142 98 2MB
English Pages [371] Year 2002
A COURSE IN SYMBOLIC LOGIC
Haim Gaifman
Philosophy Department Columbia University
c Copyright °1992 by Haim Gaifman
Revised: February 1999. Further corrections: February 2002.
Contents ,QWURGXFWLRQL[ 1 Declarative Sentences 1.0
1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
1.1 Truth-Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2
1.1.0
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2
1.1.1
Context Dependency . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
1.1.2
Types and Tokens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4
1.1.3
Vagueness and Open Texture . . . . . . . . . . . . . . . . . . . . . . .
9
1.1.4
Other Causes of Truth-Value Gaps . . . . . . . . . . . . . . . . . . . .
13
1.2 Some Other Uses of Declarative Sentences . . . . . . . . . . . . . . . . . . . .
14
2 Sentential Logic 2.0
17
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
17
2.1 Sentences, Connectives, Truth-Tables . . . . . . . . . . . . . . . . . . . . . . .
19
2.1.0
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
19
2.1.1
Negation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
20
2.1.2
Conjunction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
21
2.1.3
Truth-Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
23
2.1.4
Atomic Sentences in Sentential Logic . . . . . . . . . . . . . . . . . . .
27
2.2 Logical Equivalence, Tautologies, Contradictions . . . . . . . . . . . . . . . . .
29
2.2.0
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
29
2.2.1
Some Basic Laws Concerning Equivalence . . . . . . . . . . . . . . . .
32
2.2.2
Disjunction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
36
2.2.3
Logical Truth and Falsity, Tautologies and Contradictions . . . . . . .
40
2.3 Syntactic Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
42
2.3.0
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
42
2.3.1
Sentences as Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
50
2.3.2
Polish Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
53
2.4 Syntax and Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
53
2.5 Sentential Logic as an Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . .
54
2.5.0
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
54
2.5.1
Using the Equivalence Laws . . . . . . . . . . . . . . . . . . . . . . . .
58
2.5.2
Additional Equivalence Laws . . . . . . . . . . . . . . . . . . . . . . . .
64
2.5.3
Duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
69
2.6 Conditional and Biconditional . . . . . . . . . . . . . . . . . . . . . . . . . . .
72
3 Sentential Logic in Natural Language 3.0
80
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
81
3.1 Classical Sentential Connectives in English . . . . . . . . . . . . . . . . . . . .
85
3.1.1
Negation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
85
3.1.2
Conjunction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
86
3.1.3
Disjunction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
92
3.1.4
Conditional and Biconditional . . . . . . . . . . . . . . . . . . . . . . .
97
4 Logical Implications and Proofs 4.0
105
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
4.1 Logical Implication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 4.2 Implications with Many Premises . . . . . . . . . . . . . . . . . . . . . . . . . 110 4.2.0
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
4.2.1
Some Basic Implication Laws and Top-Down Derivations . . . . . . . . 112
4.2.2
Additional Implication Laws and Derivations as Trees . . . . . . . . . . 118
4.2.3
Logically Inconsistent Premises . . . . . . . . . . . . . . . . . . . . . . 125
4.3 Fool-Proof Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 4.3.1
Validity and Counterexamples . . . . . . . . . . . . . . . . . . . . . . . 127
4.3.2
The Basic Laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
4.3.3
The Fool-Proof Method . . . . . . . . . . . . . . . . . . . . . . . . . . 134
4.4 Proofs by Contradiction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 4.4.0
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
4.4.1
The Fool-Proof Method for Proofs by Contradiction . . . . . . . . . . . 139
4.5 Implications of Sentential Logic in Natural Language . . . . . . . . . . . . . . 143 4.5.0
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
4.5.1
Meaning Postulates and Background Assumptions . . . . . . . . . . . . 144
4.5.2
Implicature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
5 Mathematical Interlude 5.0
153
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
5.1 Basic Concepts of Set Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 5.1.1
Sets, Membership and Extensionality . . . . . . . . . . . . . . . . . . . 154
5.1.2
Subsets, Intersections, and Unions . . . . . . . . . . . . . . . . . . . . . 159
5.1.3
Sequences and Ordered Pairs . . . . . . . . . . . . . . . . . . . . . . . 165
5.1.4
Relations and Cartesian Products . . . . . . . . . . . . . . . . . . . . . 166
5.2 Inductive Definitions and Proofs, Formal Languages . . . . . . . . . . . . . . . 173 5.2.1
Inductive definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
5.2.2
Proofs by Induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
5.2.3
Formal Languages as Sets of Strings . . . . . . . . . . . . . . . . . . . . 188
5.2.4
Simultaneous Induction . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
6 The Sentential Calculus 6.0
197
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
6.1 The Language and Its Semantics . . . . . . . . . . . . . . . . . . . . . . . . . 197 6.1.0
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
6.1.1
Sentences as Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
6.1.2
Semantics of the Sentential Calculus . . . . . . . . . . . . . . . . . . . 202
6.1.3
Normal Forms, Truth-Functions and Complete Sets of Connectives . . . 206
6.2 Deductive Systems of Sentential Calculi . . . . . . . . . . . . . . . . . . . . . . 217 6.2.1
On Formal Deductive Systems . . . . . . . . . . . . . . . . . . . . . . . 217
6.2.2
Hilbert-Type Deductive Systems . . . . . . . . . . . . . . . . . . . . . . 219
6.2.3
A Hilbert-Type Deductive System for Sentential Logic . . . . . . . . . 221
6.2.4
Soundness and Completeness . . . . . . . . . . . . . . . . . . . . . . . 229
6.2.5
Gentzen-Type Deductive Systems . . . . . . . . . . . . . . . . . . . . . 235
7 Predicate Logic Without Quantifiers 7.0
241
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
7.1 PC0 , The Formal Language and Its Semantics . . . . . . . . . . . . . . . . . . 244 7.1.0
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244
7.1.1
The Semantics of PC0 . . . . . . . . . . . . . . . . . . . . . . . . . . . 246
7.2 PC0 with Equality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250 7.2.0
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
7.2.1
Top-Down Fool-Proof Methods For PC0 with Equality . . . . . . . . . 253
7.3 Structures of Predicate Logic in Natural Language . . . . . . . . . . . . . . . . 261 7.3.1
Variables and Predicates . . . . . . . . . . . . . . . . . . . . . . . . . . 261
7.3.2
Predicates and Grammatical Categories of Natural Language . . . . . . 264
7.3.3
Meaning Postulates and Logical Truth Revisited . . . . . . . . . . . . . 267
7.4 PC∗0 , Predicate Logic with Individual Variables . . . . . . . . . . . . . . . . . 269 7.4.0
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
7.4.1
Substitutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271
7.4.2
Variables and Structural Representation . . . . . . . . . . . . . . . . . 273
8 First-Order Logic
277
8.1 First View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277 8.2 Wffs and Sentences of FOL . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279 8.2.0
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
8.2.1
Bound and Free Variables . . . . . . . . . . . . . . . . . . . . . . . . . 283
8.2.2
More on the Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . 285
103 8.2.3
Substitutions of Free and Bound Variables . . . . . . . . . . . . . . . . 288
8.2.4
First-Order Languages with Function Symbols . . . . . . . . . . . . . . 291
8.3 First-Order Quantification in Natural Language . . . . . . . . . . . . . . . . . 295 8.3.1
Natural Language and the Use of Variables . . . . . . . . . . . . . . . . 295
8.3.2
Some Basic Forms of Quantification . . . . . . . . . . . . . . . . . . . . 297
8.3.3
Universal Quantification . . . . . . . . . . . . . . . . . . . . . . . . . . 302
8.3.4
Existential Quantification . . . . . . . . . . . . . . . . . . . . . . . . . 307
8.3.5
More on First Order Quantification in English . . . . . . . . . . . . . . 309
8.3.6
Formalization Techniques . . . . . . . . . . . . . . . . . . . . . . . . . 314
9 FOL: Models, Truth and Logical Implication
323
9.1 Models, Satisfaction and Truth . . . . . . . . . . . . . . . . . . . . . . . . . . 323 9.1.0
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323
9.1.1
The Truth Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325
9.1.2
Defining Sets and Relations by Wffs . . . . . . . . . . . . . . . . . . . . 331
9.2 Logical Implications in FOL . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334 9.2.0
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334
9.2.1
Proving Non-Implications by Counterexamples . . . . . . . . . . . . . . 336
9.2.2
Proving Implications by Direct Semantic Arguments . . . . . . . . . . . 338
9.2.3
Equivalence Laws and Simplifications in FOL . . . . . . . . . . . . . . 341
9.3 The Top-Down Derivation Method for FOL Implications . . . . . . . . . . . . 345 9.3.0
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345
9.3.1
The Implication Laws for FOL . . . . . . . . . . . . . . . . . . . . . . . 345
9.3.2
Examples of Top-Down Derivations . . . . . . . . . . . . . . . . . . . . 350
9.3.3
The Adequacy of the Method: Completeness . . . . . . . . . . . . . . . 353
Introduction Logic is concerned with the fundamental patterns of conceptual thought. It uncovers structures that underlie our thinking in everyday life and in domains that have very little in common, as diverse, for example, as mathematics, history, or jurisprudence. A rather rough idea of the scope of logic can be obtained by noting certain keywords: object, property, concept, relation, true, false, negation, names, common names, deduction, implication, necessity, possibility, and others. Symbolic logic aims at setting up formal systems that bring to the fore basic aspects of reasoning. These systems can be regarded as artificial languages into which we try to translate statements of natural language (e.g., English). While many aspects of the original statement are lost in such a translation, others are made explicit. It is these latter aspects that are the focus of the logical investigation. Historically, logic was conceived as the science of valid reasoning, one that derives solely from the meaning of words such as ‘not’, ‘and’, ‘or’, ‘all’, ‘every’, ‘there is’, and others, or syntactical constructs like ‘if ... then ’. These words and constructs are sometimes called logical particles. A logical particle plays the same role in domains that have nothing else in common. Here is a traditional very simple example. From the two premises: All animals are mortal. All humans are animals. we infer, by logic alone: All humans are mortal. The inference is purely logical; it does not depend on the meanings of ‘animal’, and ‘human’, but solely on the meaning of the construct all .... are i
ii The same pattern underlies the following inference, in which all non-logical concepts are different. From: All uncharged elementary particles are unaffected by electromagnetic fields. All photons are uncharged elementary particles. we can infer: All photons are unaffected by electromagnetic fields. The two cases are regarded in Aristotelian logic as instances of the same syllogism–a certain elementary type of inference. The particular syllogism under which the two examples fall is the following scheme, where the premises are written above the line and the conclusion under it:
(1)
All Ps are Qs All Rs are Ps All Rs are Qs
Our first example is obtained if we substitute: ‘animal’ for ‘P’, ‘mortal being’ for ‘Q’, ‘human’ for ‘R’. The second is obtained if we substitute ‘uncharged elementary particle’ for ‘P’, ‘thing unaffected by electromagnetic fields’ for ‘Q’, ‘photon’ for ‘R’. Had we substituted in the first case ‘immortal being’ for ‘Q’, (instead of ‘mortal being’) we would have gotten the inference: All animals are immortal. All humans are animals. -------------------------All humans are immortal. Here the first premise is false, and so is the conclusion. But the inference is correct. Its correctness does not require that the premises be true, but that they stand in a certain logical relation to the conclusion: they should logically imply it. The use of schematic symbols is a first step in setting up a formalism. Yet, there is a long way from here to a fully fledged formal language. As will become clear during the course, there is much more to a formalism than the employment of formal symbols. Here is a different type of a logical inference. From the three premises:
iii Either Jill went to the movie, or she went to the party. If Jill went to the party, she met there Jack. Jill did not go to the movie. we can infer: Jill met Jack at the party. The logical particles on which this last inference is based are: either...or
,
if... then
,
not
The same scheme is exemplified by the following inference. (Again, its validity does not mean that the premises are true, but only that they imply the conclusion.) From: Either Ms. Hill invented her story, or Mr. Thomas harassed her. If Mr. Thomas harassed Ms. Hill, then he lied to the Senate. Ms. Hill did not invent her story. we can infer: Mr. Thomas lied to the Senate. The scheme that covers both of these inferences can be written in the following self-explanatory notation, where the premises (here there are three) are above the line and the conclusion is below it:
(2)
A or B If B then C not A C
Note that the schematic symbols of (2) are of a different kind than those of (1). Whereas in (1) they stand for general names: ‘human’, ‘mortal being’, ‘photon’, etc., they stand in (2) for complete sentences, such as ‘Jill went to the party’ or ‘Mr. Thomas harassed Ms. Hill’. The part of logic that takes complete sentences as basic units, and investigates the combining of sentences into other sentences, by means of expressions such as ‘not’, ‘and’, ‘or’, ‘if ... then ’, is called sentential logic. The sentence-combining operations are called sentential operations, and the resulting sentences (e.g., those in (2)) are sentential combinations, or
iv sentential compounds. Sentential logic is a basic part that is usually included in other, richer systems of logic. The logic that treats, in addition to sentential operations, the attribution of properties to objects (‘Jill is not tall, but pretty’), or relations between objects (‘Jack likes Jill’) is predicate logic. If we add to predicate logic means for expressing general statements, like those formed in English by using ‘all’, ‘every’, ‘some’ (‘All humans are mortal’, ‘Jill likes some tall man’, ‘Everyone likes somebody’), then we get first-order quantification logic, known also as first-order predicate logic, or, for short, first-order logic. The examples schematized by (1) and by (2), are rather simple. In general, the recasting of a sentence as an instance of some scheme is far from obvious. It amounts to an analysis: a way of getting the sentence from components, which displays a certain logical structure. The choice of components and how, in general, we combine them is crucial for the development of logic, just as the choice of primitive concepts and basic assumptions is crucial for any science. Logic was bogged down for more than twenty centuries, because it was tied to a particular way of analyzing sentences, one that is based on syllogisms. It could not accommodate the wealth of logical reasoning that can be expressed in natural language and is apparent in the deductive sciences. To be sure, valuable insights and sophisticated analyses were achieved during that period. But only at the turn of the century, when the basic Aristotelian setup has been abandoned in favour of an essentially different approach, did symbolic logic come into its own. Examples such as (1) and (2) can serve as entry points, but they do not show what symbolic logic is about. We are not concerned merely with schematizing this or that way of reasoning, but with the construction and the study of formal languages. The study is a tool in the investigation of conceptual thought. It aims, and has rich implications, beyond the linguistic dimension. Logic is also not restricted to the first-order case. Other logical formalisms have been designed in order to handle a variety of notions, including those expressed by the terms ‘possible’, ‘necessary’, ‘can’, ‘must’, and many others. There are also numerous systems that treat a great variety of reasoning methods, which are quite different from the type exemplified in first-order logic.
A Bit of History The beginning of logic goes back to Aristotle (384 – 322 B.C.). Aristotle’s works contain, besides non-formal or semi-formal investigations of logical topics, a schematization and a systematic theory of syllogisms. This logic was developed and enlarged in the middle ages but remained very limited, owing to its underlying approach, which bases the logic on the grammatical subject-predicate relation. Other parts of logic, namely fragments of sentential logic, have been researched by the Stoic philosophers (330 – 100 B.C.). They made use of
v schematic symbols, which did not however amount to a formal system. That stage had to come much later. The decisive steps in logic have taken place in the second half of the nineteen and the beginning of the twentieth century, in the works of George Boole (1815 – 1864), Charles Peirce (1839 – 1914), Giuseppe Peano (1858 – 1932), and–fore and foremost–Gottlob Frege (1848 – 1925), whose work Begriffschrift (1879) presents the first logical system rich enough for expressing mathematical reasoning. The other most significant event was the publication, in three volumes, of the Principia Mathematicae by Russell and Whitehead (1910 — 1913), an extremely ambitious work from which the current formalisms derive directly. (Frege’s work was little noticed at the time, though Russell and other logicians knew it.) The big bang in logic was directly related to new developments in mathematics, mostly in the works of Dedekind and Cantor. It owed much to the creation–by the latter– of set theory (1874). After the work of Russell and Whitehead, it has been taken much further by mathematicians and philosophers, among whom we find Hilbert, Ackerman, Ramsey, L¨owenheim, Skolem, and–somewhat later–Gentzen, Herbrand, Church, Tarski and many others. It owes its most important results to Kurt G¨odel–landmark results of deep philosophical significance.
“Critical Thinking”, what Symbolic Logic Is Not In the middle ages logic was described as the “the art of reasoning”. It has been, and often still is, viewed as the discipline concerned with correct ways of deriving conclusions from given premises. Much of medieval logic was concerned with the analysis and the classification of arguments to be found in various kinds of discourse, from everyday life to politics, philosophy, and theology. In this way logic was related to rhetoric: the art of convincing people through talk. The medieval logicians devoted considerable effort to the uncovering of invalid arguments, known as fallacies, of which they classified a few dozens. Perhaps a vestige of this tradition, or a renewal of it, is a logic course given in some curricula under “Critical Thinking”. In certain cases, I am told, the name is a euphemism for teaching text comprehension, filling thereby a high school lacuna. But, to judge by the textbooks, the course commonly comprises an assortment of topics that have to do with inference-drawing: deductive, inductive, statistics and probability, some elements of symbolic logic, and a discussion of various fallacies. There is no doubt that there is value to such an overview, or to the analysis of common fallacies. pretense of the title should be discounted. Critical thinking, without the scare quotes, is not something that can be taught directly as an academic subject. (Imagine a course called “Thinking”.) Thinking, like walking, is learned by practice. And good thinking, clear, rigorous, critical, is what one acquires in life, or through work in particular disciplines. Observations and corrective tips are useful, but they will not get you far, unless incorporated in long continuous experience. Clear thinking is a lifelong project.
vi A course in symbolic logic is not a course in “critical thinking”. What you will learn may, hopefully will, affect your thinking. You will study a certain formalism, using which you will learn, among the rest, to analyze certain types of English sentences, to reconstruct them and to trace their logical relationships. These should make you more sensitive to some aspects of meaning and of reasoning. But any improvement of your thinking ability will be a consequence of the mental effort and the practice that the course requires. Thinking does not arrive by learning a set of rules, or by following this or that prescription.
The Research Program of Symbolic Logic Symbolic logic is an enterprise that takes its “raw material” from the existing activity of reasoning, displayed by human beings. It focuses on certain fundamental patterns and tries to represent them in an artificial language (also called calculus). The formal language is supposed to capture basic aspects of conceptual thought. The enterprise comprises discovery as well as construction. It would not be accurate to say that the investigator merely uncovers existing patterns. The structures are revealed and constructed at the same time. Having constructed a formal system, we can go back and see how much of our reasoning is actually captured by it. The formalism is philosophically interesting, or fruitful, in as much as it gives us a handle on essential aspects of thought. It can be also of technical interest. For it can provide tools for language processing and computer simulation of reasoning processes. In either respect, there is no a priori guarantee of success. We should keep in mind that, even when the formal system represents something basic or important, its significance may be tied to some particular segment of our cognitive activity. There should be no pretense of “reducing” the enormous wealth of our thinking to an artificial symbolic system. At least there should be no a priori conviction that it can be done. To what extent can human reasoning be captured by a formal system is an intriguing and difficult question. It has been much discussed in the context of artificial intelligence (not always with the best results). Let us compare investigations in logic to investigations of human body-postures. A medical researcher can use x rays and other scans, or slow motion pictures, in order to find out how human bodies function in a range of activities. He will accumulate a vast amount of data. In order to organize his data into a meaningful account, he may classify it according to “human types”, establish certain regularities and formulate some general rules. Here he is already deviating from the “raw material”; he is introducing an abstract system in as much as his types are idealized constructs, which actual humans only approximate. (Not to mention the fact that in the very acquiring of data he is already making use of some theoretical system.) Our investigator may also arrive at conclusions concerning the correct ways in which humans should walk, or sit in order to preserve healthy tissue, minimize unnecessary tension, etc. His research will establish certain norms; not only will it reveal how humans use their bodies,
vii but also how they ought to. He may even conclude that most people do not maintain their bodies as they should. In this way the descriptive merges into the normative. And there is a feedback, for the normative may provide further concepts and further guidelines for the descriptive. Our investigator’s conclusions may be subject to debates, objections, or revisions. Here, as well, the descriptive and the normative are interlaced. Finally, it is possible that certain recommendations become highly influential, to the extent of being incorporated in the educational system. They would thus become part of the culture, determining, to an extent, the actual behaviour of humans, say, the way they hammer, or the kind of chairs they prefer. All of these aspects exist when we are concerned with human thinking. Here, as well, the descriptive merges into the normative. Having started by investigating actual thinking, we might end by concluding how thinking ought to be done. Furthermore, the enterprise might influence actual thinking habits, projecting back on the very culture within which it was carried out.
First-Order Logic
The development of symbolic logic has had by now far reaching consequences, which have affected deeply our philosophical outlook. Coupled with certain technological developments, it has also affected our culture in general. The basic system that the project has yielded is first-order logic. The name refers to a type of language, characterized–as stated in the first section–by a certain logical vocabulary. First-order logic serves also as the core of many modifications, restrictions and–most important–enlargements. Although first-order logic is rather simple, all mathematical reasoning (derivations used in proving mathematical results) can be reproduced within it. Since it is completely defined by precise formal rules, first-order logic can itself be treated and investigated as a mathematical system . Mathematical logicians have done this, and they have been able to prove highly interesting theorems about it; for example, theorems that assert that certain statements are unprovable from such and such axioms. These and other theorems about the system are known as metatheorems; for they are not about numbers, equations, geometrical spaces, or algebras, but about the language in which theorems about numbers, equations, geometrical spaces, or algebras are proven. They enlighten us about its possibilities and limitations. In philosophy, the development of symbolic logic has had far reaching effects. The role of logic within the general philosophical inquiry has been a subject of debate. There is a wide spectrum of opinions, from those who accord it a central place, to those who restrict it to a specialized area. The subject’s importance varies with one’s interests and the flavour of one’s philosophy. In any case, logic is considered a basic subject, knowledge of which is required in most graduate and undergraduate philosophy programs.
viii The Wider Scope of Artificial Languages Historically, the idea of a comprehensive formal language, defined by precise rules of mathematical nature, goes back to Leibniz (1646 — 1716). Leibniz thought of an arithmetical representation of concepts, and dreamed of a universal formal language within which all truth could be expressed. A similar vision was also before the eyes of Frege and Russell. The actual languages produced by logicians fall short of any kind of Leibnizian dream. This is no accident, for by now we have highly convincing reasons for discounting the possibility of such a universal language. The reasons have been provided by logic itself, in the form of certain metatheorems (G¨odel’s incompleteness results). As noted, there is by now a wide variety of logical systems, which express many aspects of reasoning and of conceptual organization. In the last forty years the enterprise of artificial languages has undergone a radical change due to the emergence of computer science. Computer scientists have developed scores of languages of types different from the types constructed by logicians. Their goal has not been the investigation of thought, but the description and the manipulation of computational activity. Computer languages serve to define the functioning of computers and to “communicate” with them, to “tell” a computer what to do. A major consideration that enters into the setting up of programming languages is that of efficiency: programs should be implementable in reasonable run time, on practical hardware. Usually, there is a trade-off between a program’s simplicity and conceptual clarity, on one hand, and its efficiency on the other. At the same time we have witnessed a marked convergence of some of the projects of programming languages and those of logic. For example, the programming language LISP (and its many variants) is closely connected with the logical system known as the λ calculus, developed in the thirties by Church. The calculus and its variants have been the focus of a great amount of research by logicians and computer scientists. There has been also an important direct effect of symbolic logic on computer science. The clarity and simplicity of first-order logic suggested its use as a basis for a programming language. Ways were found to implement portions of first-order logic in an efficient way, which led to the development of what is known as logic programming. This, by now, is a vast area with hundreds, if not thousands, of researchers. Logic enters also, in an essential way, into other areas of computer science, in particular, artificial intelligence and automated theorem proving.
The Goals and the Structure of the Course The main purpose of the course is to teach FOL (first-order logic), to relate it to natural language (English) and to point out various philosophical problems that arise thereby. The level is elementary, in as much as the course does not include proofs of the deeper results, such as G¨odel’s completeness and incompleteness theorems. Nonetheless, the course aims at
ix providing a good grasp of FOL. This includes an understanding of formal syntactic structures an understanding of the semantics (that is, of the notion of an interpretation of the language and how it determines truth and falsity), the mastering of certain deductive and related techniques, and an ability to use the formal system in the analysis of English sentences and informal reasoning. The first chapter, which is more of a general nature, is intended to clarify the concepts presuppositions that underlie the project of classical logic: the category of declarative sentence and the classification of all declarative sentences into true and false. Various problems that arise when trying to apply this framework to natural language are discussed, among which are indexicality, ambiguity, vagueness and open texture. This introduction is also intended to put symbolic logic into a wider and more concrete perspective, removing from it any false aura of a given truth. We get down to the actual course material in chapter 2, which provides a first view of sentential logic, based on a semantic-oriented approach. Here are defined the connectives, a variety of syntactic concepts (components, main connective, unique readability and others), truthtables and the concept of logical equivalence. The chapter contains also various simplification techniques and an algebraic perspective on the system. Having gotten a sufficient grasp of the formalism, we proceed in chapter 3 to match the formal setup with English. The chapter discusses, with the aid of many examples, ways of expressing in English the classical connectives, the extent to which English sentences can be rendered in sentential logic and what such an analysis reveals. Chapter 4 treats logical implications and proofs. After defining (semantically) the concept of a logical implication, the chapter presents a very convenient method of deciding whether a purported implication, from a set of premises to a conclusion, is valid. The method combines the ideas of Gentzen’s calculus with a top-down derivation technique. If the implication is valid it yields a formal proof (which can be rewritten bottom-up), if not–it produces a counterexample, thereby establishing the non-validity. In the last section we return to natural language and consider possible recastings of various inferences carried in English into a formal mode. Here we discuss also some concepts from the philosophy of language, such as implicature. Chapter 5 provides some basic mathematical tools that are needed for the rigorous treatment of logical calculi, in particular, for defining interpretations (models), giving a truth-definition, and for setting up deductive systems. These tools consist of elementary notions of set theory, and the basic techniques of inductive definitions and proofs. In chapter 6 the formal language of the sentential calculus is defined with full mathematical rigor, together with the concept of a deductive system. Here the crucial distinction between syntax and semantics is clarified and the relation between the two is established in terms of soundness and completeness.
x Chapter 7 presents predicate logic (without quantifiers), based on a vocabulary of predicates and individual constants. The equality predicate is introduced and the top-down method for deciding logical implications is extended so as to include atomic sentences that are equalities. This chapter treats also predication in English. In the second half of that chapter the system is extended by the introduction of variables and steps are taken towards the introduction of quantifiers. In chapter 8 the fully fledged language of first-order logic is defined, as well as the basic syntactic concepts that go with it: quantifier scope, free and bound variables, and legitimate substitutions of terms. Emphasis is placed on an intuitive understanding of what first-order formulas express and on translations from and into English. The general concept of a model for a first-order language is presented in chapter 9, as well as the definitions of satisfaction and truth. Based on these we get the concepts of logical equivalence and logical implication. The top-down derivation technique of sentential logic is extended to first-order logic. As before, the method is guaranteed to yield a proof of any valid implication. A proof of this claim–which is not included in this chapter–yields immediately the completeness theorem.
Chapter 1 Declarative Sentences 1.0 Symbolic logic is concerned first and foremost with declarative sentences. These are sentences that purport to make factual statements. They are true if what they state is the case, and they are false–if it is not. ‘Grass is green,’ ‘Every prime number is odd,’ ‘Not every prime number is odd,’ ‘The moon is larger than the earth,’ ‘John Kennedy was not the first president of the USA to be assassinated,’ ‘Jack loves Jill, but wouldn’t admit it’, declarative sentences. The first the third and the fifth are true. The second and the fourth are false. The last is true just in this case: (i) Jack loves Jill and (ii) Jack does not admit that he loves Jill. You can see what distinguishes declarative sentences by comparing them with other types. Interrogative sentences, for example, are used to express questions: ‘Who deduced existence from thinking?’
‘Did Homer write the Odyssey?’
Such sentences call for answers, which–depending on the kind of question–come in several forms; e.g., the first of the above questions calls for a name of a person, the second–for a ‘yes’ or a ‘no’. Commands are expressed by means of imperative sentences, such as: ‘Love thy neighbour as thou lovest thyself,’ 1
‘Do not walk on the grass’.
2
CHAPTER 1. DECLARATIVE SENTENCES
Given in the appropriate circumstance, by someone with authority, they call for compliance. None of these, or of the other kinds of sentence, is true or false in the same sense that a declarative sentence is. We can say of a question that it is to the point, important, interesting, and so on, or that it is irrelevant, misleading or ill-posed. A command can be justified, appropriate, or illegitimate or out of place. But truth and falsity–in the basic, elementary sense of these terms–pertain to declarative sentences only. Sentences are used in many ways to achieve diverse purposes in human interaction. To question and to command are only two of a great variety of linguistic acts. We have requests, greetings, condolences, promises, oaths, and many others. What is then, within this picture of human interaction, the principal role of declarative sentences? It is–first and foremost–to convey information, to tell someone that such and such is the case, that a certain state of affairs obtains. But over and above their use in human communication, declarative sentences constitute descriptions (or purported descriptions) of some reality: a reality perceived by humans, but perceived as existing in itself, independently of its being described. A logical investigation of declarative sentences can serve as tool that clarifies the nature of that reality. By uncovering certain basic features of our thinking it may also uncover basic features of the world that the thinking organizes. One can appreciate already, at this stage, the potential that the logic of declarative sentences has for epistemology–the inquiry into the nature of knowledge, and for ontology–the inquiry into the nature of reality. For this reason, when sentences are the target of a philosophical inquiry, the declarative ones play the most important role. Formal methods are not restricted to declarative sentences; formal systems have been designed for handling other types, such as questions and commands. But symbolic logic is mostly about declarative sentences, and it is with these that we shall be concerned here. Henceforth, I shall use ‘sentence’ to refer to declarative sentences, unless indicated otherwise.
1.1
Truth-Values
1.1.0 A declarative sentence is true or false, according as to whether what it states is, or is not the case. It is very convenient to introduce two abstract objects, TRUE and FALSE, and to mark the sentence’s being true by assigning to it the value TRUE, and its being false–by assigning to it the value FALSE. We refer to these objects as truth-values. Truth-values are merely a technical device. They make it possible to use concise and clear formulations. One should not be mystified by these objects and one should not look for hidden meanings. To say that a sentence has the value TRUE is just another way of saying that it is
1.1. TRUTH-VALUES
3
true, and to say that it has the value FALSE is no more than saying that it is false. Any two objects can be chosen as TRUE and FALSE. For the only thing that matters about truth-values is their use as markers of truth and falsity. Notation:
We use ‘T’ and ‘F’ as abbreviations for ‘TRUE’ and ‘FALSE’.
While the introduction of truth-values is a technical convenience, the very possibility of classifying sentences into true and false is a substantial philosophical issue. Does every sentence fall under one of these categories? Little reflection will show that in our everyday discourse such a classification is, to a large extent, problematic. The problem is not of knowing a sentence’s truth-value; we may not know whether Oswald was Kennedy’s only assassin, or whether 232 +1 is a prime number, but we find no difficulty in appreciating the fact that, independently of our knowledge, ‘Oswald was Kennedy’s only assassin’ is either true or false, and so is ‘232 + 1 is prime’. The problem is that in many cases it is not clear what the conditions for truth and falsity are and whether the classification applies at all. Perhaps certain sentences should on various occasions be considered as neither true nor false; which means, in our terminology, that neither T nor F is their value. The logic we are going to study, which is classical two-valued logic, assumes bivalence: the principle that every sentence has one of the two values T or F. This principle makes for systems that are relatively simple and highly fruitful at the same time. Logicians have, of course, been aware of the problems surrounding the assignment of truth-values. But in order to get off ground, an inquiry must start by focusing on some aspects, while others are ignored. Later it may broadened so as to handle additional features and other situations. The art is to know what to focus on and what, initially, to ignore. Classical two-valued logic has been extremely successful in contexts where bivalence prevails. And it serves also as a point of reference for further investigations, where problems of missing truth-values can be addressed. In short, we are doing what every scientist does, when he starts with a deliberately idealized picture. In the coming sections of this chapter I shall highlight the main situations where the assignment of definite truth-values is called into question. This will also be an occasion for discussing briefly some major topics regarding language: context-dependency, tokens and types, indexicals, ambiguity and vagueness.
1.1.1
Context Dependency
The same sentence may have different truth-values on different occasions of its use. Consider, for example: Jack: I am tall, Jill: I am tall.
4
CHAPTER 1. DECLARATIVE SENTENCES
If Jack is not tall, but Jill is, then–in Jack’s mouth–the sentence is false, but in Jill’s mouth it is true. This shows that we are dealing here with two kinds of things: the entity referred to as sentence, which is the same in the mouth of Jack and the mouth of Jill, and its different utterances. The distinction is fundamental; it, and some of its hinging phenomena, will be now discussed.
1.1.2
Types and Tokens
Linguistic intercourse is based on the production of certain physical items: stretches of sounds, marks on paper, and their like, which are interpreted as words and sentences. Such items are called tokens. When you started to read this section you encountered a token of ‘linguistic’, which was part of a token of the opening sentence. And what you have just encountered is another token of ‘linguistic’, this time enclosed in inverted commas. Of course, “token” is meaningful only in as much as it is a token of something, a word, a letter, a sentence, or–in general–some other, more abstract entity. This other entity is called type. By a sentence-token we mean a token of a sentence, that is, a token of a sentence-type. Note that our terms ‘letter’, ‘word’, or ‘sentence’, are ambiguous. Sometimes they refer to types and sometimes to tokens. This is shown clearly in situations that involve counting. How many words are there on this page? The answer depends on whether you count repetitions of the same word. If you do, then you interpret “word” as word-token, if you don’t–you interpret it as word-type. Usually the number of word-tokens exceeds the number of wordtypes; for we do, as a rule, repeat. Our ability to use language is preconditioned by our ability to recognize different tokens as being tokens of the same type. This “sameness” relation is often indicated by the physical similarity of tokens. Thus, the two tokens of ‘ability’ in the first sentence of this paragraph have exactly the same shape. But on the whole, what counts as being tokens of the same type is a matter of convention; similarity is not necessary. Think of the different fonts one can use for the same letters, and of the enormous variety of handwritings. (Reading someone’s written words is often impossible without knowing the language, even when the alphabet is known.) And to clinch the point, note that the same words are represented by tokens in different physical media: the acoustic and the visual. Things would have been considerably simpler if we could disregard the difference between tokens of the same type. But this is not so; for, as the last example shows, different tokens of the same type may have different truth-values.
1.1. TRUTH-VALUES
5
Indexicals and Demostratives An indexical is a word whose reference depends–in a systematic way–on certain surroundings of its token, e.g., the token’s origin, its time, or its place. Such is the pronoun ‘I’, which refers to its utterer, and such are the words ‘now’ and ‘here’, which refer to the utterance’s time and place. The shift of reference may results in a truth-value change. Indexicals are, indeed, the most common cause for assigning different truth-values to different tokens of the same sentence. In the last example the difference in truth-value is caused by the indexical ‘I’, which denotes Jack, in the mouth of Jack, Jill–in the mouth of Jill. Quite often the indexicals are implicit. In (1) It is raining, the present tense indicates that the time is the time of the utterance. And, in the absence of an explicit place indication, the place is the place of the utterance. When (1) is uttered in New York, on May 17 1992 at 9:00 AM, it is equivalent to: (10 ) It is raining in New York, on May 17 1992 at 9:00 AM. It is not difficult to spot indexicals, once you are aware of their possible existence. Besides ‘now’ and ‘here’, we have also the indexicals ‘yesterday’, ‘tomorrow’, ‘last week’, ‘next room’ and many others. Demonstratives, like indexicals, have systematically determined token-dependent references. They usually require an accompanying demonstration–some non-linguistic act of pointing. Such are the words ‘that’ and ‘this’. The use of ‘you’ involves a demonstrative element (the act of addressing somebody), as do sometimes ‘he’ and ‘she’. (It is not always easy to describe what exactly the demonstration is, but this is another matter.) Sometimes a distinction is made between pure indexicals–which, like ‘I’, require no demonstration–and non-pure ones. And sometimes ‘indexical’ is used for both indexicals and demonstratives.
Some Kinds of Ambiguity Many, perhaps most, proper names denote different objects on different occasions. ‘Roosevelt’ can mean either the first or the second USA president of this name, ‘Dewey’ can refer either to the philosopher or to the Republican politician, ‘Tolstoy’ can refer to any of several Russian writers. First and second names, or initials, can help in avoiding confusion (thus, we distinguish between Teddy Roosevelt–the man who was fond of speaking softly while carrying a big stick, and Franklin Roosevelt–the second world war leader in the wheel chair). Additional names reduce the ambiguity, but need not eliminate it. A glance in the telephone directory under ‘Smith’, or–in New York–under ‘Cohen’, will show this. Other distinguishing marks
6
CHAPTER 1. DECLARATIVE SENTENCES
can be used: ‘Dewey the philosopher’ versus ‘Dewey the politician’, ‘Johan Strauss the father’ versus ‘Johan Strauss the son’. Above all, a name’s denotation is determined by the context in which the name is used. (If I ask my daughter: has Bill telephoned? it is unlikely that she will take me to have referred to Bill Clinton.) But there are no clear-cut linguistic rules that regulate this. Various factors enter: what has been stated before, the topic of the discussion, and what is known of the interlocutor’s knowledge and intentions. Proper names behave quite differently from indexicals; the latter are subject to systematic rules (‘you’ refers to the person addressed, ‘now’ refers to the time of the utterance, etc.), the former are not. Besides indexicals and proper names, linguistic expressions in general may have different denotations, or meanings, on different occasions. The “same word” might mean different things, e.g., tank–a large container for storage, and tank–an armored vehicle on caterpillar treads. But here we should be careful, for the very difference of meaning is often taken to constitute a difference of words (i.e., of types). Homonyms are different words written and pronounced in the same way; their difference rests solely on difference in meaning. When ‘tank’ is split into homonyms, it is no longer a single ambiguous word. Accordingly, (2) John jumped into the tank, is, strictly speaking, not an ambiguous sentence (which has different truth-values on different occasions) but an ambiguous expression that can be read as more than one sentence: a sentence containing the ‘tank-as-container’-homonym, and a sentence containing the ‘tankas-armored-vehicle’-homonym. The context in which (2) occurs (e.g., sentences that come before and after it) may help us to decide the intended reading. By contrast, different tokens of ‘It is now raining here’ are tokens of the same sentence. For ‘now’ and ‘here’ do not constitute different words when used at different times, or at different places. A child learning to speak does not coin a new English word, when he uses ‘I’ for the first time. We can however say that the English language gained a new word when ‘tank’ (already in use as a name of certain containers) was introduced as a name of certain armored cars1 . Many cases of ambiguity–where the meanings are linked–do not deserve to be treated as homonyms. ‘Word’ can mean word-type or word-token, but this does not constitute sufficient ground for distinguishing two homonyms. We would do better, one feels, to regard it as a single ambiguous word. Ambiguous terms are not the only source of sentential ambiguity; often the sentential structure itself can be construed in more than one way. (3) Taking the money out of his wallet, he put it on the table. 1
By the same reasoning, no new word is coined when a new baby is given a current name, like ‘Henry’. But we did get a new homonym when the ninth planet was named ‘Pluto’.
1.1. TRUTH-VALUES
7
Was it the money or the wallet he put on the table? That depends on the syntactic structure of (3); it is the first, if ‘it’ goes proxy for ‘the money’, the second–if it goes proxy for ‘his wallet’. Syntactic ambiguity takes place when the same sequence of words lends itself to different structural interpretations. The truth-value can depend on the way we structure the sentence, or–in more technical terminology–on the way we parse it. Here, again, the context can decide the intended parsing. We can have a concept of “sentence” according to which different parsings determine different sentences; if so, (3) is to be regarded in the same light as (2): an expression representing more than one sentence. But on the usual, everyday concept of sentence, (3) is a single syntactically ambiguous sentence. In symbolic logic the artificial language is set up in a way that bars any ambiguity. Every sentence has a unique syntactic structure and all referring terms have unique, contextindependent references. Therefore a translation from natural language into symbolic logic involves an interpretation whereby, in cases of ambiguity, a particular reading is chosen. As a preparatory step, we can try to paraphrase the sentences of natural language, so as to eliminate various context dependencies. This is the subject of the next subsection.
Eliminating Simple Context Dependencies Dependencies on context, which are caused by indexicals or by ambiguity, can be eliminated by replacing indexicals and ambiguous terms by terms that have unique and fixed denotations throughout the discussion. For example, each occurrence of ‘word’ can be replaced by ‘word-type’ or by ‘word-token’, depending on whether the first or the second is meant; and when either will do, we can make this explicit by writing ‘word-type or word-token’. Sometimes we resort to new nicknames ‘The first John’ for our old school mate, ‘The second John’ for the new department chief. And, to be clear and succinct, we can introduce ‘John1 ’ and ‘John2 ’. The same policy can be used to eliminate homonyms. To be sure, ‘John1 ’ is not an English name, but a newly coined word. Our aim, however, is not to preserve the original phrasings, but to recast them into forms more suitable for logical analysis. Indexicals can be eliminated by using names or descriptive phrases with fixed denotations. Thus (1)–when uttered in New York at 9:00 AM, May 17 1992–is rephrased as (10 ). And ‘I am tall’–when uttered by Jill on May 10 1992–is recast as: (4) Jill is tall on May 10 1992. Here ‘is’ is to be interpreted in a timeless mode, something like is/was/will-be. Note the different degrees of precision in the specifications of time. The weather may change from hour to hour (hence we have ‘9:00 AM’ in (10 )), but presumably Jill’s being tall is not subject to
8
CHAPTER 1. DECLARATIVE SENTENCES
hourly changes. In this way sentence-tokens that involve context-dependency are translated into what Quine named eternal sentences, that is: sentences whose truth-values do not depend on time, location, or other contextual elements. Note: We are not concerned here with a conceptual elimination of indexicals. The time scale used in (10 ) and (4) is defined by referring to the planet earth, and ‘earth’ is defined by a demonstrative: this planet, or the planet we are now on. We aim only to eliminate context dependency that can cause trouble in logical analysis. And this is achieved by paraphrases of the kind just given. Note also that, for local purposes: if we are concerned only with a particular discourse, we have only to replace the terms whose denotations vary within that discourse. If ‘today’ refers to the same day in all sentence-tokens that are relevant to our purpose, we need not replace it. The situation is altogether different when it comes to ambiguities in general. If my daughter tells me ‘Bill telephoned an hour ago’ , I shall probably guess correctly who of the various Bill’s it was. But all I can appeal to is an assortment of considerations: the Bill I was expecting a call from, the Bill likely to call at that time, the Bill that has recently figured in our social life, etc. Considerations of this kind are classified in the philosophy of language under pragmatics. The resort to pragmatics, rather than to clear-cut rules, is of great interest for linguistic theory and the philosophy of language; but is of no concern for logic, at least not the logic that is our present subject. For our purposes, it is enough that there is a paraphrase that eliminates context-dependency. Logic takes it up from there. How we get there is another concern. The cases considered thus far are the tip of the iceberg. The real game of ambiguity and context-dependency starts when adjectives, descriptive phrases, and verbs are brought into the picture. This subject–a wide area of linguistic and philosophical investigations–is not part of this course. A few observations may however give us some idea of the extent of the problems. Consider attributes such as small, big, heavy, dark, high, fast, slow, rich, and their like. You don’t need much reflection to realize that they are relative and highly context-dependent. (5) Lionel is big.
(6) Kitty is small.
1.1. TRUTH-VALUES
9
You may deduce from (5) and (6) that Lionel is bigger than Kitty. Not so if it is known that Lionel is a cat and Kitty is a lioness. In that case the ‘big’ in (5) should read: ‘a big cat’, or ‘big as cats go’, and the ‘small’ in (6)– as ‘a small lioness’. If we apply the strategy suggested above for ambiguous names, we shall split ‘big’ and ‘small’ into many adjectives, say ‘bigx ’ and ‘smallx ’ where ‘x’ indicates some kind of objects; the ‘big’ in (5) is thus read as ‘bigc ’: big on the scale of cats, and the ‘small’ in (6)–as ‘smalll ’: small on the scale of lions. Another, better strategy is to provide for a systematic treatment of compounds such as ‘big as a ...’, ‘rich as a ...’, where ‘...’ describes some (natural) class. Systematic treatments do not apply however when the adjective must be interpreted by referring to a particular occasion. ‘The trunk is heavy’ can mean that the trunk is heavy, when I do the lifting, or when you do the lifting, or when both of us do the lifting. And occasionally there is nothing precise or explicit that we can fall back on. (7) Jack Havenhearst lives in a high building on the outskirts of Toronto. How high is “high”? A high building in Jerusalem is not so high in Manhattan. The context may decide it, or it may not. Perhaps the speaker has derived his statement from some vague recollection. In cases like this, when ambiguity is tied up with vagueness, the very possession of a definite truth-value is put into question. Before proceeding, note that the problems just mentioned concern attributes of the “neutral” kind. We have not touched on evaluative terms such as ‘beautiful’, ‘ugly’, ‘tasteful’, ‘repulsive’, ‘nice’, ‘sexy’, ‘attractive’, and their like, which involve additional subjective dimensions, nor on: ‘important’, ‘significant’, ‘marginal’, ‘central’, not to mention the ubiquitous ‘good’ and ‘bad’.
1.1.3
Vagueness and Open Texture
Some people are definitely bald, some are definitely not. But some are borderline cases, for whom the question: Is he bald? does not seem to have a yes-or-no answer. The same applies to every type of statement you might think of in the context of everyday discourse. For example, is it raining now? Sometimes the answer is yes, sometimes no, and sometimes neither appears satisfactory (does the present very light drizzle qualify as “rain”?) Are we now in the USA? That type of question has almost always a well-defined answer, even when we don’t know it; for international borders–things of extreme significance–are very
10
CHAPTER 1. DECLARATIVE SENTENCES
carefully drawn. But what if somebody happens to straddle the border-line? There is first the problem of pinpointing one’s location, and second the problem of pinpointing the border; and in both the “pinpointing” has limited precision. Even the question: Is the date now May 17 1992? may, on some occasion, lack a yes-or-no answer; for the time-point defined by the utterance of ‘now’ is determined with no more than a certain precision, surely not up to a millisecond, say. In everyday discourse we often handle borderline cases by employing a more refined classification. For example, we can use ‘quite bald’ and ‘hairy’ for the clear cases, and ‘baldish’ for those in between. This provides for more accurate descriptions. But it leaves us in the same situation when it comes to drawing the line between bald (in the old sense) and non-bald. And if we were to ban our original adjective, allowing only the refined ones, there would be still borderline cases for each of the new attributes. Cases in which neither T nor F is to be assigned are characterized as truth-value gaps, or for short, as gaps. The cases considered before–those of indexicals and ambiguous terms–are not genuine gaps, in as much as they can be resolved by removing the ambiguity. In cases of vagueness the gaps are for real (or so many philosophers think). Vagueness inheres in our very conceptual fabric. It does not arise because we are missing some facts. Knowing all there is to know about the hairs on Mr. Hairfew’s head: their number, distribution and length, may not determine whether he is bald or not. There is no point to insisting on a yes-or-no answer. The concept is simply not intended for cases like his. If you think of it you will see that the phenomenon is all around. Only mathematics is exempt, and some theoretical parts of exact science. It appears whenever an empirical element is present.2 Vagueness has been often regarded as a flaw, something to get rid of–if possible. But it has a vital role in reducing the amount of processed information. In principle, we could–instead of using ‘young’, ‘rich’, ‘bald’, and their like–use descriptions that tell us one’s exact age, one’s financial assets to the last penny, or one’s precise amount of cranial hair. All of which would involve colossal waste of valuable resources. For in most cases a two-fold classification into young and not-young, rich and not-rich, bald and not-bald, will do. And additional information can be obtained if and when needed. The efficiency thereby achieved is worth the price of borderline cases with truth-value gaps. A deeper reason for vagueness is that every conceptual framework gives us only a limited purchase on “reality” or “the facts”. There is always a place for surprise, for something turning up that resists classification, something that defies our neatly arranged scheme. The examples considered so far are relatively simple borderline cases. In these situations a certain classification does not apply, yet an alternative exhaustive description is available. 2
It is not a priori impossible that some experiment will turn up a particle for which the question: is it an electron? has no clear-cut answer. The theory rules this out; but the theory may change, for its authority is established by empirical criteria. Here however we are confronted with open texture rather than with simple vagueness.
1.1. TRUTH-VALUES
11
There is no mystery about the financial status of Ms. Richfield. In principle a full list of her assets–calculated to the last cent–can be drawn. The difficulty in deciding whether she is rich stems solely from the vagueness of ‘rich’. But there are situations where no alternative description is available, situations that involve more than occasional borderline cases. (8) Jeremy, the chimpanzee, knows that Jill will feed him soon. Can we say that a monkey “knows” that something is going to happen in the near future? Granting the way we apply ‘know’ to people (itself a knotty issue and a subject of a vast philosophical literature) can we apply it, in some instances, to animals? All we can do is speculate on the monkey’s mode of consciousness, dispositions, state of mind or state of brain. And it is not even clear what are the factors that are relevant for deciding the status of (8). Surely there will be conflicting opinions. Cases of this kind display the undecidedness of our conceptual apparatus, the fact that it is open-ended and may evolve in more than one way. They are known as open texture. As the example above shows, open texture involves quite common concepts. Think of generosity, freedom, or sanity.
Vagueness of Generality General statements convey quantitative information regarding some class (or multitude) of objects. They are usually expressed by words such as ‘all’, ‘every’, ‘some’, ‘most’, and their kin. For example: All human beings have kidneys and lungs. In classical logic generality is expressed by quantifiers, which have precise unambiguous interpretations. But in natural language the intended extent of generality is often ambiguous, as well as vague. ‘Everyone’ can cover many ranges, from one’s set of acquaintances to every human on earth. Consider, for example: (9) Everyone knows that Reagan used to consult astrologers. (10) Everyone wants to be rich and famous. (11) Everyone will sometime die. Only in (11) can we interpret ‘everyone’ as meaning every human being– the way it is construed in symbolic logic. In (9) and in (10) the intended interpretation is obviously different. In (9) ‘everyone’ refers to a very small minority: people who are knowledgeable about Reagan. (9) is just another way of saying that the item in question had some publicity. (10) covers a wider range than (9), but falls short of the generality of (11). Even when the range covered
12
CHAPTER 1. DECLARATIVE SENTENCES
by ‘everyone’ or ‘everything’ is explicit, the strength of the assertion can vary. For example, when the teacher asserts (12) Everyone in class passed the test, she will be taken literally; her assertion would be misleading even if a single student had failed. But a casual remark: (13) Everyone in college is looking forward to the holidays season, means only that a large majority does; it would not be considered false on the ground of a few exceptions. How large is the required majority? This is vague. Such phenomena are even more pronounced when the general statement–usually expressed by means of the indefinite plural–is intended to express a law, or a rule. For rules may have exceptions. (And the exceptions to this last rule are in mathematics, or some of the exact sciences, or in statements like (11).) The amount of tolerated exceptions is vague. Consider, for example: (14) Women live longer than men, (15) When squirrels grow heavy furs in the autumn, the winters are colder, (16) Birds fly. Statistical data (e.g., average life span) can be cited in support of (14); but under what conditions the is sentence true? This is vague. (15) sums up a general impression of past experience; presumably, statistics can be invoked here as well. (16), on the other hand, is better viewed as a rule that determines the “normal” case: If something is known to be a bird, then–in the absence of other relevant information–presume that it flies. The general principles concerning ambiguity and vagueness apply also here. We may have to give up precise systematic prescriptions and settle for pragmatic guidelines. And we should accept the possibility of borderline cases, where the assignment of any truth-value is rather arbitrary. Cases of the types just given can, of course, be handled by using mathematical-like systems. (14) and (15) call for statistical analysis, with all the criteria that go with it. (16), on the other hand, indicates reasoning based on normalcy assumptions, where one’s conclusions are retracted, if additional information shows the case to be atypical (in the relevant way). Little reflection is needed to see that almost all our decision making involves reasoning of that kind. With no information to the contrary, the usual order of things is presupposed. To do otherwise would freeze all deliberate action. In recent years a great deal of research, by computer scientists, logicians and philosophers, has been devoted to systems within which
1.1. TRUTH-VALUES
13
reasoning that involves retractions can be expressed. They come under the general term of non-monotone logic.
1.1.4
Other Causes of Truth-Value Gaps
Non-Denoting Terms Declarative sentences may contain descriptive expressions that function as names but lack denotations. The standard, by now worn-out example is from Russell: (17) The present king of France is bald. (It is assumed that (17) is uttered at a time when there is no king of France. If needed the time-indexical can be eliminated by introducing a suitable date.) Proper names, as well, may lack denotation: ‘Pegasus’, or ‘Vulcan’–either the name of the Roman god, or of the non-existent planet.3 Frege held that declarative sentences containing non-denoting terms have no truth-value. This view was later adopted, for different reasons, by Strawson. Russell, on the other hand, proposed a rephrasing by which these sentences get truth-values; (17), for example, is reconstructed as: (170 ) There is a unique person who is a King of France, and whoever is a King of France is bald. Therefore the sentence is false. Also false, by Russell’s reconstruction, is: (18) The present king of France is not bald. But (19) It is not the case that the king of France is bald. is true. (The difference between (18) and (19) is accounted for by giving the negation different scopes with respect to the descriptive term ‘the king of France’–a point that we shall not discuss here.) 3
‘Neptune’, ‘Pluto’, and ‘Vulcan’ were introduced as names of planets whose existence was deduced on theoretical grounds from the observed movements of other planets. Neptune and Pluto were later observed directly. ‘Vulcan’ did not make it. The effects attributed to Vulcan were later explained by relativity theory.
14
CHAPTER 1. DECLARATIVE SENTENCES
As far as logic is concerned the question is more or less settled–not by a verdict in favour of one of the views, but by having the issue sufficiently clarified, so as to reduce it to a choice between well understood alternatives. It boils down to what one considers as fitting better our linguistic usage. Intuitions may vary. Nonetheless, the different resulting systems are variants within the general framework of classical logic.
Category Mistakes In the usual order of things, almost every attribute and every relation is associated with a certain type of objects. When the objects do not fit the attribute, we get strange, though grammatically correct, sentences; for example, (20) The number 3 is thirsty. This is a category mistake; numbers are not the kind of things that are thirsty, or nonthirsty. Some may want to treat (20) as neither true nor false. Alternatively, (20) and its kin can be regarded as false. This policy can be extended so as to handle negations and other compounds. As in the case of non-denoting terms, the ways of dealing with such examples are well-understood and can be accommodated, as variants, within the general framework of classical logic. Non-denoting terms and category mistakes, interesting as they are when it comes to working out the details, do not pose a foundational challenge to the framework of classical logic. But vagueness and open texture do.
1.2
Some Other Uses of Declarative Sentences
Declarative sentences have other uses, besides that of conveying information, or describing the world. I do not mean their misuse, through lying, or by misleading. Such misuses are direct derivatives of their ordinary use. I mean uses that are altogether different. They have been extensively studied by philosophers and linguists, and are worth noting for the sake of completeness and in order to give us a wider perspective.
Fictional Contexts In a play, or in a movie, the players utter declarative sentences, much as people in “real life” do; but what goes on is obviously different. Compare, for example, an exclamation of ‘Fire!’ that is part of the play, with a similar exclamation, by the same actor in the same episode, when he observes a real fire breaking out in the hall. We say that the utterances in the play are not true assertions, or are not performed in an assertive mode. They are pretended assertions within a make-belief game.
1.2. SOME OTHER USES OF DECLARATIVE SENTENCES
15
Yet, within the game, they are subject to the same logic that applies to ordinary statements. Furthermore, truth-values can be meaningfully assigned to certain statements about fictional characters. ‘Hamlet killed Polonius, and was not sorry about it’ will be regarded as true, while ‘Hamlet intended to kill Polonius’ will be regarded as false. This merely reflects what is found in the play. The pretense reaches its limits easily: ‘Hamlet had blood type A’ is neither true nor false, or–if we adopt Russell’s method–false by some legislation in logic. Consider, for contrast, ‘Shakespeare had blood type A’, which has a determinate truth-value; even though we do not, and probably never will, know what this value is. No logical legislation can settle this. The declarative sentences that appear in novels, poetry, or jokes, achieve a variety of effects: they can amuse, entertain, evoke an aesthetic experience, a feel or a vision. Some can enlighten us, but not in the way that ‘The earth turns around the sun’ does.
Metaphors, Similes, and Aphorisms Consider the following. Skepticism is the chastity of the intellect. To deny to believe and to doubt well, are to a man as a race is to a horse. Those who can–do, those who cannot–teach.
Santayana
Pascal Shaw
Taken literally, the first is trivially false, or a category mistake (skepticism is not the chastity of something and ‘chastity’ does not apply to intellects). The second is trivially true or trivially false–depending on whether the claimed likeness is indefinitely wide (any two things are alike in some respect) or narrow and precise. The third–as a plain general statement–is false on any of the usual criteria. Evidently, the points of the sayings have little to do with their literally determined truth-values. Many have found in metaphors (of which the first is an example) hidden meanings, which can be approximated–though not captured–by non-metaphorical rephrasings. Others have argued that the value of metaphor–what is transmitted, or evoked–is outside the scope of linguistic meaning. And yet a metaphor can be misleading in a way that a joke, or a poem cannot. The same can be said of similes, which achieve their effect through a somewhat different mechanism. Finally there are sayings like the third, which are not to be evaluated literally, but are neither metaphors nor similes. Their point is to underline some noteworthy feature, to focus our attention on a certain pattern. ––––––
16
CHAPTER 1. DECLARATIVE SENTENCES
To sum up: in this chapter a brief overview was given of declarative sentences, their basic role in conveying factual information, the ways they function in natural language and the problems of assigning them truth-values. We have also noted some other uses of declarative sentences. To most of this we shall not return. But you should be aware of the wider picture and of the perspective within which logic has been, and still is being developed. We shall often emphasize the relations between symbolic logic and natural language. One should remember, however, that symbolic logic is not concerned with language per se. Its aim is not the discovery of linguistic structures, or laws; that is the job of the linguist. Logic and language are closely related because in symbolic logic we try, following linguistic guidelines, to express in a precise, structured way some of the things expressed in natural language. Many aspects of linguistic usage are not representable in a system of logic. Even with respect to conveying factual information, a statement will often resist fruitful formalization, either because it is too vague or confused, or because it is too complex or subtle.
Chapter 2 Sentential Logic: Some Basic Concepts and Techniques 2.0
English sentences are usually made from nouns, verbs, adjectives, etc. But sometimes the smaller components are themselves sentences. For example: (1) Jack went to the movie and Jill went home is a compound made of two sentences: ‘Jack went to the movies’, and ‘Jill went home’, joined by the word ‘and’. We can also combine sentences into bigger sentences by ‘or’: (2) Jack will get a job or Jill will get a job. In principle, any two sentences can be combined, using ‘and’ or ‘or’, (Recall that, from now on, ‘sentence’ means a declarative sentence.) We can also make a sentence from a single sentence by negating it: (3) Jack did not graduate last year can be seen as the negation of: (4) Jack graduated last year. 17
18
CHAPTER 2. SENTENTIAL LOGIC
Strictly speaking, (4) is not a part of (3); not in the same way that ‘Jack went to the movie’ is a part of (1). The forming of negation involves insertions and possibly additional changes, and it varies from language to language. In English and French we usually use an auxiliary verb (‘do’ or ‘is’–in English, ‘avoir’ or ‘etre’–in French), in Hebrew we do not. In English the auxiliary is succeeded by ‘not’, in French it is placed between ‘ne’ and ‘pas’. Grammatical details like these are abstracted away when we set up the system of symbolic logic. What is essential is that every sentence can be negated. Natural languages provide more than one way of forming negations. Instead of (3), we can use the following as a negation of (4): (30 ) It is not the case that Jack graduated last year. (Here, indeed, (4) appear as a part of the negated sentence.) Sentential logic is concerned exclusively with the making of sentences from sentences–in ways analogous to the ones just illustrated. Predicate logic, to which we shall come later, provides, in addition to the apparatus of sentential logic, a finer analysis–whereby sentences are made from parts analogous to nouns and verbs. The sentences of the system we are going to study do not belong to any natural language. But the system is meant to bring forth patterns that underlay language in general, in as much as language expresses logical thinking. For this purpose we posit abstract entities, in the role of sentences, and we postulate certain properties; just as in geometry we presuppose that the points, lines and planes satisfy certain axioms. Sentential logic includes certain operations, by which sentences can be combined into sentences. We shall refer to these operations as sentential connectives, or for short connectives. One connective, called conjunction, corresponds to the operation effected in English by using ‘and’ (as in (1)). Another connective corresponds to the operation of combining two sentences by using ‘or’ (as in (2)). For the moment we leave unspecified the exact nature of the sentences. We only assume that they are given together with the connective operations, and that they satisfy certain properties, which we shall state as we go along. Note: ‘Connective’ suggests the joining of more than one sentence, but it covers also operations on single sentences, such as negation. A connective is called binary if it operates on pairs of sentences. It is called monadic if it operates on single sentences. The system to be studied here is based on one monadic connective: negation, and on several binary ones. In principle one can consider connectives that operate on more than two sentences. But we shall not require them, because we shall be able to express whatever is needed by repeated applications of the connectives we have.1 1
For each connective, the number of sentences it combines is fixed. One can, however, generalize the notion of a connective, by allowing connectives that can combine–at one go–a variable number of sentences.
2.1. SENTENCES, CONNECTIVES, TRUTH-TABLES
19
A sentence obtained by applying connectives to sentences (possibly, more than once) is called a sentential compound. The sentences to which the connective are applied are known as its sentential components, or components for short. Connectives can be applied repeatedly, yielding larger and larger compounds. The semantic aspect, i.e., the aspect of truth and falsity, is represented in sentential logic by assignments of truth-values. The truth-value of a compound is determined by the truth-values of the components and by the connective that has been applied.
2.1
Sentential Variables, Connectives and Truth-Tables, Atomic Sentences
2.1.0 We shall use ‘A’,
‘B’,
‘C’,
‘D’, , ‘A0 ’,
‘B 0 ’,
‘D0 ’,
... etc.
as sentential schematic letters, i.e., as signs that stand for any sentences. A general claim is a claim that holds, for all sentences that the schematic letters can stand for. We also call them, conveniently, sentential variables; that is, variables ranging over arbitrary sentences.2 Learning sentential logic is analogous to learning geometry, arithmetic, or algebra. Initially, we do not define what points, lines and planes are, or what numbers are; we rely on some intuitive understanding, and we lay down certain laws. Just as in algebra we have numerical operations: addition and multiplication denoted by ‘+’ and by ‘·’, we have in sentential logic the sentential connectives; for example, conjunction, which we shall denote by ‘∧’. And just as in algebra we speak of the numbers x, y, z, where ‘x’, ‘y’, ‘z’, are variables ranging over numbers, so in logic we speak of the sentences A, B, C. In algebra we say: For every two numbers, x and y, there exists a number x · y, which is their product, And in sentential logic we say: For every two sentences, A and B, there exists a sentence, A ∧ B, which is their conjunction. 2
It is customary to regard variables as taking values in some domain. We can thus regard sentential variables as having sentences as their possible values. But, as we shall see, the sentences themselves have truth-values, T and F. Accordingly we shall speak of the truth-value of A, the truth-value of B, etc.
20
CHAPTER 2. SENTENTIAL LOGIC
Sentences are, however, more sensitive than numbers to the ways in which the operations are applied. In algebra we have the general equality x·y = y·x. But A ∧ B is, in general, different from B ∧ A. Indeed, ‘Jack went to the movie and Jill went home’ is not the same sentence as ‘Jill went home and Jack went to the movie’. We shall later see that A ∧ B and B ∧ A are logically equivalent; yet they are different sentences, unless A and B are the same sentence. Sentential expressions are either sentential variables, or expressions obtained by combining sentential variables (in the appropriate way) with connective signs. For example: ‘A ∧ B’ is a sentential expression denoting the conjunction of A and B.
2.1.1
Negation
Negation is an operation on sentences. It is monadic, i.e., it applies to single sentences. For every sentence, A, there is a sentence, called the negation of A. We shall use ‘¬’ as the name for negation and we shall write the negation of a sentence A as: ¬A A negation of a sentence is referred to, for short, as ‘negation’. There is therefore an ambiguity in the use of ‘negation’: it can refer to the operation itself, or to a sentence that results from this operation. The intended meaning will be clear from the context. Note that in stating the rule for negation we have used ‘A’ to stand for an arbitrary sentence. We could have used, of course, any other sentential variable, e.g., ‘B’. Since ¬A is a sentence, there exists also a sentence which is the negation of ¬A, namely: ¬¬A We can continue in the same way and get, for any sentence A, the sentences: ¬A,
¬¬A,
¬¬¬A,
¬¬¬¬A,
...,
ad infinitum.
You can compare negation with the algebraic operation of forming the negative: for every number x we have the negative of x, denoted as ‘−x’. The double negative is equal to the original number, −(−x) = x, but when it comes to sentences the situation is different: ¬¬A is (we shall see later) logically equivalent to A, but they are different sentences. In fact, all the sentences in the above-written list are different. The truth-value of the negation of a sentence is determined by the following simple semantic law:
2.1. SENTENCES, CONNECTIVES, TRUTH-TABLES
21
If the value of A is T, then the value of ¬A is F; if the value of A is F, then the value of ¬A is T. Regarding T and F as opposite values we can say that the value of ¬A is the opposite value of A: the effect of negation is to toggle (i.e., reverse) the truth-value.
2.1.2
Conjunction
For every two sentences A and B there is a sentence called the conjunction of A and B, which is written as: A∧B We say that A and B are the conjuncts of A ∧ B. Again, there is an ambiguity: ‘conjunction’ denotes the operation, and is also used to refer to the resulting sentence. (Note that in algebra we have two names: the result of applying addition to x and y is the sum of x and y; and the result of applying multiplication is the product.) The truth-value of the conjunction of two sentences is determined by the following rule: If both A and B have the value T, then A ∧ B has the value T. In every other case, A ∧ B has the value F. (Note that “every other case” covers here three cases: A gets T and B gets F, A gets F and B gets T, A gets F and B gets F.)
Repeated Applications and Grouping By applying connectives to sentences we get sentences, to which we can apply again connectives– getting further sentences, and so on. We can form, for example, the negation of B and then we can form the conjunction of A with it: A ∧ ¬B . Note that the similar expression: ¬A ∧ B can be interpreted in two ways: (i) The conjunction of ¬A and B: (¬A) ∧ B (ii) The negation of A ∧ B: ¬(A ∧ B)
22
CHAPTER 2. SENTENTIAL LOGIC
(i) and (ii) are different sentences, which can, moreover, differ in truth-value: if B gets F, then you can easily verify that, independently of the value of A, (¬A) ∧ B gets F; but ¬(A ∧ B) gets T. In (¬A) ∧ B the scope of the negation is A; this is the sentence on which negation operates within the compound. In ¬(A ∧ B) the scope of the negation is A ∧ B. The notion of scope applies also to conjunction, as well as to the other connectives we shall later introduce. An occurrence of a conjunction has a left scope and a right scope. In (¬A)∧B the left scope of the conjunction is ¬A; in ¬(A ∧ B) it is A. In both, the right scope is B. Again, an analogy with algebra can clarify things: (−3) + 4 6= −(3 + 4)
and
(3 · 4) + 5 6= 3 · (4 + 5) .
Parentheses are to be used, whenever needed to determine the way of reading the expression. They figure among the symbols from which sentential expressions are constructed. For convenience of reading, we shall use square and curly brackets: ¬[(A ∧ B) ∧ ¬(C ∧ D)] Regard them as parentheses written in a different way. Grouping Conventions In algebra there are standard notational conventions that allow us to suppress parentheses: (i) ‘−3 + 4’ is read as ‘(−3) + 4’, not as ‘−(3 + 4). (ii) ‘3 · 4 + 5’ is read as ‘(3 · 4) + 5’, not as ‘3 · (4 + 5)’. These two conventions can be expressed by saying that the negative sign, ‘−’, and the multiplication sign, ‘·’, bind stronger than the addition sign, ‘+’. Conventions of the same nature are adopted in logic. The convention is that ‘¬’ binds stronger than any of the other connective names. This means the following: When parentheses are missing, fix the scopes of the negation symbols to be the smallest scopes that are consistent with the given expression. Here is how it works: ‘¬A ∧ B’ is read as: ‘(¬A) ∧ B’ . ‘¬¬A ∧ ¬(B ∧ C)’ is read as: ‘(¬¬A) ∧ ¬(B ∧ C)’ . ‘¬(¬A ∧ ¬B) ∧ C’ is read as: [¬((¬A) ∧ ¬B)] ∧ C .
2.1. SENTENCES, CONNECTIVES, TRUTH-TABLES
23
In the first example, the scope of the negation is A. In the second example, the scope of the first (leftmost) negation is ¬A, the scope of the second is A and the scope of the third is B ∧ C. In the third example, the scope of the first negation is (¬A) ∧ ¬B, the scope of the second is A and the scope of the third is B. Homework 2.1 Insert parentheses in the following expressions according to the grouping convention, so as to ensure unique readability. (Do not add parentheses if there is no danger of ambiguity.) Having done this, write down the scopes of all occurrences of negations in 4, and all the left and right scopes of the occurrences of conjunctions in 2. (In each case start from the leftmost occurrence.) 1. ¬(¬A ∧ ¬(B ∧ A)) 2. ¬(¬A ∧ ¬(B ∧ C)) ∧ (A ∧ B) 3. ¬(A ∧ (¬A ∧ ¬B)) 4. ¬(A ∧ (¬A ∧ ¬C)) ∧ ¬¬B 5. C ∧ ¬(C ∧ ¬(A ∧ C)) 6. A ∧ ¬(C ∧ (¬C ∧ B))
2.1.3
Truth-Tables
Truth-tables are a standard, commonly used device for showing how the truth-values of sentences are determined by the values of their sentential components. The truth-values of each sentence are written, in the column headed by it, in the same row containing the truth-values of its components. Here is the truth-table for negation. A ¬A T F F T And the truth-table for conjunction is: A B A∧B T T T T F F F T F F F F
24
CHAPTER 2. SENTENTIAL LOGIC
Note that the use of ‘A’ and ‘B’ is of no particular significance. We could have used any other sentential variables. A truth-table shows how to correlate, with every possible assignment of truth-values to the components, a value for the whole sentence. The order of rows is not essential; we can rearrange them arbitrarily. We can also rearrange the columns, provided that they keep the same headings. We can, for example, rewrite the truth-tables for negation and conjunction thus:
A ¬A F T T F
A B A∧B T F F F F F F T F T T T
B A∧B T T F F F F T F
A T F T F
It is desirable however to adopt some uniform fixed arrangement. And this is what we shall do. We can use a single truth-table to show the values of several sentences. In particular, when a sentence is built by iterating the connectives, it is convenient to have columns for the “intermediate” sentences: A B ¬B T T F T F T F T F F F T
A ∧ ¬B F T F F
Here, we have a column for ¬B which, together with the column for A, can be used to determine the truth-values for A ∧ ¬B. You can include, or skip, such intermediate columns according to your convenience. The case of iterated negation can be described as follows: A ¬A ¬¬A ¬¬¬A . . . T F T F . . . F T F T . . . Several sentences, not necessarily components of each other, can be included in a single truthtable. The table should include a column for every sentential variable that occurs in any of the sentential expressions. (Of course, the value of a sentential variable has no effect on the values of expressions not containing it.) The following is such an example.
2.1. SENTENCES, CONNECTIVES, TRUTH-TABLES A T T T T F F F F
B T T F F T T F F
C C ∧ ¬B T F F F T T F F T F F F T T F F
25
¬(C ∧ ¬B) ¬A ∧ ¬(C ∧ ¬B) A ∧ (C ∧ ¬B) ¬(A ∧ (C ∧ ¬B)) T F F T T F F T F F T F T F F T T T F T T T F T F F F T T T F T
Note that we did not include a column for ¬B. The truth-value of C ∧¬B is obtained directly from those of C and B; the toggling of B’s value is “done in the head”. The truth-value of ¬(C ∧ ¬B) is then obtained from that of C ∧ ¬B. A column for ¬A is not included; the truth-value of ¬A ∧ ¬(C ∧ ¬B) is obtained directly from those of A and ¬(C ∧ ¬B). The Number of Rows The number of rows in a truth-table is determined by the number of sentential variables figuring in it. With one sentential variable we have two rows, one for each of its possible two values. Every additional sentential variable multiplies the number of rows by two (each row gives rise to two: one where the additional variable gets T, another– where it gets F). Therefore, for two sentential variables the number of rows is 4, for three the number is 8, and for 4 it is 16. For n sentential variables, the number of rows is 2n . Homework 2.2 Write down the truth-tables for the sentences of Homework 2.1, after inserting the required parentheses.
Sentences and Sentential Expressions Sequences of symbols such as ‘A’, ‘B’ , ‘A ∧ B’, ‘¬C’, ‘(¬A) ∧ (B ∧ C)’ are sentential expressions. They refer to sentences, whose final identity depends on the sentences referred to by the sentential variables. We therefore speak of the sentence A ∧ B. But sometimes these sequences are used to refer to the expressions themselves. In this case we speak of the sentential expressions A, A ∧ B, etc., without using quotes. This double usage, which is sometime discouraged in logic, is convenient and we shall occasionally resort to it. It is quite common in algebra: one speaks of the number x + y·z, and also of the expression x + y ·z. It is also common in English: we speak of the man Jack, but we speak also of the name Jack.
26
CHAPTER 2. SENTENTIAL LOGIC
Sentences as Instances: A sentential expression can be viewed as a scheme. A sentence falls under the scheme if it can be obtained by interpreting the sentential variables in the scheme as standing for certain sentences. We say in this case that the sentence is an instance of, or that it can be written as, the sentential expression. E.g., any sentence of the form A ∧ ¬B is an instance of C ∧ D; it is obtained by letting ‘C’ stand for A and ‘D’ for ¬B. The formal notion of substitution allows us also to substitute A ∧ ¬B for A. But of this later. Truth-Values of Sentential Expressions: Truth-tables are determined by sentential expressions. They show how the truth-values of the sentence represented by the expression depend on the values of the components represented by the variables. We can consider assignment of truth-values directly to the sentential expressions. Hence we may speak of the values assigned, in a given row, to the sentential variables, and of the corresponding value of the expression; i.e. the value that appears in the expression’s column.
Truth Functionality Every connective of (classical) sentential logic is truth functional. This means that the truthvalue of a compound built by applying the connective is completely determined by the values of the components. This become clear if we consider two English connective, one truthfunctional, the other not: (1) Jack will go to see the play, and Jill says that the play is good. (2) Jack will go to see the play, because Jill says that the play is good. Each of (1) and (2) is obtained by combining the sentences: (a) Jack will go to see the play, (b) Jill says that the play is good. In (1) the combining is by means of ‘and’, in (2)–by means of ‘because’. If either (a) or (b) (or both) is false, then (1) and (2) are false. If both (a) and (b) are true, then (1) is true; but the value of (2) is still undetermined. (2) is true only if there is a causal relation between Jill’s saying and Jack’s going. If Jack goes to see the play, not because Jill has praised it, then (2) is false. This shows that ‘because’ is not truth-functional: the truth-values of a ‘because’-compound cannot be found just by knowing the values of the components. Here is an example of a non-truth-functional monadic operation. The operation is effected by attaching the expression ‘it is necessary that’. From the sentence ‘...’ we get the sentence: ‘it is necessary that ...’. Now both of the following are true:
2.1. SENTENCES, CONNECTIVES, TRUTH-TABLES
27
Thirteen is a prime number, John Kennedy was assassinated, But only the first is a necessary truth (the attempt on Kennedy’s life could have failed). Hence, the first of the following is true, the second is false. It is necessary that thirteen is a prime number, It is necessary that John Kennedy was assassinated. In natural language we have some connectives that are truth-functional, some that are clearly not, and some that are borderline cases. We shall return to the subject in chapter 3. There are systems of logic that incorporate connectives that are not truth functional (for example, a logic containing a connective 2 for expressing necessity: 2A is true if and only if A is necessarily true). But they shall not concern us here.
2.1.4
Atomic Sentences in Sentential Logic
So far, we have stipulated certain basic features of the system: the connective operations and their truth-table interpretation. Other features follow in the sequel. We can proceed in this manner without committing ourselves to particular sentences. Later, when we set up languages based on predicates and individual names, we will have more specific entities. It is however customary and convenient to be more specific even at the sentential level. For this purpose we view the sentences as built from basic constituents by repeated applications of sentential connectives. Let us assume an infinite sequence of so called atomic sentences: A1 , A2 , . . . , An , . . . All other sentences of the formal language are built from them, bottom-up, by repeated applications of the connectives (the ones we have so far, ¬ and ∧, and others to be introduced later). The atomic sentences, or atoms for short, are not sentential compounds. We assume an infinite sequence, for the sake of generality: in order not to be bound by arbitrary restrictions. Every sentence is built from atomic sentences in a finite number of steps, where each step consists in applying a connective to sentences already constructed. Hence every sentence involves a finite number of atoms. Our particular sentences are therefore entities of the kind: A12 , A1 ∧ A6 , ¬(A6 ∧ (¬A2 ∧ A3 )), . . . Later postulates will imply that all these are different sentences; e.g., A3 ∧ A4 6= A1 ∧ A2 . Other connectives, to be added later, will be used to generate sentences as well.
28
CHAPTER 2. SENTENTIAL LOGIC
Note the essential difference between atomic sentences and the unspecified sentence referred to by sentential variables. The difference between ‘A’, ‘B’, ‘C’,..., on one hand, and ‘A1 ’, ‘A2 ’, ‘A3 ’,..., on the other, is like the difference between ‘x’, ‘y’, ‘z’ and ‘1’, ‘13’, ‘9’. The former are numerical variables, the later–names of particular numbers. A1 , A2 , A3 , etc., are distinct sentences; A2 6= A3 , by definition. A, B, C, A1 , A0 , etc., are sentences left specified; A = B may, or may not, hold. Similarly, A1 6= A2 ∧ A3 , but A may, or may not be equal to B ∧ C. (We can, of course, have A = A3 , or A = A3 ∧ A4 . But we cannot have A ∧ B = A3 , because A3 is not a compound.) Any particular interpretation of the language assigns truth-values to the atomic sentences, and this determines, via the truth-tables, the truth-value assigned to all other sentences. But sentential logic is not about particular assignments to atomic sentences, but about properties and relations that hold for assignments in general. By considering all possible assignments to atomic sentences, we treat them as being independent of each other. The truth-value assigned to Ai is not constrained by the values assigned to all other atoms. If we try to cast some sentences of natural language in the role of “atoms”, we see that, as a rule, they are not independent. For example, ‘a is red’ and ‘a is blue’ (where ‘a’ denotes some object) cannot be both true; the truth of one implies the falsity of the other. This constraint, however, is not a matter of pure logic. It derives from the meaning of ‘red’ and ‘blue’. By making the atoms independent we filter out everything that is not implied by the meaning of the sentential connectives. If needed, we can add non-logical connections. For example, if A1 and A2 are, respectively, ‘a is red’ and ‘a is blue’, then the sentence ¬(A1 ∧ A2 ) states that a is not both red and blue. If we restrict the assignments to those that make this sentence true, we impose the required constraint. We can decide to adopt the sentence as an axiom, but it will not be an axioms of sentential logic. Having introduced the atomic sentences, we can, by far and large, ignore them. We make use of the notion of atomic sentences in defining logical equivalence and other basic semantic concepts. We could have defined such concepts, rigorously, without assuming atomic sentences. But some basic properties of the semantic concepts would then require more intricate proofs.3 Once these properties are established, we do not need atomic sentences. The system is of a general schematic nature. General claims and techniques are best represented by using sentential variables, which is all that we need. Note that in any particular context the sentences denoted by the sentential variables play the roles of “atoms”, as long as we do not specify anything more about their structure. 3
The claim is that sentential logic can be done rigorously without assuming atomic sentences. In a previous version of the book, we followed this line, introducing atomic sentences only at a later stage. We used intuitive arguments instead of proofs, whose rigorous form would have been too abstract for the book. (cf. footnote 4, page 32). We continue to use intuitive arguments, but the rigorous proof is now around the corner.
2.2. LOGICAL EQUIVALENCE, TAUTOLOGIES, CONTRADICTIONS
2.2
29
Logical Equivalence, Tautologies and Contradictions
2.2.0 Obviously, the truth-values of A and ¬¬A are the same, no matter what A’s truth-value is. This is a simple example of logically equivalent sentences. The general idea is that sentences are logically equivalent if they must have the same truth-value by reasons of pure logic. We have not determined yet what comes under “reasons of pure logic”. But in the case of sentential logic, only the sentential connectives are considered as logical elements. This means that the sentences should have the same truth-value, solely because of the way in which they are obtained by applying the connectives. Equivalence that derives only from the sentential connectives is known as tautological. The detailed definition is as follows: The sentences A and A0 are tautologically equivalent if under any assignment of truth-values to the atomic sentences, A and A0 have the same truth-value. In order to establish tautological equivalence we do not have, in general, to go to the level of atomic sentences. A ∧ B and B ∧ ¬¬A must have the same truth-value, no matter how A and B are constructed from smaller units. The same holds for other tautological equivalences that we establish here. All the relevant structure can be displayed by the sentential expressions. An equivalence is proven, once it is observed that the truth-table assigns, in every row, the same value to the two sentences. Of course, the sentential variables can stand also for atomic sentences. Therefore the definition above implies the following. Two sentences are tautologically equivalent if and only if they can be written as sentential expressions, such that, in a truth-table that has column for both, their respective columns are the same. Tautological equivalence is a special kind of logical equivalence. In the sentential calculus the two are the same. But in richer systems, such as first-order logic, there are sentences that are logically, but not tautologically equivalent. Because in richer systems there are other logical elements that can enter into the sentence. Hence: Every two sentences that are tautologically equivalent are also logically equivalent; the converse holds when we limit ourselves to sentential logic, but not in general. Notation and Terminology:
We shall use ‘≡’ as the symbol for logical equivalence: A ≡ B
means that A is logically equivalent to B. Hence, for every sentence A, we have: A 𠪪A
30
CHAPTER 2. SENTENTIAL LOGIC
We refer to statements that assert the equivalence of two sentences (e.g., the statement above) as equivalence statements, or simply, equivalences. Often, when we are dealing with sentential logic, we shall use ‘equivalent’ as a shorthand for ‘tautologically equivalent’. The context will indicate the intended meaning of the term. Note: The symbol ‘≡’ is a shorthand for ‘is logically equivalent to’. It is a technical term, which is part of our English discourse. ‘A ≡ B’ reads as an English sentence: ‘A is logically equivalent to B’. On the other hand ‘A ∧ B’ does not stand for any English sentence. It denotes the conjunction of A and B, which is a sentence in our formal system, but not in English. Another, easily verifiable equivalence is: A∧B ≡B∧A The sentences on the two sides are not, in general, the same: A ∧ B 6= B ∧ A, unless A = B. Equivalence and Sentential Expressions Equivalences of the kind just illustrated are general claims: for all A, A ≡ ¬¬A, and for all A and B, A ∧ B ≡ B ∧ A. Therefore we can substitute for the sentential variables any sentential expressions, e.g., ¬(C ∧ D) ≡ ¬¬¬(C ∧ D),
(¬C) ∧ ¬(A ∧ B) ≡ ¬(A ∧ B) ∧ (¬C)
(Can you see the substitutions by which these are obtained from the previous equivalences?) General equivalences of this form are schematic, they derive from the sentential expressions. We can define tautological equivalence directly for sentential expressions. The definition is: Two sentential expressions are equivalent if, in a truth-table that has columns for both, their respective columns have the same truth-value in every row. The equivalence of sentential expressions implies, of course, the equivalence of the denoted sentences. On the other hand, two sentential expressions such as ‘A ∧ B’
and
‘A ∧ ¬C’
are not equivalent, but the denoted sentences may still be equivalent; for example, in the special case where B = ¬C, or in the special case where C = ¬B. It is not difficult to see that the following holds:
2.2. LOGICAL EQUIVALENCE, TAUTOLOGIES, CONTRADICTIONS
31
Two sentential expressions are equivalent, if and only if the sentences obtained by letting the sentential variables stand for distinct atomic sentences are equivalent. Two sentential expressions can be logically equivalent, even when they involve different sentential variables, for example: A ∧ ¬A ≡ B ∧ ¬B , because the two always get the same value, namely F. This may diverge from our intuitive notion of “equivalence”. Should the following be classified as equivalent? (1) Jack is at home and Jack is not at home, (2) The earth is larger than the moon and the earth is not larger than the moon. In ordinary usage,“equivalence” often implies a common subject, or some sort of connection that is lacking in the case of (1) and (2). Tautological equivalence is not meant to capture such aspects. We are interested only in equivalence that reduces to having the same truth-values under all possible assignments of truth-values to the sentential variables.
Truth-Table Checking One can show that two sentential compounds are tautologically equivalent simply by writing a truth-table for both, where all sentential variables (involved in either sentence) occur. If the columns headed by the two sentences are the same, then they are equivalent. The equivalence of A ∧ ¬(A ∧ ¬B) and B ∧ ¬((¬A) ∧ B) is shown in this way: A B A ∧ ¬B T T F T F T F T F F F F
¬(A ∧ ¬B) A ∧ ¬(A ∧ ¬B)) (¬A) ∧ B T T F F F F T F T T F F
¬((¬A) ∧ B) B ∧ ¬((¬A) ∧ B) T T T F F F T F
This “brute force” checking is often quite cumbersome. There are, we shall see, methods that yield in many cases shorter, more elegant proofs. These methods yield also insights that are not obtained via truth-tables. Often they enable us to simplify a sentence, that is: to find a simpler sentence equivalent to it. The last two sentences, for example, are equivalent to a sentence that is much simpler than both: A∧B You can verify it by noting that the column of each is identical to the column of A ∧ B.
32
CHAPTER 2. SENTENTIAL LOGIC
The equivalence of all the sentences in a group of more than two can be expressed by “chaining”, e.g., A ∧ ¬(A ∧ ¬B) ≡ B ∧ (¬((¬A) ∧ B) ≡ A ∧ B This mode of writing relies on the property that sentences that are equivalent to the same sentence are equivalent to each other. The chain therefore implies that all the displayed sentences are logically equivalent.
2.2.1
Some Basic Laws Concerning Equivalence
For all sentences A, B, C, the following holds: Reflexivity:
A ≡ A.
Symmetry:
If A ≡ B, then B ≡ A.
Transitivity:
If A ≡ B and B ≡ C, then A ≡ C.
‘Reflexivity’ indicates that the relation “reflects back”: every sentence is logically equivalent to itself. ‘Symmetry’ indicates that the two sides can be switched. ‘Transitivity’ points to the “passing on” of the relation, via the “mediator” B: from the pair A and B and the pair B and C to the pair A and C. Each of these properties is obvious. The argument for transitivity is, for example, this: If for every assignment of truth-values to the atomic sentences, A and B have the same truthvalues and B and C have the same truth-values, then also for every assignment to the atomic sentences A and C have the same truth-values.4 If A1 ≡ A2 , A2 ≡ A3 , . . . , An−1 ≡ An , then A1 ≡ A3 , hence also A1 ≡ A4 , etc., A1 ≡ An . Thus, every two sentences among A1 , A2 , . . . , An are equivalent. When you come to think of it you will see that reflexivity, symmetry and transitivity are true of equivalence in general, however defined and whatever the objects. For example, equality of shape (between geometrical figures), parallelism (between lines), having the same pair of parents (between people), in fact–all relations that we characterize as equivalences. In mathematics an equivalence relation is by definition any relation that satisfies reflexivity, symmetry and transitivity. 4
It is here that the atomic sentences are needed. They are the smallest building blocks of all the sentences. Without them, the equivalence of A and B would rest on a representation of A and B as sentential compounds of smaller sentences, and the equivalence of B and C–on another representation of B and C. We would then need a refinement of these representations, so as to have the same smallest units as building blocks of all three sentences. Using unique readability (cf. 2.3.0, page 43), this can be done. But it would carry us too far away from the course material.
2.2. LOGICAL EQUIVALENCE, TAUTOLOGIES, CONTRADICTIONS
33
Congruence Laws and Substitution of Equivalent Components Besides the three basic properties that are common to all equivalence relations, there are, for each equivalence relation, contexts in which we can substitute an object by an equivalent one. Laws of this nature are sometimes known as congruence laws. Logical equivalence behaves in this way when it comes to applying connectives. If we replace in a sentential compound a component by an equivalent one, we get an equivalent compound: If A ≡ A0
then: ¬A ≡ ¬A0 ,
A B ≡ A0 B,
for every binary connective .
B A ≡ B A0
The arguments that prove these claims are easy: If, by virtue of logic, A and A0 have the same truth-value, then also their negations have the same truth-value (namely the opposite one); and this follows by virtue of logic, because negation is one of the logical elements of the sentences. The same reasoning applies to connectives in general. All we need is that be truth-functional (the value of C D should depend only on the values of C and D) and that it be classified as a logical element. The laws can be applied repeatedly, for example: C 𠪪C,
hence A ∧ C ≡ A ∧ ¬¬C,
hence ¬(A ∧ C) ≡ ¬(A ∧ ¬¬C) .
The notions of components and substitutions will be elaborated in section 2.3 of this chapter. But we should have by now a sufficient intuitive understanding, relying on which we can make free use of the substitution law: Given any sentence, the substitution of a component by a logically equivalent one results in a logically equivalent sentence. We can establish equivalences by using substitutions, in combination with other properties of logical equivalence. Here is an example: From A ∧ C ≡ C ∧ A, we get, applying negation: ¬(A ∧ C) ≡ ¬(C ∧ A) By symmetry, ¬(C ∧ A) ≡ ¬(A ∧ C)
Now we can substitute in the right-hand side C by the equivalent ¬¬C and get, via transitivity: ¬(C ∧ A) ≡ ¬(A ∧ ¬¬C) Operating with logical equivalences is analogous to operating with algebraic equalities. One uses reflexivity, symmetry and transitivity and substitutions of equivalents. But you have
34
CHAPTER 2. SENTENTIAL LOGIC
to remember that logical equivalence is not equality. Sentences are syntactic creatures, and they can differ as syntactic creatures even when logic dictates that they should have the same truth-value.
Some Terminology and Notation ‘Iff ’: As is customary in logic and mathematics, we use ‘iff’ as shorthand for ‘if and only if’ (e.g., a product of two numbers is zero iff one of them is). We use ‘⇒’ (or its longer version ‘=⇒’) to stand for the English ‘implies’, or ‘entails’. Thus, ‘. . . ⇒
’ is to be read as:
‘If... then
’
Note that, like ‘≡’, ‘⇒’ is not a part of the formal language, but a convenient shorthand within our English discourse. In a similar way we use ‘⇔’ (and ‘⇐⇒’ ) to stand for ‘iff’. The following table sums up the basic properties of logical equivalence discussed above.
A ≡ A A ≡ B =⇒ B ≡ A A ≡ B, B ≡ C =⇒ A ≡ C A ≡ B =⇒ ¬A ≡ ¬B For every binary connective,
:
A ≡ B =⇒ A C ≡ B C A ≡ B =⇒ C A ≡ C B
Non-Equivalent Sentences The equivalences we establish are between expressions built from sentential variables. Hence they hold in general, no matter what the sentential variables stand for. On the other hand, sentential expressions may be non-equivalent as expressions, while some of their instances are equivalent sentences. As expressions ‘A ∧ B’ and ‘A’ are not equivalent. But if A = B, or
2.2. LOGICAL EQUIVALENCE, TAUTOLOGIES, CONTRADICTIONS
35
if A = (¬¬B), or if A is any other of an infinite number of sentences, then A ≡ B. The non-equivalence of the expressions means that the equivalence between sentences does not hold in general, not that it always does not hold. Two sentential expressions are not equivalent, if there is an assignment of truth-values to the sentential variables, under which the expressions get different values. (In the example above, assign T–to A, F–to B.) If, in this case, we let the sentential variables stand for distinct atomic sentences we get two particular non-equivalent sentences. E.g., the non-equivalent sentences A1 and A1 ∧ A2 . We can get also non-equivalent instances without using atoms. In the example, since A should get T, substitute it by any sentence of the form ¬(C ∧ ¬C), such a sentence–it is not difficult to see–always gets T. And substitute B by any sentence of the form C ∧ ¬C, which always gets F. Then the resulting sentences are never equivalent, no matter what the sentential variables stand for. Homework 2.3 Simplify, if possible, each of the sentences in Homework 2.1; i.e., try to find an equivalent sentence that is simpler, the simpler–the better. Do not use other connectives (introduced later) besides ¬ and ∧. (With the simplification methods of the sequel this will be very easy. Right now you can look at the truth-tables and try by guessing.) 2.4 Find all the pairs of sentences in Homework 2.1 that are equivalent. Fill the following table, by writing ‘+’ in every square for which the row sentence is equivalent to the column sentence.
1 2 3 4 5 6 1 2 3 4 5 6
For each pair without a ‘+’, show that there is a truth-value assignment to the sentential variables, under which the two sentences get different values. Note: You can put ‘+’ in the diagonal and you can also assume that the filled table is symmetric around the diagonal. (Can you see why?) This leaves fifteen pairs of sentential expressions for checking. Since equivalent sentences have always the same truth-values, they behave in the same way with respect to other sentences. Hence, the more equivalent pairs you discover at an early stage, the more you will economize in checking.
36
2.2.2
CHAPTER 2. SENTENTIAL LOGIC
Disjunction
Disjunction is another binary connective, denoted by ‘∨’. For every two sentences, A and B, there is a sentence A ∨ B, called the disjunction of A and B. As in the cases of negation and conjunction, ‘disjunction’ is used ambiguously: for the operation and for the resulting sentence.
The disjuncts of A ∨ B are A and B, the first is the left disjunct, the second–the right disjunct. The truth-table for disjunction is: A B A∨B T T T T F T F T T F F F In words: the truth-value of A ∨ B is F if the truth-values of both A and B are F. It is T in every other case. In English the operation corresponding to disjunction is often effected by using ‘or’. For example, under its usual reading, the sentence (3) Jack is at home, or Jill is at home is true when either Jack or Jill, or both, are at home, and is false when neither of them is. Read in this way, (3) can be construed as a disjunction of ‘Jack is at home’, and ‘Jill is at home’. This type of ‘or’ is said to be inclusive. There is another type, described as exclusive or, which is taken to imply that one, but not both, of the alternatives is true. In the following example, the ‘or’ is presumably exclusive: (4) Either you will pay the fine, or you will go to prison. A further discussion of inclusive versus exclusive ‘or’ is in chapter 3. Evidently, disjunction corresponds to inclusive ‘or’. But we can express exclusive ‘or’ by using the connectives introduced so far: (A ∨ B) ∧ ¬(A ∧ B)
2.2. LOGICAL EQUIVALENCE, TAUTOLOGIES, CONTRADICTIONS
37
Intuitively, this sentence says: “A or B, and not both A and B”. You can confirm formally that it has the desired property, by checking its truth-table: A B A∨B T T T T F T F T T F F F
A∧B T F F F
¬(A ∧ B) (A ∨ B) ∧ ¬(A ∧ B) F F T T T T T F
Homework 2.5 Suppose that ∨x is a connective that corresponds to exclusive ‘or’ (i.e., A ∨x B is true just when one of A and B is true, but not both.) Show that disjunction can be expressed using ∧ and ∨x (without using negation); in other words, using only ∧ and ∨x , construct a sentence whose truth-table column is exactly that of A ∨ B. (This is easier than it look.) Using ∧ and ¬, we can construct a sentence equivalent to A ∨ B: (5)
A ∨ B ≡ ¬(¬A ∧ ¬B)
This is described by saying that disjunction is expressible in terms of conjunction and negation. Heuristically, you can see why (5) holds by observing: “To say that A or B is the same as to say that it is not the case that both not-A and not-B.” But you can verify it, formally, by truth-tables; or, with some practice, by carrying out the checking in the head. (5) shows that, having negation and conjunction, we can do without disjunction without losing expressive power. Whenever we need A ∨ B, we can use the equivalent ¬(¬A ∧ ¬B). But eliminating disjunction in this way can yield non-transparent expressions. It is very convenient to have disjunction as a primitive connective, because it corresponds to the familiar ‘or’operation of natural language, and because it makes for short clear expressions. And, most important, basic structural properties of the formalism are best displayed if both conjunction and disjunction are available. Conjunction can be expressed in terms of negation and disjunction: (6)
A ∧ B ≡ ¬(¬A ∨ ¬B)
This, like (5) can be easily verified by direct checking of truth-values. We can therefore dispense with conjunction, if we have negation and disjunction. But again, the formalism is much easier to operate and its structure much more transparent, if both conjunction and disjunction are available. (5) and (6) are examples of logical equivalences that can be established by simple consid-
38
CHAPTER 2. SENTENTIAL LOGIC
erations of truth-values, without writing the whole truth-table. In general, we can use any of the following methods for establishing logical equivalence. Each of the conditions is both necessary and sufficient. (I) Show, for one of the sentences, that if it has the value T, the other has the value T, and if it has the value F, the other has the value F. (II) Show that one sentence has the value T iff the other has the value T. (III) Show that one sentence has the value F iff the other has the value F. Obviously, (I) suffices for proving logical equivalence. The same holds also for each of (II) and (III). Consider (III) for example. It implies that it is impossible that one sentence has T and the other F; for this would contradict the “iff”. In the case of (5), (III) provides the shortest argument. Since A ∨ B gets F, iff both A and B get F, it suffices to show that also ¬(¬A ∧ ¬B) gets F iff both A and B get F. And this is argued as follows: ¬(¬A ∧ ¬B) gets F iff ¬A ∧ ¬B gets T. And this last conjunction gets T iff both conjuncts: ¬A and ¬B get T; i.e., iff both A and B get F. In a similar way, (II) can be used to prove (6). One can also derive each of (5) and (6) from the other, using suitable substitutions and the general equivalence laws. Here, for example, is a derivation of (6) from (5): Applying negation to both sides of (5) we get: ¬(A ∨ B) ≡ ¬¬(¬A ∧ ¬B)
Since the double negation of a sentence is equivalent to a sentence, we can drop ‘¬¬’ on the right and get: ¬(A ∨ B) ≡ ¬A ∧ ¬B
Since this is true for any sentences A and B, it remains true if we substitute throughout, A and B by their negations: ¬(¬A ∨ ¬B) ≡ ¬¬A ∧ ¬¬B
Again, we can drop double negations (replacing components by their equivalents), which yields: ¬(¬A ∨ ¬B) ≡ A ∧ B
And this, via symmetry, yields (6). (If substituting A and B by their negations confuses you, use different sentential variables and let A = ¬C, B = ¬D. You will get the desired equivalence, formulated in terms of C and D.)
The examples just given illustrate some techniques of equivalence proving, which will be elaborated and extended in chapter 4.
2.2. LOGICAL EQUIVALENCE, TAUTOLOGIES, CONTRADICTIONS
39
Grouping with Disjunctions With disjunction we have additional cases that require parentheses. For example, ¬A ∨ B
and
A∨B∧C
are ambiguous expressions. The first can be interpreted either as the sentence (¬A) ∨ B, or as ¬(A ∨ B). You can easily see that the two are not, in general, logically equivalent. The second can be interpreted either as A ∨ (B ∧ C), or as (A ∨ B) ∧ C. Again, these are not, in general, logically equivalent: if A gets T and C gets F, then A ∨ (B ∧ C) gets T, but (A ∨ B) ∧ C gets F. Parentheses are therefore employed, in order to force unique readings. Our previous conventions for omitting parentheses are now extended by the following rule: Disjunction symbol binds more weakly than either the symbols for negation or for conjunction. This means that ‘¬’ binds the strongest, then ‘∧’, then ‘∨’. In treating expressions that are not fully parenthesized, we first determine the scopes of negations to be the smallest that are consistent with the given grouping; next we determine the left and right scopes of conjunctions to be the smallest consistent with the grouping at that stage. For example, ¬A ∨ ¬B ∧ C is read as: (¬A) ∨ [(¬B) ∧ C] , (¬A ∨ ¬B) ∧ B ∨ D is read as: {[(¬A) ∨ (¬B)] ∧ B} ∨ D . When parentheses are suppressed, it is often desirable to indicate grouping by appropriate spacing, e.g., ¬A ∨ ¬B∧C, [¬A∨¬B]∧B ∨ D . It is preferable to retain parentheses, even when redundant, if this makes for easier reading. Homework 2.6 Find all the pairs of logically equivalent sentences, from the list given below, and write your answer by filling a table in the manner described in Homework 2.4. For each pair that is not listed as equivalent, give an assignment of truth-values to the sentential variables under which the sentential expressions get different values. (Note that the remarks of Homework 2.4 apply also here.) 1. ¬A ∧ B 2. ¬(A ∨ ¬B) 3. (¬A ∧ B) ∨ ¬(C ∨ ¬C)
40
CHAPTER 2. SENTENTIAL LOGIC 4. (A ∨ B) ∧ (¬A ∨ B) 5. (B ∧ C) ∨ (B ∧ ¬C) 6. ¬(A ∨ B) ∧ ¬C 7. ¬(A ∧ B) ∨ ¬C
2.2.3
Logical Truth and Falsity, Tautologies and Contradictions
A sentence is a logical truth, or logically true, if it is true by reasons of pure logic. Again, the problem of specifying the scope “pure logic” arises, and again, the idea is to classify certain elements of the sentence as logical and to require that the truth of the sentence derive solely from these. In the case of sentential logic the only logical particles are the connectives, hence a sentence is logically true just when its truth derives from the way it is built by applying sentential connectives. Such sentences are known as tautologies. The full definition is: A sentence is a tautology if it gets T under every assignment of truth-values to the sentential atoms. As in the case of logical equivalence (cf. 2.2.0), the definition can be stated without going to the level of atoms: A sentence is a tautology iff it can be written as a sentential expression, such that in its truth-table, its column has T in every row. Note: A tautology is a special case of a logical truth. In sentential logic tautologies and logical truths coincide. In general, every tautology is a logical truth, but not vice versa; When we come to first-order logic, we shall encounter many logical truths that are not tautologies. The simplest tautology, constructible using the connectives introduced so far, is: A ∨ ¬A Logical falsity is defined in a completely analogous way: A sentence is logically false, just when it gets F solely by virtue of its logical elements. In the case of sentential logic this means that it gets F, by virtue of the connectives. That is, it gets F under any assignment to the atomic sentences. Or, equivalently, it can be written as a sentential expression such that, in the truth-table, its column contains only F’s. We shall call such sentences sentential contradictions, or for short, contradictions. The simplest contradiction is: A ∧ ¬A
2.2. LOGICAL EQUIVALENCE, TAUTOLOGIES, CONTRADICTIONS
41
Again, when we come to first-order logic we shall encounter logical falsities that are not sentential contradictions. Obviously: A is a logical falsity iff ¬A is a logical truth. A is a logical truth iff ¬A is a logical falsity. If A is logically true, then the logical truths are exactly the sentences that are logically equivalent to A. If A is logically false, then the logical falsities are exactly the sentences that are logically equivalent to A. All logical truths are therefore logically equivalent, and so are all logical falsities. The equivalence defined here is a technical concept; it does not, and is not intended to, capture various aspects of the intuitive notion of “equivalence”. Note: While logical truths and falsities are highly significant, they are the exceptions rather than the rule. The sentences one usually encounters are neither logical truths nor logical falsities. ‘The sun has nine planets’ is true, ‘Nixon won the 1960 presidential election’ is false, but their truth and falsity does not derive from pure logic. They are neither logically true, nor logically false. The same obtains in the case of formal languages; “most” of the sentences of the sentential calculus are neither tautologies nor contradictions. Note: We use ‘tautology’ and ‘contradiction’ in a technical sense, which should not be confused with a different, informal sense in which the terms are sometimes used. Occasionally, ‘tautology’ means a trivial logical truth, and often ‘contradiction’ means a self-evident logical falsity.
Tautological and Contradictory Sentential Expressions Just as we did in the case of logical equivalence, we can define the notions of tautology and contradiction so as to apply to sentential expressions: A sentential expression is tautological, if in a truth-table, that has a column for it, its column contains only T’s. It is contradictory, if its column contains only F’s. It now follows easily that a sentence is tautological (or contradictory) iff it can be written in the form of a tautological (or contradictory) sentential expression. We also have: A sentential expression is a tautology iff the sentence obtained from it by interpreting the sentential variables as distinct sentential atoms is. Similarly
42
CHAPTER 2. SENTENTIAL LOGIC for contradictions.
The sentence A may or may not be a tautology, may or not be a contradiction (e.g., if A = B ∨ ¬B, it is a tautology, and if A = B ∧ ¬B it is a contradiction); and it may be neither. But the sentential expression A (or, to use quotes, ‘A’) is neither a tautology nor a contradiction; for its column contains both T and F. And this is of course true of each sentential atom. The tautologies and contradictions that we establish are of a general schematic nature, and they remain so upon any substitutions for the sentential variables. Thus, if we substitute in A ∨ ¬A any sentential expression for ‘A’ we get a tautology: (A ∨ B) ∨ ¬(A ∨ B),
(¬A ∧ B) ∨ ¬(¬A ∧ B),
etc.
On the other hand, the claim that A ∨ B, is not a tautology cannot be made without knowing what the sentential variables denote (if B = ¬A, this sentence is a tautology). We can only say that, in general, the sentence A ∨ B in non-tautological. The only exception to the last remark are sentences that are established as contradictions. Whatever A is, A ∧ ¬A is not a tautology–because it is a contradiction; similarly, A ∨ ¬A is never a contradiction–because it is a tautology. Homework 2.7 Find all the tautologies and all the contradictions among the following sentences. For sentences not listed as a tautologies (as contradictions) give a truth-value assignment to the sentential variables under which the sentence gets F (gets T). 1. ¬(A ∨ B) ∨ (A ∨ B) 2. A ∧ (¬(A ∨ B) ∨ (C ∧ ¬A)) 3. (A ∧ B) ∨ (¬A ∧ ¬B) 4. (A ∨ B) ∧ (¬A ∨ ¬B) 5. (A ∧ ¬B) ∨ (B ∧ ¬A) 6. (A ∧ B) ∧ ¬(A ∧ C)
2.3
Syntactic Structure
2.3.0 The sentences of our formal system are, like those of natural language, structured entities. But unlike the sentences of natural language, which may involve syntactic ambiguity (cf. chapter
2.3. SYNTACTIC STRUCTURE
43
1, the section on ambiguity), every sentence of the formal system has a uniquely determined syntactic structure. This principle is known as unique readability. It amounts in the case of sentential logic to the following: If a sentence is obtained by applying a connective to other sentences, then the sentence determines uniquely the connective and the sentences to which the connective has been applied. Hence there is a unique reading of such a sentence as a compound of other sentences. It means, among other things, that a sentence cannot be both a negation of some sentence and a conjunction of two sentences, or both a negation and a disjunction, or both a conjunction and a disjunction, etc. Moreover, if we apply negation to different sentences the resulting sentences must be different. And if we apply conjunction to a pair of sentences, then applying it to another pair that differs either in the first or in the second sentence (or both) gives a different result; similarly for any other binary connective. Since we assumed that our sentences are constructed from atoms that are not compounds, we can add here also the requirement that a negation, or compound formed by a binary connective is not an atom. The following is the explicit statement of unique readability. For all sentences, A, B, C, A0 , B 0 , we have: • If
is a binary connective, then:
¬A 6= B C.
• ¬A = ¬A0 only if A = A0 . • If
and
A B = A0
0
are binary connectives, then: 0
B 0 only if
=
0
, A = A0 , and B = B 0 .
• If A is an atomic sentence, then A 6= ¬B and A 6= B C, for every connective . Main Connective and Component Structure The main connective of a sentence is ¬ if the sentence has the form ¬A; it is , if the sentence has the form A B. The sentences to which the main connective is applied are the sentence’s immediate components. ¬A has one immediate component: A; A B has two: A and B. Unique readability guarantees that, for any sentential compound, the main connective and the immediate components are well defined. We can view a sentential compound as decomposable into its immediate components; any of these, which is a sentential compound, is again decomposable into its immediate components.
44
CHAPTER 2. SENTENTIAL LOGIC
And so on. All the sentences that are obtained in this process of repeated decomposition are known as the components of the original sentence. For various purposes, it is convenient to regard also each sentence as a trivial component of itself. This is merely a terminological technicality concerning the use of ‘component’. The nontrivial components–-those that are obtained in the process of repeated decomposition of a sentence–-are referred to as the sentences proper components. Two sentences A and B are components of each other, only in the trivial case in which A = B. In other words, if A is a proper component of B, then B cannot be a component of A. This is intuitively obvious; for in the decomposition we always get smaller sentences. It can be proved in a rigorous way (we shall not do it at present), using the assumption that each sentence is generated from atomic sentences. As remarked (cf. footnote 3 page 28), sentential logic can be developed without assuming atomic sentences. In that case the requirement that no sentence is a proper component of itself is included among the syntactic postulates. The concept of component can be characterized by a set of rules. A sentence is a component of another, iff this can be established, in a finite number of steps, by applying the following rules. For all sentences A, B, C: 1. A is a component of A. 2. A is a component of ¬A. 3. If
is a binary connective, then A and B are components of A B.
4. If A is a component of B and B is a component of C, then A is a component of C. For example, the components of (7)
(A ∨ ¬B) ∧ ¬A
are, beside the sentence itself: (i) A ∨ ¬B and ¬A (by 3), (ii) A and ¬B (by 3 and 4), (iii) B (by 2 and 4), (iv) any component of A and any component of B (by 4). Note that A is obtained here twice, first from A∨¬B (via 3), and second from ¬A (via 2). In the list of components it suffices to list it once (as we have just done). But it occurs more than once as a component, and any specification of the sentence should make this clear. The uniqueness of the decomposition (which is what unique readability amounts to) means that each composition step is uniquely determined. This implies that: • A is a proper component of ¬B iff it is a component of B.
2.3. SYNTACTIC STRUCTURE
45
• A is a proper component of B C (where of B or of C (or of both).
is a binary connective) iff it is a component
We can use these laws to show that certain sentences are not components of others. For example, ¬B ∧ ¬A is not a component of (A ∨ ¬B) ∧ ¬A. Here is the proof. First, ¬B ∧ ¬A 6= (A ∨ ¬B) ∧ ¬A. Otherwise, we would have by unique readability ¬B = A∨¬B, which is impossible, since the left-hand side is a negation and the right-hand side–a disjunction. Hence, if ¬A∧¬B is a component of (A ∨ ¬B) ∧ ¬A, it is a component either of A∨¬B or of ¬A. By unique readability, it is different from either of these. The only way it can be a component of A ∨ ¬B is by being a component either of A or of ¬B. But both are impossible, since A and ¬B are proper components of ¬A ∧ ¬B. By the same reasoning, it is not a component of ¬A. On the other hand, A∧¬A, may, or may not, be a component of (A ∨ ¬B) ∧ ¬A, for it can be a component of B. Homework 2.8 (i) List the proper components of each of the first three sentences in Homework 2.7. (ii) Find which of the six sentences in Homework 2.7 have A∨B as a component, which cannot have it, and which may or may not have it. Prove, in the manner given above, one of your negative claims (i.e., that it cannot be a component). If it may or may not be a component, indicate when it is and when it is not.
Displayed Components: When a sentence is written as a sentential expression, some of its components are displayed in the expression, e.g., A and ¬A are displayed as components of B ∧ ¬A, and A∧B is displayed as a component of (A ∨ ¬B) ∧ (A ∧ B) . We refer to such components as displayed components. (Of course, this is meaningful only with respect to a sentential expression.) A given sentence can have components not displayed in the expression. They can be components of the sentence by being proper components of the sentences represented by the sentential variables. ¬A is not a displayed component of A ∧ B; but it can be a component of that sentence, by being a component of B. If the sentential variables represent atomic sentences, then of course all the components are displayed.
46
CHAPTER 2. SENTENTIAL LOGIC
Occurrences The same sentence can turn up, as a component of another sentence, more than once. For example, in (A ∨ ¬B) ∧ ¬A, A turns up as a component of the first conjunct: A∨¬B, and also of the second conjunct ¬A. For all we know, it may also be a component B. To distinguish between the different appearances of the same sentence as a component within another sentence we speak of occurrences. We say that there are at least two occurrences of A, as a component of (A ∨ ¬B) ∧ ¬A. And there are two occurrences of A ∨ B, as a component of (A ∨ B) ∧ (A ∨ B) . Here the number of occurrences is exactly two, because A ∨ B cannot be a component either of A or of B. The concept occurrence is very general. It applies whenever abstract structures can have repeating parts. For example, there are two occurrences of the word ‘Jack’ in the sentence (8) Jill kissed Jack and Jack laughed. And there are at least two occurrences of negation in (A ∨ ¬B) ∧ ¬A. There may be more, in as much as negation can occur also in A or in B; but in the sentential expression there are exactly two occurrences of the negation name ‘¬’. Occurrences should be distinguished from tokens. The latter are physical entities associated with particular spatio-temporal regions. But occurrences are abstract parts of abstract structures. The two occurrences of ‘Jack’ in (8) are parts of the sentence-type, not of the sentence-token. (When sentences are realized as tokens, their parts are usually represented as token-parts, which are tokens themselves. A token of (8) therefore contains two tokens of ‘Jack’. But these have to be distinguished from the two occurrences of ‘Jack’; the latter are parts of the type, not of the token.) Displayed Occurrences: When sentences are presented through sentential expressions, certain occurrences of components, or of connectives, are displayed. It is possible that there are other, undisplayed occurrences, which occur within the sentences represented by the sentential variables. The situation here is the same as in the case of displayed components. Terminology: To avoid long phrases we often omit the word ‘occurrence’. We may use ‘the first negation’ or ‘the leftmost conjunction’ when we mean the first occurrence of a negation, or the leftmost occurrence of a conjunction. Connective names are therefore used ambiguously, to denote the connective as well as to denote occurrences of the connective. Similarly ‘the main connective’ can refer to the connective (a sentential operation), or to a particular occurrence of it. We can also speak of the first A (meaning the first occurrence of A), the first A ∨ B,
2.3. SYNTACTIC STRUCTURE
47
etc. The context should make the intended meaning clear. Main Connectives in Sentential Expressions: If a sentential expression is more than a single variable (i.e., if it contains connective names), then it has a unique occurrence of a connective name that marks the main connective. It also determines the immediate components. We can say, for example, that in the sentential expression (9)
¬(¬A ∧ ¬B)
the main connective name is ‘¬’; or, more precisely, that it is the leftmost occurrence of ‘¬’, or for short the leftmost ‘¬’. The main component of the sentence is ¬A ∧ ¬B. In (10)
(A ∧ B) ∧ (A ∨ B)
the main connective name is the second ‘∧’. When we speak of the main connectives sentential expressions, we should be understood as referring to connective names not to the connectives themselves. Homework 2.9 (i) Encircle the main connective in each of the expressions in Homework 2.7. (ii) List, for each of these sentences, the components that have more than one displayed occurrence, and the number of displayed occurrences of each.
Substitutions of Sentential Components From given sentences we can get any sentential compound in a finite sequence of steps, where each step consists in applying a sentential connective to previous sentences. The sentences to which connectives are applied during this process are the sentences used in the construction. Any sentence that is used in some step appears as a component of the end result; each separate use, say of B, introduces a separate occurrence of B as a component. If, instead of using in a certain step B, we use a different sentence B 0 , we get a different outcome: the sentence obtained by substituting an occurrence of B 0 for the occurrence of B. We say in this case that B 0 has been substituted for that occurrence of B, or that the occurrence of B has been substituted by B 0 . We can substitute at one go several occurrences of a component, by the same sentence, or by different ones. One often encounters substitutions in which all occurrences of a sentence (say B) are substituted by another sentence (say B 0 ). We say in this case that B 0 has been substituted for B. The substitution of B 0 for B leaves the sentence unchanged if B is not a component, or if B 0 = B. Here are a few examples. From the sentence
48
CHAPTER 2. SENTENTIAL LOGIC (A ∨ ¬A) ∧ A
(11)
we get the following sentences, by substitutions. (11.1)
A∧B for the second occurrence of A: (A ∨ ¬(A ∧ B)) ∧ A
(11.2)
A∧B for the first occurrence of A, and ¬B for the second: ((A ∧ B) ∨ ¬¬B) ∧ A
(11.3)
A∧B for A: ((A ∧ B) ∨ ¬(A ∧ B)) ∧ (A ∧ B)
(11.4)
B for A∨¬A: B∧A
In (11), all occurrences of A are displayed, and so are all occurrences of A ∨ ¬A. (Can you see why?) But in other cases, possible occurrences of components can be undisplayed in the sentential expression. Quite often, in describing a substitution, one restricts it to displayed occurrences. ‘The second occurrence’ will thus mean the second displayed occurrence, and ‘the substitution of B 0 for B ’ will mean the substitution of B 0 for all displayed occurrences of B. For example, from (12)
¬(A ∧ B) ∧ (C ∨ B)
we obtain sentences as follows: (12.1)
¬A for the first occurrence of B, and A for the second: ¬(A ∧ ¬A) ∧ (C ∨ A)
(12.2)
A∨B for B: ¬(A ∧ (A ∨ B)) ∧ (C ∨ (A ∨ B))
(12.3)
C ∧B for C ∨B: ¬(A ∧ B) ∧ (C ∧ B)
If we want to substitute, in (12), A∨B for all occurrences of B, we have to describe it thus: (12.20 ) ¬(A0 ∧ (A ∨ B)) ∧ (C 0 ∨ (A ∨ B)), where A0 and C 0 are obtained from A and C, respectively, by substituting (throughout) A∨B for B.
2.3. SYNTACTIC STRUCTURE
49
Note: Usually, when we substitute, we want each sentential variable to be replaced by the same sentential expression on all its occurrences. Because the variable stands for the same sentence throughout the expression. But we can, nonetheless, consider substitutions, of the kind just given, as syntactic manipulations that convert sentences to sentences. The following homework is such an exercise in pure syntax. Homework 2.10 In each of the following triples the third sentence is obtained from the first through substituting certain displayed occurrences of the second by other sentences. Find the occurrences that have been substituted and the sentence substituting each occurrence. 1. [(A ∧ B) ∨ C] ∨ [¬(A ∧ B) ∨ B] A∧B (¬B ∨ C) ∨ (¬¬B ∨ B)
2. [(A ∨ B) ∧ (¬A ∨ C)] ∨ ¬(A ∧ C) A [((A ∨ B) ∨ B) ∧ (¬A ∨ C)] ∨ ¬[¬(A ∨ B) ∧ C]
3. (¬A ∨ (¬B ∧ C)) ∧ (¬B ∨ ¬(B ∧ A)) ¬B (¬A ∨ (¬(B ∧ A) ∧ C) ∧ (¬A ∨ ¬(B ∧ A)) Substitution is a very general notion. It applies to all structures in which occurrences of some parts are replaceable by other parts. For example, each occurrence of ‘Jack’ in (8) can be substituted by any proper name. And any occurrence of a binary connective can be substituted by another binary connective. If in (A ∨ ¬B) ∧ (C ∨ D) we substitute all displayed occurrences of ∨ by ∧ we get: (A ∧ ¬B) ∧ (C ∧ D) And if we toggle in that sentence (the displayed) ∨ and ∧, we get: (A ∧ ¬B) ∨ (C ∧ D)
50
CHAPTER 2. SENTENTIAL LOGIC
Repeated Conjunctions and Disjunctions The expression ‘ A∧B∧C ’ is ambiguous, for it can be interpreted as either of the two sentences: (A ∧ B) ∧ C
A ∧ (B ∧ C) .
The main connective in the first is the second (displayed) occurrence of ∧, in the second it is the first occurrence of ∧. The two resulting sentences are different; they are, however, logically equivalent. Each is true, just when all of A, B, C are true, and is false otherwise. For many purposes the distinction between (A ∧ B) ∧ C and A ∧ (B ∧ C) does not matter. It is often convenient to ignore it and to use ‘A ∧ B ∧ C’ as if it were a sentential expression. Actually, it is an ambiguous expression, which can denote either of the two sentences above. We can use it as long as the truth of what we say does not depend on which of the two sentences we choose. This generalizes to more than three conjuncts. We use ‘ A1 ∧ A2 ∧ . . . ∧ An ’ as if it were a sentential expression, where in fact it is an ambiguous expression that can denote any of the sentences obtained by grouping via parentheses. (The number of different groupings grows rapidly as n becomes larger.) All these sentences are logically equivalent. Each is true when all the Ai ’s are true, and is false otherwise. We can ignore the distinction, as long as the truth of our claims does not depend on the particular grouping. The case of repeated disjunctions is completely analogous. We use ‘ A1 ∨ A2 ∨ . . . ∨ An ’ as if it were a sentential expression. Actually, it is ambiguous and can denote any of the sentences obtained by parenthesizing. All of them are logically equivalent. Each is false, if all the Ai ’s are false, and is true otherwise. Again, our usage is harmless, as long as the particular groupings do not affect the truth of our claims.
2.3.1
Sentences as Trees
Trees are very useful structures, often used for representation and analysis. They can be defined as mathematical entities, but are easily grasped without a formal definition. The
2.3. SYNTACTIC STRUCTURE
51
following is a tree, drawn according to a bottom-to-top convention.
The little circles are called nodes, the line segments joining them are called edges. The bottom node is the root. The nodes above a given node, which are joined to it by edges, are its children; the node is their parent. The extreme nodes, those without children, are the leaves. In the present example the root has three children, the leftmost child has two children, and the rightmost one is a leaf. If the root of the tree is a leaf, the tree consists of a single node. Note that every node can serve as a root of a tree, which is a part of the whole tree. It consists of the node and all its descendants: its children, the children’s children and so on. We call this the subtree determined by the node. Since we read from top to bottom, trees are often drawn downward, with the root at the top. The same tree, drawn top-to-bottom, appears as:
And sometimes trees are drawn from left to right. Trees are often labeled; that is: every node has an associated label, which is usually some symbol, but which can be any object. Different nodes can have the same label. In the trees that represent sentences, every leaf is labeled by a sentential variable, and every other node by a connective. The following four sentences are represented by the trees written below them. A A
¬A A
A∧B A
B
A∨ B A
B
CHAPTER 2. SENTENTIAL LOGIC
52
The general principle is very simple: If the expression is a sentential variable, the tree has one node labeled by this variable. Else, the root is labeled by the main connective. If the main connective is negation, the root has one child; if it is a binary connective, the root has two children. The subtrees determined by the children represent the immediate components of the sentence. Written as a tree,
(A V C) A -[A V (^B V -^C)]
is:
While taking more space than sequences, trees provide a very clear picture of the structure. The main connective labels the root. The displayed components are exactly the subtrees determined by the nodes.
Homework 2.11 (i) Write down the tree representations of the sentences of Homework 2.7. (ii) Write down the sentential expressions that correspond to the following trees.
2.4. SYNTAX AND SEMANTICS
2.3.2
53
Polish Notation
Polish notation is a way of writing sentences in sequential form, which ensures unique readability without using parentheses. The idea is very simple. Write the connective name to the left of the sentences to which it applies. That is, if ‘ ’ is a binary connective, write ‘ AB’ instead of ‘A B’ . Here are some examples that show how the notation works. (A ∧ B) ∧ C
becomes ∧ ∧ ABC .
A ∧ (B ∧ C) becomes ∧A ∧ BC . ¬(A ∨ (B ∧ ¬C)) becomes ¬ ∨ A ∧ B¬C . It can be proven that Polish notation ensures unique readability, but the proof is not trivial. The following prescription converts our sentential expressions into Polish notation: Determine the main connective and the immediate components. Write the main connective leftmost and follow it by the main components, in their given order, after having converted each of them to Polish notation. Since immediate components involve shorter expressions than the whole sentence, the prescription reduces the task to simpler tasks. Repeating it, one will eventually get the desired Polish-notation form. Converting from Polish notation to ours requires a more difficult method that shall not be given at present. But you can acquire the skill with some practice. Homework 2.12
2.4
Convert the expressions of Homework 2.7 to Polish notation.
Syntax and Semantics
A language can be studied purely from the syntactic perspective. Viewed in this way, the language is a system consisting of expressions built of symbols, according to rules. The rules classify the expressions and determine how they can be combined and recombined into larger units. This view ignores the interpretation of the language, i.e., its link with some nonlinguistic domain: what its terms denote, what its sentences say, how their truth and falsity are determined. All of these come under the heading of semantics. The English sentence
54
CHAPTER 2. SENTENTIAL LOGIC
(1) John likes Mary’s brother. is syntactically analysed as a compound of the proper noun ‘John’, the third-person present tense of the transitive verb ‘like’, and the noun phrase ‘Mary’s brother’–arranged in that order. The syntax can tell us that any compound of that form is a sentence. And it can rule out as non grammatical the combination (2) Likes John Mary’s brother. But it does not tell us what the words refer to, what the sentence says, and it does not mention truth and falsity. All of these belong to the semantics. Now the semantics must involve the syntax in an essential way. Because syntactic classification and syntactic structure are among the factors that determine how expressions are interpreted. But the syntax can stand by itself. A computer can handle the syntax as a system of symbols without semantics. The same distinction between syntax and semantics obtains in formal linguistic systems. In our case, the rules that govern sentential structure belong to the syntax. Under this heading come: the unique readability property, the main connective, components, occurrences, substitutions, the tree structure, and their like. But truth-values and the interpretation of the connectives, which is given by their truth tables, belong to the semantics. So do all concepts whose definitions involve truth and falsity: logical truth, logical falsity and logical equivalence. You should be aware of this fundamental distinction and know how to apply it to richer systems. In each case we shall have a syntax and a semantics. We shall later see that there are theorems that connect syntactic notions with semantic ones. For example, logical truths can be given a purely syntactic characterization. But do not confuse the two and do not bring semantic notions into the syntax. Logically equivalent sentences, for example, can have completely different syntactic structures.
2.5
Sentential Logic as an Algebra
2.5.0 We noted already that logical equivalence does not amount to equality. But logical equivalence shares with equality certain features, which make it possible to adopt an algebraic approach in which the equivalence symbol ‘≡’ plays a role analogous to that of ‘=’ . Recall that, like any equivalence relation, logical equivalence is reflexive, symmetric and transitive. Moreover, as stated earlier (cf. 2.2.1), it satisfies the substitution of equivalents principle:
2.5. SENTENTIAL LOGIC AS AN ALGEBRA
55
If A ≡ A0 , and we change B to B 0 by substituting in it one or more occurrences of A by A0 , then B ≡ B 0 . The principle can be proved formally, but it is quite evident on intuitive grounds: the only way in which an occurrence of the component A can affect B’s truth-value is through the truthvalue of A; if A and A0 have always the same value, the substitution makes no difference for the value of B. Hence B ≡ B 0 . The algebraic method for establishing logical equivalence is the following. First we fix certain equivalences as our starting point (if you wish, our axioms), then we derive from them other equivalences by using repeatedly the substitution of equivalents principle. The equivalences enclosed in the following box (along with another group to be given in 2.5.2) can play the role of the starting point.
Double Negation Law:
¬¬A ≡ A Associativity:
(A ∧ B) ∧ C ≡ A ∧ (B ∧ C)
(A ∨ B) ∨ C ≡ A ∨ (B ∨ C) Commutativity:
A∧B ≡ B∧A
A∨B ≡ B∨A Idempotence:
A∧A ≡ A
A∨A ≡ A Distributive Laws:
A ∧ (B ∨ C) ≡ (A ∧ B) ∨ (A ∧ C)
A ∨ (B ∧ C) ≡ (A ∨ B) ∧ (A ∨ C) De Morgan’s Laws:
¬(A ∧ B) ≡ (¬A) ∨ (¬B)
¬(A ∨ B) ≡ (¬A) ∧ (¬B)
56
CHAPTER 2. SENTENTIAL LOGIC
The equivalences are meant as general laws; they hold for all A, B, C ; this should be clear from the context, even though the words ‘for all’ do not appear. Except for Double Negation, the laws are listed in pairs; each consists of two dual laws: the leftmost is the law for conjunction, the rightmost–the corresponding law for disjunction. E.g., De Morgan’s first law is for conjunction, the second–for disjunction. Dual laws are obtained from each other by toggling, throughout, ‘∧’ and ‘∨’. Double Negation, Associativity and Commutativity, are obvious and were discussed already. Also obvious is Idempotence (the name means the same power: A ∧ A and A ∨ A, have “the same power” as A). The last two pairs are less obvious, but are easily verified via truth-tables. Consider the distributive law for conjunction: A ∧ (B ∨ C) ≡ (A ∧ B) ∨ (A ∧ C) When we pass from the left-hand side to the right-hand side, we distribute ∧ over ∨. This is analogous to the arithmetical law by which multiplication can be distributed over addition:5 x · (y + z) = (x · y) + (x · z) We have also the dual law for distributing disjunction over conjunction. But here there is no arithmetical analogue. Arithmetical addition does not distribute over multiplication: in general x + (y · z) 6= (x + y) · (x + z). The distributing of ∧ over ∨ pushes-in the conjunction: ‘∧’ enters into the parentheses of ‘(A ∨ B)’. Similarly, the distributing of ∨ over ∧ pushes-in the disjunction. The opposite move, from the right-hand side to the left-hand side, involves a pulling-out: ∧ is pulled out in the law for conjunction, ∨ is pulled out in the law for disjunction. Also, in the first case there is a pulling out of the common conjunct A, in the second case–of the common disjunct A. In a similar way, De Morgan’s laws–in the left-to-right direction–involve the pushing-in of negation; in the opposite direction it involves a pulling out. This is accompanied by conjunction/disjunction toggling. Therefore these are not distributive laws. You cannot distribute negation, because in general ¬(A ∧ B) is not logically equivalent to (¬A) ∧ (¬B), and similarly for disjunctions. Instances: Since the laws hold for all A, B, C, we can substitute any sentential expressions for the sentential variables and get an equivalence that holds. (Of course, occurrences of the same variable should be substituted by the same sentential expression.) An equivalence obtained in this way is an instance of the law. For example, 5
(B ∨ C) ∧ ((A ∨ B) ∨ C) ≡ ((B ∨ C) ∧ (A ∨ B)) ∨ ((B ∨ C) ∧ C)
Sometimes conjunction and disjunction are compared to multiplication and addition. If we put: T = 1, F = 0, then the truth-value of a conjunction is the product of the conjuncts’ values. But then disjunction does not correspond to addition; the value of a disjunction is not the sum but the maximum of the disjuncts’ values.
2.5. SENTENTIAL LOGIC AS AN ALGEBRA
57
is an instance of the distributive law for conjunction. It is obtained by substituting ‘B ∨ C’ for ‘A’,
and
‘A ∨ B’ for ‘B’.
To identify a given equivalence as an instance of some law is not always easy. In the following homework you are required to do such identifications. Note that the law is grounded in the semantics, but being an instance is a purely syntactic notion. The following is an exercise in syntax. Homework 2.13 Each of the following is an instance of one of the listed equivalence laws. Find the law and the substitution that has been used to get the instance. Note: Sometimes the two sides of the equivalence have been switched around. 1. ¬(¬A∧B ∨ ¬A) ≡ ¬(¬A ∧ B) ∧ ¬¬A 2. ¬(A ∨ ¬B) ∧ ¬(B ∧ C) ≡ ¬((A ∨ ¬B) ∨ (B ∧ C)) 3. ¬(¬(A ∨ B) ∧ B) ≡ ¬¬(A ∨ B) ∨ ¬B 4. ¬B ∨ (C ∧ ¬A) ≡ (¬B ∨ C) ∧ (¬B ∨ ¬A) 5. (¬(A ∨ B)∧A) ∨ (¬(A ∨ B) ∧ A) ≡ ¬(A ∨ B) ∧ (A ∨ A) 6. A∧(B∨C) ∨ A∧C ≡ A∧C ∨ A∧(B∨C) 7. (B∨A) ∧ ((A∨B)∧(B∨A)) ≡ ((B ∨A) ∧ (A∨B)) ∧ (B ∨A) 8. ¬(C ∧ (A ∨ C)) ≡ ¬C ∨ ¬(A ∨ C) 9. (A ∨ B) ∧ (A ∨ C) ≡ A ∨ B∧C 10. ¬B ∧ (B ∨ ¬B) ≡ (¬B ∧ B) ∨ (¬B ∧ ¬B) Left and Right Distributive Laws: In the distributive laws listed in the box, the pushed-in sentence appears on the left. Hence we call them the left distributive laws. The right distributive laws are: (B ∨ C) ∧ A ≡ (B ∧ A) ∨ (C ∧ A)
(B ∧ C) ∨ A ≡ (B ∨ A) ∧ (C ∨ A)
They can be established directly via truth-tables, or derived from the left distributive laws by using commutativity. For example: (B ∨ C) ∧ A ≡ A ∧ (B ∨ C) ≡ (A ∧ B) ∨ (A ∧ C) ≡ (B ∧ A) ∨ (C ∧ A)
58
CHAPTER 2. SENTENTIAL LOGIC
The first equivalence is an instance of commutativity, the second–the distributive law for conjunction, the last–a simultaneous substitution of A ∧ B and A ∧ C by the equivalent (via commutativity) B ∧ A and C ∧ A. This last step can be split into two: (A ∧ B) ∨ (A ∧ C) ≡ (B ∧ A) ∨ (A ∧ C) ≡ (B ∧ A) ∨ (C ∧ A) Altogether there are three uses of commutativity. From now on we shall refer by ‘distributive law’ to both left and right distributive laws and we shall not bother to indicate every separate use of commutativity.
2.5.1
Using the Equivalence Laws
We can be establish an equivalence in a sequence of steps that create a chain: A1 ≡ A2 ≡ . . . ≡ An This, via transitivity, proves: A1 ≡ An . The idea is that each of the equivalences Ai ≡ Ai+1 should be easily deducible, or previously established. Here is an example: ¬[(A ∧ B) ∨ ¬C] ≡ ¬(A ∧ B) ∧ ¬¬C ≡ ¬(A ∧ B) ∧ C ≡ (¬A ∨ ¬B) ∧ C ≡ (¬A ∧ C) ∨ (¬B ∧ C) Find for yourself what law (or laws) is used in each step. The chain proves: ¬[(A ∧ B) ∨ ¬C] ≡ (¬A ∧ C) ∨ (¬B ∧ C) . In this way we have simplified ¬[(A ∧ B) ∨ ¬C]
to
(¬A ∧ C) ∨ (¬B ∧ C) .
Another, longer but clearer, style of presenting equivalence proofs consists in writing the sentences on separate lines, indicating in the margin the grounds for the step. Our last chain becomes: 1. ¬[(A ∧ B) ∨ ¬C] 2. ¬(A ∧ B) ∧ ¬¬C 3. ¬(A ∧ B) ∧ C 4. (¬A ∨ ¬B) ∧ C 5. (¬A ∧ C) ∨ (¬B ∧ C)
De Morgan’s law for disjunction, double negation law, De Morgan’s law for conjunction, right distributive law for conjunction.
2.5. SENTENTIAL LOGIC AS AN ALGEBRA
59
Often, several simple steps are combined into one. We can, for example, drop double negations immediately without special indication; we can accordingly pass directly from 1. to 3. in the proof above. We also need not use separate steps for changing (via associativity) the grouping in repeated conjunctions and disjunctions; in fact, these groupings can be ignored (cf. 2.3.0 “Repeated Conjunctions and Disjunction”). Similarly, we need not bother with explicit changes of order or (via commutativity) in conjunctions and disjunctions, or with explicit deletions of repeated conjuncts, or disjuncts (via idempotence). Once you get the hang of it you can assimilate these steps into others.
Pushing-In Negations We mentioned already the pushing-in of negations, via De Morgan’s laws: From ¬(A ∧ B) to ¬A ∨ ¬B .
From ¬(A ∨ B) to ¬A ∧ ¬B .
This replaces one occurrence of ‘¬’, whose scope is A ∧ B or A ∨ B, by two occurrence with smaller scopes: A and B. In the last example. negation is pushed inside in the passage from 1 to 2; then it is pushed again in the passage from 3 to 4. As long as there are components of the form ¬(A ∧ B), or ¬(A ∨ B), we can push negation in. By repeating this process, we will eventually get a sentence that has no components of these forms. (The mathematical proof of this claim will not be given here. But intuitively it is very clear, especially after having worked out a few examples.) In addition, we can always drop double negations. At the end the process there will be no component with a string of more than one negation. Consequently, any sentential expression whose connectives are among ‘¬’, ‘∧’, and ‘∨’, can be transformed into an equivalent one, in which negation applies only to the sentential variables. If several negation-occurrences can be pushed in, we can choose any of them. The final outcome does not depend on the choices of the pushed-in negations, provided that negation is pushed all the way in, all double negations are dropped and no other laws are applied. But the number of steps can vary. It is advisable to start with the outermost negations; because as the negation moves in it will form with the inner ones double negations, which can be immediately dropped. Here is an example to illustrate all these points. Given ¬[A ∧ ¬(B ∧ C)], we can start by pushing-in the inner (i.e., second) negation: ¬[A ∧ ¬(B ∧ C)], ¬[A ∧ (¬B ∨ ¬C)], ¬A ∨ ¬(¬B ∨ ¬C), ¬A ∨ (B ∧ C) If we start by pushing-in the outermost negation, we get: ¬[A ∧ ¬(B ∧ C)], ¬A ∨ ¬¬(B ∧ C), ¬A ∨ (B ∧ C)
60
CHAPTER 2. SENTENTIAL LOGIC
De Morgan’s Laws for Repeated Conjunctions and Disjunctions The law for conjunctions with more than two conjuncts is: ¬(A1 ∧ A2 ∧ . . . ∧ An ) ≡ ¬A1 ∨ ¬A2 ∨ . . . ∨ ¬An Here we are ignoring the grouping, since we are interested in logical equivalence only. The equivalence can be deduced by repeated applications of the two-conjunct version of De Morgan. For example, if n = 3, we have: ¬[(A1 ∧ A2 ) ∧ A3 ] ≡ ¬(A1 ∧ A2 ) ∨ ¬A3 ≡ (¬A1 ∨ ¬A2 ) ∨ ¬A3 It is also easily derivable by considering truth-values: The left-hand side is true iff A1 ∧ A2 ∧ . . . ∧ An is false, i.e., iff at least one of the Ai ’s is false. The right-hand side is true iff at least one of the sentences ¬Ai is true, i.e., iff at least one of the Ai ’s is false. Hence, the left-hand side is true iff the right-hand side is. The case of disjunctions is the exact dual: ¬(A1 ∨ A2 ∨ . . . ∨ An ) ≡ ¬A1 ∧ ¬A2 ∧ . . . ∧ ¬An Duality applies throughout; each of the above observations or claims has a dual.
Pushing-In Conjunctions Conjunctions are pushed-in by distributing repeatedly conjunction over disjunction. As explained already, we do it by going left-to-right in the distributive laws for conjunction: A ∧ (B ∨ C) ≡ (A ∧ B) ∨ (A ∧ C)
(B ∨ C) ∧ A ≡ (B ∧ A) ∨ (C ∧ A)
It increases the size (both ‘A’ and ‘∧’ occur once on the left, but twice on the right), but decreases the scope of the conjunction (instead of B ∨ C we have the scopes B and C). It also increases the scope of the disjunction. Here is an example of pushing-in conjunctions: A ∧ [(¬B ∨ C) ∧ ¬D] ≡ A ∧ [(¬B ∧ ¬D) ∨ (C ∧ ¬D)] ≡ (A ∧ ¬B ∧ ¬D) ∨ (A ∧ C ∧ ¬D) If we keep pushing-in conjunctions, we must eventually arrive at a point where no further pushing-in is possible; at that stage, there are no components either of the form A ∧ (B ∨ C) or of the form (B ∨ C) ∧ A. You may convince yourself that the process must terminate by working out a few examples; the formal proof, which will not be given here, is far from trivial. WATCH IT: Distributing the conjunction in ¬[(B ∨ C) ∧ A] yields ¬[B∧A ∨ C ∧A]. But you cannot distribute the conjunction in ¬(B ∨ C) ∧ A , because the first conjunct is not a
2.5. SENTENTIAL LOGIC AS AN ALGEBRA
61
disjunction, but a negation. You can however simplify by pushing-in the negation: ¬B ∧ ¬C ∧ A . Simultaneous Distributing of Conjunctions: Repeated applications of the distributive laws to (A1 ∨ A2 ) ∧ (B1 ∨ B2 ) yields the chain: (A1 ∨A2 )∧(B1 ∨B2 ) ≡ (A1 ∧(B1 ∨B2 ))∨(A2 ∧(B1 ∨B2 )) ≡ (A1∧B1 )∨(A1∧B2 )∨(A2∧B1 )∨(A2∧B2 ) . The final sentence is a disjunction of all the conjunctions Ai ∧Bj . This generalizes to arbitrary disjunctions: (A1 ∨ A2 ∨ . . . ∨ Am ) ∧ (B1 ∨ B2 ∨ . . . ∨ Bn ) is logically equivalent to the disjunction of all the conjunctions in which one of the Ai s is “conjuncted” with one of the Bj s: (A1 ∧B1 ) ∨ . . . ∨ (A1 ∧Bn ) ∨ (A2 ∧B1 ) ∨ . . . ∨ (A2 ∧Bn ) ∨ . . . . . . ∨ (Am ∧B1 ) ∨ . . . ∨ (Am ∧Bn ) Altogether there are m·n disjuncts. It generalized further to more than two conjuncts: (A1 ∨ . . . ∨ Am ) ∧ (B1 ∨ . . . ∨ Bn ) ∧ (C1 ∨ . . . ∨ Cp ) is logically equivalent to the disjunctions of all conjunctions of the form Ai ∧ Bj ∧ Ck , where 1 ≤ i ≤ m, 1 ≤ j ≤ n, 1 ≤ k ≤ p ; here we get m·n·p disjuncts (there are m choices for i, n choices for j, and p choices for k). In particular, (A1 ∨A2 ) ∧ (B1 ∨B2 ) ∧ (C1 ∨C2 ) is logically equivalent to a disjunctions of 8 conjunctions: A1 ∧B1 ∧C1 ∨ A1 ∧B1 ∧C2 ∨ A1 ∧B2 ∧C1 ∨ A1 ∧B2 ∧C2 ∨ A2 ∧B1 ∧C1 ∨ A2 ∧B1 ∧C2 ∨ A2 ∧B2 ∧C1 ∨ A2 ∧B2 ∧C2 Pushing-In Disjunctions Pushing-in disjunctions, which is carried out via the distributive laws for disjunction, is the exact dual. It reduces the scopes of ∨, but enlarges those of ∧. For example, two steps of pushing-in disjunction yield: A ∨ [(¬B ∧ C) ∨ ¬D] ≡ A ∨ [(¬B ∨ ¬D) ∧ (C ∨ ¬D)] ≡ (A ∨ ¬B ∨ ¬D) ∧ (A ∨ C ∨ ¬D)
62
CHAPTER 2. SENTENTIAL LOGIC
Repeated pushing-in of disjunction terminates at a stage where there are no components either of the form A ∨ (B ∧ C) or of the form (B ∧ C) ∨ A. As in the case of conjunction, we can distribute, at one go, disjunction over several conjunctions: (A1 ∧ A2 ∧ . . . ∧ Am ) ∨ (B1 ∧ B2 ∧ . . . ∧ Bn ) is logically equivalent to: (A1 ∨B1 ) ∧ . . . ∧ (A1 ∨Bn ) ∧ (A2 ∨B1 ) ∧ . . . ∧ (A2 ∨Bn ) ∧ . . . . . . ∧ (Am ∨B1 ) ∧ . . . ∧ (Am ∨Bn ) And this generalizes further to more than two disjuncts. Homework 2.14 Let G be the sentence: ¬[(A ∨ ¬B) ∧ ¬C] ∧ ¬[¬C ∨ (¬D ∧ E)] Construct the following sentences: G1 : Obtained from G by pushing-in negations all the way. G2 : Obtained from G1 by pushing-in conjunctions all the way. G3 : Obtained from G1 by pushing-in disjunctions all the way. If possible, simplify G2 and G3 by using the idempotence laws. 2.15 Let H = ¬G, where G is as in 2.15 above. Construct H1 –obtained from H by pushing-in negations all the way, H2 –obtained from H1 by pushing-in conjunctions all the way, and H3 obtained from H1 by pushing-in disjunctions all the way. WATCH IT: It is inadvisable to mix pushing-in conjunctions and pushing-in disjunctions. The first reduces the scopes of ∧ and enlarges those of ∨. The second has the opposite effect. And both increase sentence size. If you interlace them you will get longer and longer sentences. For example: A ∧ (B ∨ C) ≡ (A ∧ B) ∨ (A ∧ C) ≡ [(A ∧ B) ∨ A] ∧ [(A ∧ B) ∨ C] ≡ {[(A ∧ B) ∨ A] ∧ (A ∧ B)} ∨ {[(A ∧ B) ∨ A] ∧ C} ≡ . . . Occasionally you may want to apply the two kinds of pushing-in to separate components, or to combine them with other operations in between.
2.5. SENTENTIAL LOGIC AS AN ALGEBRA
63
Pulling Out Negations and Common Factors By applying De Morgan’s laws from right to left, we can pull out negations: ¬A ∨ ¬B is converted to ¬(A ∧ B), and ¬A ∧ ¬B is converted to ¬(A ∨ B). This generalizes to disjunctions of more than two disjuncts; similarly for conjunctions; i.e., we can replace ¬A1 ∨ ¬A2 ∨ . . . ∨ ¬An
by
¬(A1 ∧ A2 ∧ . . . ∧ An ) ,
¬A1 ∧ ¬A2 ∧ . . . ∧ ¬An
by
¬(A1 ∨ A2 ∨ . . . ∨ An ) .
Similarly, by applying the distributive laws in the right-to-left direction we pull out a common factor; this is a common conjunct–if the law is for conjunction, a common disjunct–if the law is for disjunction. Thus we can replace: (A∧B) ∨ (A∧C)
by
A ∧ (B ∨ C) ,
(A∨B) ∧ (A∨C)
by
A ∨ (B ∧ C) .
It generalizes to more than two disjuncts that share a common conjunct, and to more than two conjuncts that share a common disjunct: (A∧B) ∨ (A∧C) ∨ (A∧D)
is replaceable by
A ∧ (B ∨ C ∨ D) .
(A∨B) ∧ (A∨C) ∧ (A∨D)
is replaceable by
A ∨ (B ∧ C ∧ D) .
Since each pull-out step reduces the size of the sentence, repeated pull-outs–whether of common conjuncts or of common disjuncts, or of both–must terminate at a stage where no further pulling out is possible. In the case of more than two disjuncts (or more than two conjuncts), the final outcome of repeated pull-out can be highly sensitive to the order in which it is done. Consider for example: (A∧B) ∨ (A∧C) ∨ (D∧C) . If we group together the first two disjuncts and pull out A, we get: [A∧(B ∨C)] ∨ (D∧C) and no further pulling out is possible. But if we group together the second and the third disjunct and pull out C, we get: (A∧B) ∨ [(A∨D)∧C] and, again, no further pulling out is possible. The two final outcomes are quite different; they are of course logically equivalent, but the equivalence is not obvious at first glance.
64
2.5.2
CHAPTER 2. SENTENTIAL LOGIC
Additional Equivalence Laws
If we add to the laws of our previous box the following ones, we get a set of laws from which every tautological equivalence is derivable by (repeated) substitutions of equivalents. We shall not prove this mathematical theorem. But you can see how the additional laws enable us to drop certain conjuncts or certain disjuncts from an expression.
Tautological Conjunct:
Contradictory Disjunct:
A ∧ (B ∨ ¬B) ≡ A
A ∨ (B ∧ ¬B) ≡ A
Contradictory Conjunct:
Tautological Disjunct:
A ∧ (B ∧ ¬B) ≡ B ∧ ¬B
A ∨ (B ∨ ¬B) ≡ B ∨ ¬B
You can easily verify these laws by truth-value considerations. E.g., the truth-value of A ∧ (B ∨ ¬B) is the same as the value of A, because the value of B ∨ ¬B is always T. Note: B ∨ ¬B can be replaced, in the above laws, by any logically equivalent sentence, i.e., by any tautology. Similarly B ∧ ¬B can be replaced by any contradiction. The following pair of laws is derivable from the previous ones. They are important, because together with the other laws that do not involve negation (i.e., all other laws except De Morgan’s) they characterize all the algebraic properties of sentential logic that involve conjunction and disjunction only. They are also very useful in simplifications, by enabling immediate deletions of certain conjuncts, or certain disjunct.
A ∧ (A ∨ B) ≡ A
A ∨ (A ∧ B) ≡ A
Again, simple truth-value considerations show that these equivalences hold. Take for example the first: if A gets T, so does A ∨ B and so does A ∧ (A ∨ B); if A gets F, so does A ∧ (A ∨ B). The derivation of the laws from the previous ones is much less obvious then the truth-value arguments. But it is of mathematical interest that certain are formally derivable from others.
2.5. SENTENTIAL LOGIC AS AN ALGEBRA
65
Here is a derivation of the first. Note that the first step consists in applying the contradictory disjunct law from right to left, i.e., in adding a contradictory disjunct. 1. A ∧ (A ∨ B) 2. [A∧(A ∨ B)] ∨ (A ∧ ¬A)
adding a contradictory disjunct,
3. A ∧ [A∨B∨¬A]
pulling out the common conjunct A,
4. A ∧ [B ∨ (A ∨ ¬A)]
commutativity and associativity of disjunction,
5. A ∧ (A ∨ ¬A)
tautological disjunct,
6. A
tautological conjunct.
Redundant Conjuncts and Disjuncts:
Given a conjunction
A1 ∧ A2 ∧ . . . ∧ An or a disjunction A1 ∨ A2 ∨ . . . ∨ An the conjunct (or a disjunct) Ai is redundant if its deletion from the conjunction (or from the disjunction) yields a logically equivalent sentence. For example, if the same conjunct, or the same disjunct, occurs more than once, we can delete repeated occurrences, via idempotence. The two last groups of equivalence laws imply the following cases of redundancy. Redundant Conjuncts: (C1) Every tautological conjunct is redundant. (C2) If one of the conjuncts is contradictory then every other conjunct is redundant. (C3) If A is a conjunct, then every other conjunct of the form A ∨ B (or one logically equivalent to A ∨ B) is redundant. Redundant Disjuncts: (D1) Every contradictory disjunct is redundant. (D2) If one of the disjuncts is tautological, then every other disjunct is redundant. (D3) If A is a disjunct, then every other disjunct of the form A∧B (or one logically equivalent to A ∧ B) is redundant.
66
CHAPTER 2. SENTENTIAL LOGIC
Homework 2.16 Apply the simplification techniques in order to simplify the sentences of Homework 2.1. Write the simplified sentences (i) using only ¬ and ∧, (ii) using only ¬ and ∨. Note: You can now see the advantage of having both ∧ and ∨ in the system. Every sentence can be rewritten in logically equivalent form, using only ¬ and ∧ (or only ¬ and ∨). But the rewriting results in hard to grasp constructs and renders the algebraic technique highly obscure (at least for humans). The final simplified form of a sentence should not contain redundant conjuncts, or redundant disjuncts, for one can simplify further by deletion. But redundant components can be useful in the middle of a derivation; a contradictory disjunct was added in the first step of the last derivation. Here is another example. We show that A ∧ ¬A ≡ B ∧ ¬B, using only the law for contradictory disjunct (and the distributive and commutative laws). 1. A ∧ ¬A 2. A∧¬A ∨ B∧¬B 3. B ∧ ¬B
adding the contradictory B ∧¬B, dropping the contradictory A ∧ ¬A.
The last step involves switching the order of disjuncts, via commutativity, and applying the contradictory disjunct law in the form stated above. Having shown this, we now derive the contradictory conjunct law from the contradictory disjunct law: 1. A ∧ (B ∧ ¬B) 2. A ∧ (B ∧ ¬B) ∨ A ∧ ¬A 3. A ∧ (B∧¬B ∨ ¬A)
adding a contradictory disjunct, pulling out A,
4. A ∧ ¬A
dropping the contradictory B ∧ ¬B,
5. B ∧ ¬B
by the previously established equivalence.
Such derivations do not follow the standard patterns of pushing-in, or pulling out. Some inventiveness is required. It can be shown that each of the four laws for contradictory conjuncts and disjuncts implies, given the preceding laws, the other three. (Can you see how, using De Morgan’s laws, you can get the contradictory disjunct law from the tautological conjunct law, and vice versa?) Note: We can remove the tautological conjuncts from any given conjunction. What if all the conjuncts are tautological? The conjunction is then a tautology and can be simplified to A∨¬A; but we cannot remove all conjuncts, for this will leave us with no sentence. Sometimes
2.5. SENTENTIAL LOGIC AS AN ALGEBRA
67
a convention is made by which one posits “the empty conjunction” as a tautology; one adds a special sentence that, by definition, gets only the value T. One can also make sense of “the empty conjunction” by regarding the truth of each conjunct as a truth-constraint; the conjunction is true just when it meets all its truth-constraints. If there are no conjuncts, there are no truth-constraints and the conjunction is always true, i.e., tautological. The dual case is a disjunction in which all disjuncts are contradictory. The disjunction is then contradictory and can be simplified to A ∧ ¬A. One can posit “the empty disjunction” as the contradictory sentence: a special sentence that, by definition, gets only F. One can make sense of “the empty disjunction” by regarding the falsity of each disjunct as a falsity-constraint; the disjunction is false just when it meets all its falsity-constraints. If there are no disjuncts, there are no falsity-constraints and the disjunction is always false, i.e., contradictory. Homework 2.17 Simplify the following sentences. Try to get simplification that are as simple, or as short as possible. Indicate briefly how the simplification is achieved. 1. A∧B ∨ ¬A∧B 2. A∧B ∨ ¬B∧A 3. (A ∨ B) ∧ (A ∨ ¬B) 4. (A ∨ B) ∧ (B ∨ C) 5. (A ∨ B∧C) ∧ ¬(B ∧ C) 6. ¬B ∧ (B ∨ A) 7. ¬B ∨ B∧A 8. A ∧ (B ∨ C) ∨ ¬B 9. (A ∨ ¬B) ∧ ¬(¬B ∨ A) 10. ¬(A ∧ B) ∧ ¬(A ∨ B) 11. (¬B ∨ A∧B) ∧ (A ∨ ¬A∧B) 12. ¬(¬A ∧ ¬B) ∧ (¬A ∨ ¬B) 13. (A ∨ B) ∧ (A ∨ ¬B) ∧ (¬A ∨ B) ∧ (¬A ∨ ¬B) 14. A∧B ∨ A∧¬B ∨ ¬A∧B ∨ ¬A∧¬B 15. ¬(A∧B ∨ A∧C ∨ B∧C) 16. (¬A ∨ B) ∧ (¬B ∨ C) ∧ (¬C ∨ A) 17. ¬[(A ∨ B ∨ C) ∧ (¬A ∨ ¬B ∨ ¬C)]
68
CHAPTER 2. SENTENTIAL LOGIC
18. ¬[A∧¬B ∨ ¬A∧B] ∧ ¬[B∧¬C ∨ ¬B∧C] 19. A∧¬B ∨ ¬A∧B ∨ A∧¬C ∨ ¬A∧C ∨ B ∧¬C ∨ ¬B ∧C 20. ¬[¬A ∨ (¬B ∨ C)] ∨ (¬(¬A ∨ B) ∨ ¬A ∨ C) Small Sets of Laws As noted above, the laws in the first two boxes (in 2.5 and 2.5.2) are sufficient for deriving– via the equivalence properties and substitution of equivalents–all the logical equivalences. Actually, we can reduce the number of required laws, because some of our listed laws are derivable from others. The right-hand side laws are derivable from the double negation law and the left-hand side laws; and vice versa. Here, for example, is a derivation of commutativity for disjunction from the double negation law, De Morgan’s law for conjunction and commutativity for conjunction. 1. A ∨ B 2. ¬¬A ∨ ¬¬B
adding double negation,
3. ¬(¬A ∧ ¬B)
pulling-out negation, via De Morgan for conjunction,
4. ¬(¬B ∧ ¬A)
Commutative law for conjunction,
5. ¬¬B ∨ ¬¬A
pushing-in negation, via De Morgan for conjunction.
6. B ∨ A
deleting double negations.
And here is a derivation of the law for a tautological conjunct, from the double negation law, De Morgan’s law for disjunction and the law for contradictory disjunct. 1. A ∧ (B ∨ ¬B) 2. ¬¬A ∧ ¬¬(B ∨ ¬B)
adding double negation,
3. ¬[¬A ∨ ¬(B ∨ ¬B)]
pulling negation out, via De Morgan for disjunction,
4. ¬[¬A ∨ (¬B ∧ ¬¬B)] 5. ¬¬A 6. A
pushing the third negation in, via De Morgan for disjunction, deleting a contradictory disjunct, deleting double negation.
2.5. SENTENTIAL LOGIC AS AN ALGEBRA
69
The pattern should be now clear: The law is derived from the corresponding law in the other column, using the double negation law and De Morgan–first from right to left (double negations are added and negation is pulled out), then from left to right (negation is pushed in and double negations are dropped). Homework 2.18 (i) Derive the associative law for conjunction, from the associative law for disjunction, the double negation law and De Morgan’s law for disjunction. (ii) Derive the law for tautological disjunct from the law for contradictory conjunct, the double negation law and De Morgan’s law for conjunction. While the derivability of some laws from others is of mathematical interest, you need not restrict yourself to a small basis of laws. When you simplify, or when you prove equivalences, you can use freely all the laws, as well as any equivalence that is obvious, or has been previously established.
2.5.3
Duality
We have noted already the duality phenomenon. If we toggle ∧ and ∨ in any of our laws we get the dual law. It is not difficult to deduce from this that if an equivalence, stated in terms of ¬, ∧ and ∨, is derivable from these laws, so is the dual equivalence, obtained by toggling throughout ∨ and ∧. In general, duality can be defined by considering a certain operation on truth-tables: Say that a connective is the dual of another, if its truth table is obtained from that of the other by toggling everywhere T and F. It now follows that ∨ is the dual of ∧ : A B A∧B T T T T F F F T F F F F
A B A∨B F F F F T T T F T T T T
The order of rows in the second truth-table is different from the customary one; but this, as remarked in 2.1.3, makes no difference. If one connective is the dual of another, then the second is the dual of the first; because by toggling T and F in the second truth-table, we get back our first table. Hence duality is symmetric. We therefore speak of dual connectives, or of a dual pair. The dual of negation is, again, negation:
70
CHAPTER 2. SENTENTIAL LOGIC A ¬A T F F T
A ¬A F T T F
Duality of Sentential Expressions: The dual of a given sentential expression is the expression obtained from it by replacing every connective name by the name of its dual. Again, if, given the dual expression, we replace every connective name with the name of its dual, we get back our original expression. Hence duality–as a relation between expressions– is symmetric. Note: Before an expression is transformed to its dual, it should be fully parenthesized. Conventions for omitting parentheses cannot be relied upon, because they discriminate between ∧ and ∨. E.g., to form the dual of A ∧ B ∨ C, we insert parentheses: (A ∧ B) ∨ C and then toggle the connectives: (A ∨ B) ∧ C. Without parentheses we would have gotten A ∨ B ∧ C; under the grouping conventions this would become A ∨ (B ∧ C), which is wrong. Here are examples of dual expressions: A ∧ (B ∨ C) ¬A ∨ [¬(A ∧ B)] (A ∧ ¬B) ∨ (¬A ∧ B)
A ∨ (B ∧ C) ¬A ∧ [¬(A ∨ B)] (A ∨ ¬B) ∧ (¬A ∨ B)
Now the following holds: The truth-tables of dual expressions are obtained from each other by toggling everywhere T and F. It has a formal proof, which we shall not bring here. But you can convince yourself of its truth by the following observation: We get the truth-table of an expression by working through its components, using at each stage the truth-table of the main connective. The toggling of truth-values transforms the truth-table of any connective to the table of its dual. Hence, toggling throughout the truth-values yields the truth-table of the expression in which every connective is replaced by its dual. The following pair of truth-tables illustrates what happens:
2.5. SENTENTIAL LOGIC AS AN ALGEBRA
71
A B ¬A ¬A ∨ B T T F T T F F F F T T T F F T T
¬(¬A ∨ B) A ∧ ¬(¬A ∨ B) F F T T F F F F
A B ¬A ¬A ∧ B F F T F F T T T T F F F T T F F
¬(¬A ∧ B) A ∨ ¬(¬A ∧ B) T T F F T T T T
If two sentential expressions are equivalent, then their columns, in a truth-table with entries for both, are the same. If we toggle throughout T and F we get a truth-table for the dual expressions. Obviously, equal columns are transformed into equal columns. Therefore, we get the duality principle: If two sentential expressions are equivalent, so are their duals. The basic laws listed in the boxes have been arranged in such a way that the laws in each pair are duals: they are obtained from each other by replacing each expression by its dual. From any proven equivalence we can get, by the duality principle, a dual equivalence. For example, having established that ¬B ∧ (B ∨ A) ≡ A ∧ ¬B , we can deduce by duality that ¬B ∨ (B ∧ A) ≡ A ∨ ¬B . Duality can be extended to the case of laws that are stated in terms of truth-values, or which employ ‘tautology’ and ‘contradiction’. The dual of T is F, the dual of F is T. The dual of ‘tautology’ is ‘contradiction’, the dual of ‘contradiction’ is ‘tautology’. The following, for example, are duals: If A is a tautology, then
A∧B ≡B .
If A is a contradiction, then A ∨ B ≡ B . Homework 2.19 simplify them.
Write down the duals of 1., 5., 16., and 20. of Homework 2.17 and
72
CHAPTER 2. SENTENTIAL LOGIC
Note: As an algebraic structure, classical logic displays a striking symmetry in which T and F play symmetrical roles. In our use of language, however, truth and falsity cannot, in any sense, be considered symmetric. The essential difference between truth and falsity is brought about by the way language is used in our interactions with the extra-linguistic world and with each other. This subject, which has fundamental importance in the philosophy of language and the philosophy of logic, is beyond the scope of this book. Note only that the concept of logical implication (to be defined in chapter 4) reflects the asymmetry between truth and falsity. So does the fact that we find it advisable to have → (to be presently defined) as a connective, but we do not include its dual.
2.6
Conditional and Biconditional
Conditional is a binary connective, whose name is ‘→’. If we include it then, for every two sentences A and B, there is a sentence called the conditional of A and B, which we write as: A→B
A and B are called, respectively, the antecedent and the consequent of A → B. ‘Conditional’, like the names of the other connectives, is used ambiguously: for the connective (i.e., the operation) as well as the resulting sentence. Note: Do not confuse the logical antecedent–the one just defined–with the grammatical antecedent of a pronoun (the word or phrase a pronoun refers to). The truth-table of A → B is: A B A→B T T T T F F F T T F F T Conditional corresponds to the ‘If ... then
’ formation of English. The sentence
(1) If Jack is at home then Jill is at home can be recast in sentential logic as the conditional A → B, where A represents ‘Jack is at home’, and B represents ‘Jill is at home’. Recasting (1) in this way means that we consider it false if Jack is at home and Jill is not at home, and we consider it true in all other cases–in particular, if Jack is not at home. The problems arising from this interpretation of ‘If ... then ’ are discussed in chapter 3, where the relations between natural language and sentential logic are looked into. In sentential logic → is just another connective having the above truth-table.
2.6. CONDITIONAL AND BICONDITIONAL
73
Obviously, the column of A → B is the same as the column for ¬A ∨ B. Hence we have: A → B ≡ ¬A ∨ B Consequently, → is expressible in terms of ¬ and ∨ (and therefore also in terms of ¬ and ∧). We could do without → in the sentential calculus; but there are good reasons for including it, which have to do with the role of → in the context of logical implications and formal deductions. On the other hand, ∨ (hence also ∧) is expressible in terms of ¬ and →: A ∨ B ≡ ¬A → B Thus, ¬ and each of the binary connectives ∧, ∨, → are sufficient to express the other two binary connectives. Note: Sometimes conditional goes under the name ‘implication’, or ‘material implication’. The term ‘implication’ will be used for a different purpose. You should take care not to confuse the two. Grouping Convention for →: The convention is that ‘→’ binds more weakly than any of the previous connective names. This means that in restoring parentheses we first determine the scopes of negations to be the smallest scopes consistent with the existent grouping, then the scopes of conjunctions, then those of disjunctions, and then the scopes of the conditionals. For example, the grouping in: ¬A ∧ B ∨ B → ¬C ∨ D is: [((¬A) ∧ B) ∨ B] → [(¬C) ∨ D] Conditional does not have the associativity property, enjoyed by conjunction and disjunction: (A → B) → C
and
A → (B → C)
are, in general, not equivalent. The two have different values when A, B, C get all F. It also lacks commutativity: A→B
and
B→A
have different values, when A and B get different values. Consequently, when conditionals are repeated, grouping and order are extremely important. We cannot omit parentheses as we have done in the case of long conjunctions and disjunctions.
74
CHAPTER 2. SENTENTIAL LOGIC
Note: If we toggle T and F throughout the truth-table of conditional, we get the truth-table of ¬A ∧ B
Hence, if we were to introduce a connective dual to →, then the “dual-conditional” of A and B would be logically equivalent to ¬A ∧ B. This can be also seen by rewriting the conditional in the logically equivalent form ¬A ∨ B and forming the dual of that. None of the customary systems of logic has a “dual-conditional” as a primitive connective. To form the dual of an expression involving conditionals, we should therefore replace every component of the form A → B by ¬A ∧ B. Homework 2.20 Consider the expression A→B→C→D How many possible different sentences can we obtain from it by inserting parentheses? Find whether any two of sentential expressions obtained in this way are logically equivalent. 2.21 In certain cases conditional can be distributed over conjunctions and disjunctions. For example, A → (B ∧ C) ≡ (A → B) ∧ (A → C) But sometimes the “distributing” involves a change in the other connective (the conjunction or the disjunction over which conditional is distributed). Find the “distributive laws” for the following cases. • A → (B ∨ C) • (A ∨ B) → C • (A ∧ B) → C 2.22 Using the “distributive laws” of Homework 2.21, push → all the way in, in the following sentences: 1. (A ∨ B) → (C ∨ D) 2. (A ∨ B) → (C ∧ D) 3. (A ∧ B) → (C ∨ D) 4. (A ∧ B) → (C ∧ D) Note: Using conditional , we can form the tautology A → A , which is perhaps simpler than our previous standard tautology A ∨ ¬A.
2.6. CONDITIONAL AND BICONDITIONAL
75
Biconditional We conclude our list of connectives, with biconditional. It is a binary connective whose name is ‘↔’. For every two sentences A and B, there exists a sentence that is their biconditional: A↔B The interpretation of the biconditional is best expressed by expressing it as a conjunction of two conditionals: A ↔ B ≡ (A → B) ∧ (B → A) . In terms of truth-values this means that the value of A ↔ B is T if A and B have the same truth-value; it is F if they have different values. Since conditional represents, in a way, ‘If ... then If ... then
, and if
then ... ,
’ , biconditional corresponds to: that is:
... iff
.
For example, (2) Jill is at home if and only if Jack is at home can be formalized as a biconditional. More than the other connectives ↔ is dispensable as a primitive. Replacing everywhere ‘A ↔ B’ by ‘(A → B) ∧ (B → A)’, may cause some, but usually minor, inconveniences. From the truth-table of biconditional we can immediately see that it has the commutativity property: A↔B ≡ B↔A Also the following useful equivalences are easily verified, either by truth-table, or by algebraic manipulations (after converting the biconditional into a conjunction of two conditionals and expressing these in terms of the previous connectives). A↔B
≡
(A∧B) ∨ (¬A∧¬B)
¬(A ↔ B) ≡ A ↔ ¬B ≡ ¬A ↔ B ¬(A ↔ B) ≡ (A∧¬B) ∨ (¬A∧B) ≡ (A ∨ B) ∧ (¬A ∨ ¬B) Note that the right-hand side of the first equivalence can be understood as saying: “Either both A and B are true, or both are false”. Similarly, in the third row, the second sentence can be understood as saying that A and B have different truth-values; and the third sentence as saying that one of A and B is true, and one is false.
76
CHAPTER 2. SENTENTIAL LOGIC
Note that ¬(A ↔ B) expresses exclusive ‘or’. Note: If we toggle T and F in the truth-table of A ↔ B we get the truth-table of ¬(A ↔ B), hence ¬(A ↔ B) can serve as the dual of A ↔ B. This can be also inferred from the fact that the rightmost sentences in the first and third rows in the above-given equivalences are duals of each other. In addition to commutativity, biconditional has the associativity property:
(A ↔ B) ↔ C
≡
A ↔ (B ↔ C)
This can be established either by truth-tables, or by algebraic manipulation using the equivalences given above. A short truth-value argument, which gives also some insight into the nature of repeated biconditionals, proceeds by proving first the following claim. (A ↔ B) ↔ C gets T, under a given assignment of truth-values to the sentential variables A, B, C, iff F is assigned an even number of times (i.e., to 2 of the variables, or to none of them). Here is the proof. Case (i): C gets T. Then (A ↔ B) ↔ C gets T iff A and B get the same truth-value. If both get T, the number of assigned F’s is 0; and if both get F, this number is 2.
Case (ii): C gets F. Then (A ↔ B) ↔ C gets T iff A and B get different values; in this case the number of assigned F’s is 2 (one to C, one to A or to B). It is easily seen that there are exactly four possibilities of assigning an even number of F’s. These are exactly the possibilities covered in (i) and (ii). Hence, (A ↔ B) ↔ C gets T just when the number of assigned F’s is even. This shows also that (B ↔ C) ↔ A gets T iff F is assigned an even number of times. It follows that: (A ↔ B) ↔ C ≡ (B ↔ C) ↔ A Since (B ↔ C) ↔ A ≡ A ↔ (B ↔ C), we get the desired result.
2.6. CONDITIONAL AND BICONDITIONAL
77
Consequently, in expressions in which ‘↔’ is the only connective name, changes in the order and grouping yield logically equivalent expressions (as is the case with ‘∧’, or with ‘∨’). We can, therefore, ignore parentheses, e.g.,
A↔B↔C ↔D↔E
If in such a repeated biconditional a sentential variable, ‘A’, occurs more than once, we can, by changes of order and grouping, rewrite the biconditional as:
(A ↔ A) ↔ D where D is the rest of the repeated biconditional. Since A ↔ A is a tautology, it easily follows that (A ↔ A) ↔ D ≡ D We can therefore drop any pair of two occurrences of the same sentential variable. If every sentential variable occurs an even number of times, then the repeated biconditional is a tautology. If not, then by dropping all repeating pairs, we are left with a repeated biconditional in which every sentential variable occurs only once. Such a sentence is non-tautological (cf. Homework 2.23). Homework 2.23 Show that a sentential expression constructed using only ‘↔’ is tautological iff each sentential variable occurs in it an even number of times. (Most of the proof has already been done, you have to state it in full, supplying the last missing step). Using the type of argument just given, one can prove the following generalization of the claim used in the proof of associativity: If a sentential expression is built only from ‘↔’ and sentential variables, and each variable occurs only once, then the expression gets T iff the number of variables that get F is even. Unlike our previous binary connectives, ↔ with negation is not sufficient for expressing the other connectives. It can be proved that, for any sentential expression built from two sentential variables using only ‘¬’ and ‘↔’, the number of T’s in its truth-table column is even (either 0, or 2, or 4). Therefore it cannot be equivalent say to ‘A ∧ B’, which has an odd number of T’s in its column (namely, 1).
78
CHAPTER 2. SENTENTIAL LOGIC
Biconditionals are handy for characterizing logical equivalence in terms of logical truth: A ≡ B
A ↔ B is a logical truth.
iff
Grouping: The convention is that ‘↔’ has weaker binding power than the other connective names. Hence ¬A ∨ B ∧C → B ↔ A → B ∨ C should be read as:
[(((¬A) ∨ (B ∧C)) → B)] ↔ [A → (B ∨ C)] Simplifying Expressions Containing Conditionals and Biconditionals Expressions containing conditionals and biconditionals can be transformed into equivalent ones involving only ¬, ∧, and ∨, to which, in turn, we can apply our previous simplification techniques. Often, however, we can get simpler forms by retaining some conditionals, or biconditionals, in the final outcome. For example, A → (B → C) is logically equivalent to: but also to
¬A ∨ ¬B ∨ C A∧B → C
which has a clearer intuitive meaning. This generalizes to an arbitrary number of repeated conditionals grouped to the right: A1 → (A2 → (. . . → (An−1 → An ) . . .)
≡
(A1 ∧A2 ∧. . .∧An−1 ) → An
Noteworthy equivalences concerning → are: ¬A → ¬B ≡ B → A ¬(A → B) ≡ A ∧ ¬B The above-mentioned properties concerning ↔, as well as the equivalence: ¬(A ↔ B) ≡ A ↔ ¬B can be used in simplifications of expressions containing ‘↔’ and ‘¬’ only. When other connectives are present, there are no straightforward simplification techniques (short of rewriting everything in terms of ¬, ∧, ∨ and applying the previous methods). But special cases may lend themselves to special treatments.
Homework 2.24 Simplify the following sentences. You may employ in the final version any of the connectives introduced here. Try to get sentences that are short or easy to grasp. You can use truth-value considerations, the algebraic methods of the previous section, or a mixture of both. 1. (A → B) → C 2. (A → B) → A 3. (A → B) → B 4. (A → B) ∨ (B → A) 5. ¬(A → B) ∨ ¬(B → A) 6. (A ↔ B) ↔ A 7. A∧B ∨ ¬A∧¬B 8. ¬A∧B ∨ A∧¬B 9. (A → B) ↔ (B → A) 10. (A → B) → (B → A) 11. (A ∨ B) ↔ (A ∧ B) 12. [¬(A ↔ B)] ↔ ¬[(B ↔ C) ↔ C] 13. A∧B ↔ A∧C 14. ¬(A → B) ↔ ¬A∧¬C 15. (A → B) ∧ (B → C) ∧ (C → A) 16. (B → A) ∧ (C → A) ∧ (A → (B ∨ C)) 17. (A → B) ∧ (A → C) ∧ ((B ∧ C) → A)
79
80
Chapter 3 Sentential Logic in Natural Language 3.0 In this chapter we try to uncover structures of sentential logic in a natural language–namely, English. We do so by recasting various English sentences in the form of sentential compounds, constructed by means of the connectives of classical two-valued logic. Certain sentences play in this representation the role of basic unanalysed units, they are represented by sentential variables; others are built from them by using connectives. For example, (1) Jack went to the theater, but Jill did not can be recast as: (10 )
A ∧ ¬B , where A and B are, respectively, the counterparts of:
(1.i) Jack went to the theater, (1.ii) Jill went to the theater. This implies the following: For every assignment of truth-values to A and B, if (1.i) and (1.ii) are given, respectively, the same values as A and B, then (1) and (10 ) get the same truth-value. We can think of A and B as translations in a formal language of (1.i) and (1.ii). At this point we don’t have to specify A and B any further, because the analysis stops there. Note that the sentential variables should represent self-contained declarative sentences, whose meaning–in the given context–is completely determined. In our example, this calls for spelling out ‘Jill did not’ as ‘Jill did not go to the theater’. B corresponds to ‘Jill went to the theater’ not to ‘Jill did’. By the same token, ‘Jack likes Jill and Jane likes her too’ should be formalized as a conjunction A ∧ B, where A and B correspond, respectively , to: 81
82
CHAPTER 3. SENTENTIAL LOGIC IN NATURAL LANGUAGE ‘Jack likes Jill’,
‘Jane likes Jill’.
Again, the correlate of B is not ‘Jane likes her too’. Translating (10 ) back literally, we get the repetitive sentence ‘Jack went to the theater and Jill did not go to the theater’. The formalization is not intended as a guide to good style. Neither do we aim at linguistic analysis. While the latter is relevant to what we do, our job is not that of the linguist. The formalization is intended to reveal a certain logical structure: it shows how the truth-value of the sentence depends on the truth-values of its components. Note on Terminology: Throughout this chapter we adapt the formal terminology of sentential logic to the context of natural language. Thus, ‘logical negation’ (or, for short, ‘negation’) refers to an operation by which an English sentence is transformed to another sentence, which always has the opposite truth-value. Similarly ‘logical conjunction’ (or, for short, ‘conjunction’) refers to an operation by which two sentences are combined into a sentence, whose truth-value is determined according to the rules governing ∧. Likewise for the other connectives. We shall also use these terms to refer to the sentences themselves. We therefore say that (1) is read as the conjunction of (1.i) with the negation of (1.ii). Note that in the above formalization of (1) ‘but’ is interpreted as ‘and’, i.e., as a word that forms a logical conjunction. Actually, ‘but’, unlike the neutral ‘and’, indicates some contrast between Jack’s and Jill’s actions. But this is not sufficient for making a difference as far as truth-values are concerned: (1) is true just when both ‘Jack went to the theater’ and ‘Jill did not go to the theater’ are true, and is false in all other cases. This is enough for reading (1) as a logical conjunction. Stylistic and esthetic elements, indications, suggestions, indirect approval or condemnation, and similar aspects that go beyond the mere statement of facts, are, as a rule, obliterated in the formalization. Sentential logic is a very elementary part of logic, which cannot provide for an in-depth analysis. At the level of sentential logic, highly intricate sentences may appear as basic unanalysed units, because further analysis requires a richer logical apparatus. But this is the first necessary step to a deeper analysis. Besides, sentential analysis is often instructive in itself. Consider, for example: Harvey weighed it. A mediocre two and two-thirds pounds. One more negative datum to sabotage the notion that the brain’s size might account for the difference between ordinary and extraordinary ability–a notion that various 19-century researchers have labored futilely to establish (claiming along the way to have demonstrated the superiority of men over women, white men over black men, Germans over Frenchmen).
3.0.
83
It comes out as a long conjunction. But what are the conjuncts? In other words: to the truth of what is the author committed? The question forces one to focus on what is actually being asserted here. You will find that the list runs somewhat like this: (i) Harvey weighed it [the ‘it’ referring to something mentioned earlier], (ii) The weighing showed a reading of two and two thirds pounds, (iii) Two and two thirds pounds is a mediocre weight for an object of the kind in question, (iv) The fact established by the weighing is evidence against the notion that brain size might account for differences between ordinary and extraordinary mental ability, (v) There have been previous pieces of evidence disconfirming that notion about the effect of brain size, Etc.
Truth-Functional Compounds and The Truth-Value Test A truth-functional compound of given English sentences is a sentence formed from them by applying connectives (English equivalents of the connective operations), such that the compound’s truth-value is determined completely by the values of the components. For example, (1) is a truth-functional compound of ‘Jack went to the theater’ and ‘Jill went to the theater’; ‘Jill did not go to theater’ is a truth-functional compound of ‘Jill went to the theater’. On the other hand ‘Jack went to the theater because Jill told him to’ cannot be analysed as a truth-functional compound of ‘Jack went to the theater’ and ‘Jill told Jack to go to the theater’. (See 2.1.3, page 26, for a discussion of the case.) Whether an English sentence can be analysed as truth-functional compound of other sentences is not always clear. Grammatical form and the presence of words such as ‘not’, ‘and’, and others, may guide us; but in many cases this is not sufficient. It would do well to keep in mind the following test. Truth-Value Test: A sentence can be construed as a truth-functional compound, formed by applying a sentential connective, only if its truth-value is always determined according to the truth-table of that connective. For example, (2) John was unharmed by the accident
84
CHAPTER 3. SENTENTIAL LOGIC IN NATURAL LANGUAGE
can be viewed as the negation of (3) John was harmed by the accident. If (3) gets T, (2) gets F; and if (3) gets F, (2) gets T. On the other hand, the construing of (4) John is unhappy today as the negation of: (5) John is happy today does not do so well on the truth-value test. If (5) gets T, then (4) gets F; but it is conceivable that both (4) and (5) get F. John’s state might be neither that of happiness nor that of unhappiness (say John is indifferent, or under sedation). More than mere absence of happiness, ‘unhappy’ implies some positive misery. Note that the truth-value test constitutes a necessary but not a sufficient condition for interpreting a given sentence as a certain compound. Other considerations may enter. For example, of the two sentences: ‘Jill went to the theater’,
‘Jill did not go to the theater’,
the second is naturally construed as the negation of the first, not the first as the negation of the second; though both construals do equally well on the truth-value test. All we can say is that the first sentence is logically equivalent to the negation of the second. Here the grammatical form of negation decides the issue. Like many undertakings pertaining to natural language, the success of recasting English sentences in logical form is a matter of degree. The question, “What is the logical form of a given sentence?” need not always have a clear-cut answer. An approximation that ignores certain aspects might do for certain purposes. Others may require a different, finer-grained analysis.
The Use of Hybrid Expressions A convenient, common way of showing the recastings of sentences in sentential logic involves the application of formal notation to English. For example, (1) is to be analysed as: (1∗ ) (Jack went to the theater) ∧¬(Jill went to the theater) .
3.1. CLASSICAL SENTENTIAL CONNECTIVES IN ENGLISH
85
We can read (1∗ ) as standing for: ‘The conjunction of ‘Jack went to the theater’ and the negation of ‘Jill went to the theater’ ’. (Just so ‘A ∧ ¬B’ can be read as ‘the conjunction of A and the negation of B’.) Strictly speaking, connective operations are defined for the formal setup. In English there is more than one construction that can represent a connective, e.g., there are several ways of expressing logical conjunction. But (1∗ ) is not meant as some unique English sentence. It is a convenient way of indicating a logical analysis of the sentence in question. It can refer to any of the sentences obtained by expressing the connectives in English.
3.1 3.1.1
Classical Sentential Connectives in English Negation
An English construction that represents logical negation, which is also analogous to the notation ‘¬A’, is based on appending ‘it is not the case that’ as a prefix: It is not the case that Jack likes Jill is the negation of ‘Jack likes Jill’. Much more common is the attaching of ‘not’ to a verb inside the sentence: Jack does not like Jill. Sometimes negation is expressed by using ‘un’; thus, (2) above can be construed as the negation of (3). But sometimes ‘un’ does not express logical negation but something stronger, as is evidenced by (4). There are no formal rules for determining whether a certain use of ‘un’ yields logical negation. You shall have to rely on your understanding of English. Another prefix that implies negation is ‘dis’. But usually this is stronger than mere logical negation, stronger also than the corresponding ‘un’-construct. Compare for example: (6) It is not the case that Jack is respectful to his boss. (6‘ ) Jack is unrespectful to his boss. (6“ ) Jack is disrespectful to his boss. The first is the logical negation of ‘Jack is respectful to his boss’, but the last means that he is positively impertinent. The second seems to lie in between.
86
CHAPTER 3. SENTENTIAL LOGIC IN NATURAL LANGUAGE
Intuitions vary. Some may construe “unrespectful” as sufficiently stronger than mere negation, so as to warrant an assignment of a different truth-value. Others might consider (60 ) as a more emphatic version of (6), but with the same truth-value. Sometimes, even a mere ‘not’ can indicate something stronger than logical negation. Consider for example: (7) Jill did not like the play, which seems to imply actual dislike, rather than mere absence of liking. Whether (7) actually says this, or only suggests it, is one of those questions to which a clear-cut answer is not forthcoming. We shall return to the problem in the next section. It is not among the goals of this course to settle questions of English usage, especially when these involve fine distinctions and when there are no unanimous answers. But you should be aware of the existence of the problem and of its bearing on the logical analysis. A noteworthy aspect of colloquial usage is the use of repeated negation for emphasis: I haven’t told no lie. The speaker alleges that he has not told any lie. The formal reading whereby the two negations cancel each other, is misleading; it would make the speaker assert that he has told a lie. A more extreme example was proposed by Russell. A charwoman who is unjustly accused of stealing replies indignantly: ‘I ain’t never done no harm to no one!’. Were we to follow blindly the formal rules, we would interpret her as claiming to have, at some time, done harm to every human being. (Can you see how this would follow? It involves also rules concerning quantifiers, expressed here by ‘never’ and ‘no one’, but it should not be very difficult to guess.)
3.1.2
Conjunction
By the truth-value test, a sentence cannot count as a conjunction of two (or more) sentences, unless it is true when both (or all) of them are true and false in every other case. In asserting a conjunction the speaker is committed to the truth of all the conjuncts, and to nothing more. (‘Conjunction’ as used here applies to sentences and should be clearly distinguished from the term of grammar, which refers to a combining words.) The standard word that marks conjunction is ‘and’. But conjunction is also obtained by mere juxtaposition, with the appropriate punctuation. Each of the following
3.1. CLASSICAL SENTENTIAL CONNECTIVES IN ENGLISH
87
Oswald shot Kennedy and Ruby shot Oswald,
Oswald shot Kennedy; Ruby shot Oswald,
Oswald shot Kennedy, Ruby shot Oswald1
can be read as: (Oswald shot Kennedy) ∧ (Ruby shot Oswald) Both ‘and’ and juxtaposition can serve in repeated conjunction: Some are born great, some achieve greatness, and some have greatness thrust upon them. A sequence of sentences, each ending in full stop, has the effect of a conjunction–in as much as the writer is committed to the truth of all the sentences in the sequence;2 e.g., Oswald shot Kennedy. Ruby shot Oswald. As remarked already, each conjunct should be an independent sentence. Pronouns that derive their meaning from other conjuncts should be replaced by independent particles. Thus, Jack used to smoke a lot and so did his wife becomes (Jack used to smoke a lot) ∧ (Jack’s wife used to smoke a lot) . Besides ‘and’ the following words are used to form compounds of two sentences, whose truth requires that both components be true: ‘but’ , ‘yet’ , ‘moreover’ , ‘however’ , ‘although’ , ‘nonetheless’ , ‘since’ , ‘therefore’ , ‘because’, and others. 1
Taking here stylistic license, as it is done many times, to separate the sentences by a comma. We may not want to declare a whole text false on the force of a single false sentence. This, however, does not mean that the usual truth-table does not apply. It only shows that the truth-value of the resulting conjunction is not adequate for judging texts containing many sentences. 2
88
CHAPTER 3. SENTENTIAL LOGIC IN NATURAL LANGUAGE
But not in all cases the truth of both components is sufficient for the truth of the compound. When it is not–we do not get a logical conjunction, but a compound that is not truthfunctional. All these words have side effects that the neutral ‘and’ does not. The side effects of ‘but’, ‘yet’ , ‘moreover’ , ‘however’ , ‘although’ , and ‘nonetheless’ do not make for a truth-value difference. Such compounds can be construed as logical conjunctions, though some aspects of meaning are thereby lost. On the other hand, we have noted that the effect of ‘because’ cannot be ignored in the formalization and that ‘because’-compounds are not, as a rule, truth-functional. ‘Since’, ‘therefore’ and their like (e.g., ‘consequently’) appear to be intermediate cases. Consider for example, (8) Jill said that the play was good, therefore Jack went to see it. If (8) is to construed as: (Jill said that the play was good) ∧ (Jack went to see the play) then (8) comes out true if Jill said that the play was good and Jack went to see it, even if Jack’s going had nothing to do with Jill’s saying. Under such circumstances (8) is undoubtedly misleading. But a statement can be misleading and yet formally true; something false can be suggested, without being explicitly stated. To what extent does (8) explicitly say that Jack’s going was caused by Jill’s saying? On this intuitions may differ. We shall reconsider the question in 4.5.2.
‘And’ When used to combine sentences, ‘and’ does not have the side effects that other combining terms (‘but’, ‘yet’ etc.) have. But it can indicate temporal order or causal relation that go beyond logical conjunction: (9) Jack’s wife told him to stop smoking, and so he did. Evidently, (9) indicates that Jack stopped smoking after his wife told him to. It suggests moreover a causal relation between the two facts. Whether such indications or suggestions should be taken as part of what is actually said is, again, not clear-cut. Homework 3.1 Give and discuss at least three examples, besides the above-given (9), of temporal order that is indicated by ‘and’, where the indications are of different strengths, from mere suggestion to almost explicit. ‘And’ is used, more commonly, not to join sentences, but to join nouns, or verbs, or adjectives, or adverbs, or phrases of these types. Often, the resulting constructs amount to logical conjunctions:
3.1. CLASSICAL SENTENTIAL CONNECTIVES IN ENGLISH
89
(10) Jack and Jill took driving lessons can be analysed as (100 ) (Jack took driving lessons) ∧ (Jill took driving lessons) And in a similar vein, (11) Jack is short and blue-eyed can be recast as: (110 ) (Jack is short) ∧ (Jack is blue-eyed) Yet this is not always the case. Consider: (12) Jack and Jill went to the party. Does this amount to: ‘Jack went to the party and Jill went to the party’ ? Something more is implied, namely that they went together. Is the togetherness part of what is explicitly stated? In other cases this is certainly so: (13) Jack and Jill painted this picture implies cooperation between Jack and Jill. That each painted this picture separately does not make sense (unless “painting this picture” is understood in some very unusual way). Or consider: (14) Tom, Bill and Harry elected Helen as the group’s representative, which, obviously, does not reduce to a conjunction of ‘Tom elected Helen ...’, ‘Bill elected Helen ...’, and ‘Harry elected Helen ...’. And, as a final illustration: (15) l and l0 are parallel lines, which certainly does not reduce to ‘l is a parallel line and l0 is a parallel line.’ The picture is now clear. As a combiner of names, ‘and’ (and juxtaposition) can function in two ways. When it functions distributively, the result amounts to a sentential conjunction, which attributes some action, or property, to each of the named objects. The action, or property, distributes over ‘and’:
90
CHAPTER 3. SENTENTIAL LOGIC IN NATURAL LANGUAGE X and Y did Z
is equivalent to
X did Z and Y did Z.
When it functions collectively, ‘and’ combines the given names into a name of a group (consisting of the named objects). The action, or property, is attributed to the group as a whole; it cannot be reduced to the separate actions or the separate properties of group members. This is the non-distributive, or collective ‘and’: {X and Y} did Z . Sometimes ‘and’ is clearly distributive–as in the cases of (10) and (11); sometimes it is clearly not–as in the cases of (13), (14), and (15); and sometimes it is ambiguous. In (12) ‘and’ can be understood either distributively or collectively. The collective reading of (12) appears to predominate. But the other reading cannot be ignored; which is shown by the fact that (16) Jack and Jill went to the party, but they did not go there together is neither inconsistent nor in any way strange. Given (12) by itself, we interpret ‘and’ collectively. But with the additional clause in (16), we switch to the distributive reading. The switch is done in order to avoid an interpretation under which the sentence is obviously absurd. It is an instance of what some philosophers have called a principle of charity: In cases of ambiguity, other things being equal, interpret the speaker in a way that gives him the best benefit. The distinction between distributive and collective ‘and’ arises also when it combines verb phrases. Thus, Jack studied mathematics and played baseball can be recast as a conjunction of ‘Jack studied mathematics’ and ‘Jack played baseball’; the ‘and’ is distributive. But often there is an implied temporal proximity, temporal order, or causal link. And if this is to be part of what the sentence says, then we cannot recast it as a simple conjunction. Compare: (17) (18)
Jack cried and laughed
versus
Jill hit the ball and sent it spinning
(Jack cried) ∧ (Jack laughed) . versus
(Jill hit the ball) ∧ (Jill sent the ball spinning) .
In (17) the conjunction misses the implication that the crying and laughing were almost concurrent. In (19) the conjunction does not say that Jill’s hitting the ball was the cause of her sending it spinning. When ‘and’ or juxtaposition combine adverbs, the combination is usually collective. Compare, for example,
3.1. CLASSICAL SENTENTIAL CONNECTIVES IN ENGLISH (19)
Jill ran fast and silently
versus
91
(Jill ran fast) ∧ (Jill ran silently) .
The conjunction says that at some time (in the past) Jill ran fast and at some time she ran silently. It does not say that ‘fast’ and ‘silently’ apply to the same run–which is what the left-hand side sentence says. When adjectives are combined, the combination is distributive when the statement is in the present tense–as illustrated by (11) above (‘Jack is short and blue-eyed’). In the past tense there is often, as in the case of verbs, an implication of temporal proximity. Being informed that John was skinny and rich, one would understand that he was skinny and rich about the same time; that he was skinny and poor in his youth, and twenty years later–fat and rich does not accord well with what we are informed. (The problem does not arise in the present tense, because the temporal proximity is guaranteed by the tense: both conjuncts are in the present, hence both refer to the same time.) Note that, in the present tense, when adjectives are combined, the collective and distributive readings come to the same; because the adjectives are applied to the same name (e.g., ‘Jack’ in (11)), one that denotes in both conjuncts the same object. This fails with respect to adverbs because we have not a common peg to hang our adverbs on. Sometimes ‘and’ is used distributively but the distribution calls for a certain adjustment. We cannot distribute the ‘and’ in Jim and John went with their families to the zoo so as to produce: ‘Jim went with their families to the zoo and John went with their families to the zoo’. Nonetheless the ‘and’ here is distributive and its correct distribution yields: Jim went with his family to the zoo and John went with his family to the zoo. Additional factors, involving the use of plural pronouns, are here at play. Be aware of such possibilities and do not apply the tests blindly. There are various other complications into which we shall not enter. A more comprehensive analysis, which is the work of a linguist, is beyond the scope of this book. Homework 3.2 Express the following texts as sentential combinations of basic components. Get your basic components as small as possible. Remember that the basic components should be written as self-standing sentences. Indicate relevant ambiguities, as you find them, and formalize each of the readings. 1. Democracy is a form of government which may be rationally defended, not as being good, but as being less bad than any other. 2. The ear tends to be lazy, craves the familiar, and is shocked by the unexpected: the eye, on the other hand, tends to be impatient, craves the novel and is bored by repetition.
92
CHAPTER 3. SENTENTIAL LOGIC IN NATURAL LANGUAGE 3. A sentimentalist is a man who sees an absurd value in everything and doesn’t know the market price of anything. 4. To know a little of anything gives neither satisfaction nor credit, but often brings disgrace or ridicule. 5. Knowledge is two-fold and consists not only in an affirmation of what is true, but in the negation of what is false. 6. It is the just doom of laziness and gluttony to be inactive without ease and drowsy without tranquillity. 7. To learn is a natural pleasure, not confined to philosophers, but common to all men. 8. It ain’t no sin if you crack a few laws now and then, just so long as you don’t break any. 9. Great eaters and great sleepers are incapable of anything else that is great.
10. The virtue of the camera is not the power it has to transform the photographer into an artist, but the impulse it gives him to keep on going. 11. The fact that an opinion has been widely held is no evidence whatever that it is not utterly absurd; indeed in view of the silliness of the majority of mankind, a widespread belief is more likely to be foolish than sensible. 12. In a just society men and women should have equal opportunity and be free to choose their vocations. Here, as an example, is a solution for 1. Let A and B represent sentences as follows. A:
Democracy is a form of government which may be rationally defended as being good.
B: Democracy is a form of government which may be rationally defended as being less bad than any other. Then 1. is translated as:
3.1.3
¬A ∧ B
Disjunction
Usually a disjunction is formed in English by using ‘or’ or ‘either... or (20) Jack will be home this evening or his wife will, as well as,
’:
3.1. CLASSICAL SENTENTIAL CONNECTIVES IN ENGLISH
93
(200 ) Either Jack will be home this evening or his wife will, can be recast as: (20∗ ) (Jack will be home this evening) ∨ (Jack’s wife will be home this evening) . Note that ‘either’, which marks the beginning of the left disjunct, serves (together with the comma) to show the intended grouping. Compare for example: Jack will go to the theater, and either Jill will go to the movie or Jill will spend the evening with Jack, Either Jack will go to the theater and Jill will go to the movie, or Jill will spend the evening with Jack. They come out, respectively, as: Jack will go to the theater ∧ (Jill will go to the movie ∨ Jill will spend the evening with Jack), (Jack will go to the theater ∧ Jill will go to the movie) ∨ Jill will spend the evening with Jack. ‘Or’ (and ‘either...or ’) is often used to combine noun phrases, verb phrases, adjectivals, or adverbials. In this it resembles ‘and’. But the distributivity problem does not arise for ‘or’ as it arises for ‘and’; usually, we can distribute: Jack or his wife will go to the party, Jill will either clean her room or practice the violin, are, respectively, equivalent to Jack will go to the party or Jack’s wife will go to the party, Jill will clean her room or Jill will practice the violin. Note that the following are equivalent as well: Jill ran fast or silently, Jill ran fast or Jill ran silently.
94
CHAPTER 3. SENTENTIAL LOGIC IN NATURAL LANGUAGE
The problem of the adverbs applying to the same run does not arise here–as it arises in (19) for ‘and’–because of disjunction’s truth-table: That sometime in the past Jill ran fast or silently is true just if sometime in the past Jill ran fast, or sometime in the past she ran silently. (The problem arises if ‘or’ is interpreted not as ∨, but exclusively; for, then, in the first sentence the exclusion affects one particular run, but in the second it affects all of Jill’s past running.) Other failures of distributivity involve combinations of ‘or’ with ‘can’, or with verbs indicating possible choices. They will be discussed later (3.1.3, page 95).
Inclusive and Exclusive ‘Or’ We discussed already the inclusive and exclusive readings of ‘or’ (cf. 2.2.2, pages 34, 35). Recall that under the inclusive reading an ‘or’-sentence is analysed as a disjunction of sentential logic: it is true, if one of the disjuncts is true, or if both are. Under the exclusive reading, one, and only one, should be true. Exclusive disjunctions are truth-functional as well, but they should be recast differently, namely in the form (A ∨ B) ∧ ¬(A∧B), or in the equivalent form
Cf. 2.2.2 .
(A∧¬B) ∨ (¬A∧B) .
In many cases ‘or’ seems exclusive where an inclusive interpretation will do as well; the question of the right reading may not have a clear-cut answer. (21) Jack is now either in New York or in Toronto. Since Jack cannot be at the same time both in New York and in Toronto, many jump to the conclusion that the ‘or’ is exclusive. But this does not follow. From (21) one will infer that (i) Jack is in one of the two cities and (ii) he is not in both. But (ii) follows from the general impossibility of being in different places at the same time. It need not be part of what (21) explicitly states. The ‘or’ can be inclusive. Just so, if somebody says that the sun is now rising, we will naturally infer that it is rising in the east, though it is not part of the statement. Whether the speaker who asserts (21) intends the exclusion to be part of the meaning of ‘or’, or only an obvious non-logical corollary, can be a question that none may settle, including the speaker. Examples of the last kind, where ‘or’ can be read inclusively but where exclusion is nonetheless implied, come easily to mind. The strength of the implied exclusion varies, however. In Either Edith or Eunice will marry John
3.1. CLASSICAL SENTENTIAL CONNECTIVES IN ENGLISH
95
the exclusion of their not marrying him both is implied by the prohibition on bigamy, which is considerably weaker than the impossibility of being in two places at the same time. And in (22) This evening I shall either do my homework or go to the movies, the exclusion is only suggested (by the inconvenience of doing both), but is by no means necessary. The speaker can in fact add ‘or do both’ : (23) This evening I shall either do my homework, or go the movies, or do both. By choosing to add ‘or do both’–or words to this effect–the speaker allows for the possibility of an exclusive interpretation. (22) and (23) are equivalent under the inclusive interpretation; under that interpretation, the additional ‘or do both’ is redundant. But it is not redundant if ‘or’ is interpreted exclusively; its addition neutralizes the effect of the exclusive interpretation. The possibility of an exclusive interpretation of ‘or’ is not evidenced by (21), but by cases, like (23), where one finds it appropriate to add ‘or both’. There are examples where the exclusive interpretation of ‘or’ is called for, e.g., Either you pay the fine or you go to prison. And there are others where ‘or’ is obviously inclusive: If you are either a good athlete or a first-rate student, you will be admitted, which should be recast as: (A ∨ B) → C. To sum up, while there is an exclusive sense of ‘or’, the inclusive sense–formalized by disjunction of sentential logic–is appropriate in more cases than appears at first sight. Recall, however, that the main reason for having ∨ rather than exclusive disjunction (our ∨x ) as a connective, are its algebraic and formal properties. It is also directly related (as we shall see in chapter 5) to the set-theoretic operation of union.
‘Or’ with ‘Can’ (24) You can have either coffee or tea. If we distribute the ‘or’ we get: (25) You can have coffee ∨ you can have tea .
96
CHAPTER 3. SENTENTIAL LOGIC IN NATURAL LANGUAGE
Now (25) is not the equivalent of (24). Suppose you can have coffee, but you cannot have tea. Then (25) is true, because the first disjunct is, but (24) is false. The problem with (24) is that it involves a hidden conditional. Spelled out in full, it comes out as something like: (240 ) If you ask for coffee or if you ask for tea, you’ll get what you ask for. And the best way of expressing this is as a conjunction: 00
(24 ) If you ask for coffee (and no tea) you’ll get coffee, and if you ask for tea (and no coffee) you’ll get tea. The parentheses make explicit the assumption that one does not get both coffee and tea by asking for both. This corresponds to an exclusive reading of ‘or’ in (240 ). If this assumption is not correct the parentheses should be omitted and the ‘or’ in (240 ) should be read inclusively. 00
The equivalence of (240 ) and (24 ) is an instance of one of the following two equivalences. It is an instance of the first, or of the second, depending on whether the ‘or’ in (240 ) is read exclusively or inclusively. (A ∨x B) → C ≡ (A∧¬B → C) ∧ (¬A∧B → C)
(A ∨ B) → C ≡ (A → C) ∧ (B → C)
Sometimes the following sentence is used as an equivalent of (24): (24.1) You can have coffee or you can have tea. In this case the ‘or’ in (24.1) does not stand for disjunction, either exclusive or inclusive. The phenomenon just exemplified is general, it takes place when ‘can’ is used to express choice or possibility: ‘You can choose either to marry her or not to see her again’, ‘ ‘Bank’ can mean either a river-bank or a financial institution’, and many similar cases. Homework In the following homework A, B, C and D represent, respectively, the sentences: ‘Ann is married’, ‘Barbara is married’, ‘Claire is married’, ‘Dorothy is married’. Sentences, whose truth-values depend only on which of the women are married and which are not, can be naturally formalized, using ¬, ∧ and ∨. For example,
3.1. CLASSICAL SENTENTIAL CONNECTIVES IN ENGLISH
97
‘Exactly one of Barbara and Claire is married’ comes out as: B ∧¬C ∨ ¬B ∧C 3.3 Formalize the following as sentential compounds of A, B, C, D, using ¬, ∧ and ∨. Try to get short sentences. (‘The four women’ refers to Ann, Barbara, Claire and Dorothy.) 1.
At least one of the four women is married.
2.
Exactly one of the four women is married.
3.
At least two of Ann, Barbara and Claire are unmarried.
4.
If one of the four women is married, all of them are.
5.
If one of the four women is unmarried, all of them are.
6.
At least one of Ann and Dorothy is married and at least one of Barbara and Claire is unmarried.
7.
Ann and Claire have the same marital status.
8.
Either one or three among Barbara, Claire and Dorothy are married.
3.4 Let A, B, C be as above. Using the simplified form of the sentences of Homework 2.17 (in 2.5.2), translate them into English. Use good stylistic phrasings. You do not have to reflect logical form, but the English sentences should always have the same truth-value as their formal counterparts. ‘Or’ is to be understood inclusively. You can use ‘the three women’ to refer to Anne, Barbara and Claire.
3.1.4
Conditional and Biconditional
The main English constructions that correspond to the conditional of sentential logic are: ‘If ... , then
’,
‘If... ,
where ‘...’ corresponds to the antecedent and ‘
’,
and
‘
, if ...’,
’ –to the consequent. Consider for example:
(26) If John works hard he will pass the logic examination, or the equivalent (260 ) John will pass the logic examination if he works hard.
98
CHAPTER 3. SENTENTIAL LOGIC IN NATURAL LANGUAGE
If John works hard but does not pass the logic examination, (26) is false; and if he works hard and passes the examination, (26) is true. So far the intuition is clear. Somewhat less clear is the case of a false antecedent, i.e., if John doesn’t work hard. Note that if we are to assign to a truth-value, it must be T(as in the truth-table of →). The value cannot be F, for the point of making an ‘if...then ’ statement is to avoid commitment to the truth of the antecedent. (26) cannot be understood to imply that John will work hard. Some may want to say that when the antecedent is false no truth-value should be assigned. It is as if no statement is made. On this view, the very status of having a truth-value depends on the antecedent’s truth. This interpretation complicates matters considerably and puts (26) beyond the scope of classical logic. Whatever its merits, it is by no means compelling. Quite reasonably we can regard (26), when the antecedent is false, as true. Just as we can judge a father, who had told his daughter: (27) If it doesn’t rain tomorrow, we shall go to the zoo, to have fulfilled his obligation if it rained and they didn’t go to the zoo. He has fulfilled it in an easy, disappointing way, but fulfilled it he has. By the same token, if the antecedent of (26) is false, the sentence it true; true by default, but true nonetheless. ‘Material conditional’ (or, in older terminologies, ‘material implication’) is sometimes used to denote our present conditional; ‘material’ indicates that the truth-value depends only on the truth-values of the antecedent and the consequent, not on any internal or causal connection. The construal of ‘if’-statements as material conditionals can lead to oddities that conflict with ordinary intuition. The following statements turn out to be true, the first–because the antecedent is false, the second–because the consequent is true. (28.i) If pigs have wings, the moon is a chunk of cheese, (28.ii) If two and two make four, then pigs don’t have wings. There is more than one reason for the oddity of statements like (28.i) and (28.ii). First, we expect the speaker to be as informative as he or she can. There is no point of asserting a conditional if either the antecedent is known to be false, or the consequent is known to be true. In the first case one is expected to assert the negation of the antecedent; in the second case one is expected to assert the consequent. Furthermore, we expect some connection between what the antecedent and consequent say, and this is totally missing in (28.i) and (28.ii). The connection need not be causal, e.g., (29) If five times four is twenty, then five times eight is forty. (29) makes sense in as much as the consequent can be “naturally deduced” from the antecedent.
3.1. CLASSICAL SENTENTIAL CONNECTIVES IN ENGLISH
99
Often ‘if ... , then ’ is employed in a sense that indicates a causal relation. For example, (27) says that if they will not be hindered by rain, the persons in question (referred to by ‘we’) will go to the zoo. But if we construe it as a material conditional and apply the familiar equivalence ¬A → B ≡ ¬B → A , we can convert it into the logically equivalent: (270 ) If we don’t go to the zoo tomorrow, it will rain. Without additional explanation (270 ) looks bizarre; for it suggests that not going to the zoo has an effect on the weather. Cases like those discussed above have been sometimes called “paradoxes of material implication [or material conditional]”. Actually, there is nothing paradoxical here. Material conditional is not intended to reflect aspects of ‘if’ that pertain to causal links, temporal order, or any connections of meanings, over and above the truth-value dependence. Non-material conditional can be expressed in richer systems of logic, which are designed to handle phenomena that are not truth-functional.
‘If’ and ‘Only If’, Sufficient versus Necessary Conditions In ‘if’-statements that express conditionals the antecedent is marked by ‘if’. Recast as conditionals, ‘If ... , then
’,
‘If ... ,
’,
and
‘
, if ...’
come out as: (30)
(...) → (
) .
Note that in the third expression above, the antecedent comes after the consequent. The antecedent is not marked by its place but by the prefix ‘if’. Just as ‘if’ marks the antecedent, ‘only if’ marks the consequent. As far as truth-values are concerned, to say ‘... , only if is to say that ‘...’ is not true without ‘ so is ‘ ’. Thus, it comes out as (30).
’
’ being true; which simply means that if ‘...’ is true,
100
CHAPTER 3. SENTENTIAL LOGIC IN NATURAL LANGUAGE
A condition X (fact, state of affairs, what a sentence states–we won’t belabor this here) is said to be sufficient for Y , just when the obtaining of X guarantees the obtaining of Y . That X is a sufficient condition for Y is often described by asserting: ‘If X then Y ’,
or
‘Y , if X’,
’
or
‘
or–to put it more accurately: ‘If ... , then where ‘...’ describes X and ‘
, if ...’,
’ describes Y .
On the other hand, X is a necessary condition for Y if Y cannot take place without X. And this is often expressed by Y , only if X, A sufficient condition need not be necessary; e.g., dropping the glass is sufficient for breaking it (expressed by: ‘If the glass is dropped it will break’), but it is not necessary–the glass can be broken in other ways. Vice versa, a necessary condition need not be sufficient; e.g., good health is necessary for Jill’s happiness (expressed by: ‘Jill is happy only if she is in good health’), but it is not sufficient–other things are needed as well. The confusing of necessary and sufficient conditions is quite common and results in fallacious thinking; for example, the affirmation-of-consequent fallacy whereby, assuming the truth of ‘If ... , then ’, one fallaciously infers the truth of ‘...’ from that of ‘ ’. Although ‘if ... , ’ and ‘... , only if ’ come to the same when construed as conditionals, the move from the first to the second can result in statements that are rather odd. If we rewrite (27) (‘If it doesn’t rain tomorrow, we shall go to the zoo’) in the ‘only-if’ form we get: It won’t rain tomorrow only if we go to the zoo, which makes the going to the zoo a necessary condition for not raining, suggesting, even more than (270 ), some mysterious influence on the weather. Underlying this are, again, the causal implications that an ‘only-if’ statement can have, which disappear in the formalization. By now the matter has been sufficiently clarified. ‘If’-Statements that Express Generality When an ‘if’-phrase is constructed with an indefinite article and a common noun, the result is not a conditional but a generalization of one: (31) If a woman wants to have an abortion, the state cannot prevent her.
3.1. CLASSICAL SENTENTIAL CONNECTIVES IN ENGLISH
101
To assert (31) is to assert something about all women. It cannot be formalized in sentential logic. We shall see that in first-order logic it is expressed by using variables and prefixing a universal quantifier before the conditional. It comes out like: (310 ) For all x, if x is a woman who wants to have an abortion, then the state cannot prevent x from having it. By contrast, the following is expressible as a simple conditional, though it has the same grammatical form. (32) If Jill wants to have an abortion, the state cannot prevent her. The expression of generality through conditionals is very common in technical contexts, when general rules are stated using variables, or schematic symbols. For example, the transitivity of > is stated in the form: If x > y and y > z, then x > z, meaning: for all numbers x, y, z, if x > y etc. The quantifying expression ‘For all numbers x, y, z’ has been omitted, but the reader has no difficulty in supplying it.
Other Ways of Expressing Conditionals Besides ‘if’, there are other English expressions that mark the antecedent of a conditional. Here are some: ‘provided that’, ‘assuming that’, ‘in case that’,
and sometimes ‘when’.
For example, (27) can be rephrased as ... it does not rain tomorrow, we shall go to the zoo, where ‘...’ can stand for each of: ‘Provided that’, ‘Assuming that’, ‘In case that’. We can also change the places of antecedent and consequent: ‘We shall go to the zoo tomorrow, provided that it does not rain’. As in the case of ‘if’, we can get an expression marking the consequent by adding ‘only’: ‘... , only in case that ’. As a rule, ‘when’ can be used to form conditionals involving generality. For example, you can replace in (31) ‘If’ by ‘When’; or consider: (33) When there is a will there is a way. (I.e., in every case: if there is will there is a way.) (34) A conjunction is true when both conjuncts are. (I.e., for every conjunction, if both conjuncts are true so is the conjunction.)
102
CHAPTER 3. SENTENTIAL LOGIC IN NATURAL LANGUAGE
This use of ‘when’ is possible when the temporal aspect is non-existent as in (34), or when it is not explicit, as in (33). (In the latter, the temporal aspect was neutralized by using ‘case’ in the paraphrase.) But when time is explicit, we cannot recast the sentence as a generalized conditional, at least not in a straightforward way (via common names). For example: (35) Jack and Jill will marry when one of them gets a secure job. The temporal proximity of the two events (getting a secure job and the marriage), which is explicitly stated in (35), disappears if (35) is formalized as a conditional. (The formalization of (35)in first-order logic requires the introduction of time points. Alternatively, it can be carried out in temporal logic, designed expressly for handling such statements. In this logic truth-values are time-dependent and there are connectives for expressing constructions based on ‘when’, ‘after’, ‘until’, etc.) Unless: The harmless ‘unless’, which causes no problem in actual usage, can cause confusion when it comes to formalizing. ‘ , unless ...’ means that if the condition expressed by ‘...’ is not satisfied, then ‘ ’ is (or will be) true. In formal recasting it is: ¬(...) → (
) .
It is a conditional whose antecedent is the negation of the clause that follows ‘unless’. That clause can also come first: ‘unless ... , ’ . Since ¬A → B ≡ A ∨ B , an ‘unless’-statement can be formalized as a simple disjunction: (...) ∨ (
) .
Reading ‘unless’ as ‘or’ is somewhat surprising. Analyzing the situation, one can see that ‘unless’ connotes (perhaps even more than ‘if’) some sort of causal connection. When we read it as ‘or’, this side of it disappears; hence, the surprise. We are also not used to regard ‘unless’ as a disjunction. Whatever the reason, this is how ‘unless’ is interpreted as a truth-functional connective. A few examples of will show that it is indeed the right way: We shall go to the zoo tomorrow, unless it rains. We shall go to the zoo tomorrow if it does not rain. Either it will rain tomorrow, or we shall go to the zoo.
Unless you pass the exam, you will not qualify. If you don’t pass the exam, you will not qualify. You’ll pass the exam, or you will not qualify.
3.1. CLASSICAL SENTENTIAL CONNECTIVES IN ENGLISH
103
Biconditional In natural language biconditionals are expressed by using ‘if and only if’: Tomorrow will be the longest day of the year, if, and only if, today is June 20. In this form, biconditionals can serve to assert that a certain condition is both necessary and sufficient for some other condition. Note that the “necessary”-part is expressed by ‘only if’, and it corresponds to the left-to-right direction of ‘↔’ ; the “sufficient”-part is expressed by ‘if’, and it corresponds to the right-to-left direction of ‘↔’. Since a biconditional amounts to a conjunction of two conditionals, the construal of various ‘if-and-only-if’-statements as biconditionals inherits some of the problems of the conditional (material implication). Other expressions that can be used to form biconditionals are: just if,
just in case,
just when.
But biconditionals that are stated with ‘just when’ are usually general claims whose formalizations require quantifiers. Homework 3.5 Express the following excerpts as sentential compounds. Get your basic components as small as possible. Note basic components that are equivalent to generalized conditionals and rewrite them so as to display the conditional part. For example: The wise in heart will receive commandments, but a prating fool shall fall. Answer: A ∧ B, where A : ‘The wise at heart will receive commandments’, Generalized conditional: ‘If x is wise at heart, then x will receive commandment.’ B : ‘A prating fool shall fall’, Generalized conditional: ‘If x is a prating fool, then x will fall.’ For the sake of the exercise, you can treat an address to the reader as an address to a particular person, using ‘you’ as a proper name. 1. A leader is a dealer in hope. 2. Ignore what a man desires and you ignore the very source of his power. 3. Laws are like spider’s webs which, if anything small falls into them they ensnare it, but large things break through and escape. 4. If you command wisely, you’ll be obeyed cheerfully. 5. You’ll get well in the world if you are neither more nor less wise, neither better nor worse than your neighbors.
104
CHAPTER 3. SENTENTIAL LOGIC IN NATURAL LANGUAGE
6. God created man and, finding him not sufficiently alone, gave him a companion to make him feel his solitude more keenly. 7. Women are as old as they feel–and men are old when they loose their feelings. 8. If there are obstacles, the shortest line between two points may be the crooked line. 9. There is time enough for everything in the course of the day if you do but one thing at once; but there is not time enough in the course of the day if you will do two things at a time. 10. No one can be good for long if goodness is not in demand. 11. Literary works cannot be taken over like factories, or literary forms of expression like industrial methods. 12. If the mind, which rules the body, ever forgets itself so far as to trample upon its slave, the slave is never generous enough to forgive the injury; but will rise and smite its oppressor.
Chapter 4 Logical Implications and Proofs 4.0 In this chapter we introduce implication (from many premises) and we present a method of proving valid implications of sentential logic (i.e., tautological implications). It is an adaptation of Gentzen’s calculus, which is easy to work with and which is guaranteed to produce either a proof, or a counterexample that shows that the given implication does not obtain in general–hence is not provable. As we shall see in chapter 9, the system extends naturally to first-order logic, where it is guaranteed to produce a proof of any given logical implication. Returning, in the last section, to natural language, we try to represent implications that arise in English discourse, as formal implications of our system. In this connection we discuss some well-known concepts in the philosophy of language, such as meaning postulates and implicature.
4.1
Logical Implication
As noted in the introduction, logic was considered historically the science of correct reasoning, which uncovers and systematizes valid forms of inference. Generally, an inference starts with certain assumptions called premises and ends with a conclusion. It is not that required that the premises be true, but that they imply the conclusion; i.e., it should be impossible that the premises be true and the conclusion–false. In general, implications are not grounded in pure logic. That Jack was at a certain hour in New York implies that he was not, shortly afterwards, in Toronto. This is not a logical implication. It rests on the practical impossibility of covering the New York - Toronto distance in too short a time. If the time is sufficiently short, the impossibility may be traced to a physical 105
106
CHAPTER 4. LOGICAL IMPLICATIONS AND PROOFS
law. And in the extreme case, it becomes the impossibility of being at the same time in two different places. But even this is not something that rests on pure logic. We shall not address at this point what comes under “pure logic”. As in the cases of logical equivalence and logical truth (cf. chapter 2), we can say that a sentence logically implies another if it is impossible that the first be true and the second false, by virtue of the logical elements of the two sentences. In sentential logic the only logical elements are the connectives. A logical implication that rests only on the meaning of the connectives is tautological. Here is the definition. A tautologically implies B if there is no assignment of truth-values to the atomic sentences under which A gets T and B gets F. As in the case of tautological equivalence, (cf. 2.2.0) there is no need to go to the level of the atomic sentences. That a sentence tautologically implies another can be seen by displaying their relevant sentential structure. The definition entails the following. A tautologically implies B if and only if they can be written as sentential expressions (with each sentential variable standing for the same sentence in both) such that there is no assignment to the sentential variables under which the expression for A gets T, and the expression for B gets F. This means that, in a truth-table containing columns for both, there is no row in which A’s value is T and B’s value is F. If A is a contradiction, then there is no truth-value assignment (to the atomic sentences) under which it gets T. Hence, for all B, there is no assignment in which A gets T and B gets F. Consequently a contradiction implies tautologically all other sentences. By a similar argument, every sentence implies tautologically a tautology. We shall return to this later. Note that tautological implication is a certain type of logical implication. In sentential logic the two are the same. But in more expressive systems, in particular in first-order logic, there are logical implications that are not tautological. Terminology and Notation: • ‘Logical’ in the context of sentential logic means tautological. In the present chapter, the terms are interchangeable. For the sake of brevity we often use ‘implication’ for logical implication. • ‘|=’ denotes logical implication, that is: A |= B means that A logically implies B. If A |= B, we say that B is a logical consequence of A.
4.1. LOGICAL IMPLICATION
107
• ‘|=’ is a shorthand for ‘logically implies’. Like ‘≡’ it belongs to our English discourse, not to the formal system. To say that A |= B is to claim that A logically implies B. Note: Terms such as ‘implication’ and ‘equivalence’ are used mostly with respect to sentences, or sentential expressions, of our formal system. But we use them also with respect to our own statements. E.g., we can say that A ≡ A0 implies that ¬A ≡ ¬A0 , and we speak about the implication A |= A0 =⇒ ¬A0 |= ¬A . And here, ‘implication’, which is denoted by ‘=⇒’, refers to our own statements. Similarly, we may speak of the equivalence A≡B
⇐⇒
B≡A.
A similar ambiguity surrounds ‘consequence’. The intended meaning of these, and other two-level terms, should be clear from the context. If two sentences are equivalent, then under all truth-value assignments (to the atomic components) they get the same truth-value. Hence they imply each other. Vice versa, if they imply each other, than there is no assignment under which one gets T and the other gets F; therefore they are equivalent. Hence, sentences are equivalent just when they imply each other: (1)
A≡B
⇐⇒
A |= B and B |= A .
(1) shows how logical equivalence can be defined in terms of logical implication. On the other hand, using conjunction, we can express implication in terms of equivalence: (2)
A |= B
⇐⇒
A ≡ A∧B .
The argument for (2) is easy: Assume that A |= B, then (i) if A gets T, so does B and so does A ∧ B; and (ii) if A gets F, then A ∧ B gets F. Therefore A and A ∧ B always get the same value. Vice versa, if A ≡ A ∧ B, it is impossible that A gets T and B gets F, for then A and A ∧ B get different values. One can, nonetheless, argue that implication is the more basic notion. It corresponds directly to inferring. Moreover, logical equivalence is reducible to it without employing connectives, but not vice versa.1 As we shall presently see, the most basic notion is that of implication with one or more premises. 1
The reason for treating, in this book, equivalence before implication is didactic. Its analogy with equality makes equivalence more accessible and enables one to use algebraic techniques.
108
CHAPTER 4. LOGICAL IMPLICATIONS AND PROOFS
Using conditional, we can express logical implication in terms of logical truth. (3)
A |= B
⇐⇒
A→B
is logically true.
Homework 4.1 Prove (3), via the same type of argument used in proving (2). The following properties of implication are easily established. Reflexivity: Transitivity:
A |= A If A |= B and B |= C, then A |= C.
Transitivity is of course intuitively implied by the very notion of implication. The detailed argument is trivial:2 Assuming that A |= B and B |= C, one has to show that there is no assignments under which A gets T and C gets F. So assume that A gets T. Then B must get T, because A |= B. But then, C must get T, because B |= C. Logical implications are preserved when we substitute the sentences by logically equivalent ones: (4)
If A ≡ A0 and B ≡ B 0 , then A |= B iff A0 |= B 0 .
One can derive (4) immediately from the definitions, by noting that logical implication is defined in terms of possible truth-values and that logically equivalent sentences always have the same value. ((4) is also derivable from (1) via the transitivity of implication: If, A ≡ A0 then, by (1), 0 0 0 A |= A. Similarly, if B ≡ B , then B |= B . If also A |= B, we get: A0 |= A,
A |= B,
B |= B 0
Applying transitivity twice we get A0 |= B 0 . In the same way we derive A |= B from A0 |= B 0 .) (4) implies that, in checking for logical implications, we are completely free to substitute sentences by logically equivalent ones. We can therefore use all the previous simplification techniques in order to reduce the problem to one that involves simpler sentences. 2
It is trivial if we define implication by appealing to assignment to atomic sentences. If we want to bypass atomic sentences we have to show that the two implications A |= B and B |= C, can be founded on sentential expressions in which the three sentences are generated from the same stock of basic components (see also footnote 4 in chapter 2, page 32). This can be done by using unique readability.
4.1. LOGICAL IMPLICATION
109
Every case of logical equivalence is, by (1), a case of two logical implications, from left to right and from right to left. But generally implications are one-way. Here are a few easy examples in which the reverse implication does not hold in general. (i)
A ∧ B |= A
(ii)
A |= A ∨ B
(iii)
B |= A → B
(iv)
¬A |= A → B
(v)
A ∧ B |= A ↔ B
(vi)
¬A ∧ ¬B |= A ↔ B
That the reverse implications do not hold in general can be seen by assigning the sentential variables truth-values, under which the left-hand side gets T and the right-hand side gets F. We can interpret them as standing for atomic sentences that can have these values. For example, in the case of (i), let A get T and let B get F. Note that we can also force this assignment by interpreting the variables as standing for tautologies or contradictions. For example let A = C → C, B = C ∧ ¬C. In particular cases, the right-to-left implication holds as well. For example: If B is logically true, e.g. (B = C → C) then A |= A ∧ B. If B is a logically false, then A → B |= ¬A . Here, as an illustration, is the argument for the second statement. Assume that B is logically false. If A → B gets T, then, since B gets F (being logically false), A must get F. Hence, ¬A gets T. There is, therefore, no assignment (to the atomic sentences) in which A → B gets T and ¬A doesn’t. Homework 4.2 Find, for each of (i) - (vi) above, whether the reverse implication holds for all B, in each of the following cases: (1) A is logically true. (2) A is logically false. Altogether you have to check 12 cases. Prove every positive answer by an argument of the type given above. Prove every negative answer by a suitable counterexample.
110
CHAPTER 4. LOGICAL IMPLICATIONS AND PROOFS
Note: We can define a notion of logical implication that applies to sentential expressions. This is completely analogous to the case of logical equivalence and logical truth (cf. 2.2). We shall discuss it in 4.3.1, in the more general context of implication from many premises.
4.2
Implications with Many Premises
4.2.0 Implications with several premises are a natural generalization of the one-premise case. The sentences A1 , A2 , . . . , An logically imply the sentence B, and B is a logical consequence of A1 , . . . , An , if it is impossible, by virtue of the logical elements of the sentences, that all the sentences Ai are true and B is false. The notation is generalized accordingly: A1 , . . . , An |= B The sentences A1 , A2 , . . . , An are referred to as premises and B–as the conclusion. The precise definition, in the case of sentential logic, is a straightforward generalization of the one-premise case. Here it is: A1 , . . . , An |= B, if there is no truth-value assignment to the atomic sentences under which all the Ai s get T and B gets F. Again, this entails a characterization in terms of sentential expressions, without appealing to atomic sentences: A1 , . . . , An |= B, iff all the premises and the conclusion can be written as sentential expressions, such that in a truth-table containing columns for all, there is no row in which all the Ai ’s get T and B gets F. Notation:
We refer to sequences such as A1 , . . . , An as lists of sentences, and we use ‘ Γ ’, ‘ ∆ ’, ‘ Γ0 ’, ‘ ∆0 ’,. . .
etc.,
for denoting such lists. Thus, if Γ = A1 , A2 , . . . , An then Γ |= B
means that
A1 , . . . , An |= B .
Furthermore, we use notations such as ‘Γ, A’ and ‘Γ, ∆’ for lists obtained by adding sentence and by combining two lists: If
Γ = A1 , . . . , An
and
∆ = B1 , . . . , Bk ,
4.2. IMPLICATIONS WITH MANY PREMISES
111
then Γ, A = A1 , . . . , An , A
and
Γ, ∆ = A1 , . . . An , B1 , . . . , Bk .
It is obvious that, as far as logical consequences are concerned, the ordering of the premises makes no difference. Also repeating a premise, or deleting a repeated occurrence, make no difference. For example, A, B, A, C, D, D ,
B, A, C, D ,
and
A, B, C, D
have the same logical consequences. Such rewriting of lists will, henceforth, be allowed, as a matter of course. It should be evident by now that, in dealing with logical implications, we can apply the substitution-of-equivalents principle: any sentences among the premises and the conclusion can be substituted by logically equivalent ones. Spelled out in detail, it means this: If
A1 ≡ A01 ,
and
...
and An ≡ A0n ,
and B ≡ B 0 ,
then A1 , . . . , An |= B
⇐⇒
A01 , . . . , A0n |= B 0 .
The Empty Premise List We include among the lists the empty list, one that has no members. To be a logical consequence of the empty list means simply to be a logical truth. (If Γ is empty, then, vacuously, all its members are true. Hence, to say that it is impossible that all members of Γ are true and B is false is simply to say that it is impossible that B is false.) Logical implication by the empty list is expressed by writing nothing to the left of ‘|=’ . Therefore |= B means that B is a logical truth. In the case of sentential logic, it means that B is a tautology. By using conjunction, we can reduce an implication from A1 , . . . , An to the single-premise case: (5)
A1 , . . . , An |= B
⇐⇒
A1 ∧ . . . ∧ An |= B .
(5) is proved by observing that all Ai ’s get T just when their conjunction, A1 ∧ . . . ∧ An , gets T. Implications from premise lists constitute, nonetheless, an important advance on singlepremise implications. First, they do not necessitate the use of conjunctions. Second, it is
112
CHAPTER 4. LOGICAL IMPLICATIONS AND PROOFS
easier to grasp an implication stated in terms of several premises, instead of using a single long conjunction. Third, the premise list can be infinite. The definition of implication works equally well for that case, but the reduction to single premises, via (5), breaks down (unless we admit infinite conjunctions, which is a radical extension of the system). In this book we shall restrict ourselves to finite premise lists. Yet the infinite case has its uses. Fourth, by including the possibility of an empty list, we incorporate logical truth within the framework of logical implication. As we shall see, the rules for implications lead to useful methods for establishing logical truth. Finally and most important, there are nice sets of rules for establishing implications, which depend essentially on the possibility of having many premises.
4.2.1
Some Basic Implication Laws and Top-Down Derivations
Our previous (3) (which characterizes implication from a single premise in terms of logical truth) can be now stated as: (6)
A |= B
⇐⇒
|= A → B
And this can be generalized to the following important law:
(7) Γ, A |= B
⇐⇒
Γ |= A → B
(6) is a particular case of (7), obtained when Γ is empty. Here is the proof of (7): To prove the left-to-right direction, assume that Γ, A |= B and show: Γ |= A → B, i.e., that it is impossible (by virtue of the logical elements) that all members of Γ are true and A → B is not. Assume a case where all members of Γ get T. If A gets F, then A → B gets T (by the truth-table of → ). And if A gets T, then all members of Γ, A get T; since we assumed that Γ, A |= B, B gets T. Again, by the truth-table of →, A → B gets T. To prove the right-to-left direction, assume that Γ |= A → B, and show: Γ, A |= B, i.e., that it is impossible (by virtue of the logical elements) that all members of Γ, A get T and B get F. So assume that all members of Γ, A get T. Then (i) all members of Γ get T and (ii) A gets T. Having assumed that Γ |= A → B, it follows that A → B gets T. Since also A gets T, it follows, by the truth-table of →, that B gets T.
4.2. IMPLICATIONS WITH MANY PREMISES
113
Note that the argument relies on the logical elements of the sentences in Γ, A, B and on the truth-table of →, which is itself a logical element. (7) provides a very useful way of establishing implications in which the conclusion is a conditional. We can refer to it by the (rather unwieldy) “conclusion-conditional law”. We also mark it by the following self-explanatory notation (|=, →) . Here is an illustration how (|=, →) can work. Suppose that we want to show that: |= (A → (B → A)) Using (|=, →) (where Γ is empty and B is substituted by B → A) this reduces to showing that: A |= B → A Again, using (|=, →) (where Γ consists of A, A is substituted by B, and B by A), this reduces to: A, B |= A
But this last implication is obvious, because the conclusion occurs as one of the premises. Thus, we have established the logical truth of our original sentence. The argument just given is an example of a top-down proof, or top-down derivation: We start with the claim to be proved and, working our way “backward”, we keep reducing it to other sufficient claims (i.e., claims that imply it), until we reduce it to obviously true claims. We can then turn the argument into a bottom-up proof: the familiar kind that starts with obviously true claims and moves forward, in a sequence of steps, until the desired claim is established. The bottom-up proof is obtained by inverting the top-down derivation. In our last example the resulting bottom-up proof is: A, B |= A A |= B → A |= A → (B → A)
obvious, by (|=, →), by (|=, →).
The implications occurring in top-down derivations are referred to as goals. The derivation starts with the initial goal (the implication we want to prove) and proceeds stepwise by reducing goals to other goals, until all the required goals are self-evident. The top-down method figures prominently in the sequel. Besides (|=, →), we shall avail ourselves of other laws. It goes without saying that all the laws are general schemes holding for all possible values of the sentential variables. Here are some. (8)
If Γ |= A and every sentence that occurs in Γ occurs in Γ0 , then Γ0 |= A.
114
CHAPTER 4. LOGICAL IMPLICATIONS AND PROOFS
(8) says that the addition of premises can only increase the class of implied consequences. This property is the monotonicity of logical implication, or of the logical consequence relation.3 Given our definition of implication, (8) is trivial (if it is impossible that all the sentences in Γ get T and A gets F, then, a fortiori, it is impossible that all sentences in Γ0 get T and A gets F). Note: Monotonicity obtains for many types of implication, not necessarily logical, and it seems quite obvious: By adding more premises one can only get more consequences, not less. Yet we often employ reasonings that are not monotone. Conclusions established on the basis of some information may be withdrawn when some additional information is obtained. The well-known example is: Being told that Twitty is a bird, one will conclude that Twitty can fly; but one will withdraw this conclusion if told, in addition, that Twitty is a penguin. Inferences of that nature have been, in the last twenty years, the subject of considerable research by logicians, computer scientists and philosophers working in the area of belief change and artificial intelligence. Various formal systems have been proposed. They come under the title of non-monotonic logic. Not as obvious as (8), but still quite easy, is the following law of consequence addition: (9)
If Γ |= A, then, for every sentence B: Γ |= B
⇐⇒
Γ, A |= B
(9) means that any consequence of the given premises can be added as an additional premise, without changing class of consequences. Here is the proof. The left-to-right direction of the ‘⇐⇒’ follows from monotonicity. For the right-to-left direction, assume that Γ, A |= B. If all sentences in Γ get T, then, by the initial assumption (that Γ |= A), A must get T; hence, all sentences in Γ, A get T; therefore B gets T. Thus, it is impossible that all sentences in Γ get T and B gets F. The following generalization of (9) allows us to add as premises many consequences of the original list. (9∗ ) Assume that every sentence in ∆ is a consequence of Γ, then Γ and Γ, ∆ have the same consequences; that is, for every sentence C: Γ |= C 3
⇐⇒
Γ, ∆ |= C
In mathematics, ‘monotone’ (or ‘positively monotone’) is used to describe relations in which an increase in one quantity does not cause a decrease in another related one. For example, 2·x is a monotone function of x, for it doesn’t become smaller as x becomes larger. But x2 − 2x is not monotone, for, as x becomes larger it sometimes increases and sometimes decreases (e.g., it increases if x is increased from 1 to 2, but decreases if x is increased from 0 to 1).
4.2. IMPLICATIONS WITH MANY PREMISES
115
(9∗ ), can be proved by the same reasoning that proves (9). It can be also deduced by repeated applications of (9). (Assume that ∆ = B1 , . . . , Bm ; then every Bi is a consequence of Γ. By (9), Γ and Γ, B1 have the same consequences. Since B2 is a consequence Γ, it is, by monotonicity, a consequence of Γ, B1 ; again, by (9), Γ, B1 and Γ, B1 , B2 have the same consequences. Therefore Γ and Γ, B1 , B2 have the same consequences, etc.) (9∗ ) implies the following generalization of the transitivity law of one-premise implications: If (i) for every B in ∆, Γ |= B,
and (ii) ∆ |= C,
then
Γ |= C .
The argument is easy: By monotonicity, every consequence of ∆ is a consequence of Γ, ∆; and, by (9∗ ), Γ, ∆ has the same consequences as Γ. (10) If every sentence of ∆ is a consequence of Γ and every sentence of Γ is a consequence of ∆, then Γ and ∆ have the same consequences. (10) follows trivially from generalized transitivity: every consequence of ∆ is a consequence of Γ and, vice versa, every consequence of Γ is a consequence of ∆. Equivalent Premise Lists: Call two premise lists, Γ and ∆, logically equivalent, or equivalent for short, if they have the same logical consequences. If two premise lists are equivalent then every sentence of one list is a consequence of the other (because it is a consequence of the list in which it occurs). (10) says that the reverse direction holds as well. By the truth-table of →, we get immediately: (11)
A, A → B |= B
We have also: B |= A →B. These two imply the following very useful law: (12) Γ, A, A → B |= C
⇐⇒
Γ, A, B |= C
(12) is obtained, via (10), by observing that every sentence in one of the two premise lists is a consequence of the other. (Every sentence in Γ, A, B is a consequence of Γ, A, A → B, because A, A → B |= B. And every sentence in Γ, A, A → B is a consequence of Γ, A, B, because B |= A →B.) We shall call (12) disjoining. It allows us to disjoin a premise that is a conditional into its parts, provided that the antecedent is among the premises.
116
CHAPTER 4. LOGICAL IMPLICATIONS AND PROOFS
Note: Disjoining is related to what is known as modus ponens, or the rule of detachment, by which one can formally infer B from A and A → B. Disjoining is different. It is the semantic law which justifies the use of modus ponens. The name ‘disjoining’ is not a current term. (12) can be generalized to: (12∗ )
If A0 |= A, then
Γ, A0 , A → B |= C
⇐⇒
Γ, A0 , B |= C
To show (12∗ ), assume that A0 |= A. Then the addition of A to any list containing A0 , yields an equivalent list. Hence, Γ, A0 , A → B is equivalent to Γ, A0 , A, A → B, which, by (12), is equivalent to Γ, A0 , A, B. And this last list is equivalent to Γ, A0 , B, since it is obtained from it by adding A. Here is an example of a top-down derivation that uses some of the listed laws. We want to show that |= [A → (B → C)] → [B → (A → C)] Starting with this as our initial goal, we keep reducing each goal to another sufficient goal and we write the goals on separate, numbered lines. Indicated in the margin is the √ law (or laws) by which the preceding implication is reduced to the current one. The sign ‘ ’ marks obvious implications that need no further reductions. 1.
|= [A → (B → C)] → [B → (A → C)]
initial goal,
2.
A → (B → C) |= B → (A → C)]
by (|=, →),
3.
A → (B → C), B |= A → C
by (|=, →),
4.
A → (B → C), B, A |= C
by (|=, →),
5.
B → C, B, A |= C
by disjoining,
6.
C, B, A |= C
by disjoining.
√
Note that the reduction from 4. to 5. uses an instance of disjoining, whereby A → (B → C), A is replaced by B → C, A. For the sake of brevity, we can write the three steps from 1. to 2., from 2. to 3., and from 3. to 4. as a single step: |= [A → (B → C)] → [B → (A → C)] A → (B → C), B, A |= C
initial goal, by three applications of (|=, →).
4.2. IMPLICATIONS WITH MANY PREMISES
117
In a similar way, we can write the steps from 4. to 5. and from 5. to 6. as a single step in which disjoining is applied twice. The bottom-up proof of the initial goal is obtained by reversing the list: Start from 6. and end at 1., justifying each step by the indicated rule; 5. is obtained from 6. by disjoining, 4. from 5.–by disjoining, 3. from 4. by (|=, →), etc. From now on we will omit ‘initial goal’ in the margin of the first line. Here is another example, where substitution-of-equivalents is used as well. The equivalences we use are: B ∨ C ≡ ¬B → C and A → B ≡ ¬B → ¬A The first equivalence is used in getting 2. from 1., the second–in getting 4. from 3.; in the first case substitution is applied to the conclusion, in the second case–to one of the premises. 1.
A → B, ¬A → C |= B ∨ C
2.
A → B, ¬A → C |= ¬B → C
3.
¬B, A → B, ¬A → C |= C
4.
¬B, ¬B → ¬A, ¬A → C |= C
5.
¬B, ¬A, ¬A → C |= C
6.
¬B, ¬A, C |= C
substitution of equivalents, by (|=, →), substitution of equivalents, by disjoining, by disjoining.
√
Homework 4.3 Using the laws introduced so far, prove, via top-down derivations, the following five implications. The goal should be reduced in the end to an obvious implication in which the conclusion is one of the premises. You can use substitution-of-equivalents based on simple equivalences of the kind given in the last example. In the derivations of 4. and 5. you can use laws (12∗ ) and (10), as well as the implications B |= A ∨ B and A, B |= A ∧ B. 1.
|= [A → (B → C)] → [(A → B) → (A → C)]
2. ¬A → B, B → C |= ¬C → A 3. A → (B ∨ C), ¬B |= A → C 4. (A ∨ B) → (B → C) |= B → C 5. A∧B → C, B |= A → C
118
4.2.2
CHAPTER 4. LOGICAL IMPLICATIONS AND PROOFS
Additional Implication Laws and Derivations as Trees
The following two laws handle conjunctions that occur either as premises or as conclusions. The first, the conjunction-premise law, handles the case where the conjunction is one of the premises; the second, conjunction-conclusion law, handles the case where the conjunction is the conclusion. We adopt the following notation: (∧, |=):
for the conjunction-premise law,
(|=, ∧):
for the conjunction-conclusion law.
Here are the two laws: (∧, |=)
Γ, A ∧ B |= C
(|=, ∧)
Γ |= A ∧ B
⇐⇒ ⇐⇒
Γ, A, B |= C Γ |= A and Γ |= B
(∧, |=) follows, via (10) (and monotonicity), from the obvious facts that (i) A ∧ B logically implies each of A and B, and (ii) A, B |= A ∧ B. In the second law, (|=, ∧), the ⇒-direction follows, via transitivity, from the fact that A ∧ B implies each of A and B. The ⇐-direction (which is the direction we will be using in topdown derivations) is established by noting that, if Γ |= A and Γ |= B, then Γ implies every premise in the list A, B. Since this list implies A ∧ B, Γ implies it as well (via transitivity). The conjunction-conclusion law is distinguished from the other laws used so far in that it reduces a goal (Γ |= A ∧ B) not to one but to two goals. Both must be achieved. The number of goals increases thereby. It may increase further, since each of the two new goals may give rise, directly or indirectly, to more than one goal. But the new goals are simpler: instead of the conclusion A∧B we have only A or B. This is also true of the other laws that we shall use. It constitutes the main feature of the method: Although the number of goals may increase, the goals themselves become simpler. In the end, the initial goal is reduced to a collection of so called elementary goals; these are goals whose validity can be immediately checked. Here is a simple top-down derivation that ends with two final, self-evident implications: 1.
C → A |= (C → B) → (C → A∧B)
2.
C → A, C → B, C |= A ∧ B
3.
C, A, B |= A ∧ B
two applications of (|=, →), two applications of disjoining,
4.1
C, A, B |= A
by (|=, ∧),
4.2
C, A, B |= B
by (|=, ∧).
√ √
4.2. IMPLICATIONS WITH MANY PREMISES
119
Terminology:
In a given derivation, the goals to which a goal has been reduced in a single step are referred to as the goal’s children. The goal is referred to as the parent.
In the last derivation, 1. has a single child , which is 2. The only child of 2. is 3. But 3. has two children, numbered 4.1 and 4.2. They are numbered thus, in order to mark them as the two children of 3. The derivation is not complete, unless both 4.1 and 4.2 have been achieved; hence we need two 1∕,s to indicate success.
Top-Down Derivations Written as Trees A top-down derivation can be written in the form of a tree, whose nodes are labeled by the impheations that appear as goals in the derivation. (Concerning trees, see 2.4.1; recall that ‘children’ is used also in the tree-terminology for the nodes that issue from some node.) Here is the tree-form of the last derivation:
C ÷ A ∣= (C÷B)÷(C÷ AΛB)
C÷A, C÷B,
C ∣= AΛB
C, A, B
C, A, B
∣= A B
∣= A
C, A, B
∣= B
The general rule is very simple:
(i) The root is labeled by the initial goal. (ii) A node has as many children as the children of the goal that labels it. Each child-goal labels exactly one child-node.
Tn a complete derivation the leaves of the tree are labeled by implications considered to be obviously true. So far, we have restricted this category to implications in which the conclusion is one of the premises. Later, we shall add to it another type of implication.
Usually, an impheation can be reduced to a simpler one in more than one way. The choices of the sentence, to which we apply laws, determine the resulting derivation. The impheation proved in the last example is also provable as follows:
CHAPTER 4. LOGICAL IMPLICATIONS AND PROOFS
120
1.
C → A ^= (C → B) → (C → AΛB)
2.
C → A,C → B,C b
AAB
3.1
C → A,C → B,C ^=
A
by (∣=,λ),
3.2
C → A,C → B,C ∣=
B
by (H,a),
4.1
C,A,C→B ∣= A
two applications of (∣=,→),
by disjoining, √
C→A,C,B ∣= B
4.2
by disjoining. √,
Here, instead of applying disjoining to the second goal (as we did before) we apply to it (∣=, A). As a result the branching occurs earher. We then apply disjoining to each of the children, 3.1 and 3.2, getting 4.1 as the single child of 3.1, and 4.2 as a single child of 3.2 . The tree form of this derivation is:
C÷A ∣= (C ÷ B) ÷ (C ÷ AAB)
C÷A, C÷B,
C ∣= AAB
C ÷ A, C ÷ B, C ∣= A
C, A, C÷B
C ÷ A, C ÷ B, C ∣= B
∣= A
C ÷ A, C, B
∣= B
Note that 4.1 is not the child of 3.2 (which is the Une iπunediately preceding it) Neither is 4.2 the child of its immediate predecessor, 4.1. The child-parent relation is determined by the numbering, not merely by the order of Unes. This is unavoidable when we use a sequential form to represent a tree. The general rule for numbering the goals will be given later.
Laws for Other Connectives and More on Top-Down Derivations Tn analogy with the two laws for conjunction, we have a disjunction-premise law, denoted (V, ∣=), which handles a disjunction that occurs as premise; and a disjunction-conclusion law, denoted (∣=, V), which handles a disjιmction that is a conclusion. Here they are.
(v,H
γ,Avb
(H,v)
Γ ∣= A∖j B
μc
Γ,A∖=C
and
Γ,-A Η B
Γ,B∏C
4.2. IMPLICATIONS WITH MANY PREMISES
121
Note: In (∨, |=), like in (|=, ∧), we get (via the ⇐ direction) a reduction of a goal to two goals, both of which must be proved. In (|=, ∧) the English ‘and’ on the right-hand side corresponds to ∧ in the conclusion. But in (∨, |=) it corresponds to ∨ in the premise. Some students find this confusing. The reason for converting ∨ to ‘and’ is that ∨ occurs in the premise. In order to show that ‘... or ’ implies something, we have to show that ‘...’ implies it and ‘ ’ implies it. (|=, ∨) can be proved by replacing A ∨ B by the equivalent ¬A → B and, via (|=, →), transferring ¬A to the premises. Homework 4.4 Give an argument that proves the disjunction-premise law. Note: To show that the premises imply A ∨ B, it is sufficient to show that they either imply A or imply B (because each of A, B, implies A ∨ B). Hence, we can have a disjunctionconclusion law with ‘or’ in the right-hand side. But such a law holds only in the ⇐-direction. The ⇒-direction does not hold in general. Γ may imply A ∨ B without implying any of the disjuncts A, B. For example, |= A ∨ ¬A (here Γ is empty), but from this it does not follow that either |= A, or |= ¬A. For if A is neither logically true nor logically false, then 6|= A and 6|= ¬A. Therefore you run some risk if, in order to show that Γ |= A ∨ B, you try to show that either Γ |= A or Γ |= B. For if Γ implies neither disjunct, you are sure to fail, even though Γ may imply the disjunction. For example, you will fail if you try to prove in this way that |= A ∨ ¬A. On the other hand, trying to show that Γ |= A ∨ B, by showing that Γ, ¬A |= B, is safe; for the second task is equivalent to the first. Here is an example of a derivation that involves more than one branching. 1.
A ∨ B |= (A → B) → [(B → A) → A∧B],
2.
A ∨ B, A → B, B → A |= A∧B
two applications of (|=, →),
3.1
A, A → B, B → A |= A∧B
by (∨, |=),
3.2
B, A → B, B → A |= A∧B
by (∨, |=),
4.1
A, B, B → A |= A∧B
by disjoining,
5.11
A, B, B → A |= A
by (|=, ∧),
5.12
A, B, B → A |= B
by (|=, ∧),
4.2 5.21
A → B, B, A |= A∧B A → B, B, A |= A
√ √
by disjoining, (|=, ∧),
√
122
CHAPTER 4. LOGICAL IMPLICATIONS AND PROOFS
5.22
A→B,B,A ∣= B
(h=>^)∙
The following is the same derivation written as a labeled tree. For convenience, the line numbers—rather than the implications—have been used as labels.
The Rule for Numbering Nodes in a Tree: The following is a convenient numbering of sequentially arranged items that are labels of a tree, from which you can construct the tree.
• Numbers are of the form n.___, where n is a positive integer and____ is a (possibly empty) sequence of positive integers.
• The root of the tree is numbered 1. • If a node numbered n.___has a single child, the child is numbered n+l._ (i.e., the head is increased by 1 and the tail is left unchanged). • If a node numbered n.__ has k children, where ⅛ > 1, they are numbered, according to their left-to-right order:
n+l.___1,
n+l.____ 2, ...n+l.—k
Note that in n___ the number n is the node’s level in the tree, i.e., the number of nodes on the branch leading to' it from the root (including both ends). The number of digits after the point shows how many branchings, up to that node, have occurred on that branch.
In our case, a goal is substituted by one or two goals. Hence nodes do not have more than two children and the sequence after n. consists of Is and 2s. The numbering rule applies equally to derivations in which goals are substituted, in one step, by more than two goals.
4.2. IMPLICATIONS WITH MANY PREMISES
123
Note: If the conclusion is (A∧B)∧C, then two applications of (|=, ∧) will reduce our initial goal to three, each with one of A, B, C as a conclusion. We may consider a law that achieves it in one step. Here it is convenient to disregard the grouping in the repeated conjunction: Γ |= A ∧ B ∧ C ⇐⇒ Γ |= A and Γ |= B and Γ |= C The same applies to longer conjunctions. Repeated disjunctions in the premises can be treated similarly. Such laws introduce branching into more than two branches. They are not included among our basic laws. The conditional-premise and conditional-conclusion laws, denoted respectively as (→, |=) and (|=, →), are as follows: (→, |=)
Γ, A → B |= C
⇐⇒
(|=, →)
Γ |= A → B
⇐⇒
Γ, ¬A |= C
and Γ, B |= C
Γ, A |= B
The first is obtained by replacing the premise A → B by the equivalent ¬A ∨ B and applying the disjunction-premise law. The second is our old friend (7) (with the two sides of ⇔ reversed). Note:
If, in addition to A → B, the premise-list contains A, we can apply disjoining: Γ, A, A → B |= C
⇐⇒
Γ, A, B |= C
This is better than applying (→, |=). But whereas (→, |=) is always applicable to a premise that is a conditional, disjoining requires that the antecedent, A, be among the premises. It is good practice to apply disjoining as long as the premises contain a conditional and its antecedent; e.g., a list of premises of the form Γ, A → B, B → C, A can be reduced, by two applications of disjoining, to the equivalent and much simpler list Γ, A, B, C Sometimes (12∗ ) can be applied as well; e.g., relying on the obvious implication A |= A ∨ B, we can replace Γ, A, (A ∨ B) → C by the equivalent premise-list: Γ, A, C. The remaining binary connective, ↔, can be dealt with through replacing A ↔ B by the conjunction of two conditionals: (A → B) ∧ (B → A). Alternatively, we can employ directly the following laws: (↔, |=)
Γ, A ↔ B |= C
⇐⇒
Γ, A, B |= C
and Γ, ¬A, ¬B, |= C
124 (|=, ↔)
CHAPTER 4. LOGICAL IMPLICATIONS AND PROOFS Γ |= A ↔ B
⇐⇒
Γ, A |= B
and Γ, B |= A
(↔, |=) is obtained by replacing A ↔ B by the equivalent (A∧B) ∨ (¬A∧¬B) and then applying the premise laws for disjunction and conjunction. (|=, ↔) is obtained by replacing the biconditional by a conjunction of two conditionals and applying the conclusion laws for conjunction and conditional. Treatment of Negated Compounds Negated sentential compounds are sentences either of the form ¬(¬A), or of the form ¬(A B), where is a binary connective. If a negated compound occurs either in the premises or as the conclusion, then, if it is ¬(¬A) the goal can be simplified by dropping the double negation. In the other case, we can push negation inside, using our old equivalence laws. This yields a sentence to which we can apply one of the previous implication laws. If is ∧ or ∨, the pushing inside is done via De Morgan’s laws. If it is →, the pushing inside is achieved via the equivalence: ¬(A → B) ≡ A ∧ ¬B
(13)
In the case of biconditional we can use each of the equivalences: (14.i)
¬(A ↔ B) ≡ (A∧¬B) ∨ (¬A∧B)
(14.ii)
¬(A ↔ B) ≡ A ↔ ¬B
(If the negated biconditional is a premise, then, usually, (14.i) is more convenient; we then apply (∨, |=) and, to each of the resulting implications, we apply (∧, |=). If the negated biconditional is the conclusion, then (14.ii) is more convenient; we then apply (|=, ↔). Sometimes (14.ii) is more convenient when ¬(A ↔ B) is a premise: if the premise-list contains A, we can replace A ↔ ¬B, A by the equivalent list A, ¬B.) Homework 4.5 Derive (14.ii) from (14.i) by pushing-in disjunction, deleting redundant conjuncts and replacing the remaining disjunctions by equivalent conditionals. 4.6 Establish the following implications via top-down derivations. You can use all the laws introduced so far, as well as substitution of equivalents. At the end you should have reduced the initial goal to a bunch of self-evident goals in which the conclusion is among the premises. (In some cases you might have to use (12∗ )). 1.
|= ¬(A → B) → A
2.
|= A∧B → (A ↔ B)
4.2. IMPLICATIONS WITH MANY PREMISES 3.
A ∨ B, A → C |= ¬B → C
4.
(A → B) → A |= A
5.
|= [C → (¬A → B)] → [(C → A) ∨ (C → B)]
6.
(A ∨ B) → A∧B |= A ↔ B
7.
[A → (A ↔ B)] ∧ [B → (A ↔ B)] |= A ↔ B
8.
|= A → (B → C) ↔ (A∧B → C)
4.2.3
125
Logically Inconsistent Premises
A premise-list is said to be logically inconsistent, or inconsistent for short, if it is impossible for all of them to be true, by virtue of pure logic. This is equivalent to saying that the conjunction of all premises is logically false. In the case of sentential logic, where the connectives are the only logical elements, we say that the premise-list is contradictory. Given a logically inconsistent premise-list, and any sentence B, it is impossible that all premises are true and B is false; because it is impossible that all premises are true. Hence, by the definition, an inconsistent premise-list implies any sentence: If Γ is inconsistent,
Γ |= B, for all B.
This is sometimes expressed by saying that a “contradiction implies everything”, and is often a source of misunderstandings. Some accept it as a strange, or “deep” truth of logic. And some may find it a defect of the system. Actually, there is no mystery and no ground for objection. Logical implication is a technical concept defined for particular purposes. It captures some of our intuitions concerning “implication”; but it does not, and is not intended to, capture all. Using “imply” in the somewhat vague, everyday sense, we will never say that the two contradictory premises, (i) John Kennedy was assassinated by Lee Oswald, (ii) John Kennedy was not assassinated by Lee Oswald, imply that once there was life on Mars. For we require some internal link between the premises and the conclusion. But there is nothing wrong in introducing a more technical variant of implication, well-defined in terms of possible truth-values, by which a contradiction does imply every sentence. (Similar points relating to logical equivalence have been discussed already.) The law that every sentence is implied by contradictory premises is very useful when it comes to deriving and checking logical implications. We shall regard the trivial instances of this law,
126
CHAPTER 4. LOGICAL IMPLICATIONS AND PROOFS
where the premise-list contains a sentence and its negation, as self-evident implications that need no further proof. This means that every implication of the form Γ, A, ¬A |= B is a possible termination point in a top-down derivation. We therefore allow two kinds of √ successful leaves (that can be marked by ): those in which the conclusion appears among the premises, and those in which the premises contain a sentence and its negation. Here is an example showing the use of the second kind. 1. (C → A) ∨ (C → B) |= C → A∨B 2. (C → A) ∨ (C → B), C |= A ∨ B
by (|=, →),
3. (C → A) ∨ (C → B), C, ¬A |= B
by (|=, ∨),
4.1 C → A, C, ¬A |= B 4.2 C → B, C, ¬A |= B 5.1 C, A, ¬A |= B 5.2 C, B, ¬A |= B
by (|=, ∨), by (|=, ∨), √ by disjoining, √ by disjoining,
5.1 is marked as successful, because the premises contain a sentence and its negation; 5.2 is marked because the conclusion is among the premises. Note: We can now derive disjoining from the other laws as follows. By (→, |=): Γ, A → B, A |= C iff Γ, ¬A, A |= C and Γ, B, A |= C. But the first implication is obvious. Hence, Γ, A → B, A |= C iff Γ, B, A |= C. Homework 4.7 Show, via top-down derivations, that the following implications obtain. The final goals should be of the two allowed kinds of self-evident implications. 1.
|= (A → ¬A) → ¬A
2.
|= (A → B) → (¬B → ¬A)
3.
A∨B → C |= (A → C) ∧ (B → C)
4.
|= (A → B) ∧ (¬A → B) → B
5.
A → B∧C, (B → D) ∨ (C → D) |= A → C
6.
A → B, B → ¬B |= ¬A
7.
A → B∧C, B → ¬C |= ¬A
8.
A∧B → C, A∧¬B → C |= A → C
4.3. FOOL-PROOF METHOD
4.3
127
A Fool-Proof Method for Finding Proofs and Counterexamples
4.3.1
Validity and Counterexamples
All the preceding implication laws, and the equivalence laws of chapter 3, can be regarded as general schemes. They hold no matter what sentences we substitute for the sentential variables. Therefore the derivable implications are schemes as well; having proved an implication we have also proved all its instantiations. To be precise, we should consider here (as we have done before) the sentential expressions. We can consider lists of sentential expressions. Call them premise expressions when they occur on the left-hand side of an implication. A list of premise expressions tautologically implies a sentential expression if: There is no truth-value assignment to the sentential variables under which all the premise expressions get T and the conclusion expression gets F. An implication (one that consists of expressions) is tautologically valid, or valid for short, if the list of premise expressions tautologically implies the conclusion expression. An implication is therefore non-valid when there is a truth-value assignment to the sentential variables under which all the premise expressions get T and the conclusion expression gets F. Such an assignment is called a counterexample. Hence an implication is valid just when it has no counterexamples. Obviously, a non-valid implication between sentential expressions fails as an implication between sentences, if we substitute the sentential variables by distinct atomic sentences. Alternatively, we can get a failed implication between sentences as follows: Substitute every sentential variable that gets T in the counterexample by a tautology, and every sentential variable that gets F– by a contradiction. Example: A → B, A∨C |= B A B C If we assume that A = B = F F T D ∧ ¬D and C = D → D, the implication fails as an implication between sentences:
is not valid, because it has a counterexample:
(D∧¬D) → (D∧¬D), (D∧¬D) ∨ (D → D) 6|= D∧¬D But A, B and C can be other sentences for which the implication holds, e.g., if C = A ∨ B (check it for yourself).
128
CHAPTER 4. LOGICAL IMPLICATIONS AND PROOFS
‘Implication’ has therefore double usage: when the premises and the conclusion are unspecified sentences, and when they are sentential expressions. There is no danger of confusion, because the context will make the reading clear. If we say that an implication (e.g., the one in the last example) may or may not hold, then we are obviously referring to unspecified sentences. But if we say that it is not valid we are saying that the scheme does not hold in general, that is,, as an implication between sentential expressions, it has a counterexample. By the same token we say that x · y > x + y may or may not hold, depending on the numbers x and y, but that x2 + 1 > x is a valid numerical inequality.
Equivalence for Counterexamples We have seen that in each of our laws the left-hand side implication holds iff all the right-hand side implications hold. Each of our laws satisfies also the following. Counterexample Equivalence: A truth-value assignment to the sentential variables is a counterexample to the left-hand side iff it is a counterexample to at least one of the implications on right-hand side. Counterexample equivalence can be inferred from the following two facts: (I) In each law, the equivalence of the two sides is preserved under all substitutions. (II) An implication is non-valid iff it has an instantiation that fails as an implication between sentences. Alternatively, the arguments that prove the equivalence of the two sides can be used to show their counterexample equivalence. As an illustration we show this for (|=, →) and for (∨, |=). (|=, →)
Γ |= A → B
(∨, |=)
Γ, A ∨ B |= C
⇐⇒
Γ, A |= B
A truth-value assignment (to the sentential variables) is a counterexample to the left-hand side, iff all members of Γ get T and A → B gets F. But this is equivalent to saying that all members of Γ get T, A gets T, and B gets F; which is exactly the condition for a counterexample to the right-hand side. ⇐⇒
Γ, A |= C
and Γ, B |= C
A truth-value assignment is a counterexample to the left-hand side, iff all sentences in Γ get T, A∨B gets T, and C gets F. But A∨B gets T, iff either A gets T, or B gets T (or both). If A gets T, then this is a counterexample to Γ, A |= C, and if B gets T, it is a counterexample to Γ, B |= C. Vice versa, a counterexample to one of the right-hand side implications assigns T to all members of Γ and to A ∨ B, and assigns F to C.
Using top-down derivations, we will define a method that decides for any implication whether or not it is valid. Given an implication, the method is guaranteed to produce either a proof
4.3. FOOL-PROOF METHOD
129
of it or a counterexample.
4.3.2
The Basic Laws
The method uses a finite number of basic laws. Some were mentioned before and some are easily obtained from previously mentioned laws. Not all laws considered above are taken as basic. The basic laws are naturally classified as follows. First, there is a law that enables trivial rearrangements of premise lists: If every sentence occurring in Γ occurs in Γ0 and every sentence occurring in Γ0 occurs in Γ, then for every A, Γ |= A
⇐⇒
Γ0 |= A
Using this law we can reorder the premises, delete repetitions, or list any premise more than once. Henceforth, such reorganizing will be carried out without explicit mention. Next, we designate two types of implications as self-evident:
Self-Evident Implications Γ, A |= A
Γ, A, ¬A |= B
Implications belonging to these two types play a role similar to that of axioms. In bottom-up proofs they serve as starting points. In top-down ones they are the final successful goals. The rest, referred to as reduction laws, are the basis of the method. They enable us to replace a goal by simpler goals. The first group consists of laws that handle sentential compounds A B. For each binary connective, , we have a premise law ( , |=) and a conclusion law (|= ), which handle, respectively, -compounds that appear as a premise, or as the conclusion. The second group, which deals with negated compounds, is presented later.
130
CHAPTER 4. LOGICAL IMPLICATIONS AND PROOFS
Laws for Conjunction (∧, |=) (|=, ∧)
Γ, A ∧ B |= C Γ |= A ∧ B
⇐⇒
⇐⇒
Γ, A, B |= C
Γ |= A and Γ |= B
Laws for Disjunction (∨, |=) (|=, ∨)
Γ, A ∨ B |= C
⇐⇒
Γ |= A ∨ B
Γ, A |= C ⇐⇒
and Γ, B |= C
Γ, ¬A |= B
Laws for Conditional (→, |=) (|=, →)
Γ, A → B |= C
⇐⇒
Γ |= A → B
Γ, ¬A |= C ⇐⇒
and Γ, B |= C
Γ, A |= B
Laws for Biconditional (↔, |=) (|=, ↔)
Γ, A ↔ B |= C Γ |= A ↔ B
⇐⇒
Γ, A, B |= C
⇐⇒
Γ, A |= B
and Γ, ¬A, ¬B |= C and Γ, B |= A
The goal-reduction process is as described in 4.3. To recap: a step consists of replacing the left-hand side of a reduction law (the goal that is being reduced) by the right-hand side. The ⇐-direction guarantees that proving the new goals (or goal) is sufficient for proving the old goal; the ⇒-direction means that they are also implied by it. All counterexamples to a new goal are also counterexamples to the old one, and any counterexample to the old one is obtained in this way. A goal’s children are the goals that replace it. (If there is one goal on the right, there is only one child). It follows from the above that if the children are valid so is the parent. Consequently, if all the leaf goals are valid, so are their parents, and the parents of their parents, and so on, up to the initial goal. On the other hand, any counterexample to one of the leaf goals is also a counterexample to one (or more) of their parents, hence also to the parent’s parent, and so on, up to the initial goal. And any counterexample to the original goal is a counterexample to a goal in some leaf.
4.3. FOOL-PROOF METHOD
131
We still need laws for reducing negated compounds, sentences of the form ¬¬A or ¬(A B). The laws for double negation allow us to drop it, either in a premises or in the conclusion. Compounds of the form ¬(A B) can be treated in the way described in 4.2.2 (page 124), i.e., by pushing the negation inside. This means that we use laws such as: Γ, ¬(A∧B) |= C
⇐⇒
Γ, ¬A∨¬B |= C
And similar laws for pushing negation inside in ¬(A∨B), in ¬(A → B), and in ¬(A ↔ B). There is a more elegant way: Combine in a single law the pushing of negation and the law that applies to the resulting compound. For ¬(A ∧ B) this yields: Γ, ¬(A∧B) |= C
⇐⇒
Γ, ¬A |= C and Γ, ¬B |= C .
Doing so for all connectives, we get reduction laws for negated compounds, of the same type as our previous ones. It is easy to see that counterexample equivalence is also true for the second group. Because these laws are obtained from the first group by substituting sentential expressions by logically equivalent ones; such substitutions do not change the sets of counterexamples.
132
CHAPTER 4. LOGICAL IMPLICATIONS AND PROOFS
Laws for Negated Negations (¬¬, |=)
Γ, ¬¬A |= B
(|=, ¬¬)
Γ |= ¬¬B
⇐⇒ ⇐⇒
Γ, A |= B Γ |= B
Laws for Negated Conjunctions (¬∧, |=) (|=, ¬∧)
Γ, ¬(A ∧ B) |= C
⇐⇒
Γ, ¬A |= C
Γ |= ¬(A ∧ B)
⇐⇒
and Γ, ¬B |= C
Γ, A |= ¬B
Laws for Negated Disjunctions (¬∨, |=) (|=, ¬∨)
Γ, ¬(A ∨ B) |= C Γ |= ¬(A ∨ B)
⇐⇒
⇐⇒
Γ, ¬A, ¬B |= C
Γ, |= ¬A and Γ |= ¬B
Laws for Negated Conditionals (¬ →, |=) (|=, ¬ →)
Γ, ¬(A → B) |= C Γ |= ¬(A → B)
⇐⇒
⇐⇒
Γ, A, ¬B |= C
Γ |= A and Γ |= ¬B
Laws for Negated Biconditionals (¬ ↔, |=) (|=, ¬ ↔)
Γ, ¬(A ↔ B) |= C
⇐⇒
Γ |= ¬(A ↔ B)
⇐⇒
Γ, A, ¬B |= C Γ, A |= ¬B
and Γ, ¬A, B |= C and Γ, ¬B |= A
This completes the set of reduction laws. Branching Laws: A law whose right-hand side has more than one implication is called a branching law. Each application of a branching law causes branching in the tree. The branching laws are, for non-negated compounds: (|=, ∧), (∨, |=), (→, |=), (↔, |=) and (|=, ↔) . For negated compounds they are: (¬∧, |=), (|=, ¬∨), (|=, ¬ →), (¬ ↔, |=) and (|=, ¬ ↔). The other laws are referred to as non-branching.
4.3. FOOL-PROOF METHOD
133
Memorization: You do not have to memorize all the laws. A useful strategy is to memorize only four: the two laws for conjunction, the premise-disjunction law, (∨, |=), and the conclusion-conditional law, (|=, →). The rest you can get by obvious substitutions of equivalents: The conclusion-disjunction law–by rewriting A ∨ B as ¬A → B; the premiseconditional law–by rewriting A → B as ¬A∨B; the premise-biconditional law–by rewriting the biconditional as (A ∧ B) ∨ (¬A ∧ ¬B), and the conclusion-biconditional law–by rewriting it as (A → B) ∧ (B → A). Beside the double negation laws, the other laws for negated compounds are obtained by pushing negation in, as indicated earlier.
Elementary Implications Elementary implications are those that cannot be simplified through reduction laws. An implication is elementary if every sentential expressions figuring in it is either a sentential variable or a negation of one. This is equivalent to saying that it contains neither binary nor negated compounds. Here are some examples. A, ¬B, C, ¬D |= ¬B
¬A, C, B, ¬C |= D
¬A, B, ¬C, D |= ¬E
Claim: (I) If an elementary implication is valid, then it is self-evident, i.e., either the conclusion occurs as a premise or the premises contain a sentential expression and its negation. (II) If an elementary implication is not self-evident then there is a unique assignment to its sentential variables that constitutes a counterexample. This assignment is determined by the following conditions: (i) Every sentential variable that occurs unnegated in the premises gets T, and every sentential variable that occurs negated in the premises gets F. (ii) The sentential variable of the conclusion gets F, if it occurs unnegated, T–if it occurs negated. Proof: Assume that an elementary implication is not self-evident and show that the conditions in (II) determine an assignment, and that the assignment is the unique counterexample. For any given assignment the following is obvious: All the premises get T iff the assignment satisfies (i). The conclusion gets F iff the assignment satisfies (ii). There is at most one assignment, to the sentential variables occurring in the implication, that satisfies (i) and (ii); because (i) and (ii) prescribe truth-values to all these sentential variables. Hence there is a counterexample iff there is an assignment satisfying (i) and (ii); the counterexample is then unique. The only way in which (i) and (ii) can fail to determine an assignment is by prescribing more than one truth-value for the same sentential variable. This does not happen unless the
134
CHAPTER 4. LOGICAL IMPLICATIONS AND PROOFS
implication is self-evident. For if it is not, no sentential variable occurs in the premises both negated and unnegated; hence (i) assigns to each sentential variable occurring in the premises exactly one value. Next, if the variable of the conclusion does not occur in the premises, then only (ii) gives it a value. If it occurs in the premises, it must be either negated in the premises and unnegated in the conclusion, or unnegated in the premises and negated in the conclusion. Otherwise the conclusion is among the premises and the implication is self-evident. Hence (ii) and (i) assign to it the same value. QED In order to check the validity of an elementary implication, we therefore check if it is selfevident. If it is not, then (i) and (ii) in (II) tell us what the counterexample is. Of the three elementary implications given above, the first two are self-evident. The counterexample to the third is: A B C F T F
D T
E T
We can now assemble all the pieces and sum up the method.
4.3.3
The Fool-Proof Method
To check any given implication Γ |= A ,
we take it as the initial goal and proceed top-down, by applying the premise and conclusion laws for binary connectives and for negated compounds. As long as there is a goal containing a binary compound, or a negation of one, or a double negation, we can continue. Such a process cannot go on indefinitely, because the goals become smaller. Intuitively this is clear. The mathematical proof of this will not be given here. (The proof is not trivial because as the goals become smaller, their number can increase; for precise inductive arguments see 6.2.4 page 232, and 6.2.5 page 239.) When the process terminates, we get a top-down derivation tree in which all the goals in the leaves are elementary. If all are self-evident we get a top-down derivation of the initial implication, which can be turned upside down into a bottom-up proof. Otherwise, there are terminal goals that are not self-evident. Each of these yields a unique counterexample to the initial goal. All the counterexamples to the initial goal are obtained in this way. Note: Our listed laws are sufficient for deriving all valid implications. For if an implication is not derivable. the resulting tree gives us a counterexample. This shows that non-derivable implications are not valid.
4.3. FOOL-PROOF METHOD
135
Consequently, there is no need to use any other laws or to rely on substitution of equivalent components. In practice, however, you can legitimately apply other established laws, such as disjoining ((12) of 4.2.1), and you may use substitutions of equivalents (where the equivalences have been proven already), in order to shorten the proof. Note: In actual applications, you need not go all the way. The process can stop at the stage where all the goals are self-evident, even if they are not elementary. In the first example of 4.2.2 the final goals are elementary; in the second and the third they are not. Also, once you get an elementary implication that is not self-evident, you have your counterexample and you can stop. But if you want to get all counterexamples to the initial goal, you should get all non-valid elementary implications of the tree. Here are two examples. In the first we get a proof, in the second–a counterexample. The law that applies at each step is not indicated, but you can figure it out. The sentence to which a law is applied is underlined. 1. A → B, C → (A ∨ B) |= C → B 2. A → B, C → (A ∨ B), C |= B 3.1 ¬A, C → (A ∨ B), C |= B √ 3.2 B, C → (A ∨ B), C |= B √ 4.11 ¬A, ¬C, C |= B 4.12 ¬A, A ∨ B, C |= B 5.121 ¬A, A, C |= B
√ √
5.122 ¬A, B, C |= B
Note that had we employed law (12) of 4.2.1, we could have replaced C → (A ∨ B), C
by
C, A ∨ B ,
which would have eliminated 4.11, making 4.12 the sole child of 3.1. 1. A → (B ∨ C), B → (A ∧ C) |= ¬C → A 2. A → (B ∨ C), B → (A ∧ C), ¬C |= A 3.1 ¬A, B → (A ∧ C), ¬C |= A 3.2 B ∨ C, B → (A ∧ C), ¬C |= A 4.11 ¬A, ¬B, ¬C |= A
×
136
CHAPTER 4. LOGICAL IMPLICATIONS AND PROOFS
4.12 ¬A, A ∧ C, ¬C |= A Here the derivation stopped after yielding the non-valid elementary 4.11. The corresponding counterexample is: A B F F
C F
You can easily see that this is also a counterexample to 1. If you continue to reduce the remaining goals, 3.2 and 4.12, you will see that both of them are valid, hence this is the only counterexample to our initial goal. Homework 4.8 Write the last two top-down derivations in tree form, using line numbers to label the nodes. Write to the side of each node the law by which it has been derived from its parent. Some Noteworthy Properties of the Method • Given any occurrence of a sentential expression in a goal, there is at most one law that can be applied to it and the result of the application is unique. • One cannot go wrong by applying the reduction laws in any order, until all goals are elementary. (But the choice, at each stage, of where to apply a reduction law can have considerable effect on the derivation’s length. If, in the first of the last two examples, we had started by treating the leftmost premise, A → B, and had followed this by treating, on each of the two branches, C → (A∨B), we would have had four branches right at the beginning and the number of lines would have been 13, instead of 8.) • The laws that deal with a binary connective do not introduce any other connective except, possibly, negation. Therefore, no other connectives, besides negation and those appearing in the initial goal, appear in the derivation. Consequently, given any set of connectives that contains negation, the laws for the connectives of the set are sufficient for deriving all tautological implications of the subsystem that is based on these connectives. For example, if we restrict our system to sentences whose connectives are ¬ and ↔ only, the laws double negation laws, and the premise and conclusion laws for biconditionals and negated biconditionals are sufficient. Note: The validity of a given implication can be settled also through truth-tables. We make a table for all the occurring sentential expressions; then we check, row after row, whether all the premises get T and the conclusion gets F. If we find such a row–we get a counterexample. If not–the implication is valid. But the execution of this “brute force” checking is tedious, prone to mistakes and, often, more time consuming.
4.4. PROOFS BY CONTRADICTION
137
A most important feature of the method is that it generalizes to richer systems where truthtables are not available. As we shall mention later (in 9.3.3) there is no method for firstorder logic that is guaranteed to produce, in a finite number of steps, either a proof or a counterexample. But there is one that is guaranteed to produce a proof–if the implication is a logical implication. One of the proofs of this result is obtained by extending the present method and by using the same type of arguments that show its adequacy for the sentential case. Homework 4.9 Give, for each of the following implication claims, a top-down derivation or a counterexample. To cut short the construction, you can use at your convenience additional laws, besides the basic ones, as well as simple substitutions of equivalents. 1. |= [A → (B → C)] → [(A → B) → (A → C)] 2. |= A ∨ (¬A∧C) → (¬A → C) 3. A → B, B → C, C → A ∨ B |= C ↔ (A ∨ B) 4. A ∧ B, B → (C ∨ ¬D), D → ¬C |= C 5. A ∨ ¬B, (B → C) → D |= A ∨ D 6. A ↔ B, B ∨ C, A → ¬C |= A ∨ (C ∧ ¬B) 7. |= ((A → B) → C) → (B → C) 8. A → B ∧C, (B ∨ C) → D |= (D → A) → (B ↔ C) 9. A → (B ∨ C), ¬B ∨ ¬C |= ¬A 10. B∧C → A, (A ∨ B) → C |= B → A 11. A ∨ (B∧C), B ∨ (A∧C) |= (A ∨ B) ∧ C
4.4
Proofs by Contradiction
4.4.0 The following is easily established: (15)
Γ |= A
iff
Γ, ¬A
is logically inconsistent.
138
CHAPTER 4. LOGICAL IMPLICATIONS AND PROOFS
The left-hand side holds just when it is impossible that all the sentences in Γ be true and A be false; this is the same as saying that it is impossible that all sentences in Γ, ¬A be true. Furthermore, we have: (16)
If C is any contradiction, then Γ is logically inconsistent
iff
Γ |= C .
Again, this is obvious: A logically inconsistent premise-list implies all sentences, in particular– all contradictions. Vice versa, if Γ implies a contradiction, then, it is impossible that all premises in Γ be true, for then the contradiction will have to be true as well. From (15) and (16) we get: (17)
If C is any contradiction, then: Γ |= A
⇐⇒
Γ, ¬A |= C .
(17) gives us a way for proving that Γ |= A : Add ¬A to Γ and show that the resulting list implies a contradiction. Such a proof is called proof by contradiction. We can choose any contradiction as C. The most common one is a sentence of the form B ∧ ¬B. But instead of using particular contradictions, it is convenient to introduce a special contradiction symbol that denotes a sentence which, by definition, gets only the value F. The symbol to be used is: ⊥ You can think of ‘⊥’ as denoting, ambiguously, any contradiction. But we shall employ it in restricted way: it cannot occur among the premises but only as the right-hand side of ‘|=’: Γ |= ⊥
.
This is simply a way of saying that Γ is logically (or, in the special case of sentential logic, tautologically) inconsistent. ⊥ can be replaced, if one wishes, by any particular contradiction. With this notation (17) becomes: (18)
Γ |= A
⇐⇒
Γ, ¬A |= ⊥
It is easily seen that the two sides of (18) are counterexample equivalent (i.e., have the same counterexamples). All our previous premise laws apply to implications of the form ‘Γ |=⊥’ (because ⊥ can be replaced by any contradictory sentence). E.g., Γ, A ∨ B |= ⊥
⇐⇒
Γ, A |= ⊥
and Γ, B |= ⊥
4.4. PROOFS BY CONTRADICTION
139
Our previous notions of a self-evident implication and of an elementary implication carry over, in an obvious way, to implications of the form Γ |= ⊥ . The implication is self-evident just when Γ contains a sentence and its negation. It is elementary if all the premise expressions are unnegated or negated sentential variables. There is now only one kind of self-evident implications, because cases in which the conclusion is among the premises are excluded by the restriction on ‘⊥’. The claim that an elementary implication is valid iff it is self-evident has now a simpler proof: Assume that the elementary implication is not self-evident. Assign T to every sentential variable appearing unnegated in the premise-list, assign F to those that appear negated. Since no variable appears both unnegated and negated, each is assigned a single value. This assignment makes all premises true, thereby constituting a counterexample.
4.4.1
The Fool-Proof Method for Proofs by Contradiction
Our previous top-down method can be adapted to proofs by contradiction. Given an initial goal: Γ |= A we start with replacing it by the equivalent goal: Γ, ¬A |= ⊥ Then we proceed to reduce this goal to simpler goals by applying our premise laws to binary compounds, to their negations, and to double negations. (If the initial goal is Γ |=⊥ we start the reductions right away.) All the resulting goals have ‘⊥’ on the right-hand side. We can continue until all the goals are elementary. If all are self-evident we get a proof of the initial goal. Otherwise, every elementary implication that is not self-evident yields a counterexample. Here are two illustrations. In the first the method yields a proof: 1. ¬(¬A ∨ B), C → B |= ¬(A → C) 2. ¬(¬A ∨ B), C → B, ¬¬(A → C) |= ⊥ 3. ¬(¬A ∨ B) , C → B, A → C |= ⊥ 4. ¬¬A , ¬B, C → B, A → C |= ⊥ 5. A, ¬B, C → B , A → C |= ⊥ 6.1 A, ¬B, ¬C, A → C |= ⊥
140
CHAPTER 4. LOGICAL IMPLICATIONS AND PROOFS √
6.2 A, ¬B, B, A → C |= ⊥ 6.11 A, ¬B, ¬C, ¬A |= ⊥ 6.12 A, ¬B, ¬C, C |= ⊥
√
√
Note that we could have shortened the derivation had we used (12) (which is not among our basic rules). An application of (12) to 5. yields the equivalent goal: A, ¬B, C → B, C |= ⊥ which by another application of (12) becomes self-evident: A, ¬B, B, C |= ⊥ In the second example, the method yields a counterexample: 1. A → B, ¬(B → C) |= A 2. A → B, ¬(B → C) , ¬A |= ⊥ 3. A → B , B, ¬C, ¬A |= ⊥ 4.1 ¬A, B, ¬C, ¬A |= ⊥ 4.2 B, B, ¬C, ¬A |= ⊥
× ×
(The repeated occurrences of premises, in the last two goals, could have been deleted.) Note that 4.1. and 4.2 are conjointly equivalent to the initial goal. Each has a counterexample. But their counterexamples are the same, namely: A B C F T F Therefore, this is the only counterexample to the original implication. You can check that it is indeed a counterexample to the initial goal, by constructing a truth-table (for the three sentential expressions) noting that the row corresponding to that assignment is one in which all the premises get T and the conclusion gets F. You can moreover check that this is the only row with that property. The proof-by-contradiction variant uses fewer basic laws than our previous method. All the reduction laws are premise laws. On the other hand, it may occasionally require more steps. The basic laws for top-down proofs by contradictions are given on pages 141, 142. Except for the law for trivial rearrangements of the premises, no other laws are needed.
4.4. PROOFS BY CONTRADICTION
141
Homework 4.10 Using the proof-by-contradiction method, check which of the following implications is valid. Give in each case either a top-down derivation or a counterexample. You can use at your convenience additional premise laws, such as (12), or simple substitutions-byequivalents.
1. A → B, B → C |= ¬C → ¬A 2. A → A∧B, B → C |= A → C 3. (A ∨ B) → C, C |= A ∨ B 4. A ∧ (B → C), B |= A ∧ C 5.
|= (A → B) ∧ (A → C) → (A ∨ B → C)
6. A ∨ B, B → C, C → ¬A |= ⊥ 7. A ↔ B |= (A ∨ B) ↔ A 8. A → (B ∨ C), ¬(B → A), ¬(C → A) |= ⊥ 9. (A∧B) ∨ C |= (A∧B)∧¬C ∨ C
The Laws for Proofs by Contradiction
First we have the law that fixes the self-evident implications.
Self-Evident Implication Γ, A, ¬A |= ⊥
Then, we have the reduction laws. The first in the law that introduces ⊥ as the conclusion. The other are premise laws for binary connectives and negated compounds. In the following list the laws for A B and for ¬A B are grouped together.
142
CHAPTER 4. LOGICAL IMPLICATIONS AND PROOFS
Contradictory-Conclusion Law Γ |= A
⇐⇒
Γ, ¬A |= ⊥
Law for Negated Negations (¬¬, |=)
Γ, ¬¬A |= ⊥
⇐⇒
Γ, A |= ⊥
Laws for Conjunctions and Negated Conjunctions (∧, |=) (¬∧, |=)
Γ, A ∧ B |= ⊥ Γ, ¬(A ∧ B) |= ⊥
⇐⇒
⇐⇒
Γ, A, B |= ⊥
Γ, ¬A |= ⊥
and Γ, ¬B |= ⊥
Laws for Disjunctions and Negated Disjunctions (∨, |=) (¬∨, |=)
Γ, A ∨ B |= ⊥
⇐⇒
Γ, ¬(A ∨ B) |= ⊥
Γ, A |= ⊥ ⇐⇒
and Γ, B |= ⊥
Γ, ¬A, ¬B |= ⊥
Laws for Conditionals and Negated Conditionals (→, |=) (¬ →, |=)
Γ, A → B |= ⊥
⇐⇒
Γ, ¬(A → B) |= ⊥
Γ, ¬A |= ⊥ ⇐⇒
and Γ, B |= ⊥
Γ, A, ¬B |= ⊥
Laws for Biconditionals and Negated Biconditionals: (↔, |=) (¬ ↔, |=)
Γ, A ↔ B |= ⊥
⇐⇒
Γ, A, B |= ⊥
and Γ, ¬A, ¬B |= ⊥
Γ, ¬(A ↔ B) |= ⊥
⇐⇒
Γ, A, ¬B |= ⊥
and Γ, ¬A, B |= ⊥
4.5. IMPLICATIONS OF SENTENTIAL LOGIC IN NATURAL LANGUAGE
4.5
143
Implications of Sentential Logic in Natural Language
4.5.0 In order to establish logical implications between English sentences, we recast them as sentences of symbolic logic. We can then check whether logical implication holds for the recast sentences. Consider the premises: (1) Jill will not marry Jack, unless he leaves New York, (2) If Jack leaves New York, he must give up his current job, and the inferred conclusion: (3) Either Jack will give up his current job, or he won’t marry Jill. Let A, B and C be, respectively, the formal counterparts of: ‘Jill will marry Jack’,
‘Jack will leave New York’,
‘Jack will give up his current job’.
Then the sentences are translated as: (1∗ ) ¬B → ¬A (2∗ ) B → C (3∗ ) C ∨ ¬A And indeed: ¬B → ¬A, B → C |= C ∨ ¬A
Had the implication not been valid, we would have had a counterexample, using which we could have pointed out a possible scenario in which (1) and (2) are true and (3) is false. For example, had we replaced (1) by: (10 ) Jack will not marry Jill if he leaves New York, the formal implication would have been: B → ¬A, B → C |= C ∨ ¬A And here we get a counterexample:
144
CHAPTER 4. LOGICAL IMPLICATIONS AND PROOFS A B T F
C F
That is, Jill marries Jack; he does not leave New York, and he does not give up his job. Noteworthy Points: (I) We have construed (2) as a conditional and we have read ‘he must give up his job’ as ‘he will give up his job’. Whatever the connotations of ‘must’ in this context, they are ignored as irrelevant for the formalization. (II) We used the same A to represent, in (1∗ ) and in (3∗ , both the sentence (4.i) ‘Jill will marry Jack’, as well as (4.ii) ‘Jack will marry Jill’. Here we relied on the special meaning of ‘marry’; with most other verbs, e.g., ‘understand’, ‘like’, ‘amuse’, etc. the move would have been illegitimate (‘Jack likes Jill’ is not equivalent in meaning to ‘Jill likes Jack’). In a more scrupulous formalization we would have represented (4.ii) by a different sentence, say A0 . But then we should have included A ↔ A0 among the formalized premises. This additional premise reflects the equivalence of (4.i) and (4.ii), which is implicit in the argument. It has a different status than the other premises; for it is not explicitly stated, but is something that derives solely from the meaning of ‘marry’. Which brings us to our next subject.
4.5.1
Meaning Postulates and Background Assumptions
There are numerous connections between English verbs, adjectives, common names, and adverbs, which are based on their meaning and which English speakers are expected to know. They are taken for granted whenever we speak, argue, or draw conclusions. drawing conclusions. Our last example (Jack marries Jill if and only if Jill marries Jack) is a case among many. In that case we could avoid additional formal premises by using a coarse-grained formalization, in which the same A represents (4.i) and (4.ii). But this is not always desirable, and in most cases it is not possible. Consider, (5) Carol can be on the task force, if, and only if, Carol is unmarried, from which we want to conclude (6) If Carol is a bachelor, he can be on the task force.
4.5. IMPLICATIONS OF SENTENTIAL LOGIC IN NATURAL LANGUAGE
145
Let, A, B, C, represent, respectively, ‘Carol is unmarried’,
‘Carol is a bachelor’,
‘Carol can be on the task force’ .
Then (5) and (6) become respectively: C ↔ A,
and
B→C .
To infer the second from the first, we must add the premise:
B → A,
representing:
(7) If Carol is a bachelor, then Carol is unmarried. (We cannot let the same formal sentence represent both ‘Carol is unmarried’ and ‘Carol is a bachelor’; the two are not equivalent, since Carol can be an unmarried woman.) The term meaning postulate has been introduced by Carnap to describe formalized sentences that are not logical truths, but are true by virtue of the meaning of their terms. They are supposed to determine, axiomatically, the meaning of the undefined symbols. Usually it takes first-order logic to express meaning postulates. E.g., the formal counterpart of (7) is a logical consequence of the meaning postulate: (8) All bachelors are unmarried . As we shall see, it can be written as: (8∗ )
∀x[Bachelor(x) → Unmarried(x)]
Carnap held that meaning postulates are unrevisable laws of language, without empirical content, a view that has by now been abandoned by most philosophers. Nonetheless, even if the distinction is not–as Carnap held–absolute, it is a good methodological policy to distinguish sentences like (8) from sentences like ‘Carol is unmarried’, which convey nonlinguistic factual information. We shall henceforth use meaning postulates, without however committing ourselves to the original significance associated with the term. Thus, in formalizing the inference from (5) to (6), we add B → C as a premise representing a meaning postulate (or a consequence of one). Background Assumptions Almost every reasoning involves background assumptions that are not spelled out explicitly. Given the premises:
146
CHAPTER 4. LOGICAL IMPLICATIONS AND PROOFS
(9) Arthur’s mother won’t be content, unless he lives in Boston, (10) Arthur’s wife will be content only if he lives in New York, we would naturally conclude: (11) Either Arthur’s mother or his wife won’t be content. Let us formalize: A1 : Arthur will live in Boston, A2 : Arthur will live in New York, B1 : Arthur’s mother will be content. B2 : Arthur’s wife will be content. The required implication is: (12)
¬A1 → ¬B1 , B2 → A2 |= ¬B1 ∨ ¬B2 .
But it is easy to see that (12) is not a valid implication: if both A1 and A2 get T, the premises are true and the conclusion is false. It turns out that in deriving (11) we have been assuming that Arthur will not live in New York and in Boston at the same time. (“At the same time”– because in (9), (10) and (11), the future tense is, obviously, intended to indicate the same time.) This background assumption becomes, upon formalization: ¬(A1 ∧ A2 ) Having added it, we get the desired logical implication: (12∗ )
¬A1 → ¬B1 , B2 → A2 , ¬(A1 ∧ A2 ) |= ¬B1 ∨ ¬B2
A background assumption is by no means a necessary truth. It is, for example, conceivable that Arthur will live in Boston and in New York “at the same time”: say, he maintains two households and commutes daily. But given the inconvenience and the expense, such an arrangement is very unlikely; implicitly, we have ruled it out.
4.5. IMPLICATIONS OF SENTENTIAL LOGIC IN NATURAL LANGUAGE
147
Or consider the inference from: If Jack leaves New York, he will have to resign his current position, AND Jack decided to leave New York, TO: Jack will resign his current position. Here there is an implicit assumption that Jack will carry out his decision. The assumption may be objectionable in contexts in which decisions are not always implemented. Implicit background assumptions are thus statements of fact that are assumed to be known, or which can reasonably be taken for granted. Their certainty can vary considerably, from that of the well-established law, to that of a mere plausibility. Even an obvious commonplace, e.g., that one cannot be in different places exactly at the same time, can be classified as a background assumption. There is a philosophical tradition, initiated by Kant, according to which certain truths, such as the impossibility of being in two places at the same time, derive from basic (non-linguistic) epistemic principles and are immune to revision. The truth just mentioned derives, presumably, from the very meaning of physical body and space. But today the force of that tradition has been considerably weakened. Many cast doubt on the unrevisability of such a priori conceptual truths. Let us therefore classify under “meaning postulates” cases that are more of a lexicographic nature, such as (8), rather than those that follow from foundational epistemic considerations. The latter will be classified as background assumptions, albeit ones we can hardly conceive of giving up. ‘Background assumptions’ thus covers an extremely wide spectrum, from the most entrenched general laws, to probable suppositions, to particular facts implied by context. In a finer analysis we should distinguish between them. For the sake of simplicity we ignore these distinctions. ‘Meaning postulates’ is reserved for cases like (8), which derive from the conventions of language. The boundary separating meaning postulates from background assumptions is, to be sure, blurred. This is true of many useful distinctions. The difference between (8) and some factual assumption (e.g., that Jack will carry out his decision) is sufficiently clear to warrant their classification under different headings.
4.5.2
Implicature
In linguistic exchange we often infer more than what is explicitly stated. On being told that
148
CHAPTER 4. LOGICAL IMPLICATIONS AND PROOFS
(13) Jack and Jill met and, so far, they have not quarreled, one will naturally infer that a quarrel between Jack and Jill was likely. The sentence however does not say it. (13) is true, just in case that (i) Jack and Jill met and (ii) Jack and Jill have not quarreled, then and later. Grice, who pointed out and investigated these phenomena, proposed the term implicature, based on the verb implicate, for inferences of this kind. Thus, we can say that (13) implicates that there was some reason for expecting a quarrel between Jack and Jill. The sentence does not, strictly speaking, imply it. If we add to (13) the negation of the implicated sentence, we might get something odd, but not a contradiction: (130 ) Jack and Jill met and so far they have not quarreled; there was no reason to expect a quarrel. On Grice’s analysis, the implicature from (13) derives from certain pragmatic rules that govern conversations. The rules that bear on (13), and on other cases that we shall consider, have to do with the relevance, the informativeness and the economy of the speaker’s utterances. The relevance requirement is that the statements made by the speaker be relevant to the topic under discussion. The informativeness requirement is that the speaker supply the right amount of information (known to her), which is required in that exchange. And economy means that she is required to avoid unnecessary length. In our example the rules produce the implicature in the following way. If there was no reason why Jack and Jill should quarrel, then to say that they have not quarreled is to supply a piece of useless information. Since we expect the speaker to go by the rules and to supply information that has some significance, we infer from (13) (assuming the speaker to be knowledgeable and sincere) that there is some reason for expecting a quarrel. The rules of conversation require also that a speaker should not assert the conditional If ... , then
,
if he knows that ‘...’ is false, or if he knows that ‘ ’ is true. Because one can be more informative and more brief by asserting, in the first case–the negation of the antecedent, in the second case–the consequent. This point was already discussed in 3.1.4; the oddity of (28.i) and (28.ii) of that section is partly explained by noting this implicature. Also the inferring of a causal connection, which sometimes goes with the use of ‘and’, can perhaps be traced to conversational implicature. Being told: (14) Jill recommended the play, and Jack went to see it,
4.5. IMPLICATIONS OF SENTENTIAL LOGIC IN NATURAL LANGUAGE
149
we infer that Jill’s recommendation was the cause of Jack’s going. Else there would be no point in mentioning the two together. Actually, this is not so much the requirement of relevance, as the requirement that there be a sufficiently focused topic of the discussion. The same requirement of sufficient focus can be seen to underlie the assumption of temporal proximity: Unless stated otherwise, we interpret conjunctively combined clauses, in past or future tense, as referring roughly to the same time. As you can see, implicatures make it possible to mislead without making assertions that are formally false. Many resort to this device. Politicians, advertisers and lawyers excel in it.
Implicature versus Ambiguity In (13) the addition of the negated implicature does not yield a contradiction. We may take this as corroborative evidence for its being an implicature, not an implication. This kind of test is however not conclusive; on many occasions it misleads. Consider, (15) Jack and Jill were married last week. Usually (15) is taken to imply that Jack and Jill married each other. If, however, we add the negation of that conclusion, we get: (150 ) Jack and Jill were married last week, but they did not marry each other, which is not at all contradictory. Shall we then say that our first inference from (15) is by implicature only? No. Actually (15) is ambiguous. We have seen (cf. 3.1.2) that the use of ‘and’ to combine names can result in two possible interpretations: the distributive, in which the sentence can be expressed as a conjunction, and the collective, where the combination of names functions as a name of a single item. The dominant reading of (15) is the collective, implying that Jack and Jill married each other. The addition of ‘they did not marry each other’ makes this interpretation untenable (for it leads to a trivial contradiction). Hence we switch to the other reading of ‘Jack and Jill were married’. In the same vein, we may interpret John jumped into the tank as stating that John jumped into some armored vehicle. But with a suitable addition, e.g., John jumped into the tank and dived to the bottom, we read it as stating that John jumped into a large container. A nice illustration of conversational implicature is provided by comparing (15) and
150
CHAPTER 4. LOGICAL IMPLICATIONS AND PROOFS 00
(15 ) Jack and Jill were married last week, on the same day. 00
(15), under its first reading (i.e., that they married each other), implies (15 ). But then, the addition of ‘on the same day’ would be altogether redundant. Assuming the speaker to go by the rule of economy, we reinterpret (15) and infer that they were not married to each other; else there would be no point to the additional information. The Principle of Adjusting: We mentioned already the so-called charity principle (3.1.2 after example (16)). According to it, we interpret our interlocutor, in cases of ambiguity, so as to make him sound sensible. Our last examples illustrate this point. But the name ‘charity’ can mislead. For the principle derives from a wider principle, according to which we interpret our experience so as to make it cohere with the general scheme of expected regularities. This applies to linguistic as well as to non-linguistic phenomena, to interaction with people as well as to interaction with nature. The subject is too broad to go into here. Suffice it only to observe that in the case of language we expect utterances, linguistic texts and linguistic interaction to accord with certain rules, syntactic, semantic and pragmatic. We expect utterances to make a certain sense. And when the danger of nonsense looms, we use the available possibilities to adjust our reading so as to avoid it. Homework 4.11 Use sentential formalization, in order to analyze the logic of the following exchanges and to answer the questions. If there is no stated question, find whether the conclusion follows from the premises. Formalize only in as much as this is necessary for the purpose of your analysis. Discuss briefly any points relating to meaning postulates, background assumptions, ambiguity and implicature, which you find relevant. Assume that the speakers are reasonable. (1) Jill: Jack’s mother won’t be content, unless he lives in Boston. Jack: But his wife will be content only if they live in New York. Jill: So either his wife or his mother won’t be content. Take ‘they’ in Jack’s statement to refer to Jack and his wife. (2) Arthur, David and Mary share an apartment. Jack: Arthur and David are crazy about Mary, so if she is at home both of them are. Jill: In any case, if one of the boys is at home, the other is too, for none trusts himself alone with the neighbor’s dog. Jack: But they said that one of them will go over to Joe’s place to help him with his studies.
4.5. IMPLICATIONS OF SENTENTIAL LOGIC IN NATURAL LANGUAGE
151
Jill: Which goes to show that Mary is not at home. Does it? Does it make a difference for the implication if ‘they’ and ‘them’, in Jack’s last statement, refer to the two boys or to the two boys and to Mary as well? (3) Jack: Do you think that Mary is still unmarried? Jill: I don’t know, but if Mary is not unmarried, neither is Myra. Jack: And if Myra is not married neither is Mary. Jill: All this is rather confusing. Doesn’t it imply that Myra is married only if Mary is? Does it? (4) Jill: Both Arthur and Jeremiah said that they won’t be happy, unless they marry Frieda. Jack: By now she should have married one of them. Jill: But she wasn’t going to marry anyone without a secure job. Jack: So, by now, one of them has a secure job and one of them is not happy. (5) Jack: If one of Arthur and Jeremiah goes to the movie either Olga or Amelia will go with him. Jill: And the two girls won’t go there together unless accompanied by a boy. Jack: Which goes to show that the two boys will go to the movie only if the two girls go there too. (6) Jack consults a fortuneteller whether he should become a musician or study for the law. Jack: I won’t be happy unless I practice music. Fortuneteller: But only by becoming a lawyer can you be rich enough to buy the things you like. Jack: It seems that my happiness depends on my giving up things I like. Does it? (7) Jill: If you go to the movie so will I. Jack: If what you have said is true, then I will go to the movie. Jill: Why this roundabout way of putting things? You could have simply said that you will go to the movie. Jack: Not at all. I only said that I will go to the movie if what you had said is true. Who is right and why? (8) Jack: If I enroll in the logic course I shall work very hard.
152
CHAPTER 4. LOGICAL IMPLICATIONS AND PROOFS Jill: I don’t know that I believe you... Well, at least I believe that you won’t enroll in the logic course unless what you say is true. Jack: But doesn’t this show that you don’t know your true beliefs? Is Jack right?
(9) Jack: Arthur won’t move to a new apartment unless he accepts the new offer. But this won’t be true if he marries Olga. Jill: But if he marries Olga and moves to a new apartment, he will accept the offer. He won’t be able to do both on the salary he is getting now. Jack: So, unless one of us is wrong, he won’t marry Olga. (10) Jill: Unless you take a plane you won’t meet your father. Jack: Taking a plane is rather costly. Jill: But your father told me that if you meet him he’ll cover the expenses required for your trip. Jack: So if I take a plane, in the end it won’t cost me. (11) Jack, Jill and Arthur who took a midterm test discuss the possible outcomes. Jack: Someone who had a look at the list told me that two students got an A. Jill: I am expecting an A, I’ll be in a bad mood if I didn’t get it. Jack: So will I. Arthur: I don’t care. This test doesn’t matter so much. Jill: So either Arthur didn’t get an A, or one of us will be in a bad mood.
Chapter 5 Mathematical Interlude 5.0 Every description of a language (or setup, or system) must be phrased in some language. The language we use in talking about the language we discuss is referred to as the metalanguage, and the language we discuss–as the object language. When we describe French in English, the metalanguage is English and the object language is French. The metalanguage can be the same as the object language: we can describe English in English. The language used in this course for discussing formal systems is English, or rather, English supplemented with some technical vocabulary. We have been relying in our descriptions, arguments and proofs on certain basic, intuitively grasped concepts; for example, the concept of a finite number, and that of a finite sequence. We may say that a sentential expression is a finite sequence of symbols, and that a sentential compound is a sentence obtained in a finite number of steps by applying connectives, and so forth. Initially, the use of such notions poses no problems. But as the arguments and the constructions become more involved, there is an increasing need of a precise framework within which we can define certain abstract notions and carry out proofs. The framework can help us guard against error.1 At the same time it should have resources for carrying out constructions and proofs, beyond the immediate grasp of our intuitions. The need for a rigorous conceptual foundation was addressed by mathematicians and philosophers in the second half of the nineteenth century. The desired foundations were laid in the works of Dedekind, Frege and, in particular, Cantor, who created between the years 1874 1
There are well-known examples, in the history of thought, of arguments considered clear and self-evident, which have later turned out to be confused or fallacious.
153
154
CHAPTER 5. MATHEMATICAL INTERLUDE
and 1884 what is known as set theory. This theory, developed later by other mathematicians (Zermelo, von Neuman, Hausdorff, Fraenkel–to name a few), provides a rigorous apparatus, sufficiently powerful for carrying out the constructions and proofs in all formal reasoning. All known formal systems can, in principle, be described within set theory; and all known valid reasoning about them can be derived in it. Nowadays, this theory provides the most basic kit of tools for any reasoning of a mathematical, or formal nature. In its more advanced versions, set theory–itself a sophisticated branch of mathematics–is of interest to the specialists only. But its elementary core is employed whenever formal precision is required. In the first section of this chapter we shall introduce some very elementary set-theoretical notions. Our aim is not to study set theory per se, but to provide a more rigorous treatment of formal languages and their semantics. We shall take for granted a variety of mathematical concepts, such as natural number, finite set, and finite sequence. In set theory these and all other mathematical concepts are defined in terms of a single primitive: the membership relation. But such reductions do not concern us here. The second section is devoted to a certain technique that is widely employed in defining formal languages and in establishing their properties. This is the technique of inductive definitions and inductive proofs.
5.1 5.1.1
Basic Concepts of Set Theory Sets, Membership and Extensionality
A set is a collection of any objects, considered as a single abstract object. There is a set consisting of the earth, the sun, and the moon; another, consisting of the earth, the sun, the moon, and the planet Jupiter; and still another, consisting of the earth, the moon, number 8, and Bill Clinton. Thus, we can put into the same set objects whatsoever. Usually, we consider sets whose members are of the same kind: sets of people, sets of numbers, sets of sentences, etc. But this is not a restriction imposed by the concept of set; the theory allows us to form sets arbitrarily. The objects that go into a set are said to be its members. And the basic relation on which set theory is founded is the membership relation; it holds between two objects just when the second object is a set and the first is a member of it. The symbol for membership is ‘∈’ . It is employed as follows:
x∈X
5.1. BASIC CONCEPTS OF SET THEORY
155
means that x is a member of the set X, and x 6∈ X means that x is not a member of X. Hence, if X is the set whose members are the earth, the moon, the number 8, and Bill Clinton, then: Earth ∈ X, 8 ∈ X, Moon ∈ X, Clinton ∈ X, Nixon 6∈ X, 6 6∈ X, Jupiter 6∈ X, etc. We also have: Clinton 6∈ Nixon, because Nixon is not a set. Terminology: The membership symbol is occasionally used inside the English, e.g., ‘there is x ∈ X’ is read as: ‘there is a member, x, of X’. Similar self-explanatory phrases will be used throughout. Sometimes (but not always!) ‘contains’, or ‘is contained’, means contains as a member, or is contained as a member. In these cases ‘X contains x’ means that x ∈ X. We also say that x belongs to X, or that x is an element of X. We use ‘x, y ∈ X’ as a shorthand for: ‘x ∈ X andy ∈ X’, and similarly for more than two members: ‘x, y, z ∈ X’. Extensionality A set is completely determined by its members. This means that sets that have the same members are the same set. Stated in full detail, the Extensionality Axiom says: If X and Y are sets then X = Y iff every member of X is a member of Y and every member of Y is a member of X. Note that the “only if” direction is trivial: X = Y means that X and Y are identical, hence they must have the same members. (This is actually a truth of first-order logic.) The real content of the axiom consists in the “if” direction: having the same members is sufficient for being the same set. To see the implications of extensionality, consider the following two concepts: that of a human being and that of a featherless two-footed animal. In an obvious sense, the concepts differ. But humans are featherless two-footed animals, and it so happens that there are no other such creatures besides humans. Hence, the set of humans is identical to the set of featherless two-footed animal. When forming sets, differences between concepts that cannot be cashed in terms of members are ignored.
156
CHAPTER 5. MATHEMATICAL INTERLUDE
The extensionality axiom provides the standard way of proving that sets are equal. If X and Y are sets, then to prove: X=Y it suffices to show that, for every x, x∈X
iff
x∈Y .
Ways of Denoting Sets The simplest way of representing sets is by listing their members. The set is denoted by putting curly brackets, { }, around the list. The three examples given at the beginning of the section are denoted as: {Earth, Sun, Moon},
{Earth, Sun, Moon, Jupiter},
{Earth, Moon, 8, Clinton}
The ordering of the list and repetitions in it do not matter: {Earth, Moon, 8, Clinton} = {Clinton, Clinton, 8, Earth, Clinton, Earth, 8, Moon} because every member of the left-hand side is a member of the right-hand side set, and every member of the right-hand side is a member of the left-hand side. The method of listing the members is not practical when the list is too long, and not feasible if the set is infinite. Sometimes suggestive notations can be used for infinite sets, for example: {0, 1, 2, . . .}
or
{0, 2, 4, . . .}
The first is set of all natural numbers (i.e., non-negative integers), the second–of all even natural numbers. But this method, which is based on guessing the intended rule, is very limited. The most natural–and, in principle, perhaps the only–way of representing a set is by means of a defining condition: one that determines what objects belong to it. In English, the definition has the form: the set of all ... where ‘...’ expresses the condition in question. Thus, we have: The set of all positive integers divisible by 7 or 9, the set of all planets, the set of all stars, the set of all atoms, the set of all USA citizens born in August 1991, the set of all British kings who died before 1940, and so on. Note that finite listing can be seen as a special case of this kind of definition:
5.1. BASIC CONCEPTS OF SET THEORY
157
{earth, moon, 8, Clinton} = the set of all objects that are either the earth, or the moon, or number 8, or Clinton. In mathematics the following is used: {x : . . . x . . .} It reads: The set of all x such that ...x... . Here, ‘...x...’ states the condition about x. Instead of ‘x’ any other letter can be used. We shall refer to it as the standard curly bracket notation. The examples given above can be therefore written as follows: {x : x is a positive integer divisible by 7 or 9}, {x : x is a planet }, {v : v is a star}, {y : y is an atom}, {z : z is a USA citizen born in August 1991}, {x : x is a British king who died before 1940}, and so on. This is not to say that every set can be denoted by an expression of the last given form, or–for that matter–by some other expression. In mathematics we allow for the possibility of sets not denoted by any expression in our language; just as there may be atoms that no description can pick. Variants of the Notation: Usually, set members are chosen from some fixed given domain (itself a set). If U is the domain in question, then the set of all members, x, of U that satisfy ...x... is, of course: {x : x ∈ U and . . . x . . .} An alternative notation is: {x ∈ U : . . . x . . .} which reads: ‘the set of all x in U such that ...x...’ . numbers, then:
Thus, if N is the set of all natural
{x ∈ N : x + 1 is divisible by 3} = {x : x ∈ N and x + 1 is divisible by 3} Occasionally, the domain in question is to be understood from the context. It is also customary to employ variables that range over fixed domains. If in the last example it is understood that ‘x’ ranges over the natural numbers, then we can omit the reference to N and write simply {x : x + 1 is divisible by 3} Other variants of the notation involve the use of functions. For example, {2x : x ∈ N}
and
{x2 : x ∈ N}
are, respectively, the set of all numbers of the form 2x and the set of all numbers of the form x2 , where x ranges over N (i.e., the set of all even natural numbers and the set of all squares).
158
CHAPTER 5. MATHEMATICAL INTERLUDE
We can use for these sets the standard notation; but this would result in longer expressions. For example: {x2 : x ∈ N} = {z : there is x ∈ N, such that z = x2 } Once you get used to them you will find these and other notations self-explanatory. The following exercises will help you to get accustomed to set-theoretic notations and phrasings. Homework 5.1
Translate the following into the standard curly-bracket notation.
(1) The set of all people who like themselves. (2) The set of all integers that are smaller than their squares. (Recall, the square of x is x2 .) (3) The set of all people married to 1992 Columbia students. Rewrite the following in the curly-bracket functional notation. You can use ‘N’ and ‘Z’ to denote, respectively, the set of natural numbers and the set of integers. For (6) use ‘father(x)’ to denote the father of x. (4) The set of all positive multiples of 4. (5) The set of all successors of integers divisible by 5. (6) The set of all fathers of identical twins. Describe in English the following sets, use short, neat descriptions. (‘ Livings ’, ‘ Humans ’, and ‘ Planets ’ have the obvious meanings.) (7) {x ∈ Livings : x has two legs} (8) {x ∈ Humans : x has more than one child} (9) {x ∈ Planets : x is larger than the earth} Rewrite the following in the standard curly-bracket notation. (10) {3x : x ∈ Primes} (11) {x − y : x ∈ Primes, y ∈ Primes} (12) {2x + y2 : x ∈ N, y ∈ Primes} Note: The concept of a set is primitive. It cannot be defined by reduction to more basic concepts. Explanations and examples (like the ones just given) may serve to get the concept
5.1. BASIC CONCEPTS OF SET THEORY
159
across, but they do not amount to definitions. In an indirect way, the concept is determined by what we take to be the basic properties of sets. The same takes place in Euclidean geometry, where the undefined concepts of point, line and plane are indirectly determined by the geometrical axioms. Like geometry, set theory is a system based on axioms. Some are “obvious”. Others, belonging to more sophisticated parts of the theory, require deep understanding. Except for extensionality, the axioms are not discussed here. Singletons The set {x} has a single member, namely, x. Such a set is called a singleton, or a unit set; {x} is the singleton of x, or the unit set of x. One may be tempted to identify the singleton of x with x itself. The temptation should be resisted. The singleton {Clinton} is a set containing Clinton as its sole member. Clinton himself is a man, not a set. Just so, one distinguishes between John the man and the onemember committee having John as its only member. If all the committee members except John perish in a crash, the committee becomes a one-member committee; but you do not want to say that it becomes a man. The standard version of set theory has an axiom, called the regularity axiom, which implies that nothing can be a member of itself. It therefore implies that, for all x, x 6= {x} (because x ∈ {x}, but x 6∈ x). The singleton of x is {x}, the singleton of {x} is {{x}}, the singleton of {{x}} is {{{x}}}, and so on: {. . . {{x}} . . .}. It can be shown (assuming the regularity axiom) that all of these are different from each other. The Empty Set Among sets we include the so-called empty set: one that has no members. At first glance one may find this strange, as one might find strange, at first, the idea of the number zero. In fact, the concept is simple, highly useful and easily handled. We speak about the empty set, because there is only one. This follows from extensionality: If X and X 0 are sets that have no members, then X = X 0 , because they have the same members (for every x: x ∈ X iff x ∈ X 0 ). The empty set is denoted as: ∅ . Note that every object that is not a set (e.g., every physical object) has no members. Extensionality does not make these objects equal to ∅, because extensionality applies only to sets.
5.1.2
Subsets, Intersections, and Unions
If X and Y are sets than we say that X is a subset of Y if every member of X is a member of Y . We also say in that case that Y is a superset of X. The notation is: X ⊆ Y,
or, equivalently, Y ⊇ X .
160
CHAPTER 5. MATHEMATICAL INTERLUDE
Occasionally, we use the term inclusion: we say that X is included in Y , meaning that X is a subset of Y . As is usual in mathematics, crossing out indicates negation: X 6⊆ Y means that X is not a subset of Y . Obviously, X ⊆ X, for every set X. Proper Subsets: If X ⊆ Y and X 6= Y , then X is said to be a proper subset of Y , or properly included in Y , and Y is said to be a proper superset of X. If X ⊆ Y and Y ⊆ X, then X and Y have the same members and, by extensionality, are the same. Therefore
X=Y
iff
X ⊆ Y and Y ⊆ X.
It is convenient to “chain” inclusions thus: X ⊆ Y ⊆ Z; it means: X ⊆ Y and Y ⊆ Z. Set inclusion is transitive:
If
X⊆Y ⊆Z
then
X ⊆ Z.
(The proof is trivial: Assume the left hand side. If x ∈ X then x ∈ Y , because X ⊆ Y ; hence also x ∈ Z, because Y ⊆ Z; therefore every member of X is a member of Z.) Every set, X, contains as members all members of the empty set (because the empty set has no members). Hence, ∅ ⊆ X, for every set X Note: The subset relation, ⊆ , should be sharply distinguished from the membership relation, ∈ . Every set is a subset of itself, but not a member of itself. On the other hand, a member of a set need not be a subset of it; the earth is a member of {Earth, Moon}, but it is not a subset of it, because the earth is not a set. Or consider the following: ∅ ⊆ {{∅}} (and the inclusion is proper), but ∅ 6∈ {{∅}}; because the only member of {{∅}} is {∅}, and ∅ 6= {∅}. {∅} ∈ {{∅}} but {∅} 6⊆ {{∅}}; because {∅} contains ∅ as a member, whereas {{∅}} does not contain ∅ as a member.
5.1. BASIC CONCEPTS OF SET THEORY
161
Intersections The intersection of two sets X and Y , denoted X ∩ Y , is the set whose members are all the objects that are members both of X and of Y : For every x,
x∈X ∩Y
iff
x ∈ X and x ∈ Y .
or, equivalently: X ∩ Y = {x : x ∈ X and x ∈ Y } Examples: The intersection of the set of all natural numbers divisible by 2 and the set of all natural numbers divisible by 3 is the set of all natural numbers divisible both by 2 and by 3. (This is the set of natural numbers divisible by 6.) The intersection of the set of all even natural numbers and the set of all prime numbers is the set of all numbers that are both even and prime; since the only number that is both even and prime is 2, this is the singleton {2}. The intersection of the set of all USA citizens and the set of all redheaded people is the set of all redheaded USA citizens. The intersection of the set of all women and the set of all pre-1992 USA presidents is the set of all women that have been, at some time before 1992, USA presidents. This happens to be the empty set. Disjoint Sets: Two sets, X, Y , are said to be disjoint if they have no common members; i.e., if X ∩ Y = ∅. Unions The union of the sets X and Y , denoted X ∪ Y , is the set whose members are all objects that are either members of X or members of Y (or members of both). That is: For every x,
x∈X ∪Y
iff
x ∈ X or x ∈ Y .
or, equivalently: X ∪ Y = {x : x ∈ X or x ∈ Y }
162
CHAPTER 5. MATHEMATICAL INTERLUDE
Examples: The union of the set of all natural numbers that are divisible by 6 and the set of all natural numbers that are divisible by 4 is the set of all numbers divisible either by 6 or by 4 (or by both, e.g., 12). The union of the set of all mammals and the set of all humans is the set of all creatures that are either mammals or humans; since every human is a mammal, this union is the set of all mammals. The union of the set of all people that were, at some time up to t, senators, and the set of all people who were, at some time up to t, congressmen, is the set of people who were at one time or another, up to time t, members of at least one of the legislative houses. The basic properties of intersections and unions are the following: (X ∩ Y ) ∩ Z = X ∩ (Y ∩ Z) X ∩Y =Y ∩X X ∩X =X X ∩∅=∅
(X ∪ Y ) ∪ Z = X ∪ (Y ∪ Z) X ∪Y =Y ∪X X ∪X =X X ∪∅=X
The equalities of the first row mean that the operations of intersection and union are associative, those of the second row mean that they are commutative, and those of the third row – that they are idempotent. These properties follow directly from the meanings of ‘and’ and ‘or’. They are so obvious that one would hardly consider proving them formally. Formal, but tedious, proofs can be given. When this is done, one sees that the associativity of intersection reflects the associativity of ‘and’ (i.e., the fact that (A ∧ B) ∧ C and A ∧ (B ∧ C) are logically equivalent) and the associativity of union reflects that of ‘or’. Repeated Intersections and Repeated Unions: Intersections can be applied repeatedly to more than two sets, and the same holds for unions. Since these operations are associative, we can ignore grouping and use expressions such as: X1 ∩ X2 ∩ . . . ∩ Xn
X1 ∪ X2 ∪ . . . ∪ Xn
And since the operations are commutative, the order of the sets can be changed without affecting the result. It is easily seen that X1 ∩ X2 ∩ . . . ∩ Xn is the set of all objects that are members of all the sets X1 , . . . , Xn . Similarly, X1 ∪ X2 ∪ . . . ∪ Xn is the set of all objects that are members of at least one of X1 , . . . , Xn . Distributive Laws:
These two equalities hold in general:
X ∩ (Y ∪ Z) = (X ∩ Y ) ∪ (X ∩ Z)
X ∪ (Y ∩ Z) = (X ∪ Y ) ∩ (X ∪ Z)
The first is the distributive law of intersection over union, the second – of union over intersection. These laws are direct outcomes of the following two tautologies:
5.1. BASIC CONCEPTS OF SET THEORY
163
x ∈ X and (x ∈ Y or x ∈ Z) iff (x ∈ X and x ∈ Y ) or (x ∈ X and x ∈ Z). x ∈ X or (x ∈ Y and x ∈ Z) iff (x ∈ X or x ∈ Y ) and (x ∈ X or x ∈ Z). Obviously, each of X and Y includes (as a subset) their intersection X ∩ Y , and is included in their union X ∪ Y . Which can be stated thus: X ∩ Y ⊆ X, Y ⊆ X ∪ Y As is easily seen, the subset relation can be characterized in terms either of unions, or of intersections: X ⊆ Y iff X ∩ Y = X
X ⊆ Y iff X ∪ Y = Y
We also have: If X ⊆ X 0 and Y ⊆ Y 0
then
X ∩ Y ⊆ X 0 ∩ Y 0 and X ∪ Y ⊆ X 0 ∪ Y 0 .
Every set which is included both in X and in Y is included in their intersection. This follows easily from the definitions. (It is also derivable from the above-given properties: If Z ⊆ X and Z ⊆ Y , then Z = Z ∩ Z ⊆ X ∩ Y .) Therefore, the intersection of two sets X and Y is (i) included both in X and in Y , and (ii) includes every set that is included in X and in Y . We can express this by saying that X ∩ Y is the largest set that is included both in X and in Y. Similarly, the union of X and Y can be characterized as the smallest set that includes both X and Y . Homework 5.2
Let N = {0, 1, 2, . . . , n, . . .} and let x, y, z, range over N. Let X1 = {0, 1, 5, 7, 10, 13, 18, 19, 20} X2 = {3, 4, 5, 17, 21, 8, 9, 6, 1} X3 = {21, 31, 20, 40, 1, 0, 3, 20}
164
CHAPTER 5. MATHEMATICAL INTERLUDE X4 = {2x : x > 3} X5 = {x : x is divisible by 2 or by 3} X6 = {x : x is prime}
Write down in the curly-bracket notation (using ‘∅’ for the empty set) the following sets: 1. X1 ∪ X2 2. X1 ∩ X3 3. X3 ∩ X4 4. X3 ∪ X4 5. X1 ∩ X2 ∩ X3 6. (X1 ∩ X6 ) ∪ (X5 ∩ X2 ) 7. (X5 ∩ X6 ) ∪ X1 8. (X6 ∪ X5 ) ∩ (X1 ∪ X3 ) 9. (X4 ∩ X6 ) ∪ X5 10. X4 ∩ (X6 ∪ X5 ) 5.3
For any two sets, X, Y , define X − Y by: X − Y = {x ∈ X : x 6∈ Y }
With the Xi ’s as in 5.2, write down in the curly-bracket notation (using ‘∅’ for the empty set) the following sets: 1. X1 − X2 2. X2 − X1 3. X6 − X5 4. X4 − X5 5. (X3 − X1 ) ∩ (X2 − X4 ) 6. (X1 − X3 ) − X2 7. X1 − (X3 − X2 )
5.1. BASIC CONCEPTS OF SET THEORY
165
8. N − X4 9. X4 − N 10. X5 − (X6 ∪ X4 )
5.1.3
Sequences and Ordered Pairs
Sequential orderings underlie almost everything. Impressions, actions, events, come arranged in time. Quite early in our life we become acquainted with finite sequences. We learn, moreover, that different sequences can be made by arranging the same objects in different ways. We also learn that elements can be repeated; the same color, shape, or whatnot, can occur in different places. We learn to identify certain sequences of letters as words, and certain sequences of words – as sentences. Particular sequences of tones and rests make up tunes, and some sequences of moves constitute games. Sequences are all around. We shall not define here the notion of a sequence in set-theoretical terms. Relying on our intuitive understanding we shall take it for granted. Sequences can be formed from any given objects. And the sequences are objects themselves. We may use ‘a1 , a2 , . . . , an ’ to denote the sequence of length n in which a1 occurs in the first place, a2 –in the second, ..., and an –in the nth . But this notation is often inconvenient; for we also use ‘a1 , a2 , . . . , an ’ to refer to a plurality (we say: ‘the numbers 3, 7, 11, 19 are prime’), whereas a sequence is a single object. Therefore we have notations that display more clearly the sequence as an object. The most common are: (a1 , a2 , . . . , an )
and
ha1 , a2 , . . . , an i
Finite sequences are called tuples, sequences of length n–n-tuples. The expression ‘ith coordinate’ is used, ambiguously, for the ith place, as well as for the object occurring in that place. The sequences we encounter are finite. But the notion can be extended to infinite cases. We can speak of the infinite sequence of natural numbers: (0, 1, 2, . . . , n, . . .) or of the sequence of even natural numbers: (0, 2, 4, . . . , 2n, . . .) In this course we shall be concerned only with finite sequences; though we may mention infinite sequences of numbers or of symbols.
166
CHAPTER 5. MATHEMATICAL INTERLUDE
It is convenient to refer to the objects occurring in the sequence as its members. The object occurring in the ith place is the ith member of the sequence. Do not confuse this with the membership relation of set theory! As a rule, the context indicates the intended meaning of ‘member’. Equality of Sequences: A sequence is determined by its length (the number of places, or of occurrences) and by the order in which objects occur: its first member, its second members, etc. Sequences are equal when they are “exactly the same”: they have the same length and, in each place, the same object occurs. Formally: (a1 , . . . , am ) = (b1 , . . . , bn ) iff m = n and ai = bi , for all i = 1, . . . , m . They are thus quite different from sets. A set is completely determined by its members. Set-theoretic notations may list the members in some sequential order, but neither the order nor repeated listings make a difference. {0, 1, 1, 1} = {1, 0, 1, 1} = {0, 1} = {1, 0} But the sequences (0, 1, 1, 1),
(1, 0, 1, 1),
(0, 1),
(1, 0)
are all different. Ordered Pairs, Triples, Quadruples, etc. Ordered pairs, or pairs for short, are 2-tuples; triples are 3-tuples; quadruples are 4-tuples, and so on. Ordered pairs are of particular importance. The identity condition for sequences becomes, in the case of ordered pairs, the well-known condition: (a, b) = (a0 , b0 ) iff a = a0 and b = b0 .
5.1.4
Relations and Cartesian Products
We have seen that any property of objects (belonging to some given domain) determines a set: the set of all objects (in the given domain) that have the property. We can therefore use sets as substitutes for properties. (By doing so we disregard the difference between any two properties that determine the same set.) There are creatures that, like properties, are true of objects, but which involve more than one object: they relate objects to each other. For example, the parent-child relation holds for any pair of objects, x and y, such that x is a parent of y. Set theory provides a very simple and elegant way of representing these creatures:
5.1. BASIC CONCEPTS OF SET THEORY
167
Regard the relation as a property of ordered pairs and represent it, accordingly, as a set of ordered pairs. Thus, the parent-child relation is the set of all ordered pairs (x, y), such that x is a parent of y. If, for the sake of illustration, we restrict our universe to a domain consisting of: Olga, Mary, Ruth, Jack, John, Abe, Bert, Nancy, Frieda, and if the parent-child relation among these people is given by: Abe is the father of Ruth and Jack, Olga is the mother of Mary, Abe and Nancy, Jack is the father of Bert, John is the father of Nancy, and there are no other parent-child relationships, then–over this domain–the parent-child relation is simply the set: {(Abe, Ruth), (Abe, Jack) , (Olga, Mary), (Olga, Abe), (Olga, Nancy), (Jack, Bert), (John, Nancy)} Note that the child-parent relation is obtained by switching the two coordinates. It contains as members: (Ruth, Abe), (Jack, Abe), (Mary, Olga), etc. Relations that involve three members are construed, accordingly, as sets of 3-tuples. For example, the betweenness-relation–which holds between any three points x, y, z on a line such that y is between x and z–is the set of all triples (x, y, z) such that y is between x and z. Here, to sum up, are some basic notions and terms: A binary relation is a set of ordered pairs. An n-ary relation (also called an n-place relation) is a set of n-tuples. Unqualified ‘relation’ means often a binary relation. If R is an n-ary relation, then n is referred to as the arity of R, or the number of places of R. {(x1 , x2 , . . . , xn ) : . . . x1 . . . x2 . . . . . . xn . . .} is the set of all tuples (x1 , x2 , . . . , xn ) satisfying the condition stated by ‘. . . x1 , . . . x2 , . . . . . . xn . . ..
168
CHAPTER 5. MATHEMATICAL INTERLUDE
The betweenness relation above can be written as: {(x, y, z) : y is between x and z} where ‘x’ ‘y’ and ‘z’ range over geometrical points. Here are some other examples: {(x, y) : x is a parent of y}, {(x, y) : y is a parent of x}, {(x, y, z) : x introduced y to z} {(x, y) : x and y are real numbers and y = 2x + 1} {(x, y) : x and y are natural numbers and x ≥ y} Note: The variables in relational notation are used as place holders, that is, to correlate coordinates with places in the defining expression. Different variables, or the same variables in different roles, can achieve the same effect: {(x, y) : x is a parent of y} = {(y, x) : y is a parent of x} = {(u, x) : u is a parent of x} But {(x, y) : x is a parent of y} 6= {(x, y) : y is a parent of x}
The first relation consists of pairs in which the parent occupies the first coordinate, the child– the second; in the other relation the child is in the first place, the parent–in the second. Self-explanatory variants of our notation involve repetitions of variables. e.g., {(x, x) : x ∈ D} is the set of all pairs (x, x), where x ranges over D. It is equal to {(x, y) : x, y ∈ D and x = y}. Note: The arity of the relation is the length of the tuple; it may be greater than the number of different variables that appear in the definition, because, as we have just seen, the same variable can occupy different places in the tuple. Relations Over a Given Domain: Often, we consider relations that relate objects of particular kinds: numbers, people, animals, words, etc. We say that a relation is over D if it consists of tuples whose members belong to D. Usually, the variables range over well defined domains. In ‘x is an uncle of y’, ‘x’ and ‘y’ range, obviously, over people. Relations can, however, relate objects of different kinds; e.g., the ownership relation that holds between x and y, just when x is a person, and y is an object owned by x. Homework 5.4 Consider binary relations consisting of the pairs (x, y), determined respectively by following conditions. (When the signs ≥, 1, of 2’s and 3’s, that is: all numbers of the form 2m 3n , where m, n ≥ 0 and at least one of m, n is non-zero. (Recall that x0 = 1 and x1 = x.) The set S2: (1) 2, 3 ∈ S2 (2) If x, y ∈ S2, then x·y ∈ S2 Clause (2) means that, S2 is closed under products; i.e., it contains, with every two members, also their product. It is not difficult to see that S1 = S2. The argument, which is easy, shows how the property of being the smallest set satisfying the condition is used: S2 contains 2 and 3 and is closed under products. Hence it contains all products of 2’s and 3’s. Therefore S2 satisfies the conditions that define S1. Since S1 is the smallest set satisfying these conditions, we have: S1 ⊆ S2. Vice versa, the set of all products > 1 of 2’s and 3’s contains 2 and 3 and is closed under products. Hence it satisfies the conditions that define S2. Since S2 is the smallest set satisfying these conditions, we have: S2 ⊆ S1. Putting the two together we get: S1 = S2. This case is easy. But, in general, the question whether two given inductive definitions define the same set can be very difficult. The set S3: (1) 1 ∈ S3.
(2) If x ∈ S3, then 2x ∈ S3.
180
CHAPTER 5. MATHEMATICAL INTERLUDE
It is not difficult to see that S3 is just the set consisting of all powers of 2: {20 , 21 , 22 , 23 , . . . , 2n , . . .}
The set S4: (1) 3, 5 ∈ S4. (2) If x ∈ S4, then x+3 ∈ S4. (3) If x ∈ S4, then x+5 ∈ S4. S4 is the analogue of S1 (with 2 and 3 replaced by 3 and 5) in which products have been replaced by sums. It is not difficult to see that S4 consists of all numbers > 0 that can be written as 3m + 5n, where m, n are natural numbers. Just as S4 is the analogue of S1, so the following set is the analogue of S2. The set S5: (1) 3, 5 ∈ S5. (2) If x, y ∈ S5, then x+y ∈ S5. As in the case for products, one can show that S4 = S5. It can be also shown that this is the same as the following S6. The set S6: (1) 3, 5, 6, 8 ∈ S6. (2) If x ∈ S6 and x ≥ 8 then x+1 ∈ S6. S6 is simply the set consisting of 3, 5, 6, 8, and all numbers greater than 8. [To see that S4 ⊆ S6 note that 3, 5 ∈ S6, that of all numbers ≤ 8 only 3, 5, 6, 8 are sums of 3’s and 5’s; consequently, S6 is closed under (2) and (3) in the definition of S4. To see that S6 ⊆ S4, note that every number among 3, 5, 6, 8 is a sum of 3’s and 5’s and each number from 9 on is obtainable by adding to some number from 3, 5, 6, 8 a sum of 3’s and 5’s.] In the preceding examples, the recursive rules add to the set numbers of growing size. Consequently, the set keeps growing and the fixed point is infinite. As the following example shows, this need not hold in general. The set S7:
5.2. INDUCTIVE DEFINITIONS AND PROOFS, FORMAL LANGUAGES
181
(1) 7 ∈ S7. (2) If n ∈ S7 and n is odd, then 2n ∈ S7. (3) If n ∈ S7 and n > 4, then n − 2 ∈ S7. By iterating these rules we put into our set the following numbers: 7, 14, 5, 3, 12, 10, 8, 6, 4. Additional applications of the rules do not yield new numbers. Hence, S7 = {7, 14, 5, 3, 12, 10, 8, 6, 4} Homework 5.9 Let k be a fixed natural number. Let Xk be the set defined, inductively, by the following clauses: (1) k ∈ Xk . (2) If x ∈ Xk and x is even, then x/2 ∈ Xk . (3) If x ∈ Xk and x is odd, then (3x+1)/2 ∈ Xk . Write down (in the curly brackets notation) the sets Xk for the cases: k = 0, 1, 2, 3, 5, 6, 15, 17. Does there exist a number k for which Xk is infinite? This is an open and apparently a very difficult problem in number theory. Many examples of inductive definitions that apply to objects that are not numbers are given in 5.2.3. We had already one example: the set of descendants. Here is one of the same kind. The Set of Maternal Descendants: Let ‘maternal descendent’ means a descendant via the mother-child relation. Note that the connecting chain must consist of females, except, possibly, the last descendant. Using ‘MDa ’ for the set of maternal descendants of a, the clauses of the definition are: (MD1) If a is female and x is a child of a, then x ∈ MDa . (MD2) If x ∈ MDa and x is female and y is a child of x, then y ∈ MDa Note: If a is not female, MDa is empty. Formally, one shows that ∅ satisfies the two conditions for MDa : Since a is not female, the antecedent of the first condition is false and the condition holds vacuously. ∅ satisfies also the second condition, since no x is in ∅.
182
CHAPTER 5. MATHEMATICAL INTERLUDE
Inductive Definitions of Relations The machinery of inductive definitions can be applied to define relations, where these, recall, are sets of pairs, or of n-tuples. The conditions determine rules for adding certain pairs, or n-tuples, to the set that is being constructed. Here, for example, is the definition of the descendant relation, Des, which is the set of all pairs (x, z) in which x is a descendant of z. This definition is obtained from that of a’s descendants by replacing the fixed parameter ‘a’ by a variable, say ‘(z)’, and by suitable replacements of ‘x’ by ‘(x, z)’. (1) If x is a child of z, then (x, z) ∈ Des. (2) If (x, z) ∈ Des and y is a child of x then (y, z) ∈ Des. Notation:
Let s be the successor function, defined for natural numbers: s(x) = x + 1.
Many relations over natural numbers can be defined inductively, in terms of the successor function. Here is one. (1) (x, s(x)) ∈ R (i.e., this holds for all natural numbers x). (2) If (x, y) ∈ R, then (x, s(y)) ∈ R. (1) puts in R all pairs of the form (x, s(x)). Then, an application of (2) adds all the pairs (x, s(s(x))), another application adds the pairs (x, s(s(s(x)))), and so on. It is not difficult to see that R consists exactly of all pairs (x, y) in which x < y. Hence, (1) and (2) define inductively the smaller-than relation, 0, if all numbers smaller than n have the property P, then n has it. We simply let ‘n’ play the role of our previous ‘n+1’. (I) and (II∗∗ ) can be combined into a single condition: (III) For any n, if all numbers smaller than n have the property P, then n has it. For n = 0, (III) is equivalent to (I): since there are no numbers < 0, the antecedent is satisfied vacuously, hence the claim means that 0 has the property. For n > 0, (III) is the same as (II∗∗ ). The proof of (III) may, of course, proceed by cases, with n = 0 treated as a separate case. Various variants of strong induction are obvious. For example, in order to show that all natural numbers, belonging to some given set, X, have a property P, it suffices to prove the relativized version of (III): (IIIX ) For any n in X, if all numbers in X that are smaller than n have the property P, then n has it. Here is an example of strong induction in use. A natural number is called prime if it is greater than 0 and is not a product of two smaller numbers. We shall now show, by strong induction, that every number > 1 is either a prime or a product of a finite number of primes; i.e., of the form p1 · p2 · . . . · pk , where the pi ’s are prime (we presuppose here the concept of finite sequences and some elementary properties of products of these kind). Assume that n > 1 and that the claim holds for all numbers > 1 that are smaller than n. If n is not a product of two smaller numbers, then it is a prime and the claim holds. Otherwise, n is a product of two smaller numbers, say n = k · m, where k, m < n. Both k and m must be > 1 (if one of them is 1 the other cannot be smaller than n). Hence each is either a prime or a product of primes: k = p1 · · · pi , m = p01 · · · p0j . Combining the two we get: n = p1 · · · pi · · · p01 · · · p0j .
188
CHAPTER 5. MATHEMATICAL INTERLUDE
Since strong induction is a more convenient tool it is employed whenever an inductive argument relating to natural numbers is needed. The term ‘induction’ often means strong induction.
5.2.3
Formal Languages as Sets of Strings
Written linguistic constructs are usually finite sequences of signs. There is a theory that treats languages simply as sets of finite sequences. The elements of the sequences are taken from some fixed domain, which is called, in this context, the alphabet. (If we were to represent English in this way, then the “alphabet” would consist of all English words and punctuation marks and the set of sequences will consist of all grammatical English sentences.) Let Σ be some fixed non-empty set of objects. We refer to Σ as the alphabet and we assume that no member of Σ is a sequence of members of Σ. Strings Over Σ: By a (non-empty) string over Σ we mean either a member of Σ, or a sequence of length > 1 of members of Σ. The length of the string is 1, if it is a member of Σ; otherwise, it is the length of the sequence. Strings over Σ are like finite sequences of members of Σ, with the sole difference that the strings of length 1 are the members themselves. This is done for the sake of convenience; if a ∈ Σ the distinction between a and the sequence hai plays no role in the theory. The assumption that no member of Σ is itself a sequence of members of Σ is necessary in order to avoid ambiguity in determining the length of a string and the string’s members. A string of length 1 has one member: the string itself. (Note that ‘member’ does not denote here set-theoretic membership!) Each string over Σ has a unique length, a uniquely determined first member, say a1 , a uniquely determined second member, say a2 , and so on. If the length of the string is n, and ai is its ith member, i = 1, . . . , n, then the string is written as: a1 a2 . . . an If the string is of length 1 we simply have a1 (an element of Σ). We say that a occurs in the string if, for some i, a = ai . It is very useful to include among our strings a so-called empty string or null string, whose length is 0, which has no members. It plays a role somewhat analogous to that of the empty set. The null string is denoted as: Λ The criterion of identity for strings is obvious: If ai , bj ∈ Σ, for all i = 1, . . . , m j = 1, . . . , n,
5.2. INDUCTIVE DEFINITIONS AND PROOFS, FORMAL LANGUAGES
189
then a1 . . . am = b1 . . . bn
iff
m=n
and
ai = bi
for all i = 1, . . . , m
(If both m and n are > 1, this follows from the corresponding property of sequences (cf. 5.1.3). If one of them is 1, this follows from our assumption that no member of Σ is a sequence of members of Σ.) The set of all strings over Σ is denoted as: Σ∗ Concatenation of Strings:
For x and y in Σ∗ , such that
x = a1 , . . . , am
and
y = b1 , . . . , bn ,
the concatenation of x and y is the string a1 . . . am b1 . . . bn It is denoted as: xy Concatenation is also defined if one of the strings, or both, is Λ: xΛ = Λx = x Obviously, concatenation is associative: (xy)z = x(yz). Hence, in repeated concatenation we can omit parentheses. If x1 , x2 , . . . , xn are strings then x1 x2 . . . xn is the string obtained by concatenating them in the given order. Note that the string a1 . . . an , where the ai ’s are members of Σ, is the concatenation of the ai ’s, where these are considered as strings. By a language over Σ, we mean any subset of Σ∗ .
Inductive Definitions of Sets of Strings Strings form a domain where inductive definitions are particularly useful. First, note that, if Σ1 ⊆ Σ, then Σ∗1 (the set of all strings over Σ1 ) can be characterized inductively as the smallest set satisfying: (1) Λ ∈ Σ∗1 .
190
CHAPTER 5. MATHEMATICAL INTERLUDE
(2) If x ∈ Σ∗1 and p ∈ Σ1 , then xp ∈ Σ∗1 . In other words: Σ∗1 is the smallest set containing Λ and closed under concatenation (to the right) with members of Σ1 . If, for example, a1 , . . . , an are any members of Σ∗ . Then, by (1), Λ ∈ Σ∗1 Hence, by (2), Λa1 ∈ Σ∗1 , but this is exactly a1 . Consequently: a1 ∈ Σ∗1 An additional application of (2) yields: a1 a2 ∈ Σ∗1 and so on; n applications of (2) give us: a1 a2 . . . an ∈ Σ∗1 Σ∗1 is also the smallest set containing Λ and satisfying the following two conditions: (20 ) If x ∈ Σ1 , then x ∈ Σ∗1 . 00
(2 ) If x, y ∈ Σ∗1 , then xy ∈ Σ∗1 . Prefixes of Strings: A prefix of a string x (called also an initial segment of x) is any string y, such that, for some string z: yz = x It is easily seen that a prefix of a1 . . . an is any string of the form: a1 . . . am , where m ≤ n. The case m = 0 is taken to yield the empty string. Every string x is a prefix of itself, since x = xΛ. A proper prefix of x is one which is different from x. The prefix relation can be defined inductively, using only concatenation (to the right) with members of Σ: (1) For every string x, x is a prefix of x. (2) If x is a prefix of y, and p ∈ Σ, then x is a prefix of yp . Homework 5.11 A suffix of x is any string y such that, for some string z: zy = x
5.2. INDUCTIVE DEFINITIONS AND PROOFS, FORMAL LANGUAGES
191
A segment of x is any string y, such that, for some strings u and v: uyv = x Give inductive definition of these concepts, based on concatenation (to the left, or to the right) with members of Σ. Powers of strings are defined as iterated concatenations: If n is a natural number, then: xn = xx . . . x where the number of x’s on the right-hand side is n. If n = 0, this is defined to be Λ. The following is an inductive definition of that function. Note that the induction is on natural numbers, but the values of the function are strings. (1) x0 = Λ. (2) xn+1 = xn x. Obviously, x1 = x. Note:
If a ∈ Σ, then an is simply the string of length n consisting of n a’s.
Examples: The following are examples of languages, i.e., sets of strings, defined by induction. We assume that a, b, and c are some fixed members of Σ, and x, y, z, are variables ranging over Σ∗ . The set L1: (1) b ∈ L1. (2) If x ∈ L1, then axc ∈ L1. Starting with (1) (the base rule), we put b in L1. Then, rule (2) enables us, given any member of L1, to get from it a new member in L1 by concatenating it with a on the left and with c on the right. Hence, applying (1) and following it by repeated applications of (2) we get b,
abc,
aabcc,
aaabcccc, . . . , an bcn , . . .
It is not difficult to see that L1 consists of all strings of the form: an bcn , where n = 0, 1, . . .. The set L2: (1) abcc ∈ L2 (2) If x ∈ L2, then axcc ∈ L2. By applying (1), put abcc into L2. Then, each application of (2) adds one a at the beginning and two c’s at the end. Hence, we end by getting strings of the form: an bc2n ,
n = 1, 2, . . .
192
CHAPTER 5. MATHEMATICAL INTERLUDE
L2 is the language consisting exactly of all these strings. In these two examples the inductively defined sets of strings have also explicit definitions, which enumerate according to an obvious rule the members of the set. But, in general, an alternative description is not easily found. Sometimes the inductive definition is all that we have. Consider, for instance, the following very simple set of rules: (1) ab ∈ L3. (2) If x ∈ L3, then axb ∈ L3. (3) If x, y ∈ L3, then xy ∈ L3. Let a and b be, respectively the left and right parentheses: a = (
b = )
Then L3 consists of all parentheses-strings in which all parentheses are matched. E.g., (())
()()()
(()())()
()()()(())(()())
are in L3, while ())(
(()((())
()())(()
are excluded from it. But the concept of matching parentheses is itself in need of clarification. A very good way of doing this is by the inductive definition just given. If x is a string than x−1 is defined to be the string obtained by reversing x, i.e., by reading it right-to-left (and writing the result left-to-right). Here is an inductive definition of this function: (1) Λ−1 = Λ. (2) If p ∈ Σ, then (xp)−1 = p(x)−1 by the reversal of x).
(i.e., the reversal of xp is p followed
(The parentheses in (2) are used to indicate the string to which the function is applied, they are not string members.) For example, to find (abb)−1 , we note that abb = Λabb, hence: (abb)−1 = (Λabb)−1 = b(Λab)−1 = bb(Λa)−1 = bba(Λ)−1 = bbaΛ = bba A similar and shorter argument shows that, if p ∈ Σ, then p−1 = p. Homework 5.12 Let L4 be the set of strings defined inductively by:
5.2. INDUCTIVE DEFINITIONS AND PROOFS, FORMAL LANGUAGES
193
(1) Λ ∈ L4. (2) If x ∈ L4, then axb ∈ L4. (3) If x ∈ L4, then bxa ∈ L4. (4) If x, y ∈ L4, then xy ∈ L4. Find all the strings in L4 whose length is 4. Show how to reach each of them by a sequence of rule-applications. There is a very simple description of L4, can you guess it? 5.13 Define inductively each of the following languages. Use the indicated names in the definition. L5: The set of all an bn , where n = 0, 1, . . . L6: The set of all am ban , where 0 ≤ n < m. L7: The set of all an bn cn , where n = 1, 2, . . .. P AL: The set of all palindromes over the alphabet {a, b}. A palindrome is a string x such that x−1 = x.
Proofs by Induction on Strings Since sets of strings are often characterized inductively, the technique of inductive proofs comes handy. The proofs fall under the general scheme of IN1 and IN2, given in 5.2.2. Here is in example that underlies string processing techniques. L3 is the matching-parentheses set defined above (with a and b in the role of left and right parentheses). Claim: For every x in L3 the following is true: In every prefix of x the number of a’s is greater or equal to the number of b’s. Proof: First consider the strings that are put in L3 by the base rule (1) of the definition. The rule puts in L3 the single string ab. The prefixes of this string are: Λ,
a,
ab
and the claim in this case is obviously true. Next we have two inductive rules (2) and (3). Accordingly, we have to show: (C2) If, in every prefix of x, the number of a’s is not smaller than the number of b’s, then this is also true for every prefix of axb.
194
CHAPTER 5. MATHEMATICAL INTERLUDE
(C3) If, in every prefix of x and in every prefix of y, the number of a’s is not smaller than the number of b’s, then this is also true for every prefix of xy. In showing this we shall use certain obvious properties of prefixes (themselves provable from the definition of prefix): (a) Every prefix of axb is either Λ, or au, where u is some prefix of x, or axb. (b) Every prefix of xy is either a prefix of x, or xv, where v is a prefix of y. (C2) now follows easily from (a), and (C3) from (b). Consider, for example, (C2). If u is a prefix of x, and if m and n are, respectively, the numbers of a’s and b’s in it, then: (i) In au, the numbers of a’s and b’s are m+1 and n. (ii) In axb, the numbers of a’s and b’s are m+1 and n+1. By our assumption (the inductive hypothesis) m ≥ n. Therefore m+1 > n, and m+1 ≥ n+1. The proof of (C3) is similar. QED Homework 5.14 Prove the following by induction: 1. Every string, in the set L1 (defined above) is of the form an bcn , where n = 0, 1, . . .. 2. In every string of L2, the number of c’s is twice the number of a’s. 3. In every string of L3 the number of a’s is the same as the number of b’s.
5.2.4
Simultaneous Induction
The technique of inductive definitions can be applied to define several sets and/or relations at one go, i.e., by one set of conditions. Consider for example the following rules, which define, simultaneously, two sets of natural numbers E and O. (1) 0 ∈ E. (2) If x ∈ E, then x+1 ∈ O. (3) If x ∈ O, then x+1 ∈ E. It is not difficult to see that E and O are the sets of even and odd numbers.
5.2. INDUCTIVE DEFINITIONS AND PROOFS, FORMAL LANGUAGES
195
There are other pairs of sets that satisfy the clauses just given, for example the pair in which both sets consist of all the natural numbers. But the pair (E, O) is smallest, in the following sense: If (E 0 , O0 ) is any pair of sets satisfying the three clauses (with ‘E’ and ‘O’ replaced by ‘E 0 ’ and ‘O0 ’), then E ⊆ E 0 and O ⊆ O0 . If several sets (and relations) are defined inductively by a single set of rules, we say that they are defined simultaneously. These definitions are an extremely powerful tool. In the following example simultaneous induction is used to define a toy language, which is a fragment of English. We let Σ be the set consisting of eight English words: Jack,
Jill,
the,
person,
who,
liked,
saved,
hated
Since each of these is itself a string of more basic elements (the letters), we leave spaces between the concatenated English words in order to ensure a unique and easy reading. We now define by simultaneous induction three subsets of Σ∗ , denoted as: NP,
VP,
S
As it will turn out, NP is a set of noun-phrases, VP is a set of verb-phrases, and S is a set of sentences. (1) Jack,
Jill ∈ NP.
(2) If x ∈ NP, then each one of the following strings is in VP: liked x,
saved x,
hated x
(3) If x ∈ VP, then the following string is in NP
the person who x
(4) If x ∈ NP and y ∈ VP, then xy ∈ S. Applying these rules, we find, for example, that the person who liked Jill saved the person who hated Jack is in S. Here is the proof: (i) Jack and Jill are in NP,
by (1).
(ii) liked Jill and hated Jack are in VP, by (2).
196 (iii) the person who liked Jill both in NP, by (3).
CHAPTER 5. MATHEMATICAL INTERLUDE and
the person who hated Jack
are
(iv) saved the person who hated Jack is in VP, by (2). (v) the person who liked Jill saved the person who hated Jack S, by (4).
is in
Chapter 6 The Sentential Calculus 6.0 Returning in this chapter to sentential logic, we set up a rigorously defined formal language, based on an infinite sequence of atomic sentences and on the sentential connectives. We shall define the concept of an interpretation of this language, based on which we shall define, for that language, logical truth, logical falsity, and logical implication. We shall also investigate additional topics: disjunctive and conjunctive normal forms, truth-functions, and the expressiveness of sets of connectives. In the second part of this chapter the fundamental concept of a formal deductive system is defined, Hilbert-type and Gentzen-type systems for sentential logic are given and their basic features are established. The two fundamental criteria that relate the syntax to the semantics, soundness and completeness, are defined and the systems presented are shown to satisfy them.
6.1
The Language and Its Semantics
6.1.0 So far, we have assumed that our language is provided with certain sentential operations: negation, conjunction, and other connectives; that its sentences are generated from certain atomic sentences, and that certain general conditions hold. We shall now show how to define, with full formal rigour, a language that satisfies these assumptions. The language is built bottom-up, from a given set of atomic sentences; that is, all other sen197
198
CHAPTER 6. THE SENTENTIAL CALCULUS
tences are generated from them by repeated applications of the sentential connectives. The latter are defined in a way that ensures unique readability; i.e., there is exactly one decomposition of a non-atomic sentence into components and no atomic sentence is decomposable (the detailed requirements were listed in 2.3).
6.1.1
Sentences as Strings
There are many ways of setting up the language so as to satisfy the required properties. The choice of a particular definition is a question of convenience. For example, one can define sentences to be certain labeled trees (cf. 2.4). The most common way, however, is to define them, and linguistic constructs in general, as strings over some given set of symbols; that is, they are either members or finite sequences of members of that set (cf. 5.2.3). This still leaves us with a large degree of freedom. Here is one way of defining the language. The set of symbols, referred to as the alphabet, consists of the following distinct members: A1 , A2 , . . . , An , . . . ,
,
∧ ∧,
∨,
Â,
≺Â
The Ai ’s are the atomic sentences. There is an infinite number of them. The other five symbols are the connective letters: the negation letter , the conjunction letter ∧ ∧ , and so on. As usual, we assume that none of the alphabet members is a finite sequence of other members (cf. 5.2.3). The difference between the connective letters , ∧ ∧ , ∨ , Â, ≺ Â, and the metalinguistic symbols we have been using: ‘¬’, ‘∧’, ‘∨’, ‘→’, ‘↔’, is that the former are simply symbols occurring in strings that constitute the sentences. But the latter are names of certain syntactic operations. The different font is used to make this clear. The two are of course related: the operations are defined by concatenation of the strings with the corresponding connective letters. The set of all sentences is the smallest set of strings, S, satisfying: (i) Ai ∈ S, for all i = 1, 2, . . . (ii) If x, y ∈ S, then: (ii.1)
x∈S
(ii.2) ∧ ∧xy ∈ S
(ii.3) ∨ xy ∈ S
(ii.4) Âxy ∈ S (ii.5) ≺ Âxy ∈ S
6.1. THE LANGUAGE AND ITS SEMANTICS
199
Here, and in this section only, ‘x’, ‘y’, ‘z’, ‘x0 ’, etc., are variables ranging over strings. Note: The sentences of the language are constructed along the lines of the Polish notation. Had we used an infix convention, we should have included two additional symbols of left and right parenthesis. The choice of Polish notation helps to distinguish the given formal language from our metalanguage, where the infix notation is used. The sentential operations are now defined as follows. (The sign ‘=df ’ should read: ‘equal by definition’.) ¬x (the negation of x) =df
x
∧xy x ∧ y (the conjunction of x and y) =df ∧ x ∨ y (the disjunction of x and y) =df ∨ xy x → y (the conditional of x and y) =df
Âxy
x ↔ y (the biconditional of x and y) =df ≺ Âxy These definitions imply, for example, the equalities: A6 ∨ ¬A3 = ∨ A6 A3 (A1 → A4 ) ∧ A2 = ∧ ∧ ÂA1 A4 A2 A1 → A4 ∧A2 =
∧ A4 A2 ÂA1 ∧
In the last equality, the grouping of the left-hand side is obtained via our grouping conventions. Given the previous definition of sentences, it is obvious that the set of sentences is closed under applications of the connectives: the negation of a sentence is a sentence, the conjunction of two sentences is a sentence, etc. It remains to show that unique readability obtains. This comes to the following claims. (I) No atomic sentence, Ai , is of the form x, or cxy, where x and y are sentences and c is a binary connective letter. (II) No sentence x, is of the form cyz, where c is a binary connective letter and y and z are sentences. (III) If x and x0 are sentences and x = x0 , then x = x0 . (IV) For all sentences x, y, x0 , y 0 , if c and c0 are binary connective letters then: cxy = c0 x0 y 0 , only if c = c0 , x = x0 , and y = y 0
200
CHAPTER 6. THE SENTENTIAL CALCULUS
(I) follows trivially from the assumption that no member of the alphabet is equal to a sequence of other members. (II) and (III) are trivial as well: Since x and cyz are strings that start with different symbols, namely, and c, they cannot be equal. And if x = x0 , then the strings obtained from them by deleting the first member are equal. Also trivial is the first part of (III): if cxy = c0 x0 y 0 , then their first members, c and c0 must be equal. So far, we have used only general properties of strings. The claim that is far from obvious is that if cxy = cx0 y 0 , where x, x0 , y, y 0 are sentences, then x = x0 and y = y 0 . Now, if cxy = cx0 y 0 , then xy = x0 y 0 (because xy and x0 y 0 are obtained from cxy and cx0 y 0 by deletion of the first member). Hence, we have to prove: For all sentences x, x,0 y, y 0 , xy = x0 y 0
=⇒
x = x0 and y = y0
The proof is based on a method that enables one to determine, by easy counting, the scopes of the connectives. The Symbol-Counting Method Let us associate, with every symbol, a, of our alphabet an integer, ν(a), as follows: For all Ai , ν(Ai ) = 1, ν( ) = 0, and ν(c) = −1, for every binary connective letter c. Now let x be any non-empty string of length n, say x = a1 a2 . . . an . With every occurrence of a symbol in x, let its count number (relative to x) be the sum of all integers associated with it and with the preceding occurrences in x. For the ith occurrence the count number is: ν(a1 ) + ν(a2 ) + . . . + ν(ai ) Here is an illustration, where the string is the sentence (A2 ∨ ¬A1 ) ∧ (A4 → (¬(A7 → A1 ))),
that is: ∧ ∧ ∨ A2 A1 ÂA4
ÂA7 A1
This sentence is written below, in spaced form, on the first line, the numbers associated with the symbols are written below it, and below them–the count numbers. ∧ ∧ ∨ A2 A1 −1 −1 1 0 1 −1 −2 −1 −1 0
 A4 −1 1 −1 0
0 0
 A7 −1 1 −1 0
A1 1 1
Main Claim: For every sentence a1 . . . an , the count number of the last occurrence is 1 and all other count numbers are < 1. Proof: By induction on the strings. We show that (i) the claim holds for atomic sentences, (ii) if the claim holds for a sentence x, it holds also for x and (iii) if it holds for the sentences x and y, then it holds also for cxy, where c is any binary connective letter.
6.1. THE LANGUAGE AND ITS SEMANTICS
201
(i) is obvious. (ii) is easy: since ν( ) = 0, the count number of an occurrence in x, relative to x, is the same as the count number of that occurrence relative to x. To show (iii), note that, since ν(c) = −1, the count numbers of occurrences in x, relative to cxy, are smaller by 1 than their numbers relative to x. Assuming that the claim holds for x, it follows that, relative to cxy, the last occurrence in x has count number 0 and all preceding occurrences have count numbers < 0. Hence, adding cx as a prefix does not affect the count numbers of occurrences in y: the number for each occurrence in y, relative to cxy, is same as its number relative to y. Assuming the claim for y, it follows that, relative to cxy, the count number of the last occurrence is 1 and all other occurrences in y have numbers < 1. qed This claim implies that no sentence is a proper prefix of another sentence: If y = xz and x and y are sentences, then y = x (i.e., z is the empty string). Proof: The last occurrence in x has count number 0 relative to x. Since y = xz, it has the same count number relative to y; since the last occurrence in y is the only one that has (relative to y) count number 0, the last occurrence in x is also the last occurrence in y. Which implies x = y. To complete the proof of unique readability, assume that xy = x0 y 0 , and let m and m0 be, respectively, the lengths of x and x0 . If m < m0 , then x is a proper segment of x0 , which is impossible, since both are sentences. For the same reason we cannot have m0 < m. Hence m = m0 , implying x = x0 . Since y and y 0 are then obtained by deleting the first m members from the same string, we have y = y 0 . This concludes the proof. Note: Count numbers, as we have defined them, give us a way of finding, given any sentence, its immediate components. Suppose that x is a sentence whose leftmost symbol, c, is a binary connective letter. Then x = cyz, where y and z are uniquely determined sentences. To find them, find–by summing from left to right–all the count numbers in x. The first occurrence that has count number 0 (or, equivalently, count number 1 relative to the string obtained by deleting the leftmost c) is the last occurrence in y. The remainder of the string (which comes after cy) is z. This method leads also to a procedure for determining whether a given string is a sentence and provides a parsing, if it is. Homework 6.1 Consider another way of constructing the language. Here ( and ) are additional alphabet members, functioning as left and right parentheses. The clauses for nonatomic sentences are: If x and y are sentences, then (x)) is a sentence. (x))c((y)) is a sentence, for each binary connective letter c. Unique readability is proved via the following counting method. Associate with the left
202
CHAPTER 6. THE SENTENTIAL CALCULUS
parenthesis the number −1, with the right parenthesis–the number 1, and with all the other alphabet symbols–the number 0. Count numbers are defined as above, by summing from left to right the associated numbers. Prove, by induction, that in every sentence the last occurrence has count number 0 and all the others count numbers are ≤ 0. Deduce from this that (i) if x is a sentence, then in (x)) all occurrences except the last have count numbers < 0, and (ii) if x and y are sentences, then, in (x))c((y)), there are exactly two occurrences of parentheses with count number 0: the last and the one after (x. Deduce from this the unique readability property. Note: We have presupposed an alphabet with an infinite number of symbols, which function as basic units. These can be constructed from a finite number of other units. For example, they can be strings of 0’s and 1’s: = 10,
∧ ∧ = 110,
A1 = 1111110,
∨ = 1110,
A2 = 11111110,
 = 11110,
A3 = 111111110, . . . ,
≺ Â = 111110 An = 15+n 0, . . .
In each of these strings 0 serves to mark the end. If x1 . . . xm = y1 . . . yn , where all the xi ’s and yj ’s are strings of the form 1k 0, k > 0, then m = n and xi = yi , for all i = 1, . . . , n. Hence, concatenations of such strings of 0’s and 1’s are uniquely decomposable as concatenations of our alphabet symbols. For example, the string 111011111110101101111111101111110 is the sentence: ∨ A2 ∧ ∧ A3 A1
From now on we shall ignore the specific nature of the sentences. We require only that sentences be generated from the atoms by applying connectives and that unique readability hold. We do not have any further use for the connective letters of the language. We employ, as before the symbols ‘¬’, ‘∧’, ‘∨’, ‘→’, ‘↔’ for the sentential operations. We also employ, as we did before, ‘A’, ‘B’, ‘C’, ‘A0 ’, ‘B 0 ’, ‘C 0 ’, ‘A1 ’, ‘B1 ’, ‘C1 ’,...etc. as variables ranging over sentences. The only new pieces in our stock are the atomic sentences A1 , A2 , A3 , etc. Do not confuse them with sentential variables!
6.1.2
Semantics of the Sentential Calculus
Let SC be the language of the sentential calculus, as just defined. So far, the definitions have been purely syntactic. The language is defined as a system of uninterpreted constructs. An interpretation, which reads the language as being about
6.1. THE LANGUAGE AND ITS SEMANTICS
203
something else, is–as explained in 2.5–the concern of the semantics. Since our treatment of formal languages is general, we shall not be concerned with one particular interpretation, but with the class of possible interpretations. Usually, an interpretation determines how extralinguistic entities (objects, relations or properties) are correlated with linguistic items When it comes to SC, the only linguistic items are sentences. The truth-values of the atomic sentences determine the values of all the other sentences. We therefore take assignments of truth-values to atomic sentences as our possible interpretations: An interpretation of SC is a function, σ, defined for all atoms, such that, for each Ai , σ(Ai ) is a truth-value. When we come to first-order languages, we shall encounter richer and more familiar types of interpretations. We shall refer to an interpretation of SC as a truth-value assignment, or assignment for short, and we shall use ‘σ’,
‘τ ’,
‘σ 0 ’,
‘τ 0 ’,
etc.
as variables ranging over assignments. An interpretation, σ, determines a unique assignment of truth-values to all the sentences of SC: The value of each atom is the value assigned to it by σ, and the values of the other sentences are determined by the usual truth-table rules. Spelled out in detail, this amounts to an inductive definition: For Atoms:
If A is an atom, A gets σ(A).
For Negations: If A gets T, ¬A gets F. If A gets F, ¬A gets T. For Conjunctions: If A gets T and B gets T, A ∧ B gets T.
If A gets F, A ∧ B gets F.
If B gets F, A ∧ B gets F. .. . And so on for each of the connectives.
204
CHAPTER 6. THE SENTENTIAL CALCULUS
Obviously, the set of all sentences that get a truth-value by virtue of these rules contains all the atoms and is closed under connective-applications. Hence every sentence gets a truthvalue. Moreover, every sentence gets no more than one truth-value. This follows, again by induction, by showing that the set of all sentences that get unique values contains all atoms and is closed under connective-applications. Here we have to use the unique readability: An atom cannot get a value by virtue of any other rule except (i), because an atom is not a sentential compound. Next, assume that A gets a unique value. Then, since ¬A is not of the form B ∗ C, where ∗ is a binary connective, and since ¬A = ¬A0 only if A = A0 , it follows that ¬A can get a value only through (ii) and that this value is uniquely determined by the value of A. The same kind of argument applies to every other connective. We can therefore speak of the value of a sentence A under the assignment σ. Let us denote this as: valσ (A) Note: The atoms are treated as being completely independent: the truth-value of one is not constrained by the values of the others. Dependencies between atoms can be introduced by restricting the class of possible interpretations. The restrictions can be expressed by stipulating that certain sentences must get the value T. You can think of them as of extra logical axioms. For example, the restriction that it is impossible for both A1 and A2 to be true, amounts to stipulating that ¬(A1 ∧ A2 ) gets T. (Some restrictions cannot be expressed in this form; for example the restriction that only a finite number of atomic sentences get T. But any restriction that involves a finite number of atoms can be thus expressed.) Our previous semantic notions can be now characterized in these terms: • A ≡ B, just when, for all σ, valσ (A) = valσ (B). • A is a tautology, just when, for all σ, valσ (A) = T. • A is a contradiction, just when, for all σ, valσ (A) = F. • Γ |= A, just when there is no σ such that for all B in Γ valσ (B) = T and valσ (A) = F. Our previous methods for establishing logical equivalence and logical implications relied only on the general features of the language and the connectives. Therefore they apply as before:
6.1. THE LANGUAGE AND ITS SEMANTICS
205
All the general equivalences, simplification methods, and proof techniques of the previous chapters apply, without change, when the sentential variables range over the sentences of SC. On the other hand, with the sentences completely specified, we can now prove that particular sentences are not tautologies, or not contradictions, or are not logically implied by other sentences. For example, if A, B, and C are different atoms, then A ∨ B, A → C 6|= B → C For let σ be the assignment such that σ(A) = F, σ(B) = T, σ(C) = F, then valσ (A ∨ B) = valσ (A → B) = T, but valσ (B → C) = F. With A, B, and C unspecified we can only claim that the implication claim need not hold in general. The counterexamples constructed in chapter 4 can be turned into counterexamples concerning specific sentences, by assuming that each sentential variable has a distinct atomic sentence as value. Note: The value of A under the interpretation σ depends only on the values of the atoms that are components of A: If σ and τ assign the same values to all atomic components of A, then valσ (A) = valτ (A) . Hence, as far as a particular sentence is concerned, we have to consider assignments defined only for its atomic components. And if we are concerned with are finite number of sentences, we have to consider only a finite number of atoms. Truth tables can serve to show how a sentence fares under different assignments. A truth table for a given sentence should have a column for each of its atoms. The rows represent the different assignments; the value of the sentence is given in its column. When several sentences are compared by means of truth tables, their tables should be incorporated into a single table that has a column for each atom occurring in any of the sentences. Logical equivalence may hold between sentences with different atoms. For example: A3 → [A1 ∧ (A5 ∨ ¬A5 )] ≡ [A3 ∨ (A4 ∧ ¬A4 )] → A1 ) Note: The notion of duality (cf. 2.5.3) can be now defined for specific sentences. Consider sentences built from atoms using only negation, conjunction and disjunction. Apply the definition given in 2.5.3, assuming that the sentential variables denote distinct atoms. Alternatively, it can be defined inductively as follows, where ‘Ad ’ denotes the dual of A. (i) If A is an atom, then Ad = A (ii) (¬A)d = ¬(Ad )
206
CHAPTER 6. THE SENTENTIAL CALCULUS
(iii) (A ∧ B)d = Ad ∨ B d (iv) (A ∨ B)d = Ad ∧ B d
6.1.3
Normal Forms, Truth-Functions and Complete Sets of Connectives
A literal is a sentence which is either an atom or a negation of an atom. Definition: A sentence is in disjunctive normal form, abbreviated DNF, if it is a disjunction of conjunctions of literals. For example, the following is in DNF: (¬A3 ∧¬A4 ∧A5 ) ∨ A2 ∨ (A3 ∧A6 ) A sentence is in conjunctive normal form, abbreviated CNF, if it is a conjunction of disjunctions of literals. For example: (A5 ∨¬A1 ) ∧ (A5 ∨A6 ∨A7 ) ∧ (A2 ∨A3 ∨A4 ) ∧ ¬A3 Note: Every literal is both a conjunction of literals (namely, a conjunction with one conjunct) and a disjunction of literals (namely, a disjunction with one disjunct). A disjunction of literals, say A1 ∨ A2 ∨ A3 ∨ ¬A4 is in DNF, because it is a disjunction of conjunctions of literals (where every conjunction consists of one literal). It is also in CNF, because it is a conjunction of disjunctions of literals (namely, a conjunction with one conjunct). In a similar way, A1 ∧ A2 ∧ A3 ∧ ¬A4 is both in CNF and in DNF. An equivalent characterization of DNF and CNF is: A sentence A is in DNF iff: (i) A is constructed from atoms using no connective other than ¬, ∧, ∨. (ii) The scope of every negation is an atom. (iii) The scope of every conjunction does not contain any disjunction. A sentence is in CNF iff it satisfies (i) and (ii), and
6.1. THE LANGUAGE AND ITS SEMANTICS
207
(iii0 ) The scope of every disjunction does not contain any conjunction. Theorem: For every sentence, A, there is a logically equivalent sentence in DNF, and there is a logically equivalent sentence in CNF. Here is a way to convert any given sentence to an equivalent sentence in DNF. (I) Eliminate → and ↔, by expressing them in terms of ¬, ∧, ∨. Get in this way a logically equivalent sentence that involves only ¬, ∧ and ∨. (II) Push negation all the way in, cancelling double negations, until negation applies only to atomic sentences. (III) Push conjunction all the way in, by distributing conjunction over disjunction, until no disjunction is within the scope of a conjunction. To get an equivalent CNF, apply steps (I) and (II) but instead of (III) use: (III0 ) Push disjunction all the way in, by distributing disjunction over conjunction, until no conjunction is within the scope of a disjunction. As you carry out these steps you can, of course, simplify according to the occasion, dropping redundant conjuncts or disjuncts, or using established equivalences (e.g., replacing A ∨ ¬A∧B by the equivalent A ∨ B). Example: Assuming A, B, C to be atoms, the following are the stages of a possible converting of [(A∧C) → (B ↔ C)] ∧ ¬[(A∧C) ∨ ¬B] into an equivalent DNF: 1. {¬(A∧C) ∨ [(B∧C) ∨ (¬B∧¬C)]} ∧ ¬[(A∧C) ∨ ¬B] 2. [¬A ∨ ¬C ∨ (B∧C) ∨ (¬B∧¬C)] ∧ [¬(A∧C) ∧ ¬¬B] 3. [¬A ∨ ¬C ∨ (B∧C)] ∧ [(¬A∨¬C) ∧ B]
(¬B ∧ ¬C is a redundant disjunct because we have the disjunct ¬C)
4. [¬A ∨ ¬C ∨ (B∧C)] ∧ [(¬A∧B) ∨ (¬C ∧B)] 5. {[¬A ∨ ¬C ∨ (B ∧C)]∧¬A∧B} ∨ {[¬A ∨ ¬C ∨ (B∧C)]∧¬C ∧B} 6. (¬A∧B) ∨ (¬C∧¬A∧B) ∨ (B∧C∧¬A) ∨ (¬A∧¬C∧B) ∨ (¬C∧B) ∨ (B∧C∧¬C) 7. (¬A∧B) ∨ (¬C ∧B)
208
CHAPTER 6. THE SENTENTIAL CALCULUS
In getting the CNF, steps 1-3 are the same; from 3. on we can proceed: 3. [¬A ∨ ¬C ∨ (B∧C)] ∧ [(¬A∨¬C) ∧ B] 40 . [¬A ∨ [(¬C ∨B) ∧ (¬C ∨C)] ] ∧ (¬A∨¬C) ∧ B 50 . (¬A∨¬C ∨B) ∧ (¬A∨¬C) ∧ B 60 . (¬A∨¬C) ∧ B
(both ¬A∨B and ¬A∨¬C ∨B are redundant in the presence of B).
Note that in this particular case we could have gotten the CNF from the DNF by pulling out B, or the DNF from the CNF–by simple distribution of conjunction . But in general the two forms are not as simply related. A sentence in DNF is true just when some conjunction in it is true. Hence, this form shows clearly the interpretations under which the sentence is true. Consider, for example, (A1 ∧A2 ) ∨ (¬A1 ∧A3 ) ∨ (A2 ∧A3 ) ∨ (¬A1 ∧¬A2 ) This sentence is true just when: (A1 and A2 are true) or (A1 is false and A3 is true) or (A2 and A3 are true) or (A1 and A2 are false). Note that not all possibilities here are exclusive; if A1 and A2 and A3 are true, both the first and third alternatives hold. A sentence in DNF is false just when all its disjuncts are false. For example, our last sentence is false just when: (A1 is false or A2 is false) and (A1 is true or A3 is false) and (A2 is false or A3 is false) and (A1 is true or A2 is true). A CNF indicates the cases of truth and falsity in a dual way. Thus, (A1 ∨A2 ) ∧ (¬A1 ∨A3 ) ∧ (A2 ∨A3 ) ∧ (¬A1 ∨¬A2 ) is true, just when: (A1 is true or A2 is true) and (A1 is false or A3 is true) and (A2 is true or A3 is true) and (A1 is false or A2 is false). And the sentence is false just when:
6.1. THE LANGUAGE AND ITS SEMANTICS
209
(A1 and A2 are false) or (A1 is true and A3 is false) or (A2 and A3 are false) or (A1 and A2 are true). Note: A sentence can have many equivalent DNF’s (or CNF’s). For example, A1 ∨(¬A1 ∧A2 ), and A2 ∨ (A1 ∧ ¬A2 ) are equivalent sentences in DNF. They are equivalent to A1 ∨ A2 . If you replace them by their duals, you will get an analogous situation for CNF’s. Homework 6.2 Find, for each of the following sentences, equivalent sentences in DNF and CNF, as short as you can. Assume that A, B, C, D are atoms. 1. (A → B) ∧ (B → A) 2. (A∨B ↔ C ∨D) ∧ (C ↔ D) 3. ((A → B) → C) → C) 4. (A ∨ B) ∧ (¬C ∨ ¬D) 5. (A ∧ B) ∨ (¬C ∧ D) 6. ¬[A ∨ (B ∧ C) ∨ (C ∧ D)] 7. (A∧B → C ∧D) ∧ (C ∨ ¬D) 8. ((A → C) ∨ (C → D)) → (B → A) 9. (A∧B → B ∧A) ∨ C 10. (A ∨ B) ∧ (¬A ∨ ¬B) ∧ (¬A ∨ B) ∧ (A ∨ ¬B) Expressing Truth-Functions by Sentences Definition: An n-ary truth-function is a function defined for all n-tuples of T’s and F’s, which assigns to every n-tuple a truth-value (either T, or F). Here, for example, is a ternary truth-function f : f (T, T, T) = F f (T, T, F) = T f(T, F, T) = F f (T, F, F) = F f (F, T, T) = T f (F, T, F) = F f(F, F, T) = T f (F, F, F) = T
210
CHAPTER 6. THE SENTENTIAL CALCULUS
The n-tuples of truth-values correspond exactly to the rows in a truth-table based on n atomic sentences, provided that we choose a matching of the atoms with the coordinates. The tuple (x1 , x2 , . . . , xn ) corresponds to the row in which x1 is assigned to the first atom (the atom matched with the first coordinate), x2 is assigned to the second atom, and so on. Now assume that A1 , A2 , . . . , An are n distinct atoms and that we agree that Ai , for i = 1, . . . , n, is matched with the ith coordinate. For each n-tuple of truth-values (x1 , x2 , . . . , xn ), let the assignment represented by (x1 , x2 , . . . , xn ), be the assignment that assigns x1 to A1 , x2 to A2 ,..., xn to An . Then each sentence, A, whose atomic components are among A1 , . . . , An , defines an n-ary truth-function, fA : fA (x1 , . . . , xn ) = the value of A, under the assignment represented by (x1 , . . . , xn ). The values of the function fA are given in A’s column, in the truth-table based on the atoms A1 , . . . , An . Example: It is not difficult to see that the ternary truth-function given above is the function defined by the sentence ¬A1 ↔ (A2 → A3 ) Note: If Ai does not occur in A, then the ith argument has no effect on the value of fA . For example, if n = 2 and A = ¬A2 , then, under our definition, fA is a two-place function whose value for (x1 , x2 ) is obtained by toggling x2 . If A contains k atoms, then for every n ≥ k and every matching of the k atoms with coordinates from 1, 2, . . . , n, there is an n-ary truth-function defined by A. Theorem:
Every truth-function is defined by some sentence.
This is sometimes expressed by saying that every truth-table is a truth-table of some sentence. The proof will show how to construct the sentence, given the truth-table. Proof: Let f be an n-ary function. Fix n distinct atomic sentences A1 , . . . , An , with Ai corresponding to the ith coordinate, i = 1, . . . , n. If there is no n-tuple for which the value of f is T, then obviously, A1 ∧ ¬A1 defines f . Else, for each i = 1, . . . , n define: AT i =df Ai
AF i =df ¬Ai
For every n-tuple of truth-values (x1 , . . . , xn ), let C (x1 ,...,xn ) = Ax1 1 ∧ Ax2 2 . . . ∧ Axnn Consider all the tuples (x1 , . . . , xn ) for which f (x1 , . . . , xn ) = T (i.e., the rows in the truthtable for which the required sentence should have T). Let A be the disjunction of all the C (x1 ,...,xn ) ’s, where (x1 , . . . , xn ) ranges over these tuples.
6.1. THE LANGUAGE AND ITS SEMANTICS
211
A gets T iff one of these disjuncts gets T. But C (x1 ,...,xn ) gets T iff all the conjuncts Ax1 1 , Ax2 2 , . . . , Axnn get T, that is, iff A1 gets x1 , A2 gets x2 , ..., An gets xn . Therefore A gets T iff the assignment is given by one of the tuples for which the value of f is T. The truth-function defined by A coincides with f . QED Example:
Consider the following truth-table: A1 T T T T F F F F
A2 T T F F T T F F
A3 T F T F T F T F
A F T F F T F T T
There are four rows for which the required sentence, A, should be T. Accordingly A can be taken as the disjunction of four conjunctions: (A1 ∧A2 ∧¬A3 ) ∨ (¬A1 ∧A2 ∧A3 ) ∨ (¬A1 ∧¬A2 ∧A3 ) ∨ (¬A1 ∧¬A2 ∧¬A3 ) Note that the proof of the theorem yields the required sentence in DNF. It is also a new proof that every sentence is equivalent to a DNF sentence. A dual construction yields the required sentence in CNF. It is obtained by toggling everywhere T and F, and ∧ and ∨: Consider all tuples (x1 , . . . , xn ) for which the value of the function is F. If there are none, then the function is defined by A1 ∨ ¬A1 . Else, define: BiT =df ¬Ai
BiF =df Ai
Let D(x1 ,...,xn ) = B1x1 ∨ B2x2 ∨ . . . ∨ Bnxn
Then the required CNF is the conjunction of all the D(x1 ,...,xn ) ’s such that f assigns to (x1 , . . . , xn ) the value F. Example:
The CNF obtained for the above-given truth-table is:
(¬A1 ∨¬A2 ∨¬A3 ) ∧ (¬A1 ∨A2 ∨¬A3 ) ∧ (¬A1 ∨A2 ∨A3 ) ∧ (A1 ∨¬A2 ∨A3 ) Each of the disjunctions corresponds to a row in which the sentence gets F.
212
CHAPTER 6. THE SENTENTIAL CALCULUS
Full DNFs and CNFs Terminology: Given a conjunction, C, of literals, say that an atom occurs positively in C, if it is one of the conjuncts, and that it occurs negatively in C if its negation is one of the conjuncts. We speak, accordingly, of positive and negative occurrences of atoms. Similarly an atom occurs positively in a disjunction of literals if it is one of the disjuncts, and it occurs negatively if its negation is one of the disjuncts. Henceforth, we assume that when a sentence is written in DNF no atom occurs both positively and negatively in the same conjunction. For such conjunctions are contradictory and can be dropped. The only exception is when the sentence is contradictory, in which case it reduces to A1 ∧ ¬A1 . Similarly, we assume that in a CNF no atom occurs both positively and negatively in the same disjunction, unless the CNF is a tautology–in which case it reduces to A1 ∨ ¬A1 . We assume, moreover, that there are no repetitions of the same literal in any conjunction (of the DNF), or in any disjunction (of the CNF), and no repeated disjuncts (in the DNF), or repeated conjuncts (in the CNF). When comparing DNFs (or CNFs) we disregard differences in the order of literals of a conjunction (of a disjunction), and differences in the order of the disjuncts (of the conjuncts). Definition: A full DNF is one in which every occurring atom occurs in every conjunction. A full CNF is one in which every atom that occurs in it occurs in every disjunction. Examples: Assuming that A2 , A3 and A4 are distinct atoms, the following is a sentence in full DNF: (A2 ∧¬A3 ∧A4 ) ∨ (¬A2 ∧¬A3 ∧A4 ) ∨ (¬A2 ∧A3 ∧A4 ) By pulling ¬A3 ∧ A4 out of the first two conjunctions and dropping the resulting redundant conjunct A2 ∨ ¬A2 , we see that this sentence is logically equivalent to: (¬A3 ∧A4 ) ∨ (¬A2 ∧A3 ∧A4 ) which is in DNF but not in full DNF, because A2 occurs in the second conjunction, but not in the first. The sentence is also equivalent to: (A2 ∧¬A3 ∧A4 ) ∨ (¬A2 ∧A4 ) (can you see how to get it?), which is again in DNF, but not in full DNF. An example of a full CNF (where the Ai s are assumed to be atoms) is: (A1 ∨¬A2 ∨A5 ) ∧ (¬A1 ∨¬A2 ∨A5 ) ∧ (¬A1 ∨A2 ∨A5 ) ∧ (A1 ∨¬A2 ∨¬A5 )
6.1. THE LANGUAGE AND ITS SEMANTICS
213
which is equivalent to: (¬A2 ∨A5 ) ∧ (¬A1 ∨A2 ∨A5 ) ∧ (A1 ∨¬A2 ∨¬A5 ) (can you see how?), as well as to: (A1 ∨¬A2 ∨A5 ) ∧ (¬A1 ∨A5 ) ∧ (A1 ∨¬A2 ∨¬A5 ) And this last can be further compressed into: (A1 ∨¬A2 ) ∧ (¬A1 ∨A5 ) All of these are in CNF but not in full CNF. A sentence in DNF can be expanded into full DNF by supplying the missing atoms. Say that Ai is an atom occurring in some conjunction, but not in the conjunction C. We can replace C by the equivalent: C ∧ (Ai ∨ ¬Ai ) which, via distributivity, becomes: C ∧Ai ∨ C ∧¬Ai Thus, any disjunct not containing Ai is replaceable by two: one with an additional Ai and one with an additional ¬Ai . Proceeding in this way, we get eventually the full DNF. Obviously, this involves a blowing up of the sentence. A similar process works for the full CNF: We replace every disjunction D in which Ai does not occur by: (D ∨ Ai ) ∧ (D ∨ ¬Ai ) A full DNF shows us explicitly all the truth-table rows in which the sentence gets T. Each conjunction contributes the row in which every atom occurring positively is assigned T, and every atom occurring negatively is assigned F. A full CNF shows us, in a dual way, all the rows in which it gets F. Each disjunction contributes the row in which every atom occurring positively gets F, and every atom occurring negatively gets T. Note: The DNF and CNF constructed in the proof of the last theorem are full. They are obtained by following the prescription just given for correlating conjunctions (in the DNF), or disjunctions (in the CNF) with truth-table rows. Homework 6.3 Write down sentences B1 , B2 , B3 and B4 that have the following truth-tables. Write each of B1 and B2 in DNF and in CNF.
214
CHAPTER 6. THE SENTENTIAL CALCULUS A1 T T T T F F F F
A2 T T F F T T F F
A3 T F T F T F T F
B1 T F F T F T T F
B2 T T F F T T T F
B3 F T T T F T T T
B4 T F F F F T F F
Having written the sentences, see if you can simplify them by pulling out common conjuncts, or common disjuncts, as shown above. 6.4 Write B1 of 6.3 using only ¬ and ∧, B2 –using only ¬ and ∨, and each of B2 and B3 using only ¬ and →. Dummy Atoms: A DNF (or CNF) can contain dummy atoms, i.e., atoms that have no effect on the truth-value. For example, assuming that the Ai s are atoms, A1 is dummy in: (A1 ∧ ¬A2 ∧ A3 ) ∨ (¬A1 ∧ ¬A2 ∧ A3 ) That sentence is in fact equivalent to ¬A2 ∧ A3 Note that both sentences are in full DNF. It can be shown that, in a full non-contradictory DNF, an atom Ai is dummy iff the following holds: For every conjunction in the DNF there is another conjunction in it that differs from the first only in that Ai occurs positively in one, negatively–in the other. The condition for full non-tautological CNFs is the exact dual of that. Dummy atoms can be eliminated from a full DNF by pulling out, i.e., by replacing each (A01 ∧. . . A0i−1 ∧Ai ∧A0i+1 ∧. . .∧A0n ) ∨ (A01 ∧. . . A0i−1 ∧¬Ai ∧A0i+1 ∧. . .∧A0n ) by the single conjunction A01 ∧. . . A0i−1 ∧A0i+1 ∧. . .∧A0n Applying this process we eventually get an equivalent full DNF without dummy atoms. Concerning such DNFs the following is provable:
6.1. THE LANGUAGE AND ITS SEMANTICS
215
Two full non-contradictory DNFs without dummy atoms are logically equivalent iff they are the same (except for rearrangements of literals and disjuncts, and dropping repeated occurrences). The case of full non-tautological CNFs is the exact dual and we shall not repeat it. Note: The claims just made are true only under the assumption that the DNFs (or the CNFs) are full. The situation for non-full DNFs (or CNFs) is much more complex and will not be discussed here.
General Connectives and Complete Connective Sets The essential feature of a connective, which determines all its semantic properties, is its truth table. Hence, a binary connective is characterized by the two-argument truth-function defined by A1 A2 where A1 and A2 represent, respectively, the first and second coordinates. And a unary connective is characterized by the truth-function defined by A1 . For any given n-ary truth-function, we can introduce a corresponding n-place connective, one that determines the given function. There are 16 possible binary truth-functions. This can be seen by noting that the domain of a binary truth-function consists of 4 pairs: (T, T),
(T, F)
(F, T),
(F, F)
For each pair there are two possible values: T and F. Hence, there are altogether 2 · 2 · 2 · 2 possibilities. (In other words, there are 16 possible columns in a truth-table with two atoms.) Accordingly, when considering binary connectives we have to consider 16 possibilities. Four of these are used, as primitives, in SC. But it is possible to set up languages based on any set of connectives. We can also consider connectives of higher arity, that is, which combine–at one go–more than two sentences. A set of connectives is called complete if all truth-functions are definable by sentences built by using only connectives from the set. The theorem proved in the previous subsection shows that ¬, ∧, and ∨ constitute a complete connective set. Since ∨ is expressible in terms of ¬ and ∧ (cf. (5) in 2.2.2), ¬ and ∧ constitute by themselves a complete set. For similar reasons, ¬, with each of ∨ and → constitute complete connective sets. Hence, the following sets of connectives are complete. {¬, ∧}
{¬, ∨}
{¬, →}
216
CHAPTER 6. THE SENTENTIAL CALCULUS
Obviously, every set that includes one of them as a subset is complete as well. It can be shown that none of the following is complete. {∧, ∨, →, ↔},
{¬, ↔}
Everything concerning the expressive power of sets of unary and binary connectives is wellknown. There are exactly two binary connectives that form, each by itself, a complete set. One, known as Sheffer’s stroke and usually denoted as: ‘ | ’, is given by the equivalence: A|B ≡ ¬(A ∧ B) Sheffer’s stroke is sometimes called nand (not—(... and )). It is also called alternative denial, because A|B ≡ ¬A ∨ ¬B. To show that Sheffer’s stroke is complete, it suffices to show that negation and conjunction are expressible by it. Negation is expressible, since ¬A ≡ A|A We have also: ¬(A|B) ≡ A ∧ B Expressing left-hand side negation in terms of Sheffer’s stroke, we see that conjunction is expressible as well: (A|B)|(A|B) ≡ A ∧ B The second complete binary connective, whose sign is often ‘↓’, is sometimes called nor, or joint denial. Its truth-table is given by: A ↓ B ≡ ¬(A ∨ B) which is equivalent to: A ↓ B ≡ ¬A ∧ ¬B Note: Among the sixteen possible binary connectives, six are “degenerate”, i.e., have one or more dummy arguments. These are: The “tautology connective”, whose truth-value function assigns to all pairs the value T. And the “contradiction connective” whose truth-value function assigns to all pairs the value F. The “first-coordinate connective”, whose truth-value function assigns to every pair (x1 , x2 ) the first coordinate x1 . And the negated first-coordinate connective, whose truth-value function assigns to every pair the toggled first coordinate.
6.2. DEDUCTIVE SYSTEMS OF SENTENTIAL CALCULI
217
The “second-coordinate connective”, and the negated second-coordinate connective. Homework 6.5 Prove that ↓ is, by itself, complete. 6.6 Recall that the dual of a connective is the connective whose truth-table is obtained by D D toggling everywhere T and F(cf. 2.5.3). Let → and ↔ be the duals of → and ↔. Show that: D
(1) A → B is expressible in terms of ¬ and →. D
(2) A ↔ B is expressible in terms of ¬ and ↔. D
6.7 Show that, by using → as a single connective, a contradictory sentence can be conD structed, and that the same is true for ↔. Use this to show that negation is expressible in D D terms of → and →, as well as in terms of ↔ and ↔. What can you infer from this concerning D the completeness of {→, →} ? 6.8 Show that disjunction is expressible in terms of conditional (i.e., that a sentence logically equivalent to A ∨ B can be constructed from A and B, using → only). 6.9
Let A1 and A2 be atoms. Show that if each of A and B is equivalent to one of: A1 → A1 (a tautology),
A1 ,
A2 ,
A1 → A2 ,
A2 → A1 ,
A1 ∨ A2 .
then also A → B is equivalent to one of these sentence. Use this in an inductive argument to prove a certain restriction on the expressive power of the language that has → as the only connective.
6.2 6.2.1
Deductive Systems of Sentential Calculi On Formal Deductive Systems
Proving claims, reasoning and drawing conclusions, are fundamental in all cognitive domains. Logic, as was pointed out, is concerned with certain basic aspects of these activities. The outcome of the proving activity is a proof: a sequence of propositions that are supposed to establish the desired conclusion. As a rule, proofs presuppose an understanding of the concepts under consideration. Proofs come in many grades of precision and rigour. Proofs in mathematics, for example, are a far cry from “proofs” in philosophy, which rest on partially understood concepts and on
218
CHAPTER 6. THE SENTENTIAL CALCULUS
unspecified assumptions, and which are often controversial and subject to unending debates. But mathematics, as well, presupposes a great deal of intuitive understanding. The history of the subject shows that even mathematical proofs have not been immune to error and confusion. The drive for clarity and rigour has resulted in setups in which proofs are subject to strict requirements. In classical form, a proof should start from certain propositions, chosen from a set fixed in advance, and every step should conform to certain rules. The paradigm of such a system has been Euclidean geometry (dating back to the fourth century B.C.), whose importance in the history of science and ideas can be hardly exaggerated. The propositions that serve as the starting points of proofs are known as axioms. The rules that determine which steps are allowed are known as inference rules. When a given domain is organized as an axiomatic system, we can think of the axioms as self-evident truths. And we can view the inference rules as obvious truth-preserving rules, i.e., they never lead from true premises to a false conclusion. Proofs constructed in this way are therefore guaranteed to produce true propositions. The usefulness of proofs lies in the fact that, though each axiom is obvious and each step is simple, the conclusion can be a highly informative, far from obvious statement. The axiomatic method effected a grand systematization of geometry and gave it a particular shape. It served not only as a fool-proof guard against error, but as a guide for discovering new geometrical truths. At the same time it provided a framework for communicating problems and results. It became a basic paradigm, an example to be followed by scientists and philosophers through centuries. Euclidean geometry relied, nonetheless, on many intuitions, quite a few of which were left implicit. Later geometricians, who noted these lacunae, made various assumptions explicit in the form of additional axioms. The more precise the system became the less it relied on unanalysed geometrical intuitions. The big breakthrough came at the turn of the century in the works of Hilbert. He showed that geometry can be completely reduced to a formal system, characterized by a certain set of axioms and a certain way of constructing proofs, which do not require any geometrical intuition. He has indicated thereby the possibility of setting up a purely formal deductive system, one that is based on an uninterpreted language, which does not presuppose an understanding of the symbols’ meaning. In such a system, the construction of proofs amounts to symbol manipulation and belongs to the level of pure syntax. These developments formed part of the general evolution of modern logic. At about the same time Peano, drawing on Dedekind’s work, proposed a formal deductive system for the theory of natural numbers. Frege’s systems are essetially fully fledged deductive systems. To a lesser degree this is also true of Russell’s and Whitehead’s Principia Mathematica. A (formal) deductive system consists of:
6.2. DEDUCTIVE SYSTEMS OF SENTENTIAL CALCULI
219
(I) A formal language (II) Rules that define the system’s proofs. Most often (II) is given by: (II.1) A set of axioms (II.2) A set of inference rules. Roughly speaking, a proof is a construct formed by repeated application of inference rules; the axioms serve as starting points. Note: “Proof” is used here in more than one sense. The proofs that belong to deductive systems are formal structures, represented by arrangements of symbols in sequences, or in trees. But we speak also of our own arguments and reasonings as proofs; for example, the (forthcoming) proofs that every sentences in SC is equivalent to a sentence in DNF, and that a given set of connectives is complete. Do not confuse these two notions of proof! The context always indicates which notion is meant. A similar ambiguity surrounds the term “theorem”. We use it to refer to what is proved in a deductive system, as well as to claims we ourselves make. Again, the context indicates the intended meaning. The importance of deductive systems does not derive from the practicality of their proofs (though some have found computerized applications), but from the light they throw on our reasoning activity. The very possibility of capturing a sizable chunk of our reasoning by means of a completely formal system, one which is itself amenable to mathematical analysis, is extremely significant. We can thus reason about reasoning, and we can prove that some things are provable and some are not. When restricted to classical sentential logic, deductive systems do not play a crucial role; because truth-table checking can decide whether given premises tautologically imply a given conclusion. Yet, they are extremely important. First, because sentential deductive systems constitute the core of richer systems in richer languages–such as first-order logic–where nothing like truth-table checking is available. Second, they serve as a basis and as a point of comparison for various enriched sentential logics, which are beyond the scope of truth tables. Finally, they are the simplest example that beginners can study.
6.2.2
Hilbert-Type Deductive Systems
The simplest type of deductive system is often referred to as the Hilbert-type. In this type the axioms are certain sentences and each inference rule consists of: (i) a list of sentence-
220
CHAPTER 6. THE SENTENTIAL CALCULUS
schemes referred to as the premises, (ii) a sentence-scheme referred to as the conclusion. It is customary to write an inference rule in the form: A1 , A2 , . . . , Am B where the Ai ’s are the premises and B is the conclusion. The rule allows us to infer B from A1 , . . . , Am . The most common rule is modus ponens: A, A → B B which allows us to infer B from the two sentences A → B and A. Here A and B can be any sentences. The rule is a scheme that covers an infinite number of cases. We shall return to it in the next subsection. In principle, the number of premises (which can vary according to the rule) can be any finite number; but is usually one or two. Proofs and Theorems:
A proof in a Hilbert-type system is a finite sequence of sentences B1 , B2 , . . . , Bn
in which every sentence is either an axiom or is inferred from previous sentences by an inference rule. Stated formally: for every k = 1, . . . , n either (i) Bk is an axiom, or (ii) there are j1 , . . . , jm < k, such that Bk is inferred from Bj1 , Bj2 , . . . , Bjm by an inference rule. Terminology: A proof, B1 , B2 , . . . , Bn , is said to be a proof of Bn , we also say that Bn is the sentence proved by this proof. A sentence is said to be provable if there is a proof of it. A provable sentence is also called a theorem (of the given system). Note: If B1 , . . . , Bn is a proof, then, trivially, every initial segment of it: B1 , . . . Bj , where j ≤ n, is a proof. Hence, all sentences occurring in a proof are provable. Note: We can subsume the concept of axiom under the concept of inference rule, by allowing rules with an empty list of premises. A proof can then be described as a sequence of sentences in which every sentence is inferred from previous ones by some inference rule; axioms are included because they are inferred from the empty set. It is not difficult to see that the set of theorems of a deductive system can be defined inductively as the smallest set satisfying: (I) Every axiom is a theorem.
6.2. DEDUCTIVE SYSTEMS OF SENTENTIAL CALCULI
221
(II) If B1 , ..., Bm are theorems and A is inferred from B1 , . . . , Bm by an inference rule, then A is a theorem. From this viewpoint, proofs are constructs that show explicitly that a given sentence is obtainable by applying (I) and (II). Proofs as Trees: The identification of proofs with sequences is the simplest, but by no means the only possible way of defining this concept. Proofs can be also defined as trees. The leaves of the proof-tree are labeled by axioms and every non-leaf is inferred from its children by an inference rule. The sentence that labels the root is the one proved by the tree. Proof-trees take more space, but give a fuller picture that shows explicitly the premises from which each sentence is inferred. Notation:
If D is a deductive system, then `D A
means that A is provable in D. The subscript ‘D’ is omitted if the intended system is obvious.
6.2.3
A Hilbert-Type Deductive System for Sentential Logic
The following system is one of the simplest deductive systems that are adequate for the purposes of sentential logic. We shall denote it by ‘HS1’. (‘HS’ for ‘Hilbert-type Sentential logic’). The language of HS1 is based on our infinite list of atomic sentences and on two connectives ¬
→
and
This means that the sentences of HS1 are built from atoms using ¬ and → only. Other connectives are to be expressed, if needed, in terms of ¬ and → (cf. page 215). The axioms of HS1 are all the sentences which fall under one of the following three schemes: (A1)
A → (B → A)
(A2)
(A → (B → C)) → ((A → B) → (A → C))
(A3)
(¬A → ¬B) → (B → A)
It has a single inference rule, modus ponens: A → B, B
A
222
CHAPTER 6. THE SENTENTIAL CALCULUS
Each of (A1), (A2), (A3) covers an infinite number of sentences. For example, the following are axioms, since they fall under (A1): ¬(A2 →A4 ) → (¬A5 → ¬(A2 →A4 )) (A2 →(A3 →¬A2 )) → (A1 → (A2 → (A3 → ¬A2 ))) A1 → ((A1 →A1 ) → A1 ) Modus ponens is schematic as well. We can, for example, infer: from A6 → A2 and A6 : the sentence A2 , from ¬(A1 ∧ A2 ) → (A1 ∨ A2 ) and ¬(A1 ∧ A2 ) : the sentence (A1 ∨ A2 ), from (A1 ∨ A3 ) → [A3 → ¬A5 ] and A1 ∨ A3 : the sentence A3 → ¬A5 , and so on. Since we employ sentential variables throughout, our claims are of schematic nature. When we say that `HS1 A → A we mean that every sentence of the form A → A is a theorem of HS1. The HS1-proofs we construct are, in fact, proof-schemes. Here, for example, is a proof (scheme) of A → A. For the sake of clarity the sentences are written on separate numbered lines, with marginal indications of the axiom scheme under which each sentence falls or the previous sentences from which it is inferred. (The line-numbers and the marginal indications are not part of the formal proof.) 1. A → ((A →A) → A)
((A1))
2. (A → ((A →A) → A)) → ((A → (A → A)) → (A →A))
((A2))
3. (A →(A → A)) → (A →A) 4. A →(A → A) 5. A →A
(from 1. and 2.) ((A1)) (from 3. and 4.)
6.2. DEDUCTIVE SYSTEMS OF SENTENTIAL CALCULI
223
1. is an instance of (A1) where A has been substituted by A → A; the same substitution yields 2. as an instance of (A2); 4. is another instance of (A1) obtained by substituting B by A. When proofs are defined as trees, the proof just given becomes:
Note: When constructing a proof in sequential form, a rule of inference can be applied to any previously constructed sentences. If, in the above-given sequence, we move 4. to the beginning (and leave the rest of the order unchanged) we get a proof of the same sentence, in which the fifth sentence is obtained by applying modus ponens to the first and the fourth. The corresponding proof-tree is unaffected by that modification. Here is another example: a proof of ¬B → (B → A). The indications in the margin are left as an exercise. 1. (¬A → ¬B) → (B → A) 2. [(¬A → ¬B) → (B → A)] → [¬B → ((¬A → ¬B) → (B → A))] 3. ¬B → (¬A → ¬B) 4. ¬B → ((¬A → ¬B) → (B → A)) 5. [¬B → ((¬A → ¬B) → (B → A))] → [(¬B → (¬A → ¬B)) → (¬B → (B → A))] 6. (¬B → (¬A → ¬B)) → (¬B → (B → A)) 7. ¬B → (B → A) Homework 6.10 Find for each sentence in the last proof the axiom under which it falls or the two previous sentences from which it is inferred (by modus ponens). In the case of the axiom, write the substitution that yields the desired instance.
Proving that Something Is Provable Finding proofs in HS1 by mere trying is far from easy. But there are techniques for showing that proofs exist, and for producing them if necessary, without having to construct them
224
CHAPTER 6. THE SENTENTIAL CALCULUS
explicitly. For one thing, we can use sentences that have been proved already, or shown to have proofs, as axioms. Having shown that `HS1 A → A, we can use it in the following sequence in order to show that B → (A → A) is provable as well. (A →A) → (B → (A →A)),
A → A,
B → (A →A)
The first sentence is an instance of (A1) and the third is derived from the previous two by modus ponens. The sequence is not a proof, because A → A is neither an axiom nor derivable by modus ponens from previous sentences. But we can replace A → A by the sequence that constitutes its proof and the enlarged sequence is–it is easy to see–a proof of B → (A → A). We have thus shown that the sentence is provable, without constructing a proof. If called upon we can provide one. Applying again the same principle, we can use, from now on, B → (A → A). Derived Inference Rules: Just as we can use theorems, we can use additional inference rules, provided that we show that everything provable with the help of these rules is also provable in the original system. Such rules are known as derived inference rules. Usually, the proof that a certain rule is derived will also show how every application of it can be reduced to applications of the original axioms and rules. For example, the following is a derived inference rule of HS1 A ¬A → B To prove this we have to show that from A we can get, by applying the axioms and rules of HS1, ¬A → B. Here is how to do it. We shall use ¬A → (A → B), whose provability has been established (cf. Homework 6.10, with A and B switched). 1. A 2. ¬A → (A → B) 3. [¬A →(A → B)] → [(¬A → A) →(¬A →B)] 4. (¬A → A) →(¬A →B) 5. A → (¬A → A) 6. ¬A → A 7. ¬A → B Here 3. is an instance of (A2) and 5. is an instance of (A1); 4. is inferred from 2. and 3., 6.–from 1. and 5., 7.–from 4. and 6.
6.2. DEDUCTIVE SYSTEMS OF SENTENTIAL CALCULI
225
Proofs From Premises In the rest of this subsection, ‘ `’ stands for ‘ `HS1 ’. The concept of provability is naturally extendible to provability from premises. As in 4.2, we use ‘Γ’, ‘∆’, ‘Γ0 ’, ‘∆0 ’, etc., for premise lists. The other notations introduced there will serve here as well. Definition: A proof (in HS1) from a premise list Γ, is a sequence of sentences B1 , . . . , Bn such that every Bi is either (i) an axiom, or (ii) a member of Γ, or (iii) inferred from two previous members by modus ponens. We say that B1 , . . . , Bn proves Bn from Γ. A sentence is provable from Γ if it has a proof from Γ; we denote this by: Γ `B Our previous concept of proof is the particular case in which Γ is empty. The notation in that case conforms to our previous usage: `B Note: A premise list Γ is not a list of additional axiom schemes. Sentential variables that are used in writing premise-lists are meant to denote some particular unspecified sentences. But any general proof that Γ ` B, remains valid upon substitution of the sentential variables by any sentences. Example:
The following shows that A → (A → B) ` A → B:
(A →(A → B)) → ((A →A) → (A → B)), A → (A → B), (A → A) → (A →B), A → A, A → B The first sentence is an axiom (an instance of (A2)), the second is the premise , the third follows by modus ponens. The fourth is a previously established theorem and the last is obtained by modus ponens. The full formal proof from the premise A → (A →B) is obtained if we replace A → A by its proof. The set of sentences provable from Γ can be defined inductively as the smallest set containing all axioms, all members of Γ, and closed under modus ponens. Note: The concepts expressed by ‘|=’ and ‘ `’ (both of which are symbols of our metalanguage) are altogether different. The first is semantic, defined in terms of interpretations and truth-values; the second is purely syntactic, defined in terms of formal inference rules. We shall see, however, that there are very close ties between the two. The establishing of these ties is one of the highlights of modern logic. The following is obvious.
226
CHAPTER 6. THE SENTENTIAL CALCULUS
(1) If all the sentences in Γ occur in Γ0 , then every sentence provable from Γ is provable from Γ0 . We also have: (2) If Γ ` A and Γ, A ` B, then Γ ` B Intuitively, we may use A in proving B from Γ, because we can prove A itself from Γ. In a more formal manner: let A1 , . . . , Ak−1 , A be a proof of A from Γ and let B1 , . . . , Bm−1 , B be a proof of B from Γ, A then A1 , . . . , Ak−1 , B1 , . . . , Bm−1 , B is a proof of B from Γ. (The occurrences of A in B1 , . . . , Bm−1 , B can be inferred from previous sentences in A1 , . . . , Ak−1 .) When proofs are trees the proof of B from Γ is obtained by taking a proof of B from Γ, A and expanding every leaf labeled by A into a proof of A from Γ. All the claims that we establish here for ` hold if we replace ‘ `’ by ‘|=’ (which, we shall see, is not accidental). For example, (2) is the exact analogue of (9) of 4.2. But the arguments that establish the properties of ` are very different from those that establish their analogues for |=. The Deduction Theorem (7)). (3) Γ, A ` B
iff
The following is the syntactic analogue of (|=, →) (cf. 4.2.1
Γ `A→B .
The easy direction is from right to left: Consider a proof of A → B from Γ. If we add A to the premise-list, we can get, via modus ponens, B. The difficult direction is known as the Deduction Theorem: If
Γ, A ` B
then
Γ ` A → B.
Here is its proof. Consider a proof of B from Γ, A: B1 , B2 , . . . , Bn We show how to change it into a proof of A → B from Γ. First, construct the sequence of the corresponding conditionals: A → B1 , A → B2 , , . . . , A → Bi , . . . , A → Bn
6.2. DEDUCTIVE SYSTEMS OF SENTENTIAL CALCULI
227
This, as a rule, is not a proof from Γ. But we can insert before each A → Bi sentences so that the resulting sequence is a proof of A → Bn from Γ. Each Bi in the proof from Γ, A is either (i) an axiom or a member of Γ, or (ii) the sentence A, or (iii) inferred from two previous members by modus ponens. If (i) is the case, insert before A → Bi , the sentences Bi → (A → Bi ),
Bi
The first is an axiom, the second is (by our assumption) either an axiom or a member of Γ. From these two A → Bi is now inferred by modus ponens. If (ii) is the case, then A → Bi = A → A, which, as we have seen, is provable in HS1; we can therefore insert before A → A sentences that, together with it, constitute its proof in HS1. The remaining case is (iii). Say the two previous Bj ’s which yield Bi via modus ponens are Bk → Bi and Bk . The original proof of B is something of the form: ...
Bk → Bi
...
Bk
...
Bi , . . .
(The relative order of Bk and Bk → Bi doesn’t matter.) This is converted into: . . . A → (Bk →Bi ), . . . A → Bk , . . . , A → Bi , . . . Now insert before A → Bi sequence the sequence: [A → (Bk → Bi )] → [(A → Bk ) → (A → Bi )],
(A → Bk ) → (A → Bi )
The first is an axiom (an instance of (A2), the second is inferred from A → (Bk → Bi )) and the first by modus ponens. And now A → Bi is inferred by modus ponens from (A →Bk ) → (A → Bi ) and A → Bk . After carrying out all the insertions, every sentence in the resulting sequence is either an axiom or a member of Γ, or inferred from two previous members. Hence we get a proof of A → B from Γ. QED The deduction theorem is a powerful tool for showing the provability of various sentences. We can now employ the technique of transferring the antecedent to the left-hand side, which was used in the context of logical implication. Here, for example, is an argument showing that ` (A →B) → [(B →C) →(A →C)] Using the deduction theorem, it suffices to show that: (a) A → B
` (B →C) → (A →C)
228
CHAPTER 6. THE SENTENTIAL CALCULUS
Again, by the deduction theorem, the following is sufficient for establishing (a): (b) A →B, B → C
` A →C
Using, for the third time the deduction theorem, (b) reduces to: (c) A →B, B → C A,
` C
But (c) is obvious: From A → B and A we can infer B, and from B and B → C we can infer C. Note: The concept of proof from premises is definable for general deductive systems. Some systems have inference rules whose applicability to arbitrary sentences is subject to certain restrictions. In such systems the definition of proofs from premises is modified accordingly. However, (2), (3), and all other properties that are analogues of the implication laws, hold throughout. Homework 6.11 Prove the following: 1. A, ¬A ` B
Hint: use ¬A → (¬B → ¬A), axiom (A3) and twice modus ponens.
2. ¬¬A ` A
Hint: use 1. with A and B replaced by their negations, transfer ¬A via the deduction theorem, choose for B any axiom (or theorem) of HS1.
3. A ` ¬¬A
Hint: get from 2. modus ponens.
` ¬¬A → A, replace A by ¬A, then use an instance of (A3) and
4. If Γ, ¬A ` ¬B then Γ, B ` A.
Hint: show that the assumption implies that Γ modus ponens.
` ¬A → ¬B, then use (A3) and
5. If Γ, A ` B then Γ, ¬B ` ¬A.
Hint: use 2. and 3. to show that the assumption implies that Γ, ¬¬A ` ¬¬B; then use 4.
6. A, ¬B ` ¬(A → B)
Hint: apply 5. to: A, A → B ` B.
7. If Γ, ¬A ` C and Γ, B ` C, then Γ, A → B ` C.
6.2. DEDUCTIVE SYSTEMS OF SENTENTIAL CALCULI
229
Hint: get from the first assumption, via 5. and 2., Γ, ¬C ` A; get from the second Γ, ¬C ` ¬B; applying 6., get Γ, ¬C ` ¬(A → B), then apply 4. 8. ¬(A → B) ` A Hint: get from 1. ¬A ` A → B; then apply 5. (with an empty Γ) and 2. 9. ¬(A → B) ` ¬B Hint: get from (A1) B ` A → B; then apply 5.
6.2.4
Soundness and Completeness
A formal deductive system is defined without recourse to semantic notions. Its significance, however, derives from its relation to some semantics. At least, this is the case in systems that are based on classical logic or some variant of it. The semantics is given by a class of possible interpretations, in each of which every sentence gets a truth-value. Soundness A deductive system, D, is said to be sound for a given semantics, if everything provable in D is true in all interpretations. This has a generalized form that applies to proofs from premises: For all Γ and all A, if Γ `D A, then there is no interpretation in which all members of Γ are true and A is false. Roughly speaking, it means that the proofs of the system can never lead us from true premises to false conclusions. 1 . The soundness of D is proved by establishing the following two claims: (S1) Every axiom of D is true in all interpretations. (S2) The inference rules preserve truth-in-all-interpretations, that is: if the premises of an inference rule are true in all interpretations, so is the conclusion. For the generalized form, (S2) is replaced by: (S2∗ ) For every interpretation, if all premises of an inference rule are true in the interpretation, its conclusion is true in it as well. 1
The generalized form can be deduced directly from the first, provided that the underlying language has (or can express) → and the deduction theorem holds. In other cases the term “strong soundness” is sometimes used for the generalized form.
230
CHAPTER 6. THE SENTENTIAL CALCULUS
(S1) and (S2) imply that all sentences constructed in the course of a proof of D are true in all interpretations. We never get outside the set of all true-in-all interpretation sentences; because the axioms are in that set and all applications of inference rules leave us in it. Similarly, (S1) and (S2∗ ) imply that, for any given interpretation, if the premises are true in the interpretation, then every proof from these premises leaves us within the set of sentences true in that interpretation. It is an inductive argument: the set of provable (or provable from Γ) sentences is the smallest set containing the axioms (and the members of Γ) and closed under the inference rules. To show that all the sentences in this set have some property, we show that all axioms (and all members of Γ) have the property, and that the set of sentences having this property is closed under the inference rules. In the case of the sentential calculus, the interpretations consist of all truth-value assignments to the atoms. Truth under all interpretations means tautological truth. Presupposing this semantics, the soundness of HS1 is the requirement that every provable sentence is a tautology; or, in symbols, that for every A: `A
=⇒
|= A,
Similarly, generalized soundness means that for every Γ and A: Γ `A
=⇒
Γ |= A
To prove that HS1 is sound we show (S1) and (S2), (for the generalized form– (S2∗ )). That proof is easy: (S1) means that all axioms are tautologies, which can be verified directly by considering truth-tables. Since modus ponens is our only inference rule, (S2∗ ) amounts to the claim that whenever A → B and A are true in some interpretation, so is B. This, by the truth-table of →, is trivial. Completeness A deductive system D is complete with respect to the given semantics, if every sentence that is true under all interpretations is provable in D. If, again, we consider also proofs from premises, we get the generalized form of completeness: If, in every interpretation in which all members of Γ are true, A is true, then A is provable from Γ. The non-generalized form is the particular case where Γ is empty2 . Completeness means that the deductive system is powerful enough for proving all sentences that are always true (or–in the generalized form–for establishing all logical implications). 2
The generalized form can be deduced directly from the first provided that: (i) the underlying language has (or can express) → , (ii) modus ponens is a primitive or derived inference rule, and (iii) only finite premise-lists are considered. In other cases the term “strong completeness” is sometimes used for the generalized notion.
6.2. DEDUCTIVE SYSTEMS OF SENTENTIAL CALCULI
231
Completeness is of course a highly desirable property. Deductive systems that are not complete fail to express the full content of the semantics. But completeness is not as essential as soundness. It is also much more difficult to prove. Unlike soundness, there is no general inductive argument for establishing it. Once the language and its semantics (set of possible interpretations) are chosen, the existence of a formal deductive system that is both sound and complete is extremely significant. It means that by using a purely syntactic system we can characterize basis semantic notions. For many interpreted languages completeness is unachievable3 . The most notable case in which we have a deductive system that is both sound and complete is first-order logic, the subject of this book. In the case of HS1 completeness means that, for every sentence A: |= A
`A
=⇒
Or, in the generalized form, for every Γ and A: Γ |= A
Γ `A
=⇒
If both soundness and completeness hold, then we have: Γ |= A
⇐⇒
Γ `A
Soundness is the ⇐-direction, completeness–the ⇒-direction. The Completeness of HS1: The fool-proof top-down method of chapter 4 (cf. 4.3) can be used in order to show that HS1 is complete. Among other things, we have noted in 4.3 that the method applies to any sublanguage of SC that has negation among its connectives. Any true implication claim can be therefore derived by applying (repeatedly) the laws correlated with these connectives to self-evident implications or the types: (I.1)
Γ, A |= A
(I.2)
Γ, A, ¬A |= B
In the case of HS1, the only connectives are ¬ and →. Consequently, there are six laws altogether. Three cover the cases of double negations, conditional and negated conditional– in the premises, and three cover the same cases in the conclusion. The first group is: (Pr1) Γ, A |= C
⇐⇒
Γ, ¬¬A |= C
(Pr2) Γ, ¬A |= C and Γ, B |= C (Pr3) Γ, A, ¬B |= C 3
⇐⇒
⇐⇒
Γ, A → B |= C
Γ, ¬(A → B) |= C
The most important is the language of arithmetic, which describes the natural-number system with addition and multiplication. G¨ odel’s incompleteness theorem shows that completeness is out of question, for this and any richer language.
232
CHAPTER 6. THE SENTENTIAL CALCULUS
The second group is: (Cn1) Γ |= A
⇐⇒
(Cn2) Γ, A |= B
Γ |= ¬¬A
⇐⇒
Γ |= A → B
(Cn3) Γ |= A and Γ |= ¬B
⇐⇒
Γ |= ¬(A → B)
In a bottom up proof we start with implications of the types (I.1) and (I.2) and apply the laws repeatedly in the ⇒ direction. In order to establish completeness we shall prove analogous claims for the ⇒ directions, where ‘|=’ is replaced by ‘ `’. That is, we show the following: (I.1∗ )
(I.2∗ )
Γ, A ` A
Γ, A, ¬A ` B
as well as: (Pr1∗ ) Γ, A ` C
⇒
Γ, ¬¬A ` C
(Pr2∗ ) Γ, ¬A ` C and Γ, B ` C (Pr3∗ ) Γ, A, ¬B ` C (Cn1∗ ) Γ ` A (Cn2∗ ) Γ, A ` B
⇒
⇒
⇒
Γ, A → B ` C
Γ, ¬(A → B) ` C
Γ ` ¬¬A ⇒
Γ ` A→B
(Cn3∗ ) Γ ` A and Γ ` ¬B
⇒
Γ ` ¬(A → B)
Assume for the moment that we have shown this. In 4.3 we claimed that, starting with an initial goal, the reduction is bound to terminate in a tree in which all end goals (in the leaves) are elementary. We appealed to the fact that the reductions always reduced the goal’s complexity. Here we shall make this reasoning precise by turning it into an inductive argument. Let the weight of a sequence of sentences, ∆, be the sum of all numbers contributed by connective occurrences in ∆, where each occurrence of ¬ contributes 1 and each occurrence of → contributes 2. It is easily seen that, in each of the six claims given above, the sequence of sentences involved in the conclusion on the right-hand side (of ‘⇒’) has greater weight than the sequence of sentences involved in each of the left-hand side premises. We now show by induction on the weight of Γ, A, that if Γ |= A then Γ `A
6.2. DEDUCTIVE SYSTEMS OF SENTENTIAL CALCULI
233
If the weight is 0, all the sentences in Γ, A are atoms. In this, and in the more general case where all the sentences are literals, we have: If Γ |= A, then either A occurs in Γ or some atom and its negation occur in Γ. Otherwise, we can define (as in 4.3.3) a truth-value assignment under which all the premises come out true and A comes out false. Hence, by (I.1∗ ) and (I.2∗ ), Γ ` A. If not all sentences are literals, then either some premise or the conclusion is a conditional, or a negated conditional, or a doubly negated sentence. Each of these possibilities is taken care of by corresponding claim from (Pri∗ ), (Cni∗ ), i = 1, 2, 3. For example, consider the case where Γ = Γ0 , B → C, i.e., where we have: Γ0 B → C |= A By the ⇐ direction of (Pr2) we get: Γ0 , ¬B |= A and Γ0 , C |= A Each of Γ0 , ¬B, A and Γ0 , C, A has smaller weight then the weight of Γ0 , B → C, A. Hence, by the induction hypothesis: Γ0 , ¬B ` A and Γ0 , C ` A Applying (Pr2∗ ) we get: Γ0 B → C
` A
It remains to prove (I.1∗ ), (I.2∗ ) and the six claims (Pri∗ ), (Cni∗ ), i = 1, 2, 3. Of these, (I.1∗ ) is trivial. (Cn2∗ ) is the deduction theorem. The rest follow from the claims of Homework 6.11: (I.2∗ ) follows from 1. (Pr1∗ ) follows from 2. (since ¬¬A ` A, everything provable from Γ, A is provable from Γ, ¬¬A). (Cn1∗ ) follows from 3. (if A is provable from Γ, so is ¬¬A, since A ` ¬¬A). (Pr2∗ ) is 7. (Pr3∗ ) follows from 8. and 9. (since both A and ¬B are provable from ¬(A → B), everything provable from Γ, A, ¬B is provable from Γ, ¬(A → B) (Cn3∗ ) follows from 6. (if both A and ¬B are provable from Γ, so is ¬(A → B), since A, ¬B ` ¬(A → B)). If we extend the language of HS1 by adding connectives, we can get a sound and complete system provided that we add suitable axioms. Here again the fool-proof method can guide us. For example, if we add conjunction, we should add sound axioms, such that the following four claims are satisfied. They correspond to the conjunction and negated-conjunction laws of 4.3, for the premises and the conclusion.
234
CHAPTER 6. THE SENTENTIAL CALCULUS Γ, A, B ` C
⇒
Γ, A∧B ` C
Γ ` A and Γ ` B
⇒
Γ ` A∧B
Γ, ¬A ` C and Γ, ¬B ` C Γ, A ` ¬B
⇒
⇒
Γ, ¬(A∧B) ` C
Γ ` ¬(A∧B)
Homework 6.12 Write down, for each of the connectives ∨ and ↔, the additional properties that ` should have in order to imply completeness for the system obtained by adding the connective. For each of ∧, ∨ and ↔ we can state axioms (rather, axiom schemes) that guarantee completeness if the connective(s) is added. In what follows the associated axioms are chosen so as to involve only the connective in question and → (i.e., no ¬). There are three axioms for each connective. If more than one is added, we simply include the axioms for each. In all cases modus ponens remains the only inference rule. Altogether we get, besides HS1, seven complete and sound deductive systems, which correspond to the seven non-empty subsets of {∧, ∨, ↔}. Axioms for ∧:
A∧B → A
A∧B → B
A → (B → (A∧B)) Axioms for ∨:
(A → C) → [(B → C) → ((A ∨ B) → C)] A→A∨B
Axioms for ↔:
(A ↔ B) → (A → B)
B →A∨B (A ↔ B) → (B → A)
(A → B) → [(B → A) → (A ↔ B)] ) Homework 6.13 Prove the above claim for ∧, that is: if we add ∧ to the language and the corresponding axiom schemes to the deductive system, we get a complete system. (Hint: Among other things, show, using the third axiom for conjunction, that ¬C → A, ¬C →B ` ¬C → (A ∧ B) Using ¬D → E ` ¬E → D, deduce from this: (¬A → C), (¬B → C) ` ¬(A∧B) → C Show also: A → ¬B ` ¬(A ∧ B), by showing: A ∧ B ` ¬(A →¬B).)
6.2. DEDUCTIVE SYSTEMS OF SENTENTIAL CALCULI
235
6.14 Prove the above claim for ∨. (Hint: Among other things, show, using the first axiom for disjunction, that ¬C → ¬A, ¬C → ¬B
` ¬C → ¬(A ∨ B)
Taking C to be a negation of an axiom, deduce from this: ¬A, ¬B
` ¬(A ∨ B)
Using the other two axioms show that both ¬A and ¬B are provable from ¬(A ∨ B). Infer from this that ¬(A ∨ B) ` ¬(¬A → B); from which you can infer that ¬A → B ` A ∨ B 6.15
)
Prove the above claim for ↔.
Note: When we add new connectives, the old axiom schemes cover additional sentences. E.g., (A1) covers originally all sentences A → (B → A) in which A and B are in the HS1 language. In the enriched the language, (A1) covers all the cases where A and B are in that language. Note: We can have fewer axioms for each additional connective, if we use ¬. We simply add axioms that allow us to express the new connectives in terms of ¬ and →. E.g., for conjunction we add: (A ∧ B) → ¬(A → ¬B)
and
¬(A → ¬B) → (A ∧ B)
The significance of the previous axioms, which bypass negation, lies in expressing certain properties of the connective in terms of conditional. These are both algebraic and prooftheoretic properties. These topics are beyond the scope of this book.
6.2.5
Gentzen-Type Deductive Systems
In chapter 4 (4.3 and 4.4) we have studied methods for establishing claims of the form: B1 , B2 , . . . , Bn |= B These methods can be represented as purely formal proof procedures, based on a vocabulary of uninterpreted symbols. This is done by using a new type of syntactic constructs, entities of the form: B1 , B2 , . . . , Bn B where B1 , . . . , Bn and B are sentences and n = 0, where the construct is: B.)
is a new symbol. (This includes the possibility
236
CHAPTER 6. THE SENTENTIAL CALCULUS
Constructs of this form, called sequents, were introduced by Gentzen around 1934. The , differs from ‘|=’ in that it belongs to the syntax of the formalism and sequent symbol, does not stand for any English expression. Sequents are on the same level as uninterpreted sentences. But they are not sentences (one cannot, for example, apply to them sentential connectives). They form a new syntactic type. Gentzen considered deductive systems in which certain sequents are designated as axioms and inference rules enable us to deduce sequents from other sequents. The theorems are therefore are not sentences but sequents. If D is such a system, then `D Γ
A
A is a theorem of D. All this is not done, of course, as a mere game. We means that Γ intend to interpret the sequents as implication claims. Let us say that the sequent Γ
A
is valid if Γ |= A i.e., there is no interpretation of the language in which all sentences of Γ are true and A is not Note: Gentzen’s calculus differs from the one we are discussing in an important aspect. The ∆, where both Γ and ∆ are finite sequences of sentences, sequents in it are of the form Γ one of which but not both can be empty. The sequent is valid if, for every interpretation, if all members of Γ are true then at least one member of ∆ is true. The inference rules for such a system are simpler and more elegant than the one’s used in this book. Each connective can be handled separately without bringing in negation. For beginners the intended interpretation is more natural and easier if only one sentence is on the right-hand side. The present system is a variant constructed for the purpose of this book. Accordingly, the properties of soundness and completeness for the system are the following. Soundness:
A Gentzen-type system, D, is sound, if for all Γ and A: `D Γ
Completeness:
A
=⇒
Γ |= A
A Gentzen-type system, D, is complete, if for all Γ and A: Γ |= A
=⇒
`D Γ
A
Terminology: The antecedent of the sequent, Γ A, or its left-hand side, is Γ. Its succedent, or right-hand side is A. We shall use these terms, because ‘premises’ and ‘conclusion’ are now needed to refer to sequents that are themselves premises and conclusions in proofs of the sequent calculus.
6.2. DEDUCTIVE SYSTEMS OF SENTENTIAL CALCULI
237
We shall consider two Gentzen-type deductive systems for sentential logic that are sound and complete. They are obtained by straightforward formalization of the methods of 4.3 and 4.4. The first, which we denote as GS1, is based on ordinary sequents. The second, which we denote as GS2, corresponds to the proof-by-contradiction method and involves sequents of ⊥. the form Γ
The Deductive System GS1
GS1 has the following two axiom schemes: (GA1) Γ, A
A
(GA2) Γ, A, ¬A
C
Obviously, the axioms are valid. If we replace ‘ 4.3) we called self-evident implications.
’ by ‘|=’ we get what in chapter 4 (cf.
GS1 has a rule that allows us to reorder the left-hand side of a sequent. Γ Γ0
A A
Reordering
where Γ0 is obtained by reordering Γ. The other inference rules, which constitute the heart of the system, correspond to the laws of 4.3.2. We have antecedent rules and succedent rules for double negation, for each binary connective and for each negated connective. In the following list the rules are arranged in two groups. The first consists of all antecedent rules, the second of all succedent rules. Each row contains the rules for some connective and its negation. (This is different from the arrangement of the laws in 4.3.2. But you can easily see the correspondence, which is also indicated by the rules’ names. ) Note that the laws in 4.3.2 are ‘iff’ statements. The ⇐-direction of the law is the premisesto-conclusion direction of corresponding rule. Thus, the sequents of the premises are always simpler than the conclusion sequent.
238
CHAPTER 6. THE SENTENTIAL CALCULUS
ANTECEDENT RULES Γ, A Γ, ¬¬A Γ, A, B Γ, A ∧ B
C C
(∧
)
C Γ, B Γ, A ∨ B C
C
(∨
Γ, ¬A C Γ, B Γ, A → B C
C
(→
Γ, A
C Γ, ¬A, ¬B Γ, A ↔ B C
Γ, A, B
C
C C
(¬¬
)
C Γ, ¬B Γ, ¬A Γ, ¬(A ∧ B) C Γ, ¬A, ¬B Γ, ¬(A ∨ B)
)
Γ, A, ¬B Γ, ¬(A → B)
)
(↔
C C C C
C
(¬∨
(¬ →
C Γ, ¬A, B Γ, A, ¬B Γ, ¬(A ↔ B) C
)
(¬∧
C
SUCCEDENT RULES Γ
B ¬¬B
Γ Γ Γ
A Γ B A∧B
Γ, ¬A B Γ A∨B
(
Γ, A B Γ A→B
(
Γ, A
B Γ
Γ, B A↔B
¬¬)
(
∨)
Γ
→)
Γ
A (
Γ
¬B ( ¬(A ∧ B)
Γ
¬A Γ ¬B ¬(A ∨ B)
Γ, A
∧)
(
Γ ↔)
Γ, A Γ
A Γ ¬B ¬(A → B) ¬B Γ, ¬A ¬(A ↔ B)
¬∧) ¬∨)
(
¬ →)
(
B
(
¬ ↔)
Soundness and Completeness of GS1: The soundness of GS1 follows easily by observing that every axiom is valid and, for every inference rule, if all its premises are logically valid, so is the conclusion. This is exactly the ⇐-direction of the corresponding law in 4.3.2.
)
)
)
(¬ ↔
)
6.2. DEDUCTIVE SYSTEMS OF SENTENTIAL CALCULI
239
The completeness of GS1 follows from the fact that every true implication can be established using the method of 4.3.3: Start with self-evident implications and apply the ⇐ directions of the laws. These applications correspond exactly to the steps of a proof in GS1. The rigorous inductive argument follows the same lines as the argument used to prove the completeness of HS1. With each sequent we associate its weight, defined as follows: Sum all the numbers contributed by connective-occurrences, where every occurrence of ¬ contributes 1, every occurrence of ∧, ∨ and → contributes 2, and every occurrence of ↔ contributes 3. It is easily seen that, in all the inference rules, each of the premises has smaller weight than the conclusion. (The weight is some rough measure that reflects the fact that, in every inference rule, each of the premises is “simpler” than the conclusion. Any “simplicity measure” that has this property would do for our purposes.) We now prove, by induction on the weight of Γ provable in GS1.
A, that if Γ
A is valid then it is
If all the sentences in the sequent are literals, then the sequent is valid iff either (i) the succedent is one of the antecedent sentences or (ii) the antecedent contains a sentence and its negation. (Otherwise we can define, as in 4.3.3, a truth-value assignment that makes all antecedent literals true and the succedent false.) In either case the sequent is an axiom. If not all sentences in Γ A are literals, then one of them is either of the form ¬¬C, or of the form C D, or of the form ¬(C D) (where is a binary connective). In each case our sequent can be inferred, from one or two premises, by applying the corresponding rule (an antecedent rule–if the sentence is in the antecedent, a succedent rule–if it is the succedent). We now invoke an important feature of our rules: Reversibility:
In each rule, if the conclusion is valid, so are all the premises.
A is assumed to be valid, the premises This is the ⇒-direction of the laws of 4.3.2. Since Γ of the rule that yields it are valid as well. Since each premise has smaller weight, the induction hypothesis implies that it is provable in GS1. Therefore Γ A is provable. QED
The Gentzen-Type Deductive System GS2 GS2 formalizes the proof-by-contradiction method of chapter 4 (cf. 4.4) exactly in the way that GS1 formalizes the method of 4.3.3. In addition to the usual sequents, GS2 has sequents of the form: ⊥ A1 , . . . , An
where ⊥ is a special symbol (signifying contradiction). ⊥ can appear only as the right-hand side of a sequent.
240
CHAPTER 6. THE SENTENTIAL CALCULUS
By stipulation, the truth-value of ⊥ is F, under any assignment of truth-values to the atomic ⊥ is valid if there sentences. The valid sequents are defined as before. This means that Γ is no truth-value assignment that makes all the sentences in Γ true. Instead of the two axiom schemes (GA1) and (GA2), GS2 has one axiom scheme: (GA3) Γ, A, ¬A ` ⊥ The reordering rule is now supposed to cover also sequents of the new form. GS2 has also the following inference rule: Γ, ¬A Γ
A
⊥
Contradiction
The other rules of GS2 are the antecedent rules of GS1, with the difference that the succedent is always ⊥. (That is, replace list above, ‘C’ by ‘⊥’). After applying the Contradiction Rule no other rule can be applied. Hence, the only way to A is to prove Γ, ¬A ⊥ and then apply the Contradiction Rule. prove Γ The soundness and completeness of GS2 are proved in the same way as they are proved for GS1.
Chapter 7 Predicate Logic Without Quantifiers 7.0 Taking a further step in the analysis of sentences, we set up a language in which the atomic sentences are made of smaller units: individual constants and predicates (or relation symbols).
Individual Constants Individual constants are basic units that function like singular names of natural language, that is, names that denote particular objects. For example: ‘The Moon’, ‘Ann’, ‘Everest’, ‘Chicago’, ‘Bill’, ‘The USA’ . An interpretation of the formal language associates with every individual constant a denoted object, referred to as the constant’s denotation. The object can be arbitrary: a person, a material body, a spatio-temporal region, an organization, the number 1, whatever. In natural language a name can be ambiguous, e.g., Everest the mountain and Everest the officer. Usually, the intended denotation can be determined from context. A name may also lack denotation, e.g., ‘Pegasus’. (We have discussed this at some length in chapter 1.) But in predicate logic each individual constant has, in any given interpretation of the language, exactly one denotation. Different individual constants may have the same denotation, just as in natural language an object can have several names. The denotations of the individual constants depend on the interpretation. The syntax leaves them undetermined. On the purely syntactic level the individual constants are mere symbols, which function as building blocks of sentences. 241
242
CHAPTER 7. PREDICATE LOGIC WITHOUT QUANTIFIERS
We shall assume that a,
b,
c,
d,
etc.
are individual constants. It is also convenient to use ‘a’, ‘b’, ‘c’, etc. as variables ranging over individual constants. Thus, we may say: ‘For every individual constant c ’, or ‘There is an individual constant b’. Occasionally we shall help ourselves to names borrowed from English: Jill,
Jack,
Bill,
5,
etc.
Predicates Predicates, known also as relation symbols, combine with individual constants to make sentences. Grammatically, they fulfill a role analogous to English verb-phrases. But they express properties or relations. For example, in order to translate Ann is happy into predicate logic, we need an individual constant, say a, denoting Ann, and a predicate, say H, that does the job of ‘... is happy’. The translated sentence is:
H(a) H expresses, under the intended interpretation, the property of being happy. As we shall see later, it is interpreted simply as the set of all happy people. Similarly, the translation of Ann likes Bill is obtained by using individual constants, say a and b, for denoting Ann and Bill and a predicate, say L, to play the role of ‘... likes ’. The translation is then:
L(a,b)
7.0.
243
Here L is supposed to express the relation of liking. Actually it is interpreted as a settheoretical relation (cf. 5.1.4 page 167): the set of all pairs (x, y) such that x likes y. In the first case H is a one-place predicate. It comes with one empty place. When we fill the empty place with an individual constant we get a sentence. In the second case L is a two-place predicate. It comes with two empty places. We get a sentence when both places are filled with constants. The same individual constant can be used twice, e.g., L(a,a) which, under the interpretation just given, reads as: Ann likes Ann,
or
Ann likes herself.
The number of places that a predicate has is known as its arity. In our example H is unary (or monadic) and L is binary. The arity can be shown by indicating the empty places: H( )
L( , ).
A predicate can have any finite arity. For example, if a, b, and c are points on a line, we can translate: a is between b and c into: Bet(a,b,c) Here Bet( , , ) is a ternary predicate (interpreted as the three-place betweeness relation) and a, b, and c are interpreted as denoting a, b, and c. In general, an n-ary predicate is interpreted as some n-ary relation, where this is (as in set theory) a set of n-tuples. For the moment it suffices to note that predicates are analogous to constructs such as ‘... is happy’, ‘... is red’, ‘... likes ’, ‘... is greater than ’, ‘... is between and ∗∗∗ ’, etc. We assume that P,
R,
S,
P0 ,
R0 ,
etc.
244
CHAPTER 7. PREDICATE LOGIC WITHOUT QUANTIFIERS
are predicates of the formal language. It is also convenient to use ‘P’ ‘R’, etc. as variables ranging over the predicates. Thus, we might say: ‘For some monadic predicate P’, or ‘Whenever P is a ternary predicate’, etc. We may also help ourselves to suggestive expressions such as: Happy( ),
FatherOf( , ),
Larger( , ),
etc.
As far as the formal uninterpreted language is concerned, predicates are mere symbols, each with an associated positive integer (called its arity), which can be combined in a well-defined way with individual constants to form sentences. The interpretation of these symbols varies with the interpretation of the language. H( ) can be interpreted as the set of happy people–in one interpretation, as the set of humans–in another, as the set of all animals–in a third, and as the empty set–in a fourth. The same goes for individual constants. Note: ‘Predicate’ has several meanings. It can refer to properties that can be affirmed of objects (‘rationality is a predicate of man’). It is also a verb denoting the attributing of a property to an individual (‘wisdom is predicated of Socrates’). Don’t confuse these meanings with the present technical sense!
7.1
PC0, The Formal Language and Its Semantics
7.1.0 By ‘PC0 ’ we shall refer to the portion of first-order logic that involves, beside the connectives, only individual constants and predicates. The language of PC0 is therefore based on: (i) A set of individual constants. (ii) A set of predicates, each having an associated positive integer called its number of places, or arity. (iii) Sentential connectives, which (in our case) are: ¬, ∧, ∨, → and ↔. Actually, we are describing here not a single language, but a family of languages. For the languages may differ in their sets of individual constants and predicates. Given the individual constants and the of predicates, the atomic sentences are the constructs obtained by applying the rule: If P is an n-place predicate and c1 , . . . , cn are individual constants, then P(c1 , . . . , cn ) is an atomic sentence.
7.1. PC0 , THE FORMAL LANGUAGE AND ITS SEMANTICS
245
You can think of P(c1 , . . . , cn ) as the result of applying a certain operation to the predicate and the individual constants. (Just as A ∧ B is the result of applying a particular operation to A and B.) We want the atomic sentences to satisfy unique readability: the predicate and the sequence of individual constants should be readable from the sentence. This amounts to the requirement: If P(c1 , . . . , cn ) = P0 (c01 , . . . , c0m ) P = P0 ,
n = m,
then
and ci = c0i ,
for i = 1, . . . , n
The exact nature of the atomic sentences does not matter, as long as unique readability for atomic sentences and for those constructed from them by applying connectives is satisfied. Except for the difference in the atomic sentences, the set of all sentences is defined exactly as in the sentential calculus. It consists of all constructs that can be obtained from the following rules: Every atomic sentence is a sentence. If A and B are sentences, then: ¬A, A ∧ B, A ∨ B, A → B, A ↔ B
are sentences.
Note: Sometimes an infix notation is adopted for certain binary predicates: Instead of R(a,b) we write a R b. This is purely a notational convention. Example: If H( ) is a unary predicate playing the role of ‘... is happy’, and a, b and c are individual constants playing the roles of ‘Ann’, ‘Bert’ and ‘Claire’, then the sentences, Ann, Bert and Claire are happy
Ann or Bert or Claire is happy
are rendered in PC0 , respectively, by: H(a) ∧ H(b) ∧ H(c)
H(a) ∨ H(b) ∨ H(c)
(Assuming that the English ‘or’ is meant inclusively.) The second sentence is also a translation of ‘One of Ann, Claire and Bert is happy’, when ‘one of’ means at least one of. But if we want to formalize Exactly one of Ann, Bert and Claire is happy, we have to use a sentence that says that at least one of these people is happy, but no two of them are: [H(a) ∨ H(b) ∨ H(c)] ∧ ¬[H(a)∧H(b) ∨ H(a)∧H(c) ∨ H(b)∧H(c)]
246
CHAPTER 7. PREDICATE LOGIC WITHOUT QUANTIFIERS
which is equivalent to: [H(a)∧¬H(b)∧¬H(c)] ∨ [H(b)∧¬H(a)∧¬H(c)] ∨ [H(c)∧¬H(a)∧¬H(b)] WATCH OUT: such as:
You cannot use constructions that parallel English groupings of words, H(a,b,c),
or
H(a ∧ b ∧ c),
or
H(a ∨ b ∨ c) .
Such expressions do not represent anything in the formal language. They are, as far as PC0 is concerned, gibberish. H is a monadic predicate; it cannot be combined with three individual constants. Conjunction, or conjunction combine only sentences, not individual constants. Similarly, if R( ) is a predicate that formalizes ‘... is relaxed’, then you cannot render ‘Ann is happy and relaxed’ by writing: (H ∧ R)(a) This, again, is gibberish. We do not have in PC0 a conjunction that combines predicates. Conjunction is by definition an operation on sentences. Therefore ‘Ann is happy and relaxed’ is rendered as: H(a) ∧ R(a)
7.1.1
The Semantics of PC0
An interpretation of a PC0 -type language consists of: (I) An assignment that assigns to each individual constant an object. (II) An assignment that assigns to each n-ary predicate an n-ary relation, i.e., a set of n-tuples. (If n = 1 the interpretation assigns to the predicate some set.) If c is assigned to c and P is assigned to P, then c and P are described as the interpretations of c and P. We also say that c denotes, or refers to, c, and that c is the denotation, or the reference of c. This terminology is also used, though less commonly, with respect to predicates. An interpretation determines the truth-value of each atomic sentence as follows. If an n-ary predicate P is interpreted as P and each individual constant ci is interpreted as ci , for i = 1, . . . , n, then: P(c1 , . . . , cn ) has the value T if (c1 , . . . , cn ) ∈ P P(c1 , . . . , cn ) has the value F
if
(c1 , . . . , cn ) 6∈ P
(For n = 1, we simply take the member itself: P(c1 ) gets T if c1 ∈ P , and it gets F if c1 6∈ P .)
7.1. PC0 , THE FORMAL LANGUAGE AND ITS SEMANTICS
247
The assignment of truth-values to the atomic sentences determines the truth-values of all other sentences exactly in the same way as in the sentential calculus. By defining the possible interpretations of the language, we have also determined the concepts of logical truth and falsity: A sentence of PC0 is logically true if it gets the value T in all interpretations. It is logically false, if it gets the value F under all interpretations. We have also determined the relation of logical implication between a premise-list and a sentence. Γ logically implies A, just when there is no interpretation under which all members of Γ are true and A is false. Note: The implications we are considering now are no longer schemes of the kind we handled in 4.3. Our sentences have been specified to be particular entities, e.g., P(a,b) or P(a, b) ∨ R(b, b). The implication is logical if, for every interpretation, if all premises are true so is the conclusion. The interpretations vary, but the sentences remain the same. This is true of all systems we shall consider henceforth. So far, no restriction has been placed on the interpretation of the predicates and the individual constants. Consequently, any assignment of truth-values to the atomic sentences can be realized by interpreting the predicates and the individual constants in a suitable way. Given a truth-value assignment, σ, we can define an interpretation as follows: Interpret different individual constants as denoting different objects. Interpret any n-ary predicate P as the n-ary relation P such that, for all n-tuples: (c1 , . . . , cn ) ∈ P iff, for some individual constants c1 , . . . , cn , ci denotes ci , i = 1, . . . , n, and σ assigns to P(ci , . . . , cn ) the value T. Under this interpretation, the atoms get the truth-values that are assigned to them by σ. A tautology of PC0 is defined as a sentence whose truth-value, obtained via truth-tables, is T for all assignments of truth-values to its atomic components. Since every assignment of truth-values to the atomic sentences is obtainable in some interpretation, the logical truths of PC0 coincide with its tautologies. Similarly, logical implication and tautological implication are in the case of PC0 the same. The story will change when we add the equality-predicate, ≈, to the language, because the interpretation of this predicate is severely restricted. For example, the atomic sentence a ≈ a gets always the value T. It is a logical truth but not a tautology. The sentential apparatus that has been developed for sentential logic applies exactly in the same way to PC0 . We can therefore use freely the techniques of distributing, pushing negations, De Morgan’s laws, substitutions of equivalents, the top-down proof methods of 4.3.3 and 4.4.1, DNF, CNF, and all the rest.
248
CHAPTER 7. PREDICATE LOGIC WITHOUT QUANTIFIERS
We can also set up a deductive system, like one of those described in 6.2–except that the atoms are not the Ai ’s of SC, but the atomic sentence of our PC0 language. Homework 7.1 Translate the following sentences, using predicates, individual constants and connectives. The translation is to be based on the assumption that the sentences are interpreted in a universe consisting of three men and three women: Jack, David, Harry, Claire, Ann, Edith where gender is according to name, and different names denote different persons. The language has a name for each person. You can use the English names, or their (lower case) first letters: j, d, h, c, a, e. For ‘... is happy’, you can use H(...). When you find a sentence ambiguous, give its possible translations and prove their nonequivalence by showing the existence of interpretations under which they get different truthvalues. 1. Everyone is happy. 2. Every man is happy and every woman is happy. 3. Every man is happy or every woman is happy. 4. Someone is happy. 5. Some man is happy and some woman is happy. 6. Some man is happy or some woman is happy. 7. Some woman is happy and some is not. 8. Some women are happy, while some men are not. 9. If Jack is not happy, none of the women is. 10. If Harry is happy, then one of the women is and one is not. 11. No man is happy, unless another person is. 12. All women are not happy, but all men are. 13. Women are happy, men are not. 14. Not everyone who is happy is a woman.
7.1. PC0 , THE FORMAL LANGUAGE AND ITS SEMANTICS
249
15. If men are happy so are women. 7.2 Find all the logical implications that hold between the translations of the first six sentences in 7.1. (Note that there are only four sentences to consider, because of two obvious equivalences). Prove every equivalence (by any of the methods of chapter 4, or by truth-value considerations). Prove every non-equivalence, by showing the existence of an interpretation that makes one sentence true and the other false. 7.3 Translate the following sentences, under the same assumptions and using the same notations as in 7.1. Use L(..., ) for ‘... likes ’. Follow the same instructions in cases that you find ambiguous. 1. Every woman is liked by some man. 2. Some man likes every woman. 3. Some woman likes herself, and some man does not. 4. Nobody is happy who does not like himself. 5. Some women, who do not like themselves, like David. 6. Ann does not like a man, unless he likes her. 7. Claire likes a man who likes Edith. 8. Claire and Edith like some man. 9. Unless liked by a woman, no man is happy. 10. Most men like Edith. 7.4 Which, if any, of the translated first two sentences of 7.3 implies logically the other? Prove your implication, as well as non-implication, claims.
Substitutions of Individual Constants We can substitute in any atomic sentence an individual constant by another constant; this gives another sentence. When the individual constant has more than one occurrence we can substitute any particular occurrence. We can also carry out simultaneous substitutions, i.e., several substitutions at one go. A few examples suffice to make this clear. Let a b c be different individual constants, and let P(a, b, a) be an atomic sentence.
250
CHAPTER 7. PREDICATE LOGIC WITHOUT QUANTIFIERS Substituting the first occurrence of a by b we get: Substituting (all occurrences of) a by c we get:
P(b, b, a).
P(c, b, c).
Substituting the first occurrence of a by c, its second occurrence by b, and b by a, we get: P(c, a, b) Substitution in non-atomic sentences is effected by substitution in the atomic components. For example, if the given sentence is P(a, c, b) → ¬R(b, c) then: Substituting all occurrences of a by c and all occurrences of c by a we get: P(c, a, b) → ¬R(b, a) Substituting the first occurrence of b by c and the second by a we get: P(a, c, c) → ¬R(a, c)
7.2
PC0 with Equality
7.2.0 The equality predicate, or equality for short, is a binary predicate used to express statement of equality. In English such statements are expressed by: ‘... is equal to
’
‘... is identical to
’,
‘... is the same as
’
Since we use the symbol ‘=’ in our own discourse, we shall adopt a different symbol as the ≈ . Thus, equality predicate of PC0 : c ≈ c0 is an atomic sentence of PC0 , which says that the denotations of c and c’ are identical. On the other hand ‘ c = c0 ’ is a sentence in our discourse which says that c and c0 are the same individual constant. We refer to atomic sentences of the form a ≈ b as equalities. (The infix notation is a mere convention; we could have used ‘≈ (a, b)’.)
7.2. PC0 WITH EQUALITY
251
We use ‘a 6≈ b’ as a shorthand for ‘¬(a ≈ b)’. Sentences of this form are referred to as inequalities. We stipulate as part of the semantics of PC0 that if the language contains equality, then: a≈b
gets
T
iff
denotation of a
=
denotation of b .
As we shall see, an interpretation of a first-order language involves the fixing of a certain set of objects as the universe of discourse. All individual constants denote objects in that universe, and all predicates are interpreted as subsets of the universe, or as relations over it. Once the universe is chosen, the interpretation of ≈ is, by definition, the identity relation over it: the set of all pairs (a, a), where a belongs to the universe. For this reason ≈ is considered a logical predicate. All other predicates are non-logical. The following are sentences logical truths (1)
|= a ≈ a
(2)
|= a ≈ b → b ≈ a
(3)
|= (a ≈ b ∧ b ≈ c) → a ≈ c
But they are not tautologies. Their truth is not established on the basis of their sentential structure, but because ≈ is interpreted as the identity relation. The following principle obviously holds. (EP) For all individual constants c, c0 : If A0 is obtained from A by simultaneously substituting in some places c0 for c and/or c for c0 , then: c ≈ c0 → (A ↔ A0 ) is a logical truth. Here A can be any sentence; it can be itself an equality. Let A = a≈b Let A0 be obtained by substituting simultaneously a for b and b for a. Then A0 = b ≈ a By (EP), the following is a logical truth: a ≈ b → [a ≈ b ↔ b ≈ a] This is easily seen to imply (2). In a similar way, we can derive (3) from (EP). (Can you see how?)
252
CHAPTER 7. PREDICATE LOGIC WITHOUT QUANTIFIERS
(EP) is implied by the stipulation that ≈ is interpreted as the identity relation and by the following general principle that underlies the semantics of PC0 : Extensionality Principle: The truth-value of a sentence, in a given interpretation, does not change, if we substitute a term by another term that has the same denotation. By ‘term’ we mean here an individual constant; but the principle applies if ‘term’ covers predicates, whose denotations are, as we said, sets or relations (sets of tuples). In many contexts of natural language the extensionality principle does not hold. The following standard example is due to Frege. ‘Hesperus’ and ‘Phosphorus’ are two names that stand, respectively, for ‘the evening star’ and ‘the morning star’. Both, it turns out, denote the planet Venus. The sentences (4) Jill believes that Hesperus is identical to Phosphorus. (5) Jill believes that Phosphorus is identical to Phosphorus. need not have the same truth-value, although (5) results from (4) by substituting a name by a co-denoting name (i.e., with the same denotation). Jill does not doubt that (5) is true, but she may be unaware that Hesperus and Phosphorus are the same planet. (4) and (5) are about Jill’s beliefs. We can formalize statements of this type if we introduce something like a monadic connective, say Bel, which operates on sentences. For every sentence A, there is a sentence Bel(A), which says that Jill (or some anonymous agent) believes that A. Syntactically Bel acts like negation. But it is not truth-functional, hence it is not a sentential connective of classical logic. Individual constants that occur in the scope of Bel cannot, in general, be substituted by co-denoting constants without change of the truth value. The same is true of a wide class of sentences involving ‘that’-phrases (‘thinks that ...’, ‘knows that...’ and others), as well as expressions of necessity or possibility (‘it is necessary that...’, ‘it is possible that...’). In formal languages, such non-classical connectives are known as modal. In this course we shall not encounter them. Contexts, and languages, in which the extensionality principle holds are called extensional. Non-extensional context, such as (4), are known as intensional. Classical logic is extensional throughout. The truth-table method for detecting tautologies does not detect the new kind of logical truth, which derives from the meaning of ≈. It can be modified so as to take care of this special predicate. Instead of considering all possible assignments of truth-values to the atomic sentences, we have to rule out certain assignments as unacceptable. This amounts to striking out certain rows in the truth-table; for example: a row that assigns c ≈ c the value F, or a row that assigns to a ≈ b and to b ≈ a different truth values is unacceptable. We should also strike out any row that assigns the value T to a ≈ b and to a ≈ c but assigns different values
7.2. PC0 WITH EQUALITY
253
to P(a, b) and P(c, c). It is possible to state general, necessary and sufficient conditions for the acceptability of a row; but we shall not do it here. Once the acceptable rows have been determined, we can check for logical truth: A sentence of PC0 is logically true iff it has the value T in all acceptable rows of its truth-table. Note the stronger condition for tautologies: a sentence is a tautology iff it gets the value T in all rows, including the non-acceptable ones. Instead of modifying the truth-table method, we shall adjust the top-down derivation methods of 4.3.3 and 4.4.1, by adding certain laws that take care of ≈. This leads to a much simpler, easier to apply prescription for checking logical implications (and, in particular logical truth) of PC0 sentences. We remarked that the top-down derivation method applies to PC0 without change. The only difference is that instead of the atoms Ai s of the sentential calculus (cf. 6.1), or instead of the sentential variables that are used in 4.2.1, we have the atomic sentences of PC0 . (We can also continue to use sentential variables, regarding them as some unspecified sentences.) If ≈ is not present, we get at the end either a proof of the initial goal, or a truth-value assignment to the atoms that makes the premises true and the conclusion false. In the latter case we can (as shown in 7.1.1) find an interpretation of the individual constants and the predicates that yields this assignment; hence we get our counterexample. The same procedure is adequate for sentences containing ≈, provided that certain laws are added to our previous lists.
7.2.1
Top-Down Fool-Proof Methods For PC0 with Equality
We shall concentrate mostly on the proof-by-contradiction variant (cf. 4.4.0, 4.4.1) which is based on fewer laws and which necessitates fewer additions. Only two additional laws are required. The first law adds another type of self-evident implications. They, as well, √ can now serve as axioms in a bottom-up proof, or as successful final goals (marked by ‘ ’) in a top-down derivation. The second is a reduction law that can be used in reducing goals. In the following ‘c’ and ‘c0 ’ are variables ranging over individual constants. ‘ES’ stands for ‘Equality Substitution’.
254
CHAPTER 7. PREDICATE LOGIC WITHOUT QUANTIFIERS
Equality Laws For Proofs-by-Contradiction (EQ)
Γ, ¬(c ≈ c) |= ⊥
(ES) Γ, c ≈ c0 |= ⊥ ⇐⇒ Γ0 , c ≈ c0 |= ⊥ where c and c0 are different constants, c occurs in Γ, and Γ0 results from Γ by substituting everywhere c0 for c.
(For simplicity c ≈ c0 has been written as the rightmost premise. This is immaterial since we can reorder the premises.) Recall that Γ |= ⊥ means that there is no interpretation that makes all sentences of Γ true; a counterexample is an interpretation that does this. The two sides of (ES) are counterexample-equivalent: an interpretation is a counterexample to the implication of one of the sides iff it is a counterexample to the other. This is implied by (EP): If c ≈ c0 gets T, then all the sentences of Γ get the same truth-values as their counterparts in Γ0 . To apply (ES) in a top-down derivation, choose some equality c ≈ c0 among the premises and replace every occurrence of c by that of c0 , in every other sentence. After the application c does not occur in any of the sentences except in c ≈ c0 . We shall refer to such an application as a c-reduction. Note: The ⇐-direction of (ES) is the one applied in bottom-up proofs. This direction allows us–when c ≈ c0 is a premise and c does not occur in any other sentence– to replace in the other sentences some (one or more) occurrences of c0 by c. We could have formulated (ES) in a more general form, which allows to substitute some, but not necessarily all, the occurrences of c. But the top-down process is simpler and more efficacious when the law is applied in its present form. Note: The restriction that c and c0 be different and that c occur in Γ rules out cases where the substitution would leave Γ unchanged. Consider a c-reduction, where the equality is c ≈ c0 . After the reduction c appears only in the equality c ≈ c0 . Call an individual constant that appears in the premises dangling if it has one occurrence, and it is the left-hand side of an equality. Then every c-reduction reduces the number of non-dangling constants by one. Here are three very simple examples of top-down derivations: 1. a ≈ b |= b ≈ a 2. a ≈ b, ¬(b ≈ a) |= ⊥
7.2. PC0 WITH EQUALITY
255 √
3. a ≈ b, ¬(b ≈ b) |= ⊥
The first step is the usual move (via the Contradictory-Conclusion Law) in a proof by contradiction. The passage from 2. to 3. is via (ES), with a ≈ b in the role of c ≈ c0 . 3. is a self-evident implication that falls under (EQ). The bottom-up proof is obtained by reversing the sequence. The passage from 3. to 2. is via the ⇐-direction of (ES). 1. a ≈ b, b ≈ c |= a ≈ c 2. a ≈ b, b ≈ c, ¬(a ≈ c) |= ⊥ 3. a ≈ c, b ≈ c, ¬(a ≈ c) |= ⊥
√
Here the step from 2. to 3. is achieved by applying (ES), with the second equality (i.e., b ≈ c) in the role of c ≈ c0 ; the application results in substituting b in the first equality by c. 1.
a ≈ b, c ≈ b, P(a, c, a) |= P(c, b, a)
2.
a ≈ b, c ≈ b, P(a, c, a), ¬P(c, b, a) |= ⊥
3.
a ≈ b, c ≈ b, P(b, c, b), ¬P(c, b, b) |= ⊥
4.
a ≈ b, c ≈ b, P(b, b, b), ¬P(b, b, b) |= ⊥
√
Here 3. is obtained from 2. by using (ES), the equality in question being a ≈ b. Then 4. is obtained from 3. by another application of (ES), this time the equality is b ≈ c. The Adequacy of the Method The top-down method for PC0 with equality is based on our previous reduction steps and on reductions via (ES). Given an initial goal: Γ |= A
or
Γ |= ⊥
we apply repeated reductions. This must end with a bunch of goals that cannot be further reduced. The argument follows the same lines as in the case of sentential logic. Applications of the connective laws yield simpler goals. (In 6.2.4 and 6.2.5 we have seen how to associates weights with goals, so that the applications are weight reducing.) Applications of (ES) consist in substitutions that preserve all the sentential structure. The goal is simplified in that the number of non-dangling constants is reduced. Following these considerations, it is not difficult to see that the process must terminate. (We can define a new weight by adding to the weight defined in 6.2.5 the number of all non-dangling constants. The new weight goes down with each reduction step. We therefore have an inductive argument exactly like those of 6.2.4 and 6.2.5.)
256
CHAPTER 7. PREDICATE LOGIC WITHOUT QUANTIFIERS
The end goals of the process (those in the leaves) are elementary implications, i.e., they contain only literals: atoms or negated atoms. And they cannot be further reduced via (ES). Consider an implication of this type, where all equalities c ≈ c0 in which the two sides are different, are written first: c1 ≈ c01 , c2 ≈ c02 , . . . , cn ≈ c0n , Γ |=
⊥
Γ consists of all premises that are not equalities, or that are trivial equalities: a ≈ a. None of the ci s occurs in any other place; otherwise we could have applied a ci -reduction. They are exactly all the dangling constants. Assume that the implication is not self-evident, i.e., that the premises contain neither a sentence and its negation, nor an inequality of the form ¬(c ≈ c). Then the following interpretation makes all premises true. Hence it is a counterexample. (I) Let a1 , . . . , am be all the different non-dangling individual constants. Interpret them as names of different objects, a1 , . . . , am . (II) Interpret the predicates in a way that makes every atom occurring positively (i.e. unnegated) in Γ true and every atom occurring negatively false. (This can be done because, because different constants occurring in Γ have been assigned different denotations.) (III) Interpret each ci as denoting the same object (among the aj s) that is denoted by c0i , i = 1, . . . , n. (This can be done because the each ci occurs only in the equality ci ≈ c0i .) If all the elementary implications are self-evident, the top-down derivation tree shows that the initial goal is a logical implication. By inverting the tree we get a bottom-up proof. If, on the other hand, one of the elementary implication is not self-evident, then, as just shown, we can construct a counterexample. This is also a counterexample to the initial goal. QED Here is a simple example. The initial goal is: L(a, b), L(a, c)∧L(b, a) → H(a), b ≈ c |= H(a) An attempt to construct a top-down derivation results in: 1.
L(a, b), L(a, c)∧L(b, a) → H(a), b ≈ c |= H(a)
2.
b ≈ c, L(a, b), L(a, c)∧L(b, a) → H(a), ¬H(a) |=
⊥
3.
b ≈ c, L(a, c), L(a, c)∧L(c, a) → H(a), ¬H(a) |=
⊥
4.1
b ≈ c, L(a, c), ¬[L(a, c)∧L(c, a)], ¬H(a) |=
⊥
7.2. PC0 WITH EQUALITY 4.2
b ≈ c, L(a, c), H(a), ¬H(a) |=
257 √
⊥
5.11
b ≈ c, L(a, c), ¬L(a, c), ¬H(a) |=
⊥
5.12
b ≈ c, L(a, c), ¬L(c, a), ¬H(a) |=
⊥
√
×
In the first step we have also rearranged the premises by moving b ≈ c to the beginning. The step from 2. to 3. is a b-reduction. The other steps are of the old kind. 5.12 is an elementary but not self-evident. It has a counterexample: Let a denote a, and let c denote c, where a 6= c. Interpret L as any relation L, such that (a, c) ∈ L and (c, a) 6∈ L. Interpret H as any set, H, such that a 6∈ H. Interpret b as denoting c. In general, in the presence of ≈, it is advisable to apply c-reductions as soon as possible. This will reduce the number of constants in the other premises. Note: Obviously, Γ, c ≈ c |= C ⇔ Γ |= C. Hence trivial equalities can be dropped. Yet, we do not have to include this among our laws. The proof above shows that the method is adequate without the law for dropping trivial equalities. Sometimes dropping c ≈ c results in the disappearance of c from the premises. In this case any counterexample to the reduced goal becomes a counterexample to the original goal if we assign to c any arbitrary denotation.
The Top-Down Method of 4.3.3 for PC0 with Equality The adjustment of the method of 4.3.3 is obtained along similar lines. Here is a brief sketch. Recall that the method does not employ ‘⊥’. Its treats implications of the form Γ |= C To handle ≈, we split each of our old equality laws into two laws, one for the premises and one for the conclusion. Altogether we have four laws that treat equalities: the self-evident implications (EQ1) and (EQ2) and the reduction laws (ES1) and (ES2).
258
CHAPTER 7. PREDICATE LOGIC WITHOUT QUANTIFIERS
(EQ1)
Γ, ¬(c ≈ c) |= C
(EQ2)
Γ |= c ≈ c
In the following laws c and c0 are different constants, c has at one other occurrence and Γ0 , C 0 result from Γ, C by substituting everywhere c0 for c. (ES1)
Γ, c ≈ c0 |= C
⇐⇒
Γ0 , c ≈ c0 |= C 0
(ES2)
Γ |= ¬(c ≈ c0 )
⇐⇒
Γ0 |= ¬(c ≈ c0 )
To see why (ES2) holds, consider a counterexample to one of the sides. Since ¬(c ≈ c0 ) gets F, c ≈ c0 gets T. Hence c and c0 have the same denotation. Therefore the sentences in Γ and in Γ0 get the same truth-values. The top-down reductions, for a given initial goal, proceed much as before. Again, it is advisable to carry out c-reductions, via (ES1) and (ES2), as early as possible. At the end the goal is reduced a bunch of elementary implications that cannot be further reduced. Corresponding to the four self-evident implication laws, there are four types of self-evident implications: Γ, A, ¬A |= C,
Γ, A |= A,
Γ, ¬(c ≈ c) |= C,
Γ |= c ≈ c
The method’s adequacy is proven by showing that if an elementary implication is not of any of these forms and if, moreover, it cannot be further reduced via (ES1) or (ES2), then there is an interpretation that makes all premises true and the conclusion false. Example: 1.
P(a), P(b) → ¬P(c) |= a ≈ b → ¬(b ≈ c)
2.
P(a), P(b) → ¬P(c), a ≈ b |= ¬(b ≈ c)
(|=, →)
3.
P(b), P(b) → ¬P(c), a ≈ b |= ¬(b ≈ c)
a-reduction, (ES1)
4.
P(c), P(c) → ¬P(c), a ≈ c |= ¬(b ≈ c)
b-reduction, (ES2)
5.11
P(c), ¬P(c), a ≈ c |= ¬(b ≈ c)
(→, |=),
5.11
P(c), ¬P(c), a ≈ c |= ¬(b ≈ c)
(→, |=),
√ √
7.2. PC0 WITH EQUALITY
259
Gentzen-Type Systems for PC0 with Equality The Gentzen-type systems GS1 and GS2, considered in 6.2.5, can be extended to PC0 with equality. By now the extension should be obvious: just as GS1 and GS2 are obtained by formalizing the laws of 4.3.4 and 4.4.1 into axioms and inference rules, the required extensions are obtained by formalizing the additional laws. The required extension of GS2 is obtained by adding the following axiom and inference rule:
Γ, ¬(c ≈ c) Γ0 , c ≈ c0 Γ, c ≈ c0
⊥ ⊥ ⊥
where c and c0 are different constants, c occurs in Γ, and Γ0 results from Γ by substituting everywhere c0 for c.
Note that the inference rule corresponds to the ⇐-direction of (ES). The completeness of the extended system follows now from the adequacy of the top-down proof-by-contradiction method for PC0 . The extension of GS1 is obtained in a similar way and is left to the reader.
A Hilbert-Type System for PC0 with Equality The Hilbert-type system HS1, given in 6.2.3, can be extended to a system that is sound and complete for PC0 with equality. As in HS1, we assume that the language of PC0 has only ¬ and → as primitive sentential connectives. The same kind of extension applies to every sound and complete system that has modus ponens as an inference rule (either primitive or derived). In particular it applies to the systems obtained from HS1 by adding connectives with the associated axioms, as described in 6.2.4 (cf. Homework 6.12, 6.13 and 6.14). It turns out that the addition of the following two equality axiom-schemes is all that is needed: EA1 EA2
c ≈ c, where c is any individual constant. c ≈ c0 → (A →A0 ), where c and c0 are any individual constants, A is any sentence of PC0 and A0 is obtained from A by substituting one occurrence of c by c0 .
Actually, we can restrict EA2 to the cases where A is an atom; this, together with EA1 is already sufficient, but we shall not go into this here.
260
CHAPTER 7. PREDICATE LOGIC WITHOUT QUANTIFIERS
If our original system is complete (i.e., is sufficient for proving all tautological implications), then the addition of the two axiom schemes takes care of all implications that are due to the connectives and to equality. For example, the logical truth a≈b→b≈a is derivable from EA1 and EA2, because a ≈ b → (a ≈ a →b ≈ a) is an instance of EA2 (where A is a ≈ a, c and c0 are, respectively a and b, and A0 is obtained by replacing the first occurrence of a by b). This and a ≈ a, tautologically imply a ≈ b → b ≈ a. Note that, while EA2 allows us to replace one occurrence of c by an occurrence of c0 , repeated applications enable us to replaces any number of c’s by c0 ’s. The completeness of the resulting system is proved in the same way used to prove the completeness of HS1. We extend the previous proof by showing that the provability relation, `, of that system has the properties required to insure the adequacy of the top-down derivation method. We have to establish the following properties, which correspond to (EQ1), (EQ2) and the ⇐-directions of (ES1) and (ES2). (i) Γ, ¬(c ≈ c) ` C (ii) Γ ` c ≈ c In the following Γ0 , C 0 result from Γ, C by substituting everywhere c0 for c. (iii) Γ0 , c ≈ c0 ` C 0 (iv) Γ0 ` ¬(c ≈ c0 )
⇒ ⇒
Γ, c ≈ c0 ` C Γ ` ¬(c ≈ c0 )
Homework 7.5 Find which of the following is a logical implication. Justify your answers (positive–by derivations or truth-value considerations, or both, negative–by counterexamples). 1. L(a, b) → L(b, a), L(a, b)∧L(b, c) → L(a, c) |= c ≈ a → (L(a, b) → L(a, a)) ? 2. (L(a, b)∧L(b, a)∧(a 6≈ b)) → H(a) |= L(a, a)∧L(a, b) → H(a) ? 3. L(a, b) → L(b, c), L(b, c) → L(c, a), L(c, a) → L(a, b) |= a ≈ b → [L(a, a) ↔ L(c, a)] ? 7.6 Consider interpretations of a language based on a two-place predicate L( , ), and individual constants a, b, c, such that:
7.3. STRUCTURES OF PREDICATE LOGIC IN NATURAL LANGUAGE
261
(i) Each individual constant denotes somebody among Nancy, Edith, Jeff, and Bert, where these are four distinct people. (ii) L(... , ) reads as: ‘... likes likes y are:
’, where all the pairs (x, y) such that x
(Nancy, Edith), (Nancy, Jeff), (Edith, Edith), (Edith, Bert), (Jeff,Jeff), (Jeff, Bert), (Bert, Edith). Consider the sentences (1)
L(a,b) ∧ L(b,a) ∧ ¬ L(b, b)
(2)
¬ L(a,a) ∧ L(b,a) ∧ L(c,a)
Find all the ways of interpreting a and b so that (1) comes out true; and all the ways of interpreting a, b, c so that (2) comes out true. Indicate in each case your reasons. 7.7 Show that the following sentences are implied tautologically by instances of EA1 and EA2, where EA2 is used for atomic sentences only. (Outline the argument, giving the instances of the required equality axioms.) 1. a ≈ b → [b ≈ c → a ≈ c] 2. a ≈ b → [L(a, a) → L(b, b)] 3. a ≈ b → [(c 6≈ d) ∨ (S(a, c) ↔ S(b, d))] 7.8 Prove (iii) in the above-mentioned list of properties required of `. (Hint: show that every sentence of Γ0 is provable from c ≈ c0 and the corresponding sentence of Γ, , by repeated uses of (EA2). Using the fact that c ≈ c0 ` c0 ≈ c, show that C is provable from c ≈ c0 and C 0)
7.3 7.3.1
Structures of Predicate Logic in Natural Language Variables and Predicates
Variables as Place Markers When predicates of PC0 of arity > 1 are meant to formalize English predicate-expressions, the correspondence between the argument places should be clear and unambiguous. For example, if we introduce the two-place L( , ) as the formal counterpart of ‘likes’ we can say:
262
CHAPTER 7. PREDICATE LOGIC WITHOUT QUANTIFIERS L(... ,
) is to be read as ‘... likes
’.
It is obvious here that the first and second places in L( , ) match, respectively, the left and right sides of ‘likes’. But this cumbersome notation is impractical when the arities are larger, or when the English expressions are longer. At this point variables come handy. We can say that L(x, y) is to be read as ‘x likes y’ You are probably acquainted with the use of variables that range over numbers from high school algebra. Variables that range over sentences have been used in previous chapters, as well as variables ranging over strings. In chapter 5 we used using variables ranging over arbitrary objects. Later we shall extend PC0 by incorporating in it variables as part of the language. At present we are going to use ‘x’ ‘y’ ‘z’ ‘u’ ‘v’, etc. merely as place markers: to mark certain places within syntactic constructs. The identity of the variables is not important. For example, the following three stipulations come to the same: L(x, y) is to be read as ‘x likes y’. L(u, v) is to be read as ‘u likes v’. L(y, x) is to be read as ‘y likes x’. Each means that L is a binary predicate to be interpreted as: {(p, q) : p likes q} But if we were to say that L(x, y) is to be read as ‘y likes x’, then we would be assigning to L a different interpretation, namely: {(p, q) : q likes p} If a and b denote, respectively, a and b, then under the first stipulation L(a,b) is true iff a likes b, but under the second it is true iff b likes a. By substituting, in English sentences, variables for noun-phrases we can indicate predicateexpressions. We can then say, for example:
7.3. STRUCTURES OF PREDICATE LOGIC IN NATURAL LANGUAGE
263
Let H be the predicate ‘x is happy’, which says that the monadic predicate H( ) is to be interpreted as the set of all happy beings. When the arity is > 1 it should be clear which coordinates are represented by which variables. If we say Let L be the predicate ‘x likes y’, then we should indicate the correspondence between variables and places of L. In the absence of other indications, we shall take the alphabetic order of the variables as our guide: ‘x’ before ‘y, ‘y’ before ‘z’.
Deriving Predicates from English Sentences Generally, an English sentence can give rise to more than one predicate. Because we can mark (using variables) different places as empty. To take an example from Frege, (1) Brutus killed Caesar gives rise to the expression: ‘x killed y’, as well as to: ‘x killed Caesar’ and ‘Brutus killed y’. Let K, K1 , and K2 be, respectively, predicates corresponding to these expressions. Then each of the following three is a formalization of (1): 0
(1 )
K(Brutus, Caesar)
(1 )
00
K1 (Brutus)
000
K2 (Caesar)
(1 )
K denotes the binary relation of killing: the set of all pairs (p, q) in which p killed q. K1 corresponds to the property of being a killer of Caesar; it denotes the set of all beings that killed Caesar. K2 corresponds to the property of being killed by Brutus; it denotes the set of all beings that were killed by Brutus. We can also derive from the binary predicate ‘x killed y’ the monadic predicate ‘x killed x’, which denotes the set of all beings that killed themselves. The derived predicates can be quite arbitrary. For example, we can formalize (2) Jack frequents the movies and Jill prefers to stay home,
264
CHAPTER 7. PREDICATE LOGIC WITHOUT QUANTIFIERS
as: (20 )
B(Jack, Jill)
where B(x, y) is to be read as: therefore interpreted as
x frequents the movies and y prefers to stay home. B is
{(p, q) : p goes to the movies and q prefers to stay home} (20 ) is not a good formalization since it hides the structure of (2). A better one, which shows (2) as a conjunction, is: 00
FrMv(Jack) ∧ PrHm(Jill)
(2 )
where FrMv and PrHm are, respectively the monadic predicates: ‘x frequents the movies’
and
‘y prefers to stay home’ 00
000
In the same vein, we can say that (10 ) is a better recasting of (1), then either (1 ) or (1 ). While grammar can be deceptive when it comes to logical analysis, grammatical aspects can guide us to more natural predicates that reveal more of the sentence’s logical structure.
7.3.2
Predicates and Grammatical Categories of Natural Language
The following are the basic grammatical categories that give rise to predicates. This is true of English, as well as of other languages. • Adjectives, as in ‘x is triangular’ • Common names, as in ‘x is a woman’. • Verbs, as in ‘x enjoys life’. Common names are also known as general names, or as common nouns. Adjectives and verbs give rise to predicates of arity greater than one. For example, from adjectives we get: x is taller than y, x is between y and z. And from verbs:
7.3. STRUCTURES OF PREDICATE LOGIC IN NATURAL LANGUAGE
265
x introduced y to z. In English, adjectives and common names require the word ‘is’, known in this context as the copula. It connects the adjective, or the common name, with the noun phrase (or phrases). In the predicate-expression the noun phrase is replaced by a variable. Common names are characterized by the presence of the indefinite article: a woman, animal, a city, etc.
an
As you can see, a variety of English constructs are put in the same bag: all become predicates upon formalization in first-order logic. Differences of English syntax and certain differences in meaning are ignored. A finer-grained picture requires additional structural elements and may involve considerable increase in the formalism’s complexity. Two Usages of ‘is’ The role of ‘is’ as a copula in predicate expressions is to be clearly distinguished from its role as a two-place predicate denoting identity. Compare, for example, (3) Ann is beautiful with (4) Ann’s father is Bert. In (3) ‘is’ functions as a copula. In (4) it functions as the equality predicate. (4) can be written as Ann’s father = Bert ‘Is’ must function as the equality predicate when it is flanked by singular noun-phrases, (i.e., noun-phrases denoting particular objects).
Singular Terms Singular terms are constructs that function as names of particular objects, e.g., Bill Clinton, USA, etc.
New York City,
132,
The smallest prime number,
The capital of the
There is a difference between the first two, which are–so to speak–atomic, and the other two, which pick their objects by means of a description. The first are called proper names, the last–definite descriptions. Usually, a definite description is marked by the definite article:
266
CHAPTER 7. PREDICATE LOGIC WITHOUT QUANTIFIERS
the capital of the USA, the satellite of the earth, who killed Liberty Valence etc.
the second world war,
the man
But this rule has exceptions. ‘The USA’ should be construed as a proper name, while ‘132’ is really a disguised description (spelled out, it becomes ‘1 · 102 + 3 · 101 + 2 · 10 ’). A definite description denotes the unique object satisfying the stated condition, e.g., ‘the earth’s satellite’ denotes that unique object of which ‘x is a satellite of earth’ is true. The definite description fails to denote if either no object, or more than one object, satisfies the condition. There are various strategies for dealing with non-denoting descriptions. In Russell’s theory of descriptions, sentences containing definite descriptions are recast into sentences that have truth values even when the description of the original sentence fails to denote. On other theories, a failure of denotation can cause a truth-value gap, that is: the sentence has no truth-value. These and other questions that relate to differences between proper names and definite descriptions focused considerable attention in the philosophy of language. Some have been the subject of a still ongoing debate. Note:
Sometimes the definite article is used merely for emphasis, or focusing:
(5) Jill is the daughter of Eileen need not imply that Jill is the only daughter of Eileen. It can be read as (50 ) Jill is a daughter of Eileen. which can be formalized as: (5∗ ) Daughter(Jill, Eileen) Here ‘is’ functions as a copula. Contrast this with (6) The daughter of Eileen is Jill. Here ‘is’ cannot be read as a copula, because ‘Jill’ cannot be a general name. (6) must be read as: (6∗ ) The daughter of Eileen = Jill. Both proper names and definite descriptions are represented by individual constants in the simplest language of first-order logic (of which PC0 is a fragment). In other variants of the language there are ways of forming other singular terms, besides individual constants. A most common variant contains function symbols. For example, it may contain a one-place function
7.3. STRUCTURES OF PREDICATE LOGIC IN NATURAL LANGUAGE
267
symbol, F( ), such that F(x) is to be read as ‘the father of x’. We can therefore render ‘John’s father’ as: F(John), and ‘The father of John’s father’ as: F(F(John)). Function symbols can have arity > 1; for example, a two-place function symbol sum( , ), such that sum(x, y) is to be read as ‘the sum of x and y’. In infix notation sum(x, y) becomes x +y. Further details concerning such languages are given in 8.2.4, page 291. Other variants of first-order languages contain a definite description operant, which is used to form expressions that read: the unique x such that: ... x ... where ‘...x...’ expresses some property of x. Straightforward translations into the predicate calculus may obliterate other distinctions, besides those between adjectives, common names and verbs. The translation of ‘snow is white’ as: White(Snow), treats ‘snow’ as a name of an object, on a par with ‘Bill Clinton’. The distinction between mass terms (‘snow’, ‘water’, ‘coal’) and proper names, (‘John’, ‘The USA’, ‘Chicago’) disappears in this translation.
7.3.3
Meaning Postulates and Logical Truth Revisited
Except for the logical particles, our formal language is uninterpreted. Consequently, various truths that are taken for granted in natural language have to be stated explicitly when the discourse is formalized. For example, by saying that Jill is a female, one implies that she is not a male. To make this explicit, we can add (7) Fem(j) → ¬Male(j)
268
CHAPTER 7. PREDICATE LOGIC WITHOUT QUANTIFIERS
as a non-logical axiom. Actually, the axiom to be added is not (7), but the generalized sentence stating that no female is a male. The sentence is formed by using a quantifier: (7∗ ) ∀v(Fem(v) → ¬Male(v)) For the moment let us state axioms within PC0 . Following Carnap, we have introduced in 4.5.1 the term ‘meaning postulate’ to characterize non-logical axioms that reflect the meaning of linguistic terms; these terms, it turns out, are mostly predicates. We can thus say that (7), or the generalization (7∗ ), derives from the meaning of ‘female’ and ‘male’. As noted in 4.5.1, the absolute distinction that Carnap advocated between meaning postulates and empirical truth is now rejected by many. But it still makes good sense to distinguish (7∗ ) from plain empirical truths. In 4.5.1 we have also considered another kind of presuppositions, called ‘background assumptions’, which are not as basic, or as trivial as meaning postulates. They cover a wide range, from truths whose certainty seems beyond doubt to those that are merely probable. No further elaboration is needed here. Predicate logic and its relation to natural language call for further clarifications of logical truth and logical implication, beyond those that have to do with the sentential connectives. Consider for example: (8) The earth is bigger than the moon, (9) The moon is smaller than the earth. Does (8) logically imply (9)? (Or, equivalently, is the conditional ‘If (8) then (9)’ a logical truth?) If two different predicates, say B and S, are used for ‘bigger’ and ‘smaller’, the sentences are formalized as: (8∗ ) B(earth, moon) (9∗ ) S(moon, earth) The implication between the sentences rests in this case on a meaning postulate that can be stated schematically, with ‘a’ and ‘b’ standing for any individual constants: (10) B(a, b) ↔ S(b, a) But (10) is not a logical truth; neither is the implication from (8∗ ) to (9∗ ) a logical implication. Another way of construing the situation is to regard ‘x is bigger than y’ simply as another way of writing ‘y is smaller that x’ ; just as in mathematics ‘x > y’ is another way of writing ‘y < x’. And in this case (8) implies tautologically (9), because they are construed
7.4. PC∗0 , PREDICATE LOGIC WITH INDIVIDUAL VARIABLES
269
as the same sentence. The question: what are the “right” translations of (8) and (9) may not have an answer. We might try a third alternative: The predicates representing ‘smaller’ and ‘bigger’ are different, but certain sentences, such as (10), count as logical axioms. This only transforms our original question into the question: What meaning postulates count as logical axioms? Suppose you regard (10) as a logical truth, would you adopt the same policy with respect to other pairs: hotter and colder
prettier and uglier
And what about ‘hot’ and ‘cold’, (11)
Red(a) → ¬Blue(a)
to the left of and to the right of ?
‘beautiful, and ‘ugly’ ? Or sentences such as:
?
All this should not undermine the concept of logical implication. It only indicates a looseness of fit between the formal structure and our actual language, a looseness that is inevitable whenever a theoretical scheme is matched against concrete phenomena.
7.4
PC∗0 , Predicate Logic with Individual Variables
7.4.0 We now take the crucial step of incorporating individual variables into the formal language. Let v1 , v2 , . . . , vn , . . . be a fixed infinite list of distinct objects called individual variables, or variables for short, which are different from all previous syntactic items of PC0 . The vi ’s are different, but they play the same role in the formal language. It is convenient to use ‘u’
‘v’
‘w’
‘x’
‘y’
‘z’
‘u0 ’
‘v0 ’
etc.
as standing for unspecified vi ’s, i.e., as variables ranging over v1 , . . . , vn , . . . . (We may say, “For every individual variable v ...”, or “For some individual variable w ...”.) We shall also use ‘x’, ‘y’ and ‘z’ in another role: to range over various domains that come up in the discussion. For example in ‘{(x, y) : x < y}’, ‘x’ and ‘y’ range over numbers. Whether ‘x’, ‘y’ and ‘z’ stand for variables of the formal language, or are used in a different role, will be clear from the context.
270
CHAPTER 7. PREDICATE LOGIC WITHOUT QUANTIFIERS
Henceforth ‘PC∗0 ’ denotes the system obtained from PC0 by incorporating the individual variables. Terms: variable.
An individual term, or–for short, a term, is either an individual constant or a
Well Formed Formulas, or Wffs: The basic construct of PC∗0 is that of a well formed formula. The name is abbreviated as ‘wff’. Wffs are constructed like the sentences of PC0 , except that variables, besides individual constants, can fill the predicate’s empty places. We shall use lower case Greek letters: ‘α’
‘β’
‘γ’
‘α1 ’
‘α2 ’ ,...
‘β1 ’
‘β2 ’ , ...
‘α0 ,’ ...
etc.,
to range over well formed formulas. Since sentences turn out to be special cases of well formed formulas, this involves also a notational change with regard to sentences: from upper case Latin to lower case Greek. Spelt out in detail, the definition is: Atomic Wffs: If P is an n-place predicate and t1 , . . . , tn are terms then P(t1 , . . . , tn ) is an atomic wff. It goes without saying that unique readability is assumed with respect to atomic wffs: the predicate and the sequence of terms are uniquely determined by the formula. This is also assumed with respect to all compounds. The sentential connectives are now construed as operations defined for wffs. The set of all wffs is defined inductively by: (I) Every atomic wff is a wff. (II) If α and β are wffs then: are wffs.
¬α, α ∧ β, α ∨ β, α → β, α ↔ β
All the syntactic concepts, such as main connective, immediate components, and components, are defined in the same way as before. The occurrences of a term in a wff are determined in the obvious way: (i) A term, t, occurs, in the ith predicate-place, in the atomic wff P (t1 , . . . , tn ), iff t = ti (note that it can have several occurrences), and (ii) the occurrences of t in a wff α are its occurrences in the atomic components of α. The Sentences of PC∗0 : The sentences of PC∗0 are, by definition, the sentences of PC0 . It is easy to see that this means the following: A wff of PC∗0 is a sentence iff no variables occur in it.
7.4. PC∗0 , PREDICATE LOGIC WITH INDIVIDUAL VARIABLES
271
(For atomic wffs this is obvious. For the others, it follows from the fact that the wffs of PC∗0 and the sentences of PC0 are generated from atoms by the same sentential connectives.) Examples: P(b, a)
The following are wffs, where u, v, w, v 0 are any variables. P(u, b)
P(a, a) → ¬R(c)
P(v, u) → ¬R(c)
P(v, v) ∧ (R(w) ∨ R(v 0 ))
The first is an atomic sentence, the second is an atomic wff that is not a sentence, the third is a non-atomic sentence, the fourth and the fifth are non-atomic wffs that are not sentences. So far the variables play the same syntactic role as individual constants. There is however a semantic distinction: The interpretation of the language assigns denotations to individual constants, but not to the variables. Consequently, an interpretation determines the truthvalues of sentences, but not of wffs. To determine the truth-value of a wff α we need, besides the interpretation of the language, an assignment of objects to the variables of α. For example, the truth value of P(a, b) is determined by the denotations of a and b and the interpretation of P; but, in order to get the truth-value of P(v, b), we need–in addition–to assign some object as the value of v. This will be elaborated and clarified within the general setting of first-order logic. Note: It may happen that the truth-value of a wff, which is not a sentence, is the same for all assignments of objects to its variables. For example, P(u, v) ∨ ¬P(u, v) gets, for every assignment, the value T. Or (P(u, v) → H(a)) ∧ P(u, v) gets the same value as H(a). A wff may be therefore logically equivalent to a sentence. This does not make it a sentence. The distinction between sentences and wffs that are not sentences is syntactic, not semantic.
7.4.1
Substitutions
In 7.1.1 we discussed substitutions, in sentences of PC0 , of individual constants by individual constants. We can now extend this to substitution of terms by terms in wffs. We denote by: Stt0 α the wff resulting by substituting t0 for t in α. By this we mean that t0 is substituted for every occurrence of t. Examples: Scu [(R(a, u) ∧ R(x, b)) → P(u)] = (R(a, c) ∧ R(x, b)) → P(c), Sxu [(R(a, u) ∧ R(x, b)) → P(u)] = (R(a, x) ∧ R(x, b)) → P(x), Sca [(R(a, u) ∧ R(x, b)) → P(u)] = (R(c, u) ∧ R(x, b)) → P(u), Sxb [(R(a, u) ∧ R(x, b)) → P(u)] = (R(a, u) ∧ R(x, x)) → P(u).
272
CHAPTER 7. PREDICATE LOGIC WITHOUT QUANTIFIERS
We can also substitute at one go several terms: s1 by t1 , s2 by t2 , ..., sn by tn . These, as we saw in 7.1.1 page 249, are called simultaneous substitutions. The result of such a simultaneous substitution in α is denoted: sn Sts11 st22 ... ... tn α. Examples: u,x [(R(a, u) ∧ R(x, b)) → P(u)] = (R(a, a) ∧ R(c, b)) → P(a), Sa,c u,x Sx,u [(R(a, u) ∧ R(x, b)) → P(u)] = (R(a, x) ∧ R(u, b)) → P(x), a,b [(R(a, u) ∧ R(x, b)) → P(u)] = (R(b, u) ∧ R(x, x)) → P(u). Sb,x
Variable Displaying Notation Notations such as: ‘α(v)’
‘β(x, y)’
‘γ(u, v, w)’
are used for wffs in order to call attention to the displayed variables. The point of the notation is that if we use ‘α(x)’, then we understand by ‘α(a)’ the wff obtained from α(x) by substituting a for x. Hence we have: Sax α(x) = α(a),
x,y Sa,b β(x, y) = β(a, b),
x,y Sb,b β(x, y) = β(b, b),
u,x,y Sa,b,c γ(u, x, y) = γ(a, b, c)
We extend this to cover substitutions of variables by variables: x,y x,y Syx α(x) = α(y), Sy,x β(x, y) = β(y, x), Sb,x β(x, y) = β(b, x),
etc.
If we think of α(x) as a predicate expressing a certain property, with ‘x’ marking the empty place of the predicate, then we can think of α(a) as saying that the property is true of the object denoted by a. Incautious use of this convention may lead to notational inconsistency. A wff containing both x and y as free variables should not be written both as α(x) and as α(y). For then α(a) can be read either as the wff obtained by substituting a for x, or the wff obtained as substituting a for y; and these are different. Also in that case we might read α(y) as the result of substituting y for x in α(x). Such inconsistencies are avoided if we display all the free variables that we consider subject to substitutions. In the case just mentioned we write the wff as α(x, y). Then, Sax α(x, y) = α(a, y), Say α(x, y) = α(x, a), Syx α(x, y) = α(y, y). Note that it would do no harm to display variables that do not occur in the wff; if x does not occur in α(x), then substituting for it any term does not have any effect: α(x) = α(c). But
7.4. PC∗0 , PREDICATE LOGIC WITH INDIVIDUAL VARIABLES
273
since this may confuse it is best to avoid it. Usually use of ‘α(x)’ is taken to indicate that ‘x’ occurs in α. When we want to focus on a particular variable, while indicating that there are possibly others, we can use notations such as: α(. . . x . . .). Or we can state explicitly that α(x) may have other variables.
7.4.2
Variables and Structural Representation
A wff α(x), having no variables besides x, can serve as a scheme for getting sentences of the form α(c), where c is any individual constant. Similarly, a wff α(u, v) can serve as a scheme for sentences of the form α(a, b). Such schemes can give us a handle on long sentences. Suppose we want to formalize: (1) Everyone, among Jack, David and Harry, who is liked by Ann is liked by Clair. Let us use L(x, y) for ‘x likes y’, and the first letters for the names. The desired sentence is a conjunction, saying, of each of the men, that if he is liked by Ann he is liked by Claire: (10 ) (L(a, j) →L(c, j)) ∧ (L(a, d) →L(c, d)) ∧ (L(a, h) →L(c, h)) We can, instead, describe the sentence as follows: (1∗ ) α(j) ∧ α(d) ∧ α(h), where α(x) = L(a, x) → L(c, x). Here α(x) “says of x” that if he is liked by Ann he is liked by Claire; hence, α(j), α(d), and α(h) say, respectively, that the property holds for Jack, David and Harry. This way of rewriting (10 ) is much shorter and much more transparent. It brings to the fore a certain structure. Here are additional examples, in which the logical form is displayed by using variables: (2) Everyone, among Jack, David and Harry, who is happy likes himself. (2∗ ) α(j) ∧ α(d) ∧ α(h), where α(x) = H(x) → L(x, x). If needed, we can unfold the sentence by carrying out all the substitutions and writing down the full conjunction: (H(j) → L(j, j)) ∧ (H(d) → L(d, d)) ∧ (H(h) → L(h, h))
274
CHAPTER 7. PREDICATE LOGIC WITHOUT QUANTIFIERS
(3) Someone among Jack David and Harry who does not like Ann likes Claire. (3∗ ) α(j) ∨ α(d) ∨ α(h), where α(x) = ¬L(x, a) ∧ L(x, c). The following case displays a two-level structure, obtained by repeating the same technique. (4) Someone, among Ann, Claire and Edith, likes everyone among Jack, David and Harry who likes himself. (4∗ ) α(a) ∨ α(c) ∨ α(e), where α(x) = β(x, j) ∧ β(x, d) ∧ β(x, h), where β(x, y) = L(y, y) → L(x, y). You may arrive at this by the following analysis: (i) (4) says that at least one among Ann, Claire and Edith has a certain property. If α(x) expresses this property then the desired sentence is: α(a) ∨ α(c) ∨ α(e) (ii) α(x) says that x likes everyone among Jack, David and Harry who likes himself. It is expressed as a conjunction: β(x, j) ∧ β(x, d) ∧ β(x, h)
where β(x, y) says that if y likes himself then x likes y; that is: (iii) β(x, y) = L(y, y) → L(x, y). The following case involves the equality predicate. (5) Among Ann, Claire and Edith, someone likes all the others. (5∗ ) α(a) ∨ α(c) ∨ α(e), where α(x) = β(x, a) ∧ β(x, c) ∧ β(x, e), where β(x, y) = x 6≈ y → L(x, y). We need the antecedent x 6≈ y, because the sentence asserts that someone likes all others; she may or she may not like herself. Watch Out: There is no presupposition that different variables must be substituted by different individual constants, or that they must have different objects as values.
7.4. PC∗0 , PREDICATE LOGIC WITH INDIVIDUAL VARIABLES
275
If we want to say of x and y that they are different, we have to use the wff x 6≈ y. If we unfold (5∗ ), it becomes a disjunction of following three sentences: (a 6≈ a → L(a, a)) ∧ (a 6≈ c → L(a, c)) ∧ (a 6≈ e → L(a, e)) (c 6≈ a → L(c, a)) ∧ (c 6≈ c → L(c, c)) ∧ (c 6≈ e → L(c, e)) (e 6≈ a → L(e, a)) ∧ (e 6≈ c → L(e, c)) ∧ (e 6≈ e → L(e, e)) Now, conditionals such as a 6≈ a → L(a, a) are logical truths (though not tautologies), because a 6≈ a gets always the value F; hence, the first conjunct in the first conjunction, the second in the second conjunction, and the third in the third conjunction, are redundant. Having removed the redundant conjuncts, the three conjunctions become: (a 6≈ c → L(a, c)) ∧ (a 6≈ e → L(a, e)) (c 6≈ a → L(c, a)) ∧ (c 6≈ e → L(c, e)) (e 6≈ a → L(e, a)) ∧ (e 6≈ c → L(e, c)) If we assume that different names denote different women, then inequalities such as a 6≈ c, are true and the conjunctions can be further simplified–by replacing each conditional by its consequent. The whole sentence becomes: (L(a, c)∧L(a, e)) ∨ (L(c, a)∧L(c, e)) ∨ (L(e, a)∧L(e, c)) This last step does not rely, as the preceding steps do, on pure logic. There is an additional assumption that different names have different denotations, which is expressible by the sentence: a 6≈ c ∧ a 6≈ e ∧ c 6≈ e. Homework 7.9 (I) Use variables, and the technique just illustrated in (1) - (5), to display the logical form of the following sentences. The same presuppositions are made as in Homework 7.1: The universe consists of Jack, David, Harry, Ann, Claire, and Edith, gender is according to name, and different names denote different people. Use the same notation as in Homework 7.1. You don’t have to unfold the sentences. Note ambiguities and provide, if you can, the appropriate different formalizations. 1. No man is liked by everyone, but some woman is. 2. A man likes himself when he is liked by everyone else. 3. Some woman likes herself, as do two of the men. 4. When a man and a woman like each other, both are happy.
276
CHAPTER 7. PREDICATE LOGIC WITHOUT QUANTIFIERS
5. Ann does not like a man who likes all women. 6. Some man,liked by Ann, does not like her. 7. Among Jack, David, Ann and Claire, no one is liked by all, except possibly Ann. 8. The same men are liked by Claire and Edith. 9. There is a man, who does not like himself, though liked by all other men. 10. No one among Ann, Claire and Harry is happy, who is not liked by someone else among Ann, Claire, Harry, and David. 11. A man is happy, if two women like him. 12. Some women, who do not like themselves, like each other. 13. Some man does not like himself, though liked by every woman. 14. Some men do not like themselves, though liked by every woman. 15. Some men do not like themselves, though every other man does. 16. Most women are liked by some men. 17. Harry and David like the same woman. 18. Harry and David like the same women. 19. Harry and David like only one woman. Note: (i) In some problems (e.g., 9) you can get shorter expressions by using inequalities, as it is done in (5); the unfolded form contains redundant components, but you don’t have to unfold. (ii) Some problems (e.g., 4) call for the use of two variables for which various pairs of constants are to be simultaneously substituted.
Chapter 8 First-Order Logic, The Language and Its Relation to English 8.1
First View
First-Order logic, FOL for short, is obtained by enriching PC∗0 with first-order quantifiers. These are new syntactic operations that produce new types of wffs. The application of quantifiers is called quantification. The version we shall study here is based on two first-order quantifiers: the universal and the existential. The choice is a matter of convenience, similar to the choice of sentential connectives. We shall see that, semantically, each quantifier is expressible in terms of the other and negation. We use ‘∀’
and
‘∃’
for the universal and the existential quantifier. A quantifier takes two objects as arguments: an individual variable and a wff. It yields as outcome a wff. The outcomes of applying ∀, or ∃, to v and α are written, respectively as: ∀vα
and
∃vα
These wffs are called, respectively, universal and existential generalizations. The following are commonly used terminologies. We speak of the universal, or existential, quantification of α with respect to v; and also of quantifying (universally, or existentially) the variable v, in α, or of quantifying over v in α. One speaks of quantified wffs, and also of quantified variables (in a given wff). We might say, for example, that in ∀vα, the variable v is quantified (universally). All of which should not cause any difficulty. 277
278
CHAPTER 8. FIRST-ORDER LOGIC
Before going, in the next section, into the syntax of FOL, let us get some idea of the relation of FOL to English and of the way of interpreting a first-order language. It will help us to appreciate better the syntactic details. If α is to be read, in English, as ‘...’, then ∀v α can be read as:
‘for all v ...’ .
∃v α can be read as:
‘for some v ...’ .
For example, let Mn
Mr
Wm
Hp
be the formal counterparts of the predicates: ‘x is a man’ ‘x is mortal’ ‘x is a woman’ ‘x is happy’. Then All men are mortal. can be formalized as: (1) ∀v1 (Mn(v1 ) → Mr(v1 )) which can be read as: For every object v1 , if v1 is a man then v1 is mortal. (Of course, any vi could have been used instead of v1 .) Similarly, ‘Some woman is happy’ can be formalized as: (2) ∃v3 (Wm(v3 ) ∧ Hp(v3 )) which can be read as: For some object v3 , v3 is a woman and v3 is happy. Here are some less straightforward examples. Let L(x, y) and K(x, y) correspond to: ‘x likes y’ and ‘x knows y’. Then
8.2. WFFS AND SENTENCES OF FOL
279
(3) ∀v2 (K(v2 , Jack) → L(v2 , Jack)) says that everyone who knows Jack likes him; literally: for every object v2 , if v2 knows Jack then v2 likes Jack. And the following says that every man is liked by some woman. (4) ∀v2 [Mn(v2 ) → ∃v1 (Wm(v1 ) ∧ L(v1 , v2 ))] Here is a miniature sketch of the semantics. The full definitions are given in chapter 9. An interpretation of an FOL is given by giving the following items: • A non-empty set, playing the role of the universe (or domain) of the interpretation. The individual variables of the language are assumed to range over that universe. • An assignment that correlates, with each individual constant of the language, a member of the universe, which is called its denotation. • An assignment that correlates with every n-ary predicate an n-ary relation over the universe, which is said to be the interpretation (or denotation) of the predicate. If the language contains ≈, its interpretation is the identity relation over the universe. From this and from the examples above you may get a rough idea how truth and or falsity are determined by the interpretation.
8.2
Wffs and Sentences of FOL
8.2.0 ‘First-order logic’ refers to a family of languages that have comparable logical resources. Those we consider here share a logical apparatus that consists of: sentential connectives, individual variables, istential quantification.
first-order universal and ex-
The non-logical vocabulary may differ from language to language, it consists of: individual constants,
predicates (of any arity).
If the language has the equality symbol, ≈, then it belongs to the logical vocabulary.
280
CHAPTER 8. FIRST-ORDER LOGIC
Every FOL language must contain at least one predicate. But it need not contain individual constants. (The above-given (1), (2) and (3) are examples of sentences without individual constants.) As before, we use ‘u’
‘v’
‘w’
‘x’
‘y’
‘z’
‘u0 ’
‘v 0 ’
etc.
to stand for unspecified vi ’s. Note: We assume that, when different symbols from this list occur in the same wffexpression, they stand for different vi ’s, unless stated otherwise. Thus, what is expressed by (4) above can be expressed by using any two different vi ’s, which we can write as: (40 ) ∀u [Mn(u) → ∃v (Wm(v) ∧ L(v, u))] We shall also use ‘x’, ‘y’ and ‘z’ as variables of our own language, ranging over various domains according to the discussion.
First-Order Wffs (Well-Formed Formulas) Terms and atomic wffs are defined exactly as in PC∗0 . Wffs are then defined inductively: (I) Every atomic wff is a wff. (II) If α and β are wffs then: ¬α, α ∧ β, α ∨ β, α → β, α ↔ β
are wffs.
(III) If α is a wff and v is any individual variable, then ∀vα
and
∃vα
are wffs.
• Wffs of the forms ¬α, or α∗β, where ∗ is a binary connective, are referred to as sentential compounds. • Wffs of the forms ∀vα and ∃vα are referred to as generalizations, the first–universal, the second–existential. Every wff is, therefore, either atomic, or a sentential compound, or a generalization. Here are some additional examples of wffs: S(x, y, b) ∨ R(c, a)
∀ x (P(x) → R(x, x))
(∃yR(a, y)) ∧ ¬∀xS(b, x, y)
8.2. WFFS AND SENTENCES OF FOL ∀ x ∃ y ∃ z (S(x, y, z) ∨ ∀uP(u))
281 ∃yP(a)
¬[(∀xR(y, z)) → ∀xP(x)]
The first, the third and the last are sentential compounds. The rest are generalizations: the second and fourth–universal, the fifth–existential. Terminology: An occurrence of a term t in an atomic wff of the form R(. . . , t, . . .) is said to be under the predicate R. Unique Readability: Unique readability comprises the previous conditions concerning atomic wffs and sentential compounds, as well as conditions on quantification: • Every generalization is neither an atomic wff, nor a sentential compound. • The quantifier, the variable, and the quantified wff are uniquely determined by the generalization: If Qv α = Q0 v 0 α0 where Q and Q0 are quantifiers, then: Q = Q0 , v = v 0 (i.e., they are the same variable) and α = α0 . Operants, Scopes and Subformulas: A quantifier-operant is a pair of the form ∀v, or ∃v, which consists of a quantifier and a variable. By an operant we shall mean either a sentential connective, or a quantifier-operant. Negation and quantifier-operants are monadic: they act on single wffs. The others are binary. The notion of main connective generalizes, in the obvious way, to the notion of main operant: • The main operant of ¬α is ¬ and its scope is α. The main operant of α ∗ β (where ∗ is a binary connective) is ∗ and its left and right scopes are α and β. • The main operant of Qv α (where Q is a quantifier) is Qv and its scope is α. Sometimes we omit to mention the variable of the quantifier-operant and speak of the scope of a quantifier (or, rather, of its occurrence), or of the quantifier itself as being the main operant. The immediate components of a wff are defined by adding to the previous definition of chapter 2 (cf. 2.3) the clause for quantifiers: The immediate component of Qv α (where Q is a quantifier) is α. A component is now defined as before: it is either the wff itself, or an immediate component of it, or an immediate component of an immediate component,... etc. A component of α is
282
CHAPTER 8. FIRST-ORDER LOGIC
proper if it is different from α. The components of α are also referred to as the subformulas of α, and the proper components–as the proper subformulas. The component-structure of a given wff is that of a tree. As in the sentential case, we can write wffs as trees, or even identify them with trees. The following is a wff whose subformulas correspond to the sub-trees that issue from the nodes. The nodes are numbered according to our old numbering rule (cf. 4.2.2, page 122). The main operants of the non-atomic subformulas are encircled. On the right-hand side is the graphic representation, with nodes labeled either by operants or by atomic wffs.
1.
∀ x [(∃ y ∃ z S(x, y, z)) ∨ ∀uP(u)]
2.
(∃ y ∃ z S(x, y, z)) ∨ ∀uP(u)
3.1
∃ y ∃ z S(x, y, z))
3.2
∀u P(u))
4.1
∃ z S(x, y, z))
5.1
S(x, y, z))
3.2
P(u)
atomic wff atomic wff
As in the case of sentential logic (cf. 2.3.0), we often omit the word ‘occurrence’. For example ‘the second ∀’ means the second occurrence of ∀, ‘the first ∃v’ means the first occurrence of ∃v, etc. The same systematic ambiguity applies to other particles and constructs: variables (‘the first v’), individual constants (‘the second a’), wffs (‘the first P(a)’) etc. Nested Quantifiers: Quantifiers are said to be nested if one is within the scope of the other. In the last example, ∀x and ∃ y are nested. A sequence of nested quantifiers is a sequence in which the second is in the scope of the first, the third–within the scope of the second, and so on. In the last example the following are sequences of nested quantifiers: ∀x, ∃y, ∃z On the other hand, ∃y and ∀u are not nested.
∀x, ∀u
8.2. WFFS AND SENTENCES OF FOL
283
( To be precise we should speak of quantifier-occurrences, because the same quantifier (with the same variable) can occur more than once.) Grouping Conventions for Quantifier-Operants: The grouping convention for negation is extended to all monadic operants: Every monadic operant-name binds more strongly than every binary operant-name. This means, for example, that ∃vα ∧ β
is to be read as
(∃vα) ∧ β
To include β within the scope of ∃v, write: ∃v(α ∧ β)
8.2.1
Bound and Free Variables
An occurrence of a variable v in a wff α is bound if it is (i) within the scope of a quantifieroperant Qv, or (ii) the occurrence of v in the pair Qv An occurrence of a variable is free if it is not bound. A variable that has a free occurrence in α is said to be free in α, or a free variable of α. A variable that has a bound occurrence in α is said to be bound in α, or a bound variable of α. Examples: S(x, y, b) ∨ ¬R(y, x) : All variable occurrences are free. ∀xS(x, y, b) ∨ ¬∃xR(y, x) : All occurrences of x are bound, all occurrences of y are free. ∀x[∃yS(x, y, b) ∨ ¬R(y, x)] : All occurrences of x are bound, and so are the occurrences of y in ∃y and under S. The last occurrence of y is free (it is not within the scope of ∃y). As the last example shows, a variable can have several occurrences, of which one or more are bound and one or more are free. Such a variable is both free and bound in α. Whether an occurrence is free depends on the wff. The same variable-occurrence which is free in one wff can be bound in a larger wff that contains the first as a component. For example, the occurrence of y in R(x, y) is free, but it is bound in the larger ∀yR(x, y). The x in ∀yR(x, y) is free in that wff, but is bound in ∃x∀yR(x, y) . The Binding Quantifier: An occurrence of a quantifier-operant Qv is said to bind, and also to capture, all the free occurrences of v in its scope; these latter are said to be bound, or captured, by the Qv.
284
CHAPTER 8. FIRST-ORDER LOGIC
As is usual, we apply the terminology to the quantifier itself: we speak of an occurrence of a quantifier as binding, or capturing, the variables that occur free in its scope. Among the occurrences that are bound by (an occurrence of) Q, we also include the occurrence of v in the pair Qv. It is not difficult to see that, for any wff α, every bound occurrence of v is bound by a unique occurrence of some quantifier. If the v occurs in Qv, then it is bound by that occurrence of Q. Otherwise, it is in some subformula, Qv β, such that it is free in β; it is then bound by that Q. (The uniqueness is guaranteed by unique readability.) An occurrence of v can belong to the scopes of several Qv’s (e.g., the last occurrence of v, in ∀v[P(v) → ∃vR(a, v)] .) Among these, the Qv with the smallest scope is the one that binds it. In the following illustrations the bindings are indicated by connecting lines.
∀ x [S(x, y) ∨ ∃ y S(x, y)]
∀ x [P(x) → (S(x, y)] ∨ ∃ y ∀ x S(x, y)) The significance of free and bound occurrences is explained in the next subsection. Individual Constants: It is convenient to extend the classification of free and bound occurrences to individual constants. All occurrences of individual constants are defined to be free. Thus, an occurrence of a term is free iff it is either an occurrence of an individual constant, or a free occurrence of an individual variable.
Sentences A wff is defined to be a sentence just when it has no free variables. (In some terminologies the term ‘open sentence’ is used for wffs with one or more free variables, and ‘closed sentence’–for sentences.) The wffs (1)-(4) at the beginning of the chapter are sentences. The definition of sentences in PC∗0 (given in 7.4) is a particular case of the present definition: All variable-occurrences in wffs of PC∗0 are free, since PC∗0 has no quantifiers. Hence a wff of PC∗0 is a sentence–according to the present definition–just when it contains no variables.
8.2. WFFS AND SENTENCES OF FOL
8.2.2
285
More on the Semantics
The syntactic distinction between free and the bound occurrences has a clear and crucial semantic significance. Consider the wff P(v) The interpretation of the language does not determine the formula’s truth-value, because the interpretation does not correlate with variables–as it does with the individual constants– particular objects. In order to get a truth-value we need, in addition to the interpretation, to assign some object to v. We therefore introduce assignments of values to variables. The truth of P (v) is relative to an assignment that assigns a value to v. The wff P(v) gets T iff the assigned value is in the set that interprets P( ). If P( ) is interpreted as the set of all people, then P(v) is true, under the assignment that assigns Juno to v, iff Juno is a person. On the other hand the interpretation by itself determines the truth-values of ∀vP(v)
∃vP(v)
and
The first is true iff all the objects in the universe of the interpretation are in the set denoted by P (which means that the set is the whole universe). The second is true iff some object is in this set. (which means that the set is not empty). You can, if you wish, assign a value to v. But this value will have no effect on the truth-values of the last two wffs. Changing the free variable in P(v) results in a non-equivalent formula: P(v) is not logically equivalent to P(u) Because, if P is interpreted as a set that is neither the whole universe nor empty, there is an assignment under which the first wff gets T and second gets F: Assign to v a value in the set and to u–a value outside it. On the other hand, changing the bound variable results in a different, but logically equivalent, formula: ∀vP(v) ≡ ∀uP(u)
∃vP(v) ≡ ∃uP(u)
Roughly speaking, the wff P(v) says something about the interpretation of P and the value of v; but the wffs ∀vP(v) and ∃vP(v) are not about the value of v, they are only about the interpretation of P. What we have just observed holds in general. If a wff, α, has free variables, then its truthvalue in a given interpretation depends, in general, on the values assigned to these variables. But if all the variables are bound, that is, if the wff is a sentence, the truth-value is completely determined by the interpretation of the language. Note: There are wffs, with free variables, which get the same truth-value under all assignments. For example, the truth-value of P(v) → P(v)
286
CHAPTER 8. FIRST-ORDER LOGIC
is T, for any value of v; nonetheless v is a free variable and the wff is not a sentence. You can think of this wff as defining a function, whose value for each value of v is T; this is different from a sentence, which simply determines a truth-value. Variable Displaying Notation: We extend the variable displaying notation of 7.4.1 (page 272) to wffs of first-order logic. We shall use ‘α(u)’
‘β(u, v)’
‘β 0 (x, y)’
‘γ(x, y, z)’
etc.
to denote wffs in which the displayed variables (and possibly others that are not displayed) are free. One of the main points of the notation has to do with substitutions of free variables, to be considered in the next subsection. If v is the only free variable in α(v), you can think of α(v) as saying something about the value of v. It defines the set consisting of all objects whose assignment to v makes α(v) true (under the presupposed interpretation of the language). Similarly, a wff with two free variables defines a binary relation, one with three free variables defines a ternary relation, and so on. (Recall that in cases of arity > 1 we have to stipulate which variable represents which coordinate; cf. 7.3.) For example, if we have in our language the predicates Male( ) and Parent( , ), we can formalize ‘x is a grandfather of y’ as: Male(x) ∧ ∃ v (Parent(x, v) ∧ Parent(v, y)) Wffs with free variables resemble predicates, but unlike predicates they are not atomic units. Homework 8.1 Assume an interpreted first-order language, containing ≈ and the predicates: M( ), F( ) and C( , , ), interpreted so that M(x), F(x), and C(x, y, z) read as: ‘x is a human male’,
‘x is a human female’,
‘x is a child of y and z’
Write down wffs that formalize the following. Use the same free variables that are used here with the English. You may introduce shorthand notations; e.g., you can define ‘β1 (x, y)’ to stand for some wff, and then use it as a unit. But write in full unfolded form at least two of the formalizations. 1. x is the mother of y. 2. x is a sister of y. 3. x is an uncle of y.
8.2. WFFS AND SENTENCES OF FOL
287
4. x and y are first cousins. 5. x is y’s nephew. 6. x is y’s maternal grandmother. 7. x is a half-brother of y. 8. x has no sisters. 9. Everyone has a father and a mother.
(Use H as a predicate for humans).
10. No one has more than one father.
Repeated Use of Bound Variables: several occurrences of quantifiers: (5)
The same variable can paired in the same wff with
∀xP(x) → ∀xP0 (x)
(5) says that if everything is in the set denoted by P, then everything is in the set denoted by P0 . Such quantifiers can be even nested: (6)
∀ v [P(v) → ∀vP(v)]
We can try to read (6) as: (60 )
For every v, the following holds: If v is P, then for every v, v is P.
It may look, or sound, confusing, until you realize that there is no connection between the first v and the second v. To bring the point out, rephrase (60 ) as: 00
(6 )
For every v: if v is P, then everything is P.
(And this, it is not difficult see, says that if something is P then everything is P.) While (6) is a legitimate sentence, one may wish to avoid the repeated use of the same bound variable. This can be easily done by using another variable in the role of the second v. The following is logically equivalent to (6). (6∗ )
∀ v [P(v) → ∀uP(u)]
288
8.2.3
CHAPTER 8. FIRST-ORDER LOGIC
Substitutions of Free and Bound Variables
We denote by ‘Stt0 α’ the wff obtained from α by substituting every free occurrence of t by t0 . (Recall that all occurrences of individual constants are considered free.) We describe the operation as the substitution of free t by t0 , or the substitution of t0 for free t. We shall refer to it, in general, as free-term substitution, or for short free substitution. The concept is extended to cover also simultaneous substitutions of several terms: tn Stt01 tt20 ... ... t0n α 1 2
is the result of substituting simultaneously, every free occurrence of t1 by t01 , every free occurrence of t2 by t02 , and so on. Example:
Let α = ∀ u (P(u, v, w) → ∃wP(w, v, c))
All occurrences of u in α are bound, all occurrences of v are free, the first occurrence of w is free and the other two are bound. Consequently, we have: Sxv α = ∀ u (P(u, x, w) → ∃wP(w, x, c)) Sxw α = ∀ u (P(u, v, x) → ∃wP(w, v, c))
Sxv w v α = ∀ u (P(u, x, v) → ∃wP(w, x, c)) Scv cv α = ∀ u (P(u, c, w) → ∃wP(w, c, v))
As in 7.4, we write the wff obtained from β(v) by substituting x for the free v as: Similarly, Svx11 xv22 xv33 α(x1 , x2 , x3 ) = α(v1 , v2 , v3 )
β(x).
Recall that, in order to avoid inconsistent notation, we have to display all the free variables that may be subject to substitution (cf. 7.4 for details). Note:
If t does not occur freely in α then Stt0 α = α
Legitimate Substitutions of Free Terms: The semantic meaning of substituting free terms is the following: The new wff says of the values (or interpretations) of the new terms what the original wff says of the values (or interpretations) of the original ones. For example, let K(x, y) and L(x, y) read, respectively, as ‘x knows y’ and ‘x likes y’, and let γ(v) be: ∀ u (K(v, u) → L(v, u)) γ(v) says that v (or rather the value assigned to v) likes everyone he or she knows. The same thing is said by γ(w) about w. But if we substitute u for the free v we get: ∀ u (K(u, u) → L(u, u))
8.2. WFFS AND SENTENCES OF FOL
289
which has a different syntactic structure and a completely different meaning. It says that everyone who knows himself likes himself. The unintended outcome is due to the fact that the free v is in the scope of ∀u. The u that is substituted for the free v is captured by that quantifier. Such examples motivate the following definition. The substitution, in a given wff, of free t by t0 is legitimate if no free occurrence of t becomes, after the substitution, a bound occurrence of t0 . Given some wff, we say that free t is substitutable by t0 , and also that t is free for t0 , if the substitution of free t by t0 is legitimate. This is generalized, in the obvious way, to simultaneous substitutions. Illegitimate substitutions result in wffs. But they change the structure of quantifier-variable bindings; therefore, when it comes to the semantics, the outcome can be unrelated to the original meaning. Henceforth, unless stated otherwise, we use ‘Stt0 α’, ‘Stt0 ss0 α’, etc., for legitimate substitutions only. Use of the notation is taken to imply that the substitution is legitimate.
Substitution of Bound Variables To substitute a bound variable, say u, by x, is to replace all bound occurrences of u by x. We describe this as the substitution of bound u by x, and we refer to the operation as boundvariable substitution. For example, ∀ u (K(v, u) → L(v, u)) is transformed into ∀ x (K(v, x) → L(v, x)) . We can also substitute, at one go, several bound variables: ∀u∃vK(u, v)
can be transformed into
∀x∃yK(x, y)
and also into
∀v∃uK(v, u)
The result of such a substitution is a logically equivalent formula. (This is proved in the next chapter.) Substitutions of bound variables can be applied locally, in order to change a proper subformula to a logically equivalent one. In this manner (6) is transformed into the logically equivalent (6∗ ). These substitutions can serve to eliminate repeated use of the same bound variable. Using them we can transform any wff into a logically equivalent one, in which different occurrences of quantifiers are always paired with different variables. Bound-variable substitutions can be also used to get an equivalent wff in which no variable is both free and bound. All in all we get wffs that are easier to grasp. Furthermore, bound-variable substitutions can enable free substitutions, which would be otherwise illegitimate. If, for some reason (and such occasions arise), we want to substitute the
290
CHAPTER 8. FIRST-ORDER LOGIC
free v by u in ∃uL(u, v) we cannot do so, because the u will be captured by the quantifier. But after substituting the bound u by w, we get the logically equivalent ∃wL(w, v) And here the substitution of free v by u is legitimate and we get:
∃wL(w, u).
Legitimate Substitutions of Bound Variables: As in the case of free-term substitutions, substitutions of bound variables can have unintended effects of capturing free occurrences. Consider, for example, our previous γ(v): ∀ u (K(v, u) → L(v, u)) If we substitute in it bound u by v we get: ∀ v (K(v, v) → L(v, v)) which is not what we intended. Bound occurrences of u have been replaced here by bound occurrences of v, but, in addition, free occurrences of v became bound. A substitution may also transform some bound occurrence to an occurrence that is bound by a different quantifier For example, if in: (7) ∀ u [P(u) → ∀wR(u, w)] we substitute bound u by w we get the non-equivalent: (8) ∀ w [P(w) → ∀wR(w, w)] To see clearly the difference, you can read (7) and (8), respectively, as: (70 ) For every u: if u is P, then u is R-related to everything. (80 ) For every w: if w is P, then everything is R-related to itself. The trouble here is that the occurrence of u under R, which in (7) is bound by first ∀, has been transformed into an occurrence of w, which is bound by the second ∀. All of this motivates the following definition. A substitution of bound variables is legitimate if every free occurrence remains, after the substitution, free, and every bound occurrence is changed to, or remains, bound by the same quantifier.
8.2. WFFS AND SENTENCES OF FOL
291
We can combine the substitution conditions for free and for bound variables into one general condition, which covers all substitutions, including mixed cases where free and bound substitutions are carried out simultaneously. Legitimate Substitutions in General A substitution of variables is legitimate if the following holds: (i) Every free occurrence remains, or is replaced by, a free occurrence; (ii) free occurrences of the same variable remain, or are replaced by, occurrences of the same variable; (iii) every bound occurrence remains, or is replaced by, a bound occurrence that is bound by the same quantifier-occurrence. 0
Note: We use the notation ‘Stt α’ only for free substitutions. We will not need a special notation for bound ones. Homework 8.2
Let α be the wff: ∀uR(u, v) → ∃ v [S(v, u, c) ∨ ∀wR(v, w)]
(i) List, or mark the free occurrences of each of the terms in α. List, or mark all bound occurrences. (ii) Substitute (legitimately) bound occurrences so as to get a logically equivalent wff in which no variable is both free and bound. (iii) Construct the following wffs (carry out also the illegitimate substitutions): Scu α
Swv α
Svu vu α
Swu cv α
(iv) Note which of the substitutions in (iii) is legitimate. In each illegitimate case, change the bound variables so as to get an equivalent wff in which the same substitution is legitimate, then carry this substitution out.
8.2.4
First-Order Languages with Function Symbols
For many purposes it is convenient to include in the non-logical vocabulary an additional category: function symbols. Each function symbol comes with its associated number of places. Assume, for example, that the language is supposed to be interpreted in a universe consisting of numbers. It would be convenient to include two-place function symbols that denote addition and multiplication: sum( , )
and
prd( , ).
292
CHAPTER 8. FIRST-ORDER LOGIC
If 1, 2, 3,... are names of 1, 2, 3,... etc., then, under this interpretation: sum(5, 3) denotes 8, and prd(4,3) denotes 12. In this interpretation the following sentences are true: sum(7, 2) ≈ 9
prd(sum(3, 2), 6) ≈ sum(prd(3, 6), sum(9, 3))
Ordinary mathematical notation is similar, except that the familiar addition and multiplication signs are used in infix notation (the function name appears between the argument places). If + and · are the formal-language symbols, and we rewrite +(x, y) and ·(x, y) as x+y and x·y, then the last sentences become: 7+2 ≈ 9
(3+2)·6 ≈ (3·6)+(9+3)
In general, the non-logical vocabulary of such a first-order language contains, in addition to individual constants and predicates, function symbols: f,
g,
h,
f0 ,
g0 ,
h0 ,
etc.,
each with its associated number of places. The definition of term is extended accordingly: (T1) Every individual variable and every individual constant is a term. (T2) If f is an n-ary function symbol and t1 , . . . , tn are terms, then f(t1 , t2 , . . . , tn ) is a term. The set of all terms is obtained by applying (T2) recursively. For example, the following are generated via (T2), hence they are terms: u,
v,
a,
b,
f(u),
g(a, v),
g(g(a, v), f(u)),
h(f(u), b, g(g(a, v), a)), etc.
The definitions of atomic wffs and of wffs are the same as before (cf. 7.4.0 page 270), but now the terms referred to in these definitions can be highly complex structures. The following, for example, are wffs. Here P and R are, respectively, a monadic and binary predicate and f and g are, respectively, a one-place and two-place function symbols. R(g(u, v), a)
∀ u ∀ v [f(u) ≈ g(v) → u ≈ v],
∀ v [P(g(f(v), b) ) → ∃ u (v ≈ f(u))]
The definitions of free and bound occurrences of individual variables are adjusted in the obvious way. If a variable occurs in a term t (including the case where t is this variable), then all its occurrences are free in t. The free occurrences of a variable in an atomic wff, α, are its occurrences in the terms that occur in α. An occurrence remains free as long as it is not captured by a quantifier. A term is constant if it does not contain variables.
8.2. WFFS AND SENTENCES OF FOL
293
Any interpretation of the language interprets, in addition to the predicates and the individual constants, the function symbols. If U is the universe of the interpretation, then an n-place function symbol is interpreted as an n-place function over U, whose values are in U. The function symbol denotes this function. In other words, an n-place f denotes an n-place function that maps every n-tuple of members of U to a member of U . Once the interpretation is given, every constant term gets a value in the interpretation’s universe. This value is obtained by applying the functions denoted by function symbols to the objects denoted by the individual constants. The value is what the constant term denotes, under the given interpretation. Terms that contain free variables do not denote particular objects. Their denotations depend on assignments of values to the individual variables occurring in them (just as the truthvalues of wffs with free variables depend on such assignments). These terms define, in the given interpretation, functions (just as wffs with free variables define sets and relations).
Peano’s Arithmetic The first-order language of Peano’s arithmetic contains, besides equality, the individual constant 0, a one-place function symbol s, and two-place function symbols + and · . The standard interpretation of the language is known as the standard mode of natural numbers. That is: • The universe is the set {0, 1, . . .} of natural numbers. • 0 denote the number 0. • sdenotes the function that maps each number n to n + 1. • + and · denote, respectively, the addition and the multiplication functions. Using infix notation for the function symbols, we have: s(s(v)) defines the function that maps each n to n + 2. v · v defines the function that maps each n to n2 . u·(v+s(s(0))) defines the function that maps each pair (m, n) to m · (n + 2).
Here we assumed that u marks the first argument, v–the second.
[((u+v)·(u+v))·(u+v))]+[(1+s(0))] defines the function that maps each (m, n) to (m + n)3 + 1
294
CHAPTER 8. FIRST-ORDER LOGIC
Given the associativity of addition and multiplication ((x + y) + z = x + (y + z) and (x · y) · z = x · (y · z)), we can ignore groupings in iterated +, or in iterated ·. Here of course we assume the standard interpretation. In general, the groupings cannot be ignored. Many properties and relations of natural numbers are expressible in Peano’s arithmetic. Here is a small sample. For convenience, we use ‘x’ ‘y’ and ‘z’ as variables of the formal language, as well as variables of our own mathematical-English discourse. ∃ v [x+v ≈ y] :
x ≤ y.
∃ v [x+s(v) ≈ y] : ∃ v [x·v ≈ y] :
x < y. x is a divisor of y.
∀ u ∀ v [(u·v ≈ x) → (u ≈ s(0) ∨ u ≈ x)] :
x is a prime number.
The following sentences are true, when the language is interpreted by the standard model of natural numbers. One may ignore that particular interpretation and consider these sentences as non-logical axioms, which characterize (in some way that we shall not go into) the concept of natural number. We then get what is known as Peano’s axiom. (Actually, they are due to Dedekind; Peano gave them a certain formalization). The axioms are the universal generalizations of the following wffs; this means that the sentences are obtained by quantifying universally over all the wffs’ free variables, e.g., the first axiom is ∀ x [0 6≈ s(x)]. 1. 0 6≈ s(x) 2. x 6≈ 0 → ∃ u [x ≈ s(u)] 3. x+0 ≈ x 4. x+s(y) ≈ s(x+y) 5. x·0 ≈ 0 6. x·s(y) ≈ (x·y)+x) 7. {α(0) ∧ ∀ v [α(v) → α(s(v))]} → ∀vα(v) The last, known as the Induction Axiom, is a scheme covering an infinite number of instances. For every wff, α(v), there is a corresponding axiom. It states that if the wff is true for 0, and if for all n, its truth for n implies its truth for n + 1, then the wff is true for all natural numbers.
8.3. FIRST-ORDER QUANTIFICATION IN NATURAL LANGUAGE
8.3
295
First-Order Quantification in Natural Language
8.3.1
Natural Language and the Use of Variables
Frege was the first to introduce quantifiers and to use them, together with variables, in order to express universal and existential claims. Since a variable can occur any number of times in a wff, and since any assignment gives it the same value on all its occurrences, variables constitute an unequaled device for pointing repeatedly to the same object. In English, which like other natural languages lacks such a device, the effect can be simulated to an extent by anaphora: the use of substitute words to refer to previously mentioned items. (1) A painter who lived on the lower east side, got, from an anonymous donor who admired his works, a gift of money that helped him finish a painting. Here ‘his’ and ‘him’ refer back to the painter. If we formalize (1) the repeated references appear as repeated occurrences of the same variable. Such repetitions correspond also to relative pronouns: ‘who’ and ‘that’: (10 ) For some x, y, z : x was a (male) painter, and x lived on the lower east side, and y admired the works of x, and x got z from y (and x did not know who sent z to x), and z was a gift of money, and z helped x to finish a painting. We can continue (1) in variable-free English, by using ‘the painter’, ‘the donor’, and ‘the gift’ throughout. But when there are no distinguishing marks and the statement has some combinatorial complexity, we must resort to variables. (2) If x, y, and z are different numbers, such that x is not between y and z, and y is not between x and z, then either x is smaller than z or y is smaller than z. You may try something along the lines of: (20 ) If each of two, of three different numbers, is not between the remaining two, then one of these two is smaller than the third. But this is not very clear and, besides, ‘the third’ is a variable in disguise. No wonder that variables have been introduced, either explicitly or in disguise (‘the first’ ‘the second’ etc.), from the very beginning of logical studies; and they have been extensively used in ancient mathematics.
296
CHAPTER 8. FIRST-ORDER LOGIC
Still, a lot can be achieved by anaphora. Consider, for example, the following sentence in FOL, where L(x, y) reads as ‘x likes y’. (3) ∃ x ∀ y [L(x, y) → (L(y, x) ∨ ∃z(L(x, z)∧L(y, z)))] Though it is not at all obvious, (3) can be rendered in variable-free English: (30 ) Someone likes only those who like either him, or someone he likes. ( The masculine pronouns have been used in a gender-neutral role; you can substitute ‘him’ by ‘him or her’ and ‘he’ by ‘he or she’; the result is grammatical but unwieldy.) Natural language can provide surprising solutions. Homework 8.3 Recast the following in variable-free English. In (1), read L(x, y) as ‘x likes y’. In 2. you are allowed one use of ‘the other’, or a similar expression, to refer to a certain number. 1. ∃ x ∀ y [L(x, y) → [L(y, x) ∨ ∃z(L(x, z)∧L(y, z) ∨ L(z, y)∧L(x, z)∧L(z, x))] 2. Of three numbers, if x, is smaller than y, and y is smaller than z, then x is smaller than z. 3. If x, y and z are three different numbers than either x is between y and z, or y is between x and z, or z is between x and y. Concerning Gender The interpretation of certain masculine forms depends on the context. The saying All men are mortal, which goes back a long time before our gender troubled grammar, attributes mortality to all human beings, not only to males. The same goes for masculine pronouns, which can refer on occasions to persons in general, male and female. If there are no indications to the contrary, you may plausibly assume that in the following sentences the quantification is intended to cover all human beings: (4) Someone in the room loves himself. (5) Everyone should pay his taxes. (In an appropriate background, the same might go for: ‘Someone in the room loves herself’.) Whether or not such uses are desirable is a delicate question, which I shall not risk addressing in a logic book.
8.3. FIRST-ORDER QUANTIFICATION IN NATURAL LANGUAGE
8.3.2
297
Some Basic Forms of Quantification
First-order quantification is expressible in natural language in a great variety of ways, each with its own implied meanings and peculiarities of usage. The fit of this motley system with the simple FOL grid is bound to be far from perfect. Here we shall only point out basic patterns. In English, generalizations, especially universal ones, fall most commonly under the following scheme: (S1) Quantifier Term –– Common-Noun Phrase –– Verb Phrase Here are two examples, one universal another existential, that fall under (S1). (4) Every word he put on paper was subject to careful considerations. Every
word he put on paper
was subject to careful considerations.
(5) Some girl who lives in New York grows beautiful tulips. Some
girl who lives in New York
grows beautiful tulips.
In FOL both the common-noun phrase and the verb phrase are represented by wffs. The verb phrase can be derived, like any predicate, either from a common name or from an adjective or from a verb. The two wffs share a common free variable, which is the variable used with the quantifier. Here is what the formalizations of (4) and (5) will look like. (40 ) ∀ x{x is a word he put on paper → x was subject to careful considerations} (50 ) ∃ x {x is a girl who lives in New York ∧ x grows beautiful tulips} The words ‘everyone’ and ‘someone’ can be rephrased as ‘every human’ and ‘some human’, and then the generalized sentence comes under (S1): Everyone in the room is smiling becomes Every human in the room is smiling. Every
human in the room
More of this later.
is smiling.
298
CHAPTER 8. FIRST-ORDER LOGIC
The quantifications that fall under (S1) are of the following forms. Here ‘every’ and ‘some’ are the quantifier terms, ‘Y ’ is the common-noun phrase and the rest: ‘... Z’ is the verb phrase, where ‘...’ can stand for ‘is’ (or, as indicated in the brackets ‘is a’, or some other suitable verb). Underneath them are written their FOL formalizations, where α(x) formalizes ‘x is a Y ’, and β(x) formalizes ‘x is [is a, or a verb] Z’. Every Y is [is a/verb] Z
Some Y is [is a/verb] Z
∀ x (α(x) → β(x))
∃ x (α(x) ∧ β(x))
Semantic considerations show why this is the correct formalization. ‘Every Y is a Z’ means that everything that is a Y is a Z. We can state this by saying: For every x, if x is a Y then x is a Z, or more formally: ∀ x [x is a Y → x is a Z]. Therefore the two parts are linked by a conditional, although there is no ‘if...then ’ in the original English. If I say, for example, ‘Everyone in the room is smiling’, then I have made an assertion about the people in the room. My assertion is not falsified if somebody who is not in the room is not smiling. We can imagine it as a (possibly infinite) conjunction of all sentences of the form: ‘If a is a person in the room then a is smiling’. If there is no one in the room, the sentence is vacuously true. Similarly, ‘Some Y is a Z’ means that there is something which is a Y and a Z: For some x, x is a Y and x is a Z, or more formally: ∃ x [x is a Y ∧ x is a Z]. Hence we have a conjunction, although there is no ‘and’ in the original English. If I say ‘Someone in the room is smiling’, then what I say is true just in case there is something that is both (i) a person in the room and (ii) smiling. If there is no one in the room, the sentence is false. Watch Out: Since the conditional and the conjunction do not appear in the original English sentence, beginners who go by surface appearance often get the formalization wrong.
8.3. FIRST-ORDER QUANTIFICATION IN NATURAL LANGUAGE
299
Note: In the case of (4), α(x) and β(x) formalize, respectively: ‘x is a word he put on paper’ and ‘x was subject to careful considerations’, and in the case of (5) they correspond to: ‘x is a girl who lives in New York’ and ‘x grows beautiful tulips’. α and β can be complex wffs that include quantifiers of their own. Cases that involve a different ordering of phrases can be still classified under (S1), for example: (6) He subjected every candidate to a lengthy interview. comes out as ∀x (α(x) → β(x)), where α(x) and β(x) formalize, respectively: ‘x was a candidate’ and ‘he subjected x to a lengthy interview’ . A second basic scheme applies to existential quantification: (S2) There – Existential Verb –- Common-Noun Phrase The existential verbs are ‘be’ and ‘exist’. For example: (7) There is a prime number smaller than 5.
There is a prime number smaller than 5. Finer points of grammar, such as the presence of the expletive ‘there’, or the indefinite article that is required in the singular form, are not represented here and we shall not dwell on them. With ‘exist’ the existential verb can come, without expletive, at the end: ‘ A prime number smaller than 5 exists. In what follows immediately we shall be concerned with cases that fall under (S1). We shall return to (S2) when we focus on existential quantification.
Relativized Quantifiers
In our present version of FOL, every individual variable ranges over “everything”; that is, over all the objects in the universe of the interpretation. If we want to restrict the quantification to a certain subdomain–say the domain of humans, or animals, or planets, or what have you–we should use a predicate that marks off that subdomain. To restrict quantification to the subdomain determined by P( ), we change the original generalized wffs as follows:
300
CHAPTER 8. FIRST-ORDER LOGIC
∀ x (. . . x . . .)
to
∀ x (P(x) → . . . x . . .)
∃ x (. . . x . . .)
to
∃ x (P(x) ∧ . . . x . . .)
In mathematical logic this is known as the relativization of the quantifiers to P (or to the subdomain determined by P). When these changes are applied to every subformula of a given wff, γ, we get the relativization of γ to P. What γ says about the universe, the relativization says about the domain that is marked off by P. Instead of the atomic P(x), we can use any wff, α(x), with one free variable; in which case we speak of the relativizing the quantifiers to α(x). Quantifications of the forms: ∀ x (α(x) → . . .)
∃ x (α(x) ∧ . . .)
arise, as we have seen, in formalizations of natural language. In formalizing (5), we might start by: ∀ x [x was subject to special scrutiny] But obviously we do not want to say that everything was subject to special scrutiny. Hence we relativize the quantifier to things that are words that he (whoever he is) put on paper. This gives us (40 ). Similarly, (50 ) is obtained from the unrelativized ∃ x [x grows beautiful tulips], by relativizing to girls who live in New York. If several quantifiers are involved, the quantified variables may need restrictions to different domains. Hence, as a rule, different α’s will be used to relativize different quantifiers. Certain natural domains, such as the domain of humans, or of dogs, or of stars, are best represented in a first-order language by monadic predicates that are included in the basic vocabulary. Occasional domains, which arise in the context of this or that statement, can be marked off by appropriate wffs. For example, if we have in our language the predicates Girl( ) and LiveIn( , ) and the individual constant NY, we can define the set of girls who live in New York by:
8.3. FIRST-ORDER QUANTIFICATION IN NATURAL LANGUAGE
301
Girl(x) ∧ LiveIn(x, NY) In which case, (5) will come out as: (5∗ ) ∃ x [Girl(x) ∧ LiveIn(x, NY) ∧ β(x)] where β(x) defines those who grow beautiful tulips. Additional predicates will be needed to construct a plausible β(x). Sometimes a fine-grained analysis is not needed. To show that (5) implies that someone grows beautiful tulips, we do not need to know the structure of β(x). We can let β(x) be the atomic wff GBT(x), which stands for ‘x grows beautiful tulips’. But sometimes a deeper analysis is unavoidable; we need to go into β(x)’s details in order to show that (5) implies that someone grows tulips.
Many-Sorted Languages Some first-order languages have individual variables of several sorts. An interpretation associates with each sort a domain; all the variables of this sort range over that domain. Quantification is interpreted accordingly. Domains that correspond to sorts need no relativization, we simply use the appropriate variables. For instance, if ξ is a variable that, according to the interpretation, ranges over humans, then ∀ x (Human(x) → Mortal(x)) is rewritten as ∀ ξ Mortal(ξ) . Single-sorted languages and many-sorted ones have the same expressive power. What the latter achieve by use of different sorts the former achieve by relativizing to the corresponding predicates. Many-sorted languages are more convenient in certain contexts of applications; single-sorted ones are simpler when it comes to defining the syntax and the semantics. Note: Sometimes relativization is not necessary, because the restriction to a specific domain is already implied by other predicates that occur in the formula. Suppose, for example, that K(x, y) reads as ‘x knows y’, and that, by definition, only humans can know. (8) Some person other than Jack knows Jill, and (9) Every person who knows Jill likes her, can be rendered as: (80 ) ∃ x (x 6≈ Jack ∧ K(x, Jill))
302
CHAPTER 8. FIRST-ORDER LOGIC and
(90 ) ∀ x (K(x, Jill) → L(x, Jill)) Explicit relativization to the human domain can be dispensed with, because the needed restriction is imposed already by the predicate K. But not always does this work out. You should note how the predicate occurs. For example, (10) Someone doesn’t know Jill should be formalized as (100 ) ∃ x [H(x) ∧ ¬K(x, Jill)] , where H marks off the domain of humans. Without the conjunct H(x), the formalization will come out true whenever the universe includes non-human objects (can you see why?).
8.3.3
Universal Quantification
The universal-quantifier terms in English are the indefinite pronouns: every
all
any
each .
They differ,however in significant aspects, grammatical as well as semantic. For example, ‘all’ requires a complement in the plural, the others–in the singular; this, we shall see, is related to semantic differences. ‘All’ can function as a quantifier-term in ways that the others cannot, ways that do not fall strictly under (S1). It can precede a relative pronoun: (1) All who went never returned. To use the other terms in such a construction, one would have to transform ‘who went’ into a noun phase, e.g., ‘one who went’. Sometimes the terms can be used interchangeably: (2) In this class, every student can pass the test. (3) In this class, all students can pass the test. (4) In this class, any student can pass the test.
8.3. FIRST-ORDER QUANTIFICATION IN NATURAL LANGUAGE
303
(5) In this class, each student can pass the test. But quite often they cannot: (6) Not every rabbit likes lettuce. (7) Not each rabbit likes lettuce. (8) Not any rabbit likes lettuce. The second is odd, if not ungrammatical; the third, if accepted, means something different than the first. Or compare: (9) Each voter cast his ballot. (10) Any voter cast his ballot. You can easily come up with many other examples. We shall not go into the intricate differences of various first-order quantifications in English. This is a job for the linguist. In what follows some basic aspects of the two most important universal-quantifier terms, every and all are discussed. The other two terms, which have peculiarities of their own, are left to the reader. ‘Every’ expresses best the universal quantification of FOL. (11) Every tiger is striped states no more and no less than the conjunction of the all sentences ‘... is striped’, where ‘...’ denotes a tiger (assuming that every tiger can be referred to by some expression). But (12) All tigers are striped implies some kind of law. This nomological (law-like) dimension is lost when (12) is formalized in FOL. The universal generalizations of FOL are, one might say, material. They state that, as a matter of fact, every object in some class is such and such; whether this is some law, or is a mere accident, does not matter. This does not mean that ‘every’ cannot convey a lawful regularity. When the domain of objects that fall under the generalization is sufficiently large, or sufficiently structured by a rich theory, an ‘every’-statement is naturally taken as an expression of lawfulness. In particular, in mathematics, ‘every’ is used throughout to state the most lawful imaginable regularities.
304
CHAPTER 8. FIRST-ORDER LOGIC
(13) Every positive integer is a sum of four squares. We can also apply ‘all’ to accidental, or occasional groups of objects, without nomological implications. (14) All the students passed the test and (15) Every student passed the test say the same thing. Note however that in (14) the definite article is needed to pick up a particular group. Without it, the statement has a different flavour: (140 ) All students passed the test. Distributive versus Collective ‘All’ (16) After the meeting all the teachers went to the reception. (17) After the meeting every teacher went to the reception. In (16) ‘all’ can be read collectively: all the teachers went together. But ‘every’ must be read distributively, in (17) and in all other cases. Only the distributive reading of (16) can be formalized as a first-order generalization. The collective reading cannot be expressed in FOL, unless we provide ways of treating pluralities as objects. The fact that ‘all’ takes a plural complement conforms to its functioning as something that relates to a plurality as a whole. Sometimes ‘all’ must be read collectively: (18) All the pieces in this box fit together. (18) neither implies, nor is implied by the statement that every two pieces fit. The latter can be expressed as a first-order generalization, by using a two-place predicate: ‘x fits together with y’. But (18) must be interpreted as an assertion about single totality. (You can have a collection of pieces, every two of which can be combined, but which cannot be combined as a whole. Vice versa, a collection can be combined as a whole, though not every two pieces in it fit together.) But in general the two readings are possible and the choice is made according to context and plausibility. Quite often, the statement under the collective reading of ‘all’ implies the one
8.3. FIRST-ORDER QUANTIFICATION IN NATURAL LANGUAGE
305
under the distributive reading. (If all the teachers went together, then also every teacher went.) But sometimes the statements are exclusive: (19) The problem was solved by all the engineers in this department, (20) The problem was solved by every engineer in this department. (One’s contribution to a joint solution does not count as “solving the problem”. ) Note: ‘All’ collects no less, even more, when it precedes a relative pronoun. Coming with ‘that’, it is employed in the singular form and points to a single totality. In (21) All that John did was for the best, ‘all that John did’ refers to the collection of John’s doings. Another example is (22) below. All with Negation (22) All that glitters is not gold. The proverb does not make the false assertion that every object that glitters is not made of gold. (22) is the negation of (23) All that glitters is gold, in which ‘all that glitters’ is read as referring to all glittering objects taken together. (23) states that this totality is made of gold; its negation, (22), says that it is not, i.e. that some of it is not made of gold. (22) can be the negation of (23), only if we treat ‘all that glitters’ as a name of a single entity. By contrast, (24) Every object that glitters is not made of gold is not the negation of (25) Every object that glitters is made of gold. (24) makes a much stronger statement: No glittering object is made of gold. By the same token, under collective ‘all’, (26) All good memories will not be forgotten, but some will, is not logically false. But the analogous statement with ‘every’ is logically false:
306
CHAPTER 8. FIRST-ORDER LOGIC
(27) Every good memory will not be forgotten, but some will. Solid Compounds ‘Every’ and ‘any’ combine with ‘one’ and ‘body’ to form solid compounds, which can be used to quantify over people: Everyone,
Everybody,
Anyone,
Anybody .
The formalized versions require relativization to the human domain, via an appropriate predicate, unless the presupposed universe consists of humans only. Thus, (28) Everyone is happy at some time. comes out as: (28∗ ) ∀x[Human(x) → x is happy sometime] where x is happy sometime can be further unfolded into a wff involving quantification over times (cf. 8.3.5). Other solid compounds are formed with ‘thing’: Everything
Anything
Without a qualifying clause, ‘everything’ would seem to express unlimited universal generalization. In fact, it expresses a somewhat vague generality, which is made more precise by context. (29) Everything has a cause. [Does ‘everything’ cover natural numbers? does it cover people?] (30) Everything is for the best. [This, apparently refers to events.] ‘Everything’ is also limited to inanimate objects, unless this is overridden by the context. With an appropriate qualification ‘everything’ can express a precise general statement: (31) From then on, everything he held in his hands turned into gold. It has also a collective reading: (32) Everything here fits together.
8.3. FIRST-ORDER QUANTIFICATION IN NATURAL LANGUAGE
8.3.4
307
Existential Quantification
The existential analogue of ‘every’ or ‘all’ is some. It is employed in forming existential generalizations under scheme (S1); cf. example (5) in 8.3.2. But when ‘some’ is combined with a common name, we expect the plural: (1) Some dogs have lived more than twenty years, (2) Some firemen have manifested exemplary courage during the earthquake. The singular forms of (1) and (2) are correct, but obsolete. With plural ‘some’, the statement claims the existence of more than one object that has the property in question. There is no problem in formalizing this. Say the property is expressed by γ(x) (e.g., γ(x) formalizes: ‘x is a dog and x has lived more than twenty years’). Then, ∃xγ(x) says that at least one object has the property; ∃ x ∃ y [γ(x) ∧ γ(y) ∧ x 6≈ y] says that more than one object have it. The second sentence, which is much longer, is interesting only in that it shows how ‘more than one’ can be expressed in terms of ‘at least one’. From a logical point of view, the basic natural notion is that of ‘at least one’; the second is a mere derivative. Logicians have therefore tended to interpret the plural form of ‘some’ as ‘at least one’. This makes for a neat symmetry: ‘some’ is to ∃ as ‘every’ is to ∀ . Opinions may vary concerning the extent to which this rule-bending (if it is a bending) is acceptable. Assume that somebody asserted (1). Has he made a false claim if it turns out that exactly one dog has lived longer than twenty years? And what would be the verdict in the analogous case of (2)? Note that (1) can be read as an assertion about dogs as a species; and in this perspective, even the existence of a single dog may count as verification. Solid Compounds of ‘Some’ The terms someone,
somebody,
something
are for existential quantification what ‘everyone’, ‘everybody’ and ‘everything’ are for the universal one. Being in the singular, the problems mentioned above do not arise here. The
308
CHAPTER 8. FIRST-ORDER LOGIC
first two involve, as the universal ones do, a restriction to the human domain. The third shares with ‘everything’ the same kind of vagueness and the same dependence on context. ‘There Is’,
‘There Exists’, ‘There was’,
‘There will be’
‘There is’, ‘There exists’, and their derivatives occur in the second scheme, (S2), of existential quantification (cf. 8.3.2). In scientific or abstract discourse, the present tense of ‘is’ or ‘exists’ indicates timeless generality, as in (7) of 8.3.2. In other contexts, the truth-value of the statement depends, (because of the time-indexicality of the verb ‘is’, or ‘exist’) on the time point referred to by the verb. E.g., (3) below may have different truth-values at different times because the contents of the next room changes with time. (3) There is a woman in the next room. Still, we can go ahead and formalize it: (3∗ ) ∃x [W(x) ∧ In (x, next-room)] . There is no syntactic problem here. We have only to note that the denotations of the indexical elements might change with time, place or other parameters. The interpretation of the formal language is therefore dependent on these parameters. But in each interpretation all predicates and individual constants have fixed denotations. In our case, the denotation of In( , ) is timedependent: person a is in room b, iff a is in room b now. For simplicity, we have assumed that next-room denotes some fixed room. But this, as well, may be relative to time, and also to place: next-room denotes the room that is now next to here. On the other hand, W can be interpreted as the set of all past, present and future women, and this, it is easy to see, does not depend on time. Existential verbs in past and future tenses are to be treated along the same lines. Consider: (5) There was a redheaded painter, (6) There will be a blue cow. It would be a blunder to introduce here, for the purpose of a logical analysis, special timedependent quantifiers.1 The past and the future should be treated in the non-logical vocabulary. (5) is handled by introducing a predicate for past humans, (6)–by introducing a predicate for future cows: 1
Temporal quantification underlies temporal logic. But this logic, which is used for specific purposes, need not enter into basic logical analysis. Especially since the full temporal story can be told if we include in our vocabulary the required predicates, cf. 8.3.5
8.3. FIRST-ORDER QUANTIFICATION IN NATURAL LANGUAGE
309
(5∗ ) ∃ x [PastHuman(x) ∧ Painter(x) ∧ RedHead(x)] (6∗ ) ∃ x [FutureCow(x) ∧ Blue(x)] The predicate PastHuman denotes the set of all humans that existed before now. FutureCow denotes the set of all cows that will exist after now. All the other predicates have timeindependent interpretations. E. g., RedHead expresses the property of being redheaded, irrespective of time; it is therefore interpreted as the set of all redheaded creatures, past present and future. More of this in the next subsection. The plural forms: ‘There are’, ‘There exist’, imply the existence of more than one object. The discussion above, concerning ‘some’ in the plural, applies in large measure also here.
8.3.5
More on First Order Quantification in English
Generality of Time and Location Quite a few quantifier terms are used to generalize, either universally or existentially, with respect to time and place. Some of these are compounds based on terms discussed earlier. Here is a list. Universal Generalization For Time:
whenever,
For Place:
wherever,
always,
anytime.
everywhere,
anywhere.
Existential Generalization For Time:
sometime,
For Place:
somewhere.
sometimes.
Temporal generality can be expressed in FOL by including times, or time-points, among the objects over which the variables range. For example, the formalization of (1) Jill is always content comes out as: (1∗ )
∀ v (Time(v) → Cont(Jill, v))
where Time(x) reads as: ‘x is a time-point’, and Cont(x, y) as: ‘x is content at time y’.
310
CHAPTER 8. FIRST-ORDER LOGIC
The same strategy works in general: we increase the arity of each predicate by adding a timecoordinate. For example, instead of formalizing ‘likes’ as a binary predicate, we formalize it as a ternary one: L(x, y, z) reads as ‘x likes y at time z’.
Temporal Aspects and Indexicality We have outlined above (cf. 8.3.4) how indexical elements make the interpretation dependent on time, or on other varying parameters. This relates both to universal and existential quantification. We have seen that such dependence should be no obstacle to formalization. The point becomes even clearer if we represent the time-parameter (and possibly others if needed) by an additional coordinate. (2)
Jack likes Jill now, but never liked her before
becomes: (2∗ )
L(Jack, Jill, now) ∧ ∀ u (u ≺ now → ¬L(Jack, Jill, u))
Here now is an individual constant denoting the time of the utterance; ≺ is a two-place predicate (written in infix notation) denoting the precedence relation over time-points. Note: Having now, we can dispense with other time-indexicals. We do not need predicates for past humans or future cows. (6) of 8.3.4 can be now rendered as: ∃ x ∃ v [now ≺ v ∧ Cow(x, v) ∧ Blue(x, v)] We shall not enter here into the exact nature of our “times”–whether they are like the points of a continuous line, discrete points like the integers, or small stretches. Different contexts may call for different modelings of time (the questions has been the subject of investigations among logicians and researchers in artificial intelligence). Generalizations over places, expressed by ‘everywhere’ and ‘somewhere’ can be similarly treated: We include in our universe locations and we add to the relevant predicates an additional location-coordinate. The exact nature of our locations, whether they are points in space, or small regions, depends on context and will not concern us here. Non-Temporal Use of Temporal Terms Terms such as ‘always’, ‘whenever’, ‘sometime’, and others from the list above, can serve in a non-temporal capacity to express quantification in general:
8.3. FIRST-ORDER QUANTIFICATION IN NATURAL LANGUAGE
311
(3) A perfect number is always even. (4) The indefinite article is ‘an’, whenever the name begins with a vowel. (5) Sometimes the same trait is an outcome of different evolutions. This is also true, to lesser extent, of terms used for locations.
Existential Import Very often, a universal generalization is taken to imply that the domain of objects that are subject to the claim is not empty. Thus, from ‘Every Y is a Z’ one would infer that there are Y ’s. Such an interpretation is said to ascribe existential import to universal quantification. It has a long history that goes back to Aristotle. Existential import seems to be the case in a considerable part of ordinary discourse. One usually infers, from (6) Every girl who saw this puppy was taken with it that some girl saw the puppy. There is no difficulty in formalizing this reading in FOL. We simply add to the wff constructed according to the previous rules: (6∗ ) ∀x(α(x) → β(x)) a conjunct asserting the implied existence (e.g., of a girl who saw the puppy): (6∗∗ ) ∀x(α(x) → β(x)) ∧ ∃xα(x) It is, however, possible to explain this, and similar cases, on the grounds of implicature. Under ordinary circumstances, an assertion of (6) is taken as a sign that the speaker believes, on good grounds, that some girls saw the puppy. Because (6)–interpreted as (6∗ )–would be vacuously true and completely uninteresting if no girl saw the puppy. The argument can be carried further by considering: (7) Everyone who was near the explosion is by now dead. The speaker may assert (7) in complete ignorance as to whether someone was near the explosion. The point is that if (7) is granted, and if we find later that someone was near the explosion, then we can deduce that the person is dead. If it turns out that no one was near the explosion (7) would still be considered true.
312
CHAPTER 8. FIRST-ORDER LOGIC
Something about ‘Someone’ ‘Someone’ means at least one person. Occasionally, an assertion of ‘some’ is taken to indicate not all, or even exactly one. But such cases can be explained on the grounds of implicature. If the teacher asserts (8) Someone got an A on the last test, the students will probably infer that only one of them got an A. For they assume that the teacher does not withhold relevant information. If several students got an A, he would have used the plural: ‘some of you’, and if all did–he would have used ‘all of you’. But if (8) is asserted by someone who stole a hasty glance at the grade sheet, and the students know this, they will infer only that there was at least one that got an A Note also that the teacher himself can announce: (9) Someone got an A on the last exam, in fact all of you did, without contradicting himself. Note: ‘Some’ means also a relatively small quantity, as in ‘Some grains of salt got into the coffee’. Read in this way, it is not expressible in FOL.
Generality through Indefinite Articles An indefinite article, by itself, sometimes implies universal generalization. Usually, such statements are intended to express some law-like regularity–an aspect that will be lost in FOL formalization. (10) A bachelor is an unmarried man means: (100 ) All bachelors are unmarried men. (11) Whales are mammals means: (110 ) All whales are mammals. But often the last form expresses something weaker than strict universal generalization: (12) Birds fly
8.3. FIRST-ORDER QUANTIFICATION IN NATURAL LANGUAGE means something like: encounter, birds fly.
313
In most cases, or in most cases you are likely to
And this kind of statement is outside the scope of FOL. Considerable efforts have been devoted to setting up formalisms in which generalizations of this kind are expressible. A very common variant of (10) and (12) employs the conditional. (13) If a triangle has two equal angles, it has two equal sides means: (130 ) Every triangle that has two equal angles has two equal sides. (14) A man is not held in esteem, if he is easily provoked means: (140 ) Every man who is easily provoked is not held in esteem.
Generality through Negation (15) No person is indispensable amounts to the negation of ‘Some person is indispensable’: (150 ) Every person is not indispensable. In this category we have the very commonly used compounds: nothing, no one, nobody, as well as nowhere.
Generalization through ‘Some’ These cases belong together with (13) and (14) above. ‘Some’ plays here the role of an indefinite article. (16) If someone beats a world record, many people admire him means: (160 ) If a person beats a world record many people admire him, which comes to: (1600 ) Everyone who beats a world record is admired by many people.
314
CHAPTER 8. FIRST-ORDER LOGIC
And in a similar vein: (17) Someone who is generous is liked really means: (170 ) Everyone who is generous is liked. We may even get ambiguous cases where ‘some’ can signify either a universal or an existential quantifier: (18) In this class, someone lazy will fail the test. You can conceivably interpret (18) as stating that, in this class, all the lazy ones will fail the test. You can also read it as a prediction about some unspecified student that the speaker has in mind.
General Advice From the foregoing, you can see some of the tangle of first-order quantification in natural language. Remember that there are no simple clear-cut rules that will enable you to derive, in a mechanical way, correct formalized versions. Conceivably, some algorithm might do this; but it is bound to be a very complex affair. A good way to check whether you have got the formalization right is to consider truth-conditions: Assuming that vagueness, non-denoting terms and other complicating factors have been cleared, do the sentence and its formal translation have the same truth-value in every possible circumstance? This is not the only criterion, but it is a crucial one. In any case, do not follow blindly the grammatical form. You must understand what the sentence says before you formalize it!
8.3.6
Formalization Techniques
When translating from English into FOL, it is often useful (especially for beginners) to proceed stepwise, using intermediary semi-formal rewrites, possibly with variables. When the semi-formal sentence is sufficiently detailed, it translates easily into FOL. Here are some illustrations. The predicates and constants in the final wffs are self-explanatory (His interpreted as the set of humans).In (1), (2), (5) and (6) more than one logically equivalent wffs are given as possible answers. In some you can trace the equivalence to the familiar: α → (β → γ) ≡ α∧β → γ . In all cases the equivalences follow from FOL equivalence
8.3. FIRST-ORDER QUANTIFICATION IN NATURAL LANGUAGE
315
rules (to be given in chapter 9). But you may try to see for yourself that the different versions say the same thing. (1) No man is happy unless he likes himself. (1∗ ) Every man who is happy likes himself. (1∗∗ ) For every man, x: If x is happy, then x likes x. (1∗∗∗ ) For every x, if x is a man, then: If x is happy, then x likes x. (1¦ )
∀ x [M(x) → (H(x) → L(x, x))]
or: ∀ x [M(x) ∧ H(x) → L(x, x)]
(2) Some man is liked by every woman who likes herself. (2∗ ) There is a man who is liked by every woman who likes herself. (2∗∗ ) There is a man, x, such that: For every woman y: If y likes y, then y likes x. (2∗∗∗ ) There is x, such that x is a man, and for every y, if y is a woman, then: If y likes y, then y likes x. (2¦ )
∃ x [M(x) ∧ ∀ y W(y) → (L(y, y) → L(y, x))]
or: ∃ x [M(x) ∧ ∀ y W(y) → (L(y, y) → L(y, x))] (3) Claire likes somebody. (3∗ ) There is a person whom Claire likes. (3∗∗ ) There is a person, x, such that: Claire likes x.
316
CHAPTER 8. FIRST-ORDER LOGIC
(3∗∗∗ ) There is x, such that x is a person, and Claire likes x. (3¦ )
∃ x [H(x) ∧ L(c, x)]
(4) Claire likes a man who does not like her. (4∗ ) There is a man whom Claire likes and who does not like Claire. (4∗∗ ) There is a man, x, such that: Claire likes x and x does not like Claire. (4∗∗∗ ) There is x, such that x is a man, and Claire likes x and x does not like Claire. (4¦ )
∃ x [M(x) ∧ L(c, x) ∧ ¬L(x, c)]
(5) Harry likes some women, though not all of them. Reading ‘some women’ as: “more than one woman”, we get: (5∗ ) There are two women whom Harry likes and there is a woman whom Harry does not like. (5∗∗ ) There are women x, y, such that: Harry likes x and Harry likes y and x 6= y, and there is a woman, z, such that: Harry does not like z. (5∗∗∗ ) There is x, such that x is a woman, and there is y, such that y is a woman, and Harry likes x and Harry likes y and x 6= y, and there is z, such that z is a woman, and Harry does not like z.
8.3. FIRST-ORDER QUANTIFICATION IN NATURAL LANGUAGE (5¦ )
317
∃ x {W(x) ∧ ∃ y [W(y) ∧ L(h, x) ∧ L(h, y) ∧ x 6≈ y]} ∧ ∃ z [W(z) ∧ ¬L(h, z)]
or: ∃ x ∃ y {W(x) ∧ W(y) ∧ L(h, x) ∧ L(h, y) ∧ x 6≈ y]} ∧ ∃ z [W(z) ∧ ¬L(h, z)]
Had we read ‘some women’ as ‘some woman’ the final version would have been: (50 ) There is x, such that x is a woman, and Harry likes x, and there is z, such that z is a woman, and Harry does not like z. (50¦ )
∃ x [W(x) ∧ L(h, x)] ∧ ∃ z [W(z) ∧ ¬L(h, z)]
(6) No woman likes a man, if he doesn’t like her. (6∗ ) Every woman does not like a man, if the man doesn’t like her. (6∗∗ ) For every woman, x: For every man y: If y does not like x, then x does not like y. [Or, equivalently: If x likes y, then y likes x.] (6∗∗∗ ) For every x, if x is a woman, then: For every y, if y is a man, then: If x likes y, then y likes x. (6¦ )
∀ x {W(x) → ∀ ( y)[M(y) → (L(x, y) → L(y, x))]}
or: ∀ x {W(x) → ∀ ( y)[M(y) ∧ L(x, y) → L(y, x)]}
or: ∀ x ∀ y {W(x)∧M(y)∧L(x, y) → L(y, x)]} Expressing Uniqueness
A claim of uniqueness is a claim that there is one and only one object satisfying a given property. If the property is expressed by ‘...x...’, then the claim has the form: (1) There is a unique x such that ...x... .
318
CHAPTER 8. FIRST-ORDER LOGIC
If ‘...x...’ is formalized as α(x), then (1) is expressed in FOL by: (1∗ )
∃ x [α(x) ∧ ∀ y (α(y) → x ≈ y)]
In words: there is x such that: (i) ...x... and (ii) for every y, if ...y..., then y is equal to x. (Here we assume, of course, that α(y) results from α(x) by a legitimate substitution of the free variable. Also y should not occur freely in α(x).) Uniqueness is also expressed by the following logically equivalent wff: (1∗∗ )
∃xα(x) ∧ ∀y∀z[α(y) ∧ α(z) → y ≈ z]
The first conjunct says that there is at least one object satisfying the property; the second– that there is at most one object satisfying it. (Again, we assume that y and z are substitutable for x in α(x), and that they are not free there.) Sometimes (1∗ ), or (1∗∗ ), is abbreviated as: ∃!x α(x) which reads: there is a unique x such that α(x). Homework 8.4 in FOL.
Rephrase the following sentences, using variables. Then formalize them
(1) Claire and Edith like the same men. (2) The women who like Jack do not like Harry. (3) Only Ann is liked by Harry and David. (4) David is not happy unless two women like him. (5) Edith is liked by some man who does not like any other woman. (6) Harry likes a woman who likes all happy men. (7) Unless liked by a woman no man is happy. (8) Some man likes all women who like themselves. (9) Every happy man is liked by some happy woman. (10) Ann is liked by every man who likes some woman.
8.3. FIRST-ORDER QUANTIFICATION IN NATURAL LANGUAGE
319
Divide and Conquer It is often useful to formalize separately components of the sentence, which then can be fitted into the global structure. Such components are wffs that can contain free variables. They are obtained from a semi-formal version of the original English sentence. Here are three examples of this divide-and-conquer method. Note how short English sentences can display, upon analysis, an intricate logical structure. (i) Whoever found John found also somebody else. For all x, if x is a person and x found John then α(x), where α(x) is a wff saying: x found someone other than John. All in all, the sentence can be written as: (i0 ) ∀x [H(x) ∧ Found(x, John) → α(x)] We now turn our attention to α(x). It can be written as: ∃y (H(y) ∧ y 6≈ John ∧ Found(x, y)) Substituting in (i0 ) we get our final answer: (i00 ) ∀x [H(x) ∧ Found(x,John) → ∃y (H(y) ∧ y 6≈ John ∧ Found(x, y))] (ii) Jill owns a dog which is smaller than any other dog. (ii0 ) ∃x [Dog(x) ∧ Owns(Jill,x) ∧ α(x)], where α(x) says: x is smaller than any other dog.
We can write it as:
∀y (Dog(y) ∧y 6≈ x → Smaller(x, y)) Substituting we get: (ii00 ) ∃x [Dog(x) ∧ Owns(Jill,x) ∧ ∀y (Dog(y)∧y 6≈ x → Smaller(x, y))] (iii) Somebody loves a person who is loved by nobody else. (iii0 ) ∃x {H(x)∧α(x)}
where α(x) says:
x loves a person who is not loved by anyone, except x. It can be written as: ∃y [H(y) ∧ Loves(x, y) ∧ β(x, y)]
where β(x, y) says:
320
CHAPTER 8. FIRST-ORDER LOGIC Every person other then x does not love y. It can be written as: ∀z (H(z)∧(z 6≈ x) → ¬Loves(z, y))
Therefore, α(x) becomes:
∃y [H(y) ∧ Loves(x, y) ∧ ∀z (H(z)∧(z 6≈ x) → ¬Loves(z, y))] Substituting in (iii0 ), we get: (iii00 ) ∃x {H(x) ∧ ∃y [H(y) ∧ Loves(x, y) ∧ ∀z (H(z)∧(z 6≈ x) → ¬Loves(z, y))]} Homework 8.5 Formalize the following sentences in FOL. You can use one-letter notations for predicates and individual names; specify what they stand for. Indicate cases of ambiguity and formalize the various readings. 1. One woman in the room knows all the men there. 2. No one in the room knows every person there. 3. Someone in the room does not know any other person. 4. Someone can be admitted to the club only if two club members vouch for him. 5. Bonnie knows a man who hates all club members except her. 6. Bonnie will not attend the party unless some friend of hers does. 7. Bonnie met two persons only one of whom she knew. 8. Abe met two men, one of whom he knew, and the other who knew him. 9. Some women who like Abe do not like any other man. 10. Abe owns a house which no one likes. 11. Abe was bitten by a dog owned by a woman who hates all men. 12. Bonnie knows a man who likes her and no one else. 13. Whoever visited Bonnie knew her and was known to some other club member, except Abe. 14. With the possible exception of Bonnie, no club member is liked by all the rest. 15. With the exception of Bonnie, no club member is liked by all the rest.
8.3. FIRST-ORDER QUANTIFICATION IN NATURAL LANGUAGE
321
8.6 Formalize the following sentences in FOL. Introduce predicates as you find necessary, specifying the interpretations clearly. (For example: GM(x, y) stands for ‘x and y are people and x is good enough to be y’s master’). Try to get a fine-grained formalization. Interpret ‘some’ as ‘at least one’. 1. No one means all he says. 2. Someone says all he means. 3. Each State can have for enemies only other States, and not men. 4. He who is by nature not his own but another’s man, is by nature a slave. 5. No man is good enough to be another man’s master. 6. Those who deny freedom to others deserve it not for themselves. 7. He who cannot give anything away cannot feel anything either. 8. You can fool some of the people all the time, or all the people some of the time, but you cannot fool all people all the time.
322
CHAPTER 8. FIRST-ORDER LOGIC
Chapter 9 Models for FOL, Satisfaction, Truth and Logical Implication 9.1
Models, Satisfaction and Truth
9.1.0 The interpretation of a first-order language is given as a model. By this we mean a structure of the form: (U, π, δ) in which: (I) U is a non-empty set, called the model’s universe, or domain. The members of U are also referred to as members of the model. (II) π is a function that correlates with every predicate, P, of the language a relation, π(P), over U of the same arity as P (if P’s arity is 1, then π(P) is a subset of U ). We say that π(P) is the interpretation of P. (III) δ is a function that correlates with every individual constant, c, of the language a member, δ(c), of U . We say that δ(c) is the denotation of c in the given model. We also speak of it as the interpretation of c. In set-theoretic terms we can express this by: π(P) ⊆ U n ,
where n = arity of P, 323
δ(c) ∈ U .
324
CHAPTER 9. FOL: MODELS, TRUTH AND LOGICAL IMPLICATION There is only one restriction on the interpretation of predicates: If the language has equality, then π(≈) = {(x, x) : x ∈ U} . In words: the equality sign is interpreted as the identity relation over U.
Note: In the case of a language with function symbols (cf. 8.2.4), the mapping π is also defined for the function symbols; it correlates with every n-place function symbol an n-place function from U n into U. Henceforth we deal with languages based on predicates and individual constants. The extension to function symbols is more or less straightforward. Notation and Terminology • We shall use ‘M’, ‘M0 ’, , . . ., ‘M1 ’, . . . for models. • ‘|M|’, ‘PM ’, ‘cM ’, denote, respectively, the universe of M, the interpretation of P in M, and the interpretation of c in M. Hence, if M = (U, π, δ) then:
|M| = U,
PM = π(P),
cM = δ(c)
• If we assume fixed orderings of the predicates and of the individual constants: P1 , P2 , . . ., c1 , c2 , . . ., the model is written by displaying the interpretations in the same order: (U, P1 , P2 , . . . c1 , c2 , . . .) M where U = |M|, Pi = PM i , cj = cj . (If there are no individual constants, the last sequence is simply omitted.) A structure of this form is known also as a relational structure.
• The size of a model M is, by definition, the number of elements in its universe. The model is finite if its universe is a finite set. As observed in 8.2.2, the truth-value of any wff α is determined by: (i) a model M and (ii) an assignment of members in M to α’s free variables. Accordingly, we have to define the truth-value of a wff α, in a model M, under an assignment g of values to α’s free variables. We shall denote this truth-value as: valM α[g] If α is a sentence, its truth-value depends only on the model and we can drop ‘[g]’.
9.1. MODELS, SATISFACTION AND TRUTH Note:
325
The assignment g is neither a part of the language, nor of the model.
The following notations are used, for assignments. (I) If x1 , . . . , xn are distinct variables, we use:
x1 x2 ... xn a1 a2 ... an
to denote the assignment defined over {x1 , . . . , xn }, which assigns each xi the value ai . Accordingly, xn valM α[xa11 xa22 ... ... an ] is the truth-value of α in M under that assignment. (II) If g is any assignment of values to some variables, then
g xa is, by definition, the assignment that assigns to x the value a, and to every other variable–the value that is assigned to it by g. Note 1: g xa is defined for the following variables: (i) all the variables for which g is defined, (ii) the variable x. To variables different from x, gax and g assign the same values. Whether g is defined for x, or not, does not matter; for in either case gax assigns to x the value a. Note 2: In order that valM α[g] be defined, g should be defined for all free variables of α. It can be also defined for other variables; but as we shall see, the values given to variables not free in α play no role in determining α’s truth-value. Note 3: We use ‘assignment’ for a function that correlates members of the universe with variables. Do not confuse this with the truth-value assignment (to be presently defined) which correlates–with each wff α, each model M and each suitable assignment g of members of |M| to variables–the truth-value valM α[g].
9.1.1
The Truth Definition
valM α[g] is defined inductively, starting with atomic wffs and proceeding to more complex ones. It is convenient start by assigning values to terms, i.e., to individual variables and constants. The value of a term ,t, under g, is denoted as: valM t [g]. It is not a truth-value but a member of M, determined as follows:
326
CHAPTER 9. FOL: MODELS, TRUTH AND LOGICAL IMPLICATION If t is the individual constant, c, then valM t [g] = cM . If t is the variable v, then valM t [g] = g(v). This value is defined iff g is defined for v.
Atomic Wffs valM P(t1 , . . . , tn ) [g] = T
if
valM P(t1 , . . . , tn ) [g] = F
if
(valM t1 [g], . . . , valM tn [g]) ∈ PM ,
(valM t1 [g], . . . , valM tn [g]) 6∈ PM .
(For atomic sentences, this coincides with the definition given in 7.1.1.) Note that, since by assumption g is defined for all free variables of P (t1 , . . . , tn ), all the values valM ti [g] are defined.
Sentential Compounds valM ¬α [g] = T if valM α[g] = F,
valM ¬α [g] = F if valM α[g] = T. If is a binary connective, then valM (α β) [g] is obtained from valM α[g] and valM β[g] by the truth-table of . (In the last clause g is defined for all free variables of α β, hence it is defined for the free variables of α and for the free variables of β. )
Universal Quantifier valM ∀xα [g] = T if, for every a ∈ |M|, valM α[g xa ] = T, valM ∀xα [g] = F otherwise.
Existential Quantifier valM ∃xα [g] = T if, for some a ∈ |M|, valM α[gxa ] = T
valM ∃xα [g] = F otherwise.
If the language contains function symbols, then the definition is exactly the same, except that we have to include in the definition of valM t[g] inductive clauses for terms containing function symbols:
9.1. MODELS, SATISFACTION AND TRUTH
327
valM f(t1 , . . . , tn ) [g] = fM (valM t1 [g], . . . , valM tn [g]) , where fM is the function that interprets the function-symbol f in M. A Special Kind of Induction: The truth-value of ∀xα (or of ∃xα) is determined by the truth-values of the simpler wff α, under all possible assignments to the variable x. It is therefore based on simpler cases whose number is, possibly, infinite. This is a special kind high-powered induction that we have not encountered before. Note: In the clauses for quantifiers, the variable x need not be free in α. If it is not, then it can be shown that, for each assignment g, the truth-values of α, ∀xα, and ∃xα are the same. Note also that α may contain non-displayed free variables, besides x. Since their values under g and under any gax are the same, these values are fixed parameters in the clause. Understanding the Logical Particles: The clauses for quantifiers employ the expressions for every and for some. We must understand what these expressions mean in order to grasp the definition. Just so, we should understand ‘and’, ‘either ... or ’, and ‘if..., then ’, in order to understand what the truth-tables mean. We say, for example: “The value of any sentence is either T or F”, or “If the value of A is T and the value of B is F, then the value of A ∧ B is F.” First-order logic does not provide us with a substitute for these concepts, but with a systematization that expresses them in rigorous precise form. Satisfaction: If valM α[g] = T we say that α is satisfied in M by the assignment g, or that M and g satisfy α. We denote this by: M |= α[g] If α has a single free variable, we say that α is satisfied by a (where a ∈ |M|), if α is satisfied by the assignment that assigns a to its free variable. Similarly, if it has two free variables, we say that it is satisfied by the pair (a, b), if it is satisfied by the assignment that assigns a to the first free variable and b to the second. Here, of course, we presuppose some agreed ordering of the variables. If α is not satisfied in M by g, (i.e., if valM α[g] = F) we denote this by: M 6|= α[g] Ambiguity of ‘|=’: ‘|=’ is also used to denote logical implication (and logical truth). There is no danger of confusion. The symbol denotes satisfaction if the expression to its left denotes a model; otherwise it denotes logical implication. These uses of ‘|=’ are traditional in logic.
328
CHAPTER 9. FOL: MODELS, TRUTH AND LOGICAL IMPLICATION
Dependence on Free Variables It can be proved (by induction on the wff) that valM α[g] depends only on the values assigned by g to the free variables of α. If g and g 0 are assignments that assign the same values to all free variables of α, but which may differ otherwise, then valM α[g] = valM α[g0 ] The proof is not difficult but rather tedious; we shall not go into it here. If there are no free variables, i.e., if α is a sentence, the truth-value depends only on the model. We can therefore omit any reference to an assignment, saying (in case of satisfaction) that the sentence is satisfied, or is true, in M, and denoting this as: M |= α Similarly, valM α is the truth-value of the sentence α in the model M. Example Consider a first-order language based on (i) the binary predicate L, (ii) the monadic predicates H, W and M, (iii) the individual constants c1 and c2 . Let their ordering be: L, H, W, M, c1 , c2 and let M be the model
(U, L, H, W, M, c, d)
where: U = {c, d, e, f, g, h} L consists of the pairs: (c,c), (c,d), (c,f ), (d,g), (d,h), (e,e), (e,f), (e,h), (f,c), (f,f), (f,h), (g,c), (g,d), (g,e), (g,g), (h,e), (h,f ) H = {c, e, f, g} W = {c, d, e} M = {f, g, h} To make this more familiar let the six objects c, d, e, f, g, h be people, three women and three men: c = Claire, d = Doris, e = Edith, f = Frank, g = George, h = Harry.
9.1. MODELS, SATISFACTION AND TRUTH
329
Then, W consists of the women and M of the men. Assume moreover that L is the likingrelation over U and that H is the subset of happy people, that is, for all x, y ∈ U: (x, y) ∈ L iff x likes y,
x∈H
iff x is happy.
Note that Claire and Doris have names in our language: c1 and c2 , but the other people do not. Now let α be the sentence: ∀ u [W(u) → ∃ v (M(v) ∧ H(v) ∧ L(u, v))] It is not difficult to see that, given that the interpretation is M, α says: (1) Every woman (in U) likes some man (in U) who is happy. Applying the truth-definition to α, we shall now see that the truth of α in the given model is exactly what (1) expresses. In other words: α is true in M
IFF
(1)
Obviously α = ∀uβ(u), where β(u) = W(u) → ∃ v (M(v) ∧ H(v) ∧ L(u, v)) Hence M |= α iff for every a ∈ U , M |= β(u)[ua ]. Now β is a conditional: β = W(u) → ∃vγ,
where γ = γ(u, v) = M(v) ∧ H(v) ∧ L(u, v)
If a 6∈ W then M 6|= W(u)[ua ] and the antecedent of the conditional gets F; which makes the conditional true. Hence β is satisfied by every assignment ua for which a 6∈ W . Therefore, α is true in M iff β is also satisfied by all the other assignments, i.e., by all assignments ua in which a ∈ W . For each of these assignments the antecedent gets T; hence the conditional gets true iff M |= ∃vγ[ua ]
And this last wff is satisfied iff there exists b ∈ U such that M |= γ(u, v)[ua vb ]; that is, iff there is b ∈ U such that: M |= M(v)∧H(v)∧L(u, v) [ua vb ] The last condition simply means that each of the conjuncts is satisfied by that: b ∈ M and b ∈ H and (a, b) ∈ L
uv ab
, which means
Summing all this up, we have: α is satisfied in M iff for every a ∈ U, if a ∈ W , there exists b ∈ U, such that b ∈ M and b ∈ H and (a, b) ∈ L. Which can be restated as:
330
CHAPTER 9. FOL: MODELS, TRUTH AND LOGICAL IMPLICATION
(2) For every a in U, if a is a woman, then there exists b in U, such that b is a man and b is happy and a likes b. Obviously, (2) is nothing but a detailed rephrasing of (1). The truth-value of α can be found by checking, for every woman in W , whether there is in U a happy man whom she likes. This indeed is the case: c likes f , d likes g, e likes f . Hence, valM α = T. The same reasoning can be applied to the sentence β: ∃ v [M(v) ∧ H(v) ∧ ∀ u (W(u) → L(u, v))] β, it turns out, asserts that there is a happy man who is loved by all women. It gets the value F, because the happy men (in U) are f and g; but f is not liked by d and g is not liked by c. Homework 9.1 (I) Find the truth-value in the model M of the last example of each of the following sentences. Justify briefly your answers in regard to 6.—10. Do not go into detailed proofs; justifications for the above α and β can run as follows: α gets T, because for each x in W , there is a y in M, which is also in H, such that L(x, y): for x = f choose y = c, for x = g choose y = d, and for x = h choose y = e. (Here we used ‘L(x, y)’ as a shorthand for ‘(x, y) ∈ L’.) β gets F, because there is no x that is in M and in H, such that for all y ∈ W , L(y, x). L ∩ M has two members f and g; but for x = f, y = d provides a counterexample, and for x = g, y = c provides it. 1. L(c1 , c2 ) ∨ L(c2 , c1 ) 2. L(c1 , c2 ) → L(c2 , c1 ) 3. ∀ x (L(c1 , x)∧M(x) → L(x, c1 )) 4. ∀ x (L(c2 , x)∧M(x) → L(x, c2 )) 5. ∀ x (L(x, c2 )∧M(x) → L(c2 , x)) 6. ∀ x [W(x) → ∃ y (M(y)∧L(x, y)∧L(y, x))] 7. ∀ x ∀ y (W(x)∧W(y)∧x 6≈ y → ¬L(x, y)) 8. ∀ x [W(x) → ∃ y (W(y)∧L(x, y))] 9. ∀ x [W(x) → ∃ y (W(y)∧L(y, x))]
9.1. MODELS, SATISFACTION AND TRUTH
331
10. ∀ x [H(x) ↔ L(x, x)]
(II) Translate the sentences into correct stylistic English. (This relates to the subject matter of the previous chapter. Do it after answering (I).)
9.1.2
Defining Sets and Relations by Wffs
The sets and relations defined by wffs in a given interpretation (cf. 8.1.1) can be now described formally using the concept of satisfaction: In a given model M, a wff with one free variable defines the set of all members of |M| that satisfy it. A wff with two free variables defines the relation that consists of all pairs (a, b) ∈ |M|2 that satisfy it. And a wff with n free variables defines, in a similar way, an n-ary relation over |M|. (For arity n > 1, we have to presuppose a matching of the free variables in the wff with the relation’s coordinates.) Note: A wff with m free variables can be used to define relations of higher arity in which the additional coordinates are “dummy”: Say, the free variables of α occur among v1 , . . . , vn and consider the relation consisting of all tuples (a1 , . . . , an ) such that vn M |= α[va11 va22 ... ... an ]
The vi s that are not free in α make for dummy coordinates that have no effect on the tuple’s belonging to the relation.
Examples Consider a first-order language, based on a two-place predicate R, the equality predicate ≈, and two individual constants: c1 and c2 . Let |M| = {0, 1, 2, 3, 4} and let: RM = {(0, 1), (0, 2), (0, 3), (1, 2), (1, 4), (2, 2), (4, 1), (4, 3), (4, 4)} cM 1 = 0
cM 2 = 3
This model is illustrated below, where an arrow from i to j means that (i, j) is in the relation.
332
CHAPTER 9. FOL: MODELS, TRUTH AND LOGICAL IMPLICATION
The following obtains.
1. Λi ∣= R(cι, τ)g] fora = 1,2,3 and Λ4 ^ R(c1, z)g] for all other α,s.
Hence, R(c1,x) defines the set {1,2,3}. 2. Λ4 [= R(?/, c2)β] for a = 0,4 and √M ^= R(τ∕, c2)β] for all other a’s.
Hence, R(^,c2) defines the set {0,4}. 3. R(z,c1) defines 0 (the empty set). 4. Λ4 ∣= 3zR(c1,τ), because there is α ∈ Λ4 (e.g., 1) such that Nt ∣= R(cχ,τ)g]
5. Nt \£ VιR(c1,ι), because not all α ∈ Λ4 are such that
Nt ∣= R(cx, x)[a]; e.g., α = 0.
6. Hence, Nt ∣= - and a similar argument works for α = 4.
12. Vx (R(x, y) → R⅛, ι)) defines the set {0,4}.
If we read R(u, ν') as: iu points to ν’, then the last wff can be read as: ‘Every member that points to y is pointed to by ?/’. 0 satisfies it vacuously, because no member points to 0.
9.1. MODELS, SATISFACTION AND TRUTH
333
13. The wff x ≈ c2 defines the set {3} . 14. The wff ∃ y (y 6≈ x ∧ R(y, y) ∧ R(y, x)) ∧ ∃zR(x, z) defines the set {1} .
Had we not included y 6≈ x in the last wff, both 4 and 2 would have satisfied it (can you see why?). Had we not included the conjunct ∃zR(x, z), 3 would have satisfied it.
Repeated Quantifier Notation:
We use
∀ x1 , x2 , . . . , xn α,
∃ x1 , x2 , . . . , xn α,
∀ x1 ∀ x2 . . . ∀ xn α,
∃ x1 ∃ x2 . . . ∃ xn α,
as abbreviations for:
Homework 9.2 Consider a language with equality whose non-logical vocabulary consist of: A binary predicate, S, a monadic predicate, P, and one individual constant a. Let Mi , where i = 1, 2, 3, be models for this language with the same universe U. Mi = (U, Si , Pi , ai ) Assume that U = {1, 2, 3, 4} and that the relations Si and Pi and the object ai (which interpret, respectively, S, P, and a) are as follows: S1 = {(1, 2), (1, 3), (1, 4), (2, 3), (2, 4), (3, 4)}, S2 = S1 ∪ {(b, b) : b ∈ U }
P2 = ∅
P1 = {1, 4},
a1 = 4 ,
P3 = {2, 3, 4}
a3 = 1 .
a2 = 1 ,
S3 = {(1, 2), (1, 3), (1, 4), (2, 3), (3, 4), (4, 2)}
Find, for each the following sentences, which of the three models satisfy it. Justify briefly your answers (cf. Homework 9.1). You might find little drawings representing the model, or parts of it, useful; especially in the case of S3 . 1. ∃v∀uS(u, v) 2. ∃ v ∀ u (u 6≈ v → S(v, u)) 3. ∀ u, v [S(v, u) → P(v)] 4. ∀ u (S(a, u) → P(u))
334
CHAPTER 9. FOL: MODELS, TRUTH AND LOGICAL IMPLICATION
5. ∀ u, v [S(a, u)∧S(u, v) → S(a, v)] 6. ∀ u, v, w [S(u, v)∧S(v, w) → S(u, w)] 7. ∀ u [P(u) → ∃vS(u, v)] 8. ∀ u, v, w [P(u)∧S(u, v)∧S(u, w) → v ≈ w] 9.3 Write down (in the list notation) the sets defined by each of the following wffs in each of the models Mi , i = 1, 2, 3 of 9.2. 1. S(a, v) 2. S(v, a) 3. P(v) ∨ v ≈ a 4. P(v) ∧ v ≈ a 5. ∃uS(u, v) 6. ∀uS(v, u) 7. ∀ u (P(u) → S(v, u)) 8. ∃ u1 , u2 [u1 6≈ u2 ∧ S(u1 , v) ∧ S(v, u2 )] 9. ∃ u (¬P(u) ∧ u 6≈ v ∧ S(u, v))
9.2
Logical Implications in FOL
9.2.0 The scheme that defines logical implication for sentential logic defines it also for FOL: A set of sentences, given as a list Γ, logically implies the sentence α, if there is no possible interpretation in which all the members of Γ are true and α is false. What characterizes implication in each case is the concept of a possible interpretation and the way interpretations determine truth-values. Rephrasing the definition in our present terms, we can say that, for a given first-order language, Γ logically implies α if there is no FOL
9.2. LOGICAL IMPLICATIONS IN FOL
335
model that satisfies all members of Γ but does not satisfy α. Furthermore, a sentence α is logically true if it is satisfied in all models, logically false–if it is satisfied in none. The concepts extend naturally to the case of wffs; we have to throw in assignments of objects to variables, since the truth-values depend also on such assignments: • A premise-list Γ of wffs logically implies a wff α, if there is no model M (for the language in question) and no assignment g (of values to the variables occurring freely in the wffs of Γ and α) which satisfy all members of Γ, but do not satisfy α. • A wff α is logically true if it is satisfied in all models under all assignments of values to its free variables. (Or, equivalently, if it is logically implied by the empty premise-list.) • A wff α is logically false if it is not satisfied in any model under any assignment of values to its free variables. The definitions for sentences are particular cases of the definitions for wffs.
Satisfiable Sets of Wffs A set of wffs is satisfiable in the model M if there is an assignment of values (to the free variables of its wffs) which satisfies all wffs in the set. If the wffs are sentences, this simply that all the sentences are true in M. A set of wffs is satisfiable if it is satisfiable in some model. A set of wffs which is not satisfiable is described as logically inconsistent. Note that this accords with the previous usage of that term in sentential logic (cf. 3.4.3). Obviously, a wff is satisfiable just when it is not logically false. As before, we use: Γ |= α to say that the premise-list Γ logically implies the wff α. (Recall the double-usage of ‘|=’ !) If the premise-list is empty, this means that α is a logical truth: |= α We have used ‘⊥’ to denote some unspecified contradiction (cf. 4.4.0). We adopt this notation also for FOL.
336
CHAPTER 9. FOL: MODELS, TRUTH AND LOGICAL IMPLICATION
Following the reasoning used for sentential logic (cf. 4.4.0), we see that Γ |=⊥ means that Γ is not satisfiable. And the same reasoning also implies: Γ |= α
⇐⇒
Γ, ¬α |= ⊥
Logical Equivalence From logical implication we can get logical equivalence. Using, as before ‘≡’, we can define it by: α ≡ β ⇐⇒ α |= β and β |= α
Obviously, α ≡ β, iff, α and β have the same truth-value in every model under every assignment of values to the variables that are free in α and in β. From now on, unless indicated otherwise, ‘implication’ and ‘equivalence’ mean logical implication and logical equivalence.
9.2.1
Proving Non-Implications by Counterexamples
We use ‘Γ 6|= α’ to say that Γ does not imply α. It means that there is some model and some value-assignment to the variables, such that all members of Γare satisfied but α is not. Such a model and assignment constitute a counterexample to the implication claim. Here are some non-implication claims that are proved by counterexamples. ∃xP(x) 6|= ∀xP(x)
(1)
Proof: Consider a model, M, whose universe contains at least two members, such that PM is neither empty nor the whole universe. Since PM 6= ∅, M |= ∃xP(x). Since PM 6= |M|, M 6|= ∀xP(x). QED
6|= ∀y∃xR(x, y) → ∃x∀yR(x, y)
(2) Proof:
Consider the following model. M = (U, R),
where U = {0, 1}, R = {(0, 0), (1, 1)}
Since M |= R(x, y)[x0 y0 ], we have: M |= ∃xR(x, y)[y0 ]. An analogous argument shows that M |= ∃xR(x, y)[y1 ]. Since 0 and 1 are the only members, M |= ∀y∃xR(x, y).
9.2. LOGICAL IMPLICATIONS IN FOL
337
On the other hand, AΛ ^= R(i,j∕)[oι]∙ Hence Λ4 ^ V{∕R(ι,j∕)[θ]. By a similar argument Λ4 fy Vj∕R(τ, y)∣j]. Therefore M. ^4 3τVτ∕R(τ,y). Since the antecedent (of the sentence in (2)) is true in AΛ, but the consequent is false, the sentence is false. QED
Let us modify the sentence of (2) a bit: ^Jx3y {x ^ y K R(ι, y)) → By^ x (x ≠ y → R(τ, y))
It is not difficult to see that that sentence is satisfied in the last counterexample, since the antecedent is false. Still that sentence is not a logical truth. To prove this consider the model A4 = (U,R) where: β= {(0,l), (1,2), (2,0)}
^ = {0,l,2},
Z^>O
It is not difficult to see that the antecedent is true in this model: for every member a there is a different member b such that (a, b) ∈ R. But there is no member b such that (a, b) E R for all a ≠ b. Hence the consequent is false.
Homework 9.4
Prove the following negative claims by constructing small-size models (as small as you can) such that the premise is satisfied, but the conclusion is not.
You also have to assign values to the free variables occurring in the implication, if there are any. (Note that the same variable can have free and bound occurrences.)
1. Vx3yS(x,y) ^ 3τS(τ,SxS^a. Hence the fist contains a wff and its negation. In the last two examples the less accurate but more suggestive variable-displaying notation is used. The new individual constants are obvious from the context. We allow to push negation
352
CHAPTER 9. FOL: MODELS, TRUTH AND LOGICAL IMPLICATION
inside. The marginal indications are left mostly for the reader. (IV) ∃x∀yα(x, y) |= ∀y∃xα(x, y) 1. ∃x∀yα(x, y) |= ∀y∃xα(x, y) 2. ∃x∀yα(x, y), ¬∀y∃xα(x, y) |=⊥ 3. ∃x∀yα(x, y), ∃y∀x¬α(x, y) |=⊥
pushing-in negation
4. ∀yα(c1 , y), ∃y∀x¬α(x, y) |=⊥ 5. ∀yα(c1 , y), ∀x¬α(x, c2 ) |=⊥ 6. ∀yα(c1 , y), α(c1 , c2 ), ∀x¬α(x, c2 ) |=⊥ 7. ∀yα(c1 , y), α(c1 , c2 ), ∀x¬α(x, c2 ), ¬α(c1 , c2 ) |=⊥
√
Note: To get 5. we have to introduce a second new constant. We cannot use c1 , because it occurs in 4. This example illustrates a general feature of the technique. Existential quantifiers can be eliminated via instantiations to new constants; universal quantifiers can be used to add instantiations to any chosen constant. (V) ∀x∃yR(x, y) ∨ ∀x∃y¬R(x, y) |= ∃x∀yR(x, y) → ∀x∃yR(x, y) 1. ∀x∃yR(x, y) ∨ ∀x∃y¬R(x, y) |= ∃x∀yR(x, y) → ∀x∃yR(x, y) 2. ∀x∃yR(x, y) ∨ ∀x∃y¬R(x, y), ¬[∃x∀yR(x, y) → ∀x∃yR(x, y)] |=⊥ 3. ∀x∃yR(x, y) ∨ ∀x∃y¬R(x, y), ∃x∀yR(x, y), ¬[∀x∃yR(x, y)] |=⊥ 4. ∀x∃yR(x, y) ∨ ∀x∃y¬R(x, y), ∃x∀yR(x, y), ∃x∀y¬R(x, y) |=⊥ 5. ∀x∃yR(x, y) ∨ ∀x∃y¬R(x, y), ∀yR(c1 , y), ∃x∀y¬R(x, y) |=⊥ 6. allx∃yR(x, y) ∨ ∀x∃y¬R(x, y), ∀yR(c1 , y), ∀y¬R(c2 , y) |=⊥ 7.1 ∀x∃yR(x, y), ∀yR(c1 , y), ∀y¬R(c2 , y) |=⊥ 7.2 ∀x∃y¬R(x, y), ∀yR(c1 , y), ∀y¬R(c2 , y) |=⊥ 8.1 ∀x∃yR(x, y), ∃yR(c2 , y), ∀yR(c1 , y), ∀y¬R(c2 , y) |=⊥ 9.1 ∀x∃yR(x, y), R(c2 , c3 ), ∀yR(c1 , y), ∀y¬R(c2 , y) |=⊥ 10.1 ∀x∃yR(x, y), R(c2 , c3 ), ∀yR(c1 , y), ∀y¬R(c2 , y), ¬R(c2 , c3 ) |=⊥
√
9.3. THE TOP-DOWN DERIVATION METHOD FOR FOL IMPLICATIONS
353
8.2 ∀x∃y¬R(x, y), ∃y¬R(c1 , y), ∀yR(c1 , y), ∀y¬R(c2 , y) |=⊥ 9.2 ∀x∃y¬R(x, y), ¬R(c1 , c3 ), ∀yR(c1 , y), ∀y¬R(c2 , y) |=⊥ 10.2 ∀x∃y¬R(x, y), ¬R(c1 , c3 ), ∀yR(c1 , y), R(c1 , c3 ), ∀y¬R(c2 , y) |=⊥
√
As you can see, the elimination of an existential quantifier introduces a new constant–over which we have no control. Universal quantifiers are not eliminated, but can be used to add instantiations, to any constants, of the quantified variables. The moves in this game of eliminating and instantiating are chosen so as to produce in the end a contradictory premiselist.
9.3.3
The Adequacy of the Method: Completeness
There is a uniform “automatic” way of applying the laws, which is guaranteed to produce, in a finite number of steps, a proof, provided that the initial implication is indeed a logical implication. This claim is proved by showing the following: If at no finite stage do we get a proof (i.e., a reduction to a set of self-evident goals), then we end with a tree containing an infinite branch. From this branch one can construct a (possibly infinite) model that satisfies all the original premises, which shows that we did not have a contradictory premise list. Consequently, if the initial goal is a logical implication, there is top-down derivation that ends in a finite number of self-evident goals. Turning upside down the derivation tree, we get a bottom-up proof. The proof-system referred to below is the system consisting of the self-evident implications (taken as axioms) and the ⇐-direction of the basic implication laws. The Completeness of the Proof System: If Γ |= α is a logical implication of FOL, then there is a proof of it, obtainable via the top-down derivation method. The soundness of the proof system, i.e., the fact that every proved implication is logical follows from the validity of the laws, whose proof was given in 9.3.1. The above completeness claim amounts, essentially, to what is known as the completeness theorem for first-order logic. We can convert our system into a deductive system (cf. 6.2), following the same lines we followed in the sentential calculus. The deductive FOL system can be either of Hilbert’s type, or of Gentzen’s type. Each includes the corresponding sententiallogic version (for the Hilbert-type system see 6.2.3, for the Gentzen type–6.2.4). The systems are sound and complete for first-order logic. The completeness for the Hilbert-type system is stated in the same form as in the sentential calculus. Now however Γ is a list of wffs in FOL, α is a wff and ‘ `’ denotes provability in FOL: Γ |= α
=⇒
Γ `α
The completeness result for first-order logic, with respect to a particular Hilbert-type system, was first proved by G¨odel. (His, and other, proofs do not employ top-downs derivations. The method employed in this book derives from the ideas underlying Gentzen’s calculus.)
354
CHAPTER 9. FOL: MODELS, TRUTH AND LOGICAL IMPLICATION
The essential difference between sentential logic and FOL is that, in the latter, we are not guaranteed to get a counterexample in a finite number of steps if the initial goal is not valid. The required counterexample may take an infinite number of steps. In general, we cannot know for sure, at any finite stage, whether there is a proof around the corner. This restriction on what we can know can be given precise mathematical form, it becomes what is known as the undecidability of first-order logic. Roughly speaking it says that there is no algorithm (or no computer program), which, on any given first-order sentence, decides in a finite number of steps whether the sentence is logically valid or not. That theorem was first proved by Church. Homework 9.6 Prove the following. You can use top-down derivations, as well as substitution of equivalents and the equivalence laws of 9.3.1, besides the full sentential apparatus. 1. ∀ x [S(x, a) → S(x, b)] ∧ ¬S(b, b) |= ¬S(b, a) 2. ∀vS(v, v) |= ∃yS(x, y) 3.
|= ∀ x [¬P(x) → ∃ y (P(y) → R(y))]
4.
|= ∃x∀yS(x, y) ∨ ∀x∃y¬S(x, y)
5. ∀xP(x) ∨ ¬∀xR(x) |= ∃ x [R(x) → P(x)] 6. ∀ x, y [S(x, y) → S(x, x)] |= ¬S(x, x) → ¬S(x, y) 7. ∀ x (α ∨ β) |= ∀xα ∨ ∃xβ 8.
|= ∃ x [(∃uP(u) → ∃vR(v)) → (P(x) → R(x))]
9. ∃xα → ∀xβ |= ∀ x (α → β) 10. ∃ x ∀ y [S(x, y) ↔ S(x, x)] |= ∃ x [∀yS(x, y) ∨ ∀y¬S(x, y)]