279 3 755KB
English Pages [102] Year 2021
P EDL : additional notes and exercises – scruffy and incomplete preliminary draft – T21
1
1.1
Preliminaries
(MP and CSP people may know most of this already; but perhaps not PP people.)
1.1
Sets
It is difficult to say anything uncontentious about what sets are over and above saying that they have members—alternative terminology: elements— and that a set A is identical to a set B if and only if A and B have precisely the same members. This identity condition is an extensionality principle, which distinguishes sets from predicates or properties: different predicates may be true of precisely the same things, or different properties may be had by precisely the same things, but different sets cannot have precisely the same things as members. Notation for ‘a is a member (element) of A’: a ∈ A.1 Extensionality Principle: A = B ⇔ for any a, a ∈ A ⇔ a ∈ B.2 There is an empty set (null set)—a set that has no members. Notation for the empty set: ∅.3 The defining condition for ∅ is that for any a, a 6∈ ∅; and so, by extensionality, there is only one empty set. Given a finite number of things, a1 , . . . , an , we can specify the set that contains all and only these things, by enumeration, as follows: {a1 , . . . , an }.4 In particular, given a single thing a, there is a set {a} that contains a and nothing else: such sets are called singletons, and people say ‘singleton a’ meaning the set {a}.5 The { . . . } notation can sometimes be extended to cover infinitely many things: for example, if we have a sequence a1 , a2 , . . . , an , . . ., we can write {a1 , a2 , . . . , an , . . . }. 1
‘a ∈ A’ is standardly negated like this: a 6∈ A. Also, ‘a ∈ A and b ∈ A’ is standardly abbreviated to ‘a, b ∈ A’—and ‘a1 ∈ A and . . . and an ∈ A’ to ‘a1 , . . . , an ∈ A’, for any n. 2 We shall be using ⇔ as informal notation for ‘if and only if’: it’s important to avoid confusion with the symbol ↔, which is used in a formal language. We can also use ⇒, for one-way implication, and this isn’t to be confused with the formal symbol →. 3 Often the empty set looks more like this: ∅. But, however exactly you want to write it, make sure it looks different from the Greek letter φ. (Sometimes φ looks more like this: ϕ.) 4 When n = 0, then we have the empty set, which is why some people like to use the notation ‘{ }’ for ∅. 5 Don’t confuse ‘{a}’ and ‘a’: in the ‘cumulative hierarchy’ of sets, which is currently taken as the standard background set-theoretical framework to work in, it’s never the case that {a} = a (and, even in non-standard set theories, this identity will not hold in general).
1
P EDL : additional notes and exercises – scruffy and incomplete preliminary draft – T21
1.1
But note that sets are unordered: {0, 1, 2, 3, . . . , n, . . . } = {1, 0, 2, 3, . . . n, . . . }, {0, 1, 2, 3} = {3, 2, 1, 0}, and so on. So don’t confuse a sequence (finite or infinite) with the set of its coordinates. We can also specify sets by giving a membership condition: if we have a predicate · · · x · · · , then the set containing all and only those things x such that · · · x · · · is specified as follows: {x | · · · x · · · }.6 There are, however, dangers in attempting to specify sets in this way: paradoxes lurk if we assume that any coherent predicate specifies a set. But if we know we’ve got a set A to start with, then we’re safe in taking it that there’s a set {x | x ∈ A and · · · x · · · }, for which the following abbreviation is used: {x ∈ A | · · · x · · · }. Since sets are extensional, different predicates may, of course, determine the same set: for example, {x ∈ N | x is even} = {x ∈ N | x + 1 is odd}.7 When we have a set of things of a given sort in the background, then it’s standard to adopt particular letters as variables which are understood to range just over these things, and we needn’t then give an explicit restriction on the left of ‘|’. For example, if we know that φ ranges over the set of formulae of a given formal language, then we can just write {φ | φ is a tautology } for the set of tautologies of the language. ——— There are three basic operations on sets which get used all the time: given sets A and B, we can take their intersection, their union, and the complement of B in A. The intersection of A and B: A ∩ B = {x | x ∈ A and x ∈ B}. The union of A and B: A ∪ B = {x | x ∈ A or x ∈ B}. The complement of B in A: A r B = {x | x ∈ A and x 6∈ B}.8 These operations of intersection and union give you a set from just two sets. But they can be generalized: given any set of sets there is a set that’s the intersection of all the sets in the set of sets, and a set that’s the union of all the sets in the set of sets. Commonly, a set of sets will be presented as an 6
You’ll also find ‘{x : · · · x · · · }’. The set N is the set of natural numbers, viz. {0, 1, 2, 3, . . . }. (See footnote 26.) 8 What about a non-relative complement of a set B—a set {x | x 6∈ B}? Well, at least in the cumulative hierarchy of sets, there is no such set: crudely put, paradoxes are avoided by imposing a size restriction on the totalities that count as sets, and an unrestricted complement would be too big to be a set. 7
2
P EDL : additional notes and exercises – scruffy and incomplete preliminary draft – T21
1.1
indexed family of sets—forTexample, as anSinfinite sequence A0 , A1 , . . . An , . . . . In this case notation like ‘ n∈N An ’ and ‘ n∈N An ’ is standard9 : \
An = {x | for every n ∈ N, x ∈ An };
n∈N
[
An = {x | for some n ∈ N, x ∈ An }.
n∈N
——— Next some relations between sets. A is said to be a subset of B (B a superset of A) if and only if any member of A is a member of B: Notation for ‘A is a subset of B’ (‘B is a superset of A’): A ⊆ B (B ⊇ A). Definition: A ⊆ B ⇔ for any a, a ∈ A ⇒ a ∈ B.10 Note that the relation ⊆ is a partial ordering: it’s reflexive 11 and transitive, and it’s also antisymmetric; thus A = B ⇔ A ⊆ B and B ⊆ A. When A ⊆ B and A = 6 B—equivalently, A ⊆ B and B 6⊆ A—then A is said to be a proper subset of B (B a proper superset of A).12 Note that ∅ is a subset of any set. Sets A and B are said to be disjoint if and only if nothing is in both. This can be neatly expressed as follows: A and B are disjoint ⇔ A ∩ B = ∅. ——— Finally, given a set A, all standard theories of sets take there to be a set P(A) that contains all and only the subests of A: P(A) = {B | B ⊆ A}. This is called the power set of A. 9
Recall footnote 7: the indexing set N is {0, 1, 2, 3, . . . }. Indexing an infinite version of a two-place operation should be familiar from notation like ‘Σ∞ n=0 an ’ 10 Don’t confuse ⊆ and ∈. Note that ⊆ is reflexive, but ∈ certainly isn’t: in fact, in the cumulative hierarchy, ∈ is asymmetric, and hence actually irreflexive. Some people both say ‘B contains a’, meaning a ∈ B, and say ‘B contains A’, meaning A ⊆ B; but, to help avoid confusion, I recommend keeping ‘contains’ for the first thing and saying ‘B is a superset of A’ for the second. (⊆ is often called the ‘set inclusion’ relation.) 11 On the domain of all sets, as it were. ‘As it were’ because there can be no set of all sets: see footnote 8 and exercise 1.1.1 12 Sometimes the notation ‘A ⊂ B’ (‘B ⊃ A’) is used to mean that A is a proper subset of B (B is a proper superset of A); but, confusingly, it’s also sometimes used to mean the same as ‘A ⊆ B’ (‘B ⊆ A’)—especially in older texts.
3
P EDL : additional notes and exercises – scruffy and incomplete preliminary draft – T21
1.2
Exercise 1.1.1. Show that there can be no set A of sets such that {B ∈ A | B 6∈ B} ∈ A. Deduce that there can be no set of all sets. Exercise 1.1.2. For each of the following, either show that it holds for any sets X and Y or else provide a counterexample. (i) (a) P(X ∩ Y ) ⊆ P(X) ∩ P(Y ), (b) P(X) ∩ P(Y ) ⊆ P(X ∩ Y ). (ii) (a) P(X ∪ Y ) ⊆ P(X) ∪ P(Y ), (b) P(X) ∪ P(Y ) ⊆ P(X ∪ Y ). (iii) (a) P(X r Y ) ⊆ P(X) r P(Y ), (b) P(X) r P(Y ) ⊆ P(X r Y ). (iv) (a) X ⊆ Y ⇒ P(X) ⊆ P(Y ),
(b) P(X) ⊆ P(Y ) ⇒ X ⊆ Y .
(v) (a) X ∩ Y = ∅ ⇒ P(X) ∩ P(Y ) = ∅, (b) P(X) ∩ P(Y ) = ∅ ⇒ X ∩ Y = ∅.
1.2
n-tuples
An n-tuple is an n-length sequence of things: ‘ha1 , . . . , an i’ is notation for the n-tuple consisting of a1 , . . . , an in that order.13 A 2-tuple is called an ‘ordered pair’ (or sometimes just ‘pair’)14 . And identity over n-tuples is determined by the identity of coordinates: ha1 , . . . , an i = hb1 , . . . , bn i ⇔ for all i, 0 ≤ i ≤ n, ai = bi . All this is in Halbach’s Manual, of course, but note that our characterization of n-tuples hasn’t excluded the case where n = 1 or where n = 0. What about these cases? Well, in some contexts there may be good reason to distinguish a 1-tuple hai from the object a itself, but it is often more natural to identify them, particularly in doing semantics for first-order languages (such as Halbach’s L2 and L= )—and this identification obviously satisfies the general identity principle. And it will be useful—it’ll make things neat—to recognize 13
Sometimes ordinary parentheses are used—‘(a1 , . . . , an )’—but you ought to stick to ‘h’ and ‘ i’, which is the standard modern notation. Never , of course, use ‘{’ and ‘}’, which should be reserved for (unordered) sets. 14 A 3-tuple is a(n ordered) ‘triple’; a 4-tuple is a(n ordered) ‘quadruple’; and finally, after “quin-” and “sex-”, we get all of “tuple”—and, I suppose, also after “sep-” and “oc-”, but reaching for a special Latinate word begins to sound sillier and sillier the larger you get.
4
P EDL : additional notes and exercises – scruffy and incomplete preliminary draft – T21
1.3
a 0-tuple h i: the 0-tuple, since the identity principle determines that there can’t be more than one.15 To denote the set {hx1 , . . . , xn i | for all i, 0 ≤ i ≤ n, xi ∈ A}16 —i.e. the set containing all and only those n-tuples whose coordinates are each members of the set A—the notation ‘An ’ is used.17 What we’ve said about 1-tuples and the 0-tuple means that A1 = A and A0 = {h i}.
1.3
Relations
An n-place relation is just a set of n-tuples. A 2-place relation is standardly called a binary relation. And a 3-place one ternary, . . . . (See the Manual.) Halbach provides standard definitions of the domain and the range of a given binary relation.18 A common alternative approach (more suited, perhaps, for the way relations figure in the semantics of first-order languages19 ) is to consider a relation over—or on—an independently specified ‘domain’. This applies to n-place relations, for whatever n: R is an n-place relation over a domain A if and only if R ⊆ An . And if A ⊆ B, then An ⊆ B n , and so any relation over a domain A is also a relation over B. Of course, An is itself an n-place relation—the universal n-place relation over A. In the philosophical literature the term “relation” is often used to mean an n-place predicate, or something of the same ontological kind—whatever that is—as a property.20 But in classical logic the notion of a relation is extensional : a relation R is identical to a relation S if and only if R and S are sets containing precisely the same n-tuples. Over a given domain A an n-place predicate · · · x1 · · · xn · · · will determine the relation containing all and only those 15
In standard developments of set theory n-tuples, for n > 1, are represented as—formally taken to be—sets, rather than allowing them in as a theoretically primitive construct. It doesn’t matter much how this is done, provided that the identity principle displayed above is guaranteed. Standardly, an ordered pair ha, bi is first defined to be {{a}, {a, b}}; then an ordered triple ha, b, ci to be hha, bi, ci; and so on. In fact we have an inductive definition: see section 1.6. (The 0-tuple doesn’t get a look in here.) 16 This is a natural extension of the notation introduced on page 2: fully spelt out, the set is {x | there exist x1 , . . . xn such that x = hx1 , . . . , xn i and, for all i, 0 ≤ i ≤ n, xi ∈ A}. 17 n A is a special case of the Cartesian product A1 × . . . × An of sets A1 , . . . , An , where A1 × . . . × An = {hx1 , . . . , xn i | for all i, 0 ≤ i ≤ n, xi ∈ Ai }. Thus A2 = A × A, and so on. 18 The domain of a binary relation R—which is determined by R—is defined to be {x | there exists y such that hx, yi ∈ R}—and, correlatively with this, the range of a binary relation R is defined to be {y | there exists x such that hx, yi ∈ R}. 19 One of the oddities of Halbach’s book is that, in interpretation structures, the n-place relation interpreting and n-place predicate is not required to be over the domain of the interpretation structure—in the sense of ‘over’ defined here. I know of no other book that’s so eccentric: the standard definition of an interpretaion structure does require this. 20 You sometimes find these usages in maths texts too (and sometimes there’s a muddle).
5
P EDL : additional notes and exercises – scruffy and incomplete preliminary draft – T21
1.3
n-tuples that satisfy the predicate, viz. {hx1 , . . . , xn i ∈ An | · · · x1 · · · xn · · · }. And different predicates can determine the same relation; for example, over the natural numbers ‘x < y’ and ‘y + 1 > x + 1’ determine the same relation: {hm, ni | m < n} = {hm, ni | n + 1 > m + 1}21 . Finally, what about 1-place and 0-place relations? Well, it follows from our stance on 1-tuples that a 1-place relation over a domain A is just a subset of A. And given that there is one and only one 0-tuple h i, there are only two possibilities for a 0-place relation: the singleton {h i} or the empty relation ∅.
It’s then tempting to ‘identify’ the truth value / (Truth) with {h i}, and the truth value 7 (Falsity) with ∅.22 After all, n-place predicates determine n-place relations; and sentences are 0-place predicates: so they determine 0-place relations—they are either satisfied by h i or they’re not. How, then, could truth not be being satisfied by the only thing available to do any satisfying—and falsity be being left unsatisfied? Taking this line might at first sight seem artificial; but it would be an instance of pushing formal apparatus to a limiting case in a way that’s actually quite natural and certainly makes things neat in certain contexts.23
——— The Manual gives a definition of what it is for a binary relation R to be an equivalence relation on a set A: it’s a relation that’s reflexive, symmetric and transitive on A. Such a relation will then determine a partition of A—viz. a set of mutually disjoint non-empty subsets of A whose union is the whole of A—such that the elements of each subset in the partition will all be Rrelated to one another, but no pair of elements from distinct subsets in the partition will be R-related. The sets in the partition—these subsets of A—are called equivalence classes modulo the relation R.24 For example, if R is the relation defined over N by stipulating that hm, ni ∈ R if and only if m and n have the same remainer on division by 2, then there are two equivalence classes modulo R: the set of even numbers and the set of odd numbers. Observe that the univeral relation A2 over a set A is an equivalence relation on A, and, if A is non-empty, there is just one equivalence class, viz. A—so the partition is {A}. At the other extreme the relation {ha, ai | a ∈ A} will be an equivalence relation; this is the identity relation over A (a pair ha, bi is in the relation iff a = b): the equivalence classes are then the singleton subsets of A—in other words, the partition determined is {{a} | a ∈ A}. 21
Fully spelt out this equation is ‘{hx, yi ∈ N2 | x < y} = {hx, yi ∈ N2 | y + 1 > x + 1}’. But the context determines that ‘m’ and ‘n’ range over N, so we can adopt an extension of the convention mentioned above (page 2) and do without explicit mention of N. 22 The virtue of the notation ‘/’ and ‘7’ for Truth and Falsity will emerge in section 2.6. 23 The smaller typeface means you could ignore the remarks. 24 Note that even if A = ∅ there will be such a partition—the partition is ∅: there are no equivalence classes.
6
P EDL : additional notes and exercises – scruffy and incomplete preliminary draft – T21
1.4
——— Exercise 1.3.1.
25
(a) Justify the claim that an equivalence relation on A determines a partition of A into equivalence classes, as described above. (b) Assume that Π is a partition of A, and define the relation RΠ be the set {ha, bi ∈ A2 | there exists X ∈ Π such that a, b ∈ X}. Show that RΠ is an equivalence relation on A. (c) Show that the relation RΠ defined in (b) determines Π as its partition of A into equivalence classes. (d) Assume that a relation R over A is an equivalence relation on A. Show that if Π is the partition into equivalence classes determined by R, then RΠ = R.
1.4
Functions
The idea of a function will be familiar from mathematical work with number domains of one sort or another. But the general notion of a function is not restricted to numbers (you should already also be familiar with truth functions, for example). The Manual, of course, provides a way of modelling functions as relations, but let’s ignore that for the moment. We can think of a function f as taking arguments (inputs), and yielding a unique value (output) for each argument a, standardly written ‘f (a)’ (and often read ‘f of a’)—and f could take any kind of thing as an argument, and yield any kind of thing as a value. The set of things on which a function is ‘defined’—the things it actually yields a value for—is called its domain, and the set of all values yielded is called its range. It’s standard to write ‘f : A → B’ to mean that f is a function whose domain is A and whose range is a subset of B, and to say that f is a function from A into B. And if everything satisfying a predicate . . . x . . . is in the domain of f , then the set-specifying notation ‘{f (x) | . . . x . . .}’ can be used to stand for {y | there exists x such that y = f (x) and . . . x . . .}. There’s also another useful sort of arrow notation: instead of writing ‘f (a) = b’ we can write ‘f : a 7→ b’. And the notation can be used (with variables or schematic letters) to give a description of a function: for example, the 25
The point of this exercise is to show that there is a natural correspondence between equivalence relations and partitions. The mathematicians will no doubt have covered the material already in their Mods course, and so they needn’t do the exercise. The physicists won’t have, but they had probably best postpone doing the exercise: make it vacation work.
7
P EDL : additional notes and exercises – scruffy and incomplete preliminary draft – T21
1.4
squaring function (on whatever number domain26 ) can be specified as the function x 7→ x2 . Functions don’t have to be just one-place. The addition function, for example, is a two-place function, i.e. it takes two arguments. And in the lectures we’ll be considering n-place functions, where there is no bound on how (finitely) large n could be: the value of an n-place function f for arguments a1 , . . . , an is written ‘f (a1 , . . . , an )’. The domain of an n-place function is the set of all n-tuples, viz. n-length sequences, for which the function yields a value.27 And this means that we could in fact think of an n-place function, whatever n is, as a one-place function, viz. a function whose arguments are n-tuples—an n-tuple being itself a singe object. This justifies our giving basic definitions below just for one-place functions. To illustrate these ideas, consider truth functions. Truth tables describe truth functions: for example, the truth table for : describes the one-place function f: : {/, 7} → {/, 7} such that f: (/) = 7 and f: (7) = /; the truth table for ∧ describes the two-place function f∧ : {/, 7}2 → {/, 7} such that f∧ (/, /) = /, f∧ (/, 7) = 7, f∧ (7, /) = 7, and f∧ (7, 7) = 7; the truth table for ∨ describes the two-place function f∨ : {/, 7}2 → {/, 7} such that f∨ (/, /) = /, f∨ (/, 7) = /, f∨ (7, /) = /, and f∨ (7, 7) = 7; and so on. But truth functions are only the beginning: in a mathematical treatment of logic, functions of all different sorts come up all the time. ——— The question what a function is is as difficult as the question what a set is. But (at least on the classical conception—which we shall be adopting) functions, like sets, are extensional , in the sense that you can’t have different functions yielding exactly the same range of values for each argument. And the domain of a function is constitutive of the function—meaning that you can’t have the same function on different domains. Hence we have an identity principle: f and g have a common domain A such that f =g⇔ for all a ∈ A, f (a) = g(a). 26 Here is a (crudely specified) list of some number domains that you should be familiar with (along with the labels standardly used for them):
N = {0, 1, 2, 3, . . . } —Natural Numbers Z = { . . . , −2, −1, 0, 1, 2, . . . } —Integers n Q = {m | n, m ∈ Z and m 6= 0} —Rational Numbers (see footnote 27) √ R —Real Numbers, which include ir rational numbers such as 2, π, e, . . . . 27
And the set-specifying notation ‘{f (x1 . . . xn ) | . . . x1 . . . xn . . .}’ can be used to stand for {y | there exist x1 . . . xn such that y = f (x) and . . . x1 . . . xn . . .}.
8
P EDL : additional notes and exercises – scruffy and incomplete preliminary draft – T21
1.4
This principle points up that you shouldn’t confuse a function with notation that specifies a function. For example, we can think of propositional formulae as specifying truth functions, and the distinct formulae (P ∧Q) and :(:P ∨Q) then both specify the same function: f∧ (as defined above). Again, x.(y + z) and x.y + x.z both specify the same (three-place) function, on any given number domain; furthermore, each expression expresses a different function over different number domains. And you shouldn’t confuse a function with a process of calculation or computation: in each of the examples just given, the two pieces of notation represent different ways of working out the value, for particular arguments, of the single function specified. In standard developments of set theory functions are actually taken to be sets— in fact relations. This is the definition that Halbach gives in the Manual : a function is taken to be a binary relation F satisfying the condition that for any a there is at most one b such that ha, bi ∈ F . The domain and range of F — as defined above for functions—then turn out to be precisely the ‘domain’ and ‘range’ of F thought of just as a relation.28 29 Say f : A → B; then, the following terminology is used: f is one-one (injective) iff f takes distinct arguments to distinct values —in other words, for any a and b in A, a 6= b ⇒ f (a) 6= f (b) (equivalently, f (a) = f (b) ⇒ a = b). f is onto B 30 (surjective) iff B is the range of f (not just a superset of it) —in other words, for every b in B, there is an a in A such that f (a) = b. f is a one-one correspondence between A and B (is bijective (is a bijection)) iff f is both one-one and onto B. A function which takes the same value for any argument is called a constant function. There will be constant functions on any domain, but note that a function whose domain is a singleton is inevitably constant: there is only one possible argument and hence only one value for it. This remark covers n-place functions for any n, since a domain is a set of n-tuples. (In particular, it covers 0-place functions—and we shall, for example, be interested in 0-place truth functions. For such a function there can only be one possible domain, viz. {h i}; this offers only one possible array of arguments, viz. h i: so 0-place functions are all inevitably constant.) ——— 28
Observe that this representation of functions identifies them with their graphs: think of drawing a graph for a function such as x 7→ x2 on the real numbers—the line drawn is {hx, yi ∈ R | x2 = y}.) 29 Observe, too, that the identity principle for functions that’s stated above obviously follows from this representation of functions by relations. 30 Not just into. (Every function, of course, is onto its range.)
9
P EDL : additional notes and exercises – scruffy and incomplete preliminary draft – T21
1.5
Exercise 1.4.1. Write out explicity all one-place and all two-place truth functions. Exercise 1.4.2. Given any set A, and any function f whose domain is A, show that the set {x ∈ A | x 6∈ f (x)} is not in the range of f (i.e. there is no a ∈ A such that f (a) = {x ∈ A | x 6∈ f (x)}).31 [Hint: assume that there were such an element a ∈ A and derive a contradiction. (Ask yourself: What if a ∈ f (a)? What if a 6∈ f (a)?)]
1.5
Cardinality
A cardinal number answers a question ‘How many . . . ?’.32 In particular, we can ask how many members a set has; and the answer is called its cardinality. Recall that the set N of natural numbers is the set {0, 1, 2, 3, . . . }. These numbers can serve as cardinal numbers, and, when they do, they are taken to comprise all and only the finite cardinal numbers. Thus a set is finite if and only if its cardinality is a natural number.33 For example, ∅ is finite, because it has cardinality 0; so is {∅}, because it has cardinality 1; and {∅, {∅}} has cardinality 2; {∅, {∅}, {∅, {∅}}} has cardinality 3; and so on. But what is the cardinality of N itself? This is certainly not n for any n ∈ N: N is not finite—it’s infinite (‘infinite’ just means ‘not finite’). But infinite-size isn’t itself a cardinal number any more than finite-size is: there are different infinite cardinal numbers—infinitely many. (And this infinity of infinite cardinal numbers is explosively greater than the infinite number of finite cardinal numbers—in fact it’s greater than any infinite cardinal number!) To get a general grip on the idea of the cardinality of a set (finite or infinite), standard foundational accounts start with a definition of when a set A has-the-same-cardinality-as a set B—common alternative terminology: A is equipollent to B , or A is equinumerous to (or with) B 34 (notation: A ≈ B). The definition is just this: A ≈ B ⇔ there is a one-one correspondence between A and B. 31
We haven’t actually stipulated that the range of f contains only sets; and there’s no need to: if f (x) is not a set, then it has no members and so, for any y, y 6∈ f (x). 32 In contrast with a real number, say, which answers a question ‘How much . . . ?’. 33 There are other definitions of what it is for a set to be finite, which in a suitably powerful set theory can be shown to be equivalent. 34 The virtue of the alternative terminology is that it doesn’t look as if it presupposes a notion of cardinality. Of course, there’s in fact no such presupposition with ‘has-the-samecardinality-as’, but it looks as if there is.
10
P EDL : additional notes and exercises – scruffy and incomplete preliminary draft – T21
1.5
It is then routine to check that, for any sets A, B, and C, (i) A ≈ A; (ii) A ≈ B ⇒ B ≈ A; (iii) A ≈ B and B ≈ C ⇒ A ≈ C. Thus, if we’re happy to talk of a ‘domain of sets’35 , then ≈ is an equivalence relation on this domain.36 Anyhow, a set which has the same cardinality as N is called denumerable—and ‘ℵ0 ’ (read ‘aleph zero’) is the notation used to stand for the cardinal number: this is the smallest infinite cardinal number37 . A set that is either finite or denumerable is called countable, and any larger set is called uncountable. But what does ‘larger’ actually mean? Well, we can define a relation 4 between sets as follows: A 4 B ⇔ there is a one-one function from A into B. It then turns out—though this is a non-trivial result (the Schr¨oder-Berstein Theorem)—that if A 4 B and B 4 A, then A ≈ B; and it makes sense to think of ‘A 4 B’ as meaning that B has cardinality no smaller than A. We can then define a relation of having strictly larger cardinality: A ≺ B if and only if A 4 B and A 6≈ B (equivalently, A 4 B and B 64 A). ——— For the development of the theory of infinite cardinal numbers, see any introduction to set theory. And for philosophical and historical discussion, see, for example, Adrian Moore’s book The Infinite. ——— Exercise 1.5.1. Give an example (with justification) of a set which has a proper subset of the same cardinality. 35
But see footnote 11. It would be nice if we could identify cardinal numbers with the equivalence classes of ≈. This, more or less, is what logicists such as Frege and Russell wanted to do. But in standard contemporary developments of set theory there’s a problem: equivalence classes are too big to be themselves sets in the theory: see footnote 8. 37 It’s the start of a list of increasingly large infinite cardinals, ℵ0 , ℵ1 , ℵ2 , . . .—an infinite list, and one extending way way beyond what you’ve got after running through all the ℵn for n ∈ N. 36
11
P EDL : additional notes and exercises – scruffy and incomplete preliminary draft – T21
1.6
Exercise 1.5.2. If A is finite, with cardinality m, what is the cardinality of An ? If A and B are finite, with cardinalities m and n, respectively, how many functions are there from A into B? How many n-place truth functions are there? Exercise 1.5.3. Deduce from Exercise 1.4.2 that A ≺ P(A). If A is finite, with cardinality m, what is the cardinality of P(A)? If A is finite, with cardinality m, how many n-place relations are there over A? Exercise 1.5.4. An infinite sequence A0 , A1 , . . . An , . . . of sets is a chain iff i < j ⇒ Ai ⊆ Aj , for all i, jS∈ N. Show that if A0 , A1 , . . . An , . . . is a chain and B is a finite subset of n∈N An , then B ⊆ An , for some n.
1.6
Induction and Recursion
A common pattern in providing a mathematical treatment of logic is first to define a totality (e.g. the set of formulae of a given formal language (or a particular subset of such formulae)) by induction, and then to define functions on this totality by recursion38 on the inductive definition (e.g. a function giving the evaluation of a formula in a given interpretation structure).39 An inductive definition of a totality first specifies a given range of basic objects (e.g. the sentence letters of a propositional language), and then specifies ways of making new items out of existing ones (e.g. making the formula :φ out of a formula φ, or making the formula (φ ∧ ψ) out of formulae φ and ψ); and the totality inductively defined by such specifications is the set that (i) contains the basic items, that (ii) contains any item got by one of the ways of making new items, if the item(s) it’s made out of is/are in the set40 , and that (iii) contains nothing else. 38 Sometimes people use the same terminology as before and call recursive definitions “inductive definitions”, but it’s clearer to have different labels. 39 We can also define relations by recursion; but these definitions could in principle be recast as definitions of functions. 40 A set satisfying condition (ii) is said to be closed under the specifications for making new items.
12
P EDL : additional notes and exercises – scruffy and incomplete preliminary draft – T21
1.6
There are different but equivalent ways of making the idea of an inductive definition mathematically precise—in particular, showing that a set meeting conditions (i), (ii), and (iii) above actually exists. However we do this, it is noteworthy that an item will be in an inductively defined totality if and only if it has a ‘generation tree’ tracing through a way it can be made out of basic items. In the case of a set of formulae that has been inductively defined in the way sketched above, a formula in the set will have a unique generation tree describing the way it is built up out of sentence letters. Such a generation tree will then reveal the syntactic structure of a formula.41 If the items in an inductively defined totality do each have a unique generation tree, then a recursive definition of a function on this totality is one that first specifies outright a value for the basic items, and then, for each way of making new items out of existing ones, specifies a value for an item made in this way in terms of the value(s) for the item(s) it’s made out of: a (unique) value is thereby fixed for every item in the totality. (E.g. to define the evaluation of a formula under a given assignment of truth values to sentence letters—i.e. an interpretation structure—a sentence letter is first evaluated with the truth value assigned to it; then the evaluation of a formula :φ is defined to be / (Truth), if the evaluation of φ is 7 (Falsehood), and 7 otherwise—i.e. it’s f: of the evaluation of φ; and the evaluation of a formula (φ ∧ ψ) is defined to be /, if the evaluation both of φ and of ψ is /, and 7 otherwise—i.e. it’s f∧ of the evaluation of φ and the evaluation of ψ; and so on, for however many connectives there may be in the language.)42 This, of course, is just a description of the semantic clauses that the Manual gives for L1 : for any interpretation structure A, a function φ 7→ |φ|A is defined from the domain of formulae of L1 into {/, 7}—defined by recursion on the inductive defition of the totality of formulae of L1 , For a detailed (and pedantic) account of inductive and recursive definitions, see Enderton, A Mathematical Introduction to Logic, 1.2—and see also 1.4 (on ‘unique readability’). 41
There can, however, be inductively defined totalities whose items may have more than one generation tree: some proof systems are an example of this, where generation trees turn out to be proofs. This is not true, however, of Natural Deduction systems of the kind presented in the Manual : the proofs are not straightforwardly generation trees of any inductive definition. Note that what we get for the systems in the Manual are in fact inductive definitions of the totality of proofs. (I’ve said ‘systems’ rather than ‘system’, since there are three different systems—for L1 , for L2 , and for L= .) The generation trees for these definitions represent stages in creating a more complex proof out of a given proof or proofs. Sadly, the particular definitions given in the Manual don’t quite yield unique generation trees, and so proofs can be structurally ambiguous: see Exercise 3.1.1. (In fact, though, it’s easy enough to formulate definitions for a Natural Deduction system in a way that does guarantee a unique structure to each proof.) 42 Recall f: and f∧ from page 8.
13
P EDL : additional notes and exercises – scruffy and incomplete preliminary draft – T21
1.6
Once we have an inductively defined totality, we can then take proof by induction as a primitive proof strategy—that is, proof by induction on the inductive definition of the totality43 . To show that every item in an inductively defined totality satisfies a given condition, all we have to do is (i) show that every basic item satisfies the condition, and (ii), for each way of making new items out of existing ones, show that an item made in this way satisfies the condition, provided that the item(s) it’s made out of do(es)—and this can be shown by taking it as an assumption (called the inductive hypothesis) that the item(s) it’s made out of satisfies/satisfy the condition, and under this assumption showing that the item we’re considering satisfies it too. Part (i) is often described as establishing the ‘base cases’, and part (ii) the ‘induction steps’. There’s another strategy that’s often adopted when reasoning about the formulae of a language, or like totalities, viz. to define some measure of complexity44 —such as the number of symbols, or the number of connectives, or whatever—and then to use some form of proof by induction on the natural numbers. However, using a direct proof by induction on an inductive definition is conceptually more basic, and it’s also usually more straightforward. But proof by induction on the natural numbers will often be used for other purposes, and so there follows a schematic presentation of three common forms of it. These are inference schemes: from what’s above the line you may infer what’s below it. Ordinary (step-by-step) Induction: Φ(0), ∀n[Φ(n) ⇒ Φ(n + 1)] ∀n Φ(n)
Complete (all-at-once) Induction: ∀n [ ∀m[m < n ⇒ Φ(m)] ⇒ Φ(n)] ∀n Φ(n)
The Least Number Principle: ∃n Φ(n) ∃n [ ∀m[m < n ⇒ not-Φ(m)] and Φ(n)] (i.e. there is a smallest number such that Φ(it)) 43
I’m not, of course, talking about formal proofs in any logical system, but about our informal (but rigorous) reasoning about formal logic. 44 This can standardly be a recursive definition.
14
P EDL : additional notes and exercises – scruffy and incomplete preliminary draft – T21
1.7
It’s noteworthy that proof by step-by-step induction can itself be seen as proof by induction on an inductive definition. This is because the set of natural numbers can be inductively defined: there is just one basic item, viz. 0, and there is just one way of making a new item out of an existing one, viz. adding 1. The principle of complete induction then follows from the principle of step-bystep induction: there is a proof of this in, for example, Bostock, Intermediate Logic, Section 2.8. And the least number principle is really the same principle as complete induction: it’s just its ‘contraposition’ (see 1.7)—with ‘not-Φ( )’ replacing ‘Φ( )’. ———
Exercise 1.6.1. (a) Assume that A is a non-empty set of natural numbers and that, for all n ∈ N, if n + 1 ∈ A, then n ∈ A. Show that 0 ∈ A. . (b) By the least number principle it’s trivial to show that if a set A of natural numbers is non-empty, then A contains a least element.45 But use just stepby-step induction to show that this holds.
Exercise 1.6.2. C Suggest an inductive definition of (the set of) Plato’s ancestors using the relations being-a-father-of and being-a-mother-of. From the assumption that someone’s human only if both their parents are human, prove by induction that all Plato’s ancestors were human. How many ancestors did Plato have?
1.7
Proof Strategies
There are various manoeuvres that are standardly adopted in setting up a proof that so-and-so follows from such-and-such—I mean an informal (but rigorous) argument about a logical system, or whatever, not a proof using any such system. And there are technical terms to specify these manoeuvres. There follows a list of some of the common ones. (Unsurprisingly, they do 45
Thus 0)-place ones. For example, can we define 7 in terms of : and ∧: this is because 7 is logically equivalent to (P ∧ :P ), say.107 But there is admittedly In other words, we take φf =
W
~ x∈f −1 (/)
V
1≤i≤n
θ(~x, i) ,
where f −1 (/) = {~x ∈ {/, 7} | f (~x) = /}. (‘f −1 (a)’ is standard notation for the ‘inverse image’ of an element a in the range of a function f —i.e. for {x | f (x) = a}.) [A formula is said to be in disjunctive normal form (DNF ) iff it is a disjunction of conjunctions of sentence letters/negated sentence letters, and is said to be in perfect disjunctive normal form iff it is a disjunction of conjunctions of sentence letters/negated sentence letters such that any sentence letter occurring at all occurs exactly once (negated or unnegated) in each conjunction that’s a disjunct. So we have shown that a (≥ 1)-place and not-constantlyfalse truth function can be expressed by a formula in perfect disjunctive normal form. (A formula on its own is counted as a disjunction with just one disjunct.) And it’s a corollary of this that any satisfiable formula is equivalent to one in perfect disjunctive normal form. (If, furthermore, we identified an empty disjunction (of anything) with 7, then constantly-false truth functions and unsatisfiable formulae would not need to be excluded.)] 107
Hence the caveat in footnote 106 is actually unnecessary: sense can be made of saying that / and 7 can be defined away from {:, ∧, ∨, /, 7}, leaving {:, ∧, ∨} adequate to express even f/ and f7 .
43
P EDL : additional notes and exercises – scruffy and incomplete preliminary draft – T21
2.8
something unnatural about introducing an extraneous sentence letter into a formula defining a connective; and we can show that this is never needed for definitions of (> 0)-place connectives: if n > 0 and there is any formula φ of L1 (C) at all such that cP 1 . . . P n and φ are logically equivalent, then there is such a formula containing no sentence letters not among P 1 , . . . , P n . To see this we just need to check that, if Q 1 , . . . , Q m are extraneous sentence letters in φ—i.e. not among P 1 , . . . , P n — then φ(P 1 /Q 1 , . . . , P 1 /Q m ) would do just as well as a definition, since it is logically equivalent. (Use Lemmas 2.2.1 and 2.4.1, of course.)
——— We shall also want to establish that certain sets C of connectives are not adequate: to do this we have to find a truth function that no formula of L1 (C) can express. And a general strategy for showing that a function f can’t be expressed is to find some property which all formulae of L1 (C) must possess and which precludes their expressing f . For example, {/, ∧, ∨} is not adequate, because no formula of L1 (/, ∧, ∨) can express f: : to see this observe that any formula of L1 (/, ∧, ∨) must come out / in any structure that assigns / to all letters (check this by induction), but no formula that expresses f: can have this property. And showing that a connective c can’t be defined in terms of a given set C of connectives calls for essentially the same manoeuvre: the example we’ve just run through is, after all, the sketch of a proof that : can’t be defined in terms of {/, ∧, ∨}. ——— Exercise 2.8.1. C Let /∗ be 7 and 7∗ be / (so the function x 7→ x∗ is in fact f: ); and let the the dual f ∗ of a truth function f then be defined as follows: f ∗ (x1 , . . . , xn ) = f (x∗1 , . . . , x∗n )∗ . Assuming that φ is a formula of L∗1 (see Exercise 2.6.1), show that φ expresses f with respect to hP 1 , . . . , P n i if and only if φ∗ expresses f ∗ with respect to hP 1 , . . . , P n i, where φ∗ is defined as in Exercise 2.6.1. [Note: f ∗∗ = f .] Exercise 2.8.2. C Show that any truth function can be expressed by a conjunction whose conjuncts are each either a disjunction whose only disjuncts are / or a disjunction whose disjuncts are either a letter or a negated letter. [Invoke the formula φf defined in the proof of Theorem 2.8.1, and appeal to Exercise 2.8.1.] 44
P EDL : additional notes and exercises – scruffy and incomplete preliminary draft – T21
2.8
Exercise 2.8.3 (An alternative proof of Theorem 2.8.1). J Let P 1 , P 2 , P 3 , . . . be an enumeration, without repetitions, of the sentence letters of L1 (:, ∧, ∨, /, 7). Prove by induction on the natural numbers that the following condition holds for all n: for every function f : {/, 7}n −→ {/, 7}, there is a formula of L1 (:, ∧, ∨, /, 7) that expresses f with respect to hP 1 , . . . , P n i. [Hint for the induction step:- Given any (n + 1)-place function f , there are two n-place functions f / and f 7 defined as follows: f / (x1 , . . . , xn ) = f (x1 , . . . , xn , /), f 7 (x1 , . . . , xn ) = f (x1 , . . . , xn , 7). In terms of formulae φf / and φf 7 that express f / and f 7 with respect to the list hP 1 , . . . , P n i, you can then use :, ∧, and ∨ to construct a formula to express f with respect to hP 1 , . . . , P n , Pn+1 i.] Exercise 2.8.4 (An alternative proof of Theorem 2.5.1 (Theorem 2.5.2)). Assume that φ ψ and that hP 1 , . . . , P n i is a list of the sentence letters that occur both in φ and in ψ. We can then ‘generalize’ the obvious proof strategy for Exercise 2.3.4108 to find a Craig interpolant109 for φ and ψ (in fact, to specify the whole set of such interpolants). (a) First show that there is no ~x ∈ {/, 7}n such that both the following hold: (i) there is a B such that B(P i ) = xi , 1 ≤ i ≤ n, and |φ|B = /; (ii) there is a C such that C(P i ) = xi , 1 ≤ i ≤ n, and |ψ|C = 7. (b) Hence deduce that there is at least one f : {/, 7}n → {/, 7} such that / if there is a B such that B(P i ) = xi , 1 ≤ i ≤ n, and |φ|B = / f (~x) = 7 if there is a C such that C(P i ) = xi , 1 ≤ i ≤ n, and |ψ|C = 7 (c) Now adduce (the proof of) Theorem 2.8.1 to show that every such f can be expressed by a formula that’s a Craig interpolant for φ and ψ. (d) Show that λ is a Craig interpolant for φ and ψ only if λ expresses, with respect to hP 1 , . . . , P n i, a function f that satisfies the condition in part (b). Exercise 2.8.5. (a) Define ∨ in terms of → alone. (b) Show that → cannot be defined in terms of ∨ alone. (c) Show that ∧ cannot be defined in terms of → alone. 108 109
In fact this exercise subsumes Exercise 2.3.4, since it all makes sense when n = 0. I.e. a λ satisfying the conditions of Theorem 2.5.1 (Theorem 2.5.2).
45
P EDL : additional notes and exercises – scruffy and incomplete preliminary draft – T21
2.8
Exercise 2.8.6. (a) Show that {↔, 7, ∧} is adequate, but that no proper subset is. (b) Is {/, 7, ∧, ∨} adequate? Justify your answer. Exercise 2.8.7. (a) Show that each of {↑} and {↓} is expressively adequate. (b) Assume that c is a 2-place connective. Show that if either fc (/, /) = / or fc (7, 7) = 7, then {c} is not adequate. Also show that if either fc (x, y) = f: (x) or fc (x, y) = f: (y), then {c} is not adequate. (c) Deduce that no two-place connective other than ↑ or ↓ is expressively adequate on its own. Exercise 2.8.8. Let the 3-place connectives 5 and 4 be the so-called majority and minority connectives110 : f5 (x, y, z) = / if and only if at least two x, y, and z are /111 ; and f4 (x, y, z) = / if and only if at most one of x, y, and z is /112 . (a) (i) Show that 5(φ, ψ, χ), :4(φ, ψ, χ) and 4(:φ, :ψ, :χ) are equivalent. (ii) Define (φ ∧ ψ) in terms of 5 and 7. (iii) Define :φ in terms of 4. (iv) Define 5(φ, ψ, χ) in terms of ∧ and ∨. (v) Show that :φ cannot be defined in terms of ∧, ∨ and 7. (b) (i) Are 5 and 7 together adequate to express all truth functions? (ii) Are 4 and 7 together adequate to express all truth functions? Justify your answers by reference to part (a). For readability, we’ll write ‘5(φ, ψ, χ)’ and ‘4(φ, ψ, χ)’ for 5φψχ and 4φψχ. This is not actually very coherently put; ‘are’, of course, means ‘are identical to’, and that points up the problem: what I’m trying to say is that either both x = / and y = / or both y = / and z = / or both z = / and x = /. 112 Similar problem mutatis mutandis. 110
111
46
P EDL : additional notes and exercises – scruffy and incomplete preliminary draft – T21
2.8
Exercise 2.8.9. J The dual c∗ of a connective c is interpreted by the dual of the truth function that interprets c—i.e. fc∗ = fc∗ (as defined in exercise 2.8.1). Show that C is adequate if and only if {c∗ | c ∈ C} is.
Exercise 2.8.10. J Call a truth function f self-dual if and only if f = f ∗ (as defined in exercise 2.8.1); and correspondingly call a connective c self-dual if and only if fc is self-dual. (a) Show that if C contains only self-dual connectives, then formulae of L1 (C) express only self-dual functions. (b) Deduce that if C contains only self-dual connectives, then C is not adequate. Exercise 2.8.11. J Let the 3-place connectives 5 and 4 be the majority and minority connectives as in Exercise 2.8.8. And, for any sentence letter P, let τP be the transformation taking a formula of L1 (:, ∨) to a formula of L1 (:, 5) that’s defined by recursion as follows: τP (Q) = Q, for every sentence letter Q (including P) τP (:φ) = :τP (φ) τP (φ ∨ ψ) = 5(τP (φ), τP (ψ), P) (a) Show that, whenever |P|A = T , then it’s both the case that |τP (φ)|A = |φ|A and that |τP (φ)|A∗ = (|φ|A )∗ , where the first occurrence of ‘∗ ’ is as in exercise 2.6.1, and the second is as in exercise 2.8.1 (so it’s just f: ). (b) Assume that f is an (n + 1)-place self-dual truth function—‘self-dual’ is defined in exercise 2.8.10—and let f / be the n-place truth function such that f / (x1 , . . . , xn ) = f (x1 , . . . , xn , /). Show that if a formula φ of L1 (:, ∨) expresses f / and P does not occur in φ, then τP (φ) expresses f . (c) Since : and 5 are both self-dual, we know from exercise 2.8.10, part (a), that any formula of L(:, 5) will express a self-dual function. Now show that the set {:, 5} is adequate to express every self-dual function. (d) Is {5} adequate to express every self-dual function? (e) Is {4} adequate to express every self-dual function?113
113
4 is itself self-dual, and so formulae of L(4) express only self-dual functions.
47
P EDL : additional notes and exercises – scruffy and incomplete preliminary draft – T21
3
3.1
Proof systems
In classical logic, at least, semantic definitions are taken to be basic114 , and any proof system is expected to generate as instances of what follows from what all and only those instances of following-from that the semantic definitions dictate: a proof system is said to be sound provided it doesn’t prove anything for which there is a counter-example; and it is said to be complete provided it does prove everything for which there is no counter-example. The Natural Deduction system in the Manual for L1 —henceforth ND1 —is both sound and complete115 —Theorems 3.3.1 and 3.3.2 below. But there are plenty of other systems too, and we shall investigate some of these.
3.1
Natural Deduction systems
Here is an inductive definition of a relation b between sets of sentences of L1 and sentences of L1 —we’re writing ‘Γ b φ’ instead of ‘hΓ, φi ∈ b’, to make it more readable. (1) (2∧I ) (2∧E1 ) (2∧E2 ) (2→I ) (2→E ) (2:I ) (2:E ) (2∨I1 ) (2∨I2 ) (2∨E ) (2↔I ) (2↔E1 ) (2↔E2 )
{φ} b φ. If Γ1 b φ and Γ2 b ψ, then Γ1 ∪ Γ2 b φ ∧ ψ. If Γ b φ ∧ ψ, then Γ b φ. If Γ b φ ∧ ψ, then Γ b ψ. If Γ b ψ, then Γ r {φ} b φ → ψ. If Γ1 b φ → ψ and Γ2 b φ, then Γ1 ∪ Γ2 b ψ. If Γ1 b ψ and Γ2 b :ψ, then (Γ1 r {φ}) ∪ (Γ2 r {φ}) b :φ. If Γ1 b ψ and Γ2 b :ψ, then (Γ1 r {:φ}) ∪ (Γ2 r {:φ}) b φ. If Γ b φ, then Γ b φ ∨ ψ. If Γ b ψ, then Γ b φ ∨ ψ. If Γ b φ∨ψ and ∆1 b χ and ∆2 b χ, then Γ∪(∆1r{φ})∪(∆2r{ψ}) b χ. If Γ1 b ψ, and Γ2 b φ, then (Γ1 r {φ}) ∪ (Γ2 r {ψ}) b φ ↔ ψ. If Γ1 b φ ↔ ψ and Γ2 b φ, then Γ1 ∪ Γ2 b ψ. If Γ1 b φ ↔ ψ and Γ2 b ψ, then Γ1 ∪ Γ2 b φ.
What in Section 1.6 I called the ‘generation trees’ of this definition actually constitute proofs in a sequent-style version of ND1 . Consider, for example, the following pairs of proofs—showing that the sentences (P ∧ Q) → R and P → (Q → R) are interderivable—the first in the traditional Natural-Deduction 114
We take the meaning of the connectives to be given by their truth tables. In intuitionist logic, on the other hand—or other constructivist logics—the meaning of connectives can be seen as actually given by proof rules. 115 The Manual wraps these two results up in one and calls it ‘Adequacy’: page 144.
48
P EDL : additional notes and exercises – scruffy and incomplete preliminary draft – T21
3.1
style of the Manual, and the second in sequent style116 . The steps at which assumptions are discharged are labelled—and matchingly the corresponding lines in the sequent-style proofs, where sentences disappear from the left of b : (i)
[P ]2 [Q]1 P ∧Q
(P ∧ Q) → R R (1) Q→R (2) P → (Q → R)
P bP QbQ P, Q b P ∧ Q (P ∧ Q) → R b (P ∧ Q) → R P, Q, (P ∧ Q) → R b R (1) P, (P ∧ Q) → R b Q → R (2) (P ∧ Q) → R b P → (Q → R) (ii) [P ∧ Q] Q
1
[P ∧ Q]1 P R (P ∧ Q) → R
P → (Q → R) Q→R (1)
P ∧QbP ∧Q P ∧QbP P → (Q → R) b P → (Q → R) P ∧QbP ∧Q P ∧QbQ P ∧ Q, P → (Q → R) b Q → R P ∧ Q, P → (Q → R) b R (1) P → (Q → R) b (P ∧ Q) → R Sadly, since labelling is not constitutive of proofs in the Manual definition, ND1 proofs are not actually in one-one correspondence with these sequent-style proofs (see footnote 41 and Exercise 3.1.1). Even so, we have the following useful lemma—useful, because in lots of contexts the inductive definition of b is easier to work with than ND1 proofs. Lemma 3.1.1. Γ b φ if and only if there is an ND1 proof of φ whose undischarged premises are exactly the sentences in Γ. Proof. From left to right by induction on the inductive definition of b; from right to left by induction on the inductive definition of ND1 proofs. Def. Γ ` φ ⇔ there is an ND1 proof of φ all of whose undischarged assumptions are elements of Γ.117 116 117
in which we drop ‘{’ and ‘}’ and replace ‘∪’ by ‘,’—as we do on the left of ‘’ and ‘`’. This is precisely the definition given in the Manual, of course.
49
P EDL : additional notes and exercises – scruffy and incomplete preliminary draft – T21
3.1
Corollary 3.1.2. Γ ` φ if and only if Γ0 b φ, for some subset Γ0 of Γ. And another noteworthy—though intuitively obvious—lemma: Lemma 3.1.3. If Γ b φ, then Γ is finite. Proof. By induction on the inductive definition of b. Corollary 3.1.4. Γ ` φ if and only if Γ0 ` φ, for some finite subset Γ0 of Γ. And we ought now record some language-independent properties of `, which match the properties of given in Lemma 2.3.2 Lemma 3.1.5.
(i) If φ ∈ Γ, then Γ ` φ. (ii) If Γ ` φ and Γ ⊆ ∆, then ∆ ` φ. (iii) If Γ ` φ and Γ, φ ` ψ, then Γ ` ψ.
Proof. Parts (i) and (ii) are easy to check directly:- Corollary 3.1.2 gives us (i), since {φ} b φ. And this corollary obviously gives (ii) as well. On the other hand, although it’s easy to ‘see’ that (iii) holds—by cuttingand-pasting ND1 proofs, as it were—a rigorous inductive argument is more complicated. Since this fact will not actually be required to establish the soundness or completeness of ND1 , we could wait until we have proved these results and then deduce (iii) from them—using part (iii) of Lemma 2.3.2. But it might be instructive to get our hands dirty. The guts of the argument is contained in the following sublemma, which will also be useful for other results that it’s natural to argue for in a cut-and-paste sort of way: the result can be seen precisely as underpinning cut-and-paste arguments.118 119 Sublemma 3.1.6. If Γ b φ and ∆ b ψ, then Γ0 ∪ (∆r{φ}) b ψ, for some Γ0 ⊆ Γ. Proof. Assume that Γ b φ and fix on Γ and φ. We can now argue by induction on the definition of b to show that the required condition holds for any ∆ 118
It tells us that in a proof of ψ whose undischarged assumptions are the sentences in ∆, any occurence of φ among these assumptions can be replaced by a proof of φ whose undischarged assumptions are the sentences in Γ, to produce a proof of ψ whose undischarged assumptions are the sentences in ∆ other than φ (if it was there in the first place) together with sentences in a subset of Γ—not necesarily all the sentences in Γ, not only because there might not actually be any occurrences of φ to replace, but also because if there are occurrences that have been replaced, then there might be steps in the proof of ψ that require the discharging of assumptions that remained undischarged in the proof of φ. 119 We could invoke → and its rules to provide a much less elaborate proof—one that avoids induction steps for each rule in the system. However, I want a proof strategy that will work mutatis mutandis for any decent natural-deduction system for whatever language—possibly one without →. That rules for a connective provide workable induction steps could be seen as a constraint on their admissability as natural-deduction rules.
50
P EDL : additional notes and exercises – scruffy and incomplete preliminary draft – T21
3.1
and ψ such that ∆ b ψ—i.e. to show that the displayed property holds of all pairs h∆, ψi in the relation b : there exists Γ0 ⊆ Γ such that Γ0 ∪ (∆ r {φ}) b ψ. (1) First we must show that if {ψ} b ψ, then there exists Γ0 ⊆ Γ such that Γ0 ∪ ({ψ}r{φ}) b ψ. There are two cases to consider: (i) ψ = φ, (ii) ψ 6= φ. In case (i) we can take Γ0 to be Γ, and then Γ0 ∪ ({ψ} r {φ}) b ψ, since Γ0 ∪ ({ψ}r{φ}) = Γ—and it’s a background assumption that Γ b φ. In case (ii) we can take Γ0 to be ∅, and then Γ0 ∪ ({ψ} r {φ}) b ψ, since Γ0 ∪ ({ψ}r{φ}) = {ψ}. (2) Now we need inductive steps for each of the closure condtions on b, viz. the clauses in the definition that correspond to ND1 inference rules. I’ll present four of the fifteen cases—in order of increasing complexity—and leave the other eleven as an exercise: Exercise 3.1.3. (2∧E1 ) The inductive assumption for this clause is that there exists Γ0 ⊆ Γ such that Γ0 ∪ (∆r{φ}) b ψ1 ∧ψ2 ; and we have to show that there exists Γ0 ⊆ Γ such that Γ0 ∪ (∆ r {φ}) b ψ1 . But this is immediate, since, for any set Γ0 , clause (2∧E1 ) gives that if Γ0 ∪ (∆r{φ}) b ψ1 ∧ ψ2 , then Γ0 ∪ (∆r{φ}) b ψ1 . (2∧I ) The inductive assumption this time is that there both exists Γ1 ⊆ Γ such that Γ1 ∪ (∆1 r{φ}) b ψ1 and exists Γ2 ⊆ Γ such that Γ2 ∪ (∆2 r{φ}) b ψ2 ; and we have to show there exists Γ0 ⊆ Γ such that Γ0 ∪ ((∆1 ∪ ∆2 )r{φ}) b ψ1 ∧ ψ2 . But clause (2∧I ) gives that (Γ1 ∪ (∆1 r {φ})) ∪ (Γ2 ∪ (∆2 r {φ})) b ψ1 ∧ ψ2 . And (Γ1 ∪ (∆1 r {φ})) ∪ (Γ2 ∪ (∆2 r {φ})) = (Γ1 ∪ Γ2 ) ∪ ((∆1 ∪ ∆2 ) r {φ}). Hence we can take Γ0 to be Γ1 ∪ Γ2 . (2→I ) The inductive assumption is the single condition that there exists Γ0 ⊆ Γ such that Γ0 ∪ (∆ r {φ}) b ψ2 ; and we have to show that there exists Γ0 ⊆ Γ such that Γ0 ∪ ((∆r{ψ1})r{φ}) b ψ1 → ψ2 . But clause (2→I ) guarantees that (Γ0 ∪ (∆ r {φ})) r {ψ1} b ψ1 → ψ2 . And (Γ0 ∪ (∆r{φ}))r{ψ1} = (Γ0 r{ψ1}) ∪ ((∆r{ψ1})r{φ}). Hence we can take Γ0 to be Γ0 r{ψ1}. (2∨E ) The inductive assumption for this clause is three-fold—that there exists Γ0 ⊆ Γ such that Γ0 ∪ (∆0 r{φ}) b ψ1 ∨ ψ2 , that there exists Γ1 ⊆ Γ such that Γ1 ∪ (∆1 r{φ}) b χ, and that there exists Γ2 ⊆ Γ such that Γ2 ∪ (∆2 r{φ}) b χ; and we have to show that there exists Γ0 ⊆ Γ such that Φ b χ, where Φ = Γ0 ∪ ((∆0 ∪ (∆1 r {ψ1 }) ∪ (∆2 r {ψ2 })) r {φ}). But clause (2∨E ) guarantees that Ψ b χ, where Ψ = (Γ0 ∪ (∆0 r{φ})) ∪ ((Γ1 ∪ (∆1 r{φ}))r{ψ1}) ∪ ((Γ2 ∪ (∆2 r{φ}))r{ψ2}). Hence we can take Γ0 to be Γ0 ∪ (Γ1 r{ψ1}) ∪ (Γ2 r{ψ2}), since then Φ = Ψ. 51
P EDL : additional notes and exercises – scruffy and incomplete preliminary draft – T21
3.2
This completes the amount of proof that I have the stamina for. Proof of Lemma 3.1.5 continued: part (iii). If Γ ` φ and Γ, φ ` ψ, then there exist Γ0 ⊆ Γ and ∆0 ⊆ Γ ∪ {φ} such that Γ0 b φ and ∆0 b ψ. The sublemma then guarantees that Γ0 ∪ (∆0 r {φ}) b ψ, for some Γ0 ⊆ Γ0 . But then Γ0 ∪ (∆0 r{φ}) ⊆ Γ, and so Γ ` ψ. ——— Exercise 3.1.1. The following two ND1 proofs are structurally ambiguous: there are alternative ways in which to assign the dischargings to lines in the proof, so that they correspond to different proofs in the sequent-style presentation. Label the proofs in alternative ways so as to reveal the ambiguity and then write out the different corresponding proofs in the sequent-style presentation. P [:P ]
::P
P →R R R P →R Q → (P → R)
[P ] P ∨Q
[:P ] ::P
[Q]
Q→R R
Exercise 3.1.2. Show, by induction on the definition of b, that if Γ b φ, then Γ φ. Exercise 3.1.3. Do the undone inductive steps in the proof of Sublemma 3.1.6.
3.2
Consistency and (Negation) Completeness
The Manual gives a definition of syntactic consistency as follows: Def. Γ is (syntactically) consistent 120 ⇔ there is some φ such that Γ 6 ` φ But there is an alternative definition that we can use for a system whose language contains :, which I’ll here distinguish by calling it ‘:-consistency’: Def. Γ is :-consistent: ⇔ there is no φ such that both Γ ` φ and Γ ` :φ. 120
Sandardly this is called ‘consistency’ simpliciter, and there’s no ambiguity if—as is also standard—the term ‘unsatisfiable’ is used instead of ‘semantic consistency’. In the following sections of these notes I’ll follow this practice.
52
P EDL : additional notes and exercises – scruffy and incomplete preliminary draft – T21
3.2
It’s a trivial matter to show that these definitions are equivalent: Lemma 3.2.1. Γ is (syntactically) consistent if and only if Γ is :-consistent.121 Proof. Exercise 3.2.1. And the terminology ‘inconsistent’, and ‘:-inconsistent’, can now simply be defined to mean not consistent, and not :-consistent. Obviously—from part (ii) of Lemma 3.1.5—we have Lemma 3.2.2. If Γ is consistent and ∆ ⊆ Γ, then ∆ is consistent. And, with Corollary 3.1.4 (and Lemma 3.1.5), Lemma 3.2.1 now immediately yields the finitariness of consistency—a cheap and easy result, unlike the finitariness of satisfiability, viz. compactness (Theorem 2.7.1). Lemma 3.2.3. Γ is consistent if (and only if) Γ0 is consistent, for every finite subset Γ0 of Γ. The notion of (:-)consisitency122 is defined in terms of `, but we can easily formulate a necessary and sufficient condition to characterize ` in terms of consistency. In fact consistency and ` fit together in exactly the way that satisfiability and do; and we have a result matching Lemma 2.3.3: Lemma 3.2.4.
(i) Γ ` φ if and only if Γ ∪ {:φ} is inconsistent. (ii) Γ ` :φ if and only if Γ ∪ {φ} is inconsistent.
Proof. Exercise 3.2.1. ——— Now a notion of ‘completeness’—a property of sets of sentences, which is not to be confused with the semantic completeness of proof systems. (Note that we might very appropriately call it ‘:-completeness’, since its definition matches that of :-consistency.) Def. Γ is complete ⇔ for every φ either Γ ` φ or Γ ` :φ. ——— Exercise 3.2.1. Prove Lemmas 3.2.1 and 3.2.4 121
Cf. Exercise 2.3.1: with parts (ii) and (iii) we have the corresponding equivalence for . Henceforth, I’ll just be saying ‘consistency’, though now we have Lemma 3.2.1, we’re entitled to invoke whichever of the equivalent defining conditions is most convenient—and I’ll do this without comment. 122
53
P EDL : additional notes and exercises – scruffy and incomplete preliminary draft – T21
3.3
Exercise 3.2.2. Show that if Γ is consistent, then either Γ ∪ {φ} or Γ ∪ {:φ} is consistent. Exercise 3.2.3. Show that if Γ is consistent and complete then, for any φ and ψ 123 , (i) (ii) (iii) (iv) (v)
Γ ` :φ Γ`φ∧ψ Γ ` φ→ψ Γ`φ∨ψ Γ ` φ↔ψ
⇔ ⇔ ⇔ ⇔ ⇔
Γ 6 ` φ. both Γ ` φ and Γ ` ψ. either Γ 6 ` φ or Γ ` ψ. either Γ ` φ or Γ ` ψ. either both Γ ` φ and Γ ` ψ or both Γ 6 ` φ and Γ 6 ` ψ.
Exercise 3.2.4. A set Γ of sentences is maximally consistent if and only if Γ is consistent and, for any φ, either φ ∈ Γ or Γ ∪ {φ} is inconsistent.124 (a) Show that if Γ is maximally consistent, then Γ is complete. (b) Show that if Γ is maximally consistent, then, for all φ, Γ ` φ ⇒ φ ∈ Γ. (c) Show that if Γ is complete and consistent, and, for all φ, Γ ` φ ⇒ φ ∈ Γ, then Γ is maximally consistent.
3.3
Soundness and (Semantic) Completeness
Theorem 3.3.1 (The Soundness of ND1 ). If Γ ` φ, then Γ φ. Theorem 3.3.2 (The Completeness of ND1 ). If Γ φ, then Γ ` φ. But from the definitions and results of sections 2.3 and 3.1, it’s not difficult to check that these results have alternative formulations in terms of consistency and satisfiablity: Exercise 3.3.1. Theorem 3.3.3 (The Soundness of ND1 , alternative formulation). If Γ is satisfiable, then Γ is consistent. Theorem 3.3.4 (The Completeness of ND1 , alternative formulation). If Γ is consistent, then Γ is satisfiable.
123
Thus provability conditions exactly match truth conditions for the connectives. It’s easy to check that this is equivalent to saying that Γ is consistent but has no proper superset that’s consistent. 124
54
P EDL : additional notes and exercises – scruffy and incomplete preliminary draft – T21
3.3
——— Establishing soundness is easier than establishing completeness: Proof of Theorem 3.3.1. From the assumption that Γ ` φ, it follows, by Corollary 3.1.2, that Γ0 b φ, for some subset Γ0 of Γ; hence Γ0 φ, by Exercise 3.1.2; but from this it follows, by Lemma 2.3.2, part (ii), that Γ φ. ——— To establish completeness we’ll prove Theorem 3.3.4. Proof of Theorem 3.3.4. It’s natural to break up the argument into two lemmas: Lemma 3.3.5. If Γ is consistent, then Γ ⊆ Γ+ , for some complete and consistent set Γ+ of sentences. Lemma 3.3.6. If Γ is complete and consistent, and, for all sentence letters P, A(P) = / iff Γ ` P, then, for all L1 sentences φ, |φ|A = / iff Γ ` φ. We can then easily put these lemmas together to establish Theorem 3.3.4. Given a consistent set Γ, Lemma 3.3.5 guarantees a complete consistent superset Γ+ , in terms of which we simply define a structure AΓ+ by stipulating that AΓ+ (P) = / iff Γ ` P, for all sentence letters P; and then—since φ ∈ Γ ⇒ Γ+ ` φ, by Lemma 3.1.5, parts (i) and (ii)—Lemma 3.3.6 guarantees that AΓ+ |= Γ: so Γ is satisfiable. Proof of Lemma 3.3.5. Given consistent Γ, there are various ‘constructions’ we could use to establish the existence of a superset Γ+ meeting the conditions required. The following is perhaps the easiest to work with (though not perhaps the neatest conceptually: see Exercise 3.3.2). First let φ0 , φ1 , φ2 , . . . be an exhaustive enumeration of the sentences of L1 . (There such an enumeration: cf. Exercise 2.7.3.) Then, starting with Γ—the consistent set we’re assuming—take a sequence Γ0 , Γ1 , Γ2 , . . . of sets of sentences to be defined, by recursion, as follows: Γ0 = Γ Γn Γn+1 = Γn ∪ {:φn } And let Γ+ =
S
n∈N
if Γ ` φn , otherwise.
Γn .
It’s now easy to check, by induction on N, (i) that we have a chain of sets— i.e. that i < j ⇒ Γi ⊆ Γj ; and (ii) that, for every n, Γn is consistent. The induction step for (ii) follows from Lemma 3.2.4: Γn+1 = Γn , unless Γ 6 ` φn , in which case Γn+1 = Γn ∪ {:φn }; but the inductive assumption is that Γn is consistent, and if Γ 6 ` φn , then, by part (i) of Lemma 3.2.4, Γn ∪ {:φn } is consistent—hence either way Γn+1 is consistent. 55
P EDL : additional notes and exercises – scruffy and incomplete preliminary draft – T21
3.3
The consistency of Γ+ now follows from the purely set-theoretical fact—checked in Exercise 1.5.4—that, since we have a chain of sets, any finite subset of Γ+ is a subset of Γn , for some n: by Lemma 3.2.2, any finite subset must then be consistent, so that, by Lemma 3.2.3, Γ+ itself is consistent. And Γ+ is complete because any φ is φn , for some n. This means that either Γn ` φ, or else Γn 6 ` φ and Γn+1 = Γn ∪ {:φ}, so that Γn+1 ` :φ—by part (i) of Lemma 3.1.5. Hence, since both Γn and Γn+1 are subsets of Γ+ , either Γ+ ` φ or Γ+ ` :φ—by part (ii) of Lemma 3.1.5. This completes the proof of Lemma 3.3.5 Proof of Lemma 3.3.6. All the hard work has been done in Exercise 3.2.3: after that, the proof is a straightforward induction on complexity of sentences. ——— Exercise 3.3.1. Derive Theorems 3.3.1 and 3.3.3 directly from one another, and derive Theorems 3.3.2 and 3.3.4 directly from one another.125 Exercise 3.3.2. Show that if, in the proof of Lemma 3.3.5, we replaced the definition of Γn+1 in terms of Γn by either of the following, then Γ+ would be maximally consistent.126 Γn ∪ {φn } if this is consistent, (i) Γn+1 = Γn otherwise. Γn ∪ {φn } if this is consistent, (ii) Γn+1 = Γn ∪ {:φn } otherwise. [It’s probably more usual to work with maximal consistency in a completeness proof of this kind (a Henkin-style proof) than with consistency-pluscompleteness. Note that we then get a set Γ+ whose actual membership conditions—not just provability-from conditions—match truth conditions: a maximally consistent set and its complement in the totality of L1 -sentences coherently partition sentences up into those that are true and those that are false under the truth-value assignment it determines to the sentence letters.127 ] 125 For our development of results, the important ways round are obviously 3.3.1 ⇒ 3.3.3 and 3.3.4 ⇒ 3.3.2. 126 See Exercise 3.2.4 for the definition of maximal consistency. 127 In fact, we would also get a maximally consistent Γ+ if we simply modified the definition given in the proof like this:-
Γ0 = Γ Γn ∪ {φn } Γn+1 = Γ n ∪ { : φn }
56
if Γ ` φn , otherwise.
P EDL : additional notes and exercises – scruffy and incomplete preliminary draft – T21
3.4
Exercise 3.3.3. Deduce compactness from soundness and completeness—more specifically, derive Theorem 2.7.1 directly from Theorems 3.3.3 and 3.3.4 (and Lemma 3.2.3). Exercise 3.3.4. C Recall the relation ∼Γ , defined in Exercise 2.4.4 for any set Γ of sentences: φ ∼Γ ψ ⇔ Γ φ ↔ ψ. (But, now we have soundness and completeness, we know it could equally well be defined in terms of `.) (a) Concerning each of the following cases say whether it determines a particular number that’s the number of equivalence classes of ∼Γ ; and, if it does, then say what that number is. (i) (ii) (iii) (iv) (v)
Γ Γ Γ Γ Γ
is is is is is
inconsistent. consistent and complete. consistent. complete. maximally consistent.
(b) How many equivalence classes does the relation ∼Γ have in each of the following specific cases? (i) (ii) (iii) (iv) (v)
3.4
Γ = ∅. Γ is the set of all sentence letters of L1 . Γ = {P1 }. Γ is the set of all sentence letters of L1 except P1 . Γ is the set of all sentence letters of L1 except P1 and P2 .
Other Proof Systems
The ND1 rules for →, ∧ and ∨ are so standard that it would hardly be a “natural deduction” system at all if they were altered (except perhaps over conventions for discharging). The Manual rules for ↔ are the natural ones too.128 But there’s room for variation over rules for :. These remarks assume both that we’re talking about a proof system for L1 and talking about systems sound and complete with respect to classical semantics. But we needn’t restrict ourselves to either of these assumptions. 128
In the original draft of the text they were different (and electronic copies of this might still be around), but Halbach gave into pressure from colleagues: what he originally had were messy because they weren’t ‘pure’: they involved ∧—in a way that meant he might just as well, and more neatly, have simply defined ↔ in terms of → and ∧.
57
P EDL : additional notes and exercises – scruffy and incomplete preliminary draft – T21
3.4
Furthermore, we needn’t restrict ourselves to natural deduction systems at all, and we’ll have an exercise on an ‘axiom system’. Axiom systems don’t just have inference rules, but allow you to assume certain sentences—the ‘axioms’—for free, meaning that at the end of the proof they don’t count as undischarged assumptions. (Actually, the ‘rule’ =-Intro in the Manual system for L= is precisely an axiom in this sense; so the idea will already be familiar.) On the other hand, axiom systems never have inference rules that allow, let alone require, the discharging of assumptions. A typical axiom system will in any case keep the stock of rules to an absolute minimum—in propositional logic, usually just Modus Ponens, i.e. →-elimination129 —and include a stock of axioms powerful enough to make the system complete. We have been using the notation ‘b’ and ‘`’ unadorned, but they have been defined relative to the system ND1 , and we could indicate this relativity by subscripting the notation: ‘bND1 ’ and ‘`ND1 ’. Corresponding relations bS and `S can, of course, be defined for whatever system S we’re dealing with, and relativity will be made explicit in this way.130 ——— Sticking with L1 for the moment, let’s first define a system NDC : NDC is just like ND1 , except that is lacks the rule :-Intro. The system NDC is actually no weaker than ND1 , since :-Intro can always be eliminated from a proof using :-Elim. More precisely, any ND1 proof can be transformed, stage by stage, into an NDC proof—and this is really an argument by induction on the definition of ND1 proofs (which has a corresponding argument by induction on the definition of bND1 , invoking the inductive definition of bNDC ). The point is this: any application of :-Intro will combine proofs Π1 of ψ and Π2 of :ψ to produce a proof of :φ, in which any undischarged assumption of φ in Π1 or Π2 gets discharged; but, assuming we’ve already transformed the poofs Π1 and Π2 into :-Intro-free NDC proofs Π01 of ψ and Π02 of :ψ, then we can transform the newly produced proof as follows:[φ]a Π1 ψ
:φ
[φ]a Π2 :ψ
[::φ]a
7→ (a)
[:φ]b φ Π01 ψ
(b)
:φ
[::φ]a φ Π02 :ψ
[:φ]c
(c)
(a)
Thus the discharged assumptions of φ are replaced by a proof of φ from ::φ (using :-Elim and discharging :φ) and the assumptions of ::φ are discharged 129
Halbach’s uses the label ‘Modus Ponens’ for a logically true sentence, but it’s primary use is as the name of the rule 130 And corresponding versions of Sublemma 3.1.6 and Lemma 3.1.5 will hold.
58
P EDL : additional notes and exercises – scruffy and incomplete preliminary draft – T21
3.4
at the last step (which is now a step of :-Elim). Of course, φ needn’t actually occur in Π1 or in Π2 , in which case there’ll be nothing to replace (since it couldn’t have turned up in Π01 or in Π02 ). But note that there may be undischarged assumptions of ::φ in Π1 or in Π2 , and correspondingly in Π01 or in Π02 , which are hidden in our schematic representation of the transformation but which will have to be discharged in the final step of :-Elim. Hence our transformation doesn’t guarantee exactly the same undischarged assumptions, merely that there are no additional undischarged assumptions: our argument shows that Γ bND1 φ ⇒ Γ0 bNDC φ, for some Γ0 ⊆ Γ, but it remains a possibility that Γ0 might be a proper subset of Γ.131 Even so, this is obviously sufficient for deducing that, for any Γ and φ, Γ `ND1 φ ⇒ Γ `NDC φ.132 And, since any NDC proof is an ND1 proof, entailment goes the other way too; so the systems are equivalent: for any Γ and φ, Γ `NDC φ ⇔ Γ `ND1 φ. ——— Still sticking with L1 , let’s now define a system NDC 0 , which has neither the rule :-Intro nor the rule :-Elim, but which has two different rules, NCD and EFQ (Non-Constructive Dilemma, and Ex Falso Quodlibet 133 ): [:φ] .. . ψ
[φ] .. . ψ ψ
.. . φ
NCD
.. . :φ ψ
EFQ
NCD: The result of appending ψ to a proof of ψ and another proof of ψ, and of discharging all assumptions of φ in the first proof of ψ and of discharging all assumptions of :φ in the second proof of ψ, is a proof of ψ. EFQ: The result of appending ψ to a proof of φ and a proof of :φ is a proof of ψ. [Note: there is no discharging in EFQ.] The system NDC 0 turns out to be equivalent to NDC (and hence, too, to ND1 ): that is, for any Γ and φ, Γ `NDC 0 φ ⇔ Γ `NDC φ. Establishing this is left as an exercise: Exercise 3.4.12.134 131
This is more obvious if we convert the transformation described here into an argument by induction on the definition of bND1 : see Exercise 3.4.1. 132 Actually, given the rules for ∧, we could stick back any assumptions that get additionally discharged in the transformation: see Exercise 3.4.2. 133 In Bostock’s book NCD is called ‘TND’ (Tertium Non Datur ), but I think it’s better to keep that label for the sentence form φ∨ :φ. (If you want Latin then call it ‘DNC’ (Dilemma Non Constructivum (or should that be ‘Non Constructum’, perhaps?)).) 134 Cf. question 6 in the logic section of the 2009 summer PPE Prelims paper).
59
P EDL : additional notes and exercises – scruffy and incomplete preliminary draft – T21
3.4
——— Now we specifiy a system NDI that drops :-Elim and replaces it with the rule EFQ: this is a strictly weaker than ND1 and NDC —a system for Intuitionist logic. In particular, ::P 6 `NDI P and 6 `NDI P ∨ :P . But I don’t know any way of establishing this in just a few lines. (See the footnote on pages 125 and 126 of the Manual.) It is always more difficult to show that something isn’t derivable in a given system than that it is: if it is, you can just give a proof, but if it isn’t, then you need something more complicated. Typically, to show that Γ0 6 `S φ0 , for some particular Γ0 and φ0 , we can find some property which can be shown, by induction on the definition of bS , to be satisfied by all pairs hΓ, φi such that Γ `S φ, but which hΓ0 , φ0 i doesn’t satisfy. Arguments of this sort can be used to show the ‘independence’ of a rule in a system—i.e. that it’s not redundant, in that if we drop it, then we get a strictly weaker system. For example, we can’t just drop the rule ∧-Elim1 (correspondingly, we can’t drop the clause (2∧E1 ) in the inductive definition of b) without leaving a weaker system; and we can argue as follows. First define a ‘non-standard semantics’ which messes with the clause for evaluating φ∧ψ in a structure, but keeps everything else the same: the evaluation φ 7→ |φ|0A , let’s call it, stipulates that |φ ∧ ψ|0A = |ψ|0A 135 . Then, if S is the system got from ND1 by dropping ∧-Elim1, we can show that S is sound for this semantics: Γ `S φ ⇒ Γ 0 φ, where 0 is the non-standard entailment relation defined in terms of | · |0A . But, for example, P ∧ Q 6 0 P . And so P ∧ Q 6 `S P . ——— Let’s get rid of the clutter of so many connectives and work with L1 (→, :)—a language which is, of course, adequate to express any truth function. Natural − natural-deduction systems for L1 (→, :), ND− C and NDC 0 , are obtained from NDC and NDC 0 by simply dropping the rules for the absent connectives. But now let’s consider a (very standard) axiom system AXC for L1 (→, :). The axioms of this system are all instances of the following schemes:136 : (A1) φ → (ψ → φ) (A2) (φ → (ψ → χ)) → ((φ → ψ) → (φ → χ)) (A3) (:φ → ψ) → ((:φ → :ψ) → φ) And there is just one rule: Modus Ponens, viz. →-Elim. Thus ∧ is interpreted by the truth function f∧0 that just projects its right-hand argument: for any truth values x and y, f∧0 (x, y) = y. 136 This means that, for any formulae φ, ψ, and χ, any one of the above is an axiom: there are only three overall syntactic forms, but infinitely many instances of each form. (And no formula not of any of these forms is an axiom.) 135
60
P EDL : additional notes and exercises – scruffy and incomplete preliminary draft – T21
3.4
As explained above, a natural-deduction style for presenting this system simply turns these axiom schemes into introduction rules like =-Intro in the Manual system for L= . For example, here is a proof-scheme to show that, for any sentence φ, `AXC φ → φ. [φ → (φ → φ)]
[φ → ((φ → φ) → φ)] [(φ → ((φ → φ) → φ)) → ((φ → (φ → φ)) → (φ → φ))] (φ → (φ → φ)) → (φ → φ) φ→φ
The systems AXC and ND− φ (ExerC are equivalent: Γ `AXC φ ⇔ Γ `ND− C cise 3.4.12). To establish ⇒, we first have to show that there’s an NDC proof of every axiom of AXC in which there are no undischarges premises, and then it’s an easy inductive argument. But the argument for ⇐ is more complicated: we have to argue, overall, by induction on the definition of ND− C proofs (or the definition of bND− ), but, to show that →-Intro can be reproduced in AXC , we C need an argument by induction on the definition of AXC proofs 137 . ——— For the purposes of the exercises, let’s make our usage explicit: in saying that systems S1 and S2 for a language L are equivalent we mean that, for any set Γ of sentences and any sentence φ from L, Γ `S1 φ ⇔ Γ `S2 φ. Note that to show that Γ `S1 φ ⇒ Γ `S2 φ (for all Γ and φ) we can argue that (for all Γ and φ) if Γ bS1 φ, then there’s a subset Γ0 of Γ such that Γ0 bS2 φ. But rather than doing a formal induction on the definition of bS1 , we can work with the (inductive) definition of natural deduction proofs and provide a cut-and-paste argument that specifies how to transform proofs stage by stage (as in the argument on page 58): this will always be sufficient—except in Excercise 3.4.1, of course. ——— Exercise 3.4.1. Convert the transformation argument on page 58 to show that :-Intro is eliminable into an inductive argument on the definition of b (i.e. bND1 ). Exercise 3.4.2. Show that, in any system S that has all three of the standard natural-deduction rules for ∧, if Γ bS φ, Γ ⊆ ∆, and ∆ is finite, then ∆ bS φ. Exercise 3.4.3. Show that NDC and NDC 0 are equivalent systems. (See page 59.) Fixing on φ, we have to show that if Γ bAXC ψ, then Γ r {φ} bAXC φ → ψ. This—or rather, the result for system S that Γ, φ `S ψ ⇒ Γ `S φ → ψ—is known as the ‘Deduction Theorem’ (and it’s just not feasible to work with an axiom system until we’ve proved it). 137
61
P EDL : additional notes and exercises – scruffy and incomplete preliminary draft – T21
3.4
Exercise 3.4.4. Establish the independence of the rule ∧-Intro in NDC . (See page 60.) Exercise 3.4.5. Establish the independence of the rule :-Elim in ND− C . (See pages 60 and 61.) Exercise 3.4.6. Let ND} be a natural deduction system for the language L(:, ) which has the rule :-Elim and the following introduction and elimination rules for : φ φψ φψ ψ Show that, for any sentence φ of the language, `ND} φ. Would this hold if ND} had the rule :-Intro instead of :-Elim? Justify your answer. Exercise 3.4.7. Let ND be a natural deduction system, for the language obtained from L1 by adding a new two-place connective , which has all the rules of ND1 together with the following introduction and elimination rules for : φψ φ φψ φ Specify a truth table for —i.e. a truth function f , to be used to define |φ ψ|A as f (|φ|A , |ψ|A )—such that with this interpretation, together with the standard interpretations for the standard connectives, ND is sound and complete. Explain what additional things need to be checked out in order to establish soundness and completeness; and check them out. Exercise 3.4.8.
[:φ] .. . φ φ
Here are two more negation rules:-
:-EC
[φ] .. . :φ :φ
:-II
(a) Show that if in NDC 0 we replace NCD with :-EC , then we get a system eqivalent to NDC 0 (and hence, too, to NDC (and ND1 )). (b) Show that if in NDI we replace :-Intro with :-II , then we get a system equivalent to NDI . [Thus the system in (a) is a (classical) system with :-EC and EFQ; and the system in (b) is a(n intuitionist) system with :-II and EFQ. (This is a neat way to locate the difference between the two logics.)] 62
P EDL : additional notes and exercises – scruffy and incomplete preliminary draft – T21
3.4
Exercise 3.4.9. Let ND− I be a natural deduction system for the language L1 (→, :) which has the rules →-Elim, →-Intro, :-Intro, and EFQ. And let ND7I be a natural deduction system for the language L1 (→, 7) which has the rules →-Elim, →-Intro, and EFQ7 (a version of Ex Falso Quodlibet for 7):
7 φ
EFQ7
Now let a translation mapping φ 7→ φ(7) from sentences of L1 (→, :) to sentences of L1 (→, 7), and a translation mapping φ 7→ φ(:) from sentences of L1 (→, 7) to sentences of L1 (→, :), be defined by recursion as follows: P (7) = P (:φ)(7) = (φ(7) → 7) (φ → ψ)(7) = (φ(7) → ψ (7) )
P (:) = P 7(:) = :(P → P ) (φ → ψ)(:) = (φ(:) → ψ (:) )
[Note that the first clause in each definition is the base case, for every sentence letter P; but the clause for 7 in the definition of φ 7→ φ(:) mentions the particular sentence letter P .] (a) Show that, for any set Γ of sentences and any individual sentence φ from the language L1 (→, :), Γ `ND− φ ⇒ {γ (7) | γ ∈ Γ} `ND7I φ(7) . I
(b) Show that, for any set Γ of sentences and any individual sentence φ from the language L1 (→, 7), Γ `ND7I φ ⇒ {γ (:) | γ ∈ Γ} `ND− φ(:) . I
(7) (:)
ND− I —i.e.
(c) Show that φ and (φ ) are interderivable in show (by induction on the complexity of sentences of L1 (→, :)) that both φ bND− (φ(7) )(:) I and (φ(7) )(:) bND− φ. I
(d) Show that φ and (φ(:) )(7) are interderivable in ND7I —i.e. show (by induction on the complexity of sentences of L1 (→, 7)) that both φ bND7I (φ(:) )(7) and (φ(:) )(7) bND7I φ. (e) Show that ‘⇒’ in (a) can be strengthened to ‘⇔’. (f) Show that ‘⇒’ in (b) can be strengthened to ‘⇔’. [The point of all this is really just that if we define :φ as φ → 7, then EFQ7 gives us the intuitionistic rules for :. (And this definition aptly reflects the intuitionistic way of understanding negation.)] Exercise 3.4.10. Show that if we add to ND7I the axiom scheme ((φ → 7) → 7) → φ, then 7 we obtain a system ND7C which fits together with ND− C in the way NDI fits − together with NDI in Exercise 3.4.9—i.e. all the results hold mutatis mutandis. 63
P EDL : additional notes and exercises – scruffy and incomplete preliminary draft – T21
3.4
Exercise 3.4.11. Sketch proofs of soundness and completeness138 for the sytem ND7C . Exercise 3.4.12. Show that ND− C and AXC are equivalent systems. (See page 61.) Exercise 3.4.13.
[:φ] .. .
Here are yet more negation rules:-
7 :-E C7 φ
[φ] .. .
7 :-I I7 :φ
− Let systems ND− C 7 and NDI 7 be systems for the language L1 (→, :, 7) such that each system contains EFQ, along with the usual elimination and introduction − rules for →, while ND− C 7 contains :-EC 7 and NDI 7 contains :-II 7 .
(a) Establish the independence of :-EC 7 in ND− C7. (b) Establish the independence of :-II 7 in ND− I7.
(c) Show that adding :-Elim to ND− C 7 would not create a stronger system. (d) Show that adding :-Intro to ND− I 7 would not create a stronger system.
(e) Would either system be made stronger by adding EFQ7 (see Exercise 3.4.9)? [Saying that S1 is stronger than S2 means, of course, that Γ `S2 φ ⇒ Γ `S1 φ, for any Γ and φ, though there are Γ and φ such that Γ `S1 φ but Γ 6 `S2 φ.] Exercise 3.4.14. For each n ∈ N, let ∧n and ∨n be n-place connectives, interpreted by truth functions f∧n and f∨n such that f∧n (x1 , . . . xn ) = / iff xi = /, for all i, 1 ≤ i ≤ n, and f∨n (x1 , . . . xn ) = 7 iff xi = 7, for all i, 1 ≤ i ≤ n. (a) Specify natural-deduction rules for ∧n and ∨n (for n ≥ 1)—that is, rule schemes that will work for any n ≥ 1. (b) Assuming the standard rules for ∧ and ∨, use your rules to show that ((P ∧ Q) ∧ R) and ∧3 (P, Q, R) are interderivable, and that ((P ∨ Q) ∨ R) and ∨3 (P, Q, R) are interderivable, (c) Show that P , ∧1 (P ), and ∨1 (P ) are interderivable. (d) What about n = 0?139 We need ∧0 to be provable outright, from no undischarged assumptions; and we need ∨0 to be interderivable with 7—assuming the rule EFQ7 (given in Exercise 3.4.9)? 138
With respect to classical semantics. Note that there is a style for presenting proofs in which sentences are discharged, and axioms are introduced, not by enclosing them in square brackets but by drawing an inference line on top of them—an inference line with nothing on top of it. 139
64
P EDL : additional notes and exercises – scruffy and incomplete preliminary draft – T21
3.4
Exercise 3.4.15. Suppose that L1 is enriched with two 3-place connectives ⇒ and , and that clauses are added to the definition of the set of sentences to determine that if φ, ψ, and χ are sentences, then so are (φ, ψ ⇒ χ) and (φ, ψ χ). (a) Show that if we add the following introduction rule and elimination rules to ND1 , then formulae (φ, ψ ⇒ χ) and ((φ ∨ ψ) → χ) will be interderivable.140 [φ] [ψ] .. .. . . χ χ (φ, ψ ⇒ χ) .. . φ
.. . (φ, ψ ⇒ χ) χ
⇒-Intro
.. . ψ
⇒-Elim1
.. . (φ, ψ ⇒ χ) χ
⇒-Elim2
(b) Specify two introduction rules and one elimination rule to add to ND1 such that formulae (φ, ψ χ) and ((φ ∧ ψ) → χ) will then be interderivable.141 Justify your specifications. Exercise 3.4.16. Suggest natural deduction rules for the majority connective 5 (see Exercise 2.8.8). Justify your suggestions. [Hint. Note that 5(φ, ψ, χ) is equivalent both to (φ ∧ ψ) ∨ (ψ ∧ χ) ∨ (χ ∧ φ) and to (φ ∨ ψ) ∧ (ψ ∨ χ) ∧ (χ ∨ φ). Look for three introduction rules and three elimination rules (and make them pure rules, in the sense that they involve no connective other than 5).] Exercise 3.4.17. Suggest a set of natural deduction rules for the language L1 (↓) and a set of natural deduction rules for the language L1 (↑) such that the resulting systems will be sound and complete. Justify your suggestions. [It’ll be messy, but make it as neat as possible. There’s a lot of work here.] Exercise 3.4.18. (a) Check that the indepence result in Exercise 3.4.5 can be extended to NDC . This deals with :-Elim. The ∧-rules have been dealt with in the text and in Exercise 3.4.4: they’re each independent in ND1 and hence too in NDC . Now complete the picture for NDC by establishing the independence of each of the other rules in the system. (b) Is each of the NDC 0 -rules independent in the system? Justify your answer. 140 141
Recall that ((φ ∨ ψ) → χ) and ((φ → χ) ∧ (φ → χ)) are interderivable. Recall that ((φ ∧ ψ) → χ) and ((φ → χ) ∨ (φ → χ)) are interderivable (given :-Elim).
65
P EDL : additional notes and exercises – scruffy and incomplete preliminary draft – T21
4
4.0
Predicate Logic
The first-order142 language L2 of the Manual is an extension of the propositional language L1 . And in its turn L= is an extension of L2 . It’s natural to think of L1 as containing two different types of vocabulary: (i) ‘logical’ vocabulary, viz. the connectives (along with the attendant apparatus of brackets) which has a ‘fixed’ interpretation; and (ii) the non-logical vocabulary, viz. sentence letters, which is up for variable interpretation. Correspondingly, it seems natural to divide the vocabulary of L2 up in the same way: under heading (i) we now also have the quantifiers, ∀ and ∃ (along with their attendant apparatus of variables); and under (ii) there are the predicate letters—among which the sentence letters of L1 are subsumed as zero-place predicate letters—and the constants. However, although quantifiers would seem to be just as ‘logical’ as connectives, there’s a problem about thinking of them as having a ‘fixed’ interpretation: after all, different ‘interpretations’, in the informal Manual sense, can determine different domains for the quantifiers to range over; and, once we get to the formal semantics, then certainly different structures can have different domains. The point is that the domain determines a specific meaning for ∀ and ∃: ‘∀x . . . x . . .’ means ‘everything in the domain is such that . . . it . . . ’ and ‘∃x . . . x . . .’ means ‘something in the domain is such that . . . it . . . ’. Thus ‘∀’ might mean ‘all natural numbers’, ‘all real numbers’, ‘all people’ (‘everyone’), and so on, according to whatever the domain is; and so too with ‘∃’, mutatis mutandis. This is in stark contrast with the connectives, which have the same fixed truth table in any interpretation. And when we move on to L= it’s natural to think of = as an additional item of ‘logical’ vocabulary, again with a ‘fixed’ interpretation. There has, however, been philosophical discussion about whether = is properly classed as ‘logical’. But let’s just focus again on the idea that its intepretation is ‘fixed’: this is more plausible than the the fixedness of ‘∀’ and ‘∃’, if you’re happy to say that individual things are just all there, being what they are—identical to themselves and distinct from everything else—so that a domain just restricts attention to a given subtotallity of all things. But maybe you might want to say that in specifying a domain you are thereby providing a ‘criterion of identity’ which actually determines what count as the individual things in it: so, for example, ‘the domain of British rivers’ might be thought an inadequate specification, because when streams merge, then it’s left indeterminate which of the two streams the single continuation is—or whether it’s neither of them but a third distinct river—for it certainly can’t be identical to both the streams, if they are distinct from one another—or maybe you want to say (though rather implausibly?) that, even though their springs maybe quite remote, nonetheless the two orignal streams are the very same river; and, any142
‘First-order’, because there are only quantifiers ranging over (a domain of) individuals.
66
P EDL : additional notes and exercises – scruffy and incomplete preliminary draft – T21
4.0
way, maybe the original streams are too tiny to be rivers—but then where do the rivers, properly so called, start? And so on, and so on: to have a domain coherent enough to be part of an interpretation for L= —or indeed L2 —we need to have it bring along a determination of what its elements actually are. The worry about the fixedness of the interpretation of = arises if you take this kind of issue to be deeper than merely a problem about how precisely enough to formulate your specification of a domain but to suggest that there is in rebus no independently given totality of rivers-in-all-the-different-possible-senses in which the idividuals are not only distinct from other rivers-in-the-same-sense but also from all rivers in different senses. A more technical comment about the languages L2 and L= :- The Manual works with a single language L2 and a single language L= that both have denumerably many predicate letters for each of the (denumerbaly many) different possible ‘addities’—i.e. number of argumant places. But there’s another common approach which is much smoother in lots of ways (see, for example, the discussion on page 73): we can define a(n infinite) class of first-order languages each with its own particular stock of non-logical vocabulary—vocabulary under heading (ii). And then, to do semantics, we define what it is to be a structure for so-and-so language. This fits in nicely with the use of logic in mathematical applications (and empirical theorizing too, I suppose): for example, with a particular interesting structure in mind we would consider a language for it: the language of artihmetic, say, or the language of set theory; and, equally, if we’re interested in a particular class of structures, such as orderings, groups, or whatever, then we would have a language to fit it—in the case or orderings this is just a language with a single two-place predicate (and, in fact, this is the usual language for set theory too). The examples given of arithmetic and groups raise a further comment:- L2 and L= lack a category of non-logical vocabulary that is standardly part of first-order langauges: function letters. These are symbols that would allow for a symbolization of phrases such as ‘the father of’, for example, or of arithmetical operators such as ‘+’, and ‘×’. Thus expressions such as ‘2 + 2’, ‘x + y’, ‘3 × x’, ‘x × (y + 1)’, and so on, could be accommodated in a formal language for arithmetic: they would be compound ‘terms’ that, in the syntax of the language, could go where only variables or constants can go in L2 and L= . (And constants could then be subsumed as zero-place function letters, just as sentence letters are zero-place predicates.) Notation:- the Manual uses ‘Z’ to range over predicate letters; over terms 143 (i.e. variables and constants); ‘v’ to range over particular—though there’s nothing to range over constants in and ‘Φ’ to range over the lot of them. I can’t cope with this: 143
I’m not sure the Manual actually uses this terminlogy, but it’s handy.
67
‘t’ to range variables in particular— I’ll use ‘P’,
P EDL : additional notes and exercises – scruffy and incomplete preliminary draft – T21
4.1
‘Q’etc.144 , to range over predicate letters; t, u, etc., to range over all terms; ‘v’, ‘w’, etc., to range over variables in particular; a, b, etc., to range over constants in particular—and they’ll be nothing to range over the lot of them. Anyhow, we’ll work with L2 and L= —and ignore, as far as possible, all the clutter of subscripts, superscripts, . . . . But we’ll also consider languages L+ 2 and L+ = , which contain the constant sentences.
4.1
Syntax
The atomic formulae of L2 are expressions Pt1 . . . tn , where P is an n-place predicate letter and t1 . . . tn are terms—i.e. each either a variable or a constant (see above). So, in contrast to the propositional languages, atomic formulae are not actually unstructured: it’s just that their sub-parts are not subformulae. And for L= there is an additional kind of atomic formula: t1 = t2 , where t1 + and t2 are terms. For L+ 2 and L= we might also include / and 7 as atomic formulae: they would then come in as one of the ‘base cases’ of the inductive definition of formulae. (Alternatively, we might count them as coming in in the composition clauses: see the earlier discussion.) Anyhow, the inductive definition of formulae is as follows. (For L2 ignore (1b), + (1c), and (1d); for L= ignore (1b) and (1c); for L+ = ignore (1d); but for L= don’t ignore anything.) 1a) if P is an n-place predicate letter and t1 , . . . , tn are terms, then Pt1 . . . tn is a formula; [1b) / is a formula;] [1c) 7 is a formula;] [1d) if t1 and t2 are terms, then t1 = t2 is a formula;] 2a) if φ is a formula, then :φ is a formula; 2b) if φ and ψ are formulae, then (φ ∧ ψ) is a formula; 2c) if φ and ψ are formulae, then (φ ∨ ψ) is a formula; 2d) if φ and ψ are formulae, then (φ → ψ) is a formula; 2e) if φ and ψ are formulae, then (φ ↔ ψ) is a formula; 2f) if φ is a formula and v is a variable, then ∀vφ is a formula; 2g) if φ is a formula and v is a variable, then ∃vφ is a formula; — and nothing else is a formula. The Manual gives a careful definition of when an occurrence of a variable in a formula is free, and hence a definition of what it is for a variable to occur freely in a formula. And all and only the variables that occur freely in a formulae φ are called the ‘free variables of φ’. The occurrence of a variable is said to be a bound occurrence if and only if it’s not a free occurrence—and v is bound by 144
This meshes with their use in propositional logic to range over sentence letters—which are now a special case of predicate letters.
68
P EDL : additional notes and exercises – scruffy and incomplete preliminary draft – T21
4.1
an occurrence of ∀v or ∃v if and only if it occurs in it’s scope but not in the scope of any other occurrence of ∀v or ∃v.145 A sentence is then defined to be a formula that has no free variables. (It’s not uncommon to use different Greek letters to range specifically over sentences— typically, σ and τ —rather than just using the same ones as for arbitrary formulae (φ, ψ, etc.).) ——— We’ll need to be finicky about the substitution of terms in a formula—the substitution of a term t for a free variable v in a formula φ. The Manual uses the notation ‘φ[t/v]’ in the particular case where t is a constant, but we shall use this more generally for any term t—variable or constant. It will be useful to have a recursive definition of φ[t/v]. But first we need to define the result u[t/v] of substituting the term t for an occurrence of the variable v in the term u; this is utterly trivial, of course, since we don’t have function letters, and so have no compound terms (though it wouldn’t be if we did—we’ld need a recursive definition of u[t/v]): v[t/v] = t u[t/v] = u, if u 6= v Now the recursive definition of φ[t/v] itself: 1a) (Pt1 . . . tn )[t/v] = P t1[t/v] . . . tn [t/v] [1b) / [t/v] = / ] [1c) 7 [t/v] = 7 ] [1d) t1 = t2 [t/v] = t1 [t/v] = t2 [t/v]146 ] 2a) (:φ)[t/v] = :(φ[t/v])147 2b) (φ ∧ ψ)[t/v] = (φ[t/v] ∧ ψ[t/v]) 2c) (φ ∨ ψ)[t/v] = (φ[t/v] ∨ ψ[t/v]) 2d) (φ → ψ)[t/v] = (φ[t/v] → ψ[t/v]) 2e) (φ ↔ ψ)[t/v] = (φ[t/v] ↔ ψ[t/v]) ∀wφ, if w 2f) (∀wφ)[t/v] = ∀w(φ[t/v]), if w ∃wφ, if w 2g) (∀wφ)[t/v] = ∃w(φ[t/v]), if w 145
=v 6= v =v 6= v
Thus in ∀x∃xF x, for example, the x following F is bound by ∃x—but not by ∀x. I think we have to live with the eqivocality of ‘=’, which is used both for the symbol in the formal lanugages L= and L+ = and to make metalinguistic identity statements. 147 As before, the use of ‘(’ and ‘)’ is informal—just to indicate scope of ‘[t/v]’. 146
69
P EDL : additional notes and exercises – scruffy and incomplete preliminary draft – T21
4.2
Observe that if a variable v occurs free in the scope of ∀w or ∃w in a formula φ, then in φ[w/v] any occurrence of w that results from its having replaced such an occurrence of v will become bound—caught up in the scope of ∀w or ∃w. However, in all natural contexts where we use substitution we’ll want to rule this case out. Constants, of course, are just not the sort of thing ever to get bound, and so there’ll never be a problem with them.148 But, even so, let’s formulate a general definition of when a term t is substitutable for v in φ. Here is (a partial specification of) a recursive definition—using the abbreviation ‘Sub(t, v, φ)’: (1a) Sub(t, v, Pt1 . . . tn ) .. .
(2a) Sub(t, x, :φ) iff Sub(t, v, φ) (2b) Sub(t, v, φ ∧ ψ) iff Sub(t, v, φ) and Sub(t, v, ψ) .. . (2f) Sub(t, v, ∀wφ) iff either w = v or both t 6= w and Sub(t, v, φ)149 . .. . It turns out, of course, that if t is a constant, then t will always be substitutable, and also that if v is not free in φ then any term will be substitutable for v. ——— Exercise 4.1.1. Give a direct definition, by recursion on the complexity of formulae, of the set F(φ) of variables that are free in φ—i.e., define the function φ 7→ F(φ). Exercise 4.1.2. (a) Say of each of the following whether or not it holds for all formulae φ and all variables v1 , v2 . Justify your answers. (i) φ = φ[v2 /v1 ][v1 /v2 ] (ii) φ[v2 /v1 ] = φ[v2 /v1 ][v1 /v2 ][v2 /v1 ] (b) Would either of your answers in (a) be different if we either added either one of the following provisos or added their conjuction? (A) v2 is substitutable for v1 in φ; (B) v2 does not occur free in φ. 148
If we had a language with complex terms that contain variables, then it wouldn’t just be variables that can get unwantedly caught up in the scope of a quantifier when they’re substituted in for a variable, and it would all be a lot more complicated. 149 Equivalently, either v is not free in ∀wφ or both t 6= w and Sub(t, v, φ).
70
P EDL : additional notes and exercises – scruffy and incomplete preliminary draft – T21
4.2
4.2
Semantics: Evaluation in a Structure
A structure A is defined in the Manual to be a pair hD, Ii, where D—the domain of the structure—is a nonempty set and I is a big function whose domain is all the non-logical vocabulary. But then, if A = hD, Ii, ‘DA ’ is used to stand for D. (Thus A = hDA , Ii = hDhDA ,Ii , Ii = hDhDhDA ,Ii ,Ii , Ii = . . . .) The interpretation I(P) in A of an n-place predicate letter P is an n-ary relation. (Halbach actually treats zero-place predicate letters—i.e. sentence letters—as a special case and stipulates that if n = 0, then I(P) is a truth value; but note that if we make the identification suggested in Section 1.3, then this isn’t a special case at all.) And the interpretation I(a) of a constant a is an element of DA . You also might expect that the interpretation I(P) of an n-place predicate letter has to be an n-ary relation over DA — i.e. that I(P) ⊆ DAn ; but in fact there is no such stipulation in the Manual : this is rather idiosyncratic and non-standard. Anyhow, later on—when specifying particular structures, Halbach writes ‘|P|A ’ and ‘|a|A ’ instead of ‘I(P)’ and ‘I(a)’—though I suppose we could, by analogy with ‘DA ’, write ‘IA (P)’ and ‘IA (a)’; and I will in fact use this subscript notation (it would seem cleaner to reserve bars exclusively for clauses for the evaluation of expressions). + Sentences (for L2 , L= , L+ 2 , or L= ) will be either outright true or outright false in a structure; but there are also formulae with free variables, and— except in certain limiting cases—we can only make sense of their receiving a definite truth value relative to an assignment of elements of the domain of a structure to the free variables. In fact, even if we were interested only in the truth or falsity of sentences, we’ld still need something like an apparatus of assignments to accomodate semantic clauses for the quantifiers. And so we need a definition of the truth value |φ|αA of a formula φ in a structure A with respect to an assignment α.
It’ld be very messy defining |φ|αA just for those αs that made an assignment exclusively to the variables free in φ. And so assignments to all the variables in the language are used throughout: these are just functions from the set of variables into the domain of a structure A: let’s write ‘A(A)’ for the set of all such assignments. Thus A(A) = A(B) iff A and B have the same domain. + (And, since the languages L2 , L= , L+ 2 , and L= all have the same stock of denumerably many variables, there’ll not be any difference in the totality of assignments for the different languages.) Note that if DA contains exactly one element, then A(A) will also contain exactly one element; but otherwise—even if DA contains only two elements—then A(A) will be uncountably infinite—of cardinality larger than ℵ0 (i.e. it will be bigger than N)150 . 150
If DA is either finite with two or more elements or is denumerable (i.e. countably infinite), then A(A) will be of cardinality 2ℵ0 (the size of P(N)). But if DA is itself of cardinality 2ℵ0 , then A(A) won’t be any larger: it too will be of cardinality 2ℵ0 .
71
P EDL : additional notes and exercises – scruffy and incomplete preliminary draft – T21
4.2
In preparation for the recursive definition of |φ|αA , we need first to give the evaluation |t|αA of a term t in A with respct to α. Since we have no compound terms, this is easy (and obvious): for a variable v, |v|αA = α(v); and for a constant a, |a|αA = IA (a). This is just like the Manual, except for the ‘IA ’ notation, which was mentioned above. When it comes to the clauses for ∀v and ∃v, however, I’ll part company from the Manual, since I think these clauses are somewhat clumsy and unnatural (though Halbach is far from alone in favouring them). The point is that they quantify over assignments, rather than quantifying directly over elements of DA 151 : |∀vφ|αA (respectively, |∃vφ|αA ) is defined to be / iff |φ|βA = / for every (respectively, some) assignment β such that β(w) = α(w) for all variables w distinct from v. But we’ll be able to quantify over DA itself if we first give the following definition of an assignment α[v:d] for any α ∈ A(A), variable v, and d ∈ DA : α[v:d](v) = d; α[v:d](w) = α(w), if w 6= v. (This definition will also make the formulation—and proof—of results about subsitution a great deal more staightforward than they would otherwise be.) Anyhow, here is the definition of |φ|αA in full: / if h|t1 |αA , . . . , |tn |αA i ∈ IA (P) α 1a) |Pt1 . . . tn |A = 7 otherwise,
151
/, ] = 7, ] / if |t1 |αA = |t2 |αA = 7 otherwise, ] / if |φ|αA = 7 = 7 otherwise, / if |φ|αA = / and = 7 otherwise, / if |φ|αA = / or = 7 otherwise, / if |φ|αA = 7 or = 7 otherwise, / if |φ|αA = |ψ|αA = 7 otherwise,
[1b)
|/|αA =
[ 1c)
|7|αA
[1d)
|t1 = t2 |αA
2a)
|:φ|αA
2b)
|(φ ∧ ψ)|αA
2c)
|(φ ∨ ψ)|αA
2d)
|(φ → ψ)|αA
2e)
|(φ ↔ ψ)|αA
|ψ|αA = / |ψ|αA = / |ψ|αA = /
Hence, you might even argue, they actually get the meaning-in-A of ∀ and ∃ wrong!
72
P EDL : additional notes and exercises – scruffy and incomplete preliminary draft – T21 2f) |∀vφ|αA = 2g)
|∃vφ|αA
/ 7
if |φ|A = / for every d ∈ DM otherwise,
/ 7
if |φ|A = / for some d ∈ DM otherwise.
=
4.2
α[v:d]
α[v:d]
——— According to the definition of a structure, it has to interpret all the (infinitely many) items of non-logical vocabulary; but, when we actually use the appratus, we’re typically interested only in a few specific predicates and/or constants152 , and we’ll want to talk about ‘the’ structure which interprets the specific vocabulary in such-and-such way. Strictly speaking this is just undefined, but, as Halbach points out in the footnote on pages 105 and 106 of the Manual, the interpretation of vocabulary we’re not interested in irrelevant: we have the following lemma, which is a predicate-logic version of Lemma 2.2.1. Lemma 4.2.1 (Relevant Non-Logical Vocabulary). If DA = DB , A(P) = B(P), for every predicate letter P in φ, and A(a) = B(a), for every constant a in φ, then, for any assignment α, |φ|αA = |φ|αB . Proof. By induction on the complexity of formulae: Exercise 4.2.3. Observe that in the footnote referred to, Halbach in fact suggests a way of being specific about the structure we mean in saying ‘the structure A such that . . . ’ : we can stipulate that IA (P) = ∅, for any predicate letter P we haven’t mentioned, and pick on some arbitrary thing d in DA and stipulate that IA (a) = d, for any constant a we haven’t mentioned. But there remains a problem: the treatment of irrelevant predicate letters is unproblematic, but we haven’t got any single uniform stipulation for irrelevant constants: the thing arbitrarily picked depend on what the domain is, and we’re not in a position to stipulate any uniform principle of picking to determine what is picked—so any stipulation remains ad hoc to the particular domain in question. And— worse—there may be domains whose specification doesn’t bring along with it any way of actually specifying a particular element to pick for irrelevant constants—so all we can say is ‘pick something’. So, short of presupposing some huge universal choice function, it all remains a bit messy. Nonetheless, in virtue of Lemma 4.2.1, it’s a messiness I suppose we can live with. ——— But there’s another important analogue of Lemma 2.2.1, which is crucial for working with the evaluation of formulae in a particular structure A: Lemma 4.2.2 (Relevant Variables). If α(v) = β(v), for every variable v free in φ, then |φ|αA = |φ|βA 152
See the comments on page 67
73
P EDL : additional notes and exercises – scruffy and incomplete preliminary draft – T21
4.2
Proof. Induction on the complexity of formulae. The only non-routine steps in the argument are the induction steps for the quantifiers. I’ll do ∀:Under the inductive hypothesis that the condition stated in the lemma holds for φ (and holds for any α and β, of course), we have to show that it also holds for ∀wφ. Thus we take α and β such that α(v) = β(v) for every variable v free in ∀wφ, and we have to show that |∀wφ|αA = |∀wφ|βA . To do this, first observe that it’s true of any d ∈ DA that α[w:d](v) = β[w:d](v), for every v free in φ: this is because either v = w, in which case α[w:d](v) = d and β[w:d](v) = d, or else both (i) v 6= w and (ii) v is free in ∀wφ, in which case it follows from (i) that α[w:d](v) = α(v) and β[w:d](v) = β(v), and it follows from (ii) that α(v) = β(v). Hence, by the inductive hypothesis, (∗)
α[w:d]
|φ|A
β[w:d]
= |φ|A
for any d ∈ DA .
We now have a chain of biconditionals: |∀wφ|αA = / ⇔ |φ|A
α[w:d] β[w:d]
⇔ |φ|A
= / for every d ∈ DA
= / for every d ∈ DA
⇔ |∀wφ|βA = /
— by definition of | · |αA — from (∗)153 — by definition of | · |βA
Hence |∀wφ|αA = |∀wφ|βA . The limiting case is when a formula contains no free variables: Corollary 4.2.3. If σ is a sentence, then, for any α and β, |σ|αA = |σ|βA . Thus the truth or falsity of a sentence σ is completely independent of assignments to variables, and we could coherently just write ‘|σ|A = /’ or ‘|σ|A = 7’. It’s common to use ‘|=’ notation and write ‘A |=α φ’ or ‘A |= φ [α]’, or the like, to mean that |φ|αA = /—and this is read ‘A satisfies φ under α’, or the like. In fact, semantic clauses are typically given directly as a recursive definition of the relation |=. And it’s also common to define outright truth for arbitrary formulae. Using the ‘|=’ notation: A |= φ iff, for every assignment α, A |=α φ. With this definition the point that the evaluation of sentences is independent of assignments can be encapsulated in the fact that if σ is a sentence and α is any assignment, then A |= σ iff A |=α σ.154 ——— There’s one more lemma that’s crucial for working with the semantic definitions. It guarantees that if a term t is substituted for a free variable v in a 153 154
To symbolize the move: we’re inferring [∀xF x ↔ ∀xGx] from ∀x[F x ↔ Gx]. Note that Halbach defines outright truth as above, but defines it only for sentences.
74
P EDL : additional notes and exercises – scruffy and incomplete preliminary draft – T21
4.2
formula φ, then the evaluation of the result under a given assignment α is the same as the evaluation of the orignal formula under the assignment that assigns to all variables other than v exactly what α assigns them and directly assigns to v the semantic value (under α) of t—all this provided t is indeed substitutable for v according to the definition in Section 4.1. (Here we have a substitution result for variables that matches the propositional Lemma 2.4.1 much as Lemma 4.2.2 matches Lemma 2.2.1.) Lemma 4.2.4 (Substitution for Variables). α[v:|t|α A]
If t is substitutable for v in φ, then |φ[t/v]|αA = |φ|A
Proof. Induction on the complexity of formulae. Again the only non-routine steps in the argument are the induction steps for quantifiers, and I’ll do ∀:Under the inductive hypothesis that the condition stated in the lemma holds for φ (holds with any t, v, and α), we have to show that it also holds for ∀wφ. Thus we take a t and an v such that t is substitutable for v in ∀wφ, and take α[v:|t|α ] an assignment α, and we have to show that |(∀wφ)[t/v]|αA = |∀wφ|A A . We shall split the argument into two separate cases: (1) v = w; (2) v 6= w. In case (1), we have that (i) (∀wφ)[t/v] = ∀wφ, since there’s no actual substitution, and (ii) it’s true of any d ∈ DA that α[w:d] = α[v:|t|αA ][w:d], since |t|αA replaces α(v), i.e. α(w), but is in turn replaced by d. And so |(∀wφ)[t/v]|αA = / ⇔ |∀wφ|αA = / — from (i) above α[w:d] ⇔ |φ|A = / for every d ∈ DA — by definition of | · |αA α[v:|t|α ][w:d] ⇔ |φ|A A = / for every d ∈ DA — from (ii) above155 α[v:|t|α ] ⇔ |φ|A A = / α[v:|t|α ] — by definition of | · |A A In case (2), we have that (i) (∀wφ)[t/v] = ∀w(φ[t/v]), and—since t is substitutable for v in ∀wφ—both (ii) t is substitutable for v in φ and (iii) w 6= t; α[w:d] and it follows from (iii) that, for any d ∈ DA , |t|A = |t|αA ; hence too—since α[w:d] v 6= w—that (iv) α[w:d][v:|t|A ] = α[v:|t|αA ][w:d]: this for any d ∈ DA . 155
The same inferential move as described in footnote 153.
75
P EDL : additional notes and exercises – scruffy and incomplete preliminary draft – T21
4.2
Now we have enough for a chain of biconditionals: |(∀wφ)[t/v]|αA = / ⇔ |∀w(φ[t/v])|αA = / — from (i) above α[w:d] ⇔ |φ|A = / for every d ∈ DA — by definition of | · |αA α[w:d][v:|t|
α[w:d]
]
A ⇔ |φ|A for every d ∈ DA — from (ii) above, by inductive hypothesis156 α[v:|t|α ][w:d] ⇔ |φ|A A = / for every d ∈ DA — from (iv) above157 α[v:|t|α ] ⇔ |φ|A A = / α[v:|t|α ] — by definition of | · |A A
α[v:|t|α A]
Hence, in either case, |(∀wφ)[t/v]|αA = |∀wφ|A
.
——— Exercise 4.2.1. Show that the semantic clauses given in these notes for ∀ and ∃ are (extensionally) equivalent to the ones in the Manual. (See page 72.) Exercise 4.2.2. Given a structure A, let its ‘standard restriction’ A0 be defined as follows: DA0 = DA ; IA0 (a) = IA (a), for all constants a; and IA0 (P) = IA (P) ∩ DAn , for all n-place predicate letters P—for all n. (See page 71.) Show that, for any formula φ and any assignment α, |φ|αA0 = |φ|αA . Exercise 4.2.3. Do the inductive proof of Lemma 4.2.1. Exercise 4.2.4. Fill in more of the inductive argument in the proof of Lemma 4.2.2: do the base case—for atomic formulae—and the step for ∃.
156 157
The same inferential move as described in footnote 153. The same inferential move as described in footnote 153.
76
P EDL : additional notes and exercises – scruffy and incomplete preliminary draft – T21
4.3
Exercise 4.2.5. Fill in more of the inductive argument in the proof of Lemma 4.2.4: do the base case—for atomic formulae—and the step for ∃. Exercise 4.2.6. Specify a formula φ, a variable v, a term t, a structure A, and an assignment α α[v:|t|α ] such that |φ[t/v]|αA 6= |φ|A A . (See Lemma 4.2.4.) Exercise 4.2.7. Let ||φ||A = {α ∈ A(A) : |φ|αA = /}. (a) Give a direct definition, by recursion on the definition of formulae, of the function φ 7→ ||φ||A . [Don’t use ‘= /’, ‘6= /’, ‘= 7’, or ‘6= 7’ anywhere.] (b) Show that if φ is a sentence, then either ||φ||A = A(A) or ||φ||A = ∅. (c) Is it only when φ is a sentence that either ||φ||A = A(A) or ||φ||A = ∅? Exercise 4.2.8. Let v1 , . . . , vn be an exhaustive list of the variables free in φ. Show that A |= φ ⇔ A |= ∀v1 . . . ∀vn φ. (For the ‘|=’ notation see page 74.)
4.3
Satisfiability, Logical Implication, etc.
Semantic properties of formulae, sets of formulae, and relations between sets of formulae and formulae all carry over from propositional logic in an obvious way. In the Manual these definitions are restircted to sentences—and sets of sentences—but we can give definitions for arbitrary formulae and sets of + arbitrary formulae (from L2 , L+ 2 , L= , L= , or whatever). Satisfiability (Semantic Consistency): First, the use of the turnstile ‘|=’ notation, introduced on page 74, can be extended to sets of formulae: A |=α Γ iff A |=α γ (i.e. |γ|αA = /) for all γ ∈ Γ. Terminology: we read ‘A |=α Γ’ as ‘A satisfies Γ with α’ , etc. etc.; and we can say ‘A satisfies Γ’ when A satisfies Γ with some assignment or other158 . Then Γ is satisfiable (semantically consistent) iff A |=α Γ, for some A and α. And Γ is unsatisfiable (semantically inconsistent) iff it is not satisfiable (not semantically consistent)—iff A |=α Γ, for no A and α. We can also say of an individual formula φ that it is unsatisfiable—precisely when {φ} is unsatisfiable. 158
Don’t confuse this with the following definition that some people give for the notation ‘A |= Γ’: that A satisfies Γ with all assignments (or, equivalently, A |= γ for all γ ∈ Γ).
77
P EDL : additional notes and exercises – scruffy and incomplete preliminary draft – T21
4.3
Logical Implication (Entailment (Semantic Consequence)): Γ φ iff
there is no structure A and assignment α such that A |=α Γ but A 6|=α φ.
Note that general properties of satisfiability and entailment carry over straightforwardly from propositional logic: in particular, Lemmas 2.3.1 and 2.3.2 can be repeated verbatim. Furthermore, entailment and satisfiability fit together in the same way as before: Lemma 2.3.3 and Exercise 2.3.1. Logical truth159 : A formula φ is logically true iff, for all structures A, A |= φ: that is, spelt out in full, for all A and all α, A |=α φ. Hence the logical truth of formulae—like tautologousness in propositional logic—can be characterized as a special case of being entailed: φ is logically true iff ∅ φ ( φ). There’s being logically false, of course, as well as being logically true: φ will be logically false iff A |=α φ for no A and α (|φ|αA = 7, for all A and α)—which is a generalization to arbitrary formulae of the Manual definition of being a contradiction. But in fact we’ve come across this case already: φ’s being logically false is precisely φ’s being unsatisfiable, as this was defined above. Logical Equivalence: We could define φ and ψ to be logically equivalent iff, for all A and α, |φ|αA = |ψ|αA —or, equivalently, iff φ ψ and ψ φ (φ y ψ). ——— Sentences and Models: When we do restrict attention to sentences, we can then make use of the remarks following Corollary 4.2.3 to set up assignmentindependent definitions. So, for a set Σ of sentences, and an structure A: A |= Σ iff A |= σ for all σ ∈ Σ. The terminology ‘model’ is standard when we’re dealing with sentences: we read ‘A |= Σ’ as ‘A is a model of Σ’ . And we also say that A is a model of σ when A |= σ (which coincides with when A |= {σ}, of course). Thus being-a-model-of is a special case of satisfying; and being satisfiable is just having a model. The independence of sentences from assignments is now pointed up by the fact that if Σ is a set of sentences and α is any assignment, then A |= Σ iff A |=α Σ. Note, too, that for sentences our definition of entailment boils down to Σσ
iff
σ is true in any model of Σ.
159
Note that in predicate logic it’s very common to use the terminology ‘valid’, or ‘logically valid’, for individual formulae to mean that they are logically true.
78
P EDL : additional notes and exercises – scruffy and incomplete preliminary draft – T21
4.3
Exercise 4.3.1. (a) Show that ∀v(φ → ψ) → (φ → ∀vψ), for any formulae φ and ψ, and any variable v that does not occur free in φ. And give an example to show that the constraint on v is necessary. (b) Show that ∀vφ → φ[t/v], for any formula φ, any variable v, and any term t that is substitutable for v in φ. And give an example to show that the constraint on t is necessary. Exercise 4.3.2. (a) For each of the following, either establish that it holds for any set Γ of formulae, any formula φ, and any variable v, or else specify a counterexample (some Γ, φ, and v such that it doesn’t hold). (i) Γ ∀vφ, if Γ φ
(ii) Γ ∃vφ, if Γ φ
(b) Now replace ‘if’ by ‘only if’ in (i) and (ii) and proceed as before. (c) Would any of your answers in (a) or (b) be different if we assumed that v did not occur free anywhere in Γ? Justify your claims. (d) Show that your anwers in (a), (b), and (c) concerning (i) and (ii) apply, respectively, also to (i0 ) and (ii0 ), with ‘and ψ is a formula’ added to the rubric in (a), and ‘anywhere in Γ ∪ {ψ}’ replacing ‘anywhere in Γ’ in (c): (i0 ) Γ, ∃vφ ψ, if Γ, φ ψ
(ii0 ) Γ, ∀vφ ψ, if Γ, φ ψ
[Don’t do all the work again: deduce the results from your previous answers.] Exercise 4.3.3. Show that all the following hold, whenever Σ, Σ1 , and Σ2 are sets of sentences, φ is a formula in which no variable distinct from v occurs free (so that ∀vφ and ∃vφ are sentences), σ is a sentence, and a and b are constants: (i) If Σ ∀vφ, then Σ φ[a/v]. (ii) If Σ φ[a/v], then Σ ∃vφ. (iii) If Σ φ[a/v], then Σ ∀vφ, provided that a occurs nowhere in Σ ∪ {φ}. (iv) If Σ1 ∃vφ and Σ2 σ, then Σ1 ∪ (Σ2 r {φ[a/v]}) σ, provided that a occurs nowhere in (Σ2 r {φ[a/v]}) ∪ {φ, σ}. (v) ∅ a = a. (via) If Σ1 φ[a/v] and Σ2 a = b, then Σ1 ∪ Σ2 φ[b/v]. (vib) If Σ1 φ[a/v] and Σ2 b = a, then Σ1 ∪ Σ2 φ[b/v]. 79
P EDL : additional notes and exercises – scruffy and incomplete preliminary draft – T21
4.3
Exercise 4.3.4. For each of the following either show that it is logically true, or else give a counterexample (a structure, or structure-plus-assignment, in which it is false). (ia) ∃x(∃yRxy → ∃yRyy) (iia) ∃y(Rxy → ∀x∃yRxy) (iiia) ∃x∀y(Rxy → ∃y∀xRxy)
(ib) ∃x(∃yRxy → ∃yRzy) (iib) ∃y(Rxy → ∃x∀yRxy) (iiib) ∃x∀y(Rxy → ∀y∃xRxy)
Exercise 4.3.5. (a) Specify a model of the following set of sentences: { ∀x:Rxx, ∀x∀y∀z(Rxy → (Ryz → Rxz)), ∀x∃yRxy }. (b) Show that no model of this set has a finite domain. (c) Specify a sentence containing no non-logical vocabulary other than R which is true in every finite structure but false in some infinite structure. Exercise 4.3.6. (a) For any formula φ in the language L= or L+ = , and any variable v, let w0 , w1 , . . . , wn , . . . be an enumeration of those variables distinct from v that don’t occur anywhere in φ, and define ∃n vφ by recursion on the natural numbers as follows: ∃0 vφ = ∀v v = v, ∃n+1 vφ = ∃wn (φ(wn /v) ∧ ∃n v(:v = wn ∧ φ)). What does ∃n vφ mean?160 (b) Give a recursive definition of ∃Qn vφ, so as to make it mean ‘there are exactly n things v such that φ’: that is, for any A and α, |∃Qn vφ|αA = / if and α[v:d] only if there are exactly n elements d of DA such that |φ|A = /.
160
Note that in the definition of ∃n+1 vφ the variable wn is new: it won’t get caught up in the scope of any quantifier in φ(wn /v) ∧ ∃n v(:v = wn ∧ φ ).
80
P EDL : additional notes and exercises – scruffy and incomplete preliminary draft – T21
4.4
5.0
The Compactness Theorem and Cardinality Properties
Theorem: Σ has a model if (and only if) every finite subset of Σ has a model. ——— What cardinality properties can/can’t be expressed
by a single sentence? by a set of sentences?
161
In the languages L= and L+ σ≥0 to be ∀xx = x; = we may proceed as follows:- Define V σ≥1 to be ∃xx = x;162 and, for n ≥ 2, define σ≥n to be ∃x1 . . . ∃xn 1≤i n such that Σ has a model with domain of size m), then Σ has an infinite model. Proof: Apply the Compactness Theorem to the set Σ ∪ {σ≥n | n ∈ N}. Negative Results: There is no Σ such that all A satisfy the following: A |= Σ ⇔ DA is finite There is no σ such that all A satisfy the following: A |= σ ⇔ DA is finite There is no σ such that all A satisfy the following: A |= σ ⇔ DA is infinite ——— Exercise 4.4.1. Spell out the proof of the lemma. Exercise 4.4.2. Establish the negative results. Exercise 4.4.3. For each of the following properties of a structure A, say whether or not there is (i) a sentence, (ii) a set of sentences, such that all and only models of (i) that sentence, (ii) that set of sentences, have the property. Justify your answers. (i) (ii) (iii) (iv)
DA contains an even finite number of elements. DA either contains an odd finite number of elements or is infinite. DA contains an even finite number of elements that’s strictly less than 512. DA contains the same number of elements as votes cast at the next General Election, if there is one and at least one vote is cast, and 99 elements otherwise.
Exercise 4.4.4. Show that if Σ is satisfied by all structures with an infinite domain, then every finite subset of Σ is satisfied by some structure with a finite domain. 161
Of course, if by a single sentence σ, then a fortiori by a set of sentences: {σ}. Both σ≥0 and σ≥1 are, of course, logical truths in the semantics we’ve been working with, since any domain DA will be nonempty; the point of starting with n = 0 and using distinct sentences for σ≥0 and σ≥1 will only emerge when we consider empty domains. 163 The definition we’ve given of σ≥n is the obvious way of symbolizing ‘there are at least n things’; but the length of the formulae grows exponentially asVn increases: there is a conjunction of n(n − 1) inequalities. The formula ∀x1 . . . ∀xn−1 ∃y 1≤i