341 78 16MB
English Pages 72 [70] Year 2020
A DECISION METHOD FOR ELEMENTARY ALGEBRA A N D GEOMETRY second edition, revised
by ALFRED TARSKI prepared for publication with the assistance of J. C . C . M c K i
UNIVERSITY O F C A L I F O R N I A PRESS Berkeley and Los Angeles 1951
University of California Press, Berkeley and Los Angeles, California C a m b r i d g e University Press, London, England C o p y r i g h t , 1951, by the Regents of the University of California Second Edition, Revised Printed in the United States of America
PREFACE
T h i s work h a s a l o n g h i s t o r y . I t s main r e s u l t s were found i n 1930 and f i r s t mentioned in p r i n t a y e a r l a t e r . I t took n i n e y e a r s , however, b e f o r e t h e m a t e r i a l i n i t s f u l l development was p r e p a r e d f o r p u b l i c a t i o n . I n f a c t , a monograph embodying t h o s e r e s u l t s was s c h e d u l e d t o a p p e a r i n 1939 u n d e r t h e t i t l e The completeness of elementary algebra and geometry i n the c o l l e c t i o n Actualite's scientifiques et industrielles, Hermann & Cie, P a r i s . As a r e s u l t of the war, the p u b l i c a t i o n d i d n o t mater i a l i z e ; two e x i s t i n g s e t s of page p r o o f s are probably the only t r a c e of t h i s v e n t u r e . N a t u r a l l y , I did n o t abandon the hope of s e e i n g the r e s u l t s i n p r i n t . However, as o f t e n happens with a u t h o r s , the o r i g i n a l v e r s i o n ceased t o s a t i s f y me. The s p e c i f i c c h a r a c t e r of t h e r e s u l t s seemed t o c a l l f o r a more formal p r e s e n t a t i o n , and the addit i o n of new o b s e r v a t i o n s and conclusions seemed t o be d e s i r a b l e . 1he p e r p l e x i t y of the war and postwar p e r i o d s r e s u l t e d in postponing the r e v i s i o n from y e a r t o y e a r . Hence I was happy when, in the beginning of 1948, the BAND C o r p o r a t i o n , Santa Monica, C a l i f o r n i a , became i n t e r e s t e d in my r e s u l t s and o f f e r e d to p u b l i s h them. I t was e s p e c i a l l y f o r t u n a t e t h a t P r o f e s s o r J . C. C. McKinsey ( S t a n f o r d U n i v e r s i t y ) , who was a t t h a t time working with t h e RAND C o r p o r a t i o n , was e n t r u s t e d with the t a s k of p r e p a r i n g t h e work f o r p u b l i c a t i o n . With my c o l l a b o r a t i o n , he a c t u a l l y p r e p a r e d a new d r a f t of the monograph and, a s i d e from h i s competent e d i t o r i a l work, he c o n t r i b u t e d t o a s i m p l i f i c a t i o n of the development. Within a few months the monograph was p u b l i s h e d . As was t o be expected, i t r e f l e c t e d the s p e c i f i c i n t e r e s t s which the RAND C o r p o r a t i o n found i n t h e r e s u l t s . Hie d e c i s i o n method f o r e l e m e n t a r y a l g e b r a and geometry — which i s one of t h e main r e s u l t s of t h e work — was p r e s e n t e d i n a s y s t e m a t i c and d e t a i l e d way, t h u s b r i n g i n g t o t h e f o r e t h e p o s s i b i l i t y of c o n s t r u c t i n g an a c t u a l d e c i s i o n machine. O t h e r , more t h e o r e t i c a l a s p e c t s of the problems d i s c u s s e d were t r e a t e d l e s s thoroughly, and only in n o t e s . The p r e s e n t e d i t i o n i s a p h o t o g r a p h i c r e p r i n t of t h a t p u b l i s h e d by t h e RAND C o r p o r a t i o n , and hence no e x t e n s i v e changes could be i n t r o d u c e d in the t e x t . However, a l l known e r r o r s have been c o r r e c t e d , the i n t r o d u c t i o n has been p a r t l y changed, and some supplementary n o t e s have been added. In a d d i t i o n t o new b i b l i o g r a p h i c a l r e f e r e n c e s ( m o s t l y t o r e c e n t l i t e r a t u r e ) , t h e s e supplementary n o t e s c o n t a i n e l a b o r a t i o n s of the o r i g i n a l t e x t which, i n a t l e a s t one c a s e , r e s u l t i n e n r i c h i n g the c o n t e n t of t h e monograph by new t h e o r e t i c a l m a t e r i a l . I should l i k e t o thank t h e r e a d e r s and r e v i e w e r s who drew my a t t e n t i o n t o m i s p r i n t s and minor f l a w s in the o r i g i n a l e d i t i o n , in p a r t i c u l a r , P r o f e s s o r L. A. Henkin ( U n i v e r s i t y of S o u t h e r n C a l i f o r n i a ) , D r . G. F. Rose ( U n i v e r s i t y of Wisconsin), and P r o f e s s o r B. L. van d e r Waerden ( U n i v e r s i t y of Z u r i c h , S w i t z e r l a n d ) . I a l s o want t o e x p r e s s my warm g r a t i t u d e t o Mr. Solomon Feferman and Mr. F r e d e r i c k B. Thompson (both of the U n i v e r s i t y of C a l i f o r n i a ) f o r t h e i r a s s i s t a n c e i n p r e p a r i n g the p r e s e n t e d i t i o n . Mr. Feferman* s work in t h i s connection was done w i t h i n the framework of a p r o j e c t sponsored by the O f f i c e of Naval Research. Berkeley, May 1951
California
Alfred
Tarski
CONTENTS
INTRODUCTION
1
SECTION
1
—
THE
SYSTEM
SECTION
2
—
DECISION
SECTION
3
—
EXTENSIONS
OF
ELEMENTARY
METHOD
FOR
ALGEBRA
ELEMENTARY
TO R E L A T E D
SYSTEMS
ALGEBRA
6 15 43
NOTES
47
BIBLIOGRAPHY
59
SUPPLEMENTARY
NOTES
61
iii
A DECISION M E T H O D ALGEBRA
AND
FOR
ELEMENTARY
GEOMETRY
INTRODUCTION
By a decision method for a class K of sentences (or other expressions) is meant a method by means of which, given any sentence 9, one can always decide in a finite number of steps whether Q is in K; by a decision problem for a class K we mean the problem of finding a decision method for K. A decision method must be like a recipe, which tells one what to do at each step so that no intelligence is required to follow it; and the method can be applied by anyone so long as he is able to read and follow directions ^ The importance of the decision problem for the whole of mathematics (and for various special mathematical theories) was stressed by Hilbert, who considered this as the main task of a new field of mathematical research for which he suggested the term "metamathematics". Hie most important kind of decision problems is that in which K is defined to be the class of true sentences of a certain theory. Mien we say that^ there is a decision method for a certain theory, we mean that there is a decision method for the class of true sentences o f the theory*1 \ (All superscripts in round brackets refer to Notes, pp. 47ff.) Some decision methods have been known for a very long time. For example, Euclid's algorithm provides (among other things) a decision method for the class of all true sentences of the form "p and q are relatively prime," where p and q are integers (or polynomials with constant coefficients). And Sturm's theorem enables one to decide how many roots a given polynomial has and thus to decide on the truth of sentences of the form, "the polynomial p has exactly k roots." Other decision methods are of more recent date. Lowenheim (1915) gave a decision method for the class of correct formulas of the lower predicate calculus involving only one variable. Post (1921) gave an exact proof of the validity of the familiar decision method (the so-called "truth-table method") for ordinary sentential calculus. Langford (1927) gave a decision method for an elementary theory of linear order. Presburger (1930) gave a decision method for the part of the arithmetic of integers which involves only the operation of addition. Tarski (1940) found a decision method for the elementary theory of Boolean algebra. McKinsey (1943) gave a decision method for the class of true universal sentences of elementary lattice theory. Mrs. Szmielew has recently found a decision method for the elementary theory of Abelian groups*8'.
1
There are also some important negative results in this connection. From the fundamental results of Godei (1930) and subsequent improvements of them obtained by Church (1936) and Rosser (1936), it follows that there does not exist a decision method for any theory to which belong all the sentences of elementary number theory (i.e., the arithmetic of integers with addition and multiplication) — and hence no decision method for the whole of mathematics is possible. A similar result has been obtained recently by Mrs. Robinson for theories to which belong all the sentences of the arithmetic of rationals. It is also known that there do not exist decision methods for various parts of modern algebra — in fact, for the elementary theory of rings (Mostowski and Tarski), the elementary theories of groups and lattices (Tarski), and the elementary theory of fields (Mrs. R o b i n s o n ) . In this monograph we present a method (found in 1930 but previously unpublished) for deciding on the truth of sentences of the elementary algebra of real numbers — and hence also of elementary geometry. By elementary algebra we understand that part of the general theory of real numbers in which one uses exclusively variables representing real numbers, constants denoting individual numbers, like "0" and "1", symbols denoting elementary operations on and elementary relations between real numbers, like "+", "-", "", and and "=", and expressions of elementary logic such as "and", "or", "not", "for some "for all x". Among formulas of elementary algebra we find algebraic equations and inequalities; and by combining equations and inequalities by means of the logical expressions listed above, we obtain arbitrary sentences of elementary algebra. Thus, for example, the following are sentences of elementary algebra: 0 > (1 + 1) + (1 + 1) ; For every a, 6, c, and d, where a ^ 0, there exists an x such that ax3 + bxs + ex + d = 0 . The first sentence is false, and the second is true. On the other hand, in elementary algebra we do not use variables standing for arbitrary sets or sequences of real numbers, for arbitrary functions of real numbers, and the like. (When in this monograph we attach the qualifier "elementary" to thè name of a theory, we refer to this abstention from the use of set-theoretical notions.) Hence those algebraic concepts whose definitions in terms of the fundamental notions listed above would require- some set-theoretical devices cannot be represented in our system of elementary algebra. This applies, for instance, to the general notion of a polynomial, to the notion of solvability of an equation by means of radicals, and the like. For this reason it is not possible, for example, to consider as a sentence of elementary algebra the sentence: Every polynomial has at least one root. On the other hand, one can formulate in elementary algebra the sentences: Every polynomial of degree 1 has a root; Every polynomial of degree 2 has a root; Every polynomial of degree 3 has a root; 2
and so on. Since we are dealing with real — not complex — sentences are true for odd degree but false for even degree.
algebra, the above
It should be emphasized that the general notion of an integer (as well as that of a rational, or of an algebraic number) also belongs to those notions which cannot be represented in our system of elementary algebra — and this in spite of the fact that each individual integer can easily be represented (e.g., 2 as 1 + 1, 3 as 1 + 1 + 1, etc.)'8'. The variables in elementary algebra always stand for arbitrary real numbers and cannot be supposed to assume only integers as values. For such a supposition would imply that the class of all sentences of elementary algebra contains all sentences of elementary number theory; and, by results mentioned above, there could be no universal method for deciding on the truth of sentences of such a class. Thus, the following is not a sentence of elementary algebra: The equation * 3 + y 3 = z3 has no solution in positive integral x, y, z. This gives, we hope, an adequate idea of what is understood here by a sentence of elementary algebra. Turning now to geometry, we can say roughly that by a sentence of elementary geometry we understand one which can be translated into a sentence of elementary algebra by fixing a coordinate system. It is well known that most sentences of elementary geometry in the traditional meaning are of this kind. Hiere are, however, exceptions. These are, for instance, statements which involve explicitly or implicitly the general notion of a natural number: for instance, statements regarding polygons with an arbitrary number of sides — such as, that in every polygon each side is shorter than the sum of the remaining sides. It goes without saying that statements which involve the general notion of a point set — of an arbitrary geometrical figure — are also not elementary in our sense, but they would hardly be regarded as elementary in the everyday understanding of the term. On the other hand, there are sentences which are elementary according to our definition but which are not ordinarily so considered. Most sentences of analytic geometry concerning algebraic curves of any definite degree belong here: for example, the theorem that any two ellipses intersect in at most four points. It is important to realize that only the nature of the concepts involved, and riot the character of the means of proof, determines whether a geometrical theorem is a sentence of elementary geometry. For instance, the statement that every angle can be divided into three congruent angles is an elementary sentence in our sense, and of course a true elementary sentence — despite the fact that the usual proofs of this statement make essential use of the axiom of continuity. On the other hand, the general notion of constructibility by rule and compass cannot be defined in elementary geometry, and therefore the statement that an angle in general cannot be trisected by by rule and compass is not an elementary sentence — although we can express in elementary geometry the facts that, say, an angle of 30° cannot be trisected by 1, 2,..., or in general any fixed number n of applications of rule and compass. If we now compare the theories treated in this monograph (i.e., elementary algebra and geometry) with the other theories mentioned above for which decision methods have been found, we see at once that although the logical structure in both
3
cases is indeed equally elementary, the theories investigated here have a considerably richer mathematical content. It would be possible to mention numerous problems which can be formulated in these theories, and which played in the past an important role in the development of mathematics. In the solution of these problems, and in general in the development of the theories considered, a great variety of modes of inference have been applied — some of them of a rather intricate nature (to mention only one example: the proof of the theorem that a triangle is isosceles if the bisectors of two of its angles are congruent). Thus the fact that there exists a universal decision method for elementary algebra and geometry could hardly have been regarded as a foregone conclusion. In the light of these remarks one should not expect that the mathematical basis for the decision method to be discussed will be of a quite obvious and trivial nature. In fact by analyzing this decision method the reader will easily see that in its mathematical content it is very closely related to a classical algebraic result — namely, the theorem of Sturm previously mentioned — and it even provides an extension of this theorem to arbitrary systems of equations and inequalities in many unknowns. Since a decision method, by its very nature, requires no intelligence for its application, it is clear that, whenever one can give a decision method for a class K of sentences, one can also devise a machine to decide whether an arbitrary sentence belongs to K. It often happens in mathematical research, both pure and applied, that problems arise as to the truth of complicated sentences of elementary algebra or geometry. The decision method presented in this work gives the mathematician the assurance that he will be able to solve every such problem by working at it long enough. Chce the machine is devised, his task will reduce to explaining the problem to the machine — or to its operator. It may be instructive to illustrate, by means of an example, the more specific ways in which a decision machine could prove helpful in the study of unsolved problems. As is well known, any two polygons of equal area, P and Q, can be decomposed into the same finite number n of non-overlapping triangles in such a way that each triangle in P is congruent to the corresponding triangle in Q. We are interested in determining the smallest number for which such a decomposition is possible. We assume in the following that P is the unit square and Q is a rectangle of unit area whose base has x units. Now the smallest number n depends exclusively on * and is denoted by d{x); our problem reduces to describing the behavior of the function d for all positive values of x. In particular, given any , we can ask what the value of is. In most cases, even the answer to this simple question presents difficulty; e.g., it is not easily seen whether or not 0
we shall mean one of the following three symbols: ~ » A , V .
The first is called the negation s i f f i (and is to be read "not"), the second is called the conjunction sign (and is to be read "and"), and the third is called the disjunction sign (and is to be read "or" — in the nonexclusive sense). By the ( e x i s t e n t i a l ) quantifier we understand the symbol "£". If g is any variable, then (Eg) is called a quantifier expression. The expression (Eg) is to be read "there exists a g such that ." By a formula we shall mean an expression built up from atomic formulas by use of sentential connectives and quantifiers. Thus, for example, the following are formulas: 0=0 (Ex)(x
= 0) ,
(x = 0)V (Ey)(x (Ex) ~(Ey)
> y)
,
~ [(* = y) V ( * > 1 + y)]
~ (x > 1) A (Ey)(x
= y y )
,
.
If one wants a precise definition of formulas, they can be defined recursively as follows: A formula of first order is simply an atomic formula. If 8 is a formula of order k, then ~ 6 is a formula of order k + 1. If 8 is a formula of order k and £ is any variable, then (Eg) 8 is a formula of order k + 1. If 8 and $ are formulas of order at most k, and one of them is of order k, then (8 A if and only if g occurs in g is free in (Erf) 8 if and only if T) is not the same variable as 9
and ¿f is free in 8\ £ is free in ~ 8 if and only if £ is free in 8; £ is free in {6 A 0) *
Pg(a
- 0)
all
formulas
we
set
=0
Pf ( a - /3) > 0
(iii)
= /3)] s ' [ p i ( a > fi) VP^/3 >
(iv)
P^f-Ma > /3)] = [ p f ( a = ft VPg(/3 >
(v)
8 and
P€{8 V /3 i s equivalent to a - /3 > 0. In order to carry out the recursive step, we make use of the f a c t s : that ~ ( a = /3) i s equivalent to (a > fi) V (/3 > a ) ; that ~ ( a > /3) i s equivalent to ( a = /3) V (/3 > a ) ; that ~ (8 V /3) holds for every value of (Regarding the possibility of constructing such a y, see Theorem 25.) We then see that the hypothesis of (7) is satisfied with 8 replaced by Rg(a,/3). Hence the conclusion of (7) is also satisfied. This conclusion, however, with 26
the indicated replacement, coincides with the conclusion of ( 4 ) . The proof now reduces t o e s t a b l i s h i n g the truth of ( 7 ) , ( 8 ) , and ( 9 ) . I t i s convenient in t h i s part o f proof t o a v a i l ourselves of customary mathematical language and symbolism. A l s o , we shall not be too meticulous in trying t o avoid possible confusions between mathematical and metamathematical formulations. Given a polynomial o. and a number A., we s h a l l denote by /(A.,a) the order o f A. in a: i . e . , the uniquely determined non-negative i n t e g e r r such that M ^ a ) holds. The function / i s thus defined f o r every number A., and f o r every polynomial a. which does not vanish i d e n t i c a l l y . S i m i l a r l y , f o r any g i v e n polynomials a and /S in ^ we denote by g(a,/3) the i n t e g e r k f o r which G|(a,/3) holds. From the d e f i n i t i o n o f G*(a,/3) ( s e e D e f i n i t i o n 22) i t f o l l o w s that such an i n t e g e r always e x i s t s and i s uniquely determined. I t can be computed in the f o l l o w i n g way. We c o n s i d e r a l l these numbers A. f o r which /(A.,a) - /(A,/3) i s p o s i t i v e and odd, and we d i v i d e them i n t o two s e t s , P and N; A. belongs t o P ( o r N) i f there i s an open i n t e r v a l whose right-hand end-point i s A., w i t h i n which the values of a and /3 have always the same s i g n ( o r always d i f f e r e n t s i g n s ) . Both s e t s P and N are c l e a r l y f i n i t e , and the d i f f e r e n c e between the number o f elements in P and the number of elements in N i s j u s t g(a,/3). Thus g(a.,/3) can be p o s i t i v e , n e g a t i v e , or z e r o ; in case a or /3 vanishes i d e n t i c a l l y , g(a,/3) = 0. F i n a l l y we introduce the symbol h(a,/3) t o denote the i n t e g e r p f o r which H?Aa,f3) holds; in other words, h(a,/3) i s the number of a l l those numbers A. f o r which j (A.,a) _ /(A,/3) i s odd — though not necessarily p o s i t i v e . For l a t e r use we state here without proof (which would be quite elementary) the f o l l o w i n g property o f the function / d e f i n e d above:
(10)
Let a, /3, y, and 8 be polynomials in a./3sq
= y.p
such that - S
holds f o r every value of £ , /6 being the (nonvanishing) leading c o e f f i c i e n t o f /3 and q some i n t e g e r . I f , f o r any given number A., /(A.,/3) > /(A.,a) — so that a , as w e l l as /3, does not vanish i d e n t i c a l l y — then /(A.,a) = /(A.,8) ( s o t h a t 8 does not vanish i d e n t i c a l l y e i t h e r ) . S i m i l a r l y , i f /(A.,/3) > / ( M ) , then /(A,a) = f{\, 8 ) .
We now take up the proof of ( 7 ) , ( 8 ) , and ( 9 ) , which w i l l be done by a s i m i l taneous induction on h{a,/3) = p. The reader can e a s i l y v e r i f y that ( 7 ) , ( 8 ) , and ( 9 ) hold in case the polynomial 8 vanishes i d e n t i c a l l y ; t h e r e f o r e we shall assume hencef o r t h that 8 does not vanish i d e n t i c a l l y . Assume f i r s t that /i(a,/3) = 0. Tlius there are no numbers A. such that /(A.,/3) is odd. A fortiori there are no numbers A. such that /(A.,a) - /(A.,/3) i s and odd; and hence g(a,/3) = 0. Furthermore, there are no numbers A. such that /(AL,8) i s p o s i t i v e and odd; f o r i f such a number A. e x i s t e d , we should have
/(A.,a) positive /(A,/3) /(A.,a) = 27
8) by (10), and hence f(k,a) - f{k,ß) would be odd. Consequently, g(/ß, S) = 0 and therefore g(a,/3) = g(/3,8). Thus in t h i s case (7) proves to hold, while (8) and (9) are of course vacuously s a t i s f i e d . Assume now that (7), (8), and (9) have been established for a r b i t r a r y polynomials a and ß with h(a,ß) = p ( p any given i n t e g e r ) . Consider any polynomials a and ß in £ with nonvanishing c o e f f i c i e n t s o-m and ß n , and with (11)
h{a,ß)
=p + 1 ,
a s w e l l as two further polynomials y and 8 in £ such that (12)
a
y ß
=
- 8 holds i d e n t i c a l l y for some non-negative integer q .
Two cases can be distinguished here, according as a^*ß n > 0 or 0 > a. •ß n ; since the arguments are e n t i r e l y analogous in both cases, however, we r e s t r i c t ourselves to the case (13)
f(k,a)
am'ßn > 0 . Our assumption (11) implies that there are e x a c t l y p + 1 numbers X for which ~ f(k,ß) i s odd. Let
(14)
\
= the largest k such that f(k,a)
~ f(k,ß)
i s odd.
Condition (13) implies that for s u f f i c i e n t l y large numbers £ > k o the values of a and ß are of the same sign. TTiis can be extended to every number £ > k (not a root of a or ß ) , s i n c e , by (14), there i s no number £ > k^ for which /(£, \ , then, by (10) and - f(£,ß) (12), would be odd for the same £ > \ , and t h i s would c o n t r a d i c t (14). The argument for ß and 8'. i s analogous; instead of (12) we use (17), and when s t a t i n g the f i n a l contradiction we r e f e r to (14) combined with the f i r s t p a r t of ( 2 0 ) , instead of merely to ( 1 4 ) . We now d i s t i n g u i s h two cases, dependent on the s i g n of /(A »a) ~ f ( ^ 0 , ß ) . In view of (20), (24), and (25), the only number which can cause a d i f f e r e n c e in the values of g(a,/3) and g ( a ' ,ß), or in the values of g(/3,8) and g(ß,&'), i s the number A.o . If now (A)
/(A0,a)
- f(\,ß)
> 0 ,
then by (14) and (15) the number Ao e f f e c t s a decrease of g(a,/3) by 1; while, a s a r e s u l t of (19), i t has no e f f e c t on the value of g(o,a) = /(A- o ,8), and hence, using ( 1 4 ) , that (29)
/(A.0,/3) ~ /(A- 0 ,S) i s positive and odd.
Thus A. i s of order r in a and 8, and of a higher order in /3. Consequently there are three polynomials a " , /S " , and 8" such that the equations (30)
a =
a
».(k0-
¿ y ,
0 = P " ' ( \ - £ )
r
,
S
=
S"'(\>-£)
r
hold i d e n t i c a l l y ; A^ i s a r o o t of/3", but not of a " o r § " . We obtain from (12) and (30): = y/3" ~ 8". Consequently, the values of a" and 8", for g = A q , have d i f f e r e n t signs. Therefore there i s an open interval, whose right-hand end-point i s A. , within which the values of a" and 8" have d i f f e r e n t signs; and, by (30), t h i s applies also to a and 8. By comparing t h i s r e s u l t with (15), we conclude that there i s an open i n t e r v a l whose right-hand end-point i s A. , within which the values of ß and 8 have the same sign. Hence, and by ( 2 9 ) , A. contributes to the increase of giß, 8) by 1. On the other hand, by (19) and (29), f ( \ > ß ) ~ i s even, so that kQ has no e f f e c t on the value of g(/ß,8'). Thus, f i n a l l y , (31)
g(ß,
8) = giß,
8') + 1 .
Equations (28) and (31) again imply (23). Hence (23) holds in both the cases (A) and(B). From (21), (22), and (23) we obtain at once: 30
g(a,/3) = g(/3,8) i n case p + 1 i s even ; g ( a /S) = g(/3,8) - 1 i n case p + 1 i s odd . Thus we have shown that (7), (8), and (9) hold for polynomials a and f3 with /i(a,/3) = p + 1; and hence by i n d u c t i o n they h o l d f o r a r b i t r a r y polynomials a and /?. T h i s completes the p r o o f . DEFINITION
28.
Let
a = a o + af+.,.+
af"
/3 = /Bo + /3J +...+
/3J"
y' r = y' r , o + y' r , 1=£ + . . . + yr* 6e arbitrary (i)
If
in ¿j. We define
polynomials is a formula
of the
the function
"r
0 ) ]
,
then we set T{§) = | [ - ( a
= 6) V . . . V - ( a n = o) J A
i 5 G ^ [ a , D f f ( a ) ] A S G ^ r » f e ( a » + /3a),
V 2
1
® —
r
i >
2 r
m
a -
- m < r3
f ( y / + . . . + 7 / )
= 0 A ( y s + i > o ) A. . . A (y r > o ) j |
not of any of the previous § -
forms,
but such
that
(££)($ A £ A... A
where each is of one of the forms y^ = 0 or y^ > 0, and if j1>---iju of i (in increasing order) for which ^ = (yi = 0); and i f j u + 1 , . . . , j of i (in increasing order) for which ^ = (y^ > 0), then we set
a r
re the are the
THEOREM 29. Let £ be any variable, and let $ be any formula such that T(. Moreover, $ is equivalent
34
.
values values
defined and no toT(^).
PROOF. The first part follows immediately from Theorem 27 and Definition 28. We shall prove the second part by considering separately the ten possible forms can have according to Definition 28. As in the proof of Theorem 27, we shall use here partially ordinary mathematical modes of expression, without taking any great pains to distinguish sharply mathematical from metamathematical notions. Suppose first, then, that is of the form 28 (i): i.e., that = ( r 3 ' k . and r 4 t h a t " r2 r
3
~
k
= k
r
~
+
r
4
4
E l i m i n a t i n g r 4 between t h e s e two e q u a t i o n s , we o b t a i n 2k
= \
~ r2
+
r
3 *
Thus ( 1 ) i m p l i e s ( 2 ) ; the proof i n t h e o p p o s i t e d i r e c t i o n i s almost o b v i o u s . To prove our theorem f o r f o r m u l a s of t h e form 28 ( i i i ) , i t s u f f i c e s t o show t h a t i f a i s a polynomial of d e g r e e m, and n o t i d e n t i c a l l y z e r o , and i f y^ , . . . , y r a r e any p o l y n o m i a l s , t h e n t h e f o l l o w i n g c o n d i t i o n s a r e e q u i v a l e n t : ( 1 ) t h e r e a r e e x a c t l y k r o o t s of a a t which y , . . . , yr a r e a l l p o s i t i v e ; and (2) t h e r e a r e t h r e e i n t e g e r s r , rg, r g s a t i s f y i n g 2k = r^ + r 2 ~ rg, 0 < r , r 2 , r 3 < m, such t h a t ^ i s t h e number of r o o t s of a a t which y x , . . . , y r . 8 , and y r _ t ' y * a r e a l l p o s i t i v e , rg i s t h e number of r o o t s of a a t which y^ y r _ 2 , and j * m l "y r a r e a l l p o s i t i v e , and r 3 i s the number of r o o t s of a a t which y , . . . , 7 r . 2 , and ~ l ' y r . 1 ' y r a r e a l l p o s i t i v e . In f a c t , if , r 8 , and r g have t h e meanings j u s t i n d i c a t e d , we o b v i o u s l y have o < rt,
r 2 , r3 < « .
L e t r 4 be the number of r o o t s of a a t which y , . . . , y r _ 2 , y r . 1 a r e a l l p o s i t i v e , yr i s n e g a t i v e . Let r B be the number of r o o t s of a a t which yi,...t yr,2 and yr a l l p o s i t i v e , and y r . t i s n e g a t i v e . Let k be the number of r o o t s of a a t w h i c h y yr a r e a l l p o s i t i v e . From t h e d e f i n i t i o n s of r , rg, rg, r^ , r a n d k we s e e
and are ,..., that
k + r = r 4 1 k + r = r B 2 =
r + r 4
r
5
3
E l i m i n a t i n g r 4 and rB from t h e s e e q u a t i o n s , we o b t a i n 2k = r + r which completes t h i s p a r t of t h e p r o o f . 1 2
r 3 ,'
To prove our theorem f o r formulas of the form 28 ( i v ) , we need only n o t i c e t h a t t h e formula •(«ta=0)A[-(ao = 0)V...V-(a. i s equivalent to = 36
0
)
= o)]
Now suppose that
is of the form 28 (v): i.e., that 0, then the formula (££) [ ( 7 i >
o ) A . . . A (7r >
O)]
is never satisfied (i.e., is satisfied by no values of the free variables occurring in it), so that is never satisfied either — and hence is equivalent to (0 = 1). If k = 0,' and n +...+ n r = 0,' then n„1 =2n =...= n r = 0,' and hence &1 reduces to l
where 7 1>0 ,..., ,..., 7 yr< 0 are terms which do not involve
since
is then equivalent to
we see that 0. To establish in this case that is equivalent to T(£), it suffices to prove: If yi,..., y r are polynomials in £ not all of which are of degree zero, and whose leading coefficients are all different from zero, then a necessary and sufficient condition that there exist no value of £ which makes all these polynomials positive is that the following three conditions hold: (1) at least one of the polynomials have a negative leading coefficient; (2) at least one of the polynomials satisfy n: (~1) l y i n i < 0 (where ni is the degree of the polynomial, and y i its leading coefficient); (3) there exist no value of £ which is a root of the derivative of the product of the polynomials, and which makes them all positive. To see that the condition is necessary, suppose that y 7 r are polynomials which are never all positive for the same value of then it is immediately apparent that (3) is satisfied; to see that (1) is satisfied, we remember that, if the leading coefficient of a polynomial is positive, then we can find a number ¡x such that the polynomial is positive for all values of the variable ¿f greater than /z; the proof of (2) is similar, by considering large negative values of the variable. Now suppose, if possible, that the condition is not sufficient: i.e., suppose that (1), (2), and (3) are satisfied, and that there exists some £ which makes all the polynomials positive. Let A. be a value of g at which y > 0,..., y > 0 . Then we see that, for g = A., V
V
• • • ' 7r >
0
'
37
On the other hand, since (1) is true, there exists an i such that y^ has a negative leading coefficient. Hence we can find a A.' which is larger than A. and sufficiently large that y^ is negative at A'. Then y^ is positive at A. and negative at A.' and hence has a root between these numbers. Since every root of y^ is also a root of y t *y2 •... 'yr, we conclude that y i 'y • 'y has a root to the right of A.. Similarly, making use of (2), we see that yt my2'•••'yr has a root to the left of A.. Now let ¡x^ be the smallest root be the largest root of y i 'y 2 ' "7r to the left of A. and let of y %y '. . .'yr to the right of k. Then yi'ya'---myr is positive in the open interval ^/x^/j^j and zero at its end-points. We see that no y^ can have a root within the open interval (/^»A^)» since each y^ is positive at A., which lies within this interval, we conclude that each y^ is positive throughout the whole open interval. Oh the other hand, since y 'ya*...myr is zero at (J. and at /¿2, we see by Rolle's theorem that there is a point v within at which the derivative of yx my2'... 'yr vanishes. Since this contradicts (3), we conclude that the condition is also sufficient. Now suppose that $ is of the form 28 (vi). If k £ 0, it is obvious that is equivalent to T{ o] A. . .A^Rd/'Sr(yr)
> o]\
However is of the form 28 (v), and hence, as we have shown above, is equivalent to T((p). In view of these remarks, by looking at the formula defining T(^) in 28 (vi), we see at once that § and T( a o ' a i ' ^ i ) ] V
- ( a 2 = 0) A - ( 4 - V a 2 >a*) A
+ 2-a»-^ > 2 - < v v £ + V
V
/0] V
~(a 2 = 0 ) A ( a J > 4 a o . a 2 ) A te'Pl+W^
+a|-/?02>a0-ai-/3i-/32 + 2-a, - c y / ? ^ t a ^ - / ^ ) ] j .
We now turn to the second part of our t a s k . We want to c o r r e l a t e , with every sentence $ which contains no v a r i a b l e s or q u a n t i f i e r s , an equivalent sentence of a very special form: in f a c t , one of the two sentences 0=0
and
0=1. We f i r s t consider terms which occur in such sentences. As i s e a s i l y seen, every such term i s obtained from the algebraic constants 0, 1, and ~1 by combining them by means of addition and m u l t i p l i c a t i o n . Hence we can c o r r e l a t e with every such term a. an integer n(a) in the following way. DEFINITION 33. Vie set
"(1)
= 1,
n ( - l ) = -1, "(0) If
If
40
a = (fi + y),
= 0.
then we set n(a)
= n(/3)
+ n(y).
n(a)
= n(fi)'n(y).
a = ( f i ' y ) , then we set
I t should be emphasized that the above d e f i n i t i o n c o r r e l a t e s i n t e g e r s , not expressions, with terms. I t i s for t h i s reason that we have w r i t t e n , for example,
REMARK.
n(1)
= 1,
r»(l)
3
instead of 1;
n ( l ) i s the integer 1, not a name of that integer. In the equation n(a) = n(/3) + n ( y ) the addition sign indicates the sum of the two integers n(/3) and n ( y ) . n(a) would ordinarily be called the "value" of the expression a; thus, i f
i s what
a • 1 + ( 1 + 1 ) * ( 1 + 1 ) , then r»(a) = 5. On the other hand, we could use for our purposes, instead of integers, certain expressions of our formal system of algebra — in fact one of the terms of the following sequence ...,
(-1) + (-1),
-1,
0,
1,
1 + 1
We can use these special terms since they can obviously be put in one-to-one correspondence with a r b i t r a r y i n t e g e r s . As a r e s u l t of t h i s modification, however, Definition 33 and the subsequent Definition 34 would assume a more complicated form. DEFINITION 34.
any (i)
Let a and fi be terms,
and
and 6 formulas,
none of which
contain
variables. If §= (a = f i ) , we set W($) = ( 0 = 0 )
in case
n(a) = n(/3), and
otherwise W(§) = (0 = 1 ) .
(ii)
I f § = (a > fi), we set W{$) = ( 0 = 0 )
in case n(a) > n(/3), and
otherwise = (0 = 1) .
(iii)
If
(§V 6), we set W{§)
in case
either
W($) =(0=0)
=(0=0)
or W(8) = (0 = 0), and
otherwise
W(§) = (0 = 1) 41
(iv)
y. If one wishes, one can enrich the system by a new predicate Rl(x), agreeing that Rl(x) will mean that x is real* 16 *. The results obtained can furthermore be extended to the elementary systems of n-dimensional Euclidean geometry. Since the methods of extending the results to the algebraic systems and the geometric systems are essentially the same, we shall consider a little more closely the case of 2-dimensional Euclidean geometry. We first give a sketchy description of the formal system of 2-dimensional Euclidean geometry. We use infinitely many variables, which are to be thought of as representing arbitrary points of the Euclidean plane. We use three constants denoting relations between points: the binary relation of identity, symbolized by "="; the ternary relation of betweenness, symbolized by "6", so that "B(x, y, z)" is to be read "y is between * and z" (i.e., y lies between x and z on the straight line connecting them; it is not necessary that the three points all be distinct; B(x, y, z) is always true if * = y or if y = z; but we cannot have * = z unless * = y = z); and the quaternary equidistance relation, symbolized by "D", so that "D(x,y; x' ,y )" is to be read "x is just as far from y as x' is from y'" (or, "the distance from * to y equals the distance from x' to y'")* 1 7 *. The only terms of this system are variables. An atomic formula is an expression of one of the forms € = V.
fife
44),
D^.-n^'.rj),
where and 7]' are arbitrary variables. As in the formal system of elementary algebra we build up formulas, from atomic formulas by means of negation, conjunction, disjunction, and the application of quantifiers; we also introduce here as abbreviations the symbols and " " . Sentences of elementary geometry, in our formulation, express certain facts about points and relation between them. On the other hand, most theorems which one finds in high-school textbooks on this subject involve also such notions as triangle,
43
p l a n e , c i r c l e , l i n e , and the l i k e . I t i s e a s y , however, t o convince o n e s e l f t h a t a c o n s i d e r a b l e p a r t of these notions can be t r a n s l a t e d i n t o the language of our system. Thus, f o r example, the theorem that the medians of a t r i a n g l e a r e concurrent can be expressed as follows ( c f . the f i g u r e immediately f o l l o w i n g the formula): (Ax) (Ay) (Az)(Ax')(Ay')
(Az ' ) | [ ~ B ( * , y , 2 ) B(y,z',x)
B(y,z,x) A B(z,x',y)
A ~ B(z,x,y)
A B(x,y',z)
A
AD(x, z'; z',y) A
D(y,*>-x\z) AD^y'}/,*)] (Ew) [ß(*, «*,*') AB(y, w, y') A B(z,
]
i»,z')
y
z On the other hand, i t would not be d i f f i c u l t to e n r i c h our system of geometry s o a s t o e n a b l e us t o r e f e r t o t h e s e e l e m e n t a r y f i g u r e s d i r e c t l y . R e g a r d i n g more e s s e n t i a l l i m i t a t i o n s of our system, see the remarks in the Introduction. In order to obtain a d e c i s i o n procedure f o r elementary geometry, we c o r r e l a t e with e v e r y sentence of elementary geometry a sentence of elementary a l g e b r a in the sense of Section 1 . The construction of $ * can be roughly described in the f o l l o w ing way. Wi_th every (geometric) v a r i a b l e £ in $ we c o r r e l a t e two d i f f e r e n t ( a l g e b r a i c ) v a r i a b l e s £ and ¿f, in such a way t h a t i f ¿f and rj a r e two d i f f e r e n t v a r i a b l e s in 0)V((f [( 0)V ( ( ! - ? )
and every p a r t i a l formula D(£,77;\x,v) by ( i - v)2 44
+ ( I - v)2 = G- -
+ G- -
- Jl) = O)]A =0)];
It is now obvious to anyone familiar with the elements of analytic geometry that whenever ^ is true then is true, and conversely. And since we can always decide in a mechanical way about the truth of $*, we can also do this for will be used to represent arbitrary variables; on the other hand, Greek capitals "8", "¿»"will be used to represent arbitrary formulas and sentences. With these exceptions we do not introduce any special metamathematical symbolism. Various metamathematical notions whose intuitive meaning is clear will be used without any explanation; this applies, for instance, to such a phrase as "the variable £ occurs in the formula " Also, we do not consider it necessary to set up an axiomatic foundation for our metamathematical discussion, and we avoid a strictly formal exposition of metamathematical arguments. We assume that we can avail ourselves in metamathematics of elementary number theory; we use variables "m", "n", "p", and so on to represent arbitrary integers; and we employ the ordinary notation for individual integers, arithmetical relations between integers, and operations on them. The reader who is interested in the deductive foundations, and a precise development, of metamathematical discussion, may be referred to Carnap [2] (part II, pp.55 ff.), Godel [4], Tarski [2l] (Section 2, pp.279 ff., in particular p.289), and Tarski [20] (especially p.100). 8.
In choosing symbols for the formalized system of algebra, we have been interested in presenting the metamathematical results in the simplest possible form. For this reason we have not introduced into the system various mathematical and logical symbols which are ordinarily used in expressing mathematical theorems: such as the subtraction symbol the symbol "" sign, by treating
x > y merely as an abbreviation for ( E z ) [ ~ ( z = 0) A ( *
= y + z8)] .
In an analogous way we could dispense with the symbols 0, 1, and -1, and with one of the two logical connectives V and A . But such a reduction in the number of symbols would hardly be advantageous from our point of view. It should be pointed out that, in order to increase the efficiency of the decision machine which may be constructed on the basis of this monograph, it might very well turn out to be useful to enrich the symbolism of our system, even if this carried with it certain complications in the description of the decision method. 9.
48
A formal definition of truth can be found in Tarski [2l]. It should be pointed out that we can eliminate the notion of truth from our whole discussion by subjecting the system of elementary algebra to the process of axiomatization. For this purpose, we single out certain sentences of our system which
we c a l l "axioms". They are divided i n t o l o g i c a l and a l g e b r a i c axioms. The l o g i c a l axioms (or rather, axiom schemata) are those of the s e n t e n t i a l calculus and the lower p r e d i c a t e c a l c u l u s with i d e n t i t y ; they can be found, f o r i n stance, in Hi1bert-Bemays [7] (see sections 3, 4 and 5 in v o l . 1 , and supplement 1 in v o l . 2 ) . Among a l g e b r a i c axioms we find, in the f i r s t p l a c e , those which c h a r a c t e r i z e the set of r e a l numbers as a commutative ordered f i e l d with the operations + and • and the r e l a t i o n > , and which single out in a f a m i l i a r way the t h r e e s p e c i a l elements 0, 1, and - 1 . These axioms are supplemented by one a d d i t i o n a l axiom schema comprehending a l l sentences o f the form (i)
U^).. . U ^ U ^ U ^ I
[ £> A < £ £ )
(££)((£ = O A(0>a))J — where £ from 77 and involves the the f a c t that ( i . e . , every and negative
= 77) A ( a > 0 ) ) A
)•
< £ £ ) ( < t j > £ ) A ( £ > £) A ( a = 0 )
, 7], £ are a r b i t r a r y v a r i a b l e s , ¿f i s any v a r i a b l e and a i s any term — which, in the n o n - t r i v i a l cases, variable I n t u i t i v e l y speaking, t h i s axiom schema every function which i s represented by a term o f our r a t i o n a l i n t e g r a l f u n c t i o n ) and which i s p o s i t i v e at at another, vanishes at some point in between.
different o f course expresses symbolism one point
From what can be found in the l i t e r a t u r e (see van der Waerden [ 2 3 ] , in part i c u l a r pp.235 f . ) , i t i s seen t h a t t h i s axiom schema can be e q u i v a l e n t l y replaced by the combination of an axiom expressing the fact that every positive number has a square r o o t , with an axiom schema comprehending a l l sentences to the e f f e c t that every polynomial o f odd degree has a zero: i . e . , a l l sent e n c e s of the form (ii)
- ( ^ ( v ^ + . - . + ^ ^ ^ o ) ] , where -r]0, r j ^ , , . . , ">7ail+l are a r b i t r a r y v a r i a b l e s , and S, i s any variable d i f f e r e n t from a l l of them. I t i s also p o s s i b l e to use, i n s t e a d o f ( i i ) , a schema comprehending a l l s e n t e n c e s t o the e f f e c t t h a t every polynomial o f degree a t l e a s t three has a quadratic f a c t o r . F i n a l l y , i t turns out t o be p o s s i b l e t o r e p l a c e e q u i v a l e n t l y schema ( i ) by the seemingly much s t r o n g e r axiom schema comprehending a l l those p a r t i c u l a r c a s e s o f the c o n t i n u i t y axiom which can be expressed in our symbolism. (By the c o n t i n u i t y axiom we may understand the statement t h a t every s e t o f r e a l numbers which i s bounded above has a l e a s t upper bound; when expressing p a r t i c u l a r cases o f t h i s axiom in our symbolism, we speak, not of elements o f a s e t , but of numbers s a t i s f y i n g a given formula.) The p o s s i b i l i t y of t h i s l a s t replacement, however, i s a r a t h e r deep r e s u l t , which i s a by-product o f other r e s u l t s presented in t h i s work: in f a c t , o f those discussed below in Note 15. A f t e r having s e l e c t e d the axioms, we describe the operations by means of which new sentences can be derived from given ones. These operations are expressed in the s o - c a l l e d " r u l e s o f i n f e r e n c e " f a m i l i a r from mathematical l o g i c . A s e n t e n c e which can be derived from axioms by repeated a p p l i c a t i o n s o f t h e r u l e s o f i n f e r e n c e i s c a l l e d a provable sentence. In our f u r t h e r d i s c u s s i o n — in p a r t i c u l a r , in defining the notions o f equivalence of terms and equiva49
lence of formulas — we replace everywhere the notion of a true sentence by that of a provable one. Hence, when establishing certain of the results given later — in particular, the theorems about equivalent formulas — we have to show that the sentences involved are formally derivable from the selected axioms (and not that they are true in any intuitive sense); otherwise the discussion does not differ from that in the text. 10. We use the term "decision method" here in an intuitive sense, without giving a formal definition. Such a procedure is possible because our result is of a positive character: we are actually going to establish a decision method, and no one who understands our discussion will be likely to have any doubt that this method enables us to decide in a finite number of steps whether any given sentence of elementary algebra is true. The situation changes radically, however; if one intends to obtain a result of a negative character — i.e., to show for a given theory that no decision method can be found; a precise definition of a decision method then becomes indispensable. The way in which such a definition is to be given is of course known from the contemporary literature. Using one of the familiar methods — for instance the method due to Godel — one establishes a one-to-one correspondence between expressions of the system and positive integers, and one agrees to treat the phrase "there exists a decision method for the class A of expressions" as equivalent with the phrase "the set of numbers correlated with the expressions of A is general recursive." (When the set of numbers correlated with a class A of sentences is general recursive, we sometimes say simply that A is general recursive.) For a discussion of the notion of general recursiveness, see Hilbert-Bernays [7] and Kleene [8]. 11.
The method of eliminating quantifiers occurs in a more or less explicit form in the papers Lowenheim [ll] (section 3), Skolem [18] (section 4), Langford [10], and Presburger [15]. In Tarski's university lectures for the years 19261928 this method was developed in a general and systematic way; cf. Presburger [15], p.95, footnote 4, and p.97, footnote 1.
12. The results obtained in Theorems 27 and 29, and culminating in Theorem 31, seem to deserve interest even from the purely mathematical point of view. They are closely related to the well-known theorem of Sturm, and in proving them we have partly used Sturm's methods. The theorem most closely related to Sturm's ideas is Theorem 27. In fact, by analyzing, and slightly generalizing, the proof of this theorem we arrive at the following formulation. Let a. and J3 be any two polynomials in a variable and k and ¡J. any two real numbers with k < fx. We construct a sequence of polynomials y i , ya,..., yn — which may be called the Sturm chain for a and jS — by taking a for y , /3 for y%, and assuming that with i > 2, is the negative remainder of and y^.j! we discontinue the construction when we reach a polynomial y n which is a divisor of 7„_1. Let K , . . . , K^ and (J. ,..., u be the sequences of values of y yn at g = k and £ = ja, respectively; let k be the number of changes in sign of the sequence k K n , and let m be the number of changes in sign of the sequence ¡jl^,..., Then it turns out that k-m is just the number g(a,/3) defined as in the proof of Theorem 27, but with the roots assumed to lie between k and fx. (In Theorem 27 we were dealing, not with the arbitrary interval but with the interval (-00, +00).) 50
Sturm himself considered two particular cases of this general theorem: the case where /3 is the derivative of a — when the number k-in proves to be 'simply the number of distinct roots of a in the interval (K,fi); and the case where /3 is arbitrary but a is a polynomial without multiple roots — when k-m proves to be the difference between the number of roots of a at which /3 agrees in sign with the derivative of a, and the number of roots of a at which disagrees in sign with the derivative of a — the roots being taken from the interval (K,fi). These two special cases easily follow from the theorem, and we have made an essential use of this fact in the proof of Theorem 29. The general formulation was found recently by J.C.C. McKinsey; it contributed to a simplification, not of the original decision method itself, but of its mathematical description. Apart, however, from technicalities connected with the notion and construction of Sturm chains, the mathematical content of Sturm's theorem essentially consists in the following: given any algebraic equation in one variable and with the coefficients a^, a a n , there is an elementary criterion for this equation to have exactly k real solutions (which may be in addition subjected to the condition that they lie in a given interval): such a criterion is obtained by constructing a certain finite sequence of systems, each consisting of finitely many equations and inequalities which involve the coefficients V Oj,..., a n of the given equation (and possibly the end-points 6 and c of the interval); it is shown that the equation has exactly k roots if and only if its coefficients satisfy all the equations and inequalities of at least one of these systems. (When applied to an equation with constant coefficients, the criterion enables us actually to determine the number of roots of the equation, but this is only a by-product of Sturm's theorem.) By applying Sturm's theorem we obtain in particular an elementary condition for an algebraic equation in one unknown to have at least one real solution. Theorem 31 gives directly an extension of this special result to an arbitrary system of algebraic equations and inequalities with arbitrarily many unknowns. It is easily seen, however, that from our theorem one can obtain stronger consequences: in fact, criteria for such systems to have exactly k real solutions. To clear up this point, let us consider the simple case of a system consisting of one equation in two unknowns F(x,y)
=
0
.
We form the following system of equations and inequalities =
F(x,y) )
F(x'.y') (x
-
0 =
x')2+(y
0 -
y')2
>0
.
By Theorem 31 we have an elementary criterion for the system (ii) to have at least one solution. But it is obvious that this criterion is at the same time a criterion for(i) to have at least two solutions. In the same way, we can obtain criteria for (i) to have at least 3,4,..., k real solutions. Hence we also obtain a criterion for (i) to have exactly k solutions (since an equation has exactly k solutions if it has at least kf but not at least k + lf solutions).
The s i t u a t i o n does not change i f the s o l u t i o n s are r e q u i r e d t o s a t i s f y a d d i t i o n a l c o n d i t i o n s — namely, t o l i e within given bounds. We can t h u s say t h a t Theorem 31 constitutes an extension of Sturm's theorem ( o r , a t l e a s t , of the e s s e n t i a l p a r t of t h i s theorem) to arbitrary systems of equations and inequalities with arbitrarily many unknowns. I t may be n o t i c e d t h a t by S t u r m ' s theorem a c r i t e r i o n f o r s o l v a b i l i t y ( i n the r e a l domain) involves systems which contain i n e q u a l i t i e s as well as equat i o n s . Hence, t o o b t a i n an e x t e n s i o n of t h i s theorem t o systems of e q u a t i o n s in many unknowns, i t seemed advisable t o consider i n e q u a l i t i e s from the beginn i n g , and i n t h e f i r s t s t e p t o e x t e n d t h e theorem t o a r b i t r a r y systems of e q u a t i o n s and i n e q u a l i t i e s i n one unknown. As a r e s u l t of t h i s p r e l i m i n a r y e x t e n s i o n , the subsequent i n d u c t i o n w i t h r e s p e c t t o the number of unknowns becomes almost t r i v i a l . In i t s most g e n e r a l form the mathematical r e s u l t o b t a i n e d above seems t o be new, although, in view of the e x t e n t of the l i t e r a t u r e involved, we have not been able t o e s t a b l i s h t h i s f a c t with a b s o l u t e c e r t a i n t y . At any r a t e some p r e c e d e n t s a r e known i n t h e l i t e r a t u r e . From what can be found i n S t u r m ' s o r i g i n a l p a p e r , the e x t e n s i o n of h i s r e s u l t t o the case of one e q u a t i o n and one i n e q u a l i t y w i t h one unknown can e a s i l y be o b t a i n e d ; Kronecker, i n h i s theory of c h a r a c t e r i s t i c s , concerned himself with the case of n (independent) e q u a t i o n s w i t h n unknowns. I t seems, on the o t h e r hand, t h a t such a simple problem as t h a t of f i n d i n g an e l e m e n t a r y c r i t e r i o n f o r the s o l v a b i l i t y i n the r e a l domain of one equation in two unknowns has not been previously t r e a t e d ; the same a p p l i e s t o the case of a system of i n e q u a l i t i e s (without e q u a t i o n s ) i n one unknown — although t h i s case i s e s s e n t i a l f o r the subsequent induct i o n . (Cf. i n t h i s connection, Weber [24], pp.271 f f . , and Runge [17], pp.416 f f . , where f u r t h e r r e f e r e n c e s t o the l i t e r a t u r e a r e a l s o g i v e n . ) The r e s u l t e s t a b l i s h e d i n Theorem 31 and d i s c u s s e d i n the preceding note has v a r i o u s i n t e r e s t i n g consequences. To formulate them, we can u s e , f o r i n s t a n c e , a g e o m e t r i c language and r e f e r t h e r e s u l t t o « - d i m e n s i o n a l a n a l y t i c space with r e a l coordinates — o r , what i s s l i g h t l y more convenient, to the i n f i n i t e dimensional space -S' , in which, however, every p o i n t has only f i n i t e l y many c o o r d i n a t e s d i f f e r e n t from zero. By an elementary a l g e b r a i c domain in S ^ we u n d e r s t a n d t h e s e t of a l l p o i n t s ^ * o , x^,..., * „ , . . . ^ in which t h e c o o r d i n a t e s x i t l $ x k 2 ' " ' ' x k a s a t i s f Y a given a l g e b r a i c e q u a t i o n or i n e q u a l i t y
or
F
(
x
F
(* kl
k
i
x
k„)=
0
* J >0 •
and t h e remaining c o o r d i n a t e s a r e z e r o s . Let be the s m a l l e s t f a m i l y of p o i n t s e t s in < S w h i c h c o n t a i n s among i t s elements a l l elementary a l g e b r a i c domains and i s c l o s e d under t h e o p e r a t i o n s of f i n i t e s e t - a d d i t i o n , f i n i t e s e t - m u l t i p l i c a t i o n , set-complementation, and p r o j e c t i o n p a r a l l e l t o any a x i s . (The p r o j e c t i o n of a s e t A p a r a l l e l t o the n t h a x i s i s t h e s e t o b t a i n e d by r e p l a c i n g by zero the n t h c o o r d i n a t e of every p o i n t of A.).Now Theorem 31 i n g e o m e t r i c f o r m u l a t i o n i m p l i e s t h a t the family 2 c o n s i s t s of those and only those s e t s i n S which are f i n i t e sums of f i n i t e products of elementary a l g e -
braic domains. The possibility of passing from the original formulation to the new one is a consequence of the known relations between projection and existential quantifiers. Theorem 31 has also some implications concerning the notion of arithmetical (or elementary) definability. A setAof real numbers is called arithmetically definable if there is a formula in our system containing one free variable and such that A consists of just those numbers which satisfy _ 2 the symbol "B" of the betweenness relation can be defined in terms of the symbol "D" of the equidistance relation.
18.
Exactly as in the case of elementary algebra, we can treat the system of elementary geometry in an axiomatic way, and base our discussion of the decision problem on the notion of provability. If we restrict ourselves to the case of two dimensions, we can take, for instance (in addition to the general logical axioms mentioned in Note 9), the following geometrical axioms:
55
(x)
(Ax)(Ay)(Az)(Az')(Au){[B(x,z,z')AB(y,z,u)/\^(x
= 2)]
— -
(Ey')(£u')[B(*,y,y')AB(*,u,u')AB(y', (xi)
z',u')]|;
(A*)(yiy)(/l2)(Au)(£v)|[(B(*,a,v)VB(u,i;,x)VB(v,x,u))AJB(y,t;,2)]
V
[(B(y,u,v)VB(u,v,y)VB(i/,y,u))AB(2,v,*)] V [(B(z,u,v)VB(u,t>,2)VB(v,2,u))AB(*,v,y)] j (xii) (xiii) (xiv) (xv)
(Ax)(Ay)
D(x,y;y,x)
Ux)Uy)U2)[D(*,y;2,2)
=
y)]
(Ax)Uy)(Az)(Au)(Av)(Av>)^[D(x,y;z,u)AD(x,y,vtw)]-^D(z,u;v,v>)j (Ax) (Ay) (Az) (Az')
(Au)
;
(* = y) A D ( * , z;x,
z')AD(y,z;y,z')A
B(y,u,z')A(B(x,u,z)VB(x,z,u)j]—~(z (xvi)
;
= z')j ;
(Ax)(AxO(Ay)(Ay')(Az)(Az')(Au)(Au')^(x,y;x'>y')AI)(y,z;y',z')A D(x,u;x',u')AD(y,u;y',u')
A
B(x,y,z)AB(x',y',z')A = y)A~(y = 2)]—D(z,u;z',u')J ; (xvii) (xviii)
(Ax)(Ay)(Ay')(Az')(Ez)^B(x,y,z)A
D(y,z;y',z')J;
U*)U*0(4y)My0Mz0Ut>){«*,y;xSy0^ (Ez)(Eu)[D(x,z;x',z')AD(y,z;y,,z')AB(z,u,v)A (B(x,y,u)vB(y,u,x)VB(u,x,y)
)J | .
To t h e s e i s added t h e axiom schema w h i c h comprehends a l l p a r t i c u l a r c a s e s o f t h e axiom o f c o n t i n u i t y ( e . g . , i n the Dedekind form) t h a t are e x p r e s s i b l e i n our system: i . e . , a l l s e n t e n c e s o f the f o l l o w i n g form: (xix)
(A£j...(A^(E
M
)(A
V
J(A7
(Em) (aVi)(AVs) where n e i t h e r ¡1 nor ">7g i s i n the formula
] 2
)[($Af)-~B(
[(