183 48 25MB
English Pages [517]
Lecture Notes in Computer Science Edited by G. Goos and J. Hartmanis
170 7th International Conference on Automated Deduction Napa, California, USA May 14-16,1984 Proceedings
Edited by R. E. Shostak
Springer-Verlag Berlin Heidelberg New York Tokyo 1984
Editorial Board
D. Barstow W. Brauer P. Brinch Hansen D. Gries D. Luckham C. Moler A. Pnueli G. SeegmOlier J. Stoer N. Wirth
Editor
A. E. Shostak SRI International 333 Ravenswood Avenue Menlo Park, CA 94025 U.S.A.
Library of Congress Cataloging in Publication Data International Conference on Automated Deduction (7th: 1984 : Napa, Calif.) Seventh International Conference on Automated Deduction. (Lecture notes in computer science ; 170) 1. Automatic theorem proving-Congresses. 2. Logic, Symbolic and mathematical-Congresses. I. Shostak, Robert. II. Title. III. 7th International Conference on Automated Deduction. IV. Series. QA76.9.A96158 1984 511.3 84-5441
CR Subject Classification (1982): 1.1, J.2.
© 1984 by Springer-Verlag New York Inc. All rights reserved. No part of this book may be translated or reproduced in any form without written permission from Springer-Verlag, 175 Fifth Avenue, New York, NY 10010, U.S.A. Permission to photocopy for internal or personal use, or the internal or personal use of specific clients, is granted by Springer-Verlag , New York, Inc. for libraries and other users registered with the Copyright Clearance Center (CCC), provided that the base fee of $0.00 per copy, plus $0.20 per page is paid directly to CCC, 21 Congress Street, Salem, MA 01970, U.S.A. Special requests should be addressed directly to Springer-Verlag, New York, 175 Fifth Avenue, New York, NY 10010, U.S.A. 0-387-96022-8/84 $0.00 + .20 Printed and bound by R.A. Donnelley & Sons, Harrisonburg, Viriginia. Printed in the United States of America. 9 8 7 6 5 432 1 3-540-96022-8 Springer-Verlag Berlin Heidelberg New York Tokyo 0-387-96022-8 Springer-Verlag New York Heidelberg Berlin Tokyo
iii
FOREWORD The Seventh International Conference on Automated Deduction was held May 14-16, 19S4, in Napa, California. The conference is the primary forum for reporting research in all aspects of automated deduction, including the design , implementation, and appli cations of theor em-proving systems , knowledge representation and retrieval, program verification , logic programming, formal specification , program synthesis , and related areas. The presented papers include 27 selected by the program committee, an invited keynote address by Jorg Siekmann, and an invited banquet address by Patrick Suppes. Contributions were presented by authors from Canada, France, Spain , the United Kingdom , the United States, and West Germany. The first conference in this series was held a decade earlier in Argonne, Illinois. Following the Argonne conference were meetings in Oberwolfach, West Germany (1976), Cambridge, Massachusetts (1977), Austin, Texas (1979), Les Ar cs, France (19S0), and New York, New York (19S2).
Program Committee P . Andrews (CMU) W.W. Bledsoe (U. Texas) past chairman L. Henschen (Northwestern) G. Huet (INRIA) D. Loveland (Duke) past chairman R. Milner (Edinburgh) R. Overbeek (Argonne) T . Pietrzykowski (Acadia) D. Pl aisted (U. Illinois) V. Pratt (Stanford) R. Shostak (SRI) chairman J. Siekmann (U. Kaiserslautern) R. Walding er (SRI) Local Arrangements R. Schwartz (SRI)
iv
CONTENTS Monday Mornin g
Universal Unification (Keynote Address) Jorg H. Siekman n (FRG ) .
. . . . . . 1
A Portable Environment for Research In Automated Reasoning Ewing L. Lusk and Ross A. Overbeek (USA) . . . . . 43 A Natural Proof System Based on Rewriting Techniques Deepak Kapur and Balakrishnan Krishnamurthy (USA)
53
EKL-A Mathematically Oriented Proof Checker Jussi Ketonen (USA) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 Monday Afternoon
A Linear Characterization of NP-Complete Problems Silvio Ursie (USA) . . . . . . . . . . . . . . . . . . .
. . . . . . 80
A Satlsflablllty Tester for Non-Clausal Propositional Calculus Allen Van Gelder (USA) A Decision Method for Linear Temporal Logic Ana R. Cavalli and Luis Farinas del Cerro (France)
101
. . . . . . . . . . . 113
A Progress Report on New Dec ision Algorithms for Finitely Presented Abelian Groups D. Lankford, G. Butle r, and A. Ballantyne (USA) . . . . . . . . . . . 128 Canonical Forms In Finitely Presented Algebras Philipp e LeChenade c (Fran ce) . . . . . . . . . . . . . . . . . . . . . . . . 142 Term R ewriting Systems and Algebra P ierre Lescanne (Fra nce) . . . . . . . .
. . . . . . . . 166
Termination of a Set of Rules Modulo a Set of Equations Jean-P ierre Jouannaud (France) and Miguel Munoz (Spain) . . . . . . . . . . . 175
v
Tuesday Aforning Associative-Commutative Unification Francois Fages (France) .
. 194
A Linear Time Algorithm for a Subcase of Second-Order Instantiation Donald Simon (USA) A New Equational Unification Method: A Generalisation of Martelli-Montanari's Algorithm Claude Kirchner (France) .
209
. . . . 224
A Case Study of Theorem Proving by the Knuth-Bendix Method Discovering that x 3 = z Implies Ring Commutativity Mark E. Stickel (USA) . . . . . . . . . . . . . . . . . . . . . . . . . . . 248 A Narrowing Procedure for Theories with Constructors L. Fribourg (France) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259 A General Inductive Completion Algorithm and Application to Abstract Data Types Helene Kirchner (France) . . . . . . . . . . . . . . . . . . . . . . . . . . 282
Tuesday Evening The Next Generation of Interactive Theorem Provers (Banquet Address) Patrick Suppes (USA)
303
VVednesday Alorning The Linked Inference Principle, IT: The User's Viewpoint L. Wos, R. Veroff, B. Smith, and W. McCune (USA) A New Interpretation of the Resolution Principle Etienn e Paul (France)
. . . . . . . . 316
333
Using Examples, Case Analysis, and Dependency Graphs in Theorem Proving David A. Plaisted (USA) . . . . . . . . . . . . . . . . . . . . . . . . . . 356
vi
Expansion Tree Proofs and Their Conversion to Natural Deduction Proofs Dale Miller (USA) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375 Analytic and Non-analytic Proofs Frank Pfenning (USA) . . . . . . . . . . . . . . . . . . . . . . . . . . . 394
Wednesday Afternoon Applications of Protected Circumscription Jack Minker and Donald Perlis (USA) . . . . .
.
Implementation Strategies for Plan-Based Deduction Kenneth Forsythe and Stanislaw Matwin (Canada)
. . . . . . . . . . 426
A Programming Notation for Tactical Reasoning David A. Schmidt (USA) . . . . . . . . . . . . . .
414
. . . . . 445
The Mechanization of Existence Proofs of Recursive Predicates Ketan Mulmuley (USA) Solving Word Problems In Free Algebras Using Complexity functions Alex Pelin and Jean H. Gallier (France)
. . 460
476
Solving a Problem In Relevance Logic with an Automated Theorem Prover Hans-Jurgen Ohlbach and Graham Wrightson (FRG) 496
7th International Conference on Automated Deduction
UNIVERSAL UNIFICATION Jorg H. Siekmann Universitat Kaiserslautern FB Informatik Postfach 3049 D-6750 Kaiserslautern
Uberhaupt hat der Fortschritt das an sich. daB er viel groBer ausschaut. als er wirklich ist. J.N. Nestroy,
ABSTRACT:
This article surveys what is presently known about first order unification theory.
CONTENTS
O.
INTRODUCTION
I.
EARLY HISTORY AND APPLICATIONS
II.
A FORMAL FRAMEWORK 1. Unification from an Algebraic Point of View 2. Unification from a Logical Point of View 2.1 Equational Logic 2.2 Computational Logic 3. Universal Unification
III. RESULTS 1. Special Equational Theories 2. The General Theory 2.1 Classes of Equational Theories 2.2 Universal Unification Algorithms IV. OUTLOOK AND OPEN PROBLEMS V.
1859
BIBLIOGRAPHY
2
O. INTRODUCTION Unification theory is concerned with problems of the following kind:
Let f and g be function symbols, a and b constants and let x and y be variables and consider two first order terms built from these symbols; for example: t t
f(x,g(a,b))
1
f(g(y,b),x).
2
The first question which arises is whether or not there exist terms which can be substituted for the variables x and y such that the two terms thus obtained from t and t become equal: in the example g(a,b) 1 2 and a are two such terms. We shall write
°1
= {x-s-q (a,b) ,y+a}
for such a unifying substitution °1 t1
=
°1
:
is a unifier of t
1
and t
2
since
°1 t2'
In addition to the decision problem there is also the problem of finding a unification algorithm which generates the unifiers for a given pair t
1
and t
2.
Consider a variation of the above problem, which arises when we assume that f is commutative: f(x,y)
(C)
=
f(y,x).
Now 01 is still a unifying substitution and moreover a unifier for t
1
and t
2,
0
{y+a} is also
since
But 02 is more general than as the composition A0
°2
°
1 with A
, since 01 is an instance of
2 algorithm only needs to compute
=
°
2 obtained {x+g(a,b)}; hence a unification
°2 ,
There are pairs of terms which have more than one most general unifier (i.e. they are not an instance of any other unifier) under commutativity, but they always have at most finitely many. This is in contrast to the first situation (of free terms), where every pair of terms has at most one most general unifying substitution.
The problem becomes entirely different when we assume that the function denoted by f is associative: (A)
f(x,f(y,z))
f(f(x,y),z).
3
In tha t c ase 01 is stil l a unify ing s ubstitut ion , but 03 =
( x~f ( g ( a , b ) ,
g( a ,b»
, y·a }
is also a u n ifier: 0 3t 1 = f (f(g (a ,b) , g (a,b» , g( a ,b» =A f(g (a ,b) , f (g (a , b) , g (a ,b») = 03t 2 But 0 4 = (x·f (g (a , b ) , f(g(a , b ), g( a,b») , y.a} i s a gain a uni fyi ng subs tituti on a nd it i s not dif f i cu lt t o see that t here are i nfi nitel y man y u nif i ers, all o f whi ch are most g ener a l . Fina lly , if we a ssume that bo t h axioms (A) a nd (C) ho l d f or f then the s ituat i on chang e s ye t a g a i n a nd f or any pa ir of terms there are a t mos t f i n i te l y man y most general unifiers unde r
(A ) and (C ) .
Th e a bov e e xamples a s well as the prac t i c a l a pp l i cati on s of u n i f i c a tion theory quoted in the f o llowing pa r ag raph s hare a common problem , wh i ch in i ts most a bstra c t f or m i s as f o l l ows : Sup po s e
t~ o
t erms s a nd t are give n ,
whi c h by s ome c onvent ion d en ote a parti cular s t r u c t ur e and le t sand
t con t a in some fre e variable s . We sa y s a nd t are un i f iab l e i f f t he r e a re s u b sti t ut i on s
(i. e . t erms re-
placing the fr e e v a r iable s o f sand t) such that bo th t e r ms become equal i n a we ll de f ined s ense .
If t he structure can be a xiomatiz ed by s ome fi r s t o r d e r theory T, unification of sand t u nder T amounts t o s o lv ing the equ a t ion s = t in t hat theory . Howeve r, the ma t hema tic a l inv e stiga tion o f equa tion solving in certain theories i s a subject as old as mathematics itself and, r i ght from the beginning, v ery much a t t he heart of i t : It dates back to Bab ylonian mathemati cs (about 20 0 0 B.C . ) . Unive r s al u n ification carries this a cti vity on i n a more ab str act sett i ng: just as univer sal algebra abstracts from c ertai n pr opertie s that perta in to s pe c i f ic algebr a s a nd i nve stigate s i s sue s that are common t o all of them, u n i versal u n i f i c ation add res s e s prob l ems, whi ch are typical f o r e quation solVing as such.
4
Just as traditional equation solving drew its impetus from its numerous applications (the - for those times - complicated division of legacies in Babylonian times and the application in physics in more modern times), unification theory derives its impetus from its numerous applications in computer science, artificial intelligence and in particular in the field of computational logic. Central to unification theory are the notion of a set of most general unifiers wur (traditionally: the set of base vectors spanning the solu-
tion space) and the hierarchy of unification problems based on wur (see part II for an exact definition of this hierarchy): (i)
a theory T is unitary if Wur always exists and has at most one element;
(ii) (iii)
a theory T is finitary if WUE always exists and is finite; a theory T is infinitary if Wur always exists and there exists a pair of terms such that WUE is infinite for this pair;
(iv)
a theory T is of type zero otherwise.
We denote a unification problem under a theory T by
T
In many practical applications it is of interest to know for two given terms sand t if there exists a matcher (a one-way-unifier) W such that W(s) and t are equal under T. We denote a matching problem under a theory T
by
T
In other words, in a matching problem we are allowed to substitute into one term only (into s using the above convention) and we say s matches t with matcher w. A
unification problem (a matching problem) under a theory T poses two
questions: Q1: is the equality of two terms under T decidable? If so:
Q2: are these two terms unifiable and i f so, is it possible to generate and
represent all unifiers?
Q1 is the usual word problem, which has found a convenient computational treatment for equational logics [KB70], [HOaO]. These techniques, called term rewriting systems are discussed in section II. 2.2. An affirmative
5
answer to Q1 is a n i mporta nt prere quisite for unificati on t h e o ry . Q2
summarizes the actual interest in u n i f i c a t i o n theory and i s the
subj ect o f thi s ar t i c l e.
It is rea sonable to expect that the r e l a t i onshi p between compute r s c i e n c e and mathematical l og ic will be as fr u i t f u l in t he n ext c e nt ury as tha t b etwee n phy s i c s an d an a l y s i s in the las t. J ohn McCart hy , 1963 I. EARLY HISTORY AND APPLICATIONS
There i s a wide v a r iety of a reas ~r ob lems
1.
in c omputer sc ience whe re u n i f i c a t i o n
arise .
Da t aba ses
A deduc t i ve databas e [ GM7 8 ] doe s no t c o n t a i n ever y piece o f informat i on e x p l i c ite ly . Instead it c ont a ins only those fa cts from wh ic h a l l other inf ormati on the u ser may wish t o know can be dedu c e d by s ome inference r ule. Such inference ru les (dedu ction ru l es) heavily r e l y on unifi c at i on algor ithms . Also the u ser of a
relationaZ
databas e [DA76] may logically AND the
propertie s s he want s t o re tr ieve or else she may be inte rested in the NATURAL JOIN [ C0 70J of two s t ore d r e lat ions . In nei the r c a se, she would a pprec i ate i f she c o nstant l y had to t a ke into acco unt t ha t AND i s an associativ e and coromutat i ve, o r t h at NATURAL JOIN obeys an a s sociative a x i om, wh ich may distribute ove r s ome other o peration.
2 . Informati on retr i e v a Z A paten t office ma y store al l recorded elec t ri c c i r c ui t s [ BC6 6 J or a ll record e d chemical compo u nds [SU65 ] as some gr aph struc tur e, and t h e problem of checking whethe r a g iven circu it or c omp ound alread y e xists is an i n s t ance of a t est for g r a p h isomosp h ism [ UL76 J, [ UN6 4 J , [eR6 SJ . Mo r e generally, if th e nod e s of s uch g rap h s are labelled wi t h universally
6 qua ntified variables ranging ove r subgraphs , these problems are p ractical instances of a gr aph ma t c hing proble m.
3. Compu t e r vi s ion I n the f i e l d o f c ompu t er vi s i on i t has become c ustoma r y t o store t h e i nt e r n a l r e p r es enta ti o n of c ertain ex ter n a l scenes a s som e n e t stru cture [ CL7 1 J , [vill7 5] . The prob l em to find a par ti cular objec t
-
als o r e p r e s e nted as some net - in a g iven sce n e is a lso an instance o f t h e grap h ma t c hi n g pr ob lem [R L6 9 J . He r e o ne of t he mai n p r o b lems is t o specify as to wh a t cons t itu te s a s uc ce ss fu l l match ( s ince a stri c t t e s t f o r end omorph ism is too r ig i d f or mos t app lications) a l t hough s eri ou s inve s t i g a t i on of t hi s pro b l em is s t i l l pe nding ( see p a r a unifi cati on in section I V).
4. Natu ral Language Pro c e ss i ng The processing of natural langu a g e [ TL8 1 ] by a c omputer us es
transformation
rule s t o change the syn t ax o f t he input sentence i nt o a mor e appropriate o ne .
I n f e r e nc e r u le s are us ed to ma n i pUlate t h e s eman ti c s of an inpu t se ntence and to d isamb i g u a t e it . Th e world k no wl e d g e a n atura l l ang uag e understanding s ystem mu s t hav e i s rep r esen ted by cert a in (s yn tact ic ) de s c ript i ons and i t is par amou nt t o de t ect i f t wo de script i on s desc r i be t h e s ame objec t o r fac t. Trans f orma ti o n r ule s , i n f er ence r ules and t h e ma tch ing of desc r iptions are bu t a f e w appl icat i o n s of u ni fi c a t i on theor y to th is field .
5 . Ex pe rt Sy s t ems An expe r t
s ystem i s a c ompute r p rogr am t o s o l v e problems and a n s wer
q u e s tio n s ,which up t o now on ly h uma n ex pe r t s were capabl e o f [S H76J . The powe r of such a s y s t em larg el y d epends o n its a bil i t y t o repre se nt and man i pu l at e the knowl edge o f its field of e x per t i se . The techniques for doing so are yet a n other i n s t ance of the a p plic at i on of unification the o r y within the field o f ar t i f i c i a l intell i genc e .
6. Comput er Al geb r a I n compu t e r a l ge br a ( o r
symb o l mani pu l at io n ) [SG7 7J mat c h i ng alg or ithms
also play an important role : f or examp l e the integrand in a s ymbolic i n t e gr a tion p roblem [M07 1J ma y be ma t c hed a gain s t cer tain p a t t e r n s in o r d e r to det ec t t he cl a s s o f i n t e gra tion pro b l ems i t belongs t o and to
7 trig g er t h e app ropr i a te act ion fo r a s oluti o n (which i n t urn may i n volve s everal q uite c omplicated matc hing attempt s [BL7 1 ] , [ CK71 ], [FA7 1] , [RN71], [ MB 6 8 ], [ M074]. 7.
Pro gr amming Lang u a g e
An important contri butio n of art ifi c ial intell i g e n ce t o p rogr a mm i ng language d e s i g n i s the me cha nism o f patt e r n -directe d i nvoca ti on of procedures [BF 7 7 ] , [ HT 72] , [HT7 6 ] , [ RD 7 2 ] , [ WA7 7 ] . Pro c edures are identif i ed by pa t t erns i n stead of proced u re identi fiers a s i n tradi ti o na l progra mm i ng langu age s . I nv oca t i o n p at te rns are u suall y d es i g n e d t o expr ess g o al s a chieved by executi ng the p roced u re . I n c oming mes s ages are tri ed to be ma tched ag a inst t he invocati o n pa t t e rns o f pro c edures in a p r o c e dur a l d at a b as e, a nd a p roc e d u r e i s a c tivated a f ter h av i ng complete d a successful match betwee n messag e and pattern. So, matching is done (1) for looking up an a ppro priate procedure that helps to accomp lish an inte nded g o a l , and
(2 ) t r a n s mi t t i ng information to the
involv ed pro cedure . Fo r these a p p lications it is p a r t icula r ly d es ir a ble to h av e me tho d s f o r match ing ob j e c t s belong ing to hig h l e v el d ata s tr u c t ur es suc h as strin gs, sets, mu ltise t s e tc . A little r efl e ction wi l l s ho w th a t fo r ver y r i c h ma tc h ing struc tur e s, as it has e . g . been prop osed in MATCHLESS in PLANNER [HT7 2 ], the matching problem is und ecidable. This
presents a probl em for the
designe r of such languag es: on t h e one hand, v e ry rich and e x pressive matching structur es ar e de s i r ab l e , sinc e they f o rm t h e b a s i s fo r the invocat ion a nd deducti o n mecha n i s m. On the o t her hand, drasti c restr icti ons wi l l b e n ece s s a ry i f match ing a l g or i t hms ar e t o be f ound . The q u e stion is j u st h ow s e ver e d o t h e s e r es tric ti o n s h av e to be . The fundame ntal mode o f o per at ion for t h e p rog r ammi ng langua g e
SNOBOL
[FG64] is t o detect t he o c c u r re nc e of a substring wi th i n a larger str ing of char acters (like e . g . a program o r some text ) and there ar e very f ast methods known, which require less than linear time [BM77 ]. If these strings contain the SNOBOL 'don't care'-variables, the o c c u r re n c e problem i s an i n s t a nc e o f the str ingun i f ic at ion problem men t ioned in the following p a r a g r aph . curren t att empts t o u se f ir st order p red i c ate Zo g i c [ K0 79 ] as a p rog ram ming langu a g e [ CM8 1 ] h eaVi ly d epend on t h e a vai l a bi lity o f fast unifica tion algorithms. In order to g ain spe ed there are attempts at present t o have a ha rdwa re re a Zi za t io n
of the u nificati o n procedure [GS84]
8
8. Algebra
A famous decidability problem, which inspite of many attacks remained open for over twenty-five years, has only recently been solved: the monoid problem (also called Lob's Problem in Western Countries, Markov's Problem in Eastern Countries and the Stringunification Problem in Automatic Theorem Proving [HJ64J, [HJ66J, [HJ67J, [LS7SJ, [MAS4J, [SS61J, [PL72J) is the problem to decide whether or not an equation system over a free semigroup possesses a solution. This problem has been shown to be decidable [MA77J. The monoid problem has important practical applications inter alia for Automatic Theorem Proving (stringunification [S17sJ and second order monadic unification [HT76J,
[~~76J)
for Formal
Language Theory (the crossreference problem for van Wijngaarden Grammars [W176J), and for pattern directed invocation languages in artificial intelligence as mentioned above. Another wellknown matching problem is Hilbert's Tenth Problem [DA73J, which is known to be undecidable [MA70J. The problem is to decide whether or not a given polynomial P[x ,x ... ,xnJ = 0 has an integer solution 1 2' (a Diophantine solution). Although this problem was posed originally and solved within the framework of traditional equation solVing, unification theory has shed a new light upon this problem (see 111.1.).
Semigroup theory [H076J, [CP61 J is the field traditionally posing the most important unification problems (i.e. those involVing associativity) . Although scientifically more mature than unification theory is today, interesting semigroup problems have been solved using the techniques of unification theory (see e.g. [SS82J, [LA80J, [LA79J).
9.
Computational Logic
All present day theorem provers have a procedure to unify first order terms as their essential component: i.e. a procedure that substitutes terms for the universally quantified variables until the two given terms are symbolwise equal or the failure to unify is detected. This problem was first studied by Herbrand [HE30], who gives an explicit algorithm for computing a most general unifier. But unification algorithms only became of real importance with the advent of automatic theorem provers (ATP) and algorithms to unify two first order terms have independently been discovered by [G067],
[R06s] and [KB70]. Because of
their paramount importance in ATP's there has been a race for the fastest such algorithm [R071],
[BA73] ,
[VZ7s],
[HT76],
[MM79] resulting in a
linear first order unification algorithm for the free algebra of terms [PW78],
[KK82].
9
Also for almost as l on g as attemp ts at proving t heorems by ma ch ine s have been made , a critical p roblem has been well known [G067] , [CK65 ] , ~E 71] : Certain equat ional ax ioms, if left wi thout p re c au t ion in the da ta base of an a u t oma tic t heorem prover , wi ll f orce t he ATP to go a str ay . I n 1967, Robinson [RN67] proposed t hat s ubsta nti a l progre s s ("a ne w platea u " ) would be achieved by r e mov i n g t hese troublesome axioms from the da ta ba s e and building the m i n to t he de duc t ive machinery . Four a pproac hes to cope with e quationa l ax i oms ha ve be e n propo s ed : (1) To write the ax ioms into the data b a se , and u s e an additional rule of inference, suc h a s paramodulat i on [WR73]. (2) To use special "rewri t e rules " [KB70], [WR67], [HT80], [H080]. (3) To design special i nferenc e r ule s incorporating these axioms [SL72]. (4) To deve l op special un i fi c a t i on algor i t hms inc orpora t ing these a x i oms [PL72]. The last a pp r oac h (4) stil l appears to be promi s ing , however i t ha s t he drawba c k that f or every new set of axioms a ne w uni fication algori thm has t o be found . Also recently there has been interesting work on c ombinations of approach (2 ) and (4) ; see section I I I 2.2. The work on higher unification by G. Huet [HT72], [HT75], [HT761, ha s been very influential for first or de r unification t heor y also and was fundamen tal in shaping the fiel d as it is known today . G. Plotk in ha s shown i n a pioneer ing pa pe r [PL7 2 ] that whenever an a u t omat ic the or em prover is t o be re f uta tion complete, its unification pr oc e du re mu s t generate a s e t o f un i f ier s satisfying the t hr e e condit i on s completeness, c orrec t ness and mi nima l i ty , wh i c h are defined below. Summarizing unification theory rest s upon two main pillars: Un i v e r s a l Algeb ra and Comput a t ional ~ o g ic and we shall now turn to a brief survey of the i mportant notion s, wh i ch form the theoretical fra mework of the fiel d.
10
11
but we ne ed n o t ion s , not
not at i on .
11
A . Ta r e k i ,
1 943
II. A FORMAL FRAME WORK ,. Uni f i c a t i o n fr om an A l g eb ra i c Po int o f Vi e w As usual l et n
be t he se t of na t u r a l numb ers. A s et of ' symbo l s wi t h
arity' is a mapping rl : M .. :IN
where M is som e s et. For f EM rlf is the
r
a rity of f. The dom ain of rl is u sed t o denote certain n -ary o p e r a t i o n s and is s ometimes called a si gnatu r e . A Un i v e r s a l Algebr a A is a p ai r
(f , n) EQ i s abbrev i a t ed to fE Q.
(A, Q), where A is t h e c ar r i e r and fE rl
denotes a mappi ng
then we write fA tal , . . . ,a ) f or t h e reali zatio n of the denoted mapping) . n = 0 t he n f is a distinguis h ed constant of the algebra
No t e that i f Qf
A . COD ( rl), the codomai n of Q, is i t s type . ~: A.. B is a homomorphi sm i f ~ fA( a , , ... ,a ) n a bi j e ctiv e homomorphism is c al led a n isomo rphi sm , i n
If A and B are algebras, f B ( ~a"
. ..
, ~ an ) ;
symbols "". For a s u bse t A :: A, tDo = (j) jA o
o
is the res tr i c t i o n of tD t o A ' An o
equiva l e nc e r e lation p is a co ngruence re la t i o n i f f a Pb" 1 imp lies fA(a"
Alp =
. . . ,a n ) p fA (b"
.. . ,a pb n n
. . . ,b n ) .
(A/ p, Q) is the quotie n t al gebra modulo p . [ a J p is t h e c ongruence
class g enerated b y aE A.
Iro of f ixed type, the a l g e b r a A Iro on the s e t X , in symbols Alro (X) , iff
For a class of algebras free i n (i)
(A ,Q ) is
(A ,>l) E Ir o
(ii) X c A E Ir and ~o : X"'B is any mapping , then there exists a o u nique homomorph ism ~: A-s-B wi t h ~o = ~Ix'
(iii ) i f
B
If Ir is t h e c l a s s o f a l l algebras o f the fixed t ype, the n Air (X) i s the
(since it e xists a nd is u n iqu e up t o isomorphism) abso l u tel y free
a lg e bra on X . The elements of A Ir (X ) are called te rms and are given a concrete representati on l~ by :
11
(i) xEX (ii) if t in
W;
is in 1,t2
, ••• ,t
~~.
~
We assume that
are terms and
~f
n, n
2
consists of the disjoint sets iff
Qf
2
fEr iff
Qf
;=
fE~
~
n
0, then f(t
~
1,
... ,t
n)
is
and r such that
and
1
°
is called the set of function symbols, r the set of constants and X
the set of variables. We define operations for n A
= Qf
A
by f(t , ••• ,t
f(t , ..• ,t Let Q be the set of these (term building) n) 1 n). operations. Let ¢ denote the empty set. 1
F;
= (W~,~)
is isomorphic to Aff(X) and hence is called the absolutely
free term algebra on X.
F~
is the initial term algebra (or Herbrand
universe). We shall write F Q
F~.
for
o
by the fact that for every algebra A
F~
Our interest in
=
is motivated 0
(A,Q) there exists a unique
homomorphism h
A:
->- A
FQ
•
o
But then instead of investigating A, we can restrict our attention to a quotient of F Q
modulo the congruence induced by h
o
A.
In order to have variables at our disposal in the initial algebra we define Q X
=Q
U X, that is we treat variables as special constants. Since
X F Q "" F¢ Q
x
we simply write F".. if X
Because terms are objects in F
*¢
and
=t
no if
n we shall write tEF n instead of
An equation is a pair of terms s , t E F n' in symbols s s
X c Q and F
=
t. The equation
is valid in the algebra A (of the same type), in symbols A F
for every homomorphism
~:
F
n
->-
s = t
iff
A ~s
~t
in
A.
Let cr: X ->- FQ be a mapping which is equal to the identity mapping almost everywhere. A substitution cr:F Q ->- F Q is the homomorphic extension of ; and is represented as a finite set of pairs:
12
E is t he s et o f sub stitu ti ons on F , The iden t ity ma p p i ng o n F , i. e. n n the empt y subs t i t u ti on , is denot ed b y E . If t is a term a nd 0 a substitution, define V: F~ ~ 2 It I E
~
X
b y V (t ) = {s et of variab l es in t }and V( t
denote s the le ngth o f t
1
, ••• ,t ) =
n
( i .e . the numbe r o f symbols in t )
'" x I
DOM (o )
{x EX: ox
COD (0)
{ ox : X E DOM (0) }
XCOD( o ) = V (COD(o» EO c E is the set o f ground substitut i ons, i . e. o EE is un i f i ab l e
An equation s = t
o
iff COD (a ) c F
n
. o
(is so l va b l e ) in A iff there exists a
substitution ~ : F ~ ~ F ~ such that ~ s = ~t is valid in A. For a set o f equations T let =T b e the fin e st congru e n c e c onta ining (s,t) and a ll pairs ( as, a t ), for a e:~
and s =t
£
T. F ~I =T i s the quotient
algebra mo d u lo =T' A uni ficatio n proble m for T , denoted a s
T
t, s,tEF . The p r ob lem is t o d e cide wh ether
n
is unifiable in
We denote the constituent
t h e initial alg ebra
as
2. Uni fic atio n fr om a Logical Poi nt of View 2 . 1 EQUATI ONAL LOG IC The wel l forme d f ormu l as o f our logic a r e e quat i o n s d efin ed a s p a i r s (s,t) in
W;
x
A substitution
~ and denoted as s = t . 0
is a finite set of pairs in w~
x
~~ (i. e. classical
work confuses the issue a little by i d e n t i f y ing the representation with t he mapping that is bein g repres e n ted ). Th e appl icati on o f o = {x + t , ... , x +t to a term t, o t 1 1 n n} replaci ng each Xi in t by t • i
is ob t a ined by simultaneously
Let T be a se t of e q u a t i o n s . The e q u a ti o n p T 1 -
=
q is der i vable fr om T ,
p = q, i f P = qET or p = q is o b t ained from T b y a fini t e
sequence of the f ollowing opera tions:
13
(i)
t
t is an axiom
=
(ii)
if s
(iii)
if r
=
if s
=
=
=
then t
sand s
(iv) if si (v)
t
s
t then r
=
t
t i, 1$i$n then f(s1,···,sn) t then crs = crt where crEL.
For a set of equations T, T 1- s
=
tiff s
t is valid in all models
of T.
Theorem (Birkhoff): T 1= s We shall abbreviate T equation s
=
t
1= s
=
t i f f T l- s
=
t
(and hence T 1-
= t
s = t) by S =T t. An
is T-unifiable, iff there exists a substitution cr such
that crs =T crt. Although this is the traditional view of unification, its apparent simplicity is deceptive: we did not define what we mean by a 'model'. In order to do so we should require the notion of an interpretation of our well formed formu~as, which is a 'homomorphism' from
W;
to certain
types of algebras, thus bringing us back to section 1. Since neither 1= nor ltreatment of
are particularily convenient for a computational
an alternative method is presented below.
2.2 COMPUTATIONAL LOGIC For simplicity of notation we assume we have a box of symbols, GENSYM, at our disposal, out of which we can take an unlimited number of "new" symbols. More formally: for F,,' let" with "x
=
T consi s t s of a p a i r o f t e rm s s , t E F n
A T- un i f i ca t i on problem T ' we d ef i ne ~ U L T ( S ,t ) ,
(i )
th e s e t of most g en er a l un i f i e r s , as : (c o r r e c tn e s s)
lJU ~ ~ U ~
(ii) Vo£
U~
( i i i ) V 01 $02
there ex i s t s £
lJU ~ :
F r o m condition ( ii) O~
O £ lJU ~
s .th .
i f 0 1$T 02 IW1 t h e n
(comp leteness) (minimali ty)
it f o l l o ws i n particu lar that U ~ =T ~o U ~ ! W n , i .e .
is a Ze ft i de a l in the semigroup ( ~, o ) and UE is g e nera t ed by lJ U~ .
For theoretica l r e a s o n s
( i d e mp o t e n c y of substi t utions ) as well a s for
many practical applications , it t u rne d out to be u s efu l addit ional t ec h nical r equ i r ement :
(0) F or a s et of va r i ab les z wi t h XCO D ( 0) ('\
z =0 for
O £ lJ U ~ 'T' ( s ,
t)
V (s ,t) ~z:
(p r o t e c t i on of Z)
to h ave t he
17
If conditions (0) - (iii) are fulfilled we say
~UZ
is a set of most
generat unifiers away from Z [FH83], [PL72]. The set ~UZT does not always exist;when it does then it is unique up to the equivalence see [FH83]. For that reason it is sufficient to
%T,
generate one ~UZT' In the following we always take representative of the equivalence class [~UZT]%'
~OZT
as same
PROBLEM TWO (Existence Problem):
For a given equational theory T E always exist for every s,t E F~/
~
, does ].lUi:: (s,t)
T
PROBLEM THREE (Enumeration Problem):
For a given equational theory T
E ~
]JUi::T(s,t) recursively enumerable for any s,t
is
,
EF
n
?
That is, we are interested in an algorithm which generates all mgu's for a given problem T' Section 111.1 summarizes the major results that have been obtained for special theories T. The central notion ]JUL T induces the following fundamental classes of equational theories based on the cardinality of ]JUL T: (i) A theory T is unitary if Vs,t ].lULT(S,t) exists and has at most one element. The class of such theories is ~ 1 (type one). (ii) A theory T is finitary if it is not unitary and if Vs,t ]JULT(S,t) eXists and is finite. The class of such theories is
~w
(type w).
(iii) A theory T is infinitary if Vs,t ]JULT(S,t) exists and there exists
T such that ]JULT(p,q) is infinite. The class of such theories is ~oo (type 00). (iv) A theory T is of type zero if it is not in one of the above classes The class of these theories is ~o' (v )
A theory is unification-re levant i f it is not of type zero. The class
of these theories is
1r .
Several examples for unitary, finitary and infinitary theories as well as type zero theories are discussed in 111.1. A matching problem T
consists of a pair of terms and a theory TeGt= . A substitution
v€E is a T-matcher (or one-way-unifier) if vs t. MET is the set of matchers and a set of most general matchers ~MET is defined similarily to VULT'
18
The setvMLT induces the classes of matching-relevant theories similar to the classes based on VULT: a theory T is unitary matching if VMLT always exists and has at most one element. The class of such theories is .JIl1 • Analogeously we define A unification algorithm U A T
.Al./jJ..AI. oo JJlo and the class A (a matching algorithm MAT)
T is an algorithm which takes two terms sand t ~ ULT(~ ML
a set o/T
T)
for T
(for T)' A minimal algorithm
2
vuA T (f./MA is an algorithm which generates a VULT T)
(VML T)'
For many practical applications this requirement is not strong enough, since it does not imply that the algorithm terminates for theories T E
11. 1
U,
11. w'
On the other hand, for T E
11.
/jJ
it is sometimes too
rigid, since an algorithm which generates a finite superset of VULT may be far more efficient than the algorithm vuAT and for that reason preferable. For that reason we define: An algorithm (i) uA (ii) uA (iii)
uA T
is type conformal iff:
generates a set o/T with UL T T T
terminates and v ,
if T E
'1l
oo
:=
v,
::::J
is finite i f T E
VULT for some VULT'
'1l 1
U
'1l
/jJ
and
then o/T~ [VULT]%'
Similarly: algorithm
MA T
is type conformal iff (i) -
U replaced by M.
(iii) hold with
19
"Howev er to ge ne rali ze . on e n e e ds exp eri e nc e ... " G. Gratze r Un i v er s a l Al g ebr a . 19 68 II I. RESULTS
" a comparative study neces sari ly pr esup pose s some pr ev i ou s se par a t e stu dy. co mpar ison bei ng impo s sible wit hou t k no wledge . " N. Whi t ehead Treat i s e on Univers a L Alg ebra. 18 9 8 1.
Spe c i a l The or i e s
Thi s sec t i o n i s conc er ned with Pr o blem Two a nd Th r ee (the e x i s t e n c e r es p. the enumerati on problem) me n tion e d i n I I .3 : For a give n equationaL
th e or y T. do e s t h ere exis t an a Lgor i thm. whi c h e nume ra t e s any te r ms s and t ?
~ UL T ( S , t )
for
The follow i ng t able summa rizes the -r e s u l t s t hat have b een ob t a i n e d for spec ial t heo r i es, whi ch con s is t o f combination s o f t he f oll owi ng e q u a t ion s : A
(associat i v i t y)
f ( f (x , y ) , z)
C
(commu t ativity)
D
(d i s t r i but i vi t y )
H, E
(homomor phism, e ndomor phi sm)
ql (x oy )
I
( idemp o tence )
f(x ,x)
f (x ,y)
!
DR :
f {x,g {y ,z»
D f{ g {x,y) , z) L:
f( x ,f (y ,z» f {y, x ) g (f (x,y ) , f (x , z » g(f {x , z) ,f {y,z »
.·c alculus and a facility for talking abou t metatheoretic obj ects. Each EKL at om has a typ e (a syntactic entity), a sort, an d a syntype wh ich eith er VARIABLE , CON STANT, DIND OP , DEFINED or SP EC IAL.
or these
syntypcs, SPECL\ L and DEF INE D arc not user declar abl e.
Th e synty pe SP ECIAL refers t o symb ols in the standard context of E KL th at ar e heavily overload ed in t hat t hey operate on all typc levels; st r ictly speaking, these symbols don 't have a ty pe and arc rega rded as ahso lute constants. DEF INg D atoms ar c intr od uced thru t he DEF INE
69
command -
in proof-theoretic terms, they can be viewed as "eigenvariables'' resulting from
the elimination of existential quantifiers. The central notion in the logic of EKL is that of types. Informally, they are represented as a classes of objects in a set-theoretic universe. The main purpose of types in EKL is to restrict the class of acceptable formulas in the language in order to prevent logical contradictions from occurring. Our motivation is quite different from the use of typing in traditional programming languages: The type structures encountered in mathematical reasoning are much simpler than the ones found in programs. The intent is to gain maximum expressibility in our language while preserving consistency
we want to prevent the expression of inconsistencies like AX. -,x(x). At
the same time, we want to give the user the means of rigidly (syntactically) excluding formulas. The user may want to specify the number and types of arguments to a function. For example, expressions of type ground
~
ground can be applied to terms of type ground resulting in
an object of type ground. Of particular interest is the notion of list types, which allows us to talk in a natural way about parameterized formulas and functions taking an arbitrary number of variables. A term of type gr-ound»
~
ground can be applied to any number of terms of
type ground resulting in an expression of type ground. Thus objects of this type could be regarded as having variable arity. Formulas and terms are treated in a completely uniform manner; formulas are simply terms of type truthval. All deduction rules manipulate terms of type truthval. EKL checks terms for correct typing. More formally, the EKL type structure is an algebra with an arbitrary set of atoms including a special atom empty (representing the null tuple
0 of empty type), together with
type constructors "@" (product), "V" (disjunction), "~" (application), "*" (list types) and relation ":S;" (for a type being a subtype of another type). In addition, the user can introduce variable types. For example, "?Foo" denotes a variable type with the name Foo. Bindops are operators that bind variables. They must have the BINDOP syntype. For example, the set-theoretic comprehension operator {xIP(x)} can be construed as a bindop of type (ground) @ truthval ~ ground. This means that the bound variable has to be of type ground, the matrix of type truthval and that an entity of type ground always results. An applicative operator of variable arity can be declared left or right associative or simply associative. A bindop of variable arity call be declared right associative. Internally the resulting
70
expressions will be automatically "flattened" by the EKL rewriter. Thus 2 + 3 + (1 + 2) becomes 2 + 3 + 1 + 2 if
+ has been
declared right associative and Vx y .Vu V.1r becomes \Ix y u u.rr. Note
t hat declarations of this type hav e imp lied semanti c content. Tupling is the m ost fundamental operation in EKL. We regard all functions as applying to ntuples, The term ntuple(x, y, z) is writt en most of the t ime as (x, y, z). Our operators are unary: f(x, y , z) is thought of applying f to the tr iple (x, y, z). In t his sense, we do not have Ivtuples: (x) means z . We allow empty tuples;
fO
refer s to a function of no arguments. We
associate to the right: !(x,y, (u, v,w)) is t he same as !(x,y, u, v,w ). In particular, the empty tuple is deleted wh en it appears at the end: (x, y, OJ is rewritten to (x, y). Perhaps the semantically trickiest part is in the introduct ion of metatheoretic operators (for na ming EKL ob jects) and
t
t
(roughly corresponding to evalu ati on) as a proper part of our
language . One has to be careful about th e interaction of bound var iab les with metatheoretic evalu ation. For instance, we cannot universally genera lize the variabl e x in in the valid for mula
z
= t [z.
Our approach avoids the sep ar ation of metatheory into another domain; the notion of
"reflect ion" in the sense of [Weyhrau ch 1978] is absent. Mctatheory and th e use of semantically attached functions are simply a part of the rewriting process.
In order to guarantee the
soundness of the system, attach ment of fun ctions cann ot be completely controlled by the user. EKL has a list of standard attached function s like PL US, TIM ES , APPEND, . .. etc. Should t he user declare a function with an intern al nam e found in thi s list , the rewriter will do the appropr iate computations on quoted enti t it ies to repl ace the corr esponding expression with the (quoted) result if the type info rm ation matches w ha t is expected. More details of the EKL lan guage, including formal semantics an d the proofs of consistency and soundness, can be found in [Ke tone n and Weening 1983B] . Even though we have implemented
it
metatheoret ic facility, th ere ha s been com parat ively
little use for it in day-to- day th eore m-proving activit ies. It seems t hat most of the concep ts regarded as "m et at heor et ic" can more naturally be expressed in terms of high-order predicate logic. The nee d for me tatheory has in many cases been an artifact of restr iction into first order expr essions; for examp le, fa ct s about simple schemata can be formul ated in terms of 'second-or der quan tifie rs: t he ind uct ion schem a 1'(0) /\ Vn.P(n ) :J P(n ' ) :J Vn .P(n)
71
can be equally well expressed as t he second-order sent ence
VP.P (O) /\ Vn.P(n) :::) P(n ') :::) Vn.P{n). Our belief is bolstered by the nearly uniform denial by pract icing ma th ematic ians that they ever use what logicia ns call m et ath eory-it appears t o be t oo cru de and simplistic to ca pt ure even remo tely the processes occurring in mathematics. The intrinsic struct ur e of facts is obscured at t he expense of emphasizing the pa rticular choice of a lan guage an d syntactic forms for representation. As a disgui sed form of pr ogramming it may not pr esent an y apparent increase in eit her correc t ness or clarity.
3. The Use of Definitions
Definit ions playa key role in math emat ics. T hey seem to be one of the principa l ways of con tr olling comp lexity of form al disc ourse . Defined symb ols in EKL have a special syntype- in rough corresponde nce wit h t he notion of an eigenvariable in proof t heory. T hey can be introduce d in two different ways: t hrough t he use of t he DEFI NE command , which check s t he validity of the proposed definition, or the DEFAX com m an d, which allows an axi om to be regarded as a defin ition of some symbol occurring in it . Defin itions arc heavily used in E KL by both the rewri ter and the un ifier.
4. Sop hi stica ted Unification
EKL is based on high -ord er logic since we felt st rongly th at many importan t math ema tical facts admit a more natural representat ion in th is context. There has been much recent work on th e t op ic of high -order unificat ion (for example, [Huet 1975]' [Miller, Cohen and Andrews 1982J and [Jensen an d Pietzrykowski 1!J76]), which has shown it to be a feasib le alternative to firstorder methods . T he most critical use of unifica tion in EK L t akes place w it hin rewr iti ng whe re one may expect all t he high-order unifiab le variables t o occur onl y on one side, for inst an ce the left-hand side, of the match. On e can show .t hat in this case t he Huct algo rit hm [ll uet t !J75] actually
72
converges-it converges even when we allow first -order un ifiable var iab les to occur on both sides. Given a reason able definition of the size of a te rm, we can eas ily pro ve th at the value of t he pair (size(1h s), size(r hs)} decr eases lexicographically du ri ng th e course of un ifying Ihs against rhs. In fact , one can show that the cardinali ty of t he set of un ifiers generated by this process is exponentia lly bo unde d in the size of Ih s . T he algorithm te rm inates very ra pidly in practice -we have yet t o worry a bout exponentia l blow-u ps. We use two-si ded unification in t he sense explained above in rewr it ing sit uations. T hus one can avoid t he problem of "free variables" mentioned by [Boyer and Moore 1979A]. This also allows us to do som e existential verification in the process of rewriting. Our implementation is opti mi zed in many ways. tio ns are never made.
For exa mple, explicit substitu-
For efficiency and in order to deal wit h impli cit lambda elimina-
ti ons, we have used the rat her complex dat a st r uctures for subst it ution list s suggest ed by [Lusk, McC une and Overb eck 1982]. Th e second modi fication t o t he Huet algorit hm involves the way EK L t reats tu ples of variables: For example, a varia ble occurring at the end of a list may matc h to t he constant
0,
which need not appear exp licit ly on the oth er side. Finally, the uni fier may in the process of mat ching (im plicit ly) expa nd defini tions of at oms occur ri ng on either side. T his has turn ed out to be a powerfu l to ol, tho ugh it is t he current bot tleneck in the un ificat ion process. As an exam ple of uni ficat ion we present the EKL verifica tion of the correctness of a definit ion of AP P EN D based on a high-ord er function existence axiom . T his is an act ual EKL run t hroug h t he SAIL E editor.
;retrieve the basic l isp axi oms
(GET-PROOFS LISPAX PRF PRF JK) (PROOF APPEND) ; de clar e a ne w operator taki ng one or more argumen ts
(DECL NEWAPPEND (TYPE: IG R O UND @ ( G ROU ND *)~GRO UNDI) (SYNTYPE : CONSTANT) (INFI XNAME: **) (BINDINGPOWER: 840)) ; We wi l l be using l i s t i nducti on .
73
;Note the distinct uses of "." in LISTINDUCTIONDEF: ;first, as a delimiter for quantified variables, and then as the infix ;operator-name for CONS. ;Expressions surrounded by bars are regarded as terms by EKL. (SHOW LISTINDUCTION LISTINDUCTIONDEF) ;labels: LISTINDUCTION 29. (AXIOM !VPHI.PHI(NIL)A(VX U.PHI(U)JPHI(X.U»J(VU.PHI(U» I) ;labels: LISTINDUCTIONDEF 33. (AXIOM
IVDF NILCASE DEF. (3FUN. (VPARS X U.FUN(NIL,PARS)=NILCASE(PARS)A FUN(X.U,PARS)= DEF(X,U,FUN(U,DF(X,PARS» ,PARS») I) ;Note that there are 6 unifiable variables occurring in this line: ;DF of type GROUND@(GROUND*)~(GROUND*), ;DEF of type GROUND@GROUND0GROUND0(GROUND*)~GROUND, ;NILCASE of type (GROUND*)~(GROUND*), ;PARS of type GROUND*, and finally X,U of type GROUND. ;The variables PARS,X,U occur inside an existential quantifier. ;In the actual unification process they are replaced by functionally ;interpreted higher order variables. (DEFINE NEWAPPEND IVV X U.NIL**V=VA(X.U)**V=X.(U**V) 1 (USE LISTINDUCTIONDEF» ;EKL accepts this definition because it is able to rewrite the formula ;3NEWAPPEND. VV X U.NIL**V=VA(X.U)**V=X.(U**V) into TRUE ;by matching it against LISTINDUCTIONDEF with the additional unifiable ;variable NEWAPPEND coming from the other side. ;After translation from internal forms, the unifier found ;by EKL can be expressed as the following set of pairs: ; (DEF , I"X Y Z Xi. X. ZI) , (DF, I"X Y.Y1), (U, IUI), (X, I XI), (NILCASE, I"X. XI ) ; (NEWAPPEND, I"x Y.FUN(X, y) I) and (PARS, IVI) ;We can now go on and prove facts about NEWAPPEND. ;For example, we can show that U**V is a list ;for any two lists U,V by induction on U. ;This is done by instantiating the universal variable PHI ;in the list induction schema to the term "U.VV.LISTP (U**V), ;opening the definition of NEWAPPEND, ;and letting the rewriter do the rest through ;the universal elimination (UE) command. ;It should be noted that the variables X,Y,Z have the sort SEXP ;and the variables U,V have the sort LISTP. ;Sorts are unary predicates representing "semantic" restrictions ;on atoms. For example, since the variable U is declared to have ;the sort LISTP, the formula LISTP U is true. (UE (PHI I"U.VV.LISTP (U**V) I) LISTINDUCTION (OPEN NEWAPPEND» VU V.LISTP U ** V
74
; This i s a us eful fa ct to add to the c ontext of ~ene r a l l r known facts ; about NEWAPPEND. The labe l SI MP INFo a s a n am e t o a line has ;spe cia l s i gnif ic an ce to EKL : a ll rewriting c ommand s wi l l ; us e SI MPINFO lines autom a t ically in t he default mode.
(LABEL SI MPINFO) ;GENERAL PRI NCI PLE: Many meta theoretic Ob j ec ts (ax iom s che mas etc .) ; can be r eplaced by constr uc t ions involving higher types.
Our approach to verifying th e existence of LISP-like fun cti ons is quite differe nt from the on e cho sen by [Boyer and Moore 1979A]. We do not use spe cial-p urpo se definit ional mechanisms. Any definition in EKL arises from ax ioms whi ch often contain varia bles of higher type. Indeed, we have no desire to represent progr ams directly in the formali sm of EKL: Our approach is pur ely extensional. The ax iom LISTIND UCTIONDEF given abo ve is sufficient for simp le primitive recursive fu nc t ions wit h pa ra meters . Man y other LISP fun ction s ca n be defined by prim itive recursion on a higher ty pe . Con sid er , for exa mp le, the fu nction flat wit h the proper ty
flat( x.y, z ) = flat(x , flat(y, z)). While flat is no t primitive recurs ive, t he function Ay.flat(x,y) is.
5. Rewriting
Almos t all the primitive command s of EKL use rewriting- even the decision pro cedures are viewed as a part of the rewrit ing process. Rewr iting immediately poses the problem of control. Indi scr iminate use of equalit ies may easily lead to infinite loops or wors e conseque nces: Unintended replacem en ts. One cannot expect problems of termination to be of great relevance in our context. Thus we need a language for rew riting-how to control th e pr ocess th roug h simple instructions t o EKL. Many formal manip ulation pr ograms use t he paradigm of "reasoning experts" operating on a "k nowledg e base" , followin g t he tradi tional sepa rati on of pr ogr am s from da t a . Our first attempt at controll ing rewri ti ng was base d on t his a pproach, following a suggest ion made by McCarthy . Regul ar expressions of strateg ies were used to te ll t he rewr iter what to do . The
75
resul ts from this experiment were very valuab le. While we were ab le to pro duce very compa ct proofs, oft en the expressions employed were incomprehensible to the casual user . One may argue that the fault lay in the design of the rewriting language. We ar e inclined t o believe otherwise. In our opinion the point of depar ture from fam iliar mathematical pr act ice occur s with th e very attempt t o separate pr ograms from da ta. Mat hematical statements oft en contain im plicit procedural informat ion. Let us look at a simple example: What is t h e intended mean ing of t he fact
P(x) J A= B? One can immediately enumerate several possibilities: (1) Repl ace P(x) J A = B by true , wh enever it appears. (2) Rep lace A
=
B by true if one can pro ve P(x) in t he current situation.
(3) Rep lace P(x) by false if one can prove ..,A = B . (4) Replace A by B whenever one can prove P(x). (5) Replace B by A wh enever one can prove P(x). (6) Replace A by B whenever on e can pr ove P(x), but not in te rms resulting from this substitut ion . Som e of th e interp re tations list ed above su bsume others: For exa mp le, (2) is mor e general than (1). Lines (4) and (5) are com plete ly contradictory in inte nt . It is obv ious that one can go on listing man y mor e possib ilit ies in this vein. With quan tified statements th e situation can get even mor e involved. EKL terms are complicated data structur es tagged by informat ion about t he applicability of various rewriting procedures. We view t he interpretation of fa ct s as a mapping of the form: (fact)
® (mode
of use) ---; (rewriting procedure).
A rewr it ing procedure can be expresse d in terms of a tuple consist ing of left -h and side, r ighthand side, list of un ifiab le variables, an d condit ions. Th e conditions can eit her be pro ced ur al in natur e or consist of form ulas t o be veri fied before the r esult of th e pro cedure can be acce pte d. The user can impose pr ocedur al conditi ons by
76
listing arbitrary LISP S-expressions that have to evaluate to T in order for the application to be accepted. A rewriting procedure returns either the right-hand side together with a substitution list or else reports failure. The verification of the conditions of rewriting is considered separate from the process of rewriting. We have a small decision procedure that performs this task quite quickly. Currently there are three possible modes of use for a fact: Given a non-failing application of a rewrite, the mode default accepts the result only if it is simpler, the mode always accepts the result always, and the mode exact accepts the resulting term but does not allow applications of this rewrite in it. In addition to actions described above the rewriter docs standard simplifications: For example, logical simplifications, A-eliminations, and removal of unnecessary quantifiers are done automatically. Applications of associative operators are "flattened." Existential statements are replaced by true if they can be verified through the use of unification. Equalities are treated in a similar fashion. Metatheoretic simplifications are done automatically. The rewriter will replace expressions of the form
~ tt
by t if t is an EKL expression of the right type and has no free variables that
are captured by the current binding environment. In addition, computations involving absolute constants are performed. If an EKL symbol F is attached to the LISP function FOO, then the rewriter may replace F(tXj ... tXn) with tY (assuming the types are consistent), where Y is the result of applying FOO to the S-expressions representing Xl" .Xn . One can ask the rewriter to apply the EKL decision procedure DERNE. The appropriate term will be replaced by true when the procedure succeeds. DERNE invokes a program which tests the validity of deductions in a fragment of predicate calculus designed to capture the notion of "trivial" inferences ([Ketonen and Weyhrauch 1983C]). Rewriting procedures can be induced by conditional branches. For example, in the formula if P then A else B we use
1) this equation has precisely one real root in the range 0 0, we can choose a K, such that, for all L:
peL) +
s, (/ +e)L
fA(/ +e)
< K e (/
+e)L
(5.6)
A base case for Eq. 5.3 (enlarging Ke if necessary) is clearly true, so for all L: T(L)
< Ke(/
+e)L
(5.7)
•
For example, one naive algorithm is to substitute for variables that occur with only one polarity, if any. then pick a variable to "branch" on arbitrarily. There is only one case, and A == {2, 2} because there are two subproblems, each at least 2 shorter than the base problem. Eq, 5.5 becomes 2y-2 == I, and
y* = v' 2. In order to determine y * for the algorithm of this paper, it is necessary to delineate the cases and subproblems that arise. This is done in the following sections. 6. Satisfiability Preserving Assignments and Dominance Pruning If we adopt the convention that false y(e . d ). In other words. between two cases that achieve the same total reduction in subproblem lengths. the more extreme case is the worst Proof We restrict our attention to the region of interest, i.e.• everything positive. Here f mn is decreasing in y. Let e = (m - n )/ 2. Then m + 1I
f mn(Y)
=:
'1-2-
('1' +y - ')
For m + n and Y held constant this expression has a unique minimum at e = O. Therefore (with m + n held constant). as e increases in absolute value. '1 must also increaseto maintain the value of f mn at 1. • Lemma 7.2 If variable v occurs k times in a succinct AND-OR tree with no dominant nodes. and it is chosen as the branch variable. then the sum of the reductions in the two subproblems is at least 3k. Proof Each node n in which v or Ii occurs is in the appropriate defining subset. There must be some other leaf in [amily tn): it contains neither v nor ii or n would be dominant One assignment to v eliminates n and the other eliminates [amilyin), for a total of at least 3 among both assignments. • Theorem 7.3 Let the algorithm outlined in section 4 choose the "branch" variable according to rules A. B, C. and D stated above. Then the resulting casescan be characterized by the following sets of integers. A =: {4.8} 8 = {3. 6} C = {4. 4} D = {2. 8} Proof In this proof. x and x will always represent the literals associated with variable v. Case A Some variable occurs more than 3 times. Rule A is applied to choose variable v. Each assignment to v reduces the expression length by at least 4. By lemma 7.2. the sum of the reductions for both assignments is at least 12. By lemma 7.1. the worst possibility is that one assignment reduces by 4 and the other by 8. Therefore A '= {4.8} . Case B Some variable occurs exactly 3 times. Rule B is applied to choose one such variable v. Without loss of generality. assume x occurs twice and x once. There arc subcases depending on what operations x and x are under. Fig. 7.1 illustrates a typical subcase. In subcases B3 and B6 I' is symmetric; in the others .it is mixed. Subease Bl Both x are under and. and x is under an and. Each assignmentto x reducesby at least 4 because either families(x ) or f amilies(;) disappears. as well as the leaves containing x and F. But by lemma 7.2 the sum of reductions is at least 9. Therefore. 81 '= {4.5 }. Subcase B2 One instance of x , say nlo is under an and. the other. n2. is under an or. and; is under an and. Each assignment to x reduces by at least 4 because either fami/Y (lll) or fam ilY(1l2) disappears. as well as the leaves containing x and F, But by lemma 7.2 the sum of reductions is at least 9. Therefore. 82 = {4,5} . Subcase B3 Both x arc under and. and x is under an or. Consider the assignment x = true. Both instances of x disappear. as well as ; . for a total reduction of at least 3 leaves. Now consider x = false. All of f amilies(x ) disappears, plus [amilles ix}, for a total reduction of at least 6 nodes. {3.6} . That is, 83 B4 Both x are under or. and; is under an or. This is similar to B1.
=
110
B5 One instance of x is under an and. the other is under an or. and to B2. B6 Both x are under or, and changed.
x is under
x is under an or.
This is similar
an and. This is similar to B3 with true and false inter-
Therefore, applying lemma 7.1, we conclude that B
= {3.6}.
The remaining cases concern expressions in which every variable occurs exactly twice, and hence every literal occurs exactly once with each polarity. Case C Some variable is mixed. Apply Rule C to choose one such variable v as the "branch" variable. The parents of x and must both contain the same operation.
x
x
Subcase CI Both x and are under an or. Consider the assignment x =truc. Then f'amiliesix ) and disappear for a total of at least 3 nodes. But since every variable occurs exactly twice, if an odd number of nodes disappear, some remaining variable occurs once and can be eliminated using the triviality lemma. Therefore, the reduction is at least 4 nodes. A similar argument applies to the assignment x = false.
x
Subcase C2 Both x and
x are under an and.
This is similar to Cl with true and false interchanged.
Therefore, we conclude that C = {4,4}. Case D All variables are symmetric. Apply Rule D to choose any variable v as the "branch" variable. Without loss of generality. assume that all literals are named so that Xi is under an and and Xi is under an or.
current
after x = true
after x =false
Figure 7.1. Case 32
111
Subcase Dl At least one of familiest x) and !amilies(x) contains more than two leaf nodes. First let familiesix) contain 3 or more nodes. With the assignment x =truc. we can only count on x and i disappearing for a reduction of 2 nodes. Consider the assignment x = false, Now both famlliest x) and [amiliesiic) disappear. but in addition there is a domino effect. If familiesix) contains 5 nodes, then 7 disappear plus an unpaired one that can be eliminated by the triviality lemma, for a total of 8; so consider smaller cases. Let familiest x) = {x. iii}. (When it is larger, a reduction of 8 is easy to show using a similar argument) Now examination of all possibilities in which f amiliesix) has 3 or 4 nodes, (keeping in mind the restriction to all symmetric variables, no dominance, and no collapsibility) reveals that there must be at least two additional literals (not complements iii or each other) that occur in [amiliest x), (See Fig. 7.2.) When one literal associated with a variable disappears, the other can be eliminated by the triviality lemma, so in all, at least 4 variables. hence 8 literals disappear. Thus Dl = {2,8}. Subcase 02 Both [amiliesi x) and [amiliescx) contain exactly two leaf nodes. As in Dl, the assignment v = true produces a reduction of 2 nodes. Consider the assignment v = false. Let w be the other literal in [amiliest x), and let ji be the other literal in familiesi i ), (See Fig. 7.3.) (wand y cannot be the same literal or they would be collapsible into x.) Both wand y disappear, so wand y become trivial, and may be assigned true. But iii is under an or, so assigning true to it causes familieslw) to disappear. Also, y disappears. If jamilies(iii) contains 3 or more nodes, this gives an immediate reduction of 7 or more, but exactly 7 would allow elimination of the unpaired node also, for a total reduction of 8. If f amiliest; iii) contains 2 nodes. let i be the other node, as shown in the diagram. It disappears. so z becomes trivial and disappears, This again brings the reduction to 8. Thus D2 achieves the same reductions as D1. Therefore, we conclude that D = {2. 8}, concluding the proof. • We should emphasize that this theorem provides an upper bound only, which is not necessarily "tight". A tighter bound might be possible by exploring the worst cases above more carefully, including looking at the next level of recursion. We now turn to the evaluation of y *, the exponential growth rate, as defined following Eq, 5.5. Corollary 7.4 The algorithm's running time is 0(2(.25+,)[·).
Figure 7.2. Possibilities for f amiliest x) in Subcase D1.
Figure 7.3. Illustration of Subcase D2.
112
Proof By lemma 7.1. YA , where W is a sequence wo' WI' w2' the states, and Wi ~ wj iff i '" j. And an assignement V for < W.~ > assigns to each propositional variable p a subset yep) of W. Given an assignment V for (W,~), we define V(A,w i) E {T,F}, where A is a formula and Wi e w, according to the costumary inductive definition in particular V(o A,wi)
T
V( DA,wi)
T
T, and iff V(A,w i +, ) iff V(A,wj) '" T for every wj e W such that Wi ~ wj
We say that A is valid (unsatisfiable) in W if V(A,w i) every assignment V for Wand every Wi E w.
T(F) for
We consider the following abbreviations 'V 0 'V A (0 A means possibly A) A v o A v ... v on-l A notes the string °A n '" A s o A &••• & on-l A OA
n-l times 0 ... o A.
gA
The axiomatic system for propositional linear temporal logic is obtained by adding to the usual formalisation of propositional calculus the following axioms and inference rule :
115
A1 · AZ' A3 · A4 · AS'
IR.
o (A ->
B) -> (DA -> 0 B) o (A -> B) -> (0 A -> o B) O'V A Begin i o( 'I> End i v 3)
4)
v
0
0
Begin i )
Mu t ua l exc l us i on 0
('I>
Beg i ni v
'I>
End j )
0
('I>
Begin i v
'I>
Begin j)
[]
('I>
End 1.. v
End j ) , wher e i f
'I>
Fa irne ss
o O(Begi n , v o O(Be gi n z v 5)
End i )
End,) End z)
Each proce ss Pi is a l ways i n exactly one the two cod e r egions
o o
(Begi n i v Endi) ( 'I>
Begi n i v
'I>
End i )
In order to show t ha t the spec i fication s veri f y some properties we can ask: if we beg i n with pr oc ess P" can P z reach the cr itical s e ction ? i. e . c 0 Begin z ? We a s s e r t that proc e s s P z ne ver r eaches t he critical sec t ion and we add the se as s er t ions to the speci ficat ion: D o 'I> Begin z' an d li e prove that this set is refutable . The refuta tion will be:
126
C o '\, Begin z
D o '\, End Z
D ( '\, End Z v '\, Beg i n 1)
o (Begi n 1
n c v Be gi n 1
o
0
End 1
using cl aus e 3 . and 3 . 1 . 3 . , 3.1.Z.a and 3.1. 1.
vEndI )
n ( '\, Begin 1
v
using cl au s e S. 0
End1)
using clause Z.
o -v Begin 1
G. So me Conclusions
We think that the procedure proposed possesses some a dvant a ge s wi th respect to other semantic or syntactical decision methods [BM] [CE]. In particular for the expr es si on of refinement we c an f ollow the ideas developped in classical logic [CHL]. In this way an implementation of a l inear refinement of this method has been reali zed in Prolo g [FL] .
Bi b l i og r aphy [BM]
BEN- ARI M. - Comp lex ity of pr oofs an d models in pro gr ammi ng lo gi cs . Ph D., Tel-Aviv Univers ity, May 1981 .
[CR]
CARNAP R. - Modali t i es a nd quantification. J SL Vol. 11, 1946 , pp. 33-64 .
[CHL] CHANG C. , LEE R. - Symbo l i c a l logic and me chanic al theorem pro ving . Academic Press, New- Yor k , 1973 . [FC]
FARINAS DEL CERRO 1. - A simple deduction met hod f or modal l og i c. I nf or ma tio n Proc essing Let t e r s , Vol. 14, nOZ , 198Z
rFL]
FARINAS DEL CERRO L. , LAUTH E. - Raisonnement temp or e1 : une me thod e de deduction. Rapport Un i ver s i t e Paul Sabatier , Toulouse, 1983 .
[G PSS] GABBAY D., PNUELI A. , SHEALAH S., STAVI J. - Tempora l Ana lys is of Fairness . Sevent h ACM Symposium on Principle s of Program ming Langu a ges . La s Vega s , NV , J anvie r 1980.
127
[LE]
LEMMON E. - An i ntroduction to modal logic . Amer. Phi l. Quaterly Monograph Ser ies, 19 77.
[MZ]
MANNA Z. - Ver i f i c a t i on of sequential prog r ams : Temporal axioma ti za tion . Report NoSTAN- CS- 81- 877 , Stanford Universi ty , 1981 .
(RJ]
ROBINSON J. - A machi ne oriented logic based on the resol uti on pr i ncip l e . J. ACM, 12 , 1965 , pp . 23- 41 .
[CE]
CLARKaE . , EMERSON E. - Desi gn and s ynthesi s of s ynchroni zation sk e letons using branching time temporal lo gi c, 1981 .
128
A PROGRESS REPORT ON NEW DECISION ALGORITHHS FOR FINITELY PRESENTED ABELIAN GROUPS
D. Lankford , G. Butler, and A. Ballantyne Louisiana Tech University Mathematics and Statistics Department Ruston, Louisiana 71272
ABSTRACT We report on the current state of our development of new decision algorithms for finitely presented Abelian groups (FPAG) based upon commutative-associative (C-A) term rewriting system methods. We show that the uniform word problem is solvable by a completion algorithm which generates Church-Rosser, Noetherian, C-A term rewriting systems. The raw result is theoretical, and few would contemplate implementing it directly because of the incredible amount of trash which would be gene r a t e d . Much of this trash can be obviated by a different approach which a ch i eves the same end. First, the uniform identity problem is solved by a modified C-A completion algorithm which generates Chur ch-Rosser b as e s , and then t he desired complete set can be computed directly from the Church-Rosser bases. Computer generated examples of the first par t . of this two part procedure are gi ven . The second part is still under development. Our computer experiences su ggest that Church-Rosser bases may often cont a i n large numbers of rules, even for simple presentations. So we were naturally interested in finding better ChurchRosser basis algorithms. With some mi nor change s the me t hod of Smith [1966] can be used to generate Church-Rosser bases . The Smith basis algorithm appears promising because computer experiments suggest that the number of rules grows slowly. However, small examples with small coefficients can gi ve rise to bases with very large coefficients which exceed machine capacity so we are not entirely satisfied with this approach.
This work was supported in part by NSF Grant MCS-8209143 and a Louisiana Tech University research grant .
129
I NTRODUCTION The r e are a t leas t t hree di f f eren t previous met ho ds of so lution of the FPAG un i fo r m word prob1em-- t he direct s um (or basis) me t h od , wh ich is discussed i n de tai l by Smit h [1966], t he e l ementary f ormul a me thod , s e e Szmie1ew [1954 ] , and the linear Di ophan tine equat i on method, see Cardoza [1975 ] .
We have not studied
the method of Rei demeis ter [19 32] t ho r oughly enough t o classi f y i t .
Only one of
t he ab ove met hod s has been previous ly implemen ted , a variation of the ba sis method by Smith [1966] who modified t h e so lut i on of Jacobson [1 953] fo r i mproved comput ational efficiency.
I t should be routine to impl ement the linear
Diop hant ine equation method , but writing an explicit algorithm f or t he elementary f ormula met hod would be non-trivi al.
I t is also not obvious how to comb i ne the
linear Di oph antine eq uation method or the el emen t ary f ormula me thod wi t h gener a l computational logic sys tems . The C-A t erm rewri ting system met hods of t h i s paper a re on the one hand general izations of the Ballantyne and Lankford [1979] s olution of the fini t e l y presented commutative semi group (FPCS) uni form word prob lem, and on t he other ha nd sp ec ial cas e s of mor e gener a l C-A term r ewrit ing s ystem me t h ods develop ed by Lank ford and Ballantyne [1 977] .
These methods a re eas i ly inco rporated i nt o
general computa t ional l ogi c s ys tems, such as with blocked pcrmu tative narr owing and r esolution , see Lank f ord and Ballantyne [1 979].
Our so lution of the FPAG
un iform word problem wi th these methods so lves part of t he stumb ling b lock prob lem ment ioned by Bergman [1978 ].
We be lieve similar meth ods will so lve other
i mpo rtan t parts of the s tumbl i ng blo ck pr ob l em, such as th e uni f orm word problem f or fin itely presented C-A rings. BACKGROUND It is known that t he wor d problem for fr ee Abel ian gr oups is deci dable by
130
the complete C-A term rewriting system below, see Lankford and Ballantyne [1977] and Lankford [1979].
Rl.
[xl]
[x]
->-
-1
R2.
[xx
R3.
[1-1]
R4.
[(x-I)-I]
RS.
[(xy)-l]
]->-
[1]
->-
[1] [x]
->--+
[x-\-l]
In the language of term rewriting systems, an FPAG consists of the above five C-A rewrite rules plus a finite number of ground (variable free) rewrite rules [L
[R where i]
->-
i]
and constants.
and R are terms constructed from the group operations i
In the usual language of uniform word problems, the constants
are the generators, and the equations L
i
=
R are the relations. i
In general a
presentation may have generators which do not occur in any of the relations, but we ignore them here (because they can be collected together using commutativity and associativity and kept in normal form by Rl
RS).
In the case of FPCS it was obvious that the C-A term rewriting systems were Noetherian when ordered by a vector lexicographic order.
It is natural to
conjecture a similar result for FPAG, but we must be careful.
For example,
[a]
->-
[a-lJ is not Noetherian, and there are non-ground rules.
Evidently we
require a norm which is compatible with certain lexicographic orders. F
l
= 1,
Fa
=
1 for any constant a,
F(x,y)
for any variable symbol v, and define
I
[t]
=x +
I
y, F_ (x) l
= x 2,
I
[v]
as in Lankford [1979].
Let
I
1
It
follows that if [u] is an immediate reduction of [tJ by Rl - RS, then
I
[t]
I~ I
R3 or R4.
[u]
I with
equality possible only for immediate reductions using
And because R3 and R4 decrease the number of occurrences of -l's,
i t follows that Rl - RS is Noetherian.
That Rl
RS is Noetherian is already
known from Lankford [1979], but the norm used there is not compatible with lexicographic orders.
131
MAIN RESULTS Lemma 1
Suppose the term algebra contains a finite number of constants
ordered in some manner:
a
1,
••• ,
For any ground C-A congruence class [g]
~.
which is irreducible relative to R1 - R5, let m denote the number of occurrences i l of a in g, n the number of occurrenceR of a in g, and Vl([g]) ii i (m .•• , 1,
~,
n
1,
••• ,n
If R is a C-A term rewriting system consisting
k).
of R1 - R5 and ground rules [1] ---+ [R] which are irreducible relative to R1 - R5 and satisfy VI ([L]) > V ([R]) in the vector lexicographic order, then 1 R is Noetherian. Proof
Suppose there were an infinite sequence [t For any ground rule [L]
---+ ...
---+
[R] of R,
1]
1 [L] I
the norm and vector lexicographic order are compatible. [til [t
i
I
~
+ 1]
I
[t i + 1]
I,
or ground.
I·
---+
~
[t
Z]
---+
1 [R] I,
[t
3]
i.e.,
rhus it follows that
Without loss of generality assume
1 [til 1 =
hence the only rules used in the immediate reductions are R3, R4,
Because the ground rules are irreducible relative to Rl - R5, there
is no infinite subsequence of applications of R3. applies only to sub terms of the form or constant.
1
x -n where n
rhe norms are equal, so R4
~ Z and x is a variable symbol
Clearly there can be no infinite subsequence of applications of
R4 to sub terms of the form x-
n
where x is a variable symbol.
When x is a
constant the only rules besides R4 which apply to a subterm of the form x- n are those of the form [x] ---+ [c], [x- 1] ---+ [c], or [x- 1] ---+ [c- l] is a constant.
rhe number of occurrences of -l's in these subterms is
nonincreasing and strictly decreases with each application of R4. no new subterms of the form xl],
[x] ---+ [c-
n
-n
rhe form x
And because
are introduced (there are no rules of the form
and the ground rules are irreducible relative to Rl - R5),
there can be no infinite subsequence of applications of R4.
1
where c
denotes ( ••• «x
-1 -1 -1 -1 -1 n ) ) •.• ) rather than (x )
132
So we now assume that t he given infini te s equen ce of i mmedi ate r educ tions cons is ts en t irely of ground r ul e ap plications, an d al l t
i
a r e ground .
(Thi s
does not mean that the [til ar e necess arily irreducible r elative to Rl - RS.) When x i s a cons t ant and n ~ 2 let us de fine the excess -l' s of x-
n
as n - 1.
The exces s - l ' s of a ground [g] is the s um of the excess - l' s of s ub terms of g of t he fo rm x-
n
wher e x i s a cons t ant and n ~ 2 .
The ex cess - l 's pa r ameter of
t he [ t i l is non-inc r eas i ng , s o wi thout l os s of gene r a li t y assume the exces s -l' s parameter is constan t. follows.
Now ex t en d the domain of VI to arb itr ary ground [g ] as
A term g may be r egarded as the product L( g)N (g) wher e L(g) is the
linear part of g, that is a product of a
-1
i
'd and ai' s, and N(g) i s the nonlinear
part, that i s a product of t erms of t he fo r m gj function symbo l of gj i s not - 1 . s o on .)
j where n
j
~
2, and the leading
(The gj may in turn be a produc t L ( gj)N(gj) a nd
Let VI ( [g ]) = Vl( [L( g) ]) + EVl( [g j ] ) '
i s an i mmediat e
-n
It can be shm.u that if [h ]
C-A r ed uc tion of [g] by a gr ound rule of
R, and t he ex ces s -l' s
pa rame ters of [g ] and [h ] a re equal , then VI ([g ] ) > Vl [h ]) in t he vec t or lexicogr aphic or de r . Conjecture
But then Vl( [ t ] ) > V ( [t > V [t ]) > l l( 3 l 2])
whi ch is impos s i b l e .
If VI i s r eplaced by V2[ g])
revised Lemma 1 holds. Theo r eo 1 pa i rs of
If
R is a Noet he rian , C- A term r ewrit ing s ys t em, and a l l cr i t i cal
R and i ts embeddings ex is t , t hen R is Chur ch-Rosser iff f or each cr i t ical
pa i r X,Y, X* Pr oo f
Y* .
See Lank for d and Ballantyne [19 77 ].
The comp letion proc edure is well-known to worker s in the fiel d, and so will not be des cribed in deta il he r e .
We j us t r emind the reader of ce r ta i n key poi n t s :
eq ua tions a re expr e s s ed as r ewr ite r ules using t he order descr i b ed above, a ll r ewri t e r ules a r e kep t irreducible r elat i ve to t he ot hers , and equat i ons tt l = [u ]
133
which satisfy t
£
[u] are deleted.
To show that the completion procedure decides
the FPAG uniform word problem we must show (1) that all critical pairs exist at each round of the completion procedure, and (2) the completion procedure terminates uniformly. Lemma 2 Proof
All critical pairs exist at each round of the completion procedure. It has recently been announced by Fages [1982] that the C-A unifica-
tion procedure halts for arbitrary pairs of input terms, in which case it is immediate that all critical pairs exist at each round of the completion procedure. One can also prove Lemma 2 directly as follows. embeddings and 52
Let 51 be R1 - R5 plus their
be the ground rules plus their embeddings.
Because R1 - R5
are already known to be Church-Rosser, critical pairs need only be shown to exist between certain pairs of rewrite rules, namely, one each from 51 and 52 or both from 52'
This leads to 17 cases, two of which we show below.
Consider the critical pairs of the embedding x1y embedding Lz
---+
Rz of a ground rule [L]
---+
[R].
---+
xy of R1 and the
The trick in this case is
that inverses of constants in Land R can themselves be treated as constants (remember that [L]
and [R] are irreducible relative to R1 - R5), and so the
C-A unification procedure halts. Consider the critical pairs of the embedding xxembedding Lz
---+
Rz of a ground rule.
a new variable w is troduced for x-
1
1y
---+
y of R2 and the
In this case, following Stickel [1975],
so that the unifiers of xwy and Lz can be
found (again treating inverses of constants as new constants). w' and (x
,)-l
Now consider
where x' and x' are the assignments of x and w by one of the C-A
unifiers in the complete set for xwy and Lz.
Because w' and x' are products of
variables, constants, and inverses of constants, the only way that w' and (x
,)-l
can be C-A unifiable is for w' to be the inverse of a constant and x' to
be the same constant.
Thus the complete set of C-A unifiers of xx-
a subset of the complete set of C-A unifiers of xwy and Lz.
1y
and Lz is
134
Lemma 3
After a cur rent s e t of r ules is reduc ed t o i rreducib le f orm by t he
compl etion procedure, t he new irreduci ble set cons i s ts of R1 - RS and gr oun d rules. Pr oof
It can b e shown tha t if a f ul ly reduced cr i t i ca l pa ir of the fo rm
[ glvari ables ] ,[ g2variab l es] is f ormed , wh er e gl an d g2 a r e ground, t he n [gl ], [g2] is also f or med . Thus when t he comple tion pr ocedure is i t e rat ed , excep t for R1 - R5, only ground rules are present af ter the cu rren t se t is r educe d to i rr edu cible fo rm. Of cou rse , b ef ore passing to a s ub sequen t cr i t i ca l pai rs s tep we mus t agai n add embeddings. Lemma 4 Proof
The C-A completion pr oce dur e halts uniformly. By t he same argument as Lemma 2 of Ballan t yne and Lank f or d [1979],
if the completi on procedure r an inde f ini t ely, then ther e would ex ist an infi ni te se t of mutual l y incompar able vec tors, whi ch i s impos s i ble. Theorem 2
The unif orm word pr ob l em f or f i ni t e ly pres en ted Abelian gr oups i s
de c i dab l e by the C-A completion pr oc edur e . Proof
Us e Lemmas 1 - 4 an d Theorem 1 .
I ne f f i cien cies in t he r aw proc edure (Theo rem 2) aris e f rom two s ources: (1)
the C-A unif i ca tion procedur e , and (2) the fac t tha t cons t an t s can occur in
equat i ons i n eq uiva len t ways , as a cons t an t on one s id e of a r ule , or a s the invers e of the cons t an t on t he other side of the rule. ineffic i en cy by taking a diff erent ap pr oa ch .
We obvi a t e much of this
First, the unif orm iden t itiy problem
i s so l ve d by a modified C-A comp le t ion algorithm which generat es Church- Ros s e r bas es , and then the desired compl e t e se t ca n be computed direc tly from t he ChurchRos ser
b as es .
Comp uter generated examples of t he f irst pa r t of t h i s two pa r t
procedu re are given .
The second par t is s till under development.
135
For any C-A term rewriting system R let obtained from
E (R) denote the set of equations
R by deleting brackets and replacing
---+
= throughout,
by
together with the commutative and associative equations which define the congruence classes, let [t] * denote the normal form of [tJ relative to R1 - RS, and let
1---+ denote
Rosser basis means [LJ
---+
R is
an immediate C-A reduction by Rl or a ground rule.
R consists of Rl - RS and ground rules of the form
1-
[lJ, and E (R)
Lemma 5
If
a Church-
t
=
1 iff [t] *
---+*
[lJ.
R consists of R1 - R5 and ground rules of the form [LJ
---+
[lJ,
and [L] is irreducible relative to R1 - R5, then R is a Church-Rosser basis iff for each critical pair X,Y obtained by the maximum overlap method of Ballantyne and Lankford [1979J (with inverses of constants treated as constants),
---+* Proof
Now let E (R) t
E
to' t
l,
1- t
.. , , t
axiom of E (R).
1. n
e
« )
1---+*
Clearly [tJ *
----+
[lJ.
[lJ implies E (R)I- t
1 where each step is obtained by one application of some
For n
=
applied is R1 - R4, then [toJ *
1 we have [to] =
1 has a proof of length n,
=
If the rule which
[1].
[1], hence [to]
*
----+
*
[1].
And i f the rule
which applied is ground, then [t J = [to] * (because the left sides of ground O rules are irreducible), so [to] * ---+
are now five cases to consider:
(1)
R1 - R5, (3) [to] --
to Rl - R5.
*
[l][ti]
[L][ti] ---+
1---+
*
[tl] .
i~
[L][ti]
But [L][ti]
*
* where
[til
~'(
=
[t
*
So
l].
may not be irreducible relative
If it is not irreducible relative to Rl - R5, then it follows that
only Rl and RZ apply, and that RZ applies only to constant-inverse of constant Without loss of generality let [L][ti] *
pairs.
* * ([L][ti]) [L
l]
=
=
[L'z], [tIl
*
l
[L] --->- [1] on xl ... ~.
* ---+
[L'~
+ 1 ... xn (Yk + 1
.. xny ... ynz], l
[1]
i]
[1]
--->-
8
R, and
R and overlaps with
8
The overlap may not be maximum, but it can be shown
that for ~ overlap critical pair X,Y, [xy- l ]* -1
l·
= [yl ... ynz] = [Ll ... Ljl where [L
It follows that [L -1 ]
[yl ... ykw].
[L'x
--->-* [1].
-1 -1 -1 * 1 * ... Y z ) ] --->- [1], so we are done with the nonn
interaction part of case (4). If R is not a Church-Rosser basis, then R can be transformed into a ChurchRosser basis by a completion algorithm which adds rules [M] ---+ [1] when
[Xy- l ]* --->-* [M]
r [1]
or [L- l]*
---+* [M]
r [1]
with [M] irreducible.
The algorithm terminates because there can be no infinite set of mutually incomparable vectors. Example 1 a
2bZc- Z
=
The Abelian group presented by a
Zb 3c -
=
1, a-
3b2c3
=
1,
1 is typical of our computer experience with small examples.
conserve space we drop brackets.
To
A 70 ground rule Church-Rosser basis was
generated which consisted of the following 35 rules and their reduced inverses. Gl.
Z -3 ab c---+l
G8.
a
GZ.
a- 3b2c 3 ---+ 1
G9.
abc
G3.
aZbZc- Z ->- 1
GIO.
-1 4 -5 abc --->-1
G4.
a -\4 c->-l
GIL
a- 6b4 ---+ 1
G5.
b 5c-3 --->- 1
GIZ.
-3 Z -3 abc ->-1
G6.
abc
---+ 1
G13.
A\-5 ---+ 1
G7.
a
bc ->- 1
G14.
6 c ->-1
-4
-4
-3 7 b ---+ 1 -2
---+ 1
137
G15.
a- 2b3c 5 -----+ 1
lO
1
G26.
b
G16 .
a - \ - 6 c -----+ 1
G27.
a 1\ c- 2 -----+ 1
G17 .
a 3b 3 -----+ 1
G28.
4 -1 5 a b c -----+ 1
G18.
a-\9 c-2 -----+ 1
G29.
a
G19.
5 a c -----+ 1
G30.
a 12b 2 -----+ 1
G20.
5 3 b c -----+ 1
G3L
a l 6b c- 1 _
G2L
6 -3 a be -----+ I
G32 .
a
G22.
} b 2c - 1 -----+ I
G33.
a 2lb -----+ I
G23.
a 9b -1 -----+ 1
G34 .
a
G24 .
a
G35.
a
G25 .
a
10 -4 c 2b7
-----+
1
-----+
15 - 3 -----+ 1 c
20 - 2 c
I
-----+
1
25 - 1 -----+ 1 c 30
-----+
1
c -----+ 1
Our comput er exper i en ces s ugges t t hat Church-R osser bases may often con t a i n large numb er s of r ules, even fo r s mal l examples .
Whe ther th is remains true f or
t he corresponding Church-Rosser, Noetherian , C-A t e rm rew ri ting systems is unknown because we have not ye t implemented an algorithm which transforms Church-Rosser bases into complete sets of C- A reductions .
Rega rd less of the outcome of that
inves tigation we are na turally interested in f inding better Church-Rosser basis gener a t i on al gori t hms.
With s ome minor changes the met hod of Smi t h [1966] can b e
used to gener a t e Chu rch-Rosser ba ses .
Smith's a lgorithm accep ts a mat rix of
integers corresponding to the exponen ts of a given presentation and output s a matrix of integers which describ es t he FPAG in t erms of its cyc l i c summands . The output matrix i s known as the Smi t h normal form (SNF) .
Smit h [1966 ] points
out t h at the SNF algorithm is eq uival en t to transforming a given present a t i on into · an equ i va I en t presen t a t lon are di stinct, the a t he b
j
i
0
f t h e f orm a
i
= JIbj
n.. lJ, b j m,J
1 wher e t he
a
i
and b
are constan t s which oc cur in th e gi ven pres en t ation, and
a re ei t he r cons tants which occur i n the given pr e s en t ation or new
cons tants (introduced by the SNF algor ithm) .
The number of eq uations of t he
fi rs t kind is bounded above by t he number of distinct cons tant s which occur
j
138
in the given presentation, and the number of equations of the second kind is bounded above by the number of given relations.
The corresponding C-A term
rewriting system consisting of Rl - RS, the rules ---+
[1] will be denoted by R R is generally not a Church-Rosser basis, SNF' SNF
but it can be easily transformed into one by adding the rules [b.-mj]*
---+
J
[1],
We call the Church-Rosser bases obtained in this way SNF Church-Rosser bases, Theorem 3
If R is an SNF Church-Rosser basis, and R' is obtained from R by 1
+m
rules [b,- j]
---+
21m I 1 j
and [b. 2(lmj
l]
[b,2I mj l ] and
3
31
[b j
1
m [1] by [b,-2I j
1
I+
1
J
] when m, is lven, or by [b .-2( Imj I + 1)] 1)]
J
~
[b/()mj
I-
1)]
w~en mj
---+
is odd, then
R' is a Noetherian, Church-Rosser, C-A term rewriting system, Proof
The number of applications of ground rules of the first kind is
finite because the number of occurrences of a
i
is non-increasing, and strictly
decreases with each application of a rule of the first kind.
R' is thus
Noetherian because any C-A term rewriting system consisting of RI - R5 and ground rules of the second kind satisfies the hypothesis of Lemma 1. Church-Rosser follows from Theorem 1.
That R' is
Because of the simple structure of R'
the superpositions which must be checked are simple and will not be given here. Example 2
Let us call the complete sets generated via Theorem 3 SNF
complete sets.
The ground rules of the SNF complete set for Example 1 are given
below (without brackets), G1.
b
a
G2.
c
a
9 -5
G3.
G4.
a a
-15
---+
-14 a
15
This is obviously quite an improvement over the Church-Rosser basis of Example 1.
139
Example 3
The following presentation was ob tained by randomly selecting t he
last four digi t s of t e l ephone numb ers.
The gr ound rul es of the SNF comple t e set
are given below (without brackets). c
3827 l
c
- 2223 1934 - 3400 4815 -6646 7833 -9443 4584 -4462 c c c c c c c c 8 6 lO 2 3 4 5 7 9
Gl.
c3
---+
cl
G2.
c
---+
c
~.
c7
---+
cl
G4.
c
5
---+
8
c
475
- 47 - 771 - 285 130 74 - 25l - 380 1200 d3 d2 c2 c4 c6 c9 cl O d1
22
4 -11 - 13 58 - 7 -35 - 16 7 c c c c d d d 4 2 3 2 6 9 lO l
l
-139
2 l
c
c
1
17 226 86 - 39 - 22 75 109 -335 d3 c 2 c4 c6 c9 c l O d1 d 2
-2 -2 -2 3 d c c c d 4 2 6 9 ld2 3 CONCLUSIONS
Our ap proach has be en based upon the met hods of Lankf ord and Ballantyne [19 77] bec au s e we a r e famili a r with them.
The more gen eral methods of Peterson
and St icke l [198 1 ] coul d a ls o be us ed to develop thes e r es ults.
Our investigations
are incomplete, but prel i minary computer experimen ts s ugges t t ha t the SNF complet ion algor ithm is currently the most pr ac tical me t hod of deciding t he FPAG uniform word problem with compl e t e sets of reductions.
Smith [1966] gave an
example wi t h 8 generators and 8 relations a l l of whose exponents are one digit which genera ted numb ers that ex ceeded h i s machine ca pa ci ty (10
8)
.
We wonder i f
t here a re other bas i s gene ration methods which achieve a be t t er ba lance be t ween the numb er of rules an d t he exp onen t size .
We also wonder what can be said about
t he theoret ica l comp lexity of SNF compl et i on .
Cardoza (1975) points out t hat
t he linear Di ophan tine equation method of s ol ut i on of the FPAG uniform word problem is polynomial comple xity.
But as we have s a id, that method doe s not
combine r eadily with general logical sys tems.
A basis so l ut i on of the uniform
wor d problem f or finit ely present ed nilpo tent groups has been given by Mostowski [1966a , b ), s o i t is r eas onable to ask if s imilar t erm r ewriting s ys tem met hods can be developed for nilpotent gr oups .
However, it appears t o be unkn own whe ther
140
the word problem for free nilpotent groups is decidable by some kind of complete sets of reductions.
We are especially interested in extensions of these methods
to finitely presented C-A rings, and believe there is a close relationship with extensions of the method of Buchberger [1979] to
Z-algebras, such as by Kapur
[1983]. REFERENCES Ballantyne, A. and Lankford, D. New decision algorithms for finitely presented commutative semigroups. Louisiana Tech U., Math. Dept., report MTP-4, May 1979; J. Comput. Math. with Appl. 7 (1981), 159-165. Bergman, G. 178-218.
The diamond lemma for ring theory.
Advances in Math. 29 (1978),
Buchberger, B. A criterion for detecting unnecessary reductions in the construction of Grabner-bases. Lecture Notes in Compo Sci. 72 (1979), Springer-Verlag, 3-21. Cardoza, E. Computational complexity of the word problem for commutative semigroups. M.Sc. thesis, MIT, Cambridge, MA, Aug. 1978; and MAC Tech. Memo. 67, Oct. 1975. Fages, F. (some results of Fages presented by J. P. Jouannaud at a GE term rewriting system conference, Sept. 1983) Jacobson, N.
Lectures in Abstract Algebra, II Linear Algebra, Van Nostrand, 1953 .
Kapur, D. (comparison of computer generated examples at a GE term rewriting system conference, Sept. 1983) Lankford, D. On proving term rewriting systems are Noetherian. Tech U., Math. Dept., report MTP-3, May 1979.
Louisiana
Lankford, D. and Ballantyne, A. Decision procedures for simple equational theories with commutative-associative axioms: complete sets of commutativeassociative reductions. U. of Texas, Math. Dept .• ATP project, report ATP-39, Aug. 1977. Lankford, D. and Ballantyne, A. The refutation completeness of blocked permutative narrowing and resolution. Proc. Fourth Workshop on Automated Deduction, Austin, Texas, Feb. 1979, W. Joyner, ed., 168-174. Mostowski, A. On the decidability of some problems in special class of groups. Fund. Math. LIX (1966), 123-135. Mostowski, A. Computational algorithms for deciding some problems for nilpotent groups. Fund. Math. LIX (1966), 137-152.
141
Peterson, G. and Stickel, J. Complete sets of reductions for some equational theories. JACM 28, 2 (Apr. 1981), 233-264. Smith, D. A basis algorithm for finitely generated Abelian groups. Algorithms 1, 1 (Jan. 1966), 13-26.
Math.
Stickel, M. A complete unification algorithm for associative-commutative functions. Advance Papers of the Fourth International Conference on Artificial Intelligence, AI Lab, MIT, Aug. 1975, 71-76. Reidemeister, K. 1932, 50-56. Szmie1ew, W. 203-271.
Einfuhrung in die kombinatorische Topo1ogie, Braunschweig,
Elementary properties of Abelian groups.
Fund. Math. XLI (1954),
142
Canonical Forms in Finitely Presented Algebras Philippe Le Chenadec
INRIA, Domaine de Voluceau Rocquencourt RP. 105 78153 Le Chesnay Cedex, France ABSTRACT
This paper is an overview of rewriting systems as a tool to solve word problems in usual algebras. A successful completion of an equational theory, defining a variety of algebras, induces the existence of a completion procedure for the finite presentations in this variety. The common background of these algorithms implies a unified vision of several well-known algorithms: TIme systems, abelian group decomposition, Dehn systems for small cancellation groups, Buchberger and Bergman's algorithms, while experiments on many classical groups proove their practical efficiency despite negative decidability results. Keywords: Finitely Presented Algebras , Word Problem, Rewriting Systems, Com-
pletion Procedures. 1. Introduction To compute in the group G defined by the generators a, b, c, and the equations abc := aa bbb := ccccccc = 1, the mathematician alternatively works on two levels : first, he deals with a group satisfying the associativity, right unit and inverse axioms, so that he simplifies words of the form aa'", takes the inverses , etc; second, the group G allows new substitutions such that bcabeb" . We explain in the present paper how this mathematician, becomed informatician as he discovered that a computer makes less mistakes than himself in tedious works, will learn to his favorite computer the elementary laws of cla ssical algebras. In the theory of rewriting systems, the canonical system of rules associated to a variety performs the first type of algebraic operations. Such systems are known for all classical algebras [HulBOa]. This paper presents an approach for the second step. By coding the canonical system in new data and control structures, we get completion procedures for finitely presented algebras of the variety. After a brief presentation of the rewriting systems background , we analyse each variety from simplest ones (semigroups...) to more complex ones (r ings...). In abelian case, the completion always terminates as the uniform word problem is solvable. Fromunsolvability results, this is false in non-commutative varieties. However, many examples of complete group pre sentations are given, showing the practical interest of an algorithm whose behaviour is theoretically worse . Also in group theory, we show that the basic result of the important small cancellation theory can be refined with rewriting theory, In other cases, we find as
=
143
special cases Thue systems (monoids), Bergman's algorithm (associative algebras), Buchberger's one (commutative algebras) ; these facts prove the unifying effect of the present view. 2. REWRITING SYS'l'EMS and the COMPLETION PROCEDURE
This section is a succint presentation, for a detailed development, we refer to [Hue8Qa, HueBOb], we just define the basic notions. 2.1. Definitions
Let S be a finite set, S· is the monoid of words on S, with the concatenation and the empty word 1; S+=S'-11} is the semigroup of non-empty words. An. equivalence relation", on S· is a congruence iff V'lJ,V,W,W'e:S' U",V => WUW'=:WVW' . The word U~1 is subword, prefix or suffix of V iff 3W,W' s.t , V=WUW',V=UW or V=WU, U is proper iff U;;fV. For computer handling, algebraic expressions are coded as terms, defined by an operator domain F graded by arity o.:F->lN, and a denumerable set V of variables, disjoint from F, the algebra of terms is noted T(F,V) and defined on (FlJV)' : 'VcEF s.t. o.(c)=O , cET(F,V), 'VVEV , vET(F,V), V'fElA Definition 2.1 The variety V(E) defined by a set of equations E is the class of all Algebras satisfying the equations of E. A presentation of an algebra Ain V(E) is apair of sets (G.R) s.t, - GnF;;fO, 'VgEG. a.(g)=O. - R is a set of equations on G(FuG).
The
algebra A is the quotient [G(FuG)1 "'E]I =R. A is finitely presented 'iff G and R are finite .
144
For example, groups are defined by the operators ., 1. -I, and the three equations (x.y).z=x.(y.z), x.(x)-1=1 and x.1=x, x.y.zcv: while (a.b: a.a = 1. a.b = b.a; b.(b.b) = 1) is a presentation of:U 2:t.. x Z/al:. 2.2. Reductions The basic idea to compute normal forms is to reduce terms by rules which are directed equations, noted X-'>p, s.t. V(p)CV(X). The precise definition of reductions needs subterm replacements: if uis at subterm, then t[up iff there is a t-subterm u and a substitution o s. t. u(X)=u and t'=t[ u'-u(p)]' The transitive-reflexive closure of -'> is noted ...•. To compute one reduction step, we must find the substitution a, this operation is called the match of two terms. We also need the unification of two terms: :3 IJ s.t. lJ(t)=IJ(t')? When the equality is replaced by the associative-commutative equivalence, these operations are called AC-matching and AC-unification, in all cases complete and finite algorithms are known [PetB2a, FagB4a, Liv76a]. 2.3. The Completion Algorithm We are looking for sets of rules such that a term has a unique irreducible form. This is achieved when the rules satisfy two classical properties: Neetherianity, there exist no infinite reduction chains. Confluence, v' m.n.p p-'>·m and p...'n ==>:3q m ...'q and n-s'q. This last condition may be tested by critical pairs when the reduction is neetherian [HueBOa]: Definition 2.3 If the rules A"'P and JL-'>l/ are such that there exists a non.JVariable sub term A' of A unificable with JL, under the substitution IJ, then the equation (IJ(X)[A''-U(l/)],u(p» is called a critical pai:r (c.p.) obtained by the superposition of JL"'l/ on A-'>p. Thus, the unification is a crucial point to compute the critical pairs. For example, the two rules x*(x.(a*y»-'>f(x) and b.z ...g(z) give the c.p. (b*g(a*y),f(b», under the sustitution i,~. These pairs are elementary divergent points in the reductions. We now present the Completion Algorithm. It tries, giving a set of equations and a reduction ordering over terms, to compute a neetherian and confluent rewriting system [HulBOa, Knu70a]. Along this study, we modify the previous notions '{term structure, matching, reductions, superposition) by examining the behaviour of the completion procedure in the various varieties. However, the control structure of this algorithm will be essentially the same:
145
CO:MPLmON ALGORITHM
Input
:Afinite set of equations. :Areduction ordering, Red(W,R) :Returns a normal form afW under the rewriting system R. Super(k,R) :Com].J'Utes all critical pa:i.rs between the rule k a:nd those in R.
E
N Then let A ::: M, p ::: N; If Mbr-( .... (bm-l.b m)··.) f':al' ( (an-I. (an.x)...)-->b l.(...· (b m-l. (bm·x)...) n,mEIN, ai,bjEG, XEV. iii) Superpositions without the associative rule result from a f' left member and an IX one s.t. :3 iEIN s.t. a'i+l-j=an+l-j, l~j~b;;n. The first proposition asserts the linear structure of irreducible terms, under the correspondence: al'(~' .. (an-I' an).") al~ ... an-Ian. The other ones detail the possible superpositions: in li), the associative rule on a ground term one gives a f' rule, and in ii) an ex rule on a f' one gives an ex rule. There are no other legal superposition (remember that we superpose on a non-variable subterm). Thus, we get two consequences. First, the data-structure of words is better in this case than the term one, second, all the f' rules are redundant. 80 that we get a new algorithm, its control structure is the completion one, the following list gives the connection between old and new keywords: A term is now a word. The word U matches the word V iff U is a prefix of V. The rule U-->V superposes on P-->Q iff :3A,B,C s.t. B# 1, AB=P and BC=U, the critical pair is then (QC,AV). The word W reduces in W' iff 3 U-->V, A,B s.t. W=AUB and W'=AVB. Then, if R(V) is the irreducible form of the word V under the canonical word system R, we have from theorem 2.4 :
147
Corollary 3.2 If (G,E) 'is afinite presentation of a semigroup S, and. R is a canonical system for S. then the subset E of G+ whose elements are all R-irreducible words, with the la:w x defined by
'v'W,W'E'f. WxW'=R(WW') is a S isomorphic semigroup. Of course, such a finite set of rules does not always exist, the word problem for semigroups beeing unsolvable [Nov55a]. The variety of monoids has the following canonical system :
M
(x.y).z x. l
[ Lx
x.(y.z) x
x
The only difference between semigroups and monoids is that the reductions must remove the constant 1 from words, as done by the two last rules in M : as superpositions occur on a non variable subterm and all rules are interreduced, these two rules cannot give c.p. The sets of rules on words are a generalization of Thue Systems for which only length reducing replacements are allowed . Many results can be found about such systems in [Bo082a . Bo082b] or in [Coc76a]. especially in connection with language theory. 4. GROUPS The study of non abelian groups is done in two times . The completion of groups is observed as in the previous cases. But a new fact appears: rules are symmetrized, this operation is closely related with the canonical system for groups, the second part of the present section is an analysis of this symmetrization process. We first need some technical definitions. Let G be a set of generators, G-l is a copy of G whose elements are noted a-I, aEG. If b=a-IEG-1 then b- 1 is the generator a. If U=UI ' . . \lnE(GuG-1r, UjEGUG-1 , then U-l=U~1 . .. Uti, with 1- 1= 1The word u, .. . Unlll .. . 1-4_1 is called a cyclic permutation of U, l ~~ n. The length of U, IU I, is the number of generators in U, with 111 =0 . 4.1 . Completion The famous canonical system for groups is the following one : x.1
x.x'" x.(x- l .y) G
x 1 Y
1-1
1
(x.y).z (X.y)-1 (X-l)-l
x.(y.z) y-l.X- l
1.x
X
x- l.x
1 Y
x- l ,(x,y)
x
r
By the rules G3, G4, G8 and G9, not all words of (GuG- l are G-irreducible. Also, we need the following function, it computes the G-canonical forms of words:
148
let G(W) = Case Wof iv, aa-1Yor a-laYThenG(V); aYThen Case G(V) of 1 Then a; a-1U Then U; Otherwise aG(V); Otherwise W• This function is now used to define the group reduction: W....W' iff:3 U....Y, A, B s.t. W=AUB and W'=G(AYB) The function G computes the well-known canonical form in free groups, i.e. the reduction generated by the rules aa-l .... I, aEGuG-l, or by the canonical system G. Of course, we could merge these rules with those generated by the completion of a given group, but the previous reduction runs faster, and this fact is essential in an implementation. We then have the equivalent of lemma 3.1, whose details are left out. The point is that new critical pairs are computed between ground term rules and G ones, expressed as words they give the: Lemma 4.1
To the rule al ... an....b l ... bm. 8.j,bj EGuG-l the completion associates the following critical pairs: (al (a2
(a;l
an-l ' b l ... bm~l), an ,alibi' .. bm ) , all , b~l ... b l l).
As these critical pairs are closed to the group variety, we may call them canonical. Now, the group completion is completely defined. The control structure is modified. we must add one step to compute the canonical pairs of the new rules. A good heuristic is to give them the highest priority in the set E of waiting equations, and to generate them between the two first steps. The reader can now think to the corollary 4.2 versus 3.2, just add the inverse operator: If WE~ then )V"'l = R(W-l). If we delete the superposition step in completion, we obtain the symmetrization algorithm. whose name will become obvious later on. As in the monoid case, the group word problem is unsolvable [Nov55a] .so the completion may not terminate. However, it halts on finite groups, as noted by M~tivier [Met3.a]:
Proposition 4.2 Given a presentation of a finite group G. the completion algorithm always halts in success. Suppose the completion does not halt, by theorem 2.4, it computes an infinite canonical system Rw As G is finite and the rules right members are R",,irreducible. there exists a word W, right member of infinitely many rules. Let 1:=(Wj)j€N be the sequence of associated left members. As the rules are Interreduced, Wj is not subword of Wj' j#i, moreover, all proper subwords of Wj are R",,irreducible. Thus, we can extract from ~ a subsequence E' strictly length increasing. Let Wj=ajYj , WjE:~', ajEGuG-l, the set of all Vj is an infinite set of R",,-
149
irreducible words , contradicting the finiteness of G. The algorithm must halt • Now. the question of complexity may be asked about the completion of finite groups . Our experience shows that complex situations exist: the symmetric groups Sn have presentations whose completion gives a set of rules whose cardinality is in O(!Snl}"'nL Moreover, we did not succeed in completing the alternating groups An. However . the proof of prop. 4.2 may be used to give an upper bound to IRI, the number of rules. Let ~)k=l .....n be an enumeration of the right members , Pk'l'PI if k'l'l, then ~IGI· If Lk = l Aj / A;->PkER I. then ~ 141= IR I. Let k
Mk =~ / 3a jEGu G- 1s.t. aj,Uj = AjE4: I, we may suppose J.Li'l'J11 if i'l'j, otherwise aj,Uj =c ajJ.Li =CPk implies ai =c aj while ai'l'aj as the rules are interreduced. But this means that a generator was redundant. Eliminating such cases, we get ILkl= IMkl , as the words in Mk are G-irreducible , we have IMk l~IGI . Thus, ·n
n
IRI :: ~ ILk l ~ ~ IGI ~ IGI 2 k=l k=l And we get IRI~IGI2. But this upper bound may be improved. Fix one MiEMk' then J.Li appears at most in IGuG- 11=2jGI sets Ml' otherwise two distinct rules would have the same left member. As the number of Mi is bounded by IGI, we have ~IMk l~ IG UG-l l. IG I . In other words, IRI~2I GI.IGl , as in practice the number of k
generators is very small, this inequality tells us that we get the multiplication table of the group Gin O(IGI) rules . However, this bound is still large as one can see in § 7. For infinite groups, we would like some criterions restricting the number of superpositions. In the following section, we show that the symmetrization is in some cases powerful enough to solve the word problem. 4.2. Symmetrization The first step is to observe that, in groups , the word problem is equivalent to the question : is a given word belonging to the congruence class of identity, for W=W' WW·- 1 = 1. Then . we can search for hypothesis s.t. the symmetrization answers to the identity class membership question. However, this study needs some words combinatorics out of the present paper scope, for a detailed presentation see [CheB3a]. First, some definitions: A cyclically reduced word (c .r .w.) is a word whose all cyclic permutations are G-reduced. A relation is a c.r .w. congruent to identity. A symmetrized set of relations R E E (s.s.r.) is a set of words including the relations R, the inverses R-l, and all their cyclic permutations. A piece is a common prefix of two distinct relations . A group presentation is henceforth a set of generators and a set of defining relations to which we may associate a s.s.r . A presentation satisfies C(n), nEN , iff every piece P of the defining relations s.s.r, is s.t. IPI < .L !WI. the n piece P belonging to the relation W. For example, (a,b,c ; abc,as,b 5.c'l) is a presentation of a polyhedral group; it satisfies C(2), but not C(3): a is a piece between the two first relations and lal= l a~c l . The first step consists in a detailed analysis of the symmetrization algorithm under the hypothesis of length decreasing rules :
150
Hypothesis 1 : If U.... V is a rule, then
lUI ~ M.
The reason for this hypothesis lies in the fact that the present study is a detailed analysis of successive reductions, which is impossible when the words length increases by the replacements . Then. the next proposition summarizes the symmetrization's behaviour: Proposition 4.3 Givrm a group presentation G, the symmetrization always terminates. Let r be the set of rules computed, then if U....VEr : • The words UV-l • U-1V, VU- l and y-lU are relations. -If IUI+IVI=2p+1 then jUI =p+1.IVI=p. -If IUI+IVI=2p then lul=p=!VI or /UI=p+1.!VI=p-l. If Gsatisfies C(2) then. -The set S={ UV-1.VU-l/U....VErf is the s.s.r, of defining relations . • IfPQES and IPI>IQI thrmP....;Q. - A non confluent critical pair results from a superposition on a piece between two relations. Let us remember that the algorithm takes a nootherian ordering as input. so that the reductions always terminates, it is of course always possible to find such an ordering: for example a lexicographical one based on length and a total ordering of generators. Thus, as the reduction is ncetherian and only a finite number of words can appear in rules. it follows that the symmetrization halts in a finite number of steps. The remaining assertions follows from elementary manipulations on words. Now, the name of symmetrization appears clearly, this algorithm brakes the defining relations of a finite group presentation by computing what is called in literature a s.s.r . [Lyn77a]. As example. the fundamental group of a double torus, defined by (A,B,C,D;ABCDA-1B-1C-1D-l) . gives the following r set: DCBA -. ABCD BCDA ·1 -4 A-IDCB B-IA-IOC-. CDA-1B-I DA-IB·Ie- 1 -4 e-IB-IA -io D -1C ·1B-IA-I -. A-IB-IC-ID -I B·le-lD-IA -4 AD -Ie-IB-I BAD-IC-I -4 C-ID-IAB D-IABC -4 CBAD-l
In Prop. 4.3, the two words in the definition of S are necessary: in the previous example, the defining relation is of type VU- l by rule 1. but not of type uv'. while BCDA-1B-1C-1D-1A is of second type by the rule 2, but not of first type. Note that we have made the assumption that G satisfies C(2). As for the length hypothesis, this is necessary in order to localize the possible reductions of a word . The second step is an analysis of the critical pairs associated to a piece. Thus, we have two rules whose left members have the non-empty word B as prefix and suffix , B is a subword of a piece, say ABC, between the distinct relations computed from these rules according to Prop. 4.3 , that is, we have the configuration C: k: aAB C {1: BOy
PC- l A-1o B¢ 1 and P-la.cF-yO-l
The superposition of rules k and I in B gives the c.p. (a.M-lo,{JC-1C-y) , and alter G-reductions, (a.o,f3-y) . Thus, we get an elementary point of divergence in
151
the reductions. But this is false in reductions of words congruent to identity : the word a07-1fj-l is f-reducible under hypothesis C(2) for in this case 1°7-11 > IABC! which proves the f-reducibility by Prop. 4.3 . In fact, the two words of the c.p., although beeing G-irreducible, may be P-reducible, and we do not want this possibility: in this case we cannot assert anything about the c.p. Let us examine this eventuality. As the rules are interreduced, there exists a rule m:fJ-v-'>, and words (J' ,1 with (J=(J'fJ- and 7=V-Y'. Now, suppose that the word fJ- is not a piece between rules k and m, we can write the identity in (GuG-l)" of the associated relations: C- 1B- 1A- 1a- 1fj'=v;-I. But this identity implies that the left member in 1 would be G-reducible in the subword CII, contradicting the irreducibility of rules members. Thus the words fJ- and II are pieces. Clearly, this is impossible if we suppose the: Hypothesis 2: All presentations satisfy C(4).
We have just prove the Lemma 4.4 Under Hyp.2, all critical pairs are, eventually after some G-reductions, in Gf-normalforms. We gave the proof of this lemma because it provides a good and succlnt example of what are the demonstrations of the next lemmata: an assumption on existence of a reduction gives a configuration impossible under our precise hypotheses or under the general features of a rewriting sytern, the identification of relations beeing constantly used. Indeed, we know the canonical form of the c.p. The main part of this second step is to examine carefully the reduction of a c.p, in a context:
Defiilition 4.5 A context is a G f-irreducible word fJ-M (left context) or T, (right context) that reduces a member of a c.p. when concatenated to it, M (resp T) giving an eventually G-reduction, and fJ- (resp. r), left member proper prefix (resp. suffix) of a I' rule, giving a r-reduction. For example, if fJ-A-'>pEr, we have the following reductions: fJ-MfJ7-'>G pXX. -'>r pX
The words fJ7 and ao beeing symmetric, we restrict our attention to the reductions of fJ7, searching on one hand confluence conditions in the reductions of fJ-M(J7 and fJ-Mao (resp. (ifl, and aoT')' Then, we get two configurations:
L
k : aAB -'> M- 1fJ-IV11(i'C-l k : aAB 4 1: BC-y'Gl11"IT-l 4 1 : BCery' -'> A- 1o m: fJ-fJ-l 4 vpv; R m: '11" -'> n : p(i'a n : fJ--y'v -'> pieces: {J', ABC, fJ-IV11, G, p. pieces: v: ABC, fJ-,
4,
Their signification appears in the following proposition:
{J'fJ-C- 1
A- 1o alVa p
0'1 11"1. V.
152
Proposition 4.6 In the reduction of J.l,MfJ'l (resp. fJiT'7), we have Case 1: fJ (resp. 7) is absorbed by M (resp. T) in a G-reduction, then, in the context, the c.p. confiues, Case 2: There is no piece between rule k (resp. I) arui m (resp nY, then, in the context, the c.p, confiues. Case 3: The reduction of J.l,M{J7 (resp. fJ7T7) gi:ues a. word I/P'7 (resp. Pi (1) Girreducible, r-red:ucible only by the rule n in the configuration L {resp. R). The interesting facts of this proposition are 1) many pieces exist in the configurations, 2) in case of non confluence, the words P' or i, beeing pieces, are not equal to 1, in other words the second component of the original c.p. has not been affected by the reductions. Under this tricky result lies in fact the heart of what is called the small cancellation theory, as we are now going to explain it. The last step in our study needs two assumptions: Hypothesis 3: The set r is s.t. the rules n in R and L does not exist. Assumption 4: :3 W,W' s.t, W4Gr1 ,W4GrW' and W'¢l is Gr-irreducible. Ncetherianity and finiteness of r implies that the set E>(Vr~9 Z I
w-+cirz,
:3 Zj,Z;a s.t. Z4 r Zi' 1JtIrred(Z;a), Irred(Zj)91 r}
where Irred(V) is the set of all Gr-irreducible forms of V, is non-empty. A minimal word Y in E>(W) with respect to the reduction ordering is then of the form UaABC{JV with U,V Gr-irreducible and U=J.lnMn · · . J.l jMj,Y==Tj7 j ... Tm7 m. where the J.I1Mi and Tj7j are those from Def. 4.5. This word Y is a minimal point of strong divergence in the reductions from W, strong because our choice of E>(W) is such that after two reductions of a word Z in E>(W) we get two reduction dags totally disconnected; and this divergence implies that the reductions are done in a non-confluent c.p., that is on a piece. We have the following scheme:
YI reduces only on 1 Y2 can't reduce on 1 i.e. there is no confluence of
Then, as Irred(Yj )
:::
YI~
11 f, we may choose a precise chain of reductions start-
ing at Yj. Under the hypothesis 3, and because of non-confluence ofYj an Y2 , the
Case 3 in prop. 4.6 details the first reductions of Yj: Y1 .... ·drJ.l1Mi)£IfJ'7'I1(fiT(T'j}. n
2
We would like to prove the falsity of assumption 4, we are done if we could prove that Y1 reduces to a non-empty word Gr-irreducible. The shorter way to achieve this goal is to reduce Y1 in such a manner that appears a propagation of the irreducible words {i'i and y'j of Prop. 4.6, according to the structure of U and Y. And in fact, this propagation is possible as explained above: Y1 .... cir {J'n-j ... {J'de' .. t'm-l Ya. Is this last word reducible? In looking for a reduction of {J' 0" 1 , we get the following last conflgurations:
153
k : cxAB ... M- 1IL11l1 1P'C- 1 1 : BCry"allllT-l ... A- 16 ... 11111 m: J.4.Ll ... ala n : 111
"'cu tll3'et2 pieces : ABC, p'. e. IL11l1 1, allll '
p:
k : cxAB ... M-1IL11l1 1,B"eC-1 1: BCiall'lT-l ... A- 16 ... 11111 m: ILlLl ... ala
n: '11
q:
tleit2
pieces : ABC, And the corresponding
Hypothesis 5: exist.
r
i,
...
'"
s, IL11l1 1. a1111_
LI,P, RR,P for (!J3'j-l and i j-l'1'j' So that, we assert the
is s.t. for all Y. the configurations IJI}-2, LHf·2, and RKf·2 do not
Consequently, Y l reduces in Y3~ 1 and GI'-irreducible, but this contradicts the definition of Y, hypothesis 3 & 5 implies the falsity of assumption 4. and:
W "'Cr 1 => Irred(W) = {11 The conclusion follows from an elementary result from group theory. direct consequence of the normal closure definition: If W belongs to the congruence class of the empty word, then there exists W' of the form nTi~Ti-l where ~ES, the s.s.r. of defining relations. obtained 1 from W by insertion and deletion of subwords aa - l or a-la,aEG. i.e , by Greductions. Obiously, W' "'Cr 1, and the following scheme illustrates the last step of our study :
I\.(~"
W
-
*
\~/" Wk \
\
'
\
',*
GT', ,
,,
\
* Gf
Gr \* \
,,
\
,,
\
\
--~ 1
154
Theorem 4.7 If for a presentation of a group G satisfying C(4), there exists a set of rules r satisfying Hyp. 3 & 5, then
W
=G
1 => W...·1.
Thus, the nice properties of I' (cf prop. 4.3) allows, in the study of a restricted confluence over 1, the localization of a strong divergence and then, exhibits the configurations that may prevent this confluence. However, the Hyp. 5 is quantified over all Y words. A finite condition is used in the small cancellation theory which uses, together with the condition C(n), the following one: T: '0' R1,R2,RsES s.t. Ri'~+1 and Rs,R I are not inverses, one of the products RIR2, R2Rs and RsR l is G-irreducible, Corollary 4.8 (Debn) If a group satisfies C(6), or C(4) and T, then it has a solvable word problem. In the case C(6), Hyp.3 is satisfied because the rules n cannot exist, its left member, concatenation of three pieces, would be shorter than its right one. On the other hand, C(4) is Hyp.2, and for example the configuration L gives the triple ({3'-l vIJ.LI IMaABC , C-IB-IA-Io-yo-Ja-I , aT- I p{3') in contradiction with T, so that Hyp.3 is satisfied in both cases. LR~ gives the triple (p.-11l1J.LI IMlXABC, C-IB-1A-loTTllaa,,-1t;-1, t;~2(,)-1~Jf3'), impossible if T holds; while C(6) implies that Mcx "'GrJ.L'II'-lp'C-IB- IA-I, i.e. the words Yl and Y2 conflues: under C(6) the configuration LR and others may exist, but are not associated to a word Y, Hyp. 5 holds in both cases. The group (A,B,C; ABC=CBA) gives a counter-example and an example. Indeed, it gives two I' sets. The first one is:
r
CBA ... ABC A-ICB -s BCA-1 B-1A-1C CA-1B- I C-1B-1A-1 A-1B-IC-l B-IC-IA AC-1B- l C-lAB BAC-l
ABCA- l BCA-1B- l CA-1B-1C-1 BAC-IB-I
CB KIC B-1A-1 C-IA
Then the rules 2,B,7,9 and Bin this order are a LRy configuration with Y = ABABABCA-ICBCA-IB-IA-IB-IC-IB-IC-IWIC-I and Irred(Y)
=11, ABABABCBCA-1CA-IWIA-IB-IC-lWIC-IB-IC-1 f·
Thus this f-set does not solve the word problem. The reader may check that it has four L and four R configurations, equivalent modulo a permutation of the letters; however, they do not provide non-confluent words. The other f set is:
r
ABC A-ICB CA-1B- I C-IB-1A-I B- IC- 1A BAC- I
CBA BCA-1 WIA-IC A-IB-IC- 1 AC-IB- 1 C-1AB
This symmetrized set is in fact canonical, consequently it satisfies the theorem 4.7. This group is not C(6), but C(5), and the triple
155
(AC-1B-1A-1CB , B-1A-1CBAC- 1 , CBAC-1WIA- 1) shows that T does not hold. Thus, this group is not a small cancellation one, while the present proof applies to it. Historically, this corollary appears first with Dehn [Deh12a], about the fundamental surface groups. The algorithm used to solve the word problem just replaces every subword greater than half a member of a s.s.r. by the remainding relation. Usual proofs uses graphs associated to words and Euler's Formula [Lyn77a ]. Greendlinger [Gre60a] proved it by the only use of G-reductions. Bticken [Buc79a] initiated the present approach, while our work details the symmetrization and, by localization. exhibits the basic configurations. The main advantage of the present proof is, by localization of a strong divergence in reductions, to give structural conditions more general than the usual metric ones.
5. ABELIAN CASE 5.1. Semigroups and Monoids In non-commutative cases , we examined the normal forms under the variety canonical system. Now, this is achieved by the associative-commutative equivalence of terms , We pointed out in § 2 that the commutative law could not be oriented into a rule without loos ing the ncetherianity. But a coherent theory of AC rewriting and completion has been developped [Liv76a. StiBia, Hul80a, Fag84a]. AlL this background is assumed in this section. we give the main applications in finitely presented commutative algebras. A term of an abelian semigroup or monoid on the set G= i glo .. . . gp I is, in accordance with the previously mentionned theory and with the usual notation is the fiat structure gil+ . .. +gil' whose one AC-equivalent term is t = gil+(gi a+( . . . +gi ) ... ). As we did in the non abelian case, we now choose a l new data structure. In the AC-unification, an ordering of the fiat structure is taken fr om a total ordering of the operator domain. Such a total ordering on G gives us the new structure : we represent terms by abelian words (a,• . . .. a p) in NP, aJ beeing the occurence number of gj in t. In other words, we choose for the free semigroup on G (resp. monoid) the concrete representation JNP-iq (resp. JNP). This deduction may bee juged too formal as we find the trivial model of free semlgroups , but we think that the present paper justifies this technical and formal work. We also need some definitions : Abelian words addition, component by component, deduced from the ACterm structure. The partial order « : M=(a" ' . , ,ap) «N=(b" . .. , hp) iff aj < bj, i=l, ....p. where < is the ordering on integers. max(M,N) = (ma.x(a"b,), . , . , m ax(ap,bp» (resp. min) , The ordering « is neetherian and compatible with the addition. The embedding rules now replace the ,g-rules of § 2, their variables allow the subterm reduction. Once more, the completion in monoids has the usual control structure with the following de finitions : A term is an abelian word . The word M matches N iff M«N , the matching is N-M = (b , - al' .. . ,bp-ap) . Mreduces in N by the rule X->p iff >..«M, and then N = (M- X)+p. The rule s X->p and U->T superpose iff min(>..,u) "" 0 , and we may consider only one critical pair: P = (max(X,u) -p,max(X,u)-T) , the others beeing confluent if P is introduced as a rule, i.e . they are less general,
156
In the variety of semigroups, we have the following result [Cli67a], obtained by induction using « : Theorem 5.1 Ji}uery finitely generated commutative semigroup is finitely presented. In other words, in an infinite set of abelian words of finite length, the set of «-minimal words is finite. Then, observing that all abelian words introduced by superposition are not greater than the common max of the defining equations , we have [Bal79a]: Proposition 5.2 On a finite presentation, the semigroup completion terminaie». On a finitely generated semigroup, defined by an infinite set 0/ equations, the completion computes 1:n an infinite time afinite canonical set of ruiee. This result is of course true for the monolds. In other words, we find the well-known fact that the uniform word problem is solvable in these varieties.
5.2. Groups The canonical system for abelian groups is
x+o x+(-x) x+(-y)+y
GA -0 -(-x)
x 0
x 0 x (-x)+( -y)
-(x+y)
The previous structure of abelian words is still used. But now, with the the inverse operator, we work in the free abelian group ZP. The rule GA'3, embedding of GA2, may be used via superpositions to control the behaviour of the completion in abelian groups. For example, the rule a+b....c superposed on GAS with the substitution (J' =i(x,a),(y,b)r gives the c.p. (e.c -b}. so that we may restrict the left members to only one generator, and consequently, there is no reason to keep the waiting equations as a list E of abelian word pairs rather than a list of abelian words, the difference between the pair members. Now, in step 3 of the completion, the choice of a word to create a new rule k is based on the minimal coefficient ai in absolute value among those in E: k:
Iaj I gl
.... ~eajgj
e = opposite sign of at
j .. 1
Then we look for the next minimum, but this minimum will decrease if all waiting words are reduced by the rule k. Also, all words in E are reduced. If we loop on this operation, there exist a step such that the last generator introduced as left member, say ISj, has disappeared from E. Note that when a left member is reducible, the associate rule is reintroduced as a word in E. Then we get two cases: Iaj I=1, the generator gj was redundant. lajl~l, again here we have two cases. The simpler one occurs when the right member is null, that is, the generator gj generates the cyclic group Z/ I aj IZ. In the last one, with GAS, we create the rule: k' : Iaj I(gj- ~Cigi) .... ~~gj I"J
j,oj
157
with bj = cj*eaj+dj , the integer division of right coefficients by the left one. This operation implies the interesting fact that all di are now smaller than aj in absolute value, so that the minimum in E will decrease if the rule k' is of the same type than the other ones, i.e. if we introduce a new generator g'j: I : gj .... g'j+I:Cjgj IP'j
Then the rules k and k' are reduced, they give the same word reintroduced in E, and the minimum decreases. The algorithm halts necessarily: when a new rule is created, the minimum in E decreases strictly or a generator disappears from E. This study shows that, to a given generator g, original or not, only three cases may happen: 1) It does not appear in left members, it generates an infinite cyclic group as n*g is in canonical form, 'v'nEN, Let GI be the set of aU these generators. 2) It appears in a left member (such a rule is unique), with a coefficient equal to one. Then it was dependent from the remaining generators. 3) It appears in a left member with a coefficient c>1. The right member is then null and the generator g generates a finite cyclic groupe Z/ cZ, Let Ga be the set of those generators. We have proved the Theorem 5.3 The group Gis the direct product of IGil infinite cyclic groups, and IGal finite groups whose cardinals are given by the rewriting system R, wich reduces a word gWen in the primiti:ue generators into this cyclic decomposition of G. The reader who is familiar with abelian groups has noticed that the reductions in F. and the introduction of a new generator correspond respectively to the elementary row and column operations on the matrix E. The interest of the present proof of abelian group decomposition is that the set of rules R keeps trace of what usually appears as some row and column magic on matrices. In fact, the completion halts without superpositions between ground rules. To conclude this section, let us say that the abelian monoid completion was first studied by J.M. Hullot and G. Huet (unpublished manuscript), and D.A. Smith [Smi66a] published a detailed algorithm for abelian group decomposition from which the present section is inspired, 6. RINGS, MODULES and ALGEBRAS
6.1. Rings The canonical system for non-abelian rings is x+O x (-x)+x 0 x+(-y)+y t.x x (x.y).z x.1 x x.(y+z) (x+y).z AU -0 0 -(x+y) -(-x) x x.O 0 X. (-y) a.x a (-x).y
[Hu180a]: x x.(y.z) (x.y)+(x.z) (x.z)+(y.z) (-x)+(-y) -(x.y) -(x.y)
158
The rules are partitionned in two sets, the right ones are needed in the symmetrization, while the left ones only define the new data-structure . The AUcanonical forms are the usual ones for polynoms, Le. the orientation of the distributive law expresses the polynomlallinearization. Thus, the data-structure is the composition of the two previous ones, first we have an abelian group, such that the first level of structure will be abelian words, second, this abelian structure is infinitely generated by the words from G', The order on G previously used in abelian words is now replaced by the lexicographic ordering on monomials from GO, We can speak of the heading monomial. From the addition structure, we get a symmetrization, but still more complex than the abelian group one . However we can choose the heading monomial as left member of the rules . And, still with the rule AU9, the symmetrization gives the rules: k : (p+l)m ... (-p)m+~l1lmJ ' pEN j
I : (p+l)m ... (-p+l)m+L;l1jmj, pE:N'. j
according to the parity of m's coefficient. Then we have two cases for the embedding rules : 1) If p=O. then we have the classical embedding with associativity (cf. § 3). 2) Otherwise, two c.p. with the distributive rules create the new l1-rules : Ie' : (p+1)xmy ... (-p)xmy+ ~Pj(xmjY)' X,yEV (resp.l') . j
This last rule allows the reduction of a monomial such that k .m'mm", where k>p in a polynomial P. Such a set of rules is symmetrized : all critical pairs between them and AU are solved . Of course, in ring completion as in the other ones, we only need the rules of type lor k. The next lemma details the c .p.: Lemma 6.1
The two rules
k : cxU
P
1: {IV
Q,
«sa«. U,VEG", P,Q~.
superpose 1If 3A,B,C B¢ l, AB=U, BC=V, the c.p. is then (( max(cx.I1)-cx) ABC+PC , (max(cx.I1)-P) ABC+AQ). The description of ring completion is then achieved. The algorithm may not terminate. Observe that we choose a precise symmetrization, the following canonical system shows that other more complex situations exist:
I:i.
A+B • AC+D -A+-B -> -D+-(AC) D+-A B+-(AC) A+-D -B+AC B+-D -A+AC
Here is an open question: is there a canonical system such as I:i. for which no symmetrized canonical system exist? In other words, is our symmetrization choice, reasonable for the complexity point of view, complete or uncomplete? We briefly talk about abelian rings. The data-structure is now two superposed abelian word structures. Thus, the completion is deduced from the previous and monoid ones. We have a symmetr-ization that enables the choice of the
159
left member. Taking an degree and lexicographical ordering, this left member may be the heading monomial. Then , as for the abelian monoids, we compute the c.p. with respect to the min-max functions on monomials: Lemma 6.2 The two rules
k: aU
P
1: (IV
Q,
t
a,pEN t , U,VEG • P,QEZ.
superpose iffmin(U,V)~l, the null degree monomial, the c.p . is then
(( max(a,f3)-a) max(U,V)+[max(U,V)-;-U].P, ( max(a,p)-p) max(U.V)+[max(U,V)-;-V].Q). In this lemma. as for abelian monoids, a new operation + appears, corresponding to a reduction step. This operation is of course associated to the division modulo an ideal, the reduction computes an equivalent term modulo a sub-structure, sub-monoid, normal subgroup or two-sided ideal in the present case. As the coefficients and monomials introduced by the superposition are not greater than the common max in the defining equations, the algorithm halts in a finite number of steps. Of course, other reduction orderings may be used. 6.2. Modules and Algebras These algebraic structures are generalizations of abelian groups, Zmodules, and rings, Z-algebras. In consequence, we just point out the ma in differences. First of all, the scalar ring is supposed to have a canonical system: the normal form of a .m, a scalar, needs a's one . In fact, this assumption is not necessary, as we have a surjective map from the scalar ring in the module, not always an injective one , but this hypothesis worths nothing while the explanations will be clearer. Then , the data-structure are the same as previously where scalars now replace integers. The superposition between two rules occurs on the scalars, the monomials or on both ones. The scalar match and reduction is done as described in § 6.1. in place of the max-min operations on integers. The Module completion always terminate : fix one generator g, create a new rule whose left member is a .g until g disappears from the waiting equations etc ... until the last generator. This is the same control structure as in the abelian group completion. Let us say that more than one rule with g in its left member may exist, and that if the scalar ring has sufficient properties, the algorithm may be improved; for example, on a division ring, the abelian group algorithm may be used, the minimum beeing defined by the division absolute value. Finally, in algebras over a ring, the completion runs as well. In abelian case, it terminates. In the other case, recall that we choose a precise symmetrization, but other less efficient ways exist. These two cases are studied in the literature . G.M. Bergman [Ber 78a] presents the reductions in non-commutative rings. On an example, he runs the Knuth-Bendix procedure. The reader will find in this paper many applications of the diamond lemma. In 1965, B. Buchberger also discovers the completion of commutative multivariate polynomial rings [Buc81a], providing many interesting examples of polynomial completion. These two authors restrict the left members to unitary monomials . We have seen that this restrictio n is unnecessary. However, when t he scalar ring is the field Q, this restriction can be used to extend t he completion: the scalar division is used to keep only unitary monomials in the left m embers . The reader will observe that it is an
160
extension of the classical Knuth-Bendix algorithm., as the field theory is not equational (V x;t!O,xx- l = 1). The computer is however supposed to have a decision procedure on rationals! 7. COMPLETE GROUP PRESENTATIONS We investigated this variety because no systematic completions was done for groups, as it was the case for other varieties. Some complete systems for classical groups are given. when such a system exists, many others can be found. Also, the one presented has the smaller number of rules. Thus, we prefer to call such sytems complete presentations rather than canonical ones as they are not unique. The defining presentations were found in [Cox72a], the complete ones with a system written in Maclisp. For an exhaustive enumeration of the rules, see our thesis [CheB3a]. 7.1. Surface Groups
The defining presentation of a p holes torus is (AI' .... A2p ; Al ... ~p = A2p ... AI)' The completion gives the system Tp of 2p rules whose length is 2p:
Alk···AnA i ...A2!
c(y).
1.2 Groups as binary algebras with two
axi~ns
Another interesting exercice in algebra using R3f3 can be to prove thlit the two follo'lling axioms present actual.Iy the variety of groups. (x / x) / ((y / y) / y) == y (x / z) / (y / z) == (x / y).
Running lCBP on this input, 1L'V2 generates an equat i on of the f'orm L(x, y)
==
which suggests the (x / x) / y
(z / z) / y Qona~ic
oper'l-tor a and the rule
-> a(y).
Later R1VE generates the equation (x / a(y)) / y
==
x / (z / z)
and the rule
x / (y / y) -> b(x) and later the rule b(x) ~o
->
x.
direct the equation
(x / y) / (z / (y / u))
== (x / u) / z
the status of / is set to "right-to-left". '3ut previous choices do not possible to direct an equation like a(y / x)
==
make
x / y.
A second iteration is necessary, with a precedence as above. The equation
(z / (u /
y)) /
(x / u) == z / (x /
y)
that cannot be directed is discarded. The equation x / x
==
y / y
is generated and the identity is introduced. Eventually a set of equations is generated. A third iteration is made necessary to check the co~~uence of the set generated previously. The durations of the three iterations of KEP are respectively 3.9 s., 12 s., and 5 s.
173
4.3 Jrouns as associativeal,,;e')ras wi th le:'t and right Hivision
---------------------------
to.. [~roup can also he presented 8S an ?lSsociative bmarv a left division and a rig,'1t division. The follO'ifinr axi.ona are
* (y * z) x \ (x * y) (x * y) / y
x
== == ==
(x
*
y)
*
"""C; "'f, ,~,1
z
y x.
The prool.em turns out to be not as easy as ',Ie believed first. Because we have no reason to prefer all operator to another, we :n8ke all the operators equivalent and ',Ie set their status to "right-to-left". After about 150 equations, REVE generates the x / x == y / y and proposes to introduce a constant that it calls a. rules later it generates the equation (x \ y) / y
==
a/x.
That suggests to introduce a nevi monadic operator b and the rule a / x
-> b (x},
At that time REVE sets the precedence to b < a ~nd b < /. Therefore REVE is unable to direct the equation b(Y / x) == x / y. Thus after stopping KEP a new precedence is declared. For instance, a < / < \ < * and b ~ / and the status of \ is now set to "left-to-right". KE" terni.natea and generates the b(x) / b(y).
The first iteration took about 17 minutes and the second 80 seconds. that a precedence like a < * < b < / < \ at the beginning of the second tion generates the "classical" axioms plus the "definition" of / and \.
Notice itera-
5. cmTCLUSImi
In this paper, we presented REVE experiments in algebra and showed how an iterative KBP can be used in REVE to generate noetherian and confluent term rewriting systems. A case we did not talk about is when the second iteration needs a new unification algorithm because during the first iteration an equation that cannot be directed was generated. For instance, the commutativity or an equation like (x / y) / z
(x / z ) / y
that occurs often in presentation of corenrtat ive groups [6]. When such unification algorithms will be available, we will try this kind of experiments. On another hand, during a discussion with Gerard cluet 1'18 thought that instead of throwing the rules away it could be possible to keep them for a further use in the same iteration of KEP. For example, if an equation was elj~inated because it was too big, may be a new rule will reduce it or a lack of equation will make necessary to relax the condition on the size. Same for equations that cannot be directed, a new rule can reduce one side and allows RBVE to direct it. Informally in REV:E the "trash can" should be replaced by a "refrigerator". A new completion procedure based on these ideas was designed and proved by Helene Kirchner (private communication) and Randy Forgaard has implemented these features in REVE 2.
174
Ackno ..:1e!lL"1lent3 : I vcul d l i ke to acknovl.ed.je ~':Jy :;o lle~:~u.es of ~nr; 'ia.n(1y "'o r;.;a ar 'l , JO'l!1 :;'ut ta,,!, a n,j Jeannet ta : .'i n ~ f or t 'o,pi r '191 p vhen I ilt ·,; '"t ecl \:hes r , -3.\':"n ~)r i :··i e ! I 7; : 1 , ...·t T : . ~? :, ·):'1·:Y~ -' 1 0q 0-: t:"'~ r.~ "'T ,)l,tT ~ ..~ lr 8 8 '1 "1-' ·:.·,~.n cy l "1 t,er 0 ':""1. :)n ~ i n '~ t his ~io r1. : : ';P1S 1j~=trt i. ~1 1y 1:Pr t1-;e .'~ t:',enc9 ..1 e 1 'Inf ornfl,t i 1U8 (grf.lnt ~2/7(7 ) P.}1r'l 1
t he Greco
?rosranm~tion .
[ 1] n,'J~L:;:( G. , LA'WFOED D. , " ~:x:ler itne'1ts ,vi t.h comput e r illrl~entfl.tions o+' »r ocedur-ee '.-hi ch often de rive dec is ion al,'1p r i t "L'11S f or t he ,ro r~ ur oblen i n abs t r act al r,ebr as " Loui a i ana ~€ch 'J. r-a.th :.Jept . '1ust on LA 71272 Dept :!'~P-7 Auf'.. 11)'30.
[ 2] JERSaCHI'l'Z :\. , r w::c~n :'. , ":SXi s t ence an d Const r uction of Rev/r ite Syst erm, " ' remo, Tnfo rmat i on Sc i ences qescar sh Of f i ce , 'l''le Aer os pace Cor norati on , ';1 SegJmdo California USA, (Au,~ tst 1( 92) .
[3] FORGAARD R. , "A Pro 6Ta'J fo r Gener at i n{' and Allr'Jlizing ~erl'1 ]evrri t in{~ 2ysteI:ls " !1Hs t er ' s T!'lDSis , '1I'l' Lab , for Comnute r flcience , Canbr i dge Massachus etts USA (1994) (To appear ) [4 ] 3:IGI'~NJ G. , T;;ur~A:, :3. u. , "Groups as gr oupoi ds vrith one law, " Debr ece n. 2 (1952) , 215-221 . [5 ] ~IU3T G. , "A Compl.e te Proof of Cor rect ness Al gor i t h71 , " J . Comp , Sys . Be. , 23 (1931) , 11-21 .
of
Publ.
Mat h.
Knuth-3endix Compl et i on
[6] JEMROr-lD~. J . , "Dec fd i n/; uni que ter mination of permutat ive r evr i t i n.": systems : Choose your t erm algebra carefully" 5th Conf , on Automated Deduct ion , Lecture Not es in Comput er SCience ;rr (1990 ) , 335-355 . [7] JOUA1:'iAUD J-P . , L::SCAlf:ffi P. , R3INIG r . , "Recur s i ve Decomposition Or der i ng , " Conf. on Formal Descr iption of Pr-ogramming Concepts II , Nor-t h-Ho'l .Land (D Bj or ner 3d . ) (198 2) , 331-34 6. [8 ] KIRG:118R fl . , "Cur r ent Irnul.emerrtat lon of the Private Canmuni cat ion.
gene ral
compl et i on al(',or i thm"
[ 9] Kf'JUTIi D. z. , BENDIX P.B. , "Si mple \'!or d Pr obl ems i n Uni versal Algebras , " in Pr oblems in Abstarct Al gebr a , Ed. Leech J . , Pere,amon Press ( 1969) , 263- 297.
Co~putati onal
[101 LESCA:lN3 P. , "Comput er Experiments with the Gene r at or " 10th POPL Conf . Austin Texas (1983) .
RT:VE
Term !lerwi t i ng System
Recurs ive [ 11] LESCANNE P. , "Uniform Termination of Tero Rcvlr i ting S,ys t eMs : Deconpos f t I on Order ing lvi th St at us" 9t h Co'LL. on Trees i n Algebra and Programmi ng (Bord eaux Ha r ch 1984) . [12] f1ET IVIER 3., "About the rewriting sy st ems pro duced Compl et i on Algor i thJ:I ," I nfo Pr-oc , Ltr-s 16 (1983), 31-34 .
by
the
Knut h- Bendix
175
Termination of a Set of Rules Modulo a Set of Equatdona!
Jean-Pierre Jouannaud 2 Centre de Recherche en Informatique de NANCY and GRECO Programmation Campus Scientifique, BP 239, 54506 Vandoeuvre les Nancy CEDEX, FRANCE. Miguel Munoz2 Universidad de Valencia, c/o Dr Moliner, Burjasot, Valencia, SPANES.
Abstract The problem of termination of a set R of rules modulo a set E of equations, called Etermination problem, arises when trying to complete the set of rules in order to get a ChurchRosser property for the rules modulo the equations. We first show here that termination of the rewriting relation and E-termination are the same whenever the used rewriting relation is Ecommuting, a property inspired from Peterson and Stickel's E-compatibility property. More precisely, their results can be obtained by requiring termination of the rewriting relation instead of E-termination if E-commutation is used instead of E-compatibility. When the rewriting relation is not E-commuting, we show how to reduce E-termination for the starting set of rules to classical termination of the rewriting relation of an extended set of rules that has the Ecommutation property. This set can be classicaly constructed by computing critical pairs or extended pairs between rules and equations, according to the used rewriting relation. In addition we show that different orderings can be used for the starting set of rules and the added critical or extended pairs. Interesting issues for further research are also discussed.
lResearch supported in part by Agence pour Ie Developpement de l'Informatique under contract 82/767 and for another part by Office of Naval Research under contract N00014-82-0333. 2p art of this work was done while the second author was visiting the Centre de Recherche en Informatique de Nancy and another part while the first author was visiting the Stanford Research Institute, Computer Science Laboratory, 333 Ravenswood Avenue, Menlo Park, CA 94025, USA.
176
1 Introduction Term Rewriting Systems (TRS in short) are known to be a very general and major tool for expressing computations. In addition, they have a very simple Initial Algebra semantics whenever the result of a computation does not depend on the choice of the rules to be applied (the so called Confluence property). When a TRS is not confluent, it can be transformed into a confluent one, using the Knuth and Bendix completion procedure [Knuth and Bendix 70]. The Knuth and Bendix completion procedure is based on using equations as rewrite rules and computing critical pairs when left members of rules overlap. If a critical pair has distinct irreducible forms, a new rule must be added and the procedure recursively applies until it maybe stops. This procedure requires the termination property of the set of rules, which can be proved by various tools. A full implementation of these techniques is described in [Lescanne 83). When the termination property is not satisfied, the set of axioms can be splitt into two parts: those causing non termination are used as equations while the others are used as rewrite rules. In that case, critical pairs must be computed between rules and between rules and equations. In addition to a complete unification algorithm for the theory expressed by these equations, the process requires the termination property of the set of rules modulo the set of equations, called E-termination. This means that infinite sequences are not allowed for the relation obtained by composition of the equality generated by the equations with one step rewriting using the rules. The theory for such completion algorithms was developped successively by [Lankford and Ballantyne 77a], [Lankford and Ballantyne 77b], [Lankford and Ballantyne 77c], [Huet 80], [Peterson and Stickel 81]' [Jouannaud 83J and [Jouannaud and Kirchner 84]. The last paper gives a very general completion algorithm for such mixed sets of rules and equations based on two abstract properties called E-Confluence and E-Coherence. Implementers of these algorithms are faced to this problem of proving E-termination, that was never adressed before, except for a recent work [Dershowitz, Hsiang, Josephson and Plaisted 83, Plaisted 83J. Those authors propose two techniques for solving the problem in the particular case of associative commutative theories, both based on transformations of terms. Unlike these authors, we don't want to transform terms, because we don't want to carryon many representations of a same term that are to be updated accordingly. However, our general purpose is the same: to reduce the E-termination problem to the ordinary termination problem, in order to apply well known and powerfull techniques based on simpllflcatlon orderings such as the Recursive Path Ordering [Dershowitz 82] or the Recursive Decomposition Ordering [Jouannaud, Lescanne and Reinig 83]. This reduction can be obtained through a process of transforming left and right hand sides of rules, or by adding new rules to the starting set. The first method proposed in [Dershowitz, Hsiang, Josephson and Plaisted 83] actually uses these two techniques at the same time. Our purpose in this paper is to investigate the power of the second technique in the general case of arbitrary mixed sets of rules and equations and show how these added rules are related to the so called extended rules added during the run of an E-completion procedure [Peterson and Stickel 81, Jouannaud and Kirchner 84]. Sections 2 and 3 introduce the classical background about Term Rewriting. E-termination is defined in section 4. We show in section 5 that E-termination and termination of the rewriting relation are the same whenever the used rewriting relation is E-commuting. More precisely,
177
we show that Peterson and Stickel's Church-Rosser results can be obtained by requmng termination for this relation instead E-termination if E-commutation is used instead of Ecompatibility as defined in [Peterson and Stickel 81]. When the rewriting relation is not Ecommuting, we show in section 6 how to reduce E-termination for the starting set of rules to termination of the rewriting relation for an extended set of rules that has the E-commutation property. This set can be classicaly constructed by computing critical pairs or extended pairs between rules and equations, according to the used rewriting relation. In addition, we show that different orderings can be used for ordering on one hand the starting set of rules, on the other hand the critical or extended pairs. Finally, we show that Peterson and Stickel's associative-commutative extensions ensure E-commutation as well as E-compatibility which proves that our approach is well suited to associative-commutative theories.
2 Term Rewriting Systems Definition 1: Given a set X of variables and a graded set F of function symbols, T(F,X) denotes the free algebra over X. Elements of T(F,X), called terms, are viewed as labelled • into FUX such that its trees in the following way: A term t is a partial application of N+ domain D(t) satisfies: the empty word £ is in D(t) and ai is in D(t) iff a is in D(t) and i E[I,arity(t(a))]. D(t) is the set of occurrences of t, 9(t) the subset of ground occurrences [i.e, non variable ones), V(t) the set of variables of t, tla the subterrn of t at occurrence a, t[a+t'] the term obtained by replacing tla by t' in t and #(x,t) the number of occurrences of x in t. A term t is linear iff #(x,t) =1 for any x in V(t). 0 Definition 2: We call axiom or equation any pair (t,t') of terms and write it t=t'. The equality =E or I-*-I E is the smallest congruence which is generated by the set E of axioms. Let l-n-IE denote n steps of the elementary one step E-equality HE and \_+_\E the transitive closure of HE. An equation t=t' is said to be linear if t and t' are both linear. 0 Definition 3: Substltutlons a are defined to be endomorphisms of T(F,X) with a finite domain D(O') = {x I O'(x) =I x}. An E-match from t to t' is a substitution 0' such that t' =E u(t). a is simply called a match if E is empty. An E-unifier of two terms t and t' is a substitution 0' such that u(t) =E u(t'). If E is empty there exists a most general unifier called mgu for any pair (t,t') of terms that has unifiers. All unifiers are actually instances of the mgu. If E is not empty, a basis for generating the set of unifiers by instanciation, whenever one exists, is called complete set of unifiers (csu in short). 0 Many theorical problems arise in equational theories that can be approached by the use of rewrite rules i.e, directed equations, or more generally by the use of mixed sets of rules Rand equations E. We define the rewriting relation in the general case and specialize it for the standard case where equations are not used in the rewriting relation. Definition 4: A rewrite rule is a pair of terms denoted I-.r such that V(r) is a subset of )1(1). The rewriting relation __R,E (or more simply -.R if E is empty) is defined as follows: t -.R,E t' iff there exist an occurrence a of 9(t), a rule l-.r of the set R of rewrite rules and a substitution a such that tla =E ul and t' = t[a+u(r)]. _*-.R,E (resp. _+-.R,E) denotes the reflexive transitive (resp. transitive) closure of __R,E and is
178
called th e derivation relation . _n-.R,E denotes n step rewritings wit h -.R,E. A term t is said reducible if t -.R,E t ' for some t', else it is said in normal form. 0 Notations: In the following, i, j, k denote natural numbers , x, y, z denote variable symbol s, a, b, c and e denot e constant sy mbols, f and h denote function symbols, sand t denot e terms, A, J1 and v denote occurre nces, c denotes the empty occurrence, and AJl denotes the concatenation of occurre nces A and p: Lett ers (J, T, 8 and rJ denot e substituti ons, ot or (J(t) denot e the inst anciation of t by (J, composit ion of subst itutions (J and T is wit ten (JT, l=r and g= d denot e equations, l-.r and g-.d denote rules or specify th e wayan equatio n is used. Each one of th ese notations can be index ed by a natural number or by a let ter denoting a natu ral number. Finally , composition of relat ion other th an substi tu tions is denoted by a bold dot. 0 Notice that R,E-reducibility is decid able whenever the matching prob lem is decidable in the th eory E. This is always th e case if the theory E is empty. Wh en th e th eory E is not empty, we will use eit her the relation -.R or th e relation -.R,E. We will spea k about rewritings for -.R and E -rewritings for -.R.E. It must be well und erstood that the two relations -.R.E and = E.-.R are different : If t -.R,E t ' at occurrence u, the E-equality ste ps may apply in t only at occurr ences th at are suffixes of u. If t = E.-.R t' , then the E-equality steps may apply at any place in t. The two relati ons are th erefore th e same only if -.R,E rewrites at the top. This is illustr at ed by the following example where + is a binary infix operator with the properties: associativity used as an equation: (x+y)+z = x+(y+z), left identity used as a ru le: e+x -. x Now consider the term t = (y+e)+z. By associativity , t = E y+(e+z) and this last term rewrites to y-l-z. However , t is in normal form for -.R,E because its subterm (y+e) is obviously in norm al form and the whole term itself cannot be matched with th e left member of rule using associativity since neither (y+e)+z nor y+(e+z) are instances of e-l-x, A usually requir ed property for a rewriting relation is Con f luence, which expresses th at the result of a comput atio n does not depend on th e choice of the rules to be applied. This property has to do with the so called Church -Rosser property, which says roughl y that the word problem can be solved by mean of checking norm al forms for equality or E-equality . Definition 5: Let R be a set of rul es' and E a set of equations. Let =A be the congruence generated by both equations and rules used both ways. Then we say that the rewriting relation -. (either -.R or -.R,E) ~s: 1. E-Church-Rosser iff 'v't1, t z such th at t 1 ~A , E '
= t z· 2. E-confluent iff the property holds for t
t z' 3t~ ,t~ such th at t 1 -*- t~ , t z -*_ t~ and
t1
1 and t z such that 3s, s .*-. t 1 and s -*-.tz. 3. Locally E-confluent iff th e property holds for t 1 and t z such that 3s, S -. t 1 and s-.tZ'
0
If th e rewrit ing relation is termi nating, th en. we can choose for t~ and t~ the normal forms of t 1 and t z.
179
When E is empty, the rewriting relation is necessarily -+R. In that case, Church-Rosser and Confluence are the same property [Huet 80]. This is not the case anymore when E is not empty, as shown by the following example, where + is a binary infix operator with the following properties: commutativity used as an equation: x+y =y+x, left identity used as a rule: e-l-x -+ x Confluence is obviously satisfied, however x+e and x are equal under the whole theory, are both in normal form and are not equal under commutativity. Note that the problem disappears if we allow commutative rewritings, because x+e rewrites now to x. But this is not the case in general [Peterson and Stickel 81, Jouannaud 83, Jouannaud and Kirchner 84]: If we look back at our example with associativity, (y+e)+z and y+z were also equal under the whole theory, both were in normal form for the E-rewriting relation, however they were not equal under associativity. In order to have the E-Church-Rosser property, another relation must actually be introduced to express that two E-equal terms must have E-equal normal forms. This property can be either the E-commutation property defined in section 4 the E-compatibility property of [Peterson and Stickel 81] or the more general E-coherence property defined first in [Jouannaud 83] and further improved in [Jouannaud and Kirchner 84]. 3 Termination Definition 6: A TRS R is terminating if it does not exist an infinite sequence of terms such 0 as to -+R t 1 -+R t 2 -+R ... -+R t n ... .
The termination problem for TRS was studied by [Manna and Ness 70] who introduced reduction orderings: Definition 7: An ordering> on terms is a reduction ordering iff it is compatible with the operations of the term Algebra i.e.: t > t' implies f(... t ...) > f(... t' ...) 0
The main property of reduction orderings is that they contain a rewriting relation -+R whenever they contain the pairs (ug,ud) for any rule g-+d in R and any substitution a [Manna and Ness 70]. Therefore, they allow to prove the termination of the -+R rewriting relation, whenever they are well founded. An important class of such orderings is given by the so-called Polynomials orderings. In order to compare terms, these orderings compare polynomials in their variables. The key point is to guess a polynomial that will make each left hand side of rule greater than the corresponding right hand side. This clearly requires some expertise. See [Huet and Oppen 80] for a discussion and accurate references. A very important class of reduction orderings is introduced in [Dershowitz 82]: Definition 8: A simplification ordering is a reduction ordering > that has the subterm property, i.e. f(... t ...) > t. 0
The main property of simplification orderings is that they contain the embedding relation, which is actually the smallest (in a set theoretic sense) simplification ordering:
180
Definition 9: Let sand t be two terms. s is said to be embedded in t and we write s emb t iff one of the following conditions hold: (1) s is a variable of V(t) (2) s = f(... Sj •.• j, t = f(... t j ... ) and Sj emb t j for any i 0 (3) s emb t j for t j a subterm of t. Embedding can just be seen as a way to map injectively the nodes of s on the nodes of t, with respect to the topology of s. But the map is not surjective in general: when it is the case, then s and t are identical. Let us draw an example:
emb
f
i'
I \ y
I \
x
h
f
I \ h
y
x
h
x
Several interesting simplification orderings have been designed these past years, mainly the Knuth and Bendix ordering [Knuth and Bendix 70]' the path of subterm ordering [Plaisted 78]' the recursive path ordering [Dershowitz 82J [Kamin and Levy 80] and the recursive decomposition ordering [Jouannaud, Lescanne and Reinig 83] [Lescanne 84]. All these orderings extend to terms a precedence defined on function symbols. The precedence can be buit automatically for the last one, which was first done in [Lescanne 83] and then improved in [Choque 83]. Before closing this section, we note that the problem of termination of the relation __R,E has never been adressed. We will see in the following that this is one of the main issues for the Etermination problem that we define now.
4 E- Termination Our problem is slightly different from the problem of termination because we allow equality steps between rewritings:
Definition 10: A TRS R is E-terminating iff it does not exist an infinite sequence of the E'R E'R E'R form to = to -- t 1 = t 1 -= to -t t o+1 .., 0 Note that we can use either __R or __R,E in the previous definition, since =Eo-tR,E and =Eo-t R are the same relation. Remind that the relation =Eo-t R is strictly more powerfull than the relation __R,E itself, because it allows E-equalities to take place anywhere in the term to be rewritten, whereas -tR,E allows E-equalities to take place only under the occurrence where the term is rewritten. The same happens with the relation =Eo-tR,E since it is equal to =Eo-.R. Let us give an additional example with our favorite binary infix operator +, and the following properties:
181
associativity used as equation: (x+y)=z = x + (y-t-z] simplifiability used as a rule: x + (- x) -. e Then the term x + ( (- x) + y) is in normal form for -.R,E , since neither one of his subterms nor itself is an instance (modulo associativity) of x+(-x). However, the whole term is equal under associativity to (x + (- xl) + y and rewrites to e + y. Let now -.R/E be the relation =E.-.R.=E. This relation on terms simulates the rewriting relation induced by -.R in E-congruence classes. It is clear that the E-termination problem is nothing else but the termination of -.R/E. These remarks will be freel used in what follows. Let us now start with some general comments about E-termination that show simple restrictions on the set of equations. First V(g) = V(d) for any equation g=d. Else there are obviously infinite loops as we may instanciate the extra variable by anything, for instance a left member I of some rule l-.r, then rewrite the term using the rule and then coming back to the starting term by applying the starting equation twice, in order to first erase r, then obtain I again. A second important remark is that E-termination cannot be satisfied whenever there are some equations of the form x=t where x has several occurrence in t, because in that case I is E-equal to a term with several occurrences of l. One can rewrite one of these and start the process again. This is actually the case with any instance of such an axiom whith x replaced by a non ground term. In the following, we assume that E does not contain such axioms. We are primarily interested in proving termination of TRS generated by the generalized Knuth and Bendix completion procedure [Jouannaud and Kirchner 84]. This procedure completes the set of rules in order to get two main properties, namely E-confluence and E-coherence. In the following, we introduce and use a property stronger than E-coherence, called E-commutation, that allows to reduce E-termination to termination of the rewriting relation.
5 E-Termination of E-Commuting Rewriting Relations In this section, we prove abstract results for an arbitrary rewriting relation -.. These results are applied in the next section to the case where the rewriting relation is -.R or -.R,E. In the following, the rewriting relation -. is supposed to satisfy the property: -.R ~ -. ~ -.R/E. As a consequence, =Eo-.Ro=E, =Eo-.o=E, =Eo-.R/E, =Eo-.R/Eo=E and -.R/E are the same relation.
Definition 11: A rewriting relation -. is E-commuting with a set E of equations iff for any s', sand t such that s' =E s -+-. t, then s' -+-. t' =E t for some t'. locally E-commuting, whenever the property holds for s' I_IE s -. t. 0 Notice that we require at least one -. step from s' before to get t'. This is coherent with our goal to prove E-termination: If R is E-terminating, then s' must be different from t' , else there would be the following cycle: t =E t' = s' =E s -. t, therefore t -.R/E t since =Eo-. is included into -.R/E.
182
E-commutation differs from E-compat ibility by rewriting from s' until a term t ' is found to be E-equal to t, instead rewriting once from s' to a term s" and th en rewriting from t until a term til is found to be E-equal to s ". On the oth er hand , E-coher ence allows both kinds of rewritings and is th erefore more general tha n eith er one.
T he importance of the E-compatibility relation was shown by Peterson and Stickel: Assumin g E-compatibility, E-Chur ch-Rosser and E-confluence ar e equivalent. Thi s can easily be shown by indu ction on the length of th e proof that s =RUE t in th e E-chur ch-Rosser definition. Let us now point out the main importance of E-commutation for the E-termination problem: Theorem 12: [Munoz 83] Let R be a set of rules and E a set of equat ions. Assume the rewritin g relation - is termin ating and E-commuting. Th en R is E-t erminating. proof: by noeth erian indu ction on - . Let to _R/E t l _*_R/E t n a derivation issued from to' By definiti on of R/E, th ere exists a t~ such th at to =E t~ _R t I , therefore to =E t~ definition of
--+ .
--+
t l by
Now, by E-commutation there exists a t~ such th at to -+--+ t~ =E t i . As
t~ =E t I , the rest of the derivation starting from t l is also a R/E derivation starting from t~. By indu ction hypothe sis, it must be finite, because t~ is a proper son of to for --+ .
0
Not e that there is no need for R to be E-confluent. Proving the same prop erty with local E-commutation instead of E-commutati on requir es littl e more work . Lemma 13: Assume --+ is locally E-commuting. . I'res to -+--+0-*-+ R/E t . Then to = E to' --+ t 1 Imp i proof: by indu ction on n if to I-n-I t~. Th e basic n=O case is strai ghtforward. Let now to I-n-IE t 1-1 t~ -
tl'
t -+ t' _*--+o=E t l for some t'.
By local E-commutation, we have t _+--+o=E t I , thus By indu ction hypoth esis, to _+_o-*-+R/E t ', thu s
to _+-+o-*_R/E t , using th e fact th at -+ is included into _ R/E. 0 1 Theorem 14: Let R a set of rules and E a set of equat ions. Assume that th e rewriting relation is termin ating and locally E-commut ing. Th en R is E-terminating. proof: Almost the same proof as for theorem 12. Wh at differs is that we use the lemma instead of E-commutation. This allows us to construct from to a derivation of the form 0 to .+-+ t~ _*-+R/E t l _*--+R/E tn' and we are done. As previously, there is no need for R to be E-confluent or locally E-confluent. Fr om these results, we can obta in a new Church-Rosser result for Equ ational T erm Rewriting Systems using T ermination instead of E-Termination: Theorem 15: Let R be a set of rules and E a set of equations. Assume that -+ is termin ating. Th en - is E-confluent and E-commu ting (thus E-Chur ch-Rosser) iff it is locally E-confluent and locally E-commuting.
183
proof: For the only if part, we remark that global properties imply local ones and that E· Church-Rosser is true since on one hand E·commutation implies E-Coherence and on the other hand E-coherence and E-confluence imply E-Church-Rosser [Jouannaud and Kirchner 84]. For the if part, we first use theorem 14 to prove E-termination. Then we can prove Ecommutation from local E-commutation by noetherian induction on the relation (....R/E)mult'
which is the extension of ....R/E to multisets of terms (see [Jouannaud and Lescanne 82] for a discussion about multiset extensions of orderings): , • E E . ' E' Let to ->to .*.... to and to 1-1 t 1 ... I-I tn' By local E-commutatlOn, t 1 -+->t 1 = to' By induction hypothesis applied to the multiset {t1,...,t n} which is strictly smaller than the starting multiset, we get t n .+....t~ =E t~. As terms along the proof that t~ =E t~ =E t~ are all proper sons of to for the relation ....R/E, the multiset of these terms is smaller than the multiset {to} itself, thus than the starting multiset. By induction hypothesis, t~ .*.... t~ t~ and we are done. E-confluence can now be obtained easily from local E-confluence and E-coherence by noetherian induction on ....R/E. 0 Let us point out that multiset induction allows simple and elegant proofs. This technique was introduced in [Jouannaud and Kirchner 84] for proving very general Church-Rosser results using local E-coherence instead of local E-commutation. We already know how to prove termination of the standard rewriting relation ->R, using a reduction ordering> such that u(g) > u(d) for any rule g....d and any substitution 17. -This enables us to prove E-termination of R, provided it is E-commuting. Example 1: [Huet 80] Let 0 and 1 be constants, Exp be a unary function symbol and + and. be binary infix function symbols. Assume + and. are both associative and commutative. Rules are: x+O -> x, O+x -> x, x.l -> x, l.x -> x, Exp(O) -> 1 and Exp(x+y) -> Exp(x).Exp(y). It is easy to check that the standard rewriting relation is locally commuting. Moreover, it is terminating as shown by a recursive path ordering [Dershowitz 82] with Exp>. and Exp >I the precedence on function symbols. By theorem 14, R is terminating modulo associativity and commutativity of + and v. Actually, R is also locally E-confluent and we could use directly theorem 15 for proving its Church-Rosser property.
In order to prove termination of the rewriting relation ....R,E , we simply need a reduction ordering > that satisfies: t > u(d) for any rule g....d, any substitution 17 and any t such that t =E u(g). As a matter of fact, such an ordering will contain the ....R,E rewriting relation and thus prove its termination provided it is well founded. Example 2: Let f be a binary symbol and - be a unary symbol with the following properties: f(x,f(-x,y)) .... y and f(f(y,-x),x) -> y used as rules. --x=x and -f(x,y)=f(-y,-x} used as equations. This equational theory is introduced in [Jouannaud, Kirchner C. and H. 81] and studied in [Kirchner C. and H. 82] where R,E is proved to be E-commuting. Since R,E must reduce the
184
number of f symbols, it is also terminat ing. Th erefore R is E-terminating. Note that the same argument used for proving terminat ion of R,E act ually proves ter minat ion of R/E. An int eresting open pro blem now arises: the design of well-suited reduction orderings for proving termination of R,E. We expect it to be easier than the design of reduction orderings working in E-congruence classes, as it is done in [Dershowitz, Hsiang, Josephson and Pl aisted 83, P laisted 83J, that is reductions orderings for proving termination of R/E. By th e way, these orderings can also be used for proving termination of R,E since R,E is included into R/E.
6 E-Termination of non E-Commuting Rewriting Relations Th e main idea of this section is quite simple: Since the set R of rules is not E-commuting , we first comput e the smallest E-commuting set of rules that cont ains R and then prove th e termination of the rewriting relation associated with this new set of rules. In addition, we will show that we can use two different proofs for R and the added rules. Th is will give us a much more powerfull techniqu e. Definition 16: A relation > is said to be: E-commuting with a rewriting relation -+ iff for any s',s and t such th at s' = E s then s' > t' = E t for some t '. semi-locally E-commuting if th e property holds for s' = E s -+ t. locally commuting if it holds for s'H E s -+ t.
- +-+
t,
0
E-commutin g relations play the same role as E-commuting rewriting relations. Notice that we obtain our previous definition if > is ta ken to be - +-+. In t he following, we assume > to be an ordering. Notice tha t E-commuta tion and semi-local E-commut ation can actually be proved equivalent by a simple induction on th e length of th e derivation. This wiII be freely used in what follows by calling E-commut ation what in fact is semi-local E-commutation. We can now adapt theorem 12 to the case of E-commutin g orderings, provided th ey are assumed well found ed. Our goal is actually to relax the hypoth esis that > is well founded and assume that it contain s the embedding relation. Theorem 17: [Munoz 83] Assume that > is E-commuting with embedding relat ion. Th en _ R is E-terminating. proof: By contradiction. E'
R
=E
'R
and conta ins the
Assume there exist an infinite cha in for -+R/E issued from to: E '
R
.
.
to = to -+ t 1 t 1 -+ t z ..· t n = t n -+ t n+1 .. ·· Th en, ar plymg E-c~mmu ~atlOn as many times as needed, we constru ct an infinite chain for >: to > t 1 > ... >tn > t n+1 .. . . By Kru skal's th eorem, th ere exist two terms t j and t j with i< Nj in the sequence such that t i is embedded in t.. It follows from hypoth esis tha t t . < t. or t.=t .. But t. > t , by transitivity of J I J I J I J the ordering > , which gives a contradiction. 0 Note th at there is no need for > to be a well founded ordering: what is important is that it contains the embedding relation. A same trick was actually used in [Dershowitz 82J when proving that simplificat ion orderings can be used to prove termination. Thi s theorem can DOW
185
be applied to simplification orderings, because they conta in the embedding relation. We can thus suppose that> is an extension of such an ordering. Let us now remark th at our definition of E-commut ation collects two different notions: • Let us assume that the E-equality step in the E-commut ation definition is empty i.e. s' = s -+ -+ t. In that case, we obtain from the definition that there exists a t ' such that s' > t ' = E t. A reasonable restr iction is to assume t' = t for that special case, as for th e empty th eory. Thi s means that the rewritin g relation -+ is included into th e commuting ordering, which is easily obtai ned by requiring that the simplificat ion ordering or th e Esimplification ordering orients the rules: as seen previously it will contain the whole rewriting relation . • Let us assume now that s' ~s and s' =E s -+ t, which implies th at ww[s' -+ t' =E tj for some t'. This is what is called E-commutation in the following. This E-commutation notion has as its purpos e to ensure E-t ermination provided terminat ion is true. Th eorem 17 can now be restat ed in th e two following ways: Theorem 18: [Munoz 83] Let> be a simplification ordering such th at u(g) > u(d) for any rule g-+d in R and any substitu tion a, Th en -+R is E-t erminati ng if > is conta ined into an ordering> ' E-commuting with -+R. Theorem lU: Let> be an E-simplification ordering such that t > u(d) for any rule g-+d in R, any subst itution a and any t such th at t =E u(g). Th en R is E-t erminating if > is cont ained into an ordering > ' E-commuting with -+R,E. proofs: We verify hypoth eses of theorem 17. > ' conta ins > thus the embedding since > is a simplificat ion ordering. Using the hypoth eses, > cont ains the rewriting relation -+R or -+R,E depending upon theorem 18 or 19. Th erefore > ' has the same property. As > ' is Ecommuting, we capture th e full power of our pr evious definition. 0
This result can now be applied in the following way: Let us sta rt with a sim pli fic ati on orderi ng that orients the instan ces of rules of R (up to E-equality for the instance of th e left member in the case where -+R,E is our rewriting relation). This ordering can be for example the Recursive P ath Ordering or the Recursive Decomposition Ordering if th e rewriting relation is -+R. Now try to extend this ordering into an Escommutinq reduction ordering . Thi s will be the difficult step in pract ice and we now study various ways for achieving it . T he technique we propose relies on th e two following main remarks: First of all, th e commutation relation is relat ed to some kind of critical pairs i.e. can be checked on a set of pairs of terms (to,t 1) th at we can try to orient with th e extended ordering. Second, usual simplification orderings are monotonic with respect to th e pr ecedence on function symbols th ey use. A first simple idea is thus to increase, if necessary, the precedence on function symbols in order to orient the critica l pairs.
186
However, the ordering used to compare the critical pairs has no need to be the same than the ordering used to compare the rules themselves. What is required is that it must be an extension, and we can actually imagine various ways to build such an extension in practice, based for example on a lexicographic way of using the first ordering then another one. This will provide a much more powerfull technique than increasing the precedence only. The previous techniques reduce the problem of E-termination of a set of rules to the problem of termination of an extended set of rules. We now split our discussion into two sections, according to the two different rewriting relations we can use. 6.1 -+R Rewritings Definition 20: Given two terms t' and t, we say that t' overlaps t at occurrence A in 9(t) if t' and t/A are unifiable. If a is their most general unifier, u(t) is called the overlaping term. Given now two rules g-+d and l-e-r such that g overlaps I at occurrence A with the substitution a, we call critical pair the pair of terms . Let SCP(g-+d,l-+r) be the set of all 0 critical pairs obtained by overlaping of g on I. Notice that any term overlaps with itself at the top. Such overlapings produce trivial critical pairs that we don't consider here. Notice also that a critical pair is not symmetric: this is due to the fact that the overlaping operation itself is not symmetric. More precisely, the overlaping term ul rewrites both ways as follows: ul -+ at and rrl -+ ul[A+ud]. The first rewriting uses the rule I-+r whereas the second one uses the rule g-+d. With critical pairs is associated the so called: Critical pair lemma [Huet 80]: Let t -+ t 1 at the top with the rule g-e-d and t -+ t 2 at occurrence A with the rule l-+r. If A is in 9(g), then there exist a critical pair (p,q) in SCP(I-+r,g-+d) and a substitution a such that t 1 = up and t 2 = zrq. 0 We now introduce an operation over sets of rules that we call Critical Pair: Definition 21: We say tlr~t a set of rules is closed with respect to a set E of equations under Critical Pair operation iff for any rule I-+r of R and any equation g=d of E: - for any (p,q) in SCP(I-+r,g-+d) or in SCP(I-+r,d-+g), p _+-+Ro=E q. - for any (p,q) in SCP(g-+d,l-+r) or in SCP(d-+g,l-+r), q _+-+Ro=E p. 0 Note that the previous condition is satisfied if p-+q (resp. q-+p) is in R. Actually, this definition means that we want the E-commutation property to be true at least for the critical pairs of the rules with the equations. As usually, it will be then satisfied for all possible cases. Let now CP(R,E) be a (smallest) set of rules that contains R and is closed under Critical Pair operation. Such a (non unique) set can be obtained by adding the rules p-+q or q-+p each time the previous property is not satisfied. Generally, we can't expect it to be finite, but let us postpone this problem.
187
Theorem 22: Let R be a set of left linear rules and E a set of linear equations. Then R is E-terminating if there exist: .a simplification ordering> such that (T(I) > (T(r) for any rule I->r in R and any substitution (T. •a reduction ordering >' that contains> and such that (T(I) >' (T(r) for any rule I--+r in 0 CP(R,E) - R and any substitution (T. proof: We have to prove the commutation of >' with -..>R. Let us first remark that >' contains> thus --+R, because> is a reduction ordering that contains the instances of rules in R. For the same reason, >' contains --+CP(R,E)-R, thus --+CP(R,E), therefore _+--+CP(R,E) by transitivity. It is therefore sufficient to prove E-commutation of _+--+CP(R,E) with --+R. Since --+CP(R,E) contains --+R, this property follows from the E-commutation of _+--+CP(R,E) with --+CP(R,E). This last property is proved now: Let t11-n-IE t --+CP(R,E) t z. We prove the property by induction on n. If n is greater than 1, the property is easily proved by induction. If n=O, the property is obvious. Let us now deal with the n=I case. Assume that t/>.. = (T(g) and t 1 = t[>,,+(T(d)] for an equation g=d in E and that t/ p, = (T(I) and t z = t[p,+(T(r)] for a rule I->r in CP(R,E). In order to have the same substitution for both rewritings, we classicaly assume without loss of generality that g and I have disjoint sets of variables. We discuss different cases according to the respective positions of >.. and p: (1) If >.. and p, are disjoint, then the equality step and the rewriting step commute. (2) If >.. is a prefix of p" let us say p,=>..p,', we distinguish two cases: (a) If p,' belongs to 9(g), then by the critical pair lemma, there exist a critical pair (p,q) in SCP(I->r,g--+d) such that t1=(T'P and tz=,=(T'q for a substitution (T'. Then t 1 = (T'p > t z = (T'q
or t 1 = (T'p >o=E (T'q = t z by hypotheses and definition of CP(R,E). (b) If p,' does not belong to 9(g), then t 1 -m-> t~ I-m'-I t z' where m is the number of occurrences of the variable x at occurrence 1/ in g such that p,' = I/p,'. As equations are supposed to be linear, m=I and the result is true. (3) If p, is a prefix of >.., the proof works as previously and uses the fact that rules in Rare supposed to be left linear, which implies that rules in CP(R,E) are left linear too, since equations are linear. This is actually a property of unification. This ends the proof of our theorem. 0 Example 3: Let us use example 1 again, with the new rules (x+y).z -> (x.z)+(y.z) and z.(x+y) -> (z.x)+(z.y). By adding .>+ to the precedence, the recursive path ordering will orient all our rules. Let us now check what are rules to be added in order to have E-commutativity. Actually, they are infinitely many, but all have a same form that enable us to prove their termination with the same recursive path ordering. Let us give some of them: ((x+y)+z).z' -> (x.z')+((y+z).z'), (x+y).(z.z') -> ((x.z)+(y.z)).z' and (x'.(x+y)).z -> x'.((x.z)+(y.z)).
188
6.2 -->R,E REWRITINGS The previous technique is rather restrictive, especially for th e kind of rules that can be handled. In order to relax this restriction, we now use th e -->R,E rewriting relation of Peterson and Sti ckel. This relation was introduced by th ese authors to solve th e problem of non left lineari ty of rul es, and a new notion of crit ical pairs: Definition 23: Giv en two terms t ' and t , t' E-overlaps t at occurrence A in ,9(t) wit h a complete set of Eru ni jiers E iff e [t '] = E O'(t/A) for any subst it utio n 0' in E . Given tw o rul es I-->r and g-->d such that IE-overl aps g at occurrenc e A with a complete set E of E-unifiers, we call complete set of E-critlcal pairs of I-->r on g-->d at occurr ence A to th e set of pairs {< O'(d),O'(g[A4-r]» I for any 0' in E} , and extended pair of g-->d on I-->r at occurrence A to the pair . In the following, SEP(g-->d,l.....r ) denotes th e set of all ext ended pairs of g-->d on I-->r. 0 Extended pai rs were first introduced in [Peterson and Sti ckel 81] for the associative commu tative case, and as abov e in [Jou an naud an d Kir chn er 841. Notice th at th e substitution 0' is not involved in t he compu tat ion of t he exte nded pair. On the other hand , for any unifi er 0' in E, we h ave: O'(g[A4-I]) = O'g[)..4-O'I] = O'gIA4-O'(g/)..)] = O'g If g=d is an equa tion of E , t hen O'g I_IE O'd at the top . It follows that if O'g rewr ites at t he to p with -->R,E , t hen O'd r ewr it es actua lly to the same t erm. Ther efor e, if the left memb er of the extended pair rewrites at t he to p to a t erm t , then t he left member of any corres ponding critical pair will rewrite at th e to p to th e corresp onding insta nce of t. This is very imp ortant for pr acti ce because it allows to avoid th e computat ion of th e complete set of un ifiers of 1 an d g/k we only need to know wether t he t wo terms unify or not. With ext end ed pairs is associated the E-extended pair lemma: Let t I_IE t 1 at th e to p with t he equa tion g.....d of E and t -->R,E t 2 at occ urre nce ).. in gi g) with th e rul e l-->r of R. Then th ere exist an exte nded pair (p,q) in SE P (I-->r, g-->d) and a substitut ion 0' such tha t t 1 =E rrp and t 2 = E O'q. proof: Sa me as t he pr oof of E-eri tical pair s lemma 1 in [Jo ua nna ud 83].
o
As pr eviously, exte nded pairs defin e a closur e operat ion: Definition 24: We say th at a set of rul es is closed with respect to a set E of equa tions un der Extended Pair operation iff for any rule I-->r of R and any equat ion g=d of E : .for any (p ,q) in SEP(l-->r ,g-->d) U SEP(I.....r.d-s- g], p .....R,E p' at the top and p' _*-->Ro=E q Let EP(R,E) be the smallest set of rul es that contains R an d is closed und er Extended P air operat ion. 0 As pr eviously, p must be rewr it ten at least once . This is the case if p-->q is in R. Theorem 25: Let R be a set of rul es an d E a set of linear equations. te rmina t ing if t here exist
Then R is E-
189
.a simplification ordering> such that t > u(d) for any rule g -. d in R and any substitution a such that t =E u(g) .a reduction ordering>' that contains> and such that t > u(q) for any extended pair (p,q) in EP(R,E) and any substitutions a such that t =E u(p).
proof: The proof is nearly the same as the proof of theorem 3, except that -.R or -.CP(R,E) reductions are replaced by -+R,E or -+EP(R,E),E reductions. Notice that the starting relations contained in these orderings are no the instances of rules anymore, but that E-equalities can affect the instance of the left member of rule. Thus, by commutation with the operations of the algebra, these orderings contain the relations -.R,E or -.EP(R,E),E. Accordingly, the lemma that we have to prove states that _+-.EP(R,E),E is E-commuting with -.EP(R,E),E. As previously, the proof is by induction and then by case analysis for the n=1 case. We use the same notations, but now t -.EP(R,E),E t and tip, =E u(l). z The case where>. and p, are disjoint is once more straightforward. If p, is a prefix of >., then tilp, =E tip, =E u(I). Therefore t i -.EP(R,E),E t z and there is no need of extended pairs for this case. If >. is a prefix of p" let us say p,=>.p,', we distinguish two cases: .If p,' belongs to 9(g), then by extended pair lemma, there exist a substitution a and an extended pair (p,q) such that t i =E up and t z =E uq. If p-+q is in EP(R,E), then
t -+EP(R,E),E rrq =E t i
i
.
If p -.EP(R,E),E p' at the top and p' _*-.EP(R,E),E.=E q then
t -.EP(R,E),E u(p') _*-.EP(R,E),E. =E uq =E t i
z.
•If p,' does not belong to ,9(g), then the diagram commutes, provided equations are linear.
0
This last result provides an easy way of proving E-termination orderings, provided the set of extended pairs is finite. Let us show that it is actually the case for the important case of associative commutative theories. Theorem 26: The set EP(R,AC) is finite. More precisely: EP(R,AC) = R U {f(x,I)-.f(x,r) I #(x,l) = 0 and f is top function symbol of I.}
This result is already proved in [Peterson and Stickel 81], because it is one of the basements of their AC-completion algorithm. By the way, we can see that extended pairs ensure the commutation property as well as the compatibility property: in fact they ensure an even stronger property, their common instance. Example 4: The same previous example can be reused with rewriting modulo associativity and commutativity. Rules x+O-.x and x*l-.x can now be discarded, since they are associative commutative instances of others. We have now exactly three extended pairs: (x+y)*(z+z') -. ((x*z)+(y+z))*z', (x'*(x+y))*z -+ x'*((x+z)+(y+z)) and x'*((x+y)*z) -. ((x'*x)+(x'*y))*z. To prove E-termination of the starting set R of rules, we first prove termination of the relation R,E. This can be done by taking the following lexicographic ordering: t that orients the extended rules. This can be done lexicographically with an ordering that counts the number of + symbols, each one with a multiplicity Zh where h is its dept h in the tree associated wit h the term.
7 Conclusion T his work is an attempt to clarify the E-te rmination problem and show how it is related to th e classical termination pro blem and th e E-commut ation property that already ar ised in [Dershowit z, Hsiang, Josephson and Pl aist ed 83] in a rather magic way. Rather than provid ing new orderings to solve th e problem, we give here some int eresting ways of addressing it. In addition, we have shown precisely what are the probl ems to be addressed before providin g practical and efficient tools for E-t ermination proofs. Let us note that we can imagine to use mixt ed rewriting relations by splitt ing the whole set R of rules into a first subset RI of left linear ones and a second subset Rnl, that must contain all non left linear rules of R and maybe some left linear rules too. Th is technique was used in [Jouann aud 83, Jouannaud an d Kir chner 84] to improve the efficiency of rewritings as well as the Knut h and Bendix completion. It is clear from the previous proofs that our results can be easily adapted to this case. Finally, a last question arises: How to design a completion algorithm for mixed sets of rules and equations t hat is based on termination of the rewriting relation instead E-termination? First, we must ensure the E-commu tation property instead of the E-coherence property as in [Jouannaud and Kir chner 841. Th is can be achieved by adding systematically extended pairs each time an equation and a rule overlapp. Th is is however not sufficient to conclude that Eterminatio n follows from termination all along th e complet ion process, since the E-commu tation property will be ensured at the end of the completion and not durin g the course of the algorith m. On the ot her hand, E-commutation will be tru e at least for those rules whose crit ical pairs are alrea dy processed and we can expect that it is sufficient for the completion process to be sound. Acknowledgments: The aut hors acknowledge Nachum Dershowitz, Jieh Hsiang and Pie rre Lescanne for fruitfull discussions about th is problem and Helene Kir chner and Jose Meseguer for carefully reading this draft and suggesting many improvements .
191
References [Cheque 83]
Choque, G. Calcul d 'un ensemble complet d'incrementations minimales pour I 'ordre recursif de decomposition. Technical Report, CRIN, Nancy, France, 1983.
[Dershowitz 82] Dershowitz, N. Ordering for Term-rewriting Systems. Journal of Theoretical Computer Science 17(3):279-301, 1982. Preliminary version in 20th FOCS, H179. [Dershowitz, Hsiang, Josephson and Plaisted 83] Dershowitz, N., Hsiang,J., Josephson, N. and Plaisted,D. Associate-Commutative Rewriting. Proceedings 10th /JCAl, 1983. [Huet 80]
Huet, G. Confluent Reductions: Abstract Properties and Applications to Term Rewriting Systems. Journal of the Association for Computing Machinery 27:797-821, 1980. Preliminary version in 18th FOCS, IEEE, 1977.
[Huet and Oppen 80] Huet, G. and Oppen, D. Equations and Rewrite Rules: A Survey. In Book, R. (editor), Formal Language Theory: Perspectives and Open Problems,. Academic Press, 1980. [Jouannaud 83] Jouannaud, J.P. Church-Rosser Computations with Equational Term Rewriting Systems. Technical Report, eRIN, Nancy, France, January, 1983. Preliminary version in Proc. 5th CAAP, 1983, to appear in Springer Lecture Notes in Computer Science. [Jouannaud and Kirchner 84] Jouannaud, J.P. and Kirchner, H. Completion of a set of rules modulo a set of Equations. Technical Report, SRI-International, 1984. Preliminary version in Proceedings 11th ACM POPL Conference, Salt Lake City, 1984, submitted to the SIAM Journal of Computing. [Jouannaud and Lescanne 82] Jouannaud, J.P., Lescanne, P. On Multiset Ordering. Information Processing letters 15(2), 1982.
192
[Jouannaud, Kirchner C. and H. 811 Jouannaud, J.P., Kirchner, C. and Kirchner, H. Algebraic manipulations as a unification and matching strategy for equations In signed binary trees. Proceedings 7th International Joint Conference on Artificial Intelligence, Vancouver, 1981. [Jouannaud, Lescanne and Reinig 831 Jouannaud, J.-P., Lescanne, P., Reinig, F. Recursive Decomposition Ordering. In IFIP Working Conference on Formal Description of Programming Concepts II,. North-Holland, 1983, edited by D. Bjorner., 1983. [Kamin and Levy 80] Kamin, S. and Levy, J.J. Attempts for generalizing the recursive path ordering. 1980. Unpublished draft. [Kirchner C. and H. 82] Kirchner, C. and Kirchner, H. Resolution d 'equations dans les Algebres libres et les varietes equationelles d 'AIgebres. PhD thesis, Universite Nancy I, 1982. [Knuth and Bendix 70) Knuth, D. and Bendix, P. Simple Word Problems in Universal Algebra. In J. Leech (editor), Computational Problems in Abstract Algebra, . Pergamon Press, 1970. [Lankford and Ballantyne 77aJ Lankford, D. and Ballantyne, A. Decision Procedures for Simple Equational Theories with Commutative Axioms: Complete Sets of Commutative Reductions. Technical Report, Univ, of Texas at Austin, Dept. of Mathematics and Computer Science, 1977. [Lankford and Ballantyne 77bJ Lankford, D. and Ballantyne, A. Decision Procedures for Simple Equational Theories with Permutative Axioms: Complete Sets of Permutative Reductions. Technical Report, Univ. of Texas at Austin, Dept. of Mathematics and Computer Science, 1977.
193
[Lankford and Ballantyne 77c] Lankford, D. and Ballantyne, A. Decision Procedures for Sim ple Equational Theories with A ssociative Commutative Axioms: Complete Sets of A ssociative Com m utative Reductions. Te chnical Report, Univ. of T exas at Austin , Dept. of Math emati cs and Computer Science, 1977. [Lescann e 83]
Lescann e P . Computer Experiments with th e REVE T erm Rewriting Systems Generator. In Proceedings, 10th POPL, . ACM, 1983.
[Lescann e 84)
Lescann e, P . How to prove termination? An approach to t he implementati on of a new Recursive Decomposition Ordering. Proceedings 6th CAAP, Bord eaux, 1984.
[Manna and Ness 701 Mann a, Z. and Ness, N. On th e T ermin ation of Mark ov Algorithms. Proceedings of the third Hawaii Con ference on Sys tem Sciences, 1970. [Munoz 831
Munoz, M. Problem e de term ina ison f inie des systernes de reecriture equationnels . PhD thesis, Universite Nancy 1, 1983.
[Peters on an d St ickel 81J Peterson, G. and Stickel , M. Complet e Sets of Reductions for Some Equ ational Theories. JACM 28:233-264, 1981. [Plaisted 78)
Plaisted, D. A R ecursively Defined Ordering f or Proving Term ination of Term R ewril ing S ys tem s. T echnical Report R-78-943, University of Illinois, Computer Science Departm ent , 1978.
[Plaisted 83]
Pl aisted, D. An associative path orderin g. Technical Report , University of Illinois, Computer Science Department, 1983.
194
ASSOCIATIVE-CO:M:MUTATIVE UNIFICATION Fromaois Faqes
CNRS, LITP4 place Jussieu 75221 Paris Cedex 05, INRIA Domaine de Voluceau Rocquencourt, 78153 Le Chesnay. ABSTRACT Unification in equational theories, that is solving equations in varieties, is of special relevance to automated deduction. Recent results in term rewriting systems, as in [Peterson and Stickel 81] and [Hsiang 82], depend on unification in presence of associativecommutative functions. Stickel [75,81] gave an associativecommutative unification algorithm, but its termination in the general case was still questioned. Here we give an abstract framework to present unification problems, and we prove the total correctness of Stickel's algorithm. The first part of this paper is an introduction to unification theory. The second part is devoted to the associative-commutative case. The algorithm of Stickel is defined in ML [Gordon, Milner and Wadsworth 79] since in addition to being an effective programming language, ML is a precise and concise formalism close to the standard mathematical notations. The proof of termination and completeness is based on a relatively simple measure of complexity for associative-commutative unification problems, 1. Unification in equational theories 1.1. Equational theories
We assume well known the concept of an algebra A = with A a set of elements (the carrier of A) and F a family of operators, given with their aritles. More generally, we may consider heterogeneous algebras over some set of sorts, but all the notions considered here carryover to sorted algebras without difficulty, and so we will forget sorts and even arities for simplicity of notation. With this provision, all our definitions are consistent with [Huet and Oppen 80]. We denote by T(F) the set of (ground) terms over F. We assume that there is at least one constant (operator of arity 0) in F so that this set is not empty. We also assume the existence of a denumerable set of variables V, disjoint from F , and denote by T(F, V) the set of terms with variables over F and V When F and Yare clear from the context, we abbreviate T(F, V) as T and T(F'j as G (for ground). We denote terms by M,N, ..., and write V (M) for the set of variables appearing in M. We denote by T (resp. G) the algebra with carrier T (resp. ri) and with operators the term constructors corresponding to each operator of F. The substitutions are all mappings from Vto T, extended to T, as endomorphisms of T. We denote by S the set of all substitutions. If aES and MET, we denote by aM the application of a to M. Since we are only interested in
195
substitutions for their effect on terms, we shall generally assume that ax=x except on a finite set of variables D(a) which we call the domain of a by abuse of notation. Such substitutions can then be represented by the finite set of pairs ! I xED(a)j. We define the range R(a) of a as: R(a)
= U V(ax). xED(a)
We say that a is ground iff R(a)=¢. The composition of substitutions is the usual composition of mappings: (aop)x=a(px). And we say that ais more general than. p : a~p iff 37] 'rJoa=p. An equation is a pair of terms M=N. Let E be a set of equations (axioms), we define the equational theory presented by E as the finest congruence over T containing all pairs aM=aN for M=Nin E and a in S. It is denoted by =. An equational E
theory presented by E is axiomatic iff E is finite or recursive. An algebra Ais a model of an equation M=N if and only if 11M = liN as elements of A. for every assignment II (i.e. mapping from Vto A extended as a morphism from T to A). We write A 1= M=N. A is a model of an equational theory E iff A 1= E for every E in E. We denote by M(F:) the class of models of E, which we call the variety defined by E. E-equality in T is extended to substitutions by extensionality: a = p iff ~XEV ax = px. E
E
We write for any set of variables V: V
a = p iff ~XEV ex = px. E
E
In the same way, a is more general than. p in E over V, V
V
a ~ p iff 3r] noo = p. E
E
The corresponding equivalence relation on substitutions is denoted by
v
v
v
E
E
E
;; ; i.e. a;; p iff a~p andp
v
~
E
a. We will omit Vwhcn V=v, and EwhenE=¢.
1.2. E-unificaUon 1.2.1. Historical Let Ebe an equational theory. A substitution a is a E-Un:ifier of terms M and N if and only if aM = aN. E
Hilbert's tenth problem (solving of polynomial equations over integers, called Diophantine equations) is the unification problem in arithmetic. Livesey, Siekmann, Szabo and Unvericht [79] have proved that Associative-Distributive unification is undecidable, and thus that the undecidability of Hilbert's tenth problem [Matiyasevich 70, Davis 73] does not rely on a specific property of integers. We denote by UE the set of all E-unifiers of M and N: UE (M,N) == !aES
I aM==E eN].
Axiomatic equational theories are semi-decidable, and UK is always recursively enumerable, but of course we are mostly interested in a generating set of
196
the E-unifiers (called Complete Set of E-Unifiers by Plotkin [72], and denoted by CSUE) , from which we can generate UE by instantiations. Or better by a basis of UE (called Complete Set of Minimal Unifiers and denoted by JLCSUE) satisfying V the minimality condition a~a' => a~a'. E
So we shall make the difference between unification procedures which enumerate a CSUE (the exhaustive enumeration procedure in semi-decidable theories enumerates UE entirely), unificatiDn algorithms, which always terminate with a finite CSUE. empty if terms are not unifiable, and minimal unificaiionprocedures or algo1ithms which compute a JLCSUE. Unification has been for the first time studied in first order languages (the case E=¢) by Herbrand [30], In his thesis he gave an explicit algorithm to compute a most general unifier. However the notion of unification really grew out of the work of the researchers in automatic theorem-proving, since the unification algorithm is the basic mechanism needed to explain the mutual interaction of inference rules. Robinson [65] gave the algorithm in connection with the resolution rule, and proved that it indeed computes a most general unifier. Independently, Guard [64] presented unification in various systems of logic. Unification is also central in the treatment of equality [Robinson and Was 69, Knuth and Bendix 70]. Implementation and complexity analysis of unification is discussed in [Robinson 71], [Venturini-Zilli 75], [Huet 76], [Baxter 77], [Paterson and Wegman 78] and [Martelli and Mont.anari 82]. Paterson and Wegman give a linear algorithm to compute the most general unifier. First order unification was extended to infinite (regular) trees by Huet [76], who showed that a single most general unifier exists for this class, computable by an almost linear algorithm. This problem is relevant to the implementation of PROLOG like programming languages [Colmerauer 72,82, Fages 83]. In the context of higher order logic, the problem of unification was studied by Gould [66], who defined "general matching sets" of terms, a weaker notion than that of CSU. The existence of a unifier is shown to be undecidable in third order languages in [Huet 73], and at second order by Goldfarb [81]. The general theory of CSU's and JLCSU's in the context of higher order logic is studied in [Huet 76, Jensen and Pietrzykowski 77], Unification in equational theories has been first studied by Plotkin [72] in the context of resolution theorem provers to build-up the underlying equational theory into the rules of inference. In this paper Plotkin conjectured that there existed an equational theory Ewhere a !lCSUE did not always exist. Theorem 1 in the next chapter proves this conjecture. Further interest in unification in equational theories arose from the problem of implementing programming languages with "call by patterns", such as QA4 [Rulifson 72], Associative unification (finding solutions to word equations) is a particularly hard problem. Plotkin [72] gives a procedure to enumerate a JLCSUA (eventually infinite), and Makanin [77] shows that the word equation problem is decidable. Stickel [75,81] and independently Livesey and Siekmann [76,79], give an algorithm for unification in presence of associative-commutative operators. However its termination in the general case was still questionned since some recursive calls are made on terms having a bigger size than the initial terms. Theorems 3 and 4 in this paper prove the termination and t he completeness in the general case. SieJ.r.mann [78] studied the general problem in his thesis, especially the extension of the AC-unification algorithm to idempotence and identity. Lankford [79,83] gave the extension to a unification procedure in Abelian group theory.
197
In the class of equational theories for which there exists a canonical term rewriting system (see [Huet and Oppen 80D, Fay [79] gives a universal procedure to enumerate a CSUE. It is based on the notion of "narrowing", as defined in [Slagle 74]. Hullot [80] gives a similar procedure and a sufficient termination criterion, further generalized in [Jouannaud and Kirchner 83]. Siekmann and Szabo [82] investigate the domain of regular canonical term rewriting systems in order to find general minimal unification procedures, but we show in [Fages and Huet 83] that even in this framework JkCSUE may not exist. Termination or minimality of unification procedures is much harder to obtain than completeness. However the main applications of unification in equational theories to the generalizations of the Knuth and Bendix algorithm, such as in [Peterson, Stickel 81 and Hsiang 82], are covered by the associativecommutative unification algorithm. 1.2.2. Definitions Let M,Ne:T, V=V(M)uV(N) and Wbe a finite set of "protected variables". S is a
Complete Set
01 E-Unifiers of M and N away from Wif and only if :
a) V'11e:S D(a)r;;:.V and R(I1)nW=¢
(purity) (correctness)
b)Sr;;:.UE(M,N)
v
c) V'p c UE (M,N) 311>:S 11 ~ P E
Furthermore S is a complete Set of Mini:mal E-Unifiers from Wif additionally:
v
d) V'I1I1'e:S l1'!'I1'=>11 ~ , E 11'
(completeness)
01 M and N away (minimality)
Remark that the most general unifiers in first order languages are jJ.CSU¢ reduced to one element. The reason to consider Wis that in many algorithms, unification must be performed on subterms, and 11 1s necessary to separate the variables introduced by unification from the variables of the context. It is the case for instance for resolution in equational theories [Plotkin 72], and for the generalization of the Knuth and Bendix completion procedure in congruence classes of terms [Peterson and Stickel 81]. It is easy to show that there always exists a eSUE away from W, by taking all E-unifiers satisfying a). We may add to the definition of CSUE :
v
d') V'11,I1'e:S l1'!'I1' => a 111'
(non-congruency)
Such CSUE still always exist but we loose the property that if UE is recursively enumerable then there exists a recursively enumerable one. For example, in undecidable axiomatic equational theories UE is recursively enumerable but in general the esuE satisfying d') are not. 1.2.3. Existence of basis of the E-uni1iers
It is well known that there may not exist a fin11e CSUE. For instance a*x=x*a in the theory where * is associative [Plotkin 72]. When there exists a finite eSUI?, there always exists a minimal one, by filtering out redundant elements. But it is not true in general: Theorem 1 (non-existence of basis) : In some first order equational theory E there exist E-unifiable terms for which there is no jJ.CSUE.
198
Proof : The proof is in [Fages and Huet 83], where it is shown that there may be infinite strings of E-unifiers more and more general. And thus that minimaltty The example is d) may be incompatible with completeness c). E=!f(O,x)=x , g(f(x,y»=g(y)j, for terms g(x) and g(a).
v
•
However when a JLCSVE exists, it is unique up to "". E
Theorem 2 (unieity of basis) : Let Mand N be two terms, VI and V2 be two JLCSVE v of M and N. There exists a bijection rp : VI'" V 2 such that ';fO'EV I a"" rp(a). E
V
Proof: ';fO'EV I 3pEV2 p";:; a since V 2 is complete. We pick-up one suchp as rp(a). v E ';fa'EV 23p'EVI p''';:; a'. We pick-up one suchp' as '1/1(0"). EV
Thus ';fO'EV l 'I/I(rpO')";:;O' so 'I/I(rpa» = v
V
E
E
rp(O'» ~ a ~ rp(O') i.e.
0'
by minimality,
E V 0'
== rp(a). E
•
2. MLas a programming language for the term. strncture ML is an applicative language with exceptions in the line of ISWIM [Landin 66], POP2 [Burstall. Collins and Popplestone 71], GEDANKEN [Reynolds 70]. Functions are "first class citizens", the type inference mechanism allows junctionals at any order, and polymorphic operators. Primitive types are : void, bcol, int and string, and type operators are: cartesian product x, sum + and function ..... Type variables are denoted by *, **, ... The declaration of an abstract type consists in the definition of constructors and external functions. For instance, the polymorphic list type, • list = void + (* x * list), is predefined with the constructors nil and cons (noted. in infix), and the destructors hd: ·list.... * and tl: * list .... • list. Variables can be structured in lists and pairs, bindings are then made by matching. We refer to the LCF manual [Gordon, Milner and Wadsworth 79] for the syntax and semantics of ML. The composition of functions is an infix operator Q: (.*.... ***}x(..... U).... *.... ***. It could be defined by : mlinfix 'Q ' •• let (f 0 g) ~ f (g x) ;; The curried function map: ( .......) .... • list .... *·list returns the list of the function applications to a list of arguments. It is equivalent to ; letrec map f l =if l =nil then nil else f (hd l) . (map f (tll));; Particularly important is itlist: (*.... **.... **) .... * list.... **....** which iterates the application of a function to a list of arguments, by composing the results: itlist f [ll; ... :ln] x = (f h (f 12 ... (f In x) ... » = ((f 11)0(f12)0...0(f In» x. It is defined by: letrec itlist f l x if l=nil then x else f (hd l) (itlist f (tll) x);; For example tlatten: • list list.... • list which eliminates the first level parentheses in a list can be defined by : let flatten l iltlist append l nil;; We generalize itlist to the iterative application of a curried function on two lists of arguments, with; letrec itlist2 f k l x = if k=nil then x else f (hd k) (hd l) (itlist2 f (u k) (n i) x);; We will not detail the abstract types related to the term structure, we keep previous notations related to them, and define the function op: T....F which
=
=
=
199
returns the head symbol of a term, and largs: T.-. T list which returns the list of the arguments. abstype F' == ;; abstype V== ;; absrectypa T == F x T list + V with ." and op M == ... and largs M == ... and isvar M == ... and e![Ua1(M,N) == ... ;; lettype S == 1'-.1';; mlinfix .... ';; letrec x+-M = ...;; Since substitutions are functions from terms to terms, the composition of substitutions is exactly the composition of functions 0, and the identity function I serves as the identity substitution. If we suppose defined the function occurs: TxT-.bool which recognizes if a variable occurs in a term, the unification algorithm of Robinson [65] may be defined by the function uni: TxT-.Swith: leirec uni(M,N) iJisvar M then 'if equa1(M,N) then I else iJ occurs(M.N) then Jail else M...N else iJisvar N then iJ occurs(N,M) then Jail else N...M else iJ op M~op N then fail else (itlist2unicompound (largs M) (largs N) I) rmdunicompoundAB(J uni ((J A,(J B) 0 (J;;
=
=
3. AC-unification 3.1. Connection with the solving of linear homogeneous diophantine equations A binary function + is associative and commutative iff it satisfies (in infix notation) ;
=
x+y y+x { (x+y)+z == x+(y+z) The set of those function symbols is denoted by FAC' We will not consider them as symbols of variable arity in order to stay in the term algebras formalism, but we define the function largAC : T-. T list which returns the list of the AC-arguments, that is the list of the arguments after the elimination of parentheses on the head symbol if it is AC. For example with M== (x+(x+y))+(f(a+(a+a))+(b+c)) where +EFAC' we have op M == +, largs M [x+(x+y) ; f(a+(a+a))+(b+c)] and largAC M== [x; x ; y; f(a+(a+a)) .b ; c). Let two terms M and N beginning with the same AC head symbol to be unified. Stickel [81] proved that the elimination of common arguments, pair by pair, does not change UAc(M,N). For example the unification of M = (x+(x+y))+(f(a+(a+a))+(b+c)) and N == ((b+b)+(b+z))+c, where +EFAC' is equivalent to the unification of the arguments lists [x;x;y;f(a+(a+a))] and [b.b;z], obtained by eliminating the common arguments b and c. If a variable is eliminated in this operation, it must be added to the set of context variables W. Unification of (non-ordered) lists of arguments is the problem of solving equations in the free abelian semigroup, which is isomorphic to the solving of homogeneous linear diophantine equations L: 8.jXj == L: bjyj over IN-lOj. Thus for any AC-unification problem, we associate such an equation by associating integer variables to distinct arguments, with their multiplicity as coefficient. For instance 2x 1+X2+X;j:::; 2Yl+Y2 in the example. In return the solutions to L1-Ie diophantine equation induce the unifiers of the lists elements. that are still to unify with the effective arguments. Stickel [76] and Huet [78] give an algorithm to solve homogeneous linear diophantine equations, which enumerates a basis of solutions, by backtracking
=
200
with a certain bound on the value of the variables, and elimination of the redundant solutions . The best bound, is double : 1) ~i Xj ~ max bk k
~) ...."
'"
v 1
V'j Xi ~
r
V'j YJ' ~ max al I
Icm(6.j.bj ) aj
lcm(aj,bj ) or Yj ~ ----;-.--=---"bj
The basis of solutions can be represented as a matrix with as many columns as variables in the equation. and as many lines as solutions in the basis. For example the equation 2Xl+X2+X3 2Yl+Y2 admits as soluti ons basis the matrix :
=
Xl X2 Xg Yl Y2 Sl 0 1 1 1 0 s200210 ss02010 S4 1 0 0 1 0 so 0 0 1 0 1 se0 1 0 0 1 s7 1 0 0 0 2 Any linear com binati on of the seven ele ments of the basis is a soluti on to the equation. However, because the ab sence of a zero in the unification problem. we must consider all subsets of the solutions basis, with the constraints that the sum of the coeff icients in a column must be non null, and equal to 1 if the corresp on ding term is no t a variable. Hullot [80] gives a m ethod for a cons trained enumeration of partiiions , in order to reduce the high complexity of this computation. We refer to his thesis for details of this technique which is crucial in an impleme ntati on . In the example. am ong the 27= 128 solutions. only 6 are to be considered. Let us exa mine !S4.So,Se.S7j : Xl S4+S7 . X2 S6 . Xs S5 . Yl S4 e Y2 S5+S6+S7 . We deduce a unifi er from it, by associating to each solution Sj a new variable U j :
=
U
=
=
=
=
=[x.... b+u7 , z.... f(a +(a+a)) +(Y+(u7+u7)) . Ue ( T list list) express es each soluti on at tac he d to a pa rti tion as a list of terms built on f and new vari ables , that ar e still t o Unify wit h the effec tive ar guments in lA.
201
We give the AC-unifieation algorithm as the ML definition of the function uniAC : TxT.... S list which returns as a list a finite CSUAC of its arguments, empty if they are not unifiable : letrec uniAC(M, N) = ifisvar M then if equal(M,N) then [I] lease 11 else if occurs(M,N) then nil lease 21 else [MO by ncetherian induction 3A1_ IE:S Aj_IOUj_IO.. .0(11 = p. AC ButA;-lui-1 . .. uIM/i=Aj-jui-1 . . . uIN/i. K
~
so by indue Lion 3UjEU(ai_l . . . 0'1 MI i,al_l ... 0'1 N/ i,Wj) ai ~ Aj-l, AC withYj=V(Ui_I' " 0'1 Wi)UV(Uj-l' " ujN/i). V
D(uI)cYj so Uj ~ Aj_1 and UjO...OUI ~ p . AC V AC Therefore UkO...OUI p .
fc
Case 7 : Let lA=[MI, ...,Mn ] and llS=[lSI'''',lSp] be the solutions returned by
~list (largAC M, largAC N, W). By isomorphism with the solving of the equation
~biYj over N-!O!, we prove with the same inductions than in case B, that 1=1 the AC-unifiers of the effective arguments in lA with the terms in the solutions lS in US form a CSUAc(M,N,W). l:ajXi =
1=1
•
In general U(M,N,W) is not a /lCSUAC of M and N, but because it is finite, it suffices to eliminate the redundant unifiers (by AC-matching) in a final pass to get one.
206
3.6. Extension to identity and idempotence In [Fages 83] we describe the extension to a unification algorithm in presence of operators that may be associative (A), commutative (C), idempotent (I) with unit (U), with the only restriction that an associative operator must be commutative. Unification of terms built over only one function symbol ACU or ACUI, is studied in [Livesey and Siekmann 76,79]. These authors show that the case 7 in the algorithm is simplified since the presence of a unit permits to consider not all subsets of the solutions base, but only the sum, and idempotence allows to solve the associated equation in 0,1 rather than over integers. However in the general case of unification in presence of operators U or I, case 6 is no longer a fail case. For example if 1 is the unit of f, f(x,y) and g(a,b) are unifiable with the CSUu !lx}} 1 ELSE [5 := UNIFY-PAIRC"I "1( " 1
J
>.
J
"1
1C"1
J+I
).
J+I
"I. V'''l
»;
J+I
j := j + 1; WHILE j
$ II: 00 [5 := ElC'I'END-UNIFIm(S.
"I. "1 "I(V 1
J
>.
J
"1); j := j + 1]
J
REIURN 5] ;
The procedure UNIFY.PAUl(el , e2, 1, V, v') works by finding the unifiers for ("i' Yj' 1j' V, v') where xi and Yj are corresponding arguments or el and e2, respectively, when "i~Y i' Since all the unifiers for different pairs of arguments must agree on the variables or V and v', we can take the intersection or the sets or unifiers restricted to these variables. This yields a set of unifiers which can be extended to unifiers for the original problem.
212
UNIFY-PAlR(e1. e2, 1. V,
~.)
I{ either e1 or e2 i s a {irst-order variable or constant. or e1 and e2
ha~e
dif ferent heads or different numbers
cr
arguments
t hen r eturn {{. f(g(h(a»). h(a» 1(U) --> f(g(h(b», h(b» 1(V) --> f(g(c), c). In the first phase. 'ole find four unifiers a.nd record four variables.
Siuce
"1 := {}
VI :'" v
"2 := {}
v2 :'" w
"a := {}
va :'" u
",
v,
:= {}
"11(V)
"21(w),
the
:= v.
call
to
UNIFY-PAIR
will
he
UNIFY.PAIR(r(g(h(a)), h(a)), r(g(h(b)). h(b)), 1. [v, w}. u). The algorithm will recurse on the arguments and rind that UNIFY.PAIR(g(h(a)). g(h(b)), 11' {v, w}, u) bas unifiers
214
H.zeg(h(Z», 1 1>, .zeg(z), 11>' ' , . }} and t.hat UNlFY-PAIR(h(a), b(b). 12' {v, Il}, u} has the unifiers {{.zeh(Z), 12>, , . , }, {, , , 2,...,l/>k.1 such that 41j(i' jv') = I'j+!v' for 1 :5 i
, lI h ,
COMPAAE-TERMS-OF-SAME-OEPTIlCC¢>1 _IO "'1 -2 ° 2 2
Clb J
2
v,
II,
_1 0
1/'
'&-20
J
last-test,
lbJ )11, Z
0
xt ' l
YJ
) I
and if tbis returns "TRUE" "'e merge the rovs , In any case. lie set last-test to this result. and continue up the rows in this manner, until all the rows have been examined. COMPARE-TERMS-OF-SAME-DEPTH(el, e2, v,
W,
last-test, xt, x2) assumes that xl and x2 have the
same depth and the result of comparing xl and x2 is stored in last-test. It returns the result of comparing (AYeel)xl and (Awee2)x2.
219
COMPARE-TERMS-OF-SAME-DEPTH(el. e2. v, w.last-test, xl, x2) There are six cases to consider: 1. el
= v and e2 = w. Return
the results of the last comparision (last-test).
2. el = v and e2 -F w. Do COMPARE-TERMS-OF-SAME-DEPTH+(e2. w. xI). 3. el
-F v and e2 = w. Do COMPARE-TERMS-OF-SAME-DEPTH+(el. v, x2). rr el =
4. Either el or e2 is another first-order variable, or a first-order constant. return "TRUE". otherwise, return "FALSE".
e2,
5. el and e2 are not first-order variables or constants and have different heads or numbers of variables. Return 'FALSE·. For l S i S r, call 6. el = f(x" x2"'" x,) and e2 = f(Yl' Y2....' Yr). COMPARE-TERMS-OF-SAME-DEPTH(xj, Yj' v, w, last-test, xl, x2). In any of these is 'FALSE', return "FALSE', otherwise. return •TRUE' . COMPARE-TERMS-OF-SAME-DEPTH+(el, u, e2)
U el = u, return 'FALSE', (In this instance, xl and x2 occur in the larger terms at different heights, hence the larger terms cannot be equal) else if el or e2 is a first-order variable or constant, return "TRUE' if el = e2, 'FALSE" otherwise, else if el and e2 have different heads or numbers of arguments, return 'FALSE'
For i
s :S
r, call
COMPARE-TERMS-OF-SAME-DEPTH+(xj, u, Yj)' In any of these is "FALSE', return 'FALSE". otherwise, return •TRUE" .
By doing the intersection this way, whenever we compare two subterms, we are sure that one of them has never been involved in a comparision before. Thus the algorithm is linear. The full algorithm for any number of columns is the obvious extension of this.
5. Implementation. The algorithm has been encoded in UCILSP on a DEC-20.
220
6. Acknowledgements. I would like to thank Dr. Woody Bledsoe, Larry Hines, Ernie Cohen, and Natarajan Shankar for their useful discussions of the problem. Also, I would like to thank the referees for their suggestions.
Proof of theorem: In the following, the term "constant function" means a lambda expression of the form A". where z does not appear in the term c. First, we need two lemmas: Lemma 21 If x"" y, and fIx) = f(y). then f is a constant function. The proof is by induction on the body of f.
=
Azez or f = Azec. In the first case, x Basis: Suppose that the body of f is atomic. Hence f = fIx) = f(y) = y, which is false. Therefore f is constant. Induction step: Suppose that f = A..h(r1(z)....,rm(z)) and the lemma is true for r1" ...r m. Then h(r1(x)....,rrn(x)) = fIx) = flY) = h(r1(y),..·,rm(y)). Thus, for j in [Lm], r/x) = rj(y) and by induction rj is a constant function, hence f is also constant. Lemma 31 If fIx) = g(x), fly) = g(y), and ((x) f f(y) then f
= g.
Proof of lemma 3: Basis: Suppose that the body of f is atomic. Then either f = AUZ or f = AUC where c is a constant other than z. The second case is impossible, since then C(x) = e = C(y). Basis: Suppose that the body of g is atomic. Either g = Azez or g = Azed. In the second case C(x) = g(x) = d = g(y) = flY), which is false. Thus f = g. Induction step: Suppose that g = Azeh(Sl(z),...,gn(z)). x = fIx) = g(x) = h(gl(x)' ....gn(x)). Hence for i in [I,n], gj(x) is a snbterm of x. Therefore gj must be a constant function for i in 11.nJ. Thus g is a constant function and a contradiction arises. Induction step: Suppose that C = Auh(f1(z), ...,fn(z)) and that tbe lemma is true for f1,...,fn. Now we induct on the body of g. Basis: If g is atomic, tben f S is non-atomic.
=
g follows analogously to the case when f is atomic and
Induction step: S = Azeh·(gl(z)"",Sn'(z)). Since h(f1(x),....'n(x)) = fIx) = g(x) = h·(Sl(x), ...,gn'(x)) h = h', n = n', and for j in 11.n!, fj(x) = Sj(x). Similarly, since fly) = sty), fi(y) = Si(Y) for i in 11,n]. If Ci(x) f 'ilY), then by induction Ci = Sj' Otherwise. since x f y, and Cj(x) = fj(Y), by Lemma, 1 fi must be a constant function. Similarly, Sj
221
is a constant function, and fj and Kj must be the same function. In eitber case fj = IIj' Hence f = II.
Theorem : If
f
a
b
f (xl)
=a,
fCyl)
=b,
g(x2)
= a,
and g(y2)
= b.
then there eXists a ,p such that either
=,p(x2), x2 = ,p(xl ) ,
Xi ar
= q,(y2) ,
yl
and g
=f
a ,p.
y2 = ,p(yl) , and f = g o
»,
The proof follows from induction on the bodies of f and II. Basis: Suppose tbat tbe body of f is atomic. Therefore, C == >.zez or f other tban z. The second case cannot occur. since it implies that a
= >.z.c wbere c is a constant == f(xl) = f(x2) == b. In tb e first case.
define ,p to be II.
== >.z.b(fl(z)•...•,.(z)) and that tbe tbeorem
Induction st ep: Suppose that f
holds for '1' '2' ...• ' • . We nOW
induct on tbe body of tbe function II. Basis: Th e body of II is ato mic. II f >.z. c as above. Hence. II = >'nl . so define q, to be 1. Induct ion ste p. Suppose tbat II = >. z. b·(III(z)•...,lIn.(Z)) and that the theorem bolds for Ill' 11 2 •.•.• 11n" Since f(xl)
=
lI(x2). h(fl(xl)•...•f.(xl))
=
h·(SI(x2)•...•II.,(x2)).
Si(x2). Similarly. since f(yl) = S(y2), for i in [l .nl fj(yl) [I,n] such that fixl) aJo
f
f Jo(Xl) gJo(x2)
f
80
h
=
b'. n
= Si(y2).
==
n', and for i in [J,a], fj(xI )
Since f(xl)
f
==
f(x2). there exists j in
f;(yl). Let jo be such a j. We define ajo := fjo(xl) and bjo := fjo(yl) . Then
bJo'
=aJo' =aJo'
fJo(yl)
=bJo'
and gJo(y2)
= bJo'
Hence. by the induction hypothesis. t bere exists a function ,pjo such that eitber
= =
= =
= =
xl ,pJo Cx2)' yl ,pJo(y2), and gJo f JO 0 ,pJo' or x2 I'>Jo(xO. y2 ,pJo(yl) , and fJo gJo a ,pJo' We define ,p := ,pjo and claim that this function satisfies tbe theorem 's conclusion. We examine two cases. based on the value of,p . Case I. xl since f(xl)
f
f(yl). xl
~
= ,p(x2). yl == ,p(y2). and gjo = fjo 0
yl . and since g(x2)
~
g(y2). x2
We first show that for all i in [I,n], lIi = fi 0 ,p.
f
y2.
1'>. First. we note that
222
=
Cas. lA: f 1 (xl) f 1 (y1). By le_ 1, f 1 is a constant function. Similarly, since St(X2) f 1 (x1) f 1 (11) St(12). St must be the saDIe constant function. Therefore St f 1 0 /0
obtain a multiequation system Sf whose e'lenerrts e
P(e) is empty liar T(e) is not empty. On S'
which
are
we can
such apply
that the
first part of the proof. [] ~~!~~~!~~~_~~: For any multiequation e, we define
Nor(e) Qy: Nor(e) = Case
__
the ~~~~~~~~Z ~~!~:2~~!!~~
*
P(e)
!I then e
*
T(e)
II and iP(e): = 1 then e
*
else if Trans(e) exists then J'rans(e) else Nor(e) does not exists.
An immediate consequence of the previous lemma is the following: 9~E~~~~~N~:
Any multiequation e either has no E-solution, either
is E-equiva-
lent to its normalized form Nor(e). ~~!~~!!!~~_~~:
A system of multiequations S is said ::;;;;~~~~~ iff
*
all the mul tiequations are normalized (that is Hor( e)
*
S is nerged.
= e)
and,
238 ~~~P2~~!~~;:;~~:
Any syste::l of multiequations S either has no ::J-solution,
can be transformed into an E-equivalent
and normalized
one,
which is obtained by normalization of each ffiultiequation in
either
denoted
Nor(S),
the merged system
j'lerg(S) •
As in any E-unification algorith!!l, we have to detect in the potential Fr-solution. Here we
occur
detect such eventual cycles "a posteriori",
the cycles that can
have two possibilities:
we
can
when the potential E-unification
substitution is determined or we can try to detect them as in the
r'lartelli and
"'Jontanari algori thn, a.s soon as possible.
*
In the first case a topological sort can be used to solve the problem.
* In the second case, one method is to give an ordering on the multiequations. The algorithm, given in the next paragraph, uses the second and more elaborated method, but it can be easily modified to deal with the first
one.
Thus the
following condition about the E theory is not essential for the application of our method. ~~f~;:;~!~~;:;~~~:
let < be the relation on the multiequations set defined by:
there exists x in V(e), there exists t in
P(e')UT(e'),
such
e 1 is decid ab le. However , our result (for t he n = 3 case ) is eas ier to ap ply.
We w ill not revi ew her e the K nu th-Bendix method or its exte nsion using associative and/or com mut at ive unification. The Knuth-Bendix method is described by Knuth and Bendix [7], Lankford [8, 9], and Huet [3J , am ong oth ers. Associative and/or commutative unification are t re a ted by Siek rnann [17, 18], Livesey an d Siekm an n [13, 14], an d Stickel [19, 20] and extension of the K nut h-B end ix m ethod to in corp orate associat ivity and/or commut a tiv ity is treated by Lan kfor d an d Ballantyne [10, 11, 121 and Pet erson an d St ickel (15, 161 .
In addition to th e
publications cited a bove, Hullot [5] pr esent s num ero us examples of the use of th e Knuth-Bend ix method wit h a nd wit hou t spec ia l treatment of associ at ivity an d/or commu t at ivity . Th e two mos t im po rtant di fferen ces in t he use of t he Knuth -Bend ix method for t his prob lem , as compare d to prev iou s work by Pet erson and Sti ckel [15, 16J, are t he use of ca ncella t ion
250
laws to simplify reductions and a better pair-evaluation function to order matching reductions. Sections 2 and 3 describe the use of cancellation laws and the pair-evaluation function in the Knuth-Bendix method. Section 4 describes the proof of the x 3
= x ring problem and
Section
5 compares our approach with Veroff's. Section 6 discusses the complete set of reductions for free rings satisfying x 3 = x. Section 7 gives suggestions for improving the performance of the program.
2. Cancellation Laws The most significant addition to the Knuth-Bendix method using associative and/or commutative unification that we made to solve the x 3
= x ring problem is the use of cancellation
laws to simplify derived equations. This addition made the solution feasible; we have so far failed to solve the x 3
= x ring problem without the
cancellation laws. It is certain that the effort required
to do so would greatly exceed that used with the cancellation laws. We expect that cancellation laws can be widely used in the Knuth-Bendix method in the future to substantially accelerate convergence of complete sets of reductions. To use the cancellation laws, we add the reductions (x
+V=
x)
->
(V = 0) and (x
+V=
x + z) -> (V = z) that may be applicable to the entire derived equation, not just its subterms as is the case for all the other reductions. With the single exception of the additive identity reduction x+O -> x that the cancellation laws are not permitted to reduce, the cancellation laws never reduce an equation of nonidentical terms to an equation of identical terms. Thus, any critical pair from which a reduction can be derived can lead instead to a simpler reduction if a cancellation law is applicable. This simpler reduction is more powerful than the original because it, plus x
+ 0 ->
x, can reduce the original
reduct ion to an identity. Besides being more powerful, the simpler reduction has the further advantage that matching its left-hand side with the left-hand side of other reductions to generate new critical pairs will result in fewer, less complex equations and thus create less work for the program. We ran two problems with and without using the cancellation laws. The first problem is completing the set of reductions for free commutative rings with unit element starting from the equations 0 + x = z, (-x) + x = 0, 1 X x = x, and x X (V + z) = (x X V) + (x X z). The second problem is to show that rings satisfying x 2 = x are commutative. This is similar to, but much simpler than, the x 3 = x ring problem. The relative simplicity of the x 2 = x ring problem is suggested by the fact that in x 2 = x rings, -x = x; in x 3 = x rings, -x = 5x. The benefit of using the cancellation laws in the x3 = z ring problem is much greater than for these simpler problems.
251
Pairs Matched
Problem
Equations Simplified Reductions Created
ring completion w.o. cancel ring completion with cancel
222 115
428 289
31 (9 retained) 21 (9 retained)
x 2 = x ring problem w.o. cancel
124
268 165
23 (13 retained) 19 (13 retained)
x2
= x ring problem with
cancel
96
Modifying derived equations instead of immediately transforming them into reductions provides important added variability in the Knuth-Bendix method to give it greater efficiency or additional capabilities. The use of cancellation laws is an example of the former. An example of the latter is the modification of the Knuth-Bendix method to do induction proofs in equational theories with constructors (see, for example, Huet and Hullot [4]). There, if an equation of two terms headed by the same constructor with n arguments is derived, the equation is replaced by n equations equating the arguments.
3. Pair-Evaluation Function In the Knuth-Bendix method, it is necessary to select which pair of reductions to match next to derive new critical pairs. If the selection algorithm is poor, the completion process may diverge, even though a complete set of reductions exists. There is no way of insuring that the completion process will not diverge unnecessarily, but selecting pairs with small combined left-hand sides works well in practice. Our implementation of the pair-selection process involves maintaining a list of all pending pairs sorted in ascending order according to an evaluation function. The evaluation function we have used in the past is simply the sum of the number of symbols in the two left-hand sides. However, the x 3
=x
ring problem is much more difficult than previous problems to which the
Knuth-Bendix method has been applied, and this simple evaluation function was not adequate for easily solving it. An attempt to solve the x 3 discovering x X y
= y X z,
=z
ring problem managed to get within one step of
but failed to select the right pair of reductions to match next after
several days of computing. The problem was the discovery of large numbers of reductions like x 2yxyx2yx
--+
xyx.
Such reductions have relatively few symbols on the left-hand side and hence were given preference for matching. However, matching these reductions with other reductions (such as the distributivity reductions) often resulted in a large number of equations that were slow to simplify. Simply counting symbols in the left-hand side of the reduction did not reflect the greater complexity (after simplification to sum-of-products form) of products when a variable is instantiated to a sum compared to the complexity of sums when a variable is instantiated. The solution is to use an evaluation function that gives less preference to products. The evaluation function adopted to remedy this problem is V(>'d + V(>'2} where
>'1 and
>"2 are the left-hand sides of the two reductions to be matched and V is defined over the constant
252 0, all variables x, and terms t, tl , ... , tn as
V(O) =2, V(x)
V(tl + V(t l X
=
2,
V(-t) = 5 X V(t), + tn) = V(ttJ + X tn) = V(ttJ X
+ V(tn), X V(tn).
This is a natural evaluation function for ring theory problems because the value of a ring sum is simply the integer sum of the values and the value of a ring product is simply the integer product of the values. The value 2 used for constants and variables is the smallest positive integer that is not the additive or multiplicative identity.
=x
The only part of the definition of V that seems contrived for the purpose of the x 3 ring pro blem is the definition of V(-t) as 5 X V(t). This value reflects the fact that -t
= 5t is a
consequence of of x 3 = x~a fact discovered in the earlier attempt to solve the problem. The choice of V( -t) = 5 X V( t) as opposed to other reasonable definitions for V is inconsequential because the reduction -x -" 5x is discovered quite early in the completion process. All other occurrences of - are then eliminated, and the evaluation function for negated terms plays no further role.
4. The Proof The appendix lists the proof of ring commutativity from x 3 = x. The proof has been cleaned up slightly and unused inferences are not shown. The program did not use exponentiation or multiplication by a constant, so 3xv 2 is our shorthand for the program's (x X v X v) v X
+ (x
X
1'1 + (x X v X v). Ring axioms were provided as reductions (1)-(11). These reductions, plus the assumptions
of associativity and commutativity for
+
and associativity for X, constitute a complete set of
reductions for free rings. We did not allow matching pairs of ring axiom reductions because this is a complete set of reductions and no new reductions could be created. Reduction (12), not shown, invoked the cancellation laws allowing the derivation of x
= y from x + z = y + z and = x.
x = 0 from
x + z = z or z = x + z. Reduction (13) is the hypothesis that x3
Each of the intermediate steps in the derivation is a derived reduction. In effect, reductions are created by forming an expression to which both parent reductions are applicable. The results of applying the two reductions are set equal and fully simplified. If the result is not an identity, it is saved as a new reduction. After each reduction, there may be a line simplifying x by y (when there is not, it means the reduction is exactly an equation formed from matching the parent reduction left-hand sides) where x is the equation formed from the two parent reductions and y is the list of simplifiers used to simplify it (cancel and distrib refer to use of the cancellation
and distributivity laws). The reduction numbers [n}", (n)d, etc., refer to embedded forms of reduction (n). For the reduction (n) A -" p where A is headed by the function symbol f, if
f is
253 associative-commutative then [n]" is the reduction flu, A) ..... f( u, p); if f is associative then (n)e! is the reduction f(tl, A) ..... flu, p), (n)"2 is the reduction f(A, v) ..... f(p, v), and (n)e3 is the reduction
flu, f(A, v)) ..... flu, f(p, v)). A useful perspective is to consider, for example, the derivation of (14) as the simplification of x 3 = x with x instantiated by y + z [i.e., (y
+ Z)3
Y + z) by applying distributivity to (y
+ Z)3.
At the completion of the proof, only 135 reductions had been created of which 52 were retained. The economy of the Knuth-Bendix procedure in number of retained results is demonstrable from the fact that these 135 reductions were the result of simplifying 9,013 equations derived from matching 988 pairs of reductions. Most of the remaining equations were simplified to identities and discarded; a few were simplified to equations like 3xy = 3yx that could not be converted into reductions and were also discarded. Total time was about 14.3 hours (including garbagecollection time) on a Syrnbolics LM-2 LISP Machine. This reflects slowness of the simplification procedure on numerous lengthy terms and could be greatly reduced.
5. The Veroff Proof The only previous computer proof of the x 3
=x
problem was done by Robert Veroff
in l1lS1 using the ANL-NIU theorem-proving system [21]. His solution required an impressively fast 2+ minutes on an IBM 3033. It is interesting to compare the approaches taken in these two proofs. Both rely heavily on equality reasoning. The process of fully simplifying equations with respect to a set of reductions is just demodulation. The Knuth-Bendix method's means for deriving equations from pairs of reductions is similar to the parsmodulation operation used in the ANLNIU prover. Cancellation laws were also used by the ANL-NIU prover. Despite such similarities in approach, solution by the Knuth-Bendix method required less preparation of the problem. The Knuth-Bendix program was given only the 11 reductions for free rings, the reduction x X z X x ..... x, declarations of associativity and commutativity, and a reduction for the cancellation laws. The ANL-NIU program was provided with a total of 60+ clauses, including the negation of the theorem (their proof was goal-directed whereas our program derived ring commutativity as a result of pure forward reasoning, attempting to complete a set of reductions). Some clauses expressed information about associativity and commutativity, which are handled by declarations in the Knuth-Bendix program. A large number were present to support a polynomial subtraction inference operation-
°
e.g., to derive a + (-c) = from a + b = 0 and b + c = 0. Comparable operations are implicit in the Knuth-Bendix method, which can infer a = c by matching the reductions x + a + b ..... x (the embedded form of a + b ..... 0) and y + b + c ..... y (the embedded form of b + c ..... 0).
6. A Complete Set of Reductions After the discovery that multiplication is commutative, the problem can be run again to derive a complete set of reductions for free rings satisfying
.z;3
= x,
This call then be used
254 to provide a decision procedur e for the word problem for such rings. Assuming the associativity and commutativity of addition and multiplicatio n, reduc tions 1, 2, 3, 5, 6, 7, 9, and 10 comprise a complete set of reductions for free commutative rings. Attempting to complete the set of reductions consisting of these reductions plus x X z X x -. x result ed in th e discovery that the reductions marked by • in the proof compr ise a complete set of reductions (assuming finite termination) for free rings sat isfying x 3 = x. During this computation, 120 pairs of reductions were matched and 2,121 equat ions simplified; this resulted in 18 redu ctions, 8 of which were retained to form the
complete set of reductions. Th e cancellation laws were necessary for deriving this result as well. To verify that this set of reductions derived by the progr am is actu ally a complete set of reductions, it must be proved t hat the set of reductions has the finit e termination property. T his can be done using a polynomial complexity measure over terms as used by Lankford [8, OJ. T he polynomial complexity measure I, Ij
, • • •
IHI is defined over the
consta nt 0, all variables x , and ter ms
. l « as
11011 =2, IIxll =2,
111 1 + IIt l X
II-III = 7 X 11111 + 1, + Inll = IIt + + IIt nll + n X Inll = 1111 11 X X IIlnll · jll
1 i.e.,
IIt l + t211 = 1111 11+ IIt211 + 1,
It can be shown that all the reduct ions in the final set (and also the deleted intermediate reductions from which they were derived) are complexity reducing according to th is measure. Th us the set of reductions has the finite te rminat ion prop erty and is complete.
7. Performance Improvem ent Simplification accounts for most of the time spent by the program. Either increasing th e speed of simplification or reducing the number of equat ions that need to be simplified could substantially improve the program 's performance. Prob ably the most serious defect of the simplification procedure is th at th e subsu rnption algorithm, used to determin e if a term is an instan ce of the left-hand side of a reduction and form the matc hing substitution, is not incremental. When matching a pair of terms, the subsurnption algorithm returns a set of all most general substitutions insta ntiating the first term to mat ch the second, instead of returning a single substitution and generating additional substitutions on dema nd. Th e large number of number of substitutions so produced, and the fact t hat th e simplification procedure requires only a single substitution, result in much wasted effort. T he use of a nonincremental subsumpt ion algorithm was especially bad in two instances. Note that subsuming 11 + ... + In by X + Y where + is associative-commutative results in the produ ct ion of 2" -2 substitutions. All these substit utions will be produced as an intermediate result when at tempt ing to reduce 11 + ... + In = s by the cancellatio n-law reduction (x + y = x + z ) -->
255 (y
=
z) or to reduce (t l
+...+ t n ) X 8
by the distributivity reduction (x+y) X z
-+
(x X z)+(y X z).
Given the lengthy terms sometimes simplified, and the extensive use of distributivity, we found it necessary to use LISP code to perform the cancellation-law and distributivity simplifications. Subsuming t l X ... X
t«
by x X y where X is associative but not commutative is less
costly, but still results in the production of n-l substitutions, and is another source of inefficiency. Adoption of an incremental subsumption algorithm, especially with "look ahead" for recognizing that a substitution for a pair of subterms will not be extendable to match the whole terms, could greatly speed the simplification process. An obvious addition to the Knuth-Bendix method to reduce the number of equations generated for this and similar examples is to recognize symmetries. Although multiplication is not assumed to be commutative, each reduction using multiplication has a symmetric variant, e.g.,
x X (y + z) (4) (x + y) X z
(3)
-+ -+
x X Y + x X z and x X z + 11 X z,
(7) x X 0 -+ 0 and (8) 0 X x -+ 0,
(10) x X (-y) -+ -(x X y) and (n) (--x) X y -+ -(x X y). If aI, az and bl , bz are pairs of reductions that arc symmetric variants, it is not necessary to match
each a, with each bj. It would be sufficient to match al with bl and bz , provided that the symmetric variants of derived reductions are also added. Winkler [22] also offers a criterion for rejecting matches that could be used to reduce the number of equations generated.
Acknowledgments Mabry Tyson provided much useful advice and assistance in this work. Dallas Lankford was extremely helpful in providing suggestions and reference material for this paper. Richard Waldinger also gave many helpful comments on an earlier draft of the paper. The strong encouragement by these people was very important to me in writing this paper.
References [1]
Bledsoe, W.W. Non-resolution theorem proving. Artificial Intelligence 9, 1 (August 1977), 1-35.
[2]
Comer, S.D. Elementary properties of structures of sections. Matematica Mexicana 19 (1974), 78-85.
[3]
Huet, G. A complete proof of the correctness of the Knuth-Bendix completion algorithm. Journal of Computer and SY8tem Sciences 23, 1 (August 1981), 11-2!.
Boldin de la Sociedad
256
[4] Huet, G. and J.M. Hullot.
Proofs by induction in equational theories with constructors.
Proceedings of the 21st IEEE Symposium on the Foundations of Computer Science, 1\)80. [5] Hullot, J.M. A catalogue of canonical term rewriting systems. Technical Report C8L-113, Computer Science Laboratory, SRI International, Menlo Park, California, April H)80. [6] Jacobson, N. Structure theory for algebraic algebras of bounded degree. Mathematics 46, 4 (October 1945),695-707. 171
Annals of
Knuth, D.E. and P.B. Bendix. Simple word problems in universal algebras. In Leech, J. [ed.], Computalioncl Problems in Abstract Algebras, Pergamon Press, 1970, pp. 263-297.
!8] Lankford, D.S. Canonical algebraic simplification. Report ATP-25, Department of Mathematics, University of Texas, Austin, Texas, May 1975. [9] Lankford, D.S. Canonical inference. Report ATP-32, Department of Mathematics, University of Texas, Austin, Texas, December 1975. [10] Lankford, D.S. and A.M. Ballantyne. Decision procedures for simple equational theories with commutative axioms: complete sets of commutative reductions. Report ATP-35, Department of Mathematics, University of Texas, Austin, Texas, March 1977. [11] Lankford, D.S. and A.M. Ballantyne. Decision procedures for simple equational theories with pcrmutative axioms: complete sets of permutative reductions. Report ATP-37, Department of Mathematics, University of Texas, Austin, Texas, April 1977. [121 Lankford, D.S. and A.M. Ballantyne. Decision procedures for simple equational theories with commutative-associative axioms: complete sets of commut.ative-associativc reductions. Report A'1'P-39, Department of Mathematics, University of Texas, Austin, Texas, August 1077. [13] Livesey, M, and J. Siekrnann. Termination and decidability results for string-unification. Memo CSM-12, Essex University Computing Center, Colchester, Essex, England, August Hl75. [14] Livesey, M. and J. Siekrnann. Unification of A+C-terms [bags] and A+C+I-terms (sets). Intorner Bcricht Nr. 5/76, Institut fiir Informatik I, Univcrsitat Karlsruhe, Karlsruhe, West Germany, HJ76. [15] Peterson, G.E. and M.E. Stickel. Complete sets of reductions for some equational theories. Journal of ihe Associotion for Computing Machinery 28, 2 (April 1981), 2;33~:W4. [161 Peterson, G.E. and M.E. Stickel. Complete systems of reductions using associative and/or commutative unification. Technical Note 2()9, Artificial Intelligence Center, SRI International, Menlo Park, California, October 1982. To appear in M. Richter (ed.) Lecture Notes on 8ystemB
of Reductions. [171 Siekrnanu, J. String-unification, part I. Essex University, Cochester, Essex, England, March 1975. [18] Siekmann, J. T-uniflcation, part I. Unification of commutative terms. Interne!' Bericht Nr. '1/76, Institut fiir Informatik I, Universitat Karlsruhe, Karlsruhe, West Germany, W7G. [19] Stickel, M.E. Mechanical theorem proving and artificial intelligence languages. Ph.D. Dissertation, Department of Computer Science, Carnegie-Mellon University, Pittsburgh, Pennsylvania, December 1977. [20] Stickel, M.E. A unification algorithm for associative-commutative functions. Journal of the Association for Computing Machinery 28, 3 (July 1081), 423-434. [21] Veroff', RL. Canonicaliznt.ion and demodulation. Laboratory, Argonne, Illinois, February 1081.
Report ANL-81-6, Argonne National
257
{22) Winkler, F. A criterion for eliminating unnecessary reductions in the Knuth-Bendix algorithm. CAMP-LINZ Technical Report 83-14.0, Ordinariat Mathematik ill, Johannes Kepler Universitst, Linz, Austria, May 1983.
Appendix: Proof that x3 = x Implies Ring Commutativity Following is the proof that x 3
= x implies ring commutativity.
The program was first run with
+ declared to be associative-commutative and
X declared
to be associative. After 988 pairs were matched, 9,013 equations simplified, and 135 reductions created (52 retained), the commutativity of X was derived. The program was then run again with both + and X declared to be associativecommutative. After 120 pairs were matched, 2,121 equations simplified, and 18 reductions created (8 retained), a complete set of reductions for the x3 x ring was discovered. The 8 reductions in
=
the complete set are marked bye.
258 .(1) (2) .(3) (4) (5) (6) .(7) (8) (9) (10) (11) .(13) (14) (18)
.(21) (22) (24) .(28) .(29) .(30)
(31) (33) (40) (48) (60) (66) (80) (82) (llS) (118) (119) (133) (135)
O+x-x, (-x)+x-O, x(y + z) - xy + xz, (x+y)z-xz+yz, -0 - 0, -(-x) _ x, xO - 0, Ox _ 0, -(x+y)-(-x)+(-y), xi-v) - -(xv), (-x)y - -(xy), x! - x, zyz + 1/ Z2 + y2 z + Z2 y + zy2 + yzy _ 0 simplifying (y + Z)2y + (y + Z)2 Z = Y + z by cancel, distrib, (13), yzw + zY'J) + wyz + wzy + ywz + zwy - 0 simplifying wry + z)w + (y + z)w" + (y + z)"w + wry + z)· + (y + z)w(y + z) + w"y + w"z =0 by distrib, (14), 6z - 0 simplifying 5z 3 + z = 0 by (13), 3z 2 + 3z - 0 simplifying 3z Q + 2z 6 + z· = 0 by (13), 3yz + 3zy _ 0 simplifying 2(y + z)" + (y + z)y + (y + z)z + 3y + 3z =0 by distrib, (22), -r - 5r 3v 2 _ 3v 3xv" - 3xv simplifying x(3v) = xv 2 + x(2v 2) by distrib, 3v 2z -> ;3vz simplifying (3v)z = (2v")z + v 2z by distrib, 3xux + 3ux -> 0 simplifying 3x 2ux + 2ux 3 + uz =0 by (13), (31), 3xux - 3ux 1/zy + zy" + y2z + 311z _ 0 simplifying 0 = 4yzy + 4Zll" + 4y2z by (21), (30), (31), (40), y2t + ty· + yty - 3l1t 2yty + 2y"t -> 3yt + ty· simplifying 3yt + ty" = 5yty + 5y2t by (24), (31), (40), x'ux" + x·'ux - 2ux simplifying x 2ux + ux + xux 2 = 3xux by cancel, (40), y·uy" + uy2 + yuy - 3uy simplifying 'i'ay' + uy2 + 1/uy5 = 31/u1/' by (13), (30), (40), 2Y28Y_1/81/2+1/8 simplifying 21/2 8y + 1/3 8 + 1/8 = 3y28 + ysy2 by cancel, (13), (31), y2 8y2_ sy2 simplifying 2y5sy + ,/s + 1/2 S = 3y5s + y48y 2 by cancel, (13), (66), 2uy" +yuy - 3u1l simplifying it by (118), xux 2 -> ux simplifying ux 3 + ux + x'ux 2 = 3ux 2 by cancel, (13), (30), x"ux -> ux simplifying it by cancel, (133),
(***) BlI = lis simplifying it by cancel, (133), (135).
from (3) and (13), from (3) and (14),
from (13) and (11), from (13) and (14), from (3) and (22), from (2)' and (21)', from (21)' and (22)', from (3) and (29), from (4) and (29),
from (13) G. either 'I9tj :::g'l9t'i' for some i (l~i~p) or ~('I9~ =E'I9u), for some j (l~j~q) .
2.2. Theories with constructors For simplicity of notation, we asume theories to be one-sorted, but all the results carryover to many-sorted theories without difficulty. In theories with constructors [HH]. the signature is partitioned as ~ ::: C \:t:I D. The members of C are called constructors and the members of D are called definite operators. We assume that there is at least one symbol of constant constructor. GC is the set of ground terms formed solely from constructors. GCVis the set formed solely from construclors and variables. Our aim is to express the E-validity property in terms of satisflabtlity. for which procedures of semi-decision such as resolution have been largely sludied. In other words, can the problem of E-validity be reduced to a problem of satisfiability (i.e. existence of a model) ? We have a positive answer to that question. when there exists only one satisfying model - the initial algebra. This leads us lo consider a set D of Evalid clauses which "lock" the initial algebra, so far as no other model satisfies E u D.
261
First we consider clause sets which discriminate two distinct members of Gc, then we show that such sets have the "locking" property. The framework of this method is more general than the Huet-Hullot 's framework. We extend their principle of definition [HH], in the following way: definition A set E of equations satisfies the principle of definition 'UJith discrimmating (or E defines Dover Cwith discriminating) if : (1) there exists a mapping >It : G -> GC such that, for any tin G, >It(t) is a member of GC equal to t modulo E. (2) there exists a set D of E-valid clauses, such that for any term t.u in G, >It(t) and >It(u) are distinct iff D u 1>It(t):::>It(u)l is unsatisfiable. >It is called the constructor mapping of E and D is called the discriminating set of E for it.
example Let us consider the natural number theory; we suppose that the constructor symbols are 0,1 and +, and that 1(t) is of the form 0, 1+1, or 1+(1+(... 1)...» for any ground term t. Then the following set is a discriminating set: I z+x;tz+y v x==y, 1+x;tO. O;t 1+x I
In the definition above. the property (1) is the same as its counterpart in [HH]. In return, the property (2) is a generalization of its counterpart in [HH]. According to [HH] indeed, terms in GC are equal modulo E iff they are identical. In that case, 'IjI is determined in an unique way and associates to any t in G the unique member of GC equal modulo E. It can be shown that in such a case, the following set C is a discriminating set for E: definition The oompatibility set of clauses - denoted C - is the union of the two following sets of clauses: C1 ::: I xj==Yi v c(xl' ... ,xk);tc(yl' ....yk) I for i:::1,.... k and every k-ary (k>O) constructor operator c in C I Cz ==
I c(xl' .. ·.xk);td(yl ... ·Sl) I for every k-ary operator c and every l-ary operator d in C, with d distinct from c I
Remark Note that C1 is the dual form of the functional substitutlvity axioms, and that O-ary (constant) constructor symbols are involved in Cz' THEOREM 2.2.1
Let E be a set of equations defining Dover C with a discriminating set D, and C an equational clause on T. C is E-valid iff E u D u Ie! is equality-satisfiable. proof «==) Let us suppose that E u D u lel is equality-satisfiable. Then there exists an interpretation J which equality-satisfies E u D u [C] . Let e be of the form vk=I....,p tk=t'k V1=1,...,q ul;tu'!' Let iibe a ground subst.it.ution : V .... G. J
262 satisfies ~C, hence eit.her J 1= ~t; = ~t'; , for some i (1,.;;i,.;;p) or J 1= ~(~Uj = ~u), for some j (1,.;;j,.;;q) . Now ~ti,~t'j,~uj'~U'j are in G. Let T;,T';;Uj,V'j be their respective images in GC by the constructor mapping of E. ~ti,~t';,~ui'~U'j are respectively equal modulo E to Tj>T';,Vj,V'j (by property (1) of the principle of definition). As Jsatisfies E, we have: either J 1= T;=T'j (a) or J 1= ~(Vj=v'j) (b). Since J satisfies D, (a) implies that Tj and T'; are identical (by property (2) of the principle of definition). On the other hand, (b) implies that vj and V'j are distinct, J being an equality-model. Hence either :Tj = T'j (a') or ~(Vj = v) (b'). So, by substitutivity of an equal by an equal modulo E, either ~tj = E ~t' i (a") or ~(~Uj = E 19u') (bOO) . Thus for any ground substitutton e, there exists an integer i (1,.;;i,.;;p) or j (l";;j,.;;q) such that: 19t; =E19t'; or ~(~l) =E19u'j) ; hence Cis E-valid (by proposition 2.1.1). (=» Let us suppose that C is E-valid. Then C is valid in I(E,E). now, by property (2) of the principle of definition, I(E,£) satisfies D. Furthermore I(E,£) is an equality-model satisfying E (by definition). So ICj u DuE is satisfied by I(E,E), and thus is equalitysatisfiable. The E-validity problem is thus converted into the equality-satisfiability problem for sets of clauses. Remark Note that theorem 2.2.1 still holds when E is a set of Horn equational clauses (the proof is the same as the one above).
The equality-satisfiability problem is semi-decidable and can be treated with a standard method of resolution with paramodulation [Lo]. Thus, any complete method of refutation by resolution with paramodulation will stand as an E-invalidity procedure, as soon as E satisfies our principle of definition. Furthermore, such procedures show satisfiability when they stop without having generated the empty clause (see [Jo] for satisflability-oriented strategies). In the following, we consider theories defined by canonical sets of rewrite rules. In that case, narrowing [Sl.La] is a powe rful optimization of paramodulation. Our concern is to build narrowing-based procedures which on the one hand generate shorter Einvalidity proofs and on the other hand produce more often E-validity proofs (i.e. do not loop for ever). 3. The basic E-invalidity procedure 3.1. Hypotheses on E
Henceforth, we make the following hypotheses on E : (HO) E defines a canonical term rewriting system, where no left-hand side is a variable. (H1) For any term t in G, the E-normal form denoted t* is a term of GC.
263 The hypotheses (HO) and (HI) ensure that the property (1) of the principle of definition is satisfied. the constructor mapping 'iT being the mapping which associates the normal form t * with any closed term t. Given a canonical set E. a sufficient criterion for (Hl)to hold. is : (HB) For every f in D, there is in E a set of equations whose left-hand sides are of the form f(s/, .... sik) (l,s;i,s;p) and the set IS1.... ,S1'1 - where Si denotes the k-tuple (s/..... st) is a base for Cin the sense that: for every k-tuple of ground terms in GC(tl' .... ~), there exist q. with l,s;q,s;p. and a substitution a. such that, for every j. l,s;j,s;k, we have
t j = a(sj) .
Further sufficient criteria are developed in [HH.Th]. See also [KaJ. in case of E being a set Horn equational clauses. In addition. we make the following hypothesis: (H2) For every t.u in GC. t=Eu
iff
t
u .
The hypothesis (H2) is exactly the property (2) of the Huet-Hullots principle of definition. With (HO-Hl), it ensures that t * is the unique member of GC equal to t, modulo E. As previously remarked. (H2) ensures that our principle of definition is verified. by taking C as a discriminating set. Moreover. it can be seen that the hypothesis (HZ). coupled with (HO-Hl), implies (HB). In the following. the results will often hold under more general conditions than (H2); we shall precise what actually are the useful hypotheses requested. The E-validity problem will be treated through techniques of refutation by resolution and narrowing. We assume that the reader is familar with the notions of resolution [LoJ and narrowing [SI,La]. A selection Iunction e is given which chooses from each clause a single literal to be resolved upon in that clause (see [KKJ) or to be narrowed. The constructor inequations c(xl... Xk);'!c(YI"'Yk) are assumed to be the 'II-selected literal for the clauses of Cl' The only rcsolvents D' considered are selected binary resolvents of a clause D with a clause of Ct U C2 U lx=xl . In resolution against C2 or against lx=xl, D' is obtained from D through instanciation and deletion of the rpselected literal. In resolution against Cl' D' is obtained from D through instanciation and replacement of the 'II-selected equation by an equation between subterms. This new equation is assumed to be the rp-selected literal in D'. For a clause D. D* denotes its E-normal form. We say that a clause D+ is a selectednormal form (resp. selected-reducesi form) of D if D+ is a reduced form of D and if rp(D+) is a normal form (resp. reduced form) of rp(D). The symbol *sel will stand for the operation of normalizing literals of a clause except the 'II-selected one. Throughout. s. b.r will stand for 'selected binary resolution against a clause of C u lx=xl' . 3.2. ground refulation
THEOREM 3.1 Let G be a ground clause on
ce. G is E'-invalid
iff there exists an input refutation D.gr
:
264
{Go,G1,· ..,Gnl of C u ~Gl u ~x=xL obtained by s.b.r. (without merging) and such that: • Go G and Gn = 0 • Gi H Gi - cp(G) , for i=l, ... .n . and Gi +1 is deduced from Gi by application of a linear sequence of s. b.r. The deduction of Gj +1 from Gj can be depicted by the following diagram:
to. VE
Eil ik
Gi + 1 proof «=) If there exists such a refutation Agr .then C u G u {x=xl is unsatisfiable (by soundness of resolution) ; hence C u G is equality-unsatisfiable and so is C u G u E. This implies that Gis E-invalid (by theorem 2.2, 1). (=» Suppose that Gis E-invalid, RC;;;; _>R' c;;;; _>E.R.
*
R is said E-noetherian (or
E-terminating)
iff _>E.R is
noetherian. _>R' is then also noetherian. * =(R U E) is the
reflexive, symmetric, transitive closure of
=E U _>R, and _*_>R' denotes the reflexive transitive closure of _>R' • * R is R'-Church-Rosser modulo E on T(F)
iff for any ground
terms t1, t2 in ~(F), t1 =RUlI t2 implies there exist ground terms t'1, t'2 s.t. t1 _*_>R' t'1 =E t'2 R'E.R is Church-Rosser, the converse being false. In practice, we shall use for R' one of the
rewriting relations _>R,
_>R,E (defined by Peterson and Stickel [P&S,81]) or
(_>L U _>NL,E) (defined in
[J x]
The term (x + (0 + y)) is not R-reducible but R,E-reduces to (x + y). (2) E = !x+y = y+xl
L = [x-O -> x]
NL = lx+(-x) ->
01
The previous term is no more R'-reducible because no commutativity step is allowed before applying a rule of L. But the term (-(x+y) + (x+y)) is R'-reducible to O.
STRUCTURED SPECIFICATION AND EQUATIONAL TERM RJ1JNRITING SYSTFNS
-------------------------------------------------------------Following Remy [RNVI,82] , we
introduce the notion of structured
specification which allows us to translate in terms of rewriting systems, the properties of consistency and completeness and to give sufficient conditions to check them. ~~!~~~!~~~_~_:
We call
* *
Let BF U DF be a partition of a set of function symbols F.
BF-equation g=d a pair of terms in T(BF,X) such that V(g) = V(d). BF-rule l->r a directed pair of terms in r(BF,X) such that VCr)
is
a subset of Vel)
* *
DF-equation g=d a pair of terms in T(F,X)\T(BF,X) s.t. V(g) = V(d) DF-rule l->r a directed pair of terms s.t. 1 is in T(F,X) \ T(BF,X)
and VCr) is a subset of Vel).
[]
Notice that, according to this definition,
an equation x=t with t
T(F,X) is not a DF-equation, and a rule x->t is not a
D~-rule.
in
287 ~~fi~i~i2~_2_:
A specification SPEC=(S,F,A) is a structured specification based
on BSPEC=(S,BF,BA) iff F=BF U DF,
A=E U R, E=BE U DE, R=BR U DR, BA=BE U BR, such that
1) BE is a set of BF-equations, DE is a set of DF-equations
BR is a set of BF-rules, DR a set of DF-rules 2)
BR is BR'-Church-Rosser modulo
BE
on T(BF), where BR' satisfies
_>BR~ _>BR' ~ _>BE.BR.
3) R is E-noetherian 4·) Any term t of T(F) has a R'-normal form t!R' in T(BF), where _>R' is
a relation such that _>R C _>R' C _>E.R and _>BR'C _>R'.
lJ
-
-
-
The notion of structured specification can be illustrated by a classical, but nevertheless significant e,(1J'nple S = integer
:3SPEC : ~U1ction
SPEC:
symbols : BF
Z:sRO SOCC PRED rules :
DF
-> integer integer --> integer integer --> integer
OPP : integer -> integer PLUS : integer, integer -> integer
BR
DR
SUCC(PRED(x)) -> x PRED(SUCC(x)) -> x
axioms
OPP(ZERO) ->ZERO OPP(SUCC(x)) -> PRED(OPP(x)) OPp(PRED(x)) -> SOCC(OPP(x)) PLUS(ZERO,y) -> y PLUS(SUCC(x) ,y) -> SUCC(PLUS(x,y)) PLUS(PRED(x) ,y) -> PRED(PWS(x,y)) DE
BE
PLUS(x,y) = PLUS(y,x) PLUS(PLUS(x,y),z) = PLUS(x,PLUS(y,z)) So~e
remarks are useful about definition 5
- In practice,
we shall use
for R'and HR'
relations previously mentioned, assuming
i~plicitely
one of the rewriting
that complete
B~
and E-
unification algorithms are known if necessary. - The condition 4) implies the completeness property of SPEC w.r.t. BSPEC. An equivalent formulation is : " any R'-normal form of a term t
in T(li')
288
is in T(BF) ,
~ (BF) " ,
s ince if a term t has some R'-normal
applyi ng condit i on 4) yi elds that t o would
to
form
which is not
in
be R'-reduci ble t o another
term of T(BF) , whi ch is impossibl e. Thus t o is i n T(BF) . The condi t i on 4) is also equ ivalent to f or any t erm t =f( t o " " ,t n ) such that f belongs t o DF and t o ' " .t n are BR'i r r educ i ble terms of T(BF) , t i s DR'-reducible. An ef fi cient decisi on algorithm f or t his last property can be
obtained
f rom [ THI, 84]. fu t an easy way to satisfy thi s condi t i on is t o defi ne the new symbol s of DF by s tructural i nduct ion on BR' - i r r educ1ble t erms of T(BF) . Let us first point out some pr operties of a
s t r uctured speci f i cat i on ,
result ing from the syntactical conditions of definition 4 . ~~~_~_~ Let (S,F,A) a speci fi cation sat isf yi ng t he condi tion 1
of
definition
5. Then fo r an,V terms t and t' such that t i s in T(BF ,X) : a ) t =BE t ' => t' i s in T(BF,X) b ) t _*_>BR' t ' => t ' is i n T(BF,X) c) t =E t'
=> t
d ) t _*_>R' t '
=BE t '
and t ' i s i n T(BF,X)
=> t _*_>BR' t ' and t' is i n T(BF,X).
Proof : a)
by
induct i on
on the
l ength
of t he small est proof
t hat
t =BE t ' . b) by i nduct i on on t he l ength of t he de r ivation. c ) a nd d) If t i s i n T(BF,X) ,
neither
equat i ons
of
DE,
nor
BSPF~
and
rules of DR apply to t . An easy i nduction yi elds t he result .
[]
-
The r elat ion between s tructured specifi cat i ons
based
enr ichments of BSPEC i s expr essed via term rewriting sys t ems !~~ £~~_~_ :
a~
on
fo l l ows :
Let SPEC=(S,F, A) a s t ructur ed specificati on based on BSPEC such
t hat A = (R U E) and R i s R'- Chur ch- Ross er modulo E on T(F) . with T:l '
as usual.
Then SPEC i s a pr otected enr ichment of BSPEC. Proof : 1 ) Si nce R is R' - Chur ch-Ross e r
mod11lo E on T(F) ,
implies t hat t here exi s t tl and t'l s . t . t _*_>R' t l =E t ' l
tAt' R' BR' t1 =B~ t ' l BQ ' SR'-r educ i bl e. ~he ext ension i s built f rom t he so cal led (g[ur iff - its lef t-hand s i de is reducible by the rule l ->r, - it is pro tected
on~y
for coher ence of r ules s impl ifiabl e by l->r ,
- it may be an extension but only of rules s implif iabl e by l - >r. Let us point out t hat , at l evel k , ~ (F) - rul e .
and
~ (SR)
~ ( F)- rules
Thus only ot her
each
newly i ntroduced rule is a
need to be checked f or s implifiabi lity
i s not modified .
We need a fai r ness se l ection hypothe sis , i n order
t o ensur e t hat
r ule and equation will be selected after a fi ni t e t ime to compute its
any
cr it i cal
pai rs , except if it has been deleted . The mai.n pro cedur e attem pts to direct a pair re duces as much as
is
(p,q)
possible
INDUCn vr:
CCT1 PL.~IOn .
int o a rule .
r ul es
The DIREC':' procedure
The SU1PLIPICATION pr ocedure
i n D,/'m). Rules that are s iJTlpl ifi abl e hy
l -> r must become new pai rs because their orientation may cnanee .
The
CRI~ICAL-
PAIRS proc edur e comput es over l appines between t he rule or the equat io n n ' given as argument and ot her equatio ns and rules . This pr ocedure also sets
pr ot ect i ons
or builds ex tens io n-pa i r s i :!' necessary, and nor'llali ze pai rs bef or e adding t o t he set of new pai rs .
t hem
295
INDUCTIVE COYiPLETIm (SP, SR, SE, PRED(OPP (x)) OPP(PRED( x) ) - > SUCC(OPP(x)) PLUS(ZERO, y) - ) y PLUS(SUCC( x) ,y) -) SUCC(PLUS(x,y)) PLUS(PRED(x) ,y) - > PRED( PllJS( x,y)) Equat i ons :
1:J
U Sl
PLUS (x,y ) = PLUS(y , x) PLUS (PLUS (x, y) , z) = PLUS( x,PLUS(y ,z ) )
f1ULT(ZERO, x) - > ZERO
~[f.LT ( SUC C ( x ) , y )
- > PLUS ( y ,~[f.LT(x ,y ) )
MULT(PRED(x) ,y) - > PLUS(OPP(y), MULT( x, y) ) MUW(PLUS(x,y) . z) - ) PLUS ( MULT ( x , z ) ,~~J1T ( y , z) )
Il\ULT(OPP(x) ,y ) - ) OPP(MULT(x,y ) )
n
'2
"1ULT ( ~~~,T ( x , y ) , 7. ) = r~JLT ( x , r1ULT (y , z ) ) r~JLT ( x ,y ) = MlTLT(y, x)
297
';'he completion al.gor i thm then
!~ene rates
and
proves t he neiv rules
PLUS(x, OPp(x» - > ZERO. OPP(OPP(x» - > x OPP(PLUS(x ,y» - > PLUS(OPP(x) ,OPP( y» , and terminates t hen with success .
~h is
example has been processed with the
help
of t he system romfF,L due to G. Iluet , G.Cousineau and t hei r co-worke r-s at I NRIA.
PRooP OF TIE I NDUCTI VE CCl'lPI,E'I'ION
AIf'JORI~ H: l
For a given k , l et SRi be t he cur r ent set of rewri t e rules at
t he ith
t ermi nal recursive call (1), (2) , (3) or (4) of the pr ocedure . R+ = U SRi is i
the set of al l the rules generated m.ring t he process. R# i s t he set of the rules which are never r educed neither on t he right or t he l eft . We ar e goi ng to prove by successive l emmas t he main theo rem ~~~~;:~~_~ _l
Let P = FO IT . . • U Pk- 1 U Fk be a set SPEC = (S, P, RUE) a st ructured s peci ficat i on (S, l\:(F) , l\: (R) IT
~( E»,
of f unction symbols , based
on
R" a set of l\( F)-rules and E" a set
BSPEC of l\:(F)-
equati ons s .t . a complete (E U E")-unification algorithm i s known . Assume t hat =i nd(Ro U Eo) i s decidabl e and t hat
the general inductive
ccrnpl et i on procedure is init i alized with R" , R, BUE", R#' t'l JEUEII) t • l 2 =
(R U R" U .E U E")
fJ
lemma 1, t'1
=~(E)
t 2 and t 2 is in
T(~(F».
300
~~~~:'_~_: If
the pr oce.ture does not sto'J ',-lith +'8.ilure nor ,ii,mroof, then
- SPEC" is consistent and conpl ete v.r.t. BSP:C. SPEC = (S, P, R u
~)
is consistent w.r.t.
- every equation of' E" U
,~"
J~P~C,
holds in 1(?,R U
_~),
- 1(1", RUE) = I(F, WI U E) = I(F, R U R" U E U E"). Proof : TJy theorem 1,
SP;~C"
is then compl.et.e
and consistent
~if. r
BSPEC,f. It is easily proved that =UVn) U I\/S))= J\:(R¥) U 1\(E))
,t .
on
I3rOlmd terms of' T(1\ (F) ) • ~hU8
BSPSC,.¥
=
JlSffiC and SPEC" is consistent
The theorem of validity in the that SEC is consistent holds in I (F, It U 1:1).
',if.
then
»t,
r . t. BSPE.c.
ap~lies
and yields
r ,t. BSPEC# and any equation of
(TI" U :;11)
~RS
The identity of initial algebras are easily
obtained from validity lemma (with BA =
(~,(R)
U ~(E)), A = (R U E) and
A' = (R U H" U E U E")) and lemma 4.
[J
The point b) of theorem 2 is based on the following lemma: Lemma 7 : If the procedl1re stops with disproof at some recursive call i, - SPEC i - SPEC'
(S, P, SPi U SRi USE) is not consistent w.r.t. BSPEC, (S, F, R U R" U E U E") is not consistent w.r.t. BSPSC. Proof (p,q)
If the procedure stops with
disproof,
in SPi such that p and q are both in
there exists a pair
T(~{(F))
and not SE-equal.
Then either ]{=1 and p=q does not hold in 1(FO' EO U EO)' or INDUC':'IVB Ca1PLET10N((p,q), Fk(SRi), \:(SE), card(~_1 (SRi))' k-1) stops with disproof and by induction hypothesis, p=q does
not
hold
in
I(~(F),
Fk(SR i) U ~(E)). There exists then a substitution s from X to T(Fk(F))
such that s(p) t(~(SRi) U ~(E)) s(q). Thus s(p) *(~(R) u ~(E)) s(q) and SPEC i is not consistent w.r.t. BSPEC. Then, fra~ lemma 3, on T(Fk(F)) =(SPi U SRi USE) is
included
into =(SPo U SIlo USE) = =(R U R" U E U 311 ) . Thus if SPEC i is not consistent w.r.t. [~nsistent w.r.t. 3SPEC.
BSPEC,
SPEC'
is clearly not
301
~ve
In order to obtain the second ryoint h) of the theoren 2,
'lCl'1ly
the
nil U"8" v,hich does not hold
in
validity lemma to 13SY':C, SP:C and SP:C'. Let as nov Drove c) : Lemma S : If there exists some assertion of
I(5,TI U Z) or if SP8C
(3,
~,
RUE)
is
not
consistent
w.r.t.
BSPEC,
the
inductive completion procedure stops vii th either disproof or failure. Proof: Applying the validity Lerma to BSPEC, SPEC and SPEC', we obtain that SPEC' is not consistent w.r.t.
S3P~C.
':'hus there exist two tens t and t' in ';'(~/?)) such that : t
(R U R" U E U E") t' and t ..(11. (R) U R r -k ' "k (E)) t'.
terminates or loops, (according to [KIR,G3J or terms \
an:'! t \ s.t. t _*_}Ri ' t
If the
procedure
[J~K,A4J),
there
exist
U ;~II) t \ SRi' finite(B)]
assume
(1)
Finite(A) and B ~ A
assume
(2)
Bl f 0 and Bl
~
pB
definition finite 1 simp
(3)
Finite(A) iff (V B)(B f 0 & B
(4)
Finite(A)
~
pA -> (3 C) C min-elt B)
4 implies using 3
f
0 & B ~ pA -> (3 C) C min-elt B)
(5)
(V B)(B
5 us
(6)
If Bl f 0 & Bl
1 simp
(7)
B
~
~
pA then (3 C) C min-elt Bl
A
7 theorem using theorem 2.12.8
2 simp 9,
pA
(8)
pB
(9)
Bl c pB
~
8 theorem using theorem
2 simp
(10)
Bl c pA
(11)
Bl f 0
2.4.2
11, 10 implies using 6 (12)
(3 C) C min-elt Bl
(13)
If Bl f 0 & Bl
(14)
(V Bl)(Bl f 0 & Bl
2, 12 cp
13 ug
~
pB then (3 C) C min-elt Bl ~
pB -> (3 C) C min-elt Bl)
14 introduction using definite finite (15)
Finite(B)
(16)
If finite(A) & B
(17)
(V B,A)[ finite (A) & B
1, 15 cp
16 ug
~
A then finite(B) ~
A -> finite(B)]
***QED*** There is a great deal more I could say about the use of the interactive theorem prover in the set-theory course, but since this is meant to be a talk about not what has been done but what should be done in the future I will say no more about it.
308
2.
Desi rab le Features f or t he Us e r
In di s cu s sing t he de s i r abl e f ea t ur e s of the next ge ne r a t i on of i nte r ac t i v e t heorem pr ov e rs , it i s natu r a l to brea k th e ana ly s i s in t o t wo par t s.
The mos t
import a nt i s from the s t a ndpoi nt of th e us er but for rea sons that I sha l l try to br i ng out i t is al most eq ua lly i mportant t hat the de s i r a bl e f ea t ur es f or au t hor s crea t i ng courses be g i ve n s e r ious a nd t hought ful con sidera t ion . F l exibi li ty-~
of int era c tion .
First , a bove al l , i n t he lis t of fea tur es
is easy and flex ib le us e of t he th eor em-pr ov i ng machiner y .
The re ca n be in the
co ns t r uc tion of on ly a mode r a tel y dif fi cu l t theo rem f or a s tuden t wha t co r re s ponds to 25 or 30 pa ge s of interac tion a s l ong a s i t i s t he kind of i nt e r a c t i on that i s ea sy f or t he s t ude nt .
There is , of co urs e , mor e t ha n one cr i terion of eas e .
If
t he stude nt has to go t hro ugh an awkwa r d path to const r uc t a proof becau s e of the severe l i mi ta t i ons on t h e t h eorem prover, then it doe s no t have the prope r sense of f lexibili t y.
There i s a gene r a l pr ob lem of human eng i nee r i ng of the proper
int e rf a ce be t ween t h e s t uden t an d t he i nt e r a ct i ve theo r em pr ove r that i s t oo ea s y to for get a bou t .
I suppos e t he po i nt I woul d s t re s s the most i s t ha t the i nterfa ce
must be suc h that it ca n be used by s ome one wi t h no pr ogramming expe ri en ce of any kind .
This is one c r i t e r i on we have had t o me e t as a s tric t t est i n hav ing ou r
int e ract i ve t heorem prove r s be us ed by l a r ge numbers of s t udent s .
No programming
r equireme nt s a re pl a ced on the s t ude nt s a nd t he y co me t o t he cour se with, i n many cases , no pr ior background in programming.
We emphasize that they wi l l l earn
no t h ing ab out pr ogramming or ab ou t co mputers in t he cour ses . teaching t hem a given subje ct matter. used i n ba nks or in f a ctorie s.
Th es e a r e co ur s e s
Compu t e r s are be i ng used jus t as t he y are
Stude nt s are not in t hes e co ur s e s to l earn about
compute r s or to gain an y pr ogrammi ng expe r i ence .
The i nte rface has got t o be
t ho ught of i n t hi s fashion , it seem s t o me, in order f or us t o ha ve a suc ce s s f ul nex t ge ne r a tion of t he or em prove r s .
I f we pu t in mor e powe r and gener a li t y we
mus t be ca r e ful t hat th is powe r and ge ne rali ty do no t impo s e strains on t he u s e of t h e s ys tem by r e l a t i v ely na i ve user s . Mi ni mum i n put . and dif f i cult ar t .
The t echnical t y pi ng of mat hema tica l f ormulas is a n a rca ne I t is s ome th ing t hat we do no t wa nt t o ge t in the wa y of
student s' g i ving proof s.
Th is means t ha t we want to t h i nk of the int era ctive
theorem pro ve r as offer ing a s much as poss ib l e a cont r ol language t o t he students, not dir ect l y a langua ge for wr i t i ng ma themat i c s.
The r e i s a tens i on he r e that will
not go away an d t hat wi ll r emain wi th us f oreve r , f or there wi l l be , on t he one hand , the de sire t o make t he s tuden t i npu t a s na tur al as possib l e in terms of ord i nary ma t hema t ical pra ctice of writing i nforma l proof s , a nd , on t he o the r hand, even i nf ormal proof s r equir e , i n sub j e c t s wi t h an y devel opment, f a irl y elabo r a t e mat hema t ical f ormu l a s tha t a re pa i nf ul and un plea s a nt t o t ype .
We will there f ore
have a t ension be t ween a rela t ively mor e a rcane co ntrol a nd the us e of mathema tical
309
Eng lish i n the gi v i ng of proo f s .
At the s t a ge of development we should see in t he
nex t ge neration I t hink we shou l d co ntinue to co ncentra te , as we have in t he se t theory course, on mi nimum input a nd t he conception that we a r e offering t o a student a con tro l languag e ra ther than an i nf orma l ma t h ema t i ca l langua ge as his mai n veh icle fo r expressing the proof he wants t o give . Power .
The t he orem-prov ing machinery in t he s et- theory cou rse i s not as
powe r f ul as one would like , and natur al i nference s cannot be made dir e ctly and easily.
The crit erion here i s the inferential leaps t hat are na t ur a l l y and ea s i l y
made by t ea ch er s and student s in giving proofs .
Now t he l eaps an d jumps tha t will
be made by different students and different instructors will vary widely, but I t hink t here is a common unde rs tanding of when mat t e r s a re too tedious an d too much time is being spen t on rou t ine th at should be swept unde r t he r ug .
Power can be
increas ed by hav ing a t the he a r t of matt er s mor e powerful r e s ol ution theorem prover s , but I t hi nk tha t what I have to say und er the inclu s i on of expert knowl edge in t h e form of heuristic guidance and automat ic "subject-matter" inference is probab ly as impo r tant an i ng redient of power as di rect computa tion . understand me.
Do not mis-
Increasing t he power of the resolution theorem provers or re lated
theorems t hat can be called by s t uden ts is of impor t a nce and shoul d by no means be neg lected .
The increa sing ch eapne s s of sheer comput a tional power makes t he
prospec t s here rather
br~gh t .
Heur is tic gu i dance .
The incorpora tion of exper t knowledge abou t a given
subject matter and , in particular , a given course by specific heuristic guidance avail ab le to s t uden ts in a varie ty of forms , especially under the form of goal struc t ures and responses to calls fo r help is one of t he most diff icult , tedious , an d time- con sumi ng aspe cts of good interac tive theorem provers.
I t i s unfor tunately
an a spect that I s ee l ittle pr os pe ct for be ing ab l e t o ge t r ight i n a ge ner al way . Cer tainly i t would seem fo r t he next generation of theor em provers the best we can ho pe is to incorp orate h i ghly sp ec if i c knowledge of a given s ubject mat ter, put t ogether most likel y by an expe r i enced ins truc tor in the subject .
In fact , I guess
I would ex press my skepticism that the kind of sp eci f i c heuristic guidance and co ns t r uc tion of goal struc t ures r equi r ed would ever become a matte r of generalized routine .
I do say something abou t the need fo r making i t easier fo r authors to
implement such gui dance in t he ne xt s ection . Graphics.
As fa r as I know, t h ere i s no s tandard r egul a r use of a t h eor em
prover anywhere in t he world that ex tensively and directly int e racts with graph ic di sp lay s related t o t hat which i s be ing proved .
It is obvious t ha t al r eady at t he
level of high -school geometry a power f u l use of graphics i s ca lled for .
There is
ev ery r eason to be hopeful ab out the k i nd of hardwar e that wi l l be av ailable to us. We are s t i l l , as f ar as I can see, a l ong way from havi ng a l l t he t ool s need ed t o crea te a real ly f irst-rate cour s e even in h i gh-sch ool geometry .
The us e s i n a
310
variety of other courses should be obvious.
I will say no more about this but take
it as understood that the extensive interaction with graphic displays should be a high-priority feature. 3.
Desirable Features for Authors
------
It is too easy to concentrate on the kind of end product that should be available to users. that has gone
i~to
It is obvious to me when I look back on the agony of effort creating the set-theory course that if we are to have the kind
of widespread use of interactive theorem provers that can be extremely useful in meeting teaching needs in mathematics and science, we must also worry about helping the authors who will actually create the courses using the tools I am calling for. Let me mention here four desirable features. Nonprogramming environment.
The first and most essential requirement is that
a sufficiently rich author language be built up that authors can create a new course without having to do any programming, or, ideally, even having the assistance available of a programming staff.
We are a very great distance from achieving this
objective at the present time, but I see no reason in principle why it is not even a feasible objective for the next generation of interactive theorem provers.
I
cannot stress too much its importance if we are to see widespread use of theorem provers in both high school and college instruction in mathematics and science. Easy i£ use author language.
It would be possible to create a nonprogramming
environment but one that is so awkward and tedious to use that only authors of the hardiest nature would be willing to tangle with it.
It is important that the author
language that is created be one that authors like and feel at home with. great deal to be said on this subject.
There is a
I would just emphasize that once again as
much as possible we would like to minimize input on the part of authors.
We would
like to give them as much as possible a control language for creating a course.
As
far as I can tell we are very far from having such facilities anywhere in the world at the present time. Flexible
~
structure.
It is also an important requirement that authors
have available a clearly formulated and flexible course structure that they can use without programming assistance.
The courSe structures in the elementary logic and
set-theory courses with which I have been associated closely myself are not, I think, impressive at all.
We did not concentrate on what I generally call the
course driver in these cases but more on the theorem-proving apparatus.
But good
courses using interactive theorem provers need to have the possibility for instructors to not be locked into a single course structure but to satisfy their own particular teaching plans and to fit the course into the curriculum of their particular institution.
Again, it is easy to underestimate the importance of this
kind of flexibility in terms of making the use of interactive theorem provers a success.
311
Eas y ways to add expert knowledge .
Above all , we ne ed to make it easy fo r
i ns t r uc tor s withou t programmi ng experien ce or programming ass istance t o add exper t kn owl edg e t o gi ve t he cour s e th e f ull-bodied character it should have.
I do not
want to undere s timat e eithe r my ambi tions in this a rea or the difficulties .
What i s
pr oba bly most i mpor t ant i s not t o think in terms of encoding a fixe d body of knowledge but fo bu ild dynami c pro cedures that i nt era c t with what the s tudent is do i ng in powerful way s to give pertinent and cogent guidance to the student .
Let
me give the simp les t kind of example , but a lso one of the most importan t, t hat arises in a ny u s e of i nteractive th eor em pr ove rs . a proof .
The student beg in s to co nstruct
He i s, l et us say , a certain distance into t he proof and althoug h he
ha d a re asonab le idea to be gin with he is now at a loss as t o how to cont inue . asks for help .
He
A dumb expert sys t em will make him s t ar t over and give adv i ce in
t erms of so me preset idea s of what a proof of a given the or em shoul d lo ok like. smar t sy s tem of expert knowl edge wi ll work in a very di f f erent way.
A
It will look
at the proof as develop ed t hu s fa r by the student and be ab le , i f he has a reasonab le i dea , t o give h im help on continuing and comple t ing the pr oof he has a lr eady begun .
Now we a l l kn ow i t is easy t o say th is but e i t her as i ns t ructors ours elves
or as one s crea t ing such expert systems , it i s no mean fea t t o come up with s uch a sys tem.
We hav e had in t he last de cade a grea t deal of discus sion of such expert
sy s t ems of help in such trivia l su bjects as elementary a ri thme tic .
I have myself
devo ted s ome time t o th ese matters and s o when I say ' t rivial ' I do no t mean t o den i gr a t e the wor k tha t ha s been done but just t o put it in pr oper persp ec t i v e .
The
ki nd of work associated wi t h BUGGY cr eated by John See l y Brown and ot he r s simply has no obvious and easy extension t o a sub j ec t at the leve l , let us s ay , even of t he fi rst cour s e in ax ioma tic set theor y, not t o sp eak of more adv ance d subjec ts .
The
difficulties of creating r eal l y good s ystems of expert knowledge of the k ind I am ca l l i ng for cannot be underestima ted.
It is , I t h i nk , in many r es pec t s a lmost t h e
f irs t item on t he agenda fo r t he nex t ge ne ra t ion of interact ive t heor em prove r s . 4.
of -Cour -Next - -Round - - - -s es -
Le t me j us t conclude by listing some of the courses that I t hi nk a re just r igh t i n dif f i cul t y for a ttack by the next generation of interact ive theorem prov ers .
None of the cours e s reaches abov e the undergradu ate l evel.
I think it is
go i ng to t ake one more gen erat i on beyond t h e next one bef or e we can ha v e interactive theorem prover s that ca n be ser i ou s ly us ed in graduate cou r s e s of ins t r uc t i on in mat hematics or s cien ce.
To empha s i ze the general i ty of th e fr amewor k t hat needs
t o be c reated , let me br i ef l y des cribe seven s tanda r d cou rses that would be of cons i derable s ignif icance to have available i n a compu ter -based framework a nd wi t h good interactive the or em-p r oving facili ties .
The first thre e courses co uld as
wel l be offered t o able h i gh school s tudent s a s t o co l lege undergraduate s.
312
Elementary geometry.
This course is in fact a high school course.
It is a
bit of a scandal that we do not yet have a production version of a good elementary geometry course with a good interactive theorem prover available anywhere in the country as far as I know.
There are some formidable problems to be solved in
creating the theorem-proving facilities required for such a course, especially in having the proper interaction between proofs and graphs of the figures being constructed, but the problems are not of great fundamental difficulty.
A standard
criticism of many axiomatic theorem-proving elementary geometry courses is that there is too much emphasis on theorems whose geometrical content is limited.
1
think that a computer-based course can avoid this problem in the way that we have avoided it in the set-theory course.
Students could be given individual lists of
theorems and they could be led to expect to have to use previous theorems that they themselves have not proved in giving their own proofs.
In this way it would
be possible carefully to select theorems of geometrical interest that are still sufficiently elementary to let students try them and to deal with the problem of giving adequate proofs.
1 have some slightly idiosyncratic ideas about this course
that 1 shall not go into here.
I think there is a place for a quantifier-free
formulation of elementary geometry that has a highly constructive formulation and that could be a basis of a course that would avoid some of the logical intricacies inherent in the quantified formulas that are so much a part of a standard geometry course.
Linear algebra.
As has been emphasized by many mathematicians in the past
several decades, an elementary course in linear algebra might well replace the elementary geometry course in high school.
In any case, a course in linear algebra
is now standard fare in every undergraduate mathematics curriculum.
There are many
nice things about linear algebra from the standpoint of being a computer-based course.
Much of the course, for example, requires only a small number of types of
variable. ~hat
One of the standard bookkeeping problems in computer-based courses is
of embodying in a natural and easy way the standard informal use of typed
variables.
Moreover, most of the proofs in linear algebra are relatively easy and
rather computational in spirit.
Of the courses I mention in this list I think it
would be the easiest to implement.
As we all know, it is possible to continue
development of the course so that it becomes relatively difficult, but even then I do not see the proofs as being as difficult as the harder proofs in the first course in axiomatic set theory described above. Differential and integral calculus.
The undergraduate teaching staffs are so
familiar with this course and it is such a standard service offering, it might be wondered why it should be considered for development as a computer-based course with an interactive theorem prover.
I think the main argument for this is that
there is a definite place for it in the more than 20,000 high schools in this
313
country.
For many of these high schools it is really not economical to staff a
small course in the calculus for the small number of students interested in taking it.
From a broad national standpoint, however, it is important that such courses
be offered to the willing and able students.
We know from a great deal of
experience that very bright sixteen- and seventeen-year-olds, for example, can do just as well in such a course as students who are a couple of years older.
I can
see offering an excellent course with theorem-proving facilities but also offering some additional graphical and symbolic facilities as desired of the kind that have been developed in recent years.
In fact, one of the problems of the more powerful
facilities for elementary integration, for example, is that of knowing exactly what should be available to the student at a given point in the course.
It would
also be possible to offer such a course in the calculus with a new viewpoint, for example, the viewpoint of nonstandard analysis.
That would be a difficult decision
because many of the schools at which a computer-based course would be directed would not have instructors who would feel at home with a nonstandard approach to the calculus.
In any case, the desirability of such a course seems clear.
Differential equations.
Again, this course might just as well be one that
would be offered to the very best students in some of the high schools.
I must
confess that I know of no one who has yet tackled even the first course in differential equations as a computer-based course using an interactive theorem prover.
It seems to me, however, that there is nothing that stands in the way of
such a course.
It is true that many instructors would, and so I would myself,
emphasize concepts and applications perhaps more than proofs in the first course, but there is no reason that a computer-based framework could not offer a good approach to these matters as well.
Also, here again is a case where the use of
sophisticated graphics could be highly desirable. Introduction to analysis.
In this list I am developing, by now the student
will be ready for a first course in analysis.
Here a theorem prover would really
get a proper workout but again I find that the proofs in most books that are billed as a first course in analysis are not especially more difficult or complex than the proofs in set theory. variables.
Also in such proofs there is a fairly restricted typing of
By careful and judicious arrangements of theorems I see no difficulties
in principle, just the difficulties of actually working out all the details in a way that will give a smooth-running course with the kinds of facilities that could be offered in the near future. Introduction to probability.
The deductive organization of the first course
in probability that assumes a background in the calculus is a natural subject to put within a computer-based framework.
Also, it is a course that some faculties
are not particularly interested in teaching. for the very best students in high school.
It could also be made available again Most of the introductory courses in
314
probability at the level I am talking about do not require very elaborate proof procedures.
The machinery, I think, would not be too difficult to implement.
Theory of automata.
An undergraduate course in the theory of automata is
again at the right level of difficulty.
There are some problems in this course.
The notation is harder than the proofs themselves.
The structures being considered
are complicated but most of the proofs are not of a comparable difficulty.
At the
present time this is something of a problem in computer-based frameworks but it is something that I am sure we will see solved in a reasonable and intuitive fashion in the near future. science.
I mention this only as one theoretical course in computer
It is clear that there are other undergraduate courses in computer
science that will, on occasion, be thinly staffed in many colleges and universities. The opportunities for computer-based courses using sophisticated interactive theorem provers in this domain are perhaps among the best of any of the areas I have touched on. In closing I want to return to my point that what we need is a general facility for creating such courses.
The logic and set-theory courses with which
I have been closely associated myself have an inevitably parochial character in their organization and conception.
This is because they were created from scratch
and were focused on solving the immediate problems at hand for the subject at hand and not on creating a general framework usable by many different people for many different courses.
At the time we were creating these courses it would have been
premature to aim at such generality.
It is not now.
It is what we need in the
near future in order to fulfill the promise of the role that theorem provers should be playing in the teaching of mathematics and science.
315
REFERENCES McDonal d , J., and Suppes, P.
Student use of an interactive t he orem prover.
W. W. Bledsoe and D. W. Lovel and (Eds .), Au t omat ed theorem proving: 25 yea r s .
Provi dence , R.I . :
Suppe s , P.
Introduction to logic.
Suppes, P.
Axiomatic se t theory.
In
After
Amer i can Mathematical Society, 1984 . New Yor k : New York :
Van Nos t r and, 1957. Van Nos t rand , 1960.
Slightly revised
edition pub l i shed by Dove r , New York , 1972 . Suppes, P. Basel:
Computer-assisted instruction at St anf or d .
In Man and computer .
Karger, 1972.
Suppes, P. (Ed.), University-level computer-assisted instruction at St a nf or d : 1968-1980.
Stanford, Ca l i f . :
Stan ford University, Institute f or Mathematical
Studies in the Social Sciences, 1981 . Suppes, P., and Binfor d, F. elementary school .
The
Experimental teachi ng of mathematical logic i n the Arithmet ic~,
1965, 12, 187-195.
316
The Linked Inference Principle, II: The User's Viewpoint*
1. Wos Mathematics and Computer Science Division Argonne National laboratory 9700 South Cass Avenue Argonne, IL 60439
R. Veroff Department of Computer Science University of New Mexico Albuquerque, NM 87131 B. Smith Mathematics and Computer Science Division Argonne National laboratory 9700 South Cass Avenue Argonne, IL 60439 and W. McCune Department of Electrical Engineering and Computer Science Northwestern University Evanston, IL 60201
1.
Introduction
In
the
field
representations canonicalization objective
of
of
of
procedures,
this
search
reasoning programs. sharply
increased
automated
information, and is,
[2),
inference rules. the
Seventh
the
search
powerful
for
intelligent
continues
inference
rules,
strategies.
and
for for
The
useful effective practical
of course, to produce ever more powerful automated
In this paper, we show how the power of such by
can
be
employing inference rules called linked inference rules.
In
particular, we focus on linked resolution
reasoning, for
UR-resolution,
a
generalization
programs of
standard
UR-
discuss ongoing experiments that permit comparison of the two
The intention is to present the results of
Conference
on
Automated
inference rules given in this paper is
Deduction. from
the
those
experiments
at
Much of the treatment of linked user's
viewpoint,
with
certain
abstract considerations reserved for Section 3.
*This work was supported in part by the Applied Mathematical Sciences Research Program (KC-04-Q2) of the Office of Energy Research of the U.S. Department of Energy under Contract W-3l-l09-Eng-38 (Argonne National laboratory, Argonne, IL 60439).
317
Employment of linked inference rules enables an automated reasoning program draw
conclusions
in
one
step
that
(unlinked) inference rules are used.
typically
require
to
many steps when standard
Each of the linked inference rules is obtained
by applying the "linked inference principle", a principle of reasoning that is fully developed in [8].
Application of the linked inference principle yields generaliza-
tions of a number of well-known inference rules--for example, binary resolution, URresolution, hyperresolution, and paramodulation. use of various linked inference rules are made criteria for syntactic criteria.
In
The larger steps permitted by the possible
particular,
by
substituting
semantic
the usual clause boundaries that
define certain well-known inference rules are replaced by criteria defined in of the meaning of the predicates and functions being employed. er the case in which the natural representation of a problem clauses,
each containing negative literals.
terms
For example, considproduces
Further, assume that
two
nonunit
some
reasoning
step that you would like a reasoning program to take requires the simultaneous sideration of the two nonunit clauses. dard UR-resolution suffice on would not be so restricted
Neither
syntactic
con-
standard hyperresolution nor stan-
grounds.
However,
linked
UR-resolution
for the criteria that are employed governing application
of the rule are semantic and not syntactic. Although the discussion focuses chiefly on the inferential touch
process,
we shall
on some consequences and benefits of a strategic nature that are present when
employing linked inference. approaches
as
if
they
(We often discuss
are
integrally
various
connected
strategic
and
inferential
when the connection is in fact
primarily historical.
The study of linked inference has led
strategies
can be used outside of linking, strategies such as the target
[8]
that
strategy and the extension of Notwithstanding
our
strategy
the [7],
discovery for
set
of
on
the
user, we shall briefly discuss the abstract
notion of linking in order to show how it unifies a number of concepts. to
of
the
emphasis
support
to
example.) In addition
furthering the objective of producing more effective and more powerful reasoning
programs, linking contributes to two other objectives. greater
control
over
the
performance
attacking some specific problem.
of
an
First, the user is
automated
reasoning
provided
program when
In particular, the user has control over the
size
of the steps that occur in a deduction, and also can instruct an automated reasoning program to restrict its deductions to those directly relevant to some chosen concept or goal.
Second, the user is permitted more freedom in the use of a natural presen-
tation of a specific problem. syntactic
flavor
of
the
The
clauses
choice
of
to
deduced.
be
notation is
not
dictated
by
the
In contrast, because of such
syntactic considerations, many (unlinked) inference rules place limitations
on
the
choice of notation. The problems we solve with puzzles,
circuit
linked
inference
design, and program verification.
are
taken
from
the
world
of
By selecting from the first of
318 these three areas, we can immediately give a sample of how linked We
choose
two
fragments
of
a
puzzle called the " j obs puzzle ".
concerns four people who, among them, hold eight jobs. Roberta,
inference works.
Each of
Thelma, Steve, and Pete--holds exactly two jobs.
The jobs puzzle
the
four
actor, boxer, chef, guard, nurse, police officer, teacher, and telephone is
held
people.
by
one
person.
people--
Each of the eight jobs-operator--
You are asked to determine which jobs are held by which
Among the clues, you are told that Roberta is not the chef,
and
that
the
husband of the chef is the telephone operator. Perhaps you have jumped to the conclusion that Thelma is the chef. you
To
see
if
are right, the following clauses can be used to represent this puzzle fragment,
and UR-resolution can be used to draw conclusions. Roberta is not the chef. (1)
-HASAJOB(Roberta, chef) If a job is held by a female, then the female is Roberta or Thelma.
(2)
-FEMALE(jobholder(y»
HASAJOB(Roberta,Y) HASAJOB(Thelma,y)
If a person is a wife, then that person is female. (3)
-HUSBAND(x,y) FEMALE(y) The person who holdS the job of telephone operator is the husband of the person
who holds the job of chef. (4)
-HASAJOB(x,telop) HUSBAND(x,jobholder(chef»
(A second clause -HUSBAND(x,jobholder(chef»
HASAJOB(x,telop)
is needed to complete the representation of this fact, but it does
not
participate
in the illustration.) For every job, there is a person holding that job. (5)
HASAJOB(jobholder(y),y)
UR-resolution suffices to yield the desired conclusion in three steps. satellite and 4 as nucleus,
From 5
as
319
(6)
HUSBAND(jobholder(telop),jobholder(chef»
is obtained. (7)
From 6 as satellite and 3 as nucleus,
FEMAtE(jobholder(chef»
is obtained. (8)
And finally, from 7 and 1 as satellites with 2 as nucleus,
HASAJOB(Thelma,chef)
is deduced as the desired result. A person solving this puzzle clause
8
(simultaneously)
by
through 5.
fragment
would
considering
simply
and
naturally
conclude
the information contained in clauses 1
The information contained in clauses 6 and 7 would not exist,
explicitly.
One variant
of
linked
UR-resolution
at
clause 8 by simultaneously considering clauses 1 through 5 without deducing 6
and
7.
In
clauses
particular, in the terminology to be illustrated in this paper, the
user could instruct a reasoning program satellite"
least
would also immediately deduce
to
choose
clause
and the predicate HASAJOB as the "target".
1
as
the
"initiating
With those choices, clause 2
is the "nucleus", and clauses 3 and 4 are "linking clauses" that "link" clause 2
to
the "satellite", clause 5. In the example just discussed, the choice of the "target" is motivated natural interest in who holds which jobs.
output, since the goal is to establish a (simple) possibilities. in clause 2.
The
the
fact
rather
than
a
choice
of
conclusion, HASAJOB(The1ma,chef), is a descendant of a literal
Thus, in this variation of linked UR-resolution, the
"target clause".
by
The intent is to produce a unit clause as
nucleus
is
the
In the following variation, however, the nucleus is not the target
clause, for the conclusion is not a descendant of a literal in the nucleus. For this variant, we select another fragment from the "jobs clue
in the puzzle is that the job of nurse is held by a male.
quickly deduce that Roberta is not the nurse. example
Another
(Incidentally, this fragment
is
the
that led to the first application of the principle of linked inference and,
in fact, to the first
variant
of
obtained with the following clauses. (9)
puzzle".
From this clue, you
FEMALE(Roberta)
(10)
-FEMALE(x) -MAtE(x)
(11)
-HASAJOB(x,nurse) MALE(x)
linked
UR-resolution.)
The deduction
can
be
320 Here again the user chooses the natural target and the inference rule of linked resolution,
the
two
choices
motivated
as
discussed
earlier.
Clause 9 is the
initiating satellite, clause 10 the nucleus, and clause 11 the target literal
UR-
clause.
MALE(x) of clause 11 links to the literal -MALE(x) of clause 10.
The
The other
literal of clause 11, -HASAJOB(x,nurse), is the target literal, and is the parent of the deduced clause (as a literal) (12)
-HASAJOB(Roberta,nurse)
and we have an illustration of a variation on the preceding example
of
linked
UR-
They
show
resolution. The two examples illustrate two variants of linked
UR-resolution.
how the user can rely on a natural representation for the problem, without regard to syntactic tricks required by the wish for using a particular inference show how the
step
occur when using this natural formulation of the puzzle. possible
increase
They
Finally, they suggest
the
in program control and possible decrease in user effort that can
occur when employing linked inference. from
rule.
size need not be limited by the obvious clause boundaries that
The increase in control is derived
in
part
avoiding the generation of certain classes of intermediate clauses and in part
from keying on semantically chosen targets.
The decrease in effort
is
derived
in
part from the ability to rely on a natural representation without being so concerned with the need to generate (intermediate)
clauses
that
are
required
to
be
unit
clauses.
2.
Overview
In this section, we briefly particular,
we discuss
the
motivation for its existence. the
desire
need
review for
to
material covered in [8]. In some of its properties, and the
At the most general level, the motivating forces
are
to rely on semantic considerations in place of syntactic, the intention
of increasing the power and efficiency of desire
certain linking,
automated
reasoning
programs,
and
the
provide the user with more control over the actions taken by a reasoning
program while simultaneously placing less burden on the user.
At the more
specific
linking addresses a number of problems commonly faced when using a reasoning
level, program.
Of course the problems are extremely interconnected, but
let
us
discuss
them as if they are somewhat separate. The first problem focuses on the size and nature of the steps occur
in
a deduction.
that
ordinarily
Because many inference rules are constrained and defined in
terms of syntactic criteria alone, and because clause boundaries currently
prohibit
321
certain
combinations of facts from being simultaneously considered. the steps taken
by a reasoning program are often smaller than necessary or desired. of
The termination
a deduction step is often given in terms of syntactic criteria such as the signs
of the literals and the number of the literals. program
Users
of
an
automated
reasoning
might well prefer the termination condition to rely on the significance and
the meaning reasoning
of
the
program
conclusion. might
be
In
standard
hyperresolution,
for
example.
a
forced to accept a conclusion containing two positive
literals. while linked hyperresolution might
produce
a
conclusion
with
but
one
positive literal, the second literal being removed with a negative unit clause. The second problem concerns representation and choice
of
inference
choice
of
predicates
readability.
and
rule(s) and
and
strategy.
functions
naturalness.
typically
its
is
forced
with
the
Too often the user wishes to use one
dictated
by
a
desire
for
convenience.
If binary resolution is to be avoided--after
use results in too many clauses of too small a step size--then
more clauses must be added to the set of support. user
interconnection
and finds that. for example, binary resolution must
then be included as an inference rule. all.
its
Thus. in
these
situations,
the
to choose between using an inference rule that may be too prolific
and weakening the power of the chosen strategy. The third problem addresses user control of the actions taken by reasoning
program.
In
many cases,
the
an
automated
user does not wish to be forced to read
through and examine a myriad of conclusions. but instead wishes to be presented only with important conclusions.
The intent of using linked inference is to prOVide each
user with a means for telling a reasoning program which concepts are interesting. in turn
permitting
the
program to present only conclusions consistent with the given
instruction.
By judicious choices. the program Can be
intermediate
clauses
by
in
effect
Many such clauses are merely links between one significant
one.
prohibited
from
generating
classifying them as relatively insignificant. significant
statement
and
another
Depending of course on the price paid (measured in terms of time)
for achieVing this reduced clause space, a sharp increase in efficiency results. The fourth and final problem is that of extending the power and flavor set
of
support
strategy.
partitions the input clauses
The into
strategy. those
as
with
currently support
and
defined those
and
of
the
employed.
without.
The
strategy prohibits application of an inference rule to a set of clauses when none of its members has support. are
not
generated.
By doing so, many clauses that would have
This
clauses and then purge them subject to various criteria. Luckham
[1].
been
generated
action is far more efficient than actions that generate However, as pointed out by
this partitioning of the input clauses is not recursively present--is
not present at higher levels of the clause space.
Of course. levels 2. 3. 4. and so
on are smaller than they would have been because of the clauses not present at level 1 when using the strategy.
But the recursive power. Were it present. would
further
3~
partition
the
retained
l evel
clauses
into two sets , one wi th s uppor t and one
without.
With this action , the l evel 2 clause space woul d be small er t han wi t h
s tandard
de f i nition of the s et of support stra t egy.
a means for part itioning ea ch level of r etained level
O--and,
clauses - -as
set
of
currently
occurs
f or
as a re sul t , t o co nt inually r educ e the s ize of levels grea ter than 1
comparable to the way that level 1 is reduced. the
the
The probl em thus is t o provi de
support
strate gy
ref utation compl e t ene s s .
in
this
Of course, t he object is
fashion
without
to
extend
(operat i ona lly)
losing
(Questions of refutation co mpl e t e nes s are not addressed in
this pap er.) Appl icat i on of i nfe rence
r ules
t he
li nked
addresses
i nference
t hese
pr i nc ipl e
va r i ous problems.
to
produc e
r equired
to
result
ha s
many
skip the les s significant intermediate results.
draw a conc l u s i on , or term t he deduction step been
linked
Intuitively, linked inference
rules ena b le an automated r eas oning pr ogr am to "link" together as are
various
complete,
only
clauses
as
The object is to a
significant
obta ined, where significance is defined by t he user.
when
Rather than
totally aba ndoni ng syntact ic notions such as sign of literals and number of literals and
clause
boundaries,
linked
inference
c ri t e ria wi t h certain s emantic notions . r equi res
rules
Thus ,
pe rmi t
fo r
t he us er to combi ne such
exa mple,
linked
UR- r es ol u t i on
t ha t the conclusion be a uni t clause but, when the a ppr opr iate s t r ate gy i s
emp loyed , broadens the de finit ion of s tanda r d UR- r es ol u t i on to r equi r e t ha t the uni t c lause
sa t i s f y
additional
s pe cified criteria.
This extension enables a rea so ning
pr ogr am to avoid terminat ing t he deduc tion step me re ly becaus e some
unit clause .
of
having
produced
The extension also allows the us e of clauses that would normally
not be co nsidered satellites.
linked UR-resolution is to s tandard UR-resolution
as
standard UR-resolution is t o unit resolution.
3.
The Linked Inference Principle Note t hat, whe n re f e rr ing t o a cl aus e, here we mean an oc cu r r ence of a
Thus
t he
men t i on
of
two
c l aus es
admi t s
the
pos sibility
ident ica l - -are merely cop i e s of the s ame cl a us e . literal means an occurrence of a literal. impl y t ha t the two literals occ urrence s
of
(possibly
are
Si mila r ly ,
in t e rfe r es
but
the s ame) literals.
sli ght ly
with
of
r e f erence
to
a
rather
that
they
ar e
different
Although we could de fi ne the linked cov e r
fa ctor i ng
[6 ] ,
such
a
an unders t a nding of the princi pl e and is not
consis t ent with the current implementat ion of Employment
the
Thus the mention of t wo li t erals does not
di stinct,
in f erence pr i nc i pl e to cove r litera l me rg i ng and to defini ti on
c l au s e .
that the c l aus es are
t he
co r r espond ing
infe rence
rules.
the pri ncip l e, and he nce of various in f erence rules derived f r om it,
r equ i r es us i ng f a ctoring as an ad ditional i nf e rence r ule .
Finally , we
choos e
here
~3
to
define
the principle at the literal level, presenting its extension to the term
level in another paper.
Thus, equality-oriented inference rules such as paramodu1a-
tion are not covered in this treatment. 3.1
Definition The linked inference principle is a principle of
reasoning
that
considers
a
finite set S of two or more (not necessarily distinct) clauses with the objective of The deducing a single clause that follows logically from the clauses in S. principle
applies
if
there
unifier u that depends on f. found in [8j.)
unifiers is
exists
appropriate function f and an appropriate
an
(The formal discussion of
rule to app1y--pairs literals in the same rules do.
appropriate
functions
sense
that
certain
standard
inference
For example, standard binary resolution considers two clauses and pairs a
literal in one with a literal of opposite sign in the other with the intent two
literals
unifying.
with
the
of
the
As a different example, standard UR-reso1ution considers a
nucleus and a set of satellites, and pairs all nucleus
and
Such a function--that required for the linked inference
satellites.
but
one
of
the
literals
in
the
Again the intent is to find an appropriate unifier
that depends on the pairings, and that
is
required
to
simultaneously
unify
the
chosen pairs. In the same satellites,
and
spirit,
linked
UR-reso1ution
various
linking
clauses.
satellites and possibly with literals employing
from
considers
a
nucleus,
a
clauses
with
a unifier that simultaneously unifies the chosen pairs.
UR-resolvent
that
results
if
the
application
perspective, standard UR-resolution can be viewed resolution.
Just
as
standard
of
It pairs literals in the nucleus with linking
the
intent
UR-resolution
as
an
being
is successful. instance
of
The object is to
"cancel", with one exception, all literals in all clauses--the exception linked
set
of
the
With this linked
UR-
and standard hyperresolution can be
implemented as a sequence of the appropriate binary resolutions, so also can various linked
inference
rules.
However,
as
expected,
linked
inference rules are not
implemented in this fashion, but instead are implemented to avoid the generation
of
intermediate clauses. 3.2
Inference Rules Before briefly
discussing other concepts covered by the linked inference prin-
cip1e, we focus on inference rules. the linked
inference
principle,
For example,
binary resolution is captured by
even when a clause is resolved with
itself.
In
that case, S consists of two copies of the same clause, and each of the literals not involved in the unification is mapped to itself.
Factoring, on the other
hand,
is
not captured by the principle as presented here.
Hyperresolution,
resolution,
Equally, linked UR-resolution and
and UR-resolution are also captured.
negative hyper-
linked hyperreso1ution are captured by the linked inference principle.
324
We can now give the following formal definition of linked UR-resolution. Definition. selection
of
Linked UR-resolution is that
inference
rule
that
requires
the
a unit clause, called the initiating satellite, and a nonunit clause,
called the nucleus, such that the literal of the initiating satellite unifies with a literal
of
opposite sign in the nucleus.
In addition, with at most one exception,
for each of the remaining literals of the nucleus, nonunit)
clause
literal.
The
containing
unit
ancestor-depth
a
clauses
1,
and
literal
are
the
of
called
nonunit
there
opposite
immediate
must
sign
exist
a
(unit
satellites
or
satellites
a
literal
in
the
nucleus,
with
opposite sign that unifies with that literal.
no satellites or links of ancestor-depth greater number
paired
at most one exception, there must exist a
satellite of ancestor-depth 2 or a link of ancestor-depth 2 that provides a of
of
clauses are called links of ancestor-depth 1.
Further, for each of the literals in the ancestor-depth 1 links that are not with
or
that unifies with that
literal
There must be an n such that than
n
participate.
Next,
the
of so-called exception literals in the set consisting of the nucleus and the
links must be exactly one. unifies
pairwise
all
Finally, there must exist a unifier that
pairs
designated
by
simultaneously
the given conditions.
The linked UR-
resolvent is obtained by applying the unifier to the unpaired literal. In the unification requirement, the reason for allowing the possibility exception
literals
in
of
no
the nucleus but instead allowing an exception in one of the
links is that the deduced unit clause may be descended from a literal
contained
in
one of the links.
Accidental deduction of a unit clause, as can occur with merging,
is not permitted.
As in standard UR-resolution, we are interested in
that
operationally
will be deduced. produced
by
predicts,
if
a
definition
all conditions are satisfied, that a unit clause
AlloWing in the definition the unit clause
that
is
accidentally
merging leads to an implementation of linked UR-resolution, as well as
of standard UR-resolution, that is less effective.
The
broader
definition
would
force exploration of many paths that in fact would not produce a unit clause. The definitions, from the abstract viewpoint of the linked inference
principle
or from the user viewpoint, of linked hyperreso lution can be obtained by focusing on the objective of deducing corresponding
a
positive
clause
rather
than
a
unit
clause.
The
definitions for linked negative hyperresolution and for linked binary
resolution can be obtained by focusing on the corresponding objectives.
The
linked
inference principle is thus seen to capture a number of inference rules. 3.3
Other Applications of the Principle The linked inference principle also captures
capturing
various
inference
rules.
other
all classes of proof by contradiction that are signaled empty clause.
concepts
in
addition
to
For example, it captures, with one exception, by
the
The exception is illustrated with the two clauses
deduction
of
the
325 P(x) P(y) -P(x) -P(y) which, taken together, are an unsatisfiable requires
the use of f a c t or i ng.
set.
The
proof
of
unsatisfiability
The linked inference principle as given here can be
extended to capture this t ype of proof as well by replacing the requirement one-to-one
property
of
the
required
function
f.
of
the
In the extension, rather than
simply considering pairs of literals l,f(l), the function is allowed to admit pa i r e d sets
T,f(T).
All
literals in any such T must be from a single clause C in S, all
literals in f(T) must be from a single clause D, and of course simultaneously
must
un i f y,
so
must
the
the
literals
of
T
literals of f(T), and the two re sulting
literals must unify and be of opposite sign .
Thi s extension is similar to the firs t
def inition published for binary resolution [3]. The linked i nf erence princ iple also cap tur es , with one exceptio n, t he notion of the
of a clause D from a given set S of c lauses.
deduction
As expec ted, agai n t he
exception is any deduction tha t requires fact or ing. Finally, the deductive
aspects
linked of
appropriate strategy .
inference
Prolog
[4] .
principle The
can
be
viewed
as
capturing
the
procedural aspects can be captured by an
The execution of a Prolog program
can
be
achieved
with
a
single linked hyperresolution. The point of noting t he various concepts captured under principle
the
linked
i nference
is merely to observe that the principle provides a unifying framework for
a number of rather distinct concepts.
No claim is being
made
that,
f or
example,
seeking a proof by contradic tion, the user should instruct a reasoning program
when
to search for the appropriate single linked inference.
In fact, unless the user has
a well-tailored algorithm for solving the problem under consideration, searching f or such one-step proofs by contradiction is essentially a waste of time .
4.
Experiments We cannot overstate the importance of proper experimentat ion in evaluating
ideas.
early stage, we have not yet been able to do the extensive testing that is to
new
Because our implementation of the linked inference rules is still in a very
properly
assess the value of the concept.
required
In this section we briefly summarize
the few experiments that we are running, and include
other
examples,
obtained
by
hand, to further illustrate the potential of linking . The problems circuits ,
are
s e l e c t ed
from
three
and proving properties of programs.
areas:
solving
puzzles,
designing
The experiments focus on comparisons
326 between standard and linked UR-reso1ution. we
Because of the obvious need for brevity,
include here only sketches of solutions to problems.
A more detailed discussion
of the problems is found in [8]. 4.1
Solving Puzzles The first experiment focuses on the "jobs
described Pete. The
in the introduction.
puzzle",
There are four people:
Among them, they hold eight different jobs. jobs
are:
actor,
boxer,
chef,
guard,
implied), teacher, and telephone operator.
golfing together.
Each
nurse,
kind of fruit: fruit.
Each
bananas. 4.2
holds police
which
was
exactly
two
jobs.
officer (gender not
Roberta is not a boxer.
Roberta, the chef,
and
the
police
The
Pete has no officer
went
Who holds which jobs?
A merchant wishes
The second experiment concerns the "fruit puzzle". you some fruit.
of
The job of nurse is held by a male.
husband of the chef is the telephone operator. education past the ninth grade.
a fragment
Roberta, Thelma, Steve, and
He places three boxes of it on a table. apples, bananas, or oranges. box is mislabeled.
Box b contains apples.
to
sell
Each box contains only one
Each box contains a different type
of
Box a is labeled apples, box b oranges, and box c What do the other boxes contain?
Designing Circuits Circuit design is an application of automated reasoning that has generated much
interest
[5].
The
basic
approach
is
to
describe
with
axioms
the available
components and the way they interact, and then deny that a circuit with the properties
can be constructed.
Examination of a proof of the corresponding theorem
contains all of the information necessary to circuit.
specify
the
design
of
the
desired
The experiments are with multiple-valued logic circuits employing T-gates.
The following clause defines aT-gate. (1)
desired
-TRUE ( 5) XX (6) Xv(YZ)
-> -> -> -> -> ->
X X TRUE X TRUE (XvY)(XvZ)
(7) (8) (9)
-> -> -> ->
XYFALSE XFALSE (Xvy) xy (XvY)Y
(1)
(2 ) (3) (4 )
(10)
XIY ~X
X&Y x=>y
(The rule (10) is directly computed, since the dual of => is not a usual connector) This system is built from v and instead of & and l. It is confluent modulo the associativity/commutativity of these two connectors. 4-2-2: completion algorithm Let S {El, •••• En} be a set of quantifier-free formulas. To complete S, we run the Knuth and Bendix algorithm in its associative/ commutative version, initializing the set of rewrite rules with the Hsiang system (or the dual system) and the set of equations with {Ei '" TRUE}. If a formula Ei is already in the form FiGi, we can initialize the set of equations with the equation Fi = Gi instead of (FiGi) = TRUE. From now on, we suppose that the completion algorithm does
345
not stop with failure because of the generation of an incomparable critical pair. Let R~ be the (finite or infinite) rewriting system built by the algorithm. Theorem 4.2: Let F and G be two quantifier-free formulas. The formula FG is a valid consequence of S iff F and G have the same R~ normal form. In particular, F is a valid consequence of S iff F z>R~ TRUE. Proof: from theorem 4.1 and the general properties of the Knuth and Bendix algorithm, as explained in [6]. Remark: if the algorithm stops with failure, we can in certain cases run it again after putting the incomparable critical pair into the set of equations, with the associativity/commutativity of ! and & (or and v if we use the dual system). For example, commutative predicates can be handled in this way. Example: let us consider the following clausal specification: S = {p(f(x» v P(x) , ~P(f(x» v ~p(x)} If we apply to this specification the completion algorithm of section 3.3, based on resolution, we generate an infinite number of rewrite rules: (system BOOL +) P(f(x» v p(x) -> TRUE ~P(f(x» v ~p(x) -> TRUE ~P(f(f(x») v P(x) -> TRUE P(f(f(f(x»» v p(x) -> TRUE ~P(f(f(f(f(x»») v p(x) -> TRUE But if we apply to this specification the Knuth and Bendix completion algorithm, this algorithm will stop with only a finite number of rules, namely: (Hsiang's system +) P(f (x ) -> p(x)! TRUE Let us consider another example: let S be the following specification for a program to test if an element u is a member of a sequence z: x=x (2) ~Elem(u,NIL) (3) Elem(u,w.x) (u=w) v Elem(u,x)
(l)
We have added (1), simple reflexivity, the only equality property needed here. Using the dual Hsiang system and the Knuth and Bendix algorithm, we obtained at once the following complete system: (dual Hsiang's system
+)
346
x=x -> TRUE Elem(u,NIL} -> FALSE Elem(u,w.x} -> (u=w) v Elem(u,x} If we use the resolution algorithm, we must split the equivalence into three clauses. We obtain an infinite system: (system BOOL +) X=X -> TRUE 'Elem(u,NIL) -> TRUE 'Elem(u,w.x) v (u=w) v Elem(u,x) -> TRUE '(u=w) v Elem(u,w.x) -> TRUE 'Elem(u,x} v Elem(u,w.x) -> TRUE 'Elem(u,w.NIL} v (u=w) -> TRUE Elem(u,u.x} -> TRUE '(u=w) v Elem(u,wl.(w.x)} -> TRUE Elem(u,wl.(u.x)} -> TRUE In both examples, the Knuth and Bendix algorithm is preferable to the resolution algorithm. That is due to the fact that we can use equivalence relations (such as (3)) as simplifiers. In other cases, both algorithms will run in parallel. For example, if S is {,p(x) v P(f(x}}}, each algorithm generates infinitely many rules. Each rule: ,p(x} v P(f(f(f( •.. (x»}}) -> TRUE produced by the resolution algorithm is associated with the rule: p(x) & P(f(f(f( ••• (x)}})} -> p(x) produced by the Knuth and Bendix algorithm. Actually, the major problem with the Knuth and Bendix algorithm is the orientation of rules; for example if we run the Knuth and Bendix algorithm with S = axiomatization of equality, we generate at once non-orientable critical pairs such as: «x=y)&(x=z) , {x=y}&(y=z}>. Furthermore, with the techniques available now, we cannot put this rule into the set of equations, because we do not have a unification algorithm for this kind of equation. 4-2-3: Knuth-Bendix algorithm as a refutational proof technique Theorem 4.2 can be specifications as follows:
particularized
for
unsatisfiable
Theorem 4.3: Let S be a set of quantifier-free formulas. S is unsatisfiable iff the Knuth and Bendix completion algorithm generates one of the following rules (x being a boolean variable): X -> TRUE X -> FALSE FALSE -> TRUE TRUE -> FALSE Proof: taking F=FALSE and G=TRUE in theorem 4.2, we obtain: FALSE is a valid consequence of S (i.e. S is unsatisfiable) iff FALSE
347
and TRUE have the same normal form in R=. Therefore, either FAL5E or TRUE must be reducible by R=. Hence, one of the above listed rules has been generated by the algorithm. Theorem 4.3 is close to results of Hsiang [5] and Fages [4]. However, there are two differences: -We do not suppose in theorem 4.3 that formulas in 5 are initially in clausal form. -In exchange for removing this restriction, we have to compute all critical pairs; consequently, the algorithm can stop with failure if it generates an incomparable critical pair. 5-FIR5T ORDER PREDICATE CALCULU5 IN AN EQUATIONAL THEORY 5-1: preliminaries We are now in first order predicate calculus with equality. Furthermore, we suppose that the set of clauses that we consider is divided into two parts: -A set T of unit clauses of the form {M=N} which define an equational theory. -A set 5 of clauses which do not contain any equality predicates. We suppose that the equational theory T can be compiled into a canonical term rewriting system R. We introduce a new inference rule, called "narrowing" defined as follows: Given a clause C, if there is a non variable occurrence u in C such that C/u is unifiable with the left-hand side of a rule (l->r) in R with m.g.u s, the clause N(C) = C(u TRUE} with C in S. We denote RS this equational term rewriting system. We say that RS is confluent on valid formulas iff for each formula F without equality predicate which is a valid consequence of S u T in predicate calculus wih equality, F ~>RS TRUE. Theorem 5.2: Let S u T be defined as above. We suppose that S does not contain the empty clause. The ETRS RS is confluent on valid formulas iff the following conditions are met: (i) for each binary factor F of a clause in S, F ~>RS TRUE (ii) for each binary resolvent R of two clauses in S, R ~>RS TRUE (iii) for each narrowing N of a clause in S, N ~>RS TRUE Proof: the run of the proof follows closely the proof of theorem 3.2. Lemmas A,B,C have to be proved only for clauses in R-normal form (that is sufficient because we use only blocked resolution in theorem 5.1). This fact allows us to ignore the rewriting system R; consequently, the proofs of these lemmas are exactly the same as for theorem 3.2. We need the following additional lemma, which extends lemmas Band C to the narrowing operation: Lemma H: if C is a clause such that C narrowing of C, N(C) ~>RS TRUE.
~>RS
TRUE, and N(C) is a
Proof: let U be the equational term rewriting system obtained from RS by using only the following rules of BOOL: ~(TRUE)
->
FALSE
349
~(FALSE)
XvTRUE XvFALSE XvX
XV(~X)
-> -> ->
->
->
TRUE TRUE X X TRUE
Let C be a clause such that C z>RS TRUE and N(C) a narrowing of C. We have C z >U TRUE because the rules of BOOL which are not in U cannot be applied to a clause. We are going to prove that U is confluent w.r.t the associativity/ commutativity of v, by using the confluence criterion of Peterson and Stickel. This criterion consists in checking the confluence of all AC-critical pairs between rules of U and their extensions. AC-critical pairs means that we use AC-unification instead of ordinary unification (see [14] for more details). It is easy to check that all critical pairs are confluent: -critical pairs between rules of R are reduced because R is confluent. -critical pairs between rules of R and rule D -> TRUE, D in S, are confluent from hypothesis (iii), because such critical pairs correspond to a narrowing of D. -critical pairs between rule D - > TRUE, D in S, and rule XvX -> X of BOOL (or its extension YvXvX -> YvX) are confluent from hypothesis (i), because such critical pairs correspond to a binary factoring of D. -critical pa irs between rule Dl - > TRUE and rule D2 - > TRUE, Dl and D2 in S, with: Dl = D3 v ~Ll D2 = L2 Ll and L2 being two positive unifiable literals, are confluent from hypothesis (ii), because such critical pairs correspond to a binary resolution between Dl and D2. -other critical pairs are obviously confluent (in particular, the sub-system of BOOL which we use is confluent). since C z>U TRUE, we have C =U TRUE. Hence N(C) =U TRUE from the definition of narrowing. Hence N(C) z>U TRUE by confluence of U. Therefore N(C) z>RS TRUE since U £ RS, which ends the proof of lemma H. We can now prove that the system RS is confluent on valid formulas by using theorem 5.1 which caracterizes the valid formulas. The proof is exactly the same as the end of the proof of theorem 3.2 and so is omit ted. From theorem 5.2, we deduce a completion algorithm which is similar to the completion algorithm of section 3.3. The differences are: -We initialize the set of rules RO to BOOL u R. -We add at step 3 the computation of narrowings. If R~ is the final rewriting system produced by the algorithm, we have the theorem: Theorem 5.3:
350
The rewriting system R~ has the following properties: (i) R~ is interreduced (ii) R~ is equivalent to S u T (iii) R~ is confluent on valid formulas Furthermore, for a given rewriting system R associated with the equational theory T, R~ is the only rewriting system associated with a set of clauses which has these properties. Proof: this proof follows closely the proof of theorem 3.3 and is left to the reader. 5-3: extension to equational term rewriting system We suppose now that the equational theory T can be compiled into a canonical equational term rewriting system (P,R), P being a set of equations and R being a set of rewrite rules, as described in section 2.1. We suppose that there is a finite and complete algorithm of P-unification. We define P-binary resolution, P-binary factoring, P-narrowing as binary resolution,... in which P-unification is used instead of ordinary unification. To extend the previous results, we need an additional property of the ETRS (P,R). This property has been introduced by Jouannaud and is called P-coherence. Roughly speaking, P-coherence allows to replace the reduction relation ->P,R defined in section 2.1 by a weaker relation ->R,P defined as follows: The term tl R,P-reduces at occurence u to a term t2 using the rule l->r in R iff there exists a substitution s such that tl/u ~P Is and t2 ~ tl[uX) , X+l>O, X+l >Y+ l X>Y The standa rd model of th is spec ificat ion is t he set of natural numbe r s, wi th the usual mean ing of 0,1,+, >. Applying the Knuth and Bendix completion a lgor i thm, we obtain the f ollowi ng system, complete w.r.t the as soc iativity/ commutativity of +, ! , &: (Hs iang's system + ) X+O - > X O>X - > FALSE 1>1 - > FALSE 1>0 - > TRUE X+ l >Y+ l - > X>Y X+ l> O - > TRUE l >X+l - > FALSE
353
X+l >l - > X>O If we use the (resolut ion + na r r owing) a lgor ithm of theorem 5.3" we mus t spl it the equ iva lence i nt o two clauses. We obtai n an inf i nite rewr iting system: (system BOOL +) X+O - > X ~(O >X) - > TRUE X+l >O - > TRUE ~(X+l >Y + l) v (X>Y) - > TRUE ~(X >Y) v (X+l >Y+l) - > TRUE ~(X +l+l >Y+l+l) v (X>Y) - > TRUE ~(X+l+ l+l >Y+l+l+l ) v (X>Y) - > TRUE
It is proved in this annex that the computation of binary factors and binary resolvents can be considered as a computation of critical pairs restricted to certain critical pairs. To take into account the associativity/ commutativity of v, we consider for each rule its extension rule, as defined by Peterson and Stickel [14]. I-Binary f a ct o r C being a clause and F(C) a binary factor of C,the r u l e F(C} - > TRUE IS obvious ly a cr itical pair between the rule C - > TRUE and the the rule XvX -> X (or its extension YvXvX - > YvX) of BOOL. 2-B inary r es o l vent Let C = Cl v Land D = Dl v ~P be two clauses, Land P being two positive unifiable literals. Let s be the m.g.u of L and P. A b inary resolvent of C and D is R(C,D } = CIs v DIs. The rewr i te rule associated with C and Dare: Cl v L Dl v ~P
-> ->
TRUE TRUE
We suppose that Cl and D1 are not the empty clause (i.e. C and D are not unit clauses). The proof is easily extended if Cl and/or D1 are the empty clause. The rule R(C,D} - > TRUE can be obtained by a computation of critical pairs as follows: The rules of BOOL: X v ( y & Z)
->
(X v
X
->
FALSE
&.,x
Y)&(X v
Z)
can be superposed, generating the rule: (X
v Y)&(X v
~y)
->
X
(1)
354
Rule (1) can be generating the rule: Cl v -'L
->
Cl
superposed with the rule Cl v L -> TRUE, (2)
The rule (2) can be considered as a kind of "extension" of the rule Cl v L -> TRUE. Such additional rules are also used by Hsiang [5] and Fages [4] for computing critical pairs. The two rules:
x v Cl v -'L Y v Dl v -.p
-> ->
X v Cl TRUE
(extension of rule (2» (extension of rule Dl v -.p
can be superposed since Land P are unifiable. generated is: CIs v DIs -> TRUE, i.e. R(C,D) - > TRUE.
->
TRUE)
The rule
Note that we retain only the rule R(C,D) -> TRUE to build the system R~ confluent on valid formulas. It is not necessary to retain the intermediate rules (1) and (2). REFERENCES [1] BIRKHOFF G. : On the structure of abstract algebras. Proc.Cambridge Phil.Soc.3l, pp 433-454 (1935). [2] BUCKEN H. : Reduction systems and small cancellation theory. Proc. Fourth Workshop on Automated Deduction, 53-59 (1979) [3] CHANG C.L. theorem proving.
and LEE R.C. : Symbolic logic and mechanical Academic Press, New-York (1973)
Formes canoniques dans les algebres booleennes, [4] FAGES F. et application a la demonstration automatique en logique du premier ordre. These de 3me cycle, Universite Pierre et Marie Curie, Juin 1983. [5] HSIANG J. and DERSHOWITZ N. : Rewrite methods for clausal and non-clausal theorem proving. ICALP 83, Spain.(1983). [6] HUET G. : A complete proof of correctness of the KNUTH-BENDIX completion algorithm. INRIA, Rapport de recherche No 25, Juillet 1980. [7] HUET G., OPPEN D.C., Equations and rewrite rules: a survey, Technical Report CSL-ll,SRI International, Jan.1980. [8] HULLOT J.M., Canonical form and unification. Fifth Conference on Automated Deduction, Les Arcs.
Proc. of the (July 1980).
[9] JOUANNAUD J.P., KIRCHNER C. and KIRCHNER H. : Incremental unification in equational theories. Proc. of the 21th Allerton Conference (1982). [10] JOUANNAUD with equations.
J.P. : Confluent and coherent sets of reduction Application to proofs in data types. Proc. 8th
355
Colloquium on Trees in Algebra and Programming (1983). [11) KNUTH D. and BENDIX P., Simple word problems in universal algebra Computational problems in abstract algebra, Ed. Leech J., Pergamon Press, 1970, 263-297. [12] LANKFORD D.S. : Canonical inference, Report ATP-32, Department of Mathematics and Computer Science, University of Texas at Austin, Dec.1975. [13] LEE R.C. : A completeness theorem and a computer program for finding theorems derivable for given axioms. Ph.D diss.in eng., U. of California, Berkeley, Calif.,1967. [14] PETERSON G. and STICKEL M. Complete sets of reductions for some equationnal theories. JACM, Vol.28, No2, Avril 1981, pp 233-264. [15] PLOTKIN G.: Building-in Intelligence , pp 73-90. (1972) [16] ROBINSON J.A resolution principle.
Equational
Theories.
Machine
A machine-oriented logic based on the JACM, Vol.12, Nol, Janvier 1965, pp 23-41
[17] SLAGLE J.R : Automatic theorem proving for theories with simplifiers, commutativity and associativity. JACM 21, pp.622-642. (1974)
356
Using Examples, Case Analysis, and Dependency Graphs in Theorem Proving David A. Plaisted Department of Computer Science University of Illinois 1304 West Springfield Avenue Urbana, Illinois 61801
This work was partially supported by the National Science Foundation under grant MCS 81-09831. 1. Introduction The use of examples seems to be fundamental to human methods of proving and understanding theorems. Whether the examples are drawn on paper or simply visualized, they seem to be more common in theorem proving and understanding by humans than in textbook proofs using the syntactic transformations of formal logic. What is the significance of this use of examples, and how can it be exploited to get better theorem provers and better interaction of theorem provers with human users? We present a theorem proving strategy which seems to mimic the human tendency to use examples, and has other features in common with human theorem proving methods. This strategy may be useful in itself, as well as giving insight into human thought processes. This strategy proceeds by finding relevant facts, connecting them together by causal relations, and abstracting the causal dependencies to obtain
a proof.
The
strategy can benefit by examining several examples to observe common features in their causal dependencies before abstracting to obtain a general proof. Also, the strategy often needs to perform a case analysis to obtain a proof, with different examples being used for each case, and a systematic method of linking the proofs of the cases to obtain a general proof. The method distinguishes between positive and negative literals in a nontrivial way, similar to the different perceptions people have of the logically equivalent statements A
:J Band
This work builds on earlier work of the author on abstraction strategies
(~B)
:J
(~A).
1171 and problem reduc-
tion methods [181. and also on recent artificial intelligence work on annotating facts with explanatory information
[6,7,91. This method differs from the abstraction strategy in that it is possi-
ble to choose a different abstraction for each case in a case analysis proof; there are other differences as well. For other recent work concerning the use of examples in theorem proving see
[11 and [2]. 1.1 Comparison with previous work
357
Several methods have been proposed for using examples or semantics in theorem provers. Gelernter [101 developed a geometry theorem prover which used back chaining and expressed semantic information in the form of diagrams; this enabled unachievable subgoals to be deleted. Reiter [191 proposed an incomplete natural deduction system which could represent arbitrary interpretations and use them as counterexamples to delete unachievable subgoals. His method also could use models to suggest instantiations of free variables. Slagle [20] presented a generalization of hyper-resolution to arbitrary models; his system gives a semantic criterion for restricting which resolutions are performed. Ballantyne and Bledsoe [11 give techniques for generating counterexamples in topology and analysis, and also show how examples can help in a positive sense in finding a proof. This idea is extended by Bledsoe [2] who gives methods for instantiating set variables to help prove theorems with existential quantifiers. Our method differs in the following ways: a) We actually apply a transformation to the clauses themselves and do a search on the transformed clauses. This transformation is based on semantic information. b) We split the set of input clauses to obtain Horn clauses and use the splits to structure the case analysis in the proof. c) We construct a dependency graph representing the assertions that can conceivably contribute to a proof, and then restrict the search to these assertions. Our method permits different models to be used for each case in a case analysis proof . Also, our methods of generating examples and counterexamples are not nearly as sophisticated as those in [I] and [2]; we concentrate on methods that are simple and general
and
easily mechanized.
However, eventually
more complex, domain-dependent
approaches such as those in [11 and 121 will undoubtedly be necessary. 1.2 An example We informally discuss an example of how the method works, before giving a formal presentation. This example should make the main features of the method clear. The theorem is the following: for all natural numbers x, x*(x+1) is even. even(x) V odd[x], even(x) ::> odd(x+i), odd(x)
Assume we have the axioms
::> even(x+ 1), even(x) ::> even(x*y),
and some axioms about arithmetic. The theorem, when negated and Skolemized, becomes
-even]c * (c + 1)), which can be viewed as a goal of even(c*(c+1)) . We first split the non-Horn clause even(x) V odd (x) to obtain the two clauses even(x) and odd(x) which must be dealt with separately. For semantics we use the standard model of the integers; this will be the "initial model" introduced below. Now, c may be interpreted to an arbitrary integer . Intuitively, when dealing with the case even(x), we would like to interpret c as an even integer, and when dealing with the case odd(x), we would like to interpret c as an odd integer . However, in gener al, there are technical problems so that when doing a case analysis, it is not always possible to find an interpretation making the case true. Thus when doing the case even(x), it is conceivable that we might be forced to look at an interpretation in which x is odd . For now suppose this does not happen. Suppose we do the case even(x) and interpret c to be 2. Suppose we also
358
interpret x to be 2. Then we construct causal relations between such interpreted facts; we say even(2) causes even(2"'3). From these causal relations we construct a proof ror the case even(x). Similarly, ror the case odd(x), we interpret x and c as 3, say. We say odd(3) causes even(4) and even(4) causes even(4"'3). From these causal relations we construct a dependency graph relating odd(3), even(4), even(4"'3), and even(3"'4) (making use or x"'Y = yottx). From this graph we construct a general proof for the case odd(x); these proofs are then combined to obtain a complete proof,
2. The Horn Case First we consider the case in which the input clauses are all Horn clauses. We assume the reader is familiar with the usual concepts or a term, a literal, a clause, a substitution, and so on. For an introduction to such theorem proving terminology see [5\ or [15\. An atom is a positive literal. P
/I.
Q
A Horn clause is a clause in which at most one literal is positive; :::l
R, considered as the clause
~P
V
~Q
thus
V R, is a Horn clause. Consider
the task or finding a proof or a positive literal M (called the goal literal) Irom a set S or Horn clauses. Assume for now that none or the clauses or S are all-negative clauses. This implies that S is consistent. Therefore there is a Herbrand model or S, which for our purposes is a set of ground atoms which are assumed to be true. Also, this model must make S true, in the sense that all clauses in S are true in the interpretation in which literals in the model are true and all other literals are false, Since S is a Horn set and has no all-negative clauses, there is in Iact a minimal Herbrand model or S, which consists or all positive ground literals that may be derived from S by hyper-resolution (equivalently, all positive ground literals that are logical consequences or S). We denote this model by CI(S), the closure or S. The first step in the proposed theorem proving method is to interpret the function symbols or S and interpret the terms or CI(S) in a corresponding way to obtain a more "concrete" model that will serve as an example Ior our theorem prover. Let I be an interpretation or the function and constant symbols or S. If
t is a ground term composed or function and constant symbols or S, let t! be the value of t in I. Assume for simplicity that S is untyped, so that I has one domain DJ and each function symbol r is interpreted as a function from D/ to DJ for n the arity or r. Then I can be extended to a model or S by properly interpreting the predicate symbols or S; let MJ (5) be the minimal such extension or I. We define an interpreted literal to be a literal whose arguments are elements or
DJ • Thus an interpreted literal is or the form P (d l'
dk ) where P is a predicate symbol
or S and the d, are elements or DJ , or is or the form ~P (d l'
dd. An interpreted
clause is a clause which is the disjunction or interpreted literals. If L is a ground literal and I is an interpretation, let L J be the interpreted literal in which each term or L is replaced by its value in I. Thus ir Lis P (t 1 ,
tt) and t/=d, then LJ is P (d 1 ,
dk
).
If C is a
ground clause and I is an intepretation, let OJ be the interpreted clause in which each literal L or C is replaced by LJ. For example, ir C is a Sb V b Pffx Pfx :::> Pgfx Pgx :::> Pggx gfx 2: Igx
x2:y
Pfx
Theorem: (3z )P(x)
"
:::> :::>
Ix 2: fy gx2:gy
"
y2:z
x2:x x2:y
:::>
x2:z
x 2: ffgfggc
The first four axioms permit derivation of P g k f
j
c. The remaining axioms permit g and f to
interchange, decreasing the value in the partial ordering. We consider two interpretations, I. and lb . First Interpretation We interpret c as 0, f(x) as x, g(x) as x+l, P(x) as x model basically counts occurrences of g.
> 0, and
x 2: y as x
= y.
This
Second Interpretation We interpret c as 0, f(x) as x+l, g(x) as x, P(x) as x 2: 0, and x ~ y as x = y. This model basically counts occurrences of f. V ..(x ~ ffgfggc). This is converted to :::> GOAL where GOAL is the goal literal. For the first interpretation, assuming J is chosen large enough, we obtain the following dependency graph, where literals have been omitted that cannot contribute to a proof of the goal: The theorem, negated, becomes ..P(x)
p (z)
"
z ~ ffgfggc
P (0)'
~
(J o~o' ~
cr
P(l)
~
0 l~l'
C!5
P(2)
~
0 ~
2~2'
P(3)
(5 ) : GOAL ~ 3~3'
(J
For the second interpretation, we obtain the following graph:
x
x =y
::::>
x ~y
x 'C?:y " Y >z ::::> z ~z Suppose the goal is to show that max (x , y )~x. We first introduce new Skolem constants c and d and transform the goal to max (e,
d )~e. Let I be the interpretation in which the
domain D1 is the natural numbers {O, 1, 2, ... } and max(x, y) gives the maximum of the numbers x and y.
x ~y
V
Suppose I interprets c as 2 and d as 4.
y>x is replaced by the split clause x
is obtained by replacing z ~y including 2>3
::::>
~y
The non-Horn clause
with alternative y >x. The Horn set SH
V y>x in this way. Now Sk has infinitely many clauses,
3=2 and 4>1
::::>
4=4 (from the first clause), 4>2 and 1>5 etc. (from
the split clause), and in general we get all clauses resulting from replacing the variables of clauses in SH with natural numbers and evaluating max in the standard way. The closure
Cl (Sk), is all ground literals derivable from Sk by hyper-resolution . This includes all literals oC the form x ~y (from the split clause) and all literals of the form y =x (from the first clause, using literals of the form x~y). Let J be all such literals whose arguments x and yare in the range {O, 1, 2, 3, 4, 5}, say. From J it is possible to construct a dependency graph, a causal chain, and a proof of max (e ,d )~e
V d >e. There are also many other proofs that could
be constructed . We then attempt a proof of the goal literal from S U {d
>
c], This proof can
be found without using the split clause, so we are done . This example illustrates the use of splitting . However, it is not entirely satisfactory because the closure Cl (Sk) includes all possible literals, which means that in M1 (S) , both "=" and
"~"
are interpreted as identically true.
Thus M1 (S) is "dense." This does not seem intuitively appealing, and in addition results in a dependency graph which does not help much in reducing the search space. We will show how to overcome these problems below. Despite the problems, this example should give an idea or how splitting works, and should help to motivate the discussion or section 5. 4. Matching and Searching Dependency Graphs We now give a more detailed description of how the dependency graphs are actually used to guide the search for a proof, and how several dependency graphs may be matched to obtain a graph which contains the common features of all the separate graphs. It is not best to extract causal chains and then search separately Cor a proof using each causal chain, because there may be many subproofs in common among the various causal chains and these will be Cound repeatedly . It is better to work directly with the dependency graphs. The number oC vertices in the dependency graph is bounded by the number of literals in J; however, there may be infinitely many causal chains, even for a finite number of literals . Even if the depth of the causal chains is limited to d, the number oC such chains may still be double exponential in d (an exponential number of possible nodes, and Cor each node more than one possible literal). We show how to work directly with the dependency graphs, and thereby avoid this problem. Also, we show how to attach depth information to the nodes of the dependency graph to make it easy
368
to search for proofs at restricted depths. 4.1 Matching Dependency Graphs Gn are given and we want to find a graph
Suppose dependency graphs G l '
G which contains the common features of all the G;. More precisely, we want a graph G such
that the set of causal chains of G is the intersection of the sets of causal chains of the Gj
•
To
obtain G, we first define a procedure "match" which matches two graphs F 1 and F 2 to yield a graph F whose set of causal chains is the intersection of the causal chains of F 1 and F 2' This procedure is then applied to G 1 and G 2 to produce a new dependency graph HI; then HI and G 3 are matched to produce H 2; then H 2 and G 4 are matched to produce H 3; and so on, until
all graphs are matched to produce H. -1' which is the desired graph G .
The graph G
=
match( G l'
G 2) is defined as follows, for arbitrary dependency
graphs G 1 and G 2: Let Nodes(G) be the nodes of G and Rel(G) be the set of causal relations of G. The nodes of G are ordered pairs «s», N 2> for N' E Nodes (Gil. To obtain G, first let Nodes(G) be 0. Then add
< N 1,
N2> to Nodes(G) for all N' E Nodes (G i ) such that the N i
are assumption nodes of G, and both N 1 and N2 are annotated with the same clause {M}. The ordered pair
is an assumption node of G and is added to Rel(G) as a causal rela-
tion, annotated with all such clauses {M}, which will be positive unit clauses. Then successively add to Nodes(G) all nodes as follows, and add to Rel(G) all causal relations as follows, until no more can be added: If N' and N~ ,
"',
N] are in Nodes (G') for i
=
1, 2, and the causal
relations {N J , "', Nt}" N' E Rel( G') for i = 1, 2, both causal relations annotated with the same clause C, and E Nodes(G) for IS;jSk, then add to Nodes(G) and add the causal relation {,
} ..
to Rel(G), this causal relation annotated with all such clauses C. 4.2 Assigning Depths to the Dependency Graph After matching the dependency graphs as above, depth information is added to each node telling at what depths the node can possibly contribute to a proof of the goal node. Each node N of the dependency graph G has a forward depth d f (N) and a backward depth db (N ) assigned to it. If no matching on graphs has been done, then the forward depth of N tells the smallest depth at which the literal labeling N can possibly be derived by hyper-resolution; the backward depth of N tells the smallest possible depth of a proof of the goal node such that the label of N occurs in the proof. To be precise, d f (N) is the depth of the smallest causal chain whose root node is labeled with the label of N. Also, db (N) is the depth of the smallest causal chain whose root node is a goal node, and which also contains some node labeled with the literal
369
labeling N. Therefore, when searching for proofs of depth d or less, it is only necessary to consider the subgraph G. of G consisting of nodes N such that db (N)sd, together with all associated causal relations. The forward depths are assigned as follows: First assign df (N):=ao for all N which are not assumption nodes; if N is an assumption node, assign d f (N )=0. Then iterate the following equation on all causal relations {N l '
"',
Nt} ~ N until no more
change occurs:
To assign backward depths, first assign db (N )=00 for all N except the goal node N G , for which
db (NG )=d f (N G ). {N l'
. . . ,
Then
iterate
the
following
equation
on
all
causal
relations
Nt } ~ N until no more change occurs: db (N i )=min( db (N j ),max( db (N ),1+max( d f (N 1)'
4.3 Searching for Proofs To search for proofs at depth d or less, we consider a subgraph Gd of the dependency graph G, where Gd consists of all nodes N of G such that db (N)sd , together with their associated causal relations. With each such node N, we associate a set clauses(N) of (clause, depth) pairs as follows: Initially clauses(N)
=0
for all nodes N of G• . Then add to
clauses(N) for assumption nodes N, where {M} is a positive unit clause annotating N. Thereafter, do the following as often as possible until no more clauses can be generated or until a proof is found: Suppose {N l' clause L l'
.. • ,
Lt
"
',
Nt} ~ N is a causal link of G•• annotated with the
::::l L, and suppose 2 in the usual axioms for inequalities and natural numbers, since 3>2 is itself a
consequence of the axioms. There is another approach to restricting the use of non-Horn clauses. It seems counter-intuitive to use some "false" literal such as
2~3
as a split literal. When we do a proof
by case analysis, if we assume some case V, is true, we generally have in mind a model in which Vi is true. To force consideration of models in which D, is false is unnatural and also increases
the search space (because such models are less sparse). Frequently the goal literal is of the form Vx1
V xm A (x l'
"',
xm
)
for some formula A. To prove this, we convert the
Xi
into new Skolem constants which may appear in the ground instances used in the proof. Often all function symbols have standard interpretations except for these Skolem constants. Then, if the V, do not contain any Skolem constants, they will be true or false in the standard interpretation. Ie some Vi is true, then it can be derived from the axioms and so splitting is not needed. Furthermore, not all D, can be false since V is true in the standard interpretation. Now, if the
Vi do contain Skolem constants, we are free to choose I to interpret them in any way desired. We would like to say that for each D, such a choice can be made so that V, will be true in
MI (S). If this is not possible, it must be that S UVi is inconsistent. Therefore it seems reasonable for each V, to restrict I to interpret the Skolem constants so that Vi is true. (It is permissible and natural to choose a different I for each case in the case analysis.) Ie a proof cannot be
371
found , then attempt to show that S UD j is inconsistent. Thus if the non-Horn clause were e sd
V d 'x. - PI. In particular, we write Loaxa to denote the expression x E L. This definition of the universal and existential quantifier may look rather peculiar, but it is very simple to explain. The meaning of the logical constant JI is such that JIo(oa)Boa is true if and only if Boa is the "universal" set of type oa. Hence, JI[>,xaPol is true if and only if >'xaPo is the universal set of type (oa), i.e. Po is true for all Xa' We shall take as axioms of T the following formulas (p, q, and r are formulass): pVp:Jp p:Jpvq
pvq:J.qVp P :J q :J .r V p :J .r V q
JIo(oa)/oa :J !oaxa "Ixa [pv !oax..J:J pvJIo(oa)!oa Here, a is a type variable, and the last two axioms represent axiom schemes. The rules of inference are substitution, modus ponens, universal generalization, and >.-conversion. We shall write fr A to denote that A has a Hilbert-style proof using these axioms and inference rules. The deduction theorem holds for T. At first glance T may look rather esoteric, but it can be described as being simply first-order logic in which we permit unrestricted comprehension via the use of >'-terms. The type structure is necessary here in order to avoid the paradoxes (like Russell's paradox) which arise from unrestricted comprehension. The use of >'-terms in substitutions can make the nature of deductions in T more complex than in first-order logic. In the fixpoint example, the result of substituting B with the term [>.x.Lx 1\ x ::; !(x)] in the
379
completeness axiom will change the subformulas of the form Bz to [Ax.Lx A x ~ f(z))x which A-convertsto Lx A x ~ fez). For another example, if we have the formula (where Y is a variable o• and D and T are variableso(Oij) VD [DY:JTY)
and we wished to do a universal instantiation (a derived rule of inference) of this formula with the term AZ[TZ 1\ Vx .Z x :J Ax), i.e. the set of all sets of individuals which are members of T and are subsets of A, we would then have
[AZ. TZ AVx .Zx:J AX)Y:J TY. We can now apply the A-conversion inference rule to this formula to deduce
[TY 1\ Vx .Yx:J AX]:J TY. Notice how the structure of this last formula is much more complex than that of the formula it was deduced from. This last formula contains occurrences of logical connectives and quantifiers which are not present in the original formula. Notice also that Y now has the role of a predicate where this was not the case in the first formula. None of these structural changes can occur in first-order logic. The discovery of such substitution terms as the one used to instantiate D is a much more complex problem than can be achieved by simply applying unification. TPS, for example, cannot currently discover terms of this kind. Radical new heuristics for finding substitutions must be developed, and we hope that expansion trees will provide a vehicle for formalizing such attempts. Bledsoe in [8] and [9) has made some exciting progress in the development of just such heuristics.
3.
Expansion TJoees and ET·Proofs
All references to trees below will actually refer to finite, ordered, rooted trees in which the nodes and arcs mayor may not be labeled, and that labels, if present, are formulas. In particular, nodes may be labeled with simply the logical connectives >- and v. We shall picture our trees with their roots at the top and their leaves (terminal nodes) at the bottom. In this setting, we say that one node dominates another node if it they are on a common branch and the first node is higher in the tree than the second. This dominance relation shall be considered reflexive. All nodes except the root node will have in-arcs while all nodes except the leaves will have out-arcs. A node labeled with '" will always have one out-arc, while a node labeled with V will always have two outarcs. We shall also say that an arc dominates a node if the node which terminates the arc dominates the given node. In particular, an arc dominates the node in which it terminates. Also, we say that an arc dominates another arc if their respective terminal nodes dominate each other in the same order. 3.1. Definition.
Let A be a formula.; An occurrence of a subformula B in A is a
boolean sublormuJa occurrence if it is in the scope of only '" and v, or if A is B. A formula, A is an atom if its leftmost non-bracket symbol is a variable or a parameter.
380
A formula B is a boolean atom (b-atom, for short) if its leftmost non-bracket symbol is a variable, parameter or IT. A signed atom (b-atom) is a formula which is either an atom (b-atom) or the negation of an atom (b-atom}, I 3.2. Definition. Formulas, of T can be considered as trees in which the non-terminal nodes are labeled with"", or v, and the terminal nodes are labeled with b-atoms. Given a formulag, A, we shall refer to this tree as the tree representation 01 A. I 3.3. Example. Figure 1 is the tree representation of ""'[ITB V Ax] V ""''''''IT[>.x.Ax V Bxl. This formula is equivalent to ""'[V '!I By V Ax] V""'''''' V x .Ax V Bz; I
/v~ f V
\_
/\
nB
A%
\
n[>.z.A% v Bz].
Figure 1 We shall adopt the following linear representation for trees. If the root of the tree
Q is labeled with ""', we write Q == _Q', where Q' is the proper subtree dominated by Q's root. Likewise, if the root of Q is labeled with v, we write Q == Q'v q", where Q' and Q" are the left and right subtree of Q. The expression Q' 1\ Q" is an abbreviation for the tree ""'[""'Q' V ""'Q"l. 3.4. Definition. Let Q, q' be two trees. Let N be a node in q and let I be a label. We shall denote by Q +~ Q' the tree which results from adding to N an are, labeled I, which joins N to the root of the tree Q'. This new arc on N comes after the other arcs from N (if there are any). In the case that the tree Q is a one-node tree, N must be the root of Q, and we write A +/ Q' instead of Q +~ Q', where .4 is the formula which h~~~ I 3.5. Example. Figure 2 contains three trees, Q, Q' and Q+}j Q', where N is a node of Q and c is some label. The nodes and arcs of Q and Q' mayor may not have their own labels, I 3.6. Definition. Let Q be a tree, and let N be a node in Q. We say that N occurs positively (negatively) if the path from the root of Q to N contains an even (odd) number of nodes labeled with ""'. We shall agree that the root of Q occurs positively in Q. If a node N in Q is labeled with a formula of the form ITB, then we say that N is universal (existential) if it occurs positively (negatively) in Q. A terminal node which is not labeled with a formula of the form ITB is called a neutral node. A universal
381
N
Figure 2: The trees
q, q', and q +~ q'.
(existential) node which is not dominated by any universal or existential node is called a top-level universal (existential) node. A labeled arc is a top-level labeled arc if it is not dominated by any other labeled arc. I 3.1. Definition. Let Q be a tree with a terminal node N labeled with the formula nB, for some formulag; B. If N is existential, then an ezpan8ion of q at N with respect to the list of formulass,, (t1,.. . ,t..), is the tree q +~ ql +~ ... +~ q.. (associating to the left), where, for 1 ~ i ~ n, Qi is the tree representation for some ..\-normal form of Bt. , The formulas t 1 , ••• ,t.. are called ezpan8ion terms of the resulting tree. We say that each of these terms are used to expand N. If N is universal, then a 3election of q at N with respect to the variable., y, is the tree q +~ q', where q' is the tree representation of some ..\-normal form of By, and y does not label an out-arc of any universal node in Q. We say that the node N is selected by y.
The set of all ezpan8ion trees is the smallest set of trees which contains the tree representations of all ..\-normal formulaa, and which is closed under expansions and selections. I Expansion trees are, in a sense, generalized formulas. The main difference is that expansion trees can contain labeled arcs. An expansion tree which contains no labeled arcs can easily be interpreted as a formula. S.8. Definition. Assume that Q is an expansion tree. Let SQ be the set of all variable occurrences which label the out-arcs from (non-terminal) universal nodes in q, and let I 8 Q be the set of all occurrences of expansion terms in q. Expans ion trees, a generalization of Herbrand instances, do not use Skolem func-
382
tions as is customary in Herbrand instances. Skolem functions can be used in this setting, but their occurrences in substitution teI'IIU! must be restricted in ways that are not apparent from the first-order use of Skolem functions. The reader is referred to [16]' for details. In order to do without Skolem functions, we need to place a restriction on selected variables which models the way in which Skolem teI'IIU! would imbed themselves in other Skolem teI'IIU!. This restriction amounts to requiring that the following binary relation on SQ be acyclic. 3.9. DetlnitioD. Let Q be an expansion tree and let - (x)(P(x) -> Z(x))
A
To see how this •solves' the sapphire problem, let P(x) say x is a red sapphire. We decide to circumscribe on P since red sapphires are, as far as we can judge, quite unusual and unlikely to be present without being recognized and well-known. Once mentioned, the gem becomes •the' red sapphire s of the story until futher notice. So, the property of being a red sapphire becomes the only contextual information needed: AlP) is P(s). As long as it remains our judgement that red-sapphired-ness is appropriate to circumscibe, we will conclude that this red sapphire is also the one and only red sapphire, namely, the lost one. Thus we
416
wiD be able to prove that r
= s.
In detail, circumscription of P by P(s) (as the only information AIPI that initially pertains) can be applied by taking the predicate Z(x) to be x
= s.
Then AIZI will be Z(s),
i.e., s = s, resulting from replacing P by Z in AIPI. It follows by the above circumscription schema that P(x) -> Z(x), i.e., that the only red sapphire is s, This is seen as follows: first, Z(s) is obvious; and Z(x) -> P(x) follows from P(s). So the schema yields P(x)-> Z(x).
If we retain this conclusion on hearing about the sapphire r, then of course we must
conclude that r
= s,
which is automatic:
Of course, we have made two significant judgements here, neither of that red sapphires are things to circumscribe on, and that new data of
the sort presented (the existence of g) does not alter the first judgement.
We are not
tackling this issue here, but simply the one of how to formally represent such reasoning.
2. Circumscription with Protected Terms
Here we discuss a simple syntactic device from Minker &. Perlis 119841. There we suggested that once A has been selected as appropriate for circumscribing P, and if (perhaps later) it is desired to protect S-things from this process so that circumscription wiD not be used to show S-things are not P-things, we can keep the same criteria A, but alter the form of the schema itself.
Starting with P(x) &. -S(x), which we write P IS(x) (and more generally
T/U(x) for T(x) &. -U(x», we alter the circumscription schema to read as follows:
PIS C IZI: tA[ZI &. (x){Z/S(x)->P(x») -> (x)(PIS(x) -> Z(x» A
for all formulas Z.
Intuitively, we are saying that conclusions are drawn only about
417
non-S-things, u far u ruling out possible P-thinp goell. 'protected circumscription '
j
We refer to this IIChema u
unless so indicated, circumllCription wiD refer to McCarthy's
IIChema. We write CIZI when context makes clear what the A, P , and S (if protected) are.
It may appear that by circumllCribing on the formula P(x)k-S(x) the same elect ill achieved. Indeed intuitinly this should be the cue. However, circulDlICription, u dellDed by McCarthy, applies only for sinpe predicate letters. It is Dot obvious how to extend it to general formulu. John McCarthy hu communicated to us that he is currently pursuing this extension.
To return to our sapphire example, suppose in addition to the red sapphire that ill lost, another precious stone has been brought from India by another Denver resident, but its precise gemology has not been revealed . III fact, we may suppose for the sake of story-line, that the two gem buyers are in fact obtaining gifts for their (one and the same) admiree, a third Denver resident whose birthday anniversary is to be celebrated soon. The reader may already feel a tingling sense of worry that the two gems may be identical in type and bound to produce embarrassment.
How then can we represent the reasoning that there are one and p06llibly two red sapphires, but no more, and that s is one, and the other stone, say g, may or may not be, in such a way that we still can conclude later that r
= s (supposing g not to be l06t)!
Our
schema will do this if we again let P(x) say x is a red sapphire, P(s) being the only information that is needed to circumscribe that very property (i.e., the axiom AIPI is simply P(s) itsell) except that now we also state S(g) to protect g from being squeezed out of possible red-eapphired-ness.
Again we let Z(x) be x=s, and further simply take S(g) as an
axiom. S(x) will have no special meaning other than that x is ' selected ' for protection from circumscription.
418
Then much as before we call conclude P(x)
-> l(x) v S(x), l.e., any red lapphire
either is the Ont one (I) or is the new untyped stone (g). Then on learning 01 the red sapphire ring r, it followl thM either r - lor r - g. If further it is bown that g il not
loat, indeed is in the Orm poll8ellllion of ita owner, then we toow r -/:- g, hence r - I.
Notice the apparent non-monotoDic:ity present in 8uch a line of re3llOning. Before we have heard
or the second stone g, we conclude r
-
I; later with further information but
(apparently) no 1081 of what was previously known, we no longer Call make such a Itrona conclusion but in8tead have only (r == 8) V (r
== I).
III lact, of course. information has been
retracted, namely our original unprotected treMment or red sapphires: now AlP) il {P(s),S(g)} where311 before it was jU8t {P(s)},
60
the previouslc:hema haa been replaced by a new one thM
in fact is not loaically stronger.
3. Using Model-Theory
III McCartby I198O( the concept 01 minimal model was discussed in the context of clreumscription. In Minker k Perlis IIgs4) we re-defined minimal model in an manner appropriate to tbe new version of c:ircumseription as follows: Let M aad N be models of AlP). We say M
',:
AI: A
vIr:
VI.>',:
AAB B
A A v B
A AB A
AEr:
vE:
(A) C
Av B
(B)
C
C
(A) =>E:
=>1:
A=> B
A B
(~A)
(A) ~E:
~I:
ff ----A
where ff abbreviates any formula propositions C.
(:,
DA ~D.
A proof of a proposition
is a tree whose leaves are the members of
(:,
C from
and whose root is
Each internal connection between parent and children nodes is justified by one
of the inference rules.
A rule with a parenthesized formula (Such as
::>1)
causes the removal (discharge) of that parenthesized leaf node when applied to the tree.
As an example,
447
(PAQ):>R
PA(Q::>R) is a proof that
(PAQ)::>R
and
P
infer
PA(Q::>R).
Write (PAQ) :>R, P I- PA(Q:>R)
to abbreviate the tree; this expression is called a theorem.
Note that the
order in which the nodes of the tree were added does not affect the final result. The following two results hold for all natural deduction systems: i)
if
ii)
if
r
I- Band
r l-C, ~
Call i) the
then
B,lI I- C, then
r,A l-
r,l\ l- C.
C.
principle and ii) the
~
principle.
Since the proofs of both
are constructive, there exist associated functions cut and
~
which build the
deduction tree of the consequent theorem form the deduction tree(s) of the antecedent theorem(s).
2.
These functions will be useful for tree assembly.
LCF ~ogic
for fomputable
~unctions
natural deduction style proofs. tional depiction of deduction.
(Gor) is a software tool for developing
A notable feature of the system is its funcFormulas are assigned the data type form and
are built using formulas and logical connectives. taken as axioms are given type thm (theorem).
Those formulas which are
Inference rule schemes are func-
tions which produce results of type thm from their arguments. For example, P
has type form
PAQ
has type form
~
has type form -+ thm
~(PI\Q)
has type thm
:>E
has type thm x thm ... thm
:>E
(~(P::>(PAQ»,
~(P)
has type thm.
448
Expressions of type thm are written in their sequent form, e.g., Po (PAQ),P ~ PAQ.
The two expressions of type thm seen above are short
exa~ples
of (forwards) LCF proofs, which are constructed by nested applications of inference rule functions.
th~
This provides an element of security to the system,
for only through the use of the
axio~
and the inference rules can new theorems
be created. The LCF system rises above its role as a mere proof checker due to its perfor~
ability to
goal directed (backwards) proofs, bUilding a deduction tree
from its root-- its goal-- to its leaves, i.e., assumptions.
The basic approach
~~? C, where ~ is a set of assumptions and
is to take a goal,
desired conclusion, and decompose that if proofs exist for also constructable.
C into a list of subgoals
~~? CI, '"
,
~~? Cn,
C is the
CI, ... ,Cn
then a proof of
such
~~? C is
A function which decomposes a goal is called a tactic.
The functional formalization of goal directed proof is defined in LCF as goal:
form list x form the assumption formula set
tactic:
~
and the desired conclusion
C;
goal ->(goal list x validation)
-- the decomposition step of goal a thm producing function; validation:
~~? C into its subgoals plus
thm list ->thm
the thm producing function which produces ~ ~ C_ from CI, ••. , ~ ~ Cn, thus justifying the decomposition.
~ ~
Using angle brackets to enclose lists and parentheses to bind pairs, here are the definitions of some LCF-style tactics: IMPTAC: ANDTAC: TRIV: IDTAC:
~ ~
? «~,A ~.
I~
? «~ ~.
?
f-' AAB ?
~,A ~. ~
~
?
1-' A:>B
?
1-'
A
A H>
«> , ?
Ho
«~ 1-'
B>, :::>1)
A;
? ~ 1-"
B> , AI)
triv, A.t)
For example, ANDTAC accepts as an argument a goal ture
AAB, the result is the list of subgoals
sition is justified by the rule
AI.
Note that if
the tactic fails (generates an exception). empty list of theorems to the axiom tactics.
(::>~)
A:oB,A,lI 1-' C 1-+ .
can be defined similarly, and, in general, such
"match and thin" tactics using discovery algorithms.
~ ?
< P,Q,PAQ,R 1-" R>,
all of whi ch occur wi t hi n
REMOVE_IMPLIES.
on the ne xt iteration of SI MPLI FY_ LHS .
The derivation tre e built by va l i da t ing the empt y s ubgoal lis t is ex a c t ly that s hown in section 1, a s the cu t and and not inf erence rules.
~
functions ar e tree s ti t chi ng ope r a t i ons
The r eade r i s encouraged to build the valid ation
based upon the trace a nd ge ne rat e the proof tree . 5.
Conclusion A notation f or expressing proof di s covery algorit hms for natur al dedu ct i on
s ystems ha s been defined.
It a l l ows s uccinct sp ec ification of strategies
widely used in all t heorem pr overs and encour a ges the discovery of compl ement ary one s as we l l .
As the language i s independent of the in ferenc e ru l e
s ystem used, it suppo rt s formu la t ion of prob lem area- i ndepende n t s t r ategi es and facilit ates compar is ons of co n t rol a l gor i t hms of di f f ere nt t heorem proving sys t ems .
This s e paration of t heor em d i s co very s t ra tegies f ron the problem
a r ea s to be s t ud ied ex pos es the i mport ance of prop e r l y f ormu l ating the unde r lying infer enc e rule sys tem of the theorem prove r . dis appea r when a
Many t echnical di ffic ultie s
co he r en t , compl eme n tary set of rules i s
us ed as a foundati on.
This pa per 's leng t h p reven t ed a close r examinati on of t he co ntro l cons t r uc t s fo r the lang uag e .
The LCF t actica ls served we ll for t he simp l e examples, but
an obvious need exi sts for more soph i s t i ca t ed combi na t ors su ch as br ead th- f i rs t s earch ("EITHER tl OR t2") a nd condit i onal iterat ion ("REPEAT t UNLESS e n) . This ar ea mer i t s study. Acknowl ed gements:
Br ian Monahan produc ed many conc r e te examples of t a ct i c
ge ne r a t ion i n his work wi th the LCF s ys t em.
He made ma ny use ful s uggestions
a nd ca ref ully read a n ea r l i er ve rs ion of this pap e r . St i r l i ng and Robin Milne r have al s o been helpful . gr a t ef u l ly acknowledged .
Discussions wi t h Col i n
Dor othy McKi e ' s t ypi ng is
457
References (And)
Andrews, P.B. Transformaing matings into natural deduction proofs. 5th Conference on Automated Deduction, Les Arcs, France, 1980, LNCS 87, pp. 281-292.
(Ble)
Bledsoe, W.W., and Tyson, M. The UT interactive theorem prover. Memo ATP-17, Mathematics Dept., University of Texas, Austin, 1975.
(Boy)
Boyer, R.S., and Moore, J.S. New York, 1979.
(Cha)
Chang, C., and Lee, R.E. Symbolic Logic and Mechanical Theorem Proving. Academic Press, New York, 1973.
(Coh)
Cohen, P.R., and Feigenbaum, E.A., eds. The Handbook of Artificial Intelligence, Vol. 3. Pittman, New York, Ch. 12.
(Can)
Cohn, A. The equivalence of two semantic definitions: a case study in LCF. Report CSR-76-81, Computer Science Dept., University of Edinburgh, Scotland, 1981.
(CoM)
Cohn, A., and Milner, R. On using Edinburgh LCF to prove the correctness of a parsing algorithm. Report CSR-113-82, Computer Science Dept., University of Edinburgh, Scotland, 1982.
(Cos)
Constable, R.L. Proofs as programs: a synopsis. Letters 16-3 (1983) 105-112.
(Gar)
Gordon, M., Milner, R., and Wadsworth,C. Springer-Verlag, Berlin, 1979.
(Gut)
Guttag, J. Notes on type abstraction. IEEE Trans. on Software Engg. SE-6-l (1980) 13-23.
(Hoa)
Hoare, C.A.R. An axiomatic basis for computer programming. 12 (1969) 576-580, 583.
(Lem)
Lemmon, E.J.
(Les)
Leszczylowski, J. An experiment with Edinburgh LCF. 5th Conference on Automated Deduction, Les Arcs, France, 1980, LNCS 87, pp. 170-181.
(Man)
Monahan, B.
(Nor)
Nordstrom, B. Programming in constructive set theory: some examples. ACM Conf. on Functional Programming Languages and Computer Architecture, Portsmouth, N.H., 1981, pp. 141-153.
(PIa)
Plotkin, G. A structural approach to operational semantics. Report DAIMI FN-19, Computer Science Dept., University of Aarhus, Denmark, 1981.
(Pra)
Prawitz, D.
(Rob)
Robinson, J.A. Logic:Form and Function. Edinburgh, 1979.
(Sup)
Suppes, P.
A Computational Logic.
Beginning Logic.
Academic Press,
Information Proc.
Edinburgh LCF.
LNCS 78,
Comm. ACM
Nelson, London, 1965.
Ph.D. thesis, University of Edinburgh, forthcoming.
Natural Deduction.
Almquist and Wiksel, Stockholm, 1965.
Introduction to Logic.
Edinburgh Univ. Press,
Van Nostrand, Princeton, 1957.
458
Appendix Inference rule schemes for quantifiers have additional restrictions attached to control the use of free variables.
Tact ics generated from those
rules must also follow these restrictions. The rules given here for the universal and existential quantifiers have restrictions which make their corresponding tactics into the skolemization procedures of Robinson (Rob); the rules are a slightly modified version of those in Suppes (Sup). operation is treated as a function like
cut
or
Given the usual syntax for first order logic, let variables which may be quantified. variable, e.g.,
x,y,z, ... ,
The unification
~.
A Skolem variable
x,y,z, '"
represent
is a free, barred
and a Skolem constant is a free variable, possibly
subscripted by a list of Skolem variables (a "Skolem function"), e.g., Let
AX t
denote the usual syntactic substitution of term
occurrences of variable
x
in formula
A.
t
x(y).
for all free
The rules are
VE
VI:
AX _ _ y(z1. .. Zll) VxA
y,zl, ... ,zn
where none of
are free in any assump-
tion upon which
AX _ y(z1. .. zri) Skolem variables in AX _
_)
belong to
y (z l , .. zn
The tactics
(VE~)
and
(rVI)
for universal quantifiers. generated;
zl, •.. ,zn
3xA AX _ _ y (z1. •• zri)
{zl, ... ,zn}.
correspond to the usual skolemization routines
VI's
inverted restriction forces
are exactly the Skolem variables in
where 3E
depends, and all
y
to be newly
is not free in any assumption
upon which and
y
A.
3xA depends,
all Skolem variables in
A belong to
{zl, .•. ,zn}. The tactics
(r3I)
(take term
t
to be
y)
and
(3E~)
usual skolemization routines for existential quantifiers.
correspond to the
459
Unification is substitution: UNIFY BACKWARDS (x, t
)
?
= b.
C
f-'
~
?
UNIFY_FORWARDS (x, t
b. 1-' C
)
j-;>
?
by
(1-3I)
F(y,xG»
~
by
(1-1Il) •
ment => =>
y
Note the necessity of the
in Skolem constant
by
(VEt-)
by
(3Ef-) .
x(y).
Attempts at unifying the structurally similar formulas can not succeed.
argu-
460
The Mecharriz at.ion of Bxistence Pr-oofs of Recursive Predicates Ketnn Mulrnuley Computer Science Department Carnegie-Mellon University Schenley Park Pittsburgh, PA 15213, USA 1.
Abstract
Proving the congruence of two semantics of a language is a well known problem. Milne[3] and Reynolds [5] gave tcchniques for proving such congruences. Both techniques hinge on proving the existence of certain recursively defined predicates. Milne's technique is more general than Reynolds', but the proofs based on that technique are known to be very complicated. In the last eight years many authors have expressed the need for a more systematic method and a mechanical aid to assist the proofs. In this paper we give a systematic method based on domain theory. The method works by building up appropriate cpos and continuous functions on them. Existence of a predicate then follows by using the Fixed Point Theorem. A mechanized tool has been developed on top of LCF to assist proofs based on this method. The paper refutes the fear expressed by many people that fixed-point theory could not be used to show existence of such predicates.
2.
Introduction
The Scott-Strachey approach to giving semantics to a language is well known. In this approach each programming construct in a language is given a denotation or meaning in an appropriately constructed domain which is some kind of cpo (complete partial order). Of course there can more than one way of constructing such domains and denotations. The question which arises naturally is: how does one know that these ways are in some sense equivalent? Say the language is L and we have two semantics L1 and L2 which map L into the domains D 1 and D 2 respectively. One might start by constructing some predicate, say P E D 1 X D 2 , which relates the equivalent values from two domains i.e (db d2 ) E P iff d 1 and This research was supported in part by the Defense Advanced Research Projects Agency (DOD), ARPA Order No. 3597, monitored by the Air Force Avionics Laboratory under Contract F33615 c81-K-15:{9, and in part by the U.S. Army Communications R&D Command under Contract DAAK80·81 cK-OO'j4. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the Defense Advanced Research Projects Agency or the U.S. Government.
461
d 2 are in SMile sense equivalent. The above question then reduces to asking: does (LI(e), L 2 (e)) E P for all eEL? Sometimes oue might succeed in constructing only a weaker predicate P such that (d l , d 2 ) E P iff d] is weaker than d 2 in some sense. One then proves the equivalence of two semantics by proving the following two assertions independently:
(1) L](e) is weaker than Lz(e) for all eEL. (2) Lz(e) is weaker than L1(e) for all eEL. The key problem then is the construction of an appropriate predicate and proving its existence, the issue we shall be addressing in this paper. To make the ideas concrete we shall consider one example from [8]. Consider the simple expression language, essentially the lambda calculus with atoms,
E
::=
I I B I (AI.Et}E2
Here I belongs to a syntactic domain Ide of identifiers and B belongs to a domain B of basic values. Dynamic seeping is intended. Let D, the domain of values and C, the domain of environments, be the least solutions satisfying D = C ~ B and C = Ide -> D. Let E be the syntactic domain of expressions. Let F denote the domain of contexts, Ide -> E. Then a denotational semantics, e : E -> D (or e : E -> C -> B) and an operational semantics, 0 : E -> F -> B can be given easily: e[I~e
= (eI)e
e[B]e = B f:[(H.Et}E 2]e = e[Ed(e[e[E2]jI])
O[I]! = O(lI)! O[BD! = B O[(AI.EdBz]J = O[E1](I[E2/I])
e
The congruence problem is to show that and 0 compute the same thing. More precisely let convert: F ~ C(Le Ii' -> Ide -> D) = ).,J).,I.(e(lI)). Then we have to show that for all E : E and I : F, e[E]{eon·uertJ) = O[E]f. It is easy to prove that O[E]f ~ c[E](convertJ). The difficult part is to prove the other inequality. For that, one shows that the following recursively defined predicates, (j f:; D x E and 77f:; C x F exist, where () = {(d, e) I V(e, I) E 77. de ~ Oel} 77 = {(e, J) I VI E Ide. (eI, 11) E O}
(2.1)
I: F, (t:[E], E) E () and (convertJ, J) E e[E](convertJ) i;; O[E]f. The above discussion shows
Then it can be shown that for all E : E and 71. Hence from (2.1) we get
the importance of recursively defined predicates in a congruence proof. Henceforth we shall confine our attention solely to the existence of such predicates. The existence of the above predicates, () and n, is shown in [8] using Milne's technique [3]. Later in this paper we will show how the proof based on our method
462
can be mechanized. Bo th meth ods show t hat 0 a nt! T/ exist for al~Y choice of 0 But if we replace ~ by = or J in (2.1), th e int erest ing questi on is: do the predicates st ill exist? This question was rai sed in [8] bu t was left open. V/e can answer t hat ques tion par ti ally now . For that 0 given above we still do no t know the answer . However we shall sh ow using Jiagonaliza tion that there exists an 0 .such that the predicates do not exist. Con sider the case of :J first . Let Ide be a tr ivial one-po int dom ain. Then C can be identified with D and F can be iden tified with E , so W f. have D = D ~ B , F = E and 0 = 7/. And the quest ion reduces to as king whet her 0' exists wh ere (j' =
{( d : D, e :
E) I V( d', e') E 0'. dd' :J Oce'}
Let I : D = (AX.XX). vVe arc omitting the obvious inje ctions an d projections. It can be shown that II =..l(as in the well known Park's theorem.) Let 0 = AelAe2.b where the constant b : B:J .L. We are assuming that B contains at least two elements. As 0 is constant, existence of ()' is equivalent to existence of some 0 where 0 =
{d : D I Vd' Eo. dd'
:J b}
(2.2)
Assume for a contradiction that such an 0 does exist. Consider two cases. 1) lEo: in this case by (2.2), I I :J b and hence ..l :J b. This is a contradiction. 2) I ~ 0 : Consider any d E o , By (2.2) we get dd ;;:) b. Hence [d = (Ax .x x)d = dd ;;:) b. Thus for all s e 0, [d :J b. Hence by (2.2) lEo. This is again a contradiction. We conclude that such an a does not exist. If we replace
~
in (2.1) by = exact ly the same reasoning goes t hrough.
The implications of this counterexample are many. It shows that the existence of recursively defined predicates is indeed a nontrivial problem. Secondly it weakens the hop e that there exists a rich enough syntactic language su ch that any predicate expressed in that language exists. Reynolds [5] and Milne [3] gave general techniques to prove the exist ence of such predicate s. Note that Reynolds ' method can not be used to show the existence the above pr edicates, () and 'YJ . There has been som e confusion ab out this in the literature. For example, see lemma 2.2.1 in Peter Mosses' thesis [4]. There he says that the existence of a certain recursively defined predicate follows by Reynolds' method easily. But it is possible to give a somewhat similar looking predicat e and show that it does n ot exist. Due to lack of space we shall not discuss that exa mple here. But the idea is the same as in the above counterexample, nam ely self-application. We hop e t hat one example makes our point clear. (Incid entl y I can not show tha t the particular predicate in [4] does not exist. So that remains an op en question agai n). Milne 's method can be used to show existence of () and 'YJ (see [8]). The proof is unde niably complicated. Because the existen ce proofs of recurs ively definer! pr edicat es are so complicated, many authors have expressed th e need for som e me chanica l aid in ca rry ing out
463
su ch proo fs [2,8]. T his probl em has be en ope n for ma uy yea rs . III th is pap er we sh all show that such a me ch an ia.u ion is indeed po ssible. .:\. mechanical aid has been ac tually implemented on top of LCF to assist in th ese existe nce proofs.
3.
Basic Not a tions
It will he assumed that the reader is familiar with domain theory , All the results in this section can be found in [6]. Dom ains in this pap er are assum ed to be consistently complete algebraic cpos. We shall denote t hem by O,D etc. C -) D will de note t he usual domain of continuous func tions from C to D. The universal domain will be denoted by U.
3.1. Definition.
(1) r: D
--> D is called a retract if r 0 r cc: ?', where 0 denotes composition. We shall denote the fixed point set of r by Iri, and by r if no ambiguity arises. Note that Id is a cpo. Its bottom elemen t is ·r(1.); we shall denote it by 1.r. .
(2) A retract r : D --> D is called a projection if r i;:;; I, where I : D the identity function.
(3) A projection r : D
-7
-->
D, we say that r is a subcpo of s (denoted by
r:s 8) if 8 0 r = r. This is equivalent to I, g : D --> D
D is
D is called finitary if Irl is isomorphic to a domain.
(4) Given retracts r, s : D 3.2. Fact. If
-7
ar!! projections then
I
saying 11'1 ~ ~ g
lsi.
I
iff If I ~ I!JI I
The domain of finitary projections of the universal domain U will be denoted by V. Each domain is isomorphic to the fixed point set of sorne finitary projection on U [6]; hence it can be regarded as an element ofV. Therefore, we shall use the notation 0, D etc to denote elements of V as well.
3.3. Fact; There exist -7, x, + : V X V --> V such th at ID X EI ~ IDI X lEI, ID+E\ ~ IDI + lEI· I
ID - EI ~ IDI -. lEI,
In this paper a retract will perform a double duty. Some times vie can think of it as denoting the set of its fixed points, sometimes we can think of it as a function. Thus if r : D --> D is a retract, then x : r, x E .,., z E irl, r x == x all make sen se; in fact they mean the same thing. Similarly if S ~ ID I th en r(S) denotes the set {y I 3x : D. y = r x}. Of course the finitary projections are retracts, so all this applies to them too. The duality between finitaryprojeetions of U,i.e. elements oj V ; and domains will be assumed throughout this paper. Thus if L E V k t hen some times we shall think of it as a subdomain of Uk, and sometimes as a function over Uk.
464
aA. Defin it io n . (1)
T he n-ary relati on H on some d oma in R is sa id 1,0 be u pwa rd/dow nward] closed in its ith index iff (;);1, " ' , ~; i , " ' , Xn ) E R ana Vi ;J ( ~ ) Xi implies (Xl , · · ·,Yi, · · ·, xn) ER.
(2)
The rela tion S is di rected-com plete if the l.u .b of any d irected subset of S belongs to S.
(3)
Cl (A,B), where A and B s" E g" and "3x E X. x EX 1\ g". We shall also allow the obvious shortforms such as: "'v'(Yl,"', Yl) E ...".
466
Let (1)
(2)
W
E $(X : Y7n,X). LeI, a n interpret a t ion ~ be a function such that:
~ assigns to X ; ym a domain X '" in V m and to X a relation X~ on X~; i.e, X~ ~ IX ~I . ~
assig ns to every constant symb ol C : D ( which can occur in 6.(X) or
f (X)) an element C~ in D. (3) ~ ass igns to every constant predica t e symb ol E a pr edicate E~ on some domain E ~~; i.e, E~~ ~ I E~I.
(4)
~ assigns to the predicat e symbol ' 1;;;; ' the standard partial ord er ing pr edicate ~ on U 2 • Assignme nts t o t he predic ate sym bols ' J ' and '=:' are sim ilar.
Let s be a function from the set of va riables over U int o U. It is very easy t o extend the interpretation to predicat e symbols, domain t erm s an d simple terms: (1)
Every S : Y E f(X) can be assigned an element S fJ in Y .
(2)
Eve ry r : U k E 6. (X ) can be assigned an element r fJ [s ] in Uk.
(3)
Every P E 9(X) can be assi gned a pr edicate p~ on some domain p ~ ; i.e, p S s;;; I P ~I .
Now we inductively define what it me ans for ~ to satisfy w E q>(X, X) with
s,
~
f= w[s] : (l)
~
(2)
~
1= (1" E P)[s] iff P ~ (r ~ [s]) E P~.
1= (Vy E S . g)[s] iff Va E S ~. ~ 1= g[s(a/ y)]; s(a/y ) is exactly like e except that it maps y to a.The other case is
sim ilar. (3)
*
~ 1= (f g) [s]iff ~ 1= f [s] implies ~ the other case s are similar.
1= g[8].
Let w be a wff whos e fr ee variables over U ar e contained in t h e list Vt , . . . , Vk . Let a be a t u ple in Uk. Then we say ~ 1= w[a] iff ~ 1= W[8 ] where s is a function which assigns the jth component of a, (a) j' to Vj. If r is a t erm whose free variabl es over U are contained in the list Vt, •.• , VI", r~ [aJ is simil arl y defined. As the int erpretations intended for the constant symbols an d the constant predicate symbols sho uld be clear fro m the context, we shall no t mention them. In fact, if ~ is th e int erpretation whi ch assigns L to X and L t o X , then we sha ll som et imes denote ~ l= w [sJ by sim ply (I" L) 1= w [sJ. Fin ally we are in a position to define a m ap on a predicate doma in. 4.5. D efinition. Let Il l , A, B, P II a nd II R , C, D, Q II bc pr edicate cpos, 'wher e l is a r etract over V"? an d R is a retract over V " , T hen given T : V m -+ y n J
467
and w E g" : vVe have to prove that:
469
mr--:: h[yj implies
(~ ~ h[z] implies ~ F= g[z]) implies is easy to see th e suffi ciency of the goals: ~ ~
!= h ly] implies ~ F= h[z] i= g[z ] implies ~ F= g[y ]
~
!= g[y])
It
and
These can be redu ced further by Reduce2 an d Reducel respect ively. (b)w = " h l\ g" or " h v g": generate t he goals : ~ ~
F= h[z] implies ~!= h[y]
and
f= g [z] implies ~ != g[y]
These could be reduced further by Reducel. Reduce2: Let the goal be l§ll= w[v] implies ~ F= w[u]' where w, ~, ~ are as in Reducel and u : U" ~ V : U". Reduce2 simplifies the goal recursively. It is very similar to Reducel and hence we shall not discuss it. 5.1. Example. Let us now prove the existence of a solution to (2.1). Remember that D , the domain of values, and C, the domain of environments, are the least solutions to the equations D = C - 4 B and C = Ide -4 D. E is the domain of expressions and F is the domain of contexts , Ide -; E . The general plan is as follows. We first construct the predicators liT, w l] and 118,fll where
T: y2
-+ y2 =
8 :V
-+
A(C',F').(C'
-4
wId: 2v, e : V ] E (X : y2 ,X) = y2 = A(D', E').(Ide
-+
B,E) V(c, J) E X.app ly d e ~ 0 e f
D', Ide - , E ')
f[e, f ] E {Y : v", Y) = VI E Ide .(apply el, apply f I) E Y Here apply: U -4 U - 4 V = AXAy.(j__ x) y whe re i-, is the projection from U to U -+ U . Strictly speaking 0 here is the extension to V -; U -+ U of the operational semantics, 0, given in section 2. Such an extension can he carried out uniquely. Next we shall construct predicate cpos such that the above predicators will form continuous functions between them. That is, we choose certain retracts L, R over y2 and predicates P and Q such that:
liT, wll
E II
R, {2}, {1}, Q I
-+
II L, {2},{1}, P II
IIS,fll E II L,{2},{1},P 11-+ I! R,{2},{1},Q II Note that we are considering only those relations which belong to CI({2}, {1}). Continuity of the predicators will allow us ~.C) t ake the following fixed point.
(L 0, LO), (RO, RO)
= FIX(>.(.L, L), (R, R).(IIT, wI! (R , R), liS,fll(L, L))). (5.1)
470
It will turn out chat
(5.2)
L'=DxE,R"=Cxl"
Then using (5.1), (5.2) and (4.2) it follows that (2.t) gets satisfied if we let 0 .=:c: L and "l = R. Of course several conditions ought to be satisfied before this can be done. Firstly to guarantee that the predicators are well defined we require that: for ::,11 (R,Il) Ell R, O,D, Q II, IIT,wl!(R,R) E ill, A,B,P for all (L, L) E Ill, A, B, P II, !IS, fll(L, L) E I R, 0, D, Q Ii
I
(5.3) (5.4)
Secondly the algorithm given above generates some goals to ensure monotonicity (and hence continuity by thm 4.6) of the predicators. (The goals given below were the ones generated by the implemented version. It takes care of some trivial goals, carries out primitive simplification and generates the goals which are more readable.) The goals for 111', "'IIi are:
V(C',F'),(C',E,') E R. (C/,F') ~ (C',E') ==> \/£ E Q', dEC' ----7 B. B(apply d (C' f)) = B(apply dfJ V~ E E, f E F', B(O(E f) f) = B(O ~f) Similarly goals for
(5.5) (5.6)
liS, fll are:
V(D',E'),(D',~') E
L. (D/,E') [;;;;(D',~') ==> ' yn will be called well beho.ved (wb) if for all i, (pr oj!, 0 l 0 proj~) : Y -> V is a retract and jproj i 0 l 0 proj~ i = proji(ll f). Then it can be easily show n that the set of well behaved retracts on y n forms a cpo ; ca ll it \ Vb(V n ) .
6.1. Definition. A pair (1' : V ?' -> V ", T : V" - ,} V'") is called a dual if given l : 'Wb(ym), (1' 0 l oT) E '\\lb(yn) and IT 0 l oTI = T(lll). We shall call T a right inverse' of T. I
It is obvious that 6.2. Definition.
< proj!', proj~ >
1': ym
->
is a dual.
yn is sa id to be nice if it can be constructed from
- , x, + : y X Y -->V, I: Y --> V, constants C : 'Wb(V k ) (k is arbitary), 0, fnpair, projf (k is arbitary). Here 0 is the composition, fnpair f g = >..x.(J x, g x) , and I g
is the identity function. 6.3. Theorem.
Proof.
T.
If T is nice then it has a 'right inverse
Though the proof is nontrivial it is technical so we shall omit it I
The result we are interested in is:
6.4. corollary. If T : ym such that IT(L) j = T(lLI). Proof.
Let
->
V" is nice, t her e exists
f = >..L. To LoT.
T : "Wb(ym) ---+ 'Wb(yn)
II
Nice functions form a fairly rich class. Most of the functions on finitary projections used to build reflexive domains are nice. For example T and S in example 5.1 are nice . In addition we have at our disposal the following simple combinators:
(1)
strict: 'Wb(yn)
---+ 'Wb(yn),
x =1. then 1. else /(x). (2) /8): 'Wb(yn) x '\!Jb(ym) g)(x,y) = (lx,gy). I
where strict is defined by : strict /
->
'Wb(yn+m) where
e
= AX. if
be defined by (J ®
6.5. ( Continuation of example 5.1). We shall choose now L, R, P, that the conditions stated in Example 5.1 are satisfied. Conditions (5.5) guarantee monotonocity of liT, wll and liS, f'[ ], We have already seen how (5.7) could be proved. Hence only (5.8) remains to be proved. We state it
V(D', E'), (D' , E') E L. (D' , E') G (D' IE') =:>
Vt E Ide --> E', I
E Ide. E'(apply ((Ide ~
E')£) I)
= E'(apply
1. I)
Q such to (5.8) (5.5) to again. (6.1)
472
By the monotonocity of liT, wj], liS, fll ( above conditions will guarantee that) and corollary 6.4, (5.3) and (5.4) can be reduced to:
(a) (b)
T(R):5 land S(L) :5 R given G ~ IGI, if G E Cl({2}, {1})
(6.2)
then Iwl(G) and Ifl(G) E Cl({2}, {1})
(c) IIT,wll(1-R, Q) J (1-L,P), IIS,fll(1-L,P):J (1-R' Q)
(6.3) (6.4)
where Iwl(G) = {x : T(G) I (G, G) F= w[x]}, and Ifl(G) is similarly defined. By the preceding theory we are justified in choosing Land R as the fixed points:
l : "Wb(y2), R : "Wb(y2) = FIX(>'(l', R').((strict(T R')), S(L'))
(6.5)
Then it can be shown in LCF that 1-L: Y x V = (1-, 1-), 1-R = (1-, 1-). The fixed point sets of land R look simply as shown in fig 6.1. f:D,E)
•
"
CC,Ide-+ E)
= (C,F)
(..L ~B ,E).
• C.J...,J.)
(.1. ,.L J •
I'RI
roLl fig 6.1
Let P(~ I1-L I) = {(1-, 1-)} and Q(~..LR) = {(..L, 1-n. Then (6.3) and (6.4) follow trivially. But they can not be proved within the LCF formalism, hence have to be proved separately. (6.2) can be proved almost automatically by using the standard set of tactics provided by the implemented system. These tactics use straightforward properties of wb retracts , right inverses etc which we have not stated in this paper. (6.1) can be shown using fixed point induction (refer to (6.5)). Thus all the conditions stated in Example 5.1 can be proved. The fixed point operation in (5.1) is then justified. Finally (5.2) follows by one more fixed point induction. Now the recursive predicates are obtained as was indicated in Example 5.1. Thus except for (6.3) and (6.4) everything else has been mechanized. As a side remark it is interesting to note what happens if ~ in (2.1) is replaced by = or d . In that case the goals which are generated by the LCF interface can no longer be proved for all O. But we have already proved that in this case there exists 0 for which the predicates do not exist. Thus though the goals generated by the LCF interface are by no means necessary, they seem to be necessary in some weak sense.
473
7.
Implernentation
Now the basic design of the system should be clear.It consists of two parts; an, LCF interface and a working environment.
7.1. LCF interface. The input to this stage is a representation of the predicate domains and predicators, A function on predicates is specified by a wIT. The output of this stage is the set of goals which will guarantee the continuity of' the predicators. We have already discussed how this is done. These goals are proved in the working environments. 7.2. Working Environment. The working environment provides a hierarchy of LCF theories and a standard set of tactics. The hierarchy consists of theories of universal domain, finitary projection, duals, well behaved retracts etc. Tactics are provided which take care of goals which frequently arise. They are programmed in ML using well known programming principles such as tacticals and simple resolution. Though the implementation is nontrivial, these aspects of ML programming and LCF are well understood, hence we shall not discuss them here (for example, see [1]). 8.
Example
We shall consider one bigger example. This is taken from [2]. It is was used to show correctness of a LISP implementation. Let D, a domain of values, and C, a domain of environments, be the least solutions of the equations D = C -~ B and C = Ide -0> D as in section 2. We want to show the existence of recursively defined predicate's:
6 ~ D x {Z ! Z ~ Ide} =zc;CxC
(one for each Z ~ Ide)
Where intuitively (v, Z) E () means the free variables of v are included in Z and (p, pi) E =,Z means p and p' "strongly" agree for all Z E Z. Formally we require that:
{(v, Z) ! W,p, p'.(Z ~ Y => (p =Y pi => V P = =z = {(p, p') I Yz E Z.pz = p'z a.nd (pz, Z) E B}
() =
V
pi))}
By rearrangement we get :
e
=
{(v,Z) IYY,p,p'.Z ~ y => (Yy E Y.py = pi Y and (py, Y) E B) => vp = Vp'}
We shall show the existence of 0, then that of =z easily follows.
474
Let Pi d e be t he fiat dom ai n of ::\1 b sc ts of ide.
We show tha t,
II L ,{},{h f' 11- ) II L,{},{},J' iI · wher e: T = (>.(D' , P } ((Ide
-+
111', w]I
E
D 'J ..-) B,Pitle)
w[u : U, Z : U] E (X : V , X ) = 'ify E P i de. 'ifp E Ide ........ (proj f( X )).\Ip' E Ide -) (proJf(X) ). Z ~ Y ((\l y E I de.y m emberof Y => (p y = p' y /\(p y , Y ) E X)) => v P = v /)
=
I
A
2
A
I
=>
I
F I X (AL .(st·rict (pr oj 1 (T(L ))) 0 (AP .Pide ))) P = {(.1, Q) I Q E P i de} L
The wff w contains two con stant pr edicate symbols 'rnembero f" and ' ~ ' . The symbol ' ~ ' is in ter preted over P i de x P i de an d 'm emberof" is interp reted over Ide x Pide in the obvious manner. Sixteen goals were generated by t he LCF interface to guarant ee continuity of the ab ove predicator. Fifteen we re generated by the goal generation algorithm of section 5. All of them were pro ved by standard tactics autom aticall y except the following:
'v"(D', pI), (D') ~') E L. (D', pI) [:;;; (D', pI)
=?
(W E P ide. p' Y =
~' Y)
This can be proved very easily using fixed poi nt induction (refer t o defn of Labove) by th e user . The remain ing go al was 'T (L ) :5 L '. T his cou ld be proved almost au to m atically . Once th e continuity of th e predicat or is pr oved, existence of the pre dic ate follows immediately a s in Example 5.J.. 9.
Scope And Conclusion
It is hoped t hat t his work w ill meet a long felt need for a mechanical aid in prov ing the existen ce of recursively defin ed predicates. We wc uld like to make few commen ts. Everything we have said so far easily genera lizes t o t he case when predictors have m ore than one argument. Secondly, compli ca ted pr edi ca tors can alw ays be cons t ructed from simple r predicators. Hope fully con tinuity of these simpler one s would have be en p roved already . In the gene ra l case r et r acts underlying predica te cpos do not look linear as in fig 6.1. Du e t o lack of space we could no t discuss more examples, but it sh ould suffice to say t ha t existence of all the predicates in [2,9,7J can be proved in the present system. A lso corr espon ding to ever y r elational functor in [5], it is easy to construct a p redicator an d prove its cont inu it y in the present system. In the next paper we hope t o report on these an d big ger predicates. Rig ht n ow t he maj or b ot tl eneck of the syst em seem s to be th e speed . 10.
.A ck n o wle d g em ent
This wor k was don e und er t he gui da nce of Prof. Dana Scot t. I would like to th ank him f or his insight•., and advice. Sp ecia l thanks to Steve Brookes and Glynn
475
Winskel for many helpful discussion s an d a great help in imp roving readability oi t he paper. Thanks also to Rob er to Minio and Dill Scherlis for helpi ng rue with ~{.
11.
Bibliography
(1)
Cohn,Avra: T he Equivalence of Two Sem antic Definitions: A Case Stud y In LCF; Internal Report, University Of Edinburgh R eport, (1981).
(2)
Gordon, Mich ael: Toward s a Semanti c Theory of Dynamic Binding; Memo AIM-265, Comput er Science Depar tment, Stanford University.
A Theory of Programming (3) Milne,Robert and Strachey,Christopher: Language Semantics; Ch ap man and Hall, Londo n, and John Wiley, New York (1976). (4) Mosses, Peter: Mathemat ical Semantics and Comp iler Generation; Ph.D t hesis, Oxford University Computing Laboratory, P rogramming Research Group (1975). (5) Reynolds, J.C .: On the R elation B etween Direct and Con tinuation Sem antics ; pp. 111-156 of pr oceedings of the Second Colloquium on Au tomata, Lan guages and P rogramming, Saarbriicken, Springer-verlag, Ber lin (1974).
(6) Scott,Dana: Lectures On a Mat hematical The ory of Computati on; Technical Monograph P RG-19 (May 1981). Oxford University Computing Laboratory, P ro gram ming Research Grou p.
(7) Sethi, R.avi an d Tang Adrian : Con structing Call-by-v alue Continu ation Sem antic s; Jou rnal of the Association for Computing Machinery,vol 27. No.3. July 1980.pp.580-597. (8)
Stoy,Joseph : Denotationa l Semantics: The Scott-Strachey A pproach to Programming La ngu age Th eory; (MIT Press, Cambridge, MA, 1977).
(9)
St oy,J oseph : Th e Con gruen ce of Two Programming Language Definitions; Theoretical Computer Science 13 (1981) 151-174. North-Holland Publishing company.
476
Solving Word Problems in Free Algebras Using Complexity Functions Alex Pelin Department of Computer and Information Science Temple University Philadelphia, Pa 19122 and Jean H.
Gallier
Department of Computer and Information Science University of Pennsylvania Moore School of Electrical Engineering D2 Philadelphia, Pa 19104
Abstract: We present a neW method for solving word problems using complexity functions. Complexity functions are used to compute normal forms. Given a set of (conditional) equations E, complexity functions are used to convert these equations into reductions (rewrite rules decreasing the complexity of terms). Using the top-down reduction extension Rep induced by a set of equations E and a complexity function, We investigate properties which guarantee that any two (ground) terms t and t are congruent modulo the congruence z 1 ~E if and only if Rep(t1)~Rep(tz)' Our metbod actually consists in computing Rep incrementally, as the composition of a sequence of top-down reduction extensions induced by possibly different complexity functions. This method relaxes some of the restrictions imposed by the Church-Rosser property. 1.
Introduction
We present a method for computing the normal form of a term with respect to a set of (conditional) equations.
2
2
and
the initial algebra over 2 is denoted
a countable set of variables V, as T
Given a signature
and the free algebra generated by V is denoted as T
(T 2(V) Let E be a set of
2(V)
is the algebra of 2-terms with variables from V). equations (or conditional equations) over T
2(V).
solving the word problem for for any two terms t
1
modulo the congruence
and t
z
,
that is,
in T
2
We are interested in
the problem of determining
whether t
1
and t
2
are congruent
induced by the set of equations E.
Our met hod i s t o c ompute n ormal forms fo r the term s in T
2
,
Sinc e
the word problem is undecidable in g e n e r a l , we a r e interested in classes of sets of equ ations for which normal f orms are eff ectively comput abl e a n d can be characterized by (det erministic) cont e xt-free langu ag es. Giv en a set o f e qua t i o n s E and a set o f g r o u n d terms L such that all gr ound subs tituti on i nst an c e s of terms o cc u r r i n g in E ar e in L, a funct i on f
: L
->
L i s a repr esentat ion funct i on for (L,E) , if for a ll In words, t
and t
2
are equivalent modulo
are id e nt i c al.
~E
if and only i f their repres entatives
Our go a l is to f ind a repres ent ation fun cti on
Ac t u a l ly , th e r epr ese ntati on f un c t i on Re p i s co mput ed as the comp o s it i on Rn.R n_l.,.R
I
With e a c h fu n ct i on R L _
i
1
of re p r e sent ation funct io ns,
i
is a ss o c iated an i n p u t s et o f g r o un d terms
a nd a set of equ at i ons Ai_lover L _ • i l
Th e equation s
(cond it i on al e q u a t ion s ) i n Ai_I a r e called Axi oms , the s e t o f th eorems d eri v abl e fr o m Ai _ I ' R
Th e r e pr e sent at ion funct i on
) whi c h i t at te mp t s t o " eli minate". s el e cts a s u bse t E o f Th(A i i_ l
i
Th e e li mi na t io n i s acc o mp l i she d b y tr a ns f orm i n g the e qu a tion s i nt o reduction r ules. L _
i
1
whic h suits E
Thi s i s d on e b y defi n i ng a c omple xit y fu n ction over i
( t hi s c on c e p t is related t o th at of a
norm-decr e asing set o f rules in Gal l ier and Book [4]). A c ompl e xity f unct i on f ov er L _ assi gn s t o each t erm t in L _ i 1 i 1 k an ele ment f( t ) o f a we l l- o r dered s e t ( N , » , whe r e N d en ot e s the s et of no nnega tiv e i nt e g er s, an d k is a p o s it iv e intege r . f
: L i_ 1
->
k ( N ,»
A fu nc t i o n
is a comp lex ity f unc tion if it i s r e cursi v e,
mon ot on e , and h as t h e subter m prop e r ty , t h at is, wh enev er t) subt e r m o f t 2 t h en f( t2)
>
f (t)).
is a
l
478
We say that a complexity function f l~r if
strongly suits an equation
for all ground substitutions s. f(s(l»
ground substitutions s, f(s(r»
> f(s(l».
> f(s(r».
If
reduction (with respect to f).
l~r
suits
ground substitutions s. f(s(l»>f(s(r», we say that
or for all
I~r
A complexity function f
and for all
is a strongly suits
a conditional equation el ••••• en=>e if f strongly suits e and for all equations e f(s(e»
i•
liiin.
> f(s(e
s(l)~s(r)
i».
for all ground substitutions s. The complexity of an instance of an equation
is defined as max{f(s(I».f(s(r»}.
A weaker version of the suitability concept which is useful in treating conditional axioms is the concept of weak suitability defined below. l~r
A complexity function f weakly suits an equation ground substitutions s, is,
f(s(l»
if for all
implies that s(l)=l(r),
= f(s(r»
that
s(l) and s(r) are identical. Both strong and weak suitability are used to generate
meta-reductions.
A meta-reduction has
the form (C) => 1 -> r, where C
is a recursive predicate in a meta-language which contains names for the well-order (N
k
,»
the complexity functions,
For all such meta-reductions. f(s(l» condition C evaluates to true.
and 1 and r are terms
> f(s(r»
if the
Note that our approach is somewhat
similar to that taken in Brand, Darringer and Joyner Once we have selected the set of theorems E.
1
complexity function f which suits
i),
[2].
and we have found a
(that is.
suits all
we define R.
1
to be the
top-down reduction extension of the set I i of reductions generated by
E
i
and f.
479
The top-down reduction extension
of a set of meta-rules lI.
0:
i
is
defined as follows:
For every term t
in L
i_ 1•
we have the following cases:
->
(I) If for a meta-rule (C) I s(l)=t and C evaluates to true,
r
then
and a ground substitution s. o:(t)=oc(s(r».
(2) If case 1 does not apply and t compute recursively t hen
cr:(t
1)
••••• oc(t
has the form g(t
1'
•••• t
n).
then
n).
DC (t) = DC (g ( DC (t I ) , ••• , DC (t n) » • (3) If neither of
In case 3. t
Let L
i
the above cases applies,
then
DC(t)=t.
is called an atom.
be the range of R.
1
obtained by applying R
i
and let A.
1
be the system of axioms
to the axioms in Ai -1 •
L. consists of
the set
1
of atoms of
the top-down reduction extension R .•
theorems E.
is well chosen, some axioms from A become string i_ l
1
1
identities in A.
1
and can be eliminated.
eliminate axioms,
a proper choice for E
syntactic form for the set of terms L
We say that a
operator f
of
i
If
the set of
If it is not possible to i
may produce a simpler
or for the axioms Ai'
top-down reduction extension
rank n and for every n terms t
every j, 1ijin, if f(tl ••••• t
n)
e
L
i_ 1
1'
•••• t
n
in L
i_ 1,
for
then
Ri(f(t1'···'tj' ••• 'tn»=Ri(f(t1····,Ri(tj)' ••• 'tn»· The representation function Rep can be seen as the composition of the functions given by the sequence
R
(4) --->
O
---> .•.
if
the composition Rn.Rn_I ••• R
I
has the
DC -property, then it has the representation property for '
O
L
n
~o
We will n ow g i v e criteria for o b t a i n i n g "usefu l" t h e or e ms in !i ' Th e r e a re essen tiall y t wo me t hods . The firs t method is t o f or c e l o c a l confl uence in t h e se t of rul e s a s s oc i a t e d wit h