Equational grammar
 9783111729138, 9783110996272

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

JANUA LINGUARUM STUDIA M E M O R I A E N I C O L A I VAN WIJK D E D I C A T A edenda curat C. H. V A N S C H O O N E V E L D Indiana University

Serìes Minor,

108

EQUATION AL GRAMMAR by GERALD A. SANDERS University of

Minnesota

1972

MOUTON THE HAGUE • PARIS

© Copyright 1972 in The Netherlands. Mouton & Co. N.V., Publishers, The Hague. No parts of this book may be translated or reproduced in any form, by print, photoprint, microfiilm, or any other means, without written permission from the publishers.

LIBRARY OF CONGRESS CATALOG CARD N U M B E R : 72-86886

Printed in Belgium, by NICI, Printers, Ghent.

TABLE OF CONTENTS

1.

INTRODUCTION

2.

EQUATIONAL STATEMENTS AND PROOFS

11

2.1. 2.2. 2.3. 2.4.

11 17 23 29

3.

4.

Equational Theories Proving Equivalence Theorems Proving Non-Equivalence Theorems Consistency and Completeness

7

GRAMMATICAL THEOREMS

33

3.1. 3.2. 3.3.

35 40 46 47 48 51 54

Symbolic Equivalence Synonymy and Ambiguity Other Properties and Relations 3.3.1. Class-Membership Theorems 3.3.2. Analiticity 3.3.3. Contradiction 3.3.4. Presupposition 3.3.5. The Characterization of Linguistically Significant Attributes

60

GRAMMATICAL AXIOMS

64

4.1.

75

Equivalence Axioms 4.1.1. The Equational Reduction of Directed Rewriting Systems 4.1.2. Redundancy Axioms 4.1.3. Lexical Axioms 4.1.4. Ordering Rules

75 78 90 94

6

TABLE OF CONTENTS

4.1.5.

4.2.

4.3.

5.

Reordering Rules and their Reduction to Ordering Equations 4.1.6. Grouping and Regrouping Axioms . . 4.1.7. Idempotency Axioms Non-Equivalence Axioms 4.2.1. The Four-Axiom-Type Equational Theory 4.2.2. The Equational Reduction of Representational Constraints 4.2.3. The Reduction of Relative Derivational Constraints 4.2.4. The Reduction of Exceptionality Constraints The Range of Equational Axioms 4.3.1. The Reduction of Lexis 4.3.2. The Reduction of Semantic Amalgamation 4.3.3. The Reduction of Empirical Logic . . .

96 100 110 122 122 125 130 133 140 143 151 156

T H E EXPLANATORY VALUES OF THE EQUATIONALITY HYPOTHESIS

164

BIBLIOGRAPHY

182

INDEX

185

1 INTRODUCTION

This study deals with the forms and functions of grammatical statements and with the problem of restricting the axiomatic basis of grammar in such a way as to achieve a maximally general and maximally revealing explanation of natural language data. The primary restriction which is proposed is the requirement that every well-formed grammatical statement must be an assertion of the equivalence or non-equivalence of two linguistic representations. This restriction effectively precludes the specification of any rule-specific or language-specific restrictions on the manner in which grammatical principles are used for the proof of particular grammatical theorems. The equationality constraint thus imposes severe limitations on the set of possible grammars and possible natural languages, and provides an otherwise-unavailable motivated basis for the formulation of certain significant universal principles of grammatical use, by means of which the optimal directionality and ordering of derivational inferences can be systematically predicted from the equational statements of a given grammar and the terminal mode of any given proof. The class of possible grammatical axioms and theorems will be further restricted by the imposition of additional constraints on the well-formedness of linguistic representations and derivations, and on the possible relations which may hold between the two members of any given equation. The empirical sufficiency of this restricted theory of equational grammar will be argued by showing that it is capable of generating the types of theorems and explanatory generalizations which must be generated by any significant

8

INTRODUCTION

theory of language. Its empirical necessity will be argued by showing that it is capable of generating certain significant explanatory principles which are inherently inexpressible by means of any non-equational theory of grammar. A metascientific basis for all equational theories of language will be established in Chapter 2, where the distinctive properties of equivalence relations will be outlined, along with the general principles of proof for all assertions of equivalence or nonequivalence. The empirical sufficiency of equational statements for the formulation of linguistic theorems and axioms will be demonstrated in Chapters 3 and 4, respectively. In Chapter 3, it will be shown that each of the significant properties and relations of linguistic objects can be formally characterized by a set of one or more equations of the form A = B, where A and B are linguistic representations. A distinct set of valid equational theorems can thus be associated with each distinct claim about the grammaticality, synonymy, ambiguity, analyticity, or entailments of a given set of linguistic objects, and there is no significant empirically verifiable assertion about such objects which is not appropriately expressible in this way. In Chapter 4 it will be shown that all empirically-significant grammatical axioms can also be appropriately expressed as equations between linguistic representations. A limited class of equivalence axioms will be found to suffice for the expression of those explanatory generalizations associated with the traditional derivational processes of adjunction, deletion, and lexicalization, as well as those effecting the proper grouping and ordering of constituents at various levels of representation. A similarly restricted class of non-equivalence axioms will also be shown to suffice for the formulation of those grammatical principles which express conditions or constraints on the well-formedness of linguistic representations and derivational proofs. Certain arguments for the necessity of equational grammar will then be summarized in Chapter 5, where it will be shown that the equationality constraint on grammatical statements is not only consistent with the principled explanation of all particular facts about particular languages, but that it is

INTRODUCTION

9

also essential for the explanation of certain universal characteristics of natural languages and certain observed limitations on their range of possible variation. Thus it will be argued there, in particular, that all principles determining the optimal use of grammatical axioms for the proof of particular theorems are universal in scope, and that all ad hoc language-specific restrictions on the directionality, order of application, or optionality of grammatical rules can be eliminated from the set of possible grammars of particular languages. By reducing all principles of grammatical inference to a small set of universal statements of the theory of grammar, we are able to radically reduce the set of possible non-universal statements of this theory, and it becomes possible for us to bring additional empirical evidence to bear on decisions concerning the correctness or naturalness of particular grammatical statements, which must be consistent now not only with some particular set of facts about some particular language but also with the proper application of all those principles of inference which have been substantiated with respect to clearer cases in that language and all other languages. In these and other ways, the proposed theory of equational grammar will be seen to provide a narrower and more revealing characterization of natural language than can be provided by any theory which is inconsistent with the equationality constraint or with any of the other general principles of grammar that are systematically correlated with it. I wish to express my gratitude to Andreas Koutsoudas for his encouragement and highly constructive criticism of the preliminary version of this monograph. I am indebted to him for a number of important ideas and insights generated during our discussions of many of the various issues dealt with here. I am also very grateful to Robert Lefkowitz, James H-Y. Tai, and Eleanor H. Young for their valuable comments on the preliminary manuscript. Their suggestions have led to a number of beneficial changes in the style and content of the original version. The original and much of the final version of this monograph were written during the academic years 1968-1969 and 1969-

10

INTRODUCTION

1970, at which time I was the holder of a National Science Foundation grant under the Science Development Program in Language and Behavior at the University of Texas at Austin. I wish to express my appreciation to all those who made this grant possible, and to the chairman, W. P. Lehmann, and entire faculty of the Department of Linguistics at Texas for providing the freedom and congenial atmosphere under which this research was carried out. Minneapolis, June 1971

2 EQUATIONAL STATEMENTS AND PROOFS

An equational statement is a statement asserting the existence or non-existence of an equivalence relation between two abstract objects or representations. A relation between the members of some set is an equivalence relation if and only if it is a binary relation that is reflexive, symmetric, and transitive within that set. Examples of such relations include the familiar relation of arithmetical equality in the set of rational numbers, the relation 'is parallel to' in the set of lines in the same plane, and the relation 'was born in the same year as' in the set of all human beings. The particular equivalence relation which will be of interest to us here is the relation 'represents the same linguistic objects as' in the infinite set of all possible finite strings of symbols. Since there is a single very simple system of logical inference and proof for all theories whose statements assert or deny equivalences, it is appropriate to establish this system first before proceeding to any questions of a specifically linguistic character. Having established a metascientific basis for all possible equational theories and all possible equivalence relations, it will then be possible to make productive use of this basis in dealing with the specific properties of equational grammars and the specifically linguistic relation of symbolic equivalence.

2.1.

EQUATIONAL THEORIES

All assertions of equivalence can be expressed as special cases of the standard formula (A = B). Thus, for example, the

12

EQUATIONAL STATEMENTS AND PROOFS

statements "John was born in the same year as Bill" and "John and Bill were born in the same year" can both be reduced to the synonymous equational statement "The year in which John was born = the year in which Bill was born". All denials of equivalence can similarly be expressed as special cases of the negative equational formula (A -/- B), which constitutes the standard abbreviation for the statement "It is not the case that A = B " . Thus, for example, the statements "Jack doesn't live in the same house as Jill", "It's not the case that Jack lives in the same house as Jill", and "Jack and Jill don't live in the same house" can all be reduced to the synonymous negative equation "The house that Jack lives in ^ the house that Jill lives in". An infinite set of positive and negative equations of these forms can of course be generated out of any arbitrary set of symbols simply by forming pairs of finite random strings over the given symbol-set and connecting the members of each such pair with either one or the other of the symbols " = " and "=£". The infinite set of equation-sets so-generated properly includes a welldefined subset constituting the set of all possible well-formed equational theories. By randomly associating its symbols with verifiable observation statements, it is possible to convert each statement-set of the latter sort into an empirical theory which is contingently falsifiable. Only a very minute subset of these interpreted equational theories will be consistent with any reasonably large or reasonably coherent body of actual data about the real world. And it is only a still more minute subset of these, finally, which will be found to satisfy the conditions of generality and mutual nonreducability which are required for the qualification of any statement-set as a real empirically-significant explanatory theory. While the empirical scientist is thus concerned with only a very small portion of the set of possible equation-sets, his questions concerning the predictive and explanatory values of these interpreted theories can be meaningfully raised only in terms of some explicit system of deductive inference and proof. A simple logical system of this sort can be established for all interpreted and uninterpreted equational theories alike by

EQUATIONAL STATEMENTS AND PROOFS

13

the axiomatic assertion of the defining properties of equivalence relations, the rule of inference that equivalent representations can be substituted for each other, and the condition for proof that any (A = B) is validly proven if and only if some equation between identities, (C = C), can be validly derived from (A = B) by a finite number of valid substitutions. All systems of mutually-consistent equations based on a given vocabulary of symbols can thus be generated by means of the following principles : (i) Equivalence and non-equivalence are binary relations in the set of all strings over the given vocabulary, such that the set of all pairs of strings is partitioned into disjoint sets of equivalent and non-equivalent pairs; that is, for any set of vocabulary symbols V, where V does not include " = " or and any finite strings X and Y out of V, the set of statements £ is a consistent equational system for V, only if E includes either the statement X = Y, or the statement X -/- Y, but not both. (ii) Equivalence is reflexive; that is, for any string X out of V, the set of statements E includes the statement X = X. (iii) Equivalence and non-equivalence are symmetric; that is, for any strings X and Y, the statement X = Y is included in E if and only if the statement Y = X is also included in E, and X Y is included in E if and only if Y X is also included in E. (iv) Equivalence is transitive; that is, for any strings X, Y, and Z, if X = Y and Y = Z are included in E, then X = Z is also included in E. Since there are an infinite number of distinct strings that can be formed out of any given set of symbols, there will be an infinite number of pairs of strings, and hence an infinite number of statements in any consistent equational system for any vocabulary. There will also be an infinite number of such systems for each set of vocabulary symbols, since for each distinct pair of strings, X and Y, out of the infinite set of strings over the given vocabulary set, there will be one set of equational systems

14

EQUATIONAL STATEMENTS AND PROOFS

whose members each include the statement X = Y and another distinct set whose members each include the statement 1 ^ 7 ; each of these disjoint sets will be infinitely large. Any consistent equational system can be converted into a deductive system in which each statement stands in a relation of either derivability or non-derivability with respect to each other statement. Such conversion is effected simply by providing a formal definition of a valid substitution of ,one ' equation for another, with derivability then being defined by the existence of a sequence, or chain, of valid substitutions between two equations. Thus, let us say that one equation can be validly substituted for another if and only if the first equation is of the form (XAY r XCY) and the second is of the form (XAY r X B Y ) , where r is either " = " or and the system which includes these equations also includes the equation (B = C). This is of course merely an extension of the familiar principle that if equals are substituted for equals the results are equal, i.e. that the meaning, or at least the inferential value and truth value, of statements is invariant under the operation of equal substitution. A statement which can be validly substituted for another statement can also be said to be DIRECTLY DERIVABLE from that statement. Thus, for example, given the statement (3 + 1 = 4), we would say that the statements (7 ^ 5 + 4 + 2) and (7 ^ 5 + 3 + 1 + 2) are validly substitutable for each other, or, equivalently, that each statement is directly derivable from the other. The operation of equal substitution is normally understood as defining an equivalence relation in the set of statements of any consistent system of equations. Thus the relation is binary (each statement is either substitutable or non-substitutable for each other statement), reflexive (each statement can be validly substituted for itself), symmetric (if Statement A can be substituted for Statement B, then Statement B can be subsituted for Statement A), and transitive (if Statement A can be substituted for Statement B, and B can be substituted for Statement C, then A can be substituted for C). If substitutability is assumed to be transitive, though, it can

EQUATIONAL STATEMENTS AND PROOFS

15

no longer be equated with the relation of DIRECT derivability, since the latter relation, though binary, reflexive, and symmetric, is clearly nontransitive. 1 There is, on the other hand, a relation of INDIRECT DERIVABILITY, which is transitive and which can be defined in terms of the relation of direct derivability. Thus we can say that Statement A is indirectly derivable from Statement B if and only if there is a set of one or more statements not identical to A or B, such that A is directly derivable from some member of that set, and some member of the set is directly derivable from B, and each member of the set is either directly derivable from B or else directly derivable from some other member of the set. The union of the relations of direct and indirect derivability yields an equivalence relation which will be called simply the relation of DERIVABILITY. Thus, we will say that Statement A is (directly or indirectly) DERIVABLE FROM Statement B if and only if there is a set of statements which can be placed in a strictly ordered sequence beginning with Statement B and ending with Statement A , such that each statement in the sequence is directly derivable from, or validly substitutable for, the statement which immediately precedes it. Since truth-value is invariant under the operation of equal substitution, all statements that are derivable from a true statement are true, and all statements that are derivable from a false statement are false. This means that it will always be possible to infer, or predict, the truth-value of certain statements from the given truth-values of certain other statements of the same system. But it is clearly not only the truth-value of a derivable statement that is predictable, but also all of its other properties, including the property of existence as a statement of a particular system of equations. It will thus always be possible to enumerate the members of any such system without listing, or axiomatically specifying, all of the members of the system. In other words, given the principle of equal substitution, it will always be possible to The inference "If A is directly derivable from B, and B is directly derivable from C , then A is directly derivable from C " is false, in fact, for all cases except the special case where A and C are identical. 1

16

EQUATIONAL STATEMENTS AND PROOFS

predict the existence of Statement X as a member of System Y from any PARTIAL SPECIFICATION of Y that includes some set of statements from which X is derivable. We may call any such partial specification of an equational system an AXIOMATIC BASIS for that system. In general, for any given system, there will be an infinite number of distinct axiomatic bases, since there will always be an infinite number of pairs of statements that are derivable from each other, and hence an infinite number of distinct sets formed by the selection of one member of each such pair. It is also evident, therefore, that at least some of the ¡axiomatic bases for any given equational system will be infinitely large. However, since we are interested in equational systems here solely because of their utility in generating scientific explanations of empirical data, we will be concerned henceforth only with those systems which have at least one axiomatic basis that is finite and a vocabulary of element-types which is also finite. The conjunction of any finite set of elements and any finite set of equational statements expressed in terms of those elements determines a distinct EQUATIONAL THEORY from which all of the members of some infinite system of equations can be deductively derived in accordance with the principle of equal substitution. Any such theory can be converted into a formal axiomatic deductive theory by the addition to its axiomatic basis of statements asserting the principles of reflexivity (X = X), symmetry (If (X = Y), then (Y = X)), and equal substitution (If (A = B), then (XAY = XBY)), along with a statement asserting that, if (X = Y) or ( X ^ Y ) , then X and Y are equivalent to strings out of a given finite set of vocabulary elements. Any such theory can also be converted into an EMPIRICAL THEORY by assigning non-null extratheoretically-verifiable observation statements to at least some of its elements and some of the formal properties and relations of its statements. The class of interpreted equational theories of this sort whose associated observation statements make assertions about the properties and relations of words, sentences, and discourses of natural languages constitutes a formally well-defined subset of

EQUATIONAL STATEMENTS AND PROOFS

17

the set of all possible empirical theories of language. It will be argued here that there are no significant facts about natural languages which are not optimally explainable by some member or members of this class of interpreted equational theories. It will also be suggested that there are some such facts which can apparently be assigned principled explanations ONLY by means of theories of this class. The restricted formal properties and rules of inference which determine the abstract class of equational theories will thus be capable of contributing to the empirical function of delimiting the set of possible natural languages through the empirical hypothesis that the grammar of every such language must be expressed as an equational theory. To test this hypothesis, it is necessary to have an effective procedure for determining the set of valid theorems which follow from any given equational theory of language, and for associating each such theorem with some non-null set of potentially verifiable observation statements about the properties and relations of linguistic objects. The characterization of well-formed equational theorems will be dealt with in the remainder of the present chapter; the linguistically-relevant interpretation of such theorems will be taken up then in Chapter 3.

2.2.

PROVING E Q U I V A L E N C E THEOREMS

Given the reflexivity of equivalence relations, which is specified for all equational theories by the idempotency axiom (X = X), it follows by the principle of equal substitution that any statement of the form (A = B) will be a valid theorem of some equational theory if and only if (A = B) stands in a derivability relation with respect to some statement of the form (X = X). Thus, since (X = X) is taken to be true for all values of X in all possible equational systems, and since truth value is invariant under equal substitution, it follows that all statements that are in a derivability relation with respect to any equation between identities will necessarily be true statements. Conversely, if

18

EQUATIONAL STATEMENTS AND PROOFS

(A = B) is true, then it must also be the case, by the principle of equal substitution, that (A = A ) is also true — and hence that every true assertion of equivalence is in a derivability relation with respect to at least one equation between identities. The set of true equivalence statements of an equational theory is thus precisely identical to the set of statements which are in a derivability relation with respect to idempotency equations of the form (X = X ) in accordance with the principle of equal substitution and the given axioms of that theory. Those statements which are in a DIRECT derivability relation to idempotency equations are the EQUIVALENCE AXIOMS of the theory, and those which are in an INDIRECT derivability relation to such equations are its EQUIVALENCE THEOREMS.

A statement of the form (A = B) is VALID, or PROVABLE, with respect to a given theory if and only if there is a finite sequence of equations flanked by (A = B) and (C = D) such that (C = D ) is an axiom of the theory and, for each pair of adjacent equations in the sequence, those equations are in a relation of direct derivability. W e may call any such sequence a PROOF of the equivalence of A and B with respect to the given theory. Since truth-value is invariant under equal substitution, a provable equation will be true if and only if each equation in its proof is true. In order for an equivalence statement to make a truth claim, therefore, it must have a proof whose axiomatic flanking equation is true. Since the idempotency equation (X = X ) is taken to be a true axiom for all possible equational theories, every true equivalence theorem of every equational theory will have a proof with an axiomatic flanking equation of this form, and there will be no false assertion of equivalence (or, equivalently, no true assertion of non-equivalence) which has a proof of this form. The relation of derivability with respect to idempotency equations can thus serve as the basis of an effective procedure for the enumeration of the truth-bearing proofs of equivalence which are determined by any given equational theory, and hence for the enumeration of all true equivalence theorems of that theory. Thus, since every equivalence theorem that is provable has

EQUATIONAL STATEMENTS AND PROOFS

19

a proof which terminates with an idempotency equation, the set of equivalence theorems of any theory can always be determined from the set of valid idempotency-flanked proofs that are justified by that theory. An effective procedure for the characterization of such proofs must take into account not only the requirement that each pair of adjacent equations in a valid proof must be directly derivable from each other by equal substitution in accordance with the equivalence axioms of the given theory, but also the equally necessary requirement that the proof be finite, i.e. that it consist of a finite sequence of equations between finite strings of symbols. Given a set of axioms, there is no difficulty involved in the formulation of algorithms to determine whether or not a relation of direct derivability holds between two equations, provided that it is known that the set of axioms is finite and that the equations to be examined are of finite length. If it is also known that every sequence of equations to be examined consists of a finite number of adjacent equations, then the procedure will always terminate after a finite number of steps, both for sequences that constitute valid proofs and for those that do not. Termination with a decision of non-validity will occur if and only if a pair of adjacent equations is found in the given sequence such that one is of the form (W = XAY) and the other of the form (W = XBY) and there is no axiom in the given theory which asserts the equivalence of A and B. Termination with a decision of validity will occur if and only if each pair of adjacent equations in the sequence has been examined and there has been no decision of non-validity. Thus there is a decision procedure for partitioning any set of finite sequences of finite equations into a set of sequences which are valid proofs of equivalence of any given theory and a set of sequences which are not valid proofs of that theory. While a theory's valid equivalence PROOFS thus constitute a recursive, or decidable, subset of the set of all finite sequences of finite equations between strings over the alphabet of that theory, it is important to note that the equivalence THEOREMS of a theory, which are simply those single statements which occur as the

20

EQUATIONAL STATEMENTS AND PROOFS

non-idempotent terminus of a least one valid proof of that theory, do NOT constitute a recursive set, but only a recursively enumerable one. That is, there is an algorithm for determining that a statement is a theorem, but there is no possible algorithm for determining that it is not a theorem. This is due to the fact that if (A = B) is a theorem it has at least one valid proof, and it will always be possible to discover such a proof from an examination of a FINITE (but possibly very large) subset of the set of all possible finite equation sequences over the given vocabulary of a theory; if (A = B) is not a theorem, on the other hand, a mechanical determination of its non-theoremhood would be possible only if it were possible to examine ALL the valid equivalence proofs of a theory, since the discovery of any n equation sequences which do not constitute valid equivalence proofs terminating in (A = B) obviously does not prove that there is no n + 1st sequence which does constitute such a proof. But it is logically impossible to examine all of the equation sequences or valid equivalence proofs of a theory since there are an infinite number of these. It follows then that there is an effective procedure for enumerating the equivalence proofs, non-proofs, and theorems of any equational theory, but not its non-theorems. To say that (A = B) is not a valid equivalence theorem of some theory is of course the same as saying that (A B) is a valid non-equivalence theorem of that theory. This means that a theory's non-equivalence theorems cannot be formally charaterized in the same sense as its equivalence theorems can. It will also be evident that the notion 'proof of non-equivalence' can be made explicit only with respect to a restricted class of equational theories which allow for the formal differentiation of all strings of symbols into a class of terminal representations and a class of non-terminal ones. Before turning to the matter of non-equivalence proofs, though, there is one further point that should be noted with respect to the characterization of proofs of equivalence. This concerns the distinction between the GENERATION of proofs and non-proofs and their deterministic or non-deterministic CONSTRUCTION. The members of any set can be algorithmically enumerated,

EQUATIONAL STATEMENTS AND PROOFS

21

or generated, either by specifying principles for finitely CONSTRUCTING each member of the set and no non-members, or else by specifying principles for finitely RECOGNIZING each member of the set among the members of any arbitrary set that properly includes it. Thus, for example, the set of all triangles in a plane could be correctly generated either by the constructional principle "Construct a triangle by selecting any three points not lying on the same straight line and connecting each point to each of the others by a straight line", or by the recognition principle "Recognize as a triangle any figure in the plane which has straight lines as sides and three interior angles". Any set which can be generated by one of these methods can also be generated by the other. Thus, parallel to the recognition algorithm outlined above for the enumeration of the equivalence proofs, non-proofs, and theorems (but not non-theorems) of any equational theory, there is a constructional algorithm that generates exactly the same sets. Thus, for example, P is a valid equivalence proof of some theory if and only if there is a finite sequence of equations P', such that P' is identical to P, and P' is constructed by (1) forming a finite equation of the form (X = X) out of the vocabulary of the given theory, (2) forming a second equation by the operation of equal substitution on the preceding equation in accordance with some axiom of the theory, and (3) repeating step (2) a finite number of times. The set of non-idempotent terminal equations of the sequences constructed in this fashion constitute the set of valid equivalence theorems of the theory, and any object whose image cannot be constructed in this way is a non-proof with respect to that theory. Similarly, just as there is no recognition algorithm for non-equivalence theorems, there is also no possible constructional algorithm for the enumeration of any such set — there being no finite number of failures to construct a valid proof terminating in (A = B) which can ever suffice to demonstrate the non-existence of such a proof. The fact that there is a constructional algorithm for equivalence proofs and theorems clearly does not imply that there is any strictly deterministic, or entirely non-random, procedure for the

22

EQUATIONAL STATEMENTS AND PROOFS

derivation of proofs from theorems. In fact, no such procedure is possible, since there is no way of knowing in advance of the construction of a complete sequence of equations whether a particular justifiable operation of equal substitution will or will not contribute to the construction of a sequence which constitutes a valid proof of equivalence. Thus, for example, consider the theory (1), the theorem (2), and the equation sequences (3) and (4). (1) (2) (3) (4)

(i) (A = B) (ii) (C = D) (AD = BD) (AD = BD), (AD = AD) (AD = BD), (AC = BD), (AC = AD)

Both of the sequences (3) and (4) can be constructed from the given theorem by operations of equal substitution justified by the axioms of the given theory, but only (3) counts as a valid equivalence proof of this theory, since (3) but not (4) is a sequence of directly derivable equations flanked by an idempotency equation. But„ given only (1) and (2), there is clearly no possible principle which could determine that (li) rather than (lii) should be used for the construction of the second equation in the proof of (2), since any principle which prevented construction of the non-proof (4) would necessarily also prevent the construction of an infinite number of valid proofs of (2) which, like (5), include this non-proof as a proper part. (5)

(AD = BD), (AC = BD), (AC = AD), (AC = AC)

Proofs of theorems can thus be mechanically discovered by construction only in a random fashion — that is, by randomly choosing between the use and non-use of each justified substitution operation in the construction of each subsequent equation in the constructed sequence and then checking the complete sequence to determine whether the last equation of the sequence is or not an idempotency equation. Since the recognition algorithms for proofs also presuppose an arbitrary input of random

EQUATIONAL STATEMENTS AND PROOFS

23

objects to be inspected, this means in essence that proofs can be discovered only by trial and error — a fact which undoubtedly plays a large part in making both mathematics and empirical science the perpetually fascinating endeavors which they are. One final point should also be noted with respect to recognition and constructional algorithms. This is that, while the two types of enumeration procedure are precisely equivalent in generative power, the simplest construction procedure for enumerating the members of any given set will always be conceptually and operationally more complex than the simplest recognition procedure for that set. This follows from the fact that every constructional algorithm properly includes a recognition algorithm, i.e. a set of principles for determining whether a particular given object is or is not identical to a particular constructed object. Recognition algorithms, on the other hand, need not include any constructional principles at all, since they determine whether or not a given object is a member of a given set simply by inspecting that object to determine whether or not it satisfies some given set of conditions. We will thus assume that the actual metatheoretical definition of well-formed equivalence proofs consists simply of an assertion of the conditions that any finite object must satisfy to qualify as a member of the set of equivalence proofs of some theory : namely, that it be a sequence of equations flanked by an idempotency equation and a non-idempotency equation, where each pair of adjacent equations stand in a relation of direct derivability in accordance with the axioms of a given theory and the principle of equal substitution.

2.3.

PROVING NON-EQUIVALENCE THEOREMS

Non-equivalence is non-transitive and thus not an equivalence relation. Thus, while it is possible to infer the truth of (A = C) from the truth of (A = B) and (B = C), it is not possible to infer anything at all about the truth value of (A ^ C) from the truth of (A B) and (B ^ C). For example, given that (1) and

24

EQUATIONAL STATEMENTS AND PROOFS

(2) are true, we clearly cannot conclude from this that (3) is also necessarily true. (1) (2) (3)

biting tigers biting elephants biting elephants tigers that bite biting tigers ^ tigers that bite

With respect to equational systems, both the transitivity of equivalence and the contrasting non-transitivity of non-equivalence follow from the fact that the one and only rule of inference for such systems is the principle of equal substitution. This principle thus precludes the derivation of (3) from (1) and (2), while properly justifying derivations such as that of (6) from (4) and (5). (4) (5) (6)

biting tigers biting elephants biting elephants = elephants that bite biting tigers elephants that bite

The fact that non-equivalence is non-transitive means that non-equivalence statements are not provable in the same way as equivalence statements. This also follows from the fact that (A B) is true for a given theory if and only if (A = B) is not a valid equivalence theorem of that theory, a condition which has already been seen to be algorithmically undeterminable. It is also clearly not the case that every non-proof of equivalence is a proof of non-equivalence, unless the notion of proof is extended in a quite unnatural way to include objects which are inconsistent with the vocabulary and equational axioms of the given theory. A natural formal characterization of proofs of non-equivalence is thus impossible with respect to the class of all possible equational theories. However, there is a well-defined subset of such theories, which appears to include all those which are of potential use as interpretable empirical theories, with respect to which proofs of non-equivalence can be algorithmically enumerated in a simple and quite natural way. These are the theories whose strings of symbols can be formally partitioned into a set

EQUATIONAL STATEMENTS AND PROOFS

25

of primitive, or terminal, representations (which are interpreted, in the case of empirical theories), and a set of non-primitive, or non-terminal, representations (which are not subject to interpretation, in the case of empirical theories). Thus, for example, consider a theory of arithmetical addition whose vocabulary consists of the integers from 1 to 9 and the symbol' + ' and whose axioms include the equations (1 + 1 = 2 ) , (2 + 1 = 3), (3 + 1 = 4), etc. Without distinguishing a set of terminal representations of this theory, it is clearly impossible to determine whether a statement such as (2 4) is one of its 4) js valid if and valid theorems or not. This is because (2 only if (2 = 4) is not a valid equivalence theorem of the theory, that is, if and only if there is no valid equivalence proof of the theory that is flanked by (2 = 4). But in order to be certain that there is no possible proof of this form, it would be necessary to inspect every possible sequence of equations formed out of the vocabulary of the theory. But, since there are an infinite number of such sequences, it is impossible to inspect every one of them. It is thus impossible to determine whether an assertion of nonequivalence does or does not follow from the axioms of any given theory — even when, as in the present case, it is intuitively quite clear that it does follow. This highly unsatisfactory indeterminacy can be readily eliminated as soon as the notion of terminal representation is introduced and formally defined for the theories in question. The representations of any equational theory can be partitioned into two subsets by specifying a proper subset of the vocabulary elements of the theory, which may be called its primitive, or terminal, elements, and by stipulating that a string of symbols is a terminal representation of that theory if and only if it consists solely of tokens out of this set of terminal elements. All other strings can then be called non-terminal representations, that is, representations which include at least one symbol which is not a member of the specified terminal vocabulary of the theory. For empirical theories, of course, some such partitioning of vocabulary and representations is independently necessary, since it is never

26

EQUATIONAL STATEMENTS AND PROOFS

the case, for non-trivial theories, that every element of the theory can be mapped into a non-null observation statement, or that every valid theorem can be understood as making a possibly verifiable empirical claim. For such theories, therefore, it is necessary that there be an effective procedure for selecting out of its infinite set of representations and theorems precisely those which are appropriately interpretable. For most empirical theories, each object in the domain of the theory has a single mode of interpretation, and interpretability can be equated with terminality, as determined by the specification of a single set of elements which are each uniquely interpreted into an observation statement about some well-defined class of physical states or events. For other empirical theories, most notably those about natural languages, the objects in the domain of the theory have two distinct modes of interpretation, where the states or events in each mode are mutually exclusive in all of their observable attributes. Theories about objects with three or more modes of interpretation are logically possible too, although it appears that no actual theories of this sort have thus far been proposed. In any event, for any empirical theory with n modes of interpretation, its representations will be necessarily and effectively partitioned into n + I distinct subsets by the formal criterion that a representation is interpretable in a given mode if and only if each and every one of its constituent non-null elements is interpretable into a nonnull observation statement about states or events in that mode. Thus, for example, in a theory about linguistic objects, every representation of the theory will be formally characterized as either (1) an INTERPRETABLE SEMANTIC REPRESENTATION, i.e. one whose elements are all interpretable into observation statements about privately-perceptible cognitive states or events; or (2) an INTERPRETABLE PHONETIC REPRESENTATION, i.e. one whose elements are all interpretable into observation statements about neuromuscular states or events effecting articulations, gestures, or other publicly-perceptible events, or (3) an UNINTERPRETABLE REPRESENTATION, i.e. one whose elements are neither all semantically interpretable nor all phonetically interpretable.

EQUATIONAL STATEMENTS AND PROOFS

27

Since states of the universe are equivalent as states of the universe only if they are precisely IDENTICAL, and since there is no possible reason for an empirical theory to posit differences in its interpretable representations which are not correlated with differences in their interpretations, it follows that interpretable representations in the same mode are also equivalent only if they are identical. This means then that for any possible equational theory, if X and Y are non-identical interpretable representations of that theory, then X is not equivalent to Y. Since there is a very simple algorithm for determining the identity or non-identity of any pair of finite strings of symbols, and since the equivalence or non-equivalence of representations is invariant under equal substitution, it follows that there is also an algorithm for the enumeration of all the valid non-equivalence proofs and theorems of any possible interpreted equational theory. Thus, a valid nonequivalence theorem is any sequence of one or more equations flanked by an equation of the form (X Y), where X and Y are non-identical interpretable representations in the same mode, and where each pair of adjacent equations in the sequence stand in a relation of direct derivability in accordance with the principle of equal substitution and the axioms of some given theory. The formal characterization of non-equivalence proofs and theorems can thus be effected in a manner which constitutes an entirely natural counterpart of the characterization which has been provided for equivalence proofs and theorems. Thus where the equivalence of A and B is determined by the derivability relation between (A = B) and the universally true statement that identical objects are equivalent, the non-equivalence of A and B is determined by the derivability relation between (A ^ B) and the universally true statement that NON-IDENTICAL objects OF THE SAME TYPE are always NON-EQUIVALENT. For empirical theories, terminality and interpretability are equivalent notions, and the necessary and sufficient conditions for the generation of non-equivalence proofs and theorems are automatically satisfied as a consequence of the fact that some but not all of the elements of such theories are assigned non-null

28

EQUATIONAL STATEMENTS AND PROOFS

empirical interpretations. For non-empirical theories, on the other hand, the notion of terminally is necessarily primitive. However, the set of terminal representations of any such theory can be formally defined in a manner which is just as precise as the interpretability definition for empirical theories, and which is just as independent of the notion of non-equivalence. Thus for any given theory with vocabulary V, R is a TERMINAL REPRESENTATION of that theory if and only if there is a set of elements Vt included in V, such that R is a finite string over Vt, and such that, for any finite representation of the theory, R', there is a string over Vt which is in a derivability relation with respect to R', and such that the latter condition is not satisfied by any subset of V which includes a smaller number of members than Vt. In other words, a terminal representation is a string over the terminal vocabulary of a theory, and the terminal vocabulary of a theory is the smallest set of its elements which is necessary and sufficient for the formation of representations which are identical or equivalent to every member of every possible valid theorem of the theory. For any theory with specified terminality, then, its valid proofs and theorems of non-equivalence can be algorithmically enumerated just like those of empirically interpreted theories. Thus, to return to our uninterpreted theory of arithmetic, there is clearly one and only one set of elements which qualifies as a terminal vocabulary here — namely, that consisting of the two elements "1" and " + " . Thus strings of the form 1 + ... + 1 stand in a relation of derivability with respect to all possible strings over the vocabulary of the theory, and there is clearly no other set of two or less members of its vocabulary which would be capable of serving this function. Having determined terminality for this theory, the statement (2 4), whose validity was shown to be inherently undeterminable without an explicit characterization of terminality, can now be formally characterized as a valid theorem of the theory by virtue of its occurrence as a flanking equation in sequences such as (7), which are generated as valid proofs of non-equivalence by the cited algorithm for such proofs.

29

EQUATIONAL STATEMENTS AND PROOFS

(7)

2^ 1 + 1 + 1 + 1 +

4 1^4 1 =+3 + 1 1=^2+1 + 1 1 ^ 1 + 1 + 1 + 1

(given) (axiom: (axiom : (axiom: (axiom :

1 3 2 1

+ + + +

1 1 1 1

= = = =

2) 4) 3) 2)

The last equation of this sequence is formally characterized as an equation between terminal representations by the fact that each constituent of each of its members is included in the specified terminal vocabulary of the theory. Since the members of this equation are determinably non-identical, and since each pair of adjacent equations in the sequence stand in a direct derivability relation, the sequence constitutes a valid non-equivalence proof, and the flanking equation (2 ^ 4) is thus a valid non-equivalence theorem of this theory.

2.4.

CONSISTENCY AND COMPLETENESS

A theory is inherently self-contradictory, or INCONSISTENT, if there is any statement S such that both S and ~ S are included in its set of axioms and valid theorems. A theory is CONSISTENT if it is not inconsistent. Thus, for any consistent equational theory, it must be the case that if (A = B) is an axiom or valid theorem, then (A ^ B) is not an axiom or valid theorem, and vice versa. A theory is COMPLETE if and only if for any statement S of the theory, either S or ~ S is a valid theorem. Thus, if an equational theory is complete, it must be the case that for any pair of representations, A and B, either (A = B) or (A ^ B) is included in the set of axioms and valid theorems of that theory. From the work of Goedel and others, it appears that the joint satisfaction of the conditions for consistency and completeness is impossible for a large class of mathematically interesting theories. (For discussion, see, for example, Stoll, 1961; and Nagel and Newman, 1956.) For present purposes, however, a much more important consideration is the existence of a class of empirical

30

EQUATIONAL STATEMENTS AND PROOFS

theories, including theories about natural language, which can achieve consistency, it seems, only at the cost of a considerable increase in the complexity and unnaturalness of their axioms. This follows fiom the fact that some observational generalizations are most simply and most naturally accounted for by means of affirmative explanatory laws, while others are most simply and most naturally derived from laws of a negative, or prohibitive, character, where the generation of contradictory predictions about certain objects falling in the domains of both affirmative and negative laws cannot be precluded except by abandoning the negative law and significantly reducing the generality of the affirmative one. Situations thus arise in which the scientist is forced to choose between a theory that is strictly consistent but not maximally revealing and one that sacrifices strict consistency for the sake of greater generality and naturalness of its axioms. This dilemma can be eliminated, however, by permitting axioms of either affirmative or negative form, while restricting the predicate 'is a theorem' to statements that are affirmative. In other words, by restricting the notion of proof for empirical theories to proofs of true statements, it is possible to preserve the consistency of such theories without losing the important explanatory values that can be obtained solely by the use of negative axioms. These gains cannot be achieved without cost, of course, the cost here being the loss of completeness, in the sense that it will no longer be the case for reconciled empirical theories of this sort that for any S either S or ~S will be a formally demonstrable theorem of the theory. This is an extremely small price to pay, however, since completeness is the least interesting, least productive, and practically least attainable goal which might possibly be imagined for human scientific endeavor. In fact, since such endeavor consists primarily of systematic attempts to falsify theories of a very rudimentary sort, and since such attempts are typically succesful in a relatively short time on the basis of a very minute sample of any given theory's provable theorems, it would be highly counterproductive, if not downright perverse, to even raise questions of completeness with respect to any of the actual

EQUATIONAL STATEMENTS AND PROOFS

31

empirical theories that a human scientist is ever likely to encounter in this world. Moreover, with respect to equational theories, whose logic is strictly two-valued, it will always be the case that for any given S (i.e. for any given representations A and B) the statement ~S (i.e. (A r/- B)) will be true if and only if the statement 'It is true that S' (i.e. (A = B)) is not true. Thus even if proofs of non-equivalence are metatheoretically undefined, the truth of any given non-equivalence statement (A B) will always be implicitly defined by the absence of its contradictory (A = B) from the algorithmically-determinable set of valid equivalence theorems of the theory in question. We will thus be concerned henceforth only with that class of equational theories whose generated proofs are all proofs of equivalence. The equivalence and non-equivalence axioms of such theories jointly determine the relation of derivability in the sets of equivalence statements of each theory, with the equivalence axioms of a theory specifying the necessary conditions for derivability for that theory, and its non-equivalence axioms specifying the non-sufficient conditions for such derivability. The sufficiency of equivalence theorems for the characterization of linguistic properties and relations will be demonstrated in Chapter 3, and the particular functions of the equivalence and non-equivalence axioms that determine the provability of such theorems will be discussed in Chapters 4 and 5. The restricted class of equational theories that has been delineated here will thus serve as the logical, or metascientific, basis for all subsequent discussion of linguistic theories, and it will be one of the primary purposes of this discussion to show that the set of most highlyvalued explanatory theories of language is in fact properly included in the set of restricted equational theories which is formally defined and generated by the following principles : For any finite vocabulary of symbols V, T is a theory based on V if and only if (i) T consists of a finite number of non-logical statements such that each statement is of the form (A = B) or (A B), where A and B are finite strings of symbols out of the set V;

32

EQUATIONAL STATEMENTS AND PROOFS

(ii) T incorporates the following set of logical axioms determining its system of inference and proofs : (a) For any representation A, (A = A); (b) For any pair of non-identical terminal representations in the same mode, A and B, (A ^ B); (c) For any representations A and B, (A = B), if and only if (B = A), and (A B) if and only if (B ^ A); (d) If (A = B) and (XAY = C), then (XBY = C) — or, if ( X A Y ^ C), then ( X B Y ^ C); (iii) For any A and B, T does not include both (A = B) and (A ^ B); (vi) Every theorem of T is a statement of the form (A = B), where A and B are finite strings out of V, and where there is a finite sequence of statements ((/li = 2?i) ... (An = BD)), such that A1 is identical to A, Bi is identical to B, and An and Bn are identical to each other, and such that for any subsequence ((A-,= B,), (A} = Bs)), Ax and A} are identical, B, is of the form WXY, B} is of the form WZY, and there is a statement in T of the form {X = Z). Unless specifically noted to the contrary, the term 'equational theory' will henceforth be used to refer only to those equational theories which are generated by the preceding principles.

3 GRAMMATICAL THEOREMS

Any grammar of a natural language must be capable of generating an infinite number of empirically-verifiable assertions about the properties and relations of the linguistic objects in its domain. Each of these assertions must be derivable by general principles of interpretation from some set of one or more statements which follow as valid theorems from the axioms and rules of inference of the given grammar and governing theory of grammar. Each of these interpreted theorems thus expresses a testable prediction about the linguistic competence of the speakers of some language, and the set of axioms and rules of inference which justify their proofs thus constitute an empirical hypothesis about the nature of the system of knowledge which underlies this competence. For each distinct linguistic property or relation, therefore, there must be a distinct formally-defined theorem schema which serves to generate out of the theorems of any given grammar all asertions of the existence of the given property or relation in the domain of that grammar. Thus, for example, since it is true for all languages that certain phonologically-distinct utterances are known by their speakers to have the same meaning, it is necessary for any adequate theory of grammar to specify a distinct theorem schema for synonymy so that it will be possible to derive from any particular grammar a set of explicit claims about the synonymy relations which hold with respect to the particular linguistic objects which that grammar deals with. Similarly, since the knowledge of a language obviously also includes the knowledge that certain utterances of that language have more than

34

GRAMMATICAL THEOREMS

one meaning while others do not, that certain sentences are analytic or self-contradictory while others are not, that certain utterances express implications, entailments, or presuppositions of certain others, etc., there must also be a distinct universallyapplicable theorem-schema for each of these properties and relations, and for all other empirically-significant attributes of linguistic objects. Each distinct schema thus contributes to the empirically necessary function of generating testable empirical claims out of the statements of theories, which acquire truthvalues and explanatory significance solely from the explicit specification of their interpretable theorem schemata and their interpretations. B y generating sets of claims about the existence of a given property or relation in the domain of any given theory, each theorem schema also serves as a formal, theory-independent definition of that property or relation. Thus, by specifying a distinct theorem schema for synonymy, a theory of grammar provides an explicit, language-independent characterization of the notion of synonymy. In similar fashion, the theorem schemata which generate claims about ambiguity, analyticity, entailment, etc., serve to formally define each of these properties and relations, and to explain why they can be appropriately predicated of utterances in each and every known instance of a natural language. Nothing can count as a possible theory of language, therefore, unless it explicitly specifies a set of interpreted theorem schemata by means of which it is possible to derive from the valid theorems of any possible grammar that is consistent with that theory an infinite set of verifiable claims about the grammaticality, synonymy, ambiguity, analyticity, etc., of the infinite number of linguistic objects constituting the domain of that grammar. Thus, since the theorems of equational theories are all statements of the form (X = Y ) , it is possible to maintain that all natural language grammars can be appropriately expressed as equational theories only if it can be shown that there is a theory of grammar which specifies for each distinct linguistic property or relation a distinct set of one or more statement-types such that all statements of

GRAMMATICAL THEOREMS

35

each type are of the form ( X = Y ) and such that some statements of each type are included in the set of valid theorems generated by each possible equational grammar. We will attempt to show now that these fundamental conditions for empirical adequacy can be readily satisfied by equational theories of grammar with respect to the central linguistic relation of symbolic equivalence, or sound-meaning association, and such major derivative relations as synonymy, ambiguity, analyticity, and grammaticality. B y showing that there are simple and intuitively natural equational characterizations of those properties and relations which have traditionally been of most interest to linguists, where each characterization consists of a set of one or more formally well-defined types of symbolic equivalence statements, we will hopefully provide sufficient reasons for believing that the theory of equational grammar is sufficient for the generation not only of those types of claims which linguists have customarily sought to make about the objects in their domain but also of all the other possible significant empirical claims which might reasonably be made about such objects.

3.1.

SYMBOLIC EQUIVALENCE

Clearly, one can be said to know a particular natural language if and only if one knows which sounds and which meanings are symbolically associated for the speakers of that language. This knowledge is formally expressed and accounted for by the set of grammatical and logical principles which are necessary and sufficient for the effective proof of equations between representations of those cognitive and phonetic objects which are symbolically associated for the speakers of some given language. Each of these basic SYMBOLIC EQUIVALENCE theorems can be formulated as a simple sound-meaning equation of the form (1)

(X) =

(w),

36

GRAMMATICAL THEOREMS

where (X) is a terminal semantic representation and (H>) is a terminal phonetic representation — that is, where ( X ) is a string of elements and operators which are interpreted into observation statements about mental states or events, and (w) is a string of elements and operators which are interpreted into observation statements about articulatory states or events. 1 Given an equational theory of grammar, there is an effective procedure for determining for any arbitrary finite strings of symbols A and B whether ( A = B) is a well-formed symbolic equivalence statement and hence whether it is POSSIBLE for ( A = B) to be a verifiable symbolic equivalence theorem of any possible grammar of a natural language. In brief, the equated strings constitute a possible basic equivalence theorem if and only if one string consists solely of constants which are inter1

Interpretation will always be understood here in the sense of a direct (non-null context-free and one-to-one) mapping of a finite set of (unanalyzable) elements specified by some theory of grammar into a set of extratheoretical observation statements about real cognitive and physiological states, processes, or events. For most ordinary purposes, these statements can be adequately viewed as directions for articulation or cognition — e. g., (FRONT) = 'Activate those neuromuscular structures which effect anterior movement of the body of the tongue'; (GREEN) = 'Think about (something) green'. Italics will always be used here to refer to strings of symbols that are free of semantic elements. Where a distinction is clearly relevant, upper case letters will refer to strings that are free of phonetic elements; otherwise upper case will stand for any string of linguistic elements in either mode. It should be emphasized, however, that these notational conventions are strictly expository in character; in actual grammatical statements the variables in phonological rules and those in non-phonological rules will be symbolized in exactly the same way, since it is possible to define 'linguistic representation' in such a way that it is necessarily the case that the X of (...NASAL... X...) is free of semantic elements and the X of (...GREEN... X...) is free of phonetic elements. It should also be noted that in the general system of representation that is employed here, which is essentially that of Sanders (1967), all minimal elements are simplex (e.g. (VOICED), (ANIMATE)) rather than complex (e.g. (+VOICED), (—ANIMATE)), and any representation of the form (—X), or equivalently (~X), is a VARIABLE for any element or string of elements that is not identical to (X). Since they are not constants, (—X) or ( ~ X ) will thus never occur in any representation of any PARTICULAR linguistic object.

GRAMMATICAL THEOREMS

37

preted into semantic, or cognitive, observation statements and the other string consists solely of constants which are interpreted into phonetic, or articulatory, observation statements. Any theory of grammar, regardless of its particular equational or non-equational characteristics, must obviously include a specification of a finite set of semantically interpretable elements and a finite set of phonetically interpretable elements, the former being necessary and sufficient for the representation of the possible meanings of all possible linguistic objects, the latter being necessary and sufficient for the representation of the possible sounds of all such objects. Since these sets are both finite and disjoint, it is possible to determine for any arbitrary finite configuration of symbols whether it is or is not a possible terminal semantic or terminal phonetic representation of a linguistic object. Any equation between representations out of different terminal alphabets would thus be effectively characterized as a well-formed basic symbolic equivalence statement. For any given sets of phonetically- and semantically-interpreted elements, therefore, it is possible to generate the set of all possible symbolic equivalence statements based on those vocabulary elements. There are also simple algorithms for generating the set of putative proofs of any such statements, since each of these is simply a finite sequence of equations between finite strings of symbols, where one of the terminal equations in the sequence is a well-formed symbolic equivalence statement and the other is an equation between identical strings of symbols. For any particular grammar, then, there is an effective procedure for selecting out of the set of putative proofs generated by its governing metatheory those which are actual valid proofs of that grammar. Since the valid symbolic equivalence PROOFS of a grammar determine the set of valid symbolic equivalence THEOREMS of that grammar, which are simply those symbolic equivalence statements which terminate at least one valid proof, it follows that there is also an effective procedure for selecting out of the set of all possible symbolic equivalence statements for all possible grammars precisely those which are included in the set of valid

38

GRAMMATICAL THEOREMS

theorems of any given particular grammar. Since each basic symbolic equivalence statement asserts an equivalence between a particular interpretable semantic representation and a particular interpretable phonetic representation, such a statement can be true with respect to a given language if and only if the set of linguistic objects which extensionally define that language includes an object which has the interpretation of the semantic member of the given equation as its meaning and the interpretation of its phonetic member as its sound. A grammar is empirically false, therefore, if any of its valid theorems makes an assertion of symbolic equivalence that is false for the language which that grammar purports to characterize.2 Two linguistic objects are distinct if and only if they differ in sound, in meaning, or in both sound and meaning. Thus any set of distinct linguistic objects stands in a one to one relation with a set of distinct symbolic equivalence theorems, and any theory that generates valid proofs of those theorems will thereby effectively enumerate that particular set of linguistic objects. The distinct meanings and distinct sounds of a set of linguistic objects similarly stand in one to one relations with the sets of distinct semantic and phonetic members of the symbolic equivalence theorems which characterize those objects.3 The interpreted " While it is possible to KNOW that an empirical theory is false, it is clearly not possible to KNOW that it is true, since non-analytic truth is nondemonstrable. It is nevertheless possible and quite appropriate to BELIEVE, or THINK, that a given theory (say, a grammar) is true at a given time and state of knowledge, the strength of one's beliefs being dependent on the number, range, and rigor of the unsuccessful attempts that have been made to falsify it. * The fact that linguistic sounds and meanings are determinable only with respect to other such sounds and meanings is well-known in modern linguistics, and the notion of intrasystemic definition in terms of intersection and non-intersection, has played a central role in all major research in the mainstream tradition of Saussure, Sapir, Trubetskoy, and their followers. It is important to note, though, that the intrasystemic definability of linguistic sounds and meanings is essential to their character and not, as suggested by Bloomfield (1926, 1933), a more or less temporary expedient necessitated by inadequacies in our powers of phonetic or semantic observation. Thus, although we now have exceedingly refined

GRAMMATICAL THEOREMS

39

equivalence theorems of an equational grammar are thus sufficient for the specification of the words, phrases, sentences, and discourses of any possible natural language and for the enumeration of the distinct meanings and pronunciations of these objects. It will be shown now that all other empirically-significant properties and relations of linguistic objects can also be formally characterized in terms of basic symbolic equivalence theorems. It will thus be seen that any theory which is capable of generating proofs of symbolic equivalence will necessarily also be capable of generating proofs of synonymy, ambiguity, analyticity, etc. All linguistic properties and relations, in other words, will be shown to be fully explicable in terms of the primitive relation of equivalence between linguistic representations. Thus, since the axioms and rules of inference that are necessary and sufficient for the generation of correct claims about the association of sounds and meanings in any given language are also necessary and sufficient for the generation of correct claims about all the other verifiable properties and relations of that language, a theory of grammar that explicitly restricts the set of possible theorems of any grammar to a set of simple symbolic equivalence statements will thereby provide a principled explanation for the fact that any human being who knows how to say things and understand things in some natural language will also know what things are synonymous or ambiguous or analytic in that language, what things stand in entailment or presupposition relations, what things are mutually contradictory, etc. By fully characterizing knowledge of a language as knowledge of a set of equivalence and nonequivalence relations between linguistic representations, an equainstruments for the recording and measurement of the physical articulatory and acoustic concomitants of particular utterances, the resulting data have had little bearing on linguistic research in general and none at all on the problem of determining which sounds are the 'same' and which are not in given languages. It is not inconceivable that we will eventually have comparable quantitative data concerning the physical cerebral concomitants of utterances, but there is no reason to suppose that these will be any more useful to general linguistics than the presently available acoustic findings.

40

GRAMMATICAL THEOREMS

tional theory of grammar thus provides a correct and entirely natural characterization of the actual range of linguistic competencies that are invariably attendant upon the knowledge of a natural language. In other words, such theories not only provide an empirically-motivated delimitation of the set of possibly-true statements about natural languages, but also serve as explanatory models of the systems of knowledge and inference which underlie the actual language capacities of the speakers of such languages.

3.2.

SYNONYMY AND AMBIGUITY

Synonymy, or paraphrase, is a relation between two linguistic objects which have the same meaning but different pronunciations. Ambiguity, or homophony, is a relation between two linguistic objects which have the same pronunciation but different meanings. In other words, to say that pronunciations P x and P 2 are synonymous is to say that there are two well-formed linguistic objects (Mi, Pi) and (M 2 , P2) such that M i and M 2 are identical cognitive objects and P i and P 2 are non-identical articulatory objects. Similarly, to say that a pronunciation Pi is ambiguous is to say that there are two well-formed linguistic objects (Mi, P i ) and M 2 , P 2 ) such that Pi and P 2 are identical articulatory objects and M i and M 2 are non-identical cognitive objects. Thus, since any possible linguistic object (M, P) is fully characterized by the symbolic equivalence theorem (M' = P'), where the interpretation of M ' describes M and the interpretation of P ' describes P, it follows that any possible pair of linguistic objects that are synonymous or ambiguous will necessarily be fully characterized by a pair of symbolic equivalence theorems, with synonymy signified by the identity of their semantic members and ambiguity by the identity of their phonetic members. While it is customary to predicate the terms 'synonymous' and 'ambiguous' only of the phonetic terms of a synonymy or ambiguity relation, it is clear that any assertion that phonetic representations a and b are synonymous, or that c is ambiguous, is

GRAMMATICAL THEOREMS

41

always understood to mean that a and b are synonymous with respect to some particular semantic representation (D), or that c is ambiguous with respect to some non-identical (E) and (F). It is also evident, therefore, that any PROOF of synonymy or ambiguity must necessarily make essential reference to all three terms of the relation distributed over a pair of simple symbolic equivalence equations. Thus synonymy can be formally characterized by a pair of valid symbolic equivalence theorems with a shared semantic member and ambiguity by a pair with a shared phonetic member. In other words these relations can be metatheoretically defined in an entirely natural way by means of the following equational schemata: (1)

a and b are synonymous =

det

(2)

( 3 C) [(a - C) and (b = C)] a is ambiguous = def ( 3 C ) ( 3 D ) [(a = C) and (a = D)]

Any grammar which is capable of demonstrating true symbolic equivalence theorems will thus necessarily also be capable of correctiy characterizing all instances of synonymy and ambiguity among the objects of its domain. The converse is also true, of course, as is the fact that any false claim of synonymy or ambiguity demonstrates that at least one of the equivalence theorems which define the claim is false and thus that the grammar which justifies their joint proof is false. A variable-free symbolic equivalence theorem is VERIFIED, or empirically TRUE, if and only if in some given community the actual cognitive or semantic object which is the empirical interpretation of its semantic member is in fact a meaning of the actual articulatory, acoustic, or graphic object which is the empirical interpretation of its phonetic member. In other words, each valid equivalence theorem (A = b) defines a pair of empirical objects, A ' (the interpretation of A) and b' (the interpretation of b), and claims that for the speakers of the language whose grammar justifies the proof of this theorem A' and b' will be judged to be

42

GRAMMATICAL THEOREMS

symbolically associated as the meaning and expression of some linguistic object. If these objects are judged to be symbolically equivalent, then the given theorem is true and the grammar which justifies its proof is, to this extent, empirically substantiated. If they are judged to be non-equivalent, then the theorem and the grammar that proves it are empirically false. It should be noted, though, that while the verification procedure for symbolic equivalence theorems is entirely straightforward in principle, the actual verification of such theorems is in practice necessarily indirect. This is because neither the significant sound properties nor the significant meaning properties of a linguistic object are determinable except in relation to the sounds and meanings of other linguistic objects. Thus competent native observers obviously cannot determine with absolute certainty that a particular articulatory rendition of the utterance Birds fly actually is or is not the sound of a particular cognitive rendition of the meaning BIRDS FLY; they can nevertheless approach such a determination asymptotically, though, by means of a set of entirely objective observations about the RELATIONS of these renditions to the possible renditions of the sounds and meanings of OTHER linguistic objects — observations, for example, that the meaning of Birds fly is NOT IDENTICAL to the meaning of Birds sigh, or the sound of BIRDS FLY to the sound of ELEPHANTS FLY, etc., or that the meaning of Birds do something is INCLUDED in the meaning of Birds fly, or, the sound of BURR in the sound of BIRDS FLY, etc. The intrasystematic character of linguistic sounds and meanings and the necessary indirectness of the verification of simple equivalences between them accounts to a large extent for the central position of synonymy and ambiguity in linguistics. Thus, while simple symbolic equivalence theorems constitute the sine qua non of grammar, the most useful theorems for the decisive verification of grammars are generally theorems of synonymy and ambiguity. Since each of these complex theorems asserts the JOINT equivalence of two distinct representations to some third representation — in the case of synonymy, two sounds to

GRAMMATICAL THEOREMS

43

one meaning; for ambiguity, two meanings to one sound — the complex theorem is false if EITHER of the asserted equivalences is false, and the procedure of verification can be reduced from the task of determining all of the actual properties of a pair of objects to the much simpler task of simply judging whether two given objects are the same or different in their relationships to a given third set of properties. Thus it is not necessary to know or become aware of ALL the semantic properties of BIRDS FLY or ALL of the phonetic properties of Birds fly and Elephants fly to effectively determine that these phonetic objects are not the same with respect to this semantic object. Similarly, it is possible for a competent observer to readily determine the truth of the assertion that Flying birds can be dangerous is equivalent to both (X) and (~X) without being able to determine the truth of falsity of the assertion that it is equivalent, say, to the meaning (SOME BIRDS FLY, THOSE BIRDS CAN BE DANGEROUS). Simple symbolic equivalence theorems with unknown phonetic or semantic members are also more easily verified than those with no unknowns, but it will still generally be the case that discrimination is easier than identification, and thus that synonymy and ambiguity theorems with or without unknowns will be easier to falsify than those asserting simple equivalence. This matter is of considerable significance with respect to the formulation and verification of linguistic theories, but since the proofs of synonymy and ambiguity theorems are inherent in any theory which is capable of proving simple symbolic equivalence theorems, the verification of any single symbolic equivalence theorem can be effected sufficiently through the verification of the binary synonymy and ambiguity theorems which include it. The set of valid symbolic equivalence theorems of an equational grammar will thus always suffice for the generation of all possibly-true predictions about the grammaticality, synonymy, and ambiguity of the objects in its domain, where each theorem of the form (A = b) generates the verifiable claims that (A, b) is GRAMMATICAL, that A is A MEANING OF b, and that b is AN

44

GRAMMATICAL THEOREMS

A, and where each pair of theorems of the form (A = b, A = c) generates the claim that b and c are SYNONYMOUS and each pair of the form (A = b, C = b) generates the claim that b is AMBIGUOUS. The fact that all of these basic linguistic properties and relations are characterized by exactly the same sets of theorems is a fact about equational grammars which is not necessarily true for grammars governed by metatheories that do not incorporate the equationality hypothesis. There are, in fact, metatheories, such as that proposed by Chomsky in Syntactic Structures (1957), in which the only empirical property that can be exhaustively characterized by the possible interpretable theorem schemata of its grammars is the property of being the sound of some sentence of a given language, predicated of the interpretation of each terminal phonetic representation P of each valid theorem of the form (D P), or (D = P), where D is an uninterpretable 'deep structure' representation generated by a specified subset of the statements of some grammar. These superficial well-formedness theorems remain as one of the two basic theorems types of the more complex and much richer theory of grammar proposed by Katz and Fodor (1963) and developed by Katz and Postal (1964) and Chomsky (1965), the other interpretable theorem schema for this theory being (D M), where D is a deep structure representation and M is a terminal semantic representation. While this theory makes essential use of both theorem sets for the generation of claims about symbolic equivalence (D P, D M), synonymy (Di Pi, Di Mi, D 2 P2, D 2 Mi) and ambiguity (Di P,, Di M 1; D 2 Pj, D 2 M2), it is like the Syntactic Structures theory in its capacity to generate claims about superficial well-formedness by means of only that well-defined set of theorems which assert derivability relations between uninterpretable deep structures and interpretable phonetic representations. In contrast to the theory of equational grammar, therefore, these conventional non-equational theories both allow for the possibility of a grammar for which the particular theorem EXPRESSION OF

GRAMMATICAL THEOREMS

45

(D P) is valid but for which there is no M such that (D M) is valid. In other words, these theories generate grammars generating claims that there are sentences in natural languages that are phonologically well-formed but have no semantic properties whatever, and hence no paraphrases, no entailments, no presuppositions, and no other such meaning relations with respect to any other sentence of the language. It is evident, however, that there are no meaningless sentences of this sort in natural languages,4 and that the cited standard theories of non-equational grammar are inadequate by virtue of their inherent inability to provide any principled explanation for this fact. Such an explanation is provided, on the other hand, by any equational theory of grammar, since it follows necessarily from the equationality hypothesis that ALL possibly-true assertions about linguistic objects must be assertions of symbolic equivalence, which is a relation that is defined as possibly holding only between the members of sets of representations which include one interpretable semantic representation and one interpretable phonetic representation. The equationality hypothesis thus generates the claim that all significant linguistic properties and relations are analyzable in terms of the single primitive relation of symbolic equivalence, and that all true assertions about such properties and relations in any natural language can be directly derived by a set of simple interpretation schemata 1

Thus, while some philosophers have occasionally applied the term 'meaningless' to sentences like Colorless green ideas sleep furiously, it is clear that what is really meant in all such cases is not that the sentence is meaningless in any ordinary or even extended sense, but rather that the proposition that it expresses is either SELF-CONTRADICTORY or DEVOID OF TRUTH-VALUE by virtue of the failure of its presuppositions to describe true states of affairs. This failure to distinguish such false or presuppositionally odd sentences f r o m such truly meaningless and grammatically non-characterized utterances (for English) as *Green furiously sleep ideas colorless or *ll fait chaud, which are neither contradictory nor noncontradictory, neither devoid nor non-devoid of truth value, is rarely observed in the works of linguists or philosophers w h o are at all cognizant of the essential properties and communicative functions of natural languages.

46

GRAMMATICAL THEOREMS

from the set of valid simple symbolic equivalence theorems of an equational grammar of that language. This has been demonstrated already with respect to synonymy and ambiguity, and will be shown now to be equally true for other significant linguistic properties and relations. 3.3.

OTHER PROPERTIES AND RELATIONS

Symbolic equivalence theorems have been characterized thus far as equations with no unknowns, that is, equations between a given string of phonetic constants and a given string of semantic constants. It is readily apparent, though, that any theory which is capable of demonstrating equations with no unknowns will also necessarily be capable of demonstrating equations between representations containing one or more unknowns represented by free or bound variables, an equation of the latter sort being nothing more than a schematic abbreviation of some class of equations of the former sort. Thus if a theory has a valid proof for the equation (3 + 2 = 4 + 1) it will also have valid proofs for the theorems (3 + 2 = 4 + x ) , (y + 2 = 4 + x), (3 + 2 = x + y), etc. Similarly, if a grammar is capable of proving the variable-free equation (la), it will also be capable of proving equations (lb-e), where late letters are free variables for strings and where upper case and italized lower case orthography stand for strings over the sets of semantic and phonetic elements and operators, respectively. (1)

(a)

((N, B I R D S ) , (V, F L Y ) ) =

(b)

((N, BIRDS), (V, FLY)) = ((birds) & x)

(c)

((W), (V, F L Y ) )

=

((birds)

((birds)

((N, BIRDS), (V, FLY)) = (*)

(e)

(W) =

&

(fly))

& jc)

(d)

((birds)

&

(fly))

For any valid equation with no unknowns, there will thus be a set of valid equations which can be derived from it by replacing one or more of its constants with appropriate variables.

GRAMMATICAL THEOREMS

47

The task of proving an equation with unknowns is thus essentially the task of finding those constants which, if substituted for the variables of the given equation, will yield an equation which has a valid proof according to some given theory. For the linguist, there are two types of symbolic equivalence theorems with unknowns which are of particular interest: those like (Id) in which the entire phonetic member of the equation is unknown, and those like (le) in which the entire semantic member is unknown. These are of particular interest because they represent precise formal models of the problems which a competent speaker faces and solves in producing and understanding particular utterances of his language. Thus the task of producing an utterance is, aside from the matter of physical execution of the required articulatory gestures, precisely the task of finding a phonetic object whose representation is demonstrably equivalent according to the rules of his grammar with the semantic representation of a given cognitive object. Similarly, to understand a given utterance it is necessary to find some semantic object whose representation is demonstrably equivalent to the given phonetic representation of that utterance. The heuristics of solving symbolic equivalence equations with unknown phonetic or semantic members will thus necessarily constitute one of the central concerns of any theory of linguistic performance.

3.3.1.

Class-Membership

Theorems

All taxonomic or classificatory statements about linguistic objects can also be expressed as symbolic equivalence theorems with one of more unknowns. Thus, for example, such conventional assertions of class-membership as (2a-c) can be formally expressed by the respective equations (3a-c): (2)

(3)

(a) (b) (c) (a)

Birds fly is a sentence man is an animate count noun man is a verb in the sentence Man the oars (birds fly) = (S(X))

48

GRAMMATICAL THEOREMS

(b) (c)

{man) = (N, ANIMATE, COUNT, X) (man) = (V, X) in the equation (man the oars) = (I ASK OF YOU THAT YOU WILL MAN THE OARS)

Classificatory theorems of this sort are generally of quite marginal interest in themselves, of course, but is it nevertheless of some interest to note that proofs of such theorems will always be INHERENT to any theory which is capable of proving simple variable-free equivalence theorems. Grammars which demonstrate simple symbolic equivalence theorems are thus necessarily also sufficient for the classification of any object in their domain with respect to any element of the grammar. It has been demonstrated then that any theory which is capable of demonstrating equations with no unknowns will necessarily also be capable of demonstrating equations with one or more unknowns, and that these latter theorems suffice for the classification of any linguistic object with respect to any possible linguistic elements or structures. Thus, the class of all ADV & N & V 3. N -» John 4. V -» danced 5. ADV —> yesterday Proof : 1. S = yesterday John danced (given) 2. N & V & ADV = yesterday John danced (Rule 1) 3. ADV & N & V = yesterday John danced fRule 2) 4. ADV & John & V = yesterday John danced (Rule 3) 5. ADV & John danced = yesterday John danced (Rule 4) 6. yesterday John danced = yesterday John danced (Rule 5)

(2)

Grammar G': 5'. yesterday -> ADV 4'. danced -» V 3'. John N 2'. ADV & N & V - » N & V & ADV 1'. N & V & ADV -» S Proof: 1. S = 2. S = 3. S = 4. S = 5. S = 6. S =

yesterday John danced ADV & John danced ADV & John & V ADV & N & V N & V & ADV S

(given) (Rule 5') (Rule 4') (Rule 3') (Rule 2') (Rule 10

Since any pair of inverse rewriting rules, such as (N John) and (John N), is logically equivalent to a single equational rule, such as (N = John), it is clear that any antisymmetric non-equational rewriting rule can be replaced by a symmetric equational rule without any possible loss of generality or explanatory power. Thus any theorem which is demonstrable by either of the inverse

GRAMMATICAL AXIOMS

77

grammars G and C will also be demonstrable by the other and also by the following equational grammar G " which subsumes all of the rules of both G and G ' : (3)

Grammar G": 1". S = N & V & ADV 2". N & V & ADV = ADV & N & V 3". N = John 4". V = danced 5". ADV = yesterday

Since the latter grammar will justify by the principle of equal substitution any derivational step which can be justified by either of the arbitrary empirically-equivalent directed grammars which it subsumes, it is clear that the equational reduction is not only possible here but also necessary by virtue of the metascientific requirement that less general statements and theories be progressively reduced to more general ones. It is also clear that such reduction will be equally possible and equally well-motivated for ANY system of directed rewriting rules whose rules are selected out of a set of exclusively antisymmetric rules that are all used to justify directed substitutions in proofs terminating in one and the same terminal alphabet. But this is precisely the case for all of the even moderately-motivated directed rewriting systems which have actually been proposed thus far for natural languages, including not only those phoneticallydirected systems governed by Chomsky's (1957) original theory of transformational grammar, but also both the semantically-directed and phonetically-directed subsystems of grammars governed by the more elaborate componential theories of Katz and Fodor (1963) and Chomsky (1965). Thus, although these theories require that particular grammars include language-specific identifications of the structural description (input) and structural change (output) of EACH of their rewriting rules — thereby generating the patently false claim that for ANY representations A and B, both (A -> B) and (B —» A) are possible phonetically-directed rules of natural language grammars — the grammarians who have actually

78

GRAMMATICAL AXIOMS

been working in these theoretical frameworks have apparently made no use of this vast power of language-specific directionalityspecification with respect to ANY of the motivated directed rewriting rules of their grammars. Thus while the governing nonequational theories predict that converses such as subject-raising and object-lowering, or pronominalization and unpronominalization, or voiced final consonant devoicing and voiceless final consonant voicing are equally possible phonetically-directed transformations for natural language grammars, we consistently find that only one of the members of each such converse pair has ever actually been proposed as a phonetically-directed rule for any such grammar. All of these grammars are thus directly reducible to equational grammars by simply removing all of their languagespecific directionality specifications in favor of universal principles of inferential precedence which predict from the form of any equation (A = B) which of its pair of entailed converse directed transformations (A B) and (B —> A) takes precedence in phonetically-directed derivations and which takes precedence in semantically-directed derivations. Since it is only by means of such universal inference principles that it can be explained WHY certain directed transformations are empirically appropriate ONLY in phonetically-directed derivations while their converses are apppropriate ONLY in semantically-directed derivations, and since it is possible for a theory of grammar to posit such principles ONLY if it prohibits language-specific specifications of directionality in the particular grammars that it governs, it is clear that the reduction of directed rewriting systems to sets of equational axioms is not only possible here but empirically necessary.

4.1.2.

Redundancy

Axioms

Directed rewriting rules have traditionally been differentiated into two major types : (1) FORMATIONAL, or PHRASE-STRUCTURE, rules, which specify inclusion, or dominance, relations between individual constants; and (2) TRANSFORMATIONAL rules, which specify simple derivability relations between pairs of equivalent repre-

GRAMMATICAL AXIOMS

79

sentations including one or more variables, where the members of each pair are related to each other according to some set of one or more elementary transformations enumerated by the governing metatheory. A phrase-structure rule is thus any rewriting rule of the form (4)

X(A)Y -> X(A(B))Y,

where A is a single constant, where B is a non-null internally unbracketed set of one or more constants, and where X and Y may be null. Various particular constraints can be imposed on such rules, of course, to make them more or less poweful in terms of both strong and weak generative capacity. Thus, for example, it could be required that the X and Y of the standard phrasestructure schema (4) also be free of variables, or that they be null; it could also be required that B must or must not include A, or that B must or must not consist of n elements, for any finite n, or that the elements of B may or must be ordered with respect to each other, or unordered with respect to each other. Regardless of the particular set of constraints imposed, though, it is clear that all these varieties of phrase-structure rules are only special cases of the schema (4), and that if any directed inference of this form is valid then its inverse is necessarily also valid. Thus, for example, for any grammar which validly demonstrates any true theorem by making use of the inference of (S(NP & PREDICATE)) from (S), there will be at least one true theorem which can be validly demonstrated by that grammar by making use of the inference of (S) from (S(NP & PREDICATE)). Every motivated directed rule of the form (4), in other words, presupposes and is presupposed by an inverse rule of the form (5)

X(A(B))Y

X(A)Y

and both of the members of any such pair are subsumed and implied by an equational rule of the form (6)

X(A)Y = X(A(B))Y.

The equational character of all assertions of classification or

80

GRAMMATICAL AXIOMS

constituent structure is intuitively quite obvious. Thus to assert that all cats are mammals is clearly the same as asserting that everything that is a cat is both a cat and a mammal, or that everything that is both a cat and a mammal is a cat, or, in other words, simply that the equation ((CAT) = (CAT, MAMMAL)) is true. Similarly, the assertion that all NP's consist of - a Dieterminer) followed by a N(oun) is precisely equivalent to the assertion that everything that is an NP is both an NP and a D & N, or, equivalently, that everything that is both an NP and a D & N is an NP, or, most generally, that the equation (NP) = (NP(D & N)) is true. The intrinsically equational character of categorization statements has often been ignored or obscured, however, in the case of the categorization axioms of grammars. Thus the arbitrary, empirically-unmotivated directionality specifications imposed on the phrase-structure rules of non-equational grammars have sometimes actually been taken by linguists to be empirically significant in some unknown sense. Such misconception has probably been due not only to the fairly common tendency to hypostatize the more familiar strategies for proving linguistic theorems, but also to the common empirically non-significant notational convention for abbreviating the phrase-structure schema (X(A)Y X(A(B))Y) as (X(A)Y X(B)Y). It is clear, though, that while this abbreviated schema is not distinct from that for lexicalization or other simple substitution transformations, the derivational function of a phrase-structure rule is always understood to be that of its unabbreviated schema and not that of a substitution transformation. Thus a conventional phrase-structure rule (NP —» D & N) is understood to justify the phonetically directed additive derivation (7), and not the simple substitutive derivation (8): (7) (a) .... NP .... (b) NP

D&N

GRAMMATICAL AXIOMS

(8)

(a) (b)

81

.... NP .... .. D & N . . .

The formational character of the particular type of rewriting system assumed by Chomsky (1957, 1965, etc.) has been repeatedly shown to represent an empirically indefensible hypothesis about natural language grammar. Thus it has been shown, in a number of recent studies, that the function of formation is neither necessary (see, e.g., Sanders, 1967) nor sufficient (see, e.g., Stanley, 1967; Perlmutter, 1968) for the principled description or explanation of linguistic data. This means that directed phrase-structure rules of formation must be metatheoretically excluded from the set of possible grammatical axioms generated by any adequate theory of grammar, regardless of the directionality or non-directionality of its permitted transformational axioms. For equational theories, this required exclusion is automatically effected by the independently-motivated general requirement that all grammatical axioms must be expressible as equations. While all empirical theories require certain postulations of class-inclusion relations, it is clearly unnecessary to interpret these as rules of formation rather than conditions for well-formedness. Moreover, it is also clear that no classification statement can have any empirical significance unless its terms are understood to be the names for interpretable PROPERTIES rather than as the uninterpretable (and hence empirically arbitrary) names for CLASSES of objects. Thus a statement such as NP -> (NP(D & N)) would merely be a nominal definition and hence devoid of any scientific interest or utility if NP, D, and N are merely uninterpreted constant category symbols; this statement could justifiably be included in a grammar then only if these symbols are interpreted property elements or at least variables for three empirically significant classes of objects. But the conventional notion of phrase-structure rules is incompatible with both of these alternative prerequisites for significance, since their assumed directionality and formational function preclude the

82

GRAMMATICAL AXIOMS

interpretation of their arguments as variables while the obvious phonetic and semantic irrelevance or redundancy of their usual arguments precludes their inclusion in the set of interpreted elements of any grammar. Formational or directed phrase-structure rules are thus neither necessary nor sufficient for any purpose of scientific linguistics. (For further discussion, see Sanders, 1967.) Any set of structured objects that can be characterized by a phrase-structure grammar interpreted as a set of rules of formation can also be characterized by a set of acceptability, or well-formedness, conditions. (See, e. g., Hays, 1964; Sanders, 1967.) For any natural language, however, there will be certain significant generalizations which can be readily expressed by means of well-formedness conditions, but which cannot possibly be expressed in any principled or natural way by means of rules of formation. Thus the generalizations of (9), for example, can readily be expressed by equational well-formedness conditions such as those of (10) : 3 (9)

(a)

(b)

(10)

(a) (b)

Cognitive verbs (e.g. admire, believe, remember, think) require cognitive agents (e.g. this man, you, Henry, my favorite cat) Verbs of trying (e.g. attempt, be motivated, try) require identical superordinate and subordinate agents S(N(W, X) V(COGNITIVE, Y)Z) = S(N(W, COGNITIVE) V(COGNITIVE, Y)Z) S((AGT, X) V(TRYING) (S((AGT, Y)Z))W) = S((AGT, X, Y) V(TRYING) (S((AGT, X, Y)Z))W)

Neither (9a) nor (9b) can be expressed by context-free phrasestructure rules, of course, and (9b) cannot be expressed by means * These examples are purely illustrative. As will be shown in Sec. 4.2, all representational well-formedness constraints of this sort are most appropriately expressed by assertions of non-equivalence rather than equivalence.

GRAMMATICAL AXIOMS

83

of any rule which does not make essential use of variables. There are thus certain significant generalizations about natural languages which are inherently beyond the powers of phrase structure grammar but not beyond the powers of equational grammar. There is considerable additional evidence of the empirical inadequacy of phrase-structure grammar and thus of the necessity as well as sufficiency of its elimination in favor of equational grammar. Perhaps the strongest evidence of all, though, consists of the fact that phrase-structure rules are demonstrably inadequate even for their presumably primary function of classification. Here, moreover, it is not only the case that formational phrase-structure grammar is incapable of expressing certain generalizations, but also that the acceptance of any such grammar as a proper PART of a grammar of a natural language will PRECLUDE the expression of certain generalizations by means of ANY grammatical statement at all. Thus, for example, a phrase-structure rule such as (11) will not only be inherently incapable of revealing which of the empirically non-equivalent statements of (12) are true but will also make any explicit expression of the true members of (12) impossible in any grammar which includes i t : (11)

A

A(B C)

(12)

(a) Every A is an A(B C) i(b) Every (B C) is an A(B C) (c) Every A(B) is an A(B C) (d) Every A(C) is an A(B C) (e) Every B is an A(B C) (f) Every C is an A(B C)

These empirically-significant distinctions are readily expressed by the distinct acceptability equations: (13)

(a) (b) (c) (d)

A = (BC) A(B) A(C)

A(B C) = A(B C) = A(B C) = A(B C)

84

GRAMMATICAL AXIOMS

(e) (f)

B = A(B C) C = A(B C)

This fundamental inadequary of formational phrase structure grammar has long been recognized, of course, in the area of phonology (see, e.g. Halle, 1962; Chomsky, 1964), but a large number of scholars have failed to move from the general rejection of the phrase-structure generation of underlying or superficial phonological structures to the equally motivated rejection of such generation with respect to semantic or syntactic structures. For all aspects of linguistic objects, nevertheless, and for all empirical objects in general, it will always be the case that any attempt to account for the contents of a domain by means of statements of a formational or phrase-structure character will result in the systematic loss of significant generalizations and in a blurring of the empirically vital distinctions between predictability and unpredictability, redundancy and distinctiveness, and between what is axiomatic in a given system and what can be deductively inferred from those axioms. (For further discussion, see Sanders, 1967.) The elements with respect to which the two members of an equation differ (e.g. the (B C) of (13a), the A of (13b)) are formerly characterized as non-axiomatic, predictable, and nondistinctive relative to the identical elements of both members (e.g. the A of (13a), the (B C) of (13b)). Equational grammars thus provide an explicit and natural characterization of the vital distinction between those properties and relations of an object which are predictable and those which are not, a characterization which is inherently beyond the powers of any possible phrasestructure grammar. Since the characterization functions of phrase-structure rules are fully subsumed by well-formedness equations, and since the latter are not only more revealing but also entirely compatible with the goal of maximal prediction with respect to minimal axiomatic bases, it is reasonable to assume that any adequate explanatory theory of grammar will require the exclusion of formational phrase-structure rules from all specific

GRAMMATICAL AXIOMS

85

grammars in favor of well-formedness conditions in the form of predictive equational statements. Recent evidence presented by Perlmutter (1968) and others has suggested that certain facts that can be readily accounted for by well-formedness conditions are also beyond the explanatory power of grammatical transformations, even when transformations are allowed to apply in ways which are clearly excessively unconstrained and unnatural with respect to the known properties of natural languages. The metatheoretical acceptance of wellformedness conditions thus would appear to be clearly necessary, and such acceptance allows for a very extensive narrowing of the constraints on all possible transformations. Thus, for example, given the availability of conditions on the well-formedness of surface structures, there can be no possible basis for the continued toleration of such arbitrary and unnatural devices as rule-specific conditions on applicability or on the relative order of application of substantively unrelated and unconfutable rules (extrinsic ordering in the sense of Chomsky (1965)). Even without wellformedness conditions it is possible to impose on all rules the constraint that they be unconditionally self-explanatory in terms of their governing metatheory (see Sanders, 1969a); WITH wellformedness conditions this unconditionally constraint is not only possible but necessary, since otherwise there would be numerous characterizations which could be effected equally adequately either by means of ad hoc well-formedness conditions or by means of ad hoc rule-conditions, and there would be no principled basis for choosing between them or for preventing overlaps in their respective filtering functions. It has already been suggested that any affirmative well-formedness condition can be formulated as an equational statement which formally expresses an empirically true synthetic dependency relation between certain properties of linguistic objects. It has also been suggested that the distinction between classification, acceptance, and derivational inference is essentially only a matter of heuristics, and that the form, meaning, and empirical significance of any grammatical statement must be independent of its

86

GRAMMATICAL AXIOMS

various uses in proving various theorems of different types. Although any statement is subject to multiple interpretation and multiple uses, it appears that the only standard formulation of grammatical statements which is NEUTRAL with respect to ALL of their possible uses and interpretations is the equational formulation; any other ordinary way of expressing a grammatical principle will incorporate some grammatically gratuitous direction for the efficient use of the principle in the proof of some particular class of theorems in some particular way — typically, for the proof of equations with phonetic unknowns by means of phonetically terminated derivations from the given semantic member. Moreover, it could also be easily shown that these rule-specific heuristic directions are gratuitous with respect to theories of PERFORMANCE as well, since there are GENERAL procedures for PREDICTING the most efficient directional use of any equational rule for any given terminal mode of any proof. It is only an equational formulation, in other words, which allows for the richness of interpretation and use that is required of any empirical theory, and which can provide the heuristically neutral characterization of linguistic knowledge which is prerequisite to any significant study of the ways in which such knowledge is used for the actual solution of particular communicative problems. Any specification of inclusion, redundancy, categorization, or structural well-formedness can be adequately expressed in the form of a simple equivalence statement. Thus any equation of the form (14a) can be understood as the symbolic representation of any and all of the laws verbally expressed by (14b-e) : (14)

(a) (b) (c) (d) (e)

(A) = (A, B) A is a sub-type or species of the genus B; All A's are also B; A is included in B; (A, B) could be a well-formed A.

The following examples should suffice to show that all of these alternate verbal expressions of the underlying equivalence relation are entirely appropriate with respect to the equational represen-

GRAMMATICAL AXIOMS

87

tations of the usual types of categorization, inclusion, redundancy, and well-formedness statements which have actually been proposed for various grammars of natural languages : (15)

(a) (b)

(c) (d)

(e) (16)

(a) (b) (c) (d) (e)

(17)

(a) (b)

1

(S) = (S(NP & VP)) Sentences are a subtype of those constructions consisting of a Nominal Phrase followed by a Verbal Phrase; All sentences are also constructions whose immediate constituents are an NP followed by a VP; The set of sentences is included in the set of constructions whose immediate constituents are an NP followed by a VP; Any structure analyzable as (S(NP & VP)) could be a well-formed sentence. ( + NASAL) = ( + NASAL, + VOICED) Nasals are subtypes of the class of voiced segments; All nasals are also voiced; The set of nasal segments is included in the set of voiced segments; Any segment which is both voiced and nasal could be a well-formed nasal segment. (CLITIC) = (CLITIC ((W, se) & (X, II) & (Y, I) & (Z, III, DATIVE)) 4 Clitic pronoun structures are subtypes of the class of clitic pronoun sequences where se precedes

This is a purely illustrative reduction of Perlmutter's output condition on clitic pronouns in Spanish (Perlmutter, 1968 : 158), with parentheses here signifying optionality according to the usual abbreviation of ( A / 0 ) as (A). Perlmutter's actual formulation is "(72) Output condition on clitic pronouns: se II I III Dative" where this is to be applied as a template to surface structures which are identifiable in some unspecified way as sequences of clitic pronouns. The appropriate equational formulation of this constraint is actually by means of a non-equivalence axiom. This axiom is presented in Sec. 4.2.2. For an alternative treatment of these data by means of a precedence-relational ordering rule, see Sanders (1969; note 33).

88

GRAMMATICAL AXIOMS

second person and/or second person precedes first person and/or first person precedes third person dative; (c)

All clitic sequences are also sequences of the specified sort;

(d)

The set of clitic sequences is included in the set of sequences of the specified sort;

(e)

Any clitic sequence which is analyzable as specified could be a well-formed clitic structure.

A n equational redundancy axiom of the form ( ( A ) = ( A , B)) will thus be sufficient for the expression of any possible law-like generalization that can be expressed by a directed phrase-structure rule of the form ( A A ( B ) ) , or by a directed redundancy rule of the form ( ( A ) - > ( A , B)), or by an affirmative well-formedness condition of the form ( ( A , B) could be a well-formed (A)). It is clear, though, that the generalizations which are expressible by phrase-structure rules and affirmative well-formedness conditions are not all necessarily law-like statements, since it is possible to assert by means of such statements, for example, that only SOME A's are well-formed B's, while OTHERS are well-formed non-B's. Since non-lawlike statements are inherently incapable of contributing to the explanation of empirical data, it is necessary for any adequate empirically-motivated theory of grammar to provide some principled constraint against the inclusion of such statements in any of the grammars that it governs. This required constraint is automatically provided by the general equationality condition on grammatical statements in conjunction with the additional independently-motivated condition that ( ( A ) = ( A , B)) is a possible statement if and only if either A is of the form X B Y or else B is a single constant. The latter condition, from which Chomsky's (1965) condition on 'recoverability of deletions' follows as a necessary consequence, suffices to assure that each redundancy axiom of each possible grammar will express a generalization which is asserted to hold without exception for all proofs that are generated by that grammar.

GRAMMATICAL AXIOMS

89

The redundancy axioms of equational grammars are thus statements of the form ((A) = (A, B)), where B is a single constant. It follows from any such statement by the principle of equal substitution that both ((A) (A, B)) and ((A, B) -> (A)) are valid derivational inferences. It is also evident that for any actual statement of this sort it will be empirically appropriate and necessary to make use of one of its directed inferences only in proofs terminating in phonetically-interpretable representations, and the other only in proofs terminating in semantically-interpretable representations. The required inferential precedence relations between the additive and reductive substitutions justified by any given redundancy axiom are deductively determined by the universal principle of MAXIMALIZATION OF TERMINALITY, which constitutes the most general and intuitively most obvious member of the set of principles of valid linguistic inference that must be explicitly specified by any adequate theory of equational grammar : (18)

For any grammar G, such that G includes the axiom ((A) = (A, B)), G has no valid proof of the form (...X(A)Y = Z, X ~(A, B) Y = Z,..., W = W), such that the elements of B and W are included in the same terminal alphabet, and G has no valid proof of the form (...X(A, B)Y = Z, X(A)Y = Z,..., W = W), such that the elements of B and W are included in the same terminal alphabet.

It follows from this principle that a phonological redundancy axiom such as ((NASAL) = (NASAL, VOICED)) can serve to justify additive substitutions only in phonetically-directed derivations and reductive substitutions only in semantically-directed derivations, and that a semantic redundancy axiom such as ((HUMAN) = (HUMAN, ANIMATE)) can serve to justify additive substitutions only in semantically-directed derivations and reductive substitutions only in phonetically-directed derivations. Equational theories of grammar thus provide a principled and extremely general explanation of the fact that there are natural

90

GRAMMATICAL AXIOMS

languages in which all nasal consonants are pronounced with voicing or all humans are understood to be animate, but no such languages in which all nasals are pronounced voicelessly or all humans are understood to be inanimate. Similar explanations of such significant restrictions on the range of variability of natural language will be generated with respect to each possible pair of semantic elements and each possible pair of phonetic elements by means of the same principle of maximalization of terminality, a principle which necessarily presupposes the equational formulation of all axioms of all particular grammars, and which represents a level of explanatory generalization which would appear to be far beyond the inherent powers of any possible theory of non-equational grammar. 4.1.3.

Lexical

Axioms

The relation between the sound and the meaning of a word or morpheme has almost always been viewed as one of simple substitutability, or direct symbolic equivalence. No arguments against the fundamental correctness of this view are known or presently conceivable. Thus we will take it for granted that the derivational function of any lexical statement is the substitution of a given phonological structure for a given non-phonological structure in phonetically-terminated derivations and the inverse substitution in semantically-terminated derivations. In the context of phonetically-terminated derivations, specific proposals to this effect have been made, for example, by Bach (1964), Gruber (1965), and Sanders (1967). In support of these proposals, Bach has shown that substitutive lexicalization is POSSIBLE and CAN apply subsequently to the application of all phonetically-directed syntactic transformations in phoneticallyterminated derivations, Gruber has shown that lexicalization MUST apply subsequently to at least SOME syntactic transformations, and I have attempted to show that lexicalization MUST be COMPLETELY substitutive and post-syntactic in phonetically-directed derivations if we are to have any basis for the explanation of certain highly

GRAMMATICAL AXIOMS

91

significant facts about all natural languages, e.g. the fact that there are no motivated grammatical rules of any language which make essential reference to both a phonetically-interpretable constant and a semantically-interpretable constant, or to both the sound and the meaning of any morpheme. These and other arguments for substitutive lexicalization have never been challenged, as far as I can tell, and they would clearly appear to be equally valid with respect to the use of lexical rules in semanticallyterminated derivations. Thus it can presently be assumed that the lexical statements of all grammars will at least be restricted to statements of the form (19)

((X), (W)) = ((w), (Z)),

where X and W are free of phonologically-interpretable elements and w and Z are free of semantically-interpretable elements, where X is a (possibly null) set of uninterpretable intermediate elements or rule-identification markers relevant to each of the rules which must apply exceptionally and make essential reference to any element of W, and where Z is a (possibly null) set of uninterpretable markers relevant to those rules which must apply exceptionally and make essential reference to any element of w.B " This formulation assumes the characterization of exceptionality and idiosyncracy by means of a system of exceptionality elements and conventions of the sort proposed by G. Lakoff (1965). It will be suggested, in Sec. 4.2.4, that this special theory of exceptionality can be appropriately reduced to ordinary statements of equivalence and non-equivalence between linguistic representations, in which case the schema for lexical rules would be simplified to (A = b), or, allowing for context-sensitivity, (XAY = XbY). The question of context-sensitive lexicalization which will be prohibited for all non-ordering lexical axioms, in Sec. 4.2.1, will not be argued here, though, chiefly because I am presently unable to provide any thoroughly convincing arguments against the toleration of such sensitivity, or, if tolerated, for choosing between phonological and non-phonological contexts. Thus, for example, it appears that most facts about English can be accounted for equally well by means of grammars governed by the contradictory requirements of total context freedom, phonological context freedom, and semantic context freedom in lexical rules, the weak but not strong equivalence of these requirements being illustrated by the distinct

92

GRAMMATICAL AXIOMS

Thus the lexical rules for the English morphemes ox and try, for example, might conceivably be something like (20) and (21), respectively: (20)

(SUBSTANTIVE, BOVINE,...) = (((SYLLABIC, LOW) (DORSAL) (SPIRANT)), (Plural -en Rule,...))

(21)

((Identical Agents Condition, Equi NP Deletion,...), 6 (V, TRYING,...)) = ((APICAL, STOP, RELEASED) (RESONANT, -LATERAL) (SYLLABIC, LOW) (SYLLABIC, FRONT))

The determining relation between a grammatical rule and its possible deductive uses is most obvious and most simply stated in the case of lexical rules. Thus the directionality of a lexical statement with respect to any given proof is fully determined by the mode of the final equation of that proof; if the terminal mode is phonetic then all lexical statements that are usable will be proofs of the theorem ((OX, PLURAL) = (oxert)) which they respectively determine : (i) (a) (OX, PL) (given) (b) ((OX) & (OX, PL)) ((N, X, PL) = ((N, X) & (N, X, PL)) (c) ((OX & en) ((OXi PL) = (en)) (d) (ox & en) ((OX) = (ox)) (ii) (a) (OX, PL) (given) (b) ((OX) & (PL)) ((N, X, PL) = ((N, X) & (PL)) (c) ((OX) & en) ((OX & PL) = (OX & en) (d) (ox & en) ((OX) = (ox)) (iii) (a) (OX, PL) (given) (b) ((OX) & (PL)) ((N, X, PL) = ((N, X) & (PL)) (c) (ox & (PL)) ((OX) = (ox)) (d) (ox & en) ((ox & (PL)) = (ox & en) Although I feel that tha toleration of phonological sensitivity (as in iii) is the least desirable of the three metatheoretical alternatives, and that the prohibition of any sensitivity (as in i) is the most desirable, I am unable here to provide sufficient significant empirical evidence to substantiate these feelings. * These derivational restriction markers are at most purely expository abbreviations. An obviously more general and revealing treatment of the restriction on TRY is provided by equation (10b), or by a non-equivalence axiom of the sort illustrated in Sec. 4.2.2 and Chapter 5.

GRAMMATICAL AXIOMS

93

used to effect the substitution of their phonological members for their non-phonological members, the opposite substitution being LOGICALLY IMPOSSIBLE until after the appropriate phonological strings have been inferred from their equivalent non-phonological strings; in semantically-terminated derivations, likewise, all lexical rules must be initially applied so as to effect a substitution of nonphonological strings for phonological ones, and, unless there are exact retracings of derivational steps, this is the only POSSIBLE way they can be applied. For any given proof the order of use of a lexical statement is also fully determined with respect to all non-lexical statements of the grammar. Thus for any sequence of equivalent representations including one pair whose equivalence is justified by a lexical rule, it is clearly impossible for any rule in the replacing mode to be used before the lexical rule is used, or for any rule in the replaced mode to be used after it. In other words, lexical rules are necessarily applicable only medially in bimodal proofs, the only possible order of derivational use of the three major types of grammatical statements being (1) semantic and syntactic rules, (2) lexical rules, (3) phonological rules (in phonetically-terminated proofs); or (1) phonological rules, (2) lexical rules, (3) semantic and syntactic rules (in semantically-terminated proofs). The range of choice in the order of derivational use of grammatical rules is thus necessarily limited to rules of the same modally-defined type. For lexical rules, moreover, whose members are mutually distinct, it clearly cannot make any difference whether one substitution is made before, after, or simultaneously with another-significant applicational precedences being possible only for equations which are intersecting and thus subject to possible conflation into a single complex equation. (There are general metatheoretical principles for predicting from the simplest form of any pair of conflatable statements their appropriate order of application in any valid proof (see Chomsky, 1967; Sanders 1970b).) The direction and order of derivational use of every lexical equation is thus fully determined for any possible proof by the form of the equation and the terminal mode of that proof. There is obviously also

94

GRAMMATICAL AXIOMS

no need or empirical justification for restricting any lexical equation in terms of optional or obligatory use, since failure to apply any applicable lexical substitution is logically inconsistent with satisfaction of the condition that all valid proofs must terminate in equations between identical representations in the same interpretable alphabet. Although the predictability of use is most obvious and most easily stated, perhaps, in the case of lexical equations, it appears that the empirically appropriate direction, optionality, and order of use of EVERY grammatical equation can be predicted for any grammar and any given direction of proof of any grammatical theorem. This fact is entirely obscured, of course, by any theory of grammar which allows the incorporation in grammars of heuristic annotations in the abbreviated form of direction-determining arrows, optionality or obligatoriness labels, or orderdetermining numbers associated with particular rules, or in the unabbreviated form of non-universal conditions for valid proof by particular grammars. Such heuristic annotations are not only inconsistent with the formulation of empirically adequate theories about languages, but also with respect to theories about the communicative use of languages, whose general principles governing the use of grammatical principles must necessarily be mutuallyexclusive with any particular specification of the uses of any particular principles of any particular grammar. The only theories of grammar which are capable of providing an appropriate basis for the description and explanation of linguistic competence and performance alike are thus those which require the heuristically and inferentially neutral expression of all rules of grammar, as is the case if all rules are required to be equations.

4.1.4.

Ordering Rules

Ordering rules can be viewed either as instances of an elementary transformation of ordering or simply as special cases of lexicalization, namely those cases where the non-phonological member is the semantically-significant relational element for simple

GRAMMATICAL AXIOMS

95

GROUPING, or commutative co-constituency, and the phonological member is the phonetically-significant relational element for ORDERING, or non-commutative serial concatenation. According to either interpretation, though, it is clear that every ordering rule specifies a symbolic equivalence relation between a set of intersecting constituants and a particular ordering of those constituents. Thus, regardless of whether ordering rules are viewed as elementary transformations of constructions or as lexical substitution relations between their relational operators, every ordering rule can be expressed as an instance of the general equational schema

(22)

W(X, Y)Z = W(X & Y)Z,

where all terms are representations in the same mode and where comma and ampersand represent grouping and ordering, respectively, these relational elements being internally defined by the metatheoretical equations of (23) and externally defined by the empirical interpretation statements of (24) : (23)

Grouping : Ordering :

(X, Y) = (Y, X) (X & Y) ^ (Y & X)

(24)

Grouping:

(X, Y) = 'the interpretations of X and Y are (cognitively) associated with each other' (X & Y) = 'the interpretation of X immediately precedes the interpretation of Y in temporal (and/or spatial) successivity'

Ordering :

I have presented evidence elsewhere (Sanders, 1967, 1969, 1970a) showing that empirically adequate explanatory grammars require the metatheoretical assumptions (1) that all terminal semantic representations are free of ordering relations, (2) that all elements of terminal phonetic representations are ordered with respect to each other, and (3) that all ordering relations are derivationally invariant, i.e. that if an element A is ordered to the left of an element B in any line of any derivation there is no line in that derivation in which B is ordered to the left of A. For

96

GRAMMATICAL AXIOMS

theories which incorporate these derivational constraints, the direction and order of use of any ordering rule will be intrinsically determined in essentially the same way as in the case of all other rules of lexicalization. Although any theory of grammar which fails to incorporate these constraints will be incompatible with the principled expression of certain significant generalizations both about linguistic knowledge and about its use, the acceptance or rejection of these constraints has no bearing at all on, the appropriateness of equational grammar, since the relation between the constituents of a structure and any ordering of those constituents can always be adequately specified by an equational statement in the form of (22) regardless of whether the specification is defined on constructions which are not semantically terminal or on those which are, and regardless of whether it is used to justify only one inference in the derivation of a given construction or to justify more than one. 4.1.5

Reordering Rules and Their Reduction to Ordering Equations

Grammars which do not observe the Invariant Order Constraint permit the derivational re-ordering, or permutation, of constituents. In many such grammars, these permutations are specified by directed re-ordering rules of the form (25)

W&X&U&Y&Z-»

W & Y & U & X & Z ,

where each such rule is, of course, reducible to a corresponding equation of the form (26)

W & X & U & Y & Z

=

W&Y&U&X&Z.

If permutation were viewed as a single elementary relation or derivational process, it is evident that every instance of (25) or (26) would directly contradict the defining non-commutativity axiom for ordering, (X & Y ^ Y & X), which asserts that (26) is false for any possible values of its variables and thus that any

97

GRAMMATICAL AXIOMS

inference instruction such as (25) is necessarily invalid. But it is clear that permutation is actually not an elementary transformation of any known theory of grammar, since the derivational function of any possible permutation rule will always be precisely equivalent to that of an ordered sequence of instances of the independently motivated elementary transformations of (1) identity adjunction, or iteration; (2) grouping or regrouping (of the adjunct); (3) ordering (of the adjunct), and, either before or after ordering, (4) identity deletion (of the source of iteration). Thus a permutation equation such as (27)

CCS) & (ADV)) = ((ADV) & (S))

is at most an abbreviation for the set of non-permutational equations (28)

(1)

(ADV)

(2)

((S) & (ADV, ADV)) = ((ADV), ((S) & (ADV)))

=

(ADV, ADV)

(COPYING)

(3)

((ADV), ((S) & (ADV))) = ((ADV) & ((S) & (ADV))) (ORDERING) ((ADV) & ((S) & (ADV))) = ((ADV) & ((S))),

(REGROUPING)

(4)

(DELETION)

which justify, for example, from the structure underlying (29a) the sequential derivation of the structures underlying (29b-d): (29)

(a) (b) (c) (d)

It was raining yesterday It was raining yesterday, yesterday Yesterday, it was raining yesterday/then Yesterday (,) it was raining

(28.1) (28.2, 28.3) (28.4)

The complex nature of derivational recordering has been widely recognized, of course, and its reduction to adjunction (with grouping and ordering) and deletion has been explicitly noted by Rosenbaum and Lochak (1966) and others. However, there is an even simpler and more natural reduction of reordering in terms of the elementary relation of ordering alone. Thus the derivational function of any directed reordering rule

98 (30)

GRAMMATICAL AXIOMS

X & Y

Y & X

is precisely the same as that of the sequence of directed rules (31)

(i) (ii)

X &Y X, Y

X, Y Y & X.

These directed inferences presuppose and are justified by the respective ordering equations (32)

(i) (ii)

(X, Y) = (X & Y) (X, Y) = (Y & X)

But it can readily be seen that these equations subsume not only the directed permutation (30) but also the directed rule (33) which generates the possible inputs to (30) and is thus necessarily presupposed by i t : (33)

X, Y -> X & Y

Thus for any construction (X, Y) both the ordering (X & Y) and the ordering (Y & X) will always be derivable without the use of reordering rules, and these derivations will be justified by exactly the same pair of simple ordering equations which would be necessary to justify any weakly equivalent derivations by means of reordering. When recordering is thus reduced to ordering it follows that the difference between grammars which observe the Invariant Order Constraint and those which violate it can not possibly consist of a difference in RULES, since exactly the same equations (e.g. (32)) that are necessary and sufficient for the invariant derivation of alternate orderings of any given constituents are also necessary and sufficient for their derivation by permutation-effecting sequences of inferences such as (31i-ii). Rather, the difference is clearly shown to! involve a difference in the way rules are USED to prove a given theorem, along with the obvious difference in the standards of logical consistency which are imposed on all proofs. Thus, to observe the invariant order condition and thereby avoid all contradictions of the non-commutativity axiom (A & B ^

GRAMMATICAL AXIOMS

99

B & A) in the lines of any proof, it is not possible to use both (32i) and (32ii) in the derivation of any given structure. Conversely, in order to violate the invariance condition and contradict the non-commutativity axiom, it would be necessary to make use of both (32i) and (32ii) in the derivation of the same structure. Now in order to make use of both of these rules in a single derivation it is clear that at least one of them must be used at least TWICE. Thus (34) and (35) are the shortest possible derivations from (X, Y) which are justified by (32) and which include both (X & Y) and (Y & X) : (34)

(a) (b) (c) (d)

(X, Y) (X & Y) (X, Y) (Y & X)

(given) (32i. (X, Y) (32i. (X, Y) (32ii. (X, Y)

(35)

(a) (b) (c) (d)

(X, Y) (Y & X) (X, Y) (X & Y)

(given) (32ii) (32ii) (32i)

It is not the case, of course, that repeated use of the same rule will always result in a violation of invariant ordering. Thus, for example, (36)

(a) (b) (c) (d)

(X, Y) (X & Y) (X, Y) (X & Y)

(given) (321) (321) (32i)

However, it will always be the case that for any theorem which has a valid proof that requires inverse use of the same rule in the derivation of the same structure there will also be a valid proof that consists of fewer lines and that has no derivational retracings or inverse inferences. Thus, for the theorem which is repetitively proven by (35) and (36), there is the shorter non-repetitive proof (37)

(a) (b)

(X, Y) (X & Y)

(given) (32i)

100

GRAMMATICAL AXIOMS

Every non-repetitive proof will necessarily observe the invariant order constraint. Since such proofs have an obvious simplicity value, it is reasonable to assume that any adequate theory of grammar will determine by its principles for the selection of computationally optimal proofs the preference of non-repetitive proofs over repetitive ones for any possible theorem. The set of such non-repetitive proofs is in fact automatically determined by the independent empirically-motivated principle of maximalization of terminality specified at the beginning of the present chapter. This universal inference principle for equational grammar could thus be said to explain why the Invariant Order Constraint has not only an indépendant logical basis in the non-commutativity axiom for ordering but also an independent empirical basis in its necessity and sufficiency for the principled explanation of all known facts about ordering in natural languages. The preceding remarks reveal once again that there are certain interesting relations between grammatical inferences and proofs which can be readily stated and explained in terms of a theory which assumes the underlying equational nature of all principles of grammar. Many of the questions that have been raised here — concerning the reduction of permutation to ordering, the relation between inverse use and repetitive derivations, the general question of the relationships between grammatical principles and the inferences and proofs which they justify — would evidently have no source or basis in the context of any theory which assumes that non-equational directed inferences are the highest level hypotheses of grammar. 4.1.6

Grouping and Regrouping

Axioms

All directed regrouping transformations of the forms (38)

(a)

((A), (B), X ) - » ((A, (B)), X )

(INTERPOSITION)

(b)

((A, (B)), X )

(EXTRAPOSITION)

((A), (B), X )

are clearly subsumed by associativity, or alternative grouping, equations of the form

GRAMMATICAL AXIOMS

(39)

101

((A), (B), X) = ((A, (B)), X).

Regrouping has not generally been recognized as an elementary transformation, and grouping has usually been treated as a necessary (and hence predictable) component of the operation of adjunction. It is nevertheless logically possible to interpret an adjunction transformation, such as (40)

(SYLLABIC, X) (NASAL, Y) SAL) (NASAL, Y),

(SYLLABIC, X, NA-

as a set of logically-independent transformations of (self-)adjunction and regrouping, joindy effecting sequential derivations, such as (41)

(a) (b)

(SYL, X) (NAS, Y) (SYL, X) (NAS, NAS, Y)

(c)

(SYL, X, NAS) (NAS, Y)

(given) (idempotency, i.e. (X, X) = (X)) ((SYL) (NAS, NAS) = (SYL, NAS) (NAS))

Moreover, while the subordinate rather than coordinate grouping of adjuncts relative to the constituents to which they are adjoined could be properly assured by a metacondition on adjunction for such cases as (42) and (43), (42)

S(N(W), X, S(N(W), Y — identity adjunction S(N(W), X, S(N(W), S(N(W), X), Y) — metacondition -> S(N(W), X), S((N(W), (N(W), S(N(W), X))), Y)

(43)

NEGATIVE, (V, X) — identity adjunction -» NEGATIVE, (NEGATIVE, V, X) — metacondition NEGATIVE, (V(NEGATIVE, V), X)

it would be necessary to predict coordinate rather than subordinate grouping for other cases of adjunction, such as that of the vowel nasalization rule (40). It is possible nevertheless that the correct differential predictions might be adequately derived from the universal equations

102 (44)

GRAMMATICAL AXIOMS

(i)

Subordinative Grouping: ((A), ((B), (A), X)) = ((A), (((B), ((B), (A))), X)) (HUMAN, ANIMATE)

can appropriately apply to the non-phonological construction (N ,HUMAN, MALE) in any phonological-element-free representation of the sentence A man arrived yesterday. And the phonetic redundancy rule (2)

(NASAL)

(NASAL, VOICED)

can appropriately apply to the nasal segments of the construction ((NASAL, LABIAL) (SYLLABIC, FRONT, LOW) (NASAL)) in any semantic-element-free representation of the same sentence. The two representations of the morpheme man in English can thus be mutually exclusive in every line of every derivation. They MUST be universally mutually-exclusive, moreover, if we are to explain why natural languages have numerous general rules such as (1) and (2), but NO modally-mixed rules such as (3)

* (HUMAN) (HUMAN, NASAL) * (HUMAN) -> (HUMAN, MALE)/( * (NASAL) (NASAL, ANIMATE) * (NASAL) (NASAL, LABIAL)/( MALE))

, man) , (HUMAN,

This fact is explained by the metatheoretical definitions of terminal representations and the assumption that all grammatical statements are equations, since the mutual exclusivity of the two

GRAMMATICAL AXIOMS

145

modes of representation of the morpheme man, for example, follows necessarily from the empirically necessary lexical rule (4)

(N, HUMAN, MALE) = ((NASAL, LABIAL), (SYLLABIC, FRONT, LOW), (CODA, NASAL))

The existence of a lexicon, as a set of axiomatically bimodal representations of morphemes, is thus not only IRRELEVANT to the proper specification of redundancy rules but also INCOMPATIBLE with any principled explanation of their universally monomodal character. The notion of a lexicon is equally irrelevant with respect to the proper specification of co-occurrence relations and equally incompatible with any reasonable explanation of the fact that phonological information is NEVER relevant to the characterization of semantic and syntactic well-formedness, a characterization which can be sufficiently effected in terms of the welldefined relations of inclusion and contradiction with respect to the predicates and arguments of terminal semantic representations. In Chomsky (1965), however, it is proposed that correct and incorrect collocations of predicates and their arguments can be appropriately differentiated in terms of a general principle of lexical insertion. In essence this principle requires that lexical entries be introduced into constituents of partially-formed representations of sentences if and only if the properties of a lexical entry are not distinct from those of the generic constituents they are introduced into and in addition, for the lexical entries of verbs and adjectives, if their inherent argument co-occurrence properties are not distinct from the actual generic properties of the arguments of the predications that they are introduced into. Thus, for example, Chomsky's lexical insertion theory of collocation would account for the contrast between the collocationally well-formed sentences (5)

(a) (b)

John admires sincerity Sincerity frightens John

and the collocationally deviant ones

146 (6)

GRAMMATICAL AXIOMS

(a) (b)

*Sincerity admires John *John frightens sincerity

as a consequence of the fact that the conditions for the proper insertion of the predicate lexical entries (7)

(a) (b)

((admire), (V, ((ANIMATE) — (N)),...)) ((frighten), (V, ((N) (ANIMATE)),...))

are satisfied in the case of (5) but not in the case of (6). It can readily be seen that the phonological members of these hypothetical lexical entries are entirely irrelevant to the specification of proper collocations. But this means then that the notions of 'lexical entry' and 'lexical insertion' must also be entirely irrelevant here, and that an adequate theory of predicational wellformedness can thus be achieved in terms of a general theory of grammar which has a simpler and much more narrowly-constrained set of axioms and statement-types than that proposed by Chomsky. In place of Chomsky's metarule for the formational insertion of lexical entries, this theory would have a universal metacondition on the well-formedness of groupings of predicates and their arguments in terminal semantic representations. In essence, this condition will express the general principle that a sentence is collocationally well-formed if and only if there are no contradictions between its underlying predicates and their arguments. Some of the problems involved in achieving a correct formulation of this principle are discussed in detail in Sanders (1967), where it is shown, for example, that ordering and contextual features are neither necessary nor sufficient for an adequate general characterization of predicational contradiction, and that such a characterization can readily be effected in terms of the mutual violation of a synthetic disjunction by a predicate and one of its arguments, where a disjunctive, or antonymy, relation (cf. Katz, 1964; Bierwisch, 1969) holds between elements A and B if and only if there is an element G such that the statements (A, -B) = (G, (A, -B)) and (-A, B) = (G, (-A, B)) are both true.

GRAMMATICAL AXIOMS

147

As shown in Chapter 2, there is a principled conflation schema for such disjunctive statements in the form of a redundancy axiom (8)

( A / B / . . . N ) = (G, (A/B/...N)),

where slash stands for the metatheoretical operator or relation of alternative denial. Moreover, it is appropriate to view this schema as asserting not merely that there are representations of the forms (9)

(a) (b)

(G, A, X) [where X does not include B or... or N] (G, B, X) [where X does not include A or... or N]

but also that these are equivalent to the representations (10)

(a) (b)

(G, A, (NEG, B),.„ (NEG, N)) (G, B, (NEG, A),.., (NEG, N)).

Representations like the latter suffice, as shown by Katz (1964), as a basis for the uniform characterization of all contradictions by the formal relation (argument including X, predicate including (NEG, X)). Thus, for example, the contradictory sentence (11)

Red is green

is formally characterized as such by the fact that when a phonetic representation of this sentence is substituted for the phonetic variable of the theorem schema for contradiction, (12)

(a) = ((N, (A, (B))), (PRED, (A, (NEG, B))))

the resulting theorem has a valid proof by means of a grammar of English. Aside from the previously-mentioned general rule for subject-predicate ordering (Sec. 4.1.6, Equation 59.ii) and the ordinary lexical statements (13)

(i) (ii)

(RED) = (red) (GREEN) = (green)

this proof depends only on the presumably well-motivated redundancy rule (14)

(RED/GREEN/BLUE/...) = (COLORED, (RED/GREEN/BLUE/...))

148

GRAMMATICAL AXIOMS

From this rule it follows that (15)

(a) (b)

(RED) = (COLORED, (RED, (NEG, GREEN),...)) (GREEN) = (COLORED, (GREEN, (NEG, RED),...))

and thus that the terminal semantic representation of (11) must be equivalent to (16)

((N, COLORED, (RED, (NEG, GREEN),...)), (PRED, (COLORED, (GREEN, (NEG, RED),...)))

Substitution of variables for the constants of this representation results in a representation which is identical to the semantic member of the theorem schema for contradiction (12). All collocationally ill-formed sentences are contradictory and their ill-formedness is characterized in the same way with respect to the same schema for contradictions. Thus the ill-formed sentences (6) are formally differentiated from the well-formed sentences of (5) by the fact that a grammar of English will demonstrate valid contradiction theorems of the form (12) with respect to the former but not with respect to the latter. To show this we need only assume the lexical rules (17) and the predictive redundancy rule (18) : (17)

(i) (ii) (iii) (iv)

(18)

(John) = (N, ANIMATE,...) (sincerity) = (N, ABSTRACT,...) (admires) = (PRED, (AGENT, ANIMATE), (OBJECT),...) (frightens) = (PRED, (AGENT), (DATIVE, ANIMATE),...)

(ANIMATE/ABSTRACT/LOCATIVE/...) = (N, (ANIMATE/ABSTRACT/LOCATIVE/...))

After the application of these rules (along with those effecting order-elimination and other presently irrelevant general processes) to the phonetic representations of (5) and (6), the representations of these sentences will be as follows :

GRAMMATICAL AXIOMS

(50

(a)

(b)

(6')

(a)

(b)

149

((AGENT, N, ANIMATE, (NEG, ABSTRACT),...), (PRED, (AGENT, N, ANIMATE, (NEG, ABSTRACT)...), (OBJECT),...), (OBJECT, N, ABSTRACT, (NEG, ANIMATE),...)) ((AGENT, N, ABSTRACT, (NEG, ANIMATE),...), (PRED, (AGENT), (DATIVE, N, ANIMATE, (NEG, ABSTRACT)...), (DATIVE, N, ANIMATE, (NEG, ABSTRACT),...)) ((AGENT, N, ABSTRACT, (NEG, ANIMATE),...), (PRED, (AGENT, N, ANIMATE, (NEG, ABSTRACT),...), (OBJECT),...), (OBJECT, N, ANIMATE, (NEG, ABSTRACT),...)) ((AGENT, N, ANIMATE, (NEG, ABSTRACT),...), (PRED, (AGENT), (DATIVE, N, ANIMATE, (NEG, ABSTRACT),...)), (DATIVE, N, ABSTRACT, (NEG, ANIMATE,...))

Since the representations of (60 both include contradictions between co-generic constituents of a predicate and one of its arguments (as indicated by the italics), they are both analyzable as instances of the semantic member of the contradiction schema (12). Neither of the representations of (50 can be analyzed in this way. The preceding remarks provide only a highly schematic view of the general theory of collocational well-formedness suggested in Sanders (1967), and they fail to suggest either the full range of motivations and discriminatory powers of the theory or the full range of interesting unsolved problems which it raises. Further discussion would be inappropriate here, however, since we have already gone beyond the limits dictated by our present purpose, which is simply to show that collocational well-formedness can be adequately characterized by theories which do not include lexical entries or lexical insertion rules in their inventory of possible linguistic statement-types. This much could actually have been demonstrated in a much simpler way, moreover, since neither

150

GRAMMATICAL AXIOMS

the formational character of Chomsky's metatheory (1965), nor his postulation of context features, nor any other essential aspect of his treatment of collocation depends on the existence of lexical entries and lexical insertion rather than ordinary equational lexicalization. Thus, preserving all non-lexical features of Chomsky's theory, we could simply add to his branching and generic subcategorization base rules an additional set of specific subcategorization rules, and, in place of the pre-cyclic lexical insertion metarule, a pre-cyclic agreement metarule, roughly of the form (19)

S(X & (PRED) & Y) = S(X & (PRED, X

Y)) & Y).

If we assume then that the lexical rules for admire and frighten are equations between the phonological and non-phonological members of the lexical entries of (7), it is clear that admire whose non-phonological equivalent includes (ANIMATE N), could not be substituted for the predicate (6a), which, after the application of (19), would be (PRED, V, (ABSTRACT N)). Thus no phonetic representation would be assigned to this or any other collocationally deviant sentence, and only those sentences which are collocationally well-formed (such as (5)) will have possible phonetically terminated derivations. There are also other possible non-lexical and non-componential systems for the characterization of predicational well-formedness, but it is evident that all of these will also be compatible with the constraints of the theory of equational grammar in its narrowest form. Since these systems can account for all co-occurrence restrictions that can be accounted for by Chomsky's lexical-entry system, and since they also account for the monomodal character of grammatical rules — a fact which cannot be explained at all by the lexical-entry system — it can only be concluded that equational grammar is of greater explanatory value than any type of grammar which permits the postulation of bimodal lexical entries, or any other types of lexical statements that are distinct from the ordinary lexical axioms of equational grammars.

151

GRAMMATICAL AXIOMS

4.3.2.

The Reduction

of Semantic

Amalgamation

The 'semantic projection rules' of Katz and Fodor (1963) require little discussion, since these semantically-directed transformations are not only directly reducible to equations but are also based on an arbitrary and empirically unmotivated differentiation of semantics and syntax and on the postulation of bimodal dictionaries which, as shown above, have no essential function in explanatory theories of language. For present purposes, it will be sufficient to briefly demonstrate the reducibility and empirical non-necessity of these rules, ignoring the general inadequacies of the linguistic metatheory they presuppose, many of which have already been amply discussed in Weinreich (1966), McCawley (1968) and various other published and unpublished papers. As defined by Katz and Fodor (1963), "projection rules amalgamate sets of paths [ordered and grouped bimodal representations] dominated by a grammatical marker by combining elements from each of them to form a new set of paths which provides a set of readings for the sequence of lexical items under the grammatical marker." Since Katz and Fodor's lexical entries are assumed not only to be bimodal but to include context-sensitive disjunctions of semantic representations, the operation of amalgamation serves primarily the function of eliminating the contextually-inappropriate senses of axiomatically polysemous lexical entries by the deletion of those members of their semantic disjunctions whose inherent 'selection restriction', or context, features are not included among the inherent non-contextual features of the constituent that they are in construction with. In addition to deletion, amalgamation effects regroupings (and apparently, unorderings19) of the semantic representations of the constituents of a construction into a single (presumably unordered) representation of the construction as a whole. By progressive regrouping and elimination of ordering and contextually-inappro18

It is unclear, from Katz and Fodor's discussion, precisely what relations are presumed to hold between the constituents of semantic representations. In all their actual examples of projection rules, though, it appears that

152

GRAMMATICAL AXIOMS

priate disjunctions, the semantic representations of larger constructions are progressively derived from those of their constituents until a single representation (which may still include disjunctions, however) is associated with the largest construction in the given object, which will be, in the case of non-deviant objects, a non-null representation of the meaning of a sentence. Thus, given lexical entries of the general form (20)

((A), ( ( M E A N I N G ! in t h e c o n t e x t ( X i ) ) / ( M E A N I N G 2 in

the context ( X 2 ) ) / . . . / ( M E A N I N G „ in the context (X N )))) every projection rule for non-coordinate constructions (which we will assume to be uniformly represented as A(A, B), where B is the attribute) will be an instance of the following semanticallydirected rule schema, where (21a) is the generalized structural description and (21b) the generalized output description, and where the elementary transformational relations are clearly those of deletion and regrouping : (21)

(a) (b)

(A(A((o), (X, Y), (B((b), (W, X in the context Y ) / (Z in the context - X , -Y))))) (A((a, b), ((X, Y), W)))

If any statement of this form is necessary to justify the proof of any true linguistic theorem, then its converse will also serve to justify the proof of some true theorem, and both statements will of course be reducible to a more general and heuristicallyneutral equation of the form ((21a) = 21b)). Similarly all projection rules for coordinative constructions, which we will assume to be uniformly represented as A(A, A), will follow as special cases of the general idempotency and regrouping equation: (22)

(A(A((jc), ( X ) ) ) , (A(A((Y), (Y))))) =

((A((jc, y), ( X , Y ) ) ) )

It is evident then that, aside from their bimodality, projection

ordering is preserved with respect to the (semantically-irrelevant) phonological constituents of amalgamated structures but is always eliminated with respect to their non-phonological constituents.

GRAMMATICAL AXIOMS

153

rules are formally indistinguishable from any ordinary grammatical rule justifying deletion and regrouping. Moreover, since phonological information is obviously just as irrelevant to amalgamation as it is to the specification of proper collocations, and since even for Katz and Fodor the 'carrying' of phonological representations in instances of (21) and (22) serves no function other than that of expository reference, it is clear that the phonological parts of all projection rules can be eliminated, and that such rules could thus be fully reduced to the ordinary schemata of equational grammar in its most restricted form. However, as soon as the semantic irrelevance of phonological representations is made explicit, the Katz and Fodor model of grammar collapses. For, if bimodal representations are as unnecessary for amalgamation as for the specification of all other linguistic processes or relations, then there can be no possible justification for the assumption of axiomatically bimodal representations of morphemes, that is, for a dictionary or lexicon in the sense of Katz and Fodor or Chomsky (1965). But if there are no bimodal lexical entries, there can be no possible justification then for assuming that there are disjunctions in the semantic representations of morphemes, an assumption which would be ad hoc and highly dubious even if bimodality COULD be justified (see, e.g., Weinreich, 1966; McCawley, 1968). The entire Katz and Fodor model depends crucially on this assumption, however, since the elimination of members of semantic disjunctions is the only function and raison d'être of projection rules, the ancillary regroupings which they effect being independently unmotivated in their theory and of no discriminatory value at all with respect to the characterization of meanings. Thus we must either accept the empirically unacceptable postulation of axiomatically bimodal dictionaries or else reject the Katz and Fodor theory of linguistic description, in which case all of the eminently-reasonable and well-stated goals of this theory must be achieved by other means. Such means are in fact readily available to any non-separatist and non-lexical theory of grammar, including the most narrowly restricted theory of equational grammar.

154

GRAMMATICAL AXIOMS

It has already been shown that equational grammars adequately characterize the semantically-significant relations of synonymy, ambiguity, entailment, analyticity, contradiction, presupposition, and predicational well-formedness. It has also been shown that the successful characterization of these and other 'semantic' relations will NECESSARILY be achievable by any theory which successfully characterizes the central 'grammatical' relation of simple symbolic equivalence between the sounds and meanings of linguistic objects. It will thus always be the case that the achievable goals of any purportedly distinct theory or sub-theory of semantics will be properly included in those of an ordinary theory of equational grammar. The explanatory domain of semantics and empirical logic, in other words, is properly included in the domain of linguistics, and all theories of semantics are reducible to internally-homogeneous theories of grammar. There is one specified function of Katz and Fodor's projection rules, however, which has not yet been explicitly dealt with; this is the function of explaining how "a speaker employs the syntactic structure of a sentence to determine its meaning as a function of the meanings of its lexical items" (Katz, 1964 : 521), and how he can "disambiguate parts of a sentence in terms of other parts and thereby determine the number of [semantic] readings of a sentence" (Katz and Fodor, 1963 : 485). But it is clear that the task of determining the meaning of a sentence from the MEANINGS of its constituents is simply a PART of the more general task of determining its meaning from the SOUNDS of its constituents, and ultimately, of course, from the sound of the sentence AS A WHOLE. The achievement of this more general task is formally characterized, as we have shown, by the valid semanticallyterminated proof of a symbolic equivalence theorem with a given phonetic member and unknown semantic member. Since the proof of all such theorems depends on ALL the rules of the grammar and requires no postulations of bimodal or disjunctive representations of linguistic objects, it is clear that the speaker's ability to determine the meanings of sentences can only be explained by the GRAMMAR AS A WHOLE, and that it will in fact be so-explained

GRAMMATICAL AXIOMS

155

by grammars which are strictly equational and free of any rules for the formation or elimination of disjunctions. The competent speaker's ability to "disambiguate parts of a sentence in terms of other parts", i.e. to determine that a given sound sequence is ambiguous in one context but not in another, is similarly explained by an equational grammar as a whole without recourse to projection rules or any other special devices for semantic disambiguation. Thus the fact that a speaker of English knows that the sounds of ball have two possible meanings in that ball was nice but only one of these in that ball ended late and only the other in that ball weighed five pounds is obviously inseparable from the fact that he is capable of associating a phonetic rendition of the former sentence with either of two contradiction-free semantic renditions, while each of the latter can be associated with only one. The principles of knowledge which explain these facts are precisely the same as those which explain the speaker's ability to recognize the grammaticality of (23a-b) and the contrasting ungrammaticality of (23c-d): (23)

(a) (b) (c) (d)

That ball was nice and ended late That ball weighed five pounds and was nice *That ball weighed five pounds and ended late * The ball that ended late weighed five pounds

These principles are expressed by the ordinary and presumably well-motivated lexical equations (24)

(i) (ii) (iii) (iv) (v)

(ball) = (N, EVENT,...) (ball) = (N, PHYSICAL OBJECT,...) (nice) = (PRED,...) (weigh) = (PRED, (GENITIVE, PHYSICAL OBJECT),...) (end) = (PRED, (N, EVENT),...)

From these rules it follows that a contradiction-free semantic representation can be derived from that ball is nice by making use of either (24i) or (24ii), while if the former is used in a semantically-terminated derivation of that ball weighs five pounds

156

GRAMMATICAL AXIOMS

the resulting representation will be formally characterized as contradictory by the presence of (EVENT) in its argument and (NEG, EVENT) in its predicate. The same would hold for the use of (24ii) rather than (24i) in a semantically-directed derivation from that ball ended late, with the resulting predicational il— formedness being characterized by the argument (PHYSICAL OBJECT) and predicate (NEG, PHYSICAL OBJECT) in the only valid termination for this derivation. Thus a speaker knows whether any utterance or utterance constituent has one or more non-contradictory meanings if and only if he knows the true phonological, non-phonological, and lexical equivalence rules of his language, the ability to disambiguate possibly ambiguous sequences being entailed by and inseparable from the ability to simply speak and understand a language. 4.3.3.

The Reduction of Empirical Logic

Linguistic equivalence rules play a central role in all traditional theories of natural, or empirical, logic — that is, theories about the semantic properties and relations of natural language sentences and about the ability of human beings to recognize the validity or invalidity of any inference that can be expressed by these sentences. Thus, for example, the relation of 'simple conversion', a common notion of scholastic Aristotelian syllogistics (see, e.g. Prior, 1962 : 108), is expressed by the equations (25)

No B is an A = No A is a B

(26)

Some B is an A = Some A is a B.

Since "No X is a Y" is synonymous with "It's not the case that some X is a Y", the informal equations (25) and (26), which are essentially expressed in terms of the normal phonetic representations of the four proposition-types in English, can be reduced to the parallel non-phonological equations (27)

(NEG, ((SOME, B), (PRED, A))) = (NEG, ((SOME, A), (PRED, B)))

157

GRAMMATICAL AXIOMS

(28)

((SOME, B), (PRED, A)) = ((SOME, A), (PRED, B))

Since the relation of conversion for existentially-quantified predications holds not only for predicational arguments under the scope of negation but also those under the scope of such other propositional quality predicates as "true", "certain", and "possible" (cf. "It's possible that some B is an A", "It's possible that some A is a B"), the rule of conversion can be generalized by means of the following single equivalence axiom : (29)

((PRED, x) ((SOME, B), (PRED, A))) ((SOME, A), (PRED, B)))

=

((PRED, x)

This rule of conversion, which is an instance of the general equational axiom schema for alternative grouping, or associativity, serves to account, for example, for the apparent cognitive synonymy of the phonetic expressions of (30a) and (30b) in all human languages, and for the fact that a competent speaker of any language will recognize the validity of the inferences expressed by the phonetic representations of (30c) and (30d) in that language : (30)

(a) (b) (c) (d)

No cat is a canary No canary is a cat If it is true that no fat men are happy men, then it must also be true that no happy men are fat men If it is true that it is possible that some men are immortals, then it must also be true that it is possible that some immortals are men

Similar reductions to grammatical equivalence axioms could be effected, it seems, for all other traditional equivalence statements of empirical logic. Moreover, with respect to all motivated implicational principles such as (31)

(a)

Every A is a B D Some A is a B

(b)

It's certain that X D It's possible that X,

these appear to be invariably reducible to ordinary independentlymotivated redundancy equations such as

158 (32)

GRAMMATICAL AXIOMS

(a) (b)

(ALL) = (ALL, SOME) (CERTAIN) = (CERTAIN, POSSIBLE),

which, in conjunction with the standard theorem schema for entailment (see Chapter 3), will provide for the valid proof of true entailment theorems for all pairs of linguistic objects that are informally characterized by the given logical implication principle. I wish to suggest now that there will be no significant principle or relation of empirical logic that cannot be adequately expressed by means of the elements, axioms, and theorems of equational theories of grammar. The possibility of such a reduction of natural logic to natural language grammar follows from the fact than an equational grammar which adequately characterizes the primary grammatical relation of symbolic equivalence will, without the addition of any supplementary elements, statement-types, or conventions, necessarily suffice as well for the characterization of synonymy, contradiction, entailment, and all other relations of traditional interest to students of empirical logic. It might still be maintained, however, that even after reduction there will still be some distinctive difference between those statements which express principles of logic and those which express principles of grammar of the sort which underlie syntactic transformations of the ordinary types. Such a distinction may in fact be already provided by the theory of equational grammar, since it will be observed that the axiom for conversion (29) differs from all other known derivational regrouping axioms in that it cannot justify either an increase or a decrease in terminality with respect to either interpretable alphabet, and thus has no appropriate derivational use according to the maximalization of terminality principle. If this is not accidental to the particular formulation of conversion here, and if a similar derivational nonutility is found to characterize all other axioms expressing traditional logical equivalences, then this characteristic would suffice to differentiate logical axioms from all other rules of grammar, and it would follow — quite correctly, I believe — that the linguistic objects that are related by principles such as conversion

GRAMMATICAL AXIOMS

159

have no common terminal semantic representation and are thus characterized as NON-SYNONYMOUS, the asserted equivalence then being appropriately interpreted as the weaker relation of argumentative or inferential equivalence, rather than the stronger and more specific relation of meaning-identity. I wish to suggest now that there is no other possible principled basis for the differentiation of logical and syntactic rules. Thus, in terms of the elementary transformations which they justify, these are precisely the same for logical equivalence axioms and those axioms which justify such traditional phonetically-directed syntactic processes as Subject Raising, Passive Formation, Negative Incorporation, Particle Movement, and Clause Extraposition, as well as such typical phonological processes as assimilation, metathesis, and degemination. In non-equational grammars, therefore, there is no plausible basis for differentiating a passive transformation from a conversion transformation in terms of their formal properties as transformations. Frequent efforts have been made to differentiate semantics and logic from grammar on the basis of purported differences in their explanatory functions. Thus far, however, such efforts have been invariably unsuccessful, since the results of linguistic research have repeatedly shown that, apart from certain phonetic features of a language, there are few if any facts about linguistic objects which can be explained in an optimally general way independently of their semantic representations. Thus, for example, in what is perhaps the only real empirically-based effort to differentiate semantic from syntactic theories by function, Chomsky (1957) has suggested that, in the context of a particular formational and categorial model of directed transformational grammar, a grammar of English which characterizes sentences (33a) and (33b) but not (33c) will be simpler than one which characterizes either all three sentences or only (33a): (33)

(a) (b) (c)

Colorless clear liquids boil furiously ?Colorless green ideas sleep furiously *Furiously sleep ideas green colorless.

160

GRAMMATICAL AXIOMS

The conjunction of this metatheory and a given body of empirical data thus determines, on the basis of total generality of axioms and laws, that two theories are better than one for the optimal explanation of the properties and relations of these sentences, one theory to account for the distinction between (33a-b) and (33c), the other to account for that between (33a) and (33b). In terms of this metatheory, however, collocational wellformedness can only be accounted for by means of a large set of entirely ad hoc particular co-occurrence restrictions between particular classes of linguistic structures. In order to provide any real explanation of these co-occurrence restrictions, it has been necessary to eliminate the notion of categorized atomic morphemes in favor of much less superficial representations which include at least a partial semantic representation of morphemes, with respect to which certain GENERALIZATIONS about collocational well-formedness can then be shown to hold (see Chomsky, 1965). Attempts to achieve progressively more general explanations of co-occurrence relations have required that the relevant explanatory principles be defined on domains which are progressively farther removed from the categorized and complexly grouped and ordered representations of 'syntactic surface structures' or 'syntactic deep structures' and progressively closer to fully-specified terminal semantic representations (see, e. g. Sanders, 1967; McCawley, 1968). Thus we finally reach a point where it appears that what was originally purported to be the distinctive function of GRAMMAR AS OPPOSED TO SEMANTICS — the explanation of collocational well-formedness — must not only be effected in terms of uncategorized, unordered, minimally-grouped SEMANTIC REPRESENTATIONS, but also, for an optimally general achievement of this function, in terms of the traditional SEMANTIC RELATION of contradiction. Thus in the present state of knowledge the simplest, most general, and most revealing theories of language will NECESSARILY perform the explanatory functions of any purported theory of semantics or logic AS WELL AS those of any purportedly 'non-semantic' theory of grammar. And each of the statements of such theories will have the individual explanatory function of

GRAMMATICAL AXIOMS

161

asserting an empirically-motivated symbolic equivalence relation between two representations of linguistic objects. Phonological, non-phonological, and lexical equations are differentiated in terms of their constituting elements; it might seem likely then that if there is any distinct difference between semantic and syntactic statements, it would also involve some difference in their respective element sets. Although conceivable perhaps with respect to redundancy rules — a rule such as ((N, X) = (NP, (N, X))) being classifiable as syntactic by the non-interpretability of NP, a rule such as ((HUMAN) = (HUMAN, ANIMATE) as semantic by the interpretability of all its constants — such a distinction cannot be found in the case of rules expressing transformational relations between variables. Thus syntactic transformations seem to be largely if not entirely specifiable by means of an alphabet consisting of free variables and the constants N (noun or nominal argument), P R E D (verb or predicative), N E G (negative), and, perhaps, certain quantifiers and the sentential element S (although the latter appears to be generally replaceable by a configuration of N and PRED). However, if these elements are considered to be the distinctive markers of 'syntactic' rules, then traditional 'logical' rules such as that for simple conversion (29) must also be 'syntactic' rules, since these must also typically refer to nominal arguments, predicatives, negativity elements, and quantifiers. Moreover, NEG and quantificational elements are necessarily semantically interpretable, and there is good reason to assume that N and P R E D are also so-interpretable. Thus if differentiation in terms of elements is possible at all it will certainly not differentiate syntactic from logical rules in any way which even roughly corresponds to any of the traditional informal conceptions of their distinctness. Principles of natural logic are normally considered to hold with respect to all rational beings in all possible worlds. Many ordinary grammatical rules, on the other hand, are known to be properly applicable only in the context of certain particular speech communities. A differentiation of the logical and non-logical components of grammars in terms of the scope of application of

162

GRAMMATICAL AXIOMS

their respective rules would thus appear to be both possible and intuitively reasonable — logical rules being characterized then as those which are universal, in the sense of being language-contextfree, non-logical rules being those which are non-universal, or language-context-sensitive. Although such a differentiation might be possible, it would require the assumption of a number of ad hoc provisos, and, in any event, will not result in the kind of distinction between logic and grammar which has been claimed or presupposed to exist by various contemporary linguists. First, to make any reasonable use of the universality criterion, it is necessary to exclude from consideration all phonological and lexical rules, since, even if they are universal, no one would wish, certainly, to call statements such as the lexical axiom ((TEA) = ((APICAL, STOP), (FRONT, VOCALIC), (LOW, SYLLABIC))) or the redundancy axiom ((NASAL) = (NASAL, VOICED)) principles of logic. Moreover, if we consider the traditional formulations of many logical equations, we find that these are actually defined not on the semantic representations of linguistic objects but on certain ordered superficial representations which are well-formed only with respect to a certain proper subclass of the set of all possible languages. It might still be said that these rules are universal, though, since all that they assert is that GIVEN a particular pair of structures these structures are always symbolically equivalent. Thus there would be no need to restrict (25), for example, so that it would apply to English but not Thai, the rule being equally true for Thai in spite of the fact that neither of its members would ever be derivable by a grammar of that language. This 'empty universality' is eliminated, of course, when logical principles are formulated in their simplest, most general, and most revealing form, which will evidently always be with respect to unordered terminal semantic representations, these being available, by definition, as axiomatic representations of the linguistic objects of any language. There are other interesting problems and consequences associated with the effort to differentiate logic from grammar proper

GRAMMATICAL AXIOMS

163

in terms of the universality or non-universality of their respective rules. But for present purposes we need only note that if all universal rules are rules of logic then many of the rules which underlie transformations which have traditionally been called 'grammatical' or 'syntactic' are necessarily rules of logic. Thus, on the basis of all known evidence, most of the equations justifying such syntactic processes as Anaphora Formation, Subordination, Equi NP Deletion, Contrastive Stressing, Passive Formation, Coordination Reduction, etc., are almost certainly universal, and universal in the strongest possible sense of being independently necessary for the optimally general explanation of the facts about each particular language. The universality criterion would thus probably result in a partition of linguistic theories in such a way that most transformational copying, reduction, and regrouping rules would belong to the same component as all traditional principles of logic, this 'syntactic-semantic-logical' component being differentiated by the absence of language-context restrictions on its rules from the rest of non-phonological grammar, which would include syntactic rules chiefly only in the etymologicallybasic sense of ordering rules. In short, then, there would appear to be no logical or empirical basis in non-equational grammar for assuming that there is any distinction between what have been called logical or semantic equivalence statements and what have been called optional transformations. Both can be appropriately reduced to monomodal equations justifying idempotency and regrouping transformations, and there is apparently no systematic way to differentiate them in terms of form, function, constituent elements, or scope of applicability. In view of this, it is not surprising that all attempts to establish theories or subtheories of logic or semantics distinct from the rest of grammar have thus far consistentiy resulted in the loss rather than gain of explanatory generalizations and in the proliferation of empirically non-significant competitions between the unnecessarily numerous powers and components of the proposed theories.

5 THE EXPLANATORY VALUES OF THE EQUATIONALITY HYPOTHESIS

Hypotheses about natural language grammar are confirmed to the extent that they determine a more restricted characterization of the set of possible natural languages and a larger class of true generalizations about such languages than any hypotheses that are contrary or contradictory to them. Evidence of such confirmation has been presented in the preceding chapters with respect to the general hypothesis of equational grammar and its contrary and contradictory hypotheses of complete and partial non-equationality. Thus it has been shown that the sets of possibly-true theorems, axioms, and possible natural languages generated by an equational theory of grammar are properly included in those generated by any otherwise equivalent theory which requires or permits grammars to specify non-symmetrical rather than symmetrical relations between linguistic représentions. It has also been shown that all of the empirically-motivated types of directed rules and constraints of traditional non-equational grammars can be reduced to more general equational statements drawn from a highly-restricted class of equivalence and non-equivalence axioms whose appropriate uses for the justification of inferences in the proofs of linguistic theorems are determined wholly by universal principles. Since these universal principles are logically incompatible with the axiomatic specifications of rule-specific and language-specific conditions on inferential use that are inherent to all non-equational rules of grammar, and since they exclude from the class of possible natural languages a large class of clearly non-natural

THE EXPLANATORY VALUES

165

languages whose principled exclusion could not otherwise be achieved, the constraints which the equationality hypothesis imposes on grammatical statements are thus found to be not only consistent with the known facts about particular languages but also necessary for any principled explanation of the known limits on their range of possible variability. O n the basis of all available evidence, therefore, the equationality hypothesis can justifiably be claimed to possess a degree of explanatory generality and range of necessary explanatory values which is inherently unattainable by any grammatical theory that is not consistent with this hypothesis. In the light of present knowledge, in other words, it is appropriate to maintain that there can be no possible adequate theory of natural language grammar that is not an equational theory of grammar. I would like to conclude now by summarizing some of the principal distinctive empirical implications of the hypothesis of equational grammar and some of the primary evidence for its claimed superiority relative to the contrary and contradictory hypotheses of non-equational grammar as a basis for the principled explanation of natural-language data and as a catalyst for significant research in linguistics. The hypothesis of equational grammar is based on the intuitively quite natural assumption that linguistic objects are differentiated from all other objects of the universe by the fact that each distinct linguistic object is associated with a distinct set of finite representations, such that one and only one member of the set is a terminal semantic representation, whose elements are all interpretable into observation statements about cognitive states or events, and one and only one member of the set is al terminal phonetic representation, whose elements are all interpretable into observation statements about articulatory states or events. The relation that holds between any two members of such a set is the relation 'represents the same linguistic object as', a binary relation that is symmetric, transitive, and reflexive — and hence a member of the well-defined class of equivalence relations. The equationality hypothesis asserts that this relation of symbolic

166

THE EXPLANATORY VALUES

equivalence is the only primitive relation in theories of language, and that there is no empirically significant statement about any natural language which cannot be appropriately expressed in terms of this relation. More precisely, then, an equational theory of grammar is any theory that imposes the following three conditions on the axioms, theorems, and valid proofs of all particular grammars of natural languages : (1) For any natural language L, G is a possible grammar of L only if G consists of a finite number of statements such that each statement asserts a relation of either equivalence or nonequivalence between two linguistic representations. (Restricted sub-types of equational theories are determined by the imposition of additional empirically-motivated constraints on equational axioms, such as those determining the four-axiom-type theory outlined in Sec. 4.2.1.) (2) For any grammar G, T is a valid theorem of G if and only if T is of the form (Xi = YO and there is a finite sequence of finite equations (Xi = Yi,...,X n = Y n ), such that X] is a terminal semantic representation and Yi is a terminal phonetic representation X n and Y n are identical terminal representations, and, for any pair of adjacent equations in the sequence, the two equations are identical except that one of the members of the first is of the form W A Z and one of the members of the second is of the form WBZ, and there is included in G some statement of the form (A = B), and there is no universal principle of inference that precludes the appropriate substitution of B for A in proofs terminating with equations in the terminal alphabet of X n and Y„; and where, moreover, for any (adjacent or non-adjacent) equational members X m and X p or Y m and Y p of the forms WAZ and WBZ, there is no statement of the form (A B) included in G. (3) For any attribute P, P is a distinct linguistically significant property or relation if and only if there is a distinct universal theorem schema of the form (w = X), for property attributes, or (w = X; y = Z), for relational attributes, such that P is an

THE EXPLANATORY VALUES

167

attribute of a linguistic object or pair of linguistic objects in any language L if and only if the grammar of L demonstrates valid theorems for these objects, and these theorems are instances of the given theorem schema. (A further formal restriction of the class of linguistically-significant attributes of linguistic objects can be effected, as shown in Sec. 3.3.5, by requiring that the phonetic members of all theorem schemata must be unrestricted variables for any possible terminal phonetic representation.) Any theory that incorporates these conditions will also include a specification of the vocabulary of elements for all grammatical representations and their interpretations, a specification of the interpreted theorem schemata that formally define the attributes of grammaticality, synonymy, ambiguity, etc., and a specification of the principles of inferential precedence and terminal wellformedness which impose universal synthetic conditions on the appropriate use of statements as justification for the inference of one representation from another in the valid proofs of linguistic theorems. Given these three inventories of substantive universals, along with the three formal equationality conditions on grammatical axioms, proofs, and theorem-types, there are effective procedures, of the general types discussed in Chapter 2, for generating a highly-restricted particular infinite set of possible grammars, and, for each such grammar, an infinite set of valid interpretable theorems expressing testable empirical claims about the existence and nature of some particular infinite set of possible linguistic objects. Particular grammars are thus formally defined by the equationality hypothesis as finite sets of equivalence and nonequivalence axioms which are necessary for the proof of particular sets of equivalence theorems that can be mapped into empiricallyvulnerable claims about particular languages by means of such universal theorem-schemata as the following : (1)

(a)

A = b

((A, b) is grammatical; A is a meaning of b\ b is an expresssion of A)

168

THE EXPLANATORY VALUES

(b) (c)

A = b; A = c A = b\ C = b

(b and c are synonymous) (b is ambiguous)

This means that there can be no non-universal generalizations about natural language which are not simple assertions of representational equivalence or non-equivalence. The equationality hypothesis thus generates the claim that the set of possible natural languages is such that the members of this set can differ from each other only to the extent that such differences follow from the assumption of different sets of equivalence and non-equivalence axioms. This highly restrictive claim about the nature and range of variability of human language is logically incompatible, of course, with any theory of language which permits particular grammars to include assertions of non-symmetric or anti-symmetric relations between linguistic representations. Its apparent consistency with what is presently known about the similarities and differences between actual natural languages would thus serve to confirm the hypothesis of equational grammar and to disconfirm its contrary and contradictory hypotheses of partial or complete non-equationality. The general hypothesis of equational grammar has also been shown to provide a principled basis for the assumption of other universal principles of grammar which determine further empirically-motivated reductions of the class of possible natural languages and progressively more adequate explanations of the observed properties and relations of such languages. Thus, by requiring the explicit reduction of all of the rules and constraints of traditional non-equational grammars to the equational statements from which they are deductively derivable, it becomes immediately apparent that many of the numerous distinctlyformulated statement-types and distinctly specified components and componentially-restricted conventions of non-equational grammar are actually empirically non-distinct, their gross identities in derivational function and explanatory significance having simply been obscured by the arbitrarily non-homogeneous notational conventions and empirically unmotivated statement-specific

THE EXPLANATORY VALUES

169

inference conditions posited by most of the more standard theories of non-equational grammar.1 It has been found thus far, in fact, that all of the empirically-defensible instances of the various statement-types of non-equational grammars can be reduced with significant gains in generality and explanatory value to instances of one or another of the following four equational statementtypes : (2)

1

(a)

Redundancy axioms; adjunction-deletion of constants e.g. (1) (NASAL) = (NASAL, VOICED) (2) (RED/GREEN/...) = (COLORED, (RED /GREEN/...))

(b)

Lexical axioms; intermodal substitution e.g. (1) (YOUNG, MALE, HUM) = ((LAB, STP, VCD) (VOC, BK) (VOC, FR)) (2) ((DET), (NOUN)) = (DET) & (NOUN)

(c)

Idempotency axioms; adjunction-deletion of variables e.g. (1) (NASAL) & (CONS, X) = (NASAL, X) & (CONS, X) (2) ((N, X), (V, (V, Y) Z)) = ((N, X), (V, ((N, X), (V, Y)) Z))

(d)

Non-equivalence axioms; non-derivability constraints e.g. (1) Z ((N, X), (V, TRYING), ((N, ^ X ) , (V, Y))) (2) (W, (NP, S(C, X)), Y) ^ (W, (NP, (X)), Y, C)

It is not surprising in this context of overabundance of powers, components, statement-types, and arbitrary notational distinctions that competitions between esthetically more- and less-favored components or styles of proof-construction should sometimes have been mistaken for empirically-decidable competitions between mutually-inconsistent theories. See Chomsky (1968) and Lakoff (1969) for a general survey of the range of non-empirical competitions that are generated within the framework of non-equational grammar, and for a clear indication of the vastly excessive powers and internal structures of all of the standard theories that have been formulated in terms of this framework.

170

THE EXPLANATORY VALUES

Since such reduction is not possible, though, for a large class of empirically-indefensible non-equational rules, such as (3)

(a) (b) (c)

(VOCALIC, HIGH, FRONT) (CONSONANTAL, LABIAL, STOP, VOICED) (MALE) -> (FEMALE)/(N, HUMAN, ) & (PRED, PREGNANT) (VOCALIC) (VOCALIC, LOW)/((N, AARDVARK), ( *)),

the principled exclusion of such rule-types by the independentlymotivated general principles of equational theories, and the apparent lack of any principled basis for their exclusion by nonequational theories, counts as further evidence of the correctness of the equationality hypothesis. Each equivalence statement of the types illustrated in (2a-c) determines in conjunction with the metascientific principle of equal substitution a pair of symmetrical derivational inferences, directed transformations, or local constraints on derivational well-formedness.2 Thus, for example, given equation (2a. 1), it follows that both ((NASAL) (NASAL, VOICED)) and ((NASAL, VOICED) —» (NASAL)) are valid derivational inferences. It can be seen that this will always be the case for any possible equivalence axiom, since the relation of equal substitution, or direct derivability, is strictly symmetrical, and since, for any theorem which has a valid proof as defined by the general conditions stated above and in Chapter 2, that theorem will have two optimally simple proofs, one terminating in an equation between identical phonetic representations, the other terminating in an equation between identical semantic representations. These two proofs will always be validated by exactly the same set of equivalence axioms, and will differ only in that, for each validating axiom (A = B), 3

The empirical equivalence between transformations and local constraints on derivational well-formedness — i. e. constraints on the formal and substantive relations that must hold between the adjacent lines of valid derivations — has been demonstrated by Lakoff (1969). See also Hays (1964), Sanders (1967, 1969), and Chapter 2 above.

THE EXPLANATORY VALUES

171

one proof makes use of the directed inference (A B), and the other makes use of the converse inference (B A). It has also been shown that, contrary to the predictions of non-equational theories, which treat conditions on the appropriate derivational use of grammatical axioms as rule-specific conditions that are accidental to particular grammars, the appropriate use of all empirically-defensible axioms is actually fully determined by universal principles, which predict for any (A = B) the inferential precedence relations that will hold between (A B) and (B - » A) in the proofs of all possible grammars. Thus, while a non-equational theory is incapable of accounting for the fact that there are languages with motivated phonetically-directed processes of nominal raising or unconditioned nasal voicing but none with phonetically-directed converses of these processes, and must treat such facts simply as accidental to the particular sample of natural languages thus far investigated, any equational theory of grammar will provide a principled explanation of these facts as a consequence of general laws about the essential properties of all natural languages. The equationality hypotheses thus claims that is is linguistically essential and not a mere accident of particular languages or particular linguistic axioms such as (2a-d) that the empiricallyappropriate use of each such axiom for the construction or acceptance of optimally simple proofs of true theorems is that which results in a progressive increase in elements and wellformed element-strings out of the terminal alphabet of the final identity equation of the proof and/or in a progressive decrease in elements or element-strings out of the opposite terminal alphabet. In other words, since it is ESSENTIAL for the inference principles of equational theories of grammar to be universal, a principle like the maximalization of terminality principle (see Chapter 4) will necessarily impose constraints not only on the grammars of those particular languages whose properties provided the original empirical basis for the principle, but also on the grammars of all possible languages whatever. This means that the terminal maximalization principle of equational theories expresses a significant

172

THE EXPLANATORY VALUES

empirically-vulnerable explanatory hypothesis about the ESSENTIAL properties of ALL natural languages — namely, that the nonuniversal principles of knowledge which are necessary and sufficient for the competent communicative use of any such language are such that each principle has exactly two formally-determinable converse FUNCTIONS, one of which contributes to the mapping of sounds onto meanings by justifying the substitution of a representation which is relatively more similar to a terminal semantic representation for one which is relatively less similar, the other of which contributes to the mapping of meanings onto sounds by justifying the substitution of a representation which is relatively more similar to a terminal phonetic representation for one which is relatively less similar. Although this is an exceptionally reasonable and natural condition to impose on the affirmative rules of natural language grammars, it is by no means a logically-necessary condition, and is in fact neither necessary nor independentlymotivated for any otherwise adequate non-equational theory of grammar, which treats all conditions on inferential use as axiomatic attributes of the particular rules of particular grammars and hence has no need for universal principles of use or the constraints on grammatical rules that are entailed by such principles. The terminal maximalization principle is sufficient to exclude from the set of valid linguistic proofs all proofs containing empirically redundant recursions or derivational loops. It is also seen to have the much more important effect of radically reducing the set of possible linguistic theorems, and hence the set of otherwise possible natural languages specifiable by any possible equational theory of grammar. Thus, for example, given this principle, it follows that there can be no theorem that is provable only by proofs involving a phonetically-directed process of unconditioned nasal devoicing, or a semantically-directed process of unconditioned color uncoloring. This means, in other words, that there can be no natural language in which all nasal consonants are pronounced voicelessly, or in which specific color predicates like being brown are never understood as special cases of the

THE EXPLANATORY VALUES

173

generic predicate being colored. The consistency of this claim with the known facts about natural languages thus counts as confirming evidence for the premises from which it necessarily follows — namely, the general hypothesis of equational grammar, the equationality-dependent universal principle of terminal maximalization, and the substantive linguistic laws expressed by the redundancy equations (2a. 1) and (2a.2). Similar empirical claims about the set of natural languages will be generated with respect to each of the other possible redundancy, lexical, and idempotency axioms that can be expressed in the alphabet of any given equational theory. The most interesting of these claims are those derived from the inferential precedence conditions determining the appropriate derivational uses of idempotency axioms, since these conditions must be based on a more precise universal characterization of terminal phonetic and semantic representations than the simple elementary characterization that suffices for the correct prediction of all appropriate uses of lexical and redundancy axioms. Idempotency axioms thus motivate and provide an empirical basis for testing hypotheses not only about the interpretable elements of terminal linguistic representations but also about the constituency relations which hold between these elements in all such representations. Thus, for example, it was shown in Sec. 4.1.7 that, given the general principle of maximalization of terminal specificity, along with the independently-reasonable requirement that every «-place predicate must be grouped with precisely n arguments in terminal semantic representations, it follows that an idempotency axiom like (2c. 1) will always be used to justify adjunction transformations in semantically-directed proofs and deletion transformations in phonetically-directed ones. This accounts for the fact that while there are languages with motivated phonetically-directed Equi NP Deletion transformations, there are no languages with motivated phonetically-directed converses of these transformations. The same predicate-argument condition on the well-formedness of terminal semantic representations was also seen to explain why nominal raising processes always take precedence over their

174

THE EXPLANATORY VALUES

converse lowering processes in phonetically-directed derivations with the reverse precedence holding for all semantically-directed derivations. For an equational theory of grammar, in other words, the very assumptions that are required to account for the facts about nominal-raised sentences in a single language such as English are also sufficient for the explanation of a logically nonnecessary fact about ALL languages, namely, that while there may be languages in which arguments of embedded predicates occur as subjects of matrix predicates, there are no languages apparently in which an argument of a matrix predicate ever occurs as the subject of an embedded predicate. Similar principled explanations will be generated by equational theories for a wide range of other significant linguistic facts of this sort. Since all of these explanations depend essentially on the assumption that all non-universal rules of grammar are strictly symmetrical, each of the facts which they explain serves to confirm the equationality hypothesis and to falsify any theory of grammar which does not incorporate this hypothesis. Thus if the appropriate derivational use of a grammatical statement is permitted to be specified by the grammar that includes it, which is the case for all of the standard theories of non-equational transformational grammar, this can only be taken to imply that the appropriate inferential use of a rule is entirely idiosyncratic to that particular rule and the particular grammar that includes it. But this claim is empirically false, and all of the available facts about natural languages would seem to support the contrary claim that for any pair of representations, A and B, if some language has an empirically-motivated transformation (A B) for proofs terminating in a given interpretable alphabet, then there is no language that has an empirically-motivated transformation (B - » A) for proofs terminating in that same alphabet. This latter claim follows directly from the equationality hypothesis, of course, since this hypothesis is inconsistent with the existence of languagespecific conditions on derivational use, and since any universal principles which determine that (A - » B) is the appropriate phonetically- or semantically-directed inference justified by (A =

175

THE EXPLANATORY VALUES

B) will necessarily determine exactly the same thing with respect to all other possible instances of (A = B) in all other possible grammars. It is thus only by means of equational theories that the observed limitations on natural language variability can be accounted for here, since the apparent non-existence of languages with converse transformations of identical directionality can be explained only by deriving all restrictions on inferential use by means of universal principles that are applicable over the domain of all possible pairs of equationally-related linguistic representations. The explanatory values of equational grammar have been demonstrated, moreover, not only with respect to those affirmative grammatical statements which can be appropriately expressed as assertions of equivalence between representations, but also with respect to the class of motivated negative or prohibitive rules of grammar, all of which have been shown to be expressible as assertions of non-equivalence between representations. Thus, by requiring the reduction of all non-universal constraints on representational and derivational well-formedness to non-equivalence axioms like those illustrated in (4), the equationality hypothesis again determines a more restricted class of grammars and possible natural languages than its contrary or contradictory hypotheses. (4)

(a)

(W, (NP, S (C,X)), Y) ± (W, (NP,(0,X)), Y, C)

(b)

S(.._S(..LR.)

(c) (d) (e)

(X (SEGMENTS Y) ^ (X (-VOCALIC)., Y) Z =¿= X ( - V O C A L I C ^ Y Z ^ ((N, X), (V, TRYING), ((N, - X ) , (V, Y)))

..S(..L2..)..)

#

S(.._S(..L2..)

&..&

,.LR.)

Each such non-equivalence axiom asserts that a given pair of representations cannot be representations of the same linguistic object. Each of these axioms thus serves as a global derivational constraint 3 that effectively denies the validity of any otherwise 3

The term 'global constraint' is employed by Lakoff (1969) to refer generally to those conditions on derivational well-formedness which make essential reference to more than two lines of a derivation or to two lines that are not necessarily adjacent. His use of the term there, however, is

176

THE EXPLANATORY VALUES

valid proof that includes derivationally related representations corresponding to both members of the given negative equation. Equation (4a), for example, was shown in Sec. 4.2.3 to perform the intended explanatory filtering function of J.R. Ross's Complex NP Constraint (Ross, 1967) by denying the equivalence or derivability relation between any pair of representations such that a constituent included in the clause of a complex NP in one representation is not included in that clause in the other. The nonequivalence axiom (4b) constitutes a similar, though less formally expressed, equational reduction of G. Lakoff's (1969) Logical Predicate Constraint, which Lakoff formulates as a non-symmetrical constraint against the derivational co-occurrence of a semantic or semantically-proximaite representation in which a negative or quantificational constituent L! asymmetrically commands4 another such constituent L 2 and a semantically-remote 'surface structure' representation in which L 2 both commands and precedes Li. The non-symmetrical features of Lakoff's formulation are empirically-unmotivated, however, and its intended explanatory function, which is to account for such facts as the synonymy of There are few men who everyone is enlightened by, Few men enlighten everyone, and The men who enlighten everyone are few, and the contrasting non-synonymy between any of these and Everyone is enlightened by few men, can evidently be achieved just as well by the more general non-equivalence axiom (4b), which makes no reference to the relative derivational order not wholly consistent with this definition, since, though he refers to Ross's Coordinate Structure Constraint (Ross, 1967) as a global constraint, Lakoff's formulation (1969; 3) explicitly refers to two adjacent lines of derivations only, which makes it precisely equivalent to Ross's original formulation as a constraint on rule application rather than on the derivational co-occurrence of integral and non-integral representations of the same coordination. That constraints of this sort cannot possibly be adequately imposed on rule applications or on necessarily adjacent lines of derivations has been demonstrated in Sanders (1969). 4 A constituent A commands a constituent B if and only if A neither dominates nor is dominated by B and the clause that most immediately includes A also includes B. There is an asymmetrical command relation between A and B if A commands B but B doesi not command A.

THE EXPLANATORY VALUES

177

or semantic proximity of the representations whose derivational co-occurrence it prohibits. Equations (4c) and (4d) express two empirically-distinct reductions of C.W. Kisseberth's Yawelmani Consonant Constraint (Kisseberth, 1969); the former effectively precludes the phonetically-directed derivation of the terminallyprohibited sequences CCC, #CC, and CC^t by vowel-deletion transformations, while also (possibly incorrectly) precluding the phonetically-directed elimination of such sequences by epenthesis; the latter axiom simply generalizes Kisseberth's observed restriction on terminal phonetic representations to ALL representations of Yawelmani sentences. A similar absolute representational wellformedness constraint is expressed by the non-equivalence axiom (4e), which determines the familiar Identical Subject Constraint for verbs like try and attempt which has been discussed and variously specified by Lakoff (1965), Perlmutter (1968), and a number of other linguists. This equation asserts that there is no representation whatever that is equivalent to a representation in which the subject of the complement of a verb of trying is not identical to the subject of the verb of trying, from which it follows necessarily that there can be no valid linguistic theorem with a phonetic member like *Dogs try for cats to chase mice or *John attempted that Bill would open the door. Although none of the constraints that have been equationally expressed here were originally formulated as non-equivalence statements, their reduction to such statements is seen to result in no loss of generality or explanatory power with respect to the facts they were intended to account for. The requirement that ALL prohibitive grammatical constraints be expressed as non-equivalence statements results, moreover, in a large number of significant explanatory gains, and in the same sort of empirically-motivated restriction of the class of possible languages as results from the requirement that all affirmative rules of grammar be expressed as statements of representational equivalence. This has already been shown in Chapter 4, but it can be demonstrated most clearly, I believe, by considering the relative explanatory powers of the symmetric non-equivalence axioms of equational grammars and

178

THE EXPLANATORY VALUES

the global constraints of G. Lakoff's (1969) theory of derivational constraints, which constitutes an exceptionally explicit formulation of the basic assumptions of all standard theories of non-equational grammar. Lakoff's global constraints differ from non-equivalence axioms in three fundamental and empirically highly significant respects. First, in contrast to the metalanguage of equational grammar, Lakoff's metalanguage for derivational constraints allows for the expression of constraints that make essential reference to more than two lines in a derivation or proof. Secondly, his metalanguage permits the use not only of negation and material implication in the expression of constraints, but also the contrastive use of existential and universal quantification over derivations. Finally, by allowing essential reference to the relative order of representations in (phonetically-)directed derivations, Lakoff's theory permits the existence of global constraints which are not symmetrical. The theory he assumes would thus allow for the possibility of languages for which the converse of an ill-formed derivation might be well-formed, as well as pairs of languages such that there is a prohibition in one language against the phonetically-directed derivation of A from B but not against the derivation of B from A, and a prohibition in the other language against the derivation of B from A but not against the derivation of A from B. For theories that incorporate the equationality hypothesis, on the other hand, the set of possible global constraints on derivations will necessarily be restricted to those which can be expressed as intrinsically universally quantified assertions of the strictly binary and symmetric relation of non-equivalence between representations. Thus the set of global constraints generated by an equational theory of grammar will constitute only a small proper subset of the set of possible constraints generated by any non-equational theory of the sort assumed by Lakoff. I wish to suggest now that there is in fact no empirically-motivated global derivational constraint which does not fall within the smaller class of equationallyrestricted constraints, and hence that the equationality hypothesis is again shown to provide a delimitation of the set of natural

THE EXPLANATORY VALUES

179

languages which is closer to a correct delimitation than that provided by any theory of grammar which does not incorporate this hypothesis. Thus with respect to non-equivalence statements as well as equivalence statements, the symmetry requirement determines a narrower and more precise delimitation of the set of possible natural languages than could otherwise be achieved. Consider, for example, the Complex NP Constraint. Although this was expressed by Ross as a phonetically-directed restriction against the movement of any constituent OUT of the clause of a Complex NP, it seems clear that any language whose derivations observe this restriction will also observe a restriction against the movement of any constituent INTO the clause of a complex NP. By permitting the inclusion of non-symmetrical statements in grammars, the theories assumed by Ross and by Lakoff make it impossible to account for the symmetrical character of this constraint in English, or for the more important fact that there are apparently NO languages which prohibit either interposition or extraposition with respect to a given structure without also prohibiting BOTH of these derivational regrouping processes with respect to that structure. Thus, where extraposition is prohibited out of sentential subjects or coordinations interposition is also prohibited, and vice versa. All such facts follow necessarily from the equationality hypothesis, of course, and thus receive a principled and quite natural explanation from any theory that incorporates this hypothesis. It can be seen, from the nature and range of the examples cited, that there will be a possible non-equivalence formulation for any derivational constraint which expresses a true universal negative statement about the representations of some language. There remain, however, three very commonly proposed grammatical statement-types which cannot possibly be expressed as assertions of either equivalence or non-equivalence between linguistic representations. These are the global derivational constraints which specify the optional or obligatory application of particular directed transformations in particular grammars, and those which impose language-specific extrinsic ordering restric-

180

THE EXPLANATORY VALUES

tions, or non-universal inferential precedence conditions, on the use of particular pairs of transformations. The schemata for all statements of optionality, obligatoriness, and extrinsic rule ordering are presented in (5a), (5b), and (5c), respectively. (5)

(a) (b) (c)

(3 P) (P = (..,XAY = Z, XBY = Z , . . , w = w)) (V P) (P (..,XAY = Z, ~(XBY = Z),.„ w = w)) (V P) (P ^ (..,UCV = Z, UDV = Z,.., X A Y = Z, XBY = Z,...,w = w))

Thus to say that (A —» B) is an optional phonetically directed transformation is to say that there is a set of one or more valid phonetically-directed proofs in which a line including a representation of the form XAY immediately precedes a line including a corresponding representation of the form XBY. Similarly, to say that (A -» B) is an obligatory phonetically-directed transformation is to say that there is no valid phonetically-directed proof in which there is a line including XAY and an immediately subsequent line that does not include XBY. Likewise, to say that (A -» B) is extrinsically ordered before (C D) with respect to the phonetically-directed derivations of some language is to say that there is no valid phonetically-directed proof for that language in which a pair of lines including XAY followed by XBY is preceded by a pair of lines including UCV followed by UDV. From these three schemata for language-specific inferential precedence statements, which are empirically equivalent to the corresponding global constraint schemata proposed by Lakoff (1969), it can be seen that assertions about optionality, obligatoriness, or rule-ordering can be expressed only by statements which make essential reference to entire derivations or proofs, and which make essential use of variables for proofs and quantification over proofs. These are in fact the very properties which distinguish the class of invariant inference principles of a theory from its class of ordinary contingent generalizations. Since the equationality hypothesis requires that all non-universal grammatical statements must be equations whose members are single linguistic

THE EXPLANATORY VALUES

181

representations, and since it is clearly impossible to reduce the schemata of (5) to such equations, the theory of equational grammar thus generates the claim that there are no non-universal restrictions on grammatical inference whatever, and hence that there are no facts about natural languages and no significant generalizations about such languages which require the assumption of any non-universal axiom about the optionality, obligatoriness, or relative order of application of the rules of any grammar. There is every reason to believe that this claim is correct. I cannot attempt to substantiate it here, however, or even attempt to summarize the large and growing body of research contributing towards its confirmation.5 There is presumably no question, though, of where the burden of proof lies here, since the class of possible languages that can be generated by means of languagespecific principles of inference is obviously vastly larger and more heterogeneous than that which is generated by means of universal inference principles alone. In fact, I know of no yet unfalsified linguistic hypothesis that determines a smaller or less varied class of possible languages than the hypothesis of equational grammar. To falsify this hypothesis, therefore, it would be necessary to show that there is some actual natural language that falls outside of the restricted class of languages it generates. Pending such evidence, the greater restrictive powers and explanatory values of equational theories relative to their non-equational contraries or contradictories requires the acceptance of the basic hypothesis of equational grammar as a well-confirmed empirical law about the nature of natural language.

' I am referring here primarily to recent and forthcoming works on rule-application by S. R. Anderson, W. Chafe, M. Kenstowicz, C. W. Kisseberth, A. Koutsoudas, T. Lehmann, C. Noll, C. Ringen, and G. Sanders. See also Perlmutter (1968) and Sanders (1970b).

BIBLIOGRAPHY

Bach, E. 1964 "Subcategories in Transformational Grammars", in H. Lunt (ed.), Proceedings of the Ninth International Congress of Linguists (The Hague : Mouton), 672-678. 1968 "Two Proposals Concerning the Simplicity Metric in Phonology", Glossa 2, 128-149. Bierwisch, M. 1969 "On Certain Problems of Semantic Representations", Foundations of Language 5, 153-184. Bloomfield, L. 1926 "A Set of Postulates for the Science of Language", Language 2, 153-164. 1933 Language (New Y o r k : Holt). Chomsky, N . 1957 Syntactic Structures (The Hague : Mouton). 1961 "On the Notion 'Rule of Grammar'", in Proceedings of the Twelfth Symposium in Applied Mathematics, 6-24. Reprinted in Fodor and Katz, 119-136. 1964a "Current Issues in Linguistic Theory", in Fodor and Katz, 50-118. 1964b "The Logical Basis of Linguistic Theory", in H. Lunt (ed.), Proceedings of the Ninth International Congress of Linguists (The H a g u e : Mouton), 914-1008. 1965 Aspects of the Theory of Syntax (Cambridge: M.I.T. Press). 1967 "Some General Properties of Phonological Rules", Language 43, 102-138. 1968 "Deep Structure, Surface Structure, and Semantic Interpretation", in Steinberg and Jakobovits (1971), 183-210, reprinted in Chomsky (1972). 1970 "Some Empirical Issues in the Theory of Transformational Grammar", in Chomsky (1972). 1972 Studies on Semantics in Generative Grammar (The H a g u e : Mouton).

BIBLIOGRAPHY

183

Chomsky, N . and M. Halle 1968 The Sound Pattern of English ( N e w Y o r k : H a r p e r a n d R o w ) . C h o m s k y , N . , a n d G . A . Miller 1963 " I n t r o d u c t i o n to t h e F o r m a l A n a l y s i s of N a t u r a l L a n g u a g e s " , in R . D . L u c e , R . B u s h , a n d E . G a l l a n t e r (eds.), Handbook of Mathematical Psychology II ( N e w Y o r k : Wiley), 269-321. F o d o r , J. A . , a n d J. J . K a t z (eds.) 1964 The Structure of Language ( E n g l e w o o d Cliffs, N . J . : P r e n t i c e Hall). G r u b e r , J. S. 1965 Studies in Lexical Relations ( M . I . T . d o c t o r a l dissertation). Halle, M. 1962 " P h o n o l o g y in G e n e r a t i v e G r a m m a r " , Word 18, 54-72. Hays, D. G. 1964 " D e p e n d e n c y T h e o r y : A F o r m a l i s m a n d S o m e O b s e r v a t i o n s " , Language 40, 511-525. K a t z , J. J. 1964 " A n a l y t i c i t y a n d C o n t r a d i c t i o n in N a t u r a l L a n g u a g e " , in F o d o r a n d K a t z , 519-543. K a t z , J. J., a n d J. A . F o d o r 1963 " T h e S t r u c t u r e of a S e m a n t i c T h e o r y " , Language 39, 170-210. Kisseberth, C. W . 1969 " O n t h e R o l e of D e r i v a t i o n a l C o n s t r a i n t s in P h o n o l o g y " , ( m i m e o g r a p h e d ) ( I n d i a n a U n i v e r s i t y Linguistics C l u b ) . Lakoff, G. 1965 On the Nature of Syntactic Irregularity, ( = M a t h e m a t i c a l Linguistics a n d A u t o m a t i c T r a n s l a t i o n R e p o r t N S F - 1 6 ) ( C a m b r i d g e : Harvard University Computation Laboratory). 1969 " O n G e n e r a t i v e S e m a n t i c s " , ( m i m e o g r a p h e d ) ( I n d i a n a U n i v e r s i t y Linguistics C l u b ) , in S t e i n b e r g a n d J a k o b o v i t s (1971), 232-296. L a n g a c k e r , R. W . 1968 " M i r r o r I m a g e R u l e s " , Language 45 : 3, 575-598; 4 5 : 4, 844-862. M c C a w l e y , J. D . 1968 " T h e R o l e of S e m a n t i c s in A G r a m m a r " , in E . Bach a n d R . T . H a r m s (eds.), Universals in Linguistic Theory ( N e w Y o r k : H o l t , R i n e h a r t a n d W i n s t o n ) , 124-169. N a g e l , E., a n d J. R . N e w m a n 1956 " G o e d e l ' s P r o o f " , in J. R. N e w m a n (cd.), The Mathematical Way of Thinking ( N e w Y o r k : S i m o n a n d Schuster), 1668-1695. Perlmutter, D. M. 1968 Deep and Surface Constraints in Syntax ( M . I . T . d o c t o r a l dissertation). P o s t a l , P. M . 1968 " T h e C r o s s - O v e r P r i n c i p l e " ( D u p l i c a t e d p r e l i m i n a r y version of J u n e 1968). Prior, A . N . 1962 Formal Logic, 2 n d e d . ( O x f o r d : O x f o r d U n i v e r s i t y Press).

184

BIBLIOGRAPHY

Rosenbaum, P. 1968 "English Grammar II", Specification and Utilization of a Transformational Grammar ( = Scientific Report 2, Section 1) (Yorktown Heights, N.Y.: IBM). Rosenbaum, P., and D. Lochak 1966 "The IBM Core Grammar of English", Specification and Utilization of a Transformational Grammar, Part I (Yorktown Heights, N.Y.: IBM). Ross, J. R. 1967 Constraints on Variables in Syntax (M.I.T. doctoral dissertation) (Mimeographed, Indiana University Linguistics Club, 1968). Sanders, G. A. 1967 Some General Grammatical Processes in English (Indiana University doctoral dissertation) (Mimeographed, Indiana University Linguistics Club, 1968). 1969 "Invariant Ordering" (Duplicated, University of Texas at Austin) (Mimeographed, Indiana University Linguistics Club, 1970). 1970a "Constraints on Constituent Ordering", Papers in Linguistics 2: 3, 460-502. 1970b "Precedence Relations in Language", Paper presented at the Summer Meeting of the Linguistic Society of America, Columbus, Ohio, luly 1970. (Publication forthcoming.) Sanders, G. A., and J. H-Y. Tai 1969 "Immediate Dominance and Identity Deletion", Paper presented at the Winter Meeting of the Linguistic Society of America, San Francisco, California, December 1969. (Foundations of Language 8 (1972) 161-198.) Stanley, R. 1967 "Redundancy Rules in Phonology", Language 43 : 1. Steinberg, D. D., and L. A. Jakobovits (eds.) 1971 Semantics (Cambridge : Cambridge University Press). Stoll, R . R . 1961 Sets, Logic, and Axiomatic Theories (San Francisco: Freeman). Weinreich, U. 1966 "Explorations in Semantic Theory", in T. A. Sebeok (ed.), Current Trends in Linguistics III (The Hague : Mouton), 395-477. Reprinted (1972) ( = Janua Linguarum, Series Minor, 89) (The Hague : Mouton).

INDEX

acceptance, 18-23 addition, (see adjunction.) adjacency, 105 adjunction, 73, 78-90, 101, 110-25, 169 amalgamation, 151-56 ambiguity, 40-45, 154-56, 168 ampersand, 95 analyticity, 49-51 applicational precedence, 68-72, 9294, 109, 135, 167, 179-81 argument, 51, 52, 118, 145-50 arrow, 75 assimilation, 105, 116, 121 associativity, (see also regrouping.) 100, 102-103 attributes, linguistically-significant, 60-63, 166-67 axiomatic basis, 16 axioms, 8, 16, 18, 64-163 directed, 67-68, 75 equivalence, 8, 18, 75-124, 166, 169 grammatical, 64-163 grouping, 95, 100-110, 124-25 idempotency, 110-22, 124-25, 169 lexical, 90-94, 123-24, 138, 169 non-commutativity, 96, 99 non-equivalence, 8, 66, 122-40, 169, 175-79 ordering, 94-100, 123-24 redundancy, 78-90, 123, 147, 169 regrouping, 97, 100-110, 124-25 reordering, 96-100

bimodality, 143 bracketing, (see grouping.) collocation, 145-50 comma, 95 completeness, 29-31 conditional analyticity, 50 conditions, (see constraints.) coordinative grouping, 102 consistency, 29-31 constituency, (see grouping.) constituent ordering, 94-100, 123-24 constraints, (see also non-equivalence axioms, proofs, validity.) derivational, 95-96, 130-33, 17581 equationality, 31-32, 61, 66-67, 165-67 exceptionality, 133-40 global, (see also derivational.), 175-80 invariant order, 95-96, 98-100, 130-31 non-derivability, (see non-equivalence axioms.) phonotactic functionality, 70 representational, (see also representation.), 118, 125-30 well-formedness, 82, 125-30 context-sensitivity, 91-92 contradiction, 51-54, 146-50 converse, (see also inverse.), 172 conversion, 156-58 copulative analyticity, 49

186

INDEX

deep structure, 44, 144 deletion, 73, 78-90, 97, 110-25, 169 derivability, (see also inference, proofs, validity.), 14-15, 18, 125 derivational constraints, 95-96, 13033, 175-81 derivations, (see proofs.) direct derivability, 14, 18 directed use, (see also inference, proofs, transformations, validity.), 67-69, 89, 107, 116-25 disambiguation, 154-56 disjunctive statements, 147 entailment, 50-51, 53-54 equation, (see also equational statements.), 11-14 equational statements, (see also axioms, theorems.), 8, 11-18, 3132, 166 equational theories, 11-17, 31-32, 61, 66-67, 165-67 equivalence, 7, 11-23 axioms, 18, 75-125, 166, 169 symbolic, 11, 35-40 theorems, 17-23, 33-63, 166 existential presupposition, 55-56 exceptionality, 133-40 expression of, 43-44, 167 extraposition, 100, 103, 116-17 extrinsic ordering, 179-81 focal presupposition, 56-58 generation, 20-23 grammatical axioms, 64-163 grammatical theorems, 33-63 g r a m m a t i c a l l y , 43, 167 grouping, 95, 100-110, 124-25 idempotency, 17, 18, 101, 110-22, 124-25, 169 indirect derivability, 15, 18 inference, (see also directed use, proofs, transformations, validity.), 14, 17-19, 31-32, 66-69, 123-25, 166, 170-75, 179-81 inferential precedence, (see applicational precedence.)

interposition, 100, 116-17 interpretation, 26, 36, 95, 165 invariant ordering, 95-96, 98-100, 130-31 inverse, (see also directed use.), 7578, 79, 99, 110, 170-71 laws (see also axioms.), 65 lexical axioms, 90-94, 123-24, 138, 169 lexical entries, 142, 143-50 lexicalization, (see lexical axioms.) lexicon, 142 lexis, 143-50 logic, 156-63 maximalization of terminality, 89, 107, 171-74 meaning of, 43, 167 mirror-image, 131-32 nominal ordering, 106 nominal raising, 102-107, 159 non-equivalence, 7, 8, 11-14, 23-31, 66, 122-40, 169 axioms, 66, 122-40, 169 proofs, 23-29 theorems, 23-31 obligatoriness, 179-80 optionality, 179-80 ordering, 94-100, 123-24, 151 paraphrase, (see synonymy.) particle iteration, 106 particle ordering, 106 permutation, (see reordering.) phonetic representation, 26, 68, 95, 109 phrase-structure rules, 78-90 precedence, (see applicational precedence, inference.) predicate, 49, 51, 52, 118, 145-50 presupposition, 55-60 projection rules, (see amalgamation.) proofs, 17-32, 66-67, 89, 166 acceptance, 18-23 construction, 20-23

INDEX equivalence, 17-23, 32, 66-67 generation, 20-23 non-equivalence, 23-29 recognition, 18-23 prothesis, 69-71 provability, (see also proofs, validity.), 18-19 redundancy axioms, 7 8 - 9 0 , 1 2 2 , 1 2 3 , 147, 169 reflexivity, 13, 16, 165 regrouping, 97, 100-110, 151 reordering, 96-100 representation, 25-29, 36, 68, 95, 116, 118, 161, 165, 166 interpretable, (see also terminal.), 26, 36, 95, 165 phonetic, 26, 68, 95, 109 semantic, 26, 36, 68, 95, 109, 114, 118, 151-56 terminal, 25-29, 36, 68, 95, 109, 118, 166 uninterpretable, 26, 161 rewriting systems, 75-90 rules (see also axioms, transformations.) directed, 67-68, 75, 77, 170 formational, 75, 78-90 non-equational, (see directed.) phrase-structure, 78-90 rewriting, 75-90 transformational (see also directed use, transformations.), 67-75 self-contradiction, 51-54 semantic(s), (see also logic, semantic representation, theorems.), 15163 semantic representation, 26, 36, 68, 95, 109, 114, 118, 151-56 significance, 60-63, 166-67 statements (see also axioms, theorems.) classificatory, 47-48 equational, 8, 11-18, 31-32, 166 non-equational, (see also directed rules.), 67-68, 75-90, 171-78

187

subject formation, (see also nominal raising.), 103, 159 subordinative analyticity, 50 subordinative grouping, 102 substitutability (see also inference, lexical axioms.), 14, 16, 68, 143, 169 symbolic equivalence, 11, 35-40 symmetry, 13, 16, 67, 165, 174 synonymy, 40-45, 168 terminality, 25-29, 89, 109, 118, 158, 171-74 theorems, 7, 8, 17-29, 33-63, 166 ambiguity, 40-45 analyticity, 49-51 class-membership, 47-48 contradiction, 51-54 entailment, 50-51, 53-54 equivalence, 17-23, 33-63, 166 grammatical, 33-63, 166 non-equivalence, 23-31 presupposition, 55-60 synonymy, 40-45 theorem-schemata, 33-63, 166-67 theories empirical, 16 equational, 11-17, 31-32, 65-75, 165-67 four-axiom-type, 122-25, 166 non-equational, (see also directed rules.), 67-73, 174-81 transformations, (see also directed use, inference.), 67-75 adjunction, 73, 78-90, 101, 11025, 169 deletion, 73, 78-90, 97, 110-25, 169 extraposition, 100, 103, 116-17 interposition, 100, 116-17 transitivity, 13, 16, 165 validity, 14, 18, 19, 27, 31-32, 6667, 89, 109, 166 verification, 41-43 vocabulary (see also representation.), 31-32, 36, 167