186 101 8MB
English Pages 388 [412] Year 2009
A. H. Louie More Than Life Itself A Synthetic Continuation in Relational Biology
CATEGORIES Edited by Roberto Poli (Trento) Advisory Board John Bell (London, CA) Mark Bickhard (Lehigh) Heinrich Herre (Leipzig) David Weissman (New York) Volume 1
A. H. Louie
More Than Life Itself A Synthetic Continuation in Relational Biology
Bibliographic information published by the Deutsche Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available in the Internet at http://dnb.d-nb.de.
North and South America by Transaction Books Rutgers University Piscataway, NJ 08854-8042 [email protected] United Kingdom, Ireland, Iceland, Turkey, Malta, Portugal by Gazelle Books Services Limited White Cross Mills Hightown LANCASTER, LA1 4XS [email protected]
Livraison pour la France et la Belgique: Librairie Philosophique J.Vrin 6, place de la Sorbonne; F-75005 PARIS Tel. +33 (0)1 43 54 03 47; Fax +33 (0)1 43 54 48 18 www.vrin.fr
2009 ontos verlag P.O. Box 15 41, D-63133 Heusenstamm www.ontosverlag.com ISBN 978-3-86838-044-6 2009 No part of this book may be reproduced, stored in retrieval systems or transmitted in any form or by any means, electronic, mechanical, photocopying, microfilming, recording or otherwise without written permission from the Publisher, with the exception of any material supplied specifically for the purpose of being entered and executed on a computer system, for exclusive use of the purchaser of the work Printed on acid-free paper FSC-certified (Forest Stewardship Council) This hardcover binding meets the International Library standard Printed in Germany by buch bücher dd ag
To my lignum vitae and arbor scientiae, my genealogical and academic ancestors on whose shoulders I stand
In 518 BC, Pythagoras journeyed west, and had a comprehensive interview with the prominent ruler Leon of Phlius while both were attending the Olympic Games. Prince Leon was most impressed by Pythagoras’s range of knowledge and asked in which of the arts he was most proficient. Pythagoras replied that, rather than being proficient in any art, he regarded himself as being a “philosopher”. Prince Leon had never heard this term before — it had been newly coined by Pythagoras —and asked for an explanation. Pythagoras said: “Life may well be compared with these pan-Grecian Games. For in the vast crowd assembled here, some are led on by the hopes and ambitions of fame and glory, while others are attracted by the gain of buying and selling, with mere views of profit and wealth. But among them there are a few, and they are by far the best, whose aim is neither applause nor profit, but who come merely as spectators through curiosity, to observe what is done, and to see in what manner things are carried on here. “It is the same with life. Some are slaves to glory, power and domination; others to money. But the finest type of man gives himself up to discovering the meaning and purpose of life itself. He, taking no account of anything else, earnestly looks into the nature of things. This is the man I call a lover of wisdom, that is, a philosopher. As it is most honourable to be an onlooker without making any acquisition, so in life, the contemplation of all things and the quest to know them greatly exceed every other pursuit.” — free translation of anecdote recorded in Marcus Tullius Cicero (c. 45 BC) Tusculanae Questiones Liber Quintus, III: 8–9
Contents
Praefatio: Unus non sufficit orbis Nota bene
xiii xxiii
Prolegomenon: Concepts from Logic In principio... Subset Conditional Statements and Variations Mathematical Truth Necessity and Sufficiency Complements Neither More Nor Less
1 1 2 3 6 11 14 17
PART I: Exordium
21
1 Praeludium: Ordered Sets Mappings Equivalence Relations Partially Ordered Sets Totally Ordered Sets
23 23 28 31 37
2 Principium: The Lattice of Equivalence Relations Lattices The Lattice X
39 39 46
Mappings and Equivalence Relations Linkage Representation Theorems
50 54 59
3 Continuatio: Further Lattice Theory Modularity Distributivity Complementarity Equivalence Relations and Products Covers and Diagrams Semimodularity Chain Conditions
61 61 63 64 68 70 74 75
PART II: Systems, Models, and Entailment
81
4 The Modelling Relation Dualism Natural Law Model versus Simulation The Prototypical Modelling Relation The General Modelling Relation
83 83 88 91 95 100
5 Causation Aristotelian Science Aristotle’s Four Causes Connections in Diagrams In beata spe
105 105 109 114 127
6 Topology Network Topology Traversability of Relational Diagrams The Topology of Functional Entailment Paths Algebraic Topology Closure to Efficient Causation
131 131 138 142 150 156
PART III: Simplex and Complex
161
7 The Category of Formal Systems Categorical System Theory Constructions in S Hierarchy of S-Morphisms and Image Factorization The Lattice of Component Models The Category of Models The Α and the Ω Analytic Models and Synthetic Models The Amphibology of Analysis and Synthesis
163 163 167 173 176 183 187 189 194
8 Simple Systems Simulability Impredicativity Limitations of Entailment and Simulability The Largest Model Minimal Models Sum of the Parts The Art of Encoding The Limitations of Entailment in Simple Systems
201 201 206 209 212 214 215 217 221
9 Complex Systems Dichotomy Relational Biology
229 229 233
PART IV: Hypotheses fingo
237
10 Anticipation Anticipatory Systems Causality Teleology Synthesis Lessons from Biology An Anticipatory System is Complex
239 239 245 248 250 255 256
11 Living Systems A Living System is Complex (M,R)-Systems Interlude: Reflexivity Traversability of an (M,R)-System What is Life? The New Taxonomy
259 259 262 272 278 281 284
12 Synthesis of (M,R)-Systems Alternate Encodings of the Replication Component Replication as a Conjugate Isomorphism Replication as a Similarity Class Traversability
289 289 291 299 303
PART V: Epilogus
309
13 Ontogenic Vignettes (M,R)-Networks Anticipation in (M,R)-Systems Semiconservative Replication The Ontogenesis of (M,R)-Systems
311 311 318 320 324
Appendix: Category Theory Categories → Functors → Natural Transformations Universality Morphisms and Their Hierarchies Adjoints
329 330 348 360 364
Bibliography
373
Acknowledgments
377
Index
379
xiii
Praefatio Unus non sufficit orbis
In my mentor Robert Rosen’s iconoclastic masterwork Life Itself [1991], which dealt with the epistemology of life, he proposed a Volume 2 that was supposed to deal with the ontogeny of life. As early as 1990, before Life Itself (i.e., ‘Volume 1’) was even published (he had just then signed a contract with Columbia University Press), he mentioned to me in our regular correspondence that Volume 2 was “about half done”. Later, in his 1993 Christmas letter to me, he wrote: ...I’ve been planning a companion volume [to Life Itself] dealing with ontology. Well, that has seeped into every aspect of everything else, and I think I’m about to make a big dent in a lot of old problems. Incidentally, that book [Life Itself] has provoked a very large response, and I’ve been hearing from a lot of people, biologists and others, who have been much dissatisfied with prevailing dogmas, but had no language to articulate their discontents. On the other hand, I’ve outraged the “establishment”. The actual situation reminds me of when I used to travel in Eastern Europe in the old days, when everyone was officially a Dialectical Materialist, but unofficially, behind closed doors, nobody was a Dialectical Materialist.
xiv
When Rosen died unexpectedly in 1998, his book Essays on Life Itself [published posthumously in 2000] was in the final stages of preparation. But this collection of essays is not ‘Volume 2’, as he explained in its Preface: Thus this volume should be considered a supplement to the original volume. It is not the projected second volume, which deals with ontogenetics rather than with epistemology, although some chapters herein touch on ideas to be developed therein. We see, therefore, that the “projected second volume” was then still a potentiality. I have, however, never seen any actualization of this ‘Volume 2’, and no part of its manuscript has ever been found. Rosen did, nevertheless, leave behind a partially completed manuscript tentatively entitled “Complexity”. This was a work-in-progress, with only a few sections (mostly introductory material) finished. It may or may not be what he had in mind for the projected second volume of Life Itself. My opinion is that it is not. To me, its contents are neither sufficiently extent nor on-topic enough for it to be a more-than-half-done Volume 2 on the ontogeny of life. In the years since, I had begun an attempt to extend the manuscript into Life Itself Volume 2, but this effort of raising his orphan, as it were, was abandoned for a variety of reasons — one of them was that I did not want to be Süssmayr to Mozart’s Requiem. The book that you are now reading, More Than Life Itself, is therefore not my completion of the anticipated Volume 2 of Robert Rosen’s Life Itself, and has not incorporated any of his text from the “Complexity” manuscript. It is entirely my own work in the RashevskyRosen school of relational biology. The inheritance of Nicolas Rashevsky (1899–1972) and Robert Rosen (1934–1998), my academic grandfather and father, is, of course, evident (and rightly and unavoidably so). Indeed, some repetition of what Rosen has already written first (which is worthy of repetition in any case) may occasionally be found. After all, he was a master of les mots justes, and one can only rearrange a precise
xv
mathematical statement in a limited number of ways. As Aristotle said, “When a thing has been said once, it is hard to say it differently.” The crux of relational biology, a term coined by Nicolas Rashevsky, is “Throw away the matter and keep the underlying organization.” The characterization of life is not what the underlying physicochemical structures are, but by its entailment relations, what they do, and to what end. In other words, life is not about its material cause, but is intimately linked to the other three Aristotelian causes, formal, efficient, and final. This is, however, not to say that structures are not biologically important: structures and functions are intimately and synergistically related. Our slogan is simply an emphatic statement that we take the view of ‘function dictates structure’ over ‘structure implies function’. Thus relational biology is the operational description of our endeavour, the characteristic name of our approach to our subject, which is mathematical biology. Note that ‘biology’ is the noun and ‘mathematical’ is the adjective: the study of living organisms is the subject, and the abstract deductive science that is mathematics is the tool. Stated otherwise, biology is the final cause and mathematics is the efficient cause. The two are indispensable ingredients, indeed complementary (and complimentary) halves of our subject. Relational biology can no more be done without the mathematics than without the biology. Heuristic, exploratory, and expository discussions of a topic, valuable as they may be, do not become the topic itself; one must distinguish the science from the meta-science. The Schrödinger question “What is life?” is an abbreviation. A more explicitly posed expansion is “What distinguishes a living system from a non-living one?”; alternatively, “What are the defining characteristics of a natural system for us to perceive it as being alive?” This is the epistemological question Rosen discusses and answers in Life Itself. His answer, in a nutshell, is that an organism — the term is used in the sense of an ‘autonomous life form’, i.e., any living system — admits a certain kind of relational description, that it is ‘closed to efficient causation’. (I shall
xvi
explain in detail these and many other somewhat cryptic, very Rosen terms in this monograph.) The epistemology of biology concerns what one learns about life by looking at the living. From the epistemology of life, an understanding of the relational model of the inner workings of what is alive, one may move on to the ontogeny of life. The ontology of biology involves the existence of life, and the creation of life out of something else. The ontogenetic expansion of Schrödinger’s question is “What makes a natural system alive?”; or, “What does it take to fabricate an organism?” This is a hard question. This monograph More Than Life Itself is my first step, a synthesis in every sense of the word. With the title that I have chosen for the book, I obviously intend it to be a continuation of Robert Rosen’s conception and work in Life Itself. But (as if it needs to be explicitly written) I am only Robert Rosen’s student, not Robert Rosen himself. No matter how sure I am of my facts, I cannot be so presumptuous as to state that, because I know my mentorcolleague-friend and his work so well, what I write is what he would have written. In other words, I cannot, of course, claim that I speak for Robert Rosen, but in his absence, with me as a ‘torch-bearer’ of the school of relational biology, my view will have to suffice as a surrogate. But surrogacy implicitly predicates nonequivalence. My formulations occasionally differ from Rosen’s, and this is another reason why I find it more congenial to not publish my More Than Life Itself as ‘Volume 2 of Robert Rosen’s Life Itself ’. I consider these differences evolutionary in relational biology: as the subject develops from Rashevsky to Rosen to me, each subsequent generation branches off on the arbor scientiae. Any errors (the number of which I may fantasize to be zero but can only hope to be small, and that they are slight and trivially fixable) that appear in this book are, naturally, entirely mine. The capacity to err is, in fact, the real marvel of evolution: the processes of metabolism-repair-replication are ordained from the very beginning to make small mistakes. Thus through mutational blunders progress and improvements are made. The Latin root for ‘error’, the driving force of evolution, is erratio, which means roving, wandering about looking for something, quest.
xvii
In complex analysis (the theory of functions of a complex variable), analytic continuation is a technique used to extend the domain of a given holomorphic (alias analytic) mapping. As an analogue of this induction, I use the term synthetic continuation in the subtitle of this monograph that is the song of our synthetic journey. Analytic biology attempts to model specific fragments of natural phenomena; synthetic biology begins with categories of mathematical objects and morphisms, and seeks their realizations in biological terms. Stated otherwise, in relational biology, mathematical tools are used synthetically: we do not involve so much in the making of particular models of particular biological phenomena, but rather invoke the entailment patterns (or lack thereof) from certain mathematical theories and interpret them biologically. Nature is the realization of the simplest conceivable mathematical ideas. I shall have a lot more to say on analysis versus synthesis in this monograph. Someone once said to Rosen: “The trouble with you, Rosen, is that you keep trying to answer questions nobody wants to ask.” [Rosen 2006]. It appears that his answers themselves cause even more self-righteous indignation in some people, because the latter’s notions of truth and Rosen’s answers do not coincide. Surely only the most arrogant and audacious would think that the technique they happen to be using to engage their chosen field is the be-all and end-all of that subject, and would be annoyed by any alternate descriptions, Rosen’s or otherwise. One needs to remember that the essence of a complex system is that a single description does not suffice to account for our interactions with it. Alternate descriptions are fundamental in the pursuit of truth; plurality spices life. “One world is not enough.” Uncritical generalizations about what Rosen said are unhelpful. For example, according to Rosen, one of the many corollaries of being an organism is that it must have noncomputable models. The point is that life itself is not computable. This in no way means that he somehow implies that computable models are useless, and therefore by extension people involved with biological computing are wasting their time! There are plenty of useful computing models of biological processes. The simple
xviii
fact is that computing models (an indeed any models whatsoever) will be, by definition, incomplete, but they may nevertheless be fruitful endeavours. One learns a tremendous amount even from partial descriptions. Along the same vein, some impudent people take great offence in being told by Rosen that their subject area (e.g. the physics of mechanisms), their ‘niche’, is special and hence nongeneric. Surely it should have been a compliment! An algebraic topologist, say, would certainly take great pride that her subject area is indeed not run-of-the-mill, and is a highly specialized area of expertise. I am a mathematical biologist. I would not in my wildest dream think that mathematics can provide almost all (in the appropriately mathematical sense) the tools that are suitable for the study of biology. A mathematical biologists is a specialist in a very specialized area. There is nothing wrong in being a specialist; what is wrong is the reductionistic view that the specialization is in fact general, that all (or at least all in the ‘territory’ of the subject at hand) should conform to the specialization. Why is being nongeneric an insult? Are some people, in their self-aggrandizement, really pretentious enough to think that the subject they happen to be in would provide answers to all the questions of life, the universe, and everything? Rosen’s revelations hit particular hard those who believe in the ‘strong’ Church-Turing thesis, that for every physically realizable process in nature there exists a Turing machine that provides a complete description of the process. In other words, to them, everything is computable. Note that Rosen only said that life is not computable, not that artificial life is impossible. However one models life, natural or artificial, one cannot succeed by computation alone. Life is not definable by an algorithm. Artificial life does not have to be limited to what a computing machine can do algorithmically; computing is but one of a multitude of available tools. But for the ‘strong’ Church-Turing thesis believers, they would have the syllogism Rosen says life is not computable. Everything is computable. Therefore Rosen says artificial life is impossible.
xix
Compare and contrast this to Alan Turing’s psychotic syllogism, a non sequitur that is so iconic of his demise Turing believes machines think. Turing lies with men. Therefore machines cannot think. The following is a diagram of the modelling relation. (I shall have a lot more to say about it in Chapter 4.)
Natural systems from the external world generate in us our percepts of ‘reality’. While causal entailments themselves may be universal truths, perceived causal entailments are not. All we have are our own observations, opinions, interpretations, our individual alternate descriptions of ‘reality’ that are our personal models of ‘truth’. Causal entailments are interpreted, not proven. A mathematical proof is absolute; it is categorically more than a scientific ‘proof’ (and a judicial ‘proof’) of ‘beyond a reasonable doubt’. A scientific theory can never be proven to the same absolute certainty of a mathematical theorem. This is because a scientific ‘proof’ is merely considered ‘highly likely based on the evidence available’; it depends on observation, experimentation, perception, and interpretation — all of which are fallible, and are in any case approximations of truth. Sometimes the minimized doubt later turns out to be errors, and paradoxically, in the same
xx
spirit of ‘errors drive evolution’, the inherent weakness in scientific proofs leads to scientific revolutions, when, ‘based on new evidence’, ‘proven’ theories are refined, updated, surpassed, or replaced. Modelling is the art of bringing entailment structures into congruence. The essence of an art is that it rests on the heuristic, the unentailed, the intuitive leap. The encoding and the decoding arrows in the modelling relation diagram are themselves unentailed. Theoretical scientists are more artists than artisans. Natural Law assures them only that their art is not in vain, but it in itself provides not the slightest clue how to go about it. There is no right or wrong in art, only similarities and differences, the congenial and the uncongenial. Among the four arrows in the diagram of the modelling relation, only inferential entailment may be proven in the rigorous mathematical sense. Absolute statements about the truth of statements validated by proofs cannot be disputed. Rosen proved the theorems that he stated in Life Itself, although his presentations are not in the orthodox form of definitionlemma-theorem-proof-corollary that one finds in conventional mathematics journals and textbooks. His book is, after all, not a text in pure mathematics. While the presentation may be ‘Gaussian’, with all scaffolding removed, there are nevertheless enough details in Rosen’s prose that any reasonably competent mathematician can ‘fill in the blanks’ and rewrite the proofs in full, if one so wishes. But because of the unorthodox heuristic form, people have contended Rosen’s theorems. Since the dispute is over form rather than substance, it is not surprising that the contentions are mere grumbles, and no logical fallacies in the theorems have ever been found. A common thread running in many of the antiRosen papers that I have encountered is the following: they simply use definitions of terms different from Rosen’s, whence resulting in consequences different from Rosen’s, and thereby concluding that Rosen must be wrong! I shall in the present monograph recast Rosen’s theorems in as rigorously mathematical a footing as possible, using the algebraic theory of lattices. It is an interesting exercise in itself, but it is most unlikely to convert any skeptics with their preconceived ideas of truth.
xxi
The other three arrows in the modelling relation diagram — causal entailment, encoding, and decoding — all have intuitive elements in their art and science. As such, one if so inclined may claim that another’s interpretations are uncongenial, but cannot conclude that they are wrong: there are no absolute truths concerning them. Rosen closed his monograph Anticipatory Systems with these words: For in a profound sense, the study of models is the study of man; and if we can agree about our models, we can agree about everything else. Agree with our models and partake our synodal exploration. Else agree to disagree, then we shall amicably part company. In the Preface of Life Itself, Rosen identified his intended readership by quoting from Johann Sebastian Bach’s Clavierübung III: Denen Liebhabern, und besonders denen Kennern von vergleichen Arbeit, zur Gemüths Ergezung... [Written for those who love, and most especially those who appreciate such work, for the delight of their souls...] Let me add to that sentiment by quoting a couplet from a Chinese classic:
[The diligent one sings for oneself, not for the recruitment of an audience.]
xxii
The same readers who took delight in Life Itself should also enjoy this More Than Life Itself. Be our companions on our journey and join us in our songs.
A. H. Louie 22 February, 2009
xxiii
Nota bene
Many references in this monograph are drawn from Robert Rosen’s Trilogy: ● [FM] Fundamentals of Measurement and Representation of Natural Systems [1978] ● [AS] Anticipatory Systems: Philosophical, Mathematical, Methodological Foundations [1985a], and ● [LI]
and
Life Itself: A Comprehensive Inquiry into the Nature, Origin, and Fabrication of Life [1991].
Additional references are in ● [NC] “Organisms as Causal Systems which are not Mechanisms: an Essay into the Nature of Complexity” [1985b] and ● [EL] Essays on Life Itself [2000]. My thesis ● [CS] “Categorical System Theory” [1985]
xxiv
contains much of the background material on the category theory of natural and formal systems. (See the Bibliography for publication details of these references.) Familiarity with our previous work is not a prerequisite; it would, however, make simpler a first reading of this monograph. I strive to make it as self-contained as possible, but because of the subjects’ inherent complexity, the entailment patterns of the many concepts cannot be rendered unidirectional and sequential. Some topics I present herein depend not only upon material on previous pages but also upon material on following pages. So in a sense this monograph is an embodiment of a relational diagram in graph-theoretic form, a realization of the branches and cycles of the entailment patterns. In this book I assume that the reader is familiar with the basic facts of naive set theory, as presented, for example, in Halmos [1960]. Set theory from the naive point of view is the common approach of most mathematicians (other than, of course, those in mathematical logic and the foundations of mathematics). One assumes the existence of a suitable universe of sets (viz. the universe of small sets) in which the set-theoretic constructions, used in contexts that occur naturally in mathematics, will not give rise to paradoxical contradictions. In other words, one acknowledges these paradoxes, and moves on. This is likewise the position I take in this monograph. In the Prolegomena I present some set-theoretic and logical preliminaries, but more for the clarity of notations than for the concepts themselves. For example, the relative complement of a set A in a set B may be variously denoted as B ∼ A , B − A , B \ A , etc.; I use the first one. I often used the language of category theory as a metalanguage in my text. The definitive reference on this branch of abstract algebra is Mac Lane [1978]. I give a concise summary in the Appendix of those categorytheoretic concepts that appear in my exposition.
1
Prolegomenon Concepts from Logic
The principle that can be stated Cannot be the absolute principle. The name that can be given Cannot be the permanent name. — Lao Tse (6th century BC) Tao Te Ching Chapter 1
In principio... 0.1 Κοιναι εννοιαι α Book I of Euclid’s Elements begins with a list of twenty-three definitions and five postulates in plane geometry, followed by five common notions that are general logical principles. Common Notion 1 states
2
“Things equal to the same thing are also equal to one another.” Equality is a primitive: such proclamation of its self-evident property without proof is the very definition of axiom. Thus formally begins mathematics... It may be argued that equality is the most basic property in any mathematical subject. In set theory, equality of sets is formulated as the 0.2 Axiom of Extension Two sets are equal if and only if they have the same elements.
Subset I shall assume that the reader has a clear intuitive idea of the notions of a set and of belonging to a set. I use the words ‘set’, ‘collection’, and ‘family’ as synonyms. The elementary operations that may be performed with and on sets are used without further comments, save one: 0.3 Definition If A and B are sets and if every element of A is an element of B , then A is a subset of B . The wording of the definition implies that there are two possibilities: either A = B , or B contains at least one element that is not in A , in which case A is called a proper subset of B . It has been increasingly popular in the mathematical literature to use A ⊆ B as notation, seduced by the ordering relation ≤ . This usage, unfortunately, almost always ends here. ⊆ -users rarely use the then-consistent notation A ⊂ B , analogous to < , to mean proper subset, but often resort to the idiosyncratic ⊂ ≠ instead. The few exceptions that do employ ⊂ to mean ‘proper subset’ invariably lead to confusion, because of the well-established standard notation (1)
A⊂ B
3
for ‘either A = B or A is a proper subset of B ’. The notation I use in this book is this standard which is inclusive of both senses of the containment of set A in set B . Sometimes A ⊂ B is reversely described as ‘ B is a superset of A ’. If A and B are sets such that A ⊂ B and B ⊂ A , then the two sets have the same elements. It is equally obvious vice versa. The Axiom of Extension 0.2 may, therefore, be restated as the 0.4 Theorem Two sets A and B are equal if and only if A ⊂ B and B ⊂ A. On account of this theorem, the proof of set equality A = B is usually split into two parts: first prove that A ⊂ B , and then prove that B ⊂ A .
Conditional Statements and Variations 0.5 Conditional Many statements, especially in mathematics, are of the form ‘If p , then q .’ We have already encountered some in this prologue. These are called conditional statements, and are denoted in the predicate calculus of formal logic by (2)
p → q.
The if-clause p is called the antecedent and the then-clause q is called the consequent. Note that the conditional form (2) may be translated equivalently as ‘ q if p .’ So the clauses of the sentence may be written in the reverse order, when the antecedent does not in fact ‘go before’, and the conjunction ‘then’ does not explicitly appear in front of, the consequent. If the antecedent is true, then the conditional statement is true if the consequent is true, and the conditional statement is false if the consequent is false. If the antecedent is false, then the conditional statement is true regardless of whether the consequent is true or false. In other words, the
4
conditional p → q is false if p is true and q is false, and it is true otherwise. 0.6 I Say What I Mean “Then you should say what you mean,” the March Hare went on. “I do,” Alice hastily replied; “at least — at least I mean what I say — that’s the same thing, you know.” “Not the same thing a bit!” said the Hatter. “Why, you might just as well say that ‘I see what I eat’ is the same thing as ‘I eat what I see’!” “You might just as well say,” added the March Hare, “that ‘I like what I get’ is the same thing as ‘I get what I like’!” “You might just as well say,” added the Dormouse, which seemed to be talking in his sleep, “that ‘I breathe when I sleep’ is the same thing as ‘I sleep when I breathe’!” “It is the same thing with you,” said the Hatter, and here the conversation dropped, and the party sat silent for a minute,... — Lewis Carroll (1865) Alice’s Adventures in Wonderland Chapter VII A Mad Tea Party
Alice’s “I do” is the contention “I say what I mean”. This may be put as the conditional statement “If I mean it, then I say it.”. which is form (2), p → q , with p = ‘I mean it’ and q = ‘I say it’. It is equivalent to the statement
5
“I say it if I mean it.”. The conditional p → q may also be read as ‘ p only if q ’. Alice’s statement is then “I mean it only if I say it.”. The adverb ‘only’ has many nuances, and in common usage ‘only if’ is sometimes used simply as an emphasis of ‘if’. But in mathematical logic ‘only if’ means ‘exclusively if’. So ‘ p only if q .’ means ‘If q does not hold, then p cannot hold either.’ In other words, it is logically equivalent to ‘If not q , then not p .’, which in the predicate calculus is (3)
¬q → ¬ p
(where ¬ denotes negation, the logical not). The conditional form (3) is called the contrapositive of the form (2). The contrapositive of Alice’s “I mean it only if I say it.” ( = “If I mean it, then I say it.” ) is the equivalent conditional statement “If I do not say it, then I do not mean it.”. 0.7 I Mean What I Say The conditional form (4)
q→ p
is called the converse of the form (2), and the equivalent contrapositive of the converse, i.e. the conditional form (5)
¬ p → ¬q ,
is called the inverse of the original form (2). A conditional statement and its converse or inverse are not logically equivalent. For example, if p is true and q is false, then the conditional p → q is false, but its converse
6
q → p is true. The confusion between a conditional statement and its converse is a common mistake. Alice thought “I mean what I say.” (i.e. the converse statement “If I say it, then I mean it.”) was the same thing as “I say what I mean.” (the original conditional statement “If I mean it, then I say it.”), and was then thoroughly ridiculed by her Wonderland acquaintances. 0.8 Biconditional The conjunction (6)
( p → q) ∧ (q → p)
(where ∧ is the logical and) is abbreviated into (7)
p ↔ q,
called a biconditional statement. Since q → p may be read ‘ p if q ’ and p → q may be read ‘ p only if q ’, the biconditional statement p ↔ q is ‘ p if and only if q ’, often abbreviated into ‘ p iff q ’. If p and q have the same truth value (i.e. either both are true or both are false), then the biconditional statement p ↔ q is true; if p and q have opposite truth values, then p ↔ q is false.
Mathematical Truth “Pure mathematics consists entirely of such asseverations as that, if such and such a proposition is true of anything, then such a such another proposition is true of that thing. It is essential not to discuss whether the first proposition is really true, and not to mention what the anything is of which it is supposed to be true. … If our hypothesis is about anything and not about some one or more particular things, then our deductions constitute mathematics. Thus mathematics may be de-
7
fined as the subject in which we never know what we are talking about, nor whether what we are saying is true.” — Bertrand Russell (1901) Recent work on the principles of Mathematics
In mathematics, theorems (also propositions, lemmata, and corollaries) assert the truth of statements. Grammatically speaking, they should have as their subjects the statement (or the name of, or some other reference to, the statement), and as predicates the phrase ‘is true’ (or ‘holds’, or some similar such). For example, the concluding Rosen theorem in Section 9G of LI is 0.9 Theorem There can be no closed path of efficient causation in a mechanism. (The word ‘mechanism’ has a very specific meaning in the Rosen lexicon: 0.10 Definition A natural system is a mechanism if and only if all of its models are simulable. I shall have a lot more to say on this in Chapter 8.) Theorem 0.9 should be understood as 0.9' Theorem ‘There can be no closed path of efficient causation in a mechanism.’ is true. Or, what is the same, 0.9' Theorem Theorem 0.9 is true. But, of course, this Theorem 0.9' really means 0.9" Theorem Theorem 0.9' is true.
8
Or, equivalently, 0.9" Theorem “Theorem 0.9 is true.” is true. This “statement about a statement” idea may, alas, be iterated ad infinitum, to 0.9 T Theorem ……“ “ “ “ “Theorem 0.9 is true.” is true ” is true ” is true ” is true” ……. Lewis Carroll wrote about this hierarchical ‘reasoning about reasoning’ paradox in a witty dialogue What the Tortoise said to Achilles [1895]. Efficiency and pragmatism dictate the common practice that the predicate is implicitly assumed and hence usually omitted. A theorem, then, generally consists of just the statement itself, the truth of which it asserts. 0.11 Implication An implication is a true statement of the form (8)
“ ‘ p → q ’ is true.”
It is a statement about (the truth of) the conditional statement (9)
‘ p → q ’.
The implication (8) is denoted in formal logic by (10)
p ⇒ q,
which is read as ‘ p implies q ’. When a conditional statement is expressed as a theorem in mathematics, viz. Theorem If p , then q . it is understood in the sense of (8), that it is an implication.
9
The difference between → and ⇒ , i.e. between a conditional statement and an implication, is that of syntax and semantics. Note that p → q is just a proposition in the predicate calculus, which may be true or false. But p ⇒ q is a statement about the conditional statement p → q , asserting that the latter is a true statement. In particular, when p ⇒ q , the situation that p is true and q is false (which is the only circumstance for which the conditional p → q is false) cannot occur. 0.12 Modus tollens Since a conditional statement and its contrapositive are equivalent, when p → q is true, ¬q → ¬p is also true. The contrapositive inference (11)
( p ⇒ q ) ⇒ ( ¬q ⇒ ¬ p )
is itself an implication, called modus tollens in mathematical logic. Most mathematical theorems are stated, or may be rewritten, as implications. The Rosen Theorem 0.9, for example, is p ⇒ q with p = ‘ N is a mechanism’ and q = ‘there is no closed path of efficient causation in N ’, where N is a natural system. Stated explicitly, it is the 0.13 Theorem If a natural system N is a mechanism, then there is no closed path of efficient causation in N . The equivalent contrapositive implication ¬q → ¬p is the 0.14 Theorem If a closed path of efficient causation exists in a natural system N , then N cannot be a mechanism. 0.15 Equivalence A true statement of the form (12)
“ ‘ p ↔ q ’ is true.”,
10
which asserts the truth of a biconditional statement, is called an equivalence. It is denoted as (13)
p ⇔ q,
and is read as ‘ p and q are equivalent’. It is clear from the definitions that the equivalence (13) is equivalent to the conjunction (14)
p ⇒ q and q ⇒ p .
When p ⇔ q , either both p and q are true, or both are false. When a biconditional statement is expressed as a theorem in mathematics, viz. Theorem
p if and only if q .
it is understood in the sense of (12) that the biconditional statement p ↔ q is in fact true, that it is the equivalence p ⇔ q . 0.16 Definition A definition is trivially a theorem — by definition, as it were. It is also often expressed as an equivalence, i.e., with an ‘if and only if’ statement. See, for example, Definition 0.10 of ‘mechanism’. Occasionally a definition may be stated as an implication (e.g. Definition 0.3 of ‘subset’), but in such cases the converse is implied (by convention, or, indeed, by definition). Stated otherwise, a definition is always an equivalence, whether it is expressed as such or not, between the term being defined and the defining conditions. Definition 0.3 is the implication p ⇒ q where p = ‘every element of A is an element of B ’ and q = ‘set A is a subset of set B ’. But since this is a definition, implicitly entailed is the converse q ⇒ p :
11
0.3' Definition If a set A is a subset of a set B then every element of A is an element of B . So the definition is really 0.3'' Definition A set A is a subset of a set B if and only if every element of A is an element of B . Note that this implicit entailment is not a contradiction to the fact, discussed above in 0.7, that a conditional statement is not logically equivalent to its converse. The propositions p → q and q → p will always remain logically distinct, and in general the implication p ⇒ q says nothing about q ⇒ p . The previous paragraph only applies to definitions, and its syntax is
(15)
‘If p ⇒ q is a definition, then also q ⇒ p , whence p ⇔ q .’
Necessity and Sufficiency 0.17 Modus ponens (16)
The law of inference
‘If p ⇒ q and p is true, then q is true.’
is called modus ponens. This inference follows from the fact that when p ⇒ q , p → q is true; so the situation that p is true and q is false (the only circumstance for which the conditional p → q is false) cannot occur. Thus the truth of p predicates q . Incidentally, modus ponens is the ‘theorem’ that begins the propositional canon in Lewis Carroll’s What the Tortoise said to Achilles
12
[1895]. Note that the truth of p → q is required for the truth of p to entail the truth of q . In a general (not necessarily true) conditional statement p → q , the truth values of p and q are independent. Because of its inferential entailment structure (that the truth of p is sufficient to establish the truth of q ), the implication p ⇒ q may also be read ‘ p is sufficient for q ’. Contrapositively (hence equivalently), the falsehood of q is sufficient to establish the falsehood of p . In other words, if q is false, then p cannot possibly be true; i.e. the truth of q is necessary (although some additional true statements may be required) to establish the truth of p . Thus the implication p ⇒ q may also be read ‘ q is necessary for p ’. The equivalence p ⇔ q (i.e. when ‘ p iff q ’ is true so that p and q predicate each other) may, therefore, be read ‘ p is necessary and sufficient for q ’. 0.18 Membership The concepts of necessity and sufficiency are intimately related to the concept of subset. Definition 0.3'' is the statement (17)
A ⊂ B iff ∀x ( x ∈ A ) ⇒ ( x ∈ B ) .
Stated otherwise, when A is a subset of B (or, what is the same, B includes A ), membership in A is sufficient for membership in B , and membership in B is necessary for membership in A . Similarly, the Axiom of Extension 0.2 is the statement (18)
A = B iff ∀x ( x ∈ A ) ⇔ ( x ∈ B ) ;
i.e. membership in A and membership in B are necessary and sufficient for each other. The major principle of set theory is the
13
0.19 Axiom of Specification For any set U and any statement p ( x ) about x , there exists a set P the elements of which are exactly those x ∈ U for which p ( x ) is true. It follows immediately from the Axiom of Extension that the set P is determined uniquely. To indicate the way P is obtained from U and p ( x ) , the customary notation is (19)
P = { x ∈ U : p ( x )} .
The ‘ p ( x ) ’ in (19) is understood to mean “ ‘ p ( x ) ’ is true” (with the conventional omission of the predicate); it may also be read as ‘ x has the property p ’. For example, let N be the set of all natural systems, and let s ( N ) = ‘all models of N are simulable’. Then one may denote the set of all mechanisms M (cf. Definition 0.10)] as (20)
M = { N ∈ N : s ( N )} .
When the ‘universal set’ U is obvious from the context (or inconsequential), it may be dropped, and the notation (19) abbreviates to (21)
P = { x : p ( x )} .
As a trivial example, a set A may be represented as (22)
A = { x : x ∈ A} .
0.20 Implication and Inclusion Statement (17) connects set inclusion with implication of the membership property. Analogously, if one property implies another, then the set specified by the former is a subset of the set specified by the latter (and conversely). Explicitly, if x has the property p
14
implies that x has the property q , i.e. if ∀x p ( x ) ⇒ q ( x ) , then P = { x : p ( x )} is a subset of Q = { x : q ( x )} (and conversely):
(23)
P ⊂ Q iff ∀x p ( x ) ⇒ q ( x ) .
The equivalence (23) may be read as P ⊂ Q if and only if p is sufficient for q , and also P ⊂ Q if and only if q is necessary for p . For example, let N be the set of all natural systems, let t ( N ) = ‘there is no closed path of efficient causation in N ’, and let (24)
T = { N ∈ N : t ( N )} .
Let M be the set of all mechanisms as specified in (20). Theorem 0.9 (the proof of which is the content of Chapter 9 of LI, and is given an alternate presentation later on in Chapter 8 of this monograph) is the statement (25)
∀N ∈ N s ( N ) ⇒ t ( N ) ,
whence equivalently (26)
M ⊂ T.
Complements “I think that it would be reasonable to say that no man who is called a philosopher really understands what is meant by the complementary descriptions.” — Niels Bohr (1962) Communication 1117
15
“Some believe the Principle of Complementarity, but the rest of us do not.” — Anonymous
0.21 Definition The relative complement of a set A in a set B is the set of elements in B but not in A :
(27)
B ∼ A = { x ∈ B : x ∉ A} .
When B is the ‘universal set’ U (of some appropriate universe under study, e.g. the set of all natural systems N), the set U ∼ A is denoted A c , i.e. (28)
A c = { x ∈ U : x ∉ A} ,
and is called simply the complement of the set A . An element of U is either a member of A , or not a member of A , but not both. That is, A ∪ A c = U , and A ∩ A c = ∅ . The set specified by the property p , P = { x : p ( x )} , has as its complement the set specified by the property ¬p ; i.e. (29)
P c = { x : ¬p ( x )} .
In the predicate calculus, there are these 0.22 Laws of Quantifier Negation
(30)
¬∀x p ( x ) ⇔ ∃x ¬p ( x )
(31)
¬∃x p ( x ) ⇔ ∀x ¬p ( x )
16
The negation of the statement s ( N ) = ‘all models of N are simulable’ is thus ¬s ( N ) = ‘there exists a model of N that is not simulable’ This characterizes the collection of natural systems that are not mechanisms as those that have at least one nonsimulable model. The predicate calculus also has this trivial tautology: 0.23 Discharge of Double Negation
(32)
¬¬p ⇔ p The negation of the statement t ( N ) = ‘there is no closed path of
efficient causation in N ’ is therefore ¬t ( N ) = ‘there exists a closed path of efficient causation in N ’. The equivalent contrapositive statement of (25) is hence (33)
∀N ∈ N ¬t ( N ) ⇒ ¬s ( N ) ,
which gives the 0.24 Theorem If there exists a closed path of efficient causation in a natural system, then it has at least one model that is not simulable (whence it is not-a-mechanism). I shall explore the semantics of Theorems 0.9, 0.13, 0.14, and 0.24 (instead of just their sample syntax used in this prologue to illustrate principles of mathematical logic) in Chapter 8 et seq.
17
Neither More Nor Less 0.25 Nominalism “I don’t know what you mean by ‘glory’,” Alice said. Humpty Dumpty smiled contemptuously. “Of course you don’t — till I tell you. I meant ‘there’s a nice knockdown argument for you’!” “But ‘glory’ doesn’t mean ‘a nice knock-down argument’,” Alice objected. “When I use a word,” Humpty Dumpty said, in a rather scornful tone, “it means just what I choose it to mean — neither more nor less.” “The question is,” said Alice, “whether you can make words mean so many different things.” “The question is,” said Humpty Dumpty, “which is to be master — that’s all.” — Lewis Carroll (1871) Through the Looking-Glass, and What Alice Found There Chapter VI Humpty Dumpty
Humpty’s point of view is known in philosophy as nominalism, the doctrine that universals or abstract concepts are mere names without any corresponding ‘reality’. The issue arises because in order to perceive a particular object as belonging to a certain class, say ‘organism’, one must have a prior notion of ‘organism’. Does the term ‘organism’, described by this prior notion, then have an existence independent of particular organisms? When a word receives a specific technical definition, does it have to reflect its prior notion, the common-usage sense of the word? Nominalism says no.
18
0.26 Semantic Equivocation A closely related issue is a fallacy of misconstrual in logic known as semantic equivocation. This fallacy is quite common, because words often have several different meanings, a condition known as polysemy. A polysemic word may represent any one of several concepts, and the semantics of its usage are context-dependent. Errors arise when the different concepts with different consequences are mixed together as one. For a word that has a technical definition in addition to its everyday meaning, non sequitur may result when the distinction is blurred. Confusion often ensues from a failure to clearly understand that words mean “neither more nor less” than what they are defined to mean, not what they are perceived to mean. This happens even in mathematics, where terms are usually more precisely defined than in other subjects. The most notorious example is the term ‘normal’, which appears in numerous mathematical subject areas to define objects with specific properties. In almost all cases (e.g. normal vector, normal subgroup, normal operator), the normal subclass is nongeneric within the general class of objects; i.e. what is defined as ‘normal’ is anything but normal in the common-usage sense of ‘standard, regular, typical’. While it is not my purpose in this monograph to dwell into nominalism and semantic equivocation themselves, they do make occasional appearances in what follows as philosophical and logical undertones. 0.27 Structure ‘Extreme’ polysemous words, those having two current meanings that are opposites, are called amphibolous. For example, the word ‘structure’, which means ‘a set of interconnecting parts of a thing’ (its Latin root is struere, to build), has antonymous usage in biology and mathematics: ‘concrete’ in one, and ‘abstract’ in the other. In biology, ‘structure’ means material structure, the constituent physicochemical parts. In our subject of relational biology, our slogan is ‘function dictates structure’. Entailment relations within living systems are their most important characteristics.
19
In mathematics, on the other hand, ‘structure’ (as in set-withstructure) in fact means the relations defined on the object. A structure on a set is a collection of nullary, unary, binary, ternary ... operations satisfying as axioms a variety of identities between composite operations. Thus a partially ordered set (which I shall introduce in Chapter 1) is a set equipped with a binary operation ≤ having certain specified properties. A group is a set equipped with a binary (the group multiplication), a nullary (the unit element), and a unary (the inverse) operation, which together satisfy certain identities. A topological space’s structure is a collection of its subsets (the open sets) with certain prescribed properties. And so forth. It is perhaps this ‘relations are structure’ concept in mathematics that inspired Nicolas Rashevsky on his foundation of relational biology. 0.28 Function We should also note the polysemy of the word ‘function’. The Latin functio means ‘performance’. An activity by which a thing fulfils a purpose, the common meaning of ‘function’, may be considered a performance. This is the word’s biological usage, although the teleologic ‘fulfils a purpose’ sense is regularly hidden. (I shall have much more to say on this later.) A mathematical function may be considered as a set of operations that are performed on each value that is put into it. Leibniz first used the term function in the mathematical context, and Euler first used the notation f ( x ) to represent a function, because the word begins with the letter f . Since in mathematics ‘function’ has a synonym in ‘mapping’, in this book I shall use mapping for the mathematical entity (cf. Definition 1.3), and leave function to its biological sense.
21
PART I Exordium
No one really understood music unless he was a scientist, her father had declared, and not just a scientist, either, oh, no, only the real ones, the theoreticians, whose language was mathematics. She had not understood mathematics until he had explained to her that it was the symbolic language of relationships. “And relationships,” he had told her, “contain the essential meaning of life.” — Pearl S. Buck (1972) The Goddess Abides Part I
22
Equivalence relation is a fundamental building block of epistemology. The first book of the Robert Rosen trilogy is Fundamentals of Measurement and Representation of Natural Systems [FM]. It may equally well be entitled ‘Epistemological Consequences of the Equivalence Relation’; therein one finds a detailed mathematical exposition on the equivalence relation and its linkage to similarity, the pre-eminent archetypal concept in all of science, in both the universes of formal systems and natural systems. Equivalence relation is also a fundamental building block of mathematics. The concept of equivalence is ubiquitous. Many of the theorems in this book depend on the fact that the collection of equivalence relations on a set is a mathematical object known as a lattice. In this introductory Part I, I present a précis of the algebraic theory of lattices, with emphasis, of course, on the topics that will be of use to us later on. Some theorems will only be stated in this introduction without proofs. Their proofs may be found in books on lattice theory or universal algebra. The standard reference is Lattice Theory written by Garrett Birkhoff, a founder of the subject [Birkhoff 1967].
23
1 Praeludium: Ordered Sets
Mappings 1.1 Definition (i) If X is a set, the power set X of X is the family of all subsets of X . (ii) Given two sets X and Y , one denotes by X × Y the set of all ordered pairs of the form ( x, y ) where x ∈ X and y ∈ Y . The set X × Y is called the product (or cartesian product) of the sets X and Y . 1.2 Definition A relation is a set R of ordered pairs; i.e. R ⊂ X × Y , or equivalently R ∈ ( X × Y ) , for some sets X and Y .
The collection of all relations between two sets X and Y is thus the power set ( X × Y ) . 1.3 Definition
A mapping is a set f of ordered pairs with the property
that, if ( x, y ) ∈ f and ( x, z ) ∈ f , then y = z . Note the requirement for a subset of X × Y to qualify it as a mapping is in fact quite a stringent one: most, i.e., common, members of ( X × Y ) do not have this property. A mapping is therefore a special, i.e., nongeneric, kind of relation. But genericity is not synonymous with
24
importance: general relations and mappings are both fundamental mathematical objects of study. 1.4 Definition Let f be a mapping. One defines two sets, the domain of f and the range of f , respectively by
(1)
dom ( f ) = { x : ( x, y ) ∈ f for some y }
and (2)
ran ( f ) = { y : ( x, y ) ∈ f for some x } . Thus f is a subset of the product dom ( f ) × ran ( f ) . If ran ( f )
contains exactly one element, then f is called a constant mapping. Various words, such as ‘function’, ‘transformation’, and ‘operator’, are used as synonyms for ‘mapping’. The mathematical convention is that these different synonyms are used to denote mappings having special types of sets as domains or ranges. Because these alternate names also have interpretations in biological terms, to avoid semantic equivocation, in this book I shall — unless convention dictates otherwise — use mapping (and often map) for the mathematical entity. 1.5 Remark The traditional conception of a mapping is that of something that assigns to each element of a given set a definite element of another given set. I shall now reconcile this with the formal definition given above. Let f be a mapping and let X and Y be sets. If dom ( f ) = X and
ran ( f ) ⊂ Y , whence f is a subset of X × Y , one says that f is a mapping of X into Y , denoted by (3)
f : X →Y ,
and occasionally (mostly for typographical reasons) by
25
(4)
f X ⎯⎯ →Y .
The collection of all mappings of X into Y is a subset of the power set ( X × Y ) ; this subset is denoted Y X (see A.3, in the Appendix). To each element x ∈ X , by Definition 1.3, there corresponds a unique element y ∈ Y such that ( x, y ) ∈ f . Traditionally, y is called the value of the mapping f at the element x , and the relation between x and y is denoted by y = f ( x ) instead of ( x, y ) ∈ f . Note that the y = f ( x )
notation is only logically consistent when f is a mapping — for a general relation f , it is possible that y ≠ z yet both ( x, y ) ∈ f and ( x, z ) ∈ f ; if one were to write y = f ( x ) and z = f ( x ) in such a situation, then one would be led, by Euclid’s Common Notion 1 (cf. 0.0 and also the Euclidean property 1.10(e) below), to the conclusion that y = z : a direct contradiction to y ≠ z . With the y = f ( x ) notation, one has (5)
ran ( f ) = { y : y = f ( x ) for some x } ,
which may be further abbreviated to (6)
ran ( f ) = { f ( x ) : x ∈ dom ( f ) } .
One then also has (7)
f = {( x, f ( x ) ) : x ∈ X } .
From this last representation, we observe that when X ⊂ and Y ⊂ (where is the set of real numbers), my formal definition of a mapping coincides with that of the ‘graph of f ’ in elementary mathematics.
26
Sometimes it is useful to trace the path of an element as it is mapped. If a ∈ X , b ∈ Y , and b = f ( a ) , one uses the ‘maps to’ arrow (note the short vertical line segment at the tail of the arrow) and writes
f :a
(8)
b.
Note that this ‘element-chasing’ notation of a mapping in no way implies that there is somehow only one element a in the domain dom ( f ) = {a} = X , mapped by f to the only element b in the range ran ( f ) = {b} ⊂ Y . a is a symbolic representation of variable elements in the domain of f , while b denotes its corresponding image b = f ( a ) defined by f . The notation f : a b is as general as f : X → Y , the former emphasizing the elements while the latter emphasizing the sets. One occasionally also uses the ‘maps to’ arrow to define the mapping f itself: x
(9)
1.6 Definition
f ( x).
Let f be a mapping of X into Y . If E ⊂ X , f ( E ) , the
image of E under f , is defined to be the set of all elements f ( x ) ∈ Y for
x ∈ E ; i.e., (10)
f ( E ) = { f ( x ) : x ∈ E} ⊂ Y .
In this notation, f ( X ) is the range of f . 1.7 Definition
If f is a mapping of X into Y , the set Y is called the
codomain of f , denoted by cod ( f ) .
The range f ( X ) = ran ( f ) is a subset of the codomain Y = cod ( f ) , but they need not be equal. When they are, i.e. when f ( X ) = Y , one says
27
that f is a mapping of X onto Y , and that f : X → Y is surjective (or is a surjection). Note that every mapping maps onto its range. 1.8 Definition
If E ⊂ Y , f −1 ( E ) denotes the set of all x ∈ X such that
f ( x) ∈ E :
(11)
f −1 ( E ) = { x : f ( x ) ∈ E } ⊂ X ,
and is called the inverse image of E under f . If y ∈ Y , f −1 ({ y} ) is abbreviated to f −1 ( y ) , so it is the set of all x ∈ X such that f ( x ) = y . Note that f −1 ( y ) may be the empty set, or may contain more than one element. If, for each y ∈ Y , f −1 ( y ) consists of at most one element of X , then f is said to be a one-to-one (1-1 , also injective) mapping of X into Y . Other commonly used names are ‘ f : X → Y is an injection’, and ‘ f : X → Y is an embedding’. This may also be expressed as follows: f is a one-to-one mapping of X into Y provided f ( x1 ) ≠ f ( x2 ) whenever x1 , x2 ∈ X and x1 ≠ x2 .
If A ⊂ X , then the mapping i : A → X defined by i ( x ) = x for all x ∈ A is a one-to-one mapping of A into X , called the inclusion map (of A in X ).
If f : X → Y is both one-to-one and onto, i.e. both injective and surjective, then f is called bijective (or is a bijection), and that it establishes a one-to-one correspondence between the sets X and Y . While the domain and range of f are specified by f as in Definition 1.4, the codomain is not yet uniquely determined — all that is required so far is that it contains the range of f as a subset. One needs to invoke a category theory axiom (see Appendix: Axiom A.1(c1)), and assigns to each mapping f a unique set Y = cod ( f ) as its codomain.
28
Equivalence Relations Recall Definition 1.2 that a relation R is a set of ordered pairs, R ⊂ X × Y for some sets X and Y . Just as for mappings, however, there are traditional terminologies for relations that were well established before this formal definition. I shall henceforth use these traditional notations, and also concentrate on relations with X = Y . 1.9 Definition If X is a set and R ⊂ X × X , one says that R is a relation on X , and write x R y instead of ( x, y ) ∈ R . 1.10 (r) (s) (a) (t) (e)
Definition A relation R on a set X is said to be reflexive if for all x ∈ X , x Rx ; symmetric if for all x, y ∈ X , x R y implies y Rx ; antisymmetric if for all x, y ∈ X , x R y and y Rx imply x = y ; transitive if for all x, y, z ∈ X , x R y and y Rz imply x Rz ; Euclidean if for all x, y , z ∈ X , x Rz and y Rz imply x R y .
1.11 Definition A relation R on a set X is called an equivalence relation if it is reflexive, symmetric, and transitive; i.e. if it satisfies properties (r), (s), and (t) in Definition 1.10 above.
The equality (or identity) relation I on X , defined by x I y if x = y , is an equivalence relation. As a subset of X × X , I is the diagonal I = {( x, x ) : x ∈ X } . Because of reflexivity (r), any equivalence relation R ⊂ X × X must have ( x, x ) ∈ R for all x ∈ X ; thus I ⊂ R . The universal
relation U on X , defined by xU y if x, y ∈ X , is also an equivalence relation. Since U = X × X , for any equivalence relation R on X one has R ⊂U . The equality relation I is Euclidean. Indeed, when R = I , the Euclidean property (e) is precisely Euclid’s Common Notion 1 (cf. 0.0):
29
“Things equal to the same thing are also equal to one another.” One readily proves the following 1.12 Theorem If a relation is Euclidean and reflexive, it is also symmetric and transitive (hence it is an equivalence relation). 1.13 Definition Let R be an equivalence relation on X . For each x ∈ X the set
(12)
[ x ]R = { y ∈ X : x R y }
is called the equivalence class of x determined by R , or the R-equivalence class of x . The collection of all equivalence classes determined by R is called the quotient set of X under R , and is denoted by X R ; i.e. (13)
X R = {[ x ]R : x ∈ X } .
1.14 All or Nothing
By reflexivity (r), one has x ∈ [ x ]R for all x ∈ X .
The equivalence classes determined by R are therefore all nonempty, and (14)
X = ∪ [ x ]R . x∈X
Also, the members of X R are pairwise disjoint.
For suppose
x, y ∈ X and [ x ]R ∩ [ y ]R ≠ ∅ . Choose z ∈ [ x ]R ∩ [ y ]R , whence x Rz and
y Rz . By symmetry (s) and transitivity (t) one has y Rx . Now if w ∈ [ x ]R ,
x Rw , so together with y Rx just derived, transitivity (t) gives y Rw , whence w ∈ [ y ]R . This shows that [ x ]R ⊂ [ y ]R . By symmetry (of the argument) [ y ]R ⊂ [ x ]R , and consequently [ x ]R = [ y ]R .
Stated otherwise, every element of X belongs to exactly one of the equivalence classes determined by R .
30
1.15 Congruence An equivalence relation is sometimes also called a congruence. There is, however, a canonical usage of the latter which provides a nontrivial example of the former.
Let m > 0 be a fixed integer. One defines a relation i ≡ i ( mod m ) on the set of integers as follows: for a , b ∈ , one writes (15)
a ≡ b ( mod m ) ,
and says that a is congruent modulo m to b , when the difference a − b is a multiple of m ; the fixed positive integer m is called the modulus of the relation. It easily follows that a ≡ b ( mod m ) if and only if a and b leave the same remainder upon division by m . The relation congruence modulo m defines an equivalence relation on the set of integers, and it has m equivalence classes, [ 0] , [1] ,..., [ m − 1] . One readily verifies the arithmetic rules: if a ≡ b ( mod m ) and c ≡ d ( mod m ) , then a + c ≡ b + d ( mod m ) and ac ≡ bd ( mod m ) . One
may also prove that if ab ≡ ac ( mod m ) and a is relatively prime to m , then b ≡ c ( mod m ) . The quotient set ≡ of under i ≡ i ( mod m ) is often denoted m . All the rules for addition and multiplication hold for m , so it is in fact a ring. It is called the ring of integers modulo m , and plays an important role in algebra and number theory. We shall encounter m again later on in this monograph.
1.16 Definition A pairwise disjoint family of sets the union of which is X is called a partition of X . The sets in the disjoint family are the blocks of the partition.
Here is a special type of partition that we shall need later:
31
1.17 Definition A partition is called singular if all its blocks consist of single elements except for one block.
We have just seen in 1.14 that the family of equivalence classes determined by an equivalence relation on X is a partition of X . The blocks of the partition that corresponds to the equality relation I are all singleton sets (sets containing exactly one member). In the partition that corresponds to the universal relation U , there is only one block, the set X itself. Conversely, given a partition of X one may define a relation R by defining two elements of X to be related under R if they belong to the same block. It is trivial to verify that R is an equivalence relation on X , and that its equivalence classes are precisely the blocks of the partition. One thus has: 1.18 Lemma There is a one-to-one correspondence between the equivalence relations on a set X and the partitions of X .
The set of all ordered pairs ( x, y ) with x, y ∈ A is, of course, simply the product set A × A . So an alternate concise formulation of the above lemma is 1.19 Lemma A relation R on a set X is an equivalence relation if and only if there is a partition of X such that R = ∪ A × A . A∈
Partially Ordered Sets 1.20 Definition A relation R on a set X is called a partial order if it is reflexive, antisymmetric, and transitive; i.e. if it satisfies properties (r), (a), and (t) in Definition 1.10 above.
32
One usually uses the notation ≤ instead of R when it is a partial order. 1.21 Definition A partially ordered set (often abbreviated as poset) is an ordered pair X , ≤ in which X is a set and ≤ is a partial order on X .
When the partial order ≤ is clear from the context, one frequently for simplicity omits it from the notation, and denote X , ≤ by the underlying set X . Each subset of a poset is itself a poset under the same partial order. 1.22 Definition Let X , ≤ be a poset and x, y ∈ X . If x ≤ y , one says ‘ x
is less than or equal to y ’, or ‘ y is greater than or equal to x ’, and write ‘ y ≥ x ’. One also writes ‘ x < y ’ (and ‘ y > x ’) for ‘ x ≤ y and x ≠ y ’, whence reads ‘ x is less than y ’ (and ‘ y is greater than x ’). The simplest example of a partial order is the equality relation I . The equality relation, as we saw above, is also an equivalence relation, and it is in fact the only relation which is both an equivalence relation and a partial order. The relation ≤ on the set of all integers is an example of a partial order. As another example, the inclusion relation ⊂ is a partial order on the power set A of a set A . Morphisms in the category of posets are order-preserving mappings: 1.23 Definition A mapping f from a poset X , ≤ X to a poset Y , ≤ Y is
called order-preserving, or isotone, if (16)
x ≤ X y in X
implies
f ( x ) ≤ Y f ( y ) in Y .
(The somewhat awkward symbols ≤ X and ≤ Y are meant to indicate the partial orders on X and Y may be different. With this clearly understood, I shall now simplify the notation for the next part of the definition.) Two
33
posets X and Y are isomorphic, written X ≅ Y , if there exists a bijective map f : X → Y such that both f and its inverse f −1 : Y → X are orderpreserving; i.e. (17)
x ≤ y in X
iff
f ( x ) ≤ f ( y ) in Y .
Any poset may be represented as a collection of sets ordered by inclusion: 1.24 Theorem
Let X , ≤ be a poset. Define f : X → X , for x ∈ X ,
by
(18)
f ( x ) = { y ∈ X : y ≤ x} .
Then X is isomorphic to the range of f ordered by set inclusion ⊂ ; i.e. X ,≤ ≅ f ( X ),⊂ . 1.25 Definition The converse of a relation R is the relation R such that x R y if y Rx .
Thus the converse of the relation ‘is included in’ ⊂ is the relation ‘includes’ ⊃ ; the converse of ‘less than or equal to’ ≤ is ‘greater than or equal to’ ≥ . A simple inspection of properties (r), (a), and (t) leads to the 1.26 Duality Principle The converse of a partial order is itself a partial order.
The dual of a poset X = X , ≤ is the poset X = X , ≥ defined by the converse partial order. Definitions and theorems about posets are dual in pairs (whenever they are not self-dual). If any theorem is true for all posets, then so is its dual. 1.27 Definition Let ≤ be a partial order on a set X and let A ⊂ X . The subset A is bounded above if there exists x ∈ X such that a ≤ x for all
34
a ∈ A ; such x ∈ X is called an upper bound for A . An upper bound x for A is called the supremum for A if x ≤ y for all upper bounds y for A . A subset A can have at most one supremum (hence the article the), and if it exists it is denoted by sup A .
The terms bounded below, lower bound, and infimum (notation inf A ) are defined analogously. A set that is bounded above and bounded below is called bounded. 1.28 The Greatest and the Least Every element of X is an upper bound for the empty set ∅ ⊂ X . So if ∅ has a supremum in X , then sup∅ is an element such that sup ∅ ≤ y for all y ∈ X ; such an element (if it exists) is called the least element of X , and this is the element sup ∅ = inf X .
Dually, inf ∅ , if it exists, is an element such that y ≤ inf ∅ for all y ∈ X ; such an element is called the greatest element of X , and this is the element inf ∅ = sup X . Let A be a set and consider the poset A, ⊂ . Each subset ⊂ A (i.e. each family of subsets of A ) is bounded above (trivially by the set A ) and bounded below (trivially by the empty set ∅ ), hence bounded. A subset B of A (i.e. B ∈ A ) is an upper bound for if and only if ∪ S ⊂ B , and a lower bound for if and only if B ⊂ ∩ S . Thus S∈
S∈
∪ S = sup and ∩ S = inf .
S∈
The least element of A, ⊂ is ∅ , and
S∈
the greatest element of A, ⊂ is A . The greatest and least elements of a poset X are only considered to ‘exist’ if they are members of X . It is important to note, however, that an upper bound, a lower bound, the supremum, and the infimum (if any exists) for a subset A ⊂ X are only required to be elements of the original poset X . They may or may not be in the subset A itself. In the example in the
35
previous paragraph,
∪ S = sup
S∈
and
∩ S = inf
are members of A ,
S∈
but they are not necessarily members of . Even in cases where there is no greatest or least element, there may be elements in a poset that have no other elements greater than or less than they are: 1.29 Definition Let ≤ be a partial order on a set X and let A ⊂ X . An element x ∈ A is maximal if whenever y ∈ A and x ≤ y one has x = y . Stated otherwise, x ∈ A is maximal if x < y for no y ∈ A . Dually, an element x ∈ A is minimal if whenever y ∈ A and y ≤ x one has x = y , or equivalently if y < x for no y ∈ A .
Note that maximal and minimal elements of A are required to be members of A . The greatest element (if it exists) must be maximal, and the least element (if it exists) must be minimal. But the converse is not true. As an example, let A be the three-element set {1, 2,3} . Its power set is
A = {∅, {1} , {2} ,{3} , {1, 2} , {1,3} , {2,3} , A } , partially ordered by ⊂ . The element A is the greatest element (hence a maximal element) of this poset A, ⊂ , and the element ∅ is the least element (hence a minimal element). Now consider ⊂ A with
= {{1} , {1, 2} , {1,3}} . , ⊂ is a poset in its own right. One has sup = ∪ S = {1} ∪ {1, 2} ∪ {1,3} = {1, 2,3} = A s∈
and
36
inf = ∪ S = {1} ∩ {1, 2} ∩ {1,3} = {1} . s∈
So both sup and inf exist in A , but sup ∉ while inf ∈ . , ⊂ has no greatest element, but both {1, 2} and {1,3} are maximal elements.
, ⊂ has {1} as its least element (which is therefore also a
minimal element). 1.30 Theorem Any (nonempty) finite subset of a poset has minimal and maximal elements. PROOF
Let X , ≤ be a poset and A = { x1 , x2 ,..., xn } be a finite subset
of X . Define m1 = x1 , and for k = 2,..., n , define (19)
if xk < mk −1 ⎧x mk = ⎨ k . m otherwise ⎩ k −1 Then mn will be minimal. Similarly, A has a maximal element.
□
1.31 Poset as Category A partially ordered set X , ≤ may itself be
considered as a category, in which the objects are elements of X , and a hom-set X ( x, y ) for x, y ∈ X has either a single element or is empty, according to whether x ≤ y or not. Product in this category corresponds to infimum, and coproduct corresponds to supremum. This is a single-posetas-a-category, and is completely different from ‘the category of all posets and isotone mappings’ considered in 1.23 above. Note the analogy to a single-set-as-a-category (i.e. a discrete category) versus the category Set of all sets and mappings (see A.2 and A.3).
37
Totally Ordered Sets Two elements x and y in a poset are called comparable if either x ≤ y or y ≤ x . A poset may contain incomparable elements. (Consider A, ⊂ in the previous example in 1.29, and one has neither {1, 2} ⊂ {1,3} nor {1,3} ⊂ {1, 2} .) A partial order for which this cannot happen is called a total order. Explicitly: 1.32 Definition A poset X , ≤ is said to be totally (or linearly) ordered if
[in addition to properties (r), (a), and (t) ] all elements are comparable, i.e. (l)
for all x, y ∈ X , either x ≤ y or y ≤ x .
The term ‘totally ordered set’ is sometimes abbreviated into toset. The relation ≤ on the set of all integers is an example of a total order. 1.33 Definition A chain in a poset X , ≤ is a subset A ⊂ X such that ≤
totally orders A . Every subset of a chain is a chain. In chains, the notions of least and minimal element are identical. This is because if x ∈ A is minimal, whence y < x for no y ∈ A , then by property (l) of Definition 1.32 one must have x ≤ y for all y ∈ A , whence x is the least element of A . Dually, in chains, the notions of greatest and maximal element are identical. So in view of Theorem 1.30, one has 1.34 Theorem element.
Every finite chain has a least element and a greatest
The set {1, 2,..., n} of the first n natural numbers (i.e. positive integers) in its natural order ≤ forms a finite chain, called the ordinal number n . When totally unordered (so that no two different elements are comparable), it forms another poset called the cardinal number n .
38
1.35 Theorem Every finite chain of n elements is isomorphic to the ordinal number n .
There is a result about partial orders that has far-reaching consequences in several branches of mathematics. It is 1.36 Zorn’s Lemma Let X be a nonempty partially ordered set with the property that each nonempty chain in it has an upper bound. Then X has a maximal element.
It is interesting to note that Zorn’s Lemma is equivalent to the 1.37 Axiom of Choice Given a nonempty family of nonempty sets, there is a mapping f with domain such that f ( A ) ∈ A for all A ∈ .
The encyclopaedic treatise on this topic is Rubin & Rubin [1963]. Both Zorn’s Lemma and the Axiom of Choice will make their appearance again later on.
39
2 Principium: The Lattice of Equivalence Relations
Lattices 2.1 Definition A lattice is a nonempty partially ordered set L, ≤ in which each pair of elements x, y has a supremum and an infimum in L . They are denoted by x ∨ y = sup { x, y} and x ∧ y = inf { x, y} , and are often called, respectively, the join and meet of x and y .
For any elements x, y of a lattice, (1)
x≤ y ⇔
x∨ y = y ⇔
x ∧ y = x.
If the lattice L (as a poset) has a least element 0, then 0 ∧ x = 0 and 0 ∨ x = x for all x ∈ L . If it has a greatest element 1, then x ∧ 1 = x and x ∨ 1 = 1 for all x ∈ L . It is trivially seen from condition (1), known as consistency, that every totally ordered set is a lattice, in which x ∨ y is simply the larger and
x ∧ y is the smaller of x and y . The poset A, ⊂ , of the power set of a set A with the partial order of inclusion, is also a lattice; for any X , Y ∈ A , X ∨ Y = X ∪ Y and X ∧ Y = X ∩ Y .
40
The families of open sets and closed sets, respectively, of a topological space are both lattices. In these lattices the partial orders, joins, and meets are the same as those for the power set lattice. The collection of all subgroups of a group G is a lattice. The partial order ≤ is set-inclusion restricted to subgroups, i.e. the relation ‘is a subgroup of’. For subgroups H and K of G , H ∧ K = H ∩ K , but H ∨ K is the smallest subgroup of G containing H and K (which is generally not their set-theoretic union). A lattice may also be regarded as a set with two binary operators, ∨ and ∧ , i.e. the triplet L, ∨, ∧ . Again, for simplicity of notation, we often abbreviate to the underlying set and denote the lattice as L . The two operators satisfy a number of laws that are similar to the laws of addition and multiplication, and these laws may be used to give an alternative definition of lattices. 2.2 Theorem Let L be a lattice, then for any x, y, z ∈ L ,
(a) [associative] x ∨ ( y ∨ z ) = ( x ∨ y ) ∨ z , x ∧ ( y ∧ z ) = ( x ∧ y ) ∧ z ; (b) [commutative] x ∨ y = y ∨ x , x ∧ y = y ∧ x ; (c) [absorptive] x ∧ ( x ∨ y ) = x , x ∨ ( x ∧ y ) = x ; (d) [idempotent] x ∨ x = x , x ∧ x = x . Conversely, if L is a set with two binary operators ∨ and ∧ satisfying (a)–(c), then (d) also holds, and a partial order may be defined on L by the rule (e)
x ≤ y if and only if x ∨ y = y
[whence if and only if x ∧ y = x ]. Relative to this ordering, L is a lattice such that x ∨ y = sup { x, y} and x ∧ y = inf { x, y} .
41
We have already seen the duality principle for partial orders in 1.26. As one may deduce from Theorem 2.2, this duality is expressed in lattices by interchanging ∨ and ∧ (hence interchanging ≤ and ≥ ), resulting in its dual. Any theorem about lattice remains true if the join and meet are interchanged. 2.3 Duality Principle The dual of a lattice is itself a lattice. 2.4 The Greatest and the Least The greatest element of a poset (if it exists) must be maximal, and the least element (if it exists) must be minimal. I illustrated with an example in 1.29 that the converse is not necessarily true. But for a lattice, one has Theorem In a lattice, a maximal element is the greatest element (and hence unique); dually, a minimal element is the least element (and hence unique).
Let x1 and x2 be two maximal elements. Their join x1 ∨ x2 is such that x1 ≤ x1 ∨ x2 and x2 ≤ x1 ∨ x2 (by definition of ∨ as the supremum). Because x1 is maximal, it cannot be less than another element, so x1 ≤ x1 ∨ x2 ⇒ x1 = x1 ∨ x2 ; similarly, because x2 is maximal, x2 ≤ x1 ∨ x2 ⇒ x2 = x1 ∨ x2 . Therefore x1 = x2 . Thus there can only be one maximal element. PROOF
Now let x be the only maximal element, and y be an arbitrary element of the lattice. One must have x ≤ y ∨ x by definition of ∨ ; but x is maximal, so x ≤ y ∨ x ⇒ x = y ∨ x , whence y ≤ x (by property 2.2(e) above). Thus x is the greatest element. □ Note that this theorem does not say a lattice necessarily has the greatest and the least elements, only that if a maximal (respectively, minimal) element exists, then it is the greatest (respectively, least).
42
2.5 Inequalities Let L be a lattice, and let x, y, z ∈ L . Then (a) [isotone] if x ≤ z , then x ∧ y ≤ y ∧ z and x ∨ y ≤ y ∨ z ;
(b) [distributive] x ∧ ( y ∨ z ) ≥ ( x ∧ y ) ∨ ( x ∧ z ) and x ∨ ( y ∧ z) ≤ ( x ∨ y) ∧ ( x ∨ z) ; (c) [modular] if x ≤ z , then x ∨ ( y ∧ z ) ≤ ( x ∨ y ) ∧ z . While any subset of a poset is again a poset under the same partial order, a subset of a lattice need not be a lattice (because x ∨ y or x ∧ y may not be members of the subset even if x and y are). So one needs to make the explicit 2.6 Definition A sublattice of a lattice L is a subset M which is closed under the operators ∨ and ∧ of L ; i.e. M ⊂ L is a sublattice if x ∨ y ∈ M and x ∧ y ∈ M for all x, y ∈ M .
The empty set is a sublattice, and so is any singleton subset. Let a, b ∈ L and a ≤ b . The (closed) interval is the subset [ a, b ] ⊂ L consisting of all elements x ∈ L such that a ≤ x ≤ b . An interval [ a, b ] need not be a chain (Definition 1.33), but it is always a sublattice of L , and it has the least element a and the greatest element b . Let A be a set and let ∗ be a fixed element of A , called the base point. A pointed subset of A is a subset X of A such that ∗∈ X (cf. Example A.6(ii) in the Appendix). The pointed power set A is the * subset of the power set A containing all pointed subsets of A ; i.e. A = { X ⊂ A : ∗∈ X } . A is a sublattice of A . * ∗ Note that it is possible for a subset of a lattice L to be a lattice without being a sublattice of L . For example, as we saw above, the collection Σ ( G ) of all subgroups of a group G is a lattice. When G is considered a set (forgetting the group structure), G is a lattice with ∨ = ∪ and ∧ = ∩ . Σ ( G ) is a subset of G , but it is not a sublattice of G .
43
While Σ ( G ) and G have the same partial order ≤ = ⊂ and the same meet operator ∧ = ∩ , the join operator ∨ of Σ ( G ) is not inherited from that of
G . For another example, let X be a set, whence the power set ( X × X ) is a lattice with ∨ = ∪ and ∧ = ∩ . A relation on X is a subset of X × X , so the collection X of all equivalence relations on X is a subset of ( X × X ) . As we shall soon see, X is again a lattice, but not usually a sublattice of ( X × X ) .
In particular, the union of two
equivalence relations need not be an equivalence relation. X is another example of a lattice with a join operator different from the standard settheoretic union. A morphism in the category of lattices is defined in the obvious structure-preserving fashion: 2.7 Lattice Homomorphism A mapping f from a lattice L to a lattice L′ is called a (lattice) homomorphism if for all x, y ∈ L
(2)
f ( x ∨ y ) = f ( x ) ∨ f ( y ) and
f ( x ∧ y) = f ( x) ∧ f ( y) .
A lattice homomorphism preserves the ordering: x ≤ y ⇒ f ( x) ≤ f ( y) .
But not every order-preserving mapping (i.e. poset
homomorphism) between lattices is a lattice homomorphism. 2.8 Lemma If f is a homomorphism from a lattice L into a lattice L′ , then the image f ( L ) is a sublattice of L′ .
If a lattice homomorphism f : L → L′ is one-to-one and onto, then f is called an isomorphism of L onto L′ , and the two lattices are said to be isomorphic. If f : L → L′ is one-to-one, then f is an embedding, L and
44
f ( L ) are isomorphic, and by Lemma 2.8 L has a representation (namely f ( L ) ) as a sublattice of L′ . 2.9 Heyting Algebra
Any nonempty finite subset A = {a1 ,..., an } of a
lattice L has, by induction, both a supremum and an infimum, denoted respectively by (3)
sup A = ∨ ai = a1 ∨ n
i =1
∨ an and inf A = ∧ ai = a1 ∧ n
i =1
∧ an .
Note that brackets may be omitted by associativity, and that the order of the factors is immaterial by commutativity. For the empty subset, however, the existence of sup ∅ = inf L (least element) and inf ∅ = sup L (greatest element) have to be postulated separately (cf. the discussion on finite categorical products in the appendix, A.28 and Lemma A.29). Now let L, ≤ = L, ∨, ∧ have all finite suprema and infima (including those of the empty family, hence L has least element 0 and greatest element 1). Recall (1.31) that a poset L, ≤ may itself be considered as a category, in which the objects are elements of L , and a hom-set L ( x, y ) for x, y ∈ L has either a single element or is empty, according to whether x ≤ y (alternatively, x ∨ y = y , x ∧ y = x ) or not. The category L is cartesian closed (cf. A.53). The exponential (cf. A.52) zy ∈L defined in the bijection L ( x ∧ y, z ) ≅ L ( x, z y ) is uniquely determined as the largest element for which the meet with y is less than or equal to z , i.e., z y = sup { x ∈ L : x ∧ y ≤ z} , whence x ∧ y ≤ z iff x ≤ z y . A lattice with all finite suprema and infima is called a Heyting Algebra, and is a model of the propositional calculus. 2.10 Completeness An infinite subset of a lattice, however, need not have a supremum or an infimum. (Consider, for example, the lattice of integers totally ordered by ≤ .) A lattice in which every subset has a supremum and an infimum is said to be complete. This includes the empty
45
subset ∅ . So in particular, a complete lattice L has a least element (inf L = sup ∅ ) and a greatest element (sup L = inf ∅ ). It is an interesting fact that for a lattice to be complete, it suffices for every subset to have a supremum (or for every subset to have an infimum): the ‘other half’ of the requirement is automatically entailed. 2.11 Theorem Let L be a poset in which every subset has a supremum (or in which every subset has an infimum). Then L is a complete lattice.
One must note carefully that in the hypothesis of this theorem, every subset includes the empty subset ∅ , otherwise the conclusion does not follow. For example, in the lattice of natural numbers (positive integers) totally ordered by ≤ , every nonempty subset has an infimum (the smallest number in the subset). But is not complete, since an infinite subset does not have a supremum. This is not a contradiction to the theorem, because the infimum of the empty set, inf ∅ = sup , does not exist. 2.12 Examples The power set A of any set A is a complete lattice with ∨ = ∪ and ∧ = ∩ ; so is a pointed power set A . *
For a less trivial example, consider the collections and respectively of open sets and closed sets of a topological space X . We have already encountered them above, as examples of lattices. With ∨ = ∪ and ∧ = ∩ , and are sublattices of X . But and are not complete with these operators, because the intersection of an infinite collection of open sets needs not be open, and the union of an infinite collection of closed sets needs not be closed. To make and complete, ∨ and ∧ have to be defined slightly differently. Let A be an arbitrary index set. Let { Fa : a ∈ A} ⊂ . Then the operations (4)
∨F =∪F a∈A
a
a
a∈A
⎛ ⎞ and Fa = ⎜ ∩ Fa ⎟ a∈A ⎝ a∈A ⎠
∧
○
46
(where S° denotes the interior of the set S ) are the join and meet that make into a complete lattice. Similarly, for {Ga : a ∈ A} ⊂ , the operations ⎛ ⎞ Ga = ⎜ ∪ Ga ⎟ a∈A ⎝ a∈A ⎠
∨
(5)
−
and
∧G = ∩G
a∈A
a
a
a∈A
(where S − denotes the closure of the set S ) are the join and meet that make into a complete lattice. [Note that when the index set A is finite, the
( )
and
( )
−
of the definitions are redundant, and these new ∨ and ∧
become identical to set-theoretic union and intersection respectively.] This is another example that shows it is possible for a subset of a lattice L to be a lattice without being a sublattice of L . With the operators defined as in (4) and (5), both and are themselves lattices, and they are both subsets of X . But neither is a sublattice of X , because in each case, the partial order, join, and meet are not all identical to the ⊂ , ∪ , and ∩ of X .
The Lattice X Let X be a set and let X denote the collection of all equivalence relations on X . A relation on X is a subset of X × X , so X is a subset of ( X × X ) . An equivalence relation as a subset of X × X has a very special structure (Lemma 1.19), so an arbitrary subset of an equivalence relation is not necessarily itself an equivalence relation. The partial order ⊂ of set inclusion, when restricted to X , implies more. When two equivalence relations R1 , R2 ∈ X are such that R1 ⊂ R2 , it means that in fact every R1 -equivalence class is a subset of some R2 -equivalence class. This also means, indeed, that the blocks in the partition defined by R1 are obtained by further partitioning the blocks in the partition defined by R2 . Stated otherwise, the blocks of R2 are obtained from those of R1 by taking set-theoretic unions of them. I shall henceforth use the notation R1 ≤ R2
47
when R1 , R2 ∈ X are such that R1 ⊂ R2 . R1 ≤ R2 is
An alternate description of
2.13 Definition Let R1 and R2 be equivalence relations on a set X . One says that R1 refines R2 (and that R1 is a refinement of R2 ) if for all x, y ∈ X ,
(6)
x R1 y ⇒ x R2 y .
When R1 refines R2 , i.e. when R1 ≤ R2 , one says that R1 is finer than R2 , and that R2 is coarser than R1 . One may verify that the relation of refinement on X is a partial order. Thus 2.14 Theorem
X , ≤ is a partially ordered set.
The equality relation I is the least element, and the universal relation U is the greatest element in the poset X , ≤ . Stated otherwise, the equality relation I , which partitions X into a collection of singleton sets, is the finest equivalence relation on X ; the universal relation U , which has one single partition that is X itself, is the coarsest equivalence relation on X . Contrast this with the fact that ∅ is the least element in ( X × X ) , ⊂ , while the largest element is the same U = X × X . 2.15 Definition Let R1 and R2 be equivalence relations on a set X . Their meet R1 ∧ R2 is defined as
(7)
x ( R1 ∧ R2 ) y iff x R1 y and x R2 y .
It is trivial to verify that R = R1 ∧ R2 is an equivalence relation on X , and that R refines both R1 and R2 , i.e.
48
(8)
R ≤ R1 and R ≤ R2 ,
and is the coarsest member of X with this property. In other words, (9)
R = R1 ∧ R2 = inf { R1 , R2 } .
One also has 2.16 Lemma The equivalence classes of R1 ∧ R2 are obtained by forming the set-theoretic intersection of each R1 -equivalence class with each R2 -
equivalence class. As subsets of ( X × X ) , R1 ∧ R2 = R1 ∩ R2 .
Since the collection of equivalence classes form a partition, the R1 -class and R2 -class that intersect to form R1 ∧ R2 -class are uniquely determined. The definition of meet may easily be extended to an arbitrary index set A and a collection of equivalence relations {Ra : a ∈ A} : (10)
x
(∧ R )y a∈A
a
iff x Ra y for all a ∈ A .
And one has (11)
∧R
a∈A
a
= inf { Ra : a ∈ A} .
The set-theoretic union of two equivalence relations does not necessarily have the requisite special structure as a subset of X × X (Lemma 1.19) to make it an equivalence relation. The join has to be defined thus: 2.17 Definition Let R1 and R2 be equivalence relations on a set X .
Their join R1 ∨ R2 is defined as follows: x ( R1 ∨ R2 ) y iff there is a finite sequence of elements x1 ,..., xn ∈ X such that
49
(12)
x R1 x1 , x1 R2 x2 , x2 R1 x3 , …, xn R1 y .
One readily verifies that R = R1 ∨ R2 is an equivalence relation on X , and that R is refined by both R1 and R2 , i.e. R1 ≤ R and R2 ≤ R ,
(13)
and is the finest member of X with this property. In other words, R = R1 ∨ R2 = sup { R1 , R2 } .
(14)
One concludes from (13) that, as subsets of ( X × X ) , R1 ⊂ R1 ∨ R2 and R2 ⊂ R1 ∨ R2 , whence R1 ∪ R2 ⊂ R1 ∨ R2 . The set (and relation) R1 ∨ R2 is called the transitive closure of the union R1 ∪ R2 .
For an arbitrary index set A and a collection of equivalence relations { Ra : a ∈ A} , the definition of the join ∨ Ra (i.e. the transitive a∈A
closure of the union x
(∨ R ) y a∈A
a
∪R
a
), that corresponds to the binary join in (12), is:
a∈A
iff there exist a finite sequence of elements x1 ,..., xn ∈ X and
indices a1 ,..., an ∈ A such that (15)
x Ra1 x1 , x1 Ra 2 x2 , x2 Ra 3 x3 , …, xn Ra n y .
With the meet and join as defined in 2.15 and 2.17, X is a lattice. In fact, 2.18 Theorem
X is a complete lattice.
Because of the one-to-one correspondence between equivalence relations and partitions (Lemma 1.18), any sublattice of the lattice of equivalence relations is also called a partition lattice.
50
Mappings and Equivalence Relations 2.19 Definition Given a mapping f : X → Y , one calls two elements x1 , x2 ∈ X f-related when f ( x1 ) = f ( x2 ) , and denotes this relation by R f ;
i.e. x1 R f x2 iff f ( x1 ) = f ( x2 ) .
(16)
Then R f is an equivalence relation on X , whence the equivalence classes determined by R f form a partition of X . f is a constant mapping on each R f -equivalence class. R f is called the equivalence relation on X induced by f , and f is called a generator of this equivalence relation. The equivalence relation induced on a set X by a constant mapping is the universal relation U (with only one single partition block which is all of X ). The equivalence relation induced on a set X by a one-to-one mapping is the equality relation I (with each partition block a singleton set). Any mapping with domain X induces an equivalence relation on X . It is a very important fact that all equivalence relations on X are of this type: 2.20 Theorem If R is an equivalence relation on X , then there is a mapping f with domain X such that R = R f .
Consider the mapping from X to the quotient set of X under R , π : X → X R , that maps an element of X to its equivalence class; i.e. PROOF
(17)
π ( x ) = [ x ]R for x ∈ X . This mapping π is called the natural mapping (projection) of X onto X R , and has the obvious property that Rπ = R . □
51
2.21 Lemma Let f : X → Y be a mapping and R f be the equivalence
relation on X induced by f . Then there is a one-to-one correspondence between the quotient set X R f and the range f ( X ) . PROOF
[ x ]R
f
Let x ∈ X , then f ( x ) ∈ f ( X ) . Identify the equivalence class with f ( x ) . This is the one-to-one correspondence between
X R f and f ( X ) .
□
If one denotes the one-to-one correspondence by f : X R f → f ( X ) and the natural mapping of X onto X R f by π f , then the diagram
(18)
commutes; i.e. for all x ∈ X , f (π f ( x ) ) = f ( x ) . The result also shows that the range of any mapping with domain X may always be identified with a quotient set of X . Via f , any (algebraic or topological) structure on Y may be pulled to X R f , and very often, via π f back to X . I shall have more to say on this imputation of structure from codomain to domain later (cf. injection in 7.37, and metaphorically in terms of the modelling relation in Chapter 4). 2.22 Homomorphism Theorems Let R be an equivalence relation on X . Any mapping with the quotient set X R as domain may be lifted, via
52
the natural mapping π : X → X R , to give a mapping with X as domain. Explicitly, for g : X R → Y , define g : X → Y by g ( x ) = g (π ( x ) ) for
x∈ X :
(19)
The results of Lemma 2.21 and Remark 2.22 are very general, and appear in many areas in mathematics. For example, the Fundamental Theorem of Group Homomorphisms is Theorem Let φ be a homomorphism of a group G into a group H with
kernel K . Then G K ≅ φ ( G ) . Note that in connection with this theorem there are many important results in group theory. For example, that the kernel K is a normal subgroup of G , that φ ( G ) is a subgroup of H , that if φ is onto, then G K ≅ H , etc. In linear algebra, one has the corresponding Theorem If T is a linear transformation from vector space U into vector space V with kernel W, then T (U ) is isomorphic to U W .
Conversely, if U is a vector space and W is a subspace of U , then there is a homomorphism of U onto U W . The reader is invited to discover homomorphism theorems in other categories.
53
2.23 Definition Let X be a set. An observable of X is a mapping with domain X . The collection of all observables of X , i.e. the union of Y X for all Set-objects Y , may be denoted i X . 2.24 Equivalent Observables We just saw that a mapping f with domain X induces an equivalence relation R f ∈ X . Dually, equivalence
relations on a set X induce an (algebraic) equivalence relation ∼ on the set of all mappings with domain X (i.e. on the set i X of observables of X ), as follows. If f and g are two mappings with domain X , define f ∼ g if R f = Rg , i.e. if and only if (20)
f ( x) = f ( y ) ⇔
g ( x) = g ( y)
for all x, y ∈ X .
This means the equivalence relations induced by f and g partition their common domain the same way. Stated otherwise, f ∼ g iff f and g are generators of the same equivalence relation in X . By definition, an observable cannot distinguish among elements lying in the same equivalence class of its induced equivalence relation. Two algebraically equivalent mappings ‘convey the same information’ about the partitioning of the elements of X — one cannot distinguish the elements of X further by employing equivalent observables. Succinctly, one has (21)
i X ∼ ≅ X .
Note that the algebraic equivalence f ∼ g only means that X R f ≅ X Rg ; in other words, there is a one-to-one correspondence between f ( X ) and g ( X ) , but there may be no relation whatsoever between the values f ( x ) and g ( x ) for x ∈ X . Indeed, the two mappings f and g may even have codomains that do not intersect. In particular, if their codomains are equipped with metrics, the fact that f ( x ) may be ‘close’ to f ( y ) in cod ( f ) says nothing about the closeness between g ( x ) and g ( y ) in cod ( g ) . So in this sense, even equivalent mappings give
54
‘alternate information’ about the elements of X , when the codomains are taken into account. 2.25 Qualitative versus Quantitative An observable of X , as I define it, may have any set Y as codomain. The difference between ‘qualitative’ and ‘quantitative’ thus becomes in degree and not in kind. Indeed, an observable measures a ‘quantity’ when Y is a set of numbers [e.g. when Y is a subset of , , , , or even (respectively the sets of natural numbers, integers, rational numbers, real numbers, and complex numbers), without for now straying into the territories of quaternions and Cayley numbers], and measures a ‘quality’ when Y is not a ‘numerical set’.
Seen in this light, quantitative is in fact a meagre subset of qualitative. The traditional view of reductionism is (among other things) that every perceptual quality can and must be expressible in numerical terms. Consider Ernest Rutherford’s infamous declaration “Qualitative is nothing but poor quantitative.” For us, the features of natural systems in general, and of biological systems in particular, that are of interest and importance are precisely those that are unquantifiable. Even though the codomains of qualitative observables can only be described ostensively, the observables themselves do admit rigorous formal definitions. Rosen has discussed much of this in earlier work. See, for example, AS, NC, LI, EL.
Linkage Let R1 and R2 be equivalence relations on a set X . Recall (Definition 2.13) the partial order of refinement in the lattice X : R1 ≤ R2 ( R1 refines R2 ) if (22)
x R1 y ⇒
x R2 y .
55
2.26 Lemma If R1 is a refinement of R2 , then there is a unique mapping ρ : X R1 → X R2 that makes the diagram
(23)
commute. PROOF
(
)
Define ρ [ x ]R = [ x ]R . 1
□
2
The mapping ρ induces an equivalence relation on X R1 .
By
Lemma 2.21, one sees that ( X R1 ) Rρ ≅ X R2 . In other words, when R1 refines R2 , one may regard X R2 as a quotient set of X R1 . The refinement relation between two equivalence relations may be defined through their generators: 2.27 Lemma Let f and g be two mappings with domain X . R f ≤ Rg
in X if and only if
(24)
f ( x) = f ( y) ⇒ g ( x) = g ( y)
for all x, y ∈ X .
2.28 Definition If R ≤ Rg in X , then g is called an invariant of R .
An invariant of an equivalence relation R is an invariant of every refinement of R . Rg is the largest equivalence relation of which g is
56
invariant. An invariant of R is constant on the equivalence classes of R . An invariant g of R will in general take the same value on more than one R -class; it takes on distinct values on distinct R -classes iff R = Rg , i.e., iff g is a generator of R . If R f ≤ Rg , then by Lemma 2.26 there is a unique mapping h : X R f → X Rg that makes the diagram
(25)
commute. This says the value of g at every x ∈ X is completely determined by the value of f through the relation g ( x ) = h ( f ( x )) .
(26)
Thus, in the obvious sense of ‘is a function of’, one has 2.29 Lemma If R f ≤ Rg in X , then g is a function of f . 2.30 Definition
Let f and g be observable of X . Let π f : X → X R f
and π g : X → X Rg be the natural quotient maps. For the R f -equivalence class [ x ]R ∈ X R f , consider the set of Rg -equivalence classes that f
57
intersect [ x ]R ; i.e. consider the set f
(
π g π −f 1 [ x ]R
(27)
f
) = {[ y ] = {[ y ] (
}
Rg
∈ X Rg : f ( x ) = f ( y )
Rg
∈ X Rg : [ y ]R ∩ [ x ]R ≠ ∅ . g
g
}
)
Note that [ x ]R ∈ π g π −f 1 [ x ]R , so the set (27) is necessarily nonempty, g
f
containing at least one Rg -equivalence class. One says (a) g is totally linked to f at [ x ]R if the set (27) consists of a single Rg f
class; (b) g is partially linked to f at [ x ]R if the set (27) consists of more than f
one Rg -class, but is not all of X Rg ; (c) g is unlinked to f at [ x ]R if the set (27) is all of X Rg . f
Further, one says that g is totally linked to f if g is totally linked to f at each [ x ]R ∈ X R f , and that g is (totally) unlinked to f if g is unlinked f
to f at each [ x ]R ∈ X R f . f
It is immediate from the definition that R f ≤ Rg has another characterization: 2.31 Lemma
g is totally linked to f if and only if R f refines Rg .
And therefore 2.32 Corollary
f ∼g.
f and g are totally linked to each other iff R f = Rg , i.e.
58
It is also immediate from the definition that 2.33 Lemma
g is totally unlinked to f if and only if every R f -class
intersects every Rg -class, and vice versa.
The essence of a description of X by an observable f lies in the set of equivalence classes arising from that description, i.e. by the induced equivalence relation R f ∈ X . When there is an alternate description of X by another observable g , intuitively one expects to learn something more. The meet of two equivalence relations (Definition 2.15) when defined through their generators becomes 2.34 Lemma Let f and g be two mappings with domain X . The meet R f ∧ Rg in X is
(28)
x ( R f ∧ Rg ) y iff
f ( x ) = f ( y ) and g ( x ) = g ( y ) .
One often abbreviates and denotes this meet as R f g = R f ∧ Rg . By Theorem 2.20, there exists an observable h of X such that Rh = R f g . Lemma 2.16 says that the equivalence classes of R f g are obtained by forming the set-theoretic intersection of each R f -equivalence class with each Rg -equivalence class. Some of them do not intersect, of course, and indeed, Definition 2.30 above classifies the possible intersections. In general (when g is partially linked or unlinked to f ), alternate descriptions do give additional information. But in the particular case when g is totally linked to f , i.e. R f ≤ Rg , one has (29)
R f g = R f ∧ Rg = inf { R f , Rg } = R f ,
so here the additional observable g does not distinguish the elements of X any more then f already has.
59
Representation Theorems Cayley’s Theorem in group theory tells us that permutation groups are special, because every group is isomorphic to a group of permutations on a set. It turns out that lattices of equivalence relations (i.e. partition lattices) are analogously special: every lattice is isomorphic to a lattice of equivalence relations. 2.35 Definition If R1 and R2 are relations on X , the relative product
R1 R2 is the set of all ordered pairs ( x, y ) ∈ X × X for which there exists a
z ∈ X with x R1 z and z R2 y . If R1 and R2 are equivalence relations, then because x R1 x one has R2 ⊂ R1 R2 ; similarly R1 ⊂ R1 R2 . Thus
(30)
R1 R2 ⊂ R1 R2 R1 ⊂ R1 R2 R1 R2 ⊂
,
and it is easily seen that R1 ∨ R2 is the union of this chain. It is possible, however, that R1 ∨ R2 is actually equal to some term in the chain. For example, this is the case when X is a finite set. This, in fact, turns out to be true for any set X , in a precise sense as follows. 2.36 Definition Let L be a lattice. A representation of L is and ordered pair X , f where X is a set and f : L → X is an injective (i.e. one-to-
one) lattice homomorphism. The representation X , f is (i)
of type 1 if for all x, y ∈ L , f ( x ) ∨ f ( y ) = f ( x ) f ( y ) ;
(ii) of type 2 if for all x, y ∈ L , f ( x ) ∨ f ( y ) = f ( x ) f ( y ) f ( x ) ; (iii) of type 3 if for all x, y ∈ L , f ( x ) ∨ f ( y ) = f ( x ) f ( y ) f ( x ) f ( y ) . In 1946, P. Whitman proved that every lattice had a representation: 2.37 Whitman’s Theorem Every lattice is isomorphic to a sublattice of the lattice of equivalence relations on some set X .
Or more succinctly: Every lattice is isomorphic to a partition lattice.
60
In 1953, B. Jónsson found a simpler proof that gave a stronger result. 2.38 Theorem
Every lattice has a type 3 representation.
The proofs involved transfinite recursion, and produced (nonconstructively) an infinite set X in the representation, even when the lattice L is a finite set. For several decades, one of the outstanding questions of lattice theory was whether every finite lattice can be embedded into the lattice of equivalence relations on a finite set. An affirmative answer was finally given in 1980 by P. Pudlák and J. Tůma: 2.39 Theorem
Every finite lattice has a representation X , f
with a
finite set X .
A representation of a lattice L induces an embedding of L into the lattice of subgroups of a group. Given a representation X , f of L , let G be the group of all permutations on X that leave all but finitely many elements fixed, and let Σ ( G ) denote the lattice of subgroups of G . Define h : L → Σ ( G ) by
(31)
h ( a ) = {φ ∈ G : x f ( a )φ ( x ) for all x ∈ X } .
[Note that f ( a ) ∈ X , so x f ( a ) φ ( x ) in (31) is the statement ‘ x is f ( a ) -related to φ ( x ) ’; i.e. ( x,φ ( x ) ) ∈ f ( a ) .]
One sees that h is an
embedding (i.e. a one-to-one lattice homomorphism), thus 2.40 Theorem Every lattice can be embedded into the lattice of subgroups of a group.
61
3 Continuatio: Further Lattice Theory
Modularity 3.1 Definition Let L be a lattice, and let x, y , z ∈ L . The modular identity (which is self-dual) is
(m)
if x ≤ z then x ∨ ( y ∧ z ) = ( x ∨ y ) ∧ z .
Not all lattices satisfy property (m); but if a lattice does, it is said to be modular. Recall the modular inequality 2.5(c) [ if x ≤ z , then x ∨ ( y ∧ z ) ≤ ( x ∨ y ) ∧ z ], which is satisfied by all lattices.
Let G be a group and let Σ ( G ) denote the lattice of subgroups of G . Let Ν ( G ) be the set of all normal subgroups of G . Ν ( G ) is a sublattice of Σ ( G ) , inheriting the same ∨ and ∧ . Recall (2.1) that for subgroups H and K of G , H ∧ K = H ∩ K , and H ∨ K is the smallest subgroup of G containing H and K . For H , K ∈ Ν ( G ) , the join becomes the simpler H ∨ K = HK in Ν ( G ) .
Note that this is not a ‘different’ ∨ , but a
consequent property because H and K are normal subgroups. Ν ( G ) is a modular lattice, while Σ ( G ) in general is not.
62
3.2 Transposition Principle In any modular lattice, the intervals [b, a ∨ b] and [ a ∧ b, a ] are isomorphic, with the inverse pair of
isomorphisms x
x ∧ a and y
y ∨b.
Two intervals of a lattice are called transposes when they can be written as [b, a ∨ b ] and [ a ∧ b, a ] for suitable a and b , hence the name of Theorem 3.2. A natural question to ask after having Theorem 2.38, that every lattice has a type 3 representation, is whether all lattices have, in fact, representations of either type 1 or type 2 (cf. Definition 2.36). The answer is negative for general lattices, and is positive only for lattices with special properties, which serve as their characterizations. 3.3 Theorem A lattice has a type 2 representation if and only if it is modular.
The lattice Ν ( G ) of normal subgroups of a group G is modular, whence by Theorem 3.3 it has a type 2 representation. It, indeed, has a natural representation X , f with X = G (as the underlying set) and, for
H ∈ Ν ( G ) , f ( H ) = {( x, y ) ∈ G × G : x y −1 ∈ H } . (This representation is in fact type 1.) While type 2 representation is completely characterized by the single modular identity (m), the characterization of lattices with type 1 representations is considerably more complicated. The question of whether a set of properties exists that characterizes lattices with type 1 representations (i.e. such that a lattice has a type 1 representation if and only if it satisfies this set of properties) is an open question. It has been proven thus far that even if such a set exists, it must contain infinitely many properties.
63
Distributivity 3.4 Lemma In any lattice L , the following three conditions are equivalent:
(d1) for all x, y, z ∈ L , x ∧ ( y ∨ z ) = ( x ∧ y ) ∨ ( x ∧ z ) ; (d2) for all x, y, z ∈ L , x ∨ ( y ∧ z ) = ( x ∨ y ) ∧ ( x ∨ z ) ; ( m′ ) for all x, y, z ∈ L ,
( x ∨ y) ∧ z ≤ x ∨ ( y ∧ z).
3.5 Definition A lattice L is distributive if it satisfy one (hence all three) of the conditions (d1), (d2), and ( m′ ).
Recall the distributive inequalities 2.5(b) [ (d1) with ≥ in place of = and (d2) with ≤ in place of = ], which are satisfied by all lattices; but the conditions (d1), (d2), and ( m′ ) are not. Note also that the ‘for all x, y, z ∈ L ’ quantifier is essential for their equivalence. In an arbitrary (non-distributive) lattice L , when one of (d1), (d2), and ( m′ ) is true for three specific elements x, y, z ∈ L , it does not necessarily imply that the other two are true for the same three elements. Any chain (or totally ordered set) is distributive. The dual of a distributive lattice is distributive, and any sublattice of a distributive lattice is distributive. The power set lattice is distributive; it is in fact the canonical distributive lattice. Every distributive lattice has a representation in a power set lattice: 3.6 Theorem A distributive lattice can be embedded into the power set lattice X , ⊂ of some set X .
Combining condition ( m′ ) with the modular inequality 2.5(c), one has 3.7 Theorem Every distributive lattice is modular.
64
A distributive lattice also has the nice ‘cancellation law’: 3.8 Theorem For a lattice to be distributive, it is necessary and sufficient that
(1)
if x ∧ z = y ∧ z and x ∨ z = y ∨ z , then x = y .
Complementarity 3.9 Definition Let L be a lattice with least element 0 and greatest element 1 [whence 0 = inf L = sup ∅ and 1 = sup L = inf ∅ ]. A complement of an element x ∈ L is an element y ∈ L such that x ∨ y = 1 and x ∧ y = 0.
The relation ‘is a complement of’ is clearly symmetric. Also, 0 and 1 are complements of each other. 3.10 Definition A lattice L (with 0 and 1) is said to be complemented if all its elements have complements.
Let L be a lattice, a, b ∈ L , and a ≤ b . The interval [ a, b ] ⊂ L is itself a lattice, with least element a and greatest element b . A complement of x ∈ [ a, b ] is thus a y ∈ [ a, b ] such that x ∧ y = a and
x ∨ y = b , in which case one also says x and y are relative complements in the interval [ a, b ] . The interval [ a, b ] is complemented if all its elements
have complements. 3.11 Definition A lattice L is said to be relatively complemented if all its intervals are complemented.
65
Note that if L is a lattice with least element 0 and greatest element 1, then [ 0,1] = L is an interval of L . So a relatively complemented lattice with 0 and 1 is complemented. We also know that 3.12 Theorem complemented.
Any
complemented
modular
lattice
is
relatively
The power set lattice X is complemented. The complement of S ⊂ X is the set-theoretic complement S c = X ∼ S = { x ∈ X : x ∉ S } . X is also relatively complemented. Let A, B ∈ X with A ⊂ B , and let S ∈ [ A, B ] , i.e. A ⊂ S ⊂ B . The complement of S in [ A, B ] is the set B ∼ ( S ∼ A) .
Recall (from the previous chapter) that X denotes the collection of all equivalence relations on a set X . We saw (Theorem 2.18) that X is a complete lattice. It is more: 3.13 Theorem
X is a complemented lattice.
We have already seen that the equality relation I is the least element, and the universal relation U is the greatest element in X .
PROOF
Let R be an equivalence relation on the set X , considered as its corresponding partition of X . We may assume that R is neither of the trivial partitions I and U . Let the blocks of R be denoted by Ai . I shall construct a partition R c of X that is a complement of R . In each block Ai , choose a single representative ai ∈ A i , and denote
the set {a i } , which has exactly one element from each block, by S . Let R c be the singular partition [Definition 1.17] consisting of the block S and the rest of the blocks singleton sets { x} for each x ∈ X ∼ S . Then
(2)
R ∨ R c = U and R ∧ R c = I .
66
Note that the existence of the set {a i } with exactly one element ai from each block Ai , whence the existence of a singular partition, is equivalent to the Axiom of Choice (1.37).
□
Since the choice of ai ∈ A i is in general not unique, the above construction also shows that complements in X are not unique. 3.14 Theorem
X is a relatively complemented lattice.
Let A, B, R be equivalence relations on the set X , with R ∈ [ A, B ] . Let the blocks of B be denoted by Bi . I shall construct a
PROOF
partition R c of X that is a complement of R in the interval [ A, B ] . Since R ≤ B , each block Bi is the disjoint union of a collection of blocks Ri j of R ; and since A ≤ R , each block Ri j of R is itself the disjoint union of a collection of blocks Ai j k of A . In each block Ri j of R , choose a single block from its component blocks Ai j k of A . Merge all these chosen Ai j k s, after ranging through all the blocks Ri j of R , into one single subset of X denoted by S . Let R c be the
partition consisting of the block S , with the rest of the blocks all those Ai j k s of A not chosen in building S . In other words, other than S , the blocks of R c coincide with those of A . Then one may verify that (3)
R ∨ R c = B and R ∧ R c = A .
□
Again, the choice of Ai j k ⊂ Ri j is in general not unique, so the above construction also shows that relative complements in X are not unique. An element may have more than one complement, or none at all. The cancellation law (Theorem 3.8) asserts that for a lattice to be
67
distributive, it is necessary and sufficient that relative complements be unique (if they exist). 3.15 Theorem In any interval of a distributive lattice, an element can have at most one complement. Conversely, a lattice with unique complements (whenever they exist) in every interval is distributive.
Complementarity may also be used to characterize modular lattices: 3.16 Theorem A lattice L is modular if and only if for each interval [ a, b ] ⊂ L , any two comparable elements of [ a, b] that have a common
complement are equal; i.e. iff for all [ a, b ] ⊂ L and x1 , x2 , y ∈ [ a, b ] , if
(i) (ii) (iii)
x1 ≤ x2 or x2 ≤ x1 , x1 ∧ y = x2 ∧ y = a , and x1 ∨ y = x2 ∨ y = b ,
then x1 = x2 .
One may use these theorems to show that a particular lattice is not distributive by demonstrating an element with two distinct complements, or not modular by demonstrating an element with two distinct comparable relative complements. Thus, in view of the constructions in the proofs of Theorems 3.13 and 3.14, and Theorems 3.15 and 3.16, one may conclude that the full lattice X of all equivalence relations on a set X is not in general distributive, and not in general modular. But because of the representation theorems, it evidently contains distributive and modular sublattices. 3.17 Definition A Boolean lattice is a complemented distributive lattice.
In a complemented lattice, every element by definition has at least one complement. In a distributive lattice with 0 and 1, every element by Theorem 3.8 has at most one complement. Thus
68
3.18 Theorem In any Boolean lattice L , each element x has one and only one complement x * . Further, for all x, y , z ∈ L (i) x ∨ x* = 1 , x ∧ x* = 0 ; (ii) ( x *) * = x ;
(iii)
( x ∨ y )* = x * ∧ y *, ( x ∧ y )* = x * ∨ y *.
A Boolean lattice is self-dual. Its structure may be considered as an algebra with two binary operations ∨ , ∧ , and one unary operation * (satisfying the requisite properties), whence it is called a Boolean algebra. Note that a Boolean algebra is required to be closed under the operations ∨ , ∧ , and *. So a proper interval of a Boolean algebra may be a Boolean sublattice, but is not necessarily a Boolean subalgebra. A distributive lattice with 0 and 1, however, has a largest Boolean subalgebra formed by its complemented elements: 3.19 Theorem The collection of all complemented elements of a distributive lattice with 0 and 1 is a Boolean algebra.
The power set lattice X is a Boolean algebra, called the power set algebra of X . A field of sets is a subalgebra of a power set algebra. 3.20 Stone Representation Theorem Each Boolean algebra is isomorphic to a field of sets.
Equivalence Relations and Products 3.21 Lemma Let X = X 1 × X 2 and let R1 , R2 ∈ X be the equivalence
relations on X induced by the natural projections π 1 : X → X 1 , π 2 : X → X 2 , i.e. R 1= Rπ1 , R2 = Rπ 2 . Then (i) Each R1 -class intersects every R2 -class; each R2 -class intersects every R1 -class;
69
(ii) The intersection of an R1 -class with an R2 -class contains exactly one element of X , whence R1 ∧ R2 = I (the equality relation); (iii)
R1 ∨ R2 = U (the universal relation).
The conditions R1 ∧ R2 = I and R1 ∨ R2 = U , of course, say that R1 and R2 are complements in the lattice X (Definition 3.9 and Theorem 3.13). This lemma follows directly from the definitions and the observation that an R1 -class is of the form {( a, y ) : y ∈ X 2 } for some fixed a ∈ X 1 , and an R2 -class is of the form
{( x, b ) : x ∈ X 1} for
some fixed b ∈ X 2 ; in other
words, each R1 -class π 1−1 ( a ) is a copy of X 2 , and each R2 -class π 2−1 ( b ) is a copy of X 1 . The converse of Lemma 3.21 is also true: 3.22 Lemma Let X be a set and let R1 , R2 ∈ X satisfy the three
conditions (i)–(iii) in Lemma 3.21. Then X = X 1 × X 2 , where X 1 = X R1 and X 2 = X R2 . PROOF
By Lemma 2.16, the equivalence classes of R1 ∧ R2 are
obtained by forming the set-theoretic intersection of each R1 equivalence class with each R2 -equivalence class. R1 , R2 ∈ X , a map (4)
Given
φ : X ( R1 ∧ R2 ) → X R1 × X R2 may therefore be defined, that sends a R1 ∧ R2 -class to the uniquely determined ordered pair of R1 -class and R2 -class of which the R1 ∧ R2 -class is the intersection. This map φ is one-to-one.
70
If condition 3.21(i) is satisfied, then φ is onto X R1 × X R2 , whence (5)
X ( R1 ∧ R2 ) ≅ X R1 × X R2 . Condition 3.21(ii) then completes the proof to the requisite X ≅ X R1 × X R2 . □ In terms of generating observables, one has
3.23 Lemma Let X be a set equipped with two observables f , g . Then there is always an embedding (i.e. a one-to-one mapping)
(6)
φ : X R f g → X R f × X Rg .
This embedding is onto if and only if f and g are unlinked to each other.
Covers and Diagrams The notion of ‘immediate superior’ in a hierarchy may be defined in any poset: 3.24 Definition Let X be a poset and x, y ∈ X . One says y covers x , or x is covered by y , if x < y and there is no z ∈ X for which x < z < y .
The covering relation in fact determines the partial order in a finite poset: the latter is the smallest reflexive and transitive relation that contains the former. 3.25 Definition Let X be poset with least element 0. An element a ∈ X is called an atom if a covers 0.
71
In the poset X , ⊂ , for A, B ⊂ X , B covers A if and only if A ⊂ B and B ∼ A contains exactly one element. Any singleton subset of X is an atom. In the poset X , ≤ , for two partitions R1 and R2 , R2 covers R1 if and only if one of the blocks of R2 is obtained by the union of two blocks of R1 , while the rest of the blocks of R2 are identical to those of R1 . In these two examples, note that we are considering the full posets X and X ; with their subsets, the ‘gaps’ in the covers may of course be larger. 3.26 Hasse Diagram Using the covering relation, one may obtain a graphical representation of a finite poset X . Draw a point (or a small circle or a dot) for each element of X . Place y higher than x whenever x < y , and draw a straight line segment joining x and y whenever y covers x . The resulting graph is called a (Hasse) diagram of X .
Let us consider two simple examples. Let A be the three-element set {1, 2,3} . Its power set is A = {∅, {1} , {2} ,{3} , {1, 2} , {1,3} , {2,3} , A } .
The diagram of A, ⊂ is
(7)
72
The partitions of A are I = {{1} ,{2} ,{3}} , R1 = {{1, 2} , {3}} , R2 = {{1,3} ,
{2}} ,
R3 = {{1} , {2,3}} , and U = {{1, 2,3}} . The diagram of A, ≤ is
(8)
One sees that x < y in X iff there is a path from y to x downward in the diagram of X . Thus any poset is defined (up to isomorphism) by its diagram. Also, the diagram of the dual poset X = X , ≥ is obtained from that of X = X , ≤ by turning the latter upside down. Hesse diagrams are also useful in characterizing modular and distributive lattices. From Theorem 3.16, one sees that the property characterizing nonmodularity only involves five elements: viz. the endpoints of an interval, an element, and its two comparable complements. The lattice of these five elements may be represented in the following diagram, which is called (evidently) the pentagon as well as (obscurely) N5 :
(9)
N5
73
3.27 Theorem A lattice is modular if and only if it does not contain a sublattice isomorphic to the five-element lattice N 5 . Equivalently, lattice is nonmodular if and only if it contains a sublattice isomorphic to the fiveelement lattice N 5 .
Every distributive lattice is modular (Theorem 3.7), but the converse is not true. The following five-element lattice, which is called (evidently) the diamond as well as (obscurely) M 3 , is modular but not distributive:
(10)
M3
From Theorem 3.15, we see that the property characterizing nondistributivity again involves five elements: viz. the endpoints of an interval, an element, and its two (not necessarily comparable) complements. The only two possible arrangements of these five elements are the pentagon N 5 and the diamond M 3 . 3.28 Theorem A lattice is distributive if and only if it does not contain a sublattice isomorphic to the five-element lattice N 5 or M 3 . Equivalently, lattice is nondistributive if and only if it contains a sublattice isomorphic to the five-element lattice N 5 or M 3 .
74
Semimodularity 3.29 Lemma In any lattice, if x ≠ y and both x and y cover z , then z = x ∧ y . Dually, if x ≠ y and z covers both x and y , then z = x ∨ y . 3.30 Theorem
In a modular lattice,
(i)
if x ≠ y and both x and y cover z (whence z = x ∧ y ), then x ∨ y covers both x and y ;
(ii)
if x ≠ y and z covers both x and y (whence z = x ∨ y ), then both x and y cover x ∧ y .
3.31 Corollary In a modular lattice, x covers x ∧ y if and only if x ∨ y covers y . 3.32 Definition A lattice is (upper) semimodular if x covers x ∧ y implies x ∨ y covers y . The dual property is called lower semimodular, for a lattice in which x ∨ y covers y implies x covers x ∧ y .
Corollary 3.31 says that a modular lattice is both (upper) semimodular and lower semimodular. Upper semimodularity is equivalent to the condition (i) in Theorem 3.30, and dually, lower semimodularity is equivalent to the condition (ii) in Theorem 3.30. Henceforth I shall follow the convention that ‘semimodular’ by itself means ‘upper semimodular’. The lattice of equivalence relations X is not modular if X contains four or more elements. However, 3.33 Theorem
X is a semimodular lattice.
75
Chain Conditions Most of the partially ordered sets and lattices we encounter are infinite, but many of them satisfy certain ‘finiteness conditions’. 3.34 Lemma In any poset
(a)
X ,≤
the following conditions are equivalent:
[ascending chain condition, ACC] every ascending chain becomes stationary: if x1 ≤ x2 ≤ x3 ≤
(11)
,
then there exists n ∈ such that xk = xn for all k ≥ n ; (b)
every strictly ascending chain terminates: if x1 < x2 < x3
x2 > x3 >
,
then the chain has only finitely many terms; (c)
[minimum condition] every nonempty subset of X has a minimal element.
Recall Theorem 1.30 that any nonempty finite subset of any poset has minimal and maximal elements. The maximum and minimum conditions 3.34(c) and 3.35(c) — for every nonempty subset, finite or infinite — are not satisfied by all posets. The two lemmata say that when a poset satisfies condition (c), then it also equivalently satisfies the corresponding conditions (a) and (b). Note that the proof of the implication (b) ⇒ (c) requires the Axiom of Choice (1.37), and it may be shown that this is indispensable. Indeed, the implication 3.34(a) ⇒ (c) is Zorn’s Lemma (1.36). Stated otherwise, without the Axiom of Choice (whence its equivalent Zorn’s Lemma), the maximum condition is stronger than the ascending chain condition; but with the Axiom of Choice, both are equivalent. The poset of natural numbers , ≤ satisfies the DCC. Condition 3.35(b) says that an infinite strictly descending chain cannot exist. This fact is the basis of an invention by Pierre de Fermat:
77
3.36 The Method of Infinite Descent Suppose that the assumption that a given natural number has a given property implies that there is a smaller natural number with the same property. Then no natural number can have this property.
Note that the method of infinite descent actually uses the fact that is the ‘opposite’ of its name: there cannot be infinite descent in natural numbers. Stated otherwise, using the method, one may prove that certain properties are impossible for natural numbers by proving that if they hold for any numbers, they would hold for some smaller numbers; then by the same argument, these properties would hold for some still-smaller numbers, and so on ad infinitum, which is impossible because a sequence of natural numbers cannot strictly decrease indefinitely. It may even be argued that Fermat used this method in (almost) all of his proofs in number theory. (He might have, perhaps, even used it in the one proof that a margin was too narrow to contain!) The next two lemmata say that induction principles hold for posets with chain conditions. 3.37 Lemma Let X , ≤ be a poset satisfying the ACC. If P ( x ) is a
statement such that (i)
P ( x ) holds for all maximal elements x of X ;
(ii)
whenever P ( x ) holds for all x > y then P ( y ) also holds;
then P ( x ) is true for every element x of X . PROOF
Let Y = { x ∈ X : ¬P ( x )} (i.e., Y is the collection of all x ∈ X
for which P ( x ) is false).
I shall show that Y has no maximal
element. For if y ∈ Y is maximal, then consider elements x ∈ X such that x > y . Either no such elements exist, or P ( x ) has to be true, because y is maximal. But then condition (ii) implies that P ( y ) is true, contradicting y ∈ Y . (When there are no elements
78
x ∈ X with x > y , the antecedent of condition (ii) is vacuously satisfied.) Since X , ≤ satisfies the ACC, whence by Lemma 3.34 also the maximum condition, the only subset of X that has no maximal element is empty. Thus Y = ∅ , and so P ( x ) is true for every element x of X .
□
Dually, one has 3.38 Lemma Let X , ≤ be a poset satisfying the DCC. If P ( x ) is a
statement such that (i)
P ( x ) holds for all minimal elements x of X ;
(ii)
whenever P ( y ) holds for all y < x then P ( x ) also holds;
then P ( x ) is true for every element x of X . Note that in both Lemmata 3.37 and 3.38, condition (i) is in fact a special case of (ii). If x is a minimal element of X , then there are no elements y ∈ X with y < x . The antecedent of condition 3.38(ii) is vacuously satisfied, whence P ( x ) is true; i.e. (ii) ⇒ (i). Condition (i) is included in the statements of the lemmata because maximal and minimal elements usually require separate arguments. Compare Lemma 3.38 with ordinary mathematical induction; we see that (i) is analogous to ordinary induction’s ‘initial step’, and (ii) is analogous to the ‘induction step’. Indeed, Lemmata 3.37 and 3.38 are known as Principles of Transfinite Induction. Lemma 3.38 is used more often in practice than Lemma 3.37, because it is usually more convenient to use the DCC and minimum condition than their dual counterparts. 3.39 Definition A poset X , ≤ is well-ordered if every nonempty subset
of X has a minimal element. This is, of course, simply the minimum condition of Lemma 3.35(c). The concept of ‘well-ordered set’ has a separate set-theoretic history, and is
79
intimately tied to that of ordinal number (cf. Theorem 1.34 and 1.35). The fact that , ≤ is a well-ordered set is the basis for ordinary mathematical induction. The following results are immediate. 3.40 Lemma A well-ordered set is a chain. 3.41 Lemma Any finite chain is well-ordered. 3.42 Lemma Any subset of a well-ordered set is well-ordered. 3.43 Definition An element x of a lattice L is join-irreducible if x = sup M = y for some finite subset M of L implies x ∈ M . Dually,
∨
y∈M
an element x of a lattice L is meet-irreducible if x = inf M =
∧y
y∈M
for
some finite subset M of L implies x ∈ M . Note that by definition, (since ∅ is a finite subset of M ) the least element 0 of L (if it exists) is not join-irreducible, because 0 = sup ∅ , and the greatest element 1 of L (if it exists) is not meet-irreducible, because 1 = inf ∅ . For a join-irreducible element x , if x = y ∨ z , then x = y or x = z . Dually, for a meet-irreducible element x , if x = y ∧ z , then x = y or x = z. The following important theorems follow from the transfinite induction lemmata 3.37 and 3.38: 3.44 Theorem If a lattice satisfies the ascending chain condition, then every element can be expressed as a meet of a finite number of meetirreducible elements. 3.45 Theorem If a lattice satisfies the descending chain condition, then every element can be expressed as a join of a finite number of joinirreducible elements.
80
Recall (Definition 3.25) that in a poset X with least element 0, an element a ∈ X is called an atom if a covers 0. 3.46 Theorem
An atom is join-irreducible.
Let a be an atom (whence 0 < a ), and let a = b ∨ c . Then 0 ≤ b ≤ a and 0 ≤ c ≤ a . Since there are no other elements between 0 and a , b and c must be either 0 or a . But 0 ∨ 0 = 0 , so either b = a □ or c = a . PROOF
81
PART II Systems, Models, and Entailment
If, then, it is true that the axiomatic basis of theoretical physics cannot be extracted from experience but must be freely invented, can we ever hope to find the right way? Nay, more, has this right way any existence outside our illusions? Can we hope to be guided safely by experience at all when there exist theories (such as classical mechanics) which to a large extent do justice to experience, without getting to the root of the matter? I answer without hesitation that there is, in my opinion, a right way, and that we are capable of finding it. Our experience hitherto justifies us in believing that nature is the realisation of the simplest conceivable mathematical ideas. I am convinced that we can discover by means of purely mathematical constructions the concepts and the laws connecting them with each other, which furnish the key to the understanding of natural phenomena. Experience may suggest the appropriate mathematical concepts, but they most certainly cannot be deduced from it. Experience remains, of course, the sole criterion of the physical utility of a mathematical construction. But the creative principle resides in mathematics. In a certain sense, therefore, I hold it true that pure thought can grasp reality, as the ancients dreamed. — Albert Einstein (10 June 1933) On the Methods of Theoretical Physics Herbert Spencer Lecture, University of Oxford
82
In this second movement, the timbre of my composition changes. I move from abstract algebra into the domains of ontology and epistemology. System is a basic undefined term, a primitive. It takes on the intuitive meaning of ‘a collection of material or immaterial things that comprises one’s object of study’. The crux in the formulation of a theory of living systems is the conception of model. It is the nature of the relation between the entailment patterns of two systems that allows one to serve as a model of the other. The purpose of modelling is that one may learn something new about a system of interest by studying a different system that is its model.
83
4 The Modelling Relation
The essence of a modelling relation consists of specifying an encoding and a corresponding decoding of particular system characteristics into corresponding characteristics of another system, in such a way that implication in the model corresponds to causality in the system. (I shall presently explain these italicized terms in detail below.) Thus in a precise mathematical sense a theorem about the model becomes a prediction about the system. A general theory of the modelling relation results when these remarks are given a rigorous setting. This theory has many important implications: to more general situations of metaphor, to the way in which distinct models of a given system are related to each other, and to the manner in which distinct systems with a common model may be compared. The modelling relation is the point of departure in Rosen’s science. It was explored in detail in Chapters 2 and 3 of AS, and also in Chapter 3 of LI. The present chapter contains a précis of the theme of, as well as my variations on, the subject.
Dualism 4.1 Self Any organism — whether observer, perceiver, or cognizer — automatically creates a dualism between self and non-self.
84
The concept of ‘self’ is a universal primitive, indeed, the universal primitive, from which everything in the universe unfolds. If I represent ‘self’ as a set I , then the recognition of self is the identification of the property ‘belonging to’ I , which is itself a primitive in set theory. The non-self is, therefore, the complementary set I c in a suitable universe. Stated otherwise, if one considers the ‘membership relation’ ∈, self is ∈, and non-self is ∉. 4.2 Ambience The ‘non-self’ may also be called one’s ambience. The ambience comprises all that inhabits the external or outer world, the world of events and phenomena (including other observers). An organism draws the sharpest possible distinction between oneself and one’s ambience, and proceeds on the basis that there is a real, fundamental distinction between them. The dualism between internal and external is one of the most ancient in a sentient being’s perception of the universe. This dualism intrudes itself at the most basic levels. In particular, for human beings, it complicates our ‘science’. Science is supposed to deal with the objective phenomena in the external world, but these need, however, to be perceived by the senses that are of the subjective internal or inner world of the cognizer. The understanding of these phenomena means a translation into a realm of language and symbol which does not belong to the external world. The world was, of course, here before we were. Without any hearer around, however, while a tree falling in the forest may still make a sound, there will not be a science of the sound. The word ‘science’, after all, extends its original meaning of knowledge, which implies a cognizant observer. (Note the answer to any question, existential or otherwise, depends on the definition of the terms involved. If by the word ‘sound’ one means, say, ‘the sensation caused in the ear, or the mental image created in the brain, by vibrating molecules’, then of course by definition sound and the hearing of it are coexistent, thence Berkeley’s falling-tree question becomes rhetorical, posed to simply affirm a tautology. But if ‘sound’ means ‘a longitudinal pressure wave in an elastic medium’, then the concepts of sound and hearing are uncoupled, whence ‘unheard sound’ is not an oxymoron.) The requirement of
85
understanding the objective by symbolic means is almost a definition of cognition. In the deepest sense, the claim of a surrogacy between phenomena and their description is the essence of modelling between external world and the parallel internal one. Without a modeller, there is no modelling. The establishment of relationships between the internal world of ideas and language and the external world of sensory phenomena is also the hallmark of the theory of systems. The necessity of treating external and internal phenomena together is one of the crucial characteristics of system theory, which has introduced a new element into the philosophy of science over the past few decades. While there had been a few preliminary works in the field, the subject of ‘general system theory’ was considered to be founded in 1954 by Ludwig von Bertalanffy, Kenneth Boulding, Anatol Rapoport, and Ralph Gerard. Note that in the name ‘general system theory’, the adjective ‘general’ modifies the noun ‘theory’, whence the topic is the generalities of ‘system theory’; ‘general’ is not attributed to ‘systems’, so the topic is not the theory of ‘general systems’. The definitive introduction to the subject was written by a founder, the still-precious General System Theory by von Bertalanffy [1968]. 4.3 “Systems [sic] Theory” Consider the terms ‘theory of systems’ and ‘system theory’ in the previous paragraph; in particular, note the singular form system in the latter: not “systems theory”. This last usage is a solecism that became accepted when it had been repeated often enough, a very example of ‘accumulated wrongs become right’. Recall that von Bertalanffy’s masterwork is called General System Theory. (In some of his later writings, the term “systems theory” did occasionally appear. I have in my collection some copies of his original typescripts, in which he had written “system theory”, but in the published versions they mysteriously mutated to “systems theory” — evidence of the handiwork of an over-zealous copy editor, perhaps...) Just think of ‘set theory’, ‘group theory’, ‘number theory’, ‘category theory’, etc. Of course one studies more than one object in each subject! Indeed, one would say in the possessive ‘theory of sets’, ‘theory of groups’,
86
‘theory of numbers’, ‘theory of categories’, ...; one says ‘theory of systems’ for that matter. But the point is that when the noun of a mathematical object (or indeed any noun) is used as adjective, one does not use the plural form. 4.4 Natural System Natural science is one attempt to come to grips with what goes on in the external world. It has taught one to isolate from one’s ambience conspicuous parts, which one may call natural systems. The extraction of a natural system from one’s ambience then creates a new dualism, between the system and its environment. A specification of what the system is like at a particular time is its state. As science has developed since the time of Newton, we have learned to characterize systems in terms of states, and to cast the basic problems of natural science in terms of temporal sequences of state transitions in systems. These sequences are determined in turn by the character of the system itself, and by the way it interacts with the environment. Thus one has the Primitive A natural system is (a) a part, whence a subset, of the external world; and (b) a collection of qualities, to which definite relations can be imputed. If ‘self’ is represented by the set I , then a natural system N may be represented by a subset N ⊂ I c . Note that I have used the symbol N to denote both a natural system and a set that represents it. Indeed, I also have I as self and a set representing the same. This equivalence is the essence of the modelling relation that I shall explicate presently. Here, let us simply note in passing that it is not too great an exercise of faith to believe that everything is a set: ‘a set of apples’, ‘a set of celestial bodies’, ‘a set of cells of an organism’, ‘a set of metabolic and repair components’, ‘a set of mechanistic parts’, a set containing infinitely many members, a set consisting of a single item, an empty set,…
87
4.5 Quid est veritas? ‘Reality’ is one of the few words which mean nothing without quotes. — Vladimir Nabokov (1956) On a Book Entitled «Lolita»
One learns about systems, their states, and their state-transition sequences through observation, through measurement. This, of course, is what observers do. But a mere observer is not a scientist. Indeed, basic features of the observer’s internal world enter, in an essential way, in turning an observer into a scientist. Often, these features are imputed to the ambience of the observer, and treated as if they too were the results of observation. Models and the modeller’s behaviour shape each other. The models realized in the anticipatory systems that are ourselves are relative truths, and are in fact the ‘constitutive parameters’ of our individuality, or as Rosen put it, “subjective notions of good and ill”. One may speak of absolute truths concerning formal systems, in the universe of mathematics. But there is no such luxury for natural systems, in the external world outside the realm of formalism. In the natural world, ‘reality’ is subjective, ‘truth’ is relative. We can indeed agree about life, the universe, and everything, if (and what a big if this is!) we can agree on our answers to the question, in Pontius Pilate’s immortal words: “What is truth?” (John 18:38). 4.6 Formal System A formalism is a ‘sublanguage’ specified entirely through its syntax. The importance of making it a sublanguage resides in (a) its capacity, through the larger language in which it sits, for semantic function, and (b) one may explore a formalism through purely syntactic means, independent of any such semantic function, if one wishes to do so. There is a close analogy between extracting a formalism from a language and extracting a natural system from an observer’s ambience; one may therefore call a formalism a formal system. Note that the concept includes, but is not limited
88
and therefore not equivocated to, Hilbert’s formalization. In the broadest sense, mathematics is the study of such formal systems, whence one has the next Primitive A formal system is an object in the universe of mathematics. In the formal world, the axioms of a formal system define its objective reality, and logic predicates its absolute truths. In Chapter 7, I shall give a definition of formal system that encompasses this primitive without loss of generality. I shall be concerned with the internal syntactic structures of formal systems, and the ways in which semantic content can be attached to formal systems. For now, let us concentrate on the close relations that exist between natural systems in the external world, and formal systems in the internal one.
Natural Law It can be commonly agreed that no one, whether experimenter, observer, or theorist, does science at all without believing that nature obeys laws or rules, and that these natural regularities can be at least partly grasped by the mind. That nature obeys laws is often subsumed under the notion of causality. The articulation of these causal laws or relationships means, in brief, that one can establish a correspondence between events in the world and propositions in some appropriate language, such that the causal relations between events are exactly reflected in implication relations between corresponding propositions. 4.7 Causality and Perception ‘Law of Nature’, or Natural Law, consists of two independent parts. The first of these comprises a belief, or faith, that what goes on in the external world is not entirely arbitrary or whimsical. Stated in positive terms, this is a belief that successions of events in that world are governed by definite relations, usually called causal. Without such a belief, there could be no such thing as science. Causality, and ideas of
89
entailment in general, guarantee a kind of regularity that one expects in nature and in science, in the sense that the same causes imply the same effects. In short, causality is ‘objective’. I shall explicate the topic of causation in more philosophical and scientific details in the next chapter. The second constituent of Natural Law is a belief that the causal relations between events can be grasped by the mind, articulated and expressed in language. Therefore, one sees in the causal world the operation of laws in terms of which the events themselves may be understood. ‘Perceived causality’ then becomes subjective. This ‘perception’ aspect of Natural Law posits a relation between the syntactic structure of a language and the semantic character of its external referents. This relation is different in kind from entailment within language or formalisms (i.e., implication or inference, which relate purely linguistic entities), and from entailment between events (i.e., causal relations between things in the external world). Natural Law, therefore, posits the existence of entailments between events in the external world and linguistic expressions (propositions) about those events. Stated otherwise, it posits a kind of congruence between implication (a purely syntactic feature of languages or formalisms) and causality (a purely semantic, extra-linguistic constituent of Natural Law). One may summarize thus: Natural Law makes two separate assertions about the self and its ambience: I.
The succession of events or phenomena that one perceives in the ambience is not arbitrary: there are relations (e.g. causal relations) manifest in the world of phenomena.
II. The posited relations between phenomena are, at least in part, capable of being perceived and grasped by the human mind; i.e. by the cognitive self. Science depends in equal parts on these two separate axioms of Natural Law. In short, Axiom I, that causal order exists, is what permits science to exist in the abstract, and Axiom II, that this causal order can be imaged by implicative order, is what allows scientists to exist. Both are required.
90
4.8 Wigner The theorist’s job is essentially to bring causal order and implicative order into congruence. There appears no a priori reason, however, to expect that purely syntactic operations (i.e., inferences on propositions about events) should in fact correspond to causal entailments between events in the external world. Pure mathematicians like to boast that for the objects of their studies, they do not. Surprisingly often, however, even pure mathematics of the most abstract origins turns out to be useful in explaining natural phenomena. If one chooses one’s language carefully, and expresses external events in it in just the right way, the requisite homology appears between implication in the language and causality in the world described by the language. The physicist Eugene Wigner once delivered a lecture [Wigner 1960] on “the unreasonable effectiveness of mathematics in the natural sciences”. Wigner would not have written his now-famous article if he felt that mathematics was only ‘reasonably effective’, or even ‘reasonably ineffective’, in science — the reasonable is not the stuff of miracles, and gives one no reason to reason about it further. Wigner obviously felt, however, that the role played by mathematics in the sciences was in some sense excessive, and it was this excess which he regarded as counterintuitive. There are two prongs to Wigner’s disquiet about the role of mathematics in science. The first is that mathematics, taken in itself as an abstract entity, should have such success in dealing with extra-mathematical referents. The second, which builds upon the first, notes that the criteria which have guided the historical development of mathematics have apparently been unrelated to any such extra-mathematical referents; they have been, rather, selections of subjective mathematical interest, arena of the exercise of cleverness, and most of all, pageants of abstract beauty. Why, Wigner asks, should formalisms developed according to such criteria allow extra-mathematical referents at all, let alone with such fidelity? His essay provides evidence to bolster his impression, but he does not attempt to account for the excess which he perceives. A relation between a language or formalism and an extra-linguistic referent, which manifests such a congruence between syntactic implication
91
within language and causality in its external referent, will be called a modelling relation. I shall next describe such relations more precisely, before investigating what they themselves entail, and in so doing provide a possible explanation to Wigner’s miracle.
Model versus Simulation 4.9 Arrow Diagram
(1)
Figure (1) contains the components I need to describe what a modelling relation is between a system S1 and a system S 2 . Often, S1 is a natural system and S2 is a formal system; but I shall begin with the most general case where each of S1 and S 2 can be either natural or formal. Later, I shall specialize on the prototypical situation, as well as other natural/formal system combinations for S1 and S 2 . The crux of the matter lies in the arrows of the diagram, which I have labelled ϕ , ψ , α , and β . The arrows ϕ and ψ represent entailment in the systems S1 and S2 , respectively. The arrow α is called the encoding arrow. It serves to associate features of S1 with their counterparts in S2 . The arrow β denotes the inverse activity to encoding; namely, the decoding of features of S 2 into those of S1 .
92
The arrows α and β taken together thus establish a kind of dictionary, which allows effective passage from one system to the other and back again. However, I may remark here on the peculiar status of the arrows α and β . Namely, they are not a part of either systems S1 or S 2 , nor are they entailed by anything either in S1 or in S 2 . 4.10 Simulation A modelling relation exists between systems S1 and S2 when there is a congruence between their entailment structures. The vehicle for establishing a relation of any kind between S1 and S 2 resides, of course, in the choice of encoding and decoding arrows, the arrows α and β . A necessary condition for congruence involves all four arrows, and may be stated as ‘whether one follows path ϕ or paths α , ψ , β in sequence, one reaches the same destination’. Expressed as composition in mathematical terms, this is (2)
ϕ = β Dψ D α .
If this relation is satisfied, one says that S2 is a simulation of S1 . Let f : X → Y be a mapping representing a process in the entailment structure of the arrow ϕ in S1 . Consider a mapping g : α ( X ) → α (Y )
(which is a process in the entailment structure of the arrow ψ in S 2 ) that makes the diagram
(3)
93
commute (which means for every element x in X , whether it traces through the mappings f followed by α , or through α followed by g , one gets the same result in α (Y ) ; i.e. the equality (4)
α ( f ( x ) ) = g (α ( x ) )
holds for all x ∈ X ). Note that this commutativity condition for simulation places no further restrictions on the mapping g itself, other than that it needs to reach the correct final destination. Such emphasis on the results regardless of the manner in which they are generated (i.e. with no particular concern on underlying principles) is the case when S2 is a simulation of S1 . 4.11 Model
If, however, the mapping g is itself entailed by the encoding
α , i.e. if g = α ( f ) , whence the mapping in S 2 is α ( f ) : α ( X ) → α (Y ) , then one has the commutative diagram
(5)
and the equality corresponding to (4), for every element x in X , is (6)
α ( f ( x ) ) = α ( f ) (α ( x ) ) .
When this more stringent condition (6) is satisfied, the simulation is called a model. If this modelling relation is satisfied between the systems S1 and S 2 ,
94
one then says that there is a congruence between their entailment structures, and that S 2 is a model of S1 . This kind of congruence between entailment structures is defined by the mathematical entity called functor (consult the Appendix for definitions and examples) in category theory, which I shall explain briefly later (cf. 4.17 below). A simulation of a process provides an alternate description of the entailed effects, whereas a model is a special kind of simulation that additionally also provides an alternate description of the entailment structure of the mapping representing the process itself. It is, in particular, easier to obtain a simulation than a model of a process. 4.12 Remarks Examples are in order. For instance, Claudius Ptolemy’s Almagest (c. AD 150) contained an account for the apparent motion of many heavenly bodies. The Ptolemaic system of epicycles and deferents, later with adjustments in terms of eccentricities and equant points, provided good geometric simulations, in the sense that there were enough parameters in defining the circles so that any planetary or stellar trajectory could be represented reasonably accurately by these circular traces in the sky. Despite the fact that Ptolemy did not give any physical reasons why the planets should turn about circles attached to circles in arbitrary positions in the sky, his simulations remained the standard cosmological view for 1400 years. Celestial mechanics has since, of course, been progressively updated with better theories of Copernicus, Kepler, Newton, and Einstein. Each improvement explains more of the underlying principles of motion, and not just the trajectories of motion. The universality of the Ptolemaic epicycles is nowadays regarded as an extraneous mathematical artefact irrelevant to the underlying physical situation, and it is for this reason that a representation of trajectories in terms of them can only be regarded as simulation, and not as model.
As another example, a lot of the so-called ‘models’ in the social sciences are really just sophisticated kinds of curve-fitting, i.e. simulations. These activities are akin to the assertion that since a given curve can be approximated by a polynomial, it must be a polynomial. Stated otherwise,
95
curve-fitting without a theory of the shape of the curve is simulation; model requires understanding of how and why a curve takes its shape. In short: simulation describes; model explains. ‘Simulation’ is based on the Latin word similis, ‘like, similar’. A simulacrum is ‘something having merely the form or appearance of a certain thing, without possessing its substance or proper qualities’. ‘Model’ in Latin is modulus, which means ‘measure’, herein lies a fine nuance that implies a subtle increase in precision. (It is interesting to note that in FM — Fundamentals of Measurement and Representation of Natural Systems, the first book of the Rosen trilogy — measurement and similarity are two main topics.) In common usage, however, the two words ‘simulation’ and ‘model’ are often synonyms, meaning: ○ a simplified description of a system put forward as a basis for theoretical understanding ○ a conceptual or mental representation of a thing ○ an analogue of different structure from the system of interest but sharing an important set of functional properties. Some, alternatively, use ‘model’ to mean mathematical theory, and ‘simulation’ to mean numerical computation. What I have presented above, however, are Robert Rosen’s definitions of these two words.
The Prototypical Modelling Relation I now specialize to the prototypical modelling relation when S1 is a natural system and S2 is a formal system. 4.13 Interpretation of the Arrows When S1 is a natural system, ϕ is causal entailment. It may be thought of as the entailment of subsequent states by present or past states. It is what an observer sees diachronically when
96
looking at a system. The arrow ϕ thus makes no reference to anything pertaining to language, or indeed to any internal activity of the observer, beyond the basic act of isolating the system and observing the causal entailment in the first place. In short, the existence of causal entailment is ontological, but its representation as the arrow ϕ is, however, epistemological. The fact that the natural system S1 with the arrow ϕ can be represented at all is due to Axiom I of Natural Law. The arrow ψ schematically represents the entailment apparatus of the formalism S2 , its inferential structure, implications. This inferential entailment is entirely syntactic. It makes no reference to semantics, meaning, or any external referents whatever. In short, inferential entailment ψ is strictly epistemological. The encoding arrow α serves to associate features of S1 with formal tokens in S2 . The simplest and perhaps the most familiar kind of encoding is the expression of results of measurement in numerical terms. (This topic was investigated in detail in FM.) Numbers, of course, are formal objects; the association of numbers with meter readings is the most elementary kind of encoding of a natural system in a formal one. In short, encoding α is our description of the ontology. The decoding arrow β denotes the complementary activity to encoding; namely, the association of elements of the formalism S2 into specific external referents, observable properties of the natural system S1 . In short, decoding β is our interpretation of our epistemology. Thus encoding and decoding let one pass effectively from the natural world to the formal one and back again. As I mentioned above, they cannot be meaningfully said to be caused by S1 or anything in it; nor can they be said to be implied by anything in S2 . The formal system S 2 with its inferential entailment arrow ψ , together with the encoding and decoding arrows α and β , are Axiom II of Natural Law.
97
4.14 Summary The situation may be represented in the following canonical diagram, with a change-of-names of the symbols:
(7)
A modelling relation exists between the natural system N and the formal system F when (8)
i = ε ( c ) and c = δ D i D ε .
If these conditions are satisfied, F is a model of N , and N is a realization of F. Let me explain alternatively what the above congruence conditions (8) mean. One thinks of the causal entailment structure c as embodied in statetransition sequences in N ; it is what an observer sees when simply watching events in N unfold. The encoding arrow α pulls features of N into the formal system F . More precisely, it endows these features with formal images in F , images on which the inferential entailment structure i of F may operate. One may think of these ‘observed’ images as ‘hypotheses’ or ‘premises’ in F . The inferential structure of F then specifies what these ‘hypotheses’ entail within the formal system F ; this process of entailment in F is precisely the arrow i . The results of applying inferential rules to hypotheses generates ‘theorems’ in F . The particular ‘theorems’ in which one is now
98
interested are those arising from hypotheses coming from N via the encoding ε. It is evident that such ‘theorems’, when decoded from F to N via the arrow δ , become assertions (‘predictions’) about N . The commutativity requirement in the modelling condition then requires that one always gets the same answer, whether (a) one simply watches the operation of causal entailment c in the natural system N itself, or (b) one encodes N into F via the arrow ε , apply the inferential processes i of F to what is encoded, and then decode the resulting theorems, via the decoding arrow δ , into predictions about N . 4.15 Natural Law Revisited The deceptively simple diagram of the prototypical modelling relation above allows the reformulation of the concept of Natural Law in a mathematically rigorous way. Tersely, Natural Law asserts that any natural system N possesses a formal model F , or conversely, is a realization of a formalism F . Stated otherwise, Natural Law says that any process of causal entailment in the external world may be faithfully represented by homologous inferential structure in some formal system F with the appropriate encodings and decodings. It must be stressed that Natural Law alone does not tell us how to accomplish any of this; it merely says that it can be done.
This equivalence of causality in the natural domain and inference in the formal domain is an epistemological principle, the axiom Every process is a mapping. Just like the axiom “Everything is a set.” leads to the identification of a natural system N and its representation as a set (cf. 4.4), mathematical equations representing causal patterns of natural processes are results of the identification of entailment arrows and their representations as mappings. As a cautionary note, it must be emphasized that this equivalence is a consequence of the model, the accessibility of which is predicated by Natural
99
Law. Simulation is promiscuous; the less stringent requirement in the encoding means that a causal process may very well be manipulated so that its function is lost (cf. 4.10 and 4.11), in which case a natural process may not be represented by a formal mapping. I shall revisit simulation in the context of simulability in Chapter 8. Structure is the order of parts, represented by sets; function is the order of processes, represented by mappings. The very fact that the right-hand side of the modelling relation diagram (7) is an object in the universe of mathematics thus implies, in a sense, that the concept of Natural Law already entails the efficacy of mathematics, which Wigner was astonished to find so unreasonable. Mathematics combines generality of concepts with depth of details, and is the language of nature. Perhaps what is truly surprising is not the ‘can be done’ part, but that we should have been so good at the ‘how’. After all, modelling itself, the choice of appropriate sets and mappings {F , ε , δ } of formal system, encoding, and decoding given a natural system N , is more an art than a science. But then again, mathematics is both high art and supreme science. Mathematics is Regina scientiarum; it adds lucidity and clarity to science. The tree of science has many branches, but the trunk is mathematics. A caveat, however, is that a danger of mathematical modelling is in abstracting too much away. One must always remember the semantics, not just the syntax. And one must learn the importance of alternate descriptions. Generality and depth must be part of the modelling endeavour itself, manifested in the plurality of models from diverse branches of mathematics. Wigner’s miracle is not the all-encompassing “mathematics is effective in explaining everything”, only that “mathematics is unreasonably effective in the natural sciences”, and that what it can do it can usually do exceptionally well. Mathematical modelling, therefore, is not the reduction of science to mathematics. The error of Reductionism (i.e., to physics) is not the claim that physics can explain other sciences, but that it exhausts ‘reality’, that all sciences should have its format.
100
The General Modelling Relation In the previous discussion on the prototypical modelling relation, I have been concerned with the case S1 = N and S 2 = F , making manifest some of the parallels that exist between the external world of events or phenomena and the internal world of language and mathematics. Indeed, the modelling relation, and the concept of Natural Law on which it rests, are merely direct expressions of these parallels. Next, I consider other natural/formal system combinations for S1 and S 2 . 4.16 Homology When both S1 and S 2 are formal systems, one has a modelling relation within mathematics itself. By ‘modelling within mathematics’, I mean the establishment of homologies between different kinds of inferential structures, arising from different parts of mathematics. In effect, one part of mathematics is treated like the external world; its inferential properties treated like causal entailment. Such ‘internal modelling’ within mathematics allows one to bring one part of mathematics to bear on another part, often to the most profound effect.
Examples of such ‘internal modelling’ abound. Consider, for instance, Cartesian Analytic Geometry, which created an arithmetic model of Euclid’s Elements, and thereby brought algebraic reasoning to bear on the corpus of geometry. Later, the consistency of ‘non-Euclidean’ geometries was proved by establishing Euclidean models of non-Euclidean objects. Whole theories in mathematics, such as the theory of Group Representations, rest entirely on such notions. Henri Poincaré ushered a new era into mathematics by showing how to build other kinds of algebraic models of geometric objects. His idea of homotopy, and later, of homology, showed how to create group-theoretic images of topological spaces, and to deduce properties of the latter from those of the former. 4.17 Category Theory In 1945, a whole new branch of mathematics was developed, by Samuel Eilenberg and Saunders Mac Lane, initially to formalize these methodologies initiated by Poincaré. This came to be called
101
category theory, and its subject matter was precisely the relations between different inferential structures within mathematics itself. In fact, it can be regarded as a general theory of modelling relations within mathematics. The Appendix in this monograph is a terse introduction to the theory. The active agents of comparison between categories (i.e., between different kinds of formalisms or inferential structures) are called functors. The formal counterpart of Natural Law in this purely abstract setting is the existence of nontrivial functors α between categories S1 and S2 :
(9)
α : S1 ( X , Y ) → S 2 (α ( X ) , α (Y ) ) α : ϕ 6 α (ϕ ) α : X 6α(X ) α : Y 6 α (Y )
In mathematics, there is also a subject in axiomatics and foundations called model theory. A model for an axiomatic theory is simply a system of objects satisfying the axioms, chosen from some other theory. The topic is intimately related to consistency and completeness, and Hilbert’s axiomatization and proof theory. In this monograph, I shall not stray into this chapter on foundational matters of mathematics, the exploration of which may begin with Kleene’s timeless Introduction to Metamathematics [1952]. I shall have occasions to refer to Kleene again.
102
4.18 Analogues
(10)
In figure (10), there are two different natural systems N1 , N 2 which possess the same formal model F (or alternatively, which constitute distinct realizations of F ). It is not hard to show that one can then ‘encode’ the features of N1 into corresponding features of N 2 and conversely, in such a way that the two causal structures, in the two natural systems N1 and N 2 , are brought into congruence. That is, one can construct from the above figure a commutative diagram of the form shown in figure (11). This is a modelling relation between two natural systems, instead of a natural system and a formal one.
(11)
Under these circumstances depicted in the two previous figures, the proper term to use is that the natural systems N1 and N 2 are analogues. Analogous systems allow us to learn about one by observing the other. Relations of analogy underlie the efficacy of ‘scale models’ in engineering, as well as all of the various ‘principles of equivalence’ in physics. But the
103
relation of analogy cuts much deeper than this. Natural systems of the most diverse kinds (e.g. organisms and societies, economic systems and metabolisms) may be analogous; analogy is a relation between natural systems which arises through the models of their causal entailments, and not directly from their material structures. As such, analogy and its cognates offer a most powerful and physically sound alternative to reductionism (viz. ‘share a common model’ and therefore ‘analogous’, as opposed to ‘one encompasses the other’). 4.19 Alternate Models
(12)
A complementary diagram to that of figure (10) is shown in figure (12). Here, a single natural system N is modelled in two distinct formalisms F1 , F2 . The question here is: What, if any, is the relation between the formalisms F1 and F2 ? The answer here is not in general as straightforward as before; it depends entirely on the extent of the ‘overlap’ between the two encodings of N in F1 and F2 , respectively. In some cases, one can effectively build at least some encoding and decoding arrows between the two formalisms. For a couple of examples, consider Dirac’s transformation theory formulation of quantum mechanics which unifies Heisenberg’s matrix mechanics and Schrödinger’s wave mechanics, and the relation between the thermodynamic
104
and statistical-mechanical models of fluids. In other cases, there exists no formal relation between F1 and F2 . One then has the situation in which N simultaneously realizes two distinct and independent formalisms; the various Bohr’s complementarities for microphysical phenomena are examples. 4.20 Largest Model Many practical and theoretical questions are raised in situations of this last type; some of them bear crucially on the limits of reductionism. It is often asserted, and still more widely believed, that physics implies reductionism; that the only way to relate natural systems is by analyzing them down to a common set of constituents: molecules, atoms, or elementary particles. In the reductionistic view, a scientific theory that is not firmly grounded on physicochemical principles is by definition wrong. The end result of such an analysis is an encoding of any natural system into a formalism which serves to express any system property in terms of these ultimate constituents. In some sense, this is the largest formalism, the largest model, which can exist; any other model is, in formal terms, some kind of quotient model or submodel of this biggest one.
In this case, the independence of two formalisms F1 and F2 , that N simultaneously realizes, is only apparent. Reductionism holds that one can always embed F1 and F2 in some larger formalism F , which is again a model, and from which F1 and F2 can be recaptured by purely formal means. An all-encompassing largest model is the metaphorical all-explaining ‘theory of everything’. The existence of such a largest formalism, which itself models a given natural system N , and from which all others can be formally generated, would constitute a new postulate about the nature of the material world itself. Some kinds of natural systems admit such a largest model, but others do not. Indeed, if one pursues this matter further, it appears that the distinction, between those that do admit a largest model and those that do not, has many of the properties of the distinction between inanimate and animate, or of simple and complex. I shall pursue these matters later on in this monograph. Rosen’s previous works [AS, LI, EL] initiated these investigations.
105
5 Causation
Felix qui potuit rerum cognoscere causas [Happy is one who comes to know the causes of things] — Virgil (29 BC) Georgics Book II, line 490
Aristotelian Science Aristotle’s categories of causation made their first appearance, albeit only in passing, in Rosen’s publications in Chapter 7 of AS. The topic then received detail treatment in LI (notably Sections 3E, 3G, and 5I). The Philosopher’s ancient text on causality, Chapter 3 of Book II of Physics, presented some of the most influential concepts in human thought. The four causes dominated philosophical and scientific thinking in the Western world for millennia, until the Newtonian revolution that is ‘the mechanization of the world picture’.
106
5.1 Wisdom and Knowledge O Sapientia, quae ex ore Altissimi prodiisti, attingens a fine usque ad finem, fortiter suaviterque disponens omnia: veni ad docendum nos viam prudentiae. [O Wisdom, coming forth from the mouth of the Most High, reaching out from one end to another, mightily and sweetly guiding all things: Come to teach us the way of knowledge.] — Advent Antiphon for 17 December attributed to Benedictine monks (c. sixth century AD)
Σoφíα (sophia) is the Greek word for (and also the Greek goddess of) ‘wisdom’. Philosophy is therefore literally ‘the liking of wisdom’, and has come to mean ‘the use of reason and argument in seeking truth and knowledge’. Sapientia is the Latin word for ‘wisdom’, hence Homo sapiens is Latin for ‘wise man’. Γνώσις (gnosis) is the Greek word for ‘knowledge’. But in English it has mutated to mean ‘esoteric knowledge of spiritual mysteries’. Scientia is the Latin word for ‘knowledge’. Hence Arbor scientia is the ‘tree of knowledge’ in the Garden of Eden. But the word ‘science’ has been specialized (indeed, mechanized) to mean a branch of knowledge conducted on prescribed principles involving the systematized observation of and experiments with phenomena, especially concerned with the material and
107
functions of the physical universe. This ingrained notion of science is an artefact of the age of analysis. Note that the full title of Principia, Isaac Newton’s 1687 masterwork, is Philosophiæ naturalis Principia mathematica (Mathematical principles of natural philosophy). ‘Natural philosophy’ had then been the term used to describe the subject that was the study of nature, while the word ‘science’ had been more specialized and referred to the Aristotelian concept of knowledge, that which was secure enough to be used as a sure prescription for exactly how to do something. John Locke, in An Essay Concerning Humane Understanding (1690) wrote that “natural philosophy is not capable of being made a science”, in the sense that a prescriptive scientific method was too restrictive too encompass the study of nature — an early statement of “not all processes are algorithmic”! I shall continue the exploration on science versus natural philosophy, in terms of analysis versus synthesis, in Chapter 7. 5.2 Αίτιον Aristotle was concerned with γνώσις, i.e. knowledge in its original general sense. He contended that one did not really know a ‘thing’ (which to Aristotle meant a natural system) until one had answered its ‘why?’ with its αίτιον (primary or original ‘cause’). In other words, Aristotle’s science is precisely the subjects for which one knows the answers to the interrogative ‘why?’. Aristotle’s original Greek term αίτιον (aition) was translated into the Latin causa, a word which might have been appropriate initially, but which had unfortunately diverged into our contemporary notion of ‘cause’, as ‘that which produces an effect’. The possible semantic equivocation may be avoided if one understands that the original idea had more to do with ‘grounds or forms of explanation’, so a more appropriate Latin rendering, in hindsight, would probably have been explanatio. 5.3 Whys and Wherefores The interrogative “why?” has a synonym in English in the archaic “wherefore?”. One of Shakespeare’s most quoted lines
108
is often misunderstood. When Juliet asked, “O Romeo, Romeo, wherefore art thou Romeo?”, she was not checking his whereabouts, but asking why he had to be, of all things, a member of the hated rival Montague clan and inherited such an unfortunate family name. ‘Wherefore’ means ‘why’ — for what purpose, reason, or end — not ‘where’. Indeed, Juliet’s “wherefore” line leads in to her famous “What’s in a name?” speech. While ‘wherefore’ has disappeared from modern English usage (other than the not-infrequent misuse of “Wherefore art ...?” as a pretentious substitution of “Where is/are ...?”), the pleonasm ‘whys and wherefores’ has survived. The expression is at least as old as Shakespeare (Comedy of Errors, 1590): “Was there ever any man thus beaten out of season, When in the why and the wherefore is neither rhyme nor reason?” Note the singular, which was once a common form: that is the way Captain Corcoran, Sir Joseph, and Josephine sing it in Gilbert and Sullivan’s HMS Pinafore (1878): “Never mind the why and wherefore, Love can level ranks, and therefore, ...” The usual meaning is perhaps a bit more than just that of the individual words; the redundancy is used as a way to emphasize that what is needed is not just ‘a reason’, but ‘the whole reason’, or ‘all the causes’. The lyrics quoted above also provide a hint on the subtle differences between ‘why’ and ‘wherefore’. While the words themselves are synonymous, their corresponding answers take different (but equivalent) forms. Often, an answer to a question “why?” is “because”; i.e. an answer to “Why q ?” is “ q because p .”, which is the conditional statement in the form “ q , if p ”:
109
(1)
q ← p.
An answer to a question “wherefore?” can also be “because”, but is more congenially phrased as “therefore”; i.e. an answer to “Wherefore q ?” is “ p therefore q .”, which is the conditional statement in the form “ p , only if q ”: (2)
p → q.
The Prolegomenon contains further musings of conditional statements, and Louie [2007] interprets their implications in the context of the etymology of the Rosen lexicon.
Aristotle’s Four Causes 5.4 Relational Diagram Relational diagrams in graph-theoretic form made their first appearance in Chapter 9 of LI. A simple mapping f : A → B has the relational diagram
(3)
where a hollow-headed arrow denotes the flow from input in A to output in B , and a solid-headed arrow denotes the induction of or constraint upon this flow by the processor f . An unnecessarily degenerate interpretation of diagram (3) in completely mechanistic terms characterizes the flow as the software, and the processor as the hardware. The solid/hollow-headed arrow symbolism was first introduced in Section 9B of LI. Its form evolved a few
110
times, and settled on this depiction in arrow diagram (3) (which is [9E.4] and [10C.1] in LI). When the mapping is represented in the element-chasing version f : a 6 b (cf. Remark 1.5), the relational diagram may be drawn as
(4)
(where I have also eliminated the dots that represent the vertices of the graph). 5.5 Entailment Symbol The processor and output relationship may be characterized ‘ f entails b ’ (Sections 5H and 9D in LI). I denote this entailment as
(5)
f ¢b
where ¢ is called the entailment symbol. The graph-theoretic and entailment forms (3), (4), and (5) are models of the mapping, very examples of ‘modelling within mathematics’ as explained in Chapter 4. 5.6 Examples in the Natural and Formal Worlds A marble sculpture of Aristotle and a mapping f : a 6 b (alternatively f : A → B and f ∈ H ( A, B ) )
both provide excellent illustrations of Aristotle’s four causes. The former is a physical object in the natural world, and the latter is a mathematical object in the formal world. We therefore seek answers to the questions “Why marble sculpture of Aristotle?” and “Why mapping?”.
111
Recall (5.2) that Aristotle’s original idea of causation had more to do with ‘grounds or forms of explanation’ for a natural system. It is with this sense of ‘cause’ that I identify components of the two examples as their four Aristotelian causes. There is no philosophical problem with this exercise for the natural system that is the marble sculpture. But for the formal system that is a mapping, it requires further justification. As I explained in the previous chapter, through the modelling relation, mappings are the formal-system embodiment, in terms of their inferential entailment, of the causal entailment in natural systems. It is through the axiom “Every process is a mapping.” (cf. 4.15) that components of mappings represent the four causes. 5.7 Materia The material cause (of a thing) is “that out of which a thing comes into being and that which remains present in it”.
Thus for the marble sculpture, the material cause is the marble, while that out of which the mapping comes to be is its input a ∈ A . One may choose to identify the material cause as either the input element a or the input set, the domain A . 5.8 Forma The formal cause is “the account of the essence and the genera to which the essence belongs”.
The formal cause of the sculpture is its specifying features. The mapping’s form, or its statement of essence, is the structure of the mapping itself as a morphism. Note that the Greek term for ‘form’ is µορϕη´ (morphé), the etymological root of ‘morphism’. The forms f ∈ H ( A, B ) and
f : a 6 b (i.e., the entailment patterns of the morphism f that maps a general element a ∈ A to its corresponding image b ∈ B ) imply the relational diagrams (3) and (4) above; the formal cause of the mapping is thus the ordered pair of arrows
(6)
112
The arrows implicitly define the processor and the flow from input to output. The compositions of these arrows also need to follow the category rules. Alternatively, when the material cause, the exact nature of the input, is immaterial (which is the essence of relational biology), the formal cause may just be identified with the entailment symbol (7)
¢
which implicitly defines the processor and the output. The identification of a morphism with its formal essence (6) or (7) is an interpretation of the category axioms in ‘arrows-only’, i.e., graph-theoretic, terms. 5.9 Efficientia The efficient cause is “that which brings the thing into being, the source of change, that which makes what is made, the ‘production rule’”.
Thus the efficient cause of the sculpture is the sculptor, and for the mapping it is the function of the mapping as a processor. The difference between the formal cause and the efficient cause of a mapping is that the former is what f is (i.e., f ∈ H ( A, B ) ), and the latter is what f does (i.e., f : a 6 b ). One may simply identify the efficient cause as the processor itself, whence also the solid-headed arrow that originates from the processor
(8)
5.10 Finis The final cause is “an end, the purpose of the thing, the ‘for the sake of which’”.
The final cause of a sculpture of Aristotle is to commemorate him. The purpose of the mapping, why it does what it does, is its output b ∈ B . One may choose to either identify the final cause as the output element b , or consider b to be the entailed effect and the output set, the codomain B , to be
113
the final cause. The Greek term τε´ λοζ (télos, translated into finis in Latin), meaning ‘end’ or ‘purpose’, covers two meanings: the end considered as the object entailed (i.e., b itself), or the end considered as the entailment of the object (i.e., the production of b ). In both cases, the final cause may be identified as b , whence also the hollow-headed arrow that terminates on the output (9)
One might, indeed, consider final cause to be ambiguous. It is either or both (i.e. the end considered as the object entailed or considered as the entailment of the object) depending on how one looks at it. In fact, an inherent ambiguity in final cause may be the underlying reason that it is an indispensable defining property of organisms. As a rational alternative to the claim that to speak of final cause is to ‘preach religion’, this inherent ambiguity may serve as a means to readmitting teleology back into science. I shall have more to say on the topics of ambiguity and impredicativity, and recast these in algebraic-topological terms, later on in this monograph. 5.11 Diagrams Material and formal causes are what Aristotle used to explain static things, i.e., things as they are, their being. Efficient and final causes are what he used to explain dynamic things, i.e. how things change and come into being, their becoming.
Here is a succinct summary of the four causes as components in the relational diagram of a mapping:
(10)
114
Without the material cause, the three causes are in the entailment diagram:
(11)
Connections in Diagrams 5.12 Ουροβόρος The ouroboros, the ‘tail-devouring snake’, is an ancient symbol depicting a serpent (or dragon) swallowing its own tail and forming a cycle. It often represents self-referencing in its many guises. I now consider the possible and impossible ouroboros for a mapping.
The relational diagram
(12)
represents the self-referencing processor
(13)
115
An interpretation of the arrow diagrams (12) and (13) is in order. The ‘self-referencing’ symbolism does not mean that one has ‘ a ( a ) = b ’. The situation represented is where a mapping f : A → B is uniquely determined by a specific element a ∈ A in its domain. As a simple example, for a fixed a0 ∈ A , consider the mapping f : A → {0,1} defined by f ( a0 ) = 1, and f ( a ) = 0 for a ≠ a0 ; such f is, indeed, the characteristic mapping χ{a0} (cf.
A.3). Each a0 ∈ A determines its corresponding characteristic mapping χ{a0} uniquely; the identification a ↔ χ{a} establishes a correspondence (i.e., an isomorphism) between A and a set H ( A, {0,1} ) of morphisms from A to
{0,1} .
Isomorphic objects are considered categorically the same (see the
Appendix for more details). Thus arrow diagrams (12) and (13), in this example, are abstract representations of ‘ a ≅ χ{a} ’. We shall encounter more examples of such self-referencing in Chapter 12, when I examine alternate realizations of (M,R)-systems. When the loop is at the other end, one has the relational diagram
(14)
which represents the self-inference
(15)
With the same consideration for isomorphic objects as above, arrow diagrams (14) and (15) only need to represent a mapping f : A → A for which a and f ( a ) are isomorphic (in the category concerned); in other words, since here
116
the domain and codomain are the same object, the mapping a 6 f ( a ) is an automorphism. This may, depending on the emphasis, be on occasion interpreted as f : a 6 a , which may in turn represent either the identity mapping 1A ∈ H ( A, A ) or the fixed point a of the mapping f . The self-entailed mapping
(16)
has no [non-self] predecessor in efficient entailment, and as such may be referred to (albeit somewhat erroneously in theological terms) as a ‘Garden of Eden’ object. St. Thomas Aquinas (we shall encounter him again below in 5.19) wrote in his masterwork Summa Theologica that “There is no case known (neither is it, indeed, possible) in which a thing is found to be the efficient cause of itself; for so it would be prior to itself, which is impossible.” To Aquinas, a Garden of Eden object is necessarily entailed by a “first efficient cause” that is, of course, God. But I digress... The self-entailed mapping (16) is, in any case, an impossibility in the category Set, except trivially when its domain A is either empty or a singleton set, since the existence of f ¢ f would involve an infinite hierarchy of homsets: (17)
(
)
f ∈ H A, H ( A, H ( A, H ( A,") ) ) .
Vacuously, for an empty domain, since H ( ∅, Y ) = {∅} for any set Y (cf. A.4), the hierarchy of hom-sets in (17) collapses to {∅} . Thus f is the empty mapping ∅ , whence the tautology ∅ ¢ ∅ (“Nothing comes from
117
nothing.”, as it were). If A is a singleton set, then f is clearly determined by its only functional value. Note that given a mapping f , of course one has f itself. But this is the statement of the entailment of existence ‘ f ¢ ∃f ’ (see 5.18 below on immanent causation). The ouroboros ‘ f ¢ f ’ is a causation different in kind (see 5.15 on functional entailment). Note also that the impossibility of the existence of the nontrivial ouroboros f ¢ f is a statement in naive set theory (and categories of sets with structure, which form the universe with which we are concerned). In hyperset theory [Azcel 1988], f ¢ f does exist, and is precisely analogous to the prototypical hyperset equation Ω = {Ω} , which has a unique solution. 5.13 Sequential Composition Relational and entailment diagrams of mappings may be composed. For example, consider the two mappings f ∈ H ( A, B ) and g ∈ H ( X , A) : the codomain of g is the domain of f .
Thus (18)
g f X ⎯⎯ → A ⎯⎯ →B.
Let the element chases be f : a 6 b (whence f ¢ b) and g : x 6 a (whence g ¢ a ): the final cause of g is the material cause of f . The relational diagrams of the two mappings connect at the common node a as
(19)
This sequential composition of relational diagrams represents the composite mapping f D g ∈ H ( X , B ) with f D g : x 6 b , and has the abbreviated relational diagram
118
(20)
whence the corresponding entailment diagram is (21)
f Dg ¢ b
Note that in these diagrams (20) and (21) for the single efficient cause f D g , both efficient causes f and g , as well as the (final) final cause b , are accounted for. 5.14 Hierarchical Composition Now consider two mappings f ∈ H ( A, B )
and g ∈ H ( X , H ( A, B) ) : the codomain of g contains f . Because of this ‘containment’, the mapping g may be considered to occupy a higher ‘hierarchical level’ than the mapping f . Let the element chases be f : a 6 b (whence f ¢ b) and g : x 6 f (whence g ¢ f ): the final cause of g is the efficient cause of f . Then one has the hierarchical composition of relational diagrams
(22)
119
with the corresponding composition of entailment diagrams (23)
g ¢f ¢b.
A comparison of (21) and (23) shows that sequential composition and hierarchical composition are different in kind: they are formally different. While the diagrams (22) and (23) may contract into something similar in form to (20) and (21), namely
(24)
and (25)
g ( x) ¢ b ,
in these abbreviated forms the entailed efficient cause f becomes ‘hidden’. Since the accounting (and tracking) of all efficient causes in an entailment system is crucial in our synthesis in relational biology (more on this in later chapters), one needs to preserve every solid-headed arrow and every entailment symbol ¢ . So there will not be any abbreviation of hierarchical compositions. 5.15 Functional Entailment g ¢ f is a ‘different’ mode of entailment, in the sense that it entails a mapping: the final cause of one morphism is the efficient cause of another morphism. It is given the name of functional entailment (Section 5I of LI ). When one is concerned not with what entails, but only what is entailed, one may simply use the notation
120
(26)
¢ f.
Note that there is nothing in category theory that mandates an absolute distinction between sets and mappings. Indeed, in the cartesian closed category Set (cf. A.53 and Example A.19(iii)), one has (27)
H ( X , H ( A, B) ) ≅ H ( X × A, B ) ;
thus g ∈ H ( X , H ( A, B) ) that entails a mapping and has a hom-set as codomain may be considered equivalently as the isomorphic g ∈ H ( X × A, B ) that has a simple set as codomain. Functional entailment is therefore not categorically different; it is, however, formally different (as concluded previously). It warrants a new name because it plays an important role in the ‘closure to efficient causation’ characterization of life (which, incidentally, is the raison d’être of this monograph: much more on this later). 5.16 Other Modes of Connection Two mappings, with the appropriate domains and codomains, may be connected at different common nodes. We have already seen two:
(28)
is the sequential composition (19), and
121
(29)
is the hierarchical composition (22). These two connections are the only compositions of two mappings. For a connection to be a composition, the hollow-headed arrow of one mapping must terminate on a node of the other mapping: the first mapping must entail something in the second. So, after (28) and (29), the only remaining possibility of composition is the connection
(30)
But this configuration simply shows two mappings with a common codomain, and the mappings do not compose.
122
The connection
(31)
is degenerate, because when the efficient causes of two mappings coincide, so must their domains and codomains (since a mapping uniquely determines its domain and codomain; cf. 1.8). Thus the geometry of (31), with two solidheaded arrows originating from the same vertex, is not allowed as an entailment pattern. The connection
(32)
shows that the domain of one mapping consists of mappings; i.e., the material cause of one is the efficient cause of the other. The two mappings have common features, but this entailment geometry is not a composition.
123
The final possibility, the ‘crossed-path’ connection
(33)
is bad notation, since it is unclear which solid-headed arrow is paired with which hollow-headed arrow. So to avoid confusion it should be resolved into two disjoint diagrams thus
(34)
Note that the patterns (30) and (32) may also be resolved without loss of entailment structure into two disjoint diagrams (34). 5.17 Multiple Connections and Resolution It is possible that two mappings are connected at more than one common node. The relational diagram, however, may be resolved into single connections for analysis. For example, the relational diagram
124
(35)
may be resolved into the hierarchical composition
(36)
while preserving the entailment (37)
Φ
¢f ¢b.
Note that the phrase while preserving the entailment is important here. This is because diagram (35) my also be resolved into the sequential composition
(38)
125
which (cf. sequential composition 5.13 above) abbreviates to the relational diagram
(39)
whence the corresponding entailment diagram is (40)
ΦD f
¢ f.
Comparing (37) with (40), one sees that the latter loses one entailment in the process. Thus one must be careful in a resolution analysis to preserve hierarchical compositions. Similarly, the relational diagram
(41)
may be resolved into the hierarchical composition
126
(42)
while preserving the entailment (43)
b ¢ Φ ¢f .
Thus the union of these two examples, the relational diagram
(44)
may be resolved into the hierarchical cycle (i.e. cycle with hierarchical compositions)
(45)
127
with the cyclic entailment
(46)
I shall explain entailment cycles in the next chapter, and I shall have a lot more to say on this particular entailment system (44)–(46) in Chapter 11.
In beata spe 5.18 Immanent Causation Now let us consider the ‘inverse problem’ of entailment. If an object b is entailed, then there exists a morphism f such
that f ¢ b (which implicitly implies the existence of a set A , the domain of
f , whence f ∈ H ( A, B ) , and the existence of an element a ∈ A such that f : a 6 b ). In other words, entailment itself entails the existence of an efficient cause. In particular, if a morphism f ∈ H ( A, B ) is functionally
entailed, then there exists a morphism g ∈ H ( X , H ( A, B ) ) (which implicitly implies the existence of a set X and an element x ∈ X ) such that g : x 6 f . Symbolically, this situation may be summarized (47)
( ¢ f ) ¢ ( ∃g
: g ¢ f ).
The entailment of the existence of something (often on a higher hierarchical level) is termed immanent causation (Section 9F of LI ) in philosophy. There are many different nuances in the various definitions of
128
immanent causation in the philosophical literature, but they all involve ‘an external agent causing something to exist’, hence ontological in addition to epistemological considerations. 5.19 St. Thomas Aquinas Ontological considerations necessitate an escape from the Newtonian trap of mechanistic simplification, in which epistemology entails and hence swallows ontology. In the pre-Newtonian science that is natural philosophy, a natural system is studied in terms of its existence and essence. See Chapter 17 of EL for a succinct discussion. The equation therein
CONCRETE SYSTEM = EXISTENCE + ESSENCE is something that could have come directly from St. Thomas Aquinas (1225– 1274). Aquinas’s writings include De Principiis Naturae (The Principles of Nature) and De Ente et Essentia (On Being and Essence), which explain Aristotelian (and post-Aristotelian) physics and metaphysics. Apart from his commentaries on Aristotle, Aquinas did not write anything else of a strict philosophical nature. But his theological works are full of philosophical insights that would qualify him as one of the greatest natural philosophers. Before Aquinas, theologians like St. Augustine placed their activities within a Platonic context. Aquinas absorbed large portions of Aristotelian doctrine into Christianity. It is, of course, not my purpose here to digress into a comparison between Plato and Aristotle. With gross simplification, one may say that Plato took his stand on idealistic principles, so that the general implies the particular, while Aristotle based his investigations on the physical world, so that the particular also predicts the general. Plato’s method is essentially deductive, while Aristotle’s is both inductive and deductive. Stated otherwise, for Plato the world takes its shape from ideas, whereas for Aristotle ideas take their shape from the world (cf. Natural Law discussed in Chapter 4).
129
Aristotle’s teleological view of nature may be summarized as “Nature does nothing in vain.” Using this regulative principle, Aristotle realized that the understanding of function and purpose is crucial to the understanding of nature. Aquinas elaborated on Aristotle’s science of ens qua ens (beings in their capacities of being), and developed his own metaphysical insight in so doing. To Aquinas, the key factor of any ‘reality as a reality’ was its existence, but existence was not just being present. A being is not a being by virtue of its matter, but a being of what it is, i.e. its essence. This constitutive principle is called esse, or ‘the act of existing’. Aquinas frequently used the Aristotelian dictum “vita viventibus est esse” (“for living things, to be is to live”). One can see in these Aristotelian and Thomastic principles the germ of relational biology: a natural system is alive not because of its matter, but because of the constitutive organization of its phenomenological entailment. The esse of an organism is this special entailment that shall be my subject of investigation later. 5.20 Exemplary Causation With the distinction between the concepts of essence and existence, Aquinas added a fifth cause, called exemplary cause. It is defined as “the causal influence exercised by a model or an exemplar on the operation of an agent”, i.e., ‘that which entails’. In terms of our first example, the marble sculpture of Aristotle, before the sculpture is realized, the model of the sculpture must already be in the mind of the efficient cause, the sculptor. The bauplan in the mind of the agent is the exemplary cause. In terms of our second example, the mapping f : a 6 b , when it is functionally
entailed, say when ∃g : g ¢ f , the entailing mapping g is the exemplary cause.
130
5.21 Michelangelo
The Artist Nothing the greatest can conceive That every marble block doth not confine Within itself: and only its design The hand that follows intellect can achieve. The ill I flee, the good that I believe, In thee, fair lady, lofty and divine, Thus hidden lie; and so that death be mine, Art, of desired success, doth me bereave. Love is not guilty, then, nor thy fair face, Nor fortune, cruelty, nor great disdain, Of my disgrace, nor chance nor destiny, If in they heart both death and love find place At the same time, and if my humble brain, Burning, can nothing draw but death from thee. — Michelangelo Buonarroti Sonnet 151 (composed c.1538–1544) Translated by Henry Wadsworth Longfellow
As expounded in his sonnet, Michelangelo contended that the sculpture already existed in a block of marble. ‘To sculpt’ meant ‘to take away’, not ‘to add’. His philosophy is that the efficient cause, realized as the sculptor’s hand, “follows intellect” and takes out the superfluous surroundings to liberate the idea that is already extant. So the exemplary cause, instead of being the bauplan in the mind of the agent, is according to Michelangelo the hidden sculpture imprisoned in matter. Michelangelo’s alternate description, however, still fits the requirement of ‘that which entails’.
131
6 Topology
Network Topology For a collection of mappings in a formal system, their compositions may give rise to a very complicated pattern of inferential entailment in a network. Here is an example of what a relatively simple one may look like:
(1)
132
Recall (5.13 and 5.14) that two mappings compose sequentially when the final cause of one is the material cause of the other, while two mappings compose hierarchically when the final cause of one is the efficient cause of the other. The isomorphism between an efficient cause and its representation as a solid-headed arrow (5.9) provides an important link between a formal system and its relation diagram in graph-theoretic form: 6.1 Theorem A network G of inferential entailment in a formal system contains a particular efficient cause f if and only if the path of G in the relational diagram contains the solid-headed arrow that corresponds to f. Because of the isomorphism, I shall use the same symbol G to denote both the network of inferential entailment and its relational diagram, and both are referred to as an entailment network. As we discussed in 5.6, the axiom “Every process is a mapping.” allows us to identify components of mappings with the Aristotelian causes. In what follows, the ‘isomorphism’ between an efficient cause of a natural process and the corresponding efficient cause of its formal mapping that is the solidheaded arrow is invoked implicitly. Thus through the modelling relation, a description of a property of functional entailment either causally in a natural system or inferentially in its formal-system model is extended dually to the other domain. A collection of interconnecting edges is called a graph, and called a directed graph (digraph for short) when every edge has an associated direction. Thus the relational diagram in graph-theoretic form of a formal system is a digraph. The digraph topology of relational diagrams is the subject of this chapter. 6.2 Analysis situs Topology is a nonmetric and nonquantitative mathematical discipline sometimes called ‘rubber-sheet geometry’. Its propositions hold as well for objects made of rubber, under deformations, as for the rigid figures from common metric geometry. It deals with fundamental geometric properties that are unaffected when one stretches, shrinks, twists, bends, or
133
otherwise distorts an object’s size and shape (but without gluing or tearing). Another name for topology is analysis situs: analysis of position. Topology deals with different problems and ideas in several different branches, including general (point-set) topology, differential topology, and algebraic topology. Note that algebraic topology is the inspiration of category theory, which is the mathematical language of Robert Rosen’s modeling relation (cf. Chapter 4). Topology began with a paper on the puzzle of the Königsberg bridges by Leonhard Euler (1707−1783), entitled Solutio problematis ad geometriam situs pertinentis (“The solution of a problem relating to the geometry of position”). The title indicates that Euler was aware that he was dealing with a different type of geometry where distance was not relevant. The paper was presented to the Academy of Sciences at St. Petersburg in 1735. It appeared in the 1736 edition of Commentant Academiae Scientiarum Imperialis Petropolitanae, although the volume was not actually published until 1741. An English translation of the paper is reprinted in Euler [1736]. Euler described the problem thus: “The problem, which I understand is quite well known, is stated as follows: In the town of Königsberg in Prussia there is an island A, called ‘Kneiphof’, with the two branches of the river (Pregel) flowing around it, as shown in Figure 1.
Figure 1. The seven Königsberg bridges.
134
There are seven bridges, a, b, c, d, e, f and g, crossing the two branches. The question is whether a person can plan a walk in such a way that he will cross each of these bridges once but not more than once. I was told that while some denied the possibility of doing this and others were in doubt, there were none who maintained that it was actually possible. On the basis of the above I formulated the following very general problem for myself: Given any configuration of the river and the branches into which it may divide, as well as any number of bridges, to determine whether or not it is possible to cross each bridge exactly once.” 6.3 Graph Theory The puzzle of the Königsberg bridges is a classic exercise in topology, a ‘network problem’. Network topology is more commonly known as graph theory. (Two good references on graph theory are Trudeau [1993] and Gross & Yellen [1999], to which the reader may refer for further exploration on the subject. Note, however, that some terminology has not been standardized, so the definitions that I present in what follows may differ from another author’s usage.) Euler’s method of solution (recast in modern terminology here) was to replace the land areas by vertices, and the bridges by edges connecting these vertices:
(2)
The resulting topological object is called a graph, in which only the configuration of the vertices and edges in terms of their relative ordering and
135
connections is important, but not the distances. Graph (2) is, indeed, Euler’s formal-system model of the natural system of the Königsberg bridges. The problem of crossing the bridges is then encoded into that of traversing the graph with a pencil without lifting it from the paper, in one continuous trace of the edges, passing along each edge exactly once. A graph for which this is possible is called traversable, and the continuous trace is now known as an Eulerian path. Stated otherwise, the geometric problem of crossing the Königsberg bridges now becomes a graph-theoretic one: to determine whether graph (2) is traversable. A (graph) edge may be considered as an unordered pair of vertices that are its end points. For example, in (2) the edge f is the unordered pair of vertices {B, D} . When the two end points are identical, i.e. when an edge
joins a vertex to itself, the edge is called a self-loop. Graph (2) has no selfloops. Note that the presence of self-loops does not affect the continuity of a trace of the edges — a graph with self-loops is traversable if and only if the graph reduced from eliminating all the self-loops is traversable. A path in a graph is a continuous trace of edges. In other words, a path from vertex A to vertex B in a graph G is a consecutive sequence of edges {v0 , v1} ,{v1 , v2 } ,{v2 , v3 } ,...,{vn−1 , vn } (where each edge {vi−1 , vi } is in G ) with v0 = A and vn = B . What we have in (2) is more properly called a connected graph, a graph in which there is a path from any vertex to any other vertex. But since we are only concerned with the topic of traversability here, we need only deal with connected graphs for now: a disconnected graph is clearly not traversable.
Multiple edges are two or more edges joining the same two vertices in a graph. In (2), a and b are multiple edges, since they are both {A, B} . A simple graph is a graph that contains no self-loops and no multiple edges. A multigraph is a graph in which multiple edges, but no self-loops, are permitted. Euler’s graph of the Königsberg bridges in (2) is a multigraph. A pseudograph is a graph in which both self-loops and multiple edges are permitted.
136
The number of edges meeting at a vertex is called its degree. Note that a self-loop contributes 2 to the degree of its (only) vertex. A vertex is called odd or even according to its degree. Since an edge connects two vertices, it follows that the sum of the degrees over all vertices in a graph is even, whence a graph must have an even number of odd vertices. Euler reasoned that, with the possible exception of the initial vertex and the final vertex, each vertex of a traversable graph had to have even degree. This was because in the middle of a journey, when passing through a land area, one had to enter on one bridge and exit on a different bridge; so each visit to a land area added two to the degree of the corresponding vertex. In the beginning of the journey, the traveler required only one bridge to exit; and at the end, only one bridge to enter — so these two vertices might be odd. If, however, the point of departure coincided with the point of arrival, then this vertex would also have even degree. A path is closed when final vertex = initial vertex. A closed Eulerian path is called an Eulerian circuit. Thus Euler discovered that if a graph has only even vertices, then it is traversable; also, the journey may begin at any vertex, and after the trace the journey ends at the same initial vertex. If the graph has exactly two odd vertices, then it is still traversable, but it is not possible to return to the starting point: the journey must begin at one odd vertex and end at the other. If the graph has more than two odd vertices, then it is not traversable. The general principle is that, for a positive integer n , if the graph contains 2n odd vertices, then it will require exactly n distinct journeys to traverse it. In the graph (2) of the Königsberg bridges, all four vertices are odd; the graph is therefore not traversable. In other words, Euler provided a mathematical proof, as some of the Königsbergers had empirically verified, that a walk crossing each of the seven bridges exactly once was impossible. In sum, Euler proved the following
137
6.4 Theorem (a) A graph possesses an Eulerian circuit if and only if its vertices are all of even degree. (b) A graph possesses an Eulerian path if and only if it has either zero or two vertices of odd degree. 6.5 Digraph A graph in which every edge has an associated direction (i.e., each edge has an initiating vertex and a terminating vertex) is called a directed graph, or digraph for short. In a digraph, a (directed) edge may be considered as an ordered pair of vertices that are its end points. For example, in the relational diagram
(3)
representing the mapping f : a
b , the solid-headed arrow is a directed edge
( f , a ) , and the hollow-headed arrow is a directed edge represented by the ordered pair of vertices ( a, b ) . In
represented by the ordered pair of vertices
a directed graph, the degree of a vertex has to be split into two entities. The number of edges terminating at a vertex v (i.e., inwardly directed edges on v , or directed edges of the form ( i , v ) ) is the indegree of v . The number of edges initiating from a vertex v (i.e., outwardly directed edges on v , or directed edges of the form ( v, i ) ) is the outdegree of v . Traversability for a digraph has an additional requirement: the underlying (undirected) graph itself has to be traversable first, but the Eulerian path also has to follow the directions of the edges. Explicitly, one has the
138
6.6 Theorem (a) A directed graph possesses an Eulerian circuit if and only if the indegree of every vertex is equal to its outdegree. (b) A directed graph possesses an Eulerian path if and only if the indegree of every vertex, with the possible exception of two vertices, is equal to its outdegree. For these two possibly exceptional vertices, the indegree of one is one smaller than its outdegree, and the indegree of the other is one larger than its outdegree.
Traversability of Relational Diagrams 6.7 Ordered-Pairwise Construction While an entailment network is a digraph, the reverse is not true: a general digraph is not necessarily the entailment network of a formal system. The partitioning of directed edges into solid-headed and hollow-headed arrows in a relational diagram comes with stringent requirements on the topology of an entailment network. In particular, the ordered-pairwise construction of solid-headed and hollowheaded arrows in the formal cause diagram
(4)
predicates that in an entailment network G of a formal system, a solid-headed arrow must be followed by a hollow-headed arrow. This also implies that the number of solid-headed arrows and hollow-headed arrows in G must be equal (therefore the total number of edges must be even). 6.8 Four Degrees of a Vertex In an entailment network, the degree of a vertex v has to be split into four entities: the number of inwardly directed solid-headed arrows, the number of inwardly directed hollow-headed arrows, the number of outwardly directed solid-headed arrows, and the number of outwardly directed hollow-headed arrows. I shall denote these four numbers ε i ( v ) , τ i ( v ) , ε o ( v ) , and τ o ( v ) , respectively. (The ε is for the efficient
139
cause that is the solid-headed arrow, and the τ is for the telos, final cause, that is the hollow-headed arrow; the i and o are for ‘in’ and ‘out’.) The indegree of v is therefore ε i ( v ) + τ i ( v ) , and the outdegree ε o ( v ) + τ o ( v ) . The requirement that a solid-headed arrow must be followed by a corresponding hollow-headed arrow says (5)
ε i ( v ) = τ o ( v ) for all v ∈ G . There are some other relations among the four degrees. One must have
(6)
∑ε (v) = ∑ε (v) i
v
o
v
(where the sum is over all vertices of the graph; i.e.
∑ ); this sum is simply v∈G
the total number of solid-headed arrows. Similarly, the sum (7)
∑τ ( v ) = ∑ τ ( v ) i
v
o
v
is the total number of hollow-headed arrows. Since the entailment network G of a formal system must contain the same number of solid-headed and hollowheaded arrows (it also follows from the equality(5)), the four sums appearing in (6) and (7) are in fact all equal. If I call the common value of this four sums n , then the total number of edges (arrows) in the graph G is 2n , and the sum of the degrees over all vertices in G is 4n . 6.9 Example To fix ideas, consider the relational diagram
(8)
140
which we have met in the previous chapter (diagram (44) of Chapter 5). It has four vertices and six edges — three each of solid-headed and hollow-headed arrows. The sum of the degrees over all vertices is twice the number of edges, hence 12. Vertices a and Φ have indegree 1 and outdegree 1, while vertices b and f have indegree 2 and outdegree 2. Thus by Theorem 6.6(a), as a digraph, (8) is traversable and has an Eulerian circuit. As a relational diagram, (8) may have the degrees of its vertices enumerated thus:
(9)
⎧ ( ε i ( a ) ,τ i ( a ) , ε o ( a ) ,τ o ( a ) ) = (1,0,0,1) ⎪ ⎪ ( ε i ( b ) ,τ i ( b ) , ε o ( b ) ,τ o ( b ) ) = (1,1,1,1) , ⎨ ε f , τ f , ε f , τ f 1,1,1,1 = ( ) ( ) ( ) ( ) ( ) ( ) i i o o ⎪ ⎪ ( ε i ( Φ ) ,τ i ( Φ ) , ε o ( Φ ) ,τ o ( Φ ) ) = ( 0,1,1,0 ) ⎩
with (10)
n = ∑ ε i ( v ) = ∑τ i ( v ) = ∑ ε o ( v ) = ∑τ o ( v ) = 3 . v
v
v
v
6.10 Further Constraints A relational diagram has more constraints on its topology than the equal number of solid-headed and hollow-headed arrows. In 5.16, I explained why the bifurcation
(11)
is a forbidden degenerate pattern. This means a vertex in a relational diagram G may have at most one solid-headed arrow initiated from it; i.e. for each v∈G , (12)
ε o ( v ) = 0 or 1 .
141
Likewise, the ‘crossed-path’ connection
(13)
is disallowed, whence there cannot be more than one solid-headed arrow terminating at a vertex. Thus for each v ∈ G , (14)
ε i ( v ) = 0 or 1 .
6.11 Alternating Arrows When tracing a path in a relational diagram G, one, of course, must follow the direction of the arrows. In addition, when tracing functional entailment paths in the network, a proper path must consist of an alternating sequence of solid-headed and hollow-headed arrows. So, for traversability as a digraph, Theorem 6.6 says that (a) G has an Eulerian circuit if and only if (15)
ε i ( v ) + τ i ( v ) = ε o ( v ) + τ o ( v ) for all v ∈ G ,
and (b) G has an Eulerian circuit if and only if the equality (15) of indegree and outdegree holds with one possibly pair of exceptional vertices v1 and v2 , for with (16)
⎧⎪ ⎡⎣ε o ( v1 ) + τ o ( v1 ) ⎤⎦ − ⎡⎣ε i ( v1 ) + τ i ( v1 ) ⎤⎦ = 1 . ⎨ + − + = ε v τ v ε v τ v 1 ⎡ ⎤ ⎡ ⎤ ( ) ( ) ( ) ( ) i 2 ⎦ o 2 ⎦ ⎣ o 2 ⎩⎪ ⎣ i 2
142
Since the condition ε i ( v ) = τ o ( v ) for all v ∈ G must be satisfied by any relational diagram, traversable or not (equation (5) above), one has the corresponding, more specific, theorem for traversability as a relational diagram: 6.12 Theorem
(a) A relational diagram (of the entailment pattern of a formal system) possesses an Eulerian circuit if and only if (17)
ε i ( v ) = τ o ( v ) and τ i ( v ) = ε o ( v )
for all v ∈ G .
(b) A relational diagram possesses an Eulerian path if and only if the equalities in (17) holds with the possible exception of two vertices v1 and v2 . For these two possibly exceptional vertices, (18)
⎧ ε o ( v1 ) − τ i ( v1 ) = 1 . ⎨ τ v ε v 1 − = ( ) ( ) o 2 ⎩ i 2
The Topology of Functional Entailment Paths 6.13 Tree The mappings in an entailment network may compose in such a way that no paths are closed, so that the arrows connect in branches of a tree; for example
143
(19)
Note that ‘cycles’ may form in the non-directed graph sense, for example
(20)
144
But since one must follow the direction of the arrows when tracing a path, a ‘cycle’ with some reversed arrows is not a cycle, i.e. not a closed path in the digraph sense of a relational diagram. 6.14 Closed Path of Material Causation A path in a relation diagram is closed, i.e. forms cycles, however, if the arrows involved have a consistent direction. When the compositions involved in the cycle are all sequential, one has a closed path of material causation. For example, when three mappings have a cyclic permutation of domains and codomains,
(21)
f ∈ H ( A, B ) ,
g ∈ H ( B, C ) ,
h ∈ H ( C , A) ,
their sequential compositions result in a cycle consisting of hollow-headed arrows entirely (with solid-headed arrows peripheral to the cycle):
(22)
The three mappings compose to (23)
h g f : A → A,
which, as we saw in 5.12, may, depending on the emphasis, be interpreted as the automorphism
145
(24)
a ≅ h g f (a) ,
the identity mapping (25)
h g f = 1A ∈ H ( A, A ) ,
or the fixed point a of the mapping h g f , (26)
h g f (a) = a .
Cyclic permutation of the three mappings also gives (27)
f h g:B→ B
and (28)
g f h:C → C ,
with the corresponding automorphism, identity mapping, and fixed point interpretations in their appropriate domains. It is easy to see that the number of mappings involved in a closed path of material causation may be any finite number (instead of three in the example), and the above discussion may be extended accordingly. Thus a closed path of material causation is formally analogous to the simple relation diagram with a self-loop
(29)
146
6.15 Closed Path with Exactly One Efficient Cause When all but one of the compositions involved in a cycle are sequential, with the exception a hierarchical composition, one has the following situation:
(30)
The mappings
(31)
f ( a ) = x1 with g1 ( x1 ) = x2 , g 2 ( x2 ) = x3 , ...,
g n−1 ( xn−1 ) = xn , and g n ( xn ) = f
compose to (32)
g n g n−1
g1 f ( a ) = f ,
or (33)
Φ f ( a ) = f where Φ = g n g n−1
g1 .
147
Thus the cycle (30) may be abbreviated as
(34)
which is one of the multiple connections we encountered in 5.17. Note that only one mapping (namely f ) is functionally entailed in this topology. Stated otherwise, in this cycle there is exactly one solid-headed arrow. 6.16 Closed Path of Efficient Causation When two or more compositions involved in the cycle are hierarchical, one has a closed path of efficient causation. In other words, a closed path of efficient causation is an entailment cycle that contains two or more efficient causes.
For example, consider three mappings from a hierarchy of hom-sets, f ∈ H ( A, B ) ,
(35)
g ∈ H ( C , H ( A, B ) ) ,
h ∈ H ( D, H ( C , H ( A, B ) ) )
with entailments
f ¢b, (36)
g ¢ f, h ¢ g.
148
Their hierarchical compositions form the relational diagram
(37)
Now suppose there is a correspondence between the sets B and H ( D, H ( C , H ( A, B ) ) ) — I shall explain one of the many ways to achieve this correspondence in the next section, and some alternate ways in later chapters. Then an isomorphic identification between b and h may be made, whence f ¢ b may be replaced by (38)
f ¢ h,
and a cycle of hierarchical compositions results
(39)
149
with the corresponding cyclic entailment pattern
(40)
Formally, one has the 6.17 Definition A hierarchical cycle is the relational diagram in graphtheoretic form of a closed path of efficient causation.
Note that in a hierarchical cycle (for example, arrow diagram (39)), there are two or more solid-headed arrows (since a closed path of efficient causation is defined as a cycle containing two or more hierarchical compositions). Because of Definition 6.17, that a hierarchical cycle is the formal-system representation (i.e. encoding) of a closed path of efficient causation in a natural system, trivially one has the following 6.18 Lemma A natural system has no closed path of efficient causation if and only if none of its models has hierarchical cycles.
Stated contrapositively, the statement is 6.19 Corollary A natural system has a model containing a hierarchical cycle if and only if it has a closed path of efficient causation.
150
Because of this equivalence of a closed path of efficient causation in a natural system and a hierarchical cycle in its model, the term hierarchical cycle, although defined for formal systems, sometimes gets decoded back as an alternate description of the closed path of efficient causation itself. In other words, one may speak of a hierarchical cycle of inferential entailments as well as a hierarchical cycle of causal entailments. Thus ‘hierarchical cycle’ joins the ranks of ‘set’, ‘system’, etc., as words that inhabit both the realms of natural systems and formal systems.
Algebraic Topology 6.20 Homology A hierarchical cycle may be constructed in terms of a sequence of algebraic-topological hom-sets { H n } . Define
(41)
H0 = B ,
(42)
H1 = H ( A, B ) ,
and (43)
H n = H ( H n−2 , H n−1 ) for n ≥ 2 .
Analogous to modular arithmetic ≡ = m (cf. 1.15), the infinite sequence
{H n } reduces to m
(44)
members { H 0 , H1 ,..., H m−1} if one has
H n ≅ H j if n ≡ j ( mod m ) , for n ≥ m and j = 0,1,..., m − 1;
or, equivalently, (45)
H mk ≅ H 0 , H mk +1 ≅ H1 , ..., H mk + m−1 ≅ H m−1 , for k = 0,1, 2,...
151
or (46)
H n ≅ H n−m for n ≥ m .
This semantic correspondence is an ‘infinity-to- m hierarchical projection’ that closes the hierarchical cycle. The isomorphism H n ≅ H n−m is something that cannot be derived from syntax alone. One needs to know something about the maps involved, i.e. the semantics, to reach this identification. For example, suppose m = 3 (as in the three-mapping example in the previous section). One way to establish H n ≅ H n−3 is as follows. An x ∈ H n−3 defines an evaluation map xˆ ∈ H ( H n−1 , H n−2 ) by (47)
xˆ( z ) = z ( x) ∈ H n−2 for z ∈ H n−1 .
If one imposes the semantic requirement that xˆ be monomorphic, then the existence of the inverse map (48)
xˆ −1 ∈ H ( H n−2 , H n−1 ) = H n
is entailed, whence (49)
x
xˆ −1
is the embedding that allows
x ∈ H n −3
to be interpreted as a map
x ≅ xˆ −1 ∈ H n . Again, the identification x ≅ xˆ −1 is something that can only be reached through semantics and not syntax alone.
152
With the hom-sets and mappings constructed thus, one has, given x ∈ H n−3 , y ∈ H n−2 , and z ∈ H n−1 , the simple cyclic entailment pattern
(50)
and the relational diagram
(51)
which one may also present in a ‘dual’ circular element-chasing version
(52)
153
6.21 Helical Hierarchy The infinity-to-three hierarchical projection is summarized succinctly in the following graph of helical hierarchy:
(53)
The helix has three (or m for the general case) hierarchical levels per turn in an apparent ever-increasing infinitely sequence. But the isomorphism of hom-sets means that (54)
H 3k ≅ H 0 , H 3k +1 ≅ H1 , H 3k +2 ≅ H 2 ,
so there are in fact only three ( m ) distinct maps:
(55)
⎧x ∈ H0 = B ⎪ ⎨ y ∈ H1 = H ( A, B ) ⎪ z ∈ H = H ( B, H ( A, B ) ) 2 ⎩
for k = 0,1, 2,...
154
with their entailment pattern in cyclic permutation, shown in diagram (53) as the bottom circle that is the vertical projection of the helix. The bottom circle is, of course, simply the element-chasing digraph (52). From diagrams (51) and (52), one sees that for x ∈ H 0 = B ,
y ∈ H1 = H ( A, B ) , and z ∈ H 2 = H ( B, H ( A, B ) ) x z y ∈ H ( H2 , H2 ) , (56)
z y x ∈ H ( H1 , H1 ) , and y x z ∈ H ( H0 , H0 )
are automorphisms, which may also, depending on the context, be interpreted as identity morphisms or fixed points in the appropriate objects. Note, in particular, that for this closed path of efficient causation one may have the identity morphism (57)
x z y = 1H 2 = 1H ( B , H ( A, B )) ;
compare this with h g f = 1A ∈ H ( A, A ) in (25) for three mappings in a closed path of material causation — note in particular the different hierarchical levels to which the morphisms belong. I note in passing that the algebraic ‘infinity-to- m hierarchical projection’ and the geometric ‘helical hierarchy’ just introduced have a topological analogue. In complex analysis, there is a general method for turning a many-valued ‘mapping’ of a complex variable into a single-valued mapping of a point on a complex manifold, a Riemann surface. The standard geometric illustration of the piecing together of a Riemann surface from ‘sheets with slits’ resembles our graph (53) of the hierarchical helix and its projection.
155
6.22 Hypersets We have already encountered, in 5.12, the prototypical hyperset Ω = {Ω} as an analogue of the ouroboros f ¢ f , which is not a Set-
object. The Set-theoretic hierarchical cycle inhabits both worlds: it also has its analogue in hyperset theory. For example, a two-mapping hierarchical cycle
(58)
is isomorphic to the (solution of the) hyperset (equation) Ω = {{Ω}} , while a three-mapping hierarchical cycle
(59)
is isomorphic to the hyperset Ω = {{{Ω}}} . In general, an n -mapping
hierarchical cycle is isomorphic to the hyperset Ω = {{
{ Ω } }} .
n
The
n
interested reader is referred to Azcel [1988] for all the details on hyperset theory.
156
Closure to Efficient Causation 6.23 Definition A natural system is closed to efficient causation if its every efficient cause is entailed within the system, i.e., if every efficient cause is functionally entailed within the system.
It is important to note that ‘closure to efficient causation’ is a condition on efficient causes, not on material causes. Thus a system that is closed to efficient causation is not necessarily a ‘closed system’ in the thermodynamic sense. (In thermodynamics, a closed system is one that is closed to material causation, i.e., a system that allows energy but not matter to be exchanged across its boundary.) Let N be a natural system and let κ ( N ) be all efficient causes in N . If
N is closed to efficient causation, one may symbolically write (60)
∀ f ∈κ ( N ) ∃ Φ ∈κ ( N ) : Φ ¢ f .
6.24 Eulerian Circuit In terms of relation diagrams G , ‘every efficient cause functionally entailed’ means that if a vertex v initiates a solid-headed arrow, it must terminate at least one hollow-headed arrow. Due to ε o ( v ) = 0 or 1 (restriction (12)), this simply means that if ε o ( v ) = 1 one must
have τ i ( v ) ≥ 1. That is,
(61)
τ i ( v ) ≥ ε o ( v ) for all v ∈ G .
Note that as a consequence of this inequality, the Eulerian path condition (Theorem 6.12(b)) (62)
ε o ( v1 ) − τ i ( v1 ) = 1
for an exceptional vertex v1 cannot occur.
157
The equality (63)
∑τ ( v ) = ∑ ε ( v ) i
v
o
v
(cf. the argument leading to (6)=(7) in 6.8) in fact turns the inequality (61) into an equality (64)
τ i ( v ) = ε o ( v ) for all v ∈ G .
This is because if (65)
τ i ( v1 ) > ε o ( v1 ) for some v1 ∈ G ,
it would force (65)
τ i ( v2 ) < ε o ( v2 ) for some other v2 ∈ G
in compensation, in order to keep the sums in (63) equal; but (65) contradicts (61). Thus, with the equality (64), and the equality (5) which is satisfied by all relational diagrams, one has the requisite conditions of Theorem 6.12(a) for Eulerian circuits: 6.25 Theorem Closure to efficient causation for a natural system means it has a formal system model in which all of the efficient causes in its causal entailment structure are contained in closed paths; i.e., all efficient causes are components of hierarchical cycles. 6.26 More Than Hierarchical Cycles Note that ‘closed to efficient causation’ is more stringent than simply ‘containing a hierarchical cycle in its entailment pattern’. The latter property only requires the some, but not necessarily all, efficient causes to be part of a hierarchical cycle; on the other hand, the former property requires all efficient causes to be in hierarchical cycles.
158
As an example, consider the following entailment diagram
(61)
While the three maps { x, y, z} forms a hierarchical cycle in its entailment structure, the map w is not entailed, whence (the system represented by) this entailment pattern is not closed to efficient causation.. Herein lies a cause of the confusion on the term ‘closure to efficient causation’. Some people use it to mean ‘a closed path containing some efficient causes exists’, instead of Theorem 6.25 ‘all of the efficient causes are contained in closed paths’ that is a consequence of Definition 6.23. The discrepancy is, however, simply due to their different usage of the word ‘closure’ (or ‘closed’), rather than an outright error on their part. (Humpty Dumpty is never far away!) It still remains that systems satisfying the more stringent universal (‘for all’) condition form a proper subset of systems satisfying the existential (‘for some’) condition. Note that a ‘universal characterization’ is consistent with other mathematical usage of the term ‘closure’. For example, in topology, a subset of a metric space is closed if it contains all (not just some) of its cluster points; in algebra, a set is closed under a binary operation if the result of combining every pair (not just some pairs) of elements of the set is also included in the set. 6.27 Connected Components One must also note that ‘closed to efficient causation’ only means that every efficient cause is part of a hierarchical cycle, but it is not necessary to have one single hierarchical cycle that contains all efficient causes. The causal entailment patterns (and therefore the inferential
159
entailment networks) need not be connected: but each network is a collection of connected components. So the requirement for ‘closed to efficient causation’ is that in each connected component all of the efficient causes are contained in a single close path. By Theorem 6.1, one has 6.28 Theorem A natural system is closed to efficient causation if and only if each connected component in its relational diagram has a closed path that contains all the solid-headed arrows.
This brings us back to the topological topic of traversability we discussed in the beginning of this chapter. In (a connected component that is) a multigraph, a cycle that contains all the solid-headed arrows necessarily, because of the pairwise construction of the solid-headed and hollow-headed arrows, contains all the hollow-headed arrows as well. Indeed, the cycle must contain all the solid-headed arrows and hollow-headed arrows in an alternating sequence. A cycle containing all the arrows (recall 6.3) corresponds to the graph-theoretic concept of Eulerian circuit. In pseudographs, since a single arrow may form a self-loop, a cycle that contains all the solid-headed arrows may or may not be an Eulerian circuit. But this cycle will still be part of an Eulerian path. I shall revisit traversability when I explicate (M,R)-systems and their realizations in Chapters 11 and 12.
161
PART III Simplex and Complex
To find the simple in the complex, the finite in the infinite — that is not a bad description of the aim and essence of mathematics. — Jacob T. Schwartz (1986) Discrete Thoughts: Essays on Mathematics, Science and Philosophy Chapter 7
162
With lattice theory from Part I and modelling theory from Part II (along with category theory from the Appendix), we are now equipped to make our approach to the subject of Part III, the dichotomy of simplexity versus complexity.
163
7 The Category of Formal Systems
Categorical System Theory In Chapter 4, I considered formal system as a primitive in our epistemology, and described it as an object in the universe of mathematics. ‘An object in the universe of mathematics’ may, of course, be interpreted as an ‘object’ in an appropriate category. Now I formalize the term, without loss of generality, in the 7.1 Definition A formal system is a pair S , F , where S is a set, and F is a collection of observables of S , i.e. F ⊂ i S , such that 0 ∈ F , where 0 is (the equivalence class of) the constant mapping on S .
Recall (Definition 2.23) that an observable of S is a mapping f with domain S . Because of the 0 ∈ F requirement, the collection F of observables of S is always nonempty. Quite frequently, we are more interested in the equivalence relation R f induced on S by f rather than f itself. So we may pass on to equivalences classes in i S , and consider F ⊂ i S ∼ = S (cf. 2.24). This, incidentally, explains the ‘equivalence class of’ in the definition of the constant mapping 0 ∈ F . 0 is not necessarily the ‘zero mapping’ (one that sends every element of S to the number zero), since the codomain is not required to contain the
164
number zero. The important function of 0 is that R0 = U , the universal relation on S , which simply serves to identify the set S itself. The universal relation U , indeed, recognizes the property of ‘belonging to S ’, which is a primitive concept in set theory. As in any formal object that is a ‘set with structure’, when the ‘specifying structure’ F is understood, one may sometimes refer to a formal system S , F by its underlying ‘set of states’ S (where a state is simply defined as a ‘member of S ’, i.e., the formal-system analogue of its naturalsystem counterpart; cf. 4.4). On the other hand, since 0 ∈ F implicitly entails S , a formal system may (analogous to the ‘arrows-only’ interpretation of the category axioms; cf. A.1) also be considered as defined by F itself. In other words, a formal system is characterized by its set of observables, i.e., by an operational definition. That F is a collection of observables of S , F ⊂ i S , means F ∈ i S . But because of the base point 0 ∈ F , F cannot be an arbitrary element of i S : one actually has F ∈ 0 i S [see 2.6 for a discussion of the pointed power set A of a set A ]. But since both A and 0 A are complete lattices * [see 2.12] (which will be the important fact invoked later), for simplicity of notation I shall continue with instead of 0 (unless the latter is explicitly required for clarity).
S, F
7.2 Resolution Associated with a formal system
equivalence relation RF ∈ S defined by RF =
∧R f ∈F
f
there is an
(cf. Definition 2.15
and Lemmata 2.16 and 2.34). In this notation, R{0, f } = R f . One may say that the formal system S , F is characterized by RF , or has resolution RF .
Note that the correspondence F RF is a projection, whence some information is lost in the process: there is in general no way to recover the collection F of observables, or { R f : f ∈ F } , from the single equivalence relation RF =
∧R f ∈F
f
. Also note that there may not be an observable h ∈ F
165
that generates the equivalence relation RF , i.e., although mathematically there exists h ∈ i S such that RF = Rh (cf. Theorem 2.20), h (and all of its ∼ -equivalent observables) may not be in F . 7.3 The Category S An observable f of S is only required to have domain S ; its codomain may be any arbitrary set. It is conventional to take cod ( f ) = , the set of real numbers (whence f ∈ S ). The evaluation of a
real-valued mapping on a set S is a formal metaphor of the measurement process. For most of our purposes cod ( f ) = is sufficient, but Definition 7.1 allows further generalizations when required (cf. the discussion on qualitative and quantitative in 2.25). With restriction to the real codomain, a formal system in the more general Definition 7.1 is identical to that defined in our previous theses [FM, CS]. In Categorical System Theory [CS], I studied the category S of all formal systems, in which the objects are pairs S , F where S is an arbitrary set and F ⊂ S . Now I generalize the category S to have objects S , F with F ⊂ i S . 7.4 S-Morphism An S-morphism φ ∈ S ( ( S1 , F1 ) , ( S2 , F2 ) ) is a pair of
mappings φ ∈ Set ( S1 , S 2 ) and φ ∈ Set ( F1 , F2 ) such that for all f ∈ F1 for all s, s′ ∈ S1 , (1)
f ( s ) = f ( s′ ) ⇒ (φ f )(φ s ) = (φ f )(φ s′ ) ,
i.e., sR f s′ implies (φ s ) Rφ f (φ s′ ) . Note that this compatibility condition (1) is equivalent to saying for all G ⊂ F1 for all s, s′ ∈ S1 , sRG s′ implies (φ s ) RφG (φ s′ ) , where φ G = {φ f : f ∈ G} ⊂ F2 . This means that for all G ⊂ F1 , φ can be considered as
a mapping from S1 RG to S 2 RφG .
166
One always defines φ 0 = 0 . This is compatible because 0s = 0s′ implies 0 (φ s ) = 0 (φ s′ ) . Note also that for any observable f , the assignment
φ f = 0 is acceptable. 7.5 Identity Define 1( S , F ) ∈S ( ( S , F ) , ( S , F ) ) by for all s ∈ S s
for all f ∈ F f
f . Thus for all G ⊂ F G
s and
G . Then 1( S ,F ) satisfies the
compatibility condition (1). 7.6 Composition in S Define composition of morphisms in S as simultaneously the compositions on the states and on the observables; i.e., if φ : ( S1 , F1 ) → ( S2 , F2 ) and ψ : ( S2 , F2 ) → ( S3 , F3 ) , define ψ φ : ( S1 , F1 )
→ ( S3 , F3 ) by for every s ∈ S1 ψ φ ( s ) = ψ (φ ( s ) ) and for every f ∈ F1
ψ φ ( f ) = ψ (φ ( f ) ) .
Note for f ∈ F1 and s, s′ ∈ S1 , sR f s′ implies
(φ s ) Rφ f (φ s′ ) , which in turn implies ψ (φ s ) Rψ (φ f )ψ (φ s′ ) ; so ψ φ
satisfies
the compatibility condition (1). One easily verifies that composition so defined is associative, and for φ : ( S1 , F1 ) → ( S2 , F2 ) , 1( S2 , F2 ) φ = φ = φ 1( S1 , F1 ) 7.7 Isomorphism If φ : ( S1 , F1 ) → ( S 2 , F2 ) and ψ : ( S2 , F2 ) → ( S1 , F1 ) are
such that ψ φ = 1( S1 , F1 ) and φ ψ = 1( S2 , F2 ) , then it is easy to see that φ : S1 → S2 and φ : F1 → F2 must be bijections (Set-isomorphisms) and that for f ∈ F1 and s, s′ ∈ S1 , f ( s ) = f ( s′ ) if and only if (φ f )(φ s ) = (φ f )(φ s′ ) , i.e., for every G ⊂ F1 S1 RG = S2 RφG . Thus isomorphic systems are abstractly the same in the sense that there is a ‘dictionary’ (one-to-one correspondence) between the states and between the observables inducing the ‘same’ equivalence relations on the states. In particular, if F and G are two sets of observables on S and there is a bijection φ : F → G such that for all f ∈ F f ∼ φ f , then the two systems
( S, F )
and
(S,G)
are isomorphic with the S-isomorphism 1S : S → S ,
167
φ :F →G.
Since categorical constructions are only unique up to
isomorphism, in the category S all constructions ( S , F ) are only ‘unique up to ∼ -equivalent observables’ (i.e., one can always replace F by an ∼ -equivalent set of observables G in the above sense) even when the set of states S is held fixed. This last comment is particularly important for all constructions in S.
Constructions in S I now briefly explore several category-theoretic constructions in S that will be of use later. The reader may consult CS for further detailed examples and proofs. 7.8 Product Products in the category S exist. For a family {( Si , Fi ) : i ∈ I } ,
the product is ( S , F ) = ∏ ( S j , Fj ) with an I -tuple of S-morphisms of the j∈I
form π i : ( S , F ) → ( Si , Fi ) . S is defined as the cartesian product ∏ S j of the sets of states. F is defined as the ‘cartesian product’
∏F
j
of the sets of
observables interpreted as follows: f = ( f j : j ∈ I ) ∈ F is an observable of S defined by (2)
(f
j
(
)
: j ∈ I )( s j : j ∈ I ) = f j ( s j ) : j ∈ I .
Note that the codomain of f is the cartesian product set of the codomains of the f j s. The projections are obviously defined by π i
(( s
j
)
: j ∈ I ) = si and
π i ( ( f j : j ∈ I ) ) = fi . It is easily checked that the π i s are indeed S-morphisms.
168
The terminal object in S is (1,{0} ) where 1 is the singleton set, the
terminal Set-object. The unique S-morphism from any system to (1, {0} ) is the one that sends all states to 1 and all observables to 0. 7.9 Meet as Product Let f and g be observables on S . Recall (Lemma 2.34) that the meet R f g = R f ∧ Rg of their equivalence relations R f and Rg on S is defined by (3)
sR f g s′ if and only if
f ( s ) = f ( s′ ) and g ( s ) = g ( s′ ) .
Note that there may not be an observable of S that generates the equivalence relation R f g , i.e., although mathematically there exists h ∈ i S such that R f g = Rh , the set of all possible observables of S , as a representation of a natural system, may not contain (the ∼ -equivalence class of) this h . There is always an embedding φ : S R f g → S R f × S Rg that maps
(s) f g
(( s )
f
)
, ( s ) g . Via this embedding, a state s ∈ S is represented by the
pair of numbers ( f ( s ) , g ( s ) ) . This embedding φ is in general one to one, but it is onto if and only if f and g are totally unlinked (to each other; cf. Lemma 2.33). This Set-product representation can be constructed neatly as an Sproduct. Consider the two systems ( S , { f ,0} ) and ( S , { g ,0} ) . The S-product of these two systems is ( S × S , F ) , where F = {0, ( f ,0 ) , ( 0, g ) , ( f , g )} , with the natural projections. Now consider further the system ( S ,{ f , g} ) . There
exist S-morphisms (4)
φ1 : ( S , { f , g} ) → ( S ,{ f ,0} ) and φ 2 : ( S ,{ f , g}) → ( S , {0, g} )
defined by, for every s ∈ S ,
169
(5)
φ1 ( s ) = s ;
φ1 f = f ,
φ1 g = 0
φ2 ( s ) = s ;
φ2 f = 0 ,
φ2 g = g .
and (6)
Hence by the universal property of the product, there exists a unique φ : ( S , { f , g} ) → ( S × S , F ) that makes the diagram commute. Namely, φ is defined by sending s ∈ S to φ ( s ) = ( s, s ) — the diagonal map — and by
φ f = ( f ,0 ) , φ g = ( 0, g ) . So one has the following diagram:
(7)
In particular, φ
being an S-morphism implies that φ : S R{ f , g}
→ S × S R{( f ,0),( 0, g )} . φ is a one-to-one mapping (on S ) and R{ f , g} = R f g
hence S R{ f , g} = S R f g . Also, S × S R{( f ,0),( 0, g )} ≅ S R f × S Rg . Thus φ is indeed the one-to-one map from S R f g to S R f × S Rg , and that the degree of onto-ness of φ is an indication of the lack of linkage between f and g . The onto-ness of a morphism is discussed in the Appendix in A.44.
170
7.10 Equalizer
For S-morphisms φ ,ψ : ( S1 , F1 ) → ( S2 , F2 ) , eq (φ ,ψ )
= ( E , H ) may not exist.
E = {s ∈ S : φ s = ψ s} , H = { f
The equalizer would have to be given by E
: f ∈ F1 , φ f = ψ f } and ι : ( E , H ) → ( S1 , F1 )
would be the inclusion morphism. But ι ( f
E
)= f
may not be uniquely
defined because there may be another g ∈ F1 such that g
E
= f
E
and
φ g = ψ g . Thus an S-equalizer only exists when the inclusion map from H to F1 is single-valued.
Note when ( E , H ) = eq (φ ,ψ ) does exist, ι : ( E , H ) → ( S1 , F1 ) has the
property that for all s, s′ ∈ E and for all g ∈ H , g ( s ) = g ( s′ ) if and only if
(ι g )( s ) = (ι g )( s′ ) , i.e., E χ : ( X 1 , G1 ) → ( X 2 , G2 ) that
Rg ≅ ι ( E ) Rι g . Further, any S-morphism
is one-to-one on the states and on the
observables, and that has this property (that X 1 Rg ≅ χ ( X 1 ) Rχ g for all g ∈ G1 ) is an equalizer. It is easy to construct a pair of S-morphisms φ1 , φ 2
with domain ( X 2 , G2 ) such that ( X 1 , G1 ) = eq (φ1 , φ 2 ) . Thus although S does not have equalizers for every pair of S-morphisms, given an S-morphism φ with the correct properties one can always find a pair of S-morphisms for which φ is the equalizer. 7.11 Coproduct The category S has coproducts. The coproduct is ( S , F ) = ( S j , Fj ) where S = S j is the coproduct of the S j s in Set (i.e., j∈I
the disjoint union S = ∪{ j} × S j ) and F = {0} ∪ {( j , f ) : j ∈ I , f ∈ F j , f ≠ 0}
defined as follows. For f ∈ Fj , f ≠ 0 , the observable ( j , f ) of S is defined by (8)
⎧ f (s) ( j , f )( k , s ) = ⎨ ⎩( k , s )
if j = k . if j ≠ k
171
The natural injections are ι j : ( S j , Fj ) → ( S , F ) with ι j ( s ) = ( j , s ) for s ∈ S j , and ι j ( f ) = ( j , f ) for f ∈ Fj with f ≠ 0 and ι j ( 0 ) = 0 . The initial object in Set is the empty set ∅ , thence the initial object in S is ( ∅, {0} ) . For any system ( S , F ) , the unique S-morphism from ( ∅,{0} ) to ( S , F ) is the empty mapping on ∅ with 0
0∈ F .
7.12 Coequalizer The category S also has coequalizers, constructed as follows. Let φ ,ψ : ( S1 , F1 ) → ( S 2 , F2 ) . Let Q = S2 R where R is the intersection
of
{(φ ( s ) ,ψ ( s ) ) ∈ S
all
2
equivalence
relations
on
S2
containing
× S2 : s ∈ S1} and of all RF2 . So in particular for t , t ′ ∈ S2 ,
tRt ′ implies for all g ∈ F2 g ( t ) = g ( t ′ ) . Let χ : S 2 → Q be the canonical
projection χ ( t ) = ( t ) R . This takes care of the map on the states. As for the
observables, let R on F2 be the intersection of all equivalence relations containing
{(φ f ,ψ f ) ∈ F2 × F2 : f ∈ F1} ,
and let χ : F2 → H = F2 R be,
naturally, χ g = ( g ) R , such that R( g ) is the equivalence relation on S2 R
generated by { Rg ′ : g ′ ∈ ( g ) R } , i.e., R( g ) is the finest equivalence relation on R
S2 such that it is refined by each of the Rg ′ , g ′ ∈ ( g ) R . Putting it another way, R( g ) is defined to be the supremum of the family { Rg ′ : g ′ ∈ ( g ) R } in the R
lattice of all equivalence relations on S2 . One sees, then, that R( g ) is refined R
by R on S2 and hence ( g ) R is well defined on Q = S2 R , so one can consider
dom ( g ) R = Q . Finally, to check χ : ( S2 , F2 ) → ( Q, H ) so defined is indeed an
S-morphism, let
g ∈ F2
and
t , t ′ ∈ S2 ; then
( g ) R ( t ) = ( g ) R ( t ′ ) hence ( g ) R ( t ) R = ( g ) R ( t ′ ) R . χ ( t ) Rχ g χ ( t ′ ) . Further, χ φ = χ ψ .
g ( t ) = g ( t ′)
implies
So tRg t ′ does imply
Now if χ ′ : ( S 2 , F2 ) → ( Q′, H ′ ) is such that χ ′ φ = χ ′ ψ , then
{( t , t ′) ∈ S2 × S2 :
χ ′ ( t ) = χ ′ ( t ′ )} is an equivalence relation on S 2 containing
172
R . Thus π ( t ) R = χ ′ ( t ) is well defined on Q = S2 R . Similarly, π ( g ) R = χ ′g is well defined on H . χ ′ = π χ and it is unique.
(9)
Finally, we have to check that π is an S-morphism. Note that for every g ′ ∈ ( g ) R (i.e., g ′Rg ), χ ′g ′ = χ ′g because {( g , g ′ ) ∈ F2 × F2 : χ ′g = χ ′g ′} is
an equivalence relation on F2 and since for every f ∈ F1 χ ′ (φ f ) = χ ′ (ψ f ) ,
this equivalence relation contains all (φ f ,ψ f ) and hence contains R . Also since χ ′ is an S-morphism, for each g ′ ∈ ( g ) R we have g ′ ( t ) = g ′ ( t ′ )
implying ( χ ′g ′ )( χ ′t ) = ( χ ′g ′ )( χ ′t ′ ) , i.e., ( χ ′g ′ )( χ ′t ) = ( χ ′g )( χ ′t ′ ) . Thus Rχ ′g is ‘refined’ by each of Rg′ (on F2 ). Since R( g ) is the supremum of
{R
g′
R
: g ′ ∈ ( g ) R } , we have R( g ) ⊂ Rχ ′g . Therefore ( g ) R ( t ) R = ( g ) R ( t ′ ) R in Q R
( g ) R ( t ) R = ( g ) R ( t ′) R in S2 , which in turn implies that ( χ ′g )( χ ′t ) = ( χ ′g )( χ ′t ′ ) in Q′ , i.e., π ( g )R ⎡⎣π ( t )R ⎤⎦ = π ( g ) R ⎡⎣π ( t ′) R ⎤⎦ in Q′ , whence π : ( Q, H ) → ( Q′, H ′ ) is indeed an S-morphism.
implies
It is apparent that whereas products and equalizers are easy to define in S, their dual concepts are a lot more complicated. This is indeed observed in many familiar categories (cf. A.36). A difficult problem in the study of a specific category is in fact to explicitly describe its coproducts and coequalizers (‘colimits’).
173
Hierarchy of S-Morphisms and Image Factorization 7.13 S-Monomorphisms A mono in the category S is an S-morphism that is injective as set mappings on the set of states and on the set of observables. For suppose φ : ( S1 , F1 ) → ( S2 , F2 ) is a mono and there are distinct states s
and s′ in S for which φ ( s ) = φ ( s′ ) , then consider ψ 1 ,ψ 2 : ( S1 , F1 ) → ( S1 , F1 ) with ψ 1 mapping all states in S to s , ψ 2 mapping all states in S to s′ , and
both ψ 1 and ψ 2 acting as identity on F1 . It is easy to check that in this case the S-morphisms ψ 1 and ψ 2 are such that φ ψ 1 = φ ψ 2 but ψ 1 ≠ ψ 2 , a contradiction. So φ : S1 → S2 must be injective. Also, suppose distinct observables f and f ′ in F1 are such that φ f = φ f ′ . Then consider
ψ 1 ,ψ 2 : ({s} ,{ f , f ′} ) → ( S1 , F1 ) where s ∈ S1 , ψ 1 is the inclusion, and
ψ 2 ( s ) = s , ψ 2 f = f ′ , ψ 2 f ′ = f . Again ψ 1 , ψ 2 are S-morphisms with φ ψ 1 = φ ψ 2 but ψ 1 ≠ ψ 2 , a contradiction. So φ : F1 → F2 is also injective. Conversely, if an S-morphism φ : ( S1 , F1 ) → ( S2 , F2 ) is injective on both S1 and F1 , it is mono. Now suppose φ : ( S1 , F1 ) → ( S2 , F2 ) is an equalizer and that S1 is
nonempty. ( F1 is already nonempty because 0 ∈ F1 .) Say φ = eq (ψ 1 ,ψ 2 ) for
ψ 1 ,ψ 2 : ( S 2 , F2 ) → ( X , H ) . Then, as an equalizer, φ : ( S1 , F1 ) → ( S2 , F2 ) is isomorphic to an inclusion (see 7.10). So in particular for f ∈ F1 and s, s′ ∈ S1 , f ( s ) = f ( s′ ) if and only if (φ f )(φ s ) = (φ f )(φ s′ ) , i.e., S1 R f ≅ φ ( S1 ) Rφ f . Thus φ −1 is well defined on φ ( S1 ) and φ ( F1 ) and can be
extended to an S- morphism on ( S 2 , F2 ) . (We need a nonempty S1 for the same reason as in Set.) So in S, an equalizer with nonempty domain is split mono.
174
In the examples in CS, I have shown that an S-mono is not necessarily an S-equalizer, so the hierarchy for monomorphisms in S is (for φ : ( S1 , F1 ) → ( S2 , F2 ) with nonempty S1 ) (10)
split mono ⇔ equalizer ⇒ mono ⇔ injection (on both S1 and F1 ).
7.14 S-Epimorphisms In S, an epi is the same as an S-morphism that is surjective on both the set of states and the set of observables. For suppose φ : ( S1 , F1 ) → ( S2 , F2 ) is an epi and there is an s ∈ S2 ∼ φ ( S1 ) , then
ψ 1 ,ψ 2 : ( S2 , F2 ) → ({0,1} ,{0} ) , where ψ 1 = χφ ( S ) on S2 , ψ 1 f = 0 for all 1
f ∈ F2 , ψ 2 = χ S2 on S2 , and ψ 2 f = 0 for all f ∈ F2 , provide a pair of S-
morphisms such that ψ 1 φ = ψ 2 φ but ψ 1 ≠ ψ 2 , a contradiction.
So
φ ( S1 ) = S2 . Now suppose there is an f ∈ F2 ∼ φ ( F1 ) , then ψ 1 ,ψ 2 :
( S2 , F2 ) → ( S2 , F2 )
where ψ 1 = 1( S2 ,F2 ) , ψ 2 = 1S2 on S2 and ψ 2 f = 0 for all
f ∈ F2 , is an example in which ψ 1 φ = ψ 2 φ but ψ 1 ≠ ψ 2 , again a
contradiction. Thus φ ( F1 ) = F2 . Conversely, an S-morphism φ : ( S1 , F1 )
→ ( S 2 , F2 ) that is onto both S2 and F2 is epi. Thus in S, one has
(11)
split epi ⇒ coequalizer ⇒ epi ⇔ surjection (onto both S 2 and F2 ).
In CS I have shown that the two preceding one-way implications are indeed irreversible; so the preceding is the hierarchy for epimorphisms in S. Note that although the two hierarchies in S for the dual concepts of monomorphisms and epimorphisms are not the same, this is not a counterexample to the principle of categorical duality (see A.30). The principle only states that if is a statement about a category C, then Σ op is universally true if Σ is. For a particular category, it may very well happen that Σ is true but Σ op is not.
175
7.15 Image Factorization The category S, also, has epi-equalizer and coequalizer-mono factorizations. The diagram
(12)
with π = i on ( S1 , F1 ) and ι = inclusion is an epi-equalizer factorization
φ = ι π of φ . Consider the equivalence relation R = {( s, s′ ) ∈ S1 × S1 : φ ( s ) = φ ( s′ )} on
S1 and R = {( f , f ′ ) ∈ F1 × F1 : φ f = φ f ′} on F1 ; the diagram
(13)
with π = natural projection (where F1 R ⊂ S1 R is to be interpreted as in the construction of coequalizers in 7.12) and ι defined by ι ( s ) R = φ ( s ) ,
ι ( f ) R = φ f (since sRs′ iff φ s = φ s′ , f R f ′ iff φ f = φ f ′ , ι well defined)
is then a coequalizer-mono factorization of φ . Note that although both the epi-equalizer and coequalizer-mono factorizations are unique up to isomorphism, the two factorizations are not necessarily isomorphic (an example is the case of Top, A.45).
176
7.16 Subobjects In S, as we saw, there are two distinct types of monomorphisms — namely, that of an equalizer (= split mono) and that of a mono (= injective morphism). I shall say that φ : ( S1 , F1 ) → ( S2 , F2 ) is an S-
subsystem (or simply ( S1 , F1 ) is a subsystem of ( S2 , F2 ) ) if φ is an equalizer, and it is an S-monosubobject (or simply ( S1 , F1 ) is a monosubobject of
( S2 , F2 ) ) if φ
is mono. So a subsystem is a monosubobject but not vice versa.
Note that subsystem implies that for each f ∈ F1 S1 R f ≅ φ ( S1 ) Rφ f ( φ : S1 R f → S2 Rφ f is one-to-one), i.e., f ( s ) = f ( s′ ) if and only if
(φ f )(φ s ) = (φ f )(φ s′ ) ; whereas monosubobject does not have this ‘backward implication’ ( φ : S1 R f → S2 Rφ f is not necessarily one-to-one). A subsystem, therefore, is the appropriate subobject of a system that preserves most of its structures. On the other hand, a monosubobject may be used to define a partial order on S, with ( S1 , F1 ) ≤ ( S2 , F2 ) if and only if ( S1 , F1 ) is a
monosubobject of ( S2 , F2 ) . Biological implications of this partial order may
be found in CS. In this monograph, I shall proceed slightly differently, and specialize on the partial order for various collections of observables on the same set of states (cf. 7.23 below).
The Lattice of Component Models Let N be a natural system. 7.17 Definition A model of N is a finite collection of formal systems { Si , Fi : i = 1,..., n} such that the collection of mappings {Fi : i = 1,..., n} satisfy the entailment requirements of the modelling relation. Each formal system Si , Fi is called a component of the model. I shall consider next, in detail, the meaning of the phrase ‘satisfy the entailment requirements of the modelling relation’.
177
7.18 Nuances of Dualism Recall in 4.4 that the primitive natural system is thus attributed that it is (a) a part, whence a subset, of the external world; and (b) a collection of qualities, to which definite relations can be imputed. Rosen continued the explication on the concept of a natural system [AS, Section 2.1] with the following: ...a natural system from the outset embodies a mental construct (i.e., a relation established by the mind between percepts) which comprises a hypothesis or model pertaining to the organization of the external world. In what follows, we shall refer to a perceptible quality of a natural system as an observable. We shall call a relation obtaining between two or more observables belonging to a natural system a linkage between them. We take the viewpoint that the study of natural systems is precisely the specification of the observables belonging to such a system, and a characterization of the manner in which they are linked. Indeed, for us observables are the fundamental units of natural systems, just as percepts are the fundamental units of experience. Note the nuance here, that of the subtle difference between a material system (or a physical system) and a natural system. A material system is ontological, it being simply any physical object in the world. A natural system, on the other hand, is epistemological, since the partitioning of the external world and the formation of percepts and their relations are all mental constructs (and are therefore entailed by the bounds of mental constructs). In short, a natural system is a subjectively-defined representation of a material system. Recall, as we discussed in 4.13, that the existence of causal entailment in a natural system is ontological, but the representation of causality, by an arrow (i.e., as mappings), is epistemological. Likewise, note the nuance between a formal system (Definition 7.1) and a model (Definition 7.17). For a general formal system S , F , the only
178
requirement for the collection of mappings F is 0 ∈ F ⊂ i S , with no size limits. A model is a functorial image of a natural system. Recall (4.15) the Natural Law axiom “Every process is a mapping.” My formal definition (Definition 2.23) of observable of a set X (a mapping with domain X ) categorically models an observable of a natural system (a perceptible quality). Thus the mappings in a model are observables in both senses. In particular, the number of percepts are finite (it may be a very large number, but finite nonetheless), therefore the number of observables of a model is finite. 7.19 Finitude St. Thomas Aquinas in his Summa Theologica wrote that “in efficient causes it is not possible to go on to infinity”. There is a philosophical debate as to whether Aquinas intended to say that an infinitely long causal chain (i.e., in our terminology an infinite sequence of hierarchical compositions) would be impossible, or that there are only finitely many efficient causes in the universe. For our purpose, a natural system (being a mental construct) can have only finitely many efficient causes, and a model has only finitely many mappings. A model, an abstraction by the modeller, is by definition an incomplete description. Thus, if { Si , Fi : i = 1,..., n} is a model (that there are finitely many model components Si , Fi is already part of Definition 7.17), then each Fi must be a finite subset of i Si . n
The requirement that the Fi s (hence the totality of observables ∪ Fi of i =1
N ) are to be finite sets looks like a very severe mathematical restriction. But in mathematical modelling of natural systems, a finiteness restriction is not unrealistic: all one requires is that the sets are finite, and there is no restriction on how small the sets have to be. So the sets could be singletons, have 1010 elements, or have 10100 elements and still be finite. After all, Sir James Jeans, in The Astronomical Horizon [1945], defined the universe as a gigantic machine the future of which is inexorably fixed by its state at any given moment, that it is “a self-solving system of 6N simultaneous differential equations, where N is Eddington’s number”. Sir Arthur Eddington, in The
179
Philosophy of Physical Science [1939], asserted (evidently poetically) that N = 2 × 136 × 2256 ( ∼ 1079 ) is the total number of particles of matter in the universe. The point is that it is a finite number. Thus the set of states of a natural system is certainly finite at one time (this is not to be confused with the set of all possible states a system can have), and the set of observables on a natural system at one time is also clearly finite. A graph with finitely many edges (and finitely many vertices) is called a finite graph. While one may study infinite graphs, the subject of graph theory (cf. Chapter 6) is almost always finite graphs. In view of the isomorphism given in Theorem 6.1, I summarize the epistemological finitude as the 7.20 Axiom of Finitude (a) a natural system has finitely many efficient causes; (b) a model has finitely many mappings; (c) the relational diagram in graph-theoretic form (of the entailment patterns of a natural system) is a finite graph. 7.21 Further Entailment Requirements
Let S , F be a component of a
model. Since I ≤ R f for each f ∈ F (where I is the equality relation on S ), one has I ≤ RF . But one almost always has I < RF : to have I = RF is to say that one has a complete description of a component set S of the natural system N (since the resolution is down to every single element of S ), which does not usually happen in a model component S , F unless S is exceedingly simple. It is important to note the epistemological difference between the equivalence relations U and I . The universal relation U = R0 induced by 0 ∈ F allows us to identify the whole natural system S , to distinguish elements that belong to S from those that do not. This differentiation of self from non-self is a requirement of Natural Law. Note that if S , F is a model component of N then so trivially is S , {0} . The equality relation I , on the other hand, identifies all the individual elements of S , and this is a description that is rarely available to us. Another way to summarize the situation
180
succinctly is that the equality relation I and the universal relation U characterize, respectively, the left-hand side and the right-hand side of the membership relation ∈. The modelling relation imposes restrictions on mappings that qualify to be members of F , since the mappings are functorial images of processes. The available observables of S , which can belong to a family F so that S , F is a model component, therefore form a proper subset H ( S , i ) of i S
(see item A.3 on the category Set). In particular, { R f : f ∈ F } ≠ S . Further restrictions apply to F : since linkages of mappings model relations of percepts, F cannot be an arbitrary collection of observables of S . 7.22 Definition Let S be a set such that S ,{0} is a model component of a natural system N . The collection of all model components of the form S , F with F ∈ i S is denoted C ( S ) . Note that while by Theorem 2.20 there exists an observable h of S such that Rh = RF , there is no requirement that h ∈ F . But evidently S , {0, h} and S , F may be considered equivalent model components (in
the sense of ∼ -equivalent observables in the isomorphism i S ∼ ≅ S ; cf. 2.24); in other words, if S , F ∈ C ( S ) , then also S , {0, h} ∈ C ( S ) . I now proceed to construct C ( S ) into a lattice. For simplicity I shall use the term ‘model of S ’ to abbreviate the verbose but more proper term ‘model component of the form S , F of the natural system N ’, i.e. an element S , F ∈ C ( S ) .
7.23 Joining Models Given S , F , S , G ∈ C ( S ) , define the join of these two models of S as (14)
S, F ∨ S,G = S, F ∪ G .
181
Note that the join operator ∨ of C ( S ) is defined covariantly through the join operator ∪ of the power set i S . One easily verifies that (15)
RF ∪G = RF ∧ RG ;
thus the characterizing equivalence relation of a join in C ( S ) corresponds on the other hand contravariantly to the meet operator ∧ of S . If F ⊂ G , then RF ⊃ R G as subsets of S × S , whence G refines F , in the obvious sense that RG ≤ RF in the lattice S . The inclusion F ⊂ G also means F ∪ G = G , whence S , F ∨ S , G = S , F ∪ G = S , G , and therefore S , F ≤ S , G
[with the natural definition, that x ≤ y if and only if
x ∨ y = y ]. So note the covariant implication: (16)
F ⊂G ⇒
S, F ≤ S,G .
The converse is not true: S , F ≤ S , G does not imply F ⊂ G . I leave it as an easy exercise for the reader to demonstrate a counterexample. Consider the trivial model S , {0} ∈ C ( S ) . Since 0 ∈ F for any model S , F ∈ C ( S ) , one has {0} ∪ F = F , whence S ,{0} ∨ S , F = S , F . This says S , {0} is the least element of C ( S ) . Note the contravariance in this correspondence: while the universal relation U corresponding to the constant mapping is the greatest element in the lattice S , the trivial model S ,{0} is the least element of C ( S ) . The dual operator is obviously defined thus:
7.24 Meeting Models Given S , F , S , G ∈ C ( S ) , define the meet of these two models of S as (17)
S, F ∧ S,G = S, F ∩ G .
182
Note that the meet operator ∧ of C ( S ) is again defined covariantly, through the meet operator ∩ of the power set i S . While the join of two models is a useful construction in combining models — the more observables, i.e. more alternate descriptions, a model has, the more information one gains on the system — the meet of two models has no practical purpose other than the mathematical structure requires it. Indeed, there is no simple relationship dual to RF ∪G = RF ∧ RG : in general, RF and RG does not have to combine in any way to give RF ∩G . The greatest element (if it exists) of C ( S ) would have to be a model S,G
such that for every
S, F ∈C(S ) ,
S, F ∧ S,G = S, F ∩ G
= S , F . [Recall that x ≤ y if and only if x ∧ y = x .] This would require F ∩ G = F for every set F of observables, whence G must contain all the observables of S . Such Godlike perspective on a system — having access to the collection G of all observables — is, however, usually not available to us. We must conclude, therefore, that for a general natural system S (or more properly, a set S representing a component of a general natural system N ), either the complete collection G cannot be determined, or even if it can be, we must have S , G ∉ C ( S ) ; stated otherwise, generically, C ( S ) does not have a greatest element. Those natural systems S for which C ( S ) does have a greatest element are, therefore, very specialized, highly nongeneric. I shall explore these specialized systems in the next chapter. The definitions of join and meet in C ( S ) establish a (covariant) embedding of
C ( S ) , ∨, ∧
into
i S , ∪, ∩
and a (contravariant)
isomorphism between C ( S ) , ∨, ∧ and S , ∨, ∧ in the category of lattices, whence the collection of all model components has a representation as a sublattice (and hence inheriting properties) of the two canonical lattices (cf. Lemma 2.8). I state this important fact as the
7.25 Theorem
The collection of model components C ( S ) is a lattice.
183
The Category of Models 7.26 Model Network The collection of mappings in a model { Si , Fi : i = 1,..., n} , when represented as a relational diagram in graphtheoretic form, is a network of blocks and arrows. Each component Si , Fi may be represented by a block, and an arrow is drawn from Si , Fi S j , Fj
if there is a mapping
to
f ∈ Fi such that cod ( f ) = S j or
cod ( f ) ∩ F j ≠ ∅ . (Note that this is different from the existence of an S-
morphism from Si , Fi to S j , Fj .) Here is an example:
(18)
The main purpose of a model network diagram is to show how the various model components are interconnected. A mapping f ∈ Fi may have its codomain cod ( f ) not related to any other components S j , Fj . These images may be considered the ‘environmental outputs’ of the component; they are understood to be implicitly present, and usually not explicitly shown; on the occasions when they are shown for emphasis, they appear as arrows initiating from the component and terminating in the ambience. Similarly,
184
‘environmental inputs’ are not usually represented; when they are, they appear as arrows initiating from the ambience and terminating in a component. The solid-headed and hollow-headed arrow distinction used in connection with the relational diagram of a mapping (cf. 5.4) may be extended analogously to a model network. A mapping f ∈ Fi such that cod ( f ) = S j entails a material cause, and may be represented by a hollow-headed arrow. A mapping f ∈ Fi such that cod ( f ) ∩ Fj ≠ ∅ entails an efficient cause, and may be represented by a solid-headed arrow. Note, however, the usage of the two kinds of arrows are for the distinction of their causal differences, but the arrows do not represent the same entities in an entailment network of mappings and in a model network of components. In particular, in an entailment network, solid-headed and hollow-headed arrows come in formalcause pairs, but there is no such relational requirement in an model network. In a model network here, a hollow-headed arrow means that the processed image of the arrow is used as a material input of its target block, while a solidheaded arrows means that the processed image of the arrow is itself a processor, thus a solid-headed arrow represents functional entailment that yields a transfer function of the target block. In short, the arrowheads of the hollow-headed and solid-headed arrows point, respectively, to entailed material and efficient causes. In contrast, in an entailment network of a formal system with its relation diagram in graph-theoretic form, it is the tails of the hollow-headed and solid-headed arrows that are the formal positions of the, respectively, material and efficient causes of a mapping. I illustrate the usage for model networks with a simple example: consider a three-component model M = { A, {0, f } , B, {0, g} , X , {0, Φ} } , where f : A → B with f : a b , g : B → C with g :b c , and Φ : X → H ( A, B ) with Φ : x f . Then the block diagram for M is
185
(19)
Each collection of observables of the components may be resolved into its individual mappings that are themselves represented in their relational diagrams in graph-theoretic form, resulting in a network of the solid-headed and hollow-headed arrows. The relational diagram for our example M is
(20)
and it contracts to
(21)
186
The similarity in form of the arrow usage between diagrams (19) and (21) (i.e., the isomorphism of their formal causes) demonstrates why I choose the solid-headed and hollow-headed arrow analogy for model networks. For a model consisting of many components and many mappings, the relational diagram in graph-theoretic form is, of course, a very complicated network. Each formal-system component by itself may already be complicated (see, for example, diagram (1) in Chapter 6, and indeed the whole of the previous chapter on network topology), and now the networking process is iterated: a model is a network of component blocks, and each of these blocks is a collection of connected components, each of which is itself a network of arrows.
7.27 Definition The collection of all models of a natural system N is denoted C ( N ) . The lattice structure of model components C ( S ) may be extended to
C ( N ) . For two models
{ Si , Fi } ,{ T j , G j } ∈ C ( N ) , their join ∨
may be
defined as the set-theoretic union of the two collections of components, with the exception that when Si = T j , instead of admitting Si , Fi and T j , G j separately
into
the
union
one
takes
their
join
in
C ( Si ) ,
Si , Fi ∨ T j , G j = Si , Fi ∪ G j . The meet of two models may be defined
dually in the obvious way. Thus
7.28 Theorem
The collection C ( N ) of models of a natural system N is
a lattice. I remarked in 1.31 that a partially ordered set
X ,≤
may itself be
considered as a category, in which the objects are elements of X , and a homset X ( x, y ) for x, y ∈ X has either a single element or is empty, according to whether x ≤ y or not. It is in this sense that the lattice (hence poset) C ( N ) may also be considered as a category.
187
7.29 Corollary The collection C ( N ) of models of a natural system N is a category. Thus we may speak of ‘the category of models’ C ( N ) in the literal category-theoretic usage.
The Α and the Ω That the binary join and meet operators of every lattice may be extended to any nonempty finite collection of elements is a matter of course (cf. 2.9). The meet operator of C ( N ) may also be extended to any nonempty, finite or infinite, collection of models, since, essentially, 0 i S , ∪, ∩ is a complete lattice; in other words, the meet, or infimum, of any nonempty collection of models is itself a model. I have noted above, however, that the greatest element, which is inf ∅ , does not necessarily exist for a general system. The join operator of C ( N ) , on the other hand, may only be extended to any countably infinite collection of models, but not to any arbitrary collection: otherwise one may take the supremum of all the models in C ( N ) and again obtain the greatest element sup C ( N ) which we have already excluded epistemologically for a general system. For a lattice to be complete, every subset (including the empty subset ∅ and the lattice itself) must have a supremum and an infimum. While we do have the least element inf C ( N ) = sup ∅ = N , {0} , the greatest element sup C ( N ) = inf ∅ does not necessarily exist; thus C ( N ) is generally not complete. (Note that in the notation N ,{0} I have used the symbol N to denote both a natural system and a set that represents it. See the discussion on natural systems in 4.4.)
7.30 Definition The greatest element (if it exists) of the lattice of models C ( N ) is called the largest model of N . The largest model may also be called the greatest model, or maximal model.
188
N , {0} is the least element of C ( N ) . But R0 = U gives us no additional information other than the identification of the set N . More interesting are the models that are ‘slightly larger’ than N ,{0} (cf. Definition 3.43).
7.31 Definition The join-irreducible elements in the lattice of models C ( N ) are called the minimal model of N . One may also refer to minimal models as smallest models — but not ‘least models’, to avoid confusion with the least element N , {0} of C ( N ) . Recall (Definition 3.25) that an element in a poset that covers the least element is called an atom, and that an atom is join-irreducible (Theorem 3.46). Models in C ( N ) of the form S ,{0, f } that for which there are no observable h of
S such that R f ≤ Rh and S ,{0, h} ∈ C ( N ) are minimal models. Since C ( N ) does not necessarily have a greatest element, one cannot generally speak of complements (Definition 3.9) in this lattice. One can, however, speak of relative complements (Definition 3.11). The lattice S of equivalence relations on S is relatively complemented (Theorem 3.14). The same constructive proof translates to C ( S ) , because of the contravariant correspondence between C ( S ) and S ; whence the lattice C ( S ) of model components is also relatively complemented. Thus by extension, one also has the
7.32 Theorem
C ( N ) is a relatively complemented lattice.
When one considers relative complements in the interval [ R2 ,U ] in S , one has the
7.33 Corollary If R1 , R2 ∈ S and R2 ≤ R1 ( R2 refines R1 ), then there is an X ∈ S such that R1 ∧ X = R2 .
189
When one considers relative complements in the interval ⎡⎣ S , {0} , S , F2 ⎤⎦ in C ( S ) (remember that the correspondence between S and C ( S ) is contravariant), one has the dual
7.34 Corollary If S , F1 , S , F2 ∈ C ( S ) and S , F1 ≤ S , F2 , then there is an S , G ∈ C ( S ) such that S , F1 ∨ S , G = S , F2 .
And when one considers relative complements in the interval ⎡⎣ N , {0} , M 2 ⎤⎦ in C ( N ) , one has the corresponding
7.35 Corollary If M 1 , M 2 ∈ C ( N ) and M 1 ≤ M 2 , then there is an X ∈ C ( N ) such that M 1 ∨ X = M 2 .
Analytic Models and Synthetic Models Next, I shall examine the inherent algebraic or topological structure of the codomains of a family of observables F of a set S . One learns about S through its model S , F by its projection into quotients (22)
S → S RF ⊂ ∏ S R f . f ∈F
7.36 Definition The expression of a set S as a cartesian product of quotient sets is an analysis of S ; the corresponding model S , F is an analytic model of S . Note that the term ‘analytic model’ is a description of the expression, or representation, of the model rather than of the model itself. Every model S , F may be represented in its cartesian-product-of-projections form
∏S f ∈F
R f , and therefore every model is an analytic model. I use the term to
190
distinguish the model from an alternate description called synthetic model that is my next topic, when the observables are not interpreted as projections. Note also that, for simplicity as in the construction of the lattice C ( S ) above, I have used the term ‘model of S ’ to abbreviate the verbose but more proper term ‘model component of the form S , F of the natural system N ’. But the extension from C ( S ) to C ( N ) is trivial: a model M = { Si , Fi } ∈ C ( N ) is analytic if every Si , Fi ∈ C ( Si ) is analytic.
7.37 Injection The imputation process dual to projection is injection (cf. A.22 Product and A.33 Coproduct in the Appendix), when one reverses the arrow of an observable f : S → Z , and considers properties that may be imputed onto S by ‘ f −1 : Z → S ’. Note that ‘ f −1 ’ is used in the ‘inverse image’ sense (Definition 1.8), and is not necessarily a (single-valued) mapping. For a topological example, let Z be a metric space; then a pseudometric d S may be defined on S by the metric d Z on Z as (23)
d S ( x, y ) = d Z ( f ( x ) , f ( y ) ) .
For an algebraic example, let Z be a ring; then for x, y ∈ S and a ∈ Z one may define corresponding ‘addition’ x + y , ‘multiplication’ x y , and ‘scalar multiplication’ a x in S by (24)
f ( x + y) = f ( x) + f ( y ) ,
(25)
f ( x y) = f ( x) f ( y),
(26)
f (a x) = a f ( x).
In each of these two examples, the ‘new’ mathematical structures in the system S on the ‘left-hand side’ are ‘decoded’ from the existing mathematical structures in the system Z on the ‘right-hand side’. Obviously, the construction in S of the mathematical structures induced by observables has to be carefully done, and is indeed not always possible.
191
The strategy of imputing properties to S by injecting into it is not promising in epistemological terms. A general natural system S is not known a priori: it is the object of study, to be probed through observation and modelling. It is, therefore, natural to consider S as the domain on which observables are defined, and models established. To treat S as codomain presupposes from the outset a knowledge of S that is difficult to phenomenologically justify. Systems for which this is possible must be special. This is the realm of synthesis, dual to analysis (although synthesis using analytic fragments is only a meagre part of synthesis; there is more discussion on this towards the end of this chapter). I now leave topological synthesis (the topic, dynamical systems, having been covered thoroughly in FM , CS, and AS), and concentrate on algebraic synthesis.
7.38 Algebraic Synthesis It is in human nature to fractionate and atomize. We like to break up complicated situations into simple ones, and then to take the resultant elementary pieces and put them together in various ways. Note that there is nothing wrong with this synthetic procedure — indeed many valuable lessons may be learned. The fatal flaw is the presumptuous reductionistic claim that this very specific kind of synthetic approach, the assembly of analytic fragments, exhausts reality.
Like following life through creatures you dissect, You lose it in the moment you detect. — Alexander Pope: Essay on Man Humpty Dumpty sat on a wall, Humpty Dumpty had a great fall; All the King’s horses and all the King’s men Couldn’t put Humpty together again. — nursery rhyme
192
Despite Alexander Pope’s cautionary message and Humpty Dumpty’s fate, reductionists continue in their futile attempt to identify the ‘complete’ set of elementary pieces and the ‘right’ way of putting them together again. In the categorical dual of the product of the analytic model, one has the coproduct (27) α
Uα → S ,
where each member in the collection of sets {U α } may be considered to be representing a part of the whole system S . We shall specialize in categories in which the coproduct is an ‘almost disjoint union’ (the object generated by the disjoint union of all ‘nonzero’ elements, together with a common ‘zero’; see the discussion in the Appendix on examples of these coproducts, A.36), whence we have the direct sum (28)
S = ⊕U α . α
When a system S is a direct sum, by the universal property, one sees that
7.39 Lemma A mapping f is an observable of S = ⊕U α if and only if it α
may be expressed as a direct sum of observables fα of the summands Uα . If each U α is equipped with its own set of observables Fα , then a set of observables F of S may be constructed as the direct sums of the Fα s.
7.40 Definition The expression of a set S as a direct sum is a synthesis of S ; the corresponding model S , F is a synthetic model of S . (The extension from C ( S ) to C ( N ) is, again, trivial: a model
M = { Si , Fi } ∈C ( N ) is synthetic if every Si , Fi ∈C ( Si ) is synthetic.)
193
If S , F and S , G are two synthetic models, then so is their join; indeed, (29)
S, F ∨ S,G = S, F ⊕ S,G .
Since each model Uα , Fα is a fortiori an analytic model, so is its direct sum (30)
S , F = ⊕ Uα , Fα . α
Thus one has
7.41 Theorem
Every synthetic model is an analytic model.
7.42 Corollary The collection of all synthetic models of N is a sublattice (whence a subcategory) of the lattice C ( N ) of all (analytic) models of N . The converse of Theorem 7.41 is not true: there generally exist analytic models that are not synthetic models. There are many reasons for this, and many were explained in Chapter 6 of LI. But above all, there is the simple mathematical reason that not every set may be expressed as a direct sum. Indeed, not every set may admit an inherently consistent linear structure, which is the central requisite in synthetic modelling. In other words, a synthetic model is a very special type of analytic model, in terms of the observables that define it. These observables must be expressible as linear combinations of observables of the summands. Stated yet otherwise, the values of the observables defining a synthetic model are determined, or entailed, entirely by values on summands. In category-theoretic terms, while the subcategory of synthetic models is an additive category, the category C ( N ) of all models is not necessarily so (cf. A.38–A.40 in the Appendix).
194
The Amphibology of Analysis and Synthesis “I am not yet so lost in lexicography, as to forget that words are the daughters of earth, and that things are the sons of heaven. Language is only the instrument of science, and words are but the signs of ideas: I wish, however, that the instruments might be less apt to decay, and that signs might be permanent, like the things which they denote.” — Samuel Johnson (1755) Dictionary of the English Language Preface
7.43 Etymology The word ‘analysis’ is the Greek compound άνάλυσις, meaning ‘an undoing’. The first component is ana-, ‘up, on’, and the second component is lusis, ‘setting free’. The analysis of something complicated is a ‘freeing up’ of the thing, or resolving it into its component parts for detailed study. This technique has been in use in mathematics and logic since before Aristotle, although analysis as a mathematical subject is a relatively recent development as a descendent of calculus. The word ‘synthesis’ is the Greek compound σύνθεσις, meaning ‘putting together’. The first component is sun-, ‘together, with’, and the second component is thesis, ‘putting, placing’. When something is ‘put together’ from parts, it often acquires a judgment and takes on the sense of being ‘a substitute’, ‘an imitation’, or ‘artificial’. But this derogation is quite unnecessary: if one does not restrict to the dissected pieces, a synthetic approach is in no way ‘inferior’. Indeed, quite the opposite is true.
7.44 Dual Analysis and synthesis are dual concepts; and just like other category-theoretic duals, they are ‘philosophical opposites’ but usually not
195
‘operational inverses’. In analytic geometry, the space is ‘cut up’ by the scale of a coordinate system. In synthetic geometry (the type of geometry developed by the Greeks, i.e. ‘Euclidean’ geometry), shapes are treated ‘together’ as wholes. The meanings and origins of the two words explain the terminology in the naming of the representation of the models (Definitions 7.36 and 7.40). A model expressed as a product resolves itself into component projections. This is the point of view of decomposition of the whole into parts; hence a product representation of a model is called analytic. A model expressed as a coproduct puts the injected summands together. This is the point of view of assembly of parts into the whole; hence a coproduct representation of a model is called synthetic.
7.45 Synthetic Models ⊂ Analytic Models Corollary 7.42 indicates a hierarchy of models, that synthetic models are specialized analytic models. But we must note that this is a consequence of the very particular definitions of analytic and synthetic models, in terms of products and coproducts respectively. One must not conclude from this technical result that, in general, analysis is more generic than synthesis. For a general natural system N , its category C ( N ) of models does not have the requisite linear structure for the component projections of an analytic model M ∈ C ( N ) to be injected back into an analytic model that is C ( N ) isomorphic to M . Coproduct is the category-theoretic construction dual to product, but not its inverse. Thus, in terms of Definitions 7.36 and 7.40, synthetic model and analytic model are not inverse constructions. In more general terms, analysis and synthesis are not categorical inverses of each other: synthesis of life, in particular, is more than reassembling analytic fragments. This is the lesson of Humpty Dumpty: the pieces cannot be put together again. In short, (31)
synthesis
analysis ≅/ identity.
196
On the other hand, in the product-coproduct sense of analysis-synthesis, if one carefully keeps track of the parts, one can recover them by taking apart the assembled whole. So it is possible to have (32)
analysis
synthesis ≅ identity.
But note that this identity in (32) and the nonexistent one in (31) are in different domains. In short, (31) and (32) together demonstrate a one-sided inverse.
7.46 Observation < Analysis < Synthesis James F. Danielli described the three ages in the science of modern biology as (33)
age of observation → age of analysis → age of synthesis
[Danielli 1974]. The same progression may also be appropriately prescribed to science in general. It is, indeed, not a hyperbole to state that the progress of human culture also depends on the capacity to move from the second age of analysis to the third age of synthesis, i.e., from analytic machine-based technologies to synthetic biology-inspired modes of production. Paradoxically, a synthetic approach restores to our fragmented sciences the kind of integration and unity they possessed in an earlier time, when scientists regarded themselves as natural philosophers. As I explicated in connection with Natural Law (4.7), perception is an integral part of science. Since science begins with observations, everything in it evolves from our physical senses. These senses interact with physical objects. It is therefore natural, as a first approximation, for scientists to assume that matter is the fundamental building blocks of the universe. The examination of material fragments, i.e. analysis, is then the next step after the accumulation of observed data, with the hope that the knowing of the parts will tell the story of the whole. The assumption is that there are certain physical laws (‘equations’) that all matter must obey, epitomized in the Newtonian paradigm. In sum, this is an ‘upward’ theory of causation.
197
7.47 Newtonianism Nature does nothing in vain, and more is in vain when less will serve; for Nature is pleased with simplicity and affects not the pomp of superfluous causes. — Issac Newton (1687) Philosophiæ naturalis Principia mathematica De Mundi systemate Liber tertius: Regulæ Philosophandi
The mechanistic view of the universe was what made sense to Newton. In his clockwork universe, God makes the clock and winds it up. Unlike Aristotle, Newton did not claim to have an explanation for everything (thus began the exile of final cause from science: see Chapter 5). For example, Newton, in his Principia, describes how gravity works, on the basis of the effects seen, but does not say what gravity is. On this and other ‘mystery subjects’, Newton said “hypotheses non fingo” (that he “frames no hypotheses”). While Newton himself might have framed no hypothesis, the same cannot be said of his followers. Scientists since Newton were led astray by the mechanistic viewpoint, in their attachment of synthetic significance to analytic fragments. If the world was (like) a machine, then, they reasoned, the understanding of the world had to be reduced through analogies to machines. The ‘Newtonian paradigm’ may be succinctly summarized thus: (i) Axiomatic presentation; (ii) Mathematical precision and tight logic; (iii) All science should have this format. Although the paradigm bears Newton’s name, the dogma was only attributed to him. Newton certainly demonstrated (i) and (ii) in his Principia, but he himself did not necessarily agree with statement (iii).
198
When the mechanical model is the ultimate explanation, then the emphasis is on “how?”, not “why?”. The misconception is that analytic knowledge (i.e. its physiology) can tell something about its creation (i.e. its ontogenesis). This is not to say analysis, Newtonian or otherwise, is not valuable. Quite the contrary: the scientific revolution, hence modern science, is founded on it. The error is in the induction that analysis is all of science, and therefore everything must be explained in analytic terms. Stated otherwise, there is nothing wrong with (i) and (ii) of the Newtonian paradigm: the effectiveness of mathematics in science is a consequence of Natural Law (cf. Chapter 4). It is the generalization (iii), the Newtonian model of all knowledge (the Newtonian model indeed extends far beyond physics, into philosophy, sociology, economics, etc.), that firmly epitomizes, whence stagnates, the age of analysis.
7.48 Incompleteness Newton, of course, based the ‘formalization of mechanics’ that is his Principia on Euclid’s Elements, the very model of scientific rigour for millennia. If axiomatic presentation with mathematical precision and tight logic worked so well for geometry, the same ought to work for physics. It had even been said that “Physics is geometry.” The same analogue reasoning led, two and a half centuries later, to Hilbert’s program, one of the goals of which was the formalization of all of mathematics. Then came Gödel. It must, however, be remembered that although it is not possible to formalize all mathematics, it is possible to formalize almost all the mathematics that anyone uses (other than, naturally, those specifically involved in mathematical logic and foundations). In short, one acknowledges the incompleteness, and moves on. Gödel’s arguments applied to a wide range of mathematics, although his original arguments were carried through an axiomatic number-theoretic system in which a certain amount of elementary arithmetic could be expressed and some basic rules of arithmetic could be proven. In essence, what is today known as ‘Gödel’s incompleteness theorem’ states that any such axiomatic system (as one that Gödel had considered), if consistent, is incomplete, and the consistency of the system cannot be proven within the system itself.
199
Succinctly, consistency and completeness of an axiomatic theory may be explicated thus: consistency: Statement p is provable, therefore it is true. completeness: Statement p is provable, because it is true. I invite the reader to reflect on these two properties in connection to the “whys and wherefores” discussion in 5.3. The lesson to be learned from Gödel, however, is more importantly the metaphorical one, that “nothing works for everything”: all attempts at universality or genericity in human endeavours are likely unsuccessful. Let me tersely summarize the situation in this variation of Epimenides’s Paradox: Every absolute statement is false. What ought to be done is the reversal of this unfortunate legacy of misidentification, and separate analysis and synthesis again. Analysis and synthesis are not mutual inverses; indeed, most of our modes of analysis are inherently irreversible. Both analysis and synthesis are essential in natural philosophy. Modes of synthesis would include the entailment of existence, immanent causation (cf. 5.18).
7.49 Towards an Age of Synthesis It must be emphasized that the progression of the three ages are natural developments. A tremendous amount of knowledge has been gained in each of the first two ages. What is causing scientists to consider that one must go beyond analysis is their dawning realization that upward causality contains no explanation for why complexity increases. Note that “complex forms emerge” is not an answer: emergence is a descriptive concept, not an explanatory one. The accumulation of experience and knowledge must, and will, continue, of course. But knowledge should not be the only emphasis that is passed on from generation to generation, from age to age. When one is only given the analytic fragments to assemble, one is told what one is supposed to know. Wisdom may be defined as ‘experience and knowledge together with
200
the power of applying them critically or practically’. In short, wisdom is knowledge applied. It is about how to think, instead of what to think. The crux is the discovery, not the prescription. The difference between analysis and synthesis is analogous to that between knowledge and wisdom. In the age of synthesis, one recognizes that the foundational feature of the universe is not matter but information — the interconnecting relationships and entailment patterns among matter. The world may be considered a hierarchy of systems that convey information, and the purpose of theory is to extract as much information from these systems as possible. One does not limit oneself to the analytic fragments; information has diverse sources. This ‘downward’ theory of causation frees science from the reductionistic project of forcing nature into a Newtonian mould. Note that nothing of substance in mechanistic science is lost: synthesis extends analysis but does not replace it. Analytic tools are necessary but simply not sufficient: to progress one needs synthetic tools. Towards an age of synthesis, what one must give up is the idea that science is a ‘bottom-up’ affair in which knowledge of a system’s parts determines knowledge of the system as a whole. ‘Analysis’ and ‘synthesis’ are examples of amphibolous words, those that bear two meanings that are diametrically opposed. As we saw, as expressions of the forms of models, synthetic is ‘specialized analytic’; but as scientific methods, synthetic is ‘generalized analytic’. The progressive generalization from analysis to synthesis in a biological context is essentially what the work of the Rashevsky-Rosen school of relational biology is about.
201
8 Simple Systems
Simulability We have already met a member of the etymological family of simulacrum: simulation (4.10). Recall that a simulation of a process provides an alternate description of the entailed effects, whereas a model is a special kind of simulation that additionally also provides an alternate description of the entailment structure of the mapping representing the process itself. 8.1 Simulation Revisited To recap, a simulation may be represented by the commutative element-chasing diagram
(1)
The equality (2)
α ( f ( x ) ) = g (α ( x ) )
202
may be summarized by the entailment (3)
{ g ,α ( x ) ,α ( y )} .
α : { f , x, y}
The important fact to note here is that simulation converts the efficient cause f , the processor of that being simulated, into material cause of the simulator α , and the ‘simulated processor’ g becomes part of the ‘effect’. In particular, the process-flow distinction in the original mapping f is lost, and all become ‘flow’ in the simulation. In a model, the processor f itself is mapped to a processor α ( f ) , and the commutative element-chasing diagram is
(4)
Here the equality (5)
α ( f ( x ) ) = α ( f ) (α ( x ) )
may be summarized by the functorial entailment (6)
α :( f : x
y)
(α ( f ) : α ( x )
α ( y )) .
Note that the ‘modelled processor’ α ( f ) preserves the entailment pattern (in particular, the process-flow distinction) of the original efficient cause f . Stated otherwise, a model maps efficient causes to efficient causes.
203
8.2 Definition An algorithm is a process with the following attributes: (i) it terminates after a finite number of steps; (ii) each step is unambiguously defined; (iii) it has zero or more input data; (iv) it has one or more output data; and (v) it must be effective, which means there must be a Turing-machine equivalent; i.e., the process must be evaluable by a mathematical (Turing) machine. An algorithm, therefore, is a computation procedure that requires in its application a rigid stepwise mechanical execution of explicitly stated rules. It is presented as a prescription, consisting of a finite number of instructions. It may be applied to any number (including none) of members of a set of possible inputs, where each input is a finite sequence of symbolic expressions. Once the inputs have been specified, the instructions dictate a succession of discrete, simple operations, requiring no recourse to chance and ingenuity. The first operation is applied to the input and transforms it into a new finite sequence of symbolic expressions. This outcome is in turn subjected to a second operation, dictated by the instructions of the algorithm, and so on. After a finite number of steps, the instructions dictate that the process must be discontinued, and some outputs be read off, in some prescribed fashion, from the outcome of the last step. 8.3 Definition A mapping is simulable if it is definable by an algorithm. If a mapping (representing a processor or efficient cause) is simulable, then a simulation (in the sense of diagram (1) above) exists. Note, however, the technical Definition 8.3 is somewhat more specific, and is predicated on the term ‘algorithm’. In turn, the definition of ‘algorithm’ depends on other computing-theoretic terms: unambiguous, effective, and ‘evaluable by a mathematical (Turing) machine’. ‘Simulable’ is also called computable and algorithmically definable. There are fine nuances that distinguish these nearsynonymous terms. I will not repeat here the thorough discussion on simulation given in Chapter 7 of LI. Readers interested in pursuing the computing-theoretic aspects will also find an excellent exposition in Kercel’s
204
paper “Entailment of Ambiguity” [2007]. For our purpose, we only need to note these three simple 8.4 (a) (b) (c)
Properties If a mapping is simulable, then its corresponding Turing machine halts after a finite number of steps; its corresponding algorithmic process is of finite length; and its corresponding program, which may be considered as a word built of the alphabets of its Turing machine, is of finite length.
The keyword is finite. A formal system, an object in the universe of mathematics, may be considered a collection of mappings connected by the system’s entailment pattern (i.e., its graph, which may itself be considered a mapping). (We studied this in categorical details in the previous chapter.) So one may extend Definition 8.3 and give the 8.5 Definition A formal system is simulable if its entailment pattern and all of its mappings are simulable. Note that any formal system (i.e., any mathematical structure) may or may not be the model of something, and it may or may not be simulable. If a formal system is simulable, then its entailment pattern and mappings may be replaced by their corresponding simulations, so that all processes and flows become just ‘flow’ (material causes and effects) for the simulator. Remember that a simulation keeps track of the ‘inputs’ and ‘outputs’, but the processors inside the ‘black boxes’ (i.e. the transfer functions) may be lost — unless the simulation is in fact a model. 8.6 Finitude Redux Because of the finiteness predicated in Properties 8.4, a simulable formal system must have only finitely many observables (since, evidently, an infinite number of nonequivalent mappings requires an infinite number of algorithms). Note, however, this finitude is a necessary, but not sufficient, condition for simulability: it is the entailment pattern of the finitely many mappings, together with the possible computability of the mappings themselves, that determine whether or not the formal system is simulable.
205
While a model is a special kind of simulation, a ‘simulable model’ is not a tautology. This is because the requirement for a formal system to be ‘simulable’ is more than ‘a simulation exists’. Stated otherwise, a simulable model is a model (of a natural system) for which the entailment pattern and all mappings are definable by algorithms. Recall (Axiom 7.20(b)) that a model has finitely many mappings, so it already satisfies this finitude necessity for simulability. Finitude is a crucial property of computing (and a standard ingredient of the definition of computability) since the pioneering work of Turing. Before I leave the subject altogether, let me just illustrate with one quotation. In Kleene’s Introduction to Metamathematics [1952], he began Chapter XIII on “Computable Functions” (i.e. our ‘simulable mappings’ of Definition 8.3) thus: § 67. Turing machines. Suppose that a person is to compute the value of a function for a given set of arguments by following preassigned effective instructions. In performing the computation he will use a finite number of distinct symbols or tokens of some sort. He can have only a finite number of occurrences of symbols under observation at one time. He can also remember others previously observed, but again only a finite number. The preassigned instructions must also be finite. Applying the instructions to the finite number of observed and remembered symbols or tokens, he can perform an act that changes the situation in a finite way, e.g. he adds or erases some occurrences of symbols, shifts his observation to others, registers in his memory those just observed. A succession of such acts must lead him from a symbolic expression representing the arguments to another symbolic expression representing the function value. There are six occurrences of the word finite in this one paragraph.
206
Impredicativity Let N be a natural system. Recall (Definition 7.27) that C ( N ) denotes the collection of all models of N , and (Theorem 7.28) that it is a lattice. I emphasized in the previous chapter (7.24 and 7.30) the fact that for a general natural system N , its lattice of models C ( N ) does not necessarily have a greatest element, whence N does not necessarily have a largest model. In this chapter I study a special class of natural systems that do have largest models. Let M ∈C ( N ) ; i.e., let M be a model of the natural system N . Let us consider what the inferential entailment pattern of its relational diagram in graph-theoretic form (necessarily a finite graph, Axiom 7.20(c)) would have to take to make M simulable. It is clear that a tree of sequential compositions (6.13) of simulable mappings is simulable. Closed paths of material causation (6.14) abound, and they are also all simulable. It is standard that the output of a step in an algorithm be the input of the next sequential step. An algorithm would be quite cumbersome if it contains no closed causal loops of this type: the program length would otherwise increase proportionally with the number of computation steps. It is also obvious that a closed path with exactly one efficient cause (6.15) is simulable. Thus one may conclude with the
8.7 Theorem A model without hierarchical cycles is simulable. An iteration of ‘efficient cause of efficient cause’ is inherently hierarchical, in the sense that a lower-level efficient cause is contained within a higher-level efficient cause; e.g. H ( X , H ( A, B ) ) is at a higher hierarchical level than H ( A, B ) . A closed path of efficient causation must form a hierarchical cycle of containment. Both the hierarchy of containment and the cycle are essential attributes of this closure. In formal systems, hierarchical cycles are manifested by impredicativities (i.e., entailed ambiguities). In other words, a hierarchical cycle is an impredicative cycle of inferential entailment.
207
8.8 Predicate In logic, the predicate is what is said or asserted about an object. It can take the role as either a property or a relation between entities. Thus predicate calculus is the type of symbolic logic that takes into account the contents (i.e., predicate) of a statement. The defining property p ( x ) in
(7)
P = { x ∈ U : p ( x )}
(cf. Axiom of Specification, 0.19) is an example of a predicate, since it asserts unambiguously the property that x must have in order to belong to the set P . 8.9 Self-Referencing Contrariwise, a definition of an object is said to be impredicative if it invokes (mentions or quantifies over) the object itself being defined, or perhaps another set which contains the object being defined. In other words, impredicativity is the property of a self-referencing definition.
As an example, consider the definition of supremum (cf. Definition 1.27). Let ≤ be a partial order on a set X and let A ⊂ X . The subset A is bounded above if there exists x ∈ X such that a ≤ x for all a ∈ A ; such x ∈ X is called an upper bound for A . An upper bound x for A is called the supremum for A if x ≤ y for all upper bounds y for A . Stated otherwise, (8)
x = sup A ⇔
x ≤ y for all y ∈ Y
where Y = { y ∈ X : y is an upper bound for A } . Note that the definition invokes the set Y and the supremum x ∈ Y , whence the definition of ‘supremum’ is impredicative. Impredicative definitions usually cannot be bypassed, and are mostly harmless. But there are some that lead to paradoxes. The most famous of a problematic impredicative construction is Russell’s paradox, which involves the set of all sets that do not contain themselves: (9)
{ x : x ∉ x} .
208
This foundational difficulty is only avoided by the restriction to a naive settheoretic universe that explicitly prohibits self-referencing constructions. A formal definition of impredicativity may be found in Kleene’s Introduction to Metamathematics [1952]: 8.10 Impredicative Definition “When a set M and a particular object m are so defined that on the one hand m is a member of M , and on the other hand the definition of m depends on M , we say that the procedure (or the definition of m , or the definition of M ) is impredicative. Similarly, when a property P is possessed by an object m whose definition depends on P (here M is the set of the objects which possess the property P ). An impredicative definition is circular, at least on its face, as what is defined participates in its own definition.” So we see that the distinguishing feature of impredicativity is the selfreferencing, cyclic constraint (10)
m
M
(cf. the two-mapping hierarchical cycle in 6.22). This is, of course, precisely the same defining feature of hierarchical cycles. 8.11 Deadlock A deadlock is a situation wherein competing actions are waiting for one another to finish, and thus none ever does. It is thus a relational analogue of impredicativity. The most famous example of deadlock is ‘the chicken or the egg’ paradox. Another example is the following statute supposedly passed by the Kansas Legislature: “When two trains approach each other at a crossing, both shall come to a full stop and neither shall start up again until the other has gone.” In computer science, deadlock refers to a specific condition when two or more processes are each waiting for another to release a resource, or more
209
than two processes are waiting for resources in a circular chain. Implementation of hierarchical cycles (or attempts to execute ambiguous codes in general) will lead a program to either a deadlock or an endless loop. In either case the program does not terminate. This is practical verification from computing science of the inverse of Theorem 8.7: a hierarchical cycle (i.e. a cycle in which two or more compositions are hierarchical, a model of a closed path of efficient causation) is different in kind from the other patterns of entailment considered above — it is not simulable. I state this formally as 8.12 Theorem simulable.
A formal system that contains a hierarchical cycle is not
Thus equivalently one also has: 8.13 Theorem If an impredicative cycle of inferential entailment exists in a formal system, then it is not simulable. ‘Practical verification’ contributes to scientific ‘proofs’, but is, however, not a mathematical proof. The rest of this chapter will be a detailed examination of properties of simulable models, culminating in the proof of (an equivalent form of) Theorems 8.12 and 8.13.
Limitations of Entailment and Simulability Let N be the set of all natural systems. Let t ( N ) be the property ‘there is no closed path of efficient causation in N ’ (recall Definition 6.16 that this means there are no closed paths that contain two or more efficient causes), and let (11)
T = { N ∈ N : t ( N )}
(see the Axiom of Specification 0.19 in the Prolegomena for an review of the notation). This is the
210
8.14 Definition A natural system is in the subset T of N if and only if it has no closed path of efficient causation. 8.15 In the Realm of Formal Systems Let M ∈ C ( N ) ; i.e., let M be a model of the natural system N . Let ι ( M ) be the property ‘there is a hierarchical cycle in M ’, and π ( M ) be the property ‘ M is simulable’. Theorem 8.7 says a formal system without hierarchical cycles is simulable. Thus, in particular,
∀M ∈ C ( N ) ¬ι ( M ) ⇒ π ( M ) .
(12)
The equivalent contrapositive of statement (12) is
∀M ∈ C ( N ) ¬π ( M ) ⇒ ι ( M ) ,
(13) which says
8.16 Corollary A model that is not simulable must contain a hierarchical cycle. 8.17 Implications in the Realm of Natural Systems If a natural system has no closed path of efficient causation, then none of its models can have hierarchical cycles (Lemma 6.18), so (by Theorem 8.7) all the models are all simulable. Therefore ∀N ∈ N , (14)
t ( N ) ⇒ ∀M ∈ C ( N ) ¬ι ( M ) ⇒ ∀M ∈ C ( N ) π ( M ) .
Stated otherwise, (15)
∀N ∈ T ∀M ∈ C ( N ) ¬ι ( M ) ∧ π ( M ) . Now let s ( N ) be the property ‘all models of N are simulable’. Thus
(16)
s ( N ) = ∀M ∈C ( N ) π ( M ) .
211
Let
S = { N ∈ N : s ( N )} .
(17)
One therefore, in view of (14) and (15), has
T⊂S,
(18)
and the following characterization of members of T:
8.18 Lemma All models of a member of T are simulable. Lemma 8.18 may alternatively be stated as
8.19 Lemma If a natural system has no closed path of efficient causation, then all of its models are simulable. That is, (19)
∀N ∈ N t ( N ) ⇒ s ( N ) .
Note that the property t ( N ) is a characterization of a natural system in the natural domain, whereas the property s ( N ) characterizes a natural system through its models in the formal domain. The implication (19) is a quintessential property of the functorial encoding in a modelling relation. Lemma 8.19 proclaims the consequences in the universe of formal systems, when a natural system has no closed path of efficient causation, in terms of limitations on its models. It is a very important fact that the converse of the lemma is also true. That is, one also has (20) and
∀N ∈ N s ( N ) ⇒ t ( N ) ,
212
S ⊂ T,
(21)
stated explicitly as the
8.20 Theorem If all models of a natural system N are simulable, then there can be no closed path of efficient causation in N . Theorem 8.20 says that certain modes of entailment are not available to a natural system when all its models are simulable. This is the most important theorem in LI, and as such, most far-reaching and hence most controversial. Chapter 9 of LI is a detailed apagogical argument that proves it, albeit in the ‘illustrative’ Rosen form. Towards the end of this chapter, I shall provide an alternate proof using the mathematical tools we have accumulated heretofore in this monograph. But before we get there, let us discover more of what the property ‘all models are simulable’ entails.
The Largest Model For the remainder of this chapter, unless otherwise stated, let N denote a natural system all models of which are simulable; i.e., N ∈ S . 8.21 Lemma The lattice C ( N ) satisfies the ascending chain condition. PROOF
I shall show that a strictly increasing chain in C ( N ) must
terminate, whence the equivalence in Lemma 3.34 entails the desired conclusion. Consider (22)
M1 < M 2 < M 3
M 3 >
(24)
> Mk >
in C ( N ) . Each M k must have a program of finite length, with the length a decreasing function of k . There are only finitely many programs of a fixed finite length (cf. Property 8.4(c)), and one obviously cannot have a strictly decreasing infinite sequence of natural numbers (cf. the method of infinite descent, 3.36); thus the strictly decreasing chain must be finite, i.e. terminate. □ By Lemma 3.45, one has the following important theorem for a natural system N with all simulable models:
8.26 Theorem Every element of C ( N ) , i.e. every model of N , can be expressed as a join of a finite number of join-irreducible elements. The set
{M } min i
of all join-irreducible elements of C ( N ) , i.e. minimal
models of N , must be finite: otherwise the strictly ascending chain (25)
M 1min < M 1min ∨ M 2min < M 1min ∨ M 2min ∨ M 3min
t . Thus, any observable f ⎡⎣ m ( t ) ⎤⎦ serves as a predictor for the behaviour of some corresponding observable of S at that later instant. I shall now allow M and S to be coupled; i.e. allow them to interact in specific ways. For the simplest model, one may simply allow the output f ⎡⎣ m ( t ) ⎤⎦ to be an input to the system S . This then creates a situation in which a future state s ( h ( t ) ) of S is controlling the present state transition in
S . But this is precisely what is characterized above as anticipatory behaviour. It is clear that the above construction does not violate causality; indeed, we have invoked causality in an essential way in the concept of a predictive model, and hence in the characterization of the system M . Although the composite system ( M + S ) is completely causal, it nevertheless will behave in an anticipatory fashion. Similarly, we may construct a system M with outputs that embody predictions regarding the inputs a ( t ) to the system S . In that case, the present change of state of S will depend upon information pertaining to future inputs to S . Here again, although causality is in no sense violated, our system will exhibit anticipatory behaviour. From the above remarks, we see that anticipatory behaviour will be generated in any system that: (i) contains an internal predictive model of itself and/or of its environment; and (ii) is such that its dynamical law uses the predictions of its internal model in an essential way. From this point of view, anticipatory systems can be viewed as a special class of adaptive control systems.
252
There are many other modes of coupling, discussed in AS, which will allow S to affect M , and which will amount to updating or improving the model system M on the basis of the activity of S . I shall for the present example suppose simply that the system M is equipped with a set E of effectors that operate either on S itself or on the environmental inputs to S , in such a way as to change the dynamical properties of S . We thus have a situation of the type in the diagram, formulated as an input-output system.
(8)
The abstract formalism is how we describe the predictive model, not an anthropomorphic imputation that the anticipatory system S itself, which may after all be a primitive organism, has to somehow formulate M and E . While it is true that without a modeller, there is no modelling, there may not be intentional effort involved on the modeller’s part. Let me use the same symbols for the object, model, and effector systems, respectively S , M , and E , to denote their efficient causes. In other words, let each symbol represent the processor associated with the block (the ‘black box’) as well as the block itself. If one traces the path of an input element a , the diagram of the anticipatory system (8) becomes
253
(9)
and the corresponding output b will be of the form (10)
b ( t ) = S ⎡⎣ a ( t ) , E ( a ( t ) ) , E ( M ( a ( t ) ) ) ⎤⎦ ,
which is the anticipatory analogue to the reactive relation (6). Within such generality, it is easy to see that it is possible to define many different time scales. 10.14 Errors A natural system is almost always more than any model of it. In other words, a model is, by definition, incomplete. As a consequence, under appropriate circumstances, the behaviour predicted by a model will diverge from that actually exhibited by the system. This provides the basis for a theory of error and system failure on the one hand, and for an understanding of emergence on the other. It is crucial to understand this aspect in any comprehensive theory of control based on predictive models.
Anticipation can fail in its purpose. A study of how planning can go wrong is illustrative; indeed the updating of models from lessons learned is the essence of an anticipatory system. Rosen discussed errors in anticipation in Section 1.1 of AS, and also in Rosen [1974], to which the reader is referred for details. I shall only give a brief summary here. The causes of errors in anticipation may be categorized into (i) bad models, (ii) bad effectors, and (iii) side effects.
254
A bad model can result from technical, paradigmatical, or statecorrespondence errors, all due to improper functorial imaging of mappings. In short, faulty encodings lead to faulty models. A proper choice of the internal predictive model M and the fine tuning of its updating processes are evidently crucial to an anticipatory system’s success. An effector E is defective when it is incapable in steering S , when it cannot appropriately manipulate the state variables, or simply when it fails to accordingly react to the information from M . Thus the careful construction of an anticipatory system also depends on the selection, design, and programming of the effector system E , as well as on the partitioning of the ‘desirable’ and ‘undesirable’ regions of response. Side effects arise because, essentially, structures have multiple functions and functions may be carried out by multiple structures. Combined with the fact of incomplete models, the consequence is that, in general, an effector E will have additional effects on S to those planned, and the planned modes of interaction between E and S will be modified by these extraneous effects. The diagnosis and treatment of erroneous anticipatory systems are frequently analogous to the procedures used in neurology and psychology. We may further ask, how does a system generate predictive models? On this point we may invoke some general ontogenic principles, by means of natural selection, to achieve some understanding (see Chapter 13). And finally, given a system that employs a predictive model to determine its present behaviour, how should we observe the system so as to determine the nature of the model it employs? This last question raises fundamental and new questions of importance to the empirical sciences, for it turns out that most of the observational techniques we traditionally employ actually destroy our capability to make such underlying models visible. Just think how often one kills an organism to study its living processes. Rosen often joked that ‘molecular biology’ is an oxymoron.
255
Lessons from Biology 10.15 Examples The conscious generation and deployment of predictive models for the purpose of control is one of the basic intuitive characteristics of intelligence. However, precisely the same type of model-based behaviour appears constantly at lower levels of biological organization as well. For instance, many simple organisms are negatively phototropic; they tend to move away from light. Now darkness in itself is physiologically neutral; it has no intrinsic biological significance (at least for non-photosynthetic organisms). However, darkness tends to be correlated with other characteristics that are not physiologically neutral, such as moisture and the absence of sighted predators. The tropism can be regarded biologically as an exploitation of this correlation, which is in effect a predictive model about the environment. Likewise, the autumnal shedding of leaves and other physiological changes in plants, which are clearly an adaptation to winter conditions, are not cued by ambient temperature, but rather by day length. There is an obvious correlation between the shortening day, which again is physiologically neutral in itself, and the subsequent appearance of winter conditions, which again constitutes a predictive model exploited for purposes of adaptive control. Innumerable other examples of such anticipatory preadaptation can be found in the biosphere, ranging from the simplest of tropisms to the most complex hormonal regulatory processes in physiology. 10.16 To Do or Not to Do The behaviours exhibited by an anticipatory system will be largely determined by the nature of its internal models. Understanding such a system means knowing its models. From this viewpoint, the modelling relation is not simply established between us and a system; it is also established between the system and itself. When viewed in this perspective, many of the baffling problems posed by complexity, as manifested especially in the behaviours of organisms and societies, appear in an entirely new light. For instance, the employment of a predictive model for control purposes brings with it an almost ethical aspect to the system’s behaviour. There is an avoidance of certain future states as bad or undesirable; a tendency towards others as good or favourable. Such a system will behave as if it knew the meaning of the word ‘ought’. Further, the
256
availability of several models always raises the possibility that they will predict distinct futures, and hence invoke incompatible responses. Such conflict can arise within a single system, or it can arise between systems; from this perspective, the resolution of conflict consists in an adjustment of the models giving rise to it. As a final remark, we may note that the intrinsic limitations of models, which arise from the fact that in complex systems they must be abstractions, themselves give rise to behaviours which can be characterized as senescent — maladaptations that grow in time, without any localizable structural cause. In dealing with these issues, properties of biological systems will provide crucial insights. Robert Rosen was fond of saying that the first lesson to be learned from biology is that there are lessons to be learned from biology. Biology, incidentally, is the study of special types of adaptive anticipatory systems, and not the memorization of the names of structural parts and biochemical molecules that passes for the subject in schools. Biology provides us with existence proofs, and specific examples, of cooperative rather than competitive activities on the part of large and diverse populations. Indeed, considered in an evolutionary context, biology represents a vast encyclopaedia of how to effectively solve complex problems; and also of how not to solve them. Biology is the science of the commonality of relations; and relationships contain the essential meaning of life. These insights represent natural resources to be harvested, resources perhaps even more important to our ultimate survival than the more tangible biological resources of food and energy. But to reap such a harvest, we need to fabricate proper tools. It is my belief that the conceptions of nature arising from relational biology will help us learn how to make it so.
An Anticipatory System is Complex 10.17 Anticipatory Cycle system (10) is
The entailment diagram for the anticipatory
257
(11)
The map ε : S → M , completing the hierarchical cycle, is the encoding of the object system S into its model M . The entailment of the three maps {M , E , S} in cyclic permutation, i.e., the hierarchical cycle { S ¢ M , E ¢ S , M ¢ E } , renders this anticipatory system complex. Note that an anticipatory system has more structure in its entailment pattern than the existence of the cycle (11). What characterizes an anticipatory system is that its maps have a feedforward aspect. Also, the model-updating map E ¢ M is not necessarily present in a hierarchical cycle of a complex system. Stated otherwise, an anticipatory system is a special kind of complex system. 10.18 Corollary Rosen concluded AS with:
[One] final conceptual remark is also in order. As we pointed out above, the Newtonian paradigm has no room for the category of final causation. This category is closely tied up with the notion of anticipation, and in its turn, with the ability of systems to possess internal predictive models of themselves and their environments, which can be utilized for the control of present actions. We have argued at great length above that anticipatory control is indeed a distinguishing feature of the organic world, and developed some of the unique features of such anticipatory systems. In the present discussion, we have in effect shown that, in order for a system to be anticipatory, it must be complex.
258
Thus, our entire treatment of anticipatory systems becomes a corollary of complexity. In other words, complex systems can admit the category of final causation in a perfectly rigorous, scientifically acceptable way. Perhaps this alone is sufficient recompense for abandoning the comforting confines of the Newtonian paradigm, which has served us so well over the centuries. It will continue to serve us well, provided that we recognize its restrictions and limitations as well as its strengths. The corollary, as I demonstrated with the anticipatory cycle 10.17, is: 10.19 Theorem An anticipatory system must be complex; a complex system may (or may not) be anticipatory.
In other words, the collection A of anticipatory systems is a proper subset of the collection I of complex systems: (12)
A ⊂ I but A ≠ I .
Thus, one has, in view of Axiom 10.4 and Theorem 10.19, the hierarchy: (13)
O ⊂ A ⊂ I,
in which both inclusions are proper. Thus
(14)
259
11 Living Systems
Hierarchical order is found both in “structures” and in “functions”. In the last resort, structure (i.e. order of parts) and function (order of processes) may be the very same thing: in the physical world matter dissolves into a play of energies, and in the biological world structures are the expression of a flow of processes. — Ludwig von Bertalanffy (1968) General System Theory
A Living System is Complex Rosen created the theory of anticipatory systems as a stepping-stone towards the ultimate goal of the characterization of life. There is no question that the subject of Chapter 10, anticipation itself, just as complexity described in Chapter 9, is of independent interest, far-reaching, and tremendously worthy of study. It must, however, be remembered that the raison d’être of biology, hence of our relational approach to the subject, is life itself. The final causes of complexity and anticipation, in this regard, are the zeroth step and the first step, respectively, in the tightening of the necessary conditions that define life.
260
The important conclusion one draws from the proper inclusions (1)
O⊂A⊂I
that conclude the previous chapter, that a living system is anticipatory and an anticipatory system is complex, is (2)
O ⊂ I but O ≠ I .
In short, Rosen’s Conjecture 9.8 may now be restated as 11.1 Rosen’s Theorem An organism must be complex; a complex system may (or may not) be an organism. 11.2 “Nothing comes from nothing.” In any deductive system of logic, after the definition of the primitive terms, one must begin with a collection of statements that by nature cannot be proven. These are the axioms of the theory. The Greek word άξίωµα means ‘that which is deemed worthy; selfevident principle’. Rosen’s Theorem 11.1 ( O ⊂ I ) is deduced from the postulated Axiom of Anticipation 10.4 ( O ⊂ A ) and the proven Theorem 10.19 ( A ⊂ I ). As I explained in the Preface and in Chapter 4, modelling is more an art than a science. A major part of this art of modelling is the careful choice of what the axioms of the theory are. For natural systems, all one has are perceptions and interpretations: one has to begin somewhere; not everything can be mathematically proven. I have chosen, in my formulation of relational biology that is this monograph, to state as axiom the more self-evident “life is anticipatory”. The less self-evident “life is complex” then follows as a theorem. While the Axiom of Anticipation, the statement “life is anticipatory”, cannot be mathematically proven, it may well be considered as being scientifically ‘proven’: there is plainly abundant evidence for it. That is the best one can do with scientific ‘proofs’ and biological ‘laws’. It has been said that the only law in biology is that there are exceptions to every possible ‘law’ in biology.
261
A living system is complex, whence inherits all properties of complex systems. As consequences of Theorems 9.2 and 9.3 and Lemma 9.4, therefore, one has the following 11.3 Theorem causation.
A living system must contain a closed path of efficient
11.4 Corollary A living system must have a model that contains an impredicative cycle of inferential entailment. 11.5 Theorem In (the relational diagram of ) a living system there is a cycle that contains two or more solid-headed arrows. 11.6 Theorem
A living system must have noncomputable models.
11.7 Finale on Simulability Note that our (i.e., the relational-biology school’s) interest is in life itself. The only reason the issue of simulability makes an appearance in relational biology is that it turns out to be a derivative of what we have to say about living systems. Simulability is a notinsignificant scientific subject of investigation, but computing science is simply not congenial and less interesting to me personally. Biology-envy is the curse of computing science. In some languages, the term for ‘computer’ is ‘electric brain’. It may be argued that the Holy Grail of computing, its ultimate final cause, is to successfully model biological and cognitive processes. The point of the Turing Test, indeed, is to see that if a computing machine’s programming may be sophisticated enough to fool us into thinking that it may not be nonhuman. But what if life itself is, even in principle, nonsimulable? There are, of course, many things that mechanization by rote does better than life , in terms of speed, repeatability, precision, and so forth. On the other hand, a living system is driven by final causation, and may be characterized by its ability to handle ambiguities and take chances, indeed, its ability to err. These are precisely the processes that cannot, by definition (cf. Definition 8.2), be modelled algorithmically.
262
Algorithms can at best simulate. A simulation of life is, alas, not life itself. An impredicative cycle is, in particular, a representation of processes in the modelling relation such that there is no algorithm for using the representation itself to gain novel insight about the processes being modelled. The existence of the impredicative cycle is the lesson of complexity. In short, intuition, insight, and creativity are not computable.
(M,R)-Systems 11.8 Necessity versus Sufficiency In Chapter 1 of EL, after explaining that a living system must contain a closed path of efficient causation, hence must be complex, Rosen added: To be sure, what I have been describing are necessary conditions, not sufficient ones, for a material system to be an organism. That is, they really pertain to what is not an organism, to what life is not. Sufficient conditions are harder; indeed, perhaps there are none. If so, biology itself is more comprehensive than we presently know. And in the final, concluding Section 11H of LI, he wrote: But complexity, though I suggest it is the habitat of life, is not itself life. Something else is needed to characterize what is alive from what is complex. That “something else” is the ‘closure to efficient causation’ property found in (M,R)-systems, the subject of this chapter and the next. 11.9 The Next Step Robert Rosen introduced (M,R)-systems to the world in 1958, in his very first published scientific paper [Rosen, 1958]. They began as a class of metaphorical, relational paradigms that define cells. The M and R may very well stand for ‘metaphorical’ and ‘relational’ in modelling terms,
263
but they are realized as ‘metabolism’ and ‘repair’. The comprehensive reference is Rosen [1972]. Rosen subsequently discussed (M,R)-systems in many of his publications, notably in Section 3.5 of AS, Sections IV and V of NC, Section 10C of LI, and Chapter 17 of EL. The reader may refer to any or all of the above for further details. In Rosen [1971], he listed three basic kinds of problems arising in the study of (M,R)-systems: a. To develop the formal properties of such systems, considered in the abstract, and interpret them in biological terms; b. To determine the methods by which the abstract organization which defines the (M,R)-system may be realized in concrete terms; c. To determine whether a particular concrete biological system is in fact a realization of an (M,R)-system (i.e. to identify the functional components in a real biological system); this is basically the inverse problem to (b). And he wrote: Almost all of my published scientific work has arisen from a consideration of these three problems, although this is perhaps not always immediately apparent. This last statement is in fact as true today, when we study Rosen’s whole lifetime’s work, as it was when he wrote it in 1971. 11.10 Metabolism and Repair The simplest (M,R)-system may be represented by the diagram (3)
f Φ A ⎯⎯ → B ⎯⎯ → H ( A, B ) .
264
The mapping f represents metabolism, whence its efficient cause, an enzyme, with material input and output represented by the sets A and B . Thus metabolism is a morphism f ∈ H ( A, B ) ⊂ B A . Members of H ( A, B ) are mappings that model metabolic process, so clearly not all mappings in B A qualify; thus H ( A, B ) is a proper subset of B A . The element trace is (4)
f:a
b
with relational diagram
(5)
and entailment diagram (6)
f ¢b.
In form (3), the morphism Φ represents repair. Its codomain is H ( A, B) , so it may be considered as a mapping that creates new copies of enzymes f , hence a gene that ‘repairs’ the metabolism function. In other words, repair is a morphism Φ with the prescribed codomain H ( A, B ) ; i.e. Φ ∈ H ( i , H ( A, B ) ) ⊂ H ( A, B ) . Repair in cells generally takes the form of a •
continual synthesis of basic units of metabolic processor (i.e. enzymes), using as inputs materials provided by the metabolic activities themselves. Stated otherwise, the domain of the repair map Φ is the codomain of metabolism f , its ‘output set’ B . Thus Φ ∈ H ( B, H ( A, B)) ⊂ H ( A, B ) B , and its elementchasing, relational, and entailment diagrams are, respectively,
265
(7)
Φ: b
f
(8)
(9)
Φ
¢f.
Metabolism and repair combine into the relational diagram
(10)
and entailment diagram (11)
Φ
¢f ¢b.
This hierarchical entailment is the essence of the simplest (M,R)-system in form (3): Φ is the repair component, f is the metabolism component, and b is the output of the metabolic activity. Note the adjective simplest I use for the (M,R)-system in form (3). This form already entails the impredicative cycle in (M,R)-systems. But a general (M,R)-system is actually a network of formal systems that are the metabolism and repair components (and their outputs). While form (3) may capture the
266
essence of all (M,R)-systems (and indeed it is possible in principle to abbreviate every abstract (M,R)-system to this simple form by making the sets and mappings involved sufficiently complex), one must, nevertheless, not lose sight of the network aspect of (M,R)-systems. An (M,R)-network, i.e., a network of (a necessarily finite family of) metabolism and repair components, contains many fine nuances of inferential entailment that are not reflected in the simplest model. I shall come back to the exploration of the very rich mathematical structure of (M,R)-networks in Chapter 13. 11.11 Replication What if the repair components themselves need repairing? New mappings representing replication (i.e. that serve to replicate the repair components) may be defined. A replication map must have as its codomain the hom-set H ( B, H ( A, B )) to which repair mappings Φ belong, so it must be of the form
(12)
β : Y → H ( B, H ( A, B ) )
for some set Y , whence the relational diagram
(13)
For the convenience of iterative combination (for illustrative purposes), one may choose Y = H ( A, B ) ; so (12) and (13) become (14) and
β : H ( A, B ) → H ( B, H ( A, B ) )
267
(15)
The replication morphism (14) may be combined with the repair morphism Φ in (3) to give a new (M,R)-system from the old one; viz. (16)
Φ β B ⎯⎯ → H ( A, B ) ⎯⎯ → H ( B, H ( A, B ) ) ,
which has the property that the ‘metabolic’ part of system (16) is the ‘repair’ part of system (3), and the ‘repair’ part of system (16) is the ‘replication’ part of system (3) (i.e. form (14)). Indeed, one may sequentially extend this formalism ad infinitum, the next system being (17)
β Ξ H ( A, B ) ⎯⎯ → H ( B, H ( A, B ) ) ⎯⎯ → H ( H ( A, B ) , H ( B, H ( A, B ) ) ) .
The arrow diagrams may be extended on either side, rightward shown above as well as leftward. 11.12 Closure If this were all there is to it with (M,R)-systems, it would have been pretty pointless. The magic of an (M,R)-system is that the replication mapping β may already be entailed in the original form (3). On the basis of what are already present in (3), “under stringent but not prohibitively strong conditions, such replication essentially comes along for free.” The apparent infinite sequence of maps that may arise with the iteration of ‘repair the repair’ is truncated by a mathematical argument that turns it into a hierarchical cycle of maps instead. So there is no infinite graph of arrows here: the finite arrow diagram (3) alone suffices.
268
Not all (M,R)-networks satisfy the stringent requirements for entailment closure. Those that do may acquire an adjective and be called replicative (M,R)-systems. The hierarchical cycle is the closure that provides the ‘selfsufficiency in efficient causes’ that defines replicative (M,R)-systems. The defining characteristic, in other words, is the self sufficiency in the networks of metabolism-repair-replication components, in the sense that every mapping is entailed within; in short, closure in efficient causation. Henceforth I shall use the term (M,R)-network to describe a network of metabolism and repair components that is not necessarily closed to efficient causation. I shall drop the adjective ‘replicative’ for (M,R)-systems whence all (M,R)-systems are replicative. I state this explicitly as the formal 11.13 Definition (a) An (M,R)-network is an entailment network of a finite collection of metabolism and repair components. (b) An (M,R)-system is an (M,R)-network that is closed to efficient causation. 11.14 Nominalism Revisited A note on Rosen’s terminology is in order. Some people have pointed out that Rosen’s usage of ‘repair’ and ‘replication’ do not conform to current standards in molecular biology, where these terms are mostly applied to nucleic acids, DNA in particular. While this nonconformation may be true, one must consider the chronology of events.
The Watson-Crick model of the structure of the DNA macromolecule was published in Nature in 1953. Francis Crick proclaimed his Central Dogma in 1958. The concepts of DNA repair and DNA replication in molecular biology were gradually formulated and developed in the decades thence. On the other hand, Robert Rosen introduced ‘metabolism’ and ‘repair’ components of his (M,R)-systems in 1958, as I have previously mentioned, in his first published scientific paper, entitled “A relational theory of biological systems”. ‘Replication’ in (M,R)-systems first appeared in 1959, in Rosen’s fourth paper [Rosen 1959], entitled “A relational theory of biological systems II”. Thus, when Rosen wrote of ‘repair’ and ‘replication’,
269
their modern biological senses had not been well established and certainly not standardized. In short, Rosen defined them first. Words mutate and their usage morphs, of course. So we must remind ourselves that Rosen’s ‘repair’ is ‘replenishment’, ‘resynthesis’, or ‘replacement’ (of enzymes) in modern biological terms. Similarly, note that Rosen’s ‘replication’ is strictly limited to the notion of a mapping in the model that makes copies of the repair process. Thus it is not (self-)replication in the modern biological sense. Rosen’s ‘replication’ is the efficient cause of an (M,R)-system’s inherent ‘autonomy’, a kind of ‘relational invariance’ in terms of its entailment pattern. This is a concept of relational biology that has no obvious counterpart in molecular biology. I emphasize: replication in an (M,R)-system is not how the genetic material is replicated, not a description of how a cell makes copies of itself, and not organismal reproduction. In one synthesis, however, of an alternate (M,R)system, which I shall introduce in the epilogue, the replication of the repair process is functionally identical to (i.e. has an analogous interpretation as) the replication of the genetic components in a cell; i.e. nucleic acid replication. 11.15 Evaluation Map There are many ways to construct the mapping β in form (12) from nothing else but what are already in the arrow diagram (3). Rosen has always used the simplest way, chosen his replication map β to
have domain Y = H ( A, B ) , and made it an inverse evaluation map. (I shall explore other ways in the next chapter). True to the spirit of relational biology, we must recognize that the most important aspect of a replication map is not its form, i.e. not the exact details of how the map is defined. Rather, the most important is its function, that it needs to produce repair mappings Φ , which belong to the hom-set H ( B, H ( A, B)) . Therefore the codomain of a replication map β must be H ( B, H ( A, B)) ; stated otherwise, one must have β ¢ Φ .
270
Here is how one constructs Rosen’s β . An element b ∈ B defines an ‘evaluation map’ (cf. Examples A.19(i) and (ii), and A.52) (18)
bˆ ∈ H ( H ( B, H ( A, B ) ) , H ( A, B ) )
by (19)
bˆ ( Φ ) = Φ ( b ) for Φ ∈ H ( B, H ( A, B ) ) .
The map (20)
α :b
bˆ
defines an embedding of
B into H ( H ( B, H ( A, B ) ) , H ( A, B ) ) . Rosen
mentioned (for example in Rosen [1972] ) that this “is the abstract version of the familiar embedding of a vector space into its second dual space”. I shall further explore this concept, in the interlude below, after I finish constructing Rosen’s inverse evaluation map. 11.16 Inverse Evaluation Map The mapping bˆ is invertible if it is monomorphic; viz. for every pair of repair maps Φ1 , Φ 2 ∈ H ( B, H ( A, B)) ,
(21)
bˆ ( Φ1 ) = bˆ ( Φ 2 ) ⇒ Φ1 = Φ 2 ;
i.e. (22)
Φ1 ( b ) = Φ 2 ( b ) ⇒ Φ1 = Φ 2 .
This implication (22) is a condition on the repair maps Φ ∈ H ( B, H ( A, B)) : if two repair maps agree at b , then they must agree everywhere. In other words, a repair map Φ [gene] is uniquely determined by its one value Φ ( b ) ∈ H ( A, B ) [enzyme]. This result may be regarded as the abstract version of the one-gene-one-enzyme hypothesis. These are essentially one set of the “stringent but not prohibitively strong conditions” required to make the
271
inverse evaluation map a replication map with nothing but the ingredients of arrow diagram (3). Note the inverse evaluation map bˆ−1 maps thus: (23)
bˆ −1 : H ( A, B ) → H ( B, H ( A, B ) ) ,
(24)
bˆ −1 ( Φ ( b ) ) = Φ .
It takes one image value f = Φ ( b ) ∈ H ( A, B ) to the whole mapping
Φ ∈ H ( B, H ( A, B)) : this is the sense in which it ‘replicates’. But the stringent condition, requiring a repair map Φ to be uniquely determined by the one value Φ ( b ) in its range, neatly overcomes this Φ = Φ ( b ) identification problem! 11.17 Remarks Let me paraphrase the dictum of relational biology in the current context. When the replication map β has domain Y = H ( A, B ) and
one constructs it as an inverse evaluation map, the important aspect is not that it is an inverse evaluation map. The fact that Rosen’s regular example has β = bˆ−1 is entirely incidental. Rather, the important aspect is that this particular replication map has the property that it is uniquely determined by one value in its range. The crux is β : Φ ( b ) Φ . There are other ways to define β ∈ H ( H ( A, B ) , H ( B, H ( A, B ) ) ) such that β : Φ ( b )
Φ ; choosing
β = bˆ−1 is just the simplest way, one specific example of how such a map may arise naturally. In other words, the emphasis is not on replication’s efficient cause, but on its final cause. So when one seeks material realizations of the replication map thus constructed, the question to ask is not, say, “What is the physical interpretation of the inverse evaluation map?” One ought to ask, instead, “What biochemical processes are uniquely determined by their products?” One possible answer here is that one gene controls the production of one enzyme, or conversely, a gene is uniquely determined by which enzyme it produces. This is, of course, the one-gene-one-enzyme hypothesis.
272
Interlude: Reflexivity 11.18 Evaluation Map Revisited Rosen usually constructed his ‘evaluation map’ in two steps. He would begin with two arbitrary sets X and Y , and then define for each element x ∈ X a mapping
(25)
xˆ : H ( X , Y ) → Y
by (26)
xˆ ( f ) = f ( x ) for all
f ∈ H ( X ,Y ) .
Next he would put X = B and Y = H ( A, B ) . Then an element b ∈ B defines an evaluation map bˆ as in lines (18) and (19) above, where I have defined the evaluation map in one single step. The map x xˆ that sends an element to its corresponding evaluation map defines an embedding of X into H ( H ( X , Y ) , Y ) . It is analogous to the
embedding of a vector space into its second dual space (cf. Example A.19(ii)). The main subject here is linear algebra; two standard references are Halmos [1958] and Hoffman & Kunze [1971]. The counterpoint of reflexivity is found in the topic of functional analysis; two good references are Brown & Page [1970] and Rudin [1973]. 11.19 Dual Space and Dual Transformation Let X and Y be two vector spaces over the field F . I shall restrict F to either the real field or the complex field . The hom-set from X to Y in the category Vct of vector spaces consists of linear transformations, and is denoted L ( X , Y ) , a standard
notation in linear algebra in lieu of the category-theoretic Vct ( X , Y ) . Note that L ( X , Y ) is itself a vector space over F , Vct being an additive category (cf. Definition A.39). Also, the scalar field F is a one-dimensional vector space over itself, so one may speak of L ( X , F ) . An element of L ( X , F ) , a
273
linear transformation of X into F , is called a linear functional. This special vector space L ( X , F ) is called the dual space of X , and one writes X ∗ in place of L ( X , F ) . The concept of ‘dual’ applies to linear transformations as well. For any linear transformation T : X → Y one may define a linear transformation T ∗ : Y ∗ → X ∗ by (27)
T *( g ) = g T for all g ∈ Y * ,
or diagrammatically
(28)
The linear transformation T * ∈ L (Y * , X ∗ ) is called the dual transformation of T ∈ L ( X , Y ) . Let D be the operation of forming the dual space and the dual transformation. In other words, define DX = X ∗ for each vector space X , and DT = T ∗ ∈ L (Y ∗ , X ∗ ) for each linear transformation T ∈ L ( X , Y ) . Then D is a contravariant functor on the category Vct of vector spaces and linear transformations, called the dual functor. Note that D is not the same as its namesake, the dual functor I Aop that sends a category A to its dual A op (cf. Example A.12(i)). In particular, I Aop is an involution ( I Aop I Aop = I A , the
274
identity functor), but D 2 = D D is not necessarily equivalent to I Vct , as we shall see presently.
11.20 Second Dual The dual process may be iterated. Since X ∗ is itself a vector space, one may consider its own dual. For simplicity one writes X ∗∗ in place of ( X ∗ ) , and one calls X ∗∗ the second dual (space) of X . Note that an ∗
element of X ∗∗ is a ‘linear functional of linear functionals’, X ∗∗ = L ( L ( X , F ) , F ) . Similarly, the second dual transformation of T ∈ L ( X , Y ) may be defined as T ∗∗ ∈ L ( X ∗∗ , Y ∗∗ ) .
Repeated applications of the dual operation on a given vector space X result in a sequence of vector spaces X , X ∗ , X ∗∗ , X ∗∗∗ ,... If X is a finitedimensional vector space over F , then each vector space of the sequence is finite-dimensional and has the same dimension as X . This means they are all isomorphic (because each one is isomorphic to F n , where n is the dimension). There does not exist, however, any canonical isomorphism from X to X ∗ (unless X has certain additional algebraic structures — we shall encounter one in the next chapter). But from a finite-dimensional vector space X over the field F to its second dual X ∗∗ , there is an isomorphism that distinguishes itself from all the others. Define for each element x ∈ X a mapping (29)
xˆ : X ∗ → F
by (30)
xˆ ( f ) = f ( x ) for all
f ∈ X ∗.
( xˆ is, of course, the now familiar evaluation map.) The mapping (31)
α X : X → X ∗∗
275
defined by (32)
αX : x
xˆ for all x ∈ X
is an isomorphism, called the natural isomorphism between X and X ∗∗ . For every linear transformation T ∈ L ( X , Y ) , one has (33)
T ∗∗ α X = α Y T ,
i.e. the diagram
(34)
commutes.
11.21 Second Dual Functor The second dual functor D 2 = D D is a covariant functor on the category of vector spaces defined by D 2 X = X ∗∗ and D 2T = T ∗∗ . Let the identity functor on Vct be denoted by I ; i.e. IX = X and IT = T . Then diagram (34) may be rewritten as
(35)
276
Thus the natural isomorphism may be regarded as a morphism α : I D 2 of functors. It is, for finite-dimensional vector spaces, a natural isomorphism in the sense of category theory; indeed, it is the Example A.19(ii). Each finite-dimensional vector space X can thus be identified with its second dual X ∗∗ , and consequently T = T ∗∗ for each linear transformation T : X → Y of finite-dimensional vector spaces. Therefore in the sequences X , X ∗ , X ∗∗ , X ∗∗∗ , ... T , T ∗ , T ∗∗ , T ∗∗∗ , ... one needs only consider the first pairs of terms X , X ∗ and T , T ∗ . The remaining ones, being naturally isomorphic copies, may be identified with them. [The first two members of the hierarchy suffice: here is another level of analogy with (M,R)-systems.] Let me make one other remark before leaving finite-dimensional vector spaces. From the fact that X and its second dual X ∗∗ have the same finite dimension it follows that they are isomorphic; i.e. there exists an isomorphism φ : X → X ∗∗ . This unspecified isomorphism, however, may not satisfy the condition that is satisfied by the natural isomorphism when φ = α X , that (36)
(φ ( x ) ) ( f ) = f ( x )
for all x ∈ X and f ∈ X ∗ .
11.22 Reflexive Vector Space When the vector space X is infinitedimensional, the mapping α X defined in (32) is still injective, and it still satisfies (33) (and so the arrow diagram (34) still commutes). But the range of α X may not be all of X ∗∗ . Thus α X is an embedding of X into, but not necessarily onto, its second dual space X ∗∗ . Since for an infinite-dimensional X , α X is not necessarily an isomorphism from X to its codomain X ∗∗ , one changes its name, from the ‘natural isomorphism’, and calls it the canonical mapping of X into X ∗∗ . The canonical mapping is an isomorphism of X
277
onto its range, the subspace Xˆ = { xˆ : x ∈ X } of X ∗∗ . In general, however, Xˆ ≠ X ∗∗ . A vector space X is called reflexive if and only if the canonical mapping α X : x xˆ maps X onto X ∗∗ ; i.e. iff Xˆ = X ∗∗ ; in other words, if and only if the canonical mapping is the natural isomorphism between X and X ∗∗ . [Here is polysemy at work again: this ‘reflexive’ is obviously different from the reflexive ‘self-relating’ property for relations (cf. Definition 1.10(r)).] All finite-dimensional vector spaces are reflexive, but some infinite-dimensional vector spaces are not. Let me emphasize that for X to be reflexive, the existence of some isomorphism from X onto X ∗∗ is not enough: the vector space and its second dual must be isomorphic under the canonical mapping. It is possible for a vector space X to be isomorphic to its second dual X ∗∗ without being reflexive.
11.23 Inverses As a final note of this interlude in linear algebra, I would like to point out the difference between two kinds of inverse mappings that one encounters when these linear algebra concepts extend to (M,R)-systems. Since the canonical mapping α X is injective, its inverse exists with domain Xˆ , defined by (37)
α X−1 ( xˆ ) = x .
But note that (38)
α X−1 ∈ L ( Xˆ , X ) .
It is completely different from the ‘inverse’ of the evaluation map xˆ = α X ( x ) ,
which may not exist. Since xˆ ∈ X ∗∗ = L ( X ∗ , F ) , it is a linear transformation
from, generally, a higher-dimensional space into a one-dimensional space, thus highly singular. To make it invertible, “stringent but not prohibitively strong conditions” are required, just like for its counterpart bˆ in (M,R)-
278
systems. Here, the conditions are restrictions on its domain and codomain. If the inverse exists, it would be a mapping (39)
xˆ −1 ∈ L ( F , X ∗ ) .
A comparison of (38) and (39) shows how different the two inverses are: there is no general entailment between the inverse of a mapping [ α X ], and the inverse of the image of one single element [ x ] of the domain that mapping, when that image [ xˆ = α X ( x ) ] happens to be a mapping in its own right. The situation is summarized succinctly as (40)
∃α X−1 ( xˆ ) ⇔ ∃ (α X ( x ) ) . −1
Traversability of an (M,R)-System 11.24 Completing the Cycle The inverse evaluation map β = bˆ −1 establishes a correspondence between H ( H ( A, B), H ( B, H ( A, B))) and B , Φ may be replaced by the isomorphic whence β : f (41)
b: f
Φ
with relational diagram
(42)
and entailment diagram
279
(43)
b ¢ Φ.
The cyclic entailment pattern when one combines lines (11) and (43) is the impredicative, hierarchical cycle of the (M,R)-system. The three maps { b, Φ, f } of replication, repair, and metabolism entail one another in a cyclic permutation. The three maps form the trivial traversable simple diagraph
(44)
The Eulerian circuit may begin at any of the three vertices: b → Φ → f → b , Φ → f → b → Φ , or f → b → Φ → f . In terms of relational diagrams, the identification ‘ bˆ −1 = b ’ transforms (15) into
(45)
which is our multiple-connections example 5.17.
280
11.25 Cytoplasm and Nucleus A typical biological (eukaryotic) cell is compartmentalized into two observably different regions, the cytoplasm and the nucleus. Metabolic activities mainly occur in the cytoplasm, while repair processors (i.e. genes) are contained in the nucleus. Let me suggest an alternate depiction of the digraph (45) of the simplest (M,R)-system. I change the geometry to enclose the repair map Φ within. This gives a graphic representation of the metabolism component as the abstract equivalent of ‘cytoplasm’ and the repair component as the abstract counterpart of ‘nucleus’. After I additionally label the arrows, digraph (45) becomes the equivalent digraph
(46)
Both vertices a and Φ have indegree 1 and outdegree 1, while vertices b and f have indegree 2 and outdegree 2. Thus by Theorem 6.6(a), diagram (46) is traversable as a digraph; its Eulerian circuit in our context is precisely the hierarchical cycle containing all the solid-headed arrows, whence the (M,R)system is closed to efficient causation (cf. Definition 6.23 and Theorem 6.28), therefore complex (Theorem 9.5). The closed path may begin at any vertex; in particular, the Eulerian circuits (1,2,3,4,5,6), (3,4,5,6,1,2) and (5,6,1,2,3,4), correspond to the Eulerian circuits in diagraph (44). Note that these three Eulerian circuits of diagraph (46) respect the solidheaded-arrow–hollow-headed-arrow ordered-pairing of each morphism: the solid-headed arrows and hollow-headed arrows are in an alternating sequence. In strictly digraph-theoretic terms, circuits such as (3,1,5,6,4,2) and (6,1,2,3,4,5) are Eulerian as well; such ‘out-of-phase’ circuits, however, have
281
no corresponding Eulerian circuits in the entailment digraph (44), and hence do not represent the hierarchical cycle. In particular, the fact that the Eulerian circuit (3,1,5,6,4,2) happens to ‘segregate’ the solid-headed arrows and the hollow-headed arrows is a graphic-theoretic coincidence that has no entailment implications.
11.26 Traversability as a Relation Diagram As a relational diagram, (46) may have the degrees of its vertices enumerated thus:
(47)
⎧ ( ε i ( a ) ,τ i ( a ) , ε o ( a ) ,τ o ( a ) ) = (1,0,0,1) ⎪ ⎪ ( ε i ( b ) ,τ i ( b ) , ε o ( b ) ,τ o ( b ) ) = (1,1,1,1) ⎨ ⎪ ( ε i ( f ) ,τ i ( f ) , ε o ( f ) ,τ o ( f ) ) = (1,1,1,1) ⎪ ( ε i ( Φ ) ,τ i ( Φ ) , ε o ( Φ ) ,τ o ( Φ ) ) = ( 0,1,1,0 ) ⎩
(cf. 6.8 for the notation of the four degrees of a vertex in a relational diagram, and Example 6.9 for the enumeration of the diagram (45) which is isomorphic to (46)). Thus diagram (46) satisfies the conditions of Theorem 6.12(a), whence as a relational diagram it is traversable and has an Eulerian circuit.
What is Life? Rosen’s idea behind (M,R)-systems was, as he explained in Chapter 17 of EL, “to characterize the minimal organization a material system would have to manifest or realize to justify calling it a cell”. Whence Rosen defined a cell thus:
11.27 Definition A cell is (at least) a material structure that realizes an (M,R)-system. Recall that a realization of a formal system is a natural system obtained from decoding that formal system; i.e., a natural system that is a realization of a formal system has the latter as a model (cf. 4.14). Note that the word ‘cell’ in the definition is used in the generic sense of ‘autonomous life form’. This
282
class of relational cell models can just as well describe organisms, indeed all living systems. In Section 10C of LI one finds (The “graph” is Rosen’s arrow diagram [10C.6] in LI of an (M,R)-system, which is an alternate version of my diagrams (45) and (46) above.): Any material system possessing such a graph as a relational model (i.e., which realizes that graph) is accordingly an organism. From our present perspective, we can see that [10C.6] is not the only graph that satisfies our conditions regarding entailments; there are many others. A material realization of any of them would likewise, to that extent, constitute an organism. Definition 11.27 says that ‘having an (M,R)-system as a model’ is a necessary condition — that is what the ‘at least’ in the definition signifies — for a natural system to be an autonomous life form. Rosen, for emphasis, added the adjectival phrase “under the condition that at least one of the appropriate inverse evaluation maps exists” to his description of an (M,R)-system in his original definition in Chapter 17 of EL. The requisite inverse evaluation map is what completes the impredicative cycle in Rosen’s standard construction of an (M,R)-system. I shall have more to say on other means of closure (cf. “there are many others” in the quote above) in the next chapter. Immediately after Definition 11.27, in the same paragraph in Chapter 17 of EL in fact, Rosen added: Making a cell means constructing such a realization. Conversely, I see no grounds for refusing to call such a realization an autonomous life form, whatever its material basis may be. The converse statement provides the sufficiency. So I may define ‘organism’, meaning any ‘living system’, as:
283
11.28 Postulate of Life A natural system is an organism if and only if it realizes an (M,R)-system. Note that Postulate 11.28 is not a contradiction to the quote in the necessity-versus-sufficiency discussion in 11.8 that there may not be sufficient conditions that characterize life. Rosen had established the necessity; he chose to state the sufficient condition in his definition of life. ‘Having an (M,R)-system as a model’ is the necessary and sufficient condition for a natural system to be an autonomous life form, on a relational level, even if one may not readily recognize the natural system as ‘alive’ on the material level. Rosen’s answer to the question “What is life?” (in its epistemological form of “What are the defining characteristics of a natural system for us to perceive it as being alive?”) is given in Chapter 10 of LI:
11.29 Theorem A material system is an organism if, and only if, it is closed to efficient causation. An (M,R)-system is a relational model of a living organism that captures this necessary and sufficient condition. Definition 11.13(b) establishes the equivalence of Postulate 11.28 and Theorem 11.29. The important point to note for the purposes of relational biology is that life is characterized through the use of efficient causation, one of Aristotle’s four categories. The characterization of life is not what the underlying physicochemical structures are, but by its entailment relations, what they do, and to what end. In other words, life is not about its material cause, but is intimately linked to the other three Aristotelian causes, formal, efficient, and final. Explicitly in terms of efficient causes, one has
11.30 Theorem A natural system is an organism if and only if it has a closed path containing all of its efficient causes.
284
An efficient cause is identified with its corresponding solid-headed arrow in the relational diagram in graph-theoretic form. So equivalently:
11.31 Theorem In the relational diagram of (a formal system model of ) a living system, there is a cycle that contains all the solid-headed arrows. Functional entailment is identified with the entailment symbol ¢. So one also has
11.32 Theorem In the entailment diagram of (a formal system model of ) a living system, there is a cycle that contains all the ¢.
The New Taxonomy 11.33 From Necessity to Sufficiency The journey to identify the distinguishing features of a living system began with the collection N of natural systems. Let O be the collection of organisms (i.e. living systems), and let < be the relation ‘less than’ (cf. Definition 1.22) in the poset N,⊂ , viz. the relation ‘is a proper subset of’: (48)
X < Y if and only if X ⊂ Y ⊂ N and X ≠ Y .
Clearly (49)
O < N,
i.e., an organism is necessarily a natural system. Thus
285
(50)
The strategy is to keep tightening the necessarily conditions, to eventually arrive at a set of conditions r that is both necessary and sufficient to characterize O; in other words, (51)
O = R = { N ∈ N : r ( N )}