245 107 6MB
English Pages 176
NUNC COGNOSCO EX PARTE
THOMAS J. BATA LIBRARY TRENT UNIVERSITY
Digitized by the Internet Archive in 2019 with funding from Kahle/Austin Foundation
https://archive.org/details/philosophicalfouOOOOgeor
1. PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
CYBERNETICS AND SYSTEMS SERIES
Editor in Chief J. ROSE Director General, World Organisation of General Systems and Cybernetics
Other Titles in Preparation
2. Fuzzy Systems 3. 4. 5. 6. 7. 8. 9. 10.
Artificial Intelligence Management Cybernetics Economic Cybernetics Cybernetics and Society Models and Modelling Systems Automation and Cybernetics Medical Cybernetics General Systems Theory
11. Computers and Cybernetics 12. Neurocybernetics
PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
F. H. GEORGE, M.A., PhD. Director, Institute of Cybernetics Brunei University, United Kingdom
ABACUS
•—• •— •—••—
PRESS
Q
z\o .
Gm First published in 1979 by ABACUS PRESS
Abacus House, Speldhurst Road, Tunbridge Wells, Kent TN4 OHU
© Abacus Press 1979 All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without the prior permission of Abacus Press.
British Library Cataloguing in Publication Data George, Frank Honywill Philosophical foundations of cybernetics (Cybernetics and systems series). 1. Cybernetics — Philosophy I. Title 001.53’01
II. Series Q310
ISBN 0-85626-163-7
Printed in Great Britain by W & J Mackay Ltd. Chatham, Kent
SERIES PREFACE
The inter- and trans-disciplinary sciences of cybernetics and systems have made tremendous advances in the last two decades. Hundreds of books have been published dealing with various aspects of these sciences. In addition, a variety of specialist journals and voluminous post-conference reports have appeared, and learned societies, national and international, established. These substantial advances reflect the course of the Second Industrial Revolution, otherwise known as the Cybernetics Revolution. In order to extend the readership from experts to the public at large and acquaint readers with up-to-date advances in these sciences which
are
rapidly
achieving
tremendous importance and which
impinge upon many aspects of our life and society, it was considered essential to produce a series of concise and readable monographs, each concerned with one particular aspect. The twelve topics con¬ stituting the first series are as follows (in alphabetical order). Artificial Intelligence, Automation and Cybernetics, Computers and Cybernetics, Cybernetics and Society, Economic Cybernetics, Fuzzy Systems, General Systems Theory, Management Cybernetics, Medical
Cybernetics,
Models and Modelling of Systems, Neuro¬
cybernetics, Philosophical Foundation of Cybernetics. The authors are experts in their particular fields and of great repute. The emphasis is on intelligible presentation without excess mathematics and abstract matter. It is hoped each monograph will become standard reading matter at academic institutions and also be of interest to the general public. In this age of enormous scientific advances and of uncertainty concerning the welfare of societies and the very future of mankind, it is vital to obtain a sound insight into the issues involved, to help us to understand the present and face the future with greater confidence. J. ROSE Blackburn
317574
PREFACE
In writing this book I have been guided by a number of considerations. The first and most important is to supply a clearcut picture of what the central themes of cybernetics entail. If we state baldly that “machines could be made to think”, then we have to examine what it is that we mean exactly, and what such a statement entails. We have to examine evidence and above all be clear about the philo¬ sophical implications. In this regard I have been greatly influenced by the late Dr. Alan Turing who set out in his article “Computing Machines and Intelligence” (published in 1950) the essential issues. There he considered some of the likely objections to such a view and I have followed these up and have gone on to try to make the whole matter more explicit. The second consideration was in terms of what I had previously said myself on the subject. I hope that what I now say is consistent with such previous statements, even if I now place the emphasis somewhat differently. A particular aspect of this second consideration is that I wish to discuss cybernetics in philosophical terms, but with emphasis on the central themes of cybernetics. This was read as a restriction on the possibility of here examining epistemology, logic, truth, meaning and the like in great detail — perhaps even from a cybernetic point of view. But I have tackled this second more philosophical matter in a separate book which should appear soon after this one. The reason for mentioning this matter is to ensure that the more philosophically inclined reader should not feel disappointed at the relatively small space given to the more traditional philosophical problems. Certain of these traditional problems, such as the mind—body problem, are so central to our purpose that they are discussed, but others such as that of ontology receive only brief mention. I would like to say also that I believe the relation between cyber¬ netics and philosophy is not only close, but it is a two-way relation¬ ship, since I believe that one’s philosophical views are clarified by cybernetic thinking. Similarly any possibility of being clear about VII
PREFACE
viii
cybernetics without careful philosophical thought is inconceivable. This last statement represents my view that science and philosophy are very much more closely related than one would guess from a casual reading of most books or journals on either subject. The obvious exception is that of the work on the philosophy of science, but even this does not deal with the relationship I am primarily concerned with. There is a deeper relationship between basic philo¬ sophical thinking and science — rather especially cybernetics — that I am hoping this book will bring out. It is not just that philosophy should be invoked to analyse scientific activities, but that science should be used to analyse philosophical activities. In looking at cybernetics in this manner, I owe a number of acknowledgements to different people. In the first place I have submitted, over the years, a number of postgraduates at Brunei to my views and have, without doubt, clarified my own as a result of the feedback derived from their views. In particular I owe a debt of gratitude to Mr. W. J. Chandler, Director of Corporate Planning at Reed International, for a whole series of discussions which surround his ideas on the science of history and planning, and of which I have openly taken advantage. I also owe a special debt to Mr. L. Johnson of the Department of Cybernetics at Brunei for his reading and constructive comments made on this text. Also my wife and my secretary at the University, Mrs. P. M. Kilbride, my elder daughter Mrs. C. E. Smith and my younger daughter Miss Karen George for varying degrees of help, both direct and indirect. As usual one had to add that the extent to which the final result may seem adequate is entirely my responsibility. Frank George Beaconsfield
CONTENTS
Series Preface Preface 1. Artificial Intelligence and the Interrogation Game
v vii 1
2. Scientific Method and Explanation
14
3. Godel’s Incompleteness Theorem
25
4. Determinism and Uncertainty
38
5. Axioms, Theorems and Formalisation
48
6. Creativity
60
7. Consciousness and Free Will
73
8. Pragmatics
81
9. A Theory of Signs
93
10. Models as Automata
109
11. The Nervous System
132
12. In Summary
140
References
148
Author Index
154
Subject Index
156
ix
Chapter 1
ARTIFICIAL INTELLIGENCE AND THE INTERROGATION GAME
This book is concerned with the philosophical background of cyber¬ netics. As a science, cybernetics is the science of communication in animals, men and machines, and we shall not in this book seek to justify it as a science. This is because we feel that it needs no justi¬ fication other than the fact that it exists and is, in our view, satis¬ factorily progressing along a fairly well-defined line. As a science, it can be judged by its practical pay-off and from this point of view it is not necessarily of great importance whether we taken one philo¬ sophical view of it rather than another. We shall, however, be discussing cybernetics from a philosophical point of view. One of the basic questions we shall be considering is “whether or not machines can be made to think?” The philosophical importance of this is obvious, but the scientific importance is very much less obvious. It does not really matter, provided the science is supplying a good ‘spin-off’, whether or not in the end it can fulfil (or wholly fulfil) this goal. So it is, that to this extent the science of cybernetics, which is concerned with artificial intelligence and its application in all sorts of other fields, such as behaviour, biology, economics, business, education, etc. is not the subject of our discussion. The point we have already made about “machines thinking” as being one of the central points of cybernetics, will also be the central theme of our own discussion. It will be stated in the form1 “Could machines be made to think?” Sometimes this has been phrased in the form “can machines be made to think?”2 but we are not concerned with the relatively unimportant sense in which this has already been proved to be possible, when compared with the far more important sense in which we think it could be made possible. It is best from our i
PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
2
point of view to throw the question into the future and say, regard¬ less of whether or not it is possible now, is it possible in principle? We should be clear from the start that a question put this way is easily misunderstood, if for no other reason than various key words have a variety of different possible meanings. In other words, if we defined the word ‘machine’ and defined the word ‘think’ then we could, without too much trouble settle our question in the affirma¬ tive or negative, as a direct result of our choice of definitions. We could say that machines by the very nature of things (i.e. by our definitions) are precisely those systems which do not think, that are automatic and unthinking, therefore to ask whether or not they think is an absurd question. We could, on the other hand, take a very much more general definition of the ‘machine’ to include human beings for example, in which case to ask whether or not they think is equally absurd because they obviously do, since they now include a class of systems which manifestly (by definition) thinks. Therefore we have to consider other ways of phrasing our central theme which makes it more intelligible, and easier to handle. The
first
difficult
word
to
define
is
undoubtedly the word
‘machine’. We really want to talk about systems “capable of being manufactured in the laboratory”. The artificial manufacture of the system is the important thing; not whether it is machine-like in any other sense. We are not thinking of a machine such as a potato peeler or a motor bicycle; we are thinking of an artificially constructable system which is capable of being manufactured in a laboratory and which also has the properties of adaptability which characterise human beings. We have to be careful here to distinguish between artificial insemination, for example of a human, which provides another human, and the laboratory manufacture of a seed which is capable of growing in our own artificially prepared environment and becoming human-like. There is also a further difficulty because when we ask about the possibility of machines (in the complex sense of artificially con¬ structable systems) thinking, we do not necessarily mean in a human¬ like way. However, we are bound to use the human being as a yard¬ stick, and ask whether or not we could manufacture a system which is capable of thinking with the same degree of efficiency as a human being. It seems, however, likely to follow from this, provided we can understand the general principles by which human beings can think as effectively as they do, and then reproduce these principles in an
ARTIFICIAL INTELLIGENCE AND THE INTERROGATION GAME
3
artificial system. Given this situation, the possibility of producing a system which is superior to man in its abilities is fairly straight¬ forward. Not that we wish to make the claim at this point that we can make machines that can think more efficiently than humans; our argument would rest sufficiently on making machines that can think at least as efficiently as humans. The second main word ‘think’ provides another obvious difficulty, since some people use this word to apply purely to what humans do and indeed not only to what humans do, but what they are conscious of doing. We would want to say that thinking is a process of mani¬ pulating symbolic representations of events, and the process of learning and adapting as a result of these manipulations, as well as solving problems and formulating plans etc., without necessarily being conscious of the process one is going through, and without necessarily being a human. This sort of definition is behaviouristic by inclination, and does not insist as some people do (more often than not philosophers talking of thinking) that thinking is a process necessarily involving consciousness. This, of course, is not to say that much of what we call thinking is not actually a conscious process, but that is another question.
THE INTERROGATION GAME We will have made the point clear that talking about the possibility of machines thinking implies something fairly special in the meaning of the word ‘machine’. It also implies something fairly clear-cut by way of the meaning of ‘thinking’ where we mean to use the human being as a yardstick. It leaves the matter open as to whether the possibility of synthesising (as opposed to simulating) a human being could actually use methods for effective thinking and problem solv¬ ing — other than human methods. We do not need though to discuss that particular question at this point. However, we do want to look at the question of the Turing interrogation game to be quite clear that the system we produce is not necessarily human-like in its construction. There is a parlour game that has sometimes been played (though not in the experience of the present writer) whereby you try to decide, as a result of asking questions, whether a human being is a man or woman. Clearly this is subject to the constraint of not being
PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
4
able to look at the person in question. So you try to formulate ques¬ tions which would elicit answers that should somehow serve to give the questioner a clear picture of to which of the two sexes he is talking. Turing has suggested an adaptation of this interrogation game in order to distinguish the human from a machine; he believed that this would be an effective way of defining the concept of a machine’s ability to think. If you can carry out an interrogation game with a human being compared with an artificially constructed system (as opposed to another human being) and if you find it impossible to tell which is which, then you have to accept the fact that a machine can think as well as a human being. In using the phrase ‘as well as’ we are not concerned so much with precise relative abilities in every sphere, only that in general the quality of human-like thought is equally attributable to both. It could be objected that this does not compare with the human¬ like quality of thought, as much as the human-like responsiveness of the two systems. The answer to this is that the problem of ‘other minds’ arises just as much when comparing a human being with oneself. It is not possible to tell whether other human beings think, all one can tell is that they behave (or do not as the case may be) as if they thought. So we shall have to settle for this criterion when comparing a machine and a human being; we shall have to decide whether it behaves as if it thought. The importance of the interrogation game lies in the fact that it is saying, in effect, that the artificially constructed system could be a digital computer, albeit a fifth or sixth generation computer. It could on the other hand be an electronic system of some kind and certainly does not need to be made, as a human is**pf colloidal protoplasm. All that matters is that it should behave in a similar way in a similar sort of situation.
OBJECTIONS TO THE MAIN THEME Our main theme is that machines could be made to think. We are answering the question posed by our main theme at least to the extent of saying that we can think of no reason to doubt the pos¬ sibility. We now start to cast around for possible objections to our viewpoint. Some of these objections will receive the most detailed
ARTIFICIAL INTELLIGENCE AND THE INTERROGATION GAME
5
analysis in separate chapters which follow; others will merely without detailed analysis be mentioned. The first objection, which we shall not treat in great detail, is the theological objection. This says in effect that thinking is a function of man’s immortal soul, and must be something attributable to man and man alone, and that no other system could possibly achieve it. This in some part is like the argument which says thinking is human¬ like and human-like only (though not necessarily on theological grounds) and therefore cannot apply to machines. We shall simply assume the wrongness of this argument and leave a discussion of a theological kind outside the text altogether. In saying this we should perhaps just mention that we have here the support of D.M. Mackay, who, while having strong positive theological views, would still accept the fact that machines could be made to think in the sense that we intend it to be understood in the text. Next, there is what Turing calls the ‘heads in the sand’ view which simply says that it would be ‘quite dreadful’ if machines could be made to think, and so as a result rival human beings. In much the same way as it would be dreadful if we found another species over¬ taking us in our ability. On purely biological grounds there seems no reason to doubt the possibility of another species overtaking the human species, and by the same sort of argument it seems pointless merely to say it would be ‘quite dreadful’ and feel that this is the counter-argument, so we shall also altogether neglect this type of counter-argument. The third type of argument we shall consider in Chapter 3. This is an argument from the point of view of the foundations of mathe¬ matics. Basically, it is an argument based on Godel’s theorems3 and related theorems due to Turing4, Church5, Post6 and others. The essential features of these arguments are based on the fact that you seem not able to construct an axiomatic system in which both its own completeness and its own consistency can be demonstrated from within the system. In other words, there are certain statements or features which we would accept as necessarily being within the system which we cannot demonstrate in our axiomatic system when that axiomatic system is used to investigate its own characteristics. We shall try to show in the next chapter but one that the Godel arguments as they are sometimes called do not really serve as a barrier to our main theme. However, this is such a complicated matter that we will not attempt to provide the counter-argument here.
6
PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
Another type of argument used as a contrary to our main theme is the argument of consciousness. We shall be treating this from various points of view, since we want to discuss the relation of consciousness as it seems to occur in humans, with the possibility of an equivalent state occurring in machines. We also wish to discuss the problem of free will, and the problem of creative ability. Various aspects of consciousness can be broken down into various possible counter¬ arguments and these will be examined in considerable detail. We shall say no more about it at this point but merely notice that the property of consciousness, which seems to be a characteristic of human beings, will be seen by some as a barrier to the possibility of making machines capable of performing the same sort of activities as human beings, on the grounds that they (the machines) could not possibly have con¬ sciousness. We believe this is a false argument too and we hope to show it, but not only in one way, rather in various ways. Another argument considered by Turing as a counter-argument is what he calls the argument of ‘various disabilities’. This in essence, says that there are various things which you cannot make a machine do that a human being can do. One of these ‘things’ is that the bodily structure of a human being would be extremely difficult to produce by any mechanical engineer however sophisticated. We shall not argue this particular point because we think that the intelligence shown, which is the basis of our argument, is independent of the structure which shows that intelligence, and therefore we do not necessarily want to produce a system made in the same way as a human. There are some doubts about this view since some people feel that the fabric of manufacture is closely bound up with the system’s performance. We will bear this objection in mind. Other disabilities which should be mentioned are the inability to reproduce oneself and the inability to function under conditions of error. Von Neumann7 has shown that both these arguments can be overcome, and that artificially constructed systems can reproduce themselves, and furthermore however much error there may be in the functioning of the system, provided that the system is sufficiently complex, it will survive that error and correct it if necessary. Turing has actually used the argument that if you specify precisely what it is that cannot be done by our artificially constructed system then from the description of what it is that it cannot do we will manufacture the system to do it. This argument, while persuasive in part, is not necessarily wholly acceptable to the human-like claims of
ARTIFICIAL INTELLIGENCE AND THE INTERROGATION GAME
7
an artificially constructed system and where we say that it cannot do a certain thing. We may only be able to point to the end result, however, without saying how it is achieved, and therefore, not necessarily give enough information to make it possible to reproduce what it is that is required or is missing, and is claimed to be impos¬ sible to reproduce. We shall be looking at this question of various disabilities throughout the text as it rears its head in various different contexts. There is one further argument which we will consider and that is that a computer, or any other artificially constructed system, only does what it is made to do by its programmer. This is a view that was held by Lady Lovelace. It is a popular fallacy that computers can only do what the programmers make the computers do. The fallacy arises from various considerations. One is the failure to remember that human beings only do what they are programmed to do, although they are programmed by various different features of the environ¬ ment, including parents, teachers, etc. and are adaptable and change according to changing circumstances. Now in this sense it is perfectly true to say that computers can only do what they are programmed to do, but they can certainly be given exactly the same flexibility as humans. In other words, various people can program them and they can be made adaptive so that they change and function in changing circumstances. In other words, if you are making an intelligent machine to play chess, for example, then to make it merely reproduce a set of standard moves which had been thought out by the programmer would be perfectly useless. It is absolutely essential that it should be given only starting programs all of which it is capable of changing in the light of its particular experience. We can, therefore, firmly disregard as an important objection to our main theme the argument that a com¬ puter only does what it is programmed to do, it has no relevance whatever. Nevertheless we shall be repeating this point more than once in the course of this book, since it is such a widespread mis¬ understanding that it needs to be emphasised frequently that it is
just a misunderstanding. Turing’s interrogation game and the objections it gives rise to automatically involve us in some philosophical and scientific dis¬ course. This in turn means that we are bound to be involved in the philosophy of science, as well as epistemology, ontology and the like.
PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
8
The philosophical implications of cybernetics do take us straight into a number of philosophical issues which could be regarded these days as virtually ‘off the shelf’. They are the so-called ‘mind—body problem’, the problem of ‘other minds’, ‘free will’ and questions, notoriously difficult to deal with, such as that of ‘consciousness . To some extent these issues thread through our whole book and are fundamental to the philosophy of cybernetics. If, for example, you argue that minds are something private and characteristic of human beings and human beings alone, then, by definition, the question of making ‘machine minds’ is a contradiction. We must therefore be careful not to become involved in linguistic absurdities or obscurities if they can be avoided. Bearing in mind, the problem as seen from the viewpoint of the interrogation game (this particularly deals with ‘other minds’), let me say that the questions of ‘free will’ and ‘consciousness’ are later discussed in detail. So now to complete this introductory chapter, we will explicitly say a few words on the ‘mind—body’ problem and how it can be regarded from the view¬ point of the cybernetician. There are many ways of looking at the mind—body problem, but we will start by considering what are sometimes called cognitive terms. Sommerhoff8 has recently made the point which is often made in some form or another: .... the deplorable mistake of Behaviourists of interpreting the meaning of all mental state concepts as synonyms with the respective behaviour dispositions.
These behaviour dispositions are not, argues Sommerhoff, what we commonly mean by such cognitive terms as ‘perception’, ‘learning’, etc. He further argues that Ryle9 was influential in the process of interpreting mental state concepts (words) as dispositions so that the shift from private descriptions to public descriptions occurs, and by implication misleads. Sommerhoff then says that Ryle believes that this (behavioural disposition) is what we mean all the time by these mentalistic terms and that, says Sommerhoff emphatically, is not correct. Regardless of what Ryle thought about the matter, let us take this as our starting point, as at least being a more plausible argument on behalf of what we might call ‘behaviourism’ than that advanced by more extreme supporters of such a view: for example, Watson.10. There is nearly always a problem of translating from terminology
ARTIFICIAL INTELLIGENCE AND THE INTERROGATION GAME
9
derived from one set of circumstances to new circumstances. Carnap has said on many occasions that people (scientists) should develop their own terminology, and it can be assumed that sooner or later someone will provide the translation either from one language into the other, or both into a third. Suppose now I say “I perceive something to be blue and round”
(1)
“I think that it is correct to follow strategy A”
(2)
or
The words ‘perceive’ and ‘think’ are used to refer to processes of which I am aware (or partially aware), or it might be said to refer to a state (act or conclusion) at which I have arrived with or without any awareness of how I arrived there. We have argued elsewhere that the meanings we give to terms are in part conventional (a dog could clearly have been called a ‘cat’) and a matter of definition, but are often contextual (e.g. a word may be defined with increasing precision according to the nature of the discussion) and this allows us to accept at least two different mean¬ ings to cognitive terms, as in the case of ‘perceive’ and ‘think’ above. We can go further and define ‘perceive’ or ‘think’ in terms of some¬ thing that I am aware of, so for example, to exclude specifically the possibility of ‘unconscious thinking’. Armed with these alternative ways of approaching definitions and therefore meaning, we can now see that the gulf between the ‘mental’ and the ‘behavioural’ usage can be bridged, and bridged in more than one way. The first method of bridging is by identifying the mental use of ‘think’ with the process inferred by the results of apparently rumina¬ tive activity. We can say that thinking is a reallocation of beliefs that
disposes us
to
respond differently. So now, when we say
“I think I should follow strategy B”
(3)
we are implying that a process has gone on (in your nervous system presumably) that allows you to say (3), or it could well have been (2). But what difference would it make to me (the listener) whether you meant me to understand either (2) or (3) in mentalistic or behavioural terms? “Probably none” is the answer and this leads to the obvious
PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
10
question: Under what circumstances would it make a difference? The answer now is perhaps that if you were examining human intelligence you would wish to feel free to accept or reject the theory that brains are the seat of intelligence. This would leave you free to accept or reject the theory that when you say ‘think’ you are referring to a process (regardless of location) which occurs and about which you have no direct awareness. I might want to know that I can distinguish between you saying “I am aware of a decision that I have taken and why I have taken it and how I arrived at that decision.”
(4)
and “I am aware of a decision that I have taken and am aware of why or how (or both) I arrived at it.”
(5)
and then either (4) or (5) with the additional statement about the actual processes and location involved, would clarify my ability to distinguish one meaning from the other. It is perhaps only as a brain scientist or a behavioural scientist that you care about these distinctions. But a philosopher-cum-logician may also become interested if he is involved in discussions about psychologism or physicalism. But the main point is to see that the identification is a harmless procedure, as surely it is if all concerned know that it is being carried out. This is clear from Sommerhoff’s understanding of what he says Behaviourists have done. This second way of discussing what has happened is to say that these ways of speaking are two sides of the same coin. This falls short of identifying the meaning of the terms, but in practice is really no more than acknowledgement that we want to avoid category mis¬ takes9 whether or not such mistakes have serious consequences. So we say brain and mind are to each other what engine and per¬ formance are to each other, say, in an automobile. The result is that terms like ‘perceive’, ‘think’ and other such terms as ‘learn’, ‘believe’ etc. are capable of being construed in either category. This could require using different terms, but need not as long as we all know what we mean by the different usages.
ARTIFICIAL INTELLIGENCE AND THE INTERROGATION GAME
11
VARIATIONS ON THE THEME
Is this all there is to the mind—body problem? Not necessarily, since there are other leads into the same issue, although they too could all be construed, and therefore clarified, in a similar fashion. Pap11 for example says that if he asserts that “X is ugly” and “X is wise” the first statement refers to X’s body, but the second statement does not and what it does refer to can be called his ‘mind’. Chisholm12 says: We have found it necessary to add that the organism, rather than being merely stimulated by the referent, must perceive it or recognise it, or have it manifested to him, or take something to be it, or else we must add that these intentional events do not occur.
Chisholm’s argument is that it is the interpretation of stimulus signs that is indicative of mental activity. Finally, Ayer13 reviews a number of these points and the most important of these which we would wish to underline in his comments is embodied in the following: . . . mental events having been rendered causally otiose, the physicalist, in his pursuit of uniformity, proceeds to eliminate them altogether by reducing them to their physical counterparts.
Ayer concludes that this is either uninteresting or implausible, and so it is if it merely reminds people that underlying their own feelings lie mechanisms that cause them (or are them). It is implausible if it is thought to lead to people accounting for their state of mind only by recourse to physically observable processes. Ayer cites the following example, which illustrates what he means: I find it hard to imagine someone who seemed to himself to be feeling great pain, but on being informed that his brain was not in the condition which the theory presented, decided that he was mistaken in believing that he felt any pain at all. He might be persuaded that it was wrong to use the word ‘pain’ in these circumstances, since it had ceased to stand only for a sensation, but he would still be the best authority as to what he felt.
This, of course, is all quite acceptable but need not cause us any difficulty. Pap’s example reminds us that it is sometimes convenient to distinguish between different types of properties of systems. Certainly there are occasions when we wish to distinguish physical from mental in the same way as we distinguish structure from function (engine from performance) and also with Chisholm one
12
PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
should not complain that intentional (interpretive) activities are being used as a distinction between mental and physical. This is also a distinction which has a contextual use. But neither view should conceal the fact that we are dealing with a situation which is unitary when it happens: people behave in a particular manner in particular circumstances and can describe these manners in different ways for different purposes in different contexts. Exactly the same can be said of Ayer’s point. Physicalism is not of interest in the contexts he describes, and one might be tempted at this point to ask why we are discussing the matter at all. The answer is that there seems to be (or is) a problem because we have made a distinction between mind and body: even if it is just a verbal dis¬ tinction which represents a distinction of immediacy — ‘feeling’ something directly or observing something either directly or indirectly. We are then brought to such questions as “in what way does the mind interact with the body?” If we contrast this with the question “in what way does the brain interact with the body?” it can be seen that the second question raises no problem (except the difficulty of finding out), but the first does. The difficulty in the first case is not just finding an answer, but trying to discover what form the answer could take. This requires precisely the sort of analysis with which we started. Viewed as a causal process from which we can abstract certain aspects by introspection, we do not encounter any special difficulty. The only problem then — apart from the scientific one — is to remember that we may use language in different ways on different occasions, which is perfectly acceptable as long as we do not forget in which way we were using it at any particular time. This all comes quite close to what was said by Ayer14 a while ago: The traditional disputes of philosophers are, for the most part, as unwar¬ ranted as they are unfruitful. The surest way to end them is to establish beyond question what should be the purpose and method of a philosophical enquiry ... if there are any questions which science leaves it to philosophy to answer, a straightforward process of elimination must lead to their discovery.
The question as to the manner in which the body and mind work together is purely scientific if ‘mind’ is taken to mean certain ‘work¬ ings of the brain’. If it means what I am conscious of and how I describe that, then by the process of elimination I must decide that the answer required depends on the nature of the question (the context) and will fall generally into one of two categories: the first is
ARTIFICIAL INTELLIGENCE AND THE INTERROGATION GAME
13
where I mean the term to be understood in ordinary mentalistic terms (all neural process barred) and the other where I mean to make a statement which applies to anyone (either others or myself) where I may or may not have recourse to behavioural or neural terms. The reasons for spending some considerable time on the mindbody problem are two-fold. It seemed desirable to indicate the type of question that must arise when we talk of the philosophical foundations of cybernetics, and also to indicate our own manner of regarding it. There clearly is a problem, and the solution largely lies — in our opinion — in the way the problem is posed. A cyber¬ netician would, (or should) hypothesise, in my view, that mind is a function of brain, as we have already said. This leaves open the question as to how the brain works, since it may be that it is a huge storage and processing system (this is how the computer equivalent operates when programmed to behave ‘intelligently’). It may however operate on very different principles, so that the brain-in-action uses the environment as its store (this is true in any event), and only retains minor traces inside itself. We then have a problem over whether the brain has a separate linguistic and event store, which are closely associated, probably in alternate hemispheres, or whether the two are coalesced as one and operate in a dynamic manner where thoughts and words are one. This latter view is reminiscent of the one held by Wittgenstein.15 But the cyber¬ netician, in so far as he is interested primarily in synthesis, will only be influenced by that which is easier to construct. To the cybernetician interested also in simulation, of course, the major problem will be to decide the manner in which the proces¬ sing actually occurs. Such issues are very much bound up with problems of meaning, as considered by-philosophers, so we must expect to become involved in such discussions as we have illustrated briefly in the case of the mind—body problem. We shall leave the matter at this point and turn, in the next chapter, to explanation and related topics.
Chapter 2
SCIENTIFIC METHOD AND EXPLANATION
This second chapter takes up the questions that arise from scientific method and explanation, since they are closely bound up with the foundations of cybernetics. Ashby* once said at an informal discussion that cybernetics was the same as scientific method. The reason for repeating his comment is not to agree with it unreservedly, but to accept that cybernetics is in many ways as much a method of ‘carving up’ and organising our knowledge as it is a matter of supplying models, such as robots, automata, self-programmed computers and automated machinery. It is an approach that is very general and sees the world as a set of dynamic interlocking systems — usually feedback systems — and often hierarchical in structure. There is a sense — slightly more general than we would argue — that science sees things in an analogous manner. There is another close connection and that arises because the automating of the scientist is one of the obvious tests to which a cybernetician could reasonably be put. That the scientist deals in scientific method and that explanation is a large part of what that method entails is fairly evident. Looking first at scientific method, we can argue that science is most obviously applicable to repetitive situations, because the occurrence of repetitive cycles makes the formulation of inductive generalisations and deterministic theories (laws, hypotheses, etc.) or near-deterministic theories possible. Nothing is better than to dis¬ cover that certain events, such as the release of an object from a height, leads to its falling to the ground in a repeatable and wholly ‘After his paper read at a conference in Dayton, Ohio, in 1966.
14
SCIENTIFIC METHOD AND EXPLANATION
15
predictable fashion. The use of deductive logic allows us to apply such generalisations to a whole variety of situations and we call this process that of Explanation. But before we move over more explicitly to Explana¬ tion as such, let us look beyond the more obvious scientific applica¬ tions such as are embodied in physics, chemistry, biology and the like, and consider a subject such as history, since we would argue that this too can be treated as a science. We have to accept that to be scientific means both to look for similarities among differences and differences among similarities. This means that each individual experiment in science might reflect — more or less — certain universal laws. But they also reflect in some measure variations due to individual circumstances, so that even each repetitive experiment in a laboratory varies within small limits from occasion to occasion. This makes it easier to accept the fact that the so-called one-off event of history, say, is just as amenable to the scientific approach. So we can apply our scientific method at different levels. We can talk in terms of the deterministic, the probabilistic, the statistical and the uncertain where no actual measure — even statistical — may be possible, but where, nevertheless by such methods as axiomatisation and the use of heuristics, we can provide some statement of likelihood. These points are also illustrated in the study of behaviour — both individual and social — where we can derive the following levels of description: (1) the molar level of behaviour which deals with people
en masse, (2) the molar level of behaviour which deals with the individual
at
the
level
of the
type
of person
he is, etc., (3)
the molar-molecular level which makes reference to his dispositions, beliefs and the like and perhaps, by implication to his nervous systems in molar fashion and (4) the detailed firing, on a neuronby-neuron basis, of the nervous system. To take up the point about types first. Watkins16 has argued that: An understanding of a complex social situation is always derived from a knowledge of the dispositions, beliefs and relationships of individuals. Its overt characteristics may be established empirically, but they are only explained by being shown to be the resultants of individual activities.
We would accept this view and add that in our above grouping that there is a ‘reductionism’ possible and whereby explanations in terms
16
PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
of (1) can be couched in terms of (2) - this is Watkins’ point. Simi¬ larly explanations in terms of (2) can be couched in terms of (3) and (3) in terms of (4) and any ‘lower’ number in terms of any ‘higher’. The interest here lies in the fact that a set of laws that are — or could be — deterministic at one level may be statistical at another. The reason would be that the feature that makes them statistical is their observability or their complexity or both. In other words, we are subject to general principles of Uncertainty — of which Heisenberg’s principle is a particular case — whereby we have limits placed on our ability to observe events and this introduces uncertainty. Then there is the fact of complexity, as in the case of the nervous system, so that we do not (or cannot) work out all the causal details and are necessarily committed to a broader approximation which is effectively statistical. The relevance of this to a science of history, say, is clear, since history is akin to the highly complex — and highly abstracted — since in an obvious sense history includes economic, social and indeed all other affairs whatever. But we must return later to this the very nub of our discussion. Nagel17 has made the point (one that we are attempting to estab¬ lish) that history and science are not so different. As he puts it, natural science is primarily Nomothetic and history is Ideographic, but they have a considerable degree of overlap. What this means is that universal statements certainly occur in history and singular statements certainly occur in science. As Nagel says: For the effective execution of this task of external and internal criticism the historian must be armed with a wide assortment of general laws, borrowed from one or the other of the natural and social sciences. And, since historians claim to be more than mere chroniclers of the past, and attempt to under¬ stand and explain recorded actions in terms of their causes and consequences, they must obviously assume supposedly well-established laws of causal dependence. In brief, history is not a purely ideographic discipline.
This is exactly what we believe too, and would invoke ‘laws’ like that of diminishing returns (from economics) and other social, political and economic laws to provide explanations of such things as trade cycles and stop-go policies in our economy. The laws (or universal hypotheses) in question are often not explicitly stated but are subsumed and the effects explained by them. These same laws are then used for prediction, where it is the prediction rather than the explanation that requires the general laws and the periodicity or near
SCIENTIFIC METHOD AND EXPLANATION
17
periodicity in events. Finally, then, we have come to base our belief in the possibility of the science of history* on the following considerations: (1) Science is not simply deterministic and made up of repetitive patterns alone; it is often statistical and shows evolutionary trends which are not simply periodic. (2) History is not simply statistical and concerned only with the description of one-off events. (3) Explanation can be the same for history as for science. (4) History includes as part of itself at least social, psychological and economic considerations from which general laws (or universal hypotheses) may be drawn. (5) The fruitfulness of our form of historicism and its justifica¬ tion can only be shown by treating ‘history’ — in our sense — scientifically. So much then for a very broad view of science, which amounts to the systematic handling of information within the compass of a certain context at a certain level of abstraction. It will be admitted — indeed asserted — that this means that any¬ thing can be studied scientifically. We can have business science, the science of gambling, even the science of dishwashing or window cleaning; although these last two may be, arguably, fairly trivial cases of applied physical chemistry or some such. Our question now is as to whether the processes gone through by the scientist — and he is now potentially ‘everyman’ — can be auto¬ mated? The answer we would give is “we see no reason to doubt it”. But, of course, one would accept that the job on hand is to show how it can be done in practice. What we have argued so far, in essence, is that to be scientific is to be like every human being is in their lives, but to be so explicitly (not implicitly) and to spell out one’s beliefs, one’s evidence, one’s logic and to be sufficiently precise in definitions and one’s descriptions. The context will, in each case, determine how precise we need to be. We can now turn explicitly to explanation.
*1 am greatly indebted to Mr. W. J. Chandler for many of the ideas mentioned here on history as a science.
PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
18
THE NATURE OF EXPLANATION The logic of explanation is something that has been discussed at great length by various people at various times, and the consequences of an attitude to scientific explanation is, like an attitude to epistem¬ ology and ontology, very important to our particular pilgrimage. We should mention here by way of example one of the best known approaches to explanation, which is due to Hempel18 and Hempel and Oppenheim19. In very broad terms, what they assert is that if the explanandum (that which is to be explained) is some particular sentence E, say, the explanans (that which provides the explanation) is made up of statements of antecedent conditions. cb C1i ■■■ » Ck and general laws Ei, L2,..., Lr and the process of deriving explanations of the explanandum are those of logical deductions from the explanans which is a combina¬ tion of antecedent statements and general laws. Hempel and Oppenheim19 in their article go on to point out that this type of causal explanation applies just as much to motivated behaviour as it does to any other kind of behaviour. The fact that behaviour may be purposive, and this is very relevant to cybernetic systems, does not mean that the purposes as represented in the system itself do not act as causes; in fact they clearly do. In other words, goal directed behaviour is as much causal behaviour as any other type of behaviour. In very general terms Hempel and Oppenheim19 say that an ordered couple of sentences (T, C), in a language L, constitutes a potential explanans for a molecular sentence E, if and only if the following conditions are satisfied: (1) T is essentially generalised and C is molecular. (2) E is derivable in L from T and C jointly. (3) T is compatible with at least one class of basic sentences which has C but not £ as a consequence. The difference between an explanans and a potential explanans is simply that over and above the condition for a temporary explanans,
SCIENTIFIC METHOD AND EXPLANATION
19
T must be a theory and C must be true. The assumption is that T be a theory rather than merely be true, for as can then be shown the generalised sentences occurring in an explanans have to constitute a theory, and not every essentially generalised sentence which is true is actually a theory. In other words, E is not always a consequence of a set of purely generalised true sentences. Hempel and Oppenheim’s language L has the syntactical structure of the lower functional calculus, subject to certain constraints, such as the omission of identity and that no sentence may contain free variables. If S is a sentence in L, S is formally true in L if S can be proved in L (a similar argument applies to a falsehood). S ismolecular if it contains no variables, and is also atomic if it contains no con¬ nectives. In the following: Q(a, b) 3 R(fl). 'v S(a) ^ R(a) Q(a, b, c) All three are molecular sentences and the last two are also atomic. We eventually derive analytic and synthetic forms in the language, such as (x)(Q(x)v 'v Q(x)) and (x)(Q(x) 3 R(x)) where the truth of the second form depends directly on the inter¬ pretation placed on Q and R respectively. We shall say no more of L, since it takes a precise descriptive form which is fairly obvious, but the details of which add little to their explanatory value. We should also here remark on the need for C to be true. The alternative to being true is that it should be highly confirmed, but Hempel and Oppenheim object to this on the grounds that we might have to change our minds in the course of time as to whether some¬ thing is an explanation in the light of subsequent evidence. This notion of confirmation, however, seems not only acceptable but inevitable, since we cannot know the truth of empirical statements, only the extent to which they are confirmed, so confirmation would be acceptable as a replacement for truth.
20
PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
We should add that the language L is in fact a simplified version of the language that would be needed to state scientific theories, but is provided solely as a basis for analysis. To make L sufficient would at least demand the inclusion of functions and provide for the use of real numbers. Hempel and Oppenheim claim that their concept of causal explana¬ tion can be generalised in various ways, for example to include statistical laws; however, we shall not make any more of that here. One thing we should say, however, is that they do make the point in their theory of Logic and Explanation that explanation can be at various levels. They give a very simple example of this, which we shall briefly recount. When an oar is dipped in water from a rowing boat, then the oar appears to be bent. The explanation of this is that the laws of refraction vary in media of different density. This is answering explicitly the question of “Why is it bent?” by virtue of what antecedent conditions or general law does the phenomena observed occur, but the difficulty now is that we can more generally say “why does the propagation of light conform to the laws of refraction?” It may seem reasonable to say that the laws of refrac¬ tion are indeed a generalisation of the propagation of the laws of light, and here we pinpoint a problem that arises in explanation. Pap20 has made the same point clearly. He says that if you ask why a particular acid turns the litmus paper red, an explanation which says “because all acids turn litmus paper red” is not sufficient, since you may now ask “Why do all acids turn litmus paper red?” We can always go on asking for more and more explanation indefinitely. Nothing perhaps brings out this more than historical explanation, where the causes of certain events can be traced back to other events which can in turn be traced back to other events indefinitely. Lucas21 makes the point very clearly. Any scheme of explanation is bound to be open to some adverse criticism, because there will be some things it cannot explain. If it explains some things in terms of others, then these others are not themselves explained. And if it offers subsequent explanations of these in terms of yet other things, then those others in turn are left unexplained. Explanations are answers. And however many answers we give, there is always room for the further question ‘Why?’. There must sometime be an end to answering. No scheme of explana¬ tion will be able to go on giving answers indefinitely.
Lucas’s point is well taken and is a reminder of the problems we are going to meet again and again in our explorations of the foundations
SCIENTIFIC METHOD AND EXPLANATION
21
of cybernetics, and what in effect are essential to the foundations of science. One of these is that we can always go on doing certain processes indefinitely and there is no obvious end to them, except perhaps by circularity. A similar example to explanation that springs to mind here is that of definitions. We can define a term in terms of other terms, and we can define the other terms which are the defining terms in terms of yet other terms, and we can proceed to go on indefinitely either until we reach a circular state where terms are defined in terms of other terms which themselves are defined in the first lot of terms, or where in the end we depend upon what are in fact undefined terms which form our basic building bricks. The significant point here is, and this is something which we will certainly meet again and again in this book, that the system cannot get outside itself but is nevertheless anchored to features which are themselves outside the system. There are ways out of these difficulties of an infinity of explana¬ tions or an infinite regress in definitions. The notion of infinity is not new, since we have means of generating infinite sets of symbols, such as in the case of the positive integers 1, 2, ..., n. This need not present any difficulty in handling arithmetic. Similarly, we need not worry that we can always go on asking ‘why’ about anything; it would indeed be strange if we could not do so, since such state¬ ments which we could not ask ‘why’ about, we should certainly want to ask “why can we not ask ‘why 5?’?” to which an answer would either explain ‘why not?’ or explain ‘why’. The difficulty of unlimited explanation can be overcome if we supply some appropriate limit. Suppose we say that the answer to a request for an explanation must be limited to the use to which the explanation is to be put. If you ask “what is electricity?” or “Why does that happen?” when pressing down a light switch, answers of various degrees of detail can be given. What we, as answerers, should say is “to what effect are you going to put our explanation?” “Are you an electrician or are you a Quantum Physicist?” If he answers that he is a seeker after truth, then you have to explain to him that we do not, and cannot know, the truth but we can know aspects of it, or at least the extent, for example, to which alleged descriptions of it are confirmed by evidence. But this now takes us further ahead in our story, and evidence, confirmation and truth must all be discussed at some stage. Let us now return to the argument that Lucas has put forward to
22
PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
try to show that the Hempel type of theory of explanation is inade¬ quate. He points out that Hempel is concerned with scientific explanation, and the use of ‘scientific’ here may imply that other types of explanation are either ‘unscientific’ or ‘non-scientific’; the first may be acceptable, but the second is not, since non-scientific explanations are manifestly available. We need not here make too much of this point, but accept the fact that explanation, whether scientific or otherwise, must depend upon some principle of verifi¬ cation and this in turn depends upon evidence, we shall now say a few words on this point. Evidential statements are relevant to some hypothesis (H) if they are part of the causal description of events which follow by virtue of H. Evidence is part of the process of confirmation or verification of the truth of otherwise of statements, and Hempel and Oppenheim make it clear that the statement C, or the set of statements Cl5 C2, ..., Ck must be true. We have argued that ‘highly confirmed’ is all we can expect, but more important we must also recognize that what is now being asserted is the relevance of truth and confirma¬ tion22 which we shall be dealing with in a later chapter. We can summarise our attitude so far to explanation by saying that explanations are always empirical statements which are relevant to other empirical statements, where the first set may constitute both antecedent conditions and general theories (perhaps laws). The statements need confirmation, and therefore explanations are to be looked at in terms of their relevancy, the weight of evidence they supply, and the extent to which they are confirmed. We would agree with Bergmann23 that theories are to explain rather than merely describe, to verify science and produce under¬ standing. The only qualifications we would add to what Bergmann has said is that to explain is also to describe. The point being that an explanation is a selected description which occurs at more than one level of generality and must be confined to that which is relevant to either the causal process of which the explanandum is a part, or the more general hypotheses (laws) from which the explanans is derived. We would also agree with Craik24 when he asserts that an explana¬ tion is like a distance receptor which allows the organism to react to expected change and thus anticipate events. This is clearly not the only purpose of an explanation, but it is certainly an aspect of how explana¬ tions can be, and doubtless are, used. Craik’s view is of special interest because of his own important contribution to cybernetics.
SCIENTIFIC METHOD AND EXPLANATION
23
Finally, we would accept that all that has been said on explanation is in some measure dependent on our notions of determinacy, and we shall be examining this concept in Chapter 4. We now have to add that a test case for cybernetics is as to the possibility of automating the process of explanation, and for our pur¬ pose here we need not be wholly concerned with whether all of what we mean by explanation’ in ordinary language is covered by our own description — based in the main on Hempel and Oppenheim19 — or not. Work on theorem proving coupled with the work on induction seems to suggest that in principle there should be no difficulty in covering the essential steps involved. Work on concept formation, hypothesis formation (similar to induction) and the other various steps entailed in science and explanation give further evidence for our belief. We know, of course, that we are talking in terms of an interactive ‘machine’ that learns from experience and we know that such a machine has a vast store — mostly of ‘reference’ information — available to it. It will, as human beings do, use the environment to store much of its information and then have the information as to where to retrieve what from the environment — libraries and other people (machines) in particular. This still leaves us with a certain doubt as to the ability of such a system to do anything of striking originality. Could it, for example, have come up with the special theory of relativity before Einstein did, granted it had the same information available to it? We think that it could, but we shall reserve our arguments on this sort of ‘final’ and very difficult test until later in this book. This brings us to a first stage conclusion, to the effect that we recognise that if one of our basic questions in cybernetics is “could machines be made to think?” then we accept that this requires analysis and consideration from both a semantic and scientific kind and also touches on traditional philosophical issues. We would say that by ‘machine’ we must mean some system — undoubtedly complex, adaptive, capable of interaction with its environment and the storing of information — that can be constructed by other than the ordinary biological means of reproduction. By ‘think’ we mean capable of processing information through symbolic means of representing the “external world” and also of linguistically repre¬ senting that “internal” model of both external and internal reality,
24
PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
and solving problems, making decisions and planning. The conclusion we would also draw is that one should be capable of making sensible comments on traditional philosophical problems. One should also expect that philosophers should question cyber¬ neticians’ intentions, as they do any other scientist.
Chapter 3
GODEL’S INCOMPLETENESS THEOREM
BACKGROUND TO GODEL'S THEOREM
We have seen in Chapter One that the Turing interrogation game gives us a basis for defining artificial intelligence, although we should not wish to be held to it precisely, as a justification for the existence of artificially intelligent systems. If a loophole were found in such a method, it would not necessarily invalidate our belief in the main theme that “systems which are artificially constructed can be made to show at least the same degree of intelligence as human beings show”. The counter arguments mentioned by Turing25 are certainly addressed primarily towards this claim, and in this chapter we take up one aspect of the argument which Turing calls “the mathematical objection”. The aspect we are to consider is embodied in the work of Kurt Godel, with special emphasis on his incompleteness theorem. To set the scene, in which we develop GodeTs incompleteness theorem (1931), and draw the implications from it which are rele¬ vant to our main theme of artificial intelligence, requires that we look back to the situation that existed when Godel’s work first appeared. The position was that at the time, and just before Godel’s contribution, there was a considerable interest in the foundations of mathematics. There were three principle views about the foundations of mathematics — first of all the logistic view, which was developed in Principia Mathematic a26 and which asserted that mathematics was a part of logic, and could be reduced to logic. A second view, which is very relevant to the development of Godel’s incompleteness theorem, is the formalist view, particularly associated with the name 25
26
PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
of David Hilbert.27 This asserted that mathematics was a game with symbols in which you could not necessarily argue that 1 + 1 = 2 was the same as “one plus one equals two”. If we interpreted the symbolic system, such as we have in arith¬ metic, as applying to the real world, then we give meaning to the symbols in a clear sense, and it was Hilbert’s argument that it was not a necessary step to explain mathematics in terms of a semantic inter¬ pretation. Mathematics was, in his view, a formal game played according to formal rules. Hilbert’s formalism was known as a ‘proof theory’. He thought of mathematics as a formalised system made up of formulae. Some of these formulae are axioms and some, derived by proofs, are theorems. A formula is called provable if it is either an axiom or the end formula of a proof. Thus we presuppose a meta¬ mathematics, within which we supply the proofs for mathematics. It is natural that in such a system we should seek to show its consistency, i.e. that contradictions should not occur, and this leads to a consideration of completeness, i.e. that every theorem (and only every theorem) desired to be proved in the system is in fact proved in it. We shall be returning to these matters of axiomatic systems in Chapter Four, but in the meantime we should notice that Godel in showing the incompleteness of an axiomatic system for mathematics cast doubt on the Hilbert theory of formalism. The third view of the foundations of mathematics, which is not so much our concern in this particular context, is in the intuitionistic view. Brouwer was the founder of this school and Heyting28 one of its principal spokesmen. They emphasised that the notions of mathe¬ matics were intuitively given. The main argument they put forward, apart from their insistence that mathematics is not based on logic, is in favour of the absolute necessity for constructive proofs. The intuitionists believe that the use of the excluded middle, that is the use of proofs based on everything in the world being either A or
^ A is illegitimate. Proofs had to be constructive in the sense that they never depend on reductio ad absurdum, but were positive applications of rules of inference to axioms or statements derived from axioms. It is not our concern to enter into the argument about the founda¬ tions of mathematics, merely to note that Godel’s incompleteness theorems will be seen, as we have already mentioned, to cast a certain doubt on the nature of the formalist argument. This doubt about formalism is relevant to the general argument
GODEL'S INCOMPLETENESS THEOREM
27
about whether machines can be made to think in so far as that a machine may be regarded as a formal system. If we can show that a formal system is not adequate to describe a human being as it stands, then certain consequences may be seen to follow (some people21 do see certain consequences to follow) regarding our main theme. Let us though now concern ourselves in more detail with the Godel situation. Godel’s arguments are based on elementary number theory, and he chose to define mathematics in terms of what are called primitive recursive functions. Primitive recursive functions are known to be adequate to encapsulate the whole of mathematics and these can be investigated in terms of formal arithmetic alone. So from this axiomatic system, which is known to be sufficient to provide a foundation for classical mathematics, Godel was able to show that it was possible to derive a statement which was acceptable in the system, but that was not provable in it. Godel’s arguments are fairly complicated, and we shall not go into them in great detail, but indicate the form they take in general terms. The arguments themselves show that axiomatic systems, certainly axiomatic systems rich enough to provide a foundation for mathe¬ matics, could not prove their own consistency. They were apparently either complete and inconsistent, or incomplete and consistent. It is important to notice though that he is not saying that systems are either complete or inconsistent as such; he is saying that there are certain statements that are part of the system that cannot be proved within the system.
GODEL NUMBERING
The technique employed by Godel is sometimes referred to as arithmetization. It is an extremely ingenious technique whereby all the symbols of the formal system (or axiomatic system) are mapped on to the positive integers. So for example: 0
5
^
D
(
)
3
4
5
6
7
are represented by 2
8
PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
28
x
y
z
11
ll2
ll3 ...
Fx
Gx
Hx
132
133 ...
are represented by
are represented by 13
and so on. So it is that every formula in our language, call it L, (a formal or axiomatic system is perhaps a more suitable description) is in one-toone correspondence with a number sequence, which we call a Godel number. For example the formula: (y)F2(y) is identical with the Godel number 6 ll2 7 132 6 ll2 We shall now raise the primes to the power of the numbers, so that 26. 3n2. 57. 7ls2. ll6. 13n2. 177 is the Godel number of the formula (v)F2(y). So we have a situation, thanks to the factorization theorems of arithmetic, where any number when it is factored into its primes defines a formula, and any formula defines a Godel number. We can now extend this notion to a sequence of formulae, as would occur in a proof. Suppose we have a sequence: /un¬ ending in F (which is the theorem proved). Each formula is represented by a Godel number, as above, and these in turn could be raised to the set of primes. So that you have a situation such as: 2gnl
^gn2 ^gn3
where gnl means “Godel’s number 1”, and so on. Every formal proof of F in L is a finite set of axioms and every
GODEL'S INCOMPLETENESS THEOREM
29
proof therefore has a Godel sequence number and the last formula of the proof will have the Godel number of the particular formula which has been proved. What we have done is to set up an isomorphism between natural integers and every symbol and formula of axiomatic system or formal language L. The next stage in Godel’s argument is very much more compli¬ cated, but is able to show that you can, bearing in mind that primitive recursive functions are represented in our formal system L, produce a formula which is primitive recursive but which is not provable in L. The method used is of special interest, because it seems to involve a somewhat artificial mode of construction. It is in fact very similar to the mode of construction developed by Cantor in his diagonal method for proving the non-enumerability of the real numbers. We shall therefore first look briefly at Cantor’s method.
CANTOR'S DIAGONAL METHOD
Cantor’s method assumes that real numbers could be placed in a specific order and we can write this down in a square array. all
a12
al3
#21
a22
a23
a31
a32
a33
■■■
■■■
He now said that if you take the leading diagonal which is the fractional part of the real number under consideration you have got an, «22, ^33, ••• • We are of course representing our number by a non¬ terminating decimal. From this leading diagonal we can now con¬ struct a new decimal number which differs from the leading diagonal number in every place, thus it is that in the first place the new number, while still lying in the interval (0, 1) must not be equal to an, the second digit must not be equal to a22, and so on. Thus we can construct a new decimal number which, thanks to its fractional part, is different from every other number in the numeration.
30
PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
This method has been subjected to a certain amount of criticism, because it seems to make use in the definiens of something which is part of the definiendum. Reiss29 (p. 124) pointed out that Richard’s paradox can be constructed in exactly the same way as Cantor’s diagonal method, and he goes on to say: It is evident that the reasoning by which the above paradox is obtained is identical with the reasoning by which it is proven that there are ‘more’ decimals than integers. Because the initially postulated polar definiens-todefiniendum relation is thereby violated, it is illegitimate to ask whether the diagonally defined decimal P is or is not included among the listed finitely defined decimals. The expression which defines the diagonal decimal P must maintain silence about itself and therefore about its own property of being of finite length. (It should be remarked that Cantor gave still another proof of the non-denumerability of the decimals which does not make use of the diagonal process. This proof, and similar ones which have been given, involve however the postulated existence of irrational numbers and we have already pointed out that irrational numbers are brought into existence by the very same violation of the definiens-to-definiendum relation.)
Reiss goes on to say that the Godel argument is essentially the same as the Cantor argument, and therefore falls foul of the same criticism. All of these points are closely related to the assumptions about the self-evident nature of certain axiomatic systems such as those embodied in euclidean geometry. The emergence of non-euclidean showed that ‘self-evidence’ was a poor guide and then Riemann (one of the non-euclideans) showed that if euclidean geometry was con¬ sistent, so was non-euclidean geometry. Then Cantor’s set theory ran into trouble (well before Reiss’ comments) which led to Russell’s theory of types. The main problem is that of self-reference. Paradoxes like Richard’s, mentioned above, had shown that apparently well-informed statements were by no means always meaningful. If a Cretan utters the statement “all Cretans are liars”, what are we to believe about Cretans? If the barber in the village shaves everyone except those that shave themselves, who shaves the (clean-shaven) barber? It is clear that some statements can be self-referential and some cannot. If I say “the set of all statements that are embodied in this report are embodied in this report” I am correct. If I say “the set of all mathematicians is itself a mathematician” I am incorrect. All this amounts to what Reiss has called in effect a circularity or an unwanted failure in the reference of a class to be also self-refer-
GODEL’S INCOMPLETENESS THEOREM
31
ential. It also (sometimes) means that utterances can produce para¬ doxes or meaningless statements unless we have some sort of hier¬ archy, such as Russell’s theory of types, which says that statements! about statements2 are not of the same order. This is very similar to the point that statements in the meta-language are not to be confused with statements in the object language. We shall be follow¬ ing up some of these points in Chapter 4, but will now go back to look at Godel’s argument which apparently shows the incom¬ pleteness of L.
GODEL'S ARGUMENT
There are many versions of varying detail of Godel’s argument30’ 31 but we shall provide one in very brief form that gets to the heart of the argument for those interested in the more technical side of the problem. This form of proof is based on that of Wilder.32 In the meta-language of M of L, construct a function B(x, y) where y is the sequence number of a formal proof of a formula F with Godel number y. If we replace each occurrence of z in F by y, we have a new formula F1, with Godel number f(x, y) say. Both B and f can be shown to be primitive recursive. We next form the relation B(f(x, x), y), which is also primitive recursive, and because of this, it implies the existence of a further formula H(x,y). The system now is known to contain the formula: -v (Ey)H(z,y) which can have Godel number i. We can obtain a further formula by replacing z by i to derive 'v (Ey)H(i,y) which has Godel number j and we note that
From these rather awkward beginnings we can show that if L
PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
32
(such as above) is consistent then 'v (Ey)H(i, y) is not provable in L. We start by assuming it is provable in L and let k be the Godel sequence number of the proof. Then B(j, k) holds and so does G(i, k) where G(i, k) =
i), k)
Now H(i, k) is provable in L, so it follows that (Ey)H(i, y) is provable in L; but we assumed that 'v (Ey)H(i,y) was provable, so L is not consistent. Alternatively if L is assumed consistent, we must conclude that 'v (Ey)H(i, y) is not provable in L.
SOME CONSEQUENCES OF GODEL
We are not primarily concerned in this book with the consequences of Cantor’s diagonal method reasoning for set theory, nor are we concerned with the implications for mathematics per se of Godel’s incompleteness theorem. What we are concerned with is the implica¬ tions for our main theme as to whether machines can be artificially constructed or not. Thus we can look with some doubt at the Cantor—Godel methods and ask ourselves what implications they have for the general argument. Before leaving this aspect of the subject we should simply remark that Gentzen33 has shown in fact that the same formal system L, which Godel was unable to show to be complete, was capable of being shown to be complete by appeal to extra assumptions occurring in
GODEL'S INCOMPLETENESS THEOREM
33
the meta-language M of L. The methods used would not appeal to the intuitionists because of their non-constructive nature, and their correctness (or otherwise) is not vital to us, since the inferences we shall draw about the implications of Godel’s theorem would have occurred without Gentzen’s proofs. Gentzen’s proofs merely add to our confidence in taking the standpoint we are to take.
THE IMPLICATION OF GODEL'S THEOREM FOR ARTIFICIAL INTELLIGENCE
We must now examine a few of the views that have been put forward in the argument as to the implications of Godel’s work. First of all we should state Lucas’ view. Lucas21 states that a human being can¬ not be a logistic system L, which is what is implied, in his view, by our argument, since he can see that a formula that is unprovable in L is in fact true. Nagel and Newman, as quoted by Arbib31, say the -following: Godel’s conclusions bear on the question whether a calculating machine can be constructed that would match the human brain in mathematical intelligence. Today’s calculating machines have a fixed set of directors built into them; these directors correspond to the fixed rules of inference formalised axiomatic procedure. The machines thus supply answers to problems by operating in a step-by-step manner, each step being controlled by the built-in directives. But, as Godel showed in his incompleteness theorem, there are inumerable problems in elementary number theory that fall outside the scope of a fixed axiomatic method, and that such engines are incapable of answering, however intricate and ingenious their built-in mechanisms may be and however rapid their operation. The Human’s brain may, to be sure, have built-in limitations of its own . . . (but Godel’s theory does indicate that the structure and power of the human mind are far more complex and subtle than any living machine yet envisaged).
On the other side of the fence we have the view of Hilary Putnam, also quoted by Arbib. Let T be the Turing machine to ‘represent’ me in the sense that T can prove just those statements I can prove. Then the argument (Nagel and Newman give no argument, but I assume they must have this one in mind) is that by using Gddel’s technique I can discover a proposition that T cannot prove and more-over 1 can prove the proposition. This refutes the assumption that T ‘represents’ me, hence I am not a Turing Machine. The fallacy is a misapplica¬ tion of Godel’s theorem, pure and simple. Given an arbitrary machine, T,
34
PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
all I can do is find a proposition U such that I can prove: If T is consistent, U is true, where U is undecidable by T this T is in fact consistent. However, T can perfectly well prove this too, i.e. T can prove that U is undecidable by it, and if T is consistent then U is ‘true’ by the programmed interpretation. And the statement U, which T cannot prove (assuming consistency), I cannot prove either (unless I prove that T is consistent, which is unlikely as T is very complicated).
The argument that Arbib finds more acceptable than the Putnam argument, though he stands on the same side as Putnam, is that due to Scriven: Nagel and Newman are struck by the fact that whatever axioms and rules of inference one might give a computer, there would apparently be mathematical truths which they could never ‘reach’ from these axioms by the use of these rules. This is true, but their assumption that we could suppose ourselves to have given the machine an adequate idea of mathematical truths when we give it the axioms and rules of inference is not true. This would be to suppose the formulae were right, and they were shown by Godel to be wrong. The Godel theorem is no more an obstacle to a computer than to ourselves; one can only say that mathematics would have been easier if the formulae had been right, and it would in that case be comparatively easy to construct a mechanical mathematician. They were not, and it is not. But just as we can recognise the truth of an unprovable formula by comparing it with what it says and what we know to be the case, so can a computer do the same.
Another somewhat similar view which rebuts Godel’s relevance to mechanism was stated by Whiteley34. Arbib goes on to say that existing computer programs do not show very much intelligence, but realises the question of their potential intelligence is a question of principle. Even since Arbib wrote this, considerable developments in “computer intelligence” have been achieved and we are beginning to see the possibility, mainly due to heuristic programming, of develop¬ ing this computer intelligence. This whole matter we shall be discussing later. But to return to the Godel argument, Lucas35 originally formulated a similar argument to his recent one21 to which the present author replied in the following terms36: . . . Mr. Lucas there states that the Godel theorem shows that any consistent formal system strong enough to produce arithmetic fails to prove, within its own structure, theorems that we, as humans (‘Minds’), can nevertheless see to be true. From this he argues that ‘minds’ can do more than machines, since machines are essentially formal systems of this same type, and subject to the limitations implied by Godel’s theorem. This is a very brief summary of what is a more complex and interesting argument, which does indeed show that
GODEL'S INCOMPLETENESS THEOREM
35
what we might call ‘deduction systems’ are limited by factors that do not limit human beings. Now the trouble is that this only disposes of deductive sytems and these are really of no cybernetic interest in any case. Cybernetics has been almost wholly concerned with what are called ‘Inductive Systems’ ...
The argument goes on in terms of learning machines, and it is pointed out that cybernetics is concerned primarily with adaptive machines that are capable of self-modification in the light of experi¬ ence. This takes us away from the immediate impact of the Godel argument to another one which is also thought important to our main theme. Do self-adaptive machines, still capable of being described as a logistic system L, have characteristics which are essentially different from a ‘fixed’ logistic system L, such as those described in Principia Mathematical Certainly it seems that an Interrogating Turing Machine has some of the same properties as an adaptive system. The questions that arise here are as to the degree of adapta¬ tion. There seems to be no sense in which the Turing machine changes its goals which are prescribed, however, contingently, by the ‘programmer’. (This is another version of the Lady Lovelace argu¬ ment.) We can however change our own goals as human beings, and we can devise machines that change their goals. Such changes may or may not apply to all goals, and the behaviour capable of being changed is more characteristic of mathematicians than mathematics. Mathematics is in fact not merely the process of proving formulae, as Scriven rightly seems to be saying. It is also the decision as to which axioms should be set up and which goals (areas of mathe¬ matics) should be developed. This ability to provide goals is socially orientated, or environment-dependent, and the decision as to how to set up the axiom systems is an inductive process, rather than a deductive one. The question comes up again as to what sort of formalised system describes this new adaptive (inductive) system? If it is another logistic system L (and it seems to be) then the argu¬ ments which show the irrelevance of Godel’s theorem to mechanism still stand. If it is not, then Godel’s theorem does not even apply. A more recent argument which asserts that Godel’s incompleteness theorem has no bearing on mechanism is supplied by Benacerraf.37 He asks himself in effect what his capacities and limitations would be if he were a Turing Machine. The Turing Machine, of course, being essentially the same as our formal system L. He says:
36
PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
Given any Turing Machine Wj, either I cannot prove that Wj is adequate for arithmetic, or I am a subset of Wj, then I cannot prove that I can prove every¬ thing Wj can ... if I am a Turing Machine than I am barred by my very nature from obeying Socrates’ profound philosophical injunction: Know thyself.
Benacerraf goes on to make other inferences about his own assump¬ tions regarding the implication of Godel’s incompleteness theorem and the possibility of he himself being a Turing Machine. These arguments are more dubious and have been criticised by Hanson.38 These further arguments do not concern us in our search for an inter¬ pretation of Godel’s theorem, and we would now like to draw our own conclusions. It seems clear to the present writer that the people who have argued against Nagel and Newman are correct and it seems likely that all their arguments amount to the same thing. D.M. MacKay, in a private comment at a meeting in Oxford some years ago, put the matter very concisely when he said: If you need an interpreter to go along with a logistic system L, then you can mechanise the interpreter and L together, and presumably make up a newlogistic system L .
This is exactly the view we would take. We would say indeed, that the set of logistic systems Lu L2, ...» Ln that can be interpreted by a human being will be infinite in number, but there will still be some logistic systems which are a complete description of himself which cannot be interpreted by himself. This is what Godel seems to assert and it does not, of course, mean that we cannot do what Gentzen did for Godel and supply an external observer for whom I am a complete logistic system. What we are asserting is that if we are all logistic systems, those which we can interpret successfully are not a complete description of ourselves, and the logistic system which is a complete description I cannot show myself to be complete. In other words, Benacerraf is completely right when he says we cannot know all about ourselves in this Godelian sense. The conclusion to be drawn from this argument is that Godel’s incompleteness theorem, while resting on a slightly dubious form of interpretation, which as Reiss would say offends the definiens— definiendum rule, nevertheless says nothing to suggest that a human being is not just as much as a logistic system or a formal system or an axiomatic system as a formal system we ourselves construct. In other words, to put it very simply and in a very straightforward manner,
GODEL'S INCOMPLETENESS THEOREM
37
Godel’s incompleteness theorem has no implications for mechanism, does nothing, in other words, to make one think that a human being could not be an artificially constructed system.
Chapter 4
DETERMINISM AND UNCERTAINTY
INTRODUCTION The various arguments about deterministic and non-deterministic systems, most of them emanating from the philosophy of physics, is the subject of this chapter. Unlike Godel’s incompleteness theorem and the subject matter of some of our other chapters, this chapter is not directly concerned with a counter-argument to the Turing interrogation game argument or our main theme of artificial intelligence. It is nevertheless associated with questions at the foundations of science, which is another important theme in our analysis. It is certainly relevant to the rela¬ tionship between the observer and the observed, and insofar as the question of “free will” has sometimes been thought relevant to this, it does touch, if lightly, on two of Turing’s counter-arguments25 which were called respectively “the argument from informality of behaviour” and “the argument from consciousness”. In broad terms these
two
arguments
are
concerned with morality, or codified
behaviour and creative capacities, especially artistic, and both in turn touch upon the question of free will. Further to the above, Lucas21 has made the point that any deter¬ ministic physical system which is capable of being described in mathematical terms is subject to the limitations implied by Godel’s Incompleteness Theorem. We are also led by our interest in determinism into a consideration of probability and statistics, and these are of special interest to cyber¬ netics, as well as direct relevance to our understanding of the uncer¬ tainty of empirical knowledge. 38
DETERMINISM AND UNCERTAINTY
39
We cannot possibly review here, even briefly, all the various approaches of probability theory that have been suggested. However, a review of many of the now classical approaches of the finitefrequency theory, the Mises-Reichenbach theory, the Keynes theory and other aspects of probability theory are well reviewed in the literature. The interested reader would do well to look at Russell’s discussion of these matters39 which supplies a lucid summary of what has been said on them. We are in a rather similar position with respect to probability theory as we were with respect to the foundations of mathematics; it is a by-product of our interests, but not central to them. Most of the efforts to provide a logical foundation for probability have got into difficulty because of the ambiguity of the word ‘probability’. The finite-frequency theory, for example, provides a view which does not cater for all uses of the word ‘probably’, as is illustrated by statements such as “all our empirical knowledge is only probable’’. “X’s theory is better than Y’s” and so on. It seems here that we are dealing mainly with evidence and explanation, and this cannot always be reduced to a mathematical ratio between two classes. Furthermore, ‘credibility’ is to be distinguished from ‘probability’; the former term being much wider than the latter and dealing with what are sometimes referred to as ‘subjective probabilities’. We shall argue that all probabilities (whether or not a non-arbitrary metric can be applied) are statements of a kind whose truth are not certainly known, and where the evidence varies from that which is highly publicly verifiable (but not completely so) to that which is only slightly publicly verifiable, and has a high degree of private support. This is not to argue that the private support does not supply good evidence, only that the good¬ ness, or otherwise, of that evidence is not publicly verifiable, or at least has not been publicly verified. We must bear this brief discussion of probability in mind when we are trying to assess the status of the Heisenberg Uncertainty Principle. We now turn to determinism.
DETERMINISM A deterministic system, according to Pap11 is one describable as follows:
PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
40
... that antecedents ah bh ch together with the law “if A, B, C, then D” uniquely determine the event dlt is to say that is predictable from the mentioned antecedents by means of the law, such that if a different event from dy occurred, we would have to conclude that either our knowledge of the antecedents was deficient or our assumed law was false.
Pap referred to this as ‘unique determinism’ and the similarity to Hempel and Oppenheim’s logic of explanation, which we discussed in Chapter 2, is evident. More recently, Mackay40 described determinacy in the following terms: We call a system ‘physically determinate’, meaning that for anyone out¬ side it there exists already a definite answer to any question about its future (or its past) which he would be correct to accept and mistaken to reject.
So by a deterministic system we are implying a precise causality in which every event is caused by aspects (not necessarily a single event) of the past and which causes in its turn events in the future. This invites us to look at the notion of causality; this is a concept that has been under scrutiny for many years. Russell41 has investigated in some detail the meaning of ‘cause’ and ‘effect’ and has made abundantly clear how vaguely the terms have been used. He nevertheless considers that there is a perfectly good sense39 in which we can say that “A causes B”. It is not thatM invariably causes B but will do so provided the circumstances are appropriate. It may be indeed that we should say “A causes B if and only if C” or “A causes B if and only if not D and £”’ and so on. Russell39 (p. 471) quotes J.S. Mill as saying: All events can be divided into classes in such a way that every event of a certain class A is followed by an event of a certain class B, which may or may not be different from A. Given two such events, the event of class A is called the ‘cause’ and the event of class B is called the ‘effect’.
It is supposed that the notion of causality is justified by induction, although there is some doubt about this. We also want to know whether “A causes B” means that A is always (or almost always) followed by B or something more. In ordinary molar terms, we certainly mean something more. We only need to consider a case such as a ball being kicked by a man to say “the man caused the ball to change its position and move from here to there”. It clearly is not merely a matter of one event following another, but rather one event impinged upon (or compelled) another event, or set of events.
DETERMINISM AND UNCERTAINTY
41
The principal difficulty over causality seems to be due to the failure to recognise the facts of multi-causality and also the condi¬ tional nature of causality. We would suppose that induction does indeed supply evidence for causality, since it is the repetitive nature of events which, in the main, makes science possible. Whether we wish to say that action-by-contact (as in the kicking of the ball) is a causal feature which applies at all levels of complexity, including the sub-microscopic, must remain an open question. As Rosenfeld42 has said the question as to whether or not there is a determinist substratum in nature is something that must be discovered by empirical investigation.
PHYSICAL DETERMINISM AND THE UNCERTAINTY PRINCIPLE The Heisenberg43 principle states that if we try to measure simul¬ taneously the position and velocity of a sub-microscopic particle then there is a decrease in the accuracy of one measurement as the accuracy of the other increases. The product of these two errors can never be less than h!2% where h is known as Planck’s constant. The fact that empirical measurements are liable to error is not new; it has been known long before Heisenberg. What Heisenberg was suggesting was that here is a limit, in principle, to what we can observe. In other words, it is not merely, as Dirac and others have suggested, that we have to “average” our observations but that there is an indeterminacy which provides a limit to even theoretical accuracy. It is important that we now discuss the various views surrounding the
so-called
new
indeterminacy
that
has
been
introduced
by
Heisenberg into twentieth century physics, to try to see its implica¬ tions. Eddington44 and Jeans45 in particular have been proponents of such indeterminacy. Eddington44 (19 3 9, p.63) makes the point that whereas in the older classical physics the laws of nature and initial data are sufficient to provide predictions for the future, this, how¬ ever, he argues, is only possible in a deterministic universe and cannot apply to “the current indeterministic system of physics”. Eddington44
(1939,
p.90)
also
considers
the
possibility
that
indeterminacy only comes about because our physical concepts are no longer suitable to precise observation. He argues that such indeter¬ minacy (as opposed to indeterminism) does not touch the point of
42
PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
indeterminism at all; all we can know are probabilities. As he puts it: The entire system of laws of physics at present recognised is concerned with probability, which we have seen signifies something that has an irreversible relation to observation. As a means of calculating future probabilities the laws form a completely deterministic system; but as a means of calculating future observational knowledge the system of law is indeterministic.
Jeans45 (194 3, p.210) makes the point ... the indeterminacy does not reside in objective nature, but only in our subjective interpretation of nature.
Both Eddington and Jeans seem to have felt that this new indeter¬ minacy directly affects the problem of free will, but we shall leave that aspect of the subject until a later chapter. Susan Stebbing46 analyzes the views of Eddington and Jeans in some detail, and criticises much of their logical argument. She herself sums up physical determinism in a traditional manner: If the initial state of a system and the laws of its behaviour are known, then its states at any other moment can be predicted, and the prediction can be verified by measurement.
Classical physics is clearly deterministic in this sense, and it was assumed to be possible to eliminate all observational errors, at least in principle. Quantum mechanics, however, has made it clear that even in principle it is impossible to know the initial conditions. This again raises the point as to whether or not the failure to know the initial conditions is a barrier in principle or not. We need not attempt to decide this matter. In fact we cannot decide this matter at this stage, since it remains to be seen whether this is a barrier in principle. We also, of course, would wish to assert that the barrier supplied by the Heisenberg principle is really only a particular case of a more general principle, i.e. that the truth of empirical statements can never be known beyond doubt. We should notice now the relevance of our brief investigation of probability theory. In other words, the same continuum between the limits of almost completely
“subjective probability” and almost
completely “objective probability” are mirrored by the interpretation of observations in quantum physics and our assessment of the credib¬ ility of any empirical statement. We are not, of course, saying that these two classes of statements above are at the two extreme limit
DETERMINISM AND UNCERTAINTY
43
points, but that they represent different points, and that such points are represented appropriately by a continuum.
LOGICAL INDETERMINACY We have already quoted Mackay40 on what he regards as being a deterministic system. His view of determinism can in part be gauged by the title of his more recent article on the subject. “The bankruptcy of determinism” which opens with the statement: Ever since Heisenberg’s Uncertainty Principle shattered the deterministic image of physics ...
Mackay goes on to discuss free will, and we shall ourselves be returning to that aspect of the subject in a later chapter. Here, however, we are only concerned with setting out Mackay’s argument for dispensing with determinism. He reminds us that Popper47 has argued (Mackay says ‘shown’) that a computer of unlimited capacity would be unable completely to predict the future of a physical system of which it was itself a
part * We shall not argue this, but merely note that there is a certain similarity between the point being made and the Godel type of argument, where a logistic system may not be able to show its own completeness. A central feature of Mackay’s argument is that although we might in principle be able to monitor someone’s brain without changing it, we cannot claim that he will be able unconditionally to assent to some event as being inevitable. Let us concentrate on the cognitive system (C, say) of the person’s brain. We can in principle frame a complete specification of this system C. But the owner of the brain cannot possibly assent to our description, because as soon as he is asked to make his assent, C or his brain state, has by such an act changed. Thus as indeterminacy has been introduced into the person’s cognitive activities, which is unconnected with the physical determinacy of his brain workings. To distinguish this sort of indeterminacy from physical indeter¬ minacy, Mackay has called it ‘logical indeterminacy’. The next stage in the argument is to assert that any one observer X and any other observer Y would in general disagree about X’s brain •The italics are Mackay’s.
PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
44
state. If X agreed with Y’s specification, X would be in a different state than Y specified and must therefore be wrong to assent, even though Y’s specification may be correct. The likeness to the notion of simultaneity in relativity theory is clear. If two observers are making observations, say, on the velocity of a third body Z, then in general X and Y, if both are correct, will not be identical in their observations. MacKay goes on to say that whereas we may be able to predict X’s behaviour, his behaviour was not inevitable for him. We shall not at this stage discuss the last issue which leads to the claim of freedom of choice and all the implications which follow. We shall dwell simply on the implications of logical determinism. The main issue of determinism, in the physical sense is untouched by the Mackay argument, but even the argument about logical determinism is open to question. The fact that there are different descriptions (all we will say are correct) of the same system surely does not make the system indeter¬ minate. What it does is make it relativistic, which is a vastly different thing. The so-called indeterminacy in X is with respect to his own state, whereas the notion of determinacy refers to the system external to X; there is no requirement that the determinacy be self-referential. However, suppose we demanded that X should not only be able to predict the future of the world outside him but the world of X him¬ self (‘inside him’) as well, the Mackay argument still lacks persuasion. There is no reason at all for asserting that X may not assent to a des¬ cription of himself which includes the predicted state of assenting to that description. Let us be careful to note that Mackay compares an assent to an event such as a solar eclipse which is not (significantly) influenced by X’s assent and a state of C (X’s cognitive system) which is influenced by X’s assent. The argument is that X’s assent in the second case does (or can) change the situation to make it non-inevitable but predict¬ able. We would simply say that if it was predictable by Y it must be inevitable for X, even though X himself may not be able to predict X’s own behaviour. Even this last point though depends upon X’s ability to make (in some state of C) correct predictions about future states of C. We shall return to this matter again later, but the main point to be made now is that Mackay’s title ‘the bankruptcy of determinism’ is not justified, since what he has shown in his article certainly does not touch physical determinism at all. The extent to
DETERMINISM AND UNCERTAINTY
45
which he seems to touch logical determinism is the extent to which he denies the possibility of self-reference in an organism, combined with a ‘phenomenalistic’ type of determinism. Let us look again at the main point. X cannot, so Mackay argues, agree with Y’s description of Z, because X’s assent changes that des¬ cription. What, precisely, though, is a description of X? We would say that a description that includes the possibility of alternative states of X, and if X knows what his decision will be (this may be only momentarily before he decides) then his description is changed. We would say that this is not so, merely that it has become determined, and both X and Y could, in principle, know in advance what that determination would be. Of the many papers that have been written on determinism, one or other that we should notice which also picks on a logical point is that due to Mayo.48 Mayo argues that the first problem (he calls it pragmatic) is that we can never be sure that our description of a future state — a pre¬ diction — is in accord with what actually happens. This is because: The resources of language are finite and the possibilities of what may happen are not.
The confusion here is two-fold. In the first place the resources of language are not finite, but obviously infinite, and in the second place although we may use language as a finite resource (there is no isomorphism between symbols and reality), the fact that we may not be able to exactly prescribe the predicted future state only means that there is some doubt as to whether the prediction has been adequately achieved; exactly the same problem applies to every empirical statement. We do not know whether it is true or not; we can only know the extent to which it is confirmed. This involves the usual vagueness that occurs precisely because of the presence of an observer who is trying to model his environment and must do so by trading in limited means. Mayo’s second argument is that we cannot predict all the features of a future state. This he says is a logical objection. In fact it is virtually the same as his first objection that no statement is the same as its reference or what it describes. Nobody seriously thinks (or should think) that a statement is the same as its reference49 and nobody should suppose either that determinism is involved in such an argument. All that is involved, as usual, is the well-known principle
46
PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
that empirical statements are made by observers with a limited power to detect what in fact occurs or, to put it plainly, empirical state¬ ments are not known to be true or false; they are only confirmed to some extent.
SUMMARY OF A VIEWPOINT
It seems that the notion of causality is a principle with more than one meaning, as in the case of probability. Causality could indeed be likened to probability, or say, evolution, and the question should be asked as to the exact status it has. We shall argue later that we can formalise any concept or concepts, by defining such concepts in axiomatic terms. This does not neces¬ sarily mean, however, that the formalised version of the concept takes priority over its empirical status. It is precisely this situation, as we shall see, that applies to pure pragmatics (Chapter 8), and this also applies to theological foundations of induction. Induction may be capable of being formalised, but this does not mean that induc¬ tion requires any more justification than is provided by the empirical evidence. This means that to ask whether causality is justified by induction is not a question of the logical justification of causality. Causality is more or less useful as a principle and more or less justi¬ fied by empirical evidence. The importance of quantum mechanics lies in drawing attention to the fact that causality is a generalisation from our own molar (or macroscopic) experiences and may or may not apply to the submicroscopic or the supra-astronomic. Such an argument can be applied in exactly the same way in probability, evolution and other concepts; indeed to any concept indeed that is a generalisation from our molar experience. The problem therefore is whether determinism falls foul of empiri¬ cal or logical evidence. We believe that there is no evidence on the question of whether the indeterminacy is a phenomenon of the observer—observer relationship or something intrinsic to nature. Most everyone seems to believe the latter, but Mackay is unusual in choosing a definition of determinism which is explicitly ‘phenomenalistic’. The most important point, however, as far as our main theme is concerned is that the status of determinism does not directly affect
DETERMINISM AND UNCERTAINTY
47
the possibility of building artificially intelligent systems, since if human beings are (logically) indeterminate, artefacts can be the same. If the environment is indeterminate (and we have no sufficient reason for supposing it) then it will be the same for both humans and artefacts. Therefore if Popper’s argument is correct (and there are reasons for doubting it) the case for a limit, in principle, to what we can know is made, and we would argue that this will apply to the human as well and this is Mackay’s argument.
Chapter 5
AXIOMS, THEOREMS, AND FORMALISATION
In Chapter One we established the details of the Interrogation Game and agreed that we could think of artificial intelligence in these terms. In other words, if we could provide an interrogation game which failed to distinguish between the human being and the machine then we would have to grant the machine the same capacities as the human being. We do not, of course, limit ourselves by such an argument since we would undoubtedly want to argue that if we understand the principles by which human beings behave intelligently we can certainly gener¬ alise them and provide in the course of time abilities that would sur¬ pass the human. This seems to follow directly from assuming that we understand the general principles by which human intelligence operates. In Chapter Three we dealt with the objection to the theme “that an artificially intelligent system can achieve the same level, even surpass, a human being in intelligence” by the arguments implicit in Godel’s incompleteness theorem. We saw there that the incom¬ pleteness theorem is primarily a problem in self-reference. The fact of incompleteness is less a fact and more a matter of proof and depends on the ability (or inability) of the system to show com¬ pleteness within itself. It is doubtful whether such a property is achieved by any system let alone an axiomatic system sufficiently complex to provide the basis for the whole of mathematics. Whether this is true or not, it seems clear that the Godel incompleteness theorem argument does not stand up as a basis for refutation of the notion of making artificially intelligent systems as efficient as human beings. 48
AXIOMS, THEOREMS AND FORMALISATION
49
In Chapter Four we have analysed determinism and indeterminacy and shown that whereas it is possible that there is no deterministic substratum in actuality from which physical laws are drawn, this alone does not affect our main theme. In fact we discovered in Chapter Four that there are many different attitudes to determinism, some which think of it in logical terms, others which think of it in relativistic terms and others which think of it in physical terms; none of these arguments seem to be any barrier to our main goal. We shall continue to assume that we are living in a deterministic world, which is merely indeterministic to us as individuals because of our inability to get beyond a certain level in observing the world independently of our own reactions to that world. It will be quite clear by now that much of our argument is con¬ cerned with what might be called the relation between the observer and the observed. In putting it this way we are not necessarily espousing a form of philosophical realism; we accept the fact that the phenomenalistic interpretation of events might still be ‘true’ or ‘justified’ even in our world of the observer and observed. Our inclination though is towards the assumption of a real world inde¬ pendent of the observer, wherein the observer has limited knowledge of the world because of the particular relationship he has with that world. We come now to consider axiomatic systems in general and the problem of formalisation in particular.
FORMALISATION
By formalisation, we mean something very similar to what was intended by Braithwaite50 when he distinguished between models and theories. He thought of theories as the interpretation of models, and models as the formalisation of theories. The words ‘models’ and ‘theories’ are relative to each other in a context, but we think of the model in general as the logical basis on which the empirical descrip¬ tion, implied by the theory, is embedded. It is important, however, to accept the fact that formalisation is a process that can be carried out in stages to any desired degree. We can start with an empirical theory, say of learning, as in the case of Hull’s theory51 and then formalise it as is done by Fitch and Barry52. We could then take Fitch and Barry’s formalisation of Hull’s theory of learning and
50
PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
formalise it in far greater detail, proceeding from mathematical to logical terms, and so on into any degree of detail and precision required for some particular purpose. Let us give an example of the process.53 In Hull’s theory of learning the first two basic concepts are: Al. Stimulus event. Each stimulus event is a specific event of very short duration, taking place in the body of the subject and having a causal effect on the central nervous system of the subject. Events of longer duration which might be expected to have been classified as also being stimulus events can be viewed as sequences of those very short stimulus events. (In a more detailed system the maximum length of duration of a stimulus event would have to be specified). A2. Reponse event. Each response event is a specific event of very short duration, taking place in the body of the subject and resulting from the operation of the central nervous system. Responses of long duration are regarded as sequences of these short response events. Axiom B2. A3Hft2, s', r) = F(x,y, z, w), where x =J(t 1, t2), ti = T(r'),
y = S(s,s,), z = S(r, /) and w = 7V) - T(s') We now quote the two main basic concepts involved: A6. Increment of tendency. By the ‘increment of tendency’ is here meant the part of the total tendency JIr(t) due to the occurrence of a stimulus event s' and a response event r . Increment of ten-
AXIOMS, THEOREMS AND FORMALISATION
51
dency is a function of time, because it depends on the ‘effective drive reduction’ J(tx, t2), which is itself a function of time. Negative increments of tendency may perhaps be regarded as positive increments of inhibition. We write A ^ft', s' ,r) to stand for the increment of sHft) due to the occurrence of stimulus event s', and the response event r . A12. Function for dependence of increment of tendency. This is a mathematical function, denoted by the letter F and expressing the dependence of the increment AsH^t2, s', r) on effective drive reduction J(tly t2), on stimulus similarity S(s, s') on response similarity S(r, r), and on the time interval from s' to r', namely on T(r') — T(s'), where tx - T(r'). The process of formalisation at this level can well be illustrated by Woodger’s54 contribution. He set up an axiomatic foundation for biology in terms of ‘things’ and their ‘parts’. Mom(x) = (u)(v)(P(u, x). P(v, x) D~ Tr(u, v)) which tells us that a space—time region x is momentary if all of it occurs together in time. Then we have: (x)(Ey)(P(y, x). Mom ((y)) which says that every individual has momentary parts. We now introduce the function Sli(x, y) to mean ‘x is a slice of a thing y’, i.e. Sli(x,y) = Th(y). Mom(x). P(x,y). ^ (£z)(Mom(z). P(z,y). .P(x, z). (x ^)) from which we can deduce theorems, the first of which says “two different slices of a thing have no common parts”. Sli(x,z). Sli(y, z). (x^y) 3 ^ (Eu)(P(u, x). P(u,y)) and “of two different slices of a thing, one is earlier than the other”.
52
PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
Sli(x, z) . Sli(y, z) . (x ^y) D Tr(x,y) V Tr(x,y) Gradually and painstakingly from these beginnings Woodger builds up a formal language for describing cells and organisms and their biological behaviour. We shall not pursue this matter further here, although such methods are characteristic of the process of formal¬ isation. It will be appreciated that all along we have tended towards a view that science is not a process of discovering ultimate truths or finding out what is certain (this is more the philosophers’ concern, though probably wrongly so); we are concerned with answering specific questions in a specific context for a particular purpose. To some extent we would accept the fact that from this we can talk in very general terms and therefore to some extent would accept the fact that science and philosophy were concerned with the pursuit of ultimate truth, but in the same way as we are barred from know¬ ing the ultimate truth of empirical statements, and have to be satis¬ fied with the extent to which they are confirmed by further observa¬ tions, so we are barred from ever appreciating the ultimate facts independent of our understanding of our observation of those facts. It will be appreciated here that it does not effectively matter whether the phenomenalistic, or even the solipsistic, argument is correct or not, that which is true is true, and this is regardless of whether we know it or not. We shall not of course in this chapter want to become deeply involved in the philosophical issues of truth, but we can say the following things to give some indication of our attitude to this problem in so far as it is relevant to our particular search. We shall say that we accept the semantic theory of truth55 which claims that the semantic definition is independent of ontological commitments either to realism, phenomenalism or any other ontological viewpoint. We accept this in the sense that we believe that statements are either true or false subject in many cases to conventions as to the precise meaning of the statement, and we can only know the extent to which they are confirmed by external evidence.46 We hope, ontologically, to take what is a critically realistic view and say that we think of the world (the justification for this is its utility in practical terms) as independent of the people in it and the people who make the observations. We also think that truth, apart from being predicated of sentences which are either true or false and
AXIOMS, THEOREMS AND FORMALISATION
53
we may not know which, also applies to reality, i.e. that which is, or that which is “real” is true. Statements therefore about “what is” are true statements or false statements even though we may not know which. To this extent we are accepting the correspondence theory of truth. The coherence theory also plays a part however, because the coherence theory tends to provide a weight of evidence in support of a particular statement being accepted as true, to the extent to which it coheres with other such statements which are already accepted as true. In other words, correspondence is really the test of the truth of a statement, to the extent to which it can be tested, whereas coherence is the extent of the acceptance of confirmation, or the extent to which it can be accepted. In this background of an understanding of confirmation and truth, we can think of the formalisation process very much in Braithwaitian terms, where we have models and theories which are related to each other by the processes of interpretation and formalisation. The process of interpretation is really the process of applying a meaning to a formal system. Hilbert’s formal mathematical argument whereby 1 + 1 = 2 is in fact intended to be the same as “one plus one equals two”. We should be clear that the intended interpretation is always in mind when we formulate axiomatic systems for models. Map makers do not make abstract maps in general and then look for territory to fit the maps. We usually have particular territories in mind, and then provide maps which we think might sufficiently fit some particular purpose and then place the interpretation on the map. Map making to this extent is very much like model making and model making so often in modern science means the construction of axiomatic systems. When we describe axiomatic systems it must be recognised that the description is in terms of a higher language. We always have some meta-language M in which to describe an axiomatic system or precise language L. The precise language L may well be a model or some theory which can be described in M, and therefore M can be the interpretation of L, but most generally M is restricted to a statement of certain properties that the axiomatic system L must have. In other words, it usually has such things as rules of formation, definitions, certain meta-theorems which lead to a proof of the completeness or otherwise, consistency or otherwise, independence of the axioms or otherwise, of the axiomatic system. It also is usually concerned with whether the axiomatic system has a decision procedure or not. We
54
PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
know of course that some axiomatic systems do have, such as in the case of the propositional calculi. What is important to notice at this point is, and this bears some relationship to Godel’s incompleteness argument, that even in a relatively simple system such as the propositional calculus, the completeness and consistency is proved as a meta-theorem not as a theorem of the calculus itself. It seems that to this extent all state¬ ments about precise models, or formal axiomatic systems, must be in terms of some higher language. This refers back to Russell’s57 work on theory of types which reminds us again that self-reference is a limited and often confusing activity, and that, in general, state¬ ments about statements and the statements themselves are not of the same order and cannot be intermingled in a logical argument. They must be distinguished as two different levels of a hierarchy. This hierarchical approach is the most effective way of designing the sort of formal systems we have in mind. Formalisation therefore is the making precise of a model within the context of an imprecise model. The precision is gained, in other words, by reference to imprecise statements. Very much the same sort of situation exists with respect to definition. If we try to define every term in a whole argument we find ourselves appealing to other terms which are undefined, or we have to define the terms circularly in terms of other terms which have already been defined in terms of the terms you are now defining. It seems most useful to accept the fact that precise models are bound to be dependent upon imprecise ideas, and therefore unanalysed terms58. Such unanalysed or undefined terms are perfectly acceptable (indeed inevitable) as providing a foundation for a precise system. Having gone this far, it will be appreciated that we are arguing against the philosophical search for certainty, and we are arguing in favour of a “scientific approach” to an understanding of the world around us by appeal to some sort of axiomatic foundation. Let us now turn briefly to the subject of Syntax, Semantics and Pragmatics.
SYNTAX, SEMANTICS AND PRAGMATICS
The distinction between syntax, semantics and pragmatics is due primarily to Carnap59 and Morris.58 The idea is that syntax is con¬ cerned with the precise relationship between terms, etc., in a grammar
AXIOMS, THEOREMS AND FORMALISATION
55
whose rules apply to the well-formed nature of statements made in the system, where semantics applies to the interpretation placed on the system. In a sense syntax is a model in the Braithwaitian sense and semantics is an interpretation of that model, or a theory, also in a Braithwaitian sense. So semantics is concerned with meaning, and this implies reference, designation, connotation and the like. This implies in turn all the attendant difficulties that we have had to try to sort out i.e. “what is an adequate reference?” and “where do connotation and denotation differ from each other?” etc. Pragmatics is not only concerned with the well-formed nature of the statements (i.e. syntax) and is not only concerned with the refer¬ ence of the statement (i.e. semantics) but is also concerned with the behavioural reactions of the people in the total conversation piece. It is being assumed here that communication of the kind implied by syntax and semantics normally takes place between particular people in particular circumstances. It should be noticed that Korzybski49 felt so strongly about the pragmatic importance of formal systems that he argued that the whole of mathematics should be redescribed in pragmatic terms. In other words, mathematical statements, the proof of mathematical theorems and the like, were all parts of the conversational piece between people. To some extent this is a view we would accept ourselves, and say that we should be prepared, where necessary, in pursuit of information, to the extent to which it might be useful to some particular goal (this is the standard we set ourselves), be pre¬ pared to talk in terms of the behaviour reactions people exhibit to the statements made in a particular interbehavioural context. The progress from syntax to semantics to pragmatics is important in terms of philosophy of science. It is important because it tends to be in keeping with the contextual view of knowledge which we have so far espoused. It lays emphasis on the fact that scientific approaches, by use of axiomatic models, could often be illuminating in what are apparently philosophical problems. We shall say no more about this particular matter at this time, since in Chapter Eight we shall be devoting ourselves to a strict discussion of pragmatics and the implica¬ tions of pragmatics for epistemology and other aspects of science, and particularly of course the implications for the cybernetic viewpoint. We cannot leave syntax, semantics and pragmatics without making the further distinction which cuts across all three. Each of the above can be either descriptive or pure. It is easy to see how a language (its
56
PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
meaning and its syntax) can apply to actuality or be merely formal, but some argument has occurred over the existence of pure prag¬ matics.60 The argument asserts in effect that pragmatics can only be descriptive, because it is inevitably part of empirical science. We would not agree with this but accept the fact that formalisation of a pragmatic model is very much what is meant by pure pragmatics.
TURING AND CHURCH'S THEOREMS
Before we complete this chapter we should say something about the work of Turing4 (1936) and Church5 and the context of their work with respect to Turing’s counter-arguments.25 It will be recalled that one of the counter arguments to Turing’s view regarding artificial intelligence, which is our main theme and asserts that artificially constructed systems can show intelligent behaviour is the mathe¬ matical one. Now some part of this mathematical objection we have dealt with in terms of Godel’s Incompleteness Theorem. We are now concerned with what turns out to be generalisations on Godel’s Incompleteness Theorem in terms of Turing and Church’s theorems. It is of interest that it was Turing himself who contributed a vital line of thought to the foundations of mathematics which was used by others as a counter-argument to the fundamental theorem of cyber¬ netics, which Turing himself supported. Let us explain this matter quite briefly. Turing and Church independently showed, and Turing used his famous Turing machine to do so (we shall be discussing this later in Chapter Ten in more detail) that the whole of mathematics as repre¬ sented by an axiomatic system61’ 62 investigated for completeness by Godel had no decision procedure. It not only has no decision procedure but it was impossible that such a system should have a decision procedure. This impossibility proof is more general than Godel’s Incompleteness Theorem and implies that the system is necessarily incapable of showing its own completeness or consistency. The important thing about this is that we have to drop the idea of assuming that because something is computable it must necessarily have a decision procedure. This notion came about as a result of a too narrow idea of what computability meant. A decision procedure was once equated to computability, even by Turing himself, but Turing would not argue, and certainly we would not, that because
AXIOMS, THEOREMS AND FORMALISATION
57
something was not computable in this narrow sense, it was not capable of being carried out by a computer. The answer to the above is that many functions can be carried out by a computer which refer to systems where no decision procedure exists. What we have to accept under these circumstances (the same exactly applies to human beings) is that mistakes might be made. We are therefore forced to go from a deterministic type (or deductive type) view of what is entailed by computer programming to an inductive or indeterministic type. In fact what has been developed is called heuristic programming, which has precisely the uncertainty made inevitable by the lack of decision procedures, but which never¬ theless applies to the world around us and which is capable of being described in a formalised system.
AXIOMATIC SYSTEMS AND AUTOMATA
We should now take one step further forward in considering the point and purpose of axiomatic systems. They represent an attempt to provide precision and may also imply a degree of generalisation. To take the latter point first, science has always attempted to supply abstract theories of an ever greater level of generality. In mathematics the search for the abstract has been particularly clear in the case of geometry, logic and algebra. In physics, we have seen the attempt to integrate quantum mechanics and relativity, and proceed with each towards a more abstract theory of matter. The underlying thinking is that the more simply we can represent a description of a system, the more effective that system is; this is a variant on the theme of Occam’s razor. We have seen that axiomatisation is also a process of formalisation, or of making more precise. This process of formalisation is very similar to the drawing of a blueprint for a machine, hence the main reason for this whole chapter. If you set up an axiomatic system, which you can show to be complete and consistent (hopefully also having an algorithm) then you have a theoretical machine or blue¬ print. There is no sense in which you have to specify the fabric of which the machine is to be constructed. These two points are easily seen to follow from a study of automata, since Turing4 himself developed the so-called Turing machine as an automaton precisely to discover whether or not mathematics had an algorithm.
58
PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
It can be seen from this that a process which can be axiomatised can be studied for its machinelike properties. This is not to say that an axiomatic system is necessarily machinelike in the narrow sense. Turing himself showed that mathematics (while in essence axiomatisable) is not computable in the sense of having an algorithm. This though does not mean that such systems are not computable in the broader sense of being capable of being run on a computer. To do so however requires the use of heuristic methods63 and as a result the possibility of error. We shall now summarise our views on axiomatic systems and empirical descriptions.
AXIOMATIC SYSTEMS AND FORMALISATION
We accept the fact that we can analyse communication systems such as are involved in languages (whether formal or informal) into their syntax and semantics. This distinction though is an arbitrary one, and we have seen that a particular statement may be valid or invalid on account of a rule that might be thought to be either syntactical or semantic. The issue here is largely dependent upon whether you allow (or not), or the extent to which you allow, meanings to deter¬ mine the structure of allowable sentences. “An X is a Y" is an allowable syntactic structure but there are clearly rules which do not permit such an interpretation. “A man is a fish”. The rule will assert that the subject noun must have a certain specified relationship with the object noun, and such a rule could be within either syntactics or semantics; it seems most natural to think of it as the latter. Semantics is in turn embedded in pragmatics. Here the distinction is slightly more clearcut, since we can more easily distinguish behaviour (res¬ ponses to stimuli, and changes of state) from the actual detail of language torn from the context of its users. The distinction between pragmatics and semantics is also arbitrary in some measure though, since language can be (normally is) com¬ posed of both stimuli and responses. Most certainly lansigns (language signs) are stimuli, but they also signify or refer to other stimuli. The difficulty here is that many stimuli (other than lansigns) also signify. Our case must be that lansigns signify in terms which are public even if only approximate in their publicly agreed signification. Meaning is essentially the process of providing interpretations to symbols and
AXIOMS, THEOREMS AND FORMALISATION
59
making them into signs. They evoke responses and here we provide the overlap of semantics and pragmatics. The importance of pragmatics to us in our search for an under¬ standing of science, lies in its relation to epistemology (we shall be returning to this theme in Chapter Eight). If formalisation produces a basic core of knowledge that is the proper study of epistemologists, then pragmatics provides a formal model of a behavioural context within which the epistemologist operates. The argument is essentially that both formalisation and modelling can, like asking ‘Why’ of some descriptive statement, be carried on indefinitely. The importance of this for our main theme lies in the need for an appreciation of what constitutes a satisfactory explanation or explication of certain concepts, such as the concept of a “thinking machine”. In particular, this chapter has completed the second half of the Turing25 counter-argument from mathematics. The Turing machine and other formal models of mathematics5’6 do not prove that machines cannot be made to think. All that is shown is that they may be subject to error, in exactly the same way as the human being is. We are led immediately to consider heuristic methods and heuristic programming. We accept this as an essential ingredient of machine intelligence, and must say something about how it operates. We will start our investigation of heuristics in the next chapter.
Chapter 6
CREATIVITY
CREATIVE ACTIVITIES
In this chapter we shall consider the problem of original thinking, creative thinking, and creativity and originality in general. To some extent this reflects the Turing counter-arguments25 in particular the so-called “argument from consciousness” and to some extent the “argument from various disabilities”, which reflect many people’s belief that nothing new can be done by a machine and also repre¬ sents many people’s belief that machines are not really aware of what they are doing and this of course brings up the usual problem of “Other Minds”, something we shall be discussing in Chapter Seven. The problem of not being able to do anything really new also relates in some measure to Lady Lovelace’s objection which was really dependent upon the idea that the computer, or any other “machine” which was manufactured by human beings, could do no more than the programmer programmed it to do. We are confronted with a considerable problem in dealing with creative activities, mainly because, and this is something which is so often forgotten by people who put the Turing counter-argument, we do not know enough of how human beings operate in these domains. Two examples immediately come to mind of types of activity, which were regarded as being incapable of being carried on by machine, until Von Neumann7 showed the opposite. These were the problems of organismic reliability and of self-reproduction. Von Neumann was able to show that even a network made up of unreliable elements could be made as reliable as one pleased by the simple 60
CREATIVITY
61
process of multiplexing. This answered the objection that any computer (or automaton) which purported to have the same capa¬ bilities as the human being would in fact break down because of the unreliability of the components. His work was based upon the fact that even if neurons are relatively unreliable (and there are some doubts on this particular point) they could be made as reliable as we please in a system simply by multiplexing the system; this is really the duplication (multiplication) of information channels. The other argument Von Neumann dealt with was the argument of self-reproduction, and he showed in a mathematical argument that machines could be made to reproduce themselves in exactly the same state as the machines had themselves reached when they performed the reproduction. This entails having a model of itself inside the machine, which could be used as a blueprint for a separate machine outside itself. It is of interest to us here in view of the importance that has been placed on self-reference, to realise that it is apparently possible to have a complete picture, as it were, of one’s own system within one’s own system. It should be noticed that the word ‘complete’ above means a sufficient description of the essentials from which a new system can be derived by interacting with its environment. This is all certainly that is required of human reproduction, any limitations it may have (the same argument as we have seen applies to Godel’s Incompleteness Theorem) applies to the machine equivalent of the human in repro¬ duction. We are not so much concerned with these sort of arguments however in this chapter, as opposed to understanding how we per¬ form creative physical functions. We are more concerned in fact with what creative thinking is. We want to know whether new ideas are ever arrived at by human beings, and if so what they entail, and if we can discover what they entail, can “machines” be made to do the same thing? We shall first look at the work of Newell, Shaw and Simon64 in which they investigate the processes of creative thinking, where they start out by saying: that we call problem solving creative when the problems solved are rela¬ tively new and difficult.
They argue that the processes of creative thinking, although they may seem dramatic to the external observer, are more explicable, in
62
PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
what are virtually behaviouristic terms, than they appear at first sight. It has often been argued that the imagery employed in creative think¬ ing, and its importance for the effectiveness of that thinking, involves a certain “flash of insight” that reveals a solution to a problem which has been worked on over a period of time. This involves what psycho¬ logists call set and other characteristics known to exist in the human being; there seems no reason to suppose that such characteristics are not obtainable in an artificially constructed system. Newell, Shaw and Simon put the matter this way. They argue that a theory of creative thinking should consist in: (1) Complete the operational specifications for the behaviour of mechanisms (organisms) that, with appropriate initial conditions, would in fact think creatively; (2) A demonstration that mechanisms behaving as specified (by these pro¬ grams) would exhibit the phenomena that commonly accompany creative thinking (e.g. incubation, illumination, formation and change in set, etc.); (3) A set of statements — verbal or mathematical — about the characteristics of the class of specifications (programs) that includes the particular examples specified.
They go on to assume that creative thinking is simply a special kind of problem solving behaviour. In other words, there is no sense in which the word ‘creative’ adds anything to general problem solving behaviour, other than the fact that the problem being tackled, and hopefully, solved is relatively rare or relatively new. Newell, Shaw and Simon say: Thus, creative activity appears simply to be a special class of problem-solving activity characterised by novelty, unconventionality, persistence, and diffi¬ culty in problem formulation.
Newell, Shaw and Simon go on to develop a problem solving procedure based on heuristic programming. It is sufficient for us to notice that at least one attempt has been made to show that creative thinking involves no more than the type of behaviouristically describable activity of dealing with problem situations. The work of Newell, Shaw and Simon, and our own thoughts on the Turing counter arguments, depends in some measure on the idea of novelty. What we should ask ourselves is whether we really ever have new concepts or new hypotheses. The idea of concept formation65’66 is closely connected with the notion of a word, so that the
CREATIVITY
63
word ‘red’ refers to the concept of red, and the word ‘square’ refers to the concept of squareness. There is very close association between words and concepts, so close indeed that some people think of concept-analysis and linguistic-analysis as virtually the same thing. We shall be very careful to distinguish between the word and its equivalent concepts,49 since to say a thing is the same as the word which refers to it is the basic offence of “semantic analysis”. We will, however, accept the fact that hypotheses are built up out of concepts which may themselves not be elementary or unitary concepts. In other words, we can conjoin redness and squareness to make a new concept, red square, which is itself not elementary any longer, and is thus capable of being decomposed into its elements, i.e. that of redness and squareness. Indeed, of course, one could take the argu¬ ment further and prepare an explanation of the concepts of red and square in terms of physical descriptions, in terms of particles and waves which exist between the environment and observers of the environment. This is the point where we can derive some sort of scientific explanation of the principles which are regarded as basic from the point of view of logical or philosophical analysis. This need not concern us here, because we are more concerned with trying to trace originality of thought, rather than trying to explain the fundamental characteristics of the material world. We shall be returning to this other problem later in Chapter Eight, where we discuss pragmatics. The argument is that we can never have any new concept in our particular environment other than by enlarging our sensory experi¬ ence. Although it may be the case that any particular individual does not formulate or discover the full range of existing concepts, the community taken in toto must have by definition done just this. So a new sensory experience seems to be necessary to provide new con¬ cepts. However, it should be noticed here that as soon as new con¬ cepts are introduced they are explained or described in terms of existing concepts. Thus, if I try and describe some new concept X say, then I have to say “X is ...” and then supply some concepts that are already familiar to the listener. In other words, I say that “X is a red square” or “X is a blue circle”, etc. and then the meaning of X is clear to me. It is to be understood that the core of already familiar concepts can itself be enlarged by direct experience of X, and any other concept that is being introduced for the first time, hence our belief
64
PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
that only new sensory experience can be the basis of new concepts. It should also be remembered though that this applies to the com¬ munity as a whole and does not apply to any particular individual who may continually be making discoveries, which have already been discovered by other people. In other words, what seems to be an argument for saying “there is nothing new under the sun” turns out to be an argument for saying that we can only understand things which are new in terms which are already familiar, although as we become familiar with those new things so they can become the basis for understanding further new things. Therefore the possibility of novelty seems only to exist for a community in terms of new sensory experience. In saying what we have said we have not attempted to define ‘sensory experience’, and we feel that this is not absolutely necessary here, although one can see that difficulties might arise over the relationship between ‘sensation’, ‘perception’, and the cognitive apparatus which seems to be necessary, according to the existing theories of behaviour, in order to understand the perceptual and learning processes. The argument seems to be that creative activities in the ordinary sense of this phrase are concerned only with novelty, which depends in turn only on our sensory experience, whether individual or collec¬ tive, and the new hypotheses which are derived from these concepts, whether old or new, or combinations of each, are themselves only novel in that they depend on new sensory experience. Thus it is that there can be no possible objection, on the face of it, to imagining or even building a machine or artificial intelligence system which is capable of going through exactly the same processes. All that is needed is the ability to have sensory experience (including new sensory experiences) and this requires that the suitable apparatus be constructed. The fact that it also must include new sensory experiences simply underlines the point that we must not build apparatus simply to sense what we know in advance will be sensed by the system. If we fell into the latter trap we should also have fallen foul of Lady Lovelace’s argument and said in effect that the machine could do no more than the programmer had made it do; this can easily be avoided by providing the machine with sensory systems that can scan various regions of “space” and therefore have the capacity to pick up data, knowledge of which in advance no one could possibly have.
CREATIVITY
65
Let us now turn our attention to another aspect of the problem of creativity and originality, i.e. artistic creation.
AESTHETICS AND EMOTIONS The argument here is as to whether an artificially constructed system (i.e. a ‘machine’ in the broad sense of the term) could manufacture a serious work of art. A great deal has already been written about computers writing poems and about computers drawing pictures and paintings and also apparently creating other works of art. On the other hand, there are many people who say, and this really is a variation on the Lady Lovelace argument, that originality or creativity of the kind implied by works of art, in whatever modality, is pre¬ cisely something a machine cannot do. We have to be extremely careful here to appreciate the main point and that is that if you program a computer merely to produce a work of art, you are falling foul of Lady Lovelace’s argument, and the computer is simply producing what is implicit in the instructions of the programmer. There is no sense in which one would want to claim that the computer was acting ‘spontaneously’, or having any of the usual feelings and emotions that accompany the creative process in the human artist. As a result of this situation, it is clear that we have to look again at the computer and ask what is missing when it is compared with a human being. In talking this way, of course we mean a computer suitably programmed by a highly flexible heuristic type of program. The answer is that the logical capacity and the capacity to reason, to think, to problem solve, etc., are all capable of being built into the program and operated by the computer successfully, without indeed falling foul of Lady Lovelace’s argument, because of the adaptive nature of the program. There are still enough features which are essential to works of art which are missing from the ordinary com¬ puter or computer program. The main one is that of emotions. We can of course represent the emotions by some mathematical means in the systems as just another constraint that operates in cog¬ nitive activities. This, for many purposes, i.e. where emotions simply interfere with the working of the rational part of the programmed operations, might well be a reasonable simulation. However we do run into difficulties when we come to works of art, since the creative
66
PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
aspects of the painter and artist are not easily reproducable by some sort of mathematical function. What we must do now is look much more closely at the role emotions play in organisms, and then return to tackle the problem in the light of what we have discovered. We have to be extremely careful in describing the emotions, partly because this is a very vast subject which has been extensively studied by experimental psychologists, and also because in talking very generally about the emotions, we are tending to do what philosophers so often do, i.e. ignoring empirical evidence in the interests of some sort of logical argument. We have already mentioned (and we will return to the subject in the next three chapters) that problems of logic and theory of know¬ ledge are in fact illuminated by empirical science, hence we are put¬ ting forward pragmatics as a better (at least as good as) basis for the investigation of the world than epistemology in the classical sense. We cannot as a result ignore the empirical study of emotions in this current chapter; we must however put the matter with great brevity. The emotions are undoubtedly physiologically determined, especially indeed neuro-physiologically determined. The emotions represent organic changes which have both a disorganising effect on the integrated behaviour patterns of the individual and also an integrating effect. What decides which effect they will have will depend upon the total context in which emotional behaviour occurs. The emotions themselves involve changes of blood pressure and other physiological changes which are accompanied by certain facial expressions, gestures, etc., which we particularly associate with different emotions in human beings. We have to be extremely careful to think of the facial expressions and the gestures as signs, signs indeed of a social character, which we have learnt accompany different physiological states in other people (including ourselves). We recognise fear, anger, jealousy, love and other emotions by various overt ges¬ tures and facial expressions, but we would suggest that these were the incidental accompaniments of particular internal physiological changes. In other words, the emotions facilitate a physiological state, such as the ability to run away or to fight,67 and they play an important part in the evolution of the human being; they are a sort of alarm system or a facilitating system and we have learned, as human beings living in society, to recognise the external signs of the internal facilitating or alarm system; but these external signs are in a sense
CREATIVITY
67
incidental to their ‘evolutionary purpose’. It is interesting that the word ‘emotion’ and the word ‘motivation’ have a common derivation from the latin word ‘emovere’ which means ‘to stir up, to agitate, to excite, or to move’.68 In other words, the emotions play a very important part in motivating the organism as well as warning the organism, which is in any case a form of motivation. Its role is fairly basic and an organism without any emotional activities could hardly have survived the process of natural selection.69 We shall assume in talking this way, therefore, that emotions are primarily a subset of motivations, all of which occur in a social environment. They are facilitated by particular structures, in the central nervous system in particular, and seem especially concerned with the autonomic nervous system; most of all they are concerned with the sympathetic branch of the autonomic nervous system, the hypothalmus, which when electrically stimulated brings about a state of so-called ‘sham rage’ in a human being, i.e. all the expressions of rage appear without any of the contextual justification for that rage. This means that people show all the symptoms without having any of the accompanying feelings. This seems a clear argument in favour of the hypothalmus at least being in the circuit at some point. The hypothalmus is probably an integrating centre since emotions are also represented in the reticular formation. The exact method by which the emotions are integrated into total behavioural patterns is not yet fully understood, however there are two theories at least of emotions, one due to James-Lange.70,71 The James-Lange theory of emotion claims that bodily changes in the organism produce the emotional experience which we normally have. The Bard-Cannon theory suggests exactly the opposite, which is that we have the emotional experience and this produces the bodily changes. It is widely felt now that neither of these two views is correct, but rather that the emotional experience and the bodily changes occur together as part of a total behavioural picture, and this is the view we ourselves would take of emotional experience. We might even say that the so-called emotional experience and the bodily changes are one and the same thing. We cannot here examine in detail the various theories, either neurophysiological, or behavioural, of emotions; all we need to say at this stage is that some sort of reproduction of this system in the computer programming set up, or in our artificial system at some
68
PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
stage or level, is an essential ingredient if we are to represent aesthetic activities of human beings, and indeed a whole range of social behav¬ iour patterns which are not normally exhibited by an artificially intelligence system. This is a reminder of course that an artificially intelligence system (if it were really to simulate the whole range of human behaviour) would have to be a social organism in any case. Because clearly much of the human behaviour is socially determined. This leads us a full circle in our argument. We are now going to suggest that we can indeed represent the role emotions play in determining behavioural patterns, in exactly the same way as we do in problem solving. In other words, the emotional states can all be simulated by suitable chosen mathematical functions, but it does not follow that we can yet supply a suitable set of functions. It seems very likely that a great deal more analysis of emotional behaviour is necessary. An understanding of its relation to the total behavioural pattern is also necessary before we can specify exactly what func¬ tions would be appropriate. An alternative to the above representation as a simulation of emotions is the possibility of building a system which actually has the emotions. We have the very primitive beginnings of such a system in the ordinary fuse box arrangement in a computer. There is no reason why we could not, if we were pushed, reproduce a very much more complicated fuse box system which would play very much the same role in the computer’s evolution, if it were made into an auto¬ nomous system, such as the emotions play in a human being. It has to be admitted that this is a fairly tenuous sort of argument at this stage, and to adequately simulate the emotional range of behaviour exhibited by human beings, it may well be that our artificially intelligence system should be constructed of colloidal protoplasm. This indeed puts it very much in the future, even though some steps have been taken in this direction,73’74 and leaves us with perhaps the biggest doubt so far about our capacity to reproduce the whole of man’s behavioural activity. It should be noticed quite clearly here that we have discovered no barrier to rational behaviour in human beings. The barrier, which we fully admit, is a difficult although not an impossible one to get by, is about the modelling of the total range of human behaviour.
CREATIVITY
69
HEURISTIC PROGRAMMING
This is a convenient moment to discuss briefly heuristic program¬ ming.63 A great deal of work has been done on heuristic methods generally and the application of heuristic methods to computer programming in particular. We shall not attempt to try to reproduce the full breadth of that work in this section, but merely draw atten¬ tion to its significance as related to creativity on one hand and its relation to formalization and the narrow notion of ‘machine like’ on the other hand. It is also related in some measure to our later dis¬ cussions of epistemology and pragmatics, in its relation to beliefs or hypotheses and what constitutes our knowledge. The main theme of Godel, and subsequently of Turing and Church, was to the effect that ‘machine-like’ meant ‘computable’. Turing, at least, realised that although ‘machine-like’ could mean computable in the narrow sense of machine-like, this was no barrier to the possible construction of an artificially intelligent system that used methods quite capable of being put onto a computer. Such methods were not ■necessarily precise, but they were nevertheless ‘machine-like’ in the wider sense of still being computable. They were not precise in the sense that they did not necessarily have a decision procedure. In other words, provided we were prepared to accept errors, as in the case of human beings who themselves perform similar types of activity, it was perfectly possible to consider intelligent activity on a comput¬ able basis. This latter sense of the word ‘computable’ was very much more general than the narrow sense originally envisaged in discus¬ sions of Turing machines. Heuristic programs are programs for the computer which are meant to provide evidence to support beliefs or hypotheses. If we use the word ‘beliefs’ here it will be merely to draw attention to the same use of the theoretical term ‘belief’ that we use later in the dis¬ cussion of pragmatics (Chapter Eight) and indeed also in our formal¬ isation of our pragmatic systems on automata (Chapter Ten). Regardless of the particular terminology used, the heuristic programs are completely precise at the machine code level, but subserve a model at a higher level in one of the meta-languages which is itself probabilistic. The main point about heuristic programs can be brought out very clearly by simply drawing attention to the fact that they are rather like inductive generalizations. The assertion made (a hypothesis
70
PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
formulated or belief held) that black clouds are a sign of rain, for example, is something from which empirical evidence can be derived. The derivation of the empirical evidence, the counting of the number of constant conjunctions between black clouds and rain, is something which is clearly algorithmic. The generalisation and the degree of factual support and confirmation that it gets as a result of this empirical evidence is itself probabilistic, i.e. heuristic. We rather tend to think of ‘heuristic’ as meaning successive approximations or approximate optimization, but it certainly could refer to any probabilistic modelling (i.e. where uncertainty is involved) even though the actual accumulation of evidence to subserve the higher level model which embodies the heuristic is itself completely algorithmic. It should be noted here that the probabilities involved as a result of the use of heuristic programming are the same as the probabilities (indeterminacy) involved in Quantum Theory; they are the direct result of our ignorance of all the facts over which we (necessarily) formulate hypotheses. If we could provide a complete set of hypo¬ theses for any particular well-defined type of activity, then we should have achieved an algorithm. We should also note that in many situations the type of activity may not be well-defined, and this itself is the result of ignorance of facts. At least we shall argue, as in the case of indeterminacy, that there is no question of claiming that there is some specific and inevitable limit to what we can know in principle. Having said as much we must now deal with one important point. If, as we have said, a “complete” set of hypotheses is effectively an algorithm, and since we know that there are many unsolvable problems (e.g. in mathematics30) where by ‘unsolvable’ we mean that there is no decision procedure, there must be some limit on our knowledge which is a limit, in principle. In fact, we know there is no decision procedure for a whole set of problems, usually because they involve either an infinity of possible solutions or they are, in effect, para¬ doxical.29 In other words, we shall say that there is a theoretical limit to what can be carried through algorithmically (i.e. you cannot find a code word to a solution with an infinite number of digits). We do not on this account, however, say that there is a practical limit, even in principle; the two things are essentially different. We have, of course,
CREATIVITY
71
in this argument reminded ourselves again of the essentially same limitations that were observed to apply to axiomatic systems by Godel, Turing and Church. Let us now return to the basic point of heuristic programming. The development of heuristic programming is a key point in support of the argument which Turing himself set down under the heading ‘learning machines’. We would now take the argument beyond the point of learning machines, although they clearly play a very important part in the argument, since they point out that a computer program is not wholly committed a priori to specific results. The commitment is only to collecting information which may modify the basis of the program as a function of that informa¬ tion. In other words, the computer program can change its state as a function of its experience. Learning programs perform a very important aspect of what we are looking for, but it is heuristic programming in particular, in this context, which provides the rest of what is needed in adaptivity and flexibility and which takes the computer program beyond the level of the purely algorithmic. In this particular chapter on creativity, heuristic programming is appropriately mentioned because quite clearly if we were going to develop any sort of creative activities (whatever we decide this phrase means) in the computer context, then it must depend very much on the development of heuristic methods. It is certainly heuristics which (if anything) are the creative aspect in human beings. If you play chess for example, you cannot play chess algorithmically, you must play heuristically, and to play heuristically means much the same as what we normally mean by to play creatively.
IN SUMMARY
We have argued in this chapter that words like ‘creative’ do not add anything significant to other phrases like ‘problem-solving’, ‘thinking’, ‘learning’, etc. We do accept however that the artistic ability, which seems to be closely related to the emotions in human beings, is an obvious omission from existing computers. If computers (or any artificially constructed system) were to try to simulate the whole of human behaviour, some equivalent of the emotional system would be required.
72
PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
We can think of science as being an explicit art and the arts as being implicit science; the difference is one of degree (of explicitness) and in no sense an absolute distinction. Finally, we should want to argue that the artistic and the emo¬ tional problems, while implying some considerable complexities, provide no barrier to our main theme of the possibility of artificially intelligent systems.
Chapter 7
CONSCIOUSNESS AND FREE WILL
Up to now in this book we have been primarily concerned with two types of counter-argument to the original Turing argument25 in favour of the possibility of artificially intelligent systems and their construction in the laboratory. Some of the arguments were quite explicit as in the case of the arguments from mathematics, while some of the other arguments which we have been concerned with, such as those over indeterminacy and formalisation, have been to some extent potential counter-arguments rather than actual ones. These potential counter-arguments have sometimes cut across some of the counter-arguments mentioned originally by Turing. In this chapter we come to consider what is more of a potential counter-argument than an actual one in the form of the question as to whether it would be possible for machines to have consciousness and also, and this is of interest in its own right, whether free will is something (whatever it is) that human beings have, and if they have it, is it unique to them or could machines have it too? First of all though we will look at the question of consciousness. Some considerable thought has been given to the question of con¬ sciousness in human beings and how it arises, and a great many recent theories have tried to relate problems of consciousness to the functioning of the reticular formation in the central nervous system. We shall not concern ourselves with any of the empirical evidence in this particular case, but merely discuss the possible ways in which consciousness may arise, what it entails for the human being, and of course whether machines could have the same capacity. It will be appreciated straight away that this raises the problem we have mentioned once or twice in the text: the problem of “other 73
74
PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
minds”. This very simply arises because we cannot be sure that anyone but ourselves is consciously aware of what he is doing and whether in this sense he has a “mind”. We only assume from the outward show of behaviour and the apparent similarity to our own behaviour that other organisms (particularly this is true of humans, although there is always some argument about animals) have the same conscious awareness as we ourselves do. By ‘consciousness’ we are going to mean the ability to have self¬ reference. Immediately the use of the word ‘self-reference’ will alert the reader to the relationship between the notions of consciousness and many of the notions relating the observer to the observed that we have been discussing in this text. Self-reference is the key point so that when we use the word ‘I’ we have clearly got to the point where we can make self-reference. We need not ask too carefully what we mean by the word ‘I’, since no doubt on different occasions different people mean different things by it. There is an obvious sense in which we would mean our body (the organism as a whole) and if we did not wish to make the primarily verbal distinction between body and mind, this would be perfectly satisfactory. We might on other occasions refer to our ‘minds’ because we may be talking about that “conscious me” which is that part of the bodily system which is selfaware. It is quite clear that in saying what we have said, all aspects of the bodily system are not self-aware, although this is something that can be discovered by scientific experiment, and is not always imme¬ diately obvious by our own conscious introspective analysis. It would seem reasonable to suppose that consciousness arose in the ordinary course of behavioural transactions with the environment. We can talk about other people and events external to ourselves, and then come to realise that we ourselves — we can after all see parts of our own body which we recognise as part of a unified whole — make us aware that we are an intricate yet single entity, an individual in some sense. So as soon as we use the word T we have come to some point of self-awareness. Over and beyond this we can be aware of many other things that we do and are conscious of doing. This means that we can analyse the material which we are handling, in the situation of a solver, learner or thinker, which allows us to see the means-ends-result activities which we normally regard as cognitive. We have said from the start that we shall not regard the processes of thinking, learning, problem solving and the other cognitive processes as requiring
CONSCIOUSNESS AND FREE WILL
75
conscious awareness. To this extent we are taking a behaviouristic view of our subject, and describing or attempting to describe pro¬ cesses which we can infer go on in organisms, irrespective of whether they are aware of them or not. No doubt the fact of actually being aware of these cognitive processes in some measure changes them, and this may well have a direct bearing on the problems of “free will” which we shall be discussing shortly. One attempt at least has been made75 to try to simulate conscious behaviour by finite automata. Culbertson used neural net automata and tried to describe the hierarchical organisation of such neural nets as were necessary to simulate a conscious cognitive system. We shall not discuss the degree of success or otherwise he had in this attempt, but merely note that such an attempt has been made, and add that we can see no reason why consciousness (no doubt some sort of selfstimulating feedback mechanism eventually operating within the organism) should not be as amenable to modelling by automata, or indeed any other form of precise modelling procedures, as the ordinary overt behaviour of organisms. The main point to make at this juncture is that there is not neces¬ sarily any mystery about consciousness, and we should wish to regard it as something having continuity with our ordinary externally observable behaviour, and something that inevitably evolved from that ordinary behaviour, perhaps particularly associated with the development of language. As soon as one gets to analyse the notion of reference one can see the possibility of making the step to becoming self-aware. It should not necessarily be argued here that organisms which do not have linguistic capacities are not self-aware, and indeed it may well be that self-awareness is a matter of degree. All that is really being asserted is that language may play a great part in emphasising self-awareness and it is self-awareness, and through language self¬ reference, that is the main content of consciousness. The important point however is that no special mystery is involved, and we have no reason to doubt that we could make an artificially intelligent system which has the same self-stimulating organisation which would result in the same self-awareness; therefore we can conceive of machines which would have the same property of consciousness as humans have. It could perhaps simply be added that as in the case of motiva¬ tion, particularly emotional types of motivation, leading to artistic and other types of rather specialised abilities, the same sort of
PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
76
consciousness may not be available in machines as in human beings without the use of the same material, or fabric, but we would not want to make a major issue of this particular point. One of the most important points perhaps about consciousness (rather like imagination) is that it is inconceivable that anyone could behave intelligently without it. Any system we might reasonably guess that had the same degree of complexity as a human being, could not conceivably be non-self-conscious. To try to imagine a “normal” person who is totally unaware of himself is absurd to a degree, and this still allows a wide distribution to the notion of normality. Consciousness and indeed therefore self-consciousness is to be thought of as a property of all human beings and can also be thought of as a property of all automata (machines) which have the capacity to use language and think at the same level as human beings do. It must be made absolutely clear though that in saying this, we mean the machine to be able to converse and think in the “spontaneous” manner of which humans are capable and not merely be a highly sophisticated question-answering machine. After this very brief discussion of consciousness and the emphasis being put on the possibility of consciousness being included in any sort of behaviouristic or pragmatic analysis of organismic or machine behaviour, we would like to pass over to the problem of free will.
FREE WILL The problem of free will has dogged philosophers for a very long time
and
during the last fifty years or so the developments of
Quantum Theory and the new physics has lead many writers to believe that the problems of free will have been made clear as a result of the indeterminacy in physics. Let us first say about this that as far as our own attitude is concerned the indeterminacy of physics has proved nothing which seems in any degree relevant to the problem of free will; this is in spite of the arguments put forward at various times by Eddington44 and Jeans,45 which seem to argue that the essential indeterminacies of physics left a state which allowed for the volition of individuals. This is an argument that has been taken up by Mackay40,76 and developed in some detail, where he asserts that in talking about the relativity of observers (we discussed this matter briefly in Chapter Four) their relative standpoints made it impossible
CONSCIOUSNESS AND FREE WILL
77
for the observer (x) to make the same prediction about the observed person, or agent
(y), since they are seeing the matter from essentially
different standpoints and furthermore the making of the prediction changes the state of the brain. It is interesting that Lucas,21 who himself shares MacKay’s opposi¬ tion to determinism, says the following about MacKay’s argument: Professor MacKay, in a powerful and intriguing article, has argued, however, that in a sense the spectator and the agent cannot communicate, because if they did, it would constitute an alteration in the situation which would make the spectator’s prediction no longer applicable ... MacKay’s argument raises what will be the decisive issue against determinism, but is not itself conclusive. *
It is, as we have said previously in Chapter Four, that MacKay’s argument about logical determinism is in reality about relativism. Since we will not accept the argument in the first place, the attendant argument that therefore people are responsible for their actions, have free will, etc. is not accepted either; at least they do not follow from MacKay’s argument from logical indeterminacy. Lucas argues that determinism defeats responsibility. He says: There are two reasons why we feel that determinism defeats responsibility. It dissolves the agent’s ownership of his actions: and it precludes their being really explicable in terms of their rationale. If determinism is true, then my actions are no longer really my actions, and they no longer can be regarded as having been done for reasons rather than for causes.
We would not agree with Lucas, because “my actions’’ remain my actions regardless of whether they are caused or not. Furthermore, to argue that causes and reasons are different seems to be quite wrong. One of the main set of causes of my actions are precisely the reasons I adduce for acting so. There is no reason, though, to say that reasons are not themselves causally determined by my experience. Against the views expressed by Eddington, Jeans, Lucas and MacKay we have the view expressed by Bazzoni77 where he says the following: It should be remarked that the words ‘determinate’ and ‘indeterminism’ used in this connection have a mathematical connotation and do not have at all the same meaning as when used in philosophical discussions such as those for example bearing on ‘free will and determinism’. Failure to recognise this difference in meaning has lead to entirely unjustified applications of a philo¬ sophical doctrine of free will to atomic processes. The principle of indeter¬ minism is better called the uncertainty principle. *The italics are the present author’s.
PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
78
We have expressed our own views on indeterminism and physics already, and can now follow it up by a brief discussion of free will which is not wholly identical with the type of argument used by Feigl,78 Grunbaum79 and Russell.40’41 Feigl regards the confusion over free will as a part of the confusion over the traditional mind-body problem. Grunbaum analyses the problem of determinism and free will with respect to four different arguments which are directed against determinism: 1. The uniqueness of the individual. This is easily disposed of as an argument, since the fact of being unique (as every item of anything is) has no bearing on whether or not it is causally determined. 2. Causality is too complex to discover in practice. This is even more obviously irrelevant to whether or not causality exists. 3. Human behaviour is purposive (goal-seeking) and therefore the past does not wholly determine the present. This argument is also irrelevant because the fact that behaviour is goal-seeking is not an argument against it being causally determined; the goals are also causes (cf. Lucas above). 4. If behaviour is causal, there is no possibility of choice between good and evil. This argument hinges on the word ‘choice’, and determinists would not argue against the existence of
choice; they would
only insist that choices are themselves determined. Russell’s argument is in a somewhat similar vein to Grunbaum’s, as can be illustrated by one quotation: ... This sense of freedom, however, is only a sense that we can choose which we please of a number of alternatives: it does not show us that there is no causal connection between what we please to choose and our precise history.
This last view is essentially the same as our own and we will now discuss that view. Our own view is simply that although there is a clear sense in which we can think of all our activities as being deter¬ mined in principle,, in practice it is very difficult to imagine that processes have an effect on
all
all other processes. In other words, certain
types of events can be unconnected with other types of events and it is not necessary to see the totality of events as wholly interconnected. We would also want to argue that causality was not a unitary process but more often than not a multiple process, involving an almost inexplicable interaction of cause and effects of the most complicated kind. The most important aspect of this complex causal relationship,
CONSCIOUSNESS AND FREE WILL
79
however, is the fact that stimuli (events) which occur up to the last instant prior to a decision being taken can influence the nature of that decision. And since some of these external stimuli that may come into the organism and modify the organism’s state up to the last instant before a decision, they cannot necessarily be predicted in advance. There is, as a result, an obvious sense in which complete prediction (as opposed to determinism in principle) is impossible. Nobody has ever argued seriously that in practical terms we can make complete predictions, but we are arguing that, even in principle, predictions at any given time ahead of events is not necessarily possible because it cannot be known exactly what causal stimuli will impinge upon an organism which is a powerful
selective processor.
The fact that the organism is a selective processor and is exposed to stimuli which might modify its state and therefore its decision in matters of choice up until the last instant before a choice is made is not necessarily an argument in favour of free will. Although, if by ‘free will’ we simply mean the feeling that we have, that we could choose any one of a number of alternatives, and are not in a position where we are inexorably compelled from time immemorial to choose one and one only, then of course free will is a perfectly ordinary and understandable part of human behaviour. What we are asserting is that ‘free will’ has a perfectly good meaning although it is not necessarily the meaning that such people as Eddington, Jeans, Lucas or MacKay would want to give to it. It does not mean that we can do absolutely anything we choose within a certain total range of possibilities which is completely unpredict¬ able. On the contrary, what it means is that most people could pre¬ dict very well what most of us would do under most choice situa¬ tions, and granted no further (unexpected) stimulation occurs after the prediction has been made, then the prediction could be absolute. To this extent we are constrained by our environment and our experience and this is inevitable. We would not wish to argue that responsibility for our actions was altogether avoided because of the fact that we cannot altogether choose in a free manner from every one of a number of possibilities and always follow rules and precepts which the community in which we live might regard as moral; this is a totally different question. We shall not pursue this argument further here except to say, and this is all that is germane to this particular book, that whatever it is that occurs in the choice among a number of alternatives in human
PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
80
beings, and we do not really want to make a major issue, primarily a semantic issue, over the term ‘free will’, we can assert that there is no reason for doubting that machines as in the case of consciousness, will have precisely the same set of problems and can solve them in the same way as we do. In other words we are asserting, without providing any great deal of evidence, that there is no reason to doubt that artificially intelli¬ gent systems (our main theme) can be constructed. To put it more carefully, if they cannot be constructed the arguments about con¬ sciousness or free will would not seem to be arguments that prove that they cannot be constructed. In the light of this, taking a non¬ constructive view of proof, we shall continue to assume that they are constructable.
Chapter 8
PRAGMATICS
We must now take up once more a philosophical stance but not so much in a traditional sense, rather from the point of view of a behavioural analysis of epistemological questions. In talking of knowledge and how we know, and what we think we
certainty regarding knowledge (empirical facts), if we are talking of what is public or know, we should conclude that we cannot obtain
communicable knowledge: this indeed perhaps being all that qualifies for knowledge as opposed to conjecture. We can achieve a kind of certainty only about our own feelings and impressions and these are, to the extent that they are certain,
private and uncommunicable, or
at least difficult to communicate. The observer is certainly imprisoned to some extent in (certainly limited by) his own private world and can only derive so much from his contact with reality. In talking this way we are maintaining a realistic ontology, even though we may still accept that what we are aware of, or capable in principle of being aware of, is the suffi¬ cient basis for much of science and scientific discourse. We should also bear in mind that science is thought by us to be mainly concerned with trying to answer specific questions about specific issues in a specific context, and not merely being a seeker after (universal) truth. We are therefore asserting that the epistemologist should consider what the scientist is asserting in so far as it may help his own case. We also believe that a scientific approach to our epistemology, based on assumptions and known to be dubitable, could throw light on attempts to explicate questions such as those concerned with meaning and truth. Pragmatics58 is the study of the inter-behavioural responses within 81
82
PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
which discourse occurs. Semantics is a special case of pragmatics concerned with meaning and syntax is a special case of semantics concerned with formation rules, or what is acceptable in grammatical terms. These three disciplines are closely inter-related, and pragmatics is in one aspect a study of psychology and in another aspect a study of cybernetics. What primarily distinguishes these two aspects is whether the approach is simulatory or synthetic. Pragmatics (as well as syntax and semantics) can also occur in either of two modalities, pure or descriptive.59’60 There is little doubt that syntax and semantics can be thought of as either pure or descrip¬ tive (‘applied’ is a synonym for ‘descriptive’) since we have a norma¬ tive approach to each and we can study each as a science. How do people actually use words and what do they mean by them? Prag¬ matics has been a source of some discussion in this context since it has been suggested that it can never be other than descriptive. We feel that anything that can be described can be formalised and that anything that can be formalised can also be the subject of a descrip¬ tion or an interpretation, hence the existence of pure pragmatics. The work of Ogden and Richards80 and Korzybski,49 although working under the name semantics, are really in the Carnap-Morris sense of the term offering a theory of meaning which is rightly placed in the field of pragmatics. The Ogden and Richards view is closely associated with conditional response theory and first saw the light of day in the early 1930s. They were setting out in effect a theory of signs or perhaps in their case more accurately a theory of symbols. We can at this stage distinguish a symbol from a sign by saying a symbol is something like x when used (quite arbitrarily) in an algebraic equation or a letter of the alphabet say, and therefore can be made to signify anything we like, whereas a sign is something which specifically signifies something; natural signs are things like smoke which signifies fire and language signs are words (in them¬ selves symbols) or sentences which have specific meaning, however difficult it is to define what that meaning is. In The Meaning of Meaning, Ogden and Richards attempted to make a distinction between what can and what cannot be intelligently talked about. They recognised, as many writers had before them, that language is in some ways a barrier to understanding and they tried to supply a theory of meaning which was in effect a scientific theory. Ogden and Richards were concerned primarily with what they called interpretation; acts of interpretation are involved when-
PRAGMATICS
83
ever you read a book or hear words spoken, and the process of inter¬ pretation is a sort of decoding process where the statements heard are translated into concepts or ideas: it can be thought of as an imaginative process, where we envisage some scene, with or without accompanying images. ‘Basic interpretation’ is a term used by Ogden and Richards to refer to interpretations that cannot be broken down any further. They are the equivalent in the world of meaning of the smallest particles of physics or like morphemes or phonemes. Let us look now at an example of how Ogden and Richards saw the process of meaning take shape. Consider a man who strikes a match: the striker has an expectation of seeing a flame on striking a match; the interpreter, who of course may be the same person as the striker, but is in any case an observer in his role of interpreter, watches expectantly and thinks of, or certainly may think of, the flame emanating from the match. The actual striking of the match is a referent, the conceptual process of thinking about it or expecting something to happen is the reference (the having of the concept) and an adequate reference refers to a referent which actually happens which can be symbolised by a word. Let us take another example; a referent can be a cat, say, and the thought of the cat is the reference and the symbol which refers to the referent is the word ‘cat’. As far as they went Ogden and Richards gave a fairly convincing theory of meaning, but it seems that they did not go far enough. They certainly did not go far enough to include the subtleties of meaning encountered in the more sophisticated uses of language, nor did they go into any detail on the non-verbal behaviour of the language users. Korzybski, following on in some measure where Ogden and Richards left off, emphasised the hierarchical nature of language and drew attention to the fact that the physical world was distinguishable from the conceptual world which again was distinguishable from the levels of language which described both the conceptual and physical worlds. There is of course an obvious sense in which the conceptual world is part of the physical world, as indeed are the words used to describe either. He then went on to distinguish the immediate des¬ criptive level of terms which referred directly to physical events or relationships and then words which referred to words and words that referred to words that referred to words, and so on. Korzybski pointed out that terms are many-meaninged (‘multi-
PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
84
ordinal’ is the word he used for this) and that they mean different things on different levels of abstraction and in different contexts. He regarded the process of abstracting as fundamental to the use of the language and the cause of a great deal of misunderstanding about the nature of language. He contrasted languages with maps, and held a view,
somewhat similar to Wittgenstein15 in the Tractatus, that
language is a pictorial thing. In other words, a linguistic description is a sort of verbal picture of the environment. This leads Korzybski to say that all our knowledge is only knowledge of structure. All we can know are structural relationships which exist in the empirical world. Korzybski makes the point that modelling, of which language is a particular kind, is self-referential; it refers to itself just as words refer to themselves. But no model can ever be as complete as the thing being modelled, since if it were it would be the original rather than the model. Korzybski draws the obvious conclusion from this that we must be extremely careful about our use of language and we can easily agree about this. In particular we have to be careful to distin¬ guish between the words and the things the words represent. Most logicians regard this as self-evident and hardly worth saying. However, the fact remains, and here Korzybski was absolutely correct81 that in spite of the fact that we recognise explicitly that things or events and the words used to describe them are different, we nevertheless often fall into the trap of forgetting that statements and the things that statements are about are, in fact, different. It is often assumed that the use of a word automatically implies the existence of some equi¬ valent and is clearly not necessarily so. This is usually an unconscious assumption, but nevertheless often occurs. We now come to the modern theory of signs which owes its origin to the work of C.S. Peirce89 and Charles Morris.58 Both had a common purpose in supplying science as a foundation for a theory of know¬ ledge and all felt it could be done through the behavioural back¬ ground specifically associated with language signs, the main differ¬ ence between the work of Ogden and Richards and Korzybski on one hand and the work of Peirce and Morris on the other is that the latter is rather more general in being concerned with other sorts of signs besides language signs, and it is for this reason that they have called their work ‘pragmatics’, and the former have called their work ‘semantics’. However, the former’s work on semantics was so closely concerned with actual utilisers of the signs it is (as we have agreed)
PRAGMATICS
85
probably better described as pragmatics. We should perhaps add that we are talking about pragmatics in this chapter in a descriptive sense, although it would also be perfectly fair to formalise it and at that point it would presumably become pure pragmatics. This formalisa¬ tion indeed we shall subsequently in part carry out. ‘Semiotic’ is the name that was originally chosen by C.S. Peirce as well as John Locke and the stoics, to categorise this particular scientific study of signs. Sometimes it was referred to as ‘Semiosis’ (this was Peirce’s usual choice in fact) and sometimes, more recently, particularly with respect to the work of Morris, it has become known as ‘Behavioural Semiotic’. We shall use pragmatics to cover all these different versions of what is really a theory of sign behaviour. There are still various possible approaches, some more philosophical and some more scientific, in so far as we can make such a distinction. Peirce, in christening the subject ‘semiosis’, thought of it as the study of signs, sign processes, sign-mediation and other context relationships existing between people, particularly with respect to their trade in signs and symbols both with each other and with the environment. A mental process, for Peirce, was a sign process. The ‘representamen’ as he called it, is something that is a mental event and it is also the representation of a thing or object for which it stands. It is exactly what Ogden and Richards called a thought or reference and we have tended to call a concept or conceptual process. Signs are representamens of human minds used by them and mani¬ pulated by them in representing through language the world around themselves. This is indeed a sort of model building process. Morris defines sign as follows: If anything A is a preparatory stimulus which in the absence of stimulusobjects initiating response-sequences of a certain behaviour-family causes a disposition in some organism to respond under certain conditions by response-sequences of this behaviour family, then A is a sign.
This definition has the obvious merit of distinguishing a sign from the referent or denotatum, or whatever it is that the sign refers to. The referent is a goal object whereas a sign itself is not; also it is clear that signs can be understood without an immediate overt behavioural response to that sign. This is the whole point of internal mediation of the organism under the heading ‘disposition to respond’. In other words, we realise that all behavioural changes will not be immediately apparent in performance, which is something we knew
86
PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
from the original discussions between Hull51 and Tolman83 leading to the revision of Hull’s earlier theories. So one important step forward has been taken. We now have to build a theory upon this starting point, and this we shall do in the next chapter. Another point that needs to be made is that when we talk about lexical signs (signs produced by the lexicographers), as we might call them, we are talking of dictionary definitions and we see how difficult these are to think of other than as empirical. They also have something of the conventional about them. We are saying, in effect, that words have meanings by virtue of conventions; these conventions may or indeed may not represent what is empirically used by the majority of people, this is a matter that can only be determined by empirical investigation.84 The normative aspect of convention is the set of syntactical constraints which we ourselves impose upon language, partly as a result of discovering how to use language. Let us consider again the distinction between sign and symbol. Black clouds are a sign of rain and ‘rain’ is a symbol for rain, and certainly it is true that in describing rain it is only a sign for me by virtue of already being a symbol. But this need not make the sign theory of meaning circular as some85 have argued, since we would argue in turn that symbols are signs and signify (albeit by convention). Price’s79 criticism of the sign theory of symbolisation was in fact a criticism of the Ogden and Richards version as presented in The Meaning of Meaning, and his criticism was really of its one-sided nature. Whatever its merits in describing other people’s sign behaviour, he felt it was not sufficient for the sign-user himself. There are two points to be made here. One is that the notion of Ogden and Richards could be regarded as one-sided in just this way, but certainly they did not intend it to be so, since the expectancy elicited by a sign, say, expecting a flame when a match is struck, was as much in the striker as the other observer(s). When we think of language signs however, we see Price’s point. It is that when I utter the statement Sx such as “the horse is in the garden”, this does not cause a ‘disposition to respond’ in me. The counter-argument would be that this was because was now not a sign for me, since I had already acquired the ‘disposition to respond’ as a result of previous knowledge, whether by acquaintance or des¬ cription. In using the phrases “learning by acquaintance” and “learn¬ ing by description”, we shall mean no more than “direct experience” and “sign experience of” and do not wish to take on the surplus
PRAGMATICS
87
meaning and the considerable overtones of Russell’s usage. All of course that was being attempted by Morris and others was to distinguish a sign stimulus from a non-sign stimulus and a stimulus that elicits an immediate response from one that elicits only a delayed response, or no response at all. The effect of language therefore is to draw attention to something other than itself, (although, of course, it can also be self-referential) and to do it by also being meaningful. Such considerations lead us directly into the traditional problems of truth and meaning, and we are not in this book attempting any sort of analysis in depth of these terms, but a few words are appropriate. Truth is closely related to meaning, yet clearly distinct from it. We wish to say that truth can be thought of, in ontological terms, as whatever exists. The facts of the world are truth. But ‘truth’ can also be thought of as a ‘property’ of propositions. Any meaningful pro¬ position uttered as a statement in any language has the property of truth or falsehood, but we see that this is only true of meaningful statements. Clearly a meaningless statement cannot be said to be true or false, since we do not know what it is asserting. This already gives us a lead on meaning. We know what we mean by a meaningless statement, since it is one that conveys no information to us. But even here we hesitate because it is not the case that the statement may be meaningful to someone but not to me simply because I am unable to understand it? My belief that it is meaningless does not make it so, it merely makes it meaningless to me. We must avoid embarking on too long a pilgrimage at this point, but we should notice that Morris’ definition of ‘sign’ certainly makes its ‘meaning’ only for some organism. One way to deal with this difficulty is to use the formalised languages such as Logistic Systems (e.g. Propositional or Predicate Calculus) wherein the formation of rules determines what is meaningful without argument. This though leaves the unformalised world of natural language little better off, but makes clear that we have two obvious alternatives — one is to attempt to discover formation rules (if only on an ad hoc basis and of a partially syntactical kind) or to say that meaning is indeed for the individual concerned. A statement is thus meaningless to X if he does not understand it. This takes us back to the nature of understanding. We ask our¬ selves what it means to understand a statement. The answer is com¬ plicated because of the wide variety of possible statement types. If the statement is of the form:
88
PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
“Pass me the packet of biscuits.”
(1)
I recognise a request or command for an action. If the statement is of the form: “You must agree with me that statement P is analytic”
(2)
I recognise a persuasive statement asking me to accept a ruling about a linguistic matter. So it is that I can recognise many different intentions on the part of the utterer which shows he is trying to change my behavioural state - make me do something or persuade me of something. This already reminds one of the Grician86 approach to meaning, which we shall discuss briefly but not yet. I also know the reference of the key words like ‘pass’, ‘biscuits’, ‘analytic’, so without a knowledge of their reference I could not understand them. I would not on this account assert them to be meaningless, merely that / do not under¬ stand them. This recognises the fact that a statement’s meaning is somehow vested in the statement itself (rather than my understand¬ ing of it) and more broadly in the speech act, i.e. the total behavi¬ oural context in which the statement is uttered. We must now go back to the start of this excursion and note that ‘truth’ cannot apply to all meaningful statements because (1) and (2) are not true or false in the way that the following statement is: “The horse is in the garden.”
(3)
This is because (3) asserts something about the state of the world which I can test for myself (at least in principle). The test elicits the truth of (3) or otherwise, or to be more guarded one might say it supplies evidence for, confirms, the truth of (3) and leaves us to admit that (3) could still be false as a result of faulty observation. Such dubitability is characteristic of all empirical statements. Truth looks like the correspondence between a statement and a fact, and a belief held by any person X will be true or false (correct or incorrect) according to whether what is believed corresponds with the fact or with the statement or not. This is to discover the truth or otherwise of both the fact and the belief. We know the conditions which would make a statement true are more or less the same as knowing the meaning of the statement. We
PRAGMATICS
89
say “more or less’’ because, as we have already appreciated, meaning also depends on recognising the intention of the utterer even where truth conditions are not relevant. Furthermore, meaning depends upon certain conventions about usage which are not always apparent by knowing the reference alone. This suggests we say a brief word on what I take to be the prag¬
matic theory of meaning proposed by Grice. He accepts the distinc¬ tion between natural and non-natural meaning (cf. Morris’ signs and lansigns, which use symbols) and says: “If U does x thereby meaning that P he does it intending: (1) that some audience A should come to believe P; (2) that A should be aware of intention (1), and (3) that the awareness mentioned in (2) should be part of A’s reason for believing P.”
These conditions can easily be adjusted to deal with injunctions as well as statements where “come to believe” is replaced by “do”. What we should notice is the emphasis on the behavioural context of meaning which is now clearly much more than merely the words themselves. It must be that we need to add the rules of reference and conventions of meaning, otherwise the phrase “U does x, thereby meaning that P ...” is incomplete. It also may be, as Searle suggests (1969), that it confuses ‘illocutionary’ and ‘perlocutionary’ speech acts. This is a failure to distinguish the intention of U with the effect on A, and this could be significant in distinguishing different types of situation. An example would be to say something and mean it, with¬ out intending to produce any effect at all. My intention in saying “Brisbane is in Queensland” could be made with complete indiffer¬ ence as to the effect on the listeners. Since we are not concerned here with philosophy-qua-philosophy, let me try to relate what has been said so far in relation to cybernetics. Cyberneticians are concerned with the synthesis and simulation of human behaviour, and to this extent they are vitally concerned that the ability to use language and logic are involved. They also need to have the skill to develop that language and logic. This is so because although in “synthetic man” he can be given a ready-made language and logic, for “simulated man” he must have the skill to develop both for himself, as well as be able to use them. This is one reason why
cyberneticians are especially interested in pragmatics. This
provides the more complete behavioural background from which he
PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
90
can operate. At the same time he must be interested in the conceptual back¬ ground attached to language and logic, and this means that concepts, reference, intention, meaning and truth and even questions of onto¬ logy are of special interest. This is not only because such ingredients are needed for (by) the automaton, but also to ensure that the cyber¬ netician himself is clear in his own conceptual apparatus. It is, of course, primarily the latter consideration which is the concern of this book. One should note in passing that, even in our brief discussion of questions of meaning, there seems to be a dangerous sort of circu¬ larity occurring. It is difficult to talk of truth without presupposing meaning. It is difficult to talk of reality without presupposing our linguistic usage. It is also difficult to talk of meaning and linguistic usage without having presuppositions about the nature of reality, knowing and understanding.
It is even difficult to talk of logic
without presupposing logic. The way in which semioticians and cyberneticians have dealt with this is by making explicit assumptions, accepted unanalysed terms and proceeded to see what follows, either conceptually or in actual¬ ity. This is precisely the scientific approach and cuts through some of the jungle of linguistic and conceptual confusion. This is in no sense meant to be a jibe at philosophers, only a reminder that whereas we will (and must) continue to examine our assumptions regularly, we must in the meantime also examine their consequences. That pragmatic analysis is of the utmost philosophical value is also undoubted and I have tried to show elsewhere in a forthcoming book some of its philosophical value. Now however, our main objective is to show how a theory of signs might be developed, and in trying to develop it we shall be paying special attention to the modern work in the field.87’88,89 This work is very much in the pragmatic tradition and approached largely from a background of psychological theory and this is one reason for the difference of emphasis from, say, Jonathan Bennett90 who writes very much more (yet on the same subject) from a philo¬ sophical background. We will, as cyberneticians, welcome all these approaches and say that both in its positive function of supplying theories and models of systems and analysing our presuppositions and concepts, we must regard pragmatics as a central theme.
PRAGMATICS
91
The fact that cybernetics goes beyond the human being — behaviour and structure — as a system and considers all sorts of other systems should not conceal the fact that the human being remains the centre piece of cybernetic endeavour. One term should now be looked at briefly since it plays such a large part in cybernetics: it is ‘purpose’. Several years ago, Wiener, Rosenbleuth and Bigelow91 (1951) and Taylor92 discussed purpose and talked of the behaviour of a greyhound which adjusted its behaviour as a function of the speed and direction of the mechanical hare it was chasing. This, up to a point, is satisfactory, but some more needs to be said. If human beings (all organisms) did not have purposes (motives, goals) they would not be able to adapt or to learn. What is required is at least an agreement here over usage. We would say that the behav¬ ioural patterns which initiate the simplest goal-seeking activity are instinctive, and may involve what ethologists have called innate releaser mechanisms. Over and above these instincts we have learned activity and, by secondary motivation, we built up sets of positive and negative values or subgoals. This happens when there is a needreduction, in terms of primary motives. All the other stimuli take on a positive value when the behaviour leads to success and negative value when it does not. Primary needs are such as food, drink, sex and we say that a human being is motivated to eat when hungry. His goal is food. The word ‘needs’ is a description of his basic requirements, ‘motives’ are roughly synonymous with ‘purposes’, but not exactly, since although we do say “his motive in driving to London was to see X” and “his purpose in driving to London was to see X” we also perhaps think of ‘motive’ as rather more specific, but only marginally so. The word ‘reason’
is
often
used
for
‘motive’
and ‘purpose’ in the above
statements. All of this makes it clear that there is a sort of terminological jungle surrounding these “motivational” terms, but we can at least suggest that people have needs, motives and purposes, and they are attached to primary, secondary, ... even rc-ary goals and subgoals which are in the outside world. Whether we want then to say that all behaviour is purposive, or all adaptive behaviour is purposive, is a matter of making a decision over usage — a purely semantic point. A similar problem arises over predicating ‘living’ or ‘consciousness’ of people at any particular time.
92
PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
With this, typically semantic, problem to be remembered, let us move on in the next chapter to a theory of signs.
Chapter 9
A THEORY OF SIGNS
All that was said in the last chapter suggests that we should be looking for a theory of signs that places some sort of interpretation on the cognitive acts of people and helps us to understand what influences their behaviour, and we are thinking primarily of their sign behaviour — their use of language. We have no illusions that we shall be able to arrive at anything like a complete understanding of their total behaviour in terms of their linguistic behaviour alone, but it is an enormously fruitful starting point. We shall remember also that we are aiming at sufficient precision (but initially no more) in our theory such that we can manufacture a model which is a formalisation of that theory. This in turn suggests an attempt to provide an axiomatic description of our sign behaviour, and in order to show how one might proceed, an existing theory of behaviour will be used and modified with a view to concentrating on its linguistic aspects. We shall then show a little of the axiomatic form of the theory which will be put into terms of models in the next chapter. One first question will be about belief. We are hoping to make belief a central concept, since this is how it was done in the original theory. I will now state the initial ideas (I re'peat these, more or less, as stated originally) and consider how we might modify them and, for present convenience, do so minimally. The theory of signs to be described centres around the notion of a
belief. It is suggested that a belief be thought of, in the manner used by C.S. Peirce, as “that which disposes the organism to think and perhaps to act”. Belief is here used as a theoretical term (a sort of logical construct) and can be thought of as being represented by any 93
94
PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
empirical statement. In other words, any statement that asserts a fact, a relationship, an attitude, etc., and implies that its holder believes it to be true is to be thought to hold that belief. Beliefs are also considered to be “relatively permanent states” of the central nervous system, although that is not necessary to the present behavioural theory. In everyday terms, beliefs are those stored memories (however stored) whose contents specify for the organism what may be expected to happen (S2) if certain behaviour (Rx) is performed under certain stimulating circumstances (Sj). Since at any given moment the organism’s behaviour is a function of a relatively few of the totality of its stored beliefs, we shall call those beliefs which are actually influencing behaviour at any given instant of time expectancies or (£(/?! — S,2))’s. Beliefs may be converted into expectancies through the action of the activating stimulus state (5X). This activating stimulus state is a conjunction of motiva¬ tional stimuli (Sms), stemming from the motivational state (M) of the organism and of the stimulus complexes (S*s). It is the activating stimulus state which is the effective part of all the potential stimuli in the environment. Its effectiveness springs from active searching and selection on the part of the person as well as environmentally given. Both the Sms and the S*s are sub-classes of the class of stimuli that have been associated with particular beliefs. Sms are, of course, primarily (although not entirely) internal to the organism, while the S*s may be either external or internal. One possible sub-class of S*s (the relation may actually be one of overlap, class exclusion or class inclusion) is the class of modifying motivational stimuli (yW/WSs) which are capable of changing the internal motivational state (M). This motivational state, which is seen as being derived from two factors, drive (D) and urgency (U), may act to determine the size and nature of the range of the expectancies transformed from the relevant beliefs. When a range of expectancies has been transformed from the totality of relevant stored beliefs by the activating stimulus state CSj), the range of expectancies is scanned. This process of scanning leads to the ‘selection’ of a single expectancy whose correlative response (Rx) is the one that will be subsequently emitted. The ‘selection’ of the single expectancy during the scanning process is made in terms of (a) the strength of the belief underlying the expec¬ tancy and (b) the valence and value of the expectancy. Valence and
A THEORY OF SIGNS
95
value, in turn, are a function of the anticipated reward and the anticipated effort involved in the projected response, the value of the act or object is the invariant feature while the valence is its value in a particular context. The emission of the correlative response (Rf) associated with the chosen expectancy follows automatically upon the selection of that expectancy. This response will either be followed by the anticipated outcome (S2), in which case either a confirmation of the correlative belief will take place, or it will not, in which case a falsification of the belief will follow. This whole process, beginning with the trans¬ formation of beliefs into expectancies by the Sx and ending with confirmation or falsification, is called the Conceptual Behavioural Unit (CBU). As far as problem solving is concerned this ends with either (1) the solution, (2) a substitute solution or (3) abandonment. As far as learning is concerned this is an on-going process, and where ‘thinking’ is involved, as it will often be in both problem solving and learning a series of CBUs occur. They will be concerned with signs (including symbols and possibly images). While the interactions of the above variables are too complex to be presented in any adequate detail here, a few further points should be made. Which beliefs are converted into expectancies depends upon the previous association between certain stimuli and certain beliefs. Beliefs are thus acquired by contiguity — the association in experience of the activating stimuli (Sjs) with means-outcomes (Ri — S2)s. Such experiences may be direct ones, wherein physical energy changes, forming the basis for the stimuli, emanate from that event about which the organism acquires beliefs. But many beliefs are learned indirectly through the use of symbols, where the eman¬ ating energy changes from the event do not in any direct fashion determine the beliefs about that event. Thus, knowledge may be acquired either through direct acquain¬ tance or through description, in the simple sense of these terms, already agreed. Apart from their acquisition, beliefs may be strength¬ ened (through confirmation) or weakened (through falsification), either by description or by direct acquaintance, by going through the steps of the general behavioural unit (GBU). The relation bet¬ ween this theory of signs and our discussions of explanation in science (Chapter 2) is fairly clear. Motivation is not only an important factor in the determination of the range of expectancies elicited for scanning, but it also plays an
96
PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
important role in the selection process through its indirect effect upon the valence, and on its value. Furthermore, motivation operates to determine the speed of selection as well as the speed and strength of response elicitation. There are other behavioural units called the perceptual behavioural units (PBUs) which are closely related to CBUs, and are necessary to them, as far as the person’s transactions with the environment are concerned. Both are connected to a GBU (General Behavioural Unit) which depicts the sensory determined behaviour, which includes perception and recognition on the one hand and thinking and learning on the other. For every CBU there must exist a PBU to categorise the originally registered activating stimulus state (S1). A PBU is also necessary to identify the outcome ox goal (S2) to allow assessment of confirmation or falsification. It is clear that a PBU must precede every CBU if environmental transactions are concerned (GBUs occurring). The exception is in the conceptual processes, where thinking and problem solving are concerned; here CBUs might follow each other: this is what has often been called Tree’ as opposed to ‘tied’ thinking. We next come to a set of theorems and assumptions regarding the detailed nature of motivation,93’94 and we shall not discuss this sort of detail in this chapter; we shall instead look back at what has been said so far. The most important point in the theory is perhaps the use of the word ‘belief’ in such a broad way. It could be reasonably said that, in ordinary parlance, our knowledge is the total collection of beliefs we have, and some of them are presumably erroneous. We try typically (where rational) to see our set of beliefs are self-consistent as this a clue to their truth (cf. coherence theory of truth). But when we behave, we certainly do so in a variety of different ways. We say we are trying to persuade, to cajole, to bully and so on. How are these different modalities related to beliefs, and when we come to talk of signs in the context of beliefs, how shall we be able to fit our theory with such matters as illocutionary acts such as intending, stating, describing, demanding, etc. on one hand and percolutionary acts such as causing alarm, convincing someone, getting someone to realise something, etc. on the other? We also need to be careful in relating the word ‘belief to a speech act in that I may believe X and utter Y, so clearly I cannot say any¬ thing as simple as a statement I make represents a belief I hold. All of these matters make it clear that whenever I translate beliefs
A THEORY OF SIGNS
97
into expectancies, I must invoke beliefs of the form “I must persuade X of Y” or “I can mislead Z by saying W”, etc. This involves at least multiple beliefs which could be say of the following form: “I “I “I “I “I “I
believe Y (to be true)” wish X to know Y” wish to mislead Z” believe W to be false” wish to help Y” wish to hinder Z”
(1) (2) (3) (4) (5)
(6)
All of (1) — (6) can be conjoined as B1 — B6. But four of them use the word ‘wish’ and not ‘believe’. Our decision now is as to whether we subsume wishes, desires, needs, etc. under beliefs or make them into a separate category. They can all be collectively named ‘Propo¬ sitional Attitudes’. There are motivational stimuli in the theory which change the nature of the beliefs which are transferred into expectancies, and perhaps the difference between ‘beliefs’ and ‘wishes’ is best viewed as the extent of the motivational influence. I could, for example, translate (2) as: “I believe X should know Y”
(7)
Yet one would be forced to admit that (6) and (2) have a slight difference of emphasis. We may do best to accept the fact that when we come to language signs, we will be free to describe illocutionary and perlocutionary acts as refinements of the general theory of signs. It is in this that a pragmatic theory of signs may be made to make internal distinctions as required. As far as deception is concerned, this will refer in just the same way to the modality of my speech act, based on multiple beliefs such as in (3) and (4) above. Let us now though return to look at some more of the theory, which represents an attempt to provide, in stages, an account of (linguistic) behaviour that can be explicitly modelled in what might reasonably be called a cybernetic manner. In general, the form of the PBU is similar to that of theCBU. After proper encoding, certain of the stimuli impinging on the central nervous system are capable of transforming beliefs into expectancies; then, through the process of scanning, one of the expectancies is
98
PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
selected, and a response ensues. But there are certain important dif¬ ferences between the details of the CBU and the PBU that must be noted. The first difference concerns itself with the nature of the final response (R j). In the PBU the is the covert response of categorising or classifying the impinging stimuli. Such a response is best thought of as occurring entirely within the central nervous system, and as not necessarily involving conscious awareness. In contrast, the of the CBU may be any number of different actions, some of them overt and some covert. We must distinguish between the contents of the beliefs of the PBU and those of the CBU. In the PBU, beliefs concern themselves with such cognitive actions as seeing, hearing, tasting, etc. — in general, those activities which have traditionally come under the rubric “sensation and perception”. They involve the action of the central nervous system as the organism “apprehends” its external as well as its internal, environment. Perceptual beliefs can be expressed as conditionals of the form: if the impinging stimuli (the Sx) have been categorised as Cx (the Rx), then the subsequent impingement by other stimuli of the categories C2, ..., Cn (the S2) is likely to obtain, with some probability p. What the conditions are under which these probable impingements will take place also forms the content of perceptual beliefs — or, more precisely, the perceptual meta-beliefs. For convenience’s sake, we shall consider all those beliefs that are concerned with the perceptual categorisation of events to be percep¬ tual beliefs, and all others general beliefs. That such an arbitrary division as this is only a temporary verbal convenience will be appre¬ ciated when we now note the close relationship between CBUs and PBUs. In the CBU, in order for an activating stimulus state (Sx) to trans¬ form a belief into an expectancy, it is first necessary that the Sx be perceived. By perception, of course, we mean the action of a categor¬ ising response which is the Rl of an immediately preceding PBU. That is to say, in order for a CBU to take place in “tied thinking” it must be preceded by a PBU. On the other hand, in the PBU, in order for an Sj to transform a perceptual belief into a perceptual expectancy we must arbitrarily assert that it is not possible that the be preceded by a categorising response. Thus, for the PBU, given a particular Su those perceptual beliefs that are associated with the Sj, will immedi¬ ately and automatically be converted into perceptual expectancies.
A THEORY OF SIGNS
99
This transformation will take place without theSj’s being first percep¬ tually categorised. The only categorisation involved is the Rl which ends the PBU. Since there may be more than one perceptual belief associated with a given S1} the question arises as to which of the beliefs con¬ verted into perceptual expectancies will be selected. Such a selection will lead automatically to the categorising response, R1, of the PBU. We have seen that in the CBU the selection of the expectancy which leads automatically to the Rx is a joint function of (a) the value and valence of the expectancies which have been transformed from beliefs, and (b) the strength of the beliefs correlative with these expectancies. In the PBU, however, while these three factors of value, valence and belief strength also operate to select a single expectancy, it is the latter that seems to play the more important role. This does not mean that value and valence are not important, especially when the stimu¬ lating circumstance (SJ is ambiguous and when motivation is strong; but, on the whole, the strength of the perceptual belief is primary in the selection of a particular perceptual expectancy leading to the Rx of the PBU. By strength of perceptual belief we mean, simply, the degree or strength of association existing between a particular Sx and a particular perceptual belief. Unlike the CBU, the final step in the PBU is not an outcome or goal (S2) which is then followed by another PBU leading to a cate¬ gorising response, for this would now involve us in an infinite regress. Rather, the PBU ends with an Rx, the categorising response. This perceptual categorisation may, however, be subsequently confirmed or disconfirmed; and this, in turn, will lead to the strengthening or weakening of the correlative perceptual belief through confirmation or falsification. Such confirmation or falsification may take place in two ways. First, the outcome (S2) of the response (Rx) of the subse¬ quent GBU may either confirm or falsify the veridicality of the previous perceptual categorising response. Secondly, a subsequent PBU, because of the content of the organism’s belief system, may be categorised as being either compatible or incompatible with the previous PBU in question. And this, in turn, may bring about a confirmation or falsification of the previous PBU, leading to the strengthening or weakening of the correlative perceptual belief. It should be noticed that perception is normally regarded as certain by the organism and is not thought of as requiring confirmation. We must distinguish between beliefs regarding perceptual events
PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
100
and beliefs regarding perceptual rules. We might also talk of percep¬ tual beliefs regarding rules about rules. Another way of distinguishing these various levels of perceptual beliefs is to call them “beliefs”, “meta-beliefs” and “meta-meta-beliefs”, etc. Let us now continue our examination of the PBU in detail. We have previously defined the activating stimulus state (Sj) as “... that state of the central nervous system which is capable of transforming specific beliefs into expectancies” and we have considered the Sj to be made up of stimulus complexes (S*s) plus stimuli arising from motivational states (Sms). But in order to specify more precisely the functioning of the Sx in the PBU, it is convenient to partition SjS into three subcategories: (a) cues,
(b) clues and
(c) signs.
We shall postulate, as part of the connotation of the words “cue”, “clue” and “sign”, that they refer to the organism’s use of certain stimuli after they have been associated with specific perceptual beliefs. Thus, we shall consider cues, clues and signs to be a subclass of the class of activating stimulus states (Sjs) rather than, say, a subclass of stimulus complexes (S*s) or motivational stimuli (Sms). For we wish to make it clear that cues, clues and signs have their functional basis not only in events external to the organism, but also in events internal to it, such as beliefs, attitudes, motivations, etc. Dl. CUES (Cus):
A subclass of the class of activating stimulus states (S^) which stems from objects or events, either external or internal, which are being apprehended directly by the organism through knowledge by acquaintance.
Cues from external events may be modified by internal Sms stemming from concomitant motivational states. Conversely, cues from internal events may be modified by S*s stemming from con¬ comitant external events. Thus: D2. CLUES (Cls): A subclass of activating stimulus states (Sxs) which stems from the context, ground or sur¬ round in which the apprehended object or event, either external or internal, is embedded. The
A THEORY OF SIGNS
101
apprehension of a clue by the organism is direct, i.e. through knowledge by acquaintance. Unlike cues, clues are not apprehended in and of themselves; if they are experienced at all, they are experienced in conjunction with cues, and they usually exert influence upon the apprehension of the concomitant cue, and vice versa. Clues, then, are Sxs which inform the organism about some events or objects other than themselves. Such information may have to do with such ‘objective’ matters as the size, colour, shape, location in space, etc. of objects as well as such ‘subjective’ matters as pleasant¬ ness, attractiveness, harmfulness, etc. Such information as is given by clues may or may not, of course, be veridical. Thus: D3. SIGNS (Sns): A subclass of the class of activating stimulus states (S^s) which the organism, through the acquisition of beliefs, has learned to stand for other stimuli. As can be seen by the definition, signs are closely related to clues, since they are a subclass of Sxs which are concerned with objects or events other than themselves. But, unlike clues, which must always appear with the object or event about which they yield information, signs may give information about objects or events which are not simultaneously present. In this sense, clues may be considered to be a kind of subclass of the class of signs. As we see from the above definition, signs may come to stand for, or be substitutable for, or come to represent, objects or events to the organism which are not now present. In fact, under special circum¬ stances, they may come to stand for events or objects which can never be known directly (i.e. through knowledge by acquaintance) by the organism. Thus, though the information the clue conveys is always about an object or event known through knowledge by acquaintance, in the case of signs this need not be so. This is particularly true of signs that are encountered in language which are used in formal and informal education to convey information about objects or events which may never be directly experienced by the organism. Cues as well as clues act as signs under special circumstances. This is by no means a sufficient analysis of signs, since that part of the definition which says “stand for”, itself requires a great deal of
102
PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
further explanation. The first step in our further explanation is to define a symbol. D4. SYMBOLS (Sys): A subclass of the class of signs (Sns) which have specific reference to concepts (or beliefs) and may also have external referents in the form of physical objects. Symbols are thus arbitrary or conventional sentences. But as words they may signify other internal states or external objects, relations, etc. As sentences, it is the symbols (as words) which form the sentence which signifies the concept or belief. The meaning of the sentence is then to be thought of as its signification (to the concept or belief). In Price’s example of “the cat in the cupboard” (Sf) the sentence clearly signifies this for the utterer or the listener. That he may not believe what is signified is clear, since we are not saying that signifi¬ cation disposes the utterer or hearer to believe in Sl} only to form a belief about (which may lead to action). We can now add our vocabulary signals as standardised signs, sometimes symbolic in form, and images which are internal signs, and are the equivalent to cues and clues in the environment. Objects and events about which cues and clues convey information can be experienced only directly, while the object and events about which signs convey information can be experienced directly or indirectly. Knowledge about cues, clues and signs, however, may be acquired either through knowledge by acquaintance or knowledge by description. The breaking down of SjS into the above subclasses may often turn out to be rather arbitrary; nevertheless, it is believed to be a convenient way of describing the various functions of the activating stimulus in the PBU. Often, when we apprehend an object or event, we are not able to categorise it completely or fully, to our satisfaction, at the first attempt. Instead, we may first categorise in one way then another, and so on. We may, of course, take into account each one of our ‘abortive’ categorisations to form a concatenation of interpretations from which we construct our final categorising response. Or we may make use of further incoming information from the object or event itself. Or, finally, by examining our past experiences, through our
A THEORY OF SIGNS
103
perceptual beliefs, we may try to ‘remember’ what this object or event ‘might be’. We then arrive at a categorisation upon which we are prepared to act, insofar as we are relatively sure that our categor¬ isation is a ‘correct’ one. As with Bruner et al.95 we think of such categorising behaviour in cognitive activities such as problem solving as having the following advantages: (1) It reduces the complexity of the environment. (2) It is a means of identification. (3) It reduces the need for learning. (4) It is a means of action. (5) It permits the ordering and relating of classes of events. From this we see that we can look upon the perceptual process as central to problem solving and as consisting of a finite series of inter¬ pretations: 7j, 12, ..., In> where ln is that interpretation or categorisa¬ tion upon which the organism is prepared to act. In the limiting case, the series of interpretations may of course be only two — Iu I2, or even merely Iv The series of categorisations may take place extremely rapidly, and it is only on relatively rare occasions that such a series is slow enough and perhaps difficult enough so that we become con¬ scious of the process. It is therefore necessary, for any complete description of the per¬ ceptual process and conceptual process, to make allowances for such a series of interpretations and categorisations. We shall, therefore, distinguish between what we will term the provisional categorising
responses (PCRs) and the final categorising responses (FCRs), the latter being of particular interest since it forms the
of the sub¬
sequent GBU. The final stages of such cognitive acts as problem solving as opposed to the processes of perception, may be terminated by the solution, etc. This is no longer to be thought of as an on-going pro¬ cess as is the case in perception and learning which is terminated only by sleep (partially) or death (totally). Before termination of, or even the application of, an expectancy to a cognitive situation, we may expect to find the notion of risk also being involved. So we now introduce language signs (lansigns) and ask what makes them differ from signs in general. This is a notoriously difficult problem as we know that for someone listening on a short-wave radio,
PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
104
for instance, and hearing a sequence of sounds, finds it difficult to know if the sounds are linguistic (e.g. Morse code) or not (atmos¬ pherics). In the extreme cases the distinction is obvious but science requires that we say what the distinction is, or how we should draw it. In the first case though the phrase “stand for” in the existing definition of ‘sign’ is admittedly inadequate. It presumably means ‘refers’ and reference is certainly not enough for language signs, although it is where they start, through ostension if nothing else. So a lansign is a sign that refers in a particular way and that particular way would be by the use of symbols (and that include gestures), so we must say linguistic symbols. So far so good; linguistic symbols are statements in a language (wffs in a logistic system) which obey certain additional rules which are syntactic, conventional and also (as a result) have meaning. Let us now remember that as we said in the last chapter, meaning can be for someone and we have to make some decision as to how to characterise it generally and yet allow a statement to be meaningful for some and not for others. The sort of definitions we need are: D5.
LANSIGNS are signs that use language symbols in their role of
making reference to something other than themselves. For ‘self-referential’ signs we can use quotes and then the reference is from “X” to X , “X” to ‘X’ or ‘X’ to X, where the last is word-to-“thing”, and first two are thing-word to thing, and thing-word to word. D6.
are signs that are capable of eliciting a response from, or changing the state of, some other person, in a manner which follows the intention of the utterer and subject to the conventions of the language used. MEANINGFUL
LANSIGNS
We need also to define, at least semi-formally, some of the key terms we have already described. The reason for this is simply to contribute enough to the behaviour theory to make axiomatisation possible. D7.
BELIEF: A relatively permanent state of the organism which represents the association, through experience, of the activating stimulus state (5j) with a means-outcome (R j — S2).
A THEORY OF SIGNS
D8.
105
STIMULUS (S): Any change of energy which impinges upon the nervous system such that it brings about activity in that system. The source of this energy change may be either external or internal to the organism.
D9.
STIMULUS COMPLEX (S*): A subclass of the class of stimuli, both internal and external to the organism.
DIO. MOTIVATIONAL STIMULI (Sm): A subclass of the class of stimuli, internal to the organism, which derive from motiva¬ tional states and which have become associated with parti¬ cular beliefs. Dll. MODIFYING MOTIVATIONAL STIMULI (MMS): A subclass of the class of stimuli, both external and internal, which may either include, be included in, or be equivalent to the stimulus complex (5*), and which have the properties necessary to modify the motivational state of the organism. D12. ACTIVATING STIMULUS STATE (SJ: That state of the central
nervous
system
which
is
capable of transforming
specific beliefs into expectancies. D13. DRIVE (D): Specific needs of the organism, both primary and secondary,
which manifest themselves in the motivational
stimuli (Sms). D14.
URGENCY (U): A special state of the organism which mani¬ fests itself in the motivational stimuli (Sms).
D15. EXPECTANCY (EIR^ — S2)): A relatively temporary state of the organism, elicited by St, which is derived from and has the same content as the correlative belief. D16. RESPONSE ELICITATION: A process, following automatic¬ ally upon selection, in which the selected means is elicited as a response. D17. RESPONSE. The behaviour of the organism following imme¬ diately upon the selection process. The response may or may
PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
106
not be overt, and may itself act as a stimulus for further behaviour. D18.
GOAL: is the intended end state with respect to any expec¬ tancy.
D19. PURPOSE: The basis of the selection of an expectancy with respect to its goal. D20.
CONTEMPLATION: the process of evaluation by which Ox selects his response.
D21. EVALUATION: the assessment of valence and value involved in the relevant goal with respect to the anticipated effort of attainment. D22.
VALENCE: a function of the anticipated effort and antici¬ pated value contained as part of a belief and, hence, with an expected means-outcome.
D23. EFFORT:
the amount of effort believed to be, and hence
expected to be, necessary in order to attain a certain goal. D24.
VALUE: the amount of reward believed to be, and hence expected to be, forthcoming upon the attainment of a certain goal. We can now write a few axioms, by taking our definitions and putting them together to provide a basis for our inference making and then we can derive some theorems.
Al.
The strength of drive (D13) rising beyond a certain threshold, initiates a set of beliefs and then, by selection, an expectancy for drive reduction purposes.
A2.
The state of expectancy continues until a terminal state is reached where a classification is made (PBU) or an action takes place (GBU) or a further CBU is alerted.
A THEORY OF SIGNS
A3.
107
A stimulus Sx brings about the state of expectancy in organism Oj which may be followed by contemplation, but is always followed by a response Rx.
A4.
Lansigns (as a statement Ss) are uttered by Oj in order to persuade some audience (02) of some state of affairs. The state of affairs referred to may include Oj’s beliefs, 02’s suspected (by
beliefs, a state of the world, etc. The word
‘persuade’ may (as an illocutionary act) involve action or change of belief (a perlocutionary act) for 02. A5.
A statement Ss to have meaning must be composed of symbols used as lansigns so that 02 understands the reference of the lansigns and is able (though not obliged) to follow the intention of O!.
A6.
Sx is true if and only if its reference is correctly ascribed by Sx. Now some theorems follow:
Tl.
If a set of beliefs is elicited in Ox by an Slf then if the S2 is an
Ss (emanating from 02) then may respond by an that remains covert, but still results in a change of state of Ox. This will occur if 02 did not ask a question, or where Ox did not wish (a belief evoked) to reply, or where the belief evoked in Oj was that no statement was required of him. T2.
The same as Tl where the response Rx is overt. The distinction between the overt and covert response depends wholly on the particular beliefs evoked in Ox.
T3.
If a set of beliefs is elicited in Ox by an Sx, that emanates from the environment and is composed of Sms only, then the response Rx will be arrived at by the selection of the appro¬ priate expectancy, which will provide a goal that initiates overt activity.
108
PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
T4.
If the overt activity initiated by T3 leads to success, then the belief(s) which was/were acted upon would be strengthened and thus more likely to be evoked on similar future occurr¬ ences of such Sms: learning has occurred (by confirmation).
T5.
If learning occurs as in T4, where the situation is novel, then we say that the behaviour was problem solving behaviour.
This outline beginning of an axiomatised theory of signs is meant here only to illustrate a technique which I take to be characteristic of science and also central to cybernetics. The reason is quite simple. Science has always used models and theories and, in the search for precision, it has sought to place much of its content into axiomatic form. This goes back to Aristotle and geometry and much more recently to Spinoza in his Ethics. In the present century there have been a large number of such efforts designed to reduce a mass of data and some attendant theory to a coherent set of statements. Hull and many of his associates attempted it in the Mathematic o-Deductive Theory of Rote-Learning, and much more specifically Woodger attempted it with much of biology. Peano and others have done the same for number and set theory and so it goes on. Much of this type of work has been criticised as having no purpose because of its complexity. But all this was before we became aware of computers. Now this sort of precision is just what is required by the systems analyst if he is to computerise a scientific model or theory. As a result we would regard the present chapter as a starting point for just such an undertaking. It is natural to think of computerising a theory of behaviour, where that behaviour is ideally human behaviour (a simulation) and includes especially the linguistic aspects of that behaviour. The process of building up the axiomatic system in question is long and painstaking. The present author has taken his theory of behaviour beyond what is described here and also recognises the need for a thorough critical reappraisal before taking it too far, or revising it at, or restating it at, a more precise level of description. In the next chapter we turn to various models used by cyberneti¬ cians to illustrate different aspects of human behaviour, and the ones we shall be considering are automata. These seem among the most “natural” of all ways in which cybernetic modelling should develop.
Chapter 10
MODELS AS AUTOMATA
In this chapter we discuss methods of modelling and formalising the type of Pragmatic Theory of Signs which we have outlined in the last chapter. We have already conflated, in part, the philosophical use of the terms with the behavioural use. We have done this in order to draw attention to the possible similarities between what we seem to mean normally by a cognitive term and how we might guess at its behavioural and physiological representation. If our interest in cybernetics is a simulatory process, some such task as this seems necessary. In doing what we have done, we readily recognise the risk of creating greater confusion than clarification, and confounding rather than conflating terms. It is therefore to be thought of as an experi¬ ment to be judged by its subsequent success or failure; exactly the same as with any other scientific undertaking. Our next task, and this is what this chapter is concerned with, is to examine the models — largely of a “logical” kind — which underlie our behavioural theory. This takes us, as it were, full circle back to the logical foundations of our theory. The principal reason for per¬ forming this type of formalisation is the need to actually arrive at a blueprint. So the models of the theory could, if necessary, be con¬ structed. This is what makes cybernetics “effective” in that it tries to ensure that theories are really capable of making the predictions they claim and thus allow testing of those theories. The situation for the analysis of behaviour, from the cybernetic point of view, therefore becomes quite complicated. We have, in a sense, a whole series of approaches, where many will treat each as “free standing” and which can also be tied together. This at all events 109
PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
110
is what is attempted and is exactly how physiological psychology came into being. We are, in the same way, relating philosophy to behaviour, and behaviour to physiology, and both to models, by a process of formalisation. This allows the appropriate testing of the theory. We then, of course, have to invoke philosophy in its other role and ask whether these are sensible things to do or whether they are a
priori absurd. This leads directly to the mind-body problem (Chapter One) to which our answer is: there is nothing absurd at all about the attempt. However, it may fail, and there are risks, so one must proceed with caution. Without any more ado then, let us look at automata, first of all the Turing Machine with its tape ruled into squares which are scanned one at a time and the tape (or the scanner) moving from left to right. The program for the automaton is the set of quadruples (or quin¬ tuples) which define as a function of the initial set of symbols on the tape (the input) the operation or computation of the machine. At the end of the computing we are left with the output which is made up of the symbols remaining on the tape. We assume that the Turing Machine is capable of only a finite number of distinct internal states and that the next operation at any time t is a function of its internal configuration at that moment and the finite expression which then appears on the tape. We can define a Turing Machine more formally as follows:
Q(t+l) = G(Q(t),S(t)) R(t + 1) = F(Q(t), S(t))
(1)
D(t+ l) = D(Q(t),S(t)) G is some function relating the previous state and the previous input, while F is a function relating the same two functions as G. G determines the change of state, while the output R writes on the square it is scanning; this “writing” may include overprinting the symbol already there, and also moving the tape either to the left or the right. The new function D simply tells us which way the tape moves. We can think of the set (s0, slt ..., sm) as the alphabet of input symbols (r0, rt, ..., r„) are output signals and (q0, qu ..., qp) repre¬ sent internal states and are called state symbols. The machine so far described is finite, but can be thought of as “potentially infinite” (or
MODELS AS AUTOMATA
111
linear bounded) in that if it should be about to run off either end of the tape, more tape can always be added. Nevertheless at any parti¬ cular time the amount of tape is finite. Formal mathematical descriptions of Turing Machines are avail¬ able in Turing,41 Post,96 Kleene,97 Davis,30 Arbib31 and Minsky.98 We are not concerned here with the various notations which are all equivalent to each other and will use the notation used by Davis. Davis talks of quadruples
qiSjSk Pi qiSjRqi qjSjLqi
qiSjqk Pi which imply that given the internal state q» and scanning symbol Sp we can either expect to overprint Sk, move right, move left or in the case of the last quadruple, ask, in effect, a question as to whether a particular integer n, say, belongs to a set such as A. Turing machines which include this fourth quadruple are called Interrogation Machines. Those that exclude it are called Simple Turing Machines. We next add some formal definitions: An Instantaneous description A is an expression containing only one state symbol and contains neither R or L. It must have at least one symbol to the right of the state symbol. A Tape expression contains only alphabetic symbols and we shall write the number of l’s on a tape expression as CF> where we mean to imply F 1 ’s. for example, (6) = 111111 or = 1 lS^S^l 1. All other symbols now apart from 1 are being ignored. Let us look at a simple (and highly artificial) Turing Machine operation. Consider the tape: A
B
C
A
B
D
A
C
D
A
A
B
The purpose will be to eliminate anything on the tape but A and B (leaving a blank called X) and make all Bs into As. The program of quadruples is as follows where we start A on the left end of the tape:
PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
112
q0ARq0 q0BAq0 q0CXq0 q0XRq o q0DXq0 This trivial operation has successive instantaneous descriptions:
q0ABCA —Aq0BCA —Aq0ACA —AAq0CA — AAq0XA
AAXq0AB
> D
AAq0XA -~^AAXq0AB —► AAXAq0B —AAXq0A -+AAXAq0D
>— XAq0XACD
XAXAq0CD
> - XAq0XD
XAXq0XAC
*
^
XAXq0ACD —--+ —-
— XAXq0DAC
—
and so on until we have the terminal state written in full as: -■ A4 X4.4 X4XX4 AA 4 oThe sequence has completed the tape and then stops. This program can be depicted by a state transition diagram as follows:
The state transition diagram remains in the single state for all the
MODELS AS AUTOMATA
113
operations, unlike our next example of parenthesis checking. A method to ensure the set of parentheses is well formed. This time we shall used quintuples rather than quadruples, if only to illustrate the difference. The tape is as follows:
B
B
*
(
(
)
(
)
)
*
B
B
The quintuples are:
q o(qo(R q0(qiBL q0Bq0BR qp*ql*N qiiqoBR q1Bq1BL
qx*q2*N The machine starts at the parenthesis furthest to the left. It moves right and erases the first right parenthesis and then back to eliminate the left one and proceeds to eliminate in pairs, leaving none if the string was well formed. The state transition diagram this time clearly involves three states (where q2 is terminal) and looks as follows: (/( /R B/B/R
)/)/L B/B/L
*/*/N
*/*/N
PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
114
The relation between the quadruple and the quintuple is evident, as is the fact that the quintuples are clearly more difficult to follow when
the
computation
or
operation
becomes
complicated and
Wang", among others, has suggested a different terminology which is very much easier to use in such circumstances. As far as Turing Machines are concerned, we must now think of a
function as being defined by the Turing Machine’s behaviour. The argument occurs on the tape before the computation starts and the value of the function for that argument is what is left when the computation is complete. We now show two examples of Turing Machine computations. Our first example is the trivial one of subtracting one number from another. Let us take 5 — 2. So the function F(x, y) to be computed = 5 — 2 and the quadruples needed are as follows:
qjBq! q i BRq2 q2BRq2 q2BRq3 qzlRq4 q3BLq9 qJRq* q4BLq5
q&lBq 5 q5BLq6 q (,lLq 7
qilLq 7 q nBLq8
q%lLq% q&BRqj q sBLq9
q8lRq8 ai = qi
= qi
(111111111111,1111111) 11111111111B1111111
MODELS AS AUTOMATA
115
So if qx starts at the left hand digit and we get the following set of I.D.’s:
q^lllllBlllB Bq2lllllBlllB Blq2llUBUlB Bllq2UlBlllB Blllq2llBlllB Bllllq2lBlUB Blllllq2BlllB BlllllBq3lllB BlllllBlq4llB BlllUBllq+lB BinilBUlq^B BUlllBUqslB BUUlBllqsBB BlllllBlq6lBB BllUlBq7llBB 51111 lq7BHBB
and this cycle is repeated until only 111 is left on the tape. Our second example is a simple proof as opposed to a numerical example. The successor function is simply:
S(x) = x + 1 with particular examples such as 5(8) = 9, 5(11) = 12, and so on. The successor function plays a very important part in the foundations of mathematics, and is by the nature of things easily shown to be computable.
Proof: Take a Turing Machine Z, and let q^m, which is the initial Instantaneous Description to be terminal for all m. Z needs only one quadruple:
PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
116
qlBBql and then for (qqn) we have output m + 1. We now define a Turing computable function/(,x) as that function which can be computed by some Turing Machine T; the tape of T is initially blank except for the conventional representation of the argument x, and where the value of f(x) is the number of l’s that remain on the tape when T stops. The reference to l’s here as the form of the output is a reference to the representation of numbers by their total number of l’s e.g.
6 = 111111, or when stated conventionally as an argument =
111111.
The important point that Turing demonstrated is that there exists a Turing Machine that can, subject to conventions of input repre¬ sentation, compute any Turing computable function whatever, and such a machine is called a Universal Turing Machine. A Universal Turing Machine is a little like an interpreter in com¬ puter programming, where by an interpreter we mean a symbolic language which describes a machine code program where there is some simpler code in the interpretative language and, as opposed to compilers, there is a (1, 1) correspondence between interpreter and machine code language, and where the translation and programming occur in the same step. If we think of a function f(x) which is Turing-computable, then we can find, by definition, a Turing Machine Z, which can compute the function f. This means that for each value of x on the tape of Z a computation occurs leaving a string of symbols Sf(x) remaining on the tape. We have in the last section of this chapter done precisely this sort of thing, and in doing so we have behaved like a Universal Turing Machine. The technique used for translating the description of a Turing Machine onto the tape of the Universal Turing Machine will not be described here; a description can be found in Minsky.98 We now move on to a brief statement of recursive functions. There are a certain set of functions which play an important part in recursive function theory and provide a particularly convenient description of most of mathematics. One of these is the successor function, which we have already mentioned, another is proper sub-
MODELS AS AUTOMATA
117
traction x — y, which we illustrated by a simple numerical example. If we add four further functions and two operations (called “com¬ position” and “minimalization”) we have an axiomatic type of system from which we generate recursive functions. We can show that all the functions and the two operations are Turing-computable, so it follows that recursive functions are computable. The question then is as to whether there are any functions of classical mathematics which are not recursive, and if there are, are they computable? The answer is that there are such functions, of which xy is an example. To deal with this we widen our concept of recursive func¬ tions. We then find that such functions as xy are included and can also be shown to be computable. This brings us to the final issue of whether all the well-formed formulae of classical mathematics are computable, to which the answer, as shown by Church5 and Turing4 is “No”. It is possible to construct a function which is acceptable within the domain of clas¬ sical mathematics which is shown not to be computable. The details of these results and the closely associated results of Godel3 we have already discussed sufficiently in Chapters Three and Five. So much for our brief discussion of automata and metamathe¬ matics. We must now return to automata, other than of the Turing machine type, and then eventually we shall consider automata in other than tape form (i.e. neural nets). But first let us consider other tape automata. It is important to appreciate that what is mathematically inter¬ esting may not be cybernetically interesting, and vice versa. Thus it is that a Turing Machine (a particular form of automaton) with many channels, as well as having tapes with many rows, rather than a single one, is of some considerable cybernetic interest, but of little mathe¬ matical interest. It is of little mathematical interest because it can be shown that all problems solvable by multi-channel, multi-tape machines can also be solved by single channel, single tape machines. This, however, still leaves a variety of different methods for solution, and these differences are of interest to the cybernetician, as a possible approach to brain models. In this section we shall describe automata (which include Turing Machines) in general terms and subsequently we shall consider parti¬ cular interpretations of some of these automata. The situation here is exactly the same as it is with formal calculi such as the proposi¬ tional or lower functional calculus, and their interpretation of des-
118
PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
criptive systems, by way of semantic rules (cf. our discussion of models and theories in Chapter Five). We shall first discuss a version of finite-state machines; we are assuming the word ‘machine’ to be sufficiently understood here in a broad sense as an “artificial” system, and we will not discuss it further. We are concerned with a machine or a model or a black box that is interacting with an environment. A stimulus or input to the machine is a response or output from the environment and vice versa. We first need to distinguish between a deterministic and a nondeterministic machine. A deterministic machine, and these are the ones we are mainly concerned with, is defined so that its output R, at time t + 1 (and is written R(t + 1)) is determined by the input at
t, and the previous history of the machine. We can write this: R(t + l) = F(H(t),S(t))
(2)
where H is the previous history and S is the input. F is some function relating the two main features to the output of the system. A non-deterministic machine which is of special interest is one where, for example, the input and history determine the probability of some response. This last type of machine is called probabilistic, and is a particular case of the class of non-deterministic machines (cf. our discussion of indeterminacy in Chapter Three). The internal state (Q) of the machine is defined as being dependent upon its previous internal state and on the previous input. We write this:
Q{t+ l) = G(Q{t),S(t))
(3)
when G is some function relating the previous state and the previous input. We can further distinguish between automata that are able to
grow (and are therefore potentially infinite) and fixed automata which includes virtually all those that have been referred to so far. It seems that automata that can grow but cannot do so beyond some specified size can do no more than an automaton that is fixed. We shall call these last automata growth automata, and the potentially infinite ones we shall call growing automata. Turing Machines can be thought of as being either ‘growth’ or ‘growing’ automata according to their precise definition. In passing it should be noted that when we say that a growth
MODELS AS AUTOMATA
119
automaton can do no more than a fixed automaton we are thinking in terms of computations and not in terms of methods, a point we have already made with respect to Turing Machines. This there¬ fore does not mean that, from a cybernetician’s view-point for example, growth automata are of no interest. There is some good reason to suppose the human brain comes precisely in to this group of automata. We can further classify automata as continuous or
discrete;100 they can also be classified as synchronous or non-synchronous according to whether or not their input and output history can be described completely as occurring during certain discrete moments of time. They may, of course, have any combination of the various properties so far described. In general, we shall say that automata are devices of finite size at any particular time such that the defined output is a function of what has happened at the defined input. Our equations (1), (2) and (3) make this concept more precise. The basic notion of a finite automaton is that by its present and future behaviour only some finite classes of possible histories can be distinguished or recognised. These histories are what we have called the ‘internal states’ {qi, q2,..., qn) of the machine. Let us try to be clear about the recognition of a machine’s history. Before doing so, however, let us follow Minsky98 and describe some representations of such nets. The first is a simple tabular form, where we represent our two defining functions F and G. A simple ‘memory machine’ has the following definition: State
State
G
qo q\
F
qo q i
so
qo qo
So
r0 rt
input
input
qi qi
r0 ri
and this same automaton can be represented by a state-transition diagram as follows:
PHILOSOPHICAL FOUNDATIONS OF CYBERNETICS
120
0
so that given input 1 (at base of arrow) in state q0 (or Q0) the hexagon, the output is 0 (written on the arrow itself), and the new state is ql (or Qx). We could write this same description in the form of a set of quadruples: