278 47 2MB
English Pages [144] Year 2006
Advanced Issues on Cognitive Science and Semiotics edited by Priscila Farias and João Queiroz
© 2006 All rights reserved. No part of this book may be reproduced in any form by any electronic or mechanical means (including photocopying, recording, or information storage and retrieval) without permission in writing from the publisher. Advanced Issues on Cognitive Science and Semiotics Edited by Priscila Farias and João Queiroz ISBN
Contents
Preface
1
1
A-life, organism and body: the semiotics of emergent levels
5
Claus Emmeche 2
Semiosis and living membranes
19
Jesper Hoffmeyer 3
Biosemiotics and the foundation of cybersemiotics
37
Søren Brier 4
Information and direct perception: a new approach
59
Anthony Chemero 5
Revisiting the dynamical hypothesis
73
Tim van Gelder 6
The dynamical approach to cognition: inferences from language
93
Robert F. Port 7
Models of abduction
123
Paul Bourgine Contributors
139
Preface João Queiroz & Priscila Farias
T
here seems to be a consensus that many of the classic problems in cognitive science are strongly connected to the fundamental issues of information, meaning and representation. There is, indeed, no domain of research, interested in cognitive processes, that has not been concerned, at some point, with these notions. At the same time, as many authors have noted, these seminal elements of investigation are frequently obscured by terminological conflicts, uncertainties and vagueness. Problematic as they may be, notions of information and representation, or its models, even if only implicitly, are always present in studies on cognitive systems, urging for reliable and sound theoretical basis. North-American pragmatist Charles Sanders Peirce, founder of the modern theory of signs, defined semiotics as a science of the essential and fundamental nature of all possible varieties of meaning processes (semiosis). Peirce’s concept of semiotics as the ‘formal science of signs’, and the pragmatic notion of meaning as the ‘action of signs’, have had a deep impact in philosophy, psychology, theoretical biology, and cognitive science. The reader will find here a collection of papers that present, from different perspectives, an attempt to relate semiotics and cognitive science with linguistics, logic, and philosophy of biology. As a first broad account of those subjects, it does not specifically focus on or privilege any of the different approaches that have been proposed up to now, but instead gives the reader the opportunity to consider the various directions and topics of research that emerge from such relations. Several chapters focus on the logic of semiosis, and propose a range of different analysis of the nature of mediation as an action of interpretation. Emmeche and Hoffmeyer explore the explanatory power of the biosemiotic approach as an alternative (or complementary) theoretical frame to the physicalist point of view. Emmeche’s chapter comments upon some of the ‘open problems’ in artificial life from the perspective of qualitative organicism and the emergent field of Peirceanoriented biosemiotics. Hoffmeyer is interested in the rise and evolution of qualia. According to the author, a scientifically consistent theory might be developed on the basis of what he called semiotic materialism. Semiotic materialism considers qualia as an evolved instantiation of a semiotic freedom that was latently present in our universe from the beginning. It claims that our universe has a built-in tendency to produce organized systems possessing increasingly more semiotic freedom in the sense that the semiotic aspect of the system’s activity becomes more and more autonomous relative to its material basis.
1
2
Brier proposes a transdisciplinary approach able to postulate a unified theory of information, cognition and communication. His proposal involves the setting of an epistemological framework that takes into account recent developments from ethology, second order cybernetics, cognitive semantics and pragmatic linguistics, and that builds on basic concepts from biosemiotics. According to his view, all living organisms are immersed in a web of signs, participating in what he defines as sign games, which eventually lead to wittgenteinian human language games. Chemero challenges the notion of information in Gibsonian-oriented approaches to perception. He proposes an understanding of direct perception and information that differs from the ecological psychology orthodoxy, the Turvey-Shaw-Mace view. According to his view, perception is direct when the perceiver and perceived are coupled and their relationship is unmediated by mental representations. Dynamical hypothesis on cognition, by rejecting the idea that cognition is to be explained exclusively in terms of internal representations, are close to Chemero’s approach. Two papers discuss the use of dynamical system theory (DST) as a strategy for modeling cognition. Van Gelder is interested in exploring the “essence of dynamical cognitive science”, and to show how it differs from traditional computational cognitive science. His strategy is to oppose the traditionalist slogan that cognitive agents are digital computers to the dynamicist claim that cognitive agents are dynamical systems. Port’s argument is about an essentially temporal (continuous, historical) feature of verbal language generation and processing on a phonological level of description. He suggests that, to constitute a general dynamic theory of language, from phonology to semantics, we should understand linguistic events and structures as temporal processes. One of Peirce’s most original contributions to the studies on cognitive inference is his development of the concept of abduction. However, Peirce’s abductive inference received, so far, relatively little attention. In the last chapter of this book, Bourgine proposes axiomatic and geometric models for abduction, close to the framework of belief revision, in an effort to formally capture what differentiates it from induction and deduction, while preserving a coherent relation between the three types of reasoning envisaged by Peirce. Bourgine’s models are exemplified in the context of three perspectives from the field of cognitive science: cognitivism, connectionism and constructivism. The various strategies presented here may be considered non-standard, and therefore remain peripheral in their fields. It is still too early to properly evaluate all the perspectives opened up by the research frontiers presented in this book. Indeed, it is premature to assert that they may one day constitute new scientific paradigms. What all these perspectives suggest, however, are alternative
approaches to basic principles of cognition, information and representation. By offering innovative and consistent propositions, we hope that the ideas presented in this book may constitute a fresh breath, and point out important new directions to be followed in the future. Acknowledgments The authors would like to acknowledge the support received, in the form of research grants, from FAPESP - The State of São Paulo Research Foundation.
3
4
A-life, organism and body: the semiotics of emergent levels Claus Emmeche
Introduction: Organicist philosophies Artificial Life research raises philosophical questions, just as cognitive science involves philosophy of mind. No clear demarcation line can be drawn between science and philosophy; every scientific research programme involves metaphysical assumptions and decisions about how to interpret the relations between experiment, observation, theoretical concepts and models (this was also evident when Artificial Life originally was formulated by C.G. Langton in the late 1980s; cf. Emmeche 1994). Yet we should not conflate questions that may be answered by science with questions that by their very nature are conceptual and metaphysical. The aim of this paper is to address from the perspective of biosemiotics a subset of the open problems (as described by Bedeau et al. 2000) raised by Artificial Life research, including ‘wet Alife’, about the general characteristics of life; the role and nature of information; how life and mind are related; and their relations again to culture and machines. Biosemiotics as the study of communication and information in living systems may provide some inspiration and conceptual tools for inquiry into these theoretical and philosophical issues. Firstly it is apt briefly to introduce organicism as a mainstream position in philosophy of biology, and also a variant called qualitative organicism, and then introduce biosemiotics as a non-standard philosophy of biology. Neither qualitative nor mainstream organicism is specific research paradigms; they are more like general and partly intuitive stances on how to understand living systems in the context of theoretical biology. Organicism. In its mainstream form (cf. Emmeche 2001) organicism endorses these theses: (a) non-vitalism (no non-physical occult powers should be invoked to explain living phenomena); (b) non-mechanicism (living phenomena cannot be completely described merely by mechanical principles, whether classical or quantum); (c) emergentism (genuine new properties are characteristic of life as compared with ‘purely’ physical non-living systems) implying ontological irreducibility of at least some processes of life (though methodological reductionism is fully legitimate); (d) the teleology of living phenomena (their purpose-like character) is real, but at least in principle explainable as resulting from the forces of blind variation and natural selection, plus eventually some additional ‘order for free’ (physico-chemical self-organization). What is studied within an organicist perspective as emergent properties are seen as material structures and processes within several levels of living systems (developmental
5
systems, evolution, genetic and biochemical networks, etc.), all of which are treated as objects with no intrinsic experiential properties. Mayr (1997) acknowledged his position as organicist, and mainstream organicism is widely accepted among biologists, even though the position was often mixed-up with vitalism (see also El-Hani & Emmeche 2000, Gilbert & Sarkar 2000). Accordingly, there are no principled obstacles to the scientific construction of life and mind as emergent phenomena by evolutionary or bottom-up methods.
6
Qualitative Organicism. This is a more radical position differing from mainstream organicism in its appraisal of teleology and phenomenal qualities. It emphasizes not only the ontological reality of biological higher-level entities (such as selfreproducing organisms being parts of historical lineages) but also the existence of qualitative experiential aspects of cognitive behavior. When sensing light or colors, an organism is not merely performing a detection of external signals which then get processed internally (described in terms of neurochemistry or information processing); something more is to be told if we want the full story, namely about the organism’s own experience of the light. This experience is seen as real. It may be said to have a subjective mode of existence, yet it is an objectively real phenomenon (Searle 1992 emphasized the ontological reality of subjective experience; yet, most of the time only in a human context). As a scientific stance qualitative organicism is concerned not only with the category of ‘primary’ measurable qualities (like shape, magnitude, and number) but also with inquiry into the nature of ‘secondary’ qualities like color, taste, sound, feeling, and the basic kinesthetic consciousness of animal movement. A seminal example of qualitative organicism is Sheets-Johnstone 1999. The teleology of living beings is seen as an irreducible and essential aspect of living movement, in contrast to mere physical change of position. This teleology is often attributed to a genuine form of causality (‘final causation’, cf. Van de Vijver et al. 1998), and qualitative organicism’s assessment of the ‘reality’ of an instance of artificial life will partly depend on how to interpret the causality of the artificial living system. Biosemiotics. The study of living systems from the point of view of semiotics, the theory of signs (and their production, transfer, and interpretation), mainly in the tradition of the philosopher and scientist Charles Sanders Peirce (18391914) but also inspired by the ethologist Jakob von Uexküll (1864-1944), has a long and partly neglected history in 20th Century science (Kull 1999 for the history, Hoffmeyer 1996 for an introduction). It re-emerged in the 1990s and is establishing itself as a cross-disciplinary field attempting to offer alternatives to a gene-focused reductionist biology (much like one of the aims of Artificial Life, and indeed inspired by it), by gathering researchers for a new approaches to biology, or a new philosophy of biology, or ultimately with the hope to bridge the gab between science and the humanities. The semiotic approach means that cells and organisms are not seen primarily as complex assemblies of molecules,
as far as these molecules — rightly described by chemistry and molecular biology — are sign vehicles for information and interpretation processes, briefly, sign action or semiosis. A sign is anything that can stand for something (an object) to some interpreting system (e.g., a cell, an animal, a legal court), where ‘standing for’ means ‘mediating a significant effect’ (called the interpretant) upon that system. Thus, semiosis always involves an irreducibly triadic process between sign, object and interpretant. Just as in chemistry we see the world from the perspective of molecules, in semiotics (as a general logic of sign action) we see the world from the perspective of sign action, process, mediation, purposefulness, interpretation, generality. Those are not reducible to a dyadic mode of mechanical actionreaction, or merely efficient causality. The form of causality governing triadic processes is final causation. Organisms are certainly composed of molecules, but these should be seen as sign vehicles having functional roles in mediating sign action across several levels of complexity, e.g., between single signs in the genotype, the environment, and the emerging phenotype. Biosemiotics is a species of qualitative organicism for these reasons: (i) It holds a realist position regarding sign processes of living systems, i.e., signs and interpretation processes are not merely epistemological properties of a human observer but exists as well in nature, e.g., in the genetic information system (ElHani et al. 2006). (ii) Biosemiotics interprets the teleology of sign action as related to final causation (Hulswit 2002). (iii) The qualitative and species-specific ‘subject’ of an animal (i.e., its Umwelt understood here as a dynamic ‘functional circle’ of an internal representation system interactively cohering in action-perception cycles with an environmental niche) can to some extent be studied scientifically by the methods of cognitive ethology, neurobiology and experimental psychology, even though the experiential feeling of the animal is closed to the human Umwelt (on Umwelt research, see papers in Kull 2001). (iv) Signs have extrinsic publicly observable as well as intrinsic phenomenal aspects. We can only access the meaning of a sign from its observable effects, a good pragmaticist principle indeed, but observation of the phenomenal experiences of another organism may be either impossible or highly mediated. However, reality exceeds what exists actually as observable. (v) Even though sign activity generally can be approached by formal and logic methods, sign action has a qualitative aspect as well. Due to the principle of inclusion (Liszka 1996) every sign of a higher category (such as a legisign, i.e., a sign of a type) includes a sign of a lower category (e.g., a sinsign, i.e., a sign of a token. A type has somehow to be instantiated by a token of it, just like any sign must be embodied). A symbol is not an index, but includes an indexical aspect, which again involves an icon. All signs must ultimately include (even though this might not be pertinent for their phenomenology) qualisigns, which are of the simplest possible sign category, hardly functioning as mediating any definite information, yet being signs of quality and thus having a
7
phenomenal character of feeling (Peirce preferred a type-token-tone trichotomy for the type-token dichotomy; a tone is like a simple feeling). The argument (v) may strike a reader not acquainted with Peirce as obscure, but it is a logical implication of the ontological-phenomenological basis of Peirce’s semiotics and points to an interesting continuity between matter, life and mind, or, to phrase it more precise, between sign vehicles as material possibilities for life, sign action as actual information processing, and the experiential nature of any interpretant of a sign, i.e., the effects of the sign upon a wider mind-like system (Emmeche 2004). To recapitulate, the biosemiotic notion of life is a notion of a complex web of sign and interpretation processes, typically with the single cell seen as the simplest possible autonomous semiotic system.
8
Synthetic biosemiosis? Computers are semiotic machines (Nöth 2003) and computers or any other adequate medium, such as a complex chemistry, can in principle function as a medium of genuine sign processes. Not all sign processes need to be biological, although all signs seem to involve at some point in their semiosis interpreters who typically would be biological organisms. Remember the distinction above between the interpretant as the effect or meaning of a sign and the interpreting system (or interpreter) as the wider system in which semiosis is taking place. So, what then is the biosemiotic stance regarding true synthetic life or ‘wet’ artificial life? To answer this question, we have to consider, though more carefully than can be done here, (a) three non-exclusive notions of ‘life’; (b) the relation between the notions of organism, animal, body, and the general embodiment of various levels of signs processes; and (c) the semiotics of scientific models. This necessity of a precaution in assessment of the degree of genuineness of synthetic life in other media is related to another organicist theme: The thesis of irreducibility of levels of organization, or, as we shall interpret that thesis here, levels of embodied sign action. ‘Life’ in Lebenswelt, biology and ontology Synthetic life provokes, of course, the general question “what is life?”; partly because of an intuition we (or some people) have from our ordinary life, as the German philosophers would say, from the Lebenswelt (life world) of human beings, that life (like death) is a basic condition we as humans hardly can control, know completely, or create. Now science seems to teach us otherwise. A contribution to clarify the issue is to be aware of the fact of the existence of at least three, non-exclusive notions of life. I will briefly sketch these: Lebenswelt life: A set of diverse, non-identical, culture-specific notions (determined by intuitive, practical, ideological, or social factors) of what it is to be ‘alive’, what life and dead is, why being alive and flexible is more fun than being dead and rigid, and so on. Science is distinct from, but not independent of, forms of the human Lebenswelt (just as scientific concepts can be seen as presupposing
and being a refinement of ordinary language). For biological relevant notions of life, we can talk about ‘folk’ and ‘experiential’ biology (Emmeche 2000). Biological life: The so-called life sciences are not interested in the life of the Lebenswelt as a normative phenomenon, but in the general physical, chemical and biological properties of life processes, as conceived of within separate paradigms of biology. This leads to several distinct ‘ontodefinitions’ of life (Emmeche 1997) such as life as evolutionary replicators, life as autopoiesis, or life as sign systems (and probably many more). However, advances in biotechnology and biomedicine will tend to mix up, ‘hybridize’ or create new boundary objects (sensu Star & Griesemer 1989) between the domains of bioscience and a life-world deeply embedded in technoscience. Ontological life: Depending upon the ontology chosen, an ontological notion of life is marked by distinctions to other, similarly general and essentials domains of reality. Take the ontology of Peirce, for instance. Here life is of the category of Firstness, it does not only include life in organisms, evolution or habit taking (which are of the category of Thirdness); life is seen as an all-inclusive aspect of the developing cosmos, on par with spontaneity and feeling: “insofar as matter does exhibit spontaneous random activity (think of measurement error or Brownian motion), it still has an element of life left in it” (Reynolds 2002, 151). Biosemiotics typically does not use Peirce’s broad ontological notion of life, but construes a notion of life derived from contemporary biology, as mentioned, life as organic sign-interpreting systems. But biosemiotics entails a thesis of the reality of ideal objects, including possibilities like a fitness space, virtuality in nature, or tendencies in evolution and development, and so “the possibilities for final causes to prefer one tendency over another. Thus biosemiotics entails an ontological revolution admitting the indispensable role of ideality in this strict sense in the sciences” (Stjernfelt 2002, 342). The invention of synthetic ‘wet’ life may affect all the three non-exclusive preoccupations with understanding life, that is, life within a cultural context, life as studied by science, and life as a metaphysical general aspect of reality. To approach how this may come about, we must analyze some levels of embodied life from the perspective of an emergentist ontology. Level-specific forms of life and embodiment As a species of organicism, biosemiotics is an emergentist position. However, it is not so often that emergent levels of sign processes have been explicitly discussed (von Uexküll 1986, Kull 2000, El-Hani et al. 2004). The account given here should be seen as preliminary; the important point is not the number of levels (more fine-grained analyses may be done) but the very existence of separate levels of embodied sign action. See also Table 1.
9
The body of physics
The body of biology
The body of ‘evo-devo’research
The body of zoology
The body of anthropology
The body of sociology
Complex dissipative, selforganizing structures
Physiologic-homeostatic units with a genetic code-plurality, and irritability
Vegetative swarm of cells coordinating multi-cellular communication with multiple organic codes
Self-moving, actionperception cycles, animation, kinesthetics
Language, culture-specific Lebenswelt
The life in societal institutions, habit formation
Table 1. Ordering relations between forms of embodiment. The epistemic dimension (top row) is shown by organizing those forms according to different domains of science each constituting its own objects; the ontic dimension (bottom row) is implied by an underlying ontology of levels of organization in Nature. Increasing specificity from left to right; for each new level the previous one is presupposed.
10
The emerging forms of embodiment of life could be suspected merely to reflect a historically contingent division of sciences; an objection often raised against simple emergentists level ontologies. Thus one should pay respect to the fallibilist principle and never preclude that new discoveries will fundamentally change the way we partition the levels of nature. The point is that from the best of our present knowledge we can construct some major modes of embodiment in which ‘life’ and ‘sign action’ plays crucially different roles, and in which we can place such broad phenomena as life, mind, and machines. Reflexivity is allowed for, so even the scientific description of these phenomena can be placed in this overall scheme of processes. A consequence is that wet artificial life is seen as a hybrid phenomenon of ‘the body of biology’ and ‘the body of sociology’, as will be explained below. The emergent modes of embodiment, increasing in specificity (Table 1) are one-way inclusive and transcending: The human body includes the animate organism, which again presupposes multicellularity and basic biologic autopoiesis (but not vice versa). A human body (e.g., the body of a child, a soccer player, or a diplomat) as studied by anthropology is something more specific (i.e., in need of more determinations) than its being as an animal, thus transcending the mere set of animate properties (as having an Umwelt) and organismic properties (like growth, metabolism, homeostasis, reproduction), just as an organism is a physical system, yet transcending the basic physics of that system. That an entity or process at an emergent level Z is transcending phenomena at level Y has two aspects. One is epistemic, i.e., “Z’s description cannot adequately be given in terms of a theory generally accounting for Y, even though this Z-description in no way contradicts a description of the Y-aspects of Z”. The other is ontological,
i.e., “crucial properties and processes of Z are of a different category than the ones of Y, even though they may presuppose and depend on Y”. Thus, a Z-entity is a highly specific mode of realizing a Y-process, not explained by Y-theory. The organism is a physical processual entity with a form of movement so specific that physics (as a science) cannot completely account for that entity. The organism is a very special type of physical being, as it includes certain purposeful (functional) part-whole relations, based upon genuine sign systems of which the genetic code is the most well known but not the only example. Here is a brief characteristic of the levels. ‘Life’ as self-organization far from equilibrium Physics deal with three kinds of objects; first, general forces in nature, particles, general bodies (matter in bulk), and the principles (‘laws’) governing their action; second, more specifically the structural dynamics of self-organized bodies (galaxies, planets, solid matter clusters, etc.); third, physical aspects of machines (artifacts produced by human societies and thus only fully explainable also by use of social sciences, like history of technology). One has often seen attempts to reduce all of physics to a formalism equivalent to some formal model of a machine, but there are strong arguments against the completeness of this programme (Rosen 1991), i.e., mechanical aspects of the physical world are only in some respects analogous to a machine. Some of the general properties of bodies studied in physics have a teleomatic character (a kind of directedness or finality, cf. Wicken 1987), which may be called ‘thermo-teleology’, as this phenomenon of directedness is best known from the second law of thermodynamics (a directedness towards disorder), or from opposing self-organizing tendencies in far from equilibrium dissipative systems. Often when physicists talk about ‘life’ in the universe the reference is to preconditions for biological life such as self-organizing nonequilibrium dissipative processes, rather than the following level. Life as biofunctionality - organismic embodiment A biological notion of function is not a part of physics, while it is crucial for all biology. Biofunctionality is not possible unless a living system is self-organizing in a very specific way, based upon a memory of how to make components of the system that meet the requirement of a functional (autopoietic and homeostatic) metabolism of high specificity. For Earthly creatures this principle is instantiated as a code-plurality between a ‘digital’ genetic code of DNA, a dynamic regulatory code of RNA (and other factors as well), and a dynamic mode of metabolism involving molecular recognition networks of proteins and other components (see the semiotic analysis by El-Hani et al. 2004). Symbolic, indexical and iconic molecular sign processes are all involved in protein synthesis. The symbols (using DNA triplets as sign vehicles) seem to be a necessary kind of signs for a stable memory to pick out the right sequences for the right job of metabolism.
11
12
This establishes a basic form of living embodiment, the single cell (a simple organism) in its ecological niche. This presupposes the workings of ‘the physical body’ as a thermodynamic non-equilibrium system, but transcends that general form by its systematic symbolic memory of organism components and organismenvironment relations. Biosemiotics posits that organismic embodiment is the first genuine form of embodiment in which a system becomes an autonomous agent “acting on its own behalf” (cf. Kauffman 2000), i.e., taking action to secure access to available resources necessary for continued living. It is often overlooked that the subject-object structure of this active agent is mediated not only energetically by a structured entropy difference between organism and environment, but also semiotically, by signs of this difference; signs of food, signs of the niche, signs of where to be, what to eat, and how to trigger the right internal processes of production of organismic components the right time. The active responsitivity of the agent organism (based upon observable molecular signs) has, as an ‘inner’ dimension, a quality of feeling, implied by what is in Table 1 called irritability at the level of a single cell. Irritability is a real phenomenon, well-known in biology, logically in accordance with a basic evolutionary matter-mind continuity, rationally conceivable, though impossible for humans to sense or perceive ‘from within’ or empathetically know ‘what it feels like’, say, for an amoeba or an E.coli. It is highly conceivable that synthetic systems analogous to this level of embodiment may be produced some day. Life as biobodies – coordinate your cells! Characteristics like multiple code-plurality (involving the genetic code, signal transduction codes, and other organic codes, see Barbieri 2003) and forms of semiotic coordination between cell lines cooperating and competing for resources within a multicellular plant or fungus are characteristics of ‘the evolution of individuality’ (Buss 1987). The ‘social’ life of cells within a lineage of organisms with alternating life cycles constitutes a special level of embodied biosemiosis, and special a coupling of evolution and development. It is the emergence of the first biobodies in which the whole body constrain the growth and differentiation of its individual cells (a form of ‘downward causation’, cf. El-Hani & Emmeche 2000). This multicellular level of embodiment corresponds to what was called a vegetative principle of life in Aristotelian biology, like that of a plant. Life as animate - moving your self! Here the body gets animated, we see a form of ‘nervous code’ (still in the process of being decoded by neuroscience), and we see the emergence of animal needs and drives. When we consider animal mind and cognition, the intentionality of an animal presupposes the simpler forms of feelings and irritability we stipulate
in single cells (including the ‘primitive’ free-living animals, such as protozoa, lacking a nervous system), yet transcends these forms by the phenomenal qualities of the perceptual spaces that emerge in functional perception-action cycles as the animal’s Umwelt. Proprioceptive semiosis is crucial for phenomenal as well as functional properties of animation (Sheets-Johnstone 1999). More generally, the animal body is a highly complex and specific kind of a multicellular organism (a biobody) that builds upon the simpler systems of embodiment such as physiological and embryogenetic regulation of the growth of specific organ systems, including the nervous system. These regulatory systems are semiotic in nature, and rely on several levels of coded communications within the body and their dynamic interpretations (Hoffmeyer 1996, Barbieri 2003). The expression “the body of zoology” in Table 1 is used to emphasize both its distinctness as a level of embodiment, and that zoology instead of being simply part of an oldfashioned division of the sciences should be the study of animated movement, including its phenomenal qualities. Life as anthropic - talk about life! With the emergence of humans comes language, culture, division work, desires (not simply needs, but culturally informed needs), power relations etc. The political animal not only lives and makes tools, but talks about it. Within this anthroposemiosis (von Uexküll 1986), the body is marked by differences of gender (not simply sex), age, social groups, and cultures. Life as societal – get a life! After humans invented agriculture and states, more elaborated institutions could emerge; and social groups became informed and enslaved by organizational principles of all the sub-systems of a civilized real society (work, privacy, politics, consumption, economy, law, politics, art, science, technology, etc.). Humans discover the culture-specificity of human life, ‘them’ and ‘us’. Reflexivity creeps in as civilization makes more and more ways to get a life. The body becomes societal (marked by civil life) and cyborgian (crucially dependent upon technology, machines). The political animal becomes cosmopolitical. The body is marked not only in the anthropic sense (see above), but also by institutions. The cyborg body is a civilized one, dependent upon technoscience (to keep ‘us’ healthy and young) and, because of the dominant forms of civilization, ultimately co-determined by the globally unequal distribution of wealth. One can foresee Artificial Life research to play an increasing role in the contest over bodies and biopower as we approach the ‘posthuman’ condition (cf. Kember 2003). Hybridization and downward causation This tour-de-force through some levels of embodiment makes a note on entanglement and hybridization relevant. The neat linearity implied by the concepts of inclusion and increasing specificity and by the (admittedly idea-of-
13
progress seeming) chain of levels does not hold true unrestricted. For instance, the very possibility of ‘human’ creation bottom-up of new forms of life seems to suggest some complication (as human purposes may radically inform the natural teleology of what looks like biobodies). Already the culture-determined breeding of new races of cattle, crops, etc. suggests that even though biology should be enough to account for the body of a non-human animal, the human forms of signification interferes with pure biosemiosis, and create partly artificial forms of life like the industrialized pig or weird looking pet dog races. In some deep sense, cows and pigs within industrialized agriculture are already cyborgs, partly machines, partly animals (cf. Haraway 1991). Culture mixes with nature in a ‘downward causation’ manner, and thus, the hierarchy of levels is ‘tangled’ (Hofstadter 1979) and ‘natural’ and ‘cultural’ bodies hybridize (Latour 1991). We might expect something similarly to apply if we access the status of ‘wet’ artificial life, as reported by Rasmussen et al 2004 and Szostak et al. 2001. Here, however, we need also to consider not only the biosemiotics of life, but also the special anthroposemiosis of experimental science, and especially the use of models and organisms to study life processes.
14
Models of life Pattee (1989) was emphatic about the distinction between a model of life and a realization of some life process. In the early phase of Artificial Life research, focus was put on the possibility of ‘life in computers’, and thus the question of computational simulations vs. realizations was crucial. Considering the possibility of a ‘wet’ bottom-up synthesis of other forms of life, we need to expand the kind of analysis given by Pattee to include not only the role of computational models in science in general and Artificial Life in particular, but also the very notion of a model in all its variety, and especially the notion of model organisms in biology. It is beyond the scope of this note to make any detailed analysis here, so in this final section only some hints will be given. Let us make a preliminary, almost Borgesian classification of models in biology like the following. Formal models and simulations. Highly relevant for ‘software’ Artificial Life. Such models are, for their theoretically relevant features, computational and mediumindependent, and thus disembodied, and would hardly qualify as candidates for ‘true’ or ‘genuine’ life, from the point of view of organicism or biosemiotics. Semiotically, the map is not the territory; a model is not the real beast. Mechanical and ICT-models. The paradigmatic example here is robots. Robots may provide good clues to study different aspects of animate embodiment, but again, if taken not as models (which they obviously may serve as) but as proclaimed real ‘machine bodies’ or ‘animats’, their ontology is a delicate one. They are build by (often ready-made) pieces of information and communication
technology; they may realize a certain kind of ‘machine semiosis’ (Nöth 2003), but their form of embodiment is radically different from real animals (see also Ziemke 2003, Ziemke & Sharkey 2001). Evolutionary models. This label collects a large class of dynamic models not only across the previous two categories (because they may be either computational or mechanical, cf. also the field of ‘evolutionary robotics’) but also combining evolutionary methods with real chemistries. Many sessions of previous Artificial Life conferences have been devoted to these models. “Model organisms”. The standard notion of a model here is to study a phenomenon, say regulation of cancer growth in humans, by investigating the same phenomenon in another but in some senses similar organism like the house mouse. In experimental biology, it has proved highly important to a fruitful research programme to choose ‘the right organism for the right job’. Drosophila genetics is a well-known case in point. The lineage or population of model organisms is often deeply changed during the process of adapting it to do its job properly, and it is apt to talk about a peculiar co-evolution of this population and the laboratories using it in research. E.g., Kohler (1994) describes how Drosophilae was introduced and physically redesigned for the use in genetic mapping and sees the lab as a special kind of ecological niche for a new artificial animal with a distinctive natural history. “Stripping-down” models. A method of investigating the minimal degree of complexity of a living cell by removing more and more genetic material to see how few genes is really needed to keep autopoiesis going (cf. Rasmussen et al. 2004). The problem is, of course, as is well-known from parasitology, that the more simple the organism becomes, the more complex an environment is needed, so by adding more compounds to the environment, you can get along with fewer genes. The organism is always part of an organism-environment relation, which makes any single measure for complexity such as genome size problematic. Bottom-up models. The term ‘bottom-up’ may be used for all three major areas of Artificial Life, in relation to ‘software’, ‘hardware’ and ‘wet’ models. Considering only the later here (e.g., Szostac et al. 2001), the crucial question is to distinguish between, on the one hand, a process aimed at by the researches which is truly bottom-up emergent, creating a new autonomous level of processes such as growth and self-reproduction pertaining to biofunctionality and biobodies, and on the other hand, something more similar to engineering a robot from pre-fabricated parts, that is, designing a functioning protocell but under such special conditions that one might question its exemplifying a genuine agent or organism. Just as exciting as they are as examples of advances in wet Artificial Life research, just as perplexing are they as possible candidates for synthetic
15
true organisms, because their process of construction is highly designed by the research team. In this way, they are similar to the ‘model organisms’ in classical experimental biology, but with the crucial difference that no one doubts the later to be organisms, while it is question begging to proclaim the former to be. The Life-Model entanglement problem A special kind of hybridization is of interest here; the co-evolution of human researchers and a population of model organisms. As hinted at above, also in the case of wet Artificial Life systems, the ‘real’ life and the model of life gets entangled. This raises questions not only about sorting out, or ‘purifying’ as Latour (1991) would say, biosemiosis from anthroposemiosis to the extent that this is possible at all, but also considering more in detail the nature of the very entanglement. The hybridicity of human design ‘top-down’ and nature’s openended, evolutionary ‘design’ bottom-up creates a set of complex phenomena that needs further critical study.
16
Conclusion From an organicist perspective, real biological life involves complex part-whole relationships, not only regarding the structured network of organism-organscells relationships, but also regarding the environment-(Umwelt)-organism relations. The biosemiotic trend in organicism is needed to understand natural life (the plants and animals we already know) from a scientific perspective, but is not enough to evaluate the complex question of “what is life?” as recently raised by synthetic chemistry approaches to wet artificial life. Here, also more ontological, metaphysical, and philosophy of science (and scientific models) inspired considerations are needed. Some of these have been presented, other just hinted at. Acknowledgements
I thank Frederik Stjernfelt, Simo Køppe, Charbel Niño El-Hani, João Queiroz, Jesper Hoffmeyer, Mia Trolle Borup and Tom Ziemke for stimulating discussions. The work was supported by the Faculty of Science, University of Copenhagen, and the Danish Research Foundation for the Humanities.
References
Barbieri, M. 2003. The Organic Codes. An introduction to semantic biology. Cambridge: Cambridge University Press. Bedau, M., McCaskill, J., Packard, N., Rasmussen, S., Adami, C., Green, D., Ikegami, T., Kaneko, K. and Ray, T. 2000. “Open problems in artificial life”. Artificial Life 6: 363-376. Buss, L.W. 1987. The Evolution of Individuality. Princeton: Princeton University Press. El-Hani, C. N. and Emmeche, C. 2000. “On some theoretical grounds for an organismcentered biology: Property emergence, supervenience, and downward causation”. Theory in Biosciences 119 (3/4): 234-275. El-Hani, C. N., Queiroz, J., and Emmeche, C. 2004. “A Semiotic Analysis of the Genetic Information System”. Semiotica 160 (1/4): 1-68.
Emmeche, C. 1994. The garden in the machine. The emerging science of artificial life. Princeton: Princeton University Press. __1997. “Defining life, explaining emergence”. Online paper at: http://www.nbi.dk/ ~emmeche/ __2000. “Closure, Function, Emergence, Semiosis and Life: The Same Idea? Reflections on the Concrete and the Abstract in Theoretical Biology”. In: J. L. R. Chandler & G. Van de Vijver, eds.: Closure: Emergent Organizations and Their Dynamics. Annals of the New York Academy of Sciences 901: 187-197. __2001. “Does a robot have an Umwelt? Reflections on the qualitative biosemiotics of Jakob von Uexküll”. Semiotica 134 (1/4): 653-693. __2004. “Causal processes, semiosis, and consciousness”. In: J. Seibt (ed.), Process Theories: Crossdisciplinary Studies in Dynamic Categories. p. 313-336. Dordrecht: Kluwer. Gilbert, S. F. & Sarkar, S. 2000. “Embracing complexity: Organicism for the 21st Century”. Developmental Dynamics 219: 1-9. Haraway, D. 1991. Simians, Cyborgs and Women: The Reinvention of Nature. New York: Routledge. Hoffmeyer, J. 1996. Signs of Meaning in the Universe. Bloomington: Indiana University Press. Hofstadter, D.R. 1979. Gödel, Escher, Bach: an Eternal Golden Braid. London: The Harvester Press. Hulswit, M. 2002. From Cause to Causation. A Peircean Perspective. Dordrecht: Kluwer. Kauffman, S. 2000. Investigations. Oxford: Oxford University Press. Kember, S. 2003. Cyberfeminism and Artificial Life. London & New York: Routledge. Kohler, R.E. 1994. Lords of the Fly. Drosophilae Genetics and the Experimental Life. Chicago: The university of Chicago Press. Kull, K. (ed.) 2001. Jakob von Uexküll: A paradigm for biology and semiotics. Berlin & New York: Mouton de Gruyter (= Semiotica 127(1/4): 1–828). __2000. “An introduction to phytosemiotics: Semiotic botany and vegetative sign systems”. Sign Systems Studies 28: 326-350. __1999. “Biosemiotics in the twentieth century: A view from biology”. Semiotica 127(1/4): 385–414. Latour, B. 1991. We have never been Modern. New York: Harvester Wheatsheaf. Liszka, J. J. 1996. A general introduction to the semeiotic of Charles Sanders Peirce. Bloomington: Indiana University Press. Mayr, E. 1997. This is Biology. The science of the Living World. Cambridge: Harvard University Press. Nöth, W. 2003. “Semiotic machines”. S.E.E.D. Journal (Semiotics, Evolution, Energy, and Development) 3 (3): 81-99. Pattee, H.H., 1989. “Simulations, realizations, and theories of life”. In: Artificial Life (Santa Fe Institute Studies in the Sciences of Complexity, Vol. VI), C.G. Langton (ed.). pp. 63-77. Redwood City: Addison-Wesley. Rasmussen, S., Chen, L., Deamer, D., Krakauer, D., Packard, N., Stadler, P., Bedau M. 2004. “Transitions from nonliving and living matter”. Science 303: 963-965. Reynolds, A., 2002. Peirce’s Scientific Metaphysics. The Philosophy of Chance, Law, and Evolution. Nashville: Vanderbilt University Press. Rosen, R. 1991. Life Itself. A Comprehensive Inquiry Into the Nature, Origin, and Fabrication of Life. New York: Columbia University Press. Searle, J. 1992. The Rediscovery of the Mind. Cambridge, Mass.: MIT Press. Sheets-Johnstone, M. 1999. The Primacy of Movement. Amsterdam & Philadelphia: John Benjamins.
17
J. Szostak, J., Bartel, D., Luisi, P. 2001. Synthesizing life. Nature 409: 383-390. Star, S.L. & Griesemer J.R. 1989. “Institutional ecology, ‘translations’, and Boundary Objects: amateurs and professionals in Berkeley’s Museum of Vertebrate Zoology, 1907-39”. Social Studies of Science 19: 387-420. Stjernfelt, F. 2002. “Tractatus Hoffmeyerensis: Biosemiotics expressed in 22 basic hypothesis”. Sign Systems Studies 30(1): 337-345. Van de Vijver, G., S. Salthe, S. and Delpos, M. (eds.) 1998. Evolutionary Systems: Biological and Epistemological Perspectives on Selection and Self-Organization. Dordrecht: Kluwer. von Uexküll, T. 1986. Medicine and semiotics. Semiotica 61 (3/4): 201-217. Wicken, J.S. 1987. Evolution, Thermodynamics, and Information. Extending the Darwinian Program. Oxford: Oxford University Press. Ziemke, T. 2003. “What’s that thing called embodiment?”. In: R. Alterman and D. Kirsh (eds.), Proceedings of the 25th Annual Meeting of the Cognitive Science Society. p. 13051310. Mahwah, NJ: Lawrence Erlbaum. Ziemke, T. and Sharkey, N. E. 2001. “A stroll through the worlds of robots and animals: Applying Jakob von Uexküll’s theory of meaning to adaptive robots and artificial life”. Semiotica 134(1-4): 701-746.
18
Semiosis and living membranes Jesper Hoffmeyer
External or internal explanation? The idea that the world is such a thing that it can be objectively described so to say in “the view from nowhere”, as Thomas Nagel has expressed it (Nagel 1986), has nourished the scientific spirit for centuries, and there can be no doubt that this strategy has been enormously successful. Undeniably however this century has witnessed a decline in the general confidence in this idea. The deeper science digs into the material basis of nature the more sophisticated and complex are the problems it must confront. Thus at the most fundamental level we have witnessed a gradual disintegration of the concept of elementary particles into an ever-growing series of very abstract entities threatening our idea of materiality. But even more disquieting perhaps are the subtle problems posed by the observer himself as a human being: how can it be the case that one of the people in the world is me? (Nagel 1986:13). This strange feeling of being a first person singular seems to encompass qualities (sometimes called qualia), which are not describable in a language that cannot go beyond the categories of the 3rd person singular or plural (Searle 1992). The error of confusing 1st person experience and 3rd person experience runs easy in our thinking as was illustrated by a recent newspaper story brought under the heading: “Scientists will soon be able to see consciousness”. The facts behind this alarming title showed up to be a quote from an expert in mathematical modeling working with brain scanning: “I am rather sure that one day we will have a picture on our scanner of the activity patterns constituting consciousness” he told the newspaper. But will he? Let us imagine him scanning my brain while I —living in the dark winter of Denmark— have an experience of longing for summer. Here I personally have no difficulty in believing that this experience might somehow have been evoked by a brain activity that can be visualized on the scanner. For the sake of the argument let us now further assume (though I take it to be not very likely) that one day our expert would be able to tell his colleagues while scanning me, that now Jesper Hoffmeyer has an experience of longing for summer, and also that I did in fact have this experience at exactly this time. Even then of course the expert did not see my longing, for it cannot be seen it can only be felt —and only by me. Biology has had a long and painful tradition of resistance towards the mechanistic implications of the objectivistic ontology. The obligation to analyze and explain living systems as if they were purely physical, i.e. non-living, systems has seemed contra-intuitive to many biologists, and the recurrent emergence of vitalistic ideas in ever-new disguises may testify to this. The “vestigia terrent”
19
20
of vitalism probably also explains the strangely heated atmosphere generated by any scientific criticism of the Darwinian orthodoxy. Darwinian theory, when formulated in a hypothetic-deductive scheme, is in fact a true representation of the view from nowhere in as much as it is not concerned about the material and temporal causative details of the evolutionary process but contents itself to the level of statistical description. The scheme of natural selection is so precious to biologists exactly because it seems to implement the bird’s eye view solidly in biology or, in other words, to safeguard the position of biology as one of the true sciences. But what about qualia? If one does not believe in miraculous creation then human beings must be the products of evolution and the neo-Darwinian theory is therefore challenged to account for the evolution of qualia. Can a purely mechanical process create qualia? One may of course dismiss such questions by simply claiming qualia to be an illusion, as did philosophers such as Patricia and Paul Churchland and Daniel Dennet (Churchland 1986, 1991, Dennet 1991). Rather than following this eliminitavist strategy (see Searle (1992) for a criticism of eliminitavism) one might suggest that biology, and science in general, reconsiders its reasons for adopting the ontology of the “view from nowhere” in the first place. As Nagel himself has put it: ‘the fundamental idea behind the objective impulse is that the world is not our world. This idea can be betrayed if we turn objective comprehensibility into a new standard of reality. That is an error because the fact that reality extends beyond what is available to our original perspective does not mean that all of it is available to some transcendent perspective that we can reach from here. But so long as we avoid this error, it is proper to be motivated by the hope of extending our objective understanding to as much of life and the world as we can.’ (Nagel 1986: 18). Nagel was lead to accept a modern kind of “dual aspect theory”. Dual aspect theory (also called the dual-attribute theory) was originally forwarded by Spinoza as the view that the unitary substance God is expressed in the distinct modes of the mental and the physical. “To talk about a dual aspect theory is largely hand waving” writes Nagel “It is only to say roughly where the truth might be located, not what it is. If points of view are irreducible features of reality, there is no evident reason why they shouldn’t belong to things that also have weight, take up space, and are composed of cells and ultimately of atoms.” But: “the main question, how anything in the world can have a subjective point of view, remains unanswered.” (Nagel 1986: 30). In a deep sense we will perhaps always feel that this question remains unanswered but I do think that the question could be approached in an evolutionary frame if science was persuaded to give up its unfounded belief that reality is in a narrow sense identical to objective reality. A scientifically consistent dual aspect theory might be raised on the basis of what I have called semiotic materialism (Hoffmeyer 1997a). Semiotic materialism claims that our universe
has a built-in tendency (consonant with modern interpretations of the 2. law of thermodynamics) to produce organized systems possessing increasingly more semiotic freedom in the sense that the semiotic aspect of the system’s activity becomes more and more autonomous relative to its material basis (Hoffmeyer 1992, 1996, Swenson 1989, Swenson and Turvey 1991). The semiotic dimension of a system is always grounded in the organization of its constituent material components, and cannot exist without this grounding, but evolution has tended to create more and more sophisticated semiotic interactions which were less and less constrained by the laws of the material world from which they were ultimately derived. And this same process finally lead to the creation of selfconscious and intelligent beings, and the religions and cultures they made (or which made them). Thus, rather than seeing human subjectivity or qualia as a unique and utterly unexplainable feature of human existence (modus Searle) or, reversely, as a seducing linguistic illusion (modus Dennet), semiotic materialism sees qualia as a highly evolved instantiation of a semiotic freedom which was latently present in our universe from the beginning and which has been gradually unfolding in the course of organic evolution (Hoffmeyer 1995, 1996, 1998a). Fundamentally the view from nowhere is an externalist way of understanding things. Mechanical dynamics are seen as guided by global (simultaneous) relations, which are held to be prior to their sequential realization. Koichiro Matsuno has shown that this “global” or “externalist” way of describing nature does not capture the inner workings of what it describes (Matsuno 1989, 1996; Matsuno and Salthe 1995). The problem is the need for global simultaneity, i.e. the availability of global initial conditions at any arbitrary moment. Global simultaneity cannot be assured a priori for the simple reason that nothing —no possible kind of communication or co-ordination— propagates beyond the velocity of light. “Because of this, global description in a mechanistic framework, however consistent logically, will turn out to be self-contradictory if the objects described are supposed to be located in the material world” (Matsuno and Salthe 1995:317). If we want to explain and not only describe material processes we therefore have to apply an “internalist” view, i.e. giving priority to concrete particulars, seeing things not from without but from within: “Local dynamics is the more inclusive perspective because it incorporates something completely lacking in the global dynamics; it is concerned with just how, in the rough and tumble of actual situations, global integration will materialize in detail based on local behaviors, while global dynamics idealistically takes attainment of consistent global integration for granted from the outset” (Matsuno and Salthe 1995: 313). Continuous time, duration, and not discrete time is the medium required by local dynamics, which thereby exhibits agency, i.e. the capacity for making
21
contingent choices internally. But this does not preclude the cumulation of the local dynamics into a global record of seemingly simultaneous operations. According to Matsuno and Salthe “the necessity for securing the law of the excluded middle is a form of final cause, while locality itself locates an observer. Observation interacts with local dynamics to bring about history, which is absent from global mechanics; any global history must be constituted out of prior local records” (Matsuno and Salthe 1995: 333). Surprisingly then the view from nowhere and the mechanical model of the world which nourishes it turns out to be anti-materialistic, while a true materialism gives priority to the fundamentally semiosic nature of material processes.
22
Life: the invention of externalism In the Umwelt theory of Jakob von Uexküll, animals posses internal phenomenal worlds, Umwelts, which they project out into their surroundings as “experienced external” guiding marks for activity (Uexküll 1982 [1940]). More recently the term Umwelt has acquired a slightly broader interpretation allowing for all kinds of organisms to posses some sort of species specific Umwelt (Anderson et al. 1984). Even bacteria may be said to posses Umwelts in the sense that tens of thousands of receptor protein molecules at their surfaces bind to selected molecules in their environment thus mediating measurements of the outside chemistry to patterns of activity at the inside (figure 1). If the bacterium enters a nutrient gradient it will start moving upstream, and if it enters a gradient of bacterial waste products it will “know” to move downstream. The bacterium in other words has evolved a capacity to make distinctions based on historically appropriated cyto-molecular habits built into the dynamic macromolecular architecture of the cell and its DNA (Hoffmeyer 1997a). Seen from the human observer’s point of view the Umwelt of an organism is a kind of world model (Meystel 1998), but seen from the organism’s own “point of view” all there is situated activity, eventually accompanied by a sense of awareness or even anticipation in the case of the most sophisticated animals. Thus, to describe living systems in terms of possessing Umwelts is still part of an externalist discourse even though it is an attempt to deal with the world as seen from the animal’s point of view. This is because it is only in the historical perspective to which the animal has not itself access that the Umwelt can be described as a model. Retrospectively, and thus externalistically, we can say that evolution has molded the Umwelts of organisms and created species possessing still more sophisticated Umwelts matching still deeper levels of environmental dynamics. There is no way to escape externalism in science, but one should be aware of the limitations this fact poses on science, namely that we can only approach an understanding of historical processes after the fact, i.e. retrospectively.
The understanding that biology models the activity of model-building organisms is at the core of biosemiotics of course. Where bacteria are considered the subtleties of the situation stop here because the Umwelt of bacteria is mostly concerned with chemistry. But considering the Umwelts of more sophisticated organisms it becomes clear that these organisms have developed models of their surroundings which are very much aimed at the activity of other organisms and thus of other model-builders producing a semiotic web of infinite complexity. It remains an open question to which extent animals may understand that those other animals, which they model in their Umwelts, do themselves act on the basis of models. As we know even some human beings, e.g. many scientists, have not been too willing to admit that much. We can safely say, however, that the evolutionary road from the most primitive externalist models of the world as possessed by bacteria to the appearance of the first models capable of approaching an internalist perspective
23
Figure 1. Binding of a nutrient molecule to the chemoreceptor blacks CheY phosphorylation allowing for the dissociation of CheY from the switch complex. The switch complex now changes its conformation and induces a counterclockwise movement of the flagellum. Result: the bacterium moves straight forward.
24
has taken billions of years to pass. No wonder then, that internalism is still rejected by many hard scientists. It certainly takes some amount of intellectual and emotional sophistication to enter the unruly inner worlds of otherness. The fact that even prokaryotic organisms like bacteria have Umwelts must influence our understanding of the origin of life. As pointed out by Stanley Salthe neither self-reference nor other-reference can be said to be an exclusive property of life, since even tornadoes may be said to posses a primitive kind of both (Salthe 1998). Yet I would resist the temptation to ascribe Umwelts to tornadoes or to any other pre-biotic systems because none of these systems have yet acquired means for an evolutionarily productive interaction between these two kinds of reference. The self-reference of a tornado is too short-lived and unstable to allow for a true evolutionary process of self-modification in response to other-reference. I shall suggest that it is the stable integration of self-reference and other-reference which establishes the minimum requirement for an Umwelt and thereby sets living systems apart from all their non-living predecessors. It is this double referential or semiotic character of living systems which is the true challenge to theories of the origin of life. And to my knowledge none of the present theories have tried to confront this most central aspect of life: the semiotic core of what it is like to be living. Mathematical modeling of complex systems indicates that the formation of “life-like” chemical systems may not have been such an unlikely event as was formerly believed (e.g. Monod 1971). In rejecting the “magical molecule” or RNA approach to the origin of life problem Stuart Kauffman from the Santa Fe Institute has pointed to the remarkable fact that the simplest known free living cells, socalled pleuromona, are already very complex, containing an estimated number of genes of a few hundred to about a thousand. And he suggests that the reason for this might be that a certain minimum complexity is necessary for life to appear (Kauffman 1995). His work with mathematical modeling of “combinatorial chemistry” has shown that when a large enough number of reactions are catalyzed in a chemical reaction system, a vast web of catalyzed reactions will suddenly crystallize. “Such a web, it turns out, is almost certainly autocatalytic —almost certainly self-sustaining, alive” (Kauffman 1995: 58). Complexity thus is a prerequisite to autocatalytic closure, which again is a prerequisite to life. And Kauffman confidently concludes that “The secret of life, the wellspring of reproduction, is not to be found in the beauty of Watson-Crick pairing, but in the achievement of collective catalytic closure. The roots are deeper than the double helix and are based in chemistry itself. So, in another sense, life —complex, whole, emergent— is simple after all, a natural outgrowth of the world in which we live” (Kauffman 1995: 48). While this scheme stands as a convincing alternative to the predominant RNA scenarios for the origin of life, it does not confront the question I have posed as central here: How could a system with the ability to make a productive model
of its external environment appear? What is the origin of the Umwelt? I have dealt with this question elsewhere (Hoffmeyer 1998b) and claimed that what is missing in Kauffman’s model is the concept of an asymmetry between inside and outside, and the first precondition for the establishment of such an asymmetry would be the formation of a closed membrane around a complex autocatalytically closed web of interacting molecules (step 2 in figure 2). Also Bruce Weber has emphasized the importance of membrane formation (Weber 1998a and b). ORIGIN OF LIFE Five necessary steps 1. Autocatalytic closure (Kauffman) 2. Inside-outside asymetry (closed surface) 3. Proto-communication (a comunity of surfaces) 4. Digital redescription (code duality) 5. Formation of an interface (inside-outside loops) Figure 2. Five necessary steps on the road to the origin of life. See text for discussion.
The main fabric of the kind of membranes possessed by living systems is a so-called LIPID BILAYER, a continuous sheet, no more than five to six nanometers thick, made of two layers of amphiphilic molecules —mostly phospholipids, glycolipids, and sulfolipids— joined laterally, as well as tail to tail, by van der Waals interactions between their hydrophobic hydrocarbon chains (figure 3). Christian de Duve characterizes lipid bilayers in the following words: “Lipid bilayers are fluid structures of almost limitless flexibility. They behave as twodimensional liquid crystals within which the lipid molecules can move about freely in the plane of the bilayer and reorganize themselves into almost any sort of shape without loss of overall structural coherence. Furthermore, lipid bilayers are self-sealing arrangements that automatically and necessarily organize themselves into closed structures, a property to which they owe their ability to
Figure 3. Lypid bilayers. The hydrophilic “heads” and “tails” of phospholipids cause them to form bilayers when dispersed in water. Bilayers tend to form closed vesicles.
25
26
join with each other by fusion and to divide by fission, while always maintaining a closed, vesicular shape.” (de Duve 1991). Now, while such closed membrane systems may have formed rather easily in the pre-biotic world, they would probably also die out relatively quickly unless they had acquired a capacity to canalize a selective flow of chemicals (“nutrients” and “waste”) across the membrane. But in themselves lipid bilayers are very impenetrable and in cell membranes as known today the transport of chemicals in and out of the cell is only possible because of the inclusion into the membrane itself of thousands of rather complex protein molecules (figure 4). For this reason de Duve in his scenario for the origin of life does not favor membrane encapsulation as an early step. Here however we are concerned with the chemistry of the process only in the trivial sense that what is suggested should conform to the chemical evidence we have, and that evidence is certainly very speculative. Whatever did happen at the chemical level and whatever the order of succession actually was what I claim here only is that from a conceptual point of view membrane formation was THE decisive step in the process. Before membrane closing occurred there could be no inside-outside asymmetry and thus no communication between systems. And my guess is that it is only because pre-biotic systems reciprocally dragged each other into a communicative network that they could muster the creativity needed for the gradual construction of a true cell. The continuous interaction between processes of “invention” and “interpretation”, which I have termed SEMETIC INTERACTION, would then have been put in play (Hoffmeyer 1997b). If for instance at some locality conditions allowed for the production of swarms of such closed membrane systems one might eventually obtain a higher level autocatalytic closure so that the outputs from one entity served as inputs to other entities and vice versa. In such a swarm one might say that the closed membrane systems had acquired a germ form of other-reference or protocommunication (step 3 in the figure 2). Still, in my terminology this would not qualify for Umwelt possession since at this stage the system still has no self-referential dynamics. For selfreferential dynamics to occur the system needs what Kauffman has called a “written record”, i.e. the spatially organized components of the system should somehow become re-described in the digital alphabet of DNA or RNA thereby forming what Claus Emmeche and I have called a code-duality (step 4 in figure 2). Code-duality refers to the idea that organisms and their DNA are both carriers of a message sent down through generations: Living systems form a unity of two coded and interacting messages, the analogically coded message of the organism itself and its re-description in the digital code of DNA. As analog codes, the organisms recognize and interact with each other in the ecological space, giving rise to a horizontal semiotic system, while as digital codes they (after eventual
recombination through meiosis and fertilization in sexually reproducing species) are passively carried forward in time between generations (Hoffmeyer and Emmeche 1991). So far a system has been established which —seen from the observer’s point of view— has an obvious interest in maintaining the needed flow of chemicals across the surface. But the system still has no way to assist the fulfillment of its own “interest”, it has no mechanism for goal-oriented modification or action. Thus the system is not an agent in its own interests. It doesn’t matter to the system whether it can distinguish features of its environment or, in other words, it has not yet acquired the capacity for making distinctions. What is needed in addition to the DNA-record is the formation of a feedback link between DNA and environment, so that events outside the system become translated into appropriate events inside the system (Weber et al. 1989). The membrane in other words must turn into an interface linking the interior and the exterior (step 5 in figure 1). Only then does the system’s understanding of its environment matter to the system: relevant parts of the environment becomes internalized as an “inside exterior”, the Umwelt, and in the same time the interior becomes externalized as an “outside interior” in the form of “the semiotic niche”, i.e. the diffuse segment of the semiosphere which the lineage has learned to master in order to control organism survival in the semiosphere (Hoffmeyer 1996). I see this as the decisive step in the evolutionary process of attaining true semiotic competence, i.e. the competence to make distinctions in space-time where formerly there were only differences. The semiotic looping of organism and environment into each other through the activity of their interface, the closed membrane, also lies at the root of the strange future-directedness or
Figure 4. Model of membrane structure showing the integration of specialized proteins into the lipid bilayer. Membrane bound proteins may perform a variety of functions such as recognition of molecular messages from other cells or transport of specific molecules across the membrane. A hypothetical channel for water and certain ions between several protein subunits is shown.
27
“intentionality” of life, its “striving” towards growth and multiplication. The spatial asymmetry between the “inside interior” and the “outside exterior” is coupled to the time asymmetry implicit in the self-referential mechanism of DNA re-description followed by cell division.
28
The landscape of membranes As far as is known cyto-membranes never form de novo by self-assembly of their constituents (as they must nevertheless have done at least once in the distant past); they always grow, in an essentially homomorphic fashion, by accretion, that is, by the insertion of additional constituents into pre-existing membranes. The corresponding patterns are transmitted from generation to generation by way of the cytoplasm (e.g. of egg cells), which contains samples of the different kinds of cyto-membranes found in the organism. The ordinary textbook talk of DNA as governing cellular or even organism behavior is therefore rather misleading. In fact if any entity should be thought of as a governor of cellular activity this should certainly be the membrane. DNA contains the recipes for constructing the one-dimensional amino acid chains, which form the backbones of enzymes, and among them the enzymes needed for catalyzing the formation of the constituents of lipid bilayers and assembling them. But whether these recipes are actually “read” and executed by cellular effectors depends on membrane bound activity. All major activities of cells are topologically connected to membranes. In the prokaryotes (bacteria) the plasma membrane (the active membrane inside the cell wall) is itself in charge of molecular and ionic transport, biosynthetic translocations (of proteins, glycosides etc.), assembly of lipids, communication (via receptors), electron transport and coupled phosphorylation, photoreduction photophosphorylation, and anchoring of the chromosome (replication) (de Duve 1991: 63). In the more modern and much bigger eukaryotic cells these tasks has been taken over by specific sub-cellular membrane structures of mitochondria, chloroplasts, the nuclear envelope, the golgi apparatus, ribosomes, lysosomes etc. Many —if not all— of these membranes are themselves descendants from once free-living prokaryotic membranes, which perhaps a billion years ago became integrated into that co-operative or symbiotic complex of prokaryotic membranes which is the eukaryotic cell. Membranes also are the primary organizers of multi-cellular life. The topological specifications necessary for growth and development of a multicellular organism cannot be derived from the DNA for the good reason that the DNA cannot “know” where in the organism it is located. Such “knowledge” has to be furnished through the communicative surfaces of the cells. Morphogenesis is mostly a result of local cell-cell interactions in which signaling molecules from one cell affect neighboring cells. Animal cells, for instance, are constantly exploring their environments by means of little cytoplasmic feelers called
filopedia (filamentous feet) that extend out from the cell. “These cytoplasmic extensions that drive cell movement and exploration are expressions of the dynamic activity of the cytoskeleton with its microfilaments and microtubules that are constantly forming and collapsing (polymerizing and depolymerizing), contracting and expanding under the action of calcium and stress” writes Brian Goodwin (Goodwin 1995:136). A developing organism has to generate its own form from a simple initial shape, and this process can be described as highly parallel sequences of bifurcations, i.e. transitions from states of higher symmetry (lower complexity) to states of lower symmetry (higher complexity). A key factor in this process seems to be calcium ion transport across cellular membranes. Calcium ion is bound by special proteins, and it has been shown that the interaction between calcium and the cytoskeleton could result in the spontaneous formation of spatial patterns in the concentration of free calcium and the mechanical state of the cytoplasm, i.e. the kind of bifurcations needed for explaining developmental processes (Goodwin 1995). What emerges from the work of Goodwin and others is an understanding of developmental processes as a formation of relational order arising from a complex pattern of basically semiotic interactions between the constituents of the developing organism. The medium for these interactions is the landscape of membranes at all levels of complexity forming the organism —from the mitochondrion to the skin. Dynamic boundaries If the prototype tool is a hammer the prototype organism is probably a dog (in the western world at least). Like ourselves dogs have relatively well defined boundaries, they are mortal individuals and they cannot be at two places at the same time, but have to move in order to get food. Such organisms have been called determinate organisms. Most organisms in the world however are not at all like this. The life of fungi for instance is a constantly changing interplay between dissociation and association generating varied patterns in the interconnected, protoplasm-filled tubes (hyphea) that spread through and absorb sources of nutrients. The hyphea branch away from one another (i.e. dissociate) most prolifically when nutrients are freely available, but re-associate to form such structures as mushrooms when supplies are depleted. Fungi exemplifies what might been called indeterminate organisms, and according to the British biologist Alan Rayner “The fungi are an entire kingdom of organisms: their total weight may well exceed the total weight of animals by several times and there are many more species of them than there are of plants! In many natural environments fungi provide the hidden energy-distributing infrastructure —like the communication pipelines and cables beneath a city— that connects the lives of plants and animals in countless and often surprising ways” (Rayner 1997: vii).
29
30
Indeterminate organisms possess expandable or “open” boundaries that enable them to continue to grow and alter their patterns indefinitely. Such organisms are potentially immortal. Thus, in a Canadian forest one individual rhizomorphic fungus, Armillaria bulbosa, is reported to have produced a network some 1500 ha in area and estimated to weigh 100 tones and to be 1500 years old (Rayner 1997: 130). Only extreme conditions would kill this exemplar. The extreme emphasis on the regenerative aspect of life, proliferation and reproduction, which is the essence of modern gene centered evolutionary thinking does not work well in the world of indeterminate creatures. Here processes of boundary-fusion, boundary-sealing and boundary-redistribution all provides means for reducing dissipation, allowing energy to be maintained within the system rather than lost to the outside. Such processes lead to more persistent organizations in which individuality is blurred. For illustration let us consider a few of the many cases offered in Alan Rayner’s book (Rayner 1997): Heartrot is intuitively conceived as a disease but in actual fact it is part of a quite normal recycling process, i.e. a process where internal partitioning allows resources to be redistributed from locations that no longer participate in energy gathering or exploration to sites where these processes are being sustained. Heartrot is caused by the degradation by fungi of the predominantly dead wood of mature trees. Heartrot thus results in the hollowing of tree trunks which provides a huge variety of habitats for animals as well as allowing the tree to recycle itself by proliferating roots within its own internal ‘compost’ of decomposing remains that accumulate within the cavity (Rayner 1997:179). Another illustrative example is the partnership formed by the majority of higher plants with fungi, so-called mychorrhizas. Here the funguses not only provide their plant partners with improved access to mineral nutrients and water in exchange for organic compounds produced by photosynthesis. The mycorrhizal mycelia are also thought to provide communication channels between plants enabling adult plants to “nurse” seedlings through fungal “umbilical cords” to reduce competition and to enhance efficient usage and distribution of soil nutrients. There is a risk to this invention,, however, because “pirating plants”, by tapping into mycorrhizal networks, may indirectly divert resources from the participating plants (Rayner 1997: 63). These are both cases of mutual symbiosis, of course, and symbiosis in general is probably the most underestimated aspect of evolutionary creativity (cf. Sapp 1994). The one-eyed focus on the reproductive aspect of life, the proliferation of successful genes, systematically weeds out the heterogeneous contexts in which organisms always live. These contexts do not only consist in material interactions but also always include subtle semiotic interactions mediated by environmental cues of all kinds. Thus, traditional symbiosis should be seen as just a particular kind of a much more widespread eco-semiotic integration (Hoffmeyer 1997c).
The absurdity of conflating all of this into one simple measure of genetic fitness becomes especially obvious when considering the world of indeterminate organisms where individuality and mortality are only loosely connected, and where the dynamic boundaries in space and time are not defined by their genetic set-up. The evolution of boundaries and the evolution of the contexts in which they put themselves are assisted by, not caused by, genetic inventions. Membrains The distinction between determinate and indeterminate organisms is itself indeterminate of course, no organism is completely determinate or completely indeterminate. Complicated functions such as photosynthesis, capture of prey, ingestion of food and reproduction requires a high degree of adaptive refinement and in order to handle these tasks even indeterminate organisms regularly produce highly prescribed determinate offshoots, e.g. flowers, fruits, leaves, and fungal fruit bodies. While these offshoots often attract the interest of the human observer it is nevertheless the indeterminate part of the system “that generates the offshoot and regulates the interrelationships between offshoots by providing interconnected pathways whose variably deformable and penetrable boundaries outline the channels of communication within the system. Determinate superstructure is integrated by indeterminate infrastructure” (Rayner 1997: 80, my italics). This pattern can be generalized since even clearly determinate organisms such as dogs or human beings are dependent on indeterminate infrastructure in the form of the vascular system and the nervous system. The pattern of development of nervous and vascular infrastructures resembles that of a foraging mycelium says Rayner: “Like the hyphal tubes of a mycelium they may be variably partitioned and cross-linked, and their lateral boundaries are variably insulated so as to receive and distribute input with minimal dissipation” (Rayner 1997: 141). There seems to be a deep logic in this arrangement. Nervous systems and brains never developed in plants or fungi. From the beginning these structures were connected to the need of animals to move for purposes of flight or foraging. Nerve cells became specialized for the “long distance” communication needed in order to co-ordinate the activities of body parts too far apart for quick interaction through traditional cell to cell communication. Thus, determinacy of the body structure, was compensated through the indeterminacy of their movements. As Rayner observes: “Animals...follow and create ‘trajectories’ in their surroundings as they change their position and behavior over time and interact with one another. If these trajectories are mapped, they often exhibit a clear but irregular structure, similar to the body boundaries of indeterminate organisms like plants and fungi” (Rayner 1997: 70). The plasticity of the sensorimotor system of determinate organisms, in
31
32
other words, assures the capacity for “capitalizing on opportunities” which in indeterminate organisms are assured simply through the plasticity of their boundaries. It follows that the brains, which gradually evolved to guide the activity of animals, could not themselves be determinate but would have to retain a plasticity, which could match the unpredictable semiotic heterogeneity of the environment. While morphogenesis in general is guided by local cell to cell interactions neurons can be in direct contact with many cells that are located quite far apart from one another, by virtue of their long output (axons) and input (dendrite) branches. This allows populations of cells distant apart from one another in the brain directly to interact and influence one another so that a non-local developmental logic is superimposed on top of the local regional differentiation that preceded it. And this is the key to the adaptive plasticity of the developing brain since “it makes it possible for the nervous system as a whole to participate actively in its own construction” (Deacon 1997: 195). Loosely sketched what happens is a kind of neuronal selection process. The developing brain produces a surplus of nerve cells and each of these nerve cells produces far more branches of their growing axons than will finally become functional synaptic connections. Only a fraction of the newly produced connections will happen to get involved in persistent coordinated activity while the remainders are eliminated in a competition between axons from different neurons over the same synaptic targets. Nerve cells, which do not succeed in making any synaptic contributions, are persuaded to commit suicide, so that only some 60% of the totality of nerve cells originally produced will survive into adulthood. That experience from the outer world influences the pattern of neuronal cell death and synaptic strengthening was illustrated by some rather harsh experiments done to newborn kittens. If the right eye of kittens were sewed shut during a critical period of early life they would develop functional blindness on that eye for the rest of their lives. It could be shown that this was not due to lack of experience as such. What happened was that synaptic competition eliminated all those connections to the visual cortex that were derived from the passive right eye leaving all the available synaptic space for left eye connections to capture. The patterns of impulses emitted through the optical nerve from the right eye never reached the visual cortex (Gilbert 1991: 644). The indeterminacy of the brain is its strength. In Terrence Deacon’s words: “Cells in different areas of the brain are not their own masters, and have not been given their connection orders beforehand. They have some crude directional information about the general class of structures that make appropriate targets, but apparently little information about exactly where they should end up in a target structure or group of target structures. In a very literal sense, then, each
developing brain region adapts to the body it finds itself in....There need be no ‘pre-established harmony’ of brain mutations to match body mutations, because the developing brain can develop a corresponding organization ‘on line’, during development” (Deacon 1997: 205). For decades neuroscientists have tried to model the brain as a computational organ. Brain cells were seen as logical gates, adding and subtracting input spikes until some threshold level of charge is breached, at which point they convulse to produce a spike of their own. This all-or-nothing nature of a nerve cell’s firing was thought to overcome the usual soupy sloppiness of cellular processes, i.e. to bring it into an area of digital calculation. But 30 years of research along these simplistic ideas has not been able to give any real answers to how the brain works. And it has now become clear that the output of any individual neuron depends on what the brain happens to be thinking at the time. Not the single firing nerve cell but the self-organizing and ever-changing global pattern of activity all over the brain seem to hold the key to an understanding of its capacity to produce the mental phenomena we all experience. Thus on top of the indeterminacy of the growing brain comes an indeterminacy at the level of synaptic connections, which furthermore makes up for an indeterminacy of global activity pattern formation. And finally on the top of all this plasticity comes yet another level of indeterminacy: Language. “Language ran its hyphae far into the nervous system allowing, today, no hope of excision —not even in theory. Language does not think through us but it has become a part of us. And yet language is common property and, hence, extraneous to us” (Hoffmeyer 1996: 112). Terrence Deacon in his recent book “The Symbolic Species” suggests that languages have adapted to children’s brains much more than the brains have evolved to become linguistic. This would explain the mystery of how children can learn to talk in spite of the often postulated unlearnability of language: “Human children appear preadapted to guess the rules of syntax correctly, precisely because languages evolves so as to embody in their syntax the most frequently guessed patterns. The brain has co-evolved with respect to language, but languages have done most of the adapting.” (Deacon 1997: 122). Thus, there is no need for postulating innate linguistic knowledge. With the invention of speech the Umwelts of individual organisms gradually gave way to the idea of the one and only world. The searching membranes of life had at length found the means for a productive non-local association with other membranes thereby creating what the American neurophysiologist Walter Freeman has called “societies of brains” (Freeman 1995). And this is where the idea of objectivity is rooted, i.e. in the social nature of human knowledge. Our ingenious membrains communicate directly with other membrains attempting to construct the world in the image of the collective, i.e. in a view from nowhere.
33
But membrains are mortal, they have a beginning and an ending. In between is a life story, which cannot be thought away as nowhere business. References
34
Churchland, Paul M. 1991. “Folk Psychology and the Explanation of Human Behavior”. In J.D. Greenwood (eds.). The Future of Folk Psychology. Cambridge: Cambridge University Press, Churchland, Patricia S. 1986. Neurophilosophy: Towards a Unified Theory of Mind/Brain. Cambridge: MIT Press/A Bradford Book. Deacon, Terrence 1997. The Symbolic Species. New York: Norton. de Duve, Christian 1991. Blueprint for a Cell. The Nature and Origin of Life. Burlington: Neil Patterson Publishers. Dennet, Daniel C. 1991. Consciousness Explained. London: Allan Lane, Penguin Press. Freeman, Walter J. 1995. Societies of Brains. A study in the Neuroscience of Love and Hate. New Jersey: Lawrence Erlbaum. Gilbert, Scott F. 1991. Developmental Biology. Sunderland: Sinauer. Goodwin, Brian 1995. How the Leopard Changed its Spots. London: Phoenix Giants. Hoffmeyer, Jesper 1992. “Some Semiotic Aspects of the Psycho-Physical Relation: The Endo-Exosemiotic Boundary”. In Thomas A. Sebeok and Jean Umiker-Sebeok (eds.). Biosemiotics: The Semiotic Web 1991. Berlin: Mouton de Gruyter, 101-123. __1995. “The Swarming Cyberspace of the Body”. Cybernetics & Human Knowing 3(1): 1-10. __1996. Signs of Meaning in the Universe. Bloomington: Indiana University Press. __1997a. “Semiotic Emergence”. Revue de la Pense d’aujourd’hui 25-7(6): 105-17 (in japanese language). __1997b. “The Swarming Body”. In Irmengard Rauch and Gerald F Carr (eds.) Proceedings of Semiotics Around the World. Proceedings of the Fifth Congress of the International Association for Semiotic Studies, 937-940. Berkeley 1994: Berlin: Mouton de Gruyter. __1997c. “Biosemiotics: Towards a New Synthesis in Biology”. European Journal for Semiotic Studies, 9(2): 355-376. __1998a. “The Unfolding Semiosphere”. In Gertrudis van de Vijver, Manuela Delpos and Stanley Salthe (eds.) Evolutionary Systems. Dordrecht: Klüwer. __1998b. “Anticipatory Surfaces”, Cybernetics and Human Knowing, 5 (1), 33-42. Hoffmeyer, Jesper and Claus Emmeche (1991). “Code-Duality and the Semiotics of Nature”. In Myrdene Anderson and Floyd Merrell (eds.). On Semiotic Modeling. New York: Mouton de Gruyter, 117-166. Kauffman, Stuart (1995). At Home in the Universe. New York: Oxford University Press. Matsuno, Koichiro (1989). Protobiology: Physical Basis of Biology. Boca Raton: CRC Press. __(1996). “Internalist Stance and the Physics of Information”. Biosystems 38: 111-118. Matsuno, Koichiro and Stanley Salthe (1995). “Global Idealism/Local Materialism”. Biology and Philosophy 10: 309-337. Meystel, Alex (1998). “Multiresolutional Umwelt: Toward Semiotics of Neurocontrol”. In Jean Umiker-Sebeok (ed.) Semiotics in the Biosphere: Reviews and a Rejoinder. Special Issue of Semiotica, 120 (3/4). Monod, Jacques (1971). Chance and Necessity. New York: Knopf. Nagel, Thomas (1986). The View from Nowhere. Oxford: Oxford University. Rayner, Allan D. M. (1997). Degrees of Freedom. Living in Dynamic Boundaries. London: Imperial College Press. Salthe, Stanley 1998. “Naturalizing semiotics”. In Jean Umiker-Sebeok (ed.) Semiotics in the Biosphere: Reviews and a Rejoinder. Special Issue of Semiotica 120 (3/4).
Sapp, Jan 1994. Evolution by Association. A History of Symbiosis. New York: Oxford University Press. Swenson, Rod 1989. “Emergent Attractors and the Law of Maximum Entropy Production”. Systems Research 6: 187-197. Swenson, Rod and M. T. Turvey 1991. “Thermodynamic Reasons for Perception-Action Cycles”. Ecological Psychology 3(4): 317-348. Uexküll, Jakob von 1982 [1940). “The Theory of Meaning”. Semiotica 42(1): 25-87.
35
36
Developing biosemiotics into cybersemiotics Søren Brier
In the last decade there has been a growing interest in making a new transdisciplinary science that could grasp, understand and manipulate the informational aspect of nature, culture and technology. Especially a better understanding of the semantics of cognition and the representation of knowledge in texts compared to the way a computer represents and manipulates linguistic knowledge has been looked for to improve the technological evolution of the international computer network’s ability to handle the semantic aspects of text and speech. But at the same time the debate about how to define information has lead deep into the question of the nature and limits of scientific knowledge and what kinds of reality we are dealing with. To formulate cognitive and informational science frameworks is to go deep into the philosophical foundation of science especially the epistemological aspect. In science the universe has generally been considered to consist of matter and energy, but following Norbert Wiener’s declaration that ‘information is information and not energy or matter’ the question has been raised weather information is a third true constituent of our basic reality as science sees it (Hayles 1999). One of the most clear and outspoken promoters of this view is Stonier (1990, 1992, 1997). I have discussed this idea of a unified theory based on the idea of objective information in Brier (1992 and 1996c, d). The main problem perhaps is that it has very little to say about semantics and signification in living systems as also Hayles (1999) points out. An international program on the Foundation of Information Science (FIS) is now evolving. In Brier (1997) I argue for my own approach to this project, which shortly stated is the following: A truly scientific theory of information, cognition and communication has to encompass the area covered by social sciences and humanities as well as biology and the physio-chemical sciences. A genuine transdisciplinarity is necessary if we want to understand information, cognition and communication in natural, living, artificial and social systems in a broadly based scientific theory. A way to connect the phenomenological view from within with a theory of behavior and language is crucial for such an enterprise —in short a theory of signification. In my opinion we have to look for a theory, which is on the one hand not mechanistic —because our scientific results so far do not support the belief that semantics is mechanistically explainable. The evolutionary view of reality seems to be well supported by empirical data and therefore a necessary basis of a worldview. As Prigogine and Stengers (1985) argues and demonstrate then
37
38
evolution cannot be understood on a mechanistic foundation, but must be based on a complexity foundation at least as the thermodynamic, with a probability function and the irreversibility coming from the second law of thermodynamics of the irreversibly growth of entropy. But this is still not enough to make a foundation for understanding the emergence of life and mind (as inner life, emotion, will and qualia). On the other hand idealistic, transcendental phenomenological views like Husserl’s, social constructivism, or as an alternative, a radical social constructivist epistemology seems unable to account for the genuine aspect of reality that science has uncovered and the connection and continuity between nature and culture (discussed in Brier 1993b and 1996b). Based on our personal and intersubjective experiences with forces like electromagnetism and gravity, I find it necessary to hold on to some kind of realism. As we also find it problematic and unnecessarily complicated to establish a non-empirical qualitative different mental world, and still have to explain how that can have causal influence on matter, I want to adhere also to a monism. A monism viewed as the general scientific idea that everything in the whole Universe has —at least from the beginning— been developed from the same “stuff.” But within a traditional physicalistic understanding, may it be deterministic as in classical physics or probabilistic as in thermodynamics and cybernetics, this “stuff of the world” is too reductionistic to explain life and mind. But I also find the attempt to explain life and mind away in an eliminative materialism paradoxical and selfrefuting (Churchland 1986, Churchland 1995). With Peirce, I therefore prefer the Aristotelian concept of ‘hyle’ that through its continuity or field idea of matter first brings us closer to the quantum field framework, and second opens for the present science the obscure idea that matter can have an internal aspect of life and mind. This is what Peirce calls ‘pure feeling’. To make a realistic, evolutionary and non-mechanistic cognitive science was actually what Lorenz and Tinbergen set out to do when they created the science of “Ethology”. From the beginning in the 1930’es ethology was based on the three theoretical foundations of modern biology: The theory of evolution, the ecological theory and modern population genetics, plus the method of comparative anatomy transferred to instinctive movements (Lorenz 1970-1971). From this foundation, Lorenz and Tinbergen, especially, developed a theory of innate release response mechanisms fueled by specific motivational energy and released by innate sign stimuli. Although the theory is very compatible with Freud’s psychoanalytical theory, it was never able to deal with the phenomenological aspect in a theoretical consistent and constructive way (Brier 1993b, 1998a). In the present paper I want to further develop the epistemological framework of ethology and evolutionary epistemology through a biosemiotic founding of basic concepts in the light of the problem of establishing the reality of qualia in a materialistic evolutionary cognitive biology and from there to the semantic
level of meaning in human language communication. I see the development of a non-reductionistic biosemiotics as essential for making a general framework for information, cognitive and communication studies. What is interesting and fruitful about Lorenz’s biological theory of animal behavior is the attempt to make a cognitive science based on biological theory surpassing on the one hand the reductionism based on the mechanisms of physics and chemistry and on the other hand the vitalism of Drietsch and others. Both Lorenz and Tinbergen were aware of the fact that animal instinctive behavior is largely inherited. A good theory of genes was not available at the time, but heredity was well known and supposed to have a material basis in the chromosomes, and population genetics was under development. Morphology was well studied, and, according to the Darwinian paradigm, it was studied from the angle of survival value of animal behavior. One of the puzzles was how animal instinctive behavior and learning could at the same time be hereditary and purposeful. There was no doubt that animals had a selective perception, and related to certain events as biologically meaningful to their survival, when they appeared in certain situations, depending on the animals mood. But neither Lorenz nor Tinbergen managed, in my opinion, to formulate the needed integrative evolutionary-ecological theory for cognitive science that could be an alternative to the objectivism of modern cognitive science and its informationprocessing paradigm (Lakoff 1987, Brier 1992, 1996 b). They did manage to make a theory of genetic preprogrammed behavior and learning, showing how perception was dependent on specific kinds of partly self-energizing specific motivations that were also regulated by age, sex, physiological needs and time of the year. But especially the foundation of the concept of motivation and its relation to emotions and consciousness has not found a broadly accepted form (Hinde 1970, Reventlow 1970, Brier 1980). This has limited its usefulness in the human sphere and most ethologists do not use the concepts anymore. But, at least, a new ‘cognitive ethology’ has been developed by Mark Bekoff (Colin and Bekoff, 1997; Bekoff, Mark, Allen, Colin, and Burghardt, Gordon M. 2002.). It recognizes the inner life of animals as a causal factor, but unfortunately only argue for its existence from the common traits of humans and animals and observations of animal behavior, but like Lorenz fails to develop a new theoretical framework to sustain and develop its scientific concepts. Among other things, this is what I hope to accomplish with my cybersemiotic framework. On the other hand a very important conclusion in Lakoff (1987) is that our biology is decisive for the way we formulate concepts and make categorizations. He points out that linguistics lacks a broader theory of motivation based on embodiment to understand how we extend metaphors from the concrete to the abstract in a meaningful way and to explain how we organize concepts into
39
40
different types of categories. He points out that cognitive models are embodied, or based on an abstraction of bodily experiences, so that many concepts, contents or other properties are motivated by bodily or social experience in a way that goes beyond the usual linguistic idea of motivation. Only in this way do they make sense thereby providing a non-arbitrary link between cognition and experience that is not logical in the way we usually understand it. This means that human language is based on human concepts that are motivated by human experience. It is easier to learn something that is motivated than something that is arbitrary or logically arranged. So one of Lakoff’s conclusions is that motivation is a central phenomenon in human cognition especially in categorization. Lakoff also points out that motivational categorization in humans is based on idealized cognitive models (ICMs) that are the result of accumulated embodied social experience and gives rise to certain anticipations. This fits very well with the ethological thinking around concepts like sign, stimuli and, imprinting (Brier 1995 and 1998), but unfortunately its very physiological and energy-oriented models of motivation, cognition and communication are not developed enough to encompass the area from animal instinctive communication to human linguistic behavior. A further development is needed that focus more on signification and communication. From the other end, it is a problem that Lakoff only develops a rather simplistic model of bodily kinetic-image schemata as the source of metaphor a metonymy. I tend to think that one could develop the theory much further if one also draws on a combination of ethological and psychoanalytical knowledge of the connection between motivational states and the cognition of phenomena as meaningful signs. Anticipatory behavior in animals Based on the results of ethology I propose that all perceptual cognition is anticipatory (Brier 1993b). According to my ethological model of instinctive reaction, the fixed action patterns that are the behavioral part of the animal instinct are only released by the innate release mechanism if motivationally borne pattern recognition appears. The perceptions that release the more or less hereditary innate release response mechanism, if it is properly motivated, are called sign stimuli. Ethology enumerates many different motivations and some of them species specific (Brier 1993b). Most living systems have great problems in perceiving something not biological, psychological or socially anticipated. So from an ethological point of view one should include the action of the subject and its motivational value into a model of the dynamics of behavior. This brings ethology’s theoretical framework close to the crucial concept of intentionality in Husserl’s Phenomenology. It therefore makes sense to view animal instincts with their specific motivation for fight, mating, hunting etc., as bio-psychological expectations or anticipations. Some of these anticipations are completely innate, such as the hunting and
mating behavior of the digger wasp, which have no contact whatsoever with its parents. It gets out of its egg buried in a little cave under the surface of the ground and eats the prey put there by its mother. It does not encounter other species members in its larval stage and nobody teaches it anything. Still the wasp is able to hunt and mate when it meets the relevant other living systems. So this is a rather closed inheritance based system. Some behavior systems in other species are partly open for variations in what is anticipated as e.g. in the imprinting of ducklings who will follow the first big moving object expressing sound they see within a certain time span after they have hatched and later on will choose a mate like it. Many birds have a basic song but have to listen to other members of the species to learn the full song, but they will not learn the song of another species. But again some bird species can include the song of other species and natural sounds and make different kinds of variations some even as ongoing improvisations. It is clear to Lorenz that emotions have functions and survival value. Wimmer (1995) gives a further development of this kind of science about emotion, which one can also find in a cybernetic version in Bateson’s (1972) work where it is viewed as inter-individual relational logic. But the problem is that it is still a purely functionalistic description not really able to explain how certain things and events becomes significant for the living system in such a way that they see them as a sign for something emotional, existential and vital for their self-organized system (Brier 1992). The crux of the matter is the problem of the relation of motivation, intentionality and feelings to the experience of anything as being meaningful. So far, no functionalistic model of explanation of behavior, perception or communication has been able to, alone, account, in a sufficient way, for the wills and emotions of the minds of animals and man. This problem is also crucial in the discussion of what information is and what the foundation of information science should be. The foundation of meaningful experience, categorization and communication is the crucial question that cognitive science should solve, as for instance pointed strongly out by Lakoff (1987). On one side we have the mathematical information theory, which in Wiener’s cybernetics is connected to thermodynamics. Cybernetics integrates the ideas of computing —formalized among others by Turing— and the idea of artificial intelligence and the functionalist information concept and this mixture is often used in cognitive sciences. Today these trends are united in “The Information Processing Paradigm” (See Brier 1997 for further argumentation and documentation). On the other side we have phenomenology, hermeneutics and semiotics. They are the traditional humanistic disciplines of meaning, signification, mediation, interpretation and cultural consciousness and their conceptual foundations do
41
42
not allow them to encompass the areas of science, not even biology. Cognitive information sciences, partly based on first order cybernetics, have run into a powerlessness situation in their attempts to find the algorithms of intelligence, informational meaning and language (Winograd & Flores 1986, Dreyfus & Dreyfus 1996, Searle 1989 and Penrose 1995). This structural approach has great problems with the phenomenon of context and signification and how they interact. They want to understand everything, including consciousness and meaning, algorithmically (Lakoff 1987) based on the old belief that both the world and the mind is structured mathematically and that mathematics in the end is the Logos, be it in the One, the unmovable mover, the pure nature of the vacuum field or the Christian God. When such foundational limitations are realized it is surely time to make a further development in your conceptual foundation, as Bohr pointed out (Bohr 1954). I believe that Peirce’s semiotics and its ability to be the foundation of a biosemiotics (Sebeok 1979, Hoffmeyer 1996) can establish such a new transdisciplinary foundation integrating the new results from other more specific disciplines such as second order cybernetics, cognitive semantics and pragmatic language philosophy. I will show that Bateson (1972), and later on the new second order cybernetics of von Foerster, have developed some fruitful concepts on the self-organization of cognition. Furthermore the concepts of autopoiesis and structural couplings of Maturana and Varela, which were developed in the same tradition, bring us important steps forward. But I also want to show that these are not sufficient to explain how meaningful communication is possible (Fogh Kirkeby 1997). To this end I turn to integrate concepts from Peirce’s semiotics which at the same time offer an alternative philosophical foundation to mechanistic materialism, on one hand, and pure constructivism on the other, in the form of an objective idealistic, realistic, evolutionary and, pragmatic philosophy based on the papers of mine you can find in the reference list. I refer to the most relevant of them as I unfold the argument, for those who want to examine a more detailed argumentation. Bateson’s information and Maturana & Varela’s autopoiesis concepts Bateson (1972) brought cybernetic information science a step further when he laid the basis for second order cybernetics by stating that “information is a difference that makes a difference”! For something to be perceived as information it has to be of relevance for the survival and self-organization of a living system and therefore being anticipated to some degree. Later on Maturana and Varela coined the term autopoiesis to underline the organizational closure of the living system including the nervous system. I have argued in Brier (1992) that this improves and develops Bateson’s points by giving a more explicit and cybernetic theory of the observer, while at the same time seeing observation as a cybernetic process where “things” or “percepts” only emerge, when they are able to obtain a dynamic stability in
the perceptual system. Von Foerster (1984) uses the mathematical functors as an example of the establishment of “eigen values” in the perceptual system. Only eigen functions (recursive functions where the result find a stable output) are perceived as stable “objects”. Reventlow (1977) coined the term “rependium” for these sudden —often irreversible— reorganizations in cognition that makes us see things as objects (see Brier 1993b for further explanation). Through the perceptual apparatus, the nervous system is perpetually perturbated with stimuli that disturb its own firing patterns, but perception is only possible if structural couplings have been formed in advance in the process of evolution, so that the perturbation of the autopoietic system was in a way anticipated. A way to combine these different concepts and frameworks is to use Peircean biosemiotics to say that perturbations that fall within a structural coupling will generate information inside the system only through generating a sign, or rather, as Peirce says it: generate the interpretant that makes the connection between the representamen and the object, which is then seen as an ‘object’. What ethology calls IRM and sign stimuli (Lorenz 1970-71) seems to fit very well into this model, the IRM being one kind of structural coupling. Other members of the species are also surrounding, and are again recognized through pre-established structural couplings. Maturana and Varela are opposed to the cognitive science concept of information. They oppose the view of organisms according to which “...their organization represents the ‘environment’ in which they live, and that through evolution they have accumulated information about it, coded in their nervous systems. Similarly, it has been said that the sense organs gather information about the ‘environment’ and through learning this information is coded in the nervous system” (Maturana & Varela 1980: 6). Maturana and Varela are of the opinion that the organizationally —and structurally oriented account of living systems does not require recourse to a conventional notion of “information”, which they rightly also attach to the traditional idea of coding. Maturana and Varela explicitly reject the cognitive view of cognition as the nervous system picking up information from the environment and the conceptualization of the cognitive processes in the brain as information processing. They clearly see that this lead to the conventional (cognitive) account of language as “.... a denotative symbolic system for the transmission of information.” (Maturana & Varela 1980: 30) But Maturana and Varela both seem to believe that it is possible to put up a grand theory of life, cognition and communication on a biological constructivism as epistemological basis. It is important to notice that Maturana and Varela’s theory of autopoiesis does not have a phenomenological or a socialcommunicative reflected philosophical basis. They acknowledge the inner emotional world of the living systems, but they have no phenomenological
43
44
or social theory of emotion, meaning and signification. Maturana is making a biological behavioral theory of love that includes the social but lacking a concept of communicational meaning and culture. The theoretical construct is based on a cybernetic biology of self-organization, so the only offer of meaning they can give is the functionalistic description of the structural coupling. I think that both autopoiesis and structural coupling are fruitful concepts in cognitive biology; but the foundation of probabilistic cybernetic biology is not transdisciplinary enough to include a theory of signification encompassing the phenomenological aspect of reality (Se also Hayles 1999 analysis). The answer, seen from the viewpoint of the phenomenological semiotics of C. S. Peirce, is that the structural couplings are establishing the possibility for semiosis and are driven by the necessity for autopoietic systems to establish a semiotic domain or significations sphere (Uexküll’s Umwelt). Second-order cyberneticians and autopoieticians do not have the triadic sign concept in their theory, and some are opposed to it. But I think that it is possible to fuse the two theories here, as they both are of second-order (Brier 1992). All the elements in Peirce’s semiosis are signs. It is worth noticing that Peirce’s triadic, phenomenological and pragmaticistic semiosis is very different from the cognitive theories of symbolism that autopoieticians and second order cyberneticians distance themselves from. The relation I see between the concept and models of ethology, autopoiesis and semiotics in this case can be shortly summed up like this: It is the structural coupling that makes it possible for something (a difference, the immediate object), given the right motivation, to create inside the living system an interpretant that discriminates the object/difference as a sign, namely, as vital meaningful information that must release behavior such as, for example, a sign-stimulus triggering an innate releasing mechanism (IRM), which sets into motion a fixed action pattern. Note that the information concept here is intersubjectively based, but on the hand is developed in the exchange with the environment through evolution and therefore has a compatibility with that reality. So Maturana and Varela are right when they say that information is something that is socially ascribed to a process from other observers. It is only through the collective language that we can make the conscious reflection; “I received information from this interaction”. As said above, a framework for cognition, information and communication must simultaneously take departure in the scientific (objective), the phenomenological (subjective) and the social-linguistic (intersubjective); none of these aspects can be left out. It is important to remember that language is the prerequisite for all science, as well as our biology and our inner world of emotion, meaning and wills. One must work simultaneously with all three if one wants to create a general framework for cognition, information and communication. Second order cybernetics has interesting aspects to offer here.
Second order cybernetics contribution to a bio-phenomenological framework As an example Heinz von Foerster (1986) has developed some very interesting thoughts about the dual evolution of biological system and the life-world —or “Umwelt”— it creates. His theory is closely related to Maturana’s idea of coevolution of autopoietic systems and environment, but von Foerster’s theory has an interesting epistemological and ontological turn, illustrating how organism “carve out” realities of the Universe through evolution. As we cannot, in the theory of general relativity, speak of an absolute time or absolute space, thus we cannot, in von Foerster’s bio-psychological theory of cognitive systems, talk of an absolute reality/environment. All systems travel with their own environment, as also von Uexküll pointed out in his “Umweltslehre” (Uexküll 1973 and 1986). Still, both theories retain a vague idea of one Universe, the independent something that everything was evolved from. So you might conclude that the universe is not a reality, but a metaphysical construct made by theories produced in our scientific worlds. But these theories are again based on the cognitive skills we have developed in evolution, which guarantee their survival value and thereby their ‘reality’. They have a shared basis with most of all the other Vertebrates. This is also one of Lorenz’ (1970-71) arguments when he point out that Kant’s categories have to have been evolved in the evolution of animal species into the human species. So, the world might be a construct, but it is all we have based on millions of years of perceptual experience. Von Foerster (1986: 87-88) concludes from this type of argument that the stability of our worldview and concepts of things and categories is the outcome of a converging pressure for communicability. This epistemological foundation of second-order cybernetics connects it with important points in Heidegger’s phenomenology. The important point from Heidegger (1962) is that, as observers, we are always already a part of the world when we start to describe it. We cannot have what Lakoff (1987) calls an “external realism”, but only an “internal realism”, as we are in the world. Our science works from within time and space as also Prigogine and Stengers (1986) points out. When we start to describe it, we, to a certain degree, separate ourselves from the wholeness of the world of our living praxis. A great part of our communication and thinking is not of our own doing. It is biological evolution and cultural history that signifies through us, and, as Karl Popper (2004) has pointed out, history cannot be given a deterministic lawful description. He is probably also here inspired by Peirce. Maturana (1983 and 1988) has —in the same line of thinking as von Foerster— pointed out that there is an ongoing interaction between the autopoietic system and its environment. They co-evolve in a historical drift (non-deterministic). Organisms who live together become surroundings for
45
46
each other, coordinating their internal organization, and, finally, languaging (not the same as language) is created as coordination of coordinations of behavior. So there is a complicated psychobiological development and dynamic-system organization behind cognition and communication. The aspects of the processes of mind that can be modeled in classical logical terms do not seem to have any special position or control of how the intentions, goals and ideas of the system are created. Furthermore, the elementary processes of which this system consists do not seem to be made of classical mechanistic information processing, but out of a self-organized motivated dynamics. Communication, then, between members of the same species is in second order cybernetics explained as a double structural coupling between two closed systems, each internally creating information (von Foerster 1993). As von Foerster (1986) also underlines, the environment is only established through stipulating a second observer. Luhmann (1990 and 1992) has developed this approach into a general communication theory distinguishing three different levels of autopoiesis: the biological, the psychological and the social communicative. Building on both Maturana’s autopoiesis theory and von Foerster’s second order cybernetics, Luhmann expands both into a sociological theory, not based on the actions of subjects, but on communication as a self-organized system in itself. It function on the basis of, and in between psychobiological interpenetrated autopoietic systems, that it uses as environment, because although they are alive, perceiving, thinking and feeling, only communication can communicate (Luhmann 1992: 251). The two other systems are silent. He is thus underlining that our mental system is also self-organized and closed around its own organization, although it is still dependent on the functioning of the biological autopoietic system. Even in the social realm messages are only received if they fall within the anticipated spectrum of structural couplings called generalized media such as power, money, love, art, science etc. In humans the complexity of the environment is reduced on the background of meaning (Luhmann 1990 and 1995 discussed in Brier 1992, 1996a and 2002). Luhmann attempts to generalize autopoiesis from the biological real where Maturana and Varela defined it to both the psychological and the social communicative sphere, and, as argued above, I find this is a more solid approach for a general theory of cognition, information and communication. But unfortunately he does not develop a profoundly philosophical theory of signification encompassing the phenomenological semiotic sphere. He stays at the social level mostly somewhat inspired by Husserl’s ideas. Although he uses the concept of meaning and interpretation as active selections as an important part of his theory inspired by Husserl’s phenomenology, he does not have a semiotic theory of signification where he deals with the origins of sign vehicles and how they come to obtain meaning through signification as
C.S. Peirce does in his semiotics. But the fruitful thing about Luhmann’s theory is that social communication is basic in his understanding of cognition and communication from the start. Communication is his basic social concept. Society is a system of communications. Luhmann (1992: 251) writes that “I would like to maintain that only communication can communicate and only within such a network of communication is what we understand as action created” Maturana develops his theory in this direction in later years, but it does not have Luhmann’s sophistication on the sociological level. All animals live in interbreeding groups, so also in cognitive biology the concept of social communication is vital. As an ethologist I therefore think it is important to extend Luhmann’s underlining of the importance of the social in understanding signification and communication model to animals by distinguishing between biological and cultural meaning. Biological meaningfulness —which for example also dominates humans in spontaneously aggressive or sexual responses— is primarily non-linguistic and emotionally borne. In ethology, one says that ritualized instinctive behavior becomes sign stimuli in the coordination of behavior between for instance, the two sexes of a species in their mating play. So —as it is already in the language of ethology— a piece of behavior or coloration of plumage in movement becomes a sign for the coordination of behavior in a specific mood, as mating for instance. It is the mood and the context that determine the biological meaning of these signs, which are true triadic signs in the biosemiotic interpretation of Peirce’s triadic and evolutionary semiotics. Ethologists have never deliberated on the foundations of the sign concept they used in their theory, but I think that Peirce’s will be the most fitting, as Saussure never worked with signs outside human language and culture. Peirce’s triadic sign also describes sign processes in the dynamic way that fits with evolutionary biology and therefore ethology processes. Instinctive sign communication is not completely arbitrarily established and is presumed to be emotionally significant to the animal (Lorenz 1971-72), which reflects are not. A sign process needs a representamen, an object and an interpretant to communicate something about the object to somebody in some aspect, but not all possible representamens are signs. There are, for example, many habits of nature that we have not yet interpreted. Although it is about an aspect of reality (the object) there is no final and true representation. Instead of Kant’s “thing in itself’”, Peirce operates with a “dynamical object” —sometimes even called the “ultimate object”— that is the ideal limit of all the “immediate objects” that are created through interpretants and interpretant’s interpretants worked out through endless time by all scientist. As he points out, signs exist in communicative societies. Biosemiotics points out that this also includes animal communities such as, for instance, an anthill, a beehive, a school of fish or a group of higher apes. The interpretant is created
47
48
through an ongoing dynamic process in communicative systems. The sign represents the immediate object that contains some aspect of the dynamical object. The immediate object is what the sign “picks up” from the dynamical object and mediate to the interpretant based on the ground. From an ethological point of view it is the innate motivation and thereby the whole IRM that determines the ground, as in Freud, where it is also the (repressed) drive that determined what an entity or a situation is interpreted as. The concept of ground is an important aspect of Peirce’s sign theory that makes it possible to connect it to ethology on one hand, cognitive semantics on the other and finally to the language game theory of Wittgenstein. It is the entrance to the question of context so important in the debate on limitations of AI (Dreyfus & Dreyfus 1995) and to the difference between the immediate and the dynamical object. Ground is a belief habit, an expectation of a pattern, which mediates between the experiences of the past and present. This past experience is both cultural and biological and, like Lakoff (1987) points out, this is what connects concepts with reality. This is also well established in AI. For example, Tschader (1997: 170) says that the semantic mapping of a concept is “grounded” in human experience in the real world. In ethology, it is the motivation that sets the ground and determines what sign game (Brier 1995), and, within that, in what conceptual scheme a certain color or movement should be interpreted in. As we usually do not accept that animal’s use of signs has the necessary syntactic structure and semantic generativity to be called language, we cannot call the communicative situations they participate in “language games”, as Wittgenstein (1958) does for humans. Holding on to the fruitfulness of Wittgenstein’s way of qualifying context in a dynamically social way through his concept of game, I suggest to call what animals do sign games. Though animals are not linguistic beings –living in the game language– like humans; it is the most important point of biosemiotics that they live in both external as well as internal sign games. This, then, is also the case for the human body. Animals are “signborgs”, not quite as cultural products as the human languageborgs, but still –as all living systems— live in a web of signs. Biosemiotics thus connects ethological knowledge with second order cybernetics to embodied cognitive semantics and gives new insight in the combination of biological and cultural experience in the process of signification and communication. What makes this possible is, in my opinion, Peirce’s development of his profound triadic philosophy. Because we must not forget that we have so far only moved slightly out of a pure functionalistic view of signification although it is now placed in a second order and evolutionary perspective. We have so far not come up with a foundation for the creation of a world view within which the existence of the phenomenological aspect, first person experience, the value and force of emotion; and the meaning and willing
in cognition and communication can be placed in a consistent way. Modern natural scientific worldviews really do not give room to the psyche as a selforganized causal force, not even in Dennet’s evolutionary view. Peirce’s triadic philosophy Peirce is not a materialist nor a mechanicist and not even an atomist, as he believes —with Aristotle— that the substance of reality is continuous, that signs, concepts and regularities are real, that we cannot remove the mental and emotional from basic reality as we are connected within it (Nature) and as it is connected with us (Mind). But, unlike Aristotle, he is a material evolutionist and he does not believe in classical logic penetrating to the ultimate depth of reality. As the pre-Socratic philosophers, he believes that Chaos (Firstness) is the cradle of all qualities; manifested as particulars (Secondness) and through habit-taking (Thirdness) gives rise to order —and not the other way around: complexity arising from simple mathematical order. Through this combination we have now one big evolutionary narrative going into the human history of language born self-consciousness and we have left the mechanical-atomistic —and deterministic— ontology and its epistemology of the possibility of total knowledge (called world formula thinking by Prigogine and Stengers 1980). Evolutionary science is science within time attempting to find relatively stable patterns and dynamical modes (habits). It is not a science of eternal laws. It is a science of the habits of evolution and the meaning they come to have for the living systems created in the process. Peirce does not have an atomistic worldview and his idea of firstness is truly complex and chaotic, and posses potentially the primary aspect of both the “inner” and “outer” world. Firstness has a lot in common with the modern idea of the “quantum vacuum field” or space-time geometry fields (Brier 1996e), except that Peirce does not let firstness be devoid of potential qualia and emotions as is a basic ontological and epistemological principle in most contemporary views of natural science. His worldview is thus fundamentally anti-reductionistic and anti-mechanistic and truly evolutionary (Brier 1993a+b, 1996b). Peirce integrates emotions and qualia from the beginning in his metaphysics and thereby avoids the present problems of the sciences. Many scientists are in my opinion ruled unconsciously by their basic ontology of a mechanical reality based on a mathematical eternal order. In this view meaning, emotion and willing can only get functionalistic explanations and must in the end be determined as hallucinatory phenomenological processes with no real causal effects on the physiology of the body. Still how the quality of consciousness should ever be able to fit into any explanations in this paradigm is beyond my imagination. It has never been truly established that mechanicism was an adequate philosophy for biology, especially for ethology (Brier 1993b and 1998a). The implication of Peirce’s philosophy and method is that qualia and
49
50
“the inner life” is potentially there already from the beginning, but they need a nervous system to get to a full manifestation. The point is that organisms and their nervous systems do not create mind and qualia. The qualia of mind develops through interaction with those nervous systems that the living bodies develops into still more self-organized manifested forms. Peirce’s point is that this manifestation happens through the development of triadic semiosis. We become conscious through the semiotic development of the living systems and their autopoietic signification-spheres in sign games for shared communication, which finally evolves into human language games. This is the new foundation I suggest: that bio-semiotics and evolutionary epistemology can be supported by, and are able to integrate recent developments from ethology, second order cybernetics, cognitive semantics and pragmatic linguistics in a fruitful way to a new transdisciplinary view of cognition and communication. To combine the ethological, the autopoietic and the semiotic description one can say the following: Meaning is habits established as structural couplings between the autopoietic system and the environment. “Objects” are cognized in the environment —through abduction— by attaching sign habits to them related to different activities of survival such as eating, mating, fighting, nursing what we, with Wittgenstein1, call life forms (Brier 1995 and 1996a) in a society (animal or human), and thereby constituting them as meaningful. We thus take a step forward in the understanding of how signs get their meaning and produce information inside communicative systems as we see information as actualized meaning in shared sign or language games. This is an alternative to the transdisciplinary framework of Stonier (1997), which build on a concept of objective information existing by itself (ultimately as “infons”). Cybersemiotic framework of information and communication science The cybersemiotic transdisciplinary framework delivers a bio-psycho-social framework for understanding signification that supplements and develops the original ethological models of animal cognition. So I claim that perception, cognition, anticipation, signification and communication are intrinsically connected in autopoietic systems in mutual historical drift in the creation of signification and sign categories. As Lakoff (1987) observed, the relations between categorical concepts are not logical but motivated, having their origins in the basic life forms and their motivated language games. Living systems are self-organized cognizant anticipatory autopoietic systems. With Spinoza, I will say that they have got Conatus! This means that the individuality of life systems value itself through its continuing efforts to preserve its own internal organization. But I also think that the knowledge developed in ethology and second order cybernetics can deepen and complement Lakoff’s efforts by giving a more profound concept of motivation and a more differentiated view of the experiential biological basis. For Peirce the ultimate
drive of evolution is love, the law of mind and the tendency to take habits. It is not a prefixed foundational mathematical order. This view makes a connection between humanities and the natural and social sciences possible, as well as between matter and mind, inside and outside, truth and meaning, causality and purpose without reducing one to the other. Thus, by joining the effort from the latest development of cybernetics and ethology with Peirce’s already transdisciplinary semiotics, and uniting them with Wittgenstein’s pragmatic language game theory we get a framework that is truly transdisciplinary. This cybersemiotic framework for information, cognitive and communication sciences also conceptualizes the anticipatory dynamics of all cognition in a more fruitful way than the information-processing paradigm. There is an active anticipatory element in all perception and recognition. It is not a pure mechanical process. Perception and cognition are active processes deeply connected to the self-organizing dynamics of living systems and their special ability to be individuals. Knowledge systems thus unfold from our bio-psycho-socio-linguistic conscious being. Their function is to orient us in the world and help us act together in the most productive way, but they do not explain us to ourselves. Peirce’s view, that we cannot split the concepts of mind and matter, is a very sound and profound basis from which to begin. I do not see any good reason why the inner world of cognition, emotions, and volition should not be accepted as just as real as both the physical world and the cultural world of signs and meaning. Embodied life, even single-celled life, is a basic component of constructing reality. We are thinking in —or maybe even with— the body. The psyche and its inner world arise within and between biological systems or bodies. Employing Peirce, one may claim that there will always be some type of psyche in every kind of biological autopoietic and dual code system. Nevertheless, a partially autonomous inner world of emotions, perceptions, and volitions only seems to arise in multi-cellular chordates with a central nervous system. Lorenz (1973) argues that such a system with emotions and experiences of pleasure is necessary for animals to have appetitive behaviors that motivate them to search for objects or situations that elicit their instinctive behavior and release the motivational urge built up behind it. This is qualitatively different from how reflexes function on a signal, which is on a proto-semiotic informational level. The signs of instincts function on a genuine semiotic level. It is obvious that what we call language games arise in social contexts where we use our minds to coordinate our willful actions and urges with fellow members of our society. Some of these language games concern our conceptions of nature as filtered through our common culture and language. But underneath that, we also have emotional and instinctual bio-psychological sign games. For humans these function as unconscious paralinguistic signs, such as facial mimics,
51
52
hand gestures, and body positions that originate in the evolution of speciesspecific signification processes in living systems. Luhmann’s theory of the human socio-communicative being consisting of three levels of autopoiesis can be used in cybersemiotics to distinguish between: 1. the languaging of biological systems, which is the coordination of behaviors between individuals of a species on a reflexive signal level (following Maturana), 2. the motivation-driven sign games of bio-psychological systems, and finally, 3. the language games level of the self-conscious linguistic human in the socio-communicative systems. A semiotic understanding has thus been added to Luhmann’s conception, and his theory is placed within Peircean triadic metaphysics. I will develop this further below, as there are also semiotic systems within the body and the psychological system and between them that can be pointed out and named for further use. We simultaneously have internal communication occurring between our mind and body. In Luhmann’s theory this differs from what Kull (1998) calls psychosomatics, as it is not a direct interaction with culture, but rather only with the psyche. Nor is it merely endosemiosis. The terms endosemiosis and exosemiosis were both coined by Sebeok (1976:3). Endosemiosis denotes the semiosis that occurs inside organisms, and exosemiosis is the sign process that occurs between organisms. Endosemiosis became a common term in semiotic discourse (see von Uexküll et. al. 1993) to indicate a semiotic interaction at a purely biological level between cells, tissues, and organs. Nöth (2001) introduced the term ecosemiotics to designate the signification process of non-intentional signs from the environment or other living beings that creates meanings for another organism, for instance, one that is hunting. The sign signifying an organism is suitable prey is not intentionally emitted by the organism being preyed upon; it is therefore ecosemiotic rather than exosemiotic. Then what can we call the internal semiotic interaction between the biological and psychological systems within an organism? I call this interaction between the psyche and the linguistic system thought semiotics. This is where our culture, through concepts, offers possible classifications of our inner state of feelings, perceptions, and volitions. In their non-conceptual or pre-linguistic states, these are not recognized by conceptual consciousness (our life world). I shall therefore call them phenosemiotic processes (phenosemiosis). This is a reference to Merleau-Ponty, who, in the Phenomenology of perception, speaks of that aspect of awareness that is pre-linguistic, and claims that there are not yet even subject and object. But it is still semiotic in Cybersemiotic theory. As the interactions between the psyche and the body are internal, but not purely biological as in endosemiotics, I call the semiotic aspect of this
interpenetration between biological and psychological autopoiesis intrasemiotics. These terms remind us that we are dealing with different kinds of semiotics. We need to study more specifically the way semiosis is created in each instance. Today we realize that there are semiotic interactions between hormone systems, transmitters in the brain, and the immune system, and that these interactions are important for establishing a second-order autopoietic system within a multicellular organism. Such an organism is comprised of cells that are themselves autopoietic systems and these are organized on a new level into an autopoietic system. But we do not clearly understand the relations between this system and our lived inner world of feelings, volitions, and intensions. It appears that certain kinds of attention on bodily functions, such as imaging, can create physiological effects within this combined system. This is partly carried by different substances that have a sign effect on organs and specific cell types in the body (endosemiotics). We also know that hormonal levels influence sexual and maternal responses; fear releases chemicals that alter the state and reaction time of specific body functions, and so on. This is a significant part of the embodiment of our mind, but intrasemiotics seem to function as meta-patterns of endosemiotic processes. For example, our state of mind determines our body posture through the tightness of our muscles. There is a subtle interplay between our perceptions, thoughts, and feelings and our bodily state working, among other things, through the reticular activation system. There is much we do not yet know about the interaction between these systems. The nervous system, the hormonal system and the immune system seem to be incorporated into one large self-organized sign web. The autopoietic description of living cybernetic systems with closure does not really leave space for sign production per se, and semiotics itself does not reflect very much about the role of embodiment in creating signification. Thus, the cybersemiotic solution to this problem is that signs are produced when the systems interpenetrate in different ways. The three closed systems produce different kinds of semiosis and signification through different types of interpenetration, plus a level of structural couplings and cybernetic “languaging”. Realizing that a signification sphere not only pertains to the environment, but also to the perception of other species’ members, cultural and proto-cultural behavior, and perceptions of one’s own mind and body-hood, I use “eco” as a prefix for the signification sphere when it pertains to non-intentional nature and culture external to the species in question. In inanimate nature, in other species, and in cultural processes, we can observe differences that signify meanings to us that were never intended by the object. Ecosemiotics focuses on the aspects of language that relate to how living systems represent nature within signification spheres, including language games in culture. Cybersemiotics suggests that the basis of these eco-language games
53
54
is the eco-sign games of animals, combined with a signification sphere created through evolution. Furthermore, these eco-language games are based on an intricate interplay between the living system and its environment, establishing what Maturana and Varela call “structural couplings.” The signification sphere is a workable model of nature for living systems that, as species, have existed and evolved throughout millions of years. This is also true for the human species, indicating that our language has a deep, inner connection to the ecology of our culture. Any existing culture is a collective way of ensuring a social system will survive ecologically. As such, the cybersemiotic theory of mind, perception, and cognition is realistic, but not materialistic or mechanistic. It builds on the inner semiotic connection between living beings, nature, culture, and consciousness carried by the three Peircean categories in a synechistic and thychistic ontology within an agapistic theory of evolution, thus delivering a philosophy beyond the dualistic oppositions between idealism (or spiritualism) and materialism (or mechanism). The cybersemiotic model provides a new conceptual framework within which these different levels of motivation can be represented and distinguished in ways not possible within frameworks of biology, psychology, and socioculture. A transdisciplinary framework can be constructed that supersedes some of the limitations of earlier divisions between disciplines by viewing meaning in an evolutionary light, as always embodied, and by seeing the body as semiotically organized, as in Peirce’s triadic worldview where mind as pure feeling is Firstness. This gives us hope that the cybersemiotic development of biosemiotics can contribute to a transdisciplinary semiotic theory of mind, information, cognition, communication, and consciousness. Notes
1. I am aware of stretching the interpretation of Wittgenstein’s life form concept to be somewhat more concrete divided into smaller parts than it is usual.
References
Allen, Colin and Bekoff, Marc 1997. Species of Mind: The Philosophy and Biology of Cognitive Ethology. Cambridge: MIT Press. Bateson, G. 1972. Steps to an Ecology of Mind. Paladin, Frogmore, St. Albans. Bekoff, Mark, Allen, Colin, and Burghardt, Gordon M. 2002. The Cognitive Animal: Empirical and Theoretical Perspectives on Animal Cognition. Cambridge: MIT Press Bohr, N. 1954. ”Kundskabens Enhed”. in Bohr, N.. Atomfysik og menneskelig erkendelse, pp. 83-99. Copenhagen: J.H. Schultz Forlag. Brier, S. 1980. Prize Essay in psychology about the fruitfulness of hierarchy - and probability deliberations in constructing models of motivation from behavioral analysis. Awarded with the Gold Medal in psychology of Copenhagen University. __1992. “Information and consciousness: A critique of the mechanistic concept of information”. Cybernetics & Human Knowing, Vol.1, (2/3): 71-94. __1993a. “Cyber-Semiotics: Second-order cybernetics and the semiotics of C.S. Peirce”. Proceedings from the Second European Congress on Systemic Science, Vol. II: 427-436.
__1993b. “A Cybernetic and Semiotic view on a Galilean Theory of Psychology”. Cybernetics & Human Knowing, Vol. 2, (2): 31-45. __1995. “Cyber-Semiotics: On autopoiesis, code-duality and sign games in bio-semiotics”. Cybernetics & Human Knowing, Vol. 3, (1): 3-14 __1996a. “From Second-order Cybernetics to Cybersemiotics: A Semiotic Re-entry into the Second-order Cybernetics of Heinz von Foerster”. Systems Research, Vol. 13, (3): 229244 __1996b. “The Necessity of a Theory of Signification and Meaning in Cybernetics and Systems Science”. Proceedings of the Third European Congress on Systems Science, Rome, 1-4 October 1996, pp. 693-697. Rome: Edizioni Kappa. __1996c. “Cybersemiotics: a new interdisciplinary development applied to the problems of knowledge organization and document retrieval in information science”. Journal of Documentation, 52 (3), September 1996: 296-344. __1996d. “The Usefulness of Cybersemiotics in dealing with Problems of Knowledge Organization and Document Mediating Systems”. Cybernetica: Quarterly Review of the International Association for Cybernetics, Vol. 39 (4): 273-299. __1996e. “Trust in the order of things: an extended review of Freya Mathew’s book The Ecological Self”. System Practice 9 (4): 377-385. __1997. “What is a Possible Ontological and Epistemological Framework for a True Universal ‘Information Science’? The suggestion of a Cybersemiotics”. World Futures, Vol. 49:297-308. __1998a. “The Cybersemiotic Explanation of the Emergence of Cognition: The Explanation of Cognition, Signification and Communication in a non-Cartesian Cognitive biology”. Cognition and Evolution, Vol. 4, no.1, pp. 90-102. __1998b. “Cybersemiotics: A suggestion for a Transdisciplinary Framework for Description of Observing, Anticipatory and Meaning Producing Systems”. Proceedings form CASYS97: conference on anticipatory systems. Liege, Belgium: Chaos foundation. American Institute of Physics Conference Proceedings 437, pp. 182-193. Berlin: Springer Verlag __1999. “Biosemiotics and the foundation of cybersemiotics. Reconceptualizing the insights of Ethology, second order cybernetics and Peirce’s semiotics in biosemiotics to create a non-Cartesian information science”. Semiotica, Special issue on Biosemiotics, 1271/4, 1999, 169-198. __2000a. “Konstruktion und Information. Ein semiotisches re-entry in Heinz von Foersters metaphysische Konstruktion der Kybernetik zweiter Ordnung”. in Jahraus, Oliver, Nina Ort und Benjamin Marius Schmidt (eds.). Beobactungen des Unbeobacthtbaren, pp. 254-295. Weilerswist: Velbrück Wissenschaft. __2000b. “Trans-Scientific Frameworks of Knowing: Complementarity Views of the Different Types of Human Knowledge”. Yearbook Edition of Systems Research and Behavioral Science, V.17, No. 5, pp. 433-458. __2001a. “Cybersemiotics and Umweltslehre”. Semiotica, 134-1/4 (2001), 779-814. __2001b “Cybersemiotics: A reconceptualization of the foundation for Information Science”. Systems Research and Behavioral Science, 18, 421-427. __2001c. “Ecosemiotics and Cybersemiotic”. Sign Systems Studies 29.1, 107-120. __2002. “Intrasemiotics and Cybersemiotics”. Sign System Studies 30.1, 113-127. __2002a. “Varela’s contribution to the creation of Cybersemiotics: The calculus of selfreference”. ASC-column, Cybernetics & Human Knowing, 9,2:77-82. __2002a. “The five-leveled Cybersemiotic Model of FIS”. Trappl, R. (ed.): “Cybernetics and Systems vol. 1, 2002”, Austrian Society for Cybernetic Studies. 1:197-202. __2003. “The Cybersemiotic model of communication: An evolutionary view on the threshold between semiosis and informational exchange”. TrippleC 1(1): 71-94.
55
56
Churchland, Patricia Smith. 1986. Neurophilosophy: Toward a Unified Science of the Mind-Brain. Cambridge: MIT Press. Churchland, Paul M. 1995. The Engine of Reason, the Seat of the Soul: A Philosophical Journey into the Brain. Cambridge: MIT Press. Foerster, Heinz von 1984. Observing systems. California: Intersystems Publication __1993. “Für Niklas Luhmann: Wie rekursiv ist Kommunikation“. Teoria Sociologica 1(2): 61-88. __1996. “From Stimulus to Symbol”. in McCabe, V. and Balzano, G.J. (eds.). Event and Cognition: An Ecological Perspective. New Jersey: Lawrence Erlbaum. Hayles, N.K. 1999. How We Became Posthuman: Virtual Bodies and Cybernetics, Literature and Informatics. Chicago: The University of Chicago Press. Heidegger. M. 1962. Being and Time: A Translation of Sein und Zeit by John Macquarrie and Edward Robinson. San Francisco: Harper. Hinde, R. 1970. Animal Behavior: A synthesis of Ethology and Comparative behavior. Tokyo: McGraw-Hill. Hoffmeyer, J. 1996. Sign of Meaning in the Universe. Bloomington: Indiana University Press. Kirkeby, O.F. 1997. Event and body-mind. An outline of a Post-postmodern Approach to Phenomenology, Cybernetics & Human Knowing, Vol. 4, No. 2/3, pp. 3-34. Kull, K. 1998. “Semiotic ecology: different natures in the semiosphere”. Sign System Studies, Vol. 26, pp. 344-364. Lakoff, G. 1987. Women, Fire and Dangerous Things: What categories reveal a bout the Mind. Chicago: The University of Chicago Press. Lorenz, K. 1935. “Der Kumpan in der Umwelt des Vogels”. J. F. Ornith., 83, 137-213, 289413. __1950. “The comparative method in studying innate behavior patterns”. Sym. Soc. Exp. Biol., 4, 221-268 __1970-71. Studies in animal and human behavior I and II. Cambridge: Harvard Univ. Press. __1973. Die Rückseite des Spiegels. München: Pieper and Co. __1966. On Aggression. London: Methuen Luhmann, N. 1990. Essays on self-reference. New York: Columbia University Press. Luhmann, N. 1992. “What is communication?”. Communication Theory, Vol. 2, no.3, pp. 251-258. Maturana, H. R. 1983. “What is it to see?”. Arch. Biol. Med. Exp. 16: 255-269. __(1988). “Ontology of observing: The Biological Foundation of Self Consciousness and the Physical Domain of Existence”. The Irish Journal of Psychology, Vol. 9, (1. 25-82). Maturana, H.R. & Varela, F.J. 1980. Autopoiesis and Cognition: The Realization of the Living. Boston: Reidel. Nöth, W. 2001. “Introduction to Ecosemiosis,” Tarasti, ISI Congress papers, Nordic Baltic Summer Institute for Semiotic and Structural Studies Part IV, June 12-21 2001 in Imatra, Finland: Ecosemiotics: Studies in Environmental Semiosis, Semiotics of the Biocybernetic Bodies, Human/too Human/ Post Human, pp. 107-123. Peirce, C.S. 1931-58. Collected Papers vol. I-VIII, Eds. Hartshorne and Weiss, Harvard University Press. Penrose, R. 1995. Shadows of the Mind: A Search for the Missing Science of Consciousness. London: Oxford University Press. Popper, K. R. 2004. The Poverty of Historicism. London: Rutledge Prigogine, I. & Stengers, I. 1986. Order out of Chaos: Man’s new Dialogue with Nature. Toronto: Bantam Books Reventlow, I. 1970. Studier af komplicerede psykobiologiske fænomener; Munksgaard, Copenhagen. Doctoral thesis. University of Copenhagen.
__1977. ”Om ‘dyrepsykologien’ i dansk psykologi og om dens betydning for begrebsdannelsen i psykologien, I”. Dansk filosofi og psykologi, bind 2. Filosofisk Institut, University of Copenhagen: 127-137. Searle, J. 1989. Minds, Brains and Science. London: Penguin Books. Sebeok, T. 1976. Contributions to the Doctrine of Signs. Bloomington: Indiana University __1979. The Sign & Its Masters. University of Texas Press. Stonier, T. 1990. Information and the Internal Structure of the Universe. London: Springer Verlag. __1992. Beyond Information: The Natural History of Intelligence, London: Springer Verlag. __1997. Information and Meaning: An Evolutionary Perspective, Berlin: Springer Verlag. Tinbergen, N. 1973. The Animal in Its World, London: George Allan & Unwin. Uexküll, J. von 1973. “The Theory of Meaning”. in Thure von Uexküll 1982. Jakob von Uexküll’s “The Theory of Meaning”. Special issue of Semiotica, 42-1. __1986. “Environment (Umwelt) and Inner World of Animals”. in Burghardt, G.M. (ed.). Foundations of Comparative Ethology. New York : Van Nostrand Reinhold. Wimmer, M. 1995. Evolutionary Roots of Emotions, Evolution and Cognition, Vol. 1. (1): 8-50, Vienna: Vienna University Press. Winograd, T. & Flores F. 1987. Understanding Computers and Cognition. Norwood, New Jersey: Alex Publishing Corporation Wittgenstein, L. 1958. Philosophical Investigations. New York: MacMillian Publishing.
57
58
Information and direct perception: a new approach Anthony Chemero
Introduction Since the 1970s, Michael Turvey, Robert Shaw, and William Mace have worked on the formulation of a philosophically-sound and empirically-tractable version of James Gibson’s ecological psychology. It is surely no exaggeration to say that without their theoretical work ecological psychology would have died on the vine because of the high-profile attacks from establishment cognitive scientists (Fodor and Pylyshyn 1981, Ullman 1981). But thanks to Turvey, Shaw and Mace’s work as theorists and, perhaps more importantly, as teachers, ecological psychology is currently flourishing. A generation of students, having been trained by Turvey, Shaw and Mace at Trinity College and/or the University of Connecticut, ecological psychology, are now distinguished experimental psychologists who train their own students in Turvey-Shaw-Mace ecological psychology. Despite the undeniable and lasting importance of Turvey, Shaw and Mace’s theoretical contributions for psychology and the other cognitive science, their work has not received much attention from philosophers. It will get some of that that attention in this paper. I will point to shortcomings in the Turvey-Shaw-Mace approach to ecological psychology, and will offer what I take to be improved versions of two important aspects of it. In particular, I will describe theories of information and of direct perception that differ from the Turvey-Shaw-Mace account. Given the debt that those of us who wish to pursue ecological psychology owe to Turvey, Shaw and Mace, this, no doubt, seems ungrateful.2 Perhaps it is. But I would argue that because of the success of the Turvey-Shaw-Mace approach to ecological psychology, the field has become a true contender in psychology, cognitive science and artificial intelligence. Given the stability of ecological psychology and its standing as a research program, it can withstand some questioning of the assumptions on which its current practice is founded. This is especially the case if the questioning is aimed at firming up foundations rather than tearing down the house. Gibsonian ecological psychology and the Turvey-Shaw-Mace approach Gibson’s ecological theory of vision (1979) was intended as a direct response to the increasing dominance of computational theories of mind, according to which perception and thought are rule-governed manipulations of internal representations. Gibson’s ecological approach to perception has three major tenets. First, perception is direct, which is to say that it does not involve computation or mental representations. That is, Gibson thought that perception was not a matter of internally adding information to sensations. Second, perception is primarily
59
60
for the guidance of action, and not for action-neutral information gathering. We perceive the environment in order to do things. The third tenet follows from the first two. Because perception does not involve mental addition of information to stimuli, yet is able to guide behavior adaptively, all the information necessary for guiding adaptive behavior must be available in the environment to be perceived. Thus the third tenet of Gibson’s ecological approach is that perception is of affordances, i.e., directly-perceivable, environmental opportunities for behavior. Affordances, as Gibson was well aware, are ontologically peculiar: [A]n affordance is neither an objective property nor a subjective property; or it is both if you like. An affordance cuts across the dichotomy of subjective-objective and helps us to understand its inadequacy. It is equally a fact of the environment and a fact of behavior. It is both physical and psychical, yet neither. An affordance points both ways, to the environment and to the observer. (Gibson 1979: 129) Despite this ontological peculiarity and the controversy over how to best understand affordances (Turvey 1992, Reed 1996, Chemero 2003a, Stoffregen 2003, Scarantino 2004), the idea of affordances —divorced of their relation to direct perception— is the one aspect of Gibson’s theory that gained significant attention from the beginning, e.g., from designers (see Norman 1986). The rest of Gibson’s ideas were not widely accepted by cognitive scientists upon their appearance. Indeed, as noted above, they were subjected to withering criticism from an establishment in psychology that was committed to understanding perception and cognition as computational manipulations of internal representations of the environment. The ecological approach was not helped by Gibson’s writing style, which, though and highly readable, was often imprecise. Enter Turvey, Shaw and Mace. Along with a few colleagues, Turvey, Shaw and Mace wrote a series of papers outlining a detailed philosophical account of the ontology and epistemology of Gibson’s ecological approach (Shaw and MacIntyre 1974; Mace 1977; Turvey 1977; Turvey and Shaw 1979; Shaw, Turvey and Mace 1980; Turvey, Shaw, Reed and Mace 19813). The most complete and rigorous of these papers is Turvey et al’s 1981 reply to criticism from Fodor and Pylyshyn, so I will focus my discussion of the Turvey-Shaw-Mace on this work. The goal of this paper, stated in the first sentence, is to provide a more precise explication of Gibson’s work, specifically his claim that “there are ecological laws relating organisms to the affordances of the environment” (Gibson 1979: 237). There are four key notions here, which come in pairs: the first pair is affordance and effectivity; the second is ecological law and information. I will look at them in order, suppressing as much formalism as possible. On the Turvey-Shaw-Mace view, an object X affords an activity Y for an organism Z just in case there are dispositional properties of X that are complemented by dispositional properties of organism Z, and the manifestation of those dispositional properties is the occurrence of activity Y. Conversely, an organism Z can effect the activity Y
with respect to object X just in case there are dispositional properties of Z that are complemented by dispositional properties of object X, and the manifestation of those dispositional properties is the occurrence of activity Y. The idea here is that affordances, or opportunities for behavior, are tendencies of things in the environment to support particular behaviors and effectivities are abilities of animals to undertake those behaviors in the right circumstances. Thus, a copy of Infinite Jest has the affordance ‘climbability’ for mice in virtue of certain properties of the book (height, width, stability, etc) and of the mouse (muscle strength, flexibility, leg length, etc.); the mouse has the effectivity ‘being-able-toclimb’ in virtue of properties of the same properties of the mouse and the book. The dispositional affordance and effectivity complement one another in that the climbing-of-book-by-mouse occurs only when the climbability and the beingable-to-climb interact. This, according to the Turvey-Shaw-Mace view is what affordances and effectivities are. To understand how organisms perceive and take advantage of affordance, and, in particular, how they do so directly, Turvey et al define information and natural law. As with affordances and effectivities, the definitions of information and ecological law interact. Ecological laws, according to the Turvey-ShawMace view, are quite different than they are according to what they term the ‘establishment/extensional analysis’. Most of the differences don’t matter to us here, so I will focus on just one key point of ecological laws: their being bound to contexts. According to Turvey et al, ecological laws are defined only within settings and do not apply universally. Thus, the ecological laws relating to things in the niche of mice do not necessarily hold in outer space, or even in the niches of mackerel or fruit flies. So, instead of taking laws to be universal relationships between properties as the ‘establishment/extensional analysis’ does, Turvey et al say that properties-in-environments specify, or uniquely correspond to, other properties-in-environments. The most important ecological laws on the Turvey-Shaw-Mace view are those relating ambient energy to properties in the environment, e.g., those relating patterns in the optic array to affordances. Thus, in virtue of ecological laws, particular patterns of the ambient optic array specify the presence of affordances in particular environments. It is this specification that allows the arrays to carry information about the affordances: because there is a lawful connection between patterns in ambient energy and the properties specified by those patterns, organisms can learn, or be informed about, the properties by sensing the patterns. Of course, among the properties about which information is carried in the array are affordances. Here’s what we have so far: Ecological laws make it such that ambient arrays specify properties (including affordances) and this specification is what makes the arrays carriers of information. The presence of this kind of information underwrites direct perception. If the information required to guide behavior is
61
62
available in the environment, then organisms can guide their behavior just by picking that information up. Ecological laws guarantee that if a particular pattern is present in the optic array in a mouse’s niche, affordances for climbing by mice are also present. Hence perception of those properties can be direct. This view of direct perception is clearly represented by Shaw’s principle of symmetry (Shaw and McIntyre 1974, Turvey 1990). We can represent the symmetry principle as follows. Let E =“The environment is the way it is”, I = “The information is the way it is”, and P = “Perception is the way it is”. Also, let ‘>’ stand for the logical relation of adjunction, a non-transitive conjunction that we can read as “specifies”. Then, the symmetry principle is [(E > I) & (I > P)] & [(P > I) & (I > E)]. In English, this says that “That the environment is the way it is specifies that information is the way it is and that information is the way it is specifies that perception is the way it is, and that perception is the way it is specifies that the environment is the way it is and that information is the way it is specifies that the environment is the way it is.” We can simplify this to say that the environment specifies the information, which specifies perception, and perception specifies the information, which specifies the environment. This principle is symmetrical in that the environment, information and perception determine one another. This, on the Turvey-Shaw-Mace view is what it is for perception to be direct. By law, the environment determines the information, which determines the perception. This makes the perception a guarantee of the presence of the information and also of the environment. So direct perception is perception that, by ecological law, is guaranteed accurate. Issues with the Turvey-Shaw-Mace approach The Turvey-Shaw-Mace approach is a sensible and faithful account of an epistemology and ontology to accompany Gibsonian ecological psychology. I think, though, that there are problems with the account. Over the last several years, I have developed an alternative ontological and epistemological background for ecological psychology, one that attempts to be equally faithful to Gibson’s vision. Since I have written at length about differences between my views and those of Turvey, Mace and Shaw concerning affordances (Chemero 2003a)4, I will restrict my comments here to differences concerning direct perception and information. The main problem with the Turvey-Shaw-Mace account is that, by insisting that information depends upon natural law, they have made it such that there is too little information available for direct perception. In particular, on the TurveyShaw-Mace view, there is no information about individuals, in social settings, or in natural language. I will discuss these in order. On individuals Because Turvey, Shaw and Mace take direct perception to be infallible, they insist
that it be underwritten by information, which is, in turn, underwritten by natural law. They are careful to maintain that the laws in question are ecological laws, laws that hold only in particular niches. Thus, laws need not be universal in order to allow information to be carried in the environment. But, of course, ecological laws must still be general in that they apply to a variety of individuals. For example, there would be an ecological law that connects a particular optical structure, a visible texture, to the bark of a particular kind of tree: in the environment of gray squirrels, say, optical structure O is present only when light has reflected off a silver maple. Note that making the ecological law niche-specific makes it so that the presence of optical pattern O in other environments, where lighting conditions or tree species differ, doesn’t affect O’s information carrying in the squirrel’s environment. So far so good, but in each gray squirrel’s environment there are a few trees that have special affordances in that, unlike most trees in the environment, they contain nests. There are no ecological laws relating these trees, as individuals, to properties of the optic array, so there is no information about these trees, as individuals, available to the squirrels. This, of course, does not apply only to trees. If information depends on laws, there is also no information about individual people available for perception. So although a human infant might have information available about humans, she has none about her mother. So, on the Turvey-Shaw-Mace view, either babies do not perceive their mothers (because the information for direct perception is unavailable) or they do not perceive them directly. I take it that either alternative is unacceptable to the ecological psychologists. On social and linguistic information Another facet of the Turvey-Shaw-Mace requirement of law-like regularities for information to be present is that no information can be carried in virtue of conventions. Conventions hold, when they do, by public agreement or acquiescence and are thus easily violated. Because of an error at the factory or a practical joke a milk carton may not contain milk and a beer can may not contain beer. This is true in any context in which milk cartons and beer cans appear. Similarly, through ignorance or dishonesty spoken and written sentences can be false and words can be used to refer to non-standard objects. In fact, these things happen all the time even in the environments where the conventions in question are supposed to be most strongly enforced, e.g., at the grocery store or Presidential press conferences. None of this is to imply that there is no information to be picked up at grocery stores or when the President speaks. Ecological laws determine the way that collections of aluminum cans in a cardboard box will structure fluorescent light and the way exhalations through vocal cords that pass by moving mouth, lips, tongue and teeth will structure the relatively still air. So there is information that there are cans on the shelf and that the President has said that he and Tony Blair use the same toothpaste. But, because these things
63
are merely conventionally determined, and conventions may be violated, there is no information concerning the presence of beer or the President’s toothpaste of choice. And since direct perception depends upon the presence of such information, we must, according to the Turvey-Shaw-Mace view, perceive that there is Bodingtons in the cans and that the President and Prime Minister use the same toothpaste either indirectly, or not at all. I would prefer theories of information and direct perception that allow children to directly perceive their mothers and for beer cans to inform us about the presence of beer. This requires different accounts of what it is for perception to be direct and of the nature of information.
64
An alternative approach to direct perception On the Turvey-Shaw-Mace approach, direct perception is defined as perception that is grounded in ecological law, so is always accurate. Indeed, Turvey et al 1981 define perception itself as direct and law-governed (Turvey et al 1981: 245). As argued above, this rules out information about, and so direct perception of, individuals and things partly determined by convention. To make it possible for these things to be perceived directly, we need a different understanding of direct perception. In this section, I describe perception as direct when and only when it is non-inferential, where being non-inferential does not guarantee accuracy. Direct perception is perception that does not involve mental representations. We can get started in seeing what this kind of direct perception is by looking at Brian Cantwell Smith’s notions of effective and non-effective tracking. We can see effective tracking in the shop-worn example of a frog tracking a passing fly. In terms of the physics of the situation, Smith points out, what we have is a continuously moving column of disturbance, beginning at the fly and ending at the frog. The key here is that this column-shaped disturbance is just one thing, and is not separable into frog, fly and intervening atmosphere, at least not in terms of physics. When a frog tracks a fly in this way, the frog and fly are coupled in a very strong sense: they are not separate things. The key for our purposes is that the tracking is a matter of constant causal connection among frog, fly and intervening air. This involves nothing worth calling a mental representation: in effective tracking, any internal parts of the agent that one might call representations are causally coupled with their targets. This effective tracking is direct perception. We can also have direct perception during non-effective tracking. Often an animal must continue to track an object despite disruption of causal connection. The frog, that is, must be able to continue to track the fly even when the light reflected from it is (temporarily) occluded. Frogs probably are not capable of this, and indeed it is hard to imagine something coming between a frog and a fly at a tongue-reachable distance. But this kind of non-effective tracking is the norm in vigilance in the animal kingdom. A nesting bird doesn’t lose track of the fox that is temporarily behind a rock. Non-effective tracking, though, also does not
require mental representation. There are three reasons for this. First, non-effective tracking could be accomplished just by causal connection and momentum. The head’s momentum keeps it going that way, and the bird’s eyes meet up with the light that is no longer occluded by the rock. Second, as Gibson points out, perception is an activity, and as such happens over time. So directly perceiving something may involve periods of time when it is being tracked effectively and periods when it is tracked non-effectively. Third, and this is getting ahead of myself because I haven’t said what information is yet, there is still information in the light about something that is temporarily occluded. Thus we can have direct, that is non-representational, perception even when tracking is non-effective.5 There are two relevant consequences of taking tracking as the model of direct perception. First, we can see that perception is, by definition, direct. Perception is always a matter of tracking something that is present in the environment. Because animals are coupled to the perceived when they track it, there is never need to call upon representations during tracking. Effective and non-effective tracking are non-representational, hence direct. Acts of conception, which Smith calls registration, may require representations. In conception or registration, there is a distancing and abstraction. It requires detachment in that the subject must “let go” of the object, stop tracking it (even non-effectively) for a while. The difference here is like that between knowing your nephew will come out from under the other side of the table, and knowing that you won’t see him again until next Thanksgiving. This latter requires abstraction in that the subject must ignore many of the details of the object to keep track of it. When you’re effectively or non-effectively tracking your nephew, you are coupled with every detail of him: every freckle, individual hair, and shirt-wrinkle is moving in concert with your head and eyes. When this physical connection is broken completely, you lose or abstract away from much of this detail. This, it seems, will require something like a representation. But direct perception never does. The second consequence of taking tracking as the model of direct perception is that perception can be direct and mistaken. First, and perhaps obviously, when tracking is non-effective, it is possible for the animal to lose track of its object. The fox might stop behind the rock, yet the bird’s head and eyes might keep moving along the path that the fox was following. This kind of minor error is typically easily corrected, of course. Another possibility is when an animal is coupled with an inappropriate object. For example, the same optical pattern can be caused by a full moon and a light bulb on a cloudy night. And there will be the same sort of continuous column of disturbance connecting a moth to each. So the moth will be effectively tracking whichever of the two it happens to be connected with. When the moth is effectively tracking the light bulb, it is making a mistake. But this does not mean that it is tracking the bulb via a mental representation of the moon. For if it did, then it would also be tracking the moon via a mental representation
65
of the moon when it was doing things correctly and perception would never be direct. Instead, the moth is directly perceiving the moon or misperceiving the light bulb via what Withagen (2004) calls a non-specifying optical variable. A variable is non-specifying when its presence is not one-one correlated with some object in the environment. Withagen argues that, like the moth when it is coupled with the moon, many animals rely on non-specifying variables. Yet according to the Turvey-Shaw-Mace view, non-specifying variables do not carry information about the environment, and so cannot be used for perception, direct or otherwise. So to make sense of the moth’s effective coupling with the moon as a case of direct perception, we need a different theory of information, according to which non-specifying variables can carry information. (The same is true if we want to understand my perception of beer-presence in beer cans and meanings in words.)
66
An alternative approach to information There is a theory of information that has considerable currency in cognitive science that is consistent with Gibsonian information: Barwise and Perry’s (1981, 1983) situation semantics, and the extensions of it by Israel and Perry (1990), Devlin (1991), and Barwise and Seligman (1997). Situation semantics is a good candidate here because Barwise and Perry’s realism about information was directly influenced by Gibson. Barwise and Perry (1981, 1983) developed situation semantics in order to, as they said, bring ontology back to semantics. That is, they were interested in a semantics based on how the world is, and not on minds, knowledge, mental representations, or anything else epistemic in character. Information, according to this view, is a part of the natural world, there to be exploited by animals, though it exists whether or not any animals actually do exploit it. According to situation semantics, information exists in situations, which are roughly local, incomplete possible worlds. Suppose we have situation token s1 which of type S1 and situation token s2 which is of type S2. Then situation token s1 carries information about situation token s2 just in case there is some constraint linking the type S2 to the type S1. Constraints are connections between situation types (figure 1). To use the classic situation semantics example (Barwise and Perry 1983, Israel and Perry 1990, Barwise and Seligman 1994), consider the set of all situations of type X, in which there is an x-ray with a pattern of type P. Because patterns of type P on x-rays are caused by veterinarians taking x-rays of dogs with broken legs, there will be a constraint connecting situations of type X with situations of type D, those in which there is a dog with a broken leg that visits a veterinarian. Given this, the fact that a situation x is of type X carries the information that there is a situation d (possibly identical to x) of type D in which some dog has a broken leg (figure 2). For our purposes here, there are two things to note about this example. First, the constraint between the situation types is doing all the work. That is,
the information that exists in the environment exists because of the constraint, and for some animal to use the information the animal must be aware of the constraint. This feature is true not just of the example of the unfortunate dog, but holds generally of information in situation semantics. The second point is that the constraint in the example holds because of a causal regularity that holds among dog bones, x-ray machines and x-rays. That is, the particular x-ray bears the information about the particular dog’s leg because, given the laws of nature and the way x-ray machines are designed, broken dog legs cause x-rays with patterns of type P. This feature of the example does not hold more generally of information in situation semantics. That is, constraints that hold between situation types are not just law-governed, causal connections. Constraints can hold because of natural laws, conventions, and other regularities. So, a situation with smoke of a particular type can bear information about the existence of fire by natural law, but it can also bear information about the decisions of tribal elders by conventions governing the semantics of smoke signals. Even given this very sketchy description of the nature of information in situation semantics, we can see that this view of information can capture the kind of information that Gibson was interested in. We can see this via an example. 67
Figure 1. Situation token s1 carries information about s2 while there is a constraint linking type S1 to type S2.
Figure 2. A classic situation semantics example.
68
Imagine that there is a beer can on a table in a room that is brightly lit from an overhead source. Light from the source will reflect off the beer can (some directly from the overhead source, some that has already been reflected of other surfaces in the room). At any point in the room at which there is an uninterrupted path from the beer can, there will be light that has reflected off the beer can. Because of the natural laws governing the reflection of light off surfaces of particular textures, colors and chemical makeup, the light at any such point will be structured in a very particular way by its having reflected off the beer can. In situation s1, the light a point p has structure of type A. Given the laws just mentioned, there is a constraint connecting the situations with light-structure type A to the beer-canpresent situations of type B. So, the light structure at point p contains information about situation about token beer-can-presence b (of type B). Notice too that, because of conventional constraints governing the relationship between cans and their contents, beer-can-presence b being of type B carries information about beerpresence c of type C. Furthermore, the light at some point in the room from which the beer can is visible will contain information about the beer can’s affordances. Take some point p, which is at my eye height. The light structure available at this point will contain not just information about the beer can and the beer, but also about the distance the point is from the ground, the relationship between that distance and the distance the beer can is from the ground, hence the reachability of the beer can and drinkability of the beer for a person with eyes at that height. Note that this example makes clear that on my view, but not Turvey-ShawMace’s, constraints that connect situations are not limited to law-like connections but can also be cultural or conventional in nature. The fact that some situation token contains information about some other token does not necessarily entail that the second situation token is factual. For example, the light at my point of observation contains information about the beer can and the beer can contains information about beer being present. If it’s possible that, because of some error at the bottling plant that caused the can to be filled with water, there is no beer in the can, the beer can presence can still carry information about beer presence. But according to Turvey-Shaw-Mace, the connection between the states of affairs must be governed by natural law. So according to the Turvey-Shaw-Mace view, beer can presences don’t carry information about beer presences, and this is because the beer can is not connected by natural law with the presence of beer. This is also a feature of Dretske’s theory of information (1981) and has long been thought to be problematic.6 Situation theorists have typically argued that constraints need not be lawlike connections between situation types. Barwise and Seligman (1994, 1997) for example have argued that the regularities that allow the flow of information must be reliable, but must also allow for exceptions. Millikan (2000) makes a similar point. She distinguishes between informationL (information carried in virtue
of natural law) and informationC (information carried in virtue of correlation). Because constraints need only be reliable, and not law-like, non-specifying variables can carry information. Millikan also makes a valuable point concerning just how reliable non-specifying variables need be. On her teleosemantic view, the correlation between two events needs to be just reliable so that some animal can use it to guide its behavior. Thus, information-carrying connections between variables can be fully-specifying, marginally significant, or anything in between, depending on the type of behavior that the variable provides information for. This works well with the theory of what it is for perception to be direct, outlined in section 3 above. Remember that, according to this view, perception is direct when it is non-representational, the result of an informational coupling between perceiver and perceived. This says nothing about what kind of constraint allows the information to be available. Since the situation semantics theory of information allows information to be present with merely reliable constraints, constraints that hold only sometimes can underwrite direct perception. So we can directly perceive beer-presence, given beer-can presence despite occasional mix-ups at the factory. And we can directly perceive the meaning in the spoken sentences despite the fact that people lie or misspeak. Most importantly, I think, a developing child can directly perceive her mother, even though there are no laws of nature concerning individuals. Compare and contrast: on specification and symmetry I have already said that on the views of information and direct perception outlined here, there is information about, and so the possibility of direct perception of, individuals and socially-, culturally-, and conventionally-determined entities and states of affairs. This is already a marked difference between the view I outline and the Turvey-Shaw-Mace view. Even more striking, and perhaps more troubling to some ecological psychologists, is the effect the views I have outlined have on Shaw’s principle of symmetry. Remember that the principle of symmetry is that (1) the environment specifies the information available for perception and that the information available for perception specifies what is perceived and (2) what is perceived specifies the information available for perception and that the information available for perception specifies the environment. There are, in other words, 1:1 correspondences between the environment and the information available for perception and between the information available for perception and what is perceived. This principle is, perhaps, the most important part of the Turvey-Shaw-Mace view of information and direct perception. Indeed as was noted above, information and direct perception are defined in terms of it. On the view described here, however, symmetry is not true. This is the case because on my situation-semantics-derived view, information does not depend on 1:1 correspondences. To repeat the example, on my view, there could be information about beer at my point of observation because light arriving there has been
69
70
reflected off an unopened Bodington’s can, but there may actually be no beer because the can might be full of something else. In fact, according to the view I’ve outlined, there is an important asymmetry at work here. The asymmetry in question here is partly an asymmetry in what we might call direction of fit. The environment to perception fit is, at least partly, causal, while the perception to environment fit is primarily normative. The can being the way it is causes the light to be the way it is at my point of observation, which sometimes causes me to perceive the beer in the refrigerator. But my perception, via the structure of the light, that there is beer in the refrigerator in no way causes there to be beer in the refrigerator. Instead, my perception fails, is incorrect, if there is no beer. A second way the asymmetry of direction of fit shows up can be brought to light diagrammatically. In situation semantics, constraints connecting types of situations allow tokens of those types to carry information. So for example, because of various constraints concerning the way light reflects off surfaces, there are causal constraints connecting the type of situation in which my daughter is present to situations in which the optic array is structured in a particular way, and because of the way light interacts with me and my visual system, there will be constraints connecting these optical array structurings and my perception of my daughter. That is, constraint C1 connects Ava-present situation type E with Avaarray situation type A and constraint C2 connects Ava-array situation type A with Ava-perception situation type P. Constraints C1 and C2 are, of course, primarily causal. We can see this in the top part of Figure 3. This part of the figure, and this direction of fit from environment to perception, corresponds to the first part of the symmetry principle, E > I > P. In contrast, consider the lower part of Figure 3. This depicts the relationship among tokens: this particular Ava-perception token p of type P is informative about a particular Ava-array token a of type A which is, in turn, informative about a particular Ava-presence token e of type E. This reflects a truism of situation theory: information “flows” among tokens in virtue of constraints among types. This lower part of the diagram corresponds to the second part of the symmetry principle, P > I > E. We can, then, see another way in which the different directions of fit are different: the environment to perception direction of fit is due to constraints among types and the perception
Figure 3. The top part of the diagram is analogous to Shaw’s E > I > P; the bottom is analogous to his P > I > E.
to environment direction of fit is due to an informational relationship among tokens. On this view, Shaw and MacIntyre were right that there is a two-way informational relationship between perception and the environment, but they were wrong in thinking that both directions of the relationship are the same. Final words In this paper, I have offered understandings of direct perception and information that differ from the ecological psychology orthodoxy, the Turvey-Shaw-Mace view. While their view takes perception to be direct when it is necessarily correct, on the view I have outlined, perception is direct when the perceiver and perceived are coupled and their relationship is unmediated by mental representations. While their view takes information to depend upon ecological laws and fully-specifying variables, my view takes information to depend upon constraints that may be only partly-specifying. I hope that I have said enough to make it clear that my alternative views comprise a coherent and attractive option for those interested in the ecological approach to psychology and those interested in embodied cognitive science. Indeed, I have argued elsewhere that the theory of information found in situation semantics ought to be appropriate for everyone in the cognitive and computing sciences. I have, of course, said nothing that makes the Turvey-Shaw-Mace orthodox view incoherent, though some of my arguments should make it less attractive. Notes 1. 2.
3.
4. 5.
6.
For a representative sample, see (Port & van Gelder, 1995). I should also point out that I owe them a personal debt. Though I was never formally a student of Shaw, Turvey, or Mace, each has been patient corrector of my misinterpretations and has even encouraged me in the development of my competing views. They still think that I’m wrong. A quick note on Edward Reed: Although Reed was an author on the paper on cognition and spent his career working on a philosophically-sound version of Gibson’s ecological psychology, I think it makes more sense to speak of the Turvey-Shaw-Mace view and not the ‘Turvey-Shaw-Reed-Mace view’. This is because after working on the 1981 paper, Reed developed views that diverged both from that presented in the 1981 paper and from the one I’m presenting here. Furthermore, Michael Turvey and I have recently come to think that the differences between our views of affordances are actually more similar than meets the eye. See Chemero and Turvey (forthcoming). I should point out that there are some who would argue that there are mental representations involved, even in effective tracking. I have written about this at length elsewhere (Chemero 2000, 2001) and will not repeat myself here other than to say that during tracking claiming that some part of an animal represents some part of the environment provides no explanatory purchase. That is, it is only possible to pick out the part of the animal that is the representation once one already understands the system as a causally connected whole. Note that everything said here about Turvey-Shaw-Mace is also true of Dretske’s classic probability-based theory of information (1981).
71
References
72
Barwise, J. and Perry, J. 1981. “Situations and attitudes”. Journal of Philosophy 77: 668-91. __1983. Situations and Attitudes. Cambridge: MIT Press. Barwise, J. and Seligman, J. 1994. “The rights and wrongs of natural regularity”. Philosophical Perspectives, 8, 331-364. __1997. Information Flow. Cambridge: Cambridge University Press. Chemero, A. 2003a. “An outline of a theory of affordances”. Ecological Psychology15: 181-195. __2003b. “Information for perception and information processing”. Minds and Machines 13: 577-588. Chemero, A. & Turvey, Michael T. (forthcoming). “Complexity and “Closure to Efficient Cause”. in K. Ruiz-Mirazo and R. Barandiaran (eds.). Proceedings of AlifeX 2006: Workshop on Artificial Autonomy. Devlin, K. 1991. Logic and Information. Cambridge: Cambridge University Press. Dretske, F. 1981. Knowledge and the Flow of Information. Cambridge: MIT Press. Fodor, J. and Pyslyshyn, Z. 1981. “How direct is visual perception? Some reflections on Gibson’s ‘ecological approach’”. Cognition, 9, 139-196. Gibson, J. 1979. The Ecological Approach to Visual Perception. Boston: Houghton-Mifflin. Israel, D. and Perry J. 1990. “What is Information?”. in P. Hanson (ed.). Information, Language and Cognition. Vancouver: University of British Columbia Press. Mace, W. 1977. “James Gibson’s strategy for perceiving: Ask not what’s inside your head, but what your head’s inside of”. In Shaw and Bransford (eds.). Perceiving, Acting and Knowing. Hillsdale: Erlbaum. Millikan, R. 2000. On Clear and Confused Ideas. Cambridge: Cambridge University Press. Norman, D. 1986. The Psychology of Everyday Things. New York: Basic Books. Reed, E. 1996. Encountering the World. New York: Oxford University Press. Scarantino, A. 2003. “Affordances explained”. Philosophy of Science 70: 949-961. Shaw, R. and McIntyre, M. 1974. “Algoristic foundations to cognitive psychology”. In Weimer and Palermo (eds.). Cognition and Symbolic Processes. Hillsdale: Erlbaum. Shaw, R., Turvey, M. and Mace, W. 1982. “Ecology psychology: The consequence of a commitment to realism”. In Weimer and Palermo (eds.). Cognition and the Symbolic Processes II. Hillsdale: Erlbaum. Smith, B. C. 1996. On the Origin of Objects. Cambridge: MIT Press. Stoffregen T. 2003. “Affordances as properties of the animal-environment system”. Ecological Psychology15: 115-134. Turvey, M. 1977. “Preliminaries to a theory of action with reference to vision”. In Shaw and Bransford (eds.). Perceiving, Acting and Knowing. Hillsdale: Erlbaum. __1990. “The challenge of a physical account of action: A personal view”. In H. T. A. Whiting, 0. G. Meijer, & P. C. W. van Wieringen (eds.). The natural physical approach to movement control. Amsterdam: Free University Press. __1992. “Affordances and prospective control: An outline of the ontology”. Ecological Psychology, 4, 173-187. Turvey, M. and Shaw, R. 1979. “The primacy of perceiving: An ecological reformulation of perception for understanding memory”. In. L.G. Nillson (ed.). Perspectives on Memory Research. Hillsdale: Erlbaum. Turvey, M., Shaw, R., Reed, E., and Mace, W. 1981. “Ecological laws of perceiving and acting: In reply to Fodor and Pylyshyn”. Cognition 9: 237-304. Ullman, S. 1981. “Against direct perception”. Behavioral and Brain Sciences 3: 373-381. Withagen, R. 2004. “The pickup of nonspecifying variables does not entail indirect perception”. Ecological Psychology 16: 237-253.
Revisiting the dynamical hypothesis Tim van Gelder There is a familiar trio of reactions by scientists to a purportedly radical hypothesis: (a) “You must be our of your mind!”, (b) “What else is new? Everybody knows that!”, and, later —if the hypothesis is still standing— (c) “Hmm. You might be on to something!” —Dennet
Introduction Here are some claims about cognitive science, which, it seems to me, no sane person could deny: 1. Increasingly, cognitive scientists are using dynamics to help them understand a wide range of aspects of cognition.1 2. Dynamical systems theory and orthodox computer science are rather different disciplines; typical dynamical systems and typical digital computers are rather different kinds of things. 3. Dynamically-oriented cognitive scientists see themselves as understanding cognition very differently from their mainstream computational cousins. These claims are my starting point. When I say that they are undeniable, I don’t mean to pretend that they are unproblematic. Indeed, for a philosopher of cognitive science, claims like these really just set up a challenge. What is going on here? What exactly is dynamical cognitive science, and how, more precisely, does dynamical cognitive science relate to orthodox computational cognitive science? And once we have reasonable answers to questions like these, even more interesting ones arise. What are the prospects for dynamical cognitive science? What does it tell us about the nature of mind? Is the mind really a dynamical system rather than a digital computer? In a series of papers, I have taken a particular approach to addressing this broad cluster of issues. That approach begins with the observation that the essence of the mainstream computational approach to cognition is often taken to be encapsulated in Newell & Simon’s “Physical Symbol System Hypothesis,” the claim that “a physical system has the necessary and sufficient means for general intelligent action” (Newell & Simon, 1976). In the quarter-century since Newell & Simon put this hypothesis on the table, a lot of work has been done elaborating and articulating the core idea, and it is now more aptly expressed in the slogan that “cognitive agents are digital computers.” But if this slogan captures the essence of the mainstream computational approach to cognition, then the obvious parallel for dynamical cognitive science is the slogan “cognitive agents are dynamical systems.” The philosophical challenge is then to say what the dynamical slogan means, in a way that does justice not only to the key concepts but also to cognitive science as it is actually practiced “in the field.”
73
74
My basic stand on the meaning of the dynamical hypothesis (DH), and its place in cognitive science, was laid out in a paper, which appeared in Behavioral and Brain Sciences (van Gelder, 1998b). However the task of articulating broad, deep ideas about the nature of a whole discipline —especially a discipline like cognitive science, which seems to be in a constant state of flux— is one that can never be definitively completed. Just as cognitive science is constantly evolving, so our philosophical understanding of the nature of cognitive science must also evolve. Here I will consider some of the most interesting objections to the dynamical hypothesis, as formulated in the BBS paper, and to consider how that formulation should be defended or adapted (Section 4). Before doing that, I will try to convey the flavor of dynamical cognitive science with a couple of illustrations (Section 2), and then present a précis of the basic dynamical hypothesis (Section 3). Roughly speaking, you can tell dynamicists in cognitive science by the fact that their models are specified by differential or difference equations rather than by algorithms. However, describing the difference this way does little to convey the depth and interest of the contrast between dynamical cognitive science and its orthodox computational counterpart. In my experience the easiest way to do this is to begin with an example, one that is drawn not from cognitive science but from the history of the steam engine. Computational versus Dynamical Governors Imagine it is sometime in the latter part of the 18th century, and you need a reliable source of power to drive your cotton mills. The obvious choice is the newly-developed rotary steam engine, in which the back-and-forth motion of a steam piston is converted to the circular motion of a flywheel, which can then power your machines. The problem, however, is that for quality output your machines need to be driven at a constant speed, but the speed of the rotary steam engine fluctuates depending on a range of factors such as the temperature in the furnace and the workload. So here is your engineering problem: design a device that can regulate or “govern” the engine so it runs at constant speed despite the myriad factors causing variation. The best way to control the speed of a steam engine is to adjust the throttle valve, which controls the amount of steam entering the piston. So the challenge becomes that of figuring out when, and by how much, to adjust the valve. From the vantage point of classical cognitive science, the proper approach seems obvious —attach a mechanism carrying out the following little algorithm: 1. Measure the speed of the flywheel; 2. Compare the actual speed against the desired speed; 3. If there is no discrepancy, return to step 1; otherwise a. Measure the current steam pressure. b. Calculate the desired alteration in steam pressure.
c. Calculate the necessary throttle valve adjustment. 4. Make the throttle valve adjustment. Return to step 1. Note that this mechanism, which we can call a computational governor, has to first take input in the form of measurements which result in symbolic representations of various aspects of the engine; then compute various quantities by manipulating symbols according to quite complex rules; and then convert the resulting specifications into actual throttle valve adjustments. The system is thus cyclic, (digital) computational, and representational in its design. An additional subtlety is worth noting: the only timing constraint on the operation of the computational governor is at the output, where the throttle valve adjustments are made. These must be made sufficiently often to control the speed within acceptable limits. Within that constraint, all other operations can happen at any time and at any speed. In this sense, timing within the computational governor is arbitrary; in other words, the device is in an interesting way atemporal. Now, no doubt today it would be possible to build a governor working in this familiar computational fashion. However this was not the way the problem could have been solved in the eighteenth century. Most obviously, digital computers capable of handling the relevant calculations wouldn’t be invented for another 150 years. The actual solution, developed initially by the Scottish engineer James Watt, was marvelously simple. It consisted of a vertical spindle geared into the main flywheel so that it rotated at a speed directly dependent upon that of the flywheel itself. Attached to the spindle by hinges were two arms, and on the end of each arm was a metal ball. As the spindle turned, “centrifugal” force drove the balls outwards and hence upwards. By a clever arrangement, this arm motion was linked directly to the throttle valve. The result was that as the speed of the main wheel increased, the arms raised, closing the valve and restricting the flow of steam; as the speed decreased, the arms fell, opening the valve and allowing more steam to flow. The engine adopted a constant speed, maintained with extraordinary swiftness and smoothness in the presence of large fluctuations in pressure and load. It is worth emphasizing how remarkably well the Watt governor actually performed its task. This device was not just an engineering hack employed because computer technology was unavailable. In 1858 Scientific American claimed that an American variant of the basic Watt governor, “if not absolutely perfect in its action, is so nearly so, as to leave in our opinion nothing further to be desired.” The Watt governor is often known as the centrifugal governor, but it would be more accurate, and in the current context more convenient, to describe it as the dynamical governor, for the Watt governor is a classic example of a dynamical
75
system as studied in dynamics textbooks. The key variable is the angle of the arms, q (theta), whose behavior is described by the differential equation d2q g sinq - r dq = (n w)2 cosq sinq dt2 l dt
76
where n is a gearing constant, w is the speed of engine, g is a constant for gravity, l is the length of the arms, and r is a constant of friction at hinges. This nonlinear, second order differential equation tells us the instantaneous acceleration in arm angle, as a function of what the current arm angle happens to be (designated by the state variable q), how fast arm angle is currently changing (the derivative of q with respect to time, dq/dt) and the current engine speed (w). In other words, the equation tells us how change in arm angle is changing, depending on the current arm angle, the way it is changing already, and the engine speed. Note that in the system defined by this equation, change over time occurs only in arm angle q (and its derivatives). The other quantities (w, n, g, l, and r) are assumed to stay fixed, and are called parameters. The particular values at which the parameters are fixed determine the precise shape of the change in q. For this reason, the parameter settings are said to fix the dynamics of the system. In normal operation, of course, the dynamical governor is connected to the engine, which can also be described as a dynamical system. The two systems are said to be coupled, in this precise sense: the key variable in the engine system is its speed, w, which is a parameter in the pendulum system, and the key variable in the pendulum system, q, is a parameter in the engine system. When all is working as it should, the coupled engine/governor system has a stable fixedpoint attractor, which is the desired constant speed. The important lesson here is this: the dynamical governor is obviously very different from the computational governor. Instead of cycles of inputs, symbolic representations, rule-governed, atemporal computations, and outputs, we have the continual mutual influencing of two quantities. This influencing is very subtle (though mathematically describable): the state of one quantity is continually determining how the other is accelerating and vice versa. This relationship is very unlike the relationship between a digital symbol and its referent. Now, let me be the first to point out that the dynamical governor is not cognitive in any very interesting sense. It is invoked to convey the general flavor of dynamical systems and how they can interact with their “environments,” and for cognitive scientists, to spark the imagination and get the conceptual juices flowing. To really understand the distinctive character of dynamical cognitive science, we need to turn to the dynamical models themselves. As it happens, some of these models do actually bear a striking similarity to the dynamical governor/ engine arrangement. One good example is a model developed by Esther Thelen and colleagues to account for that perplexing developmental phenomenon, the so-called “A not B” error.
Why do infants reach to the wrong place? The classic A not B error, as originally discovered and described by Piaget, goes something like this. Suppose Jean is an infant roughly 7-12 months old. At this age Jean knows what he likes —e.g., a toy— and when he sees it he will reach out for it. If you hide the toy in one of two bins in front of him, Jean will reach towards the right bin. But if you hide the toy in bin A a few times, and then hide it in the bin B, after a few seconds poor Jean makes the A not B error; he reaches towards bin A. Why does this happen? Piaget’s original explanation was cast in terms of the infant’s emerging concept of an object. Jean is at “Stage IV” in this process, where infants seem to believe that an object has lasting existence only where it first disappeared. Call this the cognitivist explanation: the error is due to limitations in Jean’s concepts. The A not B error is fascinating because although the main effect is quite reliable in the standard setup, it is very sensitive to many kinds of changes in the experimental conditions. Many contemporary developmental psychologists reject Piaget’s explanation, and its descendants, because it seems unable to account for Jean’s reaching behavior under these alternative conditions. They have come up with at least two other major kinds of explanations. According to the spatial hypothesis, Jean’s concept of an object is OK; he just has trouble moving his arm to the right place, and this is because he represents space wrongly. In this transitional stage Jean is still representing the world “egocentrically” rather than “allocentrically.” Jean reaches in the direction the object usually is in relation to him rather than to where it now is in independent 3-D space. Thus if Jean is rotated to the other side of the table, he will now reach for bin B, which now occupies the “A” position relative to him. The memory hypothesis also maintains that Jean’s concept of an object is OK; however in this story he has trouble remembering where the object has been hidden. The memory is necessary in order to overcome the habit of reaching towards A. The memory hypothesis can account for why Jean actually does reach towards B if he is allowed to reach in the first few seconds after hiding. It is only later that he makes the error. The cognitivist, spatial and memory hypotheses are ingenious attempts to explain a rich and perplexing body of experimental data. These explanations are very broadly similar to the computational approach to the governing problem, in that they focus attention on Jean’s internal cognitive machinery, the way he thinks about the world. Also, while each one seems to capture some truth about the A not B error, none delivers an adequate account of the overall phenomenon. In each case there are some aspects of the experimental data the hypothesis cannot explain. Thelen et al have taken a very different approach to the A not B error. Instead
77
78
of focusing on the contents of Jean’s mind, they focus on Jean’s reaching activity. They develop a general, high level dynamical model of how we come to reach in a particular direction, and then explain the A not B error by applying the general model to the special circumstances of an infant at approximately 7-12 months. Think of it this way. Suppose we are trying to choose a direction to reach in, with options ranging from far left to far right. Suppose also that our level of inclination to reach in any particular direction is constantly changing. In the Thelen et al model, this constantly changing set of inclinations is what they call the movement planning field, and is specified by a function u(x,t), which tells us our inclination to reach in direction x at time t. The heart of their model is a differential equation specifying how u is changing at any given time, depending on a number of further factors, including • the current state of the movement planning field; • general characteristics of the task domain, such as the presence of two bins in front of Jean; • specific aspects of the current situation, such as the toy being hidden in bin A; • memory of previous reaches, which bias the movement planning field in favor of previous reach directions (roughly, habit); • competitive interactions between locations across the movement planning field, which help guarantee that one direction “wins out.” Change over time in these further factors is specified by their own functions, and when all these are coupled together the result is a rather complicated beast. Fortunately however the dynamical system specified by this grand equation can be simulated on a digital computer, and so the behavior of the model can be compared with the mass of experimental data gathered on the A not B error. The bottom line is this. It is possible to choose a specific set of parameters for the grand equation such that the resulting dynamical system reproduces the classic A not B error, including the various contextual subtleties that caused so much grief for earlier kinds of explanations. This in itself is an impressive achievement, since as Thelen et al. point out, there was no guarantee in advance that this would be possible. The form of the equations build in strong assumptions about the general nature of the dynamical system responsible for reaching The fact that there exists a set of parameters generating appropriate behavior within the constraints of those assumptions already goes a considerable way toward vindicating those assumptions. However, the critical test for the model is not whether it can find a way of accounting for the existing data, but whether the very same equations and parameters can account for any other relevant data that may come along. And indeed Thelen et al found that the model made a number of novel predictions, which were borne out, in further experiments. So the exercise is not mere
retrospective “curve-fitting;” it is finding a particular “curve” which not only fits existing data but also successfully predicts new data. For anyone interested, the details are recounted in their 2001 article in Behavioral and Brain Sciences (Thelen et al. 2001). My interest here is not in whether the Thelen model is, in the end, the one true account of the A not B error. Rather, I am interested in the kind of explanation they are providing. Thelen et al summarize it this way: The model accomplishes all this without invoking constructs of “object representation,” or other knowledge structures. Rather, the infants’ behavior of “knowing” or “not-knowing” to go to the “correct” target is emergent from the complex and interacting processes of looking, reaching and remembering integrated within a motor decision field. I would like to highlight just two of the fundamental differences with more traditional explanations of cognitive processes. First, the explanation looks in the first instance not at what Jean is thinking, but at what he is doing. The primary focus is not on the little Jean-like homunculus within Jean’s head, but on the whole embodied Jean embedded in his environment (i.e., jeans and all!). Of course, within-the-head, neurally-instantiated processes are involved in the explanation of the A not B error; after all, a headless Jean wouldn’t be reaching anywhere (except in a horror movie). But in the Thelen explanation, it could not be sufficient to advert to such processes; they are only one crucial part of an overall story, which essentially invokes the whole embodied and embedded Jean. Second, in this explanation there are no symbols, rules, representations, algorithms, etc., postulated in Jean’s mind. Rather, the explanation is cast in terms of the continual evolution and interaction of a set of coupled continuous variables, as described by a differential equation. The A not B error is a behavior which emerges from all this ongoing interaction under certain specific conditions. If we grant that the A not B error is a genuinely cognitive issue, then we have a thoroughly dynamical explanation of a cognitive phenomenon, one in which the processes involved resemble Watt’s dynamical governor much more than any orthodox computational alternative. The Dynamical Hypothesis in Cognitive Science After these brief illustrations, it is now time to return to the main issue: what is the essence of dynamical cognitive science, and how does it differ from traditional computational cognitive science? As mentioned, my strategy in answering this question has been to note that where traditionalists have rallied under the slogan that cognitive agents are digital computers, dynamicists fall in behind the idea that cognitive agents are dynamical systems. The challenge is then to say, in a reasonably rigorous and interesting way, what this means. My response to that challenge, in a nutshell, is this: The Dynamical Hypothesis (DH):
79
The Nature Hypothesis: For every kind of cognitive performance exhibited by a natural cognitive agent, there is some quantitative system instantiated by the agent at the highest relevant level of causal organization, such that performances of that kind are behaviors of that system. • The Knowledge Hypothesis: that causal organization can and should be understood by producing dynamical models, using the theoretical resources of dynamics, and adopting a broadly dynamical perspective. OK… but what does all that mean? Here we have to do quite a bit of unpacking. A natural place to start is with the notion of a dynamical system. •
80
Dynamical systems Dynamical systems are, obviously, systems of a particular sort. A system, in the sense most useful for current purposes, is a set of variables changing interdependently over time. For example the solar system of classical mechanics is the set of positions and momentums of the sun and planets (and their moons, and the asteroids…). The question then is: when is a system dynamical? Interestingly, there is no established answer within cognitive science or even outside it. A search of the literature reveals a wide range of definitions of dynamical systems, ranging from the very specific (“a set of bodies behaving under the influence of forces”) to the hopelessly broad (“a system which changes in time”). Somewhere on this spectrum lies the definition that is the most useful in articulating the dynamical hypothesis in cognitive science. Which is that? The clue, I think, is found in the fact that in all those systems standardly counted as dynamical systems in the practice of cognitive science, the variables are numerical, in the sense that we can use numbers to specify their values. Why is this? Well, one thing about numerical quantities is that it makes sense to talk about how far apart any two values are, and indeed we have an easy way of telling what that distance is. And when a system’s variables are numerical, we can also tell how far apart any two overall states of the system are. And the key point is this. For some systems, it is possible to describe how they change over time —their behavior— by specifying how much or far they change in any given time step or period. The rule capturing this description is a difference or differential equation. In my opinion the best way to articulate the dynamical hypothesis is to take dynamical systems to be systems with this property, i.e., quantitative systems. There are various reasons for this. First, it reflects pretty well the actual practice of cognitive scientists in classifying systems as dynamical or not, or as more or less dynamical. Second, it is cast in terms of deep, theoretically significant properties of systems. For example, a system that is quantitative in state is one whose states form a space, in a more than merely metaphorical sense; states are positions in that space, and behaviors are paths or trajectories. Thus quantitative systems
support a geometric perspective on system behavior, one of the hallmarks of a dynamical orientation. Other fundamental features of dynamical systems, such as stability and attractors, also depend on distances. Third, the definition sets up a solid contrast between dynamical systems and digital computers, essential if we want to understand dynamical cognitive science as distinctively different from orthodox computational cognitive science. Now we have a fix on dynamical systems; what does it mean to say that cognitive agents are those things? Here things will be clearer if we make a distinction between what I call the Nature and Knowledge hypotheses. Nature hypothesis The nature hypothesis tells us something about reality itself, i.e., that things of one kind (cognitive agents) are things of another kind (dynamical systems). The truth or falsity of the nature hypothesis is completely independent of what we happen to think or know about reality; it concerns the way the world is. And to say that cognitive agents are dynamical systems is to make a somewhat complicated claim. Notice first of all that it is not a straightforward identification. Jean is a cognitive agent, and one thing for sure, Jean is not simply a set of interdependent variables, just as the Watt governor is not simply the arm angle variable. The simple slogan is really saying that for any cognitive performances of mine you might be interested in, there is some set of variables associated with me (and the relevant environment) which constitute a dynamical system of a particular sort, and the cognitive performances are behaviors of that system. So for example Jean’s “deciding” to reach for one box or another is a kind of cognitive performance, and Thelen et al’s account suggests that associated with Jean there are various variables tied together and changing in the way specified by their grand equation, such that his reaching behavior (including his A not B error) is the behavior of that set of variables. And the nature aspect of the dynamical hypothesis says that all cognitive performances are like that. Note that on this analysis each cognitive agent “is” no one dynamical system; different kinds of cognitive performances would be the behavior of different systems associated with the same agent. Knowledge hypothesis While the nature hypothesis is a claim about reality, the knowledge hypothesis is a claim about cognitive science. It says that cognition (at least in the case of natural cognitive agents, such as humans and other animals) is best understood dynamically. This is of course because cognitive agents are in fact dynamical systems (the nature hypothesis), and your intellectual tools really ought to fit the subject matter at hand. Conversely, our best evidence for the nature hypothesis would be discovering that the best way to study cognition is to use dynamics. However we should not allow the undeniable fact that the nature and knowledge
81
82
hypotheses are intimately related to cloud this important distinction. What is it to understand natural cognition dynamically? I said above that the easiest way to pick out a dynamicist in cognitive science is to see whether they use differential or difference equations, and while this is not the whole story it is certainly a key part of it. A thoroughly dynamical perspective on cognition has three major components: a dynamical model, use of the intellectual tools of dynamics, and adopting a broadly dynamical perspective. A dynamical model is an abstract dynamical (i.e., quantitative) system whose behavior is defined by the scientist’s equations. The behavior of the model is compared with empirical data on the cognitive performances of human subjects. If the match is good, we infer that the cognitive performances simply are the behavior of relevantly similar dynamical systems associated with the subjects. So for example Thelen et al define an abstract dynamical model by specifying a set of variables and a grand differential equation governing their interdependent change. They then show that if the parameters are set right, the model system behaves just the way Jean does; from which they infer that Jean’s cognitive performances are, in reality, the behavior of a very similar system whose variables are aspects of Jean himself and his environment. One problem with this whole approach to cognitive science is that the behavior of even simple nonlinear dynamical systems can be rather hard to understand. So while defining an abstract dynamical model might be easy enough, understanding what it does —and so whether it is a good model— can be pretty challenging. One handy tool here is the digital computer, used to simulate the dynamical model. Note that in such cases the digital computer is not itself a model of cognition; it is just a tool for exploring models. But much more important than the computer is the repertoire of concepts and techniques loosely gathered under the general heading of “dynamics.” This includes dynamical modeling, that traditional branch of applied mathematics which aims to understand some natural phenomenon (e.g., the solar system) via abstract dynamical models; and also dynamical systems theory, the much newer branch of pure mathematics that aims to understand systems in general and nonlinear dynamical systems in particular. To use the intellectual tools of dynamics is to apply this body of theory (suitably modified and supplemented for the purposes at hand) to the study of natural cognition. The third component of dynamical understanding is a broadly dynamical perspective. The best way to convey this somewhat nebulous idea is to describe it as the difference between the two ways of conceiving the steam-governing problem. From a broadly dynamical perspective, cognition is seen as the emergent outcome of the ongoing interaction of sets of coupled quantitative variables rather than as sequential discrete transformations from one data structure to another. Cognitive performances are conceived as continual movement in a geometric
space, where the interesting structure is found over time rather than statically encoded at a time. Interaction with the world is a matter of simultaneous mutual shaping rather than occasional inputs and outputs. Dynamicists are certainly interested in “within-the-head” structures and processes, and usually even allow that some of these count as representations, but they reject the idea that cognition is to be explained exclusively in terms of internal representations and their algorithmic transformations. It is hard to overemphasize how different dynamical cognitive science is in practice from its orthodox computational counterpart, and also hard to convey the nature of the dynamical approach in a few short paragraphs. In my opinion the Dynamical Hypothesis, as formulated above, comes pretty close to encapsulating the theoretical essence of the dynamical approach; further, the contrast between the DH and the computational hypothesis is the most significant theoretical division in contemporary cognitive science. However these are contentious philosophical claims about the nature of cognitive science. How have they been received by other cognitive scientists? Some Objections to the DH The DH is not true The largest and most considered set of responses to the DH was the set of peer commentaries in Behavioral and Brain Sciences. A rough count indicated that a majority of this self-selected bunch was basically sympathetic to the DH (in some form), and almost everyone was willing to grant that the DH (in some form) is true of at least some of cognition. Nevertheless one of the most common responses was to deny that the DH is true in general. This denial was grounded in the belief that at least some cognition (generally “higher” or more “central” aspects) is clearly best accounted for in computational rather than dynamical terms. For example Alan Bundy claimed that …with our current experience of the modeling power of dynamical versus symbolic techniques, this [dynamical accounts of higher level cognitive processes] seems very unlikely. However such objections missed the point of my work, the whole thrust of which was to articulate the DH in order that its truth might be evaluable, rather than to argue for its truth. The difference between proposing a hypothesis for empirical evaluation, and endorsing that hypothesis as true, is a subtle one —too subtle, it seemed, for many of the commentators. My own official position is that we are not currently able to say with any certainty to what extent the DH or the competing CH is true. It will only be after lots of hard work producing and evaluating particular models of particular aspects of cognition that we will be justified in asserting any verdict. It is true that at various places I have provided broad philosophical
83
arguments in favor of the truth of the DH, and these may have led some people to conclude that I was already committed to DH being unqualifiedly true. However, while these philosophical arguments may be interesting, they are rarely if ever decisive. They should be interpreted, I think, not as demonstrating that the DH is in fact true, but as demonstrating that the DH is currently sufficiently plausible to be worth taking seriously, i.e., to be worth devoting the huge amounts of time and resources required for serious empirical evaluation. Where I stand my ground is not on the blanket truth of the DH, but on the idea that the DH takes a certain form, i.e., that it should be articulated a certain way. I think these philosophical issues can be largely resolved in advance of the hard empirical work. Indeed, the corresponding philosophical questions in the case of the CH have been largely resolved over the past few decades. The challenge is to reach a similar level of clarity and consensus for the DH —something that my formulation of the DH has not yet achieved, to judge by most of the responses.
84
Eliminate the nature hypothesis! The DH as I articulate it has two intimately interconnected components, one that says something about cognitive agents and one that says something about cognitive science (i.e., the Nature and Knowledge hypotheses). Another common response has been to insist that the DH is really only the Knowledge hypothesis. This idea comes in many flavors, but the thrust is to deny that dynamicists are concerned with the way the world really is. For example, the eminent dynamicist Randy Beer has argued that As mathematical formalisms, both computation and dynamics are sufficiently broad that there is no empirical fact of the matter about which kind of system a cognitive agent is…What the debate between computational and dynamical approaches to cognitive science is really about is which is the most insightful, explanatory, penetrating and parsimonious stance to take toward a cognitive agent. (Beer, 1998) Steven Quartz claims that the crucial distinction between the computational and dynamical hypotheses is an epistemic one resting on the appropriate level of explanation for understanding cognitive systems. (Quartz, 1998) and Bernstein & van de Wetering claim that the DH as a whole is pragmatic in nature, i.e., it is more convenient/ enlightening/ interesting to describe cognition in dynamical terms than in computational terms. (Bernstein & van de Wetering, forthcoming) It is a curious thing about scientists that they are often very hesitant to use terms like “truth” and “reality,” despite the fact that they, more than anyone else, are able to uncover the truth about reality. These scientists correctly observe that any particular scientific claim or theory may (or even probably will) eventually turn out to be false, and that any good scientist should avoid dogmatism and acknowledge
the uncertainty associated with their position. However they mistakenly go on to conclude that scientists are not (or should not be) purporting to describe reality itself, but are merely providing more and more useful ways of talking. That is, they revert to a kind of instrumentalism, according to which scientific theories are only more or less convenient instruments or tools, and do not describe accurately or truly describe the way the world actually is. Put another way, they adopt a form of what philosophers know as Kantian transcendental realism, according to which the world “as it is in itself” is intrinsically unknowable; all we can access is the world “as understood by us.” Now this is not the place to debate the virtues or otherwise of transcendental realism. Suffice to say that for practical purposes a naı¨ve realism is the optimal metaphysical stance. Scientists are in the business of finding out what the world is like. They do so by developing successively more adequate (convenient/ enlightening/ interesting etc.) descriptive frameworks, where the adequacy of a framework is ultimately determined by fit between that framework and the world, as measured by scientific practice. A good scientific theory is not merely useful or convenient; it asserts (correctly or incorrectly) that the world is a certain way and not some other way. So for example Thelen et al are claiming that the A not B error is the emergent behavior of a particular kind of dynamical system, and not the result of ill-formed concepts in the Jean’s head. This is the commonsense interpretation of what is going and we’d want pretty good arguments before surrendering it. Good arguments, however, are exactly what Beer and company don’t provide. Beer, for example, attempts to argue that the Nature hypothesis is incoherent, but his arguments turn on misunderstanding the technical details of the definitions of dynamical systems and digital computers as kinds of systems. (For elaboration, see (van Gelder, 1998a)). Bernstein & van de Wetering claim that the distinction between the Nature and Knowledge hypotheses is “unhelpful” because the Nature hypothesis doesn’t add anything to the Knowledge hypothesis. Well, here —putting it bluntly— is what the Nature hypothesis adds: • I have been arguing that it adds truth; i.e., the idea that cognitive agents are in fact dynamical systems, and not merely conveniently describable as such. • When articulating the DH, distinguishing the Nature and Knowledge hypothesis enables one to sort out a whole lot of issues into two separate piles. One pile is ontological pile; it consists of issues such as: what are systems; how do systems relate to each other; what are dynamical systems; what are digital computers; how do dynamical systems and digital computers relate; how do cognitive agents and dynamical systems relate; etc... The other pile is epistemological; it consists of issues such as what is a model and how do models enable us
85
to understand natural phenomena; what is dynamical modeling; what is dynamical systems theory; what is a dynamical perspective; what are the important differences between a dynamical perspective and an orthodox computational perspective; and so forth. Of course, in the day-to-day practice of cognitive science, there is no need to append the claim “and this theory truly describes the way cognitive agents really are” to one’s dynamical theory of cognition; that much is implicit in the fact that one is asserting and defending the theory. But the philosopher of cognitive science would be delinquent if he didn’t discuss such issues.
86
The DH is not falsifiable Another sort of objection is that the DH fails to be a genuine empirical hypothesis because it is not falsifiable, i.e., nothing could prove that the DH is wrong. One source of this objection seems to be the idea that the DH must be true of everything, and you can’t falsify a theory that is trivially true. Another line of thought seems to be that the DH as formulated makes no specific empirical predictions, and so can never be tested. Let me take these in turn. If we were fuzzy enough about what dynamical systems are and what it is to be one, then the DH certainly would be trivially true. However in articulating the DH, I put considerable effort into crafting a hypothesis that is as narrow and precise as possible given the diversity of dynamical research in cognitive science. Recall that the Nature hypothesis is that claim that For every kind of cognitive performance exhibited by a natural cognitive agent, there is some quantitative system instantiated by the agent at the highest relevant level of causal organization, such that performances of that kind are behaviors of that system. Note that this definition interprets dynamical systems as quantitative systems, which are a specific subclass of systems, viz., systems for which there exists metrics over their state set (and perhaps over the time sets) such that system behavior is systematically related to distances as measured by those metrics. Not every system is like this and so the Nature hypothesis is nontrivial in claiming that cognitive agents are systems of a specific kind. Second, note that the Nature hypothesis requires that cognitive agents be dynamical systems (in this sense) at the highest relevant level of causal organization for a given kind of behavior. Digital computers do not satisfy this condition, even though they may instantiate any number of other dynamical systems at various levels. So the Nature hypothesis requires a quite specific kind of relationship between cognitive agents and dynamical systems. The non-triviality of the DH is also obvious when we consider the Knowledge hypothesis, which basically claims that cognitive agents can and should be understood in dynamical terms. If this were trivially true, we would already have perfect models of every aspect of cognition and cognitive science
would be over. But understanding cognition dynamically is obviously not a trivial matter. Understanding some natural phenomenon in dynamical terms is never simple, and if anything it is especially difficult in cognitive science. After all, physicists have been producing good dynamical models since the seventeenth century; three hundred years later in cognitive science we are only just getting into the game. In short, the DH is certainly not true of everything, and proving that it is true of cognitive agents (IF it is!) is damned hard. (If you doubt that, just have a go!) The second version of the falsifiability objection is more interesting. Randy Beer suggests that the DH isn’t a genuine scientific hypothesis, at least not in the traditional sense of making an empirically falsifiable claim. What’s at issue here aren’t experimentally testable predictions... and another serious dynamicist, Richard Heath, worries that there is little guidance on how such investigation can determine the relative validity of DH and CH. It may be the case that it is very difficult indeed to provide the empirical evidence needed to reject CH in most cognitive scenarios, using tools available to experimental psychology (Heath, 1998). These are important points, and the proper response consists in explaining in what way the DH, like any hypothesis of its kind, is empirically contentful and hence falsifiable. Beer and Heath are correct the DH cannot be decisively tested by means of any direct and immediate confrontation with reality. It is a very general hypothesis, perched deep in the web of theory, and surrounded by a wide buffer of auxiliary hypotheses and chains of inference. The DH does however issue one major prediction —that our best accounts of cognition will in the long run be dynamical in form. The DH will be known false if, after an extensive period of investigation, cognitive scientists have in practice rejected dynamical approaches in favor of some other modeling framework. In this respect, the DH is on a par with other venerable scientific doctrines. For example, the “evolutionary hypothesis,” that all biological complexity is the outcome of natural selection, does not on its own make any specific testable predictions. It does however predict that in the long run all our best explanations of biological complexity will be cast in terms of natural selection. With much auxiliary theorizing, the evolutionary hypothesis does make specific predictions, but if those predictions fail, the main hypothesis can be preserved by shifted the blame elsewhere. If there is too much blame to be shifted, we eventually reject the main hypothesis. For broad theoretical hypotheses, this indirect connection with the world is not unfalsifiability; rather, it is what falsifiability consists in. Thus, contra Beer, the DH can be a genuine scientific hypothesis even if it alone does not make specific testable predictions. The testability of any broad theoretical hypothesis depends essentially on
87
a fund of good judgment which is implicit in scientific practice and can never be made fully explicit and written down in a rule book (Kuhn, 1962). Heath is right to note that in any given case it will be difficult, perhaps impossible to establish in any conclusive or mechanical way whether a dynamical model is preferable to a computational competitor, but it would be wrong to fault the DH for failing to solve this problem. Moreover, there are some very general principles that may help us even when the detailed empirical arguments are inconclusive. When scientists, as a group, choose one model or general theoretical framework over another, they inevitably allow certain very general desiderata to shape their judgments. Famously, for example, they prefer simple and elegant theories to complex and ungainly rivals; and they prefer theories that integrate well with our best theories in neighboring domains. Some refer to such virtues as “aesthetic” or “superempirical;” whatever we call them, it is clear that the process of empirical evaluation always involves such criteria. This is not to say that scientific judgment is “irrational,” or just a matter of some “leap of faith” —rather, to grasp the essential role of such reliance is to understand the nature of scientific rationality. Finally, it is worth observing that while one group of critics claim that the DH obviously false, another group worry that the DH is not falsifiable! The critics can’t all be right (though they might all be wrong). 88
The truth is in the middle! In articulating the DH, I deliberately tried to make the contrast with orthodox cognitive science as strong and clean as possible. The reason for this should be obvious enough: we are more likely to make scientific progress when the major options are clearly delineated and can stand against each other. Not surprisingly, however, some critics have claimed that the DH and CH are too extreme; that neither is likely to be true, and the truth must be somewhere in the middle. Daniel Dennett presented this idea in a memorable way. In van Gelder’s view of the theoretical landscape, he claimed, there is Mt. Newton on one side, and Mt. Turing on the other, and nothing in between. The trouble is that neither classical mechanics nor Turing machines are likely to account for natural cognition. The truth about cognition will actually be found among the foothills and ranges scattered around and beyond the grand peaks. In effect, Dennett is claiming that the DH and the CH caricature the available options in contemporary cognitive science. However Dennett only makes his point by himself caricaturing the DH. Dynamical cognitive science is not simply (indeed, not ever) the straightforward application of Newtonian mechanics to natural cognitive processes. The dynamical umbrella covers a rich tapestry of models and theoretical machinery, including, I think, much of the supposed middle ground between Newton and Turing. To see this, consider first the general notion of computation. What makes a process a computational process? In my opinion the answer is that a computational
process is one that sets up a mapping between two domains. Metaphorically, computational processes systematically provide answers to questions: provide a question as input, and the process will deliver an answer as output. In this sense, almost anything can be construed as a computer. The concept of computation only starts to get interesting when significant further constraints on the nature of the process involved. The most familiar approach is to require that the process be effective: intuitively, to produce its answers by means of a finite number of discrete operations specified by some finite recipe or algorithm. Effective computation is the same thing as Turing computation, which is equivalent to digital computation. The second half of the twentieth century has come to be dominated by the digital computer. This is obviously true in practical domains, but it is also true in the intellectual sphere. For example that body of mathematics going under the name of “theory of computation” has been overwhelmingly the theory of digital computation. Closer to home, cognitive science has been dominated by the idea that natural cognition is a form of digital computation. This is the essence of orthodox approaches. Given these developments, many people seem to have lost sight of the fact that there are many other kinds of computation. “Non-Turing” computation is simply any kind of computation that for whatever reason fails to satisfy the full set of strict conditions for counting as digital computation. Thus, in the days before digital computers became widely available, analog machines such as differential analyzers and even the humble slide rule were used for everyday computational tasks. Back in the 1960s Scientific American carried an article describing how to build your own personal computer at home —analog, of course. Given the vast range of possible forms of non-Turing computation, it makes no sense to ask how non-Turing computation “in general” compares with its digital counterpart. But one can focus on specific kinds of non-Turing computation, defined by alternative sets of constraints. One approach is to consider a given class of dynamical systems as computers. There is now a whole branch of the theory of computation vigorously enquiring into the computational properties of dynamical systems performing one form or another of non-Turing computation. The existence of a rigorous body of knowledge at the intersection of dynamical systems theory and the theory of computation obviously opens up a whole new set of possibilities for understanding natural cognition. It may well be that certain aspects of cognition are best understood as the behavior of dynamical systems performing non-Turing computation (Garson, 1996) —that is, as occupying the “middle ground” between Mt. Newton and Mt. Turing. To the extent that this is true, the orthodox computational theory of mind clearly stands refuted. Would it likewise refute the DH —or vindicate it? Nothing in the DH requires that natural cognition be understood in solely
89
traditional dynamical terms. Indeed, such a requirement would be quite bizarre. Do orthodox models draw solely upon the theory of computation? Dennett caricatures the DH by placing it atop Mt. Newton. In reality dynamicists draw on a wide range of auxiliary concepts, methods, etc., even while holding to their dynamical core. One strategy is to combine dynamics with non-Turing computation —to see a cognitive process as simultaneously the behavior of a dynamical system and as a kind of analog computation. This middle ground, I believe, really belongs to the dynamical approach to cognition, just in case a thoroughly dynamical perspective continues to be essential to understanding the process. If the dynamics eventually drops out and the process is understood primarily as computation —even non-Turing computation— then the DH ceases to be true of that process, even if at some level the process is in fact the behavior of some dynamical system.
90
Conclusion I conclude that the DH still stands as the proper way to articulate the essence of contemporary dynamical approaches to cognition. But what about the question I keep deferring: is it actually true? To answer this is, in effect, to predict the course of cognitive science; and, as a pundit once pointed out, it is hard to make predictions, especially about the future. Moreover, as a philosopher somewhat removed from the front lines I have certainly have no special insight. However, putting qualifications aside, recent broad trends in cognitive science, as well as some very general considerations, indicate that the Dynamical Hypothesis will turn out to be true of a considerable portion of natural cognition; that where computation is relevant, it will be analog computation implemented in dynamical systems; and insofar as the DH is false, it will be superseded by some form of theoretical framework whose elements are being pieced together by unheard-of mathematicians laboring under the illusion that their ideas couldn’t possibly have any application to reality. References
Beer, R. 1998. “Framing the debate between computational and dynamical approaches to cognition”. Behavioral and Brain Sciences, 21, 630. Bernstein, D., & van de Wetering, S. (forthcoming). “More boulders of confusion”. Behavioral and Brain Sciences, 21. Dennett, D. 1995. Darwin’s Dangerous Idea. New York: Touchstone. Garson, J. 1996. “Cognition poised at the edge of chaos: A complex alternative to a symbolic mind”. Philosophical Psychology, 9, 301-321. Heath, R. 1998. “Cognitive dynamics: a psychological perspective”. Behavioral and Brain Sciences, 21, 642. Kuhn, D. 1962. The Structure of Scientific Revolutions. Chicago: University of Chicago Press. Newell, A., & Simon, H. 1976. “Computer science as empirical enquiry: Symbols and search”. Communications of the Association for Computing Machinery, 19, 113-126. Quartz, S. 1998. “Distinguishing between the computational and dynamical hypotheses: What difference makes the difference?”. Behavioral and Brain Sciences, 21, 649-650. Thelen, E., G. Schöner, et al. 2001. “The Dynamics of Embodiment: A Field Theory of Infant Perseverative Reaching.” Behavioral and Brain Sciences 24: 1-86.
van Gelder, T. J. 1998a. “Disentangling dynamics, computation, and cognition” Behavioral and Brain Sciences, 21, 40-7. van Gelder, T. J. 1998b. “The dynamical hypothesis in cognitive science”. Behavioral and Brain Sciences, 21, 1-14.
91
92
The dynamical approach to cognition: inferences from language Robert F. Port
Cognition: symbolic or dynamic? There seem to be two major conceptual models for understanding human thinking. Each makes specific, testable predictions that guide research efforts. The first is the computational or symbolic paradigm and the second is the dynamical paradigm. Despite an intellectual pedigree extending from Aristotle to Chomsky, the computational model is probably inadequate. It seems likely that the dynamical systems framework have more success at providing satisfying answers and merits further exploration. The traditional story about the nature of human cognition – about the way people think and interact intelligently with the world – is that the process of cognition is based on the manipulation of symbolic objects (Chomsky 1965, Newell and Simon 1975, Fodor 1975, Fodor and Pylyshyn 1988, Haugeland 1985). However there are a variety of reasons to be skeptical about the possibility that computation with static objects in discrete time will be adequate for description of real human cognition (eg. van Gelder and Port 1995, van Gelder 1998, Port and Leary 2005). Without attempting to survey all those arguments, I will note that one of the most fundamental arguments is simply that actual human cognitive behavior must unfold in real (that is, continuous, historical) time. Therefore, at some point, there must be an account of how symbolic units are employed (e.g., ‘written’, ‘read’ or ‘computed with’ etc.) in real time. But why is implementation such a problem, one might ask? After all, computers do this kind of implementation. Why shouldn’t we assume similar capabilities for human cognition? Digital computers are engineered with a constant-rate control clock that synchronizes all computational activity on the chip. But there is no evidence of uniform, synchronic periodicity governing neurocognitive behavior in any animals. Of course, there are many kinds of neural periodicity that can be observed in various parts of the mammalian brain, but there is nothing that resembles the governing clock of a digital computer. Instead there is evidence of a great many separate oscillators of variable rate, some of which may couple with each other from time to time, but no standard clocking process. So, if our brains aren’t rigidly timed like digital computers, then how do they run in time? How could cognition coordinate one part of itself with another part in time? And how could it coordinate some parts with a temporal pattern presented to sensory surfaces? It is difficult to see how an account of such coordination could be developed that remains compatible with the previous assumptions of the computational view. We should explore the traditional view further. How is ordinary mundane thinking to be accounted for in the symbolic/computational framework? The computationalists propose that all thinking or cognition employs a symbolic
93
94
representational system, a “language of thought”. As Fodor (1975) pointed out, this general purpose cognitive language should be expected to be easily mappable onto natural languages. That is, natural language is the obvious, firstorder model for many basic properties of the language of thought (as employed, for example, in “good old-fashioned artificial-intelligence” or GOFAI projects, Haugeland 1985). Let’s take an example of an every-day thought: “I had better run to the post office and get stamps before it closes”. Such a thought, on the computational view, implies the existence of a word-like code that includes units like ‘post office’ and ‘postage stamp’ and concepts like ‘to close a business for the day ‘. That is, the symbolic theory of cognition claims that there are representational units for each of these concepts. A linguistic unit, like the word ‘post-office’ in English, thus predicts a discrete cognitive representation ‘post-office’ in the computational systems of individual speakers of English. As Fodor (1975) made quite clear, the cognitive models of Chomsky and Newell and Simon should be seen as making explicit theoretical claims about both everyday words and concepts —like ‘table, cup, to share’ — and linguistic units — like Plural, Past-Tense, etc. They are taken to be domain-independent abstract objects. But this abstractness is supported, as Haugeland (1985) pointed out, by some kind of flawless mechanism for reading and writing the symbols. Thus to postulate a cognitive system along the lines of a computational system requires assuming some device that can execute programs of the system without error. But what evidence is there that cognition uses formal symbol-like units of any kind? The most obvious and convincing examples seem to be in human speech. Chomsky and Halle: Language as a formal system. The discipline of linguistics, my own home discipline, might claim to have the most impressive array of evidence regarding the symbolic nature of cognition. According to the conventional linguistic stance, sentences are made of strings of words in a particular order, and words are concatenations of phoneme-like symbols. Unfortunately, however, the Formality of Language has long been merely an assumption, rather than an empirical issue within linguistics. Chomsky (1965, Chomsky and Halle 1968), settled this issue for modern linguistics by assuming that language simply is a formal, discrete system of symbols and rules. The distinction of Competence from Performance contrasts the formal, Platonic, discrete-time domain of the Grammar from the noisy, continuous-time, fallible world of neural, physiological and physical aspects of human linguistic behavior. One great achievement of Chomsky is his helpful proposal that the way to study language is to inquire about the biases that the child brings to language learning. This seems a very practical statement of the problem — a way to focus our attention on more specific research issues. Both the language-learning child and the professional linguist investigating the grammar of a language bring to their induction problem some set of biases that lead them to analyze specific linguistic states of affairs in particular ways. In both cases, of course, these biases have a large influence on the grammatical analysis that is eventually achieved.
One hopes that with thoughtful research, science will be able make these two theories consonant with each other. Universal phonetic alphabet. Chomsky went further and proposed that the form of these biases and the proper way in which to state them will reflect the basic apriori elements of a symbolic mathematical system. Just like any other mathematical system, a formal system is defined by a set of primitive elements and a set of primitive operations. For language, Chomsky and Halle propose both ‘substantive universals’ and ‘formal universals’, the innate, primitive elements and the formal operations for human linguistic activity (Chomsky and Halle 1968, p. 4). The phonetic system they proposed deserves to be called the Standard Theory of Phonetics — the accepted theory of the relationship between the abstract sound units of language and the concrete ‘real’ world of gestures and sound for several generations of linguists. The theory provides a list of substantive linguistic universals — the sound units from which all words in all languages are made. The total set of features is identical with the set of phonetic properties that can in principle be controlled in speech. They represent the phonetic capabilities of man and, we would assume, are therefore the same for all languages. (p. 294-295) Their Chapter 10 offered a preliminary list of the set of phonetic elements that are possible in human language, about 50 or so segmental phonetic features – like Vocalic, Labial, High, Voiced, Rounded, Glottalized and so forth. Unlike ordinary orthographic alphabets, these features are supposed to be scales that permit at least a binary distinction (e.g., 0/1 or +/-) but perhaps more than 2 levels per scale. The explicit claim is that a list along the lines of the one they present can account for all differences between languages plus all possible phonological rules. Of course, this definition means that ONLY SOME of the things observable in speech sound are relevant to phonetics (that is, to language). Thus a certain amount of individual token-token variation and variation between speakers of different sizes plus many of the little squiggles and blobs observable in sound spectrograms can be ignored in principle. That seems like the right thing to do since any theory of perception needs some way to select certain information for response so that the remainder can be ignored. The minimal categories of phonetics need to be large enough to help speakers ignore minute differences (by selecting primarily relevant information). This is one reason why the categories need to be as large as possible (and why the size of the primitive phonetic vocabulary needs to be fairly small.) This style of categorization also predicts that languages cannot make use of any temporal structures except those describable in terms of serial order. All phonetic features must be static units that describe specific states of the system. Changes between these temporal states are assumed to click along serially in time. (Presumably adherents of this theory would agree that there must be some continuous-time clock generating these time clicks, but there is no assumption that the time clicks need be equally spaced.) So, if there DID happen to be a temporal
95
96
structure of some other (nonserial) type in the data (eg. an invariant duration ratio between two adjacent time intervals) that structure, it would seem, should not be available for linguistic use —according to the Standard Theory of Phonetics. For the speaker, this alphabet provides a speech control inventory, a listing of all the sound differences that languages have any control over. For the hearer, conversely, it implies a rich set of prelinguistic perceptual mechanisms that can do phonetic transcription of any speech that comes along into these categories (see Stevens and Blumstein 1981 for elaboration and experimental tests of these ideas). Languages may differ from each other only in ways representable in this alphabet. How large is the universal phonetic alphabet? To be plausible (and to yield testable empirical claims), the alphabet must be assumed to be quite small in size. From the standpoint of language acquisition, one hopes that the number of distinctive units is very small (and the size of each in sensory space should be large). The inventory must be small enough that composing explicit rules is feasible for language learners (as well as for linguists). If the alphabet proliferates too much, the rules will need to define more complex classes to do the right thing. So the rules must get larger, more complex and more numerous. Surely, if the alphabet were to reach into the millions of primitive units, then the plausibility of the whole idea would become very strained indeed. Chomsky and Halle actually proposed about 40 (binary) distinctive phonetic features. This set of phonetic terms with a few additions has served as the standard theory of phonetics within linguistics for the past 30 years. Universal phonetics: What does it account for? The proposed theory of phonetics – of a universal inventory of discrete, static, phonetic objects – plays a critical role in formal linguistic theory. This inventory of segmental features accounts for a great many phenomena. For example, at least for the following 6 issues (numbered a-f below), the standard theory implies simple plausible explanations. a. Why the words in every language seem to be discretely different from each other. When a language constructs many words on the basis of a small number of discrete patterns, the standard theory of phonetics provides a natural and succinct description. Languages exhibit many persuasive examples of discreteness. To look at one simple example, compare the following word sets that contrast four English front vowels: beat, bit, bet, bat peak, pick, peck, pack; meek, Mick, Meg, mag; seek, sick, secular, saccule and even speak, Spic, speck, spackle; sneaker, snicker, SNEC, snacker (SNEC? How about ‘South Newton Executive Council’?)
For every language, it appears, similar neat tables of minimally different words can be found. Cases like this are strongly suggestive that there is a small inventory of sound types (representable simply with alphabetic symbols of one sort or another) in which the vocabulary items are ‘spelled’ for each language. Standard orthographic alphabets are assumed toapproximate some actual ‘cognitive phonological alphabet’. The discrete categoricity of phonemes is shown quite vividly by ‘categorical perception’ experiments (see Liberman and Mattingly 1985). Here, a continuum of artificially manipulated speech samples that span an acoustic continuum from, say, ‘beak’ to ‘peak’, are presented to listeners who are asked to say whether pairs of them are the same or different. Subjects find it very difficult to hear differences between tokens that they label the same (as either B words or P words). So although the stimuli gradually fade from B to P, the Ss (as long as you don’t give them too much training on the discrimination task, Kewley-Port,Watson and Foyle 1988) can only discriminate them (that is, tell if a pair of them are the same or different) as well as they can classify them (as B words vs. P words). Evidence that humans hear most words of their native language in discretely distinct categories is very strong. An account of how and why phonological and phonetic structure exists so prominently in the languages of the world is a central problem for linguistic and cognitive theory. On the other hand, it is important to keep in mind that finding many suggestive examples of discretely different categories in natural languages is not sufficient justification for the formality assumption about language. The formality assumption requires that all of language exhibit discrete symbolic structure. If language really works like a computational system, involving a system of many rules, clearly all its constituents must be symbolic (that is, discrete and static). Otherwise, any unit that was not symbolic should cause the computational model to break down or jam. Computational models simply must have symbol-like units that can be ‘written’ and ‘read’ essentially perfectly at each discrete time step of the model (Haugeland 1985). Returning to the roles played by the phonetic alphabet, beyond providing succinct descriptions of the minimal-pair structure of the vocabularies of languages, the phonetic theory also provides an account of a number of other important phenomena related to language and speech. b. Why any language can be acquired by any human child. Since every child has the universal phonetic inventory for representing the words of any language (as well as other cognitive mechanisms), any child can learn any language. Apparently speakers lose access to universal phonetic descriptions as they mature, so that learning a language at 20 years is a very different matter from learning it at age 5. So, if begun young enough, any child can apparently learn any language with native fluency. c. Why a simple IPA-like inventory of symbols can include most sounds of all languages. The modern western alphabets are based directly on the Greek orthographic system, including the technically motivated universal
97
98
alphabet of the International Phonetic Association (IPA). This form of writing seems natural and appropriate for most linguistic features (though prosodic features, like distinctive tone, are less naturally represented). It seems at first glance that every language must choose from the same very limited inventory, which is why [s, z, d, t], [i, a, u] and [n, m] etc, keep turning up in language after language (Ladefoged and Maddieson 1996). The exact list, of course, is taken to be a matter for continued research, but both the IPA list and the Chomsky-Halle list are believed to be practical and reliable first attempts. d. Why words can be learned quickly during early language acquisition (6 mo - 36 mo). If a limited set of sound classes are employed, the child has available a useful notation for representing tentative word hypotheses. That is, if after Mommy says ‘cookie’ a few times, a 14 mo old may say ‘guhguh’. Somehow the child had a memory of its mother’s pronunciation that was adequate to support its first effort at the word. The phonetic alphabet makes possible an account of how that process could occur. e. Why morphophonological rules are easy to acquire. Context-sensitive variants of morphemes don’t confuse the learners very much. In some varieties of American English, mother says ‘What’re you eat’n?’ (with a glottal stop) and ‘Eat this’ (dental stop affricated release) using completely different pronunciations of the final /t/ in ‘eat’. But they are similar along certain dimensions (eg. both are stops and both are voiceless). Learners can recognize invariant phonological units since there is only a limited set of dimensions on which context-sensitivity can depend because the phonetic alphabet available for composing such rules is small. If each of those segments had a cloud of thousands of features attached to it, it would be far more difficult to recognize patterns in the speech one hears. Modern phonologists use about 100 segmental features and a few prosodic features, which seems to be enough to describe most of what is phonologically interesting in languages of the world (although, as we shall see, not all). f. Why humans do not have conscious access to ‘subsegmental’ components of speech. In this theory, the phonetic features are extracted by some hypothetical ‘subsegmental’ system. Thus, it may be relevant that speakers seem to be unable to access the acoustic or psychophysical properties (eg. formants, bursts, etc) of speech signals (cf. Liberman and Mattingly 1981, Stevens and Blumstein 1980). Despite these successes, there are many reasons for skepticism of the traditional view of phonetics. One of them is that words in various languages exhibit many kinds of different temporal patterns (Port 1995, Tajima and Port 1998, Port and Leary 2005). The next section will survey another of these problems, the phenomenon known as “incomplete neutralization”. This effect poses a serious empirical challenge to the view that the phonetic control space for speech production is always discrete. Problems with a static, discrete, universal phonetic space Despite the many clear cases of discretely contrastive linguistic units, there
remains considerable evidence that human languages often exhibit structures that do NOT fall into discretely distinct categories. The phenomenon of ‘incomplete neutralization’ is one case where paradoxes arise (Port and Crawford 1989, Warner et al 2004). In these cases, there is a morphological difference that is clear and obvious in some contexts, like English ‘beat’ vs. ‘bead’, and yet seemingly neutralized in another context, like ‘beat it’ vs. ‘bead it’ for many British and American speakers. In the second pair of phrases, the difference between the T and D seems to have disappeared since both the /t/ and /d/ are pronounced as ‘apical flaps’. In such cases, speakers of the language often maintain a small but consistent statistical difference between the two morphological structures, differences in the phonetic details of their pronunciation (Fox and Terbeek 1976, Port 1996). Although the differences may be small and overlap between the two distributions is fairly large, Ss demonstrate ability to perceptually use these phonetic details to make perceptual minimal-pair judgments. That is, if you record many American speakers (of an appropriate dialect, such as my own) saying ‘beat it’ and ‘bead it’ many times and ask listeners to guess which phrase was intended, they will perform somewhat better than chance. This is quite unexpected. When words are linguistically contrastive, like ‘beat’ vs. ‘bead, Ss will get around 99% correct (when listening conditions are good), whereas, if they are phonologically identical (that is, homophones), like ‘beat’ and ‘beet’, they will perform at 50% correct (assuming equal probability of each stimulus type). But what is one to conclude when listeners perform at 60% to 75% correct? Such perceptual performance has been found for the case of German syllable-final stops (Port and Crawford 1987). Similar results will be found for ‘beat it’ and ‘bead it’ as well (if my classroom demonstrations of this phenomenon predict results in careful experiments). All the obvious concerns about the experimental details have been controlled for — it doesn’t depend on the orthographic notation on the answer sheet, nor on combining data across subjects (some of whom might show a discrete difference while others none at all), or on any other obvious experimental artifacts. I have replicated the effect many times myself over the years. The effect cannot be easily ignored. This is a very strange result if one is committed to the view that phonetic units are discretely different from each other. Obviously pairs like ‘beat it’ and ‘beet it’ are linguistically distinct (since they are different words that normally sound discretely different), but when a syllable-final T or D occurs between vowels in the relevant English dialects (cf. words pairs like riding-writing, budding-butting, ladder-latter and phrases like the mud over there, the mutt over there), both stops are shortened to a rapid flick of the tongue against the roof of the mouth. What is surprising is that these tongue flicks are slightly – but nondiscretely – different from each other depending on whether the intended word ‘underlying’ this pronunciation had a final D or T. Both the production data and the perception agree that this phonetic contrast is marginal in these dialects. Thus if you ask speakers of this language (or of any other language, for that matter) if these two tokens sound the same or different, they will overwhelmingly say they sound the same, that is, ‘beat it’ and
99
100
‘bead it’ seem to demand identical phonetic transcriptions. Even English-speaking linguists tend to agree that the two productions are phonetically the same, despite the fact that very slight differences can be noticed (if you listen to a large sample of productions). But the difference seems too small to require transcription. Cases of incomplete neutralization have been found in English, German (in examples like ‘Bund’ vs. ‘bunt’, Port and Crawford 1989, Port 1995, Warner et al 2004) and reportedly in Russian, Catalan and Polish. Though these cases are probably fairly common, only one case is required to make the present argument that phonetic units are not necessarily discretely different from each other. The conclusion to be drawn is that, at the level of phonetics, where the cognitive symbol system makes genuine contact with the physiological and physical, the discreteness of the segmental phonetic alphabet turns out to be illusory. There is no hint of widespread discreteness in pronunciations across languages. It was only a convenient assumption that permitted linguists to get on with the enterprise of linguistics. But apparently phonetics cannot be alphabet-like. One might attempt to defend the traditional view from this conclusion by pulling the phonetic alphabet upstairs a bit: “All you have shown is that you have not measured the right parameters in your experiments. There is a discrete phonetics but it has been masked by physiological effects (that is, ‘performance’) that the traditional theory does not claim to account for.” To make such a counterproposal, however, is apparently to abandon any direct link with physical phenomena. It is like saying “Since I know my theory is correct, there must be some intermediate effect messing up the data”. Defenders will have to either propose some novel experimental measurements that might show the discreteness they claim is there, or else acknowledge that their theory may be unfalsifiable. Temporal patterns. There is a variety of kinds of evidence that languages employ temporal patterns of various kinds either in the definition of specific segmental contrasts (eg. in long and short vowels or in obstruent voicing constrasts, Port, Alani and Maeda 1979, Klatt 1976) or for prosodic structures like the mora (eg. Port, Dalby and O’Dell 1989) or interstress intervals (eg. Tajima and Port 2003). In speech production, such effects are easily “accounted for” with language-external, nonsymbolic implementation rules (eg. Klatt 1976, Port 1981) but perception is a much more difficult problem (see Port, Cummins and McAuley 1995, Fowler et al 1980). In any case, to maintain traditional phonetics, one must somehow explain how these cases of language-specific temporal features are not temporal within the language itself because the language must be definable symbolically – that is, in static, serially-ordered terms. Within the language, these temporal features are not temporal, but only static parameters of the normal kind that happen to be implemented in a way that temporal effects (like duration ratios, etc) are caused. Once again, to salvage the theory, the units that were supposed to connect the abstract phonology with concrete, observable phenomena must be drawn back inside the language where they can no longer be directly observed.
Conclusions so far The discipline of linguistics has nearly unanimously endorsed Chomsky’s stance that we can take for granted that language is symbolic and formal, and that the empirical issues to be addressed in linguistics research deal only with what particular kind of formal system must be innate to the child learner. However, there are now many forms of evidence that speech production and speech perception are controlled by a system that is capable of much more sophisticated exploitation of timing than can be represented by a time scale based on serial order. In real language, we have seen that continuous time parameters are exploited occasionally in such a way as to rest exquisitely balanced between ‘contrastive’ and ‘not contrastive’. The exclusively formal and static (that is, discrete time) character of language is an assumption that is not supportable. At best there are many cases of approximately formal structures involved in speech production and perception, units I have sometimes called symboloids (van Gelder and Port 1994). Many other partly symbol-like structures may exist as well. If these and other empirical arguments have any validity, then linguistics should seriously consider a broader range of theoretical options. To simply assume that language is a formal system is extremely risky and probably incorrect. But whatever may be its importance for linguistic theory, what are the implications for general cognition of the observation that language is not formal? The nonformal nature of language suggests the importance of looking at cognition without the blinders imposed by the computational perspective. To return to the post-office example, it is important that the target of cognitive research not be the thought as expressed on the page by words. It is important to do more — to study the act of reading the words or listening to them in time. For example, what cognitive structures appear and then collapse during the production or perception of such a sentence? – or during the act of thinking the thought itself. By following such a strategy, cognitive science and linguistics would have to abandon the security of knowing what kind of theoretical tools are required. But what can be gained is a theory of both language and cognition that does not require assuming an infallible executive machine, but rather a model that directly simulates simple linguistic behaviors. As an alternative to the symbolic models, it is proposed that we explore dynamical systems theory. In the balance of the paper, some dynamical features of language will be looked at. Then an outline of a dynamical theory of meter will be proposed, the basic rhythmic framework of behavior. A simple dynamical model seems to provide understanding of many actions. So the possibility is raised of a dynamical conception of ‘knowledge’ as a kind of skill, as a (partly acquired) ability to constrain the dynamics of an abstract, realtime cognitive system as well as of a coupled, real-time motor system. Language: from symbolic to dynamic The computational (or symbolic) view of cognition, in order to be successful, would have to succeed in showing that discrete, static linguistic units will do the job of accounting for grammatical linguistic performance. Formal linguistic
101
102
theories began with Saussure (1915/1959), extending to Bloomfield (1933), Hockett (1955) and finally Chomsky (1965). All these theories are predicated on the assumption that language can be fully described in terms of discrete symbols – from phonemes to syllables to words, feet, clauses and sentences. The basic units of each level of almost any 20th century linguistic grammar – whether focusing on semantics, syntax, morphology or phonology – are discrete, static information packets, each a symbolic phonological form plus perhaps a meaning, arranged in serially ordered strings. The treatment of time strictly in terms of serial order strikes many linguists as critical to the very definition of a “linguistically significant property”. After all, it is said, we linguists are investigating human knowledge of language, and knowledge is usually thought to be intrinsically static. But in linguistics, this static character is not an empirical issue that might turn out either way. Computational models must have data structures that stand still. Each symbolic structure must have some particular state that it is in at each of the discrete time steps or else formal modeling will fail (Haugeland 1985). And, of course, times that lie in between the discrete integer values of time are completely undefined in such a model. So, on the traditional view, the static data structures of language (eg. phonemes, words, phrases and sentences) present (each on its own hierarchical level) nonoverlapping symbols arranged in a sequence. This sequence models certain aspects of real time, but these units cannot have any further properties of time beyond serial order. Thus, in particular, they cannot have measures of duration other than ‘number of time steps,’ no durational ratios, no rates of change or rates of acceleration, etc. (See Port, Cummins and McAuley 1995 and Port and Leary 2005 for further discussion). The order of phonemes in a word like cat, /k æ t/, and the sequence of words (and phrases) in a written sentence, ‘How’s your mom?’ (or /hawz y‚ mam/), fully represents all linguistic aspects of the temporal structure of the event of saying those words. This is what is meant by a static system, a system for which time can always be represented appropriately by means of serially ordered symbol structures. My claim is that natural languages are not static in this sense whereas traditional linguistics takes this property as a premise. Temporal structure and language This strategy of isolating the language from real time seems doomed to failure. One reason is that research reveals some distinct properties of speech that are clearly temporal and requiring more information than is allowed by serial order and yet are language-specific, as was pointed out above. These cases contradict the fundamental assumption of linguistics that you do not need to ever consider time (other than serial order) in the specification of a grammar. For speech production, the traditional way to avoid this contradiction is to propose that static features in the language cause some dynamical effects in motor output (Chomsky and Halle 1968, Klatt 1976, Port 1981). One says, for example, that a static feature [+long] gets ‘implemented’ (outside the grammar) as ‘lengthening of the segment by 20%’. After a series of these rules, a specific
duration in milliseconds is specified for the segment. But there are many serious conceptual and practical problems with specifying durations in milliseconds. In particular, let’s say the phonetic grammar computes “Segment P should remain at its steady-state for 142 ms”. But now what mechanism is going to assure that the actual gesture lasts for this duration? We now have an entire new set of control problems to address. It seems that rules like “Make segment P last n milliseconds” don’t really solve any problems for the talker or listener (even if they do help experimentalists describe their data. See Fowler et al 1981, Turvey 1990, Kelso 1995). The perceptual processing of continuous-time waveforms into discrete linguistic units (like phones or phonemes) ultimately depends on hardware components (eg. a periodically clocked wave-form sampler) that are highly implausible psychologically given the way in which auditory information is distributed over time in speech (See Port, Cummins and McAuley 1995, Port 1995, van Gelder 1998, for further discussion). If one wanted to abandon the approach of translating back and forth between digital (symbolic) and analog (that is, drop the D/A and A/D conversions into and out of real time), is there another alternative? Clearly there is. One can propose that words (and all linguistic units) should have whatever temporal specifications they require. Whatever these specifications turn out to be, I suspect they will tend not to resemble “Hold this state for 142 milliseconds”. The constraints are more likely to be expressible as parameters in a dynamical system, for example, as changes in the stiffness of oscillators or in specifications of the phase angle of one event relative to another (Browman and Goldstein 1992, 1995, Saltzman 1995, Byrd and Saltzman 2003) . It is entirely an empirical issue what kind of parameters will be needed to handle the temporal structures observed in specific languages. Each language may be said to offer its own unique combination of motor skills that are effective for pronouncing their vocabulary in their set of speaking styles. My work over the past few years has been to explore speaking tasks that could be easily elicited in the lab where temporal constraints may be observable. One advantage of this move is to make available a set of conceptual tools and experimental methodologies from dynamical systems research in other disciplines that can be brought to bear on linguistic behavior. For example, the speech cycling task discussed below was inspired by Treffner and Turvey’s experiments where subjects swung a single metronome in each hand. Furthermore, it seems that static-like properties of languages, the symbol-like units themselves, might be accounted for by a dynamical theory as well. Dynamical systems might eventually provide a satisfactory interpretation of the symboloidal units of natural language as stable attractors. Thus both continuous time properties (eg. response to perturbations) and discrete, symbol-like, stable properties of linguistic behavior (such as our strong tendency to hear either word A or word B, but nothing in between.) may find explanation via dynamical systems theory. One consequence of following the new approach that puts language (and linguistics) back into time is that our science returns to the fold of the other sciences. We need no longer be a ‘special science’, the science of mind. Linguistics
103
104
has followed a research program that tends to deny the relevance of the methods and conclusions other sciences like psychology and computer science (Chomsky 1965). In describing classes of physical events — from astronomy to mechanics to neuroscience —the natural sciences conventionally employ descriptions based on dynamical systems theory. A small number of parameters or variables are captured by differential (or difference) equations that characterize their changes in continuous (or discrete) time. Since sentences, words and speech gestures are also continuous events observable in time (at least whenever they are actually performed), I propose as a new working assumption for linguistics that linguistic units are events in time and that the processes which employ these units are also events in time (cf. Browman and Goldstein 1995). From this point of view, then, the temporal structure of linguistic events, at all levels from phonetic to syntactic and semantic become problems that are just as central to the general theory of language as the more static-like units, like phonetic features or parts of speech. An unavoidable implication of this approach is that we can no longer look at language merely as a type of ‘knowledge’, if knowledge is taken be be something that one ‘acquires’ and then ‘has’ – that is, as something static. Instead language is a domain that can be studied especially clearly only when it is performed. Whereas traditional linguistics insists on looking at language only after it has been reduced to phonetic (or orthographic) transcription (so that its temporal structure has been reduced to mere serial order on a page), we shall find that data presented in this form is much less useful for our purposes. Only audio or other recordings contain the information we need at this point in the development of a new theory of language. From this temporal data, we seek clues regarding the invariant or slowly changing dynamical system that generates those temporal phenomena. Before going on to defend the feasibility of this orientation by presenting some results of its application to speech, it may be useful for some readers if we begin with a brief introduction to a few basic concepts of dynamical systems. Dynamical systems The mathematics appropriate for describing temporal events is dynamical systems theory. There are a number of approachable introductions to dynamic systems. One very intuitive approach is provided by Abraham and Shaw (1983). For another readable introduction to the application of dynamic systems theory to cognitive science, see Norton’s mathematical introduction (Norton 1995) to Port and van Gelder (1995). Another good introduction to dynamical systems is Glass and Mackey From Clocks to Chaos (1988) written especially for biologists. A fascinating but more advanced introduction is Stephen Strogatz’ Nonlinear Dynamics and Chaos (1995). The basic idea of a dynamical system is that there is (a) one or more variables that take on values at each point in time and which together comprise a relatively isolated system, plus (b) a rule, most often expressed as either a differential (or difference) equation specifying how the state of the system (that is, the values of the variables) should change over some small amount of time.
The range of values of the variables of the system provides a state-space (of all possible combinations of values). The rule (or the dynamic, as it is sometimes called) can be interpreted as imposing a kind of pressure or bias on the system to change from a given state in a certain direction. When the statespace is viewed geometrically, then state changes can be interpreted as motions of a point along the (usually smooth) ‘surface’ of the statespace. The differential equation says, for each point in the state space, which direction it should move and how fast over a tiny time interval. To model the cases that are most relevant to cognitive issues, the governing equations are nonlinear ones. Those states of the system that it tends to move toward are called ‘attractors’, while those states from which it will veer away are called ‘repellors’. Each attractor is a location within a surrounding region, the basin of attraction, where the system will always tend to move toward the attractor. Each attractor, then, defines a stable state, a state of the system that it tends to remain in for some time period until it gets somehow pushed out. (Thus attractors have some of the properties of ‘symbols’, since they are stable over time and are typically discrete.) An attractor could be a steady-state or an oscillation (either simple or complex) or even some complex time function. Coupled oscillators. One important class of dynamical system is the type that involves two oscillators that influence each other’s rate of oscillation. Dynamical systems of this type may underlie the notion of meter, both in music and in speech. But let us begin by describing a simple oscillator that receives a pulse input that is periodic. If this oscillator adjusts its cycle time in response to periodic pulse stimulation so as to more closely match its phase zeros with those of the input pulses, then it can be called an adaptive oscillator (McAuley 1995, Large and Kolen 1995, Large and Jones 1999). The simplest form of coupling is when both “fire” (that is, pass through phase zero) always at the same time. If they differ in frequency or phase, then an adaptive oscillator will adapt (whether quickly or slowly) to the phase and rate of the stimulation pattern. Now, leaving aside input stimulation for the moment, imagine a system of two internal oscillators that are tightly coupled (meaning that each is strongly influenced by changes in the other). Each oscillates through its cycle of states periodically. We can describe the instantaneous state of each oscillator in terms of its phase angle – on a scale from 0 to 1 (or equivalently, from 0 to 2π). How do we know which phase is phase 0 for any oscillator? There is no apriori basis for this distinction, but in many cases one can easily tell which state is the ‘starting point’ of a cycle. For example, if a gesture causes a pulse to occur at some particular phase angle, then phase zero (the line-up point for any other oscillator that might be coupling with it) will tend to be at the onset of that pulse. Whenever there are two oscillators being compared with each other, a useful variable to examine to understand their relationship is the relative phase of the oscillators, that is, the difference in instantaneous phase between them. If both are cycling through their phases in lockstep, so that each goes through its phase 0 at the same time, then we would say that they exhibit 1:1 frequency ratio and
105
are “phase locked” since their relative phase remains at zero while they oscillate. This condition is represented as a constant point on the circle of relative phase. If one of them is accelerating slowly (eg. shortening the period by 5% on each cycle), then the relative phase of the 2-oscillator system would slowly advance by 5% per cycle around the relative phase circle (obviously taking 20 cycles to get back to 0 relative phase again). What if one oscillator is ‘tapping’ twice for each tap of the other? In that case, relative phase will advance half-way around the circle (in the first cycle of the faster oscillator), then traverse the second half of the circle and take its next beat with both oscillators striking synchronously again at 0 relative phase. Later we will examine a mathematical model of a system of attractors on relative phase – a model that accounts for the attractiveness of frequency ratios like 1:1, 2:1 and 3:1.
106
Phonological meter and dynamics Working with several former students at Indiana University (Fred Cummins and Keiichi Tajima), we have taken the first steps toward ways of studying the dynamical properties of language. This approach requires asking somewhat different questions about language and requires use of some unfamiliar methods of statistical analysis. First, this strategy leads us to look at linguistic performances from which durational measures could be obtained. Thus, periodic behavior should be the first place to look, because the dynamics is relatively simple there. Secondly, testing dynamical models requires collecting large amounts of data, because the shape of distributions (and not necessarily their mean values) plays an important role in drawing inferences about the dynamical model. (See, eg. Schöner et al 1986, Cummins 1996). Third, it turns out that the simplest kind of dynamics to study is that of periodic patterns - of patterns that involve some kind of cycling at time scales between about a quarter second and a second or two. Therefore, it is to such phenomena that we will turn next. Traditional statement of the problem. Within the tradition of formal linguistics, these temporal phenomena are addressed under the rubrics of ‘stress’ and ‘meter’ (Liberman 1978, Liberman and Prince 1977, Hayes 1995) and are modeled using hierarchically arrayed static symbols. We suspect that many of these linguistic phenomena, like the [stress] features on particular syllables of particular words, may nevertheless tend to ‘attach themselves’ or ‘entrain themselves’ to some temporal metrical pattern. Only relatively recently have any research strategies for investigating the temporal structure of linguistic units begun to be developed (Browman and Goldstein 1986, 1995, Saltzman 1995, Tuller and Kelso 1986, Thelen and Smith 1996). Our approach. In the study of gross motor control of limbs, dynamic models have been applied with striking success. Work by many researchers (including Bernstein, Turvey, Kelso, Saltzman and many others) has shown that dynamic systems theory in one form or another offers a plausible framework for describing the coordination of motor activity. Obviously, speech too is a motor act and thus should be subject to such analysis. So dynamical models seem an appropriate place to begin.
If dynamics handles motor control, then surely the perception of speech as well should require a dynamical interpretation. But since perception is a purely cognitive process and does not involve the macroscopic motion of structures having mass, the dynamics may differ in certain details. It may be that the process of speech perception of prosodic structures is appropriately modeled by an abstract dynamic system whose dynamics resemble the dynamic structures observed in motor control for skilled actions. At the most abstract level, both systems may be entrained to meter. We may imagine such systems acting as though they had mass and stiffness even though there is no mass. When it is modeling stimulus patterns that reflect the motion of physical objects (e.g., jaws, lips and vocal cords), the system should act as though it had masses and springs. One goal of our research program is to find out what kind of dynamic system could model the rhythmic aspects of both perception and production of linguistically controlled speech gestures. Perception and production may share certain abstract structures yet clearly these structures differ between languages. It appears to take some years to acquire native-like speech rhythm. This rhythmical structure forms one important basis of the perception of foreign accent (Tajima et al 1997). Since we expect perception to work on dynamic principles, it becomes plausible that the phonology of any language is itself simply a particular complex dynamic system, or a configuration of parameters that will ‘generate’ phonologically appropriate gestures in time. We might imagine that parallel oscillators at levels like the syllable and foot could be easily generated. One paradigm for testing dynamical models of oscillating systems is to entrain one system to an external pattern. Then if the stimulus pattern is perturbed, one can explore. In the next section, some of the evidence will be reviewed that speech often behaves as though coupled to simple meters (Port 2003). That is, speech often shows cyclic repetition of similar events after similar time intervals. Hopefully some of the phenomena currently described using traditional linguistic symbolic concepts such as ‘stress’, ‘metrical hierarchy’, ‘foot’, ‘stress-timing’, etc. can be interpreted as manifestations of a general real-time model for meter. A model of this kind would incorporate continuous-time dynamic notions like oscillator, velocity, phase angle, stiffness and entrainment, in order to deal with the phonological units currently interpreted as hierarchical data structures of static symbols. If this research program is successful, it would ultimately provide for each language a linguistic description which, unlike standard symbolic models of speech, will require no additional clocking devices to perceive speech or produce it in real time. That is, it should require no external mechanisms in order to actually execute the model in time. The model specifies its own behavior in time, given specification of parameters like rate and a piece of text. This contrasts with the traditional symbolic models that always require some external implementational mechanism (eg. some linguist or piece of computer hardware) that will run the rules of the analysis and test their adequacy.
107
108
Phonetic Research on Speech Rhythm Poets, phoneticians and most ordinary speakers share the intuition that speech is often rhythmically performed. It seems likely that all linguistic communities exhibit some overtly rhythmical styles of speech — whether as poetry, song, chant, preaching, exotic styles of declamation or simple worksong. Many communities also have conventional forms of group recitation or responsive reading where a text is recited in unison. (Note the highly rhythmical style that American schoolchildren use to recite the “Pledge of Allegiance to the Flag,” for reciting the alphabet or multiplication tables, or the rhythmic chanting of monks and worshipers.) Given the ubiquity and appeal of such metrical genres of speech to ordinary humans, one might look for similarities between these styles and normal spoken language in order to reveal something about the structure of language itself. Within linguistics, in fact, there remain longstanding arguments about the extent to which perceived rhythm in normal prose reflects either quantitative constraints on timing or, conversely, that perception is an experience based merely on alternations of serially ordered units (Liberman 1978, Boomsliter and Creel 1977). Yet even ordinary prose speech is sometimes described in rhythmical terms, even to the extent of representation by musical notation (eg. Jones 1932, Martin 1972). Kenneth Pike (1945) characterized some languages (like English and Russian) as “stress-timed,” suggesting that in these languages the “time interval between the beginning of prominent syllables is somewhat uniform” (p. 34) and their component intervals (like syllables and segments) were stretched or compressed to make the onsets of stressed syllables more equally spaced. He claimed that some other languages (like French and Spanish) were “syllable-timed,” meaning that each syllable is produced with an equally spaced psychological beat of its own (p. 35). Abercrombie asserted boldly that “all human speech possesses rhythm” and further claimed that all languages in the world are either stress-timed or syllable-timed (1967). He proposed that listeners who speak the former rhythmic type of language would have “expectations about the regularity of the succession of stresses” and the latter type would have expectations about syllables. One issue that reopened the debate about the temporal basis for speech rhythm was the investigation of so-called perceptual centers or P-centers in the late 1970s. Periodicity implies regularly occurring events which need not be identical but only similar on successive cycles. However, the observable events that can be easily measured were found to differ from syllable to syllable. An English syllable can begin with a single consonant, a cluster of 2 - 3 consonants or no consonant at all. Which point in one syllable should be lined up with (that is, treated as the same as) which point in another? From experiments in which subjects were asked to read a simple list of alternating monosyllabic words, like “ba, spa, ba, spa, . . . .”, it became apparent that subjects tried to locate the words isochronously in time. But the time points that are adjusted to equalize the apparent spacing of the words may not have a simple correspondence with any single salient feature of the signal, although a good first approximation to the P-center appears to be
the point of the onset of voicing (Marcus 1981) —roughly the onset of a stressed vowel. The more successful P-center extractors simply look for a sharp increase in lower frequencies of speech signal energy smoothed at a time scale that yields roughly one bump per vowel. It appears that listeners ‘impose regularity’ on the speech signal that reflects their ability to predict what will happen and when. Success at these predictions gives rise to a strong experience of periodicity (Jones and Boltz 1989). We suspect that the production system used by speakers incorporates an oscillator that generates rhythmic performance during speech production and also generates a similar perceptual rhythm when listening to speech (cf. Large and Kolen 1996, Large & Jones 1999). Thus, one can only agree with Abercrombie that listeners “have an immediate and intuitive apprehension of speech rhythm.” Our goal is to discover what kind of mechanism could support this apprehension at least for the easiest cases. The speech cycling task In the1990s, my research group explored the task we call “speech cycling” as a class of methods that may be useful for studying the timing of speech (eg. Cummins and Port 1998, Port, Tajima, Cummins 1996, 1998, Cummins 1996, 1997, Tajima and Port 2003). Subjects are asked to repeat a short text fragment over and over, similar to the P-center experiments but with more text material. Here we will glance at results from two of these experiments. This first example of typical results is from a rate-variation study where subjects were asked to say a simple phrase with two natural stresses (on the first and last syllables). In this case they repeated “Give the dog a bone” once on each metronome pulse. The pulses started at a very slow 2.2 sec rate and increased in small steps on successive trials until subjects could no longer perform the task (typically at a period around 700 ms). In each trial subjects listened to four beeps at a constant metronome rate. Then they jumped in and repeated the phrase 8 times along with the metronome. No instructions about speech timing were given except to line up the initial syllable “give” with the metronome. Then, from the audio files, the phase angle of the onset of “bone” relative to the onset of the two adjacent “gives” was measured on scale of (0, 1) using a semiautomatic beat extractor program (Cummins 1996). If the internal timing of a phrase is not influenced at all by the fact that one is about to repeat the phrase again, one should expect a relatively smooth and uniform distribution of syllable onsets for the final word. At very slow rates the phase angle of “bone” should occur at smaller values and then increase fairly uniformly as the amount of time available for the repetition gets shorter. But figure 1 shows a frequency histogram of the observed phase angle of “bone” relative to the phrase onset summed across all 4 speakers and the full range of metronome rates. (In other experiments, we have demonstrated a small amount of ‘hysteresis’ in this task due to increasing vs. decreasing metronome rate across trials, but we ignore that effect here.) Clearly there is a strong preference here for locating the onset of the syllable
109
110
“bone,” the ‘nuclear stressed’ syllable, at a phase close to 1/2. It is also possible that there are other preferred phases near 0.3 and 0.6 (though these particular data are only weakly suggestive of other attractors). The reader is encouraged to try repeating our test phrase at a comfortable rate (about a one-second repetition rate). The pattern you will most likely employ places the onset of the word ‘bone’ just halfway between successive initial stressed ‘Gives’. So it is rather like “give...bone..give..bone..” and so forth — where “bone” is half way between “gives”. If the repetition rate is particularly slow (eg. a two-second or longer repetition rate), then (unless you really drag out the pronunciation of the phrase) you probably line up the main syllables like this: “give..bone..[rest]..give..bone..[rest]..” (with ‘bone’ occurring at 1/3 of the phrase repetition cycle). The results of our experiment are quite robust and exhibit similar behavior despite changes in the text and in many details of instructions. Still this particular method has some problems, such as the fact that some speaking rates are much easier to perform than others. In later experiments, we refined the method by varying the phrase repetition rate such that speakers did not have to change their rate of speech very much. In the following experiment (from Cummins and Port 1998), we asked subjects to repeat a phrase like ‘Beg for a dime’ in time with a metronome but this time the metronome alternated between two different tones. One tone marked phrase onsets — where subjects should place the first words “Beg”. The second tone marked where “dime” should be located. By keeping the “Beg...dime” interval constant and varying only “Beg...Beg”, the speaking rate of the subjects could be kept roughly constant while testing subject ability to repeat the test phrase so as to place the final stress at arbitrary phase angles of the phrase repetition cycle.
Figure 1. A frequency histogram of the phase of onset for the final stressed syllable, “bone”, in “Give the dog a bone” as the metronome rate was increased from very slow to very fast. A strong preference for phases near 1/2 is evident plus a possible attractor near 1/3. (From Port, Tajima and Cummins, 1998).
Figure 2 shows histograms of the observed median phase of each trial of 1012 repetitions of the test phrase for target phases drawn from a uniform random distribution of target phases between 0.3 and 0.75 for four of the 8 subjects (the other four closely resembled these). It can be seen that subjects found certain ranges of phase easy and natural to reproduce while other values of target phase (which had been uniformly distributed along the X axis) could not be reproduced accurately. It is clear that all these subjects found 1/2 and 1/3 to be a phase lag that could be easily reproduced and that 3 of the subjects exhibit attractive phases also near 2/3. When target phases occurred at other phase angles, they tended to be reproduced with values close to these three values, 1/2, 1/3 and 2/3. This very strong constraint on the timing of speech production is persuasive evidence that Ss are somehow subdividing the long cycles into integer fractions (that is, the harmonic fractions – halves and thirds). Apparently they actually have attractor phase angles at these harmonic fractions of the phase circle. These attractors are similar in certain ways to the attractors of relative phase observed in the finger-wagging task employed by Haken, Kelso and Bunz, thus suggesting a similar model, but one with attractors at 1/3 and 2/3 as well. Conclusions. Evidence from these experiments as well as in various folk-art of genres human speech shows that humans have a strong tendency to produce speech in rhythmical ways. Periodic structure in speech is so ubiquitous that we must have an account of speech that will both render such entrainment of speech possible and even to render it likely and natural. Meter system. It seems that a natural way to account for these 2-beat and 3-beat patterns during speech cycling is to propose that when Ss repeat the text, they activate a meter system (see Port 2003). A meter system is hypothesized to be a pair (or more) of oscillators that are phase locked (that is, coupled in such a way that the pair of oscillators fire simultaneously when possible). They fire at frequency ratios of 1:2 or 1:3 (or conceivably m:n) (cf. Large and Kolen 1996, Large
Figure 2. A frequency histogram of the phase of onset of the phrase final stressed syllable in the Cummins and Port experiment (1998). Target phases were specified uniformly along the phase continuum from 0.3 to 0.7. These histograms show the median phase for the ‘beat’ of the final stressed syllable on each trial of 10-12 repetitions. Only a few intervals along the phase circle seem to attract the onset of the stressed syllable.
111
and Jones 1999). A reasonable guess is that some region of the brain must exhibit a pattern of firing that cycles for each of these oscillators, eg. for both the 1 and the 2. The pulses of these oscillators apparently attract the stressed syllable ‘beats’ of the repeated phrases. The balance of this paper will develop aspects of a mathematical model that will begin to provide an explanation of some of the phenomena just described.
112
Comment on ‘universals of language’. This research program began with the question “What are the dynamical properties of language? And what properties are ignored by the conventional search for an inventory of the symbolic apriori units of language.” We had discarded as misguided the conventional search for linguistic universals. But surprisingly, almost as soon as a methodology for this study reached workable form, we began to realize that we were finding very strong and easily recognizable universal features of speech. For example, it is likely that periodic alternations of strong and weak elements in speech production are widespread across languages, possibly even universal (as proposed for the “metrical grid” by Liberman 1978). And it appears likely that periodic attractors at 1/2, 1/3 and 2/3 that we have discovered are universals of all human behavior, including language production. They can probably be observed in most languages given an appropriate linguistic task. The difference now, is that the universals we have discovered are not symbolic in form, but rather reflect dynamical systems that are very general and widespread. It seems possible that there is a fairly small number of distinct rhythmical styles, that is, styles for imposing rhythmic constraints on speech, across the languages of the world, as claimed by Abercrombie (1967) and hinted at by Pike (1945). It is not likely that such a typology will be as restricted and simplistic as the “stress-timed” vs. “syllable-timed” typology claims. But, just as there are only a small set of widely observed musical meters (eg. 2-beat, 3-beat and 4-beat meters, etc), there may be a fairly limited set of possible linguistic rhythmical styles too. Our approach seeks general dynamic mechanisms for rhythm and meter, suitable for any language (and probably for music as well). This work makes a small start toward uncovering a subdiscipline within phonology that might be called the temporal phonology of language. Of course, other aspects of phonology (like the production of consonants and vowels) also require dynamical models, probably along the lines suggested by Browman and Goldstein (1986 1995) and by Saltzman and Byrd (Saltzman 1995, Byrd and Saltzman 2003). Do the rhythms that we have observed in speech have any relationship to other kinds of motor behavior? In the next section, we shall review a simple motor task having nothing to do with speech that exhibits simple rhythmicity. The task is simple enough that an easily understood mathematical model has been proposed. From this simple model, we will derive a variant that may be more suitable for the speech results. Dynamical Model for Limb Motion and Speech Timing One question to ask is whether there might be any similarity between these rhythmical phenomena and other dynamical effects observed in human
behavior even if they have no obvious relationship to speech at all. The HakenKelso-Bunz model (Haken et al. 1985) is a very simple model for some very simple motor phenomena. The H-K-B model. Scott Kelso tried wagging his left and right index fingers in various ways. He observed (1981) that he (and his experimental subjects), could stably reproduce just two specific patterns of bimanual coordination of the index fingers. In one the fingers approach each other at the midline of the body (such that homologous muscle groups contract simultaneously, defining zero relative phase), and in the other the fingers alternate in motion toward the midline, but both move simultaneously to the left and then both right (defined as a relative phase of 0.5). He asked his subjects to move their index fingers back and forth in the 0.5 phase relation (it doesn’t seem tomatter much what particular hand orientation you employ). Each trial was begun with a slow rate and sped up gradually. At some rate each subject became unable to keep the alternating coordination and found themselves unable to prevent shifting to move the fingers symmetrically (in 0 relative phase). Haken, Kelso and Bunz proposed a model for this robust behavioral phenomenon (1985, Kelso 1995). The model proposes a vector field applying to the relative phase of the two fingers. This vector field always has an attractor at 0 relative phase (where both go toward midline and then both away). By saying 0 phase is an attractor, they are claiming that the state of the system will tend to move toward a relative phase of zero if it finds itself in the neighborhood. Thus it implies that moderate perturbations of instantaneous phase (like those that occur in the nervous sytem when you try to produce a semiskilled act like wagging your index fingers according to some silly instructions) will tend to result in automatic correction back to the region near 0 relative phase. At slower rates, the model asserts the existence of a second attractor at phase 1/2. Although this attractor is not as ‘strong’ or ‘deep’ etc. as the attractor at phase 0, it is nevertheless fairly effective, at least at slow rates. When the rate is increased, however, the attractor at 1/2 typically becomes weaker (modeled by reducing the amplitude of the 2d harmonic component for faster rates). Eventually, at even faster rates, the attractor at 1/2 will disappear (at the “critical point”) and at faster rates 1/2 becomes a maximally repelling phase angle. The HKB-model of coordination is an example of a very high-level descriptive idea: Observe macroscopic patterns of a complex dynamical system and seek collective variables (eg. relative phase) and control parameters (eg. rate) of the underlying self-organizing processes that coordinate a large number of subsystems (including not just the two fingers, but also at least the various muscles groups that control them and various relevant parts of the central motor system). Because complex systems have a propensity for turning themselves into simpler devices that may exhibit various functional requirements, the behavior of complex systems can sometimes be modeled by means of a few macroscopic order parameters, or dimensions, on which pattern changes are reflected (Haken
113
114
1983, Kelso 1995). It is possible that there may be certain dynamical characteristics that hold for many kinds of complex systems, from simple mechanical devices all the way to the motor-cognitive apparatus responsible for language. The HKB model puts the finger wagging task into a broader framework of phase transitions occurring when systems with two stable states turn into systems with only one stable state – as change in the control parameter causes changes in the attractor landscape. One way to think of attractors is as basins in a potential function, V, whose slopes represent the vector field that changes the state of the system. Thus in a potential function, a ‘valley’ is an attractor and a ‘peak’ is a repellor. The steeper the sides, the faster the change for any value of relative phase. This would lead a magic marble to settle toward local energy minima (imagining motion without any momentum in the basin). Phi stands for the collective variable, the relative phase of oscillators 1 and 2. Since the data seem to exhibit attractors at phi = 0 and phi = 0.5, the HKB-model describes the situation in terms of a potential function, V (made from negative cosines) and a control parameter, rate of finger oscillation. Increases in the rate cause the relative amplitude of a negative cosine with a trough at 1/2 to become smaller. V thus gives a range of attractor landscapes for different values of the control parameter, rate. The two attractors at the slowest rate correspond to phase lockings of antiphase and inphase. Figure 3 shows the attractor landscape for the slow rate with a moderately strong attractor at 1/2.
Figure 3. The H-K-B model of attractors on the relative phase of the two wagging fingers at slow rate. It shows attractors at a relative phase of 0 and 0.5. The potential function is simply the sum of the two negative cosines shown in the equation. As rate increased, the amplitude b of the cos 2phi terms is assumed to decrease. And the phase angle of 0.5 becomes a repelling phase angle as b approaches 0. As b gets smaller (assuming a>b), the attractor at 0.5 gets shallower and flatter, thus predicting slower return to the attractor after any perturbation, and also predicting more variation in phase due to the constant internal noise than would be predicted for the slower rate.
At a slow rate, as shown in Figure 3, there is a deep basin in V which corresponds to the stronger inphase attractor and a shallower basin showing where the weaker, antiphase attractor is located. As the rate is accelerated, the weaker attractor basin is hypothesized to become more shallow, gradually losing its effectiveness as an attractor. The system will gradually lose its stability, and eventually reach the critical point where the attractor disappears and the system state must slip over to the deeper inphase basin. Model Predictions. This set of potential functions makes several experimentally testable predictions (using stochastic methods, Schöner, Haken and Kelso 1986) when the control parameter (cycling rate) is manipulated so as to approach the critical rate. 1. When the rate is slowly increased while Ss maintain a target phase of 1/2, Ss will eventually slip over to the “inphase” pattern. 2. Even at slow rates, no phase other than 0 and 1/2 will be stable. (With practice, 1/3 and 2/3 and perhaps other phases might be trained up.) 3. “Critical slowing down.” Because the slopes of the attractor basin became more gentle at rates near the critical point, there will be less tendency to draw the phase back to the basin bottom. Therefore, given a single perturbation, such as mechanical hindering of the motor performance or a delayed stimulus cycle, when the rate is near the critical rate, the recovery of phase back to the pre-perturbation target phase should take longer than at slower rates. 4. “Critical fluctuations.” Given some amount of intrinsic noise, the weakening stability of the antiphase attractor should manifest itself as an increase of fluctuations of relative phase as the rate approaches the critical rate. These prediction have been verified for the finger wagging task as well as for several other related tasks involving other limbs (See Kelso 1995 for review). General Model of Meter The phenomena reported here suggest a speculative new model of meter. This notion is more general than just a linguistic structure, and may include the HKB finger-wagging task as a special case. Only the most general properties are clear at the moment. A more explicit mathematical treatment is required to spell out its implications explicitly. This theory depends on the notion of adaptive oscillation (McAuley and Kidd 1998, Large and Kolen 1996) and closely resembles the Large and Jones (1998) notion of musical meter as coupled oscillators whose phase zeros specify temporal attractors in perception as well as production. See Port (2003) for further discussion. Notice that HKB had only one attractor on the relative phase circle in addition to 0, an attractor at 1/2. But the speech cycling tasks revealed three attractors in addition to 0, at 1/2, 1/3 and 2/3. The two new attractors can be accounted for by adding one more harmonic to the HKB model, a term of [-c cos 3phi ] (where a, b, c are positive and c is normally smaller than b and b is normally smaller than a.) With inclusion of this term, then, if c is sufficiently large, attractors will appear (in the order of their attractiveness) at 0, 1/2 and then equally 1/3 and 2/3.
115
116
Figure 4 shows the architypal structure of such a system displayed as a potential function. It seems likely that there is some region of tissue in the brain that cycles for each of these coupled oscillators. So, if one is producing (or listening to) a waltz rhythm, then something is oscillating on each beat as well as on each measure of 3 beats. If this is a metrical structure, then the two oscillations are phase locked so that they fire simultaneously at 0 phase of the slower oscillator. This system of a single or several coupled oscillators should generate regularly spaced and nested, hierarchical structures of pulses. The way in which meter makes its presence felt, then, is that if a subject repeats a motor gesture (whether speech or nonspeech) or hears an auditory pattern that contains something perceptually salient (eg. a loud onset) that repeats at an appropriate rate, then those events may become coupled to one of the pulses of the meter (either to a faster one or a slower one). Because many other kinds of behavior can exhibit meter aside from speech, we cannot call it essentially a linguistic structure. But apparently repeated text phrases (or the lines of poem or chant) can become entrained to one – along with wagging or tapping fingers as well. Obviously, the relative depth of these attractors may vary with the task and with the subject, as well as over time within a subject (if certain patterns are practiced). It is only the general form of the potential function that I suggest may be a universal (on the assumption that all cultures can produce both 2-beat and 3-beat patterns in at least some genre or another). The details of attractor shapes may turn out to be quite different from negative cosines. These issues require more data and more theory. The proposal made here is that a set of attractors on the relative phase (shown in Figure 4) are what explains results like those in Figures 1 and 2.
Figure 4. A summary image representing the attractors of a 3-oscillator system with frequence ratios of 1:2:3 and a>b>c. It is proposed as a representation of the most likely attractors of the basic metrical patterns for most people (but may vary depending on cultural experience). The attractor at 1/2 is shown as less stable than at 0 but as more stable than at 1/3 and 2/3 because this is what seems generally to be observed.
What I have presented here is only a sketch of some of the properties that a general theory of meter should have: (1) both 2-beat and 3-beat attractors with 2-beat being easier, (2) restriction to a certain range of rates between 2-3 sec at the long end down to about a quarter second at the shorter end, and (3) attraction of prominent events, especially tapping sounds or rapid motions – whatever can serve to define a unique repeating event, toward the 0 phase locations of all the participating oscillators. This theory implies many testable hypotheses. Obviously both more modeling work and more experiments are required to clarify these ideas. Concluding discussion This paper has raced over a broad range of issues. It is time to wrap them all up, beginning with the data and working backwards to more general topics. Meters as attractor systems. In recent experiments, we modified a familiar dynamical research method and asked subjects to repeat a piece of text regularly. During regular repetition of the utterance (especially when stabilized by a metronome), we found that harmonics of the repetition cycle appeared as attractors at very predictable points in time that happened to be small integer fractions of the basic text cycle. We call them attractors because the onsets of stressed syllables tend to be biased toward these specific phase angles. From these and other experiments we have observed some similarities between these speech patterns and other nonspeech periodic motor patterns. Eventually, we leaped to hypothesis that there may be a general metrical structure that is widespread in human behavior (perhaps what linguists would like to call a ‘universal’). This system can be easily characterized (in the most distinctive cases) as an ‘underlying’ (in some sense) pair of coupled, phase-locked oscillators. The phase zeros of these oscillators appear to be rather powerful attractors for prominent motor or perceptual events. We might say that the simplest possible meter is a single oscillator ticking at some frequency. Let’s call this Level 0 (modeled by a single adaptive oscillator). A meter system at Level 1 (the simplest structure that one would usually call meter) has two oscillators at frequency ratio 1:2 or 1:3 (most likely). These are the simplest hierarchical structures in time. A nested system at Level 2, might have 1:2 nested within 1:2 (thus resembling 1:4), or 1:2 nested within 1:3 (for one kind of 6/8 pattern). In musical contexts many other meters are possible (See Tajima and Port 1998, Gasser, Eck and Port 1998, Large and Kolen 1996 and Port 2003 for further discussion). Why is it so difficult to observe these meters in spontaneous activity or natural speech? It is probably because in spontaneous action (including conversational talk) there are a great many factors affecting timing. At the very least there is the dynamics of the individual speech gestures, the dynamics of meter and the dynamics of ‘deciding what to do or say’. The metrical attractors are only one source of constraint. Only when most other variation is discarded, as when repeating some well-known text, do the meters spontaneously make themselves felt in human behavior. Presumably these attractors exist at all times, just waiting for an opportunity to entrain a speech pattern to itself.
117
118
Linguistic theory. Some may argue that these results showing how speech is easily entrained to periodic patterns and the demonstration of nondiscrete categories may be of some interest, but that, no matter how regular they are, these results are not about language at all, but only about linguistic performance. In response, I would repeat my argument that the distinction of Competence and Performance is a great mistake. The mistake is to assume in advance that Mind and Body are clearly distinct, and that we know enough about each of them that a line can be drawn. I don’t think we know enough yet. I would point out that the very fact that language is so easily entrained to meters provides us with some kind of vivid and potentially useful information about what words are really like. They do have temporal properties, and abstract [+stress] syllables are attracted strongly to the phase 0s of one meter or another. So it seems foolhardy to insist that this attractiveness and these rhythmic behaviors simply cannot be ‘The Language Itself.’ Surely if they show us critical facts about the ‘execution’ of language, they are also showing us something important about the language itself. Only words that are temporal, dynamical objects could be coupled to a metrical system. No matter how one may prefer to think of language, if one also seeks to study language in a way that might be useful for clinical purposes or for high performance speech recognition, then dealing with the temporal properties of language will prove more that simply “relevant,” they are essential. The fact that linguistics has always restricted itself to static descriptions in the past is not a good rationale for continuing to ignore temporal structure. Approaching linguistics from this perspective implies that new metaphors will be needed for interpretation as well as new methods of research. Sooner or later, linguistics must deal with time, because sooner or later language is always spoken, listened to and read in time. The Speech Cycling tasks developed here may turn out to be revealing for many kinds of linguistic phenomena. Of course, these tasks are subject to the accusation of being ‘artificial’, but this is a small price to pay. It is almost impossible to see what various temporal factors do to speech ‘in the wild’ and in unconstrained contexts. When speakers repeat something over and over, it gets polished and symmetries that are intrinsic to the behavior of the system can begin to assert themselves. It is rather like studying the angles and the refractive properties of a large crystal of salt in order to learn about the micro-structure of individual salt molecules. When a small form is repeated over and over, some of its intrinsic regularities may become observable. Speech cycling employs massive repetition to render visible the information we need about subtle linguistic patterns. Although phonology and phonetics is the first subdiscipline of linguistics to exploit speech cycling as a tool to study language, it’s possible that variants of the method can be employed for research on syntax and other issues. Cognitive Science. Does all of this discussion about how language should be analyzed have implications for cognitive science in general? First, we have argued in this paper that natural language does not necessarily provide the prima facia evidence for cognitive symbols that it is often assumed to provide. The fact that people talk using fairly discrete words does not tell us a lot about how we
manage to think in general. Languages are, at best, only partially symbol-like. Nondiscrete ‘categories’ exist and non-serial order features seem to differentiate one language from another (and thus must be learned rather than being universal and apriori). Therefore one cannot assume that linguistic units are authentic examples of symbolic structures. Instead of linguistics providing the model for general cognition, linguistics too needs an account of what its structures are and how they could exist. Hopefully dynamical systems theory will continue to provide improved understanding of such phenomena. Second, just as a new unexplored set of dynamical phenomena presented themselves for investigation when linguistic behavior was examined from the dynamical orientation (eg. with speech cycling tasks), similarly, other aspects of cognition may find new and productive research methods when the possibility of entrainment in time is explored. Finally, language seems to have always provided the basic metaphor for the rest of cognition. Thinking is a little like talking. Aristotle based his propositional logic on simple sentence structures, and 19th century predicate calculus enlarged the linguistic coverage of logic considerably. The discreteness of language has had the effect of encouraging us humans (or at least Westerners) to believe that the powers of language could be captured by mechanical automata. But thus far machines have been able to emulate humans only when we leave time out — when the inputs are assumed to be static. Even for language, where the phenomena look at first glance to be discrete and representable quite naturally in the static medium of writing, it turns out that understanding its temporal (nonserial) structure plays a critical role in theories of cognition. Writing, no matter how practical it may be, leaves out something central to a practical understanding of language, the dimension of time. In the same way, cognitive science too must take responsibility for the temporal coordination of cognitive acts. Ordinary thinking and reasoning also take place in real, continuous time. Coupling of all these perceptual and motor activities with predictable external events is surely quite natural and unavoidable. Finally, looking even beyond cognitive science, it is reassuring to note that the simple temporal structures we observe in speech and song are very reminiscent of many kinds of periodic structures in time and space. For example, tubes and strings resonate at discrete harmonically related frequencies (if they have uniform diameter), crickets chirp at a regular rates, clouds and sand dunes often arrange themselves, we might say, in spatially periodic ‘streets.’ Tigers, zebras and butterflies grow regular periodicities of coloring. Fireflies and cicadas sometimes entrain themselves to each other in time. Mathematical accounts of all these phenomena appeal to the behavior of relatively simple dynamic systems, whether in the atmosphere, in embryonic zebra skin, in the firefly brain, or wherever (Haken 1983, Murray 1993). It seems likely that morphogenetic processes that structure the mechanical world and biological systems are very similar to ones that contribute to the creation of linguistic and other cognitive structures.
119
Acknowledgments
The author is grateful for helpful discussions and comments to Tony Chemero, Fred Cummins, Ken de Jong, Mafuyu Kitahara and Keiichi Tajima.
Bibliography
120
Abercrombie, D. 1967. Elements of General Phonetics. Chicago: Aldine Publishing Company. Abraham, Ralph & Shaw, Christopher 1983. Dynamics, The Geometry of Behavior, Part 1. Santa Cruz: Aerial Press. Bloomfield, Leonard 1933. Language. London: Allen-Unwin. Boomsliter, P. C. & Creel ,w. 1977. ‘The secret springs: Housman’s outline on metrical rhythm and language’. Language and Style 10, 296-323. Browman, Catherine, & Goldstein, Louis 1986. ‘Towards an articulatory phonology’. Phonology Yearbook 3, 1986, 219-252. __1995. ‘Dynamics and articulatory phonology’. In R. Port and T. van Gelder (1995) Mind as Motion: Explorations in the Dynamics of Cognition. Cambridge: Bradford Books /MIT Press. __1992. ‘Articulatory phonology: An overview’. Phonetica, 49, 155-180. Byrd, D., & Saltzman, E. 2003. ‘Task dynamics of gestural timing: Phase windows and multifrequency rhythms’. Human Movement Science, 19, 499-526. Chomsky, Noam 1965. Aspects of the Theory of Syntax. Cambridge: MIT Press. Chomsky, N. & Halle , M. 1968. Sound Pattern of English. New York: Harper-Row. Cummins, Fred 1996. Rhythmic Coordination in English Speech: An Experimental Study. PhD thesis, Indiana University, Bloomington, IN. Also Technical Report 198, Indiana University Cognitive Science Program. __1997. ‘Synergetic organization in speech rhythm’. Proceedings of the Joint Conference on Complex Systems in Psychology, Gstaad, Switzerland. Cummins, Fred & Port, Robert 1996. ‘Rhythmic constraints on stress timing’. In Proceedings of the Fourth International Conference on Spoken Language Processing. Alfred duPont Institute, Wilmington, Delawapre, pp. 2036 - 2039. __1998. ‘Rhythmic constraints on stress timing in English’. Journal of Phonetics, 26, 145-171. Fodor, Jerry A. 1975. The Language of Thought. New York: T. Y. Crowell. Fodor, Jerry & Pylyshyn, Z. 1988. ‘Connectionism and cognitive architecture: a critical analysis’. Cognition 28, 3-71. Fowler, C. P. Rubin, R. Remez And M.t. Turvey 1981. ‘Implications for speech production of a general theory of action’. In B. Butterworth Language Production, pp. 373-420. New York: Academic Press. Freeman, Walter 1975. Mass Action in the Nervous System. New York:Academic Press. Gasser, Michael, Eck, Douglas & Port, Robert 1998. ‘Meter as mechanism: A neural network that learns metrical patterns’. Connection Science (under review). Glass, L. & Mackey, M. 1988. From Clocks to Chaos. Princeton: Princeton Univ. Press. Haken, H. 1983. Synergetics: An Introduction. Berlin: Springerverlag. Haken, H., Kelso, J. A. S. & Bunz, H. 1985. ‘A theoretical model of phase transitions in human hand movements’. Biological Cybernetics 51, 347-356 Haugeland, John 1985. Artificial Intelligence: The Very Idea. Cambridge: Bradford Books/MIT Press. Hayes, Bruce 1995. Metrical stress theory: Principles and case studies. Chicago: Univ. of Chicago. Hockett, Charles 1955. Manual of Phonology. Baltimore: Waverly Press. James, Jamie 1993. The Music of the Spheres: Music, Science and the Natural Order of the Universe. Berlin: Grove Press/Springer-Verlag. Jeka, J. J., Kelso, J. A. S. & Kiemel, L. 1993. ‘Pattern switching in human multilimb coordination dynamics’. Bulletin of Mathematical Biology 55, 829-845. Jones, Daniel 1932. An Outline of English Phonetics, 3d Ed.. Cambridge: Cambridge Univ Press. Jones, Mari & Boltz, Marilyn 1989. ‘Dynamic attending and responses to time’. Psychological Review 96, 459-491. Keating, Patricia 1985. ‘Universal phonetics and the organization of grammars’. In V. Fromkin (ed.) Phonetic Linguistics: Essays in Honor of Peter Ladefoged, 115-132. New York: Academic Press. Kelso, J.a.s 1981. ‘On the Oscillatory Basis of Movement’ (conference abstract). Bulletin of the Psychonomic Society 18, 63.
__1995. Dynamic Patterns: The Self-Organization of Brain and Behavior. Cambridge: Bradford Books/MIT Press. Kelso, J.a.s., Delcolle, J. & Schîner, G. 1990. Action-perception as a pattern formation process. M. Jeannerod (ed.). Attention and Performance XIII. Hillsdale: Erlbaum. Kelso, J.a.s. & Jeka, J.j. 1992. ‘Symmetry breaking dynamics of human multilimb coordination’. Journal of Experimental Psychology: Human Perception and Performance, 18, 645-668. Kelso, J.a.s., Buchanan, J.j. & Wallace, S.a. 1991. ‘Order parameters for the neural organization of single multipoint limb movement patterns’. Experimental Brain Research 85, 432-444. Kewley-port, D., Watson, C. & Foyle, D. 1988. ‘Auditory temporal acuity in relation to category boundaries: Speech and nonspeech stimuli’. J. Acous. Soc. Amer.83, 1133-1145. Klatt, Dennis 1976. ‘Linguistic uses of segmental duration in English: Acoustic and perceptual evidence’. J. Acous. Soc. Amer. 59, 1208-21. Ladefoged, Peter & Maddieson , I. 1996. The Sounds of the World’s Languages. Oxford: Blackwell. Large, E. W. & Jones, M. R. 1999. ‘The dynamics of attending: How we track time-varying events’. Psychological Review, 106, 119-159. Large, Ed & Kolen, John 1995. ‘Resonance and the perception of musical meter’. Connection Science 6, 177-208. Liberman, A. M & Mattingly, I. 1985. ‘The motor theory reconsidered’. Cognition 21, LIBERMAN, Mark 1978. The intonational system of English. MIT Doctoral Dissertation, Department of Linguistics. Published by IU Linguistics Club, Bloomington, Indiana. Liberman, Mark & Prince, Alan 1977. ‘On stress and linguistic rhythm’. Linguistic Inquiry 8, 249-336. Marcus, S. 1981. ‘Acoustic determinants of perceptual center (P-center) location’. Perception and Psychophysics 30 :247-256 Martin, James G. 1972. Rhythmic (hierarchical) versus serial structure in speech and other bahavior. Psychological Review 79, 487-509. Mcauley, J. Devin & Kidd, Gary 1998. ‘Effect of deviations from temporal expectations on tempo discrimination of isochronous tone sequences’. J Experimental Psychology: Hum. Perc. Perf, in press. Murray, J. D 1993. Mathematical Biology, 2d Ed. Berlin: Springer-Verlag. Newell, Allen & Simon, Herbert 1975. ‘Computer science as empirical enquiry’. Communications of the Association for Computing Machinery 6, 113-126. Norton, Alec 1995. ‘Dynamics: An introduction’. In R. Port and T. van Gelder (eds) Mind as Motion: Explorations in the Dynamics of Cognition, 45-68. Cambridge: Bradford Books/MIT Press. Pike, Kenneth 1945. The Intonation System of American English. Ann Arbor: Univ. of Michigan Press. Port, Robert F. 1996. ‘Phonetic discreteness and formal linguistics: Reply to A. Manaster-Ramer’. Journal of Phonetics 24, 491-511. __2003. ‘Meter and speech’. Journal of Phonetics, 31, 599-611. Port, R. F. & Leary, A. 2000. ‘Against formal phonology’. Language, 81, 927-964. Port, Robert, Cummins, Fred & McAuley, Devin 1995. ‘Naive time, temporal patterns and human audition’. In R. Port and T. van Gelder (eds) Mind as Motion: Explorations in the Dynamics of Cognition. Cambridge: Bradford Books/MIT Press. Port, Robert & Crawford, Penny 1989. ‘Pragmatic effects on neutralization rules’. J. Phonetics 16, 257282. Port, Robert & O’Dell , Michael 1986. ‘Neutralization of syllable-final voicing in German’. Journal of Phonetics 13, 455-471 Port, R., Dalby, J. & O’Dell , M. 1987. ‘Evidence for mora timing in Japanese’. Journal of the Acoustical Society of America, 81 :1574-1585. Port, R. & van Gelder , T. 1995. Mind as Motion: Explorations in the Dynamics of Cognition. Cambridge: Bradford Books/MIT Press. Port, R.f., Tajima , K. & Cummins, F. 1996. ‘Self-entrainment in animal behavior and human speech’. Online Proceedings of the 1996 Midwest Artificial Intelligence and Cognitive Science Conference, http: //www.cs.indiana.edu/event/maics96/proceedings.html. __1998. ‘Speech and rhythmic behavior’. in G. J. P. Savelsburgh, H. van der Maas and P. C. L. van Geert (eds) The Non-linear Analysis of Developmental Processes, 53-78. Amsterdam: Royal Dutch Academy of Arts and Sciences.
121
122
Port, Robert, Al-ani, Salman & Maeda , Shosaku 1980. ‘Temporal compensation and universal phonetics’. Phonetica 37, 235-266. Saltzman, E. 1995. ‘Dynamics and coordinate systems in skilled sensorimotor activity’. In R. Port & T. van Gelder (Eds.), Mind as Motion: Explorations in the Dynamics of Cognition, 150-173. Cambridge: Bradford Books/MIT Press. Saussure, Ferdinand 1915/1959. Course in General Linguistics. New York: Philosophical Library. Schöner, Gregor, Haken, H. & Kelso, J. A. S. 1986. ‘A stochastic theory of phase transitions in human hand movement’. Biological Cybernetics 53, 442-452. __1986. ‚A stochastic theory of phase transitions in human hand movement’. Biological Cybernetics; 53, 442-452. Scholz, J.p., Kelso, J.a.s. & Schöner, G. 1987. ‚Non-equilibrium phase transitions in coordinated biological motion: Critical slowing down and switching time’. Physics Letters A; 123, 390-394. Scott, S. K. 1993. P-centers in Speech: An Acoustic Analysis. PhD thesis, University College London. Stevens, K. N. & Blumstein, Sheila 1981. ‘The search for invariant acoustic correlates of phonetic features’. In P. Eimas and J. L. Miller Perspectives on the Study of Speech, 1-38. Hillsdale: Erlbaum. Strogatz, Stephen 1994. Nonlinear Dynamics and Chaos: With Applications to Physics, Biology, Chemistry, and Engineering. New York: Addison-Wesley. Tajima, Keiichi & Port, Robert 2003. ‘Speech rhythm in English and Japanese’. In J. Local, R. Ogden & R. Temple (Eds.), Phonetic Interpretation: Papers in Laboratory Phonology Vol. 6, 317-334. Cambridge: Cambridge University Press. Tajima, Keiichi, Port, Robert & Dalby, Jonathan. ‘Effects of speech timing on intelligibility of foreignaccented English’. Journal of Phonetics 25, 1-24. Thelen, E. & Smith, L. 1996. The Dynamic Systems Approach to the Development of Cognition and Action. Cambridge: MIT Press. Treffner, Paul J. & Turvey, Michael T. 1993. ‘Resonance constraints on rhythmic movement’. J. Experimental Psychology: Human Perception and Performance 19, 1221-1237. Turvey, M. T. 1990. ‘Coordination’. American Psychology, 45 :938-953 Tuller, B., Kelso, J.a.s. 1990. ‘Phase transitions in speech production and their perceptual consequences’. M. Jeannerod (ed.) Attention and Performance XIII. Hillsdale: Erlbaum. van Gelder, Timothy & Port, Robert 1994. ‘Beyond symbolic: Toward a Kama-Sutra of compositionality’. In Vasant Honavar and Leonard Uhr (eds.) Artificial Intelligence and Neural Networks: Steps Toward Principled Integration, 107-125. New York: Academic Press. __1995. ‘It’s about time: Overview of the dynamical approach to cognition’. In Robert Port and Timothy van Gelder (1995) Mind as Motion: Explorations in the Dynamics of Cognition, 1-43. Cambridge: Bradford Books/MIT Press. Van Gelder, Tim 1998. ‘The dynamical hypothesis in cognitive science’. Brain and Behavioral Science, in press. Warner, N., Jongman, A., Sereno, J., & Kemps, R. R. 2004. ‘Incomplete neutralization and other subphonemic durational differences in production and perception: Evidence from Dutch’. Journal of Phonetics, 32, 251-276.
Models of abduction Paul Bourgine The mind’s capacity to guess the hypothesis with which experience must be confronted, leaving aside the vast majority of possible hypotheses without examination —Peirce
Introduction C.S. Peirce had the view that reasoning involves three kinds of inference, abduction, deduction and induction. His reflections on this question developed over forty years, at a time when logic in its modern guise started emerging. Peirce’s thoughts changed substantially during this period, as was shown by Burks, Fann, Thagard and Anderson. These analysts distinguished two main phases in Peirce’s reflections about the kinds of reasoning. In the first phase, before 1900, Peirce offers a syllogistic approach of the three kinds of reasoning; in the second phase, after 1900, he favors an inferential approach that is closer to scientific inquiry. In the syllogistic approach, exemplified by the celebrated Barbara syllogism, deduction is the type of reasoning which allows deriving, from a major premise (the rule) and a minor premise (the case), a conclusion (the result). Induction derives by generalization a rule from a collection of observations of case-result pairs. Abduction starts with the conclusion and the major premise to derive the minor premise. In the inferential approach, the function of abduction is to emit a hypothesis, which can be either a premise, a rule or a theory. The function of deduction is to draw from a hypothesis its necessary or probable consequences. The function of induction consists in comparing the predicted consequences with the observed results. These three kinds of reasoning are considered necessary in scientific inquiry, abduction being the main instrument in a logic of discovery. When a new surprising fact is encountered, the first stage in reasoning consists in abducing an explanatory hypothesis, the second in deducing the testable consequences and the third in testing these consequences to either confirm or falsify the explanatory hypothesis. In its inferential approach, the second Peirce distinguishes clearly these three kinds of inferences and makes their relations explicit. The aim of this paper is to sketch a theory of abduction with its relations with deduction and induction in the sense of the second Peirce. Abduction is seen as a relation reciprocal to deduction, not directly as done by Flach (1996), but in a more general and sophisticated sense very close to the framework of belief revision (Alchourron & al. 1985, Gärdenfors 1988, Katsuno & al. 1991). One supplementary advantage beyond generality is that this conception of abduction is directly compatible with the conception of induction underlying belief revision.
123
This theory of abduction will be expressed within models belonging to different paradigms of cognition, the purpose being to check its adequacy for the whole field of cognitive science. Three main paradigms are considered: the cognitivist paradigm which takes knowledge to be symbolic, with validity as a criterion of success; the connectionist paradigm which substitutes sub-symbolic states of a neural network to symbols, while retaining the same criterion of success; the constructivist paradigm which replaces the criterion of validity with an evolutionary criterion of viability and claims that the symbolic level to be grounded in the sub-symbolic level. The three parts of the paper are discussing models of abduction corresponding to these three paradigms. For the sake of clarity, inferences are not considered to be probabilistic and are only analyzed in a set-theoretic framework, which supposes that Nature’s responses are deterministic. Abduction and cognitivism
124
Basic schema Let us consider a special type of expert, a physician. There is a set of diseases that are not directly observable; only signs are available to the physician. A causal relation develops from a hidden disease to a set of observable signs. The physician’s reasoning moves in the converse sense from the observable signs towards the hidden diseases: she has to “abduct” the hidden hypothesis (disease) from the observable facts (signs). What are first to be understood are the constraints that apply to this kind of reasoning. The most convenient way to express these constraints, as is the case in other kinds of reasoning, is to spell out the axioms which abduction follows. A great advantage of such a specification of abductive reasoning is that it can be falsified by psychological experiments. More precisely, a set of axioms defines a class of abductive reasonings that can be enlarged by weakening the axioms or restricted by strengthening them. The above set of axioms seems to be both interesting for it global properties and plausible: - A1-Consistency: nothing can be abducted from a contradiction. It is the dual of the well-known property of deduction: everything can be deduced from a contradiction. - A2-Success: every hypothesis can be abducted from itself. Again, this is the dual of the property that every hypothesis can be deduced from itself. - A3-Cautious monotony: if one can abduct a first hypothesis from a fact and if this hypothesis can be abduced from another hypothesis, one can abduct both hypotheses from the fact. This principle can be reformulated by saying that the abductive inference for a hypothesis can be extended to other underlying facts or hypotheses that might confirm the hypothesis.
- A4-Or: if one can abduct a first hypothesis from a fact and if a second hypothesis is incompatible with the first one then the conjunction of the two hypotheses can be abduced from the conjunction of the fact and the second hypothesis. This principle can be reformulated by saying that, if many incompatible hypotheses are possible, they can be preserved within a particular scheme of abduction. - A5-And: If one can abduct a hypothesis from two facts then one can abduct it from the conjunction of these two facts. We suppose that the facts and the hypotheses are represented as propositions constructed with the help of the classical connectors {¬,∧,∨, →,↔} from a finite (for the sake of simplicity) set of atomic propositions P={π1,…,πn}. The set W of possible worlds is the set of interpretations of P, i.e. the set of functions from P to {T=true, ⊥=false}. To each proposition “a” it is possible to associate the set of possible worlds ‘A’ that makes the proposition true. Then the syntactic implication α→β holds if and only if the semantic inclusion Α⊆Β holds between their corresponding sets of worlds “A” and “B”. And the syntactic equivalence α↔β holds if and only if the semantic equality A=B holds. In this framework, we now collect together the previous axioms for abductive inferences, where “α |< β” means “from a it possible to abduct b or simpler “from α one abducts β”: A1-Consistency: ¬ (⊥ |< α) A2-Success: α |< α A3-Cautious monotony: if α |< β and γ |< β then α |< β∧γ A4- Weak Or: α |< β and β ∧ γ → ⊥ then α ∧ γ |< β / γ A5-And: α |< γ and β |< γ then a ∧ β |< γ A deep representation theorem allows giving the canonical form of abductive inferences satisfying A1-A5. This representation theorem is given in two forms: theorem 1 links abduction and deduction; theorem 1’ prepares the link between abduction and induction, through the belief revision operator. 1
Theorem 1 (representation theorem for abductive reasoning) : The abductive inference “|