Sensory Integration and the Unity of Consciousness [1 ed.] 9780262319270, 9780262027786

In this volume, cognitive scientists and philosophers examine two closely related aspects of mind and mental functioning

238 102 4MB

English Pages 423 Year 2014

Report DMCA / Copyright

DOWNLOAD PDF FILE

Recommend Papers

Sensory Integration and the Unity of Consciousness [1 ed.]
 9780262319270, 9780262027786

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Sensory Integration and the Unity of Consciousness

Sensory Integration and the Unity of Consciousness

edited by David J. Bennett and Christopher S. Hill

The MIT Press Cambridge, Massachusetts London, England

© 2014 Massachusetts Institute of Technology All rights reserved. No part of this book may be reproduced in any form by any electronic or mechanical means (including photocopying, recording, or information storage and retrieval) without permission in writing from the publisher. MIT Press books may be purchased at special quantity discounts for business or sales promotional use. For information, please email [email protected]. This book was set in ITC Stone Serif Std by Toppan Best-set Premedia Limited, Hong Kong. Printed and bound in the United States of America. Library of Congress Cataloging-in-Publication Data Sensory integration and the unity of consciousness / edited by David J. Bennett and Christopher S. Hill. pages cm Includes bibliographical references and index. ISBN 978-0-262-02778-6 (hardcover : alk. paper) 1. Consciousness. 2. Sensorimotor integration. 3. Senses and sensation. I. Bennett, David J., 1957–, editor. II. Hill, Christopher S., editor. B808.9.S46 2014 128—dc23 2014003872 10

9

8 7

6 5

4 3 2

1

For DJB’s mom and dad, Elizabeth and Jacob

Contents

Preface I

ix

Mainly on Sensory Integration

1

1 Bayesian Modeling of Perceiving: A Guide to Basic Principles David J. Bennett, Julia Trommershäuser, and Loes C. J. van Dam 2 The Multisensory Nature of Perceptual Consciousness Tim Bayne

3

15

3 The Long-Term Potentiation Model for Grapheme-Color Binding in Synesthesia 37 Berit Brogaard, Kristian Marlow, and Kevin Rice 4 Intermodal Binding Awareness Casey O’Callaghan

73

5 The Unity Assumption and the Many Unities of Consciousness Ophelia Deroy 6 Multimodal Unity and Multimodal Binding Frédérique de Vignemont 7 Can Blue Mean Four? Jennifer Matey

105

125

151

8 Establishing Cross-Modal Mappings: Empirical and Computational Investigations 171 Pawan Sinha, Jonas Wulff, and Richard Held 9 Berkeley, Reid, and Sinha on Molyneux’s Question James Van Cleve 10 Modeling Multisensory Integration 209 Loes C. J. van Dam, Cesare V. Parise, and Marc O. Ernst

193

viii

II

Contents

Primarily on the Unity of Consciousness

231

11 A Unity Pluralist Account of the Unity of Experience David J. Bennett and Christopher S. Hill 12 Unity, Synchrony, and Subjects Barry Dainton 13 Experiences and Their Parts Geoffrey Lee

233

255

287

14 Unity of Consciousness: Advertisement for a Leibnizian View Farid Masrour 15 Partial Unity of Consciousness: A Preliminary Defense Elizabeth Schechter 16 E pluribus unum: Rethinking the Unity of Consciousness Robert Van Gulick 17 Counting Minds and Mental States Jonathan Vogel Contributors Index 403

401

393

347 375

323

Preface

The papers in the present volume examine two closely related aspects of mind and mental functioning—the relationships among the various senses, and the bonds that tie conscious experiences together to form unified wholes. A flood of recent work in both philosophy and perception science has challenged widespread conceptions of the traditional sensory systems— sight, touch, and so on—as operating in isolation. The contributors to Part I discuss the various ways and senses in which perceptual contact with the world is or may be “multisensory.” Recent years have also seen a surge of interest in unity of consciousness. The papers in Part II explore a range of questions about this topic, including the nature of unity, the degree to which conscious experiences are unified, and the relationship between unified consciousness and the self. Like the papers in Part I, those in Part II seek to integrate scientific and philosophical concerns. Work on the present volume began with a conference that occurred at Brown University in November 2011. David Bennett and Susanna Siegel were the organizers of the conference, and the funding came from the Brown Department of Philosophy and the Network for Sensory Research. The editors wish to thank Bernard Reginster, the Chair of Philosophy at Brown, for help in securing support for the conference, and also Mohan Matthen, Principal Investigator of the Network project. We also gratefully acknowledge help of various kinds from Kevin Connolly, Philip Laughlin, Aimee McDermott, and Joanna Rolfes. There are some grouping and dependency relationships among the papers that should be noted. This is especially true of the papers in Part I, but there is also an important dependency relationship between two papers in Part II. (1) The de Vignemont, Deroy, O’Callaghan, and Bayne papers all concern various aspects of perceptual binding, addressing philosophical concerns and issues in deeply empirically informed ways. One question addressed is how perceptual systems link disparate sources of sensory

x

Preface

information to the same object or event. (2) Sinha and Van Cleve discuss “Molyneux’s Problem.” Sinha presents and discusses relevant recent empirical results from his Project Prakash. Van Cleve places these recent results in a philosophical-historical setting. (3) In their paper on synesthesia, Brogaard, Marlow, and Rice survey recent empirical results, along with different empirical-scientific explanatory accounts on offer—including their own. Matey addresses the kind of content carried by certain kinds of synesthetic experiences—connecting her analysis to contemporary debates about the possible “high-level” content of everyday experiences. (4) Bennett, Trommershäuser, and van Dam, in their contribution, guide readers through basic principles of recent computational modeling of cue combination. In their chapter, van Dam, Parise, and Ernst discuss recent computational modeling of cue combination—modeling that is tied closely to recent empirical results, also described. (5) Finally, in Part II, while Vogel’s contribution is to some extent a freestanding paper, it is largely a response to Schechter’s contribution. The volume represents substantial commitments of time and energy by both editors, but CSH wishes to acknowledge that DJB is solely responsible for the original idea for the project and also that DJB has made a larger contribution to the editorial process. We conclude by noting a work that is not cited in any of the following papers nor elsewhere in the contemporary literature on unity of consciousness, but that we believe to merit attention. Ivan Fox’s Ph.D. dissertation The Unity of Consciousness and Other Minds (Harvard, 1984) argues forcefully for the view that whole states of consciousness are metaphysically more fundamental than the individual experiences that are thought to constitute them.

I Mainly on Sensory Integration

1 Bayesian Modeling of Perceiving: A Guide to Basic Principles David J. Bennett, Julia Trommershäuser, and Loes C. J. van Dam

1 The Single Cue Case. Example: Perceiving Slant from Texture We begin with a simplified example concerning the perception of surface slant from sensitivity to texture information (cf. Knill, 2003, 2007). The basic, perception-science version of Bayes will fall out naturally and intuitively. In our toy model, suppose that it is assumed that surface texture elements are circular (see figure 1.1).1 Suppose a perceiver views a surface head on, looking straight at a circular texture element. We’ll say that at “upright,” the slant of the surface is 90 degrees. That circular element will project to a circle on a flat projection plane, which approximates a sensory surface. The height-to-width ratio of a projected circular texture element is called the “aspect ratio” of the projected image. So, the aspect ratio of the image when looking straight on at the (90-degree, upright) surface is 1 because the image is itself circular. Now imagine the surface to be slanted back in depth. The projections of the same circular texture element will now be progressively “thinner” ellipses with progressively smaller aspect ratios (all fractions of 1, but ever smaller). There is a simple relation between such image aspect ratios and the slant of the surface: A = sin (S).

(1)

(A stands for the aspect ratio. S is the slant of the surface, measured from the “straight on” upright, where S = 90. If the surface is viewed straight on, the projection of a circular texture element will be a circle and the aspect ratio will be 1. As the surface is slanted back in depth away from upright—to 70 degrees, to 50 degrees, and so on—the projections will be successively thinner ellipses, with smaller aspect ratios.)

I Mainly on Sensory Integration

1 Bayesian Modeling of Perceiving: A Guide to Basic Principles David J. Bennett, Julia Trommershäuser, and Loes C. J. van Dam

1 The Single Cue Case. Example: Perceiving Slant from Texture We begin with a simplified example concerning the perception of surface slant from sensitivity to texture information (cf. Knill, 2003, 2007). The basic, perception-science version of Bayes will fall out naturally and intuitively. In our toy model, suppose that it is assumed that surface texture elements are circular (see figure 1.1).1 Suppose a perceiver views a surface head on, looking straight at a circular texture element. We’ll say that at “upright,” the slant of the surface is 90 degrees. That circular element will project to a circle on a flat projection plane, which approximates a sensory surface. The height-to-width ratio of a projected circular texture element is called the “aspect ratio” of the projected image. So, the aspect ratio of the image when looking straight on at the (90-degree, upright) surface is 1 because the image is itself circular. Now imagine the surface to be slanted back in depth. The projections of the same circular texture element will now be progressively “thinner” ellipses with progressively smaller aspect ratios (all fractions of 1, but ever smaller). There is a simple relation between such image aspect ratios and the slant of the surface: A = sin (S).

(1)

(A stands for the aspect ratio. S is the slant of the surface, measured from the “straight on” upright, where S = 90. If the surface is viewed straight on, the projection of a circular texture element will be a circle and the aspect ratio will be 1. As the surface is slanted back in depth away from upright—to 70 degrees, to 50 degrees, and so on—the projections will be successively thinner ellipses, with smaller aspect ratios.)

4

D. J. Bennett, J. Trommershäuser, and L. C. J. van Dam

Slant from texture

Bayesian cue integration

slanted surface

posterior p(s|d)

lateral view

perceived slant

slant (°)

texture element aspect ratio

likelihood p(d|s)

frontal view

measured slant

prior p(s) expected slant

slant = 90° h/w = 1

slant = 45° h/w = 0.707

Slant (°)

Figure 1.1

It follows that if the aspect ratio of a projection from the circular element were measured accurately, surface slant could be immediately and directly recovered under the assumption that surface texture elements are circular. Thus, a measured aspect ratio of 0.707 would correspond to a surface slant of 45 degrees. Suppose reasonably, though, that there is noise in the measuring process in recording aspect ratios. As a result, on one occasion, a surface slanted at 45 degrees might give rise to an aspect ratio “measurement reading” of 0.643—“best corresponding” to a slant of 40 degrees by application of equation 1. On another occasion, that same surface (at 45 degrees) might lead to a measured aspect ratio of 0.719—“best corresponding” to a surface slant of 46 degrees by equation 1. And so on. If percipients operate by the measured aspect (possibly in error due to noise), they’ll often be a bit off under repeated viewing of the same 45-degree slanted surface. The mean or center of the “aspect ratio” readings might correspond (by equation 1) to a 45-degree slant, but there will be a spread or distribution. The amount of “bouncing” or variability will depend on how noisy the measuring process is. The greater the spread, the less precise or reliable the measurement is said to be. (Note: accuracy is a different matter, determined by whether the mean of the distribution aligns with the actual quantity being measured.) Suppose, though, that one also had a decent prior guess or hypothesis that surfaces tended to be slanted at 45 degrees in the surrounding organism environment.2 A basic Bayesian idea is that drawing on such a prior/ working assessment of the distribution of slants in the world can help leaven the possibly detrimental effects of measurement noise; such a prior

Bayesian Modeling of Perceiving

5

hypothesis might also counter error in the operative working model of how the stimulus is generated or measured. See figure 1.1 for specific illustration. Intuitively, assuming such a prior distribution of slants will tug upshot estimates toward slant values in line with the prior probability distribution. With that as background, here is the perception-science version of Bayes applied to the slant perception case: P(s/d) = (P(d/s) * P(s)) / P(d)

(2)

Think of d as the measured value of an aspect ratio. Therefore, on a given perceptual encounter there will be a specific value of d—a measured aspect ratio—plugged into equation 2 in inferring a perceptual estimate of what slant is present. The P(s) is the prior probability distribution, with the basic meaning just discussed. Here the s ranges over slant values that are assumed to be distributed in the world in a certain way (see figure 1.1b). In our example, surface slants are assumed to be distributed around a mean of 45 degrees. The P(d/s) is the likelihood function (see figure 1.1b). This reflects a working model of: (i) how the world projects to a viewpoint or sensory surface, and (ii) how those projections are measured—especially the noise/variability in the measuring process. In our case, the likelihoods reflect the geometry of projection of a circular texture element given by equation 1; for a given measured aspect ratio, d, the mean of the likelihood function will correspond to the slant given by plugging that measured value into equation 1. In our case, the likelihoods will also reflect the model of noise or variability in the measuring process. This corresponds to the spread or variance of the likelihood function (i.e., how peaked it is or how spread out). The P(s/d) is the posterior probability distribution (figure 1.1b). Intuitively:3 this gives the probability of surface slants given the measured value d of (here) an aspect ratio. The idea is that this is what the organism “works from.” The most likely slant, given the aspect ratio, is the point where the posterior distribution is maximal. Estimating the slant by choosing this point is called maximum a posteriori estimation (MAP). If the distributions are Gaussian, this will correspond to choosing the mean of the posterior. If the prior distribution is flat (and therefore uninformative), the estimate of the slant will be based only on the likelihood. This is called maximum likelihood estimation (MLE). To summarize, the upshot perceptual estimate of (here) slant will result from interpreting the incoming stimulus in terms of two kinds of prior

6

D. J. Bennett, J. Trommershäuser, and L. C. J. van Dam

assumptions about the world. Such “prior assumptions” are reflected in the priors, and also in the likelihood functions. The latter (likelihood) assumptions may be embodied in a fairly complex model of stimulus/measurement generation, often called a “generative model.” 2 Perceptual Estimation: Multiple Sources of Information So, we’ll assume that one source of slant information is the visual slant from texture source, as described above. Suppose on some occasions such a visual-texture-based path yields an estimate of slant, vs. We will assume that the other perceptual route at work derives slant from haptic information, perhaps by gauging wrist flexion or angle. Suppose that, on this occasion, this haptic route leads to an estimate of slant, hs. One way to arrive at an upshot estimate of slant based on these two estimates from different sources would be to take a weighted average: Slant = w1 *vs + w2*hs.

(3)

Here it is assumed that w1 + w2 = 1 (i.e., the weights are fractions that add up to 1).4 The obvious question then is how to choose the weights. The basic idea of linear cue integration is that sources are weighted proportionally less if more noisy/variable (that is, less precise or less reliable). Under some reasonable assumptions, the resulting upshot estimates of (here) slant will be as precise—the least variable—as possible (see Landy, Banks, & Knill, 2011, 7). That is, if the weights are chosen in this way to reflect variability/precision, averaging will minimize the variability in the upshot slant estimates over other choices of weights. In this circumscribed sense, the slant estimates arrived at in the way described are guaranteed to be “optimal.”5 Empirically it has often (though not invariably) been found that perceptual estimates that combine information from different sources are indeed optimal in the sense described (cf. Ernst & Banks, 2002; see Landy, Banks, & Knill, 2011 for an overview). Adjustments of weights to align with source noisiness/variability can indeed be rapid, sometimes within a few trials (cf. Seydell, Knill, & Trommershäuser 2010, 2011). It turns out that this sort of weighted averaging is a special case of a Bayesian model of cue integration that generalizes equation (2) above to the multicue case (Landy, Banks, & Knill, 2011, 8–10).6 The case of estimation by (3) would correspond on the Bayesian formulation to a form of MLE (see section 1), only here there are two likelihood functions, one for

Bayesian Modeling of Perceiving

7

each source of information. If a nonflat prior distribution is used, then— on the Bayesian formulation—the idea is that perceivers arrive at an estimate via a MAP estimate by taking the maximum of the upshot posterior distribution. 3 Perceptual Integration and Fusion A challenge faced by perceptual systems is to determine when and how to combine information. Here we will describe a few of the puzzles involved in meeting such challenges. The first two subsections, (A) and (B), concern “decisions” confronting perceptual systems about whether to combine or integrate information in estimation. Subsection (C) concerns whether or when perceptual systems retain separate estimates from different sources (for example, visual and haptic) after integrating those estimates in estimating a worldly property. We do not say much here about the details of the computational modeling designed to explain perceptual system behavior in these cases, which can be complex—especially for cases (B) and (C). Our aim is mainly to isolate different perceptual system challenges. For a more complete summary of the modeling details, see van Dam, Parise, and Ernst (this volume), who also provide numerous references to relevant recent work on these topics in the modeling literature. (A) In the example of slant perception developed in the preceding sections, the assumption made was that texture elements are circular. As a result, surface slant can be determined by equation (1) above. But suppose the additional use of stereo to gauge slant yields a widely discrepant estimate of slant consistent with the unusual (but possible) situation where the texture elements are ellipses and not circles. As a result, the most sensible perceptual system strategy may well be to veto the slantfrom-texture assessment and instead rely only on the stereo-derived estimate of slant. However, the cue combination schemes described thus far would lead to an upshot estimate between these two estimates, thereby skewing the upshot estimate in the direction of the divergent texture-based estimate. In a scheme explored by Knill (2003, 2007), veto behavior is modeled using a mixture likelihood. Such a mixture likelihood is derived from a likelihood reflecting the assumption that texture elements are circular and from a likelihood reflecting the assumption that the texture elements are elliptical. Such a mixture likelihood has long, flat, nonnegligible “tails” that essentially reflect the possibility that the texture elements are ellipses.

8

D. J. Bennett, J. Trommershäuser, and L. C. J. van Dam

This leads to veto-like behavior when the estimates are widely discrepant (Knill, 2003, 2007).7 Note that in this case it is granted as somehow known or assumed by the perceptual system that the separate (stereo and texture) estimates pertain to the same (here) slant property. The source of the discrepancy of estimates results from the falsity of the assumption that the texture elements are circular, which throws off the texture-based estimate of slant. The long-tailed mixture likelihood, in effect, allows the estimate that is discrepant (due to the failed assumption) to be vetoed or discarded. (B) Section (A) presents an example in which additional information from a second cue serves to rule out as discrepant the estimate resulting from a false assumption about the world. However, more typically a discrepancy between sensory information cannot so easily be blamed on one information source or the other. Cue combination modeling should also explain how perceptual systems make a reasonable determination of whether sensory signals from different sources—say, haptic and stereovision—pertain to the same worldly property (slant, size, etc.). Consider, for instance, observers who are determining object size or width by both gripping the object and determining size visually via stereo. Due to measurement noise, there will inevitably be some discrepancy between the haptic size signal or estimate and the visual-stereo size signal even when the estimates are derived from the same width or expanse. The perceptual-system challenge is to decide whether the estimates are indeed of the same size or expanse and so should be integrated or combined in arriving at an estimate of size. This correspondence problem (or problem of causal inference) is an important challenge facing perceptual systems that has only recently begun to draw the sustained attention of modelers. The modeling details in, for instance, the work by Ernst and collaborators are somewhat complex (see Ernst & Di Luca, 2011; Ernst, 2012; for a summary of the Ernst work, see van Dam, Parise, & Ernst, this volume; for a survey of related work, see Shams & Beierholm, 2010). However, the basic idea is that the perceptual system determines the extent to which any discrepancy in (here) haptic and visual estimates of size is likely due to (i) noise (ii) a bias in the measuring of the worldly properties or (iii) a difference in the worldly properties measured. To do this, prior knowledge is needed specifying how likely it is that the combination of the relevant kinds of sensory estimates—haptic and stereo, for example—co-occur or yield estimates that align with each other. This is captured in what is called the coupling prior encoding the assumption of how tightly the two signals are generally linked. An extremely tight linkage would mean that the

Bayesian Modeling of Perceiving

9

system assumes that these sensory sources are always providing information about the same world property; a flat coupling prior means that their co-occurrence is governed by chance. Using this prior, the separate sensory estimates are each adjusted according to the estimate obtained from the other source and the strength of the assumed association. If after this step a discrepancy between the updated estimates is still sensed, this means it is likely that there is either a measurement bias in one or both of the senses or a real difference in the world properties measured with each. In that case, if it is determined as very likely that the visual and haptic estimates derive from different worldly size properties, then the estimates would not be further integrated or combined. (C) Suppose once more that subjects are determining size or width by both looking at an object and gripping it (cf. Ernst & Di Luca, 2011, figure 12.3). Ernst and Banks (2002) found that haptic and visual size information is combined in reaching an upshot estimate of size in a way that maximizes precision. But it seems quite possible that even though perceivers combine the visual and haptic sensory information in this way, they still retain access to the visual and haptic signals or estimates. If so, there would not be complete perceptual fusion in the sense that the initial visual and haptic estimates can still be accessed and used. This is just what is found for touch and sight via the oddity task used in Hillis et al. (2002) and described in van Dam, Parise, and Ernst (this volume). By contrast, Hillis et al. (2002) found that for purely visual assessment of slant as gauged by stereo and texture information, subjects could not draw upon the individual (visual) stereo and texture signals or estimates, or at least not to the same extent as when each was presented in isolation. Thus, at least for this sort of purely visual slant perception, there was perceptual fusion, more or less automatically leading to optimal integration or combination of the estimates. In modeling when and why fusion occurs, Ernst and collaborators make use of the coupling prior distributions noted in (B) (see van Dam, Parise, and Ernst, this volume, and Ernst & De Luca 2011). Such a coupling prior reflects assumptions about how likely it is that the different sorts of sensory signals occur together. These might be haptic and stereo-vision signals or the signals might both be visual, like texture and stereo. In effect, the strength of the association between, say, haptic and the stereo-vision sensory signals is encoded in the width or spread of this coupling prior distribution. Therefore, a tight or sharp-ridged coupling prior indicates that haptic size signals and stereo (vision) size signals tend to vary together. With this sort of prior information on hand, it is sensible for perceptual systems to treat any discrepancy between haptic and stereo signals or estimates as due

10

D. J. Bennett, J. Trommershäuser, and L. C. J. van Dam

to measurement noise and to be guided instead by the (strongly peaked) coupling prior—discarding the individual haptic and stereo estimates. 4 Brief Pointers to the Philosophy of Perception Though the Bayesian approach to modeling cue combination is the dominant approach in perception science, the computational modeling details of this work are just starting to find their way into discussions of perception in philosophy (see Rescorla, forthcoming; see also Colombo & Seriès, 2012). In this section we briefly point to places where connections are there to be drawn and explored, with the science potentially stimulating, enriching, and constraining the work in philosophy. As we have seen, the chief aim in this Bayesian cue combination modeling tradition is to explain how different sources of information are combined in reaching upshot estimates of single worldly properties, such as size or slant. This general kind of perceptual combination is what de Vignemont (this volume) discusses and explores as “integrative binding.” First there are apparent connections to Molyneux’s question, which is explored in the Sinha et al. and Van Cleve contributions to this volume. As Van Cleve discusses, formulations of Molyneux’s question can differ subtly and importantly. But Molyneux’s question is essentially whether a person born blind would, upon regaining sight, recognize by vision the shape of an object previously known by touch. On the working outlook of cue combination modeling, the same spatial properties are accessed and estimated through different sources. This is in line with the idea, dating to Aristotle, that there are perceptual “common sensibles”—here spatial common sensibles. Such common sensible views have traditionally been associated with a proposed or hypothesized “yes” to Molyneux’s question. This observed, the connection between the cue combination computational modeling and commitments on Molyneux is not straightforward. A full charting would include exploring just how subjects gain the ability to access the same spatial properties across information gained from touch and sight—perhaps through some form of learning. There is still much to be learned about how the relevant kind(s) of perceptual learning is/are achieved (on the modeling of learning, see van Dam, Parise, & Ernst, this volume, and Ernst, 2012). Here is another kind of connection between the modeling we have described and topics in the philosophy of perception. A number of contributors to this volume discuss how and when different properties—for example, slant and color—are bound to the same object or object surface. See especially the chapters by O’Callaghan, Deroy, and

Bayesian Modeling of Perceiving

11

Bayne. De Vignemont (this volume) refers to this sort of perceptual-binding achievement as “additive binding.” Explaining this kind of binding of multiple properties to an object/event is not the primary aim of the kind of Bayesian cue combination modeling described in this chapter. By way of (oversimplified) illustration: in the case of Bayesian cue combination, the modeling task is to understand how to combine information that is at least partly redundant—haptic and visual information about slant, for example. By contrast, in the multiple property cases that O’Callaghan, Deroy, and Bayne focus on, the sensory information specifying one property (such as slant) and the sensory information specifying the other property (such as color) are not redundant; two different properties are indicated (to be bound to the same object). Nonetheless, there is an important connection between the Bayesian cue combination case (redundant information) and the multiple property case (nonredundant information). A core challenge faced by perceptual systems that must be resolved if different properties are to be accurately bound to the same object or surface is determining whether the information guiding the detection of each property derives from the same object. Once again, see the contributions to this volume by Deroy, O’Callaghan, Bayne, and de Vignemont. Meeting this challenge requires solving a correspondence problem (or a problem of causal inference) of the sort described in part (B) of section 3 and in the van Dam, Parise, and Ernst (this volume) section on the correspondence problem. Acknowledgments Thanks to Cesare Parise for helpful discussion, as well as for figure 1.1. Notes 1. Figure due to Cesare Parise. 2. A perennial challenge is to account for the origin of “prior knowledge” or “assumptions” engaged in perceptual response (see Ernst & De Luca, 2012, for extended discussion). One possibility is that this assumed distribution has been programmed into the organism over evolutionary time in encounters over eons with typical earthly natural environments. Another is that the prior distributions and/or the likelihoods (reflecting world models) are learned. The question of whether the prior information is innate or learned—and if learned, under what conditions—can often be approached through experiment (van Dam, Parise, & Ernst, this volume; Ernst, 2012). The exact mechanisms of learning the relevant distributions are not well understood (as observed in both van Dam, Parise, & Ernst, this volume, and Ernst, 2012).

12

D. J. Bennett, J. Trommershäuser, and L. C. J. van Dam

3. Strictly, for particular (real-valued) slants, the probability will be zero. So, if we are careful, we’d talk instead of areas under the “probability density function” corresponding to the probability that a slant will fall within a certain range. 4. Prior information or assumptions about the distribution of slants can be incorporated into the upshot estimate by adding a further estimate, pr (for “prior”), with its own weight: Slant = w1*vs + w2*hs + w3*pr.

(4)

Here again it is assumed that the weights add up to 1. 5. The upshot estimate is only guaranteed to be unbiased if the individual estimates are unbiased. Assuming Gaussian noise, you can think of the assumption of lack of bias as the assumption that the distribution of measurements from a source—of slant, say, via texture—will center on the actual, worldly slant value present. Details are debated, but this is likely often an unrealistic assumption. Too much perceiving is biased or inaccurate. For sustained discussion of this “challenge of bias” within the Bayesian modeling tradition, see Ernst and Di Luca (2011) and part (B) of section 3. The challenge was independently emphasized outside the Bayesian tradition by Domini and Caudek (2011). 6. The basic idea is that the likelihood distributions—one for each source of information—are multiplied together and with the prior. Therefore, with two sources of information, P(s|vs,hs) ∝ P(vs|s)*P(hs|s)*P(s).

(5)

Then a MAP estimate of the worldly property such as slant is arrived at by taking the maximum of the posterior P(s|vs,hs). Equation (3) above is a special case of (5) in the following way. First, there is no prior in (3), so it corresponds to the case where (5) has a flat prior. The posterior, P(s|vs,hs), in (5) will in that case simply correspond to multiplying the two likelihoods and thus to MLE estimation. Again, the maximum of the posterior distribution is the most likely slant given the two inputs, vs and hs. This should lead to the same perceived slant as equation (3) in the case where the weights in equation (3) are optimally chosen to minimize variability. 7. Ernst (2012) contains a detailed discussion of the computational effects of positing such thick-tailed distributions and of the motivations and justifications for doing so.

References Colombo, M., & Seriès, P. (2012). Bayes on the brain—on Bayesian modeling in neuroscience. British Journal for the Philosophy of Science, 63, 697–723.

Bayesian Modeling of Perceiving

13

Domini, F., & Caudek, C. (2011). Combining image signals before threedimensional reconstruction: The intrinsic constraint model of cue integration. In J. Trommershäuser, K. Kording, & M. S. Landy (Eds.), Sensory cue integration (pp. 120– 143). Oxford: Oxford University Press. Ernst, M. O. (2012). Optimal multisensory integration: Assumptions and limits. In B. E. Stein (Ed.), The new handbook of multisensory processes (pp. 1084–1124). Cambridge, MA: MIT Press. Ernst, M. O., & Banks, M. S. (2002). Humans integrate visual and haptic information in a statistically optimal fashion. Nature, 415, 429–433. Ernst, M. O., & Di Luca, M. (2011). Multisensory perception: From integration to remapping. In J. Trommershäuser, K. Kording, & M. S. Landy (Eds.), Sensory cue integration (pp. 224–250). Oxford: Oxford University Press. Hillis, J. M., Ernst, M. O., Banks, M. S., & Landy, M. S. (2002). Combining sensory information: Mandatory fusion within, but not between, senses. Science, 298, 1627–1630. Knill, D. C. (2003). Mixture models and the probabilistic structure of depth cues. Vision Research, 43, 831–854. Knill, D. C. (2007). Robust cue integration: A Bayesian model and evidence from cue-conflict studies with stereoscopic and figure cues to slant. Journal of Vision, 7(5), 1–24. Landy, M. S., Banks, M. S., & Knill, D. C. (2011). Ideal observer models of cue integration. In J. Trommershäuser, K. Kording, & M. S. Landy (Eds.), Sensory cue integration (pp. 5–29). Oxford: Oxford University Press. Rescorla, M. (Forthcoming). Bayesian perceptual psychology. In M. Matthen (Ed.), The Oxford handbook of the philosophy of perception. Oxford: Oxford University Press. Seydell, A., Knill, D. C., & Trommershäuser, J. (2010). Adapting Bayesian priors for the integration of visual depth cues. Journal of Vision, 10, 1–27. Seydell, A., Knill, D. C., & Trommershäuser, J. (2011). Priors and learning in cue integration. In J. Trommershäuser, K. Kording, & M. S. Landy (Eds.), Sensory cue integration (pp. 155–172). Oxford: Oxford University Press. Shams, L., & Beierholm, U. R. (2010). Causal inference in perception. Trends in Cognitive Sciences, 14, 425–432.

2 The Multisensory Nature of Perceptual Consciousness Tim Bayne

1 Introduction Philosophical reflection on perceptual consciousness has typically adopted a modality-specific perspective as its point of departure. According to this approach, an account of perceptual consciousness as a whole will simply fall out of an account of each of the various perceptual modalities. In this chapter, I argue against one manifestation of this atomistic approach to perceptual experience: the decomposition thesis. According to the decomposition thesis, a person’s overall perceptual experience can be identified with the sum of his or her modality-specific experiences. If the decomposition thesis is correct, then perceptual experience can be exhaustively factored into modality-specific states. Modality-specific ways of referring to perceptual experience would not merely pick out distinctive clusters of phenomenal features and objects, they would also carve the structure of consciousness at its joints, so to speak. Michael Tye has suggested that the decomposition thesis—or at least something very much like it—is the “received view” among philosophers of mind (Tye, 2003, 17). I suspect that he is right. Whether or not that is the case, there is no doubt that the decomposition thesis captures an influential and intuitively plausible conception of consciousness. It is, however, a conception that faces significant objections—or so I will argue. I begin by considering whether the decomposition thesis can accommodate the unity of consciousness (sec. 2) and the perception of common sensibles (sec. 3), before turning to the question of whether it can account for multisensory integration (secs. 4–6). Before addressing those tasks, let me first consider a more fundamental issue raised by the decomposition thesis. One might think that it is not possible to evaluate the decomposition thesis without having an account of how many perceptual modalities there are, where the boundaries between

16

T. Bayne

them lie, and what makes it the case that an experience is assigned to a particular sensory modality. Unfortunately, there are no answers to these questions—at least, none that command widespread assent (Macpherson, 2011a, 2011b). Some theorists recognize only the five traditional Aristotelian senses of vision, audition, touch, taste, and smell, whereas others argue that human perception involves a number of additional senses. Those who argue for more than the traditional five senses, however, do not themselves agree about the identity of the additional senses. Some theorists regard passive touch and active touch as separate modalities; others hold that the detection of temperature, pressure, and vibration involve separate modalities (or, at the very least, submodalities). There are also debates about the individuation of the chemical senses, with open questions concerning the relations between the perception of taste, smell, and flavor (Auvray & Spence, 2008; Stevenson, 2009). In addition, there are unresolved questions about whether perceptual experiences are to be assigned to specific modalities on the basis of their intentional objects, their phenomenal character, or their underlying neurofunctional structure. Although these questions raise important issues that any thoroughgoing treatment of the decomposition thesis must address, I will (largely) set them to one side here. As we will see, significant progress can be made in evaluating the decomposition thesis by relying only on our intuitive sense of how to individuate experiences. In what follows, I will focus my attention on vision and audition, although many of the issues raised in the following discussion apply also to other modalities. 2 Phenomenal Unity One source of pressure on the decomposition thesis derives from reflection on the unity of consciousness. Consider what it’s like to experience a jazz trio, where one both sees the musicians and hears the music that they produce. In such cases, one doesn’t merely enjoy visual and auditory experiences at one and the same time, but instead these experiences are unified with each other within the context of one’s overall perceptual awareness of the world. One way to highlight this unity is by contrasting normal perceptual experience with the kind of perceptual experience that might be enjoyed by a creature with two streams of consciousness. In such a creature, the visual experience of the trio might be restricted to one stream of consciousness, and the auditory experience might be restricted to the other stream of consciousness. Such an individual will have visual and auditory experiences at one and the same time, but these experiences will not enjoy

The Multisensory Nature of Perceptual Consciousness

17

a conjoint phenomenal character in the way that those of a normal perceptual subject will. These considerations suggest that we need to recognize a phenomenal unity relation in order to characterize normal perceptual experience (Bayne & Chalmers, 2003; Bayne, 2010; Dainton, 2000/2006; Tye, 2003). Might recognizing such a relation put pressure on the decomposition thesis? It would if phenomenal unity has its own distinctive phenomenology—a phenomenology over and above that which is possessed by the experiences that it relates. The idea that phenomenal unity does possess its own phenomenal character is not difficult to motivate. If phenomenal unity is a relation that two experiences bear when they share a conjoint phenomenal character, then it seems to follow that relations of phenomenal unity must possess a proprietary phenomenal character. When phenomenal unity holds between experiences drawn from different modalities—as it often does—then that phenomenal character could not be identified with any one experiential modality. For example, the character of the unity relation that holds between an auditory experience (a) and a visual experience (v) could be neither purely auditory nor purely visual; instead, it must be either audio-visual or perhaps generically perceptual. If that is so, then the decomposition thesis would be false: “subtracting” modality-specific experiences from one’s overall perceptual experience would leave in place relations of phenomenal unity. Although it is not wholly unattractive, it seems to me that this line of argument ought to be resisted. Despite its name, phenomenal unity does not itself possess any phenomenal character. Although there is something it is like to experience a and v together, this phenomenal togetherness should not be assigned to the phenomenal unity relation itself. Phenomenal unity cannot be singled out in an act of introspective awareness, nor can the “what it’s likeness” of one phenomenal unity relation (say, that which holds between a visual and an auditory experience) be compared and contrasted with the “what it’s likeness” of another phenomenal unity relation (say, that which holds between an emotional experience and a tactile experience). A further reason to deny that phenomenal unity possesses its own phenomenal character is that ascribing phenomenal character to phenomenal unity would threaten to generate a vicious regress (Hurley, 1998). If relations of phenomenal unity did possess their own phenomenal character, then they would need to be unified with the rest of the subject’s experiences, and in order to account for that unity, we would need to posit an additional set of phenomenal unity relations. In positing these relations,

18

T. Bayne

we would seem to have embarked on a regress that shows every sign of being vicious. The lesson to draw from these considerations is that although the presence of phenomenal unity does make a phenomenal difference, the relation does not itself possess any phenomenology. The phenomenal difference that phenomenal unity makes should not be assigned to the phenomenal unity relation itself but to the complex experiential state that is generated by its presence. Whenever two experiences are phenomenally unified with each other, there is a complex experiential state that subsumes them both (Bayne & Chalmers, 2003; Bayne, 2010). For example, if a visual experience (v) and an auditory experience (a) are phenomenally unified, then there will be a complex audio-visual experience that subsumes both v and a. And here we do have an objection to the decomposition thesis, for the complex state a-v is not itself a modality-specific state, nor can it be identified with the mere conjunction of modality-specific states. Consider again an experience of a jazz trio in which one’s auditory and visual experiences are unified in the form of a complex multimodal perceptual state. This state cannot be identified with the conjunction of the visual and auditory states since a subject with two streams of consciousness could enjoy both the visual state and the auditory state without enjoying the complex state a-v. Insofar as the phenomenology of a-v “outstrips” that of its modality-specific constituent experiences, it would appear to be at odds with the decomposition thesis. 3 Common Sensibles A second source of trouble for the decomposition thesis involves the existence of common sensibles—that is, features of the world that can be experienced via more than one perceptual modality. Paradigm examples of common sensibles include spatial, temporal, and causal relations. How might the advocate of the decomposition thesis accommodate common sensibles? One commonly adopted strategy for meeting this challenge involves an appeal to modality-specific modes of presentation (Kulvicki, 2007; Lopes, 2000; O’Dea, 2006). The idea here is that common sensibles are “common” (or amodal) at the level of reference but not at the level of sense. In the same way that ‘the morning star’ and ‘the evening star’ pick out the same object in different ways, so too might visual experience and auditory experience (for example) represent spatial relations in different ways. Visual experiences of space have a visual phenomenology whereas auditory experiences of space have an auditory phenomenology.1 This conception

The Multisensory Nature of Perceptual Consciousness

19

of perceptual content is sometimes referred to as a Fregean account and is contrasted with a Russellian account of the common sensibles (and of perceptual content more generally) according to which the phenomenal character of a perceptual experience is fixed by the properties that it represents (Chalmers, 2004; Thompson, 2009). According to the Russellian, there are no modality-specific modes of presentation for the simple reason that there are no modes of presentation at all. What reason might be given for the Fregean account of perceptual content? It is not uncommon for Fregeans to appeal to introspection in support of the claim that there are modality-specific perceptual modes. However, it is far from clear—to me, at any rate—that introspection can do the work asked of it here. I don’t deny that there is typically an experiential (and introspectively discernible) difference between experiences of space that involve one modality and those that involve another modality, but it seems to me that one can account for this fact without appealing to modalityspecific modes of presentation. The first thing to note is that different modalities will typically represent spatial properties and relations with different degrees of grain and precision. For example, normal human vision exhibits greater spatial resolution than either audition or touch do. Second, perceptual experience typically has a self-representational aspect to it, for one is implicitly aware of which sensory organs one is using to perceive the world. For example, one is usually aware of whether one’s awareness of a surface is mediated by visual exploration, tactile exploration, or both. Third, if the representation of space did involve modality-specific modes of presentation, then such modes ought to be apparent to one in those contexts in which one is aware of the same property via multiple sensory modalities. For example, suppose that one were both visually and tactually aware of the length of a table. In such a case, the visual experience of the table’s length ought to be introspectively discernible from the tactile experience of its length. But for what it’s worth, I cannot identify any such multiplicity in my own experience. A further problem with the modal-specificity view is that we are clearly aware of spatial relations between stimuli that are presented in different modalities. For example, one can be aware of the sound of a siren as being to the left of a visually presented dog. What, according to the Fregean, is the phenomenal character of one’s awareness of this spatial relation? Clearly it could be neither purely visual nor purely auditory, but if that is the case then positing modality-specific modes of representation wouldn’t provide the advocate of the decomposition thesis with a fully satisfactory account of the common sensibles.

20

T. Bayne

The picture becomes even bleaker for the advocate of the decomposition thesis if we consider other common sensibles such as time. The suggestion that there might be modality-specific modes of presentation for the representation of time seems to lack even the prima facie plausibility that accompanies corresponding claims about the representation of space. This point applies to temporal relations between events irrespective of whether they are mediated by the same sensory modality or not. In short, a Russellian account of the representation of temporal relations seems to be so much more attractive than a Fregean account.2 What might the advocate of the decompositional thesis say at this point? One line of thought involves appealing not to the phenomenal character of an experience but to the causal history and/or neurofunctional basis of the relevant representations. For example, one might argue that a particular representation of space is visual not in virtue of its phenomenal character, but in virtue of the fact that its causal history can be traced back to processing in the retina and/or in virtue of the fact that the representation itself—its vehicle, if you prefer—is located in the visual cortex. This position motivates a version of the decomposition thesis according to which at least some aspects of perceptual experience can be assigned to particular modalities in virtue of their history and/or neurofunctional basis. Might this line of thought save the decomposition thesis? I think not. But in order to see why, we must turn to the topic of multisensory integration. That is the business of the next section. 4 Multisensory Integration and the Unity Assumption What explains the popularity of the decomposition thesis? Arguably much of its appeal derives from assuming that perceptual experience involves modality-specific systems. According to this view, not only is perceptual processing in one perceptual modality doxastically impenetrable, it is also encapsulated from processing that occurs in those neural systems associated with the subject’s other perceptual modalities. This picture of perceptual processing has been extremely influential, but its influence is not commensurate with its plausibility, for contemporary cognitive neuroscience has replaced this view of perceptual processing with one that is robustly multimodal. Research has revealed the existence of significant numbers of bimodal and multimodal cells—that is, cells that respond to stimulation in more than one sensory modality—and multisensory association areas (Driver & Noesselt, 2008). For example, the superior

The Multisensory Nature of Perceptual Consciousness

21

colliculus has been implicated in the integration of visual, auditory, and tactile cues; the parieto-insular vestibular cortex has been implicated in the integration of vestibular and visual cues; and the temporoparietal junction has been implicated in the integration of tactile, proprioceptive, kinesthetic, and visual cues. Moreover, many of the areas that have traditionally been regarded as “unisensory” may be so only in terms of their afferent projections, for back-projections from multisensory convergence areas often result in input from a secondary modality. Indeed, the ubiquity of multisensory interaction in the brain has led some researchers to describe the cortex as “essentially multisensory” (Foxe & Schroeder, 2005; Ghazanfar & Schroeder, 2006). But what really matters here is not the anatomical basis of multisensory integration (MSI) but its effects on information processing. Research extending back for over a century has revealed that perceptual processing in one modality is often modulated by perceptual processing in other modalities (Bertelson, 1988; Calvert, Stein, & Spence, 2004; de Gelder & Bertelson, 2003). Let me begin with some examples of MSI before turning to the implications of this research for the questions at hand. One of the most well-known examples of MSI is the ventriloquism effect, in which the apparent source of a spatially discrepant sound source is mislocalized so that it more closely matches the seen source, be it the articulated lips of the ventriloquist’s dummy or the sight of a loudspeaker cone (Bertelson, 1999; Bertelson & de Gelder, 2004). There are also temporal forms of “ventriloquism,” in which the presentation of a stream of auditory stimuli modulates the perceived temporal location of visual stimuli (Bertelson & Aschersleben, 2003; Morein-Zamir et al., 2003; see also Hanson et al., 2008). Other examples of MSI involve the modulation of “categorical” perception. For example, in the McGurk effect, dubbing the phoneme /ba/ onto the lip movements for /ga/ produces a percept of the phoneme /da/ (McGurk & MacDonald, 1976). Indeed, MSI can even affect the number of objects that are presented in perceptual experience. In the sound-induced flash illusion, participants are presented with multiple tones concurrently with a single flash of light (Shams, Kamitani, & Shimojo, 2000, 2002; Shams, Ma, & Beierholm, 2005). As a result of the interaction between vision and audition, participants typically see (or “seem to see”) two flashes of light. A similar interaction has been found between touch and audition: participants who are presented with multiple tones will experience a single tap as two taps (Hötting & Röder, 2004). What bearing does multisensory integration have on the decomposition thesis? Let us return to the suggestion that perceptual representations of

22

T. Bayne

common sensibles can be assigned to a particular modality in virtue of their history and/or neurofunctional basis. We can now see that MSI problematizes this proposal. In the ventriloquism effect, the representation of the sound’s spatial location is not solely a function of the information received via the ears or of processing in early auditory cortex but is the joint result of both auditory and visual processing. Looking to the causal history of these representations will not save the decomposition thesis. What about appeals to the neurofunctional basis of perceptual representations? Contrast two conceptions of MSI: a (purely) causal conception and a constitutive conception. According to the (purely) causal conception, the “multisensory” nature of MSI is restricted to processing the relevant perceptual input: by contrast, the representations that result from multisensory integration are wholly unisensory. For example, a causal treatment of ventriloquism would take it to involve a purely visual representation of a visual stimulus and a purely auditory representation of an auditory stimulus. According to the constitutive conception, by contrast, the multisensory nature of MSI is not restricted to perceptual processing but extends to the neurofunctional nature of the perceptual representations that are generated by that processing. In the context of ventriloquism, the representations in question are not confined to either auditory cortex or to visual cortex but instead are located in areas that are inherently audiovisual. It might be useful to consider how these two accounts apply to the sound-induced flash illusion. A causal treatment of the sound-induced flash illusion takes it to result in a purely visual experience as of multiple flashes (rather than the single flash that actually occurred) and a purely auditory experience of multiple tones. A constitutive treatment of the sound-induced flash illusion, by contrast, takes the multi-sensory interaction to result in a representation that is audio-visual. Rather than generating a representation of a visual object and a representation of an auditory object, the interaction generates a representation of an audio-visual object—an entity that is both flashing and tone-emitting. If the causal conception of MSI were correct, then the line of thought outlined earlier might suffice to save (a version of) the decomposition thesis. However, it seems to me that a strong case can be made for treating many multisensory interactions in constitutive terms.3 At the heart of this case is an appeal to what Welch and Warren dub “the unity assumption” (Welch & Warren, 1980). Here is Welch’s characterization of the assumption: An intersensory conflict can be registered as such only if the two sensory modalities are providing information about a sensory situation that the observer has strong rea-

The Multisensory Nature of Perceptual Consciousness

23

sons to believe (not necessarily consciously) signifies a single (unitary) distal object or event. (Welch, 1999, 373)

The idea is that in order to explain why the perceptual system integrates information across modalities in the way that it does, we must take it to identify the objects (and properties) of one modality with those of another. (Thus, “the identity assumption” would be a more accurate label than “the unity assumption.”) For example, we can understand the ventriloquism effect only by assuming that the perceptual system identifies the heard event with the seen event. If the perceptual system failed to identify one event with the other, then it would not register the conflict between vision and audition as a conflict, and we would have no information-processing explanation of the effect. A similar story can be told with respect to the McGurk effect. The perceptual system has reason to integrate its representations of the visual and auditory cues only if it takes these cues to concern a single speech act. As Ernst (2006, 128) has remarked, “The integration of signals is only reasonable if they are derived from the same object or event; unrelated signals should be kept separate.” Before examining how the unity assumption puts pressure on the decomposition thesis, let me make a number of clarifying remarks about the assumption itself. First, although Welch presents the assumption in terms of what the subject believes (see above), it is more plausibly regarded as subdoxastic in nature. MSI can be modulated by “top-down” effects (see e.g., Vatakis & Spence, 2007), but it is not—in general, at least—doxastically penetrable (Radeau, 1994). Knowing that one is currently subject to the ventriloquism effect will not typically have any impact on the nature of one’s perceptual experience. Second, we can appeal to the unity assumption not only to explain the resolution of intersensory conflict but also the enhancement of intersensory coherence. Consider again the sound-induced flash illusion. It is unclear how the unity assumption might account for this effect if it were concerned only with intersensory conflict, for there is no inconsistency in representing a tone and representing the absence of any corresponding visual event at (or near) the relevant time. However, there is an intuitive sense in which hallucinating an additional flash of light increases the coherence of the subject’s overall perceptual experience.4 Finally, the unity assumption does not itself constitute a complete account of MSI. For one thing, it does not have anything to say about the conditions under which an object that is perceived via one modality is identified with an object that is perceived via another modality. Moreover,

24

T. Bayne

the unity assumption says nothing about how intersensory conflict ought to be resolved in any particular case. Answers to these questions will be provided only by broader explanatory frameworks such as that provided by the Bayesian cue combination account (Ernst & Bülthoff, 2004; Trommershäuser, Landy, & Körding, 2011). Such accounts do not render the unity assumption redundant but provide a framework that explains when the assumption is “triggered” and why the perceptual system responds to intermodal inconsistency in the way that it does.5 5 Multisensory Object Files So much for multisensory integration and the unity assumption—what relevance does the foregoing have for the decomposition thesis? The connection can be couched in terms of the notion of an object file, where “object file” talk is simply a convenient way of referring to the fact that various perceptual features have been tagged by the perceptual system as belonging to the same intentional object. If it is implicit in the operation of the perceptual system that an object perceived via one modality is identical to an object perceived via another modality, then it seems to follow that the object files created as the result of this processing must be multimodal: they are not restricted to the features drawn from a single modality but extend across modalities. In fact, these object files are multisensory in two respects. First, they contain “amodal” information about certain kinds of properties—the “common sensibles.” Consider the spatial content of an object file concerning a ventriloquized event. Is this feature a visual feature or an auditory feature? A causal conception of MSI would imply that the subject has two object files: a visual file that represents its location in a visual manner and an auditory file that represents its location in an auditory manner. However, the fact that one’s awareness of the location of the ventriloquized event is a function of both the visual and the auditory input indicates that the spatial information in question is not modality-specific. A second respect in which these files are multisensory is that they are not restricted to features associated with one modality but include features associated with multiple modalities. For example, in the sound-induced flash illusion, subjects do not have one object file associated with the tone and another associated with the flash but instead have a single object file containing information associated with both. In effect, this object file represents that there is a flashing, beeping object at such-and-such location in the subject’s perceptual field. Similarly, in the McGurk effect, one has

The Multisensory Nature of Perceptual Consciousness

25

a single object file for the perceived speech act that contains information derived from both audition and vision. That, in a nutshell, is the unity assumption argument against the decomposition thesis. The argument turns on three claims: (i) MSI requires the unity assumption; (ii) the unity assumption requires a constitutive interpretation, an idea that I have developed in terms of multisensory object files; and (iii) if perception involves multisensory object files, then the decomposition thesis is false. Let me flesh out this argument by considering some objections to it. 6 Objections and Replies 6.1 From Object Files to Perceptual Content One objection to the argument is that it confuses levels of description: multisensory integration and the notion of a multisensory object file are subpersonal matters, whereas the decomposition thesis is a claim about perceptual experience, a personal-level phenomenon. As such, the objection runs, claims about multisensory integration and the structure of object files couldn’t have any bearing on the prospects of the decomposition thesis, for claims about personal-level phenomena are not hostage to the fortunes of subpersonal, information-processing models. This challenge is certainly worth taking seriously, for there are a number of reasons why one should be reluctant to identify the contents of object files with the objects of perceptual experience. First, experiments by Mitroff and colleagues using a motion-induced blindness paradigm indicate that object files can be updated independently of awareness (Mitroff & Scholl, 2005; Mitroff, Scholl, & Wynn, 2005; see also Bonneh, Cooperman, & Sagi, 2001). Second, although patients with object-centered neglect are often consciously aware of only the right side of an object, evidence from priming indicates that they possess information about the left side of neglected objects (Marshall & Halligan, 1988; see also Dorichi & Galati, 2000; Wojciulik & Kanwisher, 1998). Presumably information about both the neglected and consciously perceived sides of an object is bound together within a single object file. Third, and perhaps of most direct relevance to the current question, there is evidence that at least certain kinds of multisensory integration can occur “outside” of consciousness, in the sense that the subject need not be conscious of the sensory cues responsible for the modulation of perceptual content. Visually presented lip movements that are presented in the ipsilesional hemifield of a patient suffering from hemineglect can give

26

T. Bayne

rise to a McGurk effect when combined with auditory speech stimuli presented in the patient’s (neglected) contralesional hemifield (Soroker, Calamaro, & Myslobodsky, 1995a, 1995b; see also Leo et al., 2008). Similar results have been found for ventriloquism (Bertelson et al., 2000) and the integration of emotion as detected via facial expression and tone of voice (de Gelder & Vroomen, 2000). Assuming that neglect patients are unaware of stimuli that they do not report, these results provide further evidence of a gap between the contents of an object file and the structure of perceptual content. However, we can recognize that there is an important distinction between the contents of object files and the nature of perceptual objects without giving up on the claim that the nature of object files might constrain the nature of perceptual objects. One way to connect the two levels of description is to consider the implications that MSI has for the contents of the perceptual experiences that result from it. What “constraints” does the perceptual experience generated by the sound-induced flash illusion place on one’s environment? The answer recommended by the causal account of MSI is that the veridicality of one’s experience requires only that one’s environment contain sounds of a certain kind and flashes of a certain kind. By contrast, the constitutive account of MSI would indicate that unless one’s experience of lights and tones has a common causal ground, then it is not wholly veridical. If one’s environment contained a flash of light and a sound that failed to have a common ground—as indeed is the case in the flash-induced sound illusion—then one’s experience would be to some degree illusory. Following O’Callaghan (2008b), it seems to me that the perceptual experience that one has in the context of the sound-induced flash illusion doesn’t leave it as an open question whether the flash and the beep are manifestations of a single event, but instead imposes this requirement on one’s environment. In other words, the perceptual content of one’s experience is in line with that suggested by the constitutive conception of MSI. Consider a parallel between the issues just discussed and the interpretation of certain unisensory phenomena such as the phi phenomenon (Kolers & Von Grunau, 1976) and the “cutaneous rabbit” (Geldard & Sherrick, 1972), a tactile illusion in which a rapid sequence of taps delivered first near the wrist and then near the elbow creates the experience of an object “hopping” from one’s wrist to one’s elbow. These illusions of apparent motion are best accounted for by supposing that the perceptual system makes assumptions about the numerical identity of the objects of perception. It is only because the visual system assumes that the first object is numerically

The Multisensory Nature of Perceptual Consciousness

27

identical to the second object that it constructs a (nonveridical) percept of a single object moving from one location to another, and it is only because the tactile system assumes that there is a single object “hopping” up one’s arm that it revises the order in which the taps to the arm are experienced. In these cases, it is part of the perceptual content of one’s experience that one is presented with a single object in motion. Similarly, we should recognize that claims about the numerical identity of perceptual objects are built into the content of one’s perceptual experience in the sound-induced flash illusion and many other examples of MSI. It might be objected that this view is at odds with an influential conception of perceptual content according to which perceptual content is purely general rather than singular (McGinn, 1997; Davies, 1992; Horgan & Tienson, 2002). Suppose one is looking at a book. On the general view of perceptual content, the particular book at which one is looking does not enter into the content of one’s experience, but one’s perceptual experience would instead have been satisfied by any number of qualitatively identical books. However, although the argument presented above might appear to presuppose a singular account of perceptual content, any such appearance would be misleading. All the argument requires is that if perceptual content is purely general, then the features relating to more than one modality must fall within the scope of the existential quantifier. In other words, the content must have the form “There is an x such that x is both flashing and beeping” rather than “There is an x such that x is flashing and there is a y such that y is beeping.” 6.2 Multisensory Objects A second challenge to the argument from multisensory integration concerns the very notion of a multisensory object. The objects of sight are typically physical entities, whereas the objects of hearing are typically sounds. We see the dog but hear its bark; we see the coffee pot but hear its gurgling; we see the ambulance but hear its siren. And, our imagined critic might continue, given that physical objects and sounds are different types of things, it is unclear what an audiovisual object could possibly be. The idea that visual and auditory features could be bound together in the form of a single intentional object seems to be a deep mistake—indeed, a category mistake, no less!—for there is nothing that could have both visual and auditory features. A dog can be brown but not loud, whereas its bark can be loud but not brown—nothing, it seems, could be both brown and loud.6 The challenge is a serious one, but there is a respectable reply to it.7 The response in question is best approached by considering a distinction

28

T. Bayne

that is often made in discussions of audition between primary (or “direct”) perceptual objects and secondary (or “indirect”) perceptual objects (e.g., O’Callaghan, 2008a; Matthen, 2010). According to this proposal, although the primary objects of auditory experiences are sounds, audition has secondary objects in the form of sound-emitting objects. We are aware of the secondary objects of auditory experience by hearing the primary objects that they produce. We hear the dog by hearing its bark; we hear the coffee pot by hearing its gurgle; and we hear the ambulance by hearing its siren. According to the account of multisensory objects that I am sketching here, although the primary objects of perceptual experience are modality specific, these objects will themselves be bound together in the form of a composite “object.” Visual features will be bound to a visual object, and auditory features will be bound to an auditory object, but auditory and visual features will also be bound together insofar as the visual and auditory objects are bound together. On this view, we need object files of two kinds: one kind of file individuates the primary objects of perception and another individuates its secondary objects. The role of the secondary object file is to keep track of when two primary perceptual objects are associated with the same secondary perceptual object. But what exactly are secondary perceptual objects? They are, I suggest, events. (It is perhaps no accident that Bertelson entitled his influential overview of multisensory integration “The perception of multimodal events.”)8 Consider the McGurk effect. Here, both visual and auditory features are bound together as a representation of the utterance of the phoneme . In simple cases of ventriloquism, we might think of the intentional object as something akin to an “explosion”—an event involving the production of a sound and a spark of light. In other examples of ventriloquism—such as when seeing a drum being struck “ventriloquizes” an accessory sound— we might think of the intentional object as the event of the drum’s being struck. And in the sound-induced flash illusion, the intentional objects of one’s experience are events of flashing/beeping. In short, a plausible account can be given of the intentional objects of multisensory integration: they are events. 6.3 Partial Integration A rather different challenge to the argument from MSI involves the fact that multisensory integration is often merely partial. For example, studies of ventriloquism often find that although the presence of a visual cue biases the perceived location of the auditory cue, and the presence of the auditory cue may also bias the perceived location of the visual cue, the auditory and

The Multisensory Nature of Perceptual Consciousness

29

visual cues may not be experienced as co-located; rather, they will simply be experienced as closer to each than they would have been had each cue been presented independently (see e.g., Bertelson & Radeau, 1981; Bertelson et al., 2000). Partial integration is prima facie puzzling. On the one hand, the unity assumption suggests that the perceptual system must have bound the visual and auditory cues together, for if it didn’t, then we would lack an information-processing account of why their apparent locations were modulated in the way that they were. But if the perceptual system did bind these cues together, why did it not also co-locate them? After all, it seems unlikely that a single event would produce a flash in one location and a sound in another location. We can put the partial integration challenge in terms of the following dilemma: On the one hand we could deny that partial integration results in multisensory object files. But if we do that, then we must either deny that the unity assumption is required in order to explain partial integration, or we must deny that the unity assumption is indicative of the presence of multisensory object files. On the other hand, if we allow that partial integration involves multisensory object files, then we must allow that an object file can represent an object as having incompossible attributes—that it can, for example, fully locate an event at two distinct locations. And that too seems prima facie implausible. What should we do? Embracing the first horn of this dilemma seems to me to be particularly unpalatable. The link between multisensory integration (even when merely partial) and the unity assumption seems to me to be highly robust, as does the link between the unity assumption and the existence of multisensory objects. So, that leaves the second horn of the dilemma—is there any way of making this response to the objection palatable? Perhaps so. One line of response would be to hold that object files can attribute incompossible properties to an object. Consider the waterfall illusion, in which a stationary object appears to be both moving and, at the same time, stationary. Some theorists describe the content of the visual experience produced by the waterfall illusion as “logically impossible” (Frisby, 1979, 101; see also Smeets & Brenner, 2008). This description—which has some phenomenological plausibility—seems to require that the relevant object file represents its object as both stationary and moving (relative to the same frame of reference). And if we are willing to allow that unisensory object files can attribute incompossible properties to their objects, perhaps we should also allow that multisensory object files can attribute incompossible properties to their objects.

30

T. Bayne

Another—and to my mind rather more attractive—response to the challenge posed by partial integration leans heavily on the idea that the intentional objects of multisensory perception are not objects (strictly speaking) but events. Events are not (usually) instantaneous, point-like occurrences, but have both temporal and spatial dimensions. Because events have both temporal and spatial duration, it is possible for an event-involving object file to bind together visual and auditory objects even when those objects are themselves assigned to distinct spatiotemporal locations. Roughly speaking, we can think of perceptual content in cases of partial integration on the model of the following: “There is a visual object at such-and-such a spatio-temporal location, and there is an auditory object at such-and-such a spatio-temporal location, and these two objects are implicated in the same event.” Much more remains to be said on this topic, but I hope that this is at least the start of a plausible account of partial integration. 7 Conclusion In this chapter I have examined the decomposition thesis—the claim that perceptual experience can be parceled out into modality-specific chunks. I have argued that there are three respects in which the decomposition thesis is open to challenge. First, it struggles to accommodate the unity of consciousness, for—I have argued—we must recognize distinctively multimodal experiences in order to do justice to the unity between experiences associated with different modalities. Second, the decomposition thesis struggles to accommodate the perceptual representation of common sensibles, such as spatial, temporal, and causal relations. Third, the decomposition thesis struggles to accommodate the phenomenon of multisensory integration, which—I have argued— requires conceiving of perceptual content in terms of multisensory perceptual objects. Over and above the pressure that it puts on the decomposition thesis, I hope that this chapter draws attention to some of the many questions of philosophical interest that are raised by the multimodal nature of perception. Multisensory integration “is the rule and not the exception in perception” (Shimojo & Shams, 2001, 505), and there is a pressing need to explore its implications for philosophical accounts of perception.9 Notes 1. Note that the Fregean conception of phenomenal content need not be unpacked in modality-specific terms, for one could hold that phenomenal content is Fregean rather than Russellian but deny that the modes of presentation associated with phe-

The Multisensory Nature of Perceptual Consciousness

31

nomenal content track differences between sensory modalities. However, as a matter of fact, many Fregeans do seem to think that there are modality-specific modes of presentation. 2. As Nudds (2001) has argued, there are also cross-modal perceptual relations involving causation. For example, in perceiving a drum being struck, one might be aware of the sound as being generated by the contact of the drumstick with the drum—an event that one is visually aware of. 3. The following discussion is heavily indebted to Casey O’Callaghan’s (2008b, 2012) important work on the multisensory nature of perception. 4. Another example of the perceptual system “creating” an additional perceptual object in order to foster the coherence of the subject’s experience is provided by the Aristotle illusion. When an object is placed between one’s fingers while they are crossed, one will feel as though one is touching two objects. 5. The Bayesian account of multisensory integration can explain why one flash is perceived as two when accompanied by multiple tones, whereas two flashes are not perceived as one when they are accompanied by a single tone. Briefly, the account appeals to the fact that audition has a finer temporal grain than vision does, and hence the perceptual system puts more weight on those time-dependent numerosity judgments that involve audition than those that involve vision. 6. The information-processing variant of this objection can be put in terms of the “indispensable attributes” account of visual and auditory objects associated with the work of Kubovy and colleagues (Kubovy & van Valkenburg, 2001; Kubovy & Schutz, 2010). The idea, briefly, is that visual objects are individuated in terms of their position in a spatial frame of reference, whereas auditory objects are individuated in terms of their position in a pitch-based frame of reference. In this framework, the question posed by multisensory objects can be put as follows: with respect to what frame of reference are audiovisual objects individuated? By appeal to both pitch and space? In terms of some independent frame of reference? Or does the answer to this question depend on the kind of audio-visual integration in question? I hope to explore these questions in future work. 7. Note that this challenge applies with particular force in the context of interactions between vision and audition. It is less clear how much force it has in the context of other forms of multisensory interaction such as interactions between vision and touch (for example). 8. Note too that in the literature on multisensory binding, theorists typically refer to the intentional objects of such binding as “events” rather than “objects.” See Zmigrod, Spapé, and Hommel (2009) and Zmigrod and Hommel (2011, 2013). 9. Earlier versions of this chapter were presented at the Thumos group (Geneva), the Logos group (Barcelona), and at a conference on the unity of consciousness and

32

T. Bayne

sensory integration held at Brown University in November 2011. I am grateful to the audiences present at these occasions for their very helpful comments, and in particular to my commentator at the Brown conference, Sydney Shoemaker. Thanks are also due to Casey O’Callaghan and Matt Nudds, whose papers on multisensory integration have had an important impact on my thinking, and to Chris Hill for his comments on an earlier draft of this chapter. This chapter was written with the support of European Research Council Grant The Architecture of Consciousness (R115798), for which I am very grateful.

References Auvray, M., & Spence, C. (2008). The multisensory perception of flavor. Consciousness and Cognition, 17, 1016–1031. Bayne, T. (2010). The unity of consciousness. Oxford: Oxford University Press. Bayne, T., & Chalmers, D. (2003). What is the unity of consciousness? In A. Cleeremans (Ed.), The unity of consciousness (pp. 23–58). Oxford: Oxford University Press. Bertelson, P. (1988). Starting from the ventriloquist: The perception of multimodal events. In M. Sabourin, F. I. M. Craik, & M. Robert (Eds.), Advances in psychological science (Vol. 2, pp. 419–439). Hove, UK: Psychology Press. Bertelson, P. (1999). Ventriloquism: A case of cross modal perceptual grouping. In G. Ashersleben, T. Bachmann, & J. Müssler (Eds.), Cognitive contributions to the perception of spatial and temporal events (pp. 347–362). Amsterdam: Elsevier. Bertelson, P., & Aschersleben, G. (2003). Temporal ventriloquism: Crossmodal interaction on the time dimension. (1) Evidence from time order judgments. International Journal of Psychophysiology, 50, 147–155. Bertelson, P., & de Gelder, B. (2004). The psychology of multimodal perception. In C. Spence & J. Driver (Eds.), Crossmodal space and crossmodal attention (pp. 141–177). Oxford: Oxford University Press. Bertelson, P., & Radeau, M. (1981). Cross-modal bias and perceptual fusion with auditory-visual spatial discordance. Perception and Psychophysics, 29, 578. Bertelson, P., Vroomen, J., de Gelder, B., & Driver, J. (2000). The ventriloquist effect does not depend on the direction of deliberate visual attention. Perception and Psychophysics, 62(2), 321–332. Bonneh, Y. S., Cooperman, A., & Sagi, D. (2001). Motion-induced blindness in normal observers. Nature, 411, 798–801. Calvert, G. A., Stein, B. E., & Spence, C. (Eds.). (2004). The handbook of multisensory processing. Cambridge, MA: MIT Press.

The Multisensory Nature of Perceptual Consciousness

33

Chalmers, D. (2004). The representational character of experience. In B. Leiter (Ed.), The future for philosophy (pp. 153–181). Oxford: Oxford University Press. Dainton, B. (2000/2006). Stream of consciousness: Unity and continuity in conscious experience. London: Routledge. Davies, M. (1992). Perceptual content and local supervenience. Proceedings of the Aristotelian Society, 92, 21–45. de Gelder, B., & Bertelson, P. (2003). Multisensory integration, perception, and ecological validity. Trends in Cognitive Sciences, 7(10), 460–467. Dorichi, F., & Galati, G. (2000). Implicit semantic evaluation of object symmetry and contralesional visual denial in a case of left unilateral neglect with damage of the dorsal paraventricular white matter. Cortex, 36, 337–350. Driver, J., & Noesselt, T. (2008). Multisensory interplay reveals crossmodal influences on “sensory-specific” brain regions, neural responses, and judgments. Neuron, 57, 11–23. Ernst, M. O. (2006). A Bayesian view on multimodal cue integration. In G. Knoblich, et al. (Eds.), Human body perception from the inside out (pp. 105–131). New York: Oxford University Press. Ernst, M. O., & Bülthoff, H. H. (2004). Merging the senses into a robust percept. Trends in Cognitive Sciences, 8, 162–169. Foxe, J. J., & Schroeder, C. E. (2005). The case for feedforward multisensory convergence during early cortical processing. Neuroreport, 16, 419–423. Frisby, J. P. (1979). Seeing. Oxford: Oxford University Press. Geldard, F., & Sherrick, C. (1972). The cutaneous “rabbit”: A perceptual illusion. Science, 178(57), 178–179. de Gelder, B., & Vroomen, J. (2000). The perception of emotions by ear and by eye. Cognition and Emotion, 14, 289–311. Ghazanfar, A. A., & Schroeder, C. E. (2006). Is neocortex essentially multisensory? Trends in Cognitive Sciences, 10, 278–285. Guski, R., & Troje, N. F. (2003). Audiovisual phenomenal causality. Perception and Psychophysics, 65, 789–800. Hanson, J. V., Heron, J., & Whitaker, D. (2008). Recalibration of perceived time across sensory modalities. Experimental Brain Research, 185, 347–352. Horgan, T., & Tienson, J. (2002). The intentionality of phenomenology and the phenomenology of intentionality. In D. Chalmers (Ed.), Contemporary readings in philosophy of mind. Oxford: Oxford University Press.

34

T. Bayne

Hötting, K., & Röder, B. (2004). Hearing cheats touch but less in the congenitally blind than in sighted individuals. Psychological Science, 15(1), 60–64. Hurley, S. (1998). Consciousness in action. Cambridge, MA: Harvard University Press. Kolers, P., & Von Grunau, M. (1976). Shape and color in apparent motion. Vision Research, 16, 329–335. Kubovy, M., & van Valkenburg, D. (2001). Auditory and visual objects. Cognition, 80, 97–126. Kubovy, M., & Schutz, M. (2010). Audio-visual objects. Review of Philosophy and Psychology, 1, 41–61. Kulvicki, J. (2007). What is what it’s like? Introducing perceptual modes of presentation. Synthese, 156, 205–229. Leo, F., Bolognini, N., Passamonti, C., Stein, B. E., & Làdavas, E. (2008). Cross-modal localization in hemianopia: New insights on multisensory integration. Brain, 131, 855–865. Lopes, D. M. (2000). What is it like to see with your ears? The representational theory of mind. Philosophy and Phenomenological Research, 60(2), 439–453. Macpherson, F. (2011a). Taxonomising the senses. Philosophical Studies, 153(1), 123–142. Macpherson, F. (2011b). Cross-modal experiences. Proceedings of the Aristotelian Society, 111(3), 429–468. Marshall, J. C., & Halligan, P. W. (1988). Blindsight and insight in visuospatial neglect. Nature, 336, 766–767. Matthen, M. (2010). On the diversity of auditory objects. Review of Philosophy and Psychology, 1, 63–89. McGinn, C. (1997). The character of mind. Oxford: Oxford University Press. McGurk, H., & MacDonald, J. (1976). Hearing lips and seeing voices. Nature, 264, 746–748. Mitroff, S. R., & Scholl, B. J. (2005). Forming and updating object representations without awareness: Evidence from motion-induced blindness. Vision Research, 45, 961–967. Mitroff, S. R., Scholl, B. J., & Wynn, K. (2005). The relationship between object files and conscious perception. Cognition, 96, 67–92. Mitterer, H., & Jesse, A. (2010). Correlation versus causation in multisensory perception. Psychonomic Bulletin and Review, 17, 329–334.

The Multisensory Nature of Perceptual Consciousness

35

Morein-Zamir, S., Soto-Faraco, S., & Kingstone, A. (2003). Auditory capture of vision: Examining temporal ventriloquism. Brain Research: Cognitive Brain Research, 17, 154–163. Nudds, M. (2001). Experiencing the production of sounds. European Journal of Philosophy, 9, 210–229. O’Callaghan, C. (2008a). Object perception: Vision and audition. Philosophy Compass, 3(4), 803–829. O’Callaghan, C. (2008b). Seeing what you hear: Cross-modal illusions and perception. Philosophical Issues, 18(1), 316–338. O’Callaghan, C. (2012). Perception and multimodality. In E. Margolis, R. Samuels, & S. Stich (Eds.), Oxford handbook of philosophy of cognitive science (pp. 92–117). Oxford: Oxford University Press. O’Dea, J. (2006). Representationalism, supervenience, and the cross-modal problem. Philosophical Studies, 130(2), 285–295. Radeau, M. (1994). Auditory-visual spatial interaction and modularity. Current Psychology of Cognition, 13, 3–51. Shams, L., Kamitani, Y., & Shimojo, S. (2000). What you see is what you hear. Nature, 408, 788. Shams, L., Kamitani, Y., & Shimojo, S. (2002). Visual illusions induced by sound. Cognitive Brain Research, 14, 147–152. Shams, L., Ma, W. J., & Beierholm, U. (2005). Sound-induced flash illusion as an optimal percept. Neuroreport, 16(17), 1923–1927. Shimojo, S., & Shams, L. (2001). Sensory modalities are not separate modalities: plasticity and interactions. Current Opinion in Neurobiology, 11, 505–509. Smeets, J. B. J., & Brenner, E. (2008). Why we don’t mind to be inconsistent. In P. Calvo & A. Gomila (Eds.), Handbook of cognitive science—An embodied approach (pp. 207–221). Amsterdam: Elsevier. Soroker, N., Calamaro, N., & Myslobodsky, M. (1995a). “McGurk illusion” to bilateral administration of sensory stimuli in patients with hemispatial neglect. Neuropsychologia, 33, 461–470. Soroker, N., Calamaro, N., & Myslobodsky, M. S. (1995b). Ventriloquism effect reinstates responsiveness to auditory stimuli in the “ignored” space in patients with hemispatial neglect. Journal of Clinical and Experimental Neuropsychology, 17, 243–255. Stevenson, R. J. (2009). The psychology of flavour. Oxford: Oxford University Press.

36

T. Bayne

Thompson, B. (2009). Senses for senses. Australasian Journal of Philosophy, 87, 99–117. Trommershäuser, J., Landy, M. S., & Körding, K. P. (Eds.). (2011). Sensory cue integration. New York: Oxford University Press. Tye, M. (2003). Consciousness and persons. Cambridge, MA: MIT Press. Vatakis, A., & Spence, C. (2007). Crossmodal binding: Evaluating the “unity assumption” using audiovisual speech stimuli. Attention, Perception, and Psychophysics, 69(5), 744–756. Welch, R. B., & Warren, D. H. (1980). Immediate perceptual response to intersensory discrepancy. Psychological Bulletin, 88, 638–667. Welch, R. B. (1999). Meaning, attention, and the “Unity Assumption” in the intersensory bias of spatial and temporal perceptions. In G. Ascherleben, T. Bachman, & J. Müsseler (Eds.), Cognitive contributions to the perception of spatial and temporal events (pp. 371–387). Amsterdam: Elsevier. Wojciulik, E., & Kanwisher, N. (1998). Implicit visual attribute binding following bilateral parietal damage. Visual Cognition, 5, 157–181. Zmigrod, S., & Hommel, B. (2013). Feature integration across multimodal perception and action: A review. Multisensory Research, 26, 143–157. Zmigrod, S., & Hommel, B. (2011). The relationship between feature binding and consciousness: Evidence from asynchronous multi-modal stimuli. Consciousness and Cognition, 20, 586–593. Zmigrod, S., Spapé, M., & Hommel, B. (2009). Intermodal event files: Integrating features across vision, taction, and action. Psychological Research, 73, 674–684.

3 The Long-Term Potentiation Model for Grapheme-Color Binding in Synesthesia Berit Brogaard, Kristian Marlow, and Kevin Rice1

The phenomenon of synesthesia has undergone an invigoration of research interest and empirical progress over the past decade. Studies investigating the cognitive mechanisms underlying synesthesia have yielded insight into neural processes behind such cognitive operations as attention, memory, spatial phenomenology, and intermodal processes. However, the structural and functional mechanisms underlying synesthesia still remain contentious and hypothetical. The first section of the chapter reviews recent research on grapheme-color synesthesia, one of the most common forms of the condition, and addresses the ongoing debate concerning the role of selective attention in eliciting synesthetic experience. Drawing on conclusions from the first half, the chapter’s second half examines the various models proposed to explain the cognitive mechanisms behind graphemecolor synesthesia and discusses the explanatory virtues of a new model suggesting that some forms of grapheme-color synesthesia are grounded in memory. The last section offers an examination of some of the broader philosophical implications of synesthesia. 1 Introduction Synesthesia is a condition in which input from one sensory or cognitive stream gives rise to experience in another sensory or cognitive stream, characterized by atypical binding of objects and properties (Rich & Mattingley, 2002; Brogaard, 2012). The most common forms of synesthesia involve associations between graphemes or words and visual experience, commonly known as grapheme-color synesthesia (Simner et al., 2006). In grapheme-color synesthesia, looking at or thinking about an achromatic letter or numeral gives rise to the sensation or thought that the numeral has a specific color with a highly specific hue, brightness, and saturation. Visual synesthesia, however, can take many forms, and some synesthetes

38

B. Brogaard, K. Marlow, and K. Rice

may have visual experiences unique to their individual cases such as complex geometric patterns in response to mathematical formulas (Brogaard, Vanni, & Silvanto, 2012). In another common form of synesthesia, certain concepts can give rise to associated spatial phenomenology. An example would be synesthetes reporting days of the week as exhibiting a particular spatial arrangement across their perceptual field. Even further, although the most common forms of synesthesia elicit visual experiences, other variations can involve different sensory modalities. For example, some synesthetes report the occurrence of touch sensations or gustatory experiences in response to auditory input. Within the literature and study of synesthesia, the stimulus that gives rise to a synesthetic experience is known as the “inducer” while the synesthetic experience itself is called the “concurrent” (Grossenbacher & Lovelace, 2001). For each individual case of synesthesia, concurrents may either be projected out into space or merely be imagistically or semantically associated with the inducer. Projection is experienced much like veridical perception; that is, the concurrents are experienced as located in the visual scene outside of the synesthetes’ minds (Dixon, Smilek, & Merikle, 2004). While some synesthetes who experience projection report seeing concurrents that float above their inducers, others describe experiences similar to seeing afterimages or phosphenes. Projection may also be experienced through sensory modalities other than vision. Some music-touch synesthetes experience touch sensations localized to specific regions of the body in response to music, and some motion-sound synesthetes experience sounds in response to visual motion. Synesthetes who associate concurrents with their inducers imagistically or semantically describe seeing or feeling synesthetic concurrents within their “mind’s eye,” or report simply “knowing” that concurrents are strongly connected to or associated with their inducers (Dixon, Smilek, & Merikle, 2004). For example, grapheme-color synesthetes might view a black letter “A,” yet describe it as having the quality of the color red, even though they do not actually have the perceptual experience of seeing a red “A.” Association between an inducer and its concurrent is analogous to the connection between smell and memory in nonsynesthetic perceptual experience. In neurotypical individuals, a smell may elicit vivid visual imagery associated with that smell in memory. For example, the smell of a particular perfume may trigger a particular memory of a former friend who wore it. One crucial difference between synesthesia and neurotypical memory association is that associations formed in synesthetes are idiosyncratic and apparently random (Baron-Cohen et al., 1993). A synesthete

The Long-Term Potentiation Model

39

Figure 3.1 Example of test-retest reliability of synesthetic experience in an associator graphemecolor synesthete from age three to nine (Blue, Black, Brown, Dark Brown, Green, Gold, Purple, Red, White, and Yellow).

typically has no explanation of why a concurrent is associated with a particular inducer. A key characteristic of synesthesia, regardless of its associator or projector form, is that the synesthetic experiences are automatic; that is, synesthetes cannot suppress the association between an inducer and its concurrent. Just as the smell of freshly baked cookies would bring on thoughts of the cookies themselves, synesthetes cannot help but experience a synesthetic concurrent when an inducer is presented to them. The synesthetic association of inducers to concurrents is also stable over time (see figure 3.1) (Baron-Cohen, Wyke, & Binnie, 1987). Although the color attributed to a particular grapheme in grapheme-color synesthesia may vary among synesthetes, within-synesthete measurements of test-retest reliability show that the particular grapheme-color associations are highly stable and consistent in over 80 percent of cases (Mattingley et al., 2001). For cases of grapheme-color synesthesia, the Synesthesia Battery, an automated online test, allows for rigorous phenotyping of grapheme-color synesthetes along with many of the other forms of the condition (Eagleman et al., 2007). Data generated within the test are used to determine whether a participant falls below a certain threshold based on normalized data comparing self-reported synesthetes and nonsynesthetes. Falling below this threshold indicates a high likelihood of synesthetic experience

40

B. Brogaard, K. Marlow, and K. Rice

occurring in a subject. Since it is possible that some subjects who have genuine synesthetic experiences will fall above the threshold, the battery is not ultimately diagnostic, yet the difficulty of reaching the threshold ensures that only genuine synesthetes are identified as such for research purposes. The Synesthesia Battery is made up of two sections: a color-choosing task and a color-recall task. For the first task, a subject is presented with a grapheme for which she must choose a specific hue, brightness, and saturation from a color palette representing more than 17.6 million distinct choices. After the subject repeats the exercise three times for each grapheme (108 trials; graphemes A–Z and 0–9), a computer then calculates the geometric distance among the subject’s answers in red, green, and blue (RGB) color space. If the range of chosen RGB values for a grapheme falls below the normalized threshold, the subject is scored as a synesthete for that grapheme. For the second task, the subject is presented with randomly ordered graphemes printed in the specific colors the subject chose. The subject must then quickly determine whether the grapheme has the color in question. Self-reported synesthetes tend to have no trouble answering correctly 90 percent of the time, and thus a score in excess of 90 percent further validates the score achieved in the prior color-choosing task. Even with the breadth of recent research and interest in synesthesia, the mechanism behind the condition is still unknown. Any accurate neural mechanism for synesthesia should accommodate at least two factors: whether the connection between an inducer and concurrent is direct or indirect and whether this connection is structural or functional (Bargary & Mitchell, 2008; Ward, 2013). The first factor is concerned with whether unimodal brain regions interact by feed-forward mechanisms alone (direct) or whether they influence regions that influence other brain areas through feedback (indirect) (Ward, 2013). Both types of mechanisms exist in the neurotypical brain, but in synesthesia they are aberrant compared to the mechanisms of the normal brain (Driver & Noesselt, 2008; Ward, 2013). The second factor turns on whether neural networks that give rise to synesthetic experiences have additional synaptic connections (structural) or merely exhibit excessive disinhibition or hyperexcitement of neurotypical connections through a change in neurotransmitters (functional) (Ward, 2013). Empirical investigation on grapheme-color synesthesia originally led to three hypotheses accounting for the atypical binding that is characteristic of synesthesia. The local cross-activation hypothesis proposes that synesthesia results from a cross-activation between neural networks in adjacent brain regions. In explaining grapheme-color synesthesia, for example, the

The Long-Term Potentiation Model

41

local cross-activation hypothesis proposes that the color areas in the visual cortex and the physiologically adjacent word form area interact to produce a synesthetic concurrent (Ramachandran & Hubbard, 2001a, 2001b; Hubbard, Manohar, & Ramachandran, 2005b). Another hypothesis, the disinhibited feedback model, holds that synesthesia occurs due to disinhibition of an area of the brain that binds information from different senses, causing information from one sensory modality to trigger another (Grossenbacher, 1997; Armel & Ramachandran, 1999; Grossenbacher & Lovelace, 2001). The third hypothesis, called the aberrant reentrant processing hypothesis, holds that high-level information reenters low-level color areas, leading to the experience of synesthetic colors in response to a grapheme stimulus (Smilek et al., 2001; Myles et al., 2003). While there are studies that individually support these three hypotheses, recent research has brought each of them into question (Bor, Billington, & Baron-Cohen, 2007; Cytowic & Eagleman, 2009, 75, 217–218; Brogaard, Vanni, & Silvanto, 2012; Brogaard, 2013). In the present chapter we first discuss the ramifications of the LongTerm Potentiation (LTP) model for synesthesia, first introduced by Brogaard (2013). Based on what we call the “reactivation model of memory,” the model proposes that in some cases of grapheme-color synesthesia, the hippocampus binds together information from color areas with information from grapheme areas by associations made in long-term memory, simultaneously giving rise to conscious color experience. The LTP model is, therefore, an indirect one; however, more research is needed to determine whether it emerges via aberrant structural or functional changes in brain structure. The comparative explanatory strength of the LTP model lies in its ability to accommodate conclusions from recent synesthesia research. Not only does the LTP model explain why synesthetic experience sometimes occurs after attention is directed at its inducer, it can also account for cases in which there is not a true binding of graphemes and colors. From this discussion, we then examine the possible cognitive advantages of synesthesia such as heightened mathematical or artistic ability. Finally, we assess how the crossmodal nature of synesthesia makes the condition relevant to classical debates in philosophy of mind, including the modularity and cognitive impenetrability hypotheses and the binding problem. 2 Selective Attention and Color Binding The automaticity of synesthesia has been taken by some to provide evidence that synesthetic experience is much like normal veridical perception.

42

B. Brogaard, K. Marlow, and K. Rice

Figure 3.2 The word “red” is here displayed in black (left) and gray, representing green (right). It takes longer for subjects to read the word “red” when it is printed in green than when it is printed in black or red.

Prior research has shown perceptual features that attract attention and lead to segregation must be processed early in the visual system (Beck, 1966; Treisman, 1982); therefore many studies have attempted to determine whether selective attention is required to bind inducing graphemes with their synesthetic colors (Mattingley et al., 2001; Ramachandran & Hubbard, 2001a, 2001b; Rich & Mattingley, 2002, 2003; Robertson, 2003; Smilek, Dixon, & Merikle, 2003; Mattingley & Rich, 2004; Edquist et al., 2006; Sagiv et al., 2006; Rich & Mattingley, 2010). The idea is that if synesthetic inducers attract attention and lead to segregation like perceptual features do, then the synesthetic inducers must be processed early in the visual system. Automaticity is supported by research showing that synesthetes are susceptible to Stroop effects. A Stroop effect is a type of reaction-time interference in certain perceptual tasks (Stroop, 1935). The most common Stroop task demonstrates that it takes significantly longer for neurotypical individuals to name the color in which a word is printed if the color referred to by the word is incongruent with the printed color (see figure 3.2) (MacLeod, 1991). Likewise, it takes significantly longer for synesthetes to name the printed color of a grapheme if the synesthetic color induced by the grapheme is incongruent with the printed color (Wollen & Ruggiero, 1983; Mills, Boteler, & Oliver, 1999; O’dgaard, Flowers, & Bradman, 1999; Mattingley et al., 2001; Mattingley, Payne, & Rich, 2006). A Stroop effect also has been demonstrated in music taste (Beeli, Esslen, & Jancke, 2005) and music-color synesthesia (Ward, Huckstep, & Tsakanikos, 2006) as well as synesthesia that gives rise to spatial phenomenology (Sagiv et al., 2006; Smilek et al., 2006). Although neurotypical controls can be trained to exhibit responses that are seemingly automatic, functional magnetic resonance imaging (fMRI) has shown differences in brain activation for synesthetes, indicating that synesthetes and trained neurotypical controls may have different experiences for these associations (Elias et al., 2003).

The Long-Term Potentiation Model

43

Automaticity, however, should not be construed as an indication that synesthetic experience occurs prior to attention directed at its inducers. In other synesthetic Stroop tasks, no significant differences in response times were observed between congruent and incongruent target colors when the letter primes were masked from conscious recognition (Mattingley et al., 2001). If the color experience in grapheme-color synesthesia occurs prior to conscious recognition, then, regardless of whether the letter prime is masked or visible, synesthetes should have slower response times when identifying incongruently colored targets. The results of the masking studies thus indicate that synesthetic primes elicit color experiences only after they reach consciousness. There is also reason to think that response time may be affected by attentional load during the presentation of letter primes. It has been found that the effect of letter-prime congruency is still present when a task places low demands on attention during the presentation of letter primes, but the influence of priming diminishes as the attentional task becomes more difficult (Mattingley, Payne, & Rich, 2006). The finding that synesthetic experience may be altered by the degree of attentional load suggests that attention plays at least some role in grapheme-color synesthesia. However, it cannot be determined whether the manipulation of attentional load affects the cognitive process of linking the letter prime to its synesthetic color or whether it merely influences the perceptual representation of the letter prime (Myles et al., 2003; Mattingley, Payne, & Rich, 2006). Given its limitations, we cannot conclude on the basis of these considerations that synesthetic experience is post-attentional. Research involving visual search tasks has provided further ground for debate on whether synesthesia is an early visual phenomenon. In studies involving neurotypical subjects, visual search tasks typically present a subject with a target hidden among other objects or features (distractors). In these tasks, a subject is instructed to find a target as quickly as possible while a computer measures either accuracy or the time it takes the subject to identify the target. Visual search tasks have also been used to study the role of attention in perceptual experience. For example, reaction time tends to be quicker when the color of the target is incongruent with distractors, which may be associated with a phenomenon known as “pop-out.” In the case of grapheme-color synesthesia, if selective attention is not required to elicit color associations, then a digit’s synesthetic color should capture attention much like it would if the digit were actually colored differently from distractors. This would lead to a highly efficient identification of a target grapheme in visual search tasks. However, if selective attention is

44

B. Brogaard, K. Marlow, and K. Rice

Figure 3.3 When normal subjects are presented with the figure on the left, it takes them several seconds to identify the hidden shape. Some grapheme-color synesthetes purportedly can quickly recognize the triangular shape because they experience the 2s and the 5s as having different colors.

required to induce the synesthetic grapheme-color association, then identification should be inefficient since the grapheme would still have to be located among the distractors prior to the elicitation of the synesthetic color (Edquist et al., 2006). Ramachandran and Hubbard (2001a) investigated this question in the role of selective attention in grapheme-color synesthesia. They conducted a visual search task that presented two synesthetes and neurotypical controls with an array of synesthetic color-inducing graphemes. Within each array, target graphemes were arranged so that they could be grouped together into simple shapes (see figure 3.3). Participants were presented with each array for a duration of one second and then asked to name the correct shape from a group of four alternatives. Their findings showed that although synesthetes are not remarkably better than neurotypical controls at naming the target shapes hidden among distractors, they do appear to have a slight advantage evidenced by marginally higher accuracy or quicker reaction times (Ramachandran and Hubbard, 2001a). The synesthetes’ outperformance of the controls was taken to be due to a pop-out effect that preattentively directed the synesthetes in locating the grouped target graphemes more efficiently than controls. Based upon this observed pop-out effect, it has been argued that synesthesia is an early visual phenomenon that is induced prior to selective attention (Ramachandran & Hubbard, 2001a, 2003a; Hubbard et al., 2005; Rich and Karstoft, 2013).

The Long-Term Potentiation Model

45

Cytowic and Eagleman (2009), however, offer an alternative explanation of the greater efficiency in synesthetes’ search tasks compared to controls. They argue that the additional identifier provided by the synesthetic color of target graphemes may only be assisting synesthetes in remembering the location of previously discovered targets or rejected distractors. Although synesthetes may scan a matrix for targets in the same way as nonsynesthetes, the additional post-attentional cue of the synesthetic colors of the target graphemes may reduce the time necessary for the grouped graphemes to break into conscious awareness (Cytowic & Eagleman, 2009). With this counterexplanation on the table, it cannot be concluded that the increased search efficiency of synesthetes observed in Ramachandran and Hubbard (2001a) is due to the graphemes eliciting vivid color experience prior to selective attention. Even the subsequent research (Hubbard et al., 2005; Ward et al., 2010) that has replicated Ramachandran and Hubbard’s (2001a) original findings is still subject to the critique that the synesthetic colors may only be facilitating the perceptual grouping of the graphemes and that increased efficiency is not necessarily indicative of a pop-out effect. If grapheme-color associations occur prior to selective attention, then true color-based pop-out should capture attention at the same rate regardless of the number of distractors in an array, much like it would if the targets were actually colored. However, other studies have found that increasing the number of target and distractor elements in visual search tasks causes a corresponding increase in reaction time for synesthetes (Palmeri et al., 2002). This increase in reaction time could be indicative of a limitation on the speed by which graphemes are processed after selective attention, further supporting the hypothesis that synesthetic color associations merely speed up the visual search. It has also been argued that grapheme-color synesthetes are more efficient in visual search tasks than neurotypical controls due to implicit biases in visual search paradigms. Laeng, Svartdal, and Oelmann’s (2004) case study with subject PM observed that PM quickly identified graphemes only when the color-inducing target graphemes were close to PM’s initial focus of attention in the visual search task. Another team led by Edquist et al. (2006) carried out a group study involving fourteen grapheme-color synesthetes and fourteen controls. Each subject performed a visual search task in which a target digit differed from the distractor digits in terms of its synesthetic color or its display color. Both synesthetes and controls identified the target digit efficiently when the target had a unique display color, but the two groups were equally inefficient when the target had a unique synesthetic color.

46

B. Brogaard, K. Marlow, and K. Rice

Despite the results reported in the above studies, the idea persists that grapheme-color binding occurs preattentively (see, e.g., Kim et al., 2006). However, many paradigms used to measure the advantages of synesthesia in visual search are inherently subject to bias in that it is possible that the apparent abilities of synesthetic participants emerge from the participants’ interest in showing that synesthesia provides one with special abilities. In response to Kim et al. (2006), two recent studies aimed to address the weaknesses of former studies purporting to show a lack of a preattentive effect of synesthesia. Gheri et al. (2008) introduced a novel paradigm they expected would demonstrate an effect of preattentive grapheme-color binding only if synesthetic participants performed worse than neurotypical controls. Synesthetes were shown a 4 × 4 matrix of different achromatic numerals displayed in the center of a screen under two conditions: for Condition I, a target numeral was chosen such that its concurrent was significantly different in color from the concurrents for the remaining distractor numerals; for Condition II, the target numeral shared its concurrent color with one of the distractor numerals. Aged-matched neurotypical controls viewed the same achromatic matrices as synesthetes in both conditions. The authors hypothesized that if grapheme-color binding occurs pre-attentively, then synesthetes should score higher than controls under Condition I and lower than controls under Condition II. They argued that if synesthetic binding is a low-level perceptual process, then the synesthetic color of the target would aid synesthetes in identifying an achromatic target among achromatic distractors, reducing the time necessary to report identification of the target. Furthermore, if the synesthetic color of the target matched the synesthetic color of one of the distractors, an additional step would be required to reject the similarly colored distractor, placing synesthetes at a disadvantage. However, the authors did not find a significant difference between synesthetes and controls under both conditions and concluded that binding does not occur preattentively. Ward et al. (2010) conducted a revised version of the Ramachandran and Hubbard (2001a) and Hubbard et al. (2005) studies. In addition to the tasks included in the former studies, subjects were asked to report on their experiences of synesthetic colors after completing each task. Synesthetes scored higher than controls in visual search tasks, confirming the results of the former studies. However, the reported instances of synesthetic color experience did not correlate with the number of synesthetic inducers displayed to synesthetic participants, suggesting a lack of relationship between synesthetic phenomenology and task performance. The authors concluded that

The Long-Term Potentiation Model

47

the better performance of synesthetes could not be due to cues provided by preattentively bound synesthetic color experience. The body of research continues to grow in support of the hypothesis that for most grapheme-color synesthetes, graphemes elicit a synesthetic color only once the subject attends to them. Evidence, however, is still limited in its conclusiveness regarding the question of selective attention in grapheme-color synesthesia. Thus far, only theoretical, alternative explanations have been offered to account for synesthetes’ more efficient visual searches as originally observed in Ramachandran and Hubbard (2001a) and replicated in subsequent studies (Hubbard et al., 2005; Ward et al., 2010). We conducted our own study utilizing a novel visual search task that overcomes the limitations described above (Brogaard, Marlow, & Rice, 2013a). In order to examine whether synesthetic colors guide a subject’s attention to the location of a target, we compared the speed at which synesthetes and controls were able to identify the location of graphemes heavily camouflaged within flicker images. Grapheme-color synesthetes and nonsynesthetic controls were presented with a series of Graphics Interchange Format (GIF) images created to alternate between images of a forest scene with and without a target “2” grapheme. The target grapheme was colored red, blue, or green (camouflaged) (see figure 3.4). A control condition was also included wherein only a small change in the forest scene was introduced between images. Participants were charged with, and timed on, searching for the target change. Given the difficulty of finding the camouflaged green “2” grapheme, we suspected that any true pop-out effect induced by the number “2” should significantly reduce reaction time. Preliminary results suggest that both synesthetes and controls demonstrate highly efficient searches in locating red and blue numerals. In searching for the camouflaged green “2” grapheme, however, identifying the location of the change appears to be significantly more difficult for synesthetes compared to controls. Furthermore, within-synesthete results suggest significantly longer search times in locating the green camouflaged grapheme “2” than either the red or blue graphemes. These findings indicate that no preattentive pop-out effect occurs for synesthetes prior to selective attention since such an effect should make synesthetes’ searches (particularly in the case of locating the green camouflaged “2” graphemes) more efficient than neurotypical controls. Another intriguing preliminary result from our study is that there appears to be some interference caused by the synesthetic condition in identifying the green camouflaged “2” graphemes. This interference is inferred from the fact that synesthetes and controls did not differ in their abilities to

48

B. Brogaard, K. Marlow, and K. Rice

Figure 3.4 Example of GIF image with target superimposed over a forest scene. The “2” grapheme has been circled here to better assist the reader in noticing the change.

detect more general changes in the scenery, yet synesthetes had significantly slower detection times for the camouflaged graphemes. This suggests that the greater inefficiency in locating the camouflaged graphemes for synesthetes is due to the synesthetic condition. The slower reaction may be the result of a unique Stroop effect on visual search for the synesthetes. Participant synesthetes in the study were subjected to an elimination criterion that excluded synesthetes who associated the color green with the “2” grapheme. Since participant synesthetes experienced non-green color concurrents to the grapheme “2,” some interference could be occurring between the top-down feedback of the synesthetic color association and the different bottom-up visual information. Such an occurrence could account for the slower reaction times for synesthetes when compared to controls in locating the camouflaged green “2” graphemes. 3 Mechanism While the precise neural mechanism underlying grapheme-color synesthesia is unknown, several hypotheses have been offered (Baron-Cohen et al.,

The Long-Term Potentiation Model

49

1993; Grossenbacher & Lovelace, 2001; Ramachandran & Hubbard, 2001a; Nunn et al., 2002; Weiss, Zilles, & Fink, 2005; Hubbard & Ramachandran, 2005; Hubbard et al., 2005; Rouw & Scholte, 2007; Weiss & Fink, 2009). The different proposed mechanisms underlying developmental grapheme-color synesthesia can be divided into groups based on two factors: whether the mechanism responsible for the binding of colors to graphemes operates directly or indirectly and whether such a mechanism suggests structural or functional differences from the neurotypical brain (Bargary & Mitchell, 2008; Ward, 2013). In regard to the first factor, direct mechanisms take unusual connectivity to be the result of an atypical feed-forward connection between form-processing areas and color areas, whereas the indirect mechanisms suggest that the connectivity issues originate in aberrant feedback from number- or word-processing areas to color areas. In regard to the second factor, structural mechanisms take the cause for unusual connectivity to be the underlying brain structure, whereas functional mechanisms take the cause to be a difference in how an otherwise neurotypically structured brain processes perceptual information through normal channels. An influential hypothesis of the direct structural type is the local crossactivation hypothesis, according to which grapheme-color synesthesia arises due to cross-activation between color areas in the visual cortex and the adjacent visual word form area (Ramachandran & Hubbard, 2001a, 2001b; Hubbard, Manohar, & Ramachandran, 2005b). This suggestion is inspired by the observation that local crossover phenomena may explain other illusory and hallucinatory experiences such as phantom limb sensations (Ramachandran & Hubbard, 2003b; Ramachandran & Hirstein, 1998). This hypothesis has several limitations. One is that it doesn’t explain why processing of the visual form of a grapheme should elicit processing of unique brightness in the striate cortex and unique hue in the V4/V8 color complex in the visual cortex. Another is that it doesn’t generalize to other forms of color synesthesia (e.g., sound-color and emotion-color). It is, of course, highly plausible that different forms of color synesthesia proceed via different mechanisms. For example, cases of color synesthesia have been reported in which the visual cortex is not involved in generating synesthetic colors (Bor, Billington, & Baron-Cohen, 2007; Brogaard, Vanni, & Silvanto, 2012). So, this mechanism could well be correct for some forms of grapheme-color synesthesia. The best-known hypothesis of the indirect functional type posits that the unusual crosstalk originates in feedback processes as a result of disinhibited feedback from an area of the brain that binds information from different senses (Armel & Ramachandran, 1999; Grossenbacher, 1997;

50

B. Brogaard, K. Marlow, and K. Rice

Grossenbacher & Lovelace, 2001). The main piece of evidence cited in favor of this hypothesis comes from an analogous case in which patient PH reported seeing visual movement in response to tactile stimuli following acquired blindness (Armel & Ramachandran, 1999). As PH was blind, he could not have received the information via standard visual pathways. It is plausible that the misperception was a result of disinhibited feedback from brain regions that receive information from other senses. The hypothesis that the synesthetic effect of psychedelic substances such as LSD or psilocybin could be due to aberrant feedback connections has been taken to provide further evidence for the disinhibited feedback hypothesis (Shanon, 2002; Sinke et al., 2012). It is unknown, however, whether drug-induced synesthesia and developmental synesthesia have the same underlying mechanism, as the former differs from the latter in nearly every respect (Sinke et al., 2012). Even the very experience of drug-induced synesthesia at the time at which it occurs appears notably different from most cases of developmental synesthesia. Though music and sounds are the most frequent inducers of synesthesia during drug intoxication, all sorts of sensory input, including olfactory, gustatory, haptic, pain, and emotional stimuli, can induce synesthetic experience. Drug-induced synesthesia also may fail to exhibit the test-retest reliability that is characteristic of other forms of synesthesia, though the judge is still out. A second hypothesis of the indirect type is the atypical reentrant processing hypothesis. It is similar to the disinhibited feedback hypothesis but suggests specifically that high-level information reenters color areas in the visual cortex and that it is this form of reentrant information processing that leads to the experience of synesthetic colors (Smilek et al., 2001; Myles et al., 2003). This model would explain why visual context and meaning often influence the phenomenal quality of synesthetic experience (Myles et al., 2003; Dixon & Smilek, 2005). In figure 3.5, for instance, many grapheme-color synesthetes assign different colors to the shared letter depending on whether they interpret the string of letters as spelling the word “POT” or “JACK.” For example, one of our child subjects, a seventeen-year-old female, experiences the shared letter as bitter lemon (O) when she reads the word “POT” and as bright pink (C) when she reads the word “JACK.” This suggests that it is not the shape of the letter that gives rise to the color experience but the category or concept associated with the letter (Cytowic & Eagleman, 2009, 75). The observation that the very same grapheme can elicit different color experiences in synesthetes depending on the context in which it occurs suggests that synesthetes sometimes need to interpret what they visually

The Long-Term Potentiation Model

51

Figure 3.5 Synesthetes interpret the middle letter as a “C” when it occurs in “JACK” and as an “O” when it occurs in “POT.” The color of their synesthetic experience will depend on which word the grapheme is considered a part of.

experience prior to having a synesthetic experience. Though Ramachandran and Hubbard (2003a) argue that grapheme-color synesthesia is a form of low-level perception (a “sensory phenomenon”), they grant that linguistic context can affect synesthetic experience. They presented the sentence “Finished files are the result of years of scientific study combined with the experienced number of years” to a subject and asked her to count the number of “f’s” in it. Most normal subjects count only three “f’s” because they disregard the high-frequency word “of,” and even though the synesthete eventually spotted six “f’s,” she initially responded the way normal subjects do. Ramachandran and Hubbard (2003a) suggest that these contextual effects can be explained by top-down influences. Whether this is right in cases of grapheme-color synesthesia, however, will depend on whether color experience processed in early visual areas is indeed affected by highlevel contextual information and interpretive processes. If it is not, then

52

B. Brogaard, K. Marlow, and K. Rice

strong, top-down influences cannot explain the contextual effects. A better explanation then may be that interpretation of low-level perceptual information is sometimes required for synesthetic experience. The lack of pop-out effects as described in, for example, Edquist et al. (2006) and Brogaard, Marlow, and Rice (2013a) does, however, lend support to the reentrant processing model as explanatory in some cases of synesthesia. The absence of preattentive pop-out in synesthetes’ searches indicates that top-down attention is needed to elicit the synesthetic experience. One possibility is that the reentrant processing of high-level information to lower-level color areas is responsible for inducing the synesthetic color experience from the attended grapheme inducer. The reentrant processing theory can also explain why synesthetes may have significantly slower reaction times for locating camouflaged green “2” inducers compared to controls (Brogaard, Marlow, & Rice, 2013a). If high-level information reenters lower-level color areas, then there could be cognitive interference caused by a clash between the synesthetic color arising from top-down processes and the grapheme’s real color arising from bottom-up processes. For example, under the reentrant processing model, a black letter “A” is processed bottom-up as achromatic and then is attentionally perceived as red by the grapheme-color synesthete. The processing of red from reentrant processes would then conflict with the achromatic feature resulting from bottom-up processes. This could create cognitive interference that impedes conscious target recognition. Such an account could possibly explain synesthetes’ slower search times when compared to controls as observed in Brogaard, Marlow, and Rice (2013a). There are also models that suggest that grapheme-color synesthesia consists in special kinds of automatized memory associations. Brogaard (2013) proposes that the automatic association between graphemes and colors in some cases of grapheme-color synesthesia is akin to the automatic association between smell and negative memories. For example, the smell of chlorine may automatically induce visual images of a particular negative event. In the case of smell, the tight association presumably is formed immediately as a result of the negative value of the event. Presumably hyperactivation of the amygdala leads to the formation of connections between the adjacent olfactory bulb and visual areas. Affect is unlikely to be a factor that influences information binding in developmental synesthesia. The association presumably forms automatically because of its advantage in the learning process. Because of the indirect character of memory processes, the memory model is best understood as depicting one of the indirect mechanisms.

The Long-Term Potentiation Model

53

According to the currently accepted model of memory, which we might call the “reactivation model,” the hippocampus is not a storage space for information but a subcortical executive region in charge of maintaining connections between neural networks located in different areas of the brain (Eichenbaum, 2004; Serences et al., 2009; Rissman & Wagner, 2012). Working memory in the prefrontal cortex and hippocampus operate in tandem. The hippocampus guides the depositing of proteins at the synapses of neurons in areas that originally processed the information to be remembered. Together with neighboring hippocampal areas it also keeps track of the relative order of events and binds events that belong together. Memory retrieval by working memory reactivates the original areas of information processing by interaction with the executive hippocampus. On the memory model, synesthesia is sometimes the result of an indirect mechanism. The hippocampus would at some point have generated LTP connecting visual color and grapheme areas. Exposure to achromatic grapheme-stimuli would trigger both recognition of the grapheme as a particular grapheme (e.g., the numeral “2”) and memory retrieval of synesthetic colors to executive areas of the brain. The renewed activity in the color areas taking place in order for memory retrieval of synesthetic color information to occur may simultaneously give rise to a conscious projection of synesthetic color from visual color areas either via functional hyperactivity or structural changes in the visual cortex. In synesthetes in whom graphemes and colors truly are bound together to the extent that graphemes literally are seen as having colors, the hippocampus may be treating the distinct neural networks the way it normally would with form and color that belong together (e.g., tomato and red). In cases in which grapheme and color are not tightly bound together, the hippocampus must be treating the neural networks more like involuntary quick associations such as that between the striking of a match and its being lit. The LTP model is supported by the lack of pop-out effects described in, for example, Edquist et al. (2006) and Brogaard, Marlow, and Rice (2013a). In the LTP model, conscious projection of synesthetic color is subsequent to memory retrieval of synesthetic colors from visual color areas in response to exposure to graphemes. For conscious projection to occur, the synesthete must first spot the grapheme and interpret it, which would rule out any true pop-out of graphemes. The LTP model can also account for the slower reaction times among synesthetes in searching for green inducer targets described in Brogaard, Marlow, and Rice (2013a). The processing of green in visual areas may have slowed down the retrieval of the non-green synesthetic

54

B. Brogaard, K. Marlow, and K. Rice

Figure 3.6 Graphic illustration of the LTP model for grapheme-color binding in synesthesia.

color information and the projection of the synesthetic color onto the “2’s” after reactivation of color areas. While the reentrant-processing model may also explain arbitrary associations between inducers and their concurrents, only the LTP model explains nonarbitrary associations that may occur in certain types of synesthesia. In lexical-gustatory synesthesia, for example, words that denote foods typically give rise to the taste of the foods they denote (Ward & Simner, 2003). The nonarbitrariness of associations between words and tastes indicates that the mechanism responsible for the creation of associations must be related to memory. Under the LTP model, associations between words and tastes may occur in neurotypical memory formation. In cases of synesthesia, the connections between food words and tastes become strong enough to produce gustatory experiences.

The Long-Term Potentiation Model

55

A further virtue of the LTP model as an indirect model explanatory of some cases of synesthesia is that it straightforwardly can account for cases in which there is not a true binding of graphemes and colors. Some grapheme-color synesthetes report that graphemes merely are felt as inducers of colors either projected out into the world or seen in the mind’s eye. On the LTP model, information stored in memory about graphemes and their synesthetic colors need not be as tightly connected as the characteristic features of objects (e.g., red hearts). There are also synesthetes that merely know the color of graphemes but have no corresponding visual experience associated with the grapheme. Ward (2013) suggests that the visual experience may have faded over the years. The LTP model offers a simple explanation of this type of fading. Memories tend to fade in their vividness over time. So, if synesthetic binding is stored in memory, we should expect some fading to occur. Sound symbolism, the idea that phonemes carry meaning, indicates that synesthetic associations arise via a similar mechanism. In the English language, nouns and verbs have distinct phonological properties (Farmer, Christiansen, & Monaghan, 2006). Studies have shown that neurotypical individuals can use phonemes to correctly determine the meaning of a word in the absence of any other clues. For example, one study showed that subjects consistently preferred the nonwords “baluma” to denote a rounded shape and “takete” to denote a pointed shape (Köhler, 1929). Other studies have shown similar consistency for other phoneme-object pairs (Davis, 1961; Ramachandran & Hubbard, 2001b; Maurer, Pathman, & Mondloch, 2006). The phenomenon of sound symbolism also has been demonstrated across languages. In one study, native English speakers were able to correctly guess which phonemes denoted birds when presented with both bird and fish phonemes (Berlin, 1994). The similarity between phoneme-object association in neurotypical perception and inducer-concurrent association in some cases of synesthesia indicates that the neurotypical brain is primed to make seemingly arbitrary associations between stimuli and concurrents. In the case of synesthesia, inducer-concurrent associations may be much stronger and may be combined with functional or structural changes of sensory areas, leading to the distinct phenomenology that characterizes the condition. Under the LTP model, it is possible that neurotypical individuals form the same sort of associations as synesthetes without actually experiencing synesthetic phenomenology. Synesthetic phenomenology might only appear above a certain threshold for activation due to the abnormally strong degree of association between inducers and their concurrents. Alternatively, it may

56

B. Brogaard, K. Marlow, and K. Rice

require aberrant functional or structural changes in sensory areas, such as hyperexcitability. 4 Cognitive Advantages of Grapheme-Color Synesthesia If pop-out effects normally require top-down attention to the synesthetic inducers, grapheme-color synesthesia is unlikely to give most subjects a huge cognitive advantage in visual search tests. However, there may nonetheless be other cognitive advantages associated with color synesthesia. For example, one of our recent pilot studies suggests that grapheme-color synesthetes may have greater recall ability for digits and written names when compared to nonsynesthetes, though it is yet to be seen whether our initial results hold up in larger studies. There have also been rare cases in which color synesthesia has been associated with extreme mathematical skill. Subject DT, for example, sees numbers as three-dimensional, colored, textured forms, and he reports his synesthesia as giving him the ability to multiply large digits very rapidly (Bor, Billington, & Baron-Cohen, 2007). As DT describes it, the product of multiplying two numbers is the number that corresponds to the shape that fits between the shapes corresponding to the multiplied numbers. DT’s color synesthesia also gives rise to extreme mnemonic skills. He currently holds the European record in reciting the decimal points of the number pi. An fMRI study comparing DT to controls while attempting to locate patterns in number sequences indicated that his synesthetic color experiences occur as a result of information processing in nonvisual brain regions, including temporal, parietal, and frontal areas (Bor, Billington, & BaronCohen, 2007). Brogaard, Vanni, and Silvanto (2012) describe a case of a subject, JP, who has exceptional abilities to draw complex geometrical images by hand (see figure 3.7) as well as a form of acquired synesthesia for mathematical formulas and moving objects, which he perceives as colored, complex, geometrical figures. These two unusual case studies suggest that at least some forms of color synesthesia can give rise to cognitive advantages in the area of mathematics. As the visual cortex does not appear to be directly involved in generating the synesthetic images in either subject, however, the two cases also suggest that at least some forms of color synesthesia are best characterized as forms of high-level perception that proceed via a nonstandard mechanism. Cases of acquired synesthesia suggest a yet unexplored theory: the condition may be a window to unconscious brain processes. In the neurotypical

The Long-Term Potentiation Model

57

Figure 3.7 Image hand-drawn by subject JP. His synesthesia began in the wake of a brutal assault that led to unspecified brain injury. An fMRI study contrasting activity resulting from exposure to image-inducing formulas and noninducing formulas indicated that JP’s colored synesthetic images arise as a result of activation in areas in the temporal, parietal, and frontal cortices in the left hemisphere. The image-inducing formulas as contrasted with the noninducing formulas caused no activation in the visual cortex or the right hemisphere (see figure 3.8).

brain, much information is processed unconsciously. One such type of information consists of calculations made in the dorsal visual stream. According to the two-streams hypothesis, visual information is split into two streams in the visual cortex. The ventral stream runs sideways through the temporal lobe and ends in the prefrontal cortex, whereas the dorsal stream runs upward through the parietal cortex and ends in the motorsensory cortex. The ventral stream is specialized for vision for object recognition, and the dorsal stream is specialized for vision for action (Goodale & Milner, 1992; Milner & Goodale, 1996; Brogaard, 2011b). The dorsal stream is highly sophisticated, responsible for the complex calculations necessary to perform actions directed at moving stimuli in real time. But despite their sophistication, dorsal stream processes do not correlate

58

B. Brogaard, K. Marlow, and K. Rice

Figure 3.8 Activation induced by the image-inducing formula contrasted to noninducing formulas. The SPM(T) maps were thresholded at family-wise-error-corrected p-value 0.01 and overlaid on JP’s structural T1-weighted MRI, which was standardized into MNI-space using SPM8. (A) All activation viewed from sagittal (upper row) and axial (lower row) directions. (B–C) Two sagittal and axial slices. The white lines indicate the section of the other orientation (Brogaard, Vanni, & Silvanto, 2012).

with visual awareness (Brogaard, 2011b). Other high-level cognitive information, such as simple mathematical calculations, may be processed unconsciously as well. At least one recent study showed that abstract, symbolic, and rule-forming computations can be processed unconsciously (Sklar et al., 2012). It is possible that synesthetic phenomenology involves conscious representations of the output of unconscious processes. The representational nature of some synesthetic phenomenology is evidenced by DT’s case: the ability to consciously report the results of multiplication relies on visual phenomenology, but the synesthetic shapes used in multiplication correspond to real numbers. DT lacks conscious awareness of the method used to generate the imagery, yet the imagery is representative of the correct result. The representational ability of synesthesia need not be limited to developmental cases. Traumatic brain injury and subsequent anatomical or functional reorganization may sometimes lead to arbitrary associations of high-order concepts with perceptual representations. These associations may then become bound into a conscious experience upon retrieval. JP’s case lends credibility to this theory. As JP learned new mathematical concepts following the injury, corresponding synesthetic imagery must have evolved over time before becoming stable.

The Long-Term Potentiation Model

59

Figure 3.9 In the Müller-Lyer illusion, subjects believe the lines are of the same length, but no matter how long they look, they continue to experience the lines as having different lengths. This illustrates a case in which perceptual information is encapsulated from belief influence.

5 Modularity It is often said that synesthesia research has few or no interesting implications for classical debates in philosophy of mind. Research on synesthetic phenomena, however, has been recognized to have had profound influences on the modularity of mind hypothesis, which claims that systems involved in producing particular mental states or abilities are modular (Fodor, 1983; Sperber, 1994, 2002; Pinker, 1997; Carruthers, 2006). Beyond the core premise of modularly describing what different regions of the brain do, the modularity hypothesis also holds that certain kinds of information are encapsulated from influence by other regions (Fodor, 1983). For example, the Müller-Lyer illusion illustrates that perceptual information is encapsulated from belief influence (see figure 3.9). Though people familiar with the illusion believe that the line segments have the same length, they perceptually experience them as having different lengths. For information to be encapsulated, however, it does not suffice that there are no top-down influences on producing it; it must also be free of influences from other modules. Thus, although the Müller-Lyer illusion demonstrates that knowledge of the lines being of the same length does not alter the experience of the lines being of different lengths, high-level cognitive processes such as selective attention still might influence cognitive processes at an even lower level than that of perceptual experience (Tsal, 1984; Weidner et al., 2009). Given this possibility, the Müller-Lyer illusion fails to demonstrate the modularity of perceptual experience under a strict interpretation of the modularity hypothesis. The modularity of mind hypothesis was rejected long before synesthesia research took off as a serious field of study, but the theory has persisted in a limited form with respect to color perception. Synesthesia evidently provides a challenge even for those who restrict modularity to systems of color.

60

B. Brogaard, K. Marlow, and K. Rice

If grapheme-color synesthesia is a type of perceptual experience or a perceptual experience enriched by a mental image (Deroy, 2012), then graphemecolor synesthesia undermines the idea that the system that produces color perception is free of influences from other modules.2 There are further cases of synesthesia that challenge the modularity hypothesis even when restricted to perception. In the cases discussed above, subjects JP and DT report having internal, colorful, visual imagery in response to formulas or numerals. In both cases, data from fMRIs show that the visual cortex is not a source of the imagery. On the assumption that JP and DT’s synesthetic experiences are perceptual experiences enriched by visual imagery as argued by Deroy (2012), these cases challenge even a very modest form of modularity. A very modest form of modularity states that perceptual systems computing color are regionally defined and encapsulated. For example, defenders of the modularity hypothesis could hold that, in the human brain, these systems are restricted to the V4/V8 color complex in the visual cortex and that these systems are free of outside influences. The visual cortex, however, is not involved at all in producing the colors of the synesthetic experiences of JP and DT. The cases of JP and DT also challenge later more radical defenses of modularity that hold that modules are, as Carruthers (2006) puts it, “isolable function-specific processing systems,” which are all or almost all domain specific, and “whose operations aren’t subject to the will, [and] are associated with specific neural structures (albeit sometimes spatially dispersed ones) … whose internal operations may be inaccessible to the remainder of cognition” (12). This variant, too, would be undermined by cases in which color experiences synesthetically are associated with colored numerals or geometrical equations, which seem to be produced by structures in the parietal cortex or the temporal gyri rather than in the visual cortex. 6 Cognitive Impenetrability The implications of synesthesia research do not end with the modularity hypothesis, however. Another related, but distinct, classical debate turns on to what extent visual experience is penetrable by cognitive factors such as belief and familiarity (Pylyshyn, 1984). This concept of impenetrability should not be understood in the same way as encapsulation. Whereas encapsulation concerns all influences on a mental state, cognitive penetration merely concerns top-down influences on perceptual experience. One piece of evidence in favor of the cognitive penetrability hypothesis comes from studies investigating the effect of belief on color experience and

The Long-Term Potentiation Model

61

the effects of cognitive factors on grapheme-color experience. For example, an early study completed by Delk and Fillenbaum (1965) indicated that our beliefs about the characteristic color of an object may affect the color we experience that object as having. In the study, experimenters cut out shapes from uniformly colored pieces of paper. Some shapes represented objects that are characteristically red (for example, an apple, a heart, a pair of lips), while other shapes depicted objects that are not characteristically red (for example, a circle, an oval, a bell, a mushroom). Each cutout was placed in front of a colored background that could be changed from light red to dark red. Subjects were asked to adjust the background until the color was the same as the shape in front. The researchers found that when the object represented was characteristically red, the subjects selected a background color that was redder than the color they selected when the shape was of an object not characteristically red. Based on these types of observations, philosophers have argued our beliefs about the colors of objects penetrate our color experiences (Macpherson, 2012). The question still remains, however, whether subjects were reporting on their visual experiences or on higher cognitive states based on interpretations of their visual experiences. If the latter is true, then cognitive penetrability is unsurprising. Synesthesia could potentially shed some light on this uncertainty. Grapheme-color synesthesia has traditionally been characterized as experiential (low-level visual experience). Ramachandran and Hubbard (2003a), for example, characterize synesthesia as a genuine experiential, or “sensory,” phenomenon. As they put it, Work in our laboratory has shown that synaesthesia is a genuine sensory phenomenon. … The subject is not just “imagining the colour,” nor is the effect simply a memory association (e.g. from having played with coloured refrigerator magnets in childhood). (2003, 51)

Ramachandran and Hubbard grant that cognitive factors can influence synesthetic experience, but they maintain that this phenomenon can be explained by cognitive penetration of visual experiences. If Ramachandran and Hubbard were right, then we would have some vindication of the hypothesis that visual experience is penetrable by cognitive factors. As we have argued in this chapter, however, research into the dependence of synesthesia on focal attention suggests that grapheme-color synesthesia is not a low-level visual phenomenon. High-level cognitive mechanisms seem required to elicit synesthetic experience, indicating that what’s cognitively penetrable is interpreted visual experience (e.g., the interpretation of a shape as a numeral) rather than visual experience as such. These findings

62

B. Brogaard, K. Marlow, and K. Rice

thus undermine one of the main pieces of evidence in favor of the cognitive penetrability hypothesis, namely the claim that synesthetic experience in grapheme-color synesthesia is both a form of low-level perception and cognitively penetrable. Even further, the evidence from synesthesia research suggests that what we previously took to be cognitive penetrability may just be the subjects’ beliefs penetrating higher-order cognitive states that depend on the interpretation of low-level color experience. 7 The Role of Attention in Feature Binding Synesthesia research can also shed light on philosophical debates about attention and the binding problem. Many cases of synesthetic experience are atypical in binding unusual features: in some cases, graphemes literally are consciously seen as colored. Despite this atypicality, recent research into the mechanism underlying synesthesia may give us some insights into the nature of attention and the role it plays in feature integration. In the LTP model, the hippocampus binds together neural networks in distinct brain regions, namely grapheme areas and color areas. Attention to an achromatic grapheme elicits both recognition of a shape as a particular grapheme and retrieval of the color information in visual areas. In cases of normal perception, attention may consist in a selection of a target by choosing among features entering the system. The selection of features cannot proceed completely on the personal level, as some features appear inseparable (e.g., color and form). But one of the lessons from synesthesia is that selecting features is not the only role of attention. In grapheme-color synesthesia, selective attention facilitates recognition of a grapheme, which then elicits synesthetic color. Wu (2011) argues that conscious selective attention consists in more than perceptually locking on to a specific object. According to Wu, it also involves a way of demonstratively locking on to it (e.g., attending to that woman). Smithies (2011) goes one step further and argues that attention is what makes information fully accessible for use in rational thought and action. The lessons drawn in this chapter about the dependence of synesthesia on attention are in broad agreement with Wu’s and Smithies’s views. However, our studies suggest that prior to locking on to an object demonstratively or gaining full access to information about a particular object that suffices for acting and reporting, we must be able to consciously recognize the thing attended to as a specific object. You can, perhaps in poor

The Long-Term Potentiation Model

63

visual conditions, attend to a blob in the environment without recognizing it as a specific object and even without being able to see precisely where the blob begins or ends. Attention itself, however, is essential to conscious recognition. In line with the LTP model for synesthesia, attention to a target presumably causes the hippocampus (or neighboring hippocampal areas) to reactivate neurons in distinct areas of the brain (e.g., color and form areas) (Brogaard, 2013). When this information is retrieved by working memory in a bound form together with novel target information, this may result in conscious recognition of the object. It is at this point that one can demonstratively lock on to the object as a specific object with well-defined boundaries. On the view proposed here, then, one function of selective attention is to choose a target; another is to ensure that recognition can take place by initiating memory retrieval of bound features. Initial integration of features of novel stimuli likely takes place earlier in the sensory systems. But initial feature integration probably does not correlate with conscious recognition or even consciousness of a feature. Blindsight patients are able to predict several features of a target located among distractors, despite being unable to consciously perceive them (Brogaard, 2011a; Brogaard, Marlow, & Rice, 2013b). As blindsight patients have lesions to the primary visual cortex, there is reason to think that some features are found in early parts of the visual system, perhaps in the LGN. Studies have shown that blindsighters can attend to targets they are not aware of, but this requires cues that indicate the location of the target (Kentridge, Heywood, & Weiskrantz, 1999; Kentridge, 2011). It is unlikely that focused attention is always required for feature integration to take place in blindsight. On the feature integration theory of attention first proposed by Anne Treisman (Treisman & Gelade, 1980; Treisman, 2003), conscious perception is preceded by a preattentive stage during which different features of objects (e.g., shape, color, orientation, depth) are processed by different brain regions. This is then followed by an attentive stage where the features are combined by focused attention to a specific object, resulting in a conscious experience of the object. Treisman and Schmidt (1982) showed in a masking study that subjects who are only briefly exposed to different objects often combine features of distinct objects, for example, mistakenly attributing a shape of one object to a different object. Similar kinds of conjunctive illusions have been found in people with Bálint’s syndrome, a condition in which damage to the parietal cortex prevents focused attention

64

B. Brogaard, K. Marlow, and K. Rice

on individual objects (Friedman-Hill, Robertson, & Treisman, 1995). This model of attention, however, probably is too radical. Aside from the evidence from blindsight studies, there is evidence to suggest that initial feature integration does not depend on focused attention. For example, there are no known perceptual illusions where colors spill out of objects. This suggests that color and boundary integration is preattentive. 8 Conclusion The recent invigoration in synesthetic research over the past two decades has inspired novel empirical and theoretical approaches to studying and understanding the human mind. As a phenomenon, synesthesia has compelled profound reconsiderations of many once salient philosophical perspectives concerning a multitude of cognitive mechanisms such as memory, attention, and reentrant processing. This chapter offered a reassessment of these mechanisms through the lens of grapheme-color synesthesia research and the related debate concerning the role of selective attention in the phenomenon. Out of this review and discussion we presented new and compelling research contesting the theory of a preattentive pop-out effect aiding synesthetes in their visual searches for target graphemes among distractors. These findings, however, not only showed that grapheme-color synesthesia does not always provide an advantage in complicated visual search tasks, but it also seems that it may sometimes incur a slight disadvantage. Such an observation lends credence to either the reentrant processing hypothesis or the LTP model for some cases of grapheme-color synesthesia. Along with these new observations, we further argued that the LTP model has greater explanatory power than competitors in some cases. We contend that the superior explanatory power in some cases can be attributed to the fact that the LTP model can account both for the arbitrary associations between synesthetic inducers and concurrents and the nonarbitrary synesthetic associations that may arise in some cases of synesthesia. Although grapheme-color synesthesia may not provide an advantage in visual search tests, there are independent reasons to think that color synesthesia can provide cognitive advantages in areas such as memory retrieval of words and digits. Continued research into the mechanisms of synesthesia will help reveal the cognitive idiosyncrasies and mental processes underlying the condition as well as advance our understanding into the workings of the neurotypical brain. Through a litany of potentials, synesthesia offers modern cognitive research a new doorway into unlocking the secrets of the human mind.

The Long-Term Potentiation Model

65

Acknowledgments For helpful comments and/or discussion of the paper’s ideas, we are grateful to Kathleen Akins, David Bennett, Ophelia Deroy, Emma Esmaili, Dimitria Gazia, Brenda Kirchhoff, and Wayne Wu. Notes 1. Everyone has contributed equally. The listing of the authors is alphabetical. 2. This only holds if the mechanism underlying grapheme-color synesthesia is of the feedback type as opposed to the feed-forward type. However, we already offered several reasons for thinking the mechanism underlying grapheme-color synesthesia is at least sometimes of the feedback type and not the feed-forward type.

References Armel, K. C., & Ramachandran, V. S. (1999). Acquired synesthesia in retinitis pigmentosa. Neurocase, 5, 293–296. Bargary, G., & Mitchell, K. J. (2008). Synaesthesia and cortical connectivity. Trends in Neurosciences, 31, 335–342. Baron-Cohen, S., Harrison, J., Goldstein, L. H., & Wyke, M. (1993). Coloured speech perception: Is synaesthesia what happens when modularity breaks down? Perception, 22, 419–426. Baron-Cohen, S., Wyke, M., & Binnie, C. (1987). Hearing words and seeing colors: An experimental investigation of synesthesia. Perception, 16, 761–767. Beck, J. (1966). Effect of orientation and of shape similarity on perceptual grouping. Perception and Psychophysics, 1, 300–302. Beeli, G., Esslen, M., & Jancke, L. (2005). Synaesthesia: When coloured sounds taste sweet. Nature, 434, 38. Berlin, B. (1994). Evidence for pervasive synaesthetic sound symbolism in ethnozoological nomenclature. In L. Hinton, J. Nichols, & J. Ohala (Eds.), Sound symbolism (pp. 77–93). New York: Cambridge University Press. Bor, D., Billington, J., & Baron-Cohen, S. (2007). Savant memory for digits in a case of synaesthesia and Asperger syndrome is related to hyperactivity in the lateral prefrontal cortex. Neurocase, 13, 311–319. Brogaard, B. (2011a). Are there unconscious perceptual processes? Consciousness and Cognition, 20, 449–463.

66

B. Brogaard, K. Marlow, and K. Rice

Brogaard, B. (2011b). Conscious vision for action vs. unconscious vision for action. Cognitive Science, 35, 1076–1104. Brogaard, B. (2012). Color Synesthesia. In K. A. Jameson (Ed.), Cognition and language: Encyclopedia of color science and technology. New York: Springer. Brogaard, B. (2013). Grapheme-color synesthesia and the reactivation model of memory. In O. Deroy & M. Nudds (Eds.), Sensory blendings: New essays on synaesthesia. Oxford: Oxford University Press. Brogaard, B., Marlow, K., & Rice, K. (2013a). Do synesthetic colors grab attention in visual search? Manuscript. Brogaard, B., Marlow, K., & Rice, K. (2013b). Unconscious influences on decision making in blindsight. Behavioral and Brain Sciences, 19(6), 566–575. Brogaard, B., Vanni, S., & Silvanto, J. (2012). Seeing mathematics: Perceptual experience and brain activity in acquired synesthesia. Neurocase. doi:10.1080/13554794.20 12.701646. Brown, R. (Ed.). (2014). Consciousness inside and out: Phenomenology, neuroscience, and the nature of experience. Studies in Brain and Mind, Synthese Library, Vol. 6. Dordrecht: Springer. Carruthers, P. (2006). The architecture of the mind. Oxford: Oxford University Press. Cytowic, R. E., & Eagleman, D. M. (2009). Wednesday is indigo blue. Cambridge, MA: MIT Press. Davis, R. (1961). The fitness of name to drawings: A crosscultural study in Tanganyaka. British Journal of Psychology, 52, 259–268. Delk, J. L., & Fillenbaum, S. (1965). Differences in perceived colour as a function of characteristic color. American Journal of Psychology, 78(2), 290–293. Deroy, O. (2012). Synaesthesia: An experience of the third kind? In R. Brown (Ed.), The phenomenology and neurophilosophy of consciousness. Neuroscience Series, Synthese Library. (Response to Brit Brogaard’s paper in the same volume.) Dixon, M. J., & Smilek, D. (2005). The importance of individual differences in grapheme-color synesthesia. Neuron, 45, 821–823. Dixon, M. J., Smilek, D., & Merikle, P. M. (2004). Not all synaesthetes are created equal: Projector versus associator synaesthetes. Cognitive, Affective, and Behavioral Neuroscience, 4, 335–343. Driver, J., & Noesselt, T. (2008). Multisensory interplay reveals crossmodal influences on “sensory-specific” brain regions, neural responses, and judgments. Neuron, 57, 11–23. Eagleman, D. M., Kagan, A. D., Nelson, S. S., Sagaram, D., & Sarma, A. K. (2007). A standardized test battery for the study of synesthesia. Journal of Neuroscience Methods, 159, 139–145.

The Long-Term Potentiation Model

67

Edquist, J., Rich, A. N., Brinkman, C., & Mattingly, J. B. (2006). Do synaesthetic colours act as unique features in a visual search? Cortex, 42, 222–231. Eichenbaum, H. (2004). The hippocampus, memory, and place cells: Is it spatial memory or a memory space? Neuron, 44(1), 109–120. Elias, L. J., Saucier, D. M., Hardie, C., & Sarty, G. E. (2003). Dissociating semantic and perceptual components of synaesthesia: Behavioural and functional neuroanatomical investigations. Brain Research: Cognitive Brain Research, 16, 232–237. Farmer, T. A., Christiansen, M. H., & Monaghan, P. (2006). Phonological typicality influences on-line sentence comprehension. Proceedings of the National Academy of Sciences of the United States of America, 103, 12203–12208. Fodor, J. A. (1983). The modularity of mind. Cambridge, MA: MIT Press. Friedman-Hill, S. R., Robertson, L. C., & Treisman, A. (1995). Parietal contributions to visual feature binding: Evidence from a patient with bilateral lesions. Science, 269, 853–855. Gheri, C., Chopping, S., & Morgan, M. J. (2008). Synaesthetic colours do not camouflage form in visual search. Proceedings of the Royal Society of London, Series B: Biological Sciences, 275, 841–846. Goodale, M. A., & Milner, A. D. (1992). Separate visual pathways for perception and action. Trends in Neurosciences, 15, 20–25. Grossenbacher, P. G. (1997). Perception and sensory information in synaesthetic experience. In S. Baron-Cohen & J. E. Harrison (Eds.), Synaesthesia: Classic and contemporary readings (pp. 148–172). Malden, MA: Blackwell. Grossenbacher, P. G., & Lovelace, C. T. (2001). Mechanisms of synesthesia: Cognitive and physiological constraints. Trends in Cognitive Sciences, 5, 36–41. Hubbard, E. M., Arman, A. C., Ramachandran, V. S., & Boynton, G. M. (2005). Individual differences among grapheme-color synesthetes: Brain-behavior correlations. Neuron, 45(6), 975–985. Hubbard, E. M., Manohar, S., & Ramachandran, V. S. (2005). Contrast affects the strength of synesthetic colors. Cortex, 42, 184–194. Hubbard, E. M., & Ramachandran, V. S. (2005). Neurocognitive mechanisms of synesthesia. Neuron, 48, 509–520. Kentridge, R. W. (2011). Attention without awareness: A brief review. In C. Mole, D. Smithies, & W. Wu (Eds.), Attention: Philosophical and psychological essays (pp. 228– 246). Oxford: Oxford University Press. Kentridge, R. W., Heywood, C. A., & Weiskrantz, L. (1999). Attention without awareness in blindsight. Proceedings of the Royal Society of London, Series B: Biological Sciences, 266, 1805–1811.

68

B. Brogaard, K. Marlow, and K. Rice

Kim, C. Y., Blake, R., & Palmeri, T. J. (2006). Perceptual interaction between real and synesthetic colors. Cortex, 42, 195–203. Laeng, B., Svartdal, F., & Oelmann, H. (2004). Does color synesthesia pose a paradox for early-selection theories of attention? Psychological Science, 15, 277–281. MacLeod, C. M. (1991). Half a century of research on the Stroop effect: An integrative review. Psychological Bulletin, 109, 163–203. Macpherson, F. (2012). Cognitive penetration of colour experience: Rethinking the issue in light of an indirect mechanism. Philosophy and Phenomenological Research, 84, 24–62. Marks, L. E. (1974). On associations of light and sound: The mediation of brightness, pitch, and loudness. American Journal of Psychology, 87(2), 173–188. Marks, L. E. (1987). On cross-modal similarity: Auditory-visual interactions in speeded discrimination. Journal of Experimental Psychology: Human Perception and Performance, 13(3), 384–394. Mattingley, J. B., Payne, J., & Rich, A. N. (2006). Attentional load attenuates synaesthetic priming effects in grapheme-colour synaesthesia. Cortex, 42, 213–221. Mattingley, J. B., & Rich, A. N. (2004). Behavioural and brain correlates of multisensory experience in synaesthesia. In G. Calvert, C. Spence, & B. Stein (Eds.), Handbook of multisensory integration. Cambridge, MA: MIT Press. Mattingley, J. B., Rich, A. N., Yelland, G., & Bradshaw, J. L. (2001). Unconscious priming eliminates automatic binding of colour and alphanumeric form in synaesthesia. Nature, 410, 580–582. Maurer, D., Pathman, T., & Mondloch, C. J. (2006). The shape of boubas: Sound-shape correspondences in toddlers and adults. Developmental Science, 9(3), 316–322. Mills, C. B., Boteler, E. H., & Oliver, G. K. (1999). Digit synaesthesia: A case study using a Stroop-type test. Cognitive Neuropsychology, 16, 181–191. Milner, A. D., & Goodale, M. A. (1996). The visual brain in action. Oxford: Oxford University Press. Myles, K. M., Dixon, M. J., Smilek, D., & Merikle, P. M. (2003). Seeing double: The role of meaning in alphanumeric-colour synaesthesia. Brain and Cognition, 53, 342–345. Nunn, J. A., Gregory, L. J., Brammer, M., Williams, S. C., Parslow, D. M., Morgan, M. J., et al. (2002). Functional magnetic resonance imaging of synesthesia: Activation of V4/V8 by spoken words. Nature Neuroscience, 5, 371–375.

The Long-Term Potentiation Model

69

O’dgaard, E. C., Flowers, J. H., & Bradman, H. L. (1999). An investigation of the cognitive and perceptual dynamics of a colour-digit synaesthete. Perception, 28, 651–664. Palmeri, T. J., Blake, R., Marois, R., Flaner, M. A., & Whetsell, Jr., W. (2002). The perceptual reality of synesthetic colors. Proceedings of the National Academy of Sciences of the United States of America, 99, 4127–4131. Pinker, S. (1997). How the mind works. New York: W. W. Norton. Pylyshyn, Z. (1984). Computation and cognition. Cambridge, MA: MIT Press. Ramachandran, V. S., & Hirstein, W. (1998). The perception of phantom limbs: The D. O. Hebb Lecture. Brain, 121, 1603–1630. Ramachandran, V. S., & Hubbard, E. M. (2001a). Psychophysical investigations into the neural basis of synaesthesia. Proceedings of the Royal Society of London, Series B: Biological Sciences, 268, 979–983. Ramachandran, V. S., & Hubbard, E. M. (2001b). Synaesthesia: A window into perception, thought, and language. Journal of Consciousness Studies, 8, 3–34. Ramachandran, V. S., & Hubbard, E. M. (2003a). The phenomenology of synaesthesia. Journal of Consciousness Studies, 10, 49–57. Ramachandran, V. S., & Hubbard, E. M. (2003b). Refining the experimental lever: A reply to Shanon and Pribram. Journal of Consciousness Studies, 10, 77–84. Rich, A. N., & Karstoft, K. I. (2013). Exploring the benefit of synaesthetic colours: Testing for “pop-out” in individuals with grapheme-colour synaesthesia. Cognitive Neuropsychology, 30, 1–16. Rich, A. N., & Mattingley, J. B. (2002). Anomalous perception in synaesthesia: A cognitive neuroscience perspective. Nature Reviews: Neuroscience, 3, 43–52. Rich, A. N., & Mattingley, J. B. (2003). The effects of stimulus competition and voluntary attention on colour-graphemic synaesthesia. Neuroreport, 14, 1793–1798. Rich, A. N., & Mattingley, J. B. (2010). Out of sight, out of mind: The attentional blink can eliminate synaesthetic colours. Cognition, 114(3), 320–328. Rissman, J., & Wagner, A. D. (2012). Distributed representations in memory: Insights from functional brain imaging. Annual Review of Psychology, 63, 101–128. Robertson, L. C. (2003). Binding, spatial attention and perceptual awareness. Nature Reviews: Neuroscience, 4, 93–102. Rouw, R., & Scholte, H. S. (2007). Increased structural connectivity in graphemecolor synesthesia. Nature Neuroscience, 10, 792–797.

70

B. Brogaard, K. Marlow, and K. Rice

Sagiv, N., Heer, J., & Robertson, L. (2006). Does binding of synesthetic color to the evoking grapheme require attention? Cortex, 42(2), 232–242. Sagiv, N., & Ward, J. (2006). Cross-modal interactions: Lessons from synesthesia. Progress in Brain Research, 155, 259–271. Serences, J. T., Ester, E. F., Vogel, E. K., & Awh, E. (2009). Stimulus-specific delay activity in human primary visual cortex. Psychological Science, 20(2), 207–214. Shanon, B. (2002). Ayahuasca visualizations: A structural typology. Journal of Consciousness Studies, 9, 3–30. Simner, J., Mulvenna, C., Sagiv, N., Tsakanikos, E., Witherby, S. A., et al. (2006). Synaesthesia: The prevalence of atypical cross-modal experiences. Perception, 35, 1024–1033. Sinke, C., Halpern, J. H., Zedler, M., Neufeld, J., Emrich, H. M., & Passie, T. (2012). Genuine and drug-induced synesthesia: A comparison. Consciousness and Cognition, 21, 1419–1434. Sklar, A. Y., Levy, N., Goldstein, A., Mandel, R., Maril, A., & Hassin, R. (2012). Reading and doing arithmetic nonconsciously. Proceedings of the National Academy of Sciences of the United States of America. doi:10.1073/pnas.1211645109. Smilek, D., Callejas, A., Dixon, M. J., & Merikle, P. M. (2007). Ovals of time: Time– space associations in synaesthesia. Consciousness and Cognition, 16(2), 507–519. Smilek, D., Dixon, M. J., Cudahy, C., & Merikle, P. M. (2001). Synaesthetic photisms influence visual perception. Journal of Cognitive Neuroscience, 13, 930–936. Smilek, D., Dixon, M. J., & Merikle, P. M. (2003). Synaesthetic photisms guide attention. Brain and Cognition, 53, 364–367. Smithies, D. (2011). Attention is rational-access consciousness. In C. Mole, D. Smithies, & W. Wu (Eds.), Attention: Philosophical and psychological essays (pp. 247–273). Oxford: Oxford University Press. Sperber, D. (1994). The modularity of thought and the epidemiology of representations. In L. A. Hirschfeld & S. A. Gelman (Eds.), Mapping the mind (pp. 39–67). Cambridge: Cambridge University Press. Sperber, D. (2002). In defense of massive modularity. In I. Dupoux (Ed.), Language, brain, and cognitive development (pp. 47–57). Cambridge, MA: MIT Press. Stroop, J. R. (1935). Studies of interference in serial verbal reactions. Journal of Experimental Psychology, 18(6), 643–662. Treisman, A. (1982). Perceptual grouping and attention in visual search for features and for objects. Journal of Experimental Psychology: Human Perception and Performance, 8(2), 194–214.

The Long-Term Potentiation Model

71

Treisman, A. (2003). Consciousness and perceptual binding. In A. Cleeremans (Ed.), The unity of consciousness: Binding, integration, and dissociation. Oxford: Oxford University Press. Treisman, A., & Gelade, G. (1980). A feature integration theory of attention. Cognitive Psychology, 12, 97–136. Treisman, A., & Schmidt, H. (1982). Illusory conjunctions in the perception of objects. Cognitive Psychology, 14, 107–141. Tsal, Y. (1984). A Mueller-Lyer illusion induced by selective attention. Quarterly Journal of Experimental Psychology, Section A: Human Experimental Psychology, 36(2), 319–333. Ward, J. (2013). Synesthesia. Annual Review of Psychology, 64, 49–75. Ward, J., Huckstep, B., & Tsakanikos, E. (2006). Sound-colour synaesthesia: To what extent does it use cross-modal mechanisms common to us all? Cortex, 42, 264–280. Ward, J., Jonas, C., Dienes, Z., & Seth, A. (2010). Grapheme-colour synaesthesia improves detection of embedded shapes, but without pre-attentive “pop-out” of synaesthetic colour. Proceedings of the Royal Society of London, Series B: Biological Sciences, 277, 1021–1026. Ward, J., & Simner, J. (2003). Lexical-gustatory synaesthesia: Linguistic and conceptual factors. Cognition, 89(3), 237–261. Weidner, R., Krummenacher, J., Reimann, B., Müller, H. J., & Fink, G. R. (2009). Sources of top-down control in visual search. Journal of Cognitive Neuroscience, 11, 2100–2113. Weiss, P. H., & Fink, G. R. (2009). Grapheme-colour synaesthetes show increased grey matter volumes of parietal and fusiform cortex. Brain, 132, 65–70. Weiss, P. H., Zilles, K., & Fink, G. R. (2005). When visual perception causes feeling: Enhanced crossmodal processing in grapheme-color synesthesia. NeuroImage, 28, 859–868. Wollen, K. A., & Ruggiero, F. T. (1983). Colored-letter synesthesia. Journal of Mental Imagery, 7, 83–86. Wu, W. (2011). What is conscious attention? Philosophy and Phenomenological Research, 82(1), 93–120.

4 Intermodal Binding Awareness Casey O’Callaghan

It is tempting to hold that perceptual experience amounts to a co-conscious collection of visual, auditory, tactual, gustatory, and olfactory episodes. If so, each aspect of perceptual experience on each occasion is associated with a specific modality. This chapter, however, concerns a core variety of multimodal perceptual experience. It argues that there is perceptually apparent intermodal feature binding. I present the case for this claim, explain its consequences for theorizing about perceptual experience, and defend it against objections. I maintain that just as one thing may perceptually appear at once to jointly bear several features associated with the same sense modality, one thing also may perceptually appear at once to jointly bear features associated with different sense modalities. For instance, just as something may visually appear at once to be both red and round, or to have a red part and a green part, something may multimodally perceptually appear at once to be both bright and loud, or to have a red part and a rough part. The main lesson, I argue, is that perceiving is not just co-consciously seeing, hearing, feeling, tasting, and smelling at the same time. And perceptual phenomenal character is not on each occasion exhausted by that which is distinctive to or associated with a given modality, along with that which accrues thanks to simple co-consciousness. Not all ways of perceiving are modality specific. I defend this account against three main objections: that singular content theorists avoid my conclusions; that apparent infusion of perceptible features is required for perceptually apparent binding but does not occur intermodally; and that the diversity of objects across modalities makes perceptually apparent intermodal binding rare. 1 Feature Binding Awareness Humans are able consciously to perceive things and their features. You may see a baseball in addition to its dominant color or its laces. You may hear

74

C. O’Callaghan

a sound and its pitch or its duration. I am speaking of cases in which a subject, through the use of the senses, consciously experiences or becomes sensorily aware of that which is perceived.1 Perceptible feature bearers include individual things and happenings. Their perceptible features may include sensible attributes, such as qualities or properties, and parts, such as surfaces or segments. That humans perceive individual things does not imply that bare particulars are perceptible. And here I remain neutral on the metaphysical nature of perceptible attributes. Perceptible feature bearers commonly are perceived to have or as having their perceptible features. You may see a baseball’s being cream-colored or its having laces. You may hear a sound to be high-pitched or to have a long duration. I’ll ignore the difference between perceiving a thing’s being (or having) F and perceiving a thing to be (or to have) F. In illusion, perceptible feature bearers are perceived to have or as having features they lack. Individual things may be perceived at once to jointly have or as jointly having multiple features. You see the baseball at once to be cream-colored, spherical, and laced. You hear the sound at once to be loud and shrill. You feel the cool surface to have a rough part. In each case, the features are perceived to jointly belong to one thing. Perceptible feature bearers may in differing ways illusorily seem perceptually to jointly bear distinct features. When you consciously perceive features’ jointly belonging to the same thing, say that you consciously perceive those features to be bound. Call this a case of feature binding awareness. Allow that feature binding awareness may be illusory, so that you may for some reason consciously perceive things merely as jointly bearing distinct features. To simplify, I’ll assume that feature binding awareness sometimes is veridical perception. The right adjustments accommodate cases of misperception. My concern is not in the first instance the apparent relation between a single feature and its bearer. A single feature perceptibly belonging to an individual does not suffice for feature binding awareness as I understand it. I mean to focus on cases in which differing features perceptibly belong at once to the same thing. This is the standard concern of empirical work on feature binding. Feature binding awareness may involve perceiving attributes’ (properties or qualities) jointly sharing a bearer or being coinstantiated, or it may involve perceiving parts’ jointly belonging to or composing (perhaps partly) the same whole. So, feature binding awareness need not involve a single uniform qualitative character. Nonetheless, feature binding awareness may depend upon a common type of mechanism. Anne Treisman’s influential

Intermodal Binding Awareness

75

work, in particular, expressly tests for common mechanisms in property binding and part binding.2 (i) The binding of properties—the integration of different object properties, such as the color red and the shape + to form a red cross; (ii) the binding of parts—the integration in a structured relationship of the separately codable parts of a single object, such as the nose, eyes, and mouth in a face. … The first two seem to me to be closely related and to depend on the same binding mechanism. (Treisman, 2003, 98)

“Binding” commonly is used to refer to the perceptual process by which information concerning distinct perceptible features (such as color and shape, or distinct components) is bundled together as information concerning a common perceptible item.3 Talk of feature binding in this sense risks conflating information and its subject matter. This chapter concerns perceptual awareness, so I characterize feature binding in terms of perception and its objects. Feature binding occurs when differing features perceptibly belong to a common individual. Feature binding awareness involves differing features perceptually appearing to belong to a common entity—to be coinstantiated by an individual, or to be parts of the same whole. Feature binding awareness presumably depends upon feature binding processes. I say “presumably” because a feature binding process, as described in the first sentence of the previous paragraph, may require that features are detected or analyzed separately by subpersonal perceptual mechanisms. There is powerful evidence for this claim. But it is possible that what I have characterized as feature binding awareness could occur without such a feature binding process. In any case, my topic is the conscious upshot rather than the process. This does not set my discussion off entirely from traditional debates about feature binding, which often focus upon the psychological and explanatory relations between feature binding processes, which may be subpersonal, and phenomenal characteristics of conscious awareness.4 My main topic is conscious perceptual awareness. 2 Intramodal Feature Binding Awareness This chapter presupposes that conscious episodes of feature binding awareness sometimes occur. Paradigm feature binding awareness is intramodal. Visual feature binding is best understood. The egg looks whitish and ovoid, and “Q” has a visible part that “O” lacks. A rich experimental literature— concerning, for instance, the role of attention in binding, illusory conjunctions of features, serial versus parallel search, object-specific preview

76

C. O’Callaghan

advantages and penalties, and the role of preattentive segmentation and grouping—has investigated intramodal visual feature binding processes and their relationship to conscious visual feature binding awareness. Audition also involves feature binding awareness. The sound audibly is high-pitched but wavering, and an utterance of “overtly” has an audible part an utterance of “overt” lacks. Research on auditory scene analysis tracing especially to Bregman (1990) has illuminated the mechanisms responsible for auditory feature binding awareness. Touch, too, involves feature binding awareness. The surface feels smooth and warm to the touch.5 Taste, gustation, and olfaction are more difficult cases, but each might also enable you to perceive individuals and bound features. The cookie in your mouth tastes sweet and salty. You might smell the odor to be jointly rancid and intense.6 Not all modalities reveal the same individuals and features, but individuals and bound features are part of the structure revealed by perceptual awareness in several exteroceptive sensory modalities.7 Intramodal feature binding awareness occurs in more than one modality. 3 Intermodal Feature Binding Awareness Is there intermodal feature binding awareness? If so, features consciously perceived through different modalities can perceptually appear to be bound and thus to belong to the same thing. The skeptical position is that there is intramodal but not intermodal feature binding awareness.8 Humans only associate features perceived through different modalities or infer that they belong to the same object. Fulkerson (2011), for instance, claims that only unimodal perceptual experiences involve apparent feature binding. According to Fulkerson, “the predication or assignment of distinct features to perceptual objects” is “a distinguishing feature of unisensory perceptual experiences” (504–505). Multisensory perceptual experiences do not involve the direct predication of features onto individual perceptual objects. Instead, there is an association between experiences. … What we experience is a higher-order association between sensory experiences. (Fulkerson, 2011, 506)

Fulkerson denies that a multimodal perceptual experience can ascribe to a perceptible object distinct features associated with different modalities. Instead, he thinks distinct unimodal experiences of worldly objects are associated in a higher-order, multisensory perceptual experience. Fulkerson thus rejects intermodal binding awareness as I have characterized it.

Intermodal Binding Awareness

77

Spence and Bayne (2014, esp. sec. 7) argue that there is good evidence for unimodal but not multimodal feature binding awareness. Are features belonging to different modalities bound together in the form of multisensory perceptual objects (MPOs)? … We think it is debatable whether the “unity of the event” really is internal to one’s experience in these cases, or whether it involves a certain amount of post-perceptual processing (or inference). In other words, it seems to us to be an open question whether, in these situations, one’s experience is of an MPO or whether it is instead structured in terms of multiple instances of unimodal perceptual objects. (Spence & Bayne, 2014, 117, 119)

Spence and Bayne are skeptical whether perceptual consciousness includes awareness as of unified objects that bear features associated with different modalities. They propose to admit only apparent unity stemming from post-perceptual processing or inference rather than apparent unity that is “internal to one’s experience.” Thus, Spence and Bayne express skepticism about intermodal binding awareness.9 Here I want to present a case for intermodal binding awareness, to spell out its consequences, and to defend it against objections. My aim is not to refute the determined skeptic. And I do not claim that every variety of intramodal binding awareness occurs intermodally. Instead, my aim is to show that we should prefer a position that recognizes certain forms of intermodal binding awareness. My case for a nonskeptical position begins with a contrast between (1) and (2): (1) Perceiving a thing’s being F and a thing’s being G. (2) Perceiving a thing’s being both F and G. An instance of (2) requires that a single thing perceptibly has both features. However, an instance of (1) does not require that. Feature binding awareness occurs just in case the difference between (1) and (2) sometimes is reflected in perceptual awareness. Consider an intramodal example. Hearing a thing’s being loud and a thing’s being high-pitched differs from hearing a thing’s being both loud and high-pitched. Hearing a thing’s being both loud and high-pitched requires that a single thing perceptibly is loud and high-pitched; hearing a thing’s being loud and a thing’s being high-pitched does not require that. If intramodal feature binding awareness occurs, an intramodal episode of (2) may differ from an intramodal episode of (1) in a way that is reflected in perceptual awareness. Since we have assumed that intramodal feature binding awareness occurs, the difference between (1) and (2) may be reflected in perceptual awareness.

78

C. O’Callaghan

My claim is that the difference between (1) and (2) may be reflected in multimodal episodes of conscious perceptual awareness. There are intermodal episodes of (2) that are not merely episodes of (1). For instance, there are episodes of consciously perceiving a thing’s being both bright and loud that are not just episodes of consciously perceiving a thing’s being bright and a thing’s being loud. The difference is reflected in perceptual awareness. This has important consequences for theorizing about perception. Not every aspect of perceptual awareness is associated with a specific modality or accrues thanks to simple co-consciousness. Before I present the evidence and discuss the consequences, three further clarifications are needed. First, Tye (2003, 2007) uses an argument with a similar structure to establish that experiences associated with different modalities are co-consciously unified. But Tye’s concern differs from mine. He contrasts, for instance, having a visual experience and an auditory experience at the same time with having an experience that is both auditory and visual. Some of Tye’s examples involve a phenomenally unified multimodal experience as of a common object, but he does not draw the contrast I emphasize in this chapter. This contrast holds between pairs of co-consciously phenomenally unified multimodal experiences. It holds, for instance, between a phenomenally unified audiovisual experience as of hearing a thing’s being F and seeing a thing’s being G and a phenomenally unified audio-visual experience as of a thing’s being both F and G. Next, I have assumed for simplicity that humans sometimes do consciously perceive objects and features and thus may consciously perceive features to be bound. But, as much as possible, I aim to be neutral regarding theories of perception. Representational content theorists may prefer an alternative formulation of the contrast. (1′) Perceptually representing that a thing is F and a thing is G. (2′) Perceptually representing that a thing is both F and G. Finally, philosophers have focused on perceptible attributes and their apparent bearers, but feature binding awareness also may involve parts. Relations among perceptible parts and wholes are part of the apparent structure revealed by perceptual awareness. Thus, a version of the contrast involves parts. (1″) Perceiving a thing’s having a as a part and a thing’s having b as a part. (2″) Perceiving a thing’s having both a and b as parts.

Intermodal Binding Awareness

79

The differences matter, but this chapter’s guiding concern is what the differing versions of the contrast have in common. Each involves a contrast between conscious episodes in which a subject is perceptually aware of a common item’s jointly bearing features that are perceived at once using different senses and episodes in which a subject need not be perceptually aware of any such common item. In short, each contrast requires that a common individual may be perceptible as such across different senses. 4 Evidence for Intermodal Feature Binding Awareness 4.1 Perceptual Judgment Five sources of evidence converge to support intermodal feature binding awareness. The first concerns perceptual judgment. In many ordinary cases, perceptual evidence does not support an immediate perceptual judgment that a thing is both seen and felt, as when you see an airplane and touch a baseball, or when you touch a baseball and unknowingly see it reflected in a mirror. However, perceptual evidence may support an immediate perceptual judgment that what is seen is what is felt, or that something perceived bears both visible and tactual features. Imagine holding a baseball while looking at it. Normally, it would be silly to judge on appearances that the object you see is numerically distinct from the object you feel. The simplest explanation is that you perceive the sphere in your hand at once to be jointly white and red, smooth and leathery; it appears perceptually that there is a white and red, smooth, leathery sphere in your hand. That is what you tend to judge. However, someone might object that, even granting their veridicality, perceptual appearances leave room for doubt whether features perceived with different modalities belong to the same thing. Thus, the identification does not hinge just upon perceptual appearances or looks; it is not simply a matter of endorsing appearances. If so, the coinstantiation of features need not be perceptible intermodally; instead, it may be cognized only through contributions from further postperceptual resources. The claim is that the identification is neither perceptual nor an immediate perceptual judgment but instead belongs fully to extraperceptual cognition. 4.2 Perception-Guided Action We can make progress by recognizing that common perceptually guided actions suggest that you sometimes are sensitive to the identity of things perceived through different modalities in a way that does not require

80

C. O’Callaghan

perceptual judgment. Imagine crossing a street and hearing something rapidly approaching from your left. You may reflexively jump out of the way, or you may turn quickly to look for it. But it makes little sense to jump from or to look for a sound. Your actions instead suggest sight and hearing share objects. Moreover, once you’ve picked it up by sight, you track and respond to it as a unified perceptible thing or happening, accessible to sight and hearing, rather than as distinct individuals. Another example involves seeing a baseball coming at you and visually “guiding” it into your mitt. Your activities coordinate sight and touch in a way that suggests you implicitly recognize the ball as a common perceptible target. This ability extends to novel circumstances, so it generalizes. An additional example involves using sight to orient yourself so that you can better listen to the source of a sound. Slightly angling your face away from a source often improves how it sounds. Such activities involve responsiveness, orienting, and tracking across modalities. They suggest you perceptually identify or are sensitive to the identity of what’s seen with what’s heard or felt. The manner in which multimodal perception guides action supports intermodal binding awareness. However, someone might object that such actions could depend on pure (but fancy) reflexes, on sophisticated learned associations and coordinated predictions, or on snap judgments and implicit inferences rather than on perception. Moreover, perception for action may be functionally distinct from perception for recognition and awareness. Thus, even if intermodal perception for action identifies common objects, a subject might still wholly lack intermodal binding awareness. 4.3 Empirical Research A third source of evidence supports the claim that the identification of common objects is not limited to perception for action. A great deal of recent empirical work on multisensory perception claims that perceptual systems integrate and bind information from different senses to yield unified perceptual awareness of common multimodally accessible objects or events. Here are four representative passages that concern audio-visual binding. When presented with two stimuli, one auditory and the other visual, an observer can perceive them either as referring to the same unitary audiovisual event or as referring to two separate unimodal events. … There appear to be specific mechanisms in the human perceptual system involved in the binding of spatially and temporally aligned sensory stimuli. (Vatakis & Spence, 2007, 744, 754, my italics)

Intermodal Binding Awareness

81

As an example of such privileged binding, we will examine the relation between visible impacts and percussive sounds, which allows for a particularly powerful form of binding that produces audio-visual objects. (Kubovy & Schutz, 2010, 42, my italics) In a natural habitat information is acquired continuously and simultaneously through the different sensory systems. As some of these inputs have the same distal source (such as the sight of a fire, but also the smell of smoke and the sensation of heat) it is reasonable to suppose that the organism should be able to bundle or bind information across sensory modalities and not only just within sensory modalities. For one such area where intermodal binding (IB) seems important, that of concurrently seeing and hearing affect, behavioural studies have shown that indeed intermodal binding takes place during perception. (Pourtois et al., 2000, 1329, my italics) There is undeniable evidence that the visual and auditory aspects of speech, when available, contribute to an integrated perception of spoken language. … The binding of AV speech streams seems to be, in fact, so strong that we are less sensitive to AV asynchrony when perceiving speech than when perceiving other stimuli. (Navarra et al., 2012, 447, my italics)10

The main source of empirical evidence for intermodal binding is that sensory systems interact and share information. Cross-modal recalibrations are effects in which a stimulus presented to one sensory system impacts experience associated with another sense modality. Sometimes this generates an illusion. For instance, compelling ventriloquism involves an auditory spatial illusion produced by the visible location of an apparent sound source—the visual stimulus affects auditory spatial experience. In the McGurk effect, video of a speaker uttering /ga/ presented with audio of /ba/ leads subjects to mistakenly hear the utterance as /da/. So, processes associated with one sense sometimes interact causally with processes associated with another sense, and this can alter experience from what it otherwise would have been.11 Explaining such crossmodal effects as mere causal influence misses something important. Welch and Warren (1980) say: The bias measured in such experimental situations is a result of the tendency of the perceptual system to perceive in a way that is consonant with the existence of a single, unitary physical event. … Within certain limits, the resolution may be complete, so that the observer perceives a single compromise event. (661, 664, my italics)

For instance, in ventriloquism, visual and auditory spatial information may be recalibrated to produce a concordant spatial experience. In the McGurk effect, alveolar /da/ is a compromise between the visible velar /ga/ and the audible bilabial /ba/. So, discrepant or conflicting information from different sensory systems is reconciled in order to reduce or resolve conflict. But conflict requires a common subject matter. Thus, if perceptual processes resolve conflicts between the senses, they treat information as if

82

C. O’Callaghan

it has a common subject matter or shares a source. This requires discerning whether or not different sensory messages concern the same thing and thus belong together as candidates for reconciliation. (The alternative to attributing incompatible features to one item is attributing differing features to distinct items.) So, among perceptual strategies and mechanisms responsible for intermodal recalibrations and illusions, those that reduce and resolve conflicts require the capacity to treat information from different sensory systems as stemming from a common source—as concerning the same things or features. A unified subpersonal grasp upon common perceptible objects in turn may ground unified perceptual awareness as of a single event with visible and audible features. However, one might object. Grant that there is a pattern of causal influence across sensory systems that conforms to principles of conflict resolution, and grant that information is transmitted between senses. This does not require a common or unified representation, and it does not by itself constitute a unified grasp or representation as of a common object or feature bearer. Perceptual mechanisms might effectively resolve conflicts between distinct information streams without integrating or binding them together. The performance of effective conflict resolution need not involve explicitly tracking or representing any common sources as such. Thus, further empirical evidence is needed for intermodal feature binding. In fact, standard empirical measures of intramodal feature binding also provide evidence for intermodal feature binding. For instance, multisensory integration, illusory conjunctions, object-specific preview effects, multimodal object files, and intermodal event files (temporary episodic representations of persisting real-world objects and events) have been studied and reported in a variety of intermodal conditions.12 The important upshot of this experimental work (which endnote 12 describes in additional detail) is that perceptual processes indeed do involve tracking or representing individual feature bearers as common across sensory modalities and as bearing features perceptible with different senses. This addresses the objection raised in the previous paragraph. But it calls attention to another worry. The relationship between perceptual processes that involve feature binding—as operationalized by such experimental measures—and conscious perceptual awareness is not clear. In the intramodal visual case, for instance, Mitroff, Scholl, and Wynn (2005) report that object-specific preview benefits disagree with conscious visual percepts. Therefore, in some cases, object trajectories as determined by the object-file system may diverge from those apparent in conscious perceptual awareness. Moreover, in intermodal audio-visual cases, Zmigrod and

Intermodal Binding Awareness

83

Hommel (2011) claim that implicit measures of intermodal feature binding from event-specific preview effects may disagree with conscious perceptual awareness of audible and visible features as belonging to the same event. Event-specific preview effects can tell one story, and measures of conscious perceptual awareness can tell another. The authors say, “binding seems to operate independently of conscious awareness, which again implies that it solves processing problems other than the construction of conscious representations” (592). Thus, it is risky to draw conclusions about conscious perceptual awareness just from experimental work on feature binding. Here is where things stand. If there is intermodal feature binding awareness as I have characterized it, some mechanisms are responsible. It remains plausible that empirical research on multisensory integration and binding of information concerning a common subject matter should play a critical role in explaining intermodal feature binding awareness. For instance, it helps to show that sensitivity to the identity of things perceptible through different sense modalities is not wholly cognitive and is not limited to perception for action. However, current empirical work does not definitively account for the relation between integration and binding processes and feature binding awareness. So, we cannot take experimental work on intermodal feature binding at face value as direct support for intermodal feature binding awareness. One may doubt psychologists’ interpretations of their own results, but that is not the issue. Psychological explanations of perceptual mechanisms and processes involving feature binding just do not translate neatly and uncontroversially to claims about conscious perceptual awareness.13 4.4 Perceptual Appearances The contrast between (1) and (2) marks a difference in how things may appear perceptually to be. This difference may be apparent whether or not you believe things are as they appear and whether or not things are as they appear. When all is going well, the contrast corresponds to a difference in whether or not you are perceptually sensitive to the coinstantiation of features by a common individual perceptible through different senses. The argument stems from misleading appearances and the possibility of error. On one hand, apparent binding can be illusory. Take a compelling case of ventriloquism. You may seem to hear the visible puppet speaking, even if you are not taken in. Contrast this with a poor attempt at ventriloquism, in which it is perceptually evident that the visible puppet is not what you hear. Or consider movies. Nothing in the theater utters the words you hear and is visible on screen. In the psychology lab, you wear

84

C. O’Callaghan

headphones and watch video of disks apparently colliding with a clack. Since there is no particular perceptible event with those visible and audible features, the appearance as of a common source is an illusion. The illusion need not be spatial or temporal, since the speaker could be placed right behind the movie screen. And it does not require belief. A mere case of (1) may simply seem like a case of (2), where the difference concerns that to which you are perceptually sensitive.14 On the other hand, visible and audible features can appear to belong to distinct individuals, or not appear bound, even if you know they belong to one thing. In successful ventriloquism, the sounds appear to come from the dummy but in fact come from the ventriloquist you see. Or take the trick in which you cross one wrist atop the other, weave your fingers together, twist your hands inward and up, visually target a finger, and try to raise it. When the trick works, before you move anything, the seen but visually untargeted finger you surprisingly raise seems distinct from that finger as it is felt. Perceiving features that are coinstantiated seems like a mere case of (1). You fail to be sensitive perceptually to the identity of an individual and to the coinstantiation of its features. So, apparent intermodal binding can be illusory, and features of one thing can mistakenly perceptually appear to be features of distinct things (or may simply not perceptually appear to be bound). These possibilities support the claim that there are cases in which intermodal feature binding is perceptually apparent that differ in what is presented in experience from other cases in which it is not. This provides the materials for a reply to Spence and Bayne. Each of these effects decouples from what you think and what you are inclined to judge on extraperceptual grounds. Therefore, the differing appearances are not due wholly to extraperceptual cognition or inference. These cases involve differences in conscious perceptual awareness. It also provides the materials for a reply to Fulkerson. The differences concern what you may be consciously perceptually aware of, not simply relations between experiences. A mere conscious association between experiences cannot in itself be an illusion or misperception. However, suppose that such associations between experiences ground a difference in how things seem perceptually to be and thus may be accurate or illusory. If so, in order for skepticism to have teeth, merely seeming to be associated or to tend to co-occur must differ from seeming to belong to something common. But, if seeming to be associated or to tend to co-occur does not guarantee seeming to share a common object or source, then appearing merely as associated or as tending to co-occur is too permissive to capture the relevant distinctions among the cases discussed above. For instance, a sound and an image may seem

Intermodal Binding Awareness

85

merely to be associated or to tend to co-occur without seeming perceptually to share a common source. A rough surface and a red surface may seem to be associated without their seeming perceptually to be one surface or to belong to one object. Mere associations thus do not suffice for an account of that to which one may be multimodally perceptually sensitive, and they do not suffice for an account of multimodal perceptual awareness. 4.5 Perceptual Phenomenology A skeptic nevertheless might question whether the difference between (1) and (2) itself may be marked by a difference in the phenomenology of perception. Imagine watching a movie with a compelling, immersive soundtrack. You hang on the actors’ words and jump from your seat at the explosions. It sounds like planes flying up behind you and overhead. Now imagine the soundtrack’s timing is off. It could be just a little bit, so that it is noticeable but not disturbing. It could be even more, so that the experience is jarring. Or it could be a lot, so that the sights and sounds appear wholly dissociated. In each of these four cases, the auditory and visual stimulation independently remain qualitatively the same, but the phenomenology differs unmistakably. The alignment matters. The dramatic phenomenological difference between the perfect soundtrack and the very poorly aligned soundtrack stems in part from perceiving audible and visible features as belonging to something common in the coincident case but not in the misaligned case. The contrast is between apparent intermodal episodes of (2) and of (1).15 A similar argument applies to dubbed foreign language films. In that case, the fine-grained structures mismatch. Someone may object: These experiences differ in spatiotemporal respects; once you control for spatiotemporal differences, such as those involving apparent temporal or spatial relations between what’s audible and visible, any experiential difference dissolves. Notice that in this respect my case parallels that of perceiving causality. Stimulus features that cue perceptual awareness as of causality also are responsible for the scene’s apparent spatiotemporal features. The main features that indicate causation just are spatiotemporal. So, it is difficult to control for perceptually apparent spatiotemporal features. In the case of intermodal binding awareness, there is a clear way forward. Intermodal binding awareness may depend not just on spatiotemporal cues, but also on factors such as whether and how the subject is attending, the plausibility of the combination or how compelling the match, and whether the subject expects one event or multiple events to occur.

86

C. O’Callaghan

The binding versus segregation of these unimodal stimuli—what Bedford (2001) calls the object identity decision; see also Radeau and Bertelson (1977)—depends on both low-level (i.e., stimulus-driven) factors, such as the spatial and temporal co-occurrence of the stimuli (Calvert, Spence, & Stein, 2004; Welch, 1999), as well as on higher level (i.e., cognitive) factors, such as whether or not the participant assumes that the stimuli should “go together.” This is the so-called “unity assumption,” the assumption that a perceiver makes about whether he or she is observing a single multisensory event rather than multiple separate unimodal events. (Vatakis & Spence, 2007, 744)

Fixing spatiotemporal features does not by itself suffice to fix whether intermodal binding awareness occurs. At the same time, the perceptual system also appears to exhibit a high degree of selectivity in terms of its ability to separate highly concordant events from events that meet the spatial and temporal coincidence criteria, but which do not necessarily “belong together.” (Vatakis & Spence, 2007, 754)

Thus, it is possible to tease apart the appearance of intermodal feature binding from perceptually apparent spatiotemporal features. Fixing apparent spatiotemporal features need not fix whether or not intermodal feature binding is perceptually apparent. Take a pair of cases that controls for spatiotemporal features and for other aspects of perceptual phenomenology. A case in which you “get” the perceptual effect of intermodal binding awareness may contrast in character with an otherwise similar one in which you do not. In addition, the capacity for intermodal binding can be disrupted. Individuals with autism have difficulty integrating cues about emotion from vision and audition. But mechanisms for integrating information from different sets of senses or even features may be dissociated, so localized deficits or brain damage may not cause a wholesale inability to perceive features as bound intermodally. Instead, specific forms of intermodal feature binding awareness may fail. For instance, Pasalar, Ro, and Beauchamp (2010) show that transcranial magnetic stimulation can disrupt visuotactile sensory integration. Hamilton, Shenton, and Coslett (2006) report a patient who is unable to integrate auditory and visual information about speech. “We propose that multisensory binding of audiovisual language cues can be selectively disrupted” (Hamilton, Shenton, & Coslett, 2006, 66). Controlling for spatiotemporal differences—even apparent ones— therefore need not dissolve the phenomenological difference in perceptual experience. Higher-level cognitive factors sometimes may play a role in determining whether or not intermodal feature binding awareness occurs. This

Intermodal Binding Awareness

87

implies neither that intermodal feature binding is extraperceptual nor that the phenomenology of intermodal binding awareness is wholly cognitive. Cognition may causally but not constitutively influence perception, and intermodal binding awareness need not involve awareness of the relevant cognitive factors. One complication concerns the role of attention. I am attracted to the idea that intermodal attention is required for intermodal feature binding awareness. So, suppose there are differing ways of deploying attention. For instance, you might maintain distinct intramodal streams, or you might sustain a single intermodal focus. If so, phenomenological differences associated with these differing ways of deploying attention might account for phenomenological differences between apparent cases of (1) and of (2) that otherwise are alike. Nevertheless, given that perceptual attention targets individual objects or groups whose members are treated as parts of a unified perceptible entity, a single intermodal focus may require recognizing a common perceptible item. Such attended items may perceptibly bear features associated with different senses. 4.6 Summary Perceptual judgment, perception-guided action, empirical research on multisensory perception, perceptual appearances, and perceptual phenomenology together provide good evidence that intermodal episodes of (2) may contrast with intermodal episodes of (1), that intermodal episodes of each occur, and that the difference sometimes is reflected in perceptual awareness. Humans can be perceptually aware as of something’s jointly having both visible and audible features. This may differ from seeing as of something’s having visible features while hearing as of something’s having audible features. Only the latter is compatible with their being apparently distinct individuals. Thus, perceptually apparent intermodal feature binding occurs. There is intermodal feature binding awareness. 5 Consequences 5.1 Perception Is Not Just Minimally Multimodal Intermodal feature binding awareness has noteworthy consequences. It follows that consciously perceiving an individual object or event is not always a modality-specific episode. Some ways to perceive individuals cannot be analyzed just in terms of ways in which you could perceive with specific modalities on their own. For instance, visuotactually perceiving a thing’s being jointly F and G is not merely co-consciously seeing a thing’s being

88

C. O’Callaghan

F and feeling a thing’s being G, where it just happens that the same thing is F and G. Perceptually appreciating or being sensitive to the identity of what is seen and felt cannot occur unimodally. So, visuotactually perceiving a thing’s being both F and G is not a way of perceiving that boils down to jointly occurring episodes of seeing and feeling that could have occurred independently. Thus, overall perceptual awareness is not just a matter of co-consciously seeing, hearing, feeling, tasting, and smelling. Where F and G are perceived thanks to different senses, an attentive sensory episode of perceiving a thing’s being both F and G, in which you are sensitive to and able to recognize the identity of what is F with what is G, may not be factorable without remainder into co-conscious modality-specific components that could have occurred independently from each other. A related argument shows that not all perceptual phenomenal character is modality specific. Suppose that the phenomenal character associated with some modality on an occasion includes just that which could be instantiated by a perceptual experience wholly of that modality under equivalent stimulation, where a perceptual experience wholly of some modality belongs to that and no other modality. For example, given a particular multimodal perceptual experience, the phenomenal character associated with vision on that occasion includes just that which could be instantiated by a wholly visual perceptual experience under equivalent stimulation, where a wholly visual perceptual experience is one that is visual but not auditory, tactual, olfactory, or gustatory. The previous section’s arguments have as a consequence that visuotactually perceptually experiencing a thing’s being jointly F and G may have phenomenal features that could not be instantiated either by a wholly visual or by a wholly tactual perceptual experience and that do not accrue thanks to mere co-consciousness. (Phenomenal features that accrue thanks to mere co-consciousness may include simple coconscious phenomenal unity or those that supervene upon phenomenal character that is associated with specific modalities.) To demonstrate this, suppose seeing a thing’s being F could be a wholly visual perceptual experience, and suppose feeling its being G could be a wholly tactual perceptual experience. Co-consciously seeing a thing’s being F and feeling a thing’s being G, where it happens that what’s seen is what’s felt, does not suffice for visuotactually perceptually experiencing as of a thing’s being both F and G. So, co-consciously seeing a thing’s being F and feeling a thing’s being G may differ phenomenally from visuotactually perceiving a thing’s being jointly F and G. Thus, the phenomenal character of a multimodal perceptual episode need not be exhausted by that which is associated with

Intermodal Binding Awareness

89

each of its modalities along with that which accrues thanks to mere coconsciousness. Therefore, not all phenomenal character on each occasion is modality specific. While intermodal binding awareness as I have characterized it entails this conclusion, it is worth being explicit that a skeptical position about intermodal binding awareness is compatible with the conclusion. For instance, Fulkerson’s account in terms of conscious higher-level associations between modality-specific experiences entails the same conclusion while rejecting intermodal binding awareness. Nevertheless, skepticism about intermodal binding awareness is required to maintain that all phenomenal character apart from that which accrues thanks to mere co-consciousness is modality-specific. 5.2 Phenomenal Character Is Not Locally Distinctive Many philosophers say that perceptual experiences of a given modality have a distinctive phenomenal character.16 From the above, it follows that not all perceptual phenomenal character is locally distinctive since not every phenomenal feature is distinctive to a specific modality. That is, it is not the case that each perceptual phenomenal feature could be instantiated only by perceptual episodes associated with a certain modality.17 This is not just the traditional argument from common sensibles. The argument from intermodal feature binding requires that it is possible at a time to perceive visible and audible features to be coinstantiated, and the argument from common sensibles does not. And, unlike the traditional argument from common sensibles, it is not feasible to escape the argument from intermodal feature binding with help from modality-specific modes of presentation or modality-inflected phenomenal character (phenomenal character that is partly a product of the modality itself, understood as a mode of intentionality). Each leaves unaddressed the phenomenal character of perceptually experiencing as of a single something’s having both visible and audible features—the phenomenally apparent numerical sameness of an individual that is seen and heard. Phenomenal character nonetheless may be regionally distinctive within a modality. Due to perceptually apparent proper sensibles, the overall phenomenal character associated with any given modality on any occasion may be distinctive in that it could only be instantiated by perceptual experiences of that same modality. However, this comes at a cost. Since local distinctiveness fails, you may not be able to tell what modality a phenomenal feature is associated with on an occasion. So, there may be no clear verdict

90

C. O’Callaghan

concerning which phenomenal features, among many candidates, belong to the distinctive overall character that is associated with a given modality. Thus, the boundaries of the phenomenal character associated with a modality on an occasion may not be introspectible, and they are not settled just by considering what’s distinctive. 6 Objections and Replies 6.1 Singular Contents I have aimed to be neutral regarding theories of perception. But (1) and (2) talk about perceiving “a thing,” and the contrast between (1) and (2) is clear when read to express existentially quantified or general perceptual contents. Perceiving that something is F and something is G differs from perceiving that something is both F and G. What if perception has singular or particular contents? Someone might object that intermodal feature binding awareness does not show that not all perceptual experience is modality specific. You might hear that o is F and see that o is G. This captures the identity of the individual heard and seen, but you could hear that o is F without seeing, and you could see that o is G without hearing. So, overall perceptual awareness may be just co-consciously seeing, hearing, and the rest. No parallel move exists for general perceptual contents.18 There is a good reply that helps illuminate the issue. In principle, twin objects undetectably could be swapped. So, if o and p are distinct but you cannot by perceiving discern the difference in a way that enables you to tell which is which, you may not be able to detect the difference in a way that enables you to tell which is which between, for instance, seeing that o is G and seeing that p is G. So, singular content theorists should accept: (*) Suppose o and p are distinct but perceptually indistinguishable in ways that would enable a subject to tell which is which. Controlling for other differences, hearing that o is F while seeing that o is G is introspectively indistinguishable in ways that would enable a subject to tell which is which from hearing that o is F while seeing that p is G. But then the discernible difference between when features are perceptually experienced intermodally to be bound and when they are not cannot be explained by modality-specific singular contents alone. The singular content theorist in this respect has no advantage over the general content theorist.

Intermodal Binding Awareness

91

Some singular content theorist might reject (*) and try to capture the difference with differing modality-specific singular contents—for instance, by saying intermodal binding is perceptually apparent just in case singular contents overlap. This is a bad idea. First, it requires accepting that any difference in visual singular content is introspectively discernible by a subject. Suppose o and p are distinct but perceptually indistinguishable to a subject. And suppose that, controlling for other differences, hearing that o is F while seeing that o is G must be introspectively distinguishable by the subject from hearing that o is F while seeing that p is G. Since hearing that o is F is introspectively indistinguishable from hearing that o is F, seeing that o is G must be introspectively distinguishable from seeing that p is G. Second, it leaves no coherent way to explain illusions of identity and merely apparent distinctness in terms of modality-specific singular contents. What is the singular content of an episode of illusory intermodal binding awareness? It must be hearing that o is F and seeing that o is G, if apparent binding requires overlapping singular contents. But, since the appearance is illusory, the singular contents cannot overlap, contrary to the proposal. Capturing the contrast and the illusions therefore requires something further, such as the perceptual content that o is F and G (or that o is p). However, perceptual contents probably are not closed under conjunction. This is especially plausible for the singular content theorist who accepts (*). Sharing a constituent of singular content (seeing and hearing the same individual) does not guarantee that a subject is able to recognize that what is seen and heard is the same individual. And it does not guarantee the subject perceives features to be coinstantiated or to be bound intermodally. Indeed, you can see and hear the same thing without its being perceptually apparent that something has both visible and audible features.19 And, even if perceptual contents were closed under conjunction within a modality, different perceptual modalities are more plausibly viewed as distinct ways of entertaining contents, so it is far less plausible that contents from different modalities are closed under conjunction. So, hearing that o is F while seeing that o is G does not guarantee perceiving that o is F and G. Perceiving that o is F and G requires a contentful perceptual episode that differs from just hearing that o is F while co-consciously seeing that o is G. Given the failure of conjunctive closure for perceptual contents from different modalities, an episode of perceiving that o is F and G need not be factorable without remainder into modality-specific contentful perceptual episodes that could occur independently from each other. And it may have

92

C. O’Callaghan

phenomenal features beyond those of a wholly auditory or a wholly visual experience under equivalent stimulation. Therefore, even if contents are singular, intermodal binding awareness shows that not all perceptual experience is modality specific.20 6.2 Binding and Infusion O’Dea (2008) says that features perceived through one sense can appear bound in a manner that features perceived through different senses cannot. Intramodally bound features may appear to qualify or to be bound to each other, rather than just appearing to belong to a common object. For example, to describe a visual experience of a red square as simply an experience of an object as red and as square is to miss out something crucial, namely that it is the redness that we are aware of that we are experiencing as square-shaped. It is not the case that we see an object which is square and which is red—it is the squareness which is red and the redness which is square. (O’Dea 2008, 302)

O’Dea describes the redness as infusing the squareness. If perceptually apparent feature binding requires that one feature appears to infuse another, and if one feature may appear to infuse only another feature perceived with the same modality, then there is no intermodal feature binding awareness. O’Dea does not offer this argument. He rejects that features may appear infused intermodally with other features, but he allows that features may appear intermodally to belong to a common object.21 Thus, according to O’Dea, intermodal binding awareness does not require infusion. And I agree. Feature binding awareness requires only that features appear jointly to be features of some common item. Perceptibly bound features may even include distinct parts that perceptually appear to belong to a common whole. This reply is not ad hoc, since (as O’Dea also allows) even perceptually apparent intramodal feature binding does not require infusion. Features such as speckledness, hen-shapedness, and being wattled do not all appear to infuse each other even though all could appear to qualify one body. And, as O’Dea suggests, apparent infusion may be asymmetric. Moreover, the criteria for infusion are obscure. Why think perceptibly bound features never appear infused intermodally? The booming sound of the explosion might seem to infuse its bright flash. Why can’t the coolness seem to infuse the blueness of the sphere? The voice you hear might seem to infuse the visible mouth movements and articulatory gestures of the speaker—qualities of sound and visible motion thus may seem bound up

Intermodal Binding Awareness

93

with each other. The McGurk effect demonstrates that the apparent qualities of one regularly do depend on the other. And, even if I ceased to see it, it is difficult for me to imagine my perceptible interlocutor’s vocal activity “losing all of its visible properties without affecting its audible qualities,” which is one criterion for infusion that O’Dea mentions (305). Nevertheless, another observation may block intermodal infusion. Infusion involves a dependence of your awareness of features of one type upon your awareness of features of another type. Apparent infusion thus may require that, for each pair of apparently infused features (for instance, color and shape, or timbre and duration), if you ceased entirely to perceive any feature belonging to one of those types, you would cease to perceive any feature belonging to the other type. So, ceasing to see color would render an object’s shape invisible. Thus, what is difficult to imagine is losing visual awareness of something’s color while leaving intact visual awareness of its shape. Occasions of intermodal feature binding awareness involve perceiving thanks to more than one modality. You see an event’s brightness and hear an event’s loudness, even while perceiving something’s being jointly bright and loud. And perceiving with one modality dissociates from perceiving with another. So, even if you ceased to see, you could continue to hear an event’s loudness. More generally, it is not so difficult to imagine losing visual awareness of each of something’s visible properties while leaving intact auditory awareness of each of its audible qualities. And, if there is no feature such that ceasing entirely to perceive any feature of that type with one modality would render an object’s features imperceptible to another modality, then there is no apparent intermodal infusion. Notice that feature binding awareness does not require infusion, thus understood. Not even all intramodally bound features satisfy this criterion for infusion. For instance, the features belonging to a face or the parts that make up a typed letter each perceptibly appear bound but are not, according to this criterion, infused. So, this does not rule out intermodal binding awareness. It may, however, rule out intermodal instances of one particularly intimate variety of binding awareness that occurs intramodally.22 Infusion may be a distinctive variety of feature binding, and it deserves further attention. In particular, a clear explication of infusion and its differences from other common forms of feature binding would be valuable, as would a study of whether intermodal infusion is possible. But apparent infusion is too restrictive as a requirement on feature binding awareness, and feature binding awareness suffices to establish the conclusions of section 5.

94

C. O’Callaghan

6.3 Multimodal Perceptual Objects I maintain that intermodal feature binding awareness requires shared objects. However, given the diversity of objects across modalities, someone might object that intermodal binding awareness is less common than I have made it seem. Consider this contrast: You can see and touch the baseball, and you can perceive its being jointly yellowed and rough. But you see the baseball and hear something else—the sound it makes when it hits the bat. Thus, one might argue that no common object perceptibly seems to bear both audible and visible features—the sound is loud; the ball is round and rough. So, in this case, no intermodal binding awareness occurs. The objection succeeds only if the features do not perceptibly belong to something common. I maintain that in many such nonobvious cases we can admit common perceptible objects. My view is that perceptual objects in general are best understood as mereologically complex individuals that bear perceptible features. Perceiving something requires perceiving some of its parts or properties. However, it does not require perceiving all of them. In cases of intermodal feature binding, you may perceive the same mereologically complex individual while many of its parts and properties are perceptible to one but not both modalities. Take the case of seeing and hearing. Physical objects such as baseballs and bats are visible. They participate in events such as collisions. Such events also are visible. When such events occur in a surrounding medium, they may involve sounds. And sounds are audible. But suppose the sound is a feature—a constituent part or a complex property—of such an event that occurs in a medium, rather than a wholly distinct individual. The sound is a feature of the collision that occurs in the medium, and the baseball and bat are participants in that collision. Thus, the audible sound and the visible rebounding of the ball from the bat each are perceptible features of the collision. Events like the collision of the baseball with the bat in a surrounding medium are audible because they include sounds as features. (This does not imply that you hear the collision mediately by or in virtue of hearing the sound.) You may hear the sound, and you may hear the collision of which it is a part or property. But you need not hear all of their features. For instance, you need not hear the baseball or the bat as such; you certainly do not typically hear their colors or their facing surfaces as such. And you may see the baseball, the bat, and thus the collision, but not their hidden parts or their sound. So, you can see and hear the collision of the ball with the bat thanks to its visible and audible features. When you consciously

Intermodal Binding Awareness

95

perceive its jointly having both visible and audible features at once, that is a case of intermodal binding awareness. In order to determine the reach of intermodal binding awareness, this strategy must be assessed case by case. 7 Conclusions My main claim is that there is intermodal feature binding awareness. Features—properties or parts—are consciously perceived to be coinstantiated or to belong to the same thing—to be bound—intermodally. The argument for this claim is that evidence from immediate perceptual belief, perception-guided action, experimental research, perceptual illusions, and perceptual phenomenology converges to support contrasting intermodal episodes of (1) and (2). The important consequence is that not all perceptual awareness is modality specific. Some multimodal perceptual episodes require the kind of coordinated sensitivity that enables identifying individuals across modalities. Some multimodal perceptual experiences are not just co-conscious episodes of seeing, hearing, touching, tasting, and smelling that could have occurred independently from each other. A closely related consequence is that not all perceptual phenomenal character is modality specific. The phenomenal character of a multimodal perceptual experience need not be exhausted by that which is associated with each of the modalities plus that which accrues thanks to mere co-consciousness.23 The significant upshot is that limiting inquiry to individual modalities of sensory perception and bare co-consciousness leaves out something critical. It leaves out richly multimodal forms of perceptual awareness, such as intermodal binding awareness. Therefore, no complete account of perceptual awareness or its phenomenal character can be formulated in modalityspecific terms. Perceiving involves more than just co-consciously seeing, hearing, feeling, tasting, and smelling. Acknowledgments For valuable questions, conversations, and correspondence that helped me to improve this chapter, many thanks to Mike Barkasi, Tim Bayne, David Bennett, Justin Broackes, David Chalmers, Kevin Connolly, Ophelia Deroy, James Genone, Richard Grandy, Fiona Macpherson, Michael Martin, Mohan Matthen, Jesse Prinz, Barry C. Smith, Jeff Speaks, Charles

96

C. O’Callaghan

Spence, Dustin Stokes, and audience members at Washington University in St. Louis, University of Geneva, and the 2013 Central APA symposium on multimodal perception. Notes 1. For instructive discussion of experiential awareness, see Hill (2009, esp. chap. 3). For a differing perspective on sensory awareness, see Johnston (2006). 2. See, e.g., Treisman and Gelade (1980) and Treisman (1988, 1996, 2003). 3. See, e.g., Treisman (1996, 2003). 4. See, e.g., Treisman (1982, esp. 212–213), Treisman (1988, esp. 204), Treisman (2003, esp. 109–110), and Mitroff et al. (2005). 5. See, e.g., Fulkerson (2011). 6. See, e.g., Batty (2011). 7. See, in particular, Clark (2000), who maintains that different features are attributed to individual locations. See also Matthen (2005, 277–292), who holds that vision and audition but not olfaction involve perceptually attributing multiple features to objects (though not to commonplace material objects in the case of audition). 8. This should be distinguished from the differing skeptical position that there is no feature binding awareness at all, which sec. 2 set aside. 9. Connolly (2014) is a more complicated case. He endorses elements of skepticism. At the outset, he seems to reject intermodal binding awareness. Are some of the contents of perception fused multimodal units (fused audio-visual units, for instance)? I think that the answer is no. … We need not hold that the content of Q1 involves a fused audio-visual property, since we can explain that phenomenal type in terms of an auditory and a visual property. (Connolly, 2014, 354)

And he says multimodal episodes can be explained in terms of “a conjunction of an audio content and visual content” (362) but do not involve “fused audio-visual content” (354–355). However, he later states that perceptual experiences may have additional amodal contents involving individuals, objects, or events, characterized in modality-independent terms, of which modality-specific features may be predicated. He does not say outright whether such perceptual episodes involve mere association or intermodal binding awareness. While Connolly’s account raises questions beyond my chapter’s scope, it means Connolly need not endorse thoroughgoing skepticism about intermodal binding awareness. Connolly, in correspondence, has suggested a preference for an account in terms of association rather than binding awareness.

Intermodal Binding Awareness

97

10. See also, e.g., Bushara et al. (2003), Bertelson and de Gelder (2004), Spence and Driver (2004), Spence (2007), and Stein (2012). 11. See O’Callaghan (2012) for a catalog and review of crossmodal illusions and recalibrations. 12. On multisensory integration, see, e.g., Stein and Stanford (2008), Stein et al. (2010), and Stein (2012). On intermodal illusory conjunctions, see Cinel, Humphreys, and Poli, (2002). On intermodal object-specific preview benefits and penalties, object files and event files, see Zmigrod, Spapé, and Hommel, (2009) and Jordan, Clark, and Mitroff, (2010). The remainder of this note describes a selection of these results in additional detail. Stein et al. (2010) characterizes multisensory integration as “the neural process by which unisensory signals are combined to form a new product” (1719). For instance, superadditive effects occur when the multisensory neural or behavioral response to a stimulus is significantly greater than the sum of the modality-specific responses to that stimulus. Such effects are evidence that perceptual processes do not merely reconcile conflicts. Instead, multisensory processes sometimes integrate information concerning a common source and generate a novel response to it as such. A traditional source of support for intramodal feature binding is the existence of illusory conjunctions (ICs) of features, especially outside focal attention (see, e.g., Treisman & Schmidt, 1982). Unattended perceptible features may mistakenly appear coinstantiated. For instance, an unattended red square and green circle may mistakenly cause the perceptual impression of a red circle. An unattended “O” and “R” may mistakenly cause the perceptual impression of a “Q.” Cinel, Humphreys, and Poli (2002) present experimental evidence supporting crossmodal illusory conjunctions between vision and touch. For instance, an unattended felt texture may be perceptually ascribed to the wrong visible shape. The authors say, “These results demonstrate that ICs are possible not only within the visual modality but also between two different modalities: vision and touch” (1245). They argue based upon a series of studies that illusory conjunctions of visible and tactual features are “perceptual in nature” rather than effects of memory or extraperceptual cognition (1261). Taken together, the evidence is consistent with the idea that information converges preattentively for binding from different sensory modalities and that this binding process is modulated by the parietal lobe. … The present evidence for cross-modal ICs suggests that there is multimodal integration of sensory information in perception so that misattributions of modalities arise under conditions of inattention. (Cinel, Humphreys, & Poli, 2002, 1244, 1261)

The existence of intermodal illusory conjunctions therefore supports intermodal feature binding in perceptual processes. Another critical diagnostic for intramodal feature binding stems from object-specific preview effects (see Kahneman, Treisman, & Gibbs, 1992). Kahneman et al. (1992, see esp. 176) propose that visual object perception involves deploying object files, which are temporary episodic representations of persisting real-world objects. Object

98

C. O’Callaghan

files integrate information about distinct perceptible features. Previewing a target affects one’s capacity to recognize it again when its two appearances are “linked” perceptually to the same object (reviewing). If an object’s features match at two times, reviewing it enhances recognition; if its features do not match, reviewing it hampers recognition. Object-specific preview effects are used to determine whether or not feature binding occurs. A preview benefit requires that matching feature combinations are ascribed to a common object; no object-specific preview benefit accrues for features not initially attributed to the reviewed object. And a preview penalty requires mismatching feature combinations ascribed to a common object. Zmigrod, Spapé, and Hommel (2009, 675) say, “Interactions between stimulus-feature-repetition effects are indicative of the spontaneous binding of features and thus can serve as a measure of integration.” Object-specific preview benefits and penalties occur intermodally. Zmigrod, Spapé, and Hommel (2009, 674–675) report that patterns of interaction that characterize unimodal feature binding occur intermodally between audition and vision, and between audition and touch. They argue, for instance, that color—pitch pairs may be bound, since presenting color1 with pitch1 followed by color1 with pitch2 impairs recognition in a way that differs from what is predicted by modality-specific object files and binding alone. The authors report that perceptual processes involve “episodic multimodal representations” rather than mere intermodal interactions (682) and that feature binding occurs across modalities (683). In addition, Jordan, Clark, and Mitroff (2010) report “a standard, robust OSPB” between vision and audition. Although object files are typically discussed as visual, here we demonstrate that object-file correspondence can be computed across sensory modalities. An object file can be initially formed with visual input and later accessed with corresponding auditory information, suggesting that object files may be able to operate at a multimodal level of perceptual processing. (491)

The authors report that their data “explicitly demonstrate object files can operate across visual and auditory modalities” (501). 13. Treisman (2003, esp. 109–111) is emblematic. 14. It is noteworthy that Austin mentions ventriloquism as an example of illusion. “Then again there are the illusions produced by professional ‘illusionists,’ conjurors—for instance the Headless Woman on the stage, who is made to look headless, or the ventriloquist’s dummy which is made to appear to be talking” (Austin, 1962, 22–23). 15. The phenomenological difference between the jarring third case and the far-off fourth case also may involve a contrast between apparent episodes of (2) and of (1). Perhaps what makes it jarring is the sense that the misaligned features belong to something common and thus should be aligned. But perhaps the jarring third case is a better candidate for seeming merely to be associated, in respect of which it dif-

Intermodal Binding Awareness

99

fers from the slightly misaligned but not jarring second case, which involves apparent intermodal binding awareness. 16. See, e.g., Grice (1962, esp. 267), Peacocke (1983, esp. 27–28), and Lopes (2000, esp. 439). 17. It is false that for every perceptual phenomenal feature f there exists a unique perceptual modality m such that every possible perceptual experience that instantiates f belongs to modality m. 18. Thanks to Jeff Speaks for pressing me to address this line of objection. 19. Thus, singular content theorists do have one important advantage over general content theorists. Singular contents allow for overlapping, modality-specific contents without perceptually apparent binding, as when a subject fails to perceptually appreciate the overlap. 20. For helpful discussion of co-conscious phenomenal unity, content, and closure, see Bayne (2010, chap. 3). 21. O’Dea’s thesis is that Tye’s (2003) account of the co-conscious unity of perceptual experience cannot accommodate the difference between infusion and binding. 22. One might be tempted to think that apparent infusion involves only integral, in contrast to separable, feature dimensions (in a sense stemming from Garner, 1970, 1974; see also Treisman & Gelade, 1980; Treisman, 1986, esp. 35–37). Hue, brightness, and saturation are examples of integral dimensions. This would help explain why awareness of features of one type requires awareness of features of another type. And perhaps intermodally there are no integral, as opposed to separable, feature dimensions. However, O’Dea’s examples involve paradigm separable features, such as color and shape. More importantly, paradigm cases of feature binding involve separable rather than integral features. 23. This type of argument need not be limited to apparent feature binding. There may be intermodally perceptible relations, such as motion, synchrony, rhythm, and causality. See O’Callaghan (2014).

References Austin, J. L. (1962). Sense and sensibilia. Oxford: Oxford University Press. Batty, C. (2011). Smelling lessons. Philosophical Studies, 153, 161–174. Bayne, T. (2010). The unity of consciousness. Oxford: Oxford University Press. Bedford, F. (2001). Towards a general law of numerical/object identity. Cahiers de Psychologie Cognitive/Current Psychology of Cognition, 20, 113–175.

100

C. O’Callaghan

Bertelson, P., & de Gelder, B. (2004). The psychology of multimodal perception. In C. Spence & J. Driver (Eds.), Crossmodal space and crossmodal attention (pp. 141–177). Oxford: Oxford University Press. Bregman, A. S. (1990). Auditory scene analysis: The perceptual organization of sound. Cambridge, MA: MIT Press. Bushara, K. O., Hanakawa, T., Immisch, I., Toma, K., Kansaku, K., & Hallett, M. (2003). Neural correlates of cross-modal binding. Nature Neuroscience, 6(2), 190–195. Cinel, C., Humphreys, G. W., & Poli, R. (2002). Cross-modal illusory conjunctions between vision and touch. Journal of Experimental Psychology: Human Perception and Performance, 28(5), 1243. Clark, A. (2000). A theory of sentience. New York: Oxford University Press. Connolly, K. (2014). Making sense of multiple senses. In R. Brown (Ed.), Studies in brain and mind (Vol. 6): Consciousness inside and out: Phenomenology, neuroscience, and the nature of experience (ch. 24). Berlin: Springer. Fulkerson, M. (2011). The unity of haptic touch. Philosophical Psychology, 24(4), 493–516. Garner, W. R. (1970). The stimulus in information processing. American Psychologist, 25(4), 350–358. Garner, W. R. (1974). The processing of information and structure. Hillsdale, NJ: Erlbaum. Grice, H. P. (1962). Some remarks about the senses. In R. J. Butler (Ed.), Analytical philosophy, Series 1. Oxford: Blackwell. Hamilton, R. H., Shenton, J. T., & Coslett, H. B. (2006). An acquired deficit of audiovisual speech processing. Brain and Language, 98(1), 66–73. Hill, C. S. (2009). Consciousness. Cambridge: Cambridge University Press. Johnston, M. (2006). Better than mere knowledge? The function of sensory awareness. In T. S. Gendler & J. Hawthorne (Eds.), Perceptual experience (pp. 260–290). Oxford: Clarendon Press. Jordan, K. E., Clark, K., & Mitroff, S. R. (2010). See an object, hear an object file: Object correspondence transcends sensory modality. Visual Cognition, 18(4), 492–503. Kahneman, D., Treisman, A., & Gibbs, B. J. (1992). The reviewing of object files: Object-specific integration of information. Cognitive Psychology, 24(2), 175–219. Kubovy, M., & Schutz, M. (2010). Audio-visual objects. Review of Philosophy and Psychology, 1(1), 41–61.

Intermodal Binding Awareness

101

Lopes, D. M. M. (2000). What is it like to see with your ears? The representational theory of mind. Philosophy and Phenomenological Research, 60(2), 439–453. Matthen, M. (2005). Seeing, doing, and knowing: A philosophical theory of sense perception. Oxford: Oxford University Press. Mitroff, S. R., Scholl, B. J., & Wynn, K. (2005). The relationship between object files and conscious perception. Cognition, 96(1), 67–92. Navarra, J., Yeung, H. H., Werker, J. F., & Soto-Faraco, S. (2012). Multisensory interactions in speech perception. In B. E. Stein (Ed.), The new handbook of multisensory processing (pp. 435–452). Cambridge, MA: MIT Press. O’Callaghan, C. (2012). Perception and multimodality. In E. Margolis, R. Samuels, & S. Stich (Eds.), Oxford handbook of philosophy of cognitive science (pp. 92–117). Oxford: Oxford University Press. O’Callaghan, C. (2014). Not all perceptual experience is modality specific. In D. Stokes, M. Matthen, & S. Biggs (Eds.), Perception and its modalities. New York: Oxford University Press. O’Dea, J. (2008). Transparency and the unity of experience. In E. Wright (Ed.), The case for qualia (pp. 299–308). Cambridge, MA: MIT Press. Pasalar, S., Ro, T., & Beauchamp, M. S. (2010). TMS of posterior parietal cortex disrupts visual tactile multisensory integration. European Journal of Neuroscience, 31(10), 1783–1790. Peacocke, C. (1983). Sense and content. Oxford: Oxford University Press. Pourtois, G., de Gelder, B., Vroomen, J., Rossion, B., & Crommelinck, M. (2000). The time-course of intermodal binding between seeing and hearing affective information. Neuroreport, 11(6), 1329–1333. Radeau, M., & Bertelson, P. (1977). Adaptation to auditory-visual discordance and ventriloquism in semirealistic situations. Perception and Psychophysics, 22, 137–146. Spence, C. (2007). Audiovisual multisensory integration. Acoustical Science and Technology, 28(2), 61–70. Spence, C., & Bayne, T. (2014). Is consciousness multisensory? In D. Stokes, M. Matthen, & S. Biggs (Eds.), Perception and its modalities. New York: Oxford University Press. Spence, C., & Driver, J. (Eds.). (2004). Crossmodal space and crossmodal attention. Oxford: Oxford University Press. Stein, B. E. (2012). The new handbook of multisensory processing. Cambridge, MA: MIT Press.

102

C. O’Callaghan

Stein, B. E., Burr, D., Constantinidis, C., Laurienti, P. J., Alex Meredith, M., Perrault, T. J., et al. (2010). Semantic confusion regarding the development of multisensory integration: A practical solution. European Journal of Neuroscience, 31(10), 1713–1720. Stein, B. E., & Stanford, T. R. (2008). Multisensory integration: Current issues from the perspective of the single neuron. Nature Reviews: Neuroscience, 9(4), 255–266. Treisman, A. (1982). Perceptual grouping and attention in visual search for features and for objects. Journal of Experimental Psychology: Human Perception and Performance, 8(2), 194–214. Treisman, A. (1986). Properties, parts, and objects. In K. Boff, L. Kaufman, & J. Thomas (Eds.), Handbook of perception and human performance (Vol. 2, pp. 1–70). New York: Wiley. Treisman, A. (1988). Features and objects. Quarterly Journal of Experimental Psychology, 40A(2), 201–237. Treisman, A. (1996). The binding problem. Current Opinion in Neurobiology, 6(2), 171–178. Treisman, A. (2003). Consciousness and perceptual binding. In A. Cleeremans (Ed.), The unity of consciousness: Binding, integration, and dissociation (pp. 95–113). Oxford: Oxford University Press. Treisman, A. M., & Gelade, G. (1980). A feature-integration theory of attention. Cognitive Psychology, 12(1), 97–136. Treisman, A. M, & Schmidt, H. (1982). Illusory conjunctions in the perception of objects. Cognitive Psychology, 14, 107–141. Tye, M. (2003). Consciousness and persons: Unity and identity. Cambridge, MA: MIT Press. Tye, M. (2007). The problem of common sensibles. Erkenntnis, 66, 287–303. Vatakis, A., & Spence, C. (2007). Crossmodal binding: Evaluating the “unity assumption” using audiovisual speech stimuli. Perception and Psychophysics, 69(5), 744–756. Welch, R. B., & Warren, D. H. (1980). Immediate perceptual response to intersensory discrepancy. Psychological Bulletin, 88(3), 638–667. Welch, R. B. (1999). Meaning, attention, and the “unity assumption” in the intersensory bias of spatial and temporal perceptions. In G. Aschersleben, T. Bachmann,

Intermodal Binding Awareness

103

& J. Müsseler (Eds.), Cognitive contributions to the perception of spatial and temporal events (pp. 371–387). Amsterdam: Elsevier. Zmigrod, S., & Hommel, B. (2011). The relationship between feature binding and consciousness: Evidence from asynchronous multi-modal stimuli. Consciousness and Cognition, 20(3), 586–593. Zmigrod, S., Spapé, M., & Hommel, B. (2009). Intermodal event files: Integrating features across vision, audition, taction, and action. Psychological Research, 73(5), 674–684.

5 The Unity Assumption and the Many Unities of Consciousness Ophelia Deroy

1 Introduction The unity of consciousness is taken to reflect a special fact about the holistic or seamless aspect of our conscious experience. As I hear the whistle of the kettle while I am writing this line, I am conscious of the kettle’s whistle, but I am also conscious of the screen in front of me. These two elements are unified in my conscious experience, even though the visual and auditory features are not experienced as belonging to one and the same object. They are experienced as copresent, that is, as being experienced at the same time, by the same conscious subject. This is known as phenomenal unity. As I turn my head toward the stove, I am still conscious of the sounds, but I am also conscious of the colored shape of the steaming kettle. However, in this case, I am not conscious of a whistling sound, and of the look of the kettle as copresent with the sound. I experience these auditory and visual features as belonging to one and the same object: the whistling, gray kettle. I am not even left wondering whether the two are copresent or about the same object: object or event unity occurs spontaneously in consciousness. It means here that various features are experienced as being about one and the same object or to constitute one and the same event.1 Object or event unity (which will be shortened as object unity in this chapter) is usually considered a different kind of unity from phenomenal unity. Phenomenal unity is, after all, independent of object unity in the sense that it can occur when object unity does not. I can be conscious of the kettle’s whistle as being a distinct event from the change seen on my screen, and yet my experiences of the sound and the changes on the screen are phenomenally unified. However, questions arise as to how object unity relates to phenomenal unity. What explains the difference between phenomenal and object unity? How, in other words, should we explain that I am conscious of the screen and the whistling sounds as merely

106

O. Deroy

phenomenally unified in one case, and of the whistling kettle as a single object in the second? The question is here addressed with features belonging to different senses. We can easily contrast the experience of merely copresent sounds and colored shapes and of unified objects emitting sounds and having colored shapes. It is harder to imagine, or to conceive, merely copresent experiences of shapes and colors to contrast with the experience of unified colored shapes in a unisensory case. This means that the claims defended and discussed here are restricted to multisensory cases. They might not apply to object unity within a single sense. In the crossmodal case, object unity seems intuitively to be what happens to phenomenal unity when the experienced features, instead of being only experienced as copresent and referred to different places, are referred to a single object, and—minimally—to the same locus in space. Object unity seems then to require that the experienced features are bound together and their location in space experienced as common. This is where the problem of the unity of consciousness meets cognitive neuroscience. Scientists have, for decades now, been studying the rules of crossmodal binding, that is the cognitive and neurophysiological operations that bind these cues together into a single object or event. More specifically, they have unveiled rules of multisensory integration whereby signals from different senses are combined and provide an estimate of a single property—such as an object’s location. Put this way, there seems to be a perfect alignment between the occurrence of object unity in consciousness and the occurrence of a certain cognitive or neurophysiological operation of binding—noticeably via the multisensory integration of information provided by the different senses. It is, then, natural to think of object unity as the result or manifestation of crossmodal binding. The articulation between phenomenal unity and object unity receives its natural explanation with what I call the Switch thesis. The Switch thesis can be summarized in the following way: Switch thesis (i) Phenomenal switch: There is a switch in experience when two features across two senses are experienced as merely phenomenally copresent versus as belonging to the same object. (ii) Processing switch: There is a switch in processing that corresponds to the occurrence or nonoccurrence of binding—the cognitive or neurophysiological operation whereby sensory features are bound together as belonging to the same object.

The Unity Assumption

107

(iii) Phenomenal-processing alignment: The phenomenal switch to object unity corresponds to the occurrence of a cognitive or neurophysiological switch. Although widely assumed, I want to argue that the Switch thesis is false—or at least weak. In a previous paper, Spence and Bayne (forthcoming) have, in their own way, already questioned the Switch thesis by doubting that crossmodal binding leads to the unified experience of multisensory objects (what they call a multisensory experience). Their strategy consists in showing that the results on crossmodal binding collected in cognitive neuroscience are mostly about behavior and independent from multisensory experience and the occurrence of object-unity in consciousness. Ultimately, their line is that results of the available cognitive neuroscience research are compatible with the idea that there is no such unified experience of objects across the senses (see also Deroy, Chen, & Spence, forthcoming, for discussion). If the skeptical resistance to a quick phenomenal-processing alignment is fair, pushing it to its extreme conclusion requires radical revisions in terms of phenomenology. From the idea that we do not necessarily experience unified objects because our brains show multisensory integration or enhanced responses, Spence and Bayne’s suggestion is that we might never have experiences of unified multisensory objects. Object unity, in their extreme view, might simply what we think or imagine we experience, and not the reality of a conscious experience. Instead of unity, then, we should talk of an impression of unity. This impression of unity is rather a confused impression of continuity coming from a rapid switch of attention between unimodal experiences. Both object and phenomenal unity between experiences are then denied. Their view in this sense could count as neo-Humean as Hume was famously resistant to the idea that there would be more than isolated impressions.2 The neo-Humean line can be summarized as follows: Neo-Humeanism (i) Processing switch is maintained: There is a switch in processing corresponding to the occurrence or nonoccurrence of binding—the cognitive or neurophysiological operation whereby sensory features are bound together as belonging to the same object. (ii) Against phenomenal-processing alignment: Nothing demonstrates that crossmodal binding is manifested as the experience of multisensory unified objects. As a result, the previous claim of a phenomenal switch is questioned. The last step is accomplished by finding a substitute for the claim that we experience bound objects, that is:

108

O. Deroy

(iii) Mere attentional switch: What might be occurring when features are bound is merely a rapid switching of one’s attention between distinct experiences.3 The argumentative strategy behind neo-Humeanism is to start with scientific evidence and to disconnect it from phenomenological claims. From there, it questions the evidence of a switch to experiencing unified objects across the senses and suggests another interpretation of our conscious experience—that is, a mere attentional switch between distinct unisensory experiences.4 I am very sympathetic to the idea that, in the quest for better theories of perception, scientific evidence prevails over our intuitions about what our experience is like (or must be like). If something needs to be revised, it should be our commonsense or intuitive ideas: one cannot be a revisionist about science and bend it until it preserves our commonsense or intuition. However, I think that scientific evidence points toward another option besides the fundamental revision of our intuitive phenomenology. Notably, we need not radically give up on the idea that we can experience unified objects across the senses—nor that we can have phenomenally unified experiences. We still need then to explain how these two kinds of unity exist and how they articulate. What we need to abandon, as I will argue, is the idea of a switch between the two. The argument goes as follows: Section 2 starts by disentangling the concepts of object unity, crossmodal binding, and multisensory integration, and insists that they account for unities that are not operating at the same level. Section 3 argues that the phenomenal switch claim should be replaced by a phenomenal gradation claim. Although there is something it is like to experience sensory cues as distinctively belonging to the same object, and something it is like to experience them as belonging to different objects, there is no neat boundary between the two—but rather a continuum of more or less unified experiences. Scientific protocols are here particularly useful in revealing why we should not simply think in terms of a switch from perfect object unity to mere phenomenal unity. In section 4, I argue for a revision of the claims regarding binding, mainly the idea that it should be conceived as an on/off process whereby binding occurs or it does not. In the crossmodal case, this claim is based on a simplification that only considers the role of temporal and spatial congruence in multisensory integration. I show that there are actually two different dimensions that modulate the occurrence of binding: one concerns spatiotemporal congruence and the other concerns qualitative congruence

The Unity Assumption

109

or crossmodal correspondence. There are therefore reasons to consider that crossmodal binding also comes with modulations and various degrees of strength. In section 5, as a conclusion, I show that the ideas of a phenomenal gradation and the modulation of binding might be a way to preserve a form of phenomenal-processing alignment, like the Switch thesis. However, and echoing Spence and Bayne’s (forthcoming) legitimate doubt about this alignment, more evidence is needed to establish whether and how our experience of more or less unified objects or sets corresponds to the modulations of binding. 2 Distinguishing Object Unity, Crossmodal Binding, and Multisensory Integration It is important to note that object unity, crossmodal binding, and multisensory integration stand at very different levels. Object unity concerns the level of phenomenology. Crossmodal binding is the underlying cognitive operation that is taken to result in a common object attribution or the representation of various features as having a common bearer or source. Multisensory integration, on the other hand, is a neurological claim about how the brain treats different kinds of sensory information. Let’s start with this latter fact—that is, that signals coming from different senses converge in the human brain and are processed in an integrated manner. A criterion (not to be confused with a definition) through which multisensory integration is evidenced comes from the fact that, at the neuronal level, multisensory stimulation can lead to a response that “is significantly different (e.g., larger, smaller) from the best component response” obtained during unisensory stimulation (Stein et al., 2010, 1717; see also Holmes & Spence, 2005). The effect also surfaces at the level of behavior—with the accuracy and speed of the performance in a task performed in a multisensory setting being different from the baseline obtained in a unisensory setting. Take, for instance, the time needed for one to detect where a visual target appears on a computer screen. Although the task seems to be only about detecting a visual target, the performance is influenced by more than merely visual information: the speed can be different in a purely visual setting and in a multisensory setting where sounds are played whenever the visual target appears.5 The occurrence of multisensory integration is not, then, saying anything about object unity in experience. Whether participants experience a unified

110

O. Deroy

object or event is here not necessary to the definition of multisensory integration. It is true that, when pairs of auditory and visual stimuli are presented in temporal synchrony (as opposed to asynchronously), this is likely to give rise to reports saying that auditory and visual features were experienced as belonging to a single object. It is also true that participants in cognitive neuroscience studies of multisensory integration are sometimes asked whether they perceive one or two events. However, they are more typically presented with pairs of auditory and visual stimuli and have to perform a certain task on only one of the stimuli. For instance, in the many studies that have utilized the speeded classification task,6 they might have to respond to the (target) stimuli presented in one sensory modality while ignoring the (distractor) stimulus presented in the other modality. In studies of the redundant target effect,7 they may have to respond to any stimulus that happens to be presented. These phenomenological and behavioral accompaniments of multisensory integration should not make one forget that they are not constitutive of multisensory integration, which is defined at the level of neurological response. As a reminder of the fact that unified conscious experience is not part of the definition of multisensory integration, it is important to stress that it was first studied in anaesthetized animals presented with auditory and visual cues in temporal synchrony, independently of evidence or possibilities of unified object experiences.8 Multisensory integration, on the other hand, can certainly be a mechanism underlying crossmodal binding—at least in the sense that, once integrated, the information is de facto bound and no longer neatly distributable into different component information. But here also care is needed. First, questions remain about the development of crossmodal binding and multisensory integration, which should prevent us from claiming that they always go together. These, though, are topics for another day.9 Second, it is important to underscore that the occurrence of multisensory integration is only one of the ways in which crossmodal binding can take place. Before the discovery of multisensory integration, crossmodal binding (which was called synthesis and was closely related to conscious object unity) was considered to occur through a form of cognitive or categorical linkage—for instance, by various sensory features to be subsumed under the same object concept. Even now, cognitive factors—such as the prior assumption that various features are likely to co-occur or belong to the same object—are considered important to explain crossmodal binding. Equipped with these distinctions (see table 5.1), let us now turn to the Switch thesis, according to which the occurrence of crossmodal binding

The Unity Assumption

Table 5.1 Differences between integration.

111

object

Multisensory object unity Crossmodal or multisensory binding Multisensory integration

unity,

crossmodal

binding,

and

multisensory

Definition

Bears on

Multimodal experience of a single object Process of attribution to a common object or source Joint neurological process

Experienced features Represented features Signals coming from different receptors

corresponds to a switch in our experience from mere copresence to the experience of a unified object. 3 Revising the Phenomenal Switch The first point to discuss is the widely assumed idea that there is a switch in experience when two features are experienced as merely phenomenally copresent, for instance as a screen and a whistling sound, or as belonging to the same object (a whistling, gray kettle). This neat divide has an intuitive appeal. Its strength can be traced back to the contrastive way in which claims about the unity of consciousness are often set up. The phenomenal unity that is supposed to hold between distinct experiences is often contrasted with the experience of a disconnected set of experiences (which is hard, if not impossible for creatures like us with unified consciousness to even imagine10). In other words, the phenomenal unity of consciousness is then defined by setting up a contrast with what it is not—that is, a disunited experience. Take for instance, Brook and Raymont (2013): When one experiences a noise and, say, a pain, one is not conscious of the noise and then, separately, of the pain. One is conscious of the noise and pain together, as aspects of a single conscious experience. … This phenomenon has been called the unity of consciousness. More generally, it is consciousness not of A and, separately, of B and, separately, of C, but of A-and-B-and-C together, as the contents of a single conscious state.

Or take Bayne and Chalmers (2003): “What is important, on the unity thesis, is that this total state is not just a conjunction of conscious states.” The same contrastive definition is also used when it comes to object unity. The

112

O. Deroy

experience of features as belonging to a unified object is contrasted with the experience of the same features as belonging to distinct objects. The introduction of the present chapter, for one, illustrates this point, when it reads: “I am not conscious of a whistling sound, and of the look of the kettle as copresent with the sound. I experience these auditory and visual features as belonging to one and the same object: the whistling, gray kettle.” Bayne and Chalmers in their 2003 definition of object unity also use a contrast to show what object unity is: I might see a car’s shape and hear its noise without anything in my conscious state tying the noise to the car (perhaps I perceive the noise as behind me, due to an odd environmental effect). If so, the experiences are not objectually unified.

A contrast, however, does not mean that the two elements are also totally separated, or that there is a switch from one to the other. The switch in phenomenology seems to find additional support in our everyday experience. We seem to go from mere phenomenal unity to object unity as simply as we turn our heads toward sources of sounds and back again. I experience the mere phenomenal unity of the screen and the whistling kettle, but the object unity of the sound and sight of the kettle is just at a neck’s twist. As I turn my head back and forth toward the screen, I am experiencing object unity and then a mere phenomenal unity of the auditory and visual experiences. However, our take on our phenomenology can obviously be misleading: can I really say when my two experiences suddenly bind or go back to mere phenomenal unity? Take the example of a dubbed movie. I experience the sounds and the visual movements of the actor’s lips on the screen as constituting one and the same speaking event. If the lip movement and sounds get badly out of synchrony, say by 200 milliseconds or more,11 I will suddenly experience the auditory event and the visual event as distinct. Object unity has suddenly been lost and phenomenal unity is all that I am left with. Not all cases are so dramatic though. Imagine what would happen if someone was gradually introducing, millisecond by millisecond, a time lag between the speech sounds and the lip movements. When the time lag reaches 200 milliseconds, you might experience the sounds and the visual movements as distinct. But probably, with a time lag of 199 milliseconds, the experience was already not perfectly object-unified. Around 200 milliseconds there is a moment at which I cannot help describing my experience as two distinct events. But this does not mean that this is the moment when my experience suddenly changes from being about a single event to mere phenomenal unity.

The Unity Assumption

113

An even more convincing illustration comes from the experience of the rubber-hand illusion.12 When I see and feel my hand being stroked by a gentle brush, I experience a single, unified stroke event occurring on my hand. But if a spatial gap is introduced between the visual experience (the seeing of the hand being stroked) and the tactile experience (the feeling of the hand being stroked), for instance by presenting me visually with the sight of a rubber hand being stroked at the same rhythm as what I feel on my hand, hidden from sight, I might start to feel a drift of my hand toward the location of the seen hand. What happens can be described as a way of maintaining object/event unity against the spatial discrepancy now introduced.13 But this is not perfect object.event unity: I don’t have the experience now of the feeling of the stroke as neatly unified with the new visual location of the rubber hand. My experience is more of an in-between: felt and seen synchrony and other visual cues indicate that a single event is supposed to be at stake and therefore that the event is supposed to occur at the same location (i.e., my hand), but proprioceptive information and visual information are not perfectly congruent regarding this location. It is not as if there is a perfectly defined location in between the felt location and the seen location where the feeling and the image would be referred. Nobody has a clearly defined multisensory experience where their hand is visually, tactually, and proprioceptively present in between their real hand and the rubber one. Rather the experience is of an in-between, or of the felt location to imprecisely shift toward the rubber hand. Experimental cases like the rubber hand or desynchronized speech are the best places to stretch our experiences beyond the most frequent cases. They provide us with the means to revise our intuition about what experiences can or cannot be. The literature in multisensory integration is noticeably full of other cases where a spatial and/or temporal discrepancy is introduced between the presentation of the two sensory stimuli to measure the scope of their integration or of crossmodal binding. The amount of spatiotemporal discrepancy that can be introduced and that will still lead to integration is known as the “window of integration.”14 In their 1980 seminal review on the resolution of intersensory discrepancy, Welch and Warren insisted that introducing a spatiotemporal distance between auditory and visual stimuli was not necessarily introducing a conscious conflict in experience. The term conflict is avoided in this review because it connotes an awareness of and an aversive response to the stimulus situation. Discrepancy and discordance are more

114

O. Deroy

appropriate because they refer to the stimulus situation and presuppose on the part of the observer neither awareness nor a negative emotional reaction. (Welch & Warren, 1980, 638, n. 1)

Decades of experimentation on audio-visual integration has certainly confirmed that, within the window of integration, a spatiotemporal discrepancy does not lead to an awareness of distinct objects or features. But from there it is also wrong to infer that discrepant stimuli, when integrated, will lead to the same conscious experience as nondiscrepant stimuli. In the literature on multisensory integration, it is possible to find reports that discrepant sensory cues, still leading to an integrated response, are accompanied by a degraded or shattered object unity in consciousness. The example of the slight time lag in dubbed movies is an example. Anyone who has experienced the McGurk effect15 can try it for him or herself and see that when the auditory and visual stimuli start to fall too far out of sync, or when there is a male voice but a female-looking face, one senses the discord. Jackson (1953), who presented his participants with whistling sounds coming from different locations and asked them to determine which of seven visible kettles the sound was coming from, admitted, for instance, that the participants reported a mild feeling or impression of “intermodal conflict” when they received an auditory stimulus and a simultaneous visual cue separately, although they supposed and perceived that it was from one and the same object. Many of them maintained that they “must be imagining,” or that they had experienced a “detached feeling,” or even nausea—signaling perhaps a gradual breach in the seamless, unified experience of a single object even in the case of successful multisensory integration. To conclude on this point, experimental studies suggest that instances of successful binding are not always accompanied by the experience of a perfect object unity. To go one step further, it seems as though the idea of a phenomenal switch, where object unity suddenly becomes mere phenomenal unity, should be revised for: (i′) Phenomenal gradation claim: There is a gradual change, from the experience of perfectly unified to less unified objects or sets of features (what researchers sometimes call “consistent with a unified object”) to the experience of distinct features. The phenomenal gradation claim, as I suggest calling it, constitutes a new way to resist the intuitive idea that crossmodal binding must lead to, or is necessarily accompanied by, the unified experience of a single object across the modalities. On the basis of similar experimental evidence, Spence

The Unity Assumption

115

and Bayne (forthcoming) already object to this idea but conclude that multisensory integration never leads, or is never accompanied by, a unified experience. The gradation claim agrees with their neo-Humean view that binding (and multisensory integration) can occur in the absence of a unified experience, but disagrees with the radical negation that it never leads, or is never accompanied by, such a unified experience. Object unity, in this sense, is not always an illusory impression or what we think or imagine our experience to be. Object unity is a genuine fact about conscious experience but only one end of a spectrum that ranges from more unified experiences (as in the case of a normal kettle, where auditory and visual stimuli really originate in a single object and benefit from a high spatiotemporal congruence) to less unified experiences (like in some of Jackson’s experimental cases where the auditory and visual cues come from two slightly discrepant locations and are yet still assumed to be coming from one object and are therefore integrated).16 4 The Variations of Crossmodal Binding Welch and Warren (1980) incidentally suggest a parallel between grades of perceived object unity and the strength of what they call “the assumption of unity”—i.e., a mix of principles deciding that signals have a single source and features belong to the same object.17 “The stronger the assumption of unity, the greater the tendency will be for perception to occur in a way consistent with a unitary event” (Welch & Warren, 1980, 639, emphasis is my own). Attributing to Welch and Warren a view close to the phenomenal gradation claim is certainly an overinterpretation of what they say. At least here they plant the seeds for the second part of the view that I wish to defend. With the idea of a broader “unity assumption” governing binding, Welch and Warren underscore that binding is not only a matter of stimuli falling within a determined spatiotemporal window and being integrated neurologically.18 Although it remains true that too distant cues will not lead to multisensory integration, it seems that spatiotemporal congruency between signals is not sufficient to fully explain all nuances and cases of crossmodal binding. Welch and Warren in their seminal work already hint at the idea that the congruency between modality-specific sensory cues is playing a role in the decision to bind features across modalities: If the two modalities have some common history of presenting nonredundant information … about a known single event, then the observer should be more

116

O. Deroy

accepting of discrepant information as representing a single event. (Welch & Warren, 1980, 663)

Noticeably, the evidence of crossmodal binding in cases where cues are crossmodally congruent defines a larger “window of integration” than the one defined for arbitrary cues. For instance, when the pair of presented stimuli correspond to sounds and sights of natural or artificial objects (e.g., sounds of whistling kettles and shapes of kettles), they are bound over a larger spatiotemporal window than mismatching pairs such as the sound of the kettle and the movement of a door, or pairs of arbitrary stimuli such as beeps and flashes. In other words, congruent auditory and visual stimuli can still be referred to the same object when incongruent or unrelated pairs would no longer be bound (see Deroy, forthcoming). But does this mean that congruency is really a separate means of crossmodal binding, independent from spatiotemporal factors? One problem here comes from the difference between the cases. When one sees and hears a kettle, the quantity of temporal information present in the two cues is much greater than the one contained in a flash and a beep. The richer the signals, for instance, the more the rhythmic pattern of movement provides cues to synchrony between the sounds and the visual target. If this is the case, the larger tendency to bind the features would also come from spatiotemporal factors.19 Not all effects of crossmodal congruency, however, can be reduced to the presence of a larger quantity of temporal or spatial information. Take what happens with crossmodal correspondences. Crossmodal correspondences are tendencies for a sensory feature or attribute in one modality—either physically present or merely imagined—to be matched with a sensory feature in another modality. For instance, human adults and infants (and even chimpanzees) consistently associate high-pitched sounds and bright colors or high elevation in space; decreasing pitch and descending tactile movement, tastes and, sounds, sounds and shapes, and so on.20 Crossmodal correspondences have been described using a variety of different headings, including weak synesthesia (Martino & Marks, 2001), natural crossmodal mappings (Evans & Treisman, 2010), metaphorical mappings (Wagner et al., 1981), and synesthetic associations (Crisinel & Spence, 2010; Parise & Spence, 2009). What matters here is that, contrary to the rich crossmodal congruencies holding between sounds and appearances of kettles, they hold between relatively bare stimuli. Yet, they still introduce a difference in crossmodal binding. For instance, people find it significantly harder to correctly say which of two stimuli—one auditory

The Unity Assumption

117

and one visual—came first when these pairs of stimuli are crossmodally congruent than for pairs that are incongruent.21 The just noticeable difference, as it is called, is for instance higher for pairs of bright visual flashes and high-pitch sounds than for pairs of dark visual flashes and high-pitch sounds, suggesting that crossmodal congruence leads to a more pronounced binding. These results have been confirmed by several psychophysical and neuroimaging studies.22 Crossmodal congruence is a factor of its own, which can modulate crossmodal binding along with spatiotemporal factors. This leads to the second revision of the processing claim. Instead of an on/off process governed merely by the strict limits of a bounded window of integration at the neurological level, crossmodal binding varies along at least two dimensions, that is, spatiotemporal congruency and crossmodal congruency. The processing claim should therefore be revised and crossmodal binding defined as a complex set of variables: (ii") Processing variations: There are variations in the strength of crossmodal binding—the cognitive or neurophysiological operation whereby sensory features are bound together as belonging to the same object. 5 Conclusions: Aligning Phenomenology and Underlying Cognitive Process? We started with the intuitive ideas contained in the Switch thesis according to which there is an alignment between our conscious experience of multisensory objects and what cognitive neuroscience tells us about underlying cognitive and neurophysiological operation of binding. The point of this chapter has been to stress that this intuitive alignment needs revision. For one thing, we need to distinguish between the occurrence of multisensory integration, crossmodal binding, and the experience of a unified object in consciousness. Multisensory integration is independent from the experience of two sensory features as unified into a multisensory object. The fact that the two seem to go together is not sufficient to conclude that one explains or underlies the other. Should we then sign up for the radical neo-Humean suggestion that there is no such thing as a unified experience (Spence & Bayne, forthcoming)? Should we cling to our intuitive phenomenological claims and forget about explaining it with neurological and cognitive evidence? A closer look at experiments show an alternative. Object unity need not be abandoned, but it is only one end of a spectrum where experiences

118

O. Deroy

can be more or less unified. There are certainly cases where we report our experiences—or where our experiences introspectively strike us as being of a single unified object. But these cases can correspond either to our experiences being rigorously about a single object or to experiences consistent enough with them being about a single object. These “consistent enough” experiences stand somewhere in between the perfect object unity usually discussed by philosophers and the mere phenomenal unity usually contrasted with object unity. Turning to crossmodal binding, the decision to unify cues or features also turns out to be more complex. It varies at least with two factors—one, which corresponds to spatiotemporal closeness, and the other, which corresponds to crossmodal congruency. We are then faced with the two following claims: (i′) Phenomenal gradation: There are gradual changes, from the experience of perfectly unified to less unified objects or sets of features (what researchers sometimes call “consistent with a unified object”) to the experience of distinct features. (ii′) Processing variations: There are variations in processing corresponding to the strength of crossmodal binding—the cognitive or neurophysiological operation whereby sensory features are bound together as belonging to the same object. These two claims in themselves are not sufficient to suggest a new alignment between our experiences and the underlying cognitive and neurophysiological processes. Nothing shows that the gradations map onto the varieties of process. In this sense, and following Spence and Bayne’s methodological considerations, we should abstain from interpreting too quickly the effects of crossmodal binding processes in terms of conscious experience. There is however a way to make sense of a connection between phenomenal gradation and the variations of binding. In complex, real-life situations, our experience is largely dominated by the need to locate objects in space. When two crossmodally congruent but distant features are perceived (say the brightness of a light on the ceiling and the high-pitch sound of a bird singing in a cage), the experience of crossmodal congruency is overshadowed by the feelings of unity experienced about the bird song (as the auditory cues get bound to the visual shape of the bird) and the light (experience also as a unified, visual object). When tested alone though, crossmodal correspondences lead to a feeling of two experiences “going together,” which is orthogonal to the experience of them being about the same spatiotemporal object. For instance,

The Unity Assumption

119

Quantitative or spatiotemporal congruence Canonical “object unity” More or less qualitatively or spatiotemporally unified sets of features

Mere phenomenal unity

Qualitative or crossmodal congruence

Figure 5.1 Schematic illustration of the continuum view. Decrease in crossmodal congruence and/or spatiotemporal congruence leads to experiences of less and less unified objects—resembling more sets of congruent features—and leading to mere phenomenal unity.

in crossmodal correspondences between odors and tones, participants are simply asked to evaluate which flavor or odor (experienced at a certain location and moment) matches particular music or a tone (coming from distant loudspeakers, and not necessarily in temporal alignment).23 Here the two stimuli can be experienced as going together in the sense of being congruent without being necessarily experienced as originating in the same spatial location. Another way of putting the distinction, in more Kantian terms, is to think that the unity of our experience varies both along a qualitative and a quantitative dimension (see figure 5.1).24 The qualitative dimension might then rest on crossmodal congruency. It leads to feelings of unity between distinct sensory features—that is, to feelings that certain features go together or match. The quantitative dimension on the other hand might rest on the spatial and temporal closeness between the cues and also leads to feeling of unity, this time of belonging to the same locus in space and/or time. This, however, remains an untested relation. In conclusion, it is sufficient to say that (i) the phenomenal gradation and (ii) the processing variations claims bring another shadow to the claim that consciousness is always unified and adds another challenge to the

120

O. Deroy

pathological cases of partial unity (see Bayne, 2008; Hurley, 1998). Even in typical cases, experiencing unified objects and their properties is a special case at the end of the spectrum of more or less unified experiences. Acknowledgments Thanks to David Bennett, Barry Smith, and Charles Spence for comments on earlier drafts. Notes 1. See also Bayne (2010); Bayne & Chalmers (2003). 2. See Hume (1759/1962). 3. In their 2003 piece, Bayne and Chalmers, following another strategy, suggested another variant of this line, namely by making it possible for (ii) to lead merely to a switch in access consciousness and not to a phenomenal change. 4. Given problems with chemical senses, is it fair to say that Spence and Bayne restrict themselves to spatial senses. 5. Notice that at the level of behavior a difference in performance is only sufficient to infer a form of multisensory, or crossmodal, interaction (an influence of one type of sensory information on another). One needs additional criteria to be satisfied such as neurological convergence or fusion of information for this kind of interaction to count more specifically as integration. This precaution applies to the redundant target effect, often considered a great example of multisensory integration where performance exceeds the race model. Otto et al. (2013) have proposed that it might be due to noise instead. 6. E.g., Wallace et al. (2004); see Marks (2004) for a review. 7. E.g., Miller (1991). 8. E.g., Meredith, Nemitz, and Stein (1987). 9. See Bremner, Lewkowicz, and Spence (2012) for an overview of the current research. 10. What Hurley notes about a different contrast case, i.e., partial unity, applies also to absent unity. “We cannot imagine what it is like for there to be partial unity. That doesn’t show partial unity is unintelligible because being partially unified isn’t the sort of thing there could be anything it is like to be. We shouldn’t expect to be able to imagine what it is like” (Hurley, 1998, 165). 11. The estimate is based on (discussed) empirical evidence and only given as an illustration (see Vatakis & Spence, 2010, for a review).

The Unity Assumption

121

12. See Botvinick and Cohen (1998). 13. Note that this is not the canonical description given to participants’ experience in the rubber hand illusion. Most authors talk about participants experiencing the rubber hand as their own. The evidence, however, is only of a drift of performance in tasks involving localization of the hand, as a change in ownership prompted by subjective questionnaires. 14. Notice that the concept is controversial as there can be “integration” outside this window (Soto-Faraco & Alsius, 2007, 2009). 15. See McGurk and McDonald (1976). 16. This range of nuances is particularly helpful, I believe, when it comes to analyzing certain experiences that are maintained as experiences of the same object but don’t have precise spatial or temporal boundaries such as the experience of one’s own body in space or the awareness of flavors. 17. See also Spence (2007). 18. While the majority of the studies investigating these congruency effects have been published over the last couple of decades (thus hinting at something of an explosion of interest recently in this particular area, see Doehrmann & Naumer, 2008, for a review), some of the seminal early studies go as far back as the 1950‘s (Jackson, 1953). 19. See Parise et al. (2012). 20. See Deroy and Spence (2013a, 2013b) and Spence (2011) for reviews. 21. Parise and Spence (2009). 22. Sadhighiani et al. (2009). 23. See Crisinel and Spence (2010) and Crisinel et al. (2013). 24. Kant (1781/1787) considers that the experience of a unified object depends on four fundamental categories quantitative, qualitative, relational, and modal categories.

References Bayne, T. (2008). The unity of consciousness and the split brain syndrome. Journal of Philosophy, 105(6), 277–300. Bayne, T. (2010). The unity of consciousness. Oxford: Oxford University Press. Bayne, T., & Chalmers, D. (2003). What is the unity of consciousness? In A. Cleeremans (Ed.), The unity of consciousness: Binding, integration, and dissociation. Oxford: Oxford University Press.

122

O. Deroy

Botvinick, M., & Cohen, J. (1998). Rubber hands “feel” touch that eyes see. Nature, 391, 756. Bremner, A. J., Lewkowicz, D. J., & Spence, C. (Eds.). (2012). Multisensory development. Oxford: Oxford University Press. Brook, A., & Raymont, P. (2013). The unity of consciousness. In Edward N. Zalta (Ed.), The Stanford Encyclopedia of Philosophy, http://plato.stanford.edu/archives/ spr2013/entries/consciousness-unity/. Crisinel, A.-S., Jacquier, C., Deroy, O., & Spence, C. (2013). Composing with crossmodal correspondences: Odors and music in concert. Chemosensory Perception, 6, 45–52. Crisinel, A.-S., & Spence, C. (2010). As bitter as a trombone: Synesthetic correspondences in nonsynesthetes between tastes/flavors and musical notes. Attention, Perception, and Psychophysics, 72, 1994–2002. Deroy, O. (in press). Multisensory perception, semantic effects and cognitive penetration. In J. Zeimbekis & A. Raftopoulous (Eds.), Cognitive penetrability. Oxford: Oxford University Press. Deroy, O., Chen, Y.-C., & Spence, C. (in press). Multisensory constraints on awareness. Philosophical Transactions of the Royal Society of London, Series B: Biological Sciences. Deroy, O., & Spence, C. (2013a). Are we all born synaesthetic? Examining the neonatal synaesthesia hypothesis. Neuroscience and Biobehavioral Reviews, 37, 1240–1253. Deroy, O., & Spence, C. (2013b). Why we are not all synesthetes (not even weakly so). Psychonomic Bulletin and Review, 20, 1–22. Doehrmann, O., & Naumer, M. J. (2008). Semantics and the multisensory brain: How meaning modulates processes of audio-visual integration. Brain Research, 1242, 136–150. Evans, K. K., & Treisman, A. (2010). Natural cross-modal mappings between visual and auditory features. Journal of Vision, 10(1), 1–12. Holmes, N. P., & Spence, C. (2005). Multisensory integration: Space, time, and superadditivity. Current Biology, 15, R762–R764. Hume, D. [1739] (1962). Treatise of human nature. Oxford: Oxford University Press. Hurley, S. (1998). Consciousness in action. Cambridge, MA: Harvard University Press. Jackson, C. V. (1953). Visual factors in auditory localization. Quarterly Journal of Experimental Psychology, 5, 52–65.

The Unity Assumption

123

Kant, I. [1781] (1787). Critique of pure reason. Cambridge: Cambridge University Press. Marks, L. E. (2004). Cross-modal interactions in speeded classification. In G. A. Calvert, C. Spence, & B. E. Stein (Eds.), Handbook of multisensory processes (pp. 85–105). Cambridge, MA: MIT Press. Martino, G., & Marks, L. E. (2001). Synesthesia: Strong and weak. Current Directions in Psychological Science, 10, 61–65. McGurk, H., & MacDonald, J. (1976). Hearing lips and seeing voices. Nature, 264, 746–748. Meredith, M. A., Nemitz, J. W., & Stein, B. E. (1987). Determinants of multisensory integration in superior colliculus neurons. I. Temporal factors. Journal of Neuroscience, 7, 3215–3229. Miller, J. O. (1991). Channel interaction and the redundant targets effect in bimodal divided attention. Journal of Experimental Psychology: Human Perception and Performance, 17, 160–169. Otto, T. U., Dassy, B., & Mamassian, P. (2013). Principles of multisensory behavior. Journal of Neuroscience, 33(17), 7463–7474. Parise, C. V., Spence, C., & Ernst, M. O. (2012). When correlation implies causation in multisensory integration. Current Biology, 22(1), 46–49. Parise, C., & Spence, C. (2009). “When birds of a feather flock together”: Synesthetic correspondences modulate audiovisual integration in non-synesthetes. PLoS ONE, 4(5), e5664. Sadaghiani, S., Maier, J. X., & Noppeney, U. (2009). Natural, metaphoric, and linguistic auditory direction signals have distinct influences on visual motion processing. Journal of Neuroscience, 29(20), 6490–6499. Soto-Faraco, S., & Alsius, A. (2007). Conscious access to the unisensory components of a cross-modal illusion. Neuroreport, 18, 347–350. Soto-Faraco, S., & Alsius, A. (2009). Deconstructing the McGurk-MacDonald illusion. Journal of Experimental Psychology: Human Perception and Performance, 35, 580–587. Spence, C. (2007). Audiovisual multisensory integration. Acoustical Science and Technology, 28, 61–70. Spence, C. (2011). Crossmodal correspondences: A tutorial review. Attention, Perception, and Psychophysics, 73, 971–995. Spence, C., & Bayne, T. (in press). Is consciousness multisensory? In M. Matthen & D. Stokes (Eds.), Perception and its modalities. Oxford: Oxford University Press.

124

O. Deroy

Stein, B. E., Burr, D., Constantinidis, C., Laurienti, P. J., Alex Meredith, M., Perrault, T. J., et al. (2010). Semantic confusion regarding the development of multisensory integration: a practical solution. European Journal of Neuroscience, 31(10), 1713–1720. Vatakis, A., & Spence, C. (2010). Audiovisual temporal integration for complex speech, object-action, animal call, and musical stimuli. In M. J. Naumer & J. Kaiser (Eds.), Multisensory object perception in the primate brain (pp. 95–121). New York: Springer. Wagner, S., Winner, E., Cicchetti, D., & Gardner, H. (1981). “Metaphorical” mapping in human infants. Child Development, 52, 728–731. Wallace, M. T., Roberson, G. E., Hairston, W. D., Stein, B. E., Vaughan, J. W., & Schirillo, J. A. (2004). Unifying multisensory signals across time and space. Experimental Brain Research, 158, 252–258. Welch, R. B., & Warren, D. H. (1980). Immediate perceptual response to intersensory discrepancy. Psychological Bulletin, 88(3), 638.

6 Multimodal Unity and Multimodal Binding Frédérique de Vignemont

For a long time, most research has addressed each sensory modality in isolation. This is in line with a modular conception of perception according to which the first levels of sensory processing are encapsulated, isolated from the influence of any other kind of information. However, recent literature has emphasized the importance of multisensory interaction, which has sometimes been taken as evidence that perception is “beyond modularity” (Driver & Spence, 2000). It is now well known that sensory modalities can influence each other. For example, in the famous ventriloquism effect, the observation of lip movements displaces the apparent location of speech sounds. Multisensory effects are pervasive. They can be found between most modalities. They are often automatic and mandatory since they can occur very early on in sensory processing. It may even seem now that a purely unimodal experience (not affected one way or the other by information coming from a different modality) is more the exception than the rule.1 As claimed by Stein and Meredith (1993, 6), “The sensory modalities have evolved to work in concert.” Their interaction improves the likelihood of detecting, localizing, and identifying objects and events. It is thus beneficial to combine different sources of information in order to achieve the best perceptual judgments. In other words, the more information, the better. But this is so only if one brings together the right pieces of information. One of the difficulties that the perceptual system faces is to select the relevant information to bind together. In this chapter, I will first consider to what extent the classic binding problems apply to multimodal experiences. This will lead me to distinguish two kinds of multimodality, additive multimodality and integrative multimodality, which respectively raise specific problems and fulfill specific functions. I will then focus on what Treisman (1999) calls the parsing problem, that is, the problem of selecting the information that comes from a common object or event and segregating that information from that which

126

F. de Vignemont

concerns distinct objects or events. On what basis does the selection operate? More precisely, does the perceptual system bind together the sensory states that carry information about the same location or about the same object? And in the latter case, how is the sameness of object determined? Does it require sortal identification? Or can nonconceptual individuation suffice? 1 The Binding Problems In his book, The Unity of Consciousness, Tim Bayne (2010) describes a small sample of his stream of consciousness while in a bar in Paris. It actually happened that I went to have a drink with him in this bar. Here is a brief glimpse of my stream of consciousness when I arrived: I am looking for Tim. I scan the faces around. I have a blurred visual experience of many people. I also have a loud auditory experience of people chatting and laughing. I hear bits of sentences. I also hear the music in the background. Suddenly I feel that my visual search has been successful. I have an experience of familiarity: I recognize Tim in the corner. I feel happy to see my old friend. I see that he is writing in his notebook, very focused. I do not feel surprised. Memories of him working all the time come back to me. I hear him saying “bonjour” in his best French while I see his lips moving. I feel my muscles relaxing when I drop my bag. I feel my legs crossed. The table feels a bit sticky. I feel the coldness of the glass. I have a gustatory experience of the taste of my drink and a visual experience of its color. Visual experiences, auditory experiences, gustatory experiences, emotions, bodily sensations, memories, and so forth—those experiences of various types are part of one’s everyday stream of consciousness. Yet, despite their plurality and variety, one does not have an overwhelming experience of scattering. Rather, one experiences unity and, to some extent, consistency. This is so at several levels of analysis. In particular, one can single out three levels of unity: (i) Conscious unity: This refers to the unity of an individual’s stream of consciousness in a specific time window. For example, I have an auditory experience of the music in the bar while having a visual experience of Tim and a gustatory experience of my drink. (ii) Multimodal unity: This refers to the unity of the multimodal experience of an object perceived by several sensory modalities. For example, I have a thermal experience of the coldness of my drink, a visual experience of its color, and a gustatory experience of its taste.

Multimodal Unity and Multimodal Binding

127

(iii) Unimodal unity: This refers to the unity of the experience of an object perceived by a single, sensory modality. For example, I have a visual experience of the glass as being cylindrical and red. Most literature has focused its attention on the first and last of these levels. Here I will primarily focus on the intermediate level, the level of multimodality. I will analyze the similarities and dissimilarities of the principles that govern the experience of multimodal unity with those that govern the other types of unity. On the one hand, multimodal unity may appear as closer to conscious unity than unimodal unity. Like conscious unity, it requires bringing together distinct perceptual experiences from different sensory modalities. In contrast, unimodal unity has a more limited scope. It involves a single modality and the features of the object that are brought together are normally not separately consciously available. On the other hand, conscious unity includes distinct experiences about many different objects and events. On the contrary, both multimodal unity and unimodal unity characterize perceptual experience of a single object or event. In the examples given above, visual, thermal, and gustatory experiences are unified into a multimodal experience in virtue of being about the same object. Likewise, the seen features of color and shape are unified into a visual experience in virtue of being the features of the same object. Because they are both object(or event-) based unity, multimodal unity and unimodal unity involve perceptual binding. There is perceptual binding if different pieces of information taken to be about the same object or event are brought together into a unified content. Binding is then successful if the information that is bound is actually about the same object or event. The involvement of binding in unimodal and multimodal unity raises what is known as “the binding problem” in the literature. To be more precise, one should actually say the binding problems, as noted by Treisman (1999, 105): For any case of binding, the binding problem can actually be dissected into three separate problems. Different theories have focused primarily on one of the three. (1) Parsing: How are the relevant elements to bind as a single entity selected and segregated from those belonging to other objects, ideas, or events? (2) Encoding: How is the binding encoded so that it can be signaled to other brain systems and used? (3) Structural description: How are the correct relations specified between the bound elements within a single object?

We will soon see that these are not the only problems that binding raises. But the parsing and the encoding problem are the most fundamental ones

128

F. de Vignemont

insofar as they correspond to the two main stages in the binding process in general, no matter what is bound together. The first stage consists in selecting the relevant information to bind together in order to avoid binding features or experiences that have nothing to do together (for example, the color of my glass of wine with the shape of Tim’s mojito glass, or my visual experience of the mojito with my gustatory experience of the wine). The parsing problem can then be articulated into two distinct questions: what mechanism is used to select the relevant information to the exclusion of other information, and on what basis does this mechanism operate?2 The second stage consists in tagging the selected information as coming from the same object or event so that it can be integrated into a unified content. The key question then is what neural mechanism unifies the selected information. The structural description problem concerns a further problem, which is specific to a certain kind of binding, the binding of the various parts of the object, in contrast to the binding of its features. Feature binding is the most standard case, but here is an example of part binding. When arriving, I first see the back of Tim’s T-shirt and then its front. I can then bind the two visual experiences together into a unified representation of the T-shirt. Likewise, I can bind together the tactile experience of the shape of the back of my glass and the visual experience of its front into a unified multimodal experience of my glass. Part binding then specifically raises the problem of structural description. It requires a correct knowledge of the structural properties of the object, that is, of the spatial and functional relations between the distinct parts (such as back and front) relative to the object as a whole. The binding problems are not specific to visual perception. They constitute a more general difficulty for any type of perceptual experience, whether it comes from a single sensory modality or from several. As long as the experience is about a single object or event, the perceptual system needs to select the relevant information to bind together. Binding is even more complex in the multimodal case because of the differences in format, precision, and reference frames between the senses. The problem of whether to combine two signals involves an (implicit or explicit) inference about whether the two signals are caused by the same object or by different objects, i.e., causal inference. This is not a trivial problem. … The different senses have different precisions in all dimensions, including the temporal and spatial dimensions, and even if the two signals are derived from the same object/event, the noise in the environment and in the nervous system makes the sensory signals somewhat inconsistent with each other most of the time. Therefore, the nervous

Multimodal Unity and Multimodal Binding

129

system needs to use as much information as possible to solve this difficult problem. (Shams, 2012, 219)

Each sensory modality is encoded in its own format and, in particular, its own spatial frame, centered on different parts of the body such as the eyes, head, torso, or skin. Consequently, the brain cannot just add up the converging sensory inputs. Rather, the perceptual system needs to go beyond the differences between sensory formats and spatial frames of reference. For example, in order to bring together the visual experience of the front of my glass with the tactile experience of its back, the representations of the two distinct parts need to use the same spatial frame. In order to solve this Tower of Babel type of problem, the information coming from the different modalities must be translated into a common reference frame before they can be combined. The common frame can be amodal or the frame of one modality. This is known as the recoding problem, which is specific to multimodal binding (Pouget, Deneve, & Duhamel, 2002).3 In the remainder of the chapter, I will determine how to solve Shams’s “difficult problem,” which multimodal unity faces. Beforehand, though, I will make a further distinction between two definitions of multimodality. I will argue that an experience can qualify as multimodal on the basis of either additive binding or integrative binding. 2 Additive Multimodality and Integrative Multimodality Until now, I have described only two examples of multimodal experience based on feature binding and part binding. But there are many ways the senses can be brought together. Multisensory interactions do not form a homogeneous category.4 In particular, multisensory interaction can occur at different levels of sensory processing, at the perceptual, attentional, or cognitive levels. Multisensory effects can also be of distinct types. There can be convergence, in which information is recoded in a common multimodal format. Alternatively, there can be conversion, in which information is recoded into the format of another modality. The dynamics of these effects can vary; most are short-term, lasting as long as the sensory signals last, but some have long-term consequences.5 More importantly, for the purposes of this chapter, multisensory interaction does not necessarily involve binding. For example, in the neuropsychological syndrome of tactile extinction, patients are not aware of the touch on their left hand, but only if they are simultaneously touched on the right hand or if they see a visual stimulus near the right hand. The latter case clearly constitutes a multisensory effect.

130

F. de Vignemont

Vision extinguishes touch due to the competition of touch and vision for the same attentional resource. Yet, visual information and tactile information are not about the same object or event. They are not bound together into a unified multimodal experience. Still, they interact. By contrast, other multisensory interactions involve binding, but even then they do not form a homogeneous category. One should not believe that binding is always of the same type. In particular, one must distinguish between additive binding and integrative binding. In the examples I described, information from one modality did not affect information in the other. The modality-specific experiences were merely combined, and they complemented each other. One may then talk of additive binding. Additive binding results from the combination of sensory information that is not redundant because it is about distinct features or distinct parts of the object. However, there are many cases in which one modality influences another modality one way or the other. In particular, integrative binding results from the fusion of sensory information that is redundant. For example, I hear Tim say “bonjour” and I see his lips moving, shaping the word “bonjour.” Both modality-specific experiences carry information about the uttered word, which constitutes a common sensible. They can then be merged together into a unified content of the multimodal experience of the word. Because of the redundancy, the binding is so strong that the experiences melt into each other, so to speak. This is especially salient if the visual information is in conflict with the auditory information, as in the McGurk effect—when the auditory stimulus /aba/ is heard while looking at lips making movements that would produce an /aga/ sound, one reports hearing /ada/. Like additive binding, integrative binding raises the parsing problem, also known as the assignment problem in the multimodal literature. The assignment problem—determining which sensory stimuli pertain to the same object … is a difficult task, given that the senses often respond to multiple objects simultaneously. (Pouget, Deneve, & Duhamel, 2002, 741)

Welch and Warren (1980) were among the first to acknowledge the challenge of accounting for the mechanisms that select the relevant signals to integrate together. They claim that multisensory integration depends on what they call “the observer’s assumption of unity,” that is, the assumption that the various sensory signals carry information about the same object or event. I shall come back to the nature of the unity assumption in the last section, but one can already note that multisensory integration is of use only when the signals carry information about a particular property of

Multimodal Unity and Multimodal Binding

131

the same object or event. Otherwise, one may end up with illusory binding, as in the following illusion. In the classic setup of the rubber hand illusion, participants sit with their arm resting on a table hidden behind a screen. They are asked to fixate on a rubber hand presented in front of them, and the experimenter simultaneously strokes with two paintbrushes both the participant’s hand and the fake hand. The illusion occurs after a couple of minutes, but only if the two hands are in congruent position and synchronously stimulated (Botvinick & Cohen, 1998). Most participants then report feeling as if they were touched on the rubber hand and as if the rubber hand were their hand. In addition, they mislocate their hand toward the rubber hand. The illusion results from the integration of visual information about the rubber hand being stroked with tactile and proprioceptive information. But this integration should not occur since the visual information and the somatosensory information are not about the same object, namely the participant’s hand. The rubber hand illusion illustrates that the parsing of information is as important for integrative binding as it is for additive binding. In addition, integrative binding raises two further problems. Like additive binding, it raises the recoding problem. It is all the more necessary to translate the different pieces of information in a common format for the perceptual system if they are to be fused: Second, sensory signals—for example, the sound of a word and the image of a person’s moving lips—must be recoded into a common format before they can be combined, because sensory modalities do not use the same representations. We call this the recoding problem. (Pouget, Deneve, & Duhamel, 2002, 741)

The last problem is specific to integrative binding: Finally, combining multimodal cues involves statistical inferences, because sensory modalities are not equally reliable and their reliabilities can vary depending on the context. For example, vision is usually more reliable than audition for localizing objects in daylight, but not at night. For best performance, the statistical inference must assign greater weights to the most reliable cues, and adjust these weights according to the context. (ibid.)

What we may call the weighting problem is not encountered in additive binding. When I have a visual experience of the color of my drink and a thermal experience of its coldness, it does not matter whether visual information is more reliable than thermal information. The two sources of information are about the same object but not the same property of the object. They are not confronted, they can never be in conflict, and thus the perceptual system will never have to give primacy to one or the other.

132

F. de Vignemont

They remain independent components of the multimodal experience. In contrast, integrative binding requires solving the weighting problem. The redundancy of information that characterizes multisensory integration does not imply that all information is equally trustworthy. Furthermore, the reliability of each sensory modality varies widely according to the context and the type of information. The perceptual system needs thus to temper the importance of each modality given the context. Therefore, in cases of conflict, it can achieve the most reliable estimate. Integrative binding raises the specific additional binding problem of weighting because it has a different function than additive binding. In a nutshell, additive binding aims at completeness, whereas integrative binding aims at reliability. Thanks to additive binding, the perceptual system can collect all the pieces of information about the various parts and the various features of an object in order to have as rich an experience of the object as possible. Integrative binding is more narrow-minded, so to speak. It is focused on a single property of the object. But we know that each sensory receptor sends noisy signals. Furthermore, the quality of the signals can be decreased by environmental conditions (poor light, for example, or noisy environment). It is thus important to have more than one source of information. Informational redundancy increases robustness and reliability. Thanks to integrative binding, the perceptual system can generate the best estimate of the property of the object by pondering and integrating the various sources of information (for more details, see Bennett and Trommershäuser, this volume). For example, the music is very loud in the bar and I cannot hear well what Tim is saying, but the vision of his lips moving helps me to understand him. There is a last difference between additive binding and integrative binding. Additive binding is a matter of all or nothing. Either the features are bound or not. By contrast, integrative binding is a matter of degree. The information can be more or less bound. This is well illustrated by studies on prism adaptation. When participants wear optical prisms that deviate the direction of light rays by a constant angle, they see their hand at a location distinct from where it actually is. After a certain time of adaptation, prismatic deviation affects not only where they see but also where they feel their hand to be through proprioception (Stratton, 1899; Welch & Warren, 1980). Interestingly in some cases, participants wearing prisms have no conscious access to the original proprioceptive information. “Without any conscious opposition in the localisation of the different impressions: the touch sensations were not referred to any other than their visible locality” (Stratton, 1899, 495). The visual experience and the proprioceptive

Multimodal Unity and Multimodal Binding

133

experience are fused so that the subjects experience their hand at a single location. However, in other cases, participants can report feeling their hand at a location that is in between the optical location (i.e., on the basis of vision only) and the proprioceptive location (i.e., on the basis of proprioception only) (Welch & Warren, 1980). The fusion is not complete. The degree of binding then depends on the degree of confidence that the various perceptual experiences that are bound together concern a single object. In summary, the brief glimpse we had into the variety of multisensory interaction shows the difficulty that one faces if one wants to give a general theory of multimodality. I have argued that there are at least two ways an experience can qualify as multimodal. First, an experience is multimodal if it results from additive binding. What is bound can then be distinct features or distinct parts of an object or event. It is a matter of all or nothing. It requires solving the parsing, the encoding, and the recoding problems. It aims at forming the most complete multimodal experience of the object or event. Second, an experience is multimodal if it results from integrative binding. What is bound is redundant information about the same property of the same object within the same spatial frame of reference. It is a matter of degree. It requires solving the weighting problem in addition to the parsing, the encoding, and the recoding problems. It aims at forming a coherent and reliable multimodal experience. The question now is how to understand these two types of multimodality. I argued that multimodal unity differs from conscious unity insofar as it consists in the unified experience of a single object, whereas conscious unity brings together experiences of many distinct objects and events. Yet I shall now argue that multimodal unity consists in a kind of unity that is similar to conscious unity, at least in one specific account of conscious unity. There are several ways to interpret and analyze the unity of consciousness, but it is not the purpose of this chapter to discuss them all. Rather, I will assume the mereological model defended by Bayne (2010), among others. We might say that two conscious states are phenomenally unified when, and only when, they are co-subsumed. … What it is for one experience to subsume another is for the former to contain the latter as a part. (Bayne, 2010, 20–21)

On the mereological model, the unity of consciousness is conceived in terms of the relation between a complex phenomenal state and simpler states that are its components, so to speak. Bayne and Chalmers (2003) characterize the mereological model as follows: (i) reflexivity (a state subsumes itself), (ii) transitivity (if A subsumes B, and B subsumes C, then A

134

F. de Vignemont

subsumes C), and (iii) antisymmetry (if A subsumes B, and B subsumes A, then A = B). In my example, I have a global, experiential state that contains my auditory experience of the music in the bar, my visual experience of Tim, and my gustatory experience of my drink as components. One may wonder whether one could apply the same mereological model to understand multimodal unity.6 Let us consider first additive multimodality. Arguably a multimodal experience is above and beyond the modality-specific experiences. One may then say that it subsumes the modality-specific experiences such as visual and gustatory. It is a more complex state that includes simpler states as its components. Furthermore, the visual experience itself subsumes the color and shape states. By the rule of transitivity, one can then say that the multimodal experience, which subsumes the visual experience, also subsumes the color state. Because each piece of modality-specific information is a component of the multimodal experience, the more components there are, the richer the experience. For example, my multimodal experience of the drink is richer if it includes all the various ways of experiencing it, giving me access to all its properties. But if I have a cold and cannot taste it, or if I close my eyes, then one might say that there is something missing. Put it another way, additive binding results in the collection of distinct pieces of information under a subsuming global state. Does the mereological model apply to integrative binding as well as to additive binding? In other words, does my multimodal experience of “bonjour” subsume my visual and auditory experiences? The difficulty is that in the case of full fusion, one cannot retrieve the original modality-specific information. In the McGurk effect, for example, one does not have access to the auditory /aba/, consciously or not. But if the original modality-specific information is lost in the process of integrative binding, then how can it be a component of the multimodal experience? The mereological model can hardly then apply. However, as argued, there are degrees of integration, and the fusion is not always complete. At some sensory level, the information can still be there. For example, in the rubber hand illusion, the participants experience their hand at a location close to the location of the rubber hand, but if asked to use their hand, the kinematics of their movements indicate that they do not mislocate it (Kammers et al., 2009). Hence, the original proprioceptive information is preserved at the motor level. It then seems that the multimodal experience of the hand location can have the proprioceptive state as a component, although it is not consciously available. In addition, one may note that most experiences that involve integrative binding also involve additive binding. When there are several sources

Multimodal Unity and Multimodal Binding

135

of information coming from different modalities, some information will be redundant, and some will not be redundant. Even in the McGurk effect, one has a multimodal experience that results from the integration of the visual information about the lip movements and the auditory information about the sound of the word, but one also has a multimodal experience that results from the combination of the visual experience of the color of the lips and the auditory experience of the tone of the voice. Although the information that is redundant is fused, the other types of information, which are not redundant, remain independent. They do not influence each other, and what affects one does not affect the other. The mereological model can then clearly apply. 3 The Binding Parameter We can now turn back to our fundamental problem. How does the perceptual system achieve the selection of the relevant information without falling into a vicious circle? In a nutshell, binding results in the unified perception of an object, but binding requires first parsing the relevant information, which in turn requires the individuation or identification of the object that is the common source of the different pieces of information. But how can singling out the object be both the output and the constraint, or guideline, of binding? This raises the question of the criterion or criteria that the perceptual system uses to select the relevant information, what John Campbell calls the “binding parameter”: By the “binding parameter,” I mean the characteristic of the object that the visual system treats as distinctive of that object, and uses in binding together features as features of the same thing. (Campbell, 2002, 37)7

The notion of binding parameter can be easily generalized to apply to multimodal unity. The multimodal binding parameter is the characteristic of the object that the various sensory modalities treat as distinctive of that object and use in combining or integrating information as information about the same object. Interestingly, the application of the notion of binding parameter to the multimodal domain adds a new constraint. For the multimodal binding parameter to play the role of guiding the selection of the information across several sensory modalities that will be combined or integrated, it must be a common sensible.8 There are two interpretations of this constraint. According to a strong interpretation, the multimodal binding parameter is always the same no matter the sensory modalities that are integrated. Then it must be a characteristic available to all the senses.

136

F. de Vignemont

However, there is no reason the perceptual system should always use the same binding parameter and cannot adjust depending on the perceptual context. One can then offer a weaker interpretation of the constraint, according to which the multimodal binding parameter can vary, depending on the perceptual context. Then it must be accessible to all the sensory modalities that are concerned, and only to those. The application of the notion of binding parameter to the multimodal domain also adds a new difficulty. As said earlier, multimodal binding raises additional problems, including the recoding problem. It operates not only on different pieces of information, but also on different senses, endowed with different characteristics. The parsing problem and the recoding problem are not independent. The perceptual system must go beyond those differences in order to determine whether there is a single source or not. What are the candidates for the binding parameter? Are they the same for the multimodal binding parameter? Are they the same for additive multimodality and for integrative multimodality? Here I will focus exclusively on the latter. There is a classic distinction within the attentional literature between space-based attention and object-based attention. Traditional models characterize attention in spatial terms. One pays attention to stimuli in a specific region of space, to the exclusion of stimuli from other locations (Posner et al., 1980). In contrast, recent models emphasize the role of discrete objects: one can attend to independent individuals that one can track over time, rather than to spatial regions of the visual field. For example, one is able to selectively look at one of two spatially superimposed movies by ignoring the other, despite the fact that they share the same locations (Neisser & Becklen, 1975). Furthermore, when subjects are presented with two overlapping objects (such as a box and a line), they are more accurate in judging two properties of the same object than they are of separate objects (Duncan, 1984). Thus, discrete objects can serve as units of attention. In other words, one is able to attend to feature clusters parsed as independent individuals rather than spatial areas. The dissociation between the two types of attention is confirmed in the neuropsychological literature. For example, patients suffering from neglect represent only the right half of a scene and leave out figures on the left side, but they can also sometimes represent only the right side of objects, omitting their left side, no matter where they are located in the scene (Driver & Halligan, 1991; Marshall & Halligan, 1994). Without necessarily adopting the attentional model of binding, we can keep the distinction and claim that the selection of the relevant elements to

Multimodal Unity and Multimodal Binding

137

bind is based either on a specific region of space independent of the objects that it contains or on the object itself. In the latter case, one can suggest using sortal concepts, such as a book, hand, glass and so forth, to parse the information. Sortal concepts provide principles of individuation and numerical identity (Wiggins, 2001). They allow one to count the entities and to reply to the question “what is it?” We can thus contrast what I call the spatial hypothesis and the sortal hypothesis of the multimodal binding parameter. Spatial hypothesis: The perceptual system binds together the sensory states that carry information about the same location. Sortal hypothesis: The perceptual system binds together the sensory states that carry information about the same object. The sameness of object is determined through its conceptual identification.9 We shall now see that neither of these hypotheses is satisfactory. 3.1 The Spatial Hypothesis Let us consider first the spatial hypothesis, according to which location is the parameter that is used to select the relevant information to be bound together (Treisman, 1999; Campbell, 2002). However, location must be ruled out in the case of multimodal binding, and this is so for several reasons.10 First, a classic problem encountered also in the visual literature is that the perceptual system must be able to bind features of moving objects. The second difficulty is specific to multimodal binding. As said earlier, each sensory modality has its own spatial frame of reference and one needs to translate one frame to another if one wants to use location as the binding parameter. If parsing is space-based, then location needs to be encoded within the same frame of reference. Otherwise, it could not play its role of common currency between the different modalities. Third, it is controversial whether all senses, including taste, for example, encode the location of the perceived object. True, when eating one can feel that the piece of chocolate is first on the tip of the tongue and then at the back of the mouth, but is this spatial information carried by the gustatory experience itself or by the associated tactile experience? None of these difficulties is a fatal objection to the spatial hypothesis. At most, they reduce its scope. One can indeed limit it to cases of static objects perceived by sensory modalities that carry spatial information. There is, however, a series of cases that is more difficult to account for, which involve either two objects for one location or one object for two locations. On their basis I will argue that location is neither a sufficient nor a necessary binding parameter.

138

F. de Vignemont

We know that amputees can feel the presence of a phantom limb although they know that their limb is missing and can see that there is no limb where they feel it to be. The experience of their phantom limb can be even more vivid than the experience of their intact biological limbs. “I, says one man, I should say, I am more sure of the leg which ain’t than of the one that are, I guess I should be about correct” (Mitchell, 1871, 566). How can one experience the presence of a limb despite the fact that one can constantly see that the limb is missing? This may seem all the more surprising since it is well established that what one sees generally prevails if there is any conflict with other sensory modalities. For example, one can induce illusory arm movements by vibrating muscle tendons of the arm, but the illusion immediately disappears if one sees that the arm is not moving (Lackner, 1988). The perceptual system generally trusts more what one sees than what one feels. Consequently, one should expect vision to chase the phantom. Yet, this is not the case. Let us imagine that the patient feels her phantom hand to be at a location where there is a book. Nonetheless, seeing the book does not cancel the experience of the phantom hand. One explanation why vision does not erase the phantom is that the perceptual system does not confront the visual information about the book with the proprioceptive information about the hand despite the fact that the book and the phantom hand are experienced at the same location; hence, there is no conflict and no attempt to solve the conflict by canceling the phantom sensation. One may want to explain the lack of integrative binding by a failure to recode the two sensory modalities into a common spatial frame of reference. For example, one may claim that the location of the book is represented within an eye-centered frame only, whereas the location of the hand is represented within a body-centered frame only. This explanation, however, is not fully satisfying. It is true that the book is not located within a bodily frame of reference, but it is more controversial that the phantom hand is not located within a visual frame of reference. There are indeed some reports that indicate the contrary. For example, Ramachandran and Blakeslee (1998) describe the case of a patient who felt that he was grasping a mug and who began to scream when the examiner moved the mug away from him. The patient’s reaction shows that the phantom hand was experienced in the same space as the mug. Hence, it seems that it cannot be only a problem of recoding. The case of the phantom limb is typically a scenario in which integrative binding does not occur despite the identity of location. Yet, this should not be understood as a failure of the perceptual system insofar as there are

Multimodal Unity and Multimodal Binding

139

indeed two distinct objects. The parsing of information is correct. It merely shows that location is not always a good binding parameter. Moreover, it shows that it is not always sufficient. Further findings on phantom limbs confirm that something more than location is required for integrative binding. Ramachandran and Rogers-Ramachandran (1996) invented a virtual reality box that provided to patients the missing visual feedback of the phantom hand. A mirror was placed vertically so that the reflection of the intact hand was “superimposed” on the felt position of the phantom hand. When the normal hand moved, the mirror reflected a moving contralateral hand. This experimental device induced the illusion of phantom movements in six of ten patients, some of whom had experienced their phantom hand as paralyzed for ten years. This shows that visual information and proprioceptive information became integrated into a unified bodily experience. Without the mirror box, what the patients feel and what they see are too different, and thus there is no confrontation between the two sources of information about the same location, no conflict, and no attempt to solve the conflict. With the mirror box, on the contrary, both proprioceptive information and visual information are about the hand. This initiates a new dialogue between them. Location is neither a sufficient binding parameter nor a necessary condition.11 The perceptual system needs to appeal to further information. Let us consider prism adaptation. As I described earlier, when one wears prisms, visual information about the location of one’s hand is in conflict with proprioceptive information. Yet, they are integrated together despite the spatial discrepancy. Again, this should not be considered the result of a failure of the perceptual system. The function of integrative binding is to bind information about the same object. This is what it does here. Consequently, if the identity of the object is the same (one’s hand), then there is integrative binding no matter the spatial differences. The parsing of information can be exclusively object-based. The question then is how the object is identified. 3.2 The Sortal Hypothesis According to the sortal hypothesis, the use of the sortal concept is necessary to single out the object that is the cause of the various perceptual experiences. In Campbell’s terms, one uses a sortal concept to “delineate” the boundaries of the object that is the common cause of the various perceptual experiences. It is clear that the application of a sortal concept can create expectations about specific perceptual experiences in different modalities and their co-occurrence. In addition, the application of a sortal concept

140

F. de Vignemont

has the advantage of answering nicely to the common sensible constraint. Arguably, sortal concepts are amodal and thus common to all sensory modalities. In this view, multimodal binding requires the conceptual identification of the object that is perceived by the different sensory modalities involved. For example, the visual information and the auditory information of an approaching car are integrated in virtue of the fact that one recognizes that they both carry information about a car. This is not to say that the sortal concept suffices as a binding parameter because one can simultaneously perceive several objects of the same sort such as several cars in the street. It is not beneficial to bind together properties that belong to different objects, even if they can be grasped by the same concept. The view is rather that one uses sortal concepts to delineate objects in a specific region of space. A further asset of the sortal hypothesis is that it can easily account for the findings on phantom limbs. The concept of a book is distinct from the concept of a hand, and thus visual information about the book is not integrated with somatosensory information about the hand, although the book and the phantom hand are experienced in the same region of space. Finally, the sortal hypothesis seems to be in line with the theory of the assumption of unity in the psychological literature (Welch & Warren, 1980). In this view, multimodal binding depends on the expectation that the information is about the same object. For instance, in prism adaptation, which was extensively studied by Welch and Warren, one must believe that visual and proprioceptive information can refer to the same object, namely one’s hand, in order for the two to be integrated. The reasons to make such a specific assumption are partly “historic.” Most of the time one feels and sees one’s hand at the same location. In Bayesian terms, the unity assumption depends on the prior knowledge derived from experience about the coupling of the signals in the world, that is, how often the two signals are caused by the same source in general. For example, the perceptual system has accumulated information about the statistics of proprioceptive-visual events in the environment. If the two signals have always been consistent, then the expectation is that they will be highly consistent in the future. Prior knowledge thus helps give a causal structure to specific scenes and events (Shams & Kim, 2010). The sortal hypothesis claims that one needs to identify the object common to the different types of sensory information in order to bind the information together. But does the parsing of information require such a sophisticated explanation? Does one need to recognize the object that is perceived in order to register that the two sources of information are emanating from a single object? It remains to be shown that the assumption of

Multimodal Unity and Multimodal Binding

141

unity requires the application of sortal concepts and that prior knowledge is represented in conceptual terms. Unfortunately, the question of the binding parameter has been scarcely investigated in the literature on multisensory integration, as opposed to in the literature on visual binding. Many studies on visual binding have used multiple visual stimuli to investigate how the right color is bound to the right shape, for example. On the contrary, the majority of studies in the multimodal literature have focused on relatively simple situations, in which only a single stimulus is presented in each sensory modality at any given time. Under such conditions, the parsing problem has already been solved since there is no question about which information should be bound. Furthermore, the few models of multimodal binding parameter generally involve a combination of perceptual and cognitive factors (e.g., Radeau & Bertelson, 1977; Welch & Warren, 1980; Ernst, 2006; Shams, 2012). There is no doubt that if one has a prior expectation about the sort of object one is perceiving through multiple sensory modalities, then it is easier to select the relevant information. Yet, that does not show that the application of sortal concept is necessary. Actually, accepting the role of the unity assumption and of prior expectation does not necessarily commit us to accept the sortal hypothesis. These two notions are indeed sometimes ambiguous in the multimodal literature. Let us compare two accounts of the unity assumption. A necessary condition for the occurrence of intersensory bias is the registration, at some level, that the two discrepant sources of information are emanating from a single event. (Welch & Warren, 1980, my emphasis) An intersensory conflict can be registered as such only if the two sensory modalities are providing information about a sensory situation that the observer has strong reasons to believe (not necessarily consciously) signifies a single (unitary) distal object or event. (Welch, 1999, my emphasis)

Likewise, Shams (2012) refers sometimes to prior knowledge and sometimes to prior experience. One can thus distinguish a strongly cognitive interpretation and a weaker interpretation, according to which one can dispense with conceptual identification. Furthermore, even if one adopted a strong interpretation, this would not imply that sortal concepts were required. All one would need to assume would be that there was a single source to the information coming from distinct modalities, but the nature of the source could remain unknown. And all one would need to know is “how often two signals are caused by the same source in general” (Shams, 2012). The knowledge of mere co-occurrences between modalities could be acquired without necessarily identifying the

142

F. de Vignemont

perceived object. What matters for both the assumption of unity and prior knowledge is the relation between the perceptual experiences, not the sort of object that is the common source of the perceptual experiences. Not only are we not committed to the sortal hypothesis by the current scientific models of binding, there seem also to be good arguments against it. The application of sortal concepts is indeed too sophisticated and superfluous. At least, it seems hardly plausible that nonhuman animals—including rats who heavily rely on multisensory integration—use sortal concepts. In humans, we also know that multisensory integration recruits low-level neural mechanisms. For example, a direct modulation of the activity of the primary somatosensory area by visual information has been found (Taylor-Clarke, Kennett, & Haggard, 2002). Furthermore, one can pay conscious attention to objects that one cannot categorize (Campbell, 2002). Arguably, the same is true of multimodal binding. This is well illustrated by the case of a patient with visual agnosia, who had difficulty recognizing everyday life objects by vision but not touch (Takaiwa et al., 2003). What is interesting is what happened when she used both types of information simultaneously. If the sortal hypothesis were true, one would expect no interaction between vision and touch. Since she could not visually identify the object, she should not have been able to bind visual and tactile information together. Her performance should have been the same as when she used only touch to recognize the objects. But actually she had more difficulty recognizing the objects, and the quality of her performance decreased. In other words, visual information impaired tactile processing. Visuotactile integration occurred despite her lack of visual identification. Hence, one can have a visual experience of an unknown shape, explore it tactually, and bind visual and tactile information. One can thus dispense with categorizing the object. 3.3 The Proto-Object Hypothesis We can then suggest an alternative hypothesis of object-based binding, which is less cognitively demanding than the sortal hypothesis. The proto-object hypothesis: The perceptual system binds together the sensory states that carry information about the same object. The sameness of the object is determined through its nonconceptual individuation. Rather than conceptually identifying the source, one can merely perceptually single out proto-objects. The individuation of proto-objects requires differentiating the figure from the ground using Gestalt principles such as continuity, proximity, and similarity. It does not require classifying the proto-objects as members of classes of other things that have the same

Multimodal Unity and Multimodal Binding

143

criterion of identity. Still, one can keep track of them while they move, even despite changes in feature. Proto-objects are grasped in a nonconceptual way (see for instance Clark, 2004; Raftopoulos, 2009; Pylyshyn, 2001, 2003). Hence, the individuation of a proto-object depends on neither its location nor its conceptual recognition. Multimodal binding then depends on the perceptual similarity between what is perceived by the distinct sensory modalities instead of its conceptual identity. The perceptual features that are compared include not only location, but also size, shape, intensity, motion, orientation, texture, and so on. They also include temporal factors such as the time of the onset, offset, and the rhythm. The more congruent features there are, the more likely the information is about the same protoobject. The proto-object hypothesis can account for the phenomenon of the phantom limb without having the cost to posit the use of a sortal concept of the hand. It can also account for cases of integrative binding with no spatial link, as in prism adaptation. The common source is assigned to a proto-hand at a certain location, not to the location itself. If the location is ambiguous, as is the case when wearing prisms, then the proto-hand can be individuated by features. The proto-object hypothesis, however, may be challenged by recent findings on the rubber hand illusion (Guterstam, Gentile, & Ehrsson, 2013). It was found that one could elicit the illusion despite the absence of any rubber hand. In this new version, the experimenter synchronously stroked the hidden participant’s hand and a discrete volume of empty space five centimeters above the table in direct view of the participant. Surprisingly, participants reported feeling touch in the empty space. They also mislocalized their hand toward the location where they saw the paintbrush stroking the empty space. Finally, they vividly reacted when a knife threatened the empty space. At first sight, these results seem to invalidate the proto-object hypothesis insofar as the apparent lack of perceived proto-object did not preclude visuosomatosensory integration. But was there really no protoobject? At the physical level this was true, but at the perceptual level it might be different. Participants reported that it seemed as if they had an “invisible hand.” Hence, it was not as if they perceived the empty space as being empty. They perceived it as being occupied by a hand they could not see—one may then say a proto-hand.12 The authors explained the illusion in terms of prior experience of somatosensory information without visual information. It frequently happens that we feel sensations in limbs that we are not currently seeing. This does not prevent us from feeling the sensation in the location where the unseen limb is, nor does it make us feel the sensation outside the boundaries of our body (i.e., exosomesthesia).13 One may then suggest that in Guterstam and colleagues’ study, the

144

F. de Vignemont

synchronous stimulation activated a kind of visuospatial representation of the body, which was used as the frame of reference for the localization of the tactile sensation. These puzzling findings are thus compatible with the proto-object hypothesis. Furthermore, the proto-object hypothesis has two main assets.14 First, it removes the threat of circularity since the multimodal binding parameter is merely a proto-object and not the object itself. The threat was never fatal, even for the sortal hypothesis. Arguably, multimodal binding is not necessary for object identification. Hence, one can identify the object before multimodal binding. Yet, the fact that one does not need first to identify the object that is perceived before binding the information allows multimodal binding to play a role for the recognition of the object. If the application of sortal concept is not a prerequisite of binding, then it can result from it. One then finds out what the object is on the basis of the multimodal experience. For example, in the rubber hand illusion, it seems unlikely that the participants feel that their hand is F, see that the rubber hand is F, erroneously judge that the rubber hand is their own hand, and then only integrate what they feel with what they see. It may rather be the reverse. Participants do not identify the rubber hand as their hand and then experience the illusion; rather, they experience the illusion and only then do they judge whether the rubber hand is their own. The identification of the rubber hand as one’s own is not a prerequisite of visuosomatosensory binding; it is a consequence of it. The second asset of the proto-object hypothesis is that it removes another potential worry that the sortal hypothesis faces. Since Fodor (1983), it has been traditionally accepted that perceptual systems are informationally encapsulated modules, that is, they are insensitive to beliefs. For instance, many familiar visual illusions continue to look illusory even when the perceiver is aware of the illusions. One may then argue that if multimodal binding required something like conceptual identification, then multimodal binding could not occur at the perceptual level, only later at a postperceptual stage. The only way to avoid this conclusion is to accept a kind of cognitive penetration of perceptual processes.15 By contrast, a proponent of the proto-object hypothesis does not have to settle the complex debate of cognitive penetration. The proto-object hypothesis does not involve topdown processes. The individuation of the proto-object is perceptual. Hence, the whole process of multimodal binding, from the selection of information to the integration, can occur at the perceptual level. To recap, I have argued that neither the spatial hypothesis nor the sortal hypothesis can account for integrative binding. We have seen that

Multimodal Unity and Multimodal Binding

145

integrative binding does not occur in the two objects/one location condition (as in the case of the phantom limb), whereas it occurs in the one object/two locations condition (as in the case of prism adaptation). Hence, the binding parameter of location is neither sufficient nor necessary, and what matters is the prior individuation of objects. However, although William James said, “Everyone knows what an object is,” there is no commonly agreed upon definition. In particular, one must distinguish between objects per se and proto-objects. Individuation does not imply identification. I have argued that one can dispense with the conceptual recognition of the object. This is not to say that there are no cognitive factors, but they do not require the application of sortal concepts. I thus argue that the multimodal binding parameter is at the intermediary level between spatial individuation and conceptual identification—the level of proto-object. According to the proto-object hypothesis, integrative binding requires a nonconceptual individuation of the common source of the perceptual experiences. To conclude, I have argued that multimodal binding, whether additive or integrative, constitutes a real asset for perception. In particular, integrative binding aims at maximizing reliability. Part of the satisfaction condition of binding is that information about the same object or event must be selected to the exclusion of information about other objects or events. I have argued that one does not need to appeal to the localization of the common source of the information nor to its conceptual identification. Rather, all that is needed is to individuate the proto-object that constitutes the common source. Many factors can then help the perceptual system to operate such individuation: spatial information, perceptual similarity, prior experiences of co-occurrence between perceptual experiences, and so on. The fact that the individuation is permeable to this variety of factors shows that the perceptual system operates at the intermediary level of the protoobject, which cannot be defined as a simple location or sortal concept. Acknowledgments I would like to thank Tim Bayne for his helpful comments. Notes 1. Some perceptual experiences, including bodily experiences, can even be said to be constitutively multimodal (de Vignemont, forthcoming). 2. Treisman (1998) proposes that attention is the mechanism that filters the relevant information: the perceptual system binds together properties that are currently

146

F. de Vignemont

attended. They may be attentional as well in the multimodal binding. Indeed, attention does not operate within each modality in a strictly encapsulated fashion (Driver & Spence, 1998a). However, there is now a whole debate concerning the role of attention for binding and more particularly whether it is a prerequisite. Some evidence seems to indicate that binding is possible in the absence of attention. As Driver and Spence (1998b) noticed, it would be “highly maladaptive” to be unable to integrate information without first focusing attention on a particular location. On the contrary, their results seem to indicate that crossmodal integration precedes crossmodal attention. Although interesting, I shall leave this debate aside. Instead, in the last section of the chapter I will focus only on the second question of the basis of the selection. 3. To be distinguished from the encoding problem. 4. Further useful distinctions can be found in Macpherson (2011) and Spence and Bayne (2014). 5. For example, I experience my left hand being touched on the right side of my body on the basis of the automatic remapping of tactile sensations from a body-centered frame to an external eye-centered frame. Multisensory remapping requires past experiences in another modality in order to learn its specific format such as its spatial frame of reference. Once the format is learned, one can translate or recode the format of one modality into that of another modality even in the absence of current signal coming from the other modality. Perceptual experiences in one modality during development can thus have long-term consequences on perceptual experiences in a different modality. For more details, see de Vignemont (forthcoming). 6. One may reject the mereological model of the unity of consciousness, like Michael Tye (2003). But the question here is whether you accept the mereological model for the other levels once you have accepted it at the level of unity of consciousness. 7. I will not discuss here for reference Campbell’s theory of the role of attention. 8. This has the advantage of avoiding a visuocentric conception of binding, which is often dominant in the literature. 9. The sortal hypothesis may look close to the delineation hypothesis discussed by Campbell (2002). They both highlight the role of sortal concept, but not at the same level. The sortal hypothesis concerns binding, whereas the delineation hypothesis concerns demonstrative reference and conscious attention. 10. For further criticisms of the spatial criterion for visual binding, see Matthen (2006), Raftopoulos and Muller (2006), and Raftopoulos (2009). 11. Advocates of space-based parsing often agree with this conclusion (see, e.g., Treisman, 1999). Campbell (2002) briefly notes that the requirements for the binding parameter vary depending on the sort of object. For example, an object that moves,

Multimodal Unity and Multimodal Binding

147

like a man, requires a binding parameter that can keep track of the object through movement, unlike an object that never moves, like a valley. The perceptual system then uses a “complex binding parameter” (Campbell, 2002, 62; see also Campbell, 2006a, 2006b). 12. The illusion did not work if a box was stroked rather than the empty space. 13. The authors conclude that there may be some similarities between the invisible hand illusion and the experience of phantom limbs. One may indeed claim that phantom limbs resist the lack of visual feedback because of prior experience of somatosensory information with no visual information. Roughly speaking, there is nothing unusual in phantom limbs. That is what we experience all the time when we do not see our limbs. The difference, however, is that. 14. The proto-object hypothesis also has specific epistemological implications in the case of first-person judgments. Many (if not all) self-ascriptions of bodily properties result from multimodal integration. If multimodal integration required selfidentification, then those self-ascriptions could not be immune to error through misidentification relative to the first person. However, if the proto-hypothesis is true, then multimodal bodily judgments can be immune to error. For further details, see de Vignemont (2012). 15. For discussion of cognitive penetration, see, e.g., Macpherson (2012) and Deroy (2013).

References Bayne, T. (2010). The unity of consciousness. Oxford: Oxford University Press. Bayne, T., & Chalmers, D. (2003). What is the unity of consciousness? In A. Cleeremans (Ed.), The unity of consciousness: Binding, integration, and dissociation. Oxford: Oxford University Press. Botvinick, M., & Cohen, J. (1998). Rubber hands “feel” touch that eyes see. Nature, 391, 756. Campbell, J. (2002). Reference and consciousness. Oxford: Clarendon Press. Campbell, J. (2006a). Does visual attention depend on sortal classification? Reply to Clark. Philosophical Studies, 127, 221–237. Campbell, J. (2006b). What is the role of location in the sense of a visual demonstrative? Reply to Matthen. Philosophical Studies, 127, 239–254. Clark, A. (2004). Feature-placing and proto-objects. Philosophical Psychology, 17(4), 443–469. Deroy, O. (2013). Phenomenal contrast without cognitive penetrability of perception. Philosophical Studies, 162, 87–107.

148

F. de Vignemont

de Vignemont, F. (2012). Bodily immunity to error. In F. Recanati & S. Prosser (Eds.), Immunity to error through misidentification. Cambridge: Cambridge University Press. de Vignemont, F. (Forthcoming). A multimodal conception of bodily awareness. Mind. Driver, J., & Halligan, P. W. (1991). Can visual neglect operate in object-centred coordinates? An affirmative single case-study. Cognitive Neuropsychology, 8(6), 475–496. Driver, J., & Spence, C. (1998a). Attention and the cross-modal construction of space. Trends in Cognitive Sciences, 2(7), 254–262. Driver, J., & Spence, C. (1998b). Crossmodal links in spatial attention. Philosophical Transactions of the Royal Society of London, Series B: Biological Sciences, 353, 1319–1331. Driver, J., & Spence, C. (2000). Multisensory perception: Beyond modularity and convergence. Current Biology, 10(20), R731–R735. Duncan, J. (1984). Selective attention and the organization of visual information. Journal of Experimental Psychology: General, 113, 501–517. Ernst, M. O. (2006). A Bayesian view on multimodal cue integration. In G. Knoblich, I. M. Thornton, M. Grosjean, & M. Shiffrar (Eds.), Human body perception from the inside out. New York: Oxford University Press. Fodor, J. A. (1983). The modularity of mind. Cambridge, MA: MIT Press. Guterstam, A., Gentile, G., & Ehrsson, H. H. (2013). The invisible hand illusion: Multisensory integration leads to the embodiment of a discrete volume of empty space. Journal of Cognitive Neuroscience, 25, 1078–1099. Kammers, M., de Vignemont, F., Verhagen, L., & Dijkerman, H. C. (2009). The rubber hand illusion in action. Neuropsychologia, 47(1), 204–211. Lackner, J. R. (1988). Some proprioceptive influences on the perceptual representation of body shape and orientation. Brain, 111, 281–297. Macpherson, F. (2011). Cross-modal experiences. Proceedings of the Aristotelian Society, 111(3), 429–468. Macpherson, F. (2012). Cognitive penetration of colour experience: Rethinking the issue in light of an indirect mechanism. Philosophy and Phenomenological Research, 84(1), 24–62. Marshall, J. C., & Halligan, P. W. (1994). The yin and the yang of visuo-spatial neglect: A case study. Neuropsychologia, 32(9), 1037–1057. Matthen, M. (2006). On visual experience of objects. Philosophical Studies, 127, 195–220.

Multimodal Unity and Multimodal Binding

149

Mitchell, S. W. (1871). Phantom limbs. Lippincott’s Magazine of Popular Literature and Science, 8, 563–569. Neisser, U., & Becklen, R. (1975). Selective looking: Attending to visually specified events. Cognitive Psychology, 7, 480–494. Posner, M. I., Snyder, C. R. R., & Davidson, B. J. (1980). Attention and the detection of signals. Journal of Experimental Psychology: General, 109, 160–174. Pouget, A., Deneve, S., & Duhamel, J.-R. (2002). A computational perspective on the neural basis of multisensory spatial representations. Nature Reviews: Neuroscience, 3, 741–747. Pylyshyn, Z. (2001). Visual indexes, preconceptual objects, and situated vision. Cognition, 80, 127–158. Pylyshyn, Z. (2003). Seeing and visualizing: It’s not what you think. Cambridge, MA: MIT Press. Radeau, M., & Bertelson, P. (1977). Adaptation to auditory-visual discordance and ventriloquism in semirealistic situations. Perception and Psychophysics, 22, 137–146. Raftopoulos, A. (2009). Reference, perception, and attention. Philosophical Studies, 144, 339–360. Raftopoulos, A., & Muller, V. (2006). Non-conceptual demonstrative reference. Philosophy and Phenomenological Research, 72(2), 251–286. Ramachandran, V.S. & Blakeslee, S. (1998). Phantoms in the brain. London: Fourth Estate. Ramachandran, V. S., & Rogers-Ramachandran, D. (1996). Synaesthesia in phantom limbs induced with mirrors. Proceedings of the Royal Society of London, 263, 377–386. Shams, L. (2012). Early integration and Bayesian causal inference in multisensory perception. In M. M. Murray & M. Wallace (Eds.), The neural bases of multisensory processes. Boca Raton, FL: CRC Press. Shams, L., & Kim, R. (2010). Crossmodal influences on visual perception. Physics of Life Reviews, 7, 269–284. Spence, C., & Bayne, T. (2014). Is consciousness multisensory? In M. Matthen (Ed.), Oxford handbook of philosophy of perception. Oxford: Oxford University Press. Stein, B. E., & Meredith, M. A. (1993). The merging of the senses. Cambridge, MA: MIT Press. Stratton, G. M. (1899). The spatial harmony between sight and touch. Mind, 8(4), 492–505.

150

F. de Vignemont

Takaiwa, A., Yoshimura, H., Abe, H., & Terai, S. (2003). Radical “visual capture” observed in a patient with severe visual agnosia. Behavioural Neurology, 14(1–2), 47–53. Taylor-Clarke, M., Kennett, S., & Haggard, P. (2002). Vision modulates somatosensory cortical processing. Current Biology, 12, 233–236. Treisman, A. (1998). Feature binding, attention and object perception. Philosophical Transactions of the Royal Society of London, Series B: Biological Sciences, 353(1373), 1295–1306. Treisman, A. (1999). Feature binding, attention, and object perception. In G. Humphreys, J. Duncan, & A. Treisman (Eds.), Attention, space, and action. Oxford: Oxford University Press. Tye, M. (2003). The unity of consciousness. Cambridge, MA: MIT Press. Welch, R. B. (1999). Meaning, attention, and the “unity assumption” in the intersensory bias of spatial and temporal perceptions. Advances in Psychology, 129, 371–387. Welch, R. B., & Warren, D. H. (1980). Immediate perceptual response to intersensory discrepancy. Psychological Bulletin, 88, 638–667. Wiggins, D. (2001). Sameness and substance renewed. Cambridge: Cambridge University Press.

7 Can Blue Mean Four? Jennifer Matey

1 Introduction In Art as Experience, John Dewey wrote, When we perceive by means of the eyes as causal aids, the liquidity of water, the coldness of ice, the solidity of rocks, the bareness of trees in winter, it is certain that other qualities than those of the eye are conspicuous and controlling in perception. And it is as certain as anything can be that optical qualities do not stand out by themselves with tactual and emotive qualities clinging to their skirts. If we hear a rich and haunting voice, we feel it immediately as the voice of a certain kind of personality.1

These observations anticipate two concerns at the forefront of contemporary debates in philosophy of mind. Do conscious experiences, experiences that there is something that it is like to have, inform perceivers about the world? If they do, what type of information about the world can conscious experiences convey? With respect to the first question, a common view is that conscious experiences can carry information or “content” about the world that is of the sort that could be judged to be accurate or inaccurate.2, 3 This influential idea is sometimes put as the view that conscious experience can be representational. Focusing on the case of conscious perceptual experience, this paper takes up the question: what sort of information about the world can perceptual experiences carry, or, what properties can they represent? The most conservative proposal, most often attributed to Tye and Dretske, but probably more widespread than it is commonly taken to be,4 is that in perception we only represent objects to have so-called low-level properties, those that can be directly transduced by the sensory modalities.5 For vision, this amounts to shaped colors arranged in some spatial

152

J. Matey

dimension, but not much else. A variety of more liberal proposals have also been offered. Some recognize the representation of properties over and above what the conservative view would allow in a general way.6 Other accounts support views about the representation of specific “high-level” properties such as causation,7 being an instance of a natural kind (e.g., a pine tree),8 uses or functions for a perceiver such as being edible or openable,9 certain mental states,10 and even moral properties.11 This chapter continues in the vein of such projects. Drawing on the perceptual condition of synesthesia, I argue here that visual experiences can represent visual items as instances of a general type such as a number or letter. From the fact that some synesthetes represent alphanumeric values in perceptual character, we can infer, more generally, that conscious perceptual experiences can represent objects as falling into fairly abstract conceptual categories. 2 Synesthesia In synesthesia, a stimulus that is typically associated with one sensory modality such as a sound, induces additional modality-specific experiences that are often, but not always, associated with another modality.12, 13 For example, the sound of a chime could induce a specific color experience (called a color “photism”). While synesthesia often occurs between different sensory modalities, at least one form of synesthesia, grapheme-color synesthesia, is intramodal. Grapheme-color synesthetes systematically experience particular color photisms when presented with particular alphanumeric graphemes such as an “F” or a “4.”14 In many instances of grapheme-color synesthesia, the formal properties of the inducing grapheme seem sufficient to induce synesthetic color experiences. Researchers refer to this as “lower synesthesia.” But in a small subset of grapheme-color synesthetes, color photisms seem to correlate with the inducer’s associated meaning, for example, the numerical value of the grapheme rather than just its shape. This latter phenomenon has been called “higher synesthesia.” For instance, it has been demonstrated that higher synesthetes have identical color photisms when perceiving diverse symbols that have the same numerical value, for example, an Arabic numeral “5,” a Roman numeral “V,” and an array of five dots all might correlate with blue photisms for a five-blue synesthete.15 Moreover, some grapheme-color synesthetes have different color experiences when shown ambiguous graphemes (such as “13”) according to how the grapheme is interpreted. The synesthete may

Can Blue Mean Four?

153

2222 2 22222 22 22 22 Figure 7.1

have a green photism when the grapheme is situated in the series “12, 13, 14,” but a red photism when it is situated in the series “A, 13, C” (presumably when the grapheme is taken as a “B”). This reveals that changes in synesthetic color experiences correlate with changes in the semantic significance ascribed to the grapheme (as opposed to merely its perceived shape). Also, some grapheme-color synesthetes have different color photisms when shifting from attending to an image as a “5” to attending to it as an array of “2’s” (see figure 7.1).16 These results provide evidence that some synesthesia is triggered by the numerical values of graphemes. But I want to go a step further and show that these numerical values don’t just trigger photisms; they can also enter into the representational contents of higher grapheme-color synesthetes’ photisms.17 3 Synesthesia and Mathematical Savantism A small subset of number-color synesthetes have demonstrated a heightened ability to store and recall complex numerical information along with an unusual facility with mental arithmetic. These “synesthete-savants,” as we will refer to them, report that color photisms play a critical role in facilitating both abilities. The point here is to demonstrate that the view that their photisms represent numerical values or numbers better explains than alternative proposals the facilitative effect of photisms in their numerical abilities. British-born Daniel Tammet is one such synesthete-savant.18 Tammet reports that he experiences each whole number up to ten thousand as a unique colored shape. He writes, “I can visualize the numbers as meaningful shapes,” and, “my mind perceives numbers as complex, multidimensional, colored and textured shapes.”19 Indeed, controlled studies confirm Tammet’s reports by showing that his photisms are both systematic and highly consistent over time, common indicators of the perceptual reality of synesthesia.20

154

J. Matey

One of Tammet’s unusual abilities is reciting pi from memory up to 22,514 digits. He credits his ability to store and retrieve these digits to his photisms, which he claims facilitate storage by providing a means of chunking numerical information. He writes: It is much easier to conceive of the possibility that a human mind might be capable of recalling over 22,500 consecutive digits of Pi, particularly when, as in my case, it is able to “chunk” groups of numbers spontaneously into meaningful images that constitute their own hierarchy of associations.21

Although the amount of information retained in memory is limited, “chunking” or consolidating individual items into a single meaningful unit allows us to store a greater amount of information. For instance, we can more easily retain a ten-digit telephone number by grouping the digits into three meaningful chunks (a three-digit area code and so on). Consistent with Tammet’s report, for the synesthete-savant, a single photism could act as a means to chunk numerical information by representing or carrying information pertaining to a complex number or series of numbers. Mental arithmetic is another skill at which synesthete-savants such as Tammet excel. Here, once again, nonsavants are limited by a finite memory. To perform mental arithmetic, all of the numerical information concerning the problem to be solved must be retained in working memory while the process of mental computation is completed. This includes both the original problem information as well as all of the intermediary results that are arrived at during the computational process. Typical nonsavants, however, can only retain three to four items in working memory at a time.22 This makes it difficult, or practically impossible, for most to work efficiently with large numbers. Recall that while working out a problem such as 42,239 multiplied by 62,349, the digits to be computed as well as the partial results arrived at along the way must all be retained.23 To see this point, consider how much easier it is to multiply two two-digit numbers than it is to multiply two four-digit numbers, despite the fact that the same mathematical knowledge applies in both cases. Again, representing numbers by experiences of colored shapes could confer some advantage to the synesthete-savant’s performance of mental arithmetic by providing him or her with an effective means of chunking numerical information. For instance, when Tammet multiplies two four-digit numbers, because he can represent many numbers by a single photism, he need only retain two items in working memory, as opposed to eight. But although the ability to retain all of the problem information in working memory is necessary for the savant’s extraordinary computational

Can Blue Mean Four?

155

abilities, it is not sufficient. Synesthete-savants are also typically quicker than nonsavants at performing mental arithmetic using smaller numbers that should not put a strain on the resources of ordinary working memory. This is particularly striking when savants are presented with novel, computationally intensive math problems. Again, Tammet’s self-reports point to the photisms’ facilitative role in the actual process of mental computation. He describes mental computation facilitated by these colored shapes as “rapid, intuitive and largely unconscious.”24 Moreover, he connects the facilitative role of photisms to the fact that the photisms themselves represent numbers. He writes: I know these semantic relations between numbers as I know the relations between words, because I can visualize the numbers as meaningful semantic shapes. … Being able to visualize numbers helps me to see and to understand the various interrelations between them.25

Theoretical considerations lend some support to Tammet’s reports. Numerical information manipulated during mental arithmetic is encoded and stored in working memory in both verbal26 and visual27 formats. Studies show that visually encoded information is more important for mental arithmetic than is verbally encoded information.28 For instance, it has been demonstrated that people with average and poor mathematical ability differ in their capacities to visually represent numbers in working memory but not in their capacities to represent them verbally.29 Also, people who score higher in mathematical ability tend to have an enhanced capacity for numerical representation and be more efficient at visually representing.30, 31 The idea is that representing numbers with photisms might provide synesthetes with a particularly good means for visually encoding information. Color experiences are likely more vivid than the more typical form of visual encoding by orthographic form and when used in conjunction with form could play a reinforcing role. For instance, study participants report that their mathematical images become less vivid when the visual components of working memory are loaded with visual information irrelevant to the task. Loading also correlates with a decline in arithmetical skill. This correlation between vividness of experience and facility with mental arithmetic is consistent with the fact that more vivid representations of numbers (photisms) can have a facilitating effect on mental arithmetic. Moreover, the apparently greater vividness of synesthetes’ color photisms could relate to the photisms’ ability to carry more information. The ability to carry more complex information would explain the synesthete-savant’s enhanced computational speed. One promising way of characterizing numerical representations is by the view that such

156

J. Matey

representations are at least partially individuated by, or even identical to, their conceptual roles, the roles they play in cognition, reasoning, decision making, and reporting.32 To represent a specific number, say four, is to have a representation that plays a specific cognitive role. For instance, numbers have relations to other numbers (4 < 5 < 6). We should then expect that numerical representations would play the role of prompting judgments or facilitating reasoning about numbers in a way that reflects these relations. Numerical representations can be more or less determinate or complex. In this view, a more determinate or complex numerical representation would specify more of the number’s relations to other numbers. More complex representations might, in these subjects, facilitate more complex mathematical reasoning by prompting judgments about more of the numbers’ complex relationships to other numbers. The synesthetesavant may differ from nonsavants, then, because photisms enable them to have very complex and determinate numerical representations of even very large numbers that carry information about the complex ways those numbers relate to other numbers.33 Because the savant’s numerical representations would already specify this complex set of relations, the savant’s mathematical reasoning can be very quick and automatic. In summary, the hypothesis offered is that vivid photisms that exploit the resources of visual memory can carry more information about numbers by representing more cognitive roles relating numbers to other numbers. Some synesthete-savants report that photisms facilitate their heightened ability to retain numerical information and to very quickly perform highly complex mental computations.34 Some empirical considerations have been offered to show how representing numbers by photisms might enable synesthetes to circumvent some of the limitations on these skills by providing an effective means for chunking and by representing information about a number’s complex relations to other numbers, facilitating quick arithmetical judgments. 4 Inferentialism and Associationism The previous section offered a view on how photisms’ representation of numbers could explain their facility with the performance of mental arithmetic. That view is also consistent with self-reports of synesthetesavants, who attribute their abilities to their color photisms. An alternative explanation, however, may be that photisms themselves don’t represent numbers, but rather that synesthete-savants represent numbers in the

Can Blue Mean Four?

157

content of independent thoughts that are derived from photisms. The aim of this section is to cast doubt on the adequacy of this alternative account. If the representational view can account for the facilitative role of photisms, and the alternatives cannot, then the reasonable thing is to adopt the representational view. Let’s consider the alternative view. There are two kinds of cognitive processes by which the representation of numbers might be derived from the photism. Representations of numerical value could be derived from percepts by a routinized associative process—a percept of a numeral “4” leads automatically to a four-belief. Or, based on the photism together with a background belief, we could actively infer that a digit is a certain number, say a “4.” We will assume that by inference we mean conscious inference. If the inference were automatic and preconscious, then this view could likely be reduced to the former associative view. To start, the inferential view is particularly untenable. For one thing, this view is not consistent with Tammet’s self-reports about his experience. Tammet describes his mental computations as “rapid, intuitive and largely unconscious.” While it is possible that Tammet’s reports do not reflect his mental reality, all else being equal, it is customary to prefer a theory that grants authority to self-reports. There are additional problems with this view. Synesthetes are subject to special Stroop effects, showing that associations of numbers with colors are accomplished very early in perceptual processing. Higher number-color synesthetes were shown indicators of numbers such as a hand displaying a particular quantity of fingers painted in various colors. They were quicker at identifying the quantity of fingers displayed when they were painted the color ordinarily associated with that number. This suggests that numbers associated with photism color are processed and influential prior to conscious judgments. Inference is also more cognitively costly than association, and not only are associative processes quicker, they are oftentimes more accurate. For these reasons, it is advantageous for the cognitive system to connect commonly performed inferential processes with routinized associations. Much of learning is just that—a set of propositions and various if-then relationships internalized and routinized into a set of associations. The result is an expertise or skill. If this happens in most commonly repeated inferences, then there is reason to think that it would happen in the case of higher synesthesia as well. Can, then, the associative view better explain the computational advantage conferred by photisms?

158

J. Matey

We said that there seemed to be two skills that distinguish synesthetesavants from nonsynesthetes: the ability to retain more numerical information in memory, including in working memory, and the enhanced speed of their ability to mentally compute very complex problems. Let’s look at whether the view that savants derive representations of photisms by an automatic routinized process of association handles these two abilities better than the inferential view and the representational proposal advanced here. Performing mental computation of large numbers requires that a good deal of numerical information be retained in working memory, and working memory can only hold three to seven items at a time. As previously discussed, on the representational view, synesthete-savants can circumvent this problem because photisms provide a way to “chunk” and so store more numerical information. Mere association between photism and number, however, wouldn’t provide any advantage here. For the synesthete-savant to appreciate, much less solve a complex problem, numbers would still first have to be derived from photisms, leaving the synesthete with a working memory taxed to a level comparable to that of a neurotypical subject. Moreover, the ability to retain more in working memory is necessary, but it is not sufficient to account for the relative difference in computational speed that we see between synesthete-savants and nonsynesthetes. To underscore this point, note that synesthete-savants’ superior mental computation abilities are evident even in the computation of smaller numbers, such as in multiplying two two-digit numbers. In the representational view, vivid photisms carry information about a number’s complex set of relations to other numbers. But in the associative view, after numbers are derived from photisms, the performance of mental arithmetic is ultimately accomplished without photisms, in the same way as it is in neurotypical subjects. This view, therefore, does not offer a corresponding explanation for the synesthete-savants’ increased computational speed. We will now consider a few additional objections to the view that some synesthetes represent numerical value in perceptual experience. 5 Objections It has been argued that the view that photisms represent numbers in synesthete-savants has explanatory power and can account, better than alternatively views, for the relative abilities of synesthete-savants at mental arithmetic. But is this view vulnerable to any conclusive objections? In the

Can Blue Mean Four?

159

remainder of this chapter, I want to consider potential objections to the argument presented.35 The first objection targets the assumption, required in our view, that the content of color experiences can consist of perceiver relative disjuncts, multiple types of properties including those that do not seem obviously visual (e.g., numerical values). When a neurotypical perceiver has a color experience that ascribes color quality q, intuitively, he or she represents that the perceptual object it is ascribed to has a particular spectral reflectance or color property. If the above argument is correct, then unlike neurotypical perceivers, the content of at least some synesthetes’ photisms is not exhausted by the color properties or spectral reflectance properties that they seem to ascribe. On our view, when the number-color synesthete has a color experience by which a digit is marked green, he or she visually represents the grapheme either to be green or representative of a particular number (or both).36 Our view, then, requires that visual experiences such as color experiences can have disjunctive contents, that they be capable of representing multiple properties, and that they be capable of representing different types of properties including those that may not be typically associated with color experience. There is, however, a precedent for the view that a single qualitative experience can represent multiple types of properties, including properties that we do not ordinarily take to be associated with color experiences. Consider studies by Bach-y-Rita implementing the tactile visual substitution system (TVSS).37 The TVSS takes in visual information about light spectral reflectance through a camera positioned on the subject’s head and converts the information to a series of tactile pulses given on the subject’s skin. Subjects trained to interpret the information report experiencing the representations as primarily conveying information about objects located in the spatial field around the subject rather than as information about the objects that are causing the pulses on the skin. Moreover, several have noted that TVSS provides skilled users with information that is more like visual perception than tactile perception.38 Reports from blind individuals concerning their experiences perceiving with the use of their walking canes are similar to the reports of TVSS users. They report representing space using a cane in a way that is similar to the way that sighted people do. The primary objects of perception seem to be objects arranged spatially around one. Blind individuals do not report the object causing sensations of pressure on their hand as the primary object of perception. What is more, representations derived from TVSS are utilized to fill the same role in modulating behavior as visual representations play in sighted

160

J. Matey

individuals. For example, TVSS users have learned to hit a ball with a bat and to perform assembly-line work. These are both activities that require that one react quickly to information about the way objects are positioned in space, information about properties that are typically the subject of visual representations rather than tactile. The reaction speed required for performing these actions suggests that the processing upon which it depends is automatic; that is to say that the representations utilized do not depend on a series of long and drawn-out translations. Automaticity of processing is often taken to indicate that a process is noninferential, or at least noninferential in the relevant way. The fact that processing is quick and noninferential suggests that these tactile representations already indicate spatially located objects. This would set a precedent for the view that a sensory quality such as a color experience or a tactile sensation can represent properties that are not typically associated with that modality. The second objection that I want to consider here is that the view that synesthete-savants’ color experiences have disjunctive content may appear to yield counterintuitive results for the accuracy of some synesthetes’ visual experiences. I have proposed that for a given synesthete-savant s, a blue photism may have disjunctive content specifying a color or number and that it represents the grapheme that it projects color onto as either blue and/or a number “4.” If s encounters an actual blue numeral “7,” but erroneously has a blue photism that projects blue onto the digit, then the color experience represents the digit as either a “4” or to be the color blue (or alternatively to have the property typically associated with blue). It may appear then that our view entails that in this case, s’s experience is veridical. Intuitively, however, it seems better to count s’s experience as misrepresenting the “4.” The fact that our proposal apparently does not make sense of the intuitively erroneous nature of the synesthete’s experience in such a case might be taken to count against our view. It may be tempting to take the problem described above as motivating the view that synesthetes’ experiences have conjunctive instead of disjunctive contents. But this would not solve the problem either. In the conjunctive view, whenever a blue-four synesthete BF marks a blue body of water with a blue color quality, BF would thereby be misrepresenting the water to be blue and four. This result seems strange, though. Surely we want to say that a synesthete can veridically represent a colored object. The fact that we cannot appeal to conjunctive content appears to compound the problem for our view. This objection problematically reads too much into the proposal that has been offered here. Our proposal about the disjunctive nature of the

Can Blue Mean Four?

161

content of at least some synesthetes’ photisms can be implemented in two ways. One view is that each time a synesthete tokens a color experience t, t itself has the disjunctive content that the object it projects its color quality onto is either a particular color c or representative of a numerical value n (or both). This interpretation could yield the unhappy result discussed above. The alternative view is that color experiences have disjunctive content for a synesthete s such that whenever s’s experience instantiates a color quality q, q will either represent that the item is a particular color or that it has a particular numerical value, or q will represent that the item is a particular color and that it has some numerical value. This alternative interpretation does not incur the problems raised above. I have been careful to stay neutral with respect to which of these two versions of the view best describes the synesthete-savant. To the extent that one finds the objection above compelling, he or she may find it better to understand the proposal about the disjunctive content of synesthetic experiences in this latter way. 6 Conclusion In recent years there has been a lot of work devoted to better understanding the nature and scope of perceptual representation. Conservatives have argued that perceptual experiences can only represent properties such as colors, shapes, and their respective relations. It was argued here that perceptual experiences can, and in some cases do, represent numerical values.39 If photisms play the role of facilitating mental calculation that synesthetesavants claim they do, then it seems that the best explanation for this will be that synesthetes’ color experiences, or “photisms,” represent numerical values. Moreover, insofar as such synesthetes’ photisms project their colors onto digits, their experiences should represent the digits that color is projected onto as having the properties that photisms represent. The visual experiences of the synesthete-savant therefore represent the digits that color qualities are projected onto as having numerical values.40 Notes 1. Dewey (1934), 122. 2. Recent arguments for the view that conscious experiences have representational content are given by Pautz (2010), Searle (1983, 1992), Siegel (2005, 2010), and Siewert (1998). 3. People accept the view that perceptual experiences have representational content for a variety of reasons. Some are persuaded of the thesis based on the “transparent”

162

J. Matey

character of perceptual experience (e.g., see Horgan & Tienson, 2002; Tye, 1995, 30; cf. also Tye, 2000). Moreover, the commonsense view is that conscious perceptual experiences, like beliefs, may be either veridical or nonveridical. This is made sense of by the view that experiences have content, since content is taken to be equivalent to the conditions under which the experience would turn out to be veridical (Siewert, 1998). Finally, this view fits well with the common intuition that some beliefs are directly justified based on what is presented via the “testimony of the senses.” Intuitively, when I am standing in front of, for example, an ocean, I seem to be justified in a direct and immediate way in my belief that the world is a particular way based on what I see (Sellars, 1997; Brewer, 1999). 4. Bayne (2011) also attributes the conservative view to Clark (2000), Jackendoff (1987), Langsam (2000), Levine (1983), and Lormand (1996). 5. We can make a distinction between what we represent in the content of perceptual experience, and what we go on to believe based on those experiences. Tye and Dretske both hold that belief or judgment-independent visual experiences cannot represent high-level properties. For Dretske (1995), nonepistemic experiences are automatic, modular, sensory representations of objects that are impervious to beliefs or learning. To see an object as a particular kind of thing requires forming a judgment or belief about the object. Tye (1995) draws a sharp distinction between sensory/perceptual representations, which are modular, and the cognitive states that they feed into, such as beliefs and other conceptual states or “thoughts.” Unlike thoughts, perceptual states are thought to be nondoxastic and nonconceptual. 6. Crutchfield (2012); Siewert (1998). That number-color synesthetes represent objects to have semantic value should be of interest to philosophers of perception who are interested in the nature and content of perceptual experience as well as what those experiences represent. But the issue has significance for other philosophical projects as well. For instance, Siegel (2005) discusses the relevance for projects in epistemology concerning the justification of beliefs such as which beliefs are directly versus indirectly justified and what other mechanisms and considerations justification might depend on. Moreover, Bayne (2011) notes that whether or not we can represent high-level properties in perceptual character may also bear on the metaphysical problem of determining whether and how something seemingly immaterial like consciousness can be accounted for in material terms. Not all reductive theories of consciousness are compatible with the representation of semantic value. For additional discussion see Macpherson (2011). 7. Butterfill (2011); Siegel (2011). 8. Siegel (2005). 9. Nanay (2011). 10. McDowell (1982). For additional support for liberalism, see Carruthers (2000), Goldman (1993), and Pitt (2004).

Can Blue Mean Four?

163

11. E.g., see McGrath (2004), Cullison (2010), Copp (2001), and Watkins and Jolley (2002). 12. For a thorough discussion of both the scientific and philosophical literature on synesthesia, see Hermanson and Matey (2011). 13. Some work with a broader definition of synesthesia on which synesthetic experience can include systematic, nonmodal correspondences such as the assignment of genders to numbers. 14. Ramachandran and Hubbard (2001); Dixon, Smilek, and Merikle (2004). 15. Ramachandran and Hubbard (2001); Ward and Sagiv (2007). 16. Ramachandran and Hubbard (2001). Dixon et al. (2000) have studied an individual with higher synesthesia who has color photisms not merely when viewing visual items that have recognizable cognitive significance, but also when merely considering the numerical concept. 17. By “photism” I mean the unusual color experience typical of number-color synesthetes. There is reason to take the photisms of some synesthetes to be representational rather than as mere unintentional color qualia. Some synesthetes’ photisms have transparent- or world-presenting character, and transparency has been used as evidence of representation (Horgan & Tienson, 2002; Tye, 1995, 2000). For instance, Dixon, Smilek, and Merikle (2004) demonstrate that so-called “projector” synesthetes’ experiences present the world as if the external grapheme that triggers a photism has the color property of that photism. This is reinforced by the fact that when target graphemes that are associated by the synesthete with a particular color are presented as flanked by formally similar graphemes, the target tends to “pop out” saliently when the color of its flankers is incongruent with the color that the synesthete projects onto the significant grapheme. A cluster of such graphemes may form a salient group such as a triangle (Ramachandran & Hubbard, 2001). Cytowic (1993) also discusses projector synesthesia. 18. Although I focus on Tammet, there are numerous other synesthetes who claim to have savant abilities in virtue of their synesthesia. For instance, Jason Padgett experiences complex fractal images upon seeing many everyday objects and spontaneously understands the mathematical equations that the fractals are based on. For discussion, see Brogaard (2011). Baron-Cohen et al. (2007) suggest that having synesthesia may increase the likelihood for savantism. They note the established link between savantism and autism spectrum conditions and make the case that the co-occurrence of synesthesia along with autism spectrum conditions may increase the likelihood for savantism. 19. Tammet (2009), 140. 20. Baron-Cohen et al. (2007). This means that photisms he experiences as correlating with numbers that are similar (i.e., 99 and 999) have systematically similar

164

J. Matey

features and number-to-photism pairings are the same over multiple testings. In research on synesthesia, this systematicity and consistency are taken as indicators that photisms are perceptually real and not merely associated ideas. 21. Tammet (2009), 72. 22. Miller (1956) argued that the capacity of working memory was for roughly seven items, but many psychologists have since argued that the accurate number is more likely to be three to five items. 23. Hitch (1978). 24. Tammet (2009), 139–140. 25. Ibid., 141. 26. The working memory component that stores verbally encoded information is called the “phonological loop” (Baddeley & Hitch, 1974). The importance of the phonological loop in mental arithmetic for retaining information about the problem and partial results while the problem is solved is addressed by both Ashcraft (1995) and Heathcote (1994). Further evidence of the role of the phonological loop in mental arithmetic comes from Fuerst and Hitch’s (2000) studies in which subjects were exposed to repeated vocalization of a word while engaged in mental arithmetic. They found that when participants were required to retain problem information in working memory, the concurrent articulator suppression, which researchers take as loading the phonological loop, interferes with mental arithmetic. When subjects were allowed visual access to the problem, the interference ceased. 27. The storage component responsible for retaining visually encoded information is called the “visuospatial sketchpad” (Baddeley & Hitch, 1974). For discussion of the role of the visuospatial sketchpad in mental arithmetic, see Logie, Gilhooly, and Wynn (1994) and Ashcraft (1995). Hope and Sherrill (1987) note that many engage in finger writing when performing mental arithmetic, indicating the role of visual images in mental calculation, particularly in retaining positional information about numbers in columns and information about carrying operations. Some studies show performance in arithmetic to be impaired when subjects are given a task involving spatial or visual information to perform concurrently. 28. Chincotta et al., 1999; Zago et al. (2001) make the stronger claim that mental calculation relies exclusively on visuospatial information and not verbally encoded information. They used positron emission tomography (PET) to arbitrate between models of mental calculation that emphasize linguistic or visuospatial representation. Their findings implicate visuospatial representation in both arithmetical fact retrieval and actual calculation. 29. McClean and Hitch (1999). 30. Dark and Benbow (1991).

Can Blue Mean Four?

165

31. Dark and Benbow (1991); McClean and Hitch (1999). See also Morris and Walter (1991) on the importance of visuospatial ability in the arithmetical performance of adults. 32. Accounts have been defended by Block (1986), Horwich (2005), Loar (1981), and Peacocke (1992). 33. Tammet’s reports seem consistent with the view that to entertain a numerical concept is to have a representation that plays such roles. He writes, “Using the example of 6,253 from the multiplication we just looked at, I can immediately ‘see’ that its shape derives from the combination of 13 × 13 (169) × 37. I factorize larger numbers (above 10,000) by separating them into meaningful (visualizable) related parts: for a number like 84,187, I separate it into 841 (29 × 29) and 87 (2903 × 29), telling me immediately that the number is divisible by 29 as well as 2,903, a prime” (Tammet, 2009, 142–143). 34. See also Brogaard (2011). 35. Additional objections are addressed in Matey (2013). 36. Some question whether synesthetic experiences really are phenomenally like regular color experiences. Macpherson (2007) notes that some synesthetes can identify both the color quality synesthetes project onto a significant digit (say, blue) as well as the actual color that the digit is printed in (say, black). For further discussion of Macpherson’s view as well as a critical response, see Hermanson and Matey (2011). 37. Bach-y-Rita (1972, 1983, 1984, 1996). 38. Noe (2005). 39. It is one thing to ask what properties perceptual experiences are capable of representing and another to ask what properties they do represent. For some discussion of this distinction, see Macpherson (2011). My aim here is to focus on the first question about what properties can be represented in perception by arguing that in at least some cases, numerical values are represented. The discussion, however, should not be taken as irrelevant to the question about what properties perceptual experiences more commonly represent. For the relevance of considerations such as those presented here and what is more typically represented, see Matey (forthcoming). 40. One might now wonder whether there is any reason to think that claims about the representational nature of the visual experiences of synesthetes can generalize to speak about the representational capacities of visual experiences. I believe that the answer here is yes. Recent work suggests that we can draw general conclusions like this about representation by studying visual representations for synesthetes.

166

J. Matey

References Ashcraft, M. H. (1995). Cognitive psychology and simple arithmetic: A review and summary of new directions. Mathematical Cognition, 1, 3–34. Bach-y-Rita, P. (1972). Brain mechanisms in sensory substitution. New York: Academic Press. Bach-y-Rita, P. (1983). Tactile vision substitution: Past and future. International Journal of Neuroscience, 19(1–4), 29–36. Bach-y-Rita, P. (1984). The relationship between motor processes and cognition in tactile vision substitution. In A. F. Sanders & W. Prinz (Eds.), Cognition and motor processes (pp. 150–159). Berlin: Springer. Bach-y-Rita, P. (1996). Substitution sensorielle et qualia (Sensory substitution and qualia). In J. Proust (Ed.), Perception et Intermodalite (pp. 81–100). Paris: Presses Universitaires de France. Reprinted in English translation in A. Noe & E. Thompson (Eds.). (2002). Vision and mind: Selected readings in the philosophy of perception (pp. 497–514). Cambridge, MA: MIT Press. Baddeley, A. D., & Hitch, G. J. (1974). Working memory. In G. H. Bower (Ed.), The psychology of learning and motivation: Advances in research and theory (Vol. 8). New York: Academic Press. Baddeley, A. D., & Logie, R. H. (1999). Working memory: The multiple-component model. In A. Miyake & P. Shah (Eds.), Models of working memory: Mechanisms of active maintenance and executive control. Cambridge: Cambridge University Press. Baron-Cohen, S., Bor, D., Billington, J., Asher, J., Wheelwright, S., & Ashwin, C. (2007). Savant memory in a man with colour form-number synaesthesia and Asperger syndrome. Journal of Consciousness Studies, 14, 243–245. Bayne, T. (2011). Perceptual experience and the reach of phenomenal content. In K. Hawley & F. Macpherson (Eds.), The admissible contents of experience. Oxford: Wiley-Blackwell. Block, N. (1986). Advertisement for a semantics for psychology. Midwest Studies in Philosophy, 10, 615–678. Brewer, B. (1999). Perception and reason. Oxford: Oxford University Press. Brogaard, B. (2011). A case of acquired synesthesia and savant syndrome after a brutal assault. CAS Grant Report. University of Missouri, St. Louis. Butterfill, S. (2011). Seeing causings and hearing gestures. In K. Hawley & F. Macpherson (Eds.), The admissible contents of experience. Oxford: Wiley-Blackwell. Carruthers, P. (2000). Phenomenal consciousness. Cambridge: Cambridge University Press.

Can Blue Mean Four?

167

Chincotta, D., Underwood, G., Ghani, K. A., Papadopoulou, E., & Wresinski, M. (1999). Memory span for Arabic numerals and digit words: Evidence for a limited capacity visuo-spatial storage system. Quarterly Journal of Experimental Psychology, 52A, 325–351. Clark, A. (2000). A theory of sentience. Oxford: Oxford University Press. Copp, D. (2001). Four epistemological challenges to ethical naturalism: Naturalized epistemology and the first-person perspective. Canadian Journal of Philosophy, 26, 31–74. Crutchfield, P. (2012). Representing high-level properties in perceptual experience. Philosophical Psychology, 25(2), 279–294. Cullison, A. (2010). Moral perception. European Journal of Philosophy, 18(2), 159–175. Cytowic, R. E. (1989). Synesthesia: A union of the senses (2nd ed.). Cambridge, MA: MIT Press. Cytowic, R. E. (1993). The man who tasted shapes. New York: Warner. Cytowic, R. E., & Eagleman, D. (2009). Wednesday is indigo blue: Discovering the brain of synesthesia. Cambridge, MA: MIT Press. Dark, V. J., & Benbow, C. P. (1991). Differential enhancement of working memory with mathematical versus verbal precocity. Journal of Educational Psychology, 83, 48–60. Dehaene, S. (1992). Varieties of numerical abilities. Cognition, 44, 1–42. Dehaene, S., & Cohen, L. (1995). Towards an anatomical and functional model of number processing. Mathematical Cognition, 1, 83–120. Dewey, J. (1934). Art as experience. New York: Capricorn Books. Dixon, M. J., Smilek, D., Cudahy, C., & Merikle, P. M. (2000). Five plus two equals yellow. Nature, 406, 365. Dixon, M., Smilek, D., & Merikle, P. M. (2004). Not all synesthetes are created equal: Projector versus associator synesthetes. Cognitive, Affective, and Behavioral Neuroscience, 4(3), 335–343. Dretske, F. (1995). Naturalizing the mind. Cambridge, MA: MIT Press. Fuerst, A. J., & Hitch, G. J. (2000). Separate roles for executive and phonological components of working memory in mental arithmetic. Memory and Cognition, 28, 774–782. Goldman, A. (1993). The psychology of folk psychology. Behavioral and Brain Sciences, 16, 15–28.

168

J. Matey

Heathcote, D. (1994). The role of visuo-spatial working memory in the mental addition of multi-digit addends. Current Psychology of Cognition, 13, 207–245. Hermanson, S., & Matey, J. (2011). Synesthesia. In Internet Encyclopedia of Philosophy, http://www.iep.utm.edu/synesthe/. Hitch, G. J. (1978). The role of short-term working memory in mental arithmetic. Cognitive Psychology, 10, 302–323. Hope, J. A., & Sherrill, J. M. (1987). Characteristics of skilled and unskilled mental calculators. Journal for Research in Mathematics Education, 18, 98–111. Horgan, T., & Tienson, J. (2002). The intentionality of phenomenology and the phenomenology of intentionality. In D. Chalmers (Ed.), Philosophy of mind: Classical and contemporary readings. Oxford: Oxford University Press. Horwich, P. (2005). Reflections on meaning. Oxford: Oxford University Press. Jackendoff, R. (1987). Consciousness and the computational mind. Cambridge, MA: MIT Press. Kim, C.-Y., Blake, R., & Palmeri, T. J. (2006). Perceptual interaction between real and synesthetic colors. Cortex, 42, 195–203. Langsam, H. (2000). Experiences, thoughts, and qualia. Philosophical Studies, 99, 269–295. Levine, J. (1983). Materialism and qualia. Pacific Philosophical Quarterly, 64, 354–361. Loar, B. (1981). Mind and meaning. Cambridge: Cambridge University Press. Logie, R. H., Gilhooly, K. J., & Wynn, V. (1994). Counting on working memory in arithmetic problem solving. Memory and Cognition, 22, 395–410. Lormand, E. (1996). Nonphenomenal consciousness. Noûs, 30, 242–261. Macpherson, F. (2007). Synaesthesia, functionalism, and phenomenology. In M. de Caro, F. Ferretti, & M. Marraffa (Eds.), Cartographies of the mind: Philosophy and psychology in intersection series: Studies in brain and mind (Vol. 4, pp. 65–80). Dordrecht: Springer. Macpherson, F. (2011). Introduction. In K. Hawley & F. Macpherson (Eds.), The admissible contents of experience. Oxford: Wiley-Blackwell. Matey, J. (2013). You can see what “I” means. Philosophical Studies, 16(1), 51–70. McClean, J. F., & Hitch, G. J. (1999). Working memory impairments in children with specific arithmetic learning difficulties. Journal of Experimental Child Psychology, 74, 240–260.

Can Blue Mean Four?

169

McDowell, J. (1982). Criteria, defeasibility, and knowledge. Proceedings of the British Academy, 68, 455–479. McGrath, S. (2004). Moral knowledge by perception. Philosophical Perspectives, 18(2), 159–175. Miller, G. A. (1956). The magical number seven, plus or minus two. Psychological Review, 63, 81–97. Morris, R. B., & Walter, L. W. (1991). Subtypes of arithmetic-disabled adults: Validating childhood findings. In B. P. Rourke (Ed.), Neuropsychological validation of learning disability subtypes. New York: Guilford Press. Nanay, B. (2011). Do we see apples as edible? Pacific Philosophical Quarterly, 92, 305–322. Noe, A. (2005). Action in perception. Cambridge, MA: MIT Press. Pautz, A. (2010). Why explain visual experience in terms of content? In B. Nanay (Ed.), Perceiving the world. Oxford: Oxford University Press. Peacocke, C. (1983). Sense and content. Oxford: Oxford University Press. Peacocke, C. (1986). Thoughts: An essay on content. Oxford: Blackwell. Peacocke, C. (1992). A study of concepts. Cambridge, MA: MIT Press. Pitt, D. (2004). The phenomenology of cognition, or, What it is like to think that P. Philosophy and Phenomenological Research, 69, 1–36. Ramachandran, V. S., & Hubbard, E. M. (2001). Synesthesia: A window into perception, thought, and language. Journal of Consciousness Studies, 8(12), 3–34. Searle, J. (1983). Intentionality: An essay in the philosophy of mind. Cambridge: Cambridge University Press. Searle, J. (1992). The rediscovery of mind. Cambridge, MA: MIT Press. Sellars, W. (1997). Empiricism and the philosophy of mind. Cambridge, MA: Harvard University Press. Siegel, S. (2005). Which properties are represented in perception? In T. Szabo Gendler & J. Hawthorne (Eds.), Perceptual experience. Oxford: Oxford University Press. Siegel, S. (2010). Do visual experiences have content? In B. Nanay (Ed.), Perceiving the world. Oxford: Oxford University Press. Siegel, S. (2011). The visual experience of causation. In K. Hawley & F. Macpherson (Eds.), The admissible contents of experience. Oxford: Wiley-Blackwell.

170

J. Matey

Siewert, C. (1998). The significance of consciousness. Princeton: Princeton University Press. Smilek, D., Dixon, M. J., Cudahy, C., & Merikle, P. M. (2001). Synesthetic photisms influence visual perception. Journal of Cognitive Neuroscience, 13, 930–936. Tammet, D. (2009). Embracing the wide sky: A tour across the horizons of the mind. New York: Free Press. Tye, M. (1995). Ten problems of consciousness. Cambridge, MA: MIT Press. Tye, M. (2000). Consciousness, color, and content. Cambridge, MA: MIT Press. Ward, J., & Sagiv, N. (2007). Synaesthesia for finger counting and dice patterns: A case of higher synaesthesia? Neurocase, 13(2), 866–893. Watkins, M., & Jolley, K. (2002). Pollyanna realism: Moral perception and moral properties. Australasian Journal of Philosophy, 80, 75–85. Zago, L., Pesenti, M., Mellet, E., Crivello, F., Mazoyer, B., & Tzourio-Mazoyer, N. (2001). Neural correlates of simple and complex mental calculation. NeuroImage, 13(1), 314–327.

8 Establishing Cross-Modal Mappings: Empirical and Computational Investigations Pawan Sinha, Jonas Wulff, and Richard Held

If the world of a newborn is indeed a “blooming, buzzing confusion,” as William James suggested, how does it eventually acquire coherence? Perhaps this question presupposes a fallacy; maybe the sensory world is not chaotic, but rather, ordered and coherent right from the outset. These fundamental issues were thrust into prominence with the formulation of Molyneux’s famous question. After having remained tantalizingly open for over three centuries, the question is now beginning to yield. Empirical data from a humanitarian effort in India that provides sight to congenitally blind children suggest that shape representations across sensory modalities are initially disparate, but a cross-modal mapping can be established within a few weeks through experience with the real world. These results have implications for understanding the genesis of sensory coherence in the human brain and also help guide the development of a computational model for linking sensory modalities. 1 The Historical Backdrop The late seventeenth century was a period of great debate on the naturenurture question. Responding to the rationalists, John Locke (1690) was formulating his influential ideas on empiricism, which posited that the mind was a “tabula rasa.” It was an issue with broad ramifications, ranging from the origins of knowledge to the rights of the royals to rule over the commoners. A few decades later, Berkeley and Hume would carry these ideas forward and fuel the debate between empiricists and nativists. Against this backdrop unfolded the rather tragic family drama of William Molyneux, a natural philosopher and political writer in Ireland. By all accounts, Molyneux was a man with varied interests including metaphysics, politics, and optics. An inheritance gave him the latitude to pursue many of these avenues (O’Hara, 2004). However, his personal family

172

P. Sinha, J. Wulff, and R. Held

history was filled with sorrow. Of his three children, only one lived to be an adult. Just a dozen years into their marriage, his wife fell ill, became blind, and died. Perhaps it was his experience with blindness at such close quarters that led Molyneux to pose what has come to be regarded as one of the foremost questions in the philosophy of mind. It is a question that crystallized the broad nature-versus-nurture debate and set in motion attempts and discussions seeking to resolve it. The first historical record of the question is in a letter Molyneux wrote to John Locke in 1688 (described in Degenaar, 1996). He asked, Suppose a man born blind, and now adult, and taught by his touch to distinguish between a cube and a sphere of the same metal. … Suppose then the cube and sphere placed on a table, and the blind man be made to see: query, whether by his sight, before he touched them he could now distinguish and tell which is the globe, which the cube? (Locke, 1690)

The beauty of this formulation lies in its concreteness. By addressing this empirically approachable question, one could get some traction on the more abstract nature-nurture issue. However, during the initial outreach from Molyneux, the import of this question seems to have been lost on Locke. The latter did not respond and makes no mention of this question in the first edition of his monograph An Essay Concerning Human Understanding. However, this changed in the ensuing couple of years. Locke came to appreciate the significance of the puzzle and included it in the 1690 edition of his monograph with the preamble, “I shall here insert a problem of that very ingenious and studious promoter of real knowledge, the learned and worthy Mr. Molyneux, which he was pleased to send me in a letter some months since; and it is this … .” The rest, as they say, is history. From that point on, scholars of philosophy have referred to this query as Molyneux’s question and have placed it high in the collection of big issues about the mind. The Stanford Encyclopedia of Philosophy (Degenaar & Lokhorst, 2014) has this to say about the question: There is no problem in the history of the philosophy of perception that has provoked more thought than the problem that Molyneux raised in 1688. In this sense, Molyneux’s problem is one of the most fruitful thought-experiments ever proposed in the history of philosophy, which is still as intriguing today as when Molyneux first formulated it more than three centuries ago.

The implications of answering Molyneux’s query would be far-reaching. For Locke, Berkeley, Hume, and other empiricists, a positive answer to the Molyneux question would provide evidence in support of an innate conception of space common to both senses, sometimes called an amodal

Establishing Cross-Modal Mappings

173

representation. A negative answer, on the other hand, would suggest that cross-modal mapping results from a process of association between the senses derived from experience. The results, irrespective of how they turned out, would bear significantly on issues concerned with cross-modal identification and intermodal interactions, which are ubiquitous in daily life and also in the experimental literature (Degenaar, 1996; Newell et al., 2001; Jacomuzzi, Kobau, & Bruno, 2003; Lacey, Peters, & Sathian, 2007). Given the great significance of the question (Morgan, 1977), it is not surprising that attempts to answer it began soon after its formulation. But, the question proved to be a challenging one. As the next section makes clear, the inferences from many attempts were often overshadowed by the caveats inherent in the techniques they employed. 2 Attempts to Address the Molyneux Query The first recorded attempt to answer Molyneux’s question is attributed to the great English surgeon, William Cheselden, who, coincidentally, was born in the same year that Molyneux formulated his question. In 1728, Cheselden had a unique opportunity to directly address the query. He was able to operate on a thirteen-year-old boy who had been blind from birth due to cataracts. In testing the boy’s visual abilities after the surgery, Cheselden found that “he knew not the shape of anything, nor any one thing from another, however different in shape or magnitude; but upon being told what things were, whose form he knew before from feeling, he would carefully observe, that he might know them again” (Cheselden, 1728). Cheselden’s observations about the highly compromised nature of the boy’s vision seemed to offer an unequivocal answer to Molyneux’s query; they suggested that a blind man restored to sight would not be able to distinguish objects and would have to learn to see. However, reposing too much faith in this conclusion as a definitive answer to Molyneux’s query is fraught with risk. The boy’s visual difficulties following removal of cataracts may have had little to do with limitations of visual development after blindness; instead, they may have resulted from low-level factors— damage to the eyes. Until well into the twentieth century, the procedure for removing cataracts was based on the crude technique of “couching.” This involved applying enough pressure on the lens (typically with a blunt needle inserted into the front of the eye) to make the lens break out of its capsule and fall into the posterior chamber of the eye. In principle, this removes the occlusion from the eye’s optical axis, but in practice, the trauma this procedure inflicts on the delicate tissues of the eye often leaves

174

P. Sinha, J. Wulff, and R. Held

the person blind even though the cataract has been removed. Given that this is the procedure Cheselden would have followed in treating the thirteen-year-old boy, we cannot be certain that the subsequent inability of the boy to see was just due to the brain not being able to understand the signals from the eyes; it may well have been due to damage to the eyes themselves. Furthermore, given the very limited ophthalmic assessment tools available in the eighteenth century, we do not know if the boy suffered from any other pathologies of the eye such as retinal degeneration, which may have limited his visual outcomes. In summary, without precise ophthalmic assessments and procedures, very little can be inferred from a negative result like Cheselden’s. An alternative approach that sidesteps some of these problems involves working with newborns. Instead of insisting on a blind person who has just gained sight, one could potentially work with a baby who has just been born and hence has had no prior visual experience other than that of diffuse light in the womb. Would a newborn show the ability to transfer tactile knowledge to vision? Andrew Meltzoff and Richard Borton adopted this approach (see also Streri & Gentaz, 2004). In an influential paper (Meltzoff & Borton, 1979), they reported finding that when presented with two pacifiers visually, one with a smooth surface and the other with a rough one, newborns exhibited a significant preference for the pacifier that they had previously been sucking on in complete darkness. This result led the authors to conclude, “Our experiments show that humans can recognize intermodal matches without the benefit of experience in simultaneous tactual-visual exploration.” Thus, work with newborns suggested that the Molyneux query may admit an affirmative answer in contrast to Cheselden’s conclusion. However, just like Cheselden’s result, this inference too has to be qualified with some caveats. The first one has to do with the age of the infants. The newborns that Meltzoff and Borton worked with were, on average, twenty-nine days old. Thus, while young, they were not entirely experientially naive, undercutting the strength of the claim that a visuotactile mapping can be established even in the complete absence of experience. Second, the results with neonates have proven to be controversial, with at least three groups failing to replicate Meltzoff and Borton’s original finding of cross-modal transfer in infants (Brown & Gottfried, 1986; Pêcheux, Lepecq, & Salzarulo, 1988; Maurer, Stager, & Mondloch, 1999). Given the mixed outcomes, the findings with neonates have not proven effective in resolving the Molyneux question. It is perhaps also worth mentioning other results that have led some researchers to argue for a natural mapping between the senses. Kohler’s

Establishing Cross-Modal Mappings

175

well-known takete-baluma demonstration (Kohler, 1929) is an example of such a cross-modal mapping. Unprimed subjects across many cultures, some as young as two and a half years of age (Rogers & Ross, 1975; Ramachandran & Hubbard, 2001; Maurer, Pathman, & Mondloch, 2006), tend to spontaneously pair the word “takete” with a sharp-cornered, star-like shape, and “baluma” with a more rounded shape. There is apparently a natural homology between signals from the two senses; given the existence of the homology, it is not implausible to assume that sensitivity to cross-modal correspondence might be innate, suggesting by extension that Molyneux’s question might admit a positive answer. This is merely speculation, however. The actual baluma-takete results, though reliably reproducible, are derived from older populations with significant multimodal sensory experience. Results from those studies are not directly applicable to the situation Molyneux envisioned of testing an individual immediately after the onset of sight. Given these caveats and methodological challenges, an answer to the Molyneux question has remained elusive. Let us return to the experimental approach that Molyneux had advocated in his initial formulation—initiating sight in a congenitally blind person. What is needed to operationalize this approach while sidestepping the challenges that Cheselden faced? The critical conditions for testing the Molyneux question are as follows: We must find appropriate patient-subjects who are congenitally blind by clinical definition—that is, incapable of form discrimination—but with functional retina, optic nerve, and the visual pathways of the central nervous system. If the retina or the “back-end” neural mechanisms were not functional, the patient would be incapable of recovering form vision and being tested. This reduces the field of possible test cases to those blind due to occlusive pathologies such as dense bilateral cataracts or corneal opacities. They may possess sufficient light-sensing capability so as to be able to tell night from day, but would be unable to resolve visual forms. Another necessary precondition of our subjects to test the existence of an innate cross-modal representation based on a Molyneux-type test is that both senses in question, touch and vision, must be independently functional after treatment. Molyneux probably presupposed that a newly sighted individual would have fully functional vision and touch, but a surgery to treat blindness may itself cause unintended injury to the eye that could severely compromise vision. Indeed, as we mentioned above, this was a key concern surrounding Cheselden’s attempts to address the Molyneux question. Therefore, surgical procedures should be such as to ensure that the ability of the eye to receive and communicate visual information to the brain is not compromised.

176

P. Sinha, J. Wulff, and R. Held

Finally, surgery to clear the occlusion and install refractive devices (such as intra-ocular lenses or external eyeglasses) must be followed as soon as possible, ideally immediately, by appropriate testing of visual discrimination and transfer of object discrimination from vision to touch. Patients must be mature enough to be capable of reliable discrimination testing. Finding such patients is a tall order. Individuals who meet these criteria are extremely rare in western countries because they are detected in infancy and treated as early as possible. However, a recent initiative in India has offered renewed hope for finding and working with people in whom sight can be initiated late in life. 3 Project Prakash India shoulders the world’s largest burden of childhood blindness. It is estimated that nearly half a million children in the country are either blind or severely visually impaired (Dandona & Dandona, 2003). The visual handicap, coupled with extreme poverty, greatly compromises the children’s quality of life; fewer than 50 percent of these children survive to adulthood. These numbers take on added poignancy when one notes that in 40 percent of the cases, the blindness is treatable or preventable. Most children, however, never receive medical care because the treatment facilities are concentrated in major cities, while 70 percent of the population lives in villages. These circumstances effectively ensure that a blind child in a financially strapped rural family will live a dark and tragically short life. For blind girls, the outlook is even more dire. Many are confined at home and denied contact with the outside world. This is clearly a humanitarian crisis that needs to be urgently addressed. To this end, Project Prakash seeks to identify and treat blind children and simultaneously build awareness amid the rural populace regarding treatable and preventable blindness. Embedded in the humanitarian aspect of Project Prakash is an unprecedented opportunity to study one of the deepest scientific questions: how does the brain learn to extract meaning from sensory information? The humanitarian initiatives of Project Prakash are creating a remarkable population of children across a wide age range who are just setting out on the enterprise of learning how to see (Mandavilli, 2006; Sinha, 2013). We have begun following the development of visual skills in these unique children to gain insights into fundamental questions regarding object learning and brain plasticity (Ostrovsky, Andalman, & Sinha, 2006; Sinha & Held,

Establishing Cross-Modal Mappings

177

Figure 8.1 A Prakash child before and after gaining sight.

2012). This is a unique and unprecedented window into some of the most fundamental mysteries of learning and plasticity. On an applied note, as new eye treatments become available and existing treatments reach children who are currently blind, the basic question we have to confront is how to proceed with their integration into the sighted world. In this context, Project Prakash holds the potential for making a significant impact by directly assessing how extended visual deprivation influences children’s subsequent development of visual skills. This undertaking is a prerequisite for developing strategies to compensate for particular deficits. In the specific context of the Molyneux question, working with the Prakash children after their surgeries offers an opportunity to examine the status of the visuotactile mapping immediately after the onset of sight. 4 Addressing the Molyneux Question with Prakash Children In 2011, we presented our findings with newly sighted Prakash children on the Molyneux task (Held et al., 2011). Here we recapitulate the basic findings. Five subjects recruited from Project Prakash proved appropriate

178

P. Sinha, J. Wulff, and R. Held

for our study. Subjects YS (age: eight years), BG (age: seventeen years), SK (age: twelve years), and PS (age: fourteen years) presented with dense congenital bilateral cataracts. They were treated with phacoemulsification and an intraocular lens (IOL) implant. Subject PK (age: sixteen years) presented with congenital corneal opacities and was provided with a corneal transplant. Prior to treatment, subjects were able only to discriminate between light and dark, with subjects BG and PK additionally able to determine the direction of a bright light. None of the subjects were able to perform form discrimination. Stimuli: Molyneux had proposed using just one pair of objects, a cube and a sphere, to assess visuotactile transfer. However, we decided to include many shape pairs in our stimulus set. The reason for doing this is straightforward. With a single shape pair, we can obtain just one response from a given subject, which has a 50 percent probability of being correct by random chance. Given that our subject pool comprises five patients, this high chance level does not give us the option of determining whether the responses are statistically significantly different from what one would get from random guessing. Also, no inferential statistical tests can be conducted for single subjects given such limited data. Hence, we necessarily needed to increase the cardinality of our stimulus set. Our stimulus set comprised twenty pairs of simple three-dimensional forms drawn from a children’s shape set. The forms were large (ranging from six to twenty degrees of visual angle at a viewing distance of twelve inches) so as to sidestep any acuity limitations of the subjects. They were presented on a plain white background so as to avoid any difficulties in figure-ground segmentation. With these choices for the stimulus set and presentation, our participants had little trouble locating and comparing these simple objects. Figure 8.2(a) shows a few of our stimuli. Procedure: Subjects performed a match-to-sample task with the objects described above. One sample object was presented either visually or haptically, followed by the simultaneous presentation of the original object (target) and a distractor object in the modality matching the condition in the diagram (figure 8.2[b]). Subjects’ task was to identify the previously presented object. Subjects were seated in front of a table covered with a white, featureless sheet. For visual presentation of an object, the object was placed on the table approximately twelve inches away from the eyes, although subjects were free to adjust their distance or viewpoint. Subjects were not allowed to touch the object while viewing it. For tactile object presentation, the object was placed in the subject’s hands underneath the table where neither the

Establishing Cross-Modal Mappings

179

A

B

Figure 8.2 (a) Three shape pairs of the kind we used in our experiments. (b) The match to sample paradigm we followed in our studies. The within-modality tactile match to tactile sample task assesses haptic capability and task understanding. The visual match to visual sample task provides a convenient way to assess whether subjects’ form vision is sufficient for visually discriminating between test objects. The tactile match to visual sample task represents the critical test of intermodal transfer. (Adapted from Held et al., 2011.)

hands nor the object were visible to the subject; in addition, subjects were instructed to close their eyes. No feedback was provided to the subjects. Each of the three conditions entailed twenty trials. The order of presentations within each condition was randomized across subjects. Results As shown in figure 8.3, within two days following treatment all subjects performed near ceiling for the touch-to-touch condition (T-T) (mean: 98 percent) and for the vision-to-vision condition (V-V) (mean: 92 percent), meaning that the stimuli were easily discriminable in both modalities. In contrast, performance fell precipitously in the touch-to-vision condition (T-V), where performance was not statistically different from chance and significantly different from T-T and V-V performance (p < 0.001 and p < 0.004, respectively). We had the opportunity to test three of the five subjects on later dates. Remarkably, performance in the T-V condition with novel test objects

180

P. Sinha, J. Wulff, and R. Held

Figure 8.3 Within-modality and cross-modality match to sample performance of five newly sighted individuals within two days after sight onset. The last panel shows average performance across all subjects. (Adapted from Held et al., 2011.)

Establishing Cross-Modal Mappings

181

Figure 8.4 Tactile sample to visual match performance improvement across two postoperative assessments. (Adapted from Held et al., 2011.)

improved significantly (p < 0.02) in as little as five days from the initial performance test post-treatment. Figure 8.4 shows the results. (Performance in the other two conditions, predictably, remained at ceiling.) Subjects had been given no training during the intervening period. Given that our newly sighted participants did not exhibit an immediate transfer of their tactile shape knowledge to the visual domain, we are led to conclude that the answer to Molyneux’s question is likely “no.” This negative finding poses a puzzle when juxtaposed with evidence for the rapid acquisition of the visuotactile mapping. Since a few days’ time is likely too short for new neuronal pathways to be established, the more plausible account of these results is that the substrates responsible for cross-modal interaction precede their functionally useful linkage. Indeed, recent neurophysiological findings provide some support for this thesis. Single unit electrophysiology studies in nonhuman primates and human neuroimaging studies have shown that cortical areas traditionally considered to be unimodal are often multisensory (Falchier et al., 2002) or can be rendered rapidly so (Pascual-Leone et al., 2005; Merabet et al., 2008). The learning-based account of establishing a cross-modal mapping brings up the question of whether this approach offers any advantages over an innately specified mapping. The most obvious answer is adaptability to change. As the optical, visual, haptic, and motor capabilities of an infant change over the course of the first few months and years, a learningbased system that can keep refining the mapping based on the nature of the inputs and their contingencies is preferable over one that starts with a

182

P. Sinha, J. Wulff, and R. Held

Figure 8.5 A Prakash subject’s segmentation of three natural images soon after treatment.

mapping that may well be wrong for a given individual’s sensory apparatus and environment. A dynamically learnable mapping will, in short, be best able to accommodate the specific sensory constraints of an individual and their changes over time. To summarize, our results argue for a learning-based account for the establishment of cross-modal mappings. This leads to a natural question: what is the nature of such learning mechanisms? This question does not yet have a definitive answer, but we can offer an educated guess. 5 A Possible Mechanism of Cross-Modal Learning Our proposal for how intermodal associations might be learned is based on the results from Project Prakash for intramodal learning. The perception of the world for newly sighted children is highly fragmented. Instead of parsing the world into cohesive objects, the children see it as a collection of many unrelated regions of different colors and luminances. Figure 8.5 shows a newly treated individual’s responses when asked to point to the objects in the scene. To a Prakash child soon after treatment, the world is seemingly a jigsaw puzzle where the relationship between the pieces is not evident. Interestingly, however, the introduction of dynamic information brings about a dramatic improvement in parsing skills. A similarity of motion trajectories of the fragments apparently helps bind them together into meaningful assemblies. Even more significantly, the usefulness of motion is not momentary; rather, it appears to act as a teaching cue to help the visual system learn the heuristics of image parsing that can be effective even in static images (Sinha et al., 2009; Ostrovsky et al., 2009). Studies with normally developing infants have also yielded similar findings (Kellman & Spelke, 1983; Johnson, 2004). A fruitful analogy can be made between the intramodal parsing task and the intermodal mapping one. If we consider the regions discussed above

Establishing Cross-Modal Mappings

183

as stand-ins for attributes that could come from different modalities, then the task of region grouping translates to that of cross-modal linking. In this view, it follows that the primary cue for associating the attributes will be the dynamic trajectories they trace out. The notion of precisely what a “trajectory” is needs to be expanded, of course, to accommodate modality-specific attributes. Instead of a trajectory necessarily being defined as spatial position as a function of time, it could be any attribute over time. Some representative examples include amplitude of pressure sensor response, motion energy in a given region of the visual field, and the envelope of an auditory signal. The extent of mutual information (or even simply the correlation) across these time-series can help determine the level of association between two sensory signals that may arrive from the same modality or different ones. Held (2009) has also argued for such a mechanism for establishing cross-modal relationships. Having a common mechanism subserve both kinds of tasks has the appeal of parsimony and also does not require us to treat information from the different senses as conveying fundamentally different aspects of the environment. To make these ideas more concrete, we consider a specific computational implementation in the multimodal domain of vision and audition. The question is: given a video sequence with an audio track, can we robustly determine which visual elements of the video are the ones associated with the sound even in the presence of significant signal degradation? Detecting Audio-Visual Associations Within a few weeks after birth, infants, when presented with two dynamic faces and a speech stream, look significantly longer at the “correct” talking person (Patterson & Werker, 2003). How does the infant brain solve this audio-visual analogue of the visuotactile Molyneux problem? What makes this challenge especially daunting is the degraded quality of the sensory signals delivered by the immature nervous system. Also, the incomplete myelinization of the axons can significantly affect timing relationships between the signals. Can one reliably detect cross-modal relationships even in the presence of such complications? How far can we get through the use of similarities in the time-series corresponding to different sensory signals in a manner similar to what we described above for intramodal organization? Past computational analyses in the field of audio/video synchrony and automatic speaker detection (e.g., Hershey & Movellan, 2000) have demonstrated the strengths of this approach with high-quality sensory signals. Building on this foundation, the computational approach we adopt involves an estimation of similarity between many pairs of signals.

184

P. Sinha, J. Wulff, and R. Held

Specifically, in the present problem context, for every pixel in the image, we deduce the mutual information between the time-varying magnitude of the optical flow centered on that pixel and the audio signal energy (figure 8.6a). This yields a matrix of mutual information values across the entire image that can be visualized as a “heat map.” Figure 8.6b shows the results for a video of a talking head under a few variations. It is evident that the highest mutual information zone is concentrated around the mouth and chin region. Thus, even without any interaural time or level differences, similarities of temporal trajectories of the auditory and visual signals can effectively identify subsets of greatest cross-modal association. Next we test the robustness of this approach on more realistically degraded sensory signals as might be delivered by the infant sensory system. Specifically, our goal is to analyze the dependence of audio-visual (AV) synchrony detection on • Time lag between audio and video • Size of window of integration • Spatiotemporal resolution of input Experiment 1: Misaligned Audio and Video We introduced a time lag between the audio and the video. Positive values of time lag correspond to delayed audio signal. As the results and graph in figure 8.7 show, synchrony detection is robust for audio-visual misalignment ranging from zero to one hundred milliseconds. Experiment 2: Size of Window of Integration We computed correlation for various sizes of the sliding window. As figure 8.8 shows, we found that increase in size of window of integration caused a decrease in the absolute value of correlation and an increase in robustness, i.e., the ratio of correlation in mouth and background increased. Experiment 3: Applying Spatial and/or Temporal Blur In order to examine the effects of reduced visual and/or auditory acuity, we conducted tests with • Spatial blur applied to video • Temporal blur applied to audio • Both degradations applied simultaneously Our results revealed that synchrony detection is robust to significant audio and visual blur. This implies that detection is feasible even with sensory systems that have rudimentary spatial and temporal resolution.

A 0.7

0.6

0.5

0.4

0.3

0.2

0.1

0

Figure 8.6 (a) The basic computational step for determining which aspects of the visual scene are associated with the auditory input (without loss of generality, we consider one auditory channel for the present example). We have to determine the mutual information (MI) between the time-varying structure of the audio signal (here the envelope, indicated by the trace atop the high-frequency audio signal) and the motion energy derived from a localized region of visual space. The basic computation has to be performed across all visual regions. The figure shows three motion energy traces from different regions of the video. Some of these will have higher MI with the audio signal (red trace) than others. The spatial plot of MIs across the full scene can be visualized as a heat map, as in (b). (b) We split the audio signal into low, mid, and high frequencies. As the upper row of (b) shows, mid frequencies gave the best results, underscoring the fact that even an immature auditory system that might not have developed sensitivities for very high frequencies can fruitfully detect audio-visual synchronies. Similarly, we experimented with three different video backgrounds— plain white, random noise, and natural (real world). The results, shown in the bottom row, proved to be robust against these changes.

B 0.06 0.05 0.04 0.03 0.02 0.01

0.06 0.05 0.04 0.03 0.02 0.01

Figure 8.6 (Continued)

A 0.06 0.05 0.04 0.03 0.02 0.01

B

Figure 8.7 Results of AV synchrony detection with temporal misalignments between signals.

Establishing Cross-Modal Mappings

187

A 0.17

0.06

0.16 0.15 0.14 0.13 0.12

0.05 0.04 0.03

0.11 0.1 0.09

0.02 0.01

0.08

B

Figure 8.8 Effects of changing size of window of integration on AV synchrony detection.

These results show that the time-series of signal variation across different modalities can be effectively used for establishing a linkage between the most correlated sections. This sets the stage for building a conceptual “lookup table” to codify the mapping between signal appearance in one modality and that in the other. In the example we have used here, this would correspond to a mapping between mouth appearance and the sound produced. With this lookup table, it is straightforward to hypothesize, based on information in one modality, what the signal in the other modality ought to look like. Recent text-to-animation systems (e.g., Albrecht et al., 2002) have used precisely this idea. The idea also translates directly to the vision and touch mapping that is central to Molyneux’s question. A lookup table with tactile and corresponding visual features would enable the prediction of visual appearance based only on haptic experience, or vice versa. To summarize, the use of dynamic information across two modalities like vision and touch can help develop, in a purely empirical framework, a

188

P. Sinha, J. Wulff, and R. Held

0.11 0.1 0.09 0.08 0.07 0.06 0.05 0.04 0.03 0.02 0.01

0.08 0.07 0.06 0.05 0.04 0.03 0.02 0.01

Figure 8.9 Effects of spatial or temporal blur on synchrony computation.

mapping between the two senses and thus explain the results observed with Prakash children. It should be noted that the work described here is more a computational demonstration of the feasibility of finding cross-modal mappings rather than a model of neural processing. The results simply show that information that allows us to establish cross-modal correspondences is in the data, that it can be extracted using straightforward means, and that it is robust against a number of degradations, comparable to those affecting newborns. 6 Conclusion Like all “big” questions in science, Molyneux’s query has far-reaching implications. It touches upon several foundational issues in philosophy, cognition, and neuroscience. After having resisted resolution for over three centuries, the question now appears to be yielding. A humanitarian effort in India has offered a potential answer to this long-standing challenge. Consistent with the expectations of the empiricists, the visuotactile mapping appears not to be available at the outset of vision. Rather, it seems to be established over the course of multimodal experience with the world. The process of learning is rapid, perhaps indicating that the neuronal substrates for the cross-modal linkage are already present but need to be rendered functionally useful through experience. We speculate that the specific aspect of experience that is important for establishing the mapping

Establishing Cross-Modal Mappings

189

is the dynamic nature of signals across the modalities; similarities of timeseries of different sensory signals serve as a cue for associating corresponding sensory fragments. These associations can then allow for generalization to new inputs that are compositions of previously experienced fragments. This empirical and computational work points to several interesting avenues for further research. For instance, we need to determine where precisely in the brain cross-modal associations are encoded. Are they enmeshed with the primary sensory cortices, or do they reside in the higher-order association areas? We also need to test the hypothesis of cross-modal synchrony as the driver of association formation. One way to do so would be to have newly sighted patients experience both modalities (i.e., vision and touch), but not simultaneously; the objects they see, they are not allowed to touch and vice versa. Would this kind of experience preclude the formation of cross-modal linkages? On a more applied note, we can ask whether a tactile-rich experience would expedite the acquisition of visual proficiency by the newly sighted. It is a testament to the beauty of Molyneux’s query that even three hundred years past its formulation it continues to generate questions and ideas that can advance the cutting edge of our understanding of the brain and mind. Acknowledgments This work was supported by the National Eye Institute of the NIH through grant R01EY020517 and through a Scholar Award from the James S. McDonnell Foundation. References Albrecht, I., Haber, J., Kahler, K., Schroder, M., & Seidel, H.-P. (2002). “May I talk to you? :-)”—Facial animation from text. In Proceedings of the 10th Pacific Conference on Computer Graphics and Applications. http://www.mpi-inf.mpg.de/resources/FAM/ publ/pg2002.pdf. Brown, K. W., & Gottfried, A. W. (1986). Cross-modal transfer of shape in early infancy: Is there reliable evidence? In L. P. Lipsitt & R. Rovée-Collier (Eds.), Advances in infancy research (Vol. 4, pp. 163–170). Norwood, NJ: Ablex. Cheselden, W. (1728). An account of some observations made by a young gentleman, who was born blind, or lost his sight so early, that he had no remembrance of ever having seen, and was couch’d between 13 and 14 years of age. Philosophical Transactions, 402, 447–450.

190

P. Sinha, J. Wulff, and R. Held

Dandona, R., & Dandona, L. (2003). Childhood blindness in India: A population based perspective. British Journal of Ophthalmology, 87, 263–265. Degenaar, M. J. L. (1996). Molyneux’s problem: Three centuries of discussion on the perception of forms. Dordrecht: Kluwer Academic. Degenaar, M. J. L., & Lokhorst, G. (2014). Molyneux’s problem. In Edward N. Zalta (Ed.), The Stanford encyclopedia of philosophy. http://plato.stanford.edu/archives/ spr2010/entries/molyneux-problem/. Falchier, A., Clavagnier, S., Barone, P., & Kennedy, H. (2002). Anatomical evidence of multimodal integration in primate striate cortex. Journal of Neuroscience, 22(13), 5749–5759. Held, R. (2009). Visual-haptic mapping and the origin of cross-modal identity. Optometry and Vision Science, 86(6), 595–598. Held, R., Ostrovsky, Y., de Gelder, B., Gandhi, T., Ganesh, S., Mathur, U., et al. (2011). Newly sighted cannot match seen with felt. Nature Neuroscience, 14, 551–553. Hershey, J., & Movellan, J. R. (2000). Audio-vision: Using audio-visual correlation to locate sound sources. In S. A. Solla, T. K. Leen, & R. Muller (Eds.), Advances in neural information processing systems (Vol. 12, pp. 813–819). Cambridge, MA: MIT Press. Jacomuzzi, A. C., Kobau, P., & Bruno, N. (2003). Molyneux’s question redux. Phenomenology and the Cognitive Sciences, 2, 255–280. Johnson, S. P. (2004). Development of perceptual completion in infancy. Psychological Science, 15, 769–775. Kellman, P., & Spelke, E. (1983). Perception of partly occluded objects in infancy. Cognitive Psychology, 15, 483–524. Köhler, W. (1929). Gestalt psychology. New York: Liveright. Lacey, S., Peters, A., & Sathian, K. (2007). Cross-modal object recognition is viewpoint-independent. PLoS ONE, 9, e890. Locke, J. (1690). An essay concerning human understanding. London. Mandavilli, A. (2006). Look and learn. Nature, 441(18), 271–272. Maurer, D., Pathman, T., & Mondloch, C. J. (2006). The shape of boubas: Sound– shape correspondences in toddlers and adults. Developmental Science, 9(3), 316–322. Maurer, D., Stager, C. L., & Mondloch, C. J. (1999). Cross-modal transfer of shape is difficult to demonstrate in one-month-olds. Child Development, 70(5), 1047–1057. Meltzoff, A., & Borton, W. (1979). Intermodal matching by human neonates. Nature, 282, 403–404.

Establishing Cross-Modal Mappings

191

Merabet, L. B., Hamilton, R., Schlaug, G., Swisher, J. D., Kiriakopoulos, E. T., Pitskel, N. B., et al. (2008). Rapid and reversible recruitment of early visual cortex for touch. PLoS ONE, 3(8), e3046. Morgan, M. (1977). Molyneux’s question. Cambridge: Cambridge University Press. Newell, F. N., Ernst, M. O., Tjan, B. S., & Bülthoff, H. H. (2001). Viewpoint dependence in visual and haptic object recognition. Psychological Science, 12, 37–42. O’Hara, J. G. (2004). Molyneux, William (1656–1698). In Oxford dictionary of national biography. Oxford: Oxford University Press. Ostrovsky, Y., Andalman, A., & Sinha, P. (2006). Vision following extended congenital blindness. Psychological Science, 17(12), 1009–1014. Ostrovsky, Y., Meyers, E., Ganesh, S., Mathur, U., & Sinha, P. (2009). Visual parsing after recovery from blindness. Psychological Science, 20(12), 1484–1491. Pascual-Leone, A., Amedi, A., Fregni, F., & Merabet, L. (2005). The plastic human brain cortex. Annual Review of Neuroscience, 28, 377–401. Patterson, M. L., & Werker, J. F. (2003). Two-month-old infants match phonetic information in lips and voice. Developmental Science, 6(2), 191–196. Pêcheux, M.-G., Lepecq, J.-C., & Salzarulo, P. (1988). Oral activity and exploration in 1–2-month-old infants. British Journal of Developmental Psychology, 6(3), 245–256. Ramachandran, V. S., & Hubbard, E. M. (2001). Synaesthesia—a window into perception, thought, and language. Journal of Consciousness Studies, 8(12), 3–34. Rogers, S. K., & Ross, A. S. (1975). A cross-cultural test of the Maluma-Takete phenomenon. Perception, 4(1), 105–106. Sinha, P. (2013). Once blind and now they see. Scientific American (July). Sinha, P., Balas, B. J., Ostrovsky, Y., & Wulff, J. (2009). Visual object discovery. In S. Dickinson & M. Tarr (Eds.), Object categorization: Computer and human vision perspectives. Cambridge: Cambridge University Press. Sinha, P., & Held, R. (2012). Sight-restoration. F1000. Medica-Report, 4, 17. Streri, A., & Gentaz, E. (2004). Cross-modal recognition of shape from hand to eyes and handedness in human newborns. Neuropsychologia, 42(10), 1365–1369.

9 Berkeley, Reid, and Sinha on Molyneux’s Question James Van Cleve

Molyneux’s question, posed over three centuries ago and still debated today, is whether a man born blind and made to see would be able to recognize by sight alone the shapes of objects he formerly knew by touch. Berkeley, in company with Molyneux and several others among the first wave of thinkers to discuss the problem, said no. Reid, a critic of Berkeley in many matters, said yes. Until recently, the empirical evidence bearing on the question was equivocal, but thanks now to Pawan Sinha and his collaborators at Project Prakash, we are starting to get more direct empirical evidence than ever before (Held et al., 2011; Sinha, 2013; Sinha et al., this volume). The answer so far emerging from their research is no. In what follows, I critically discuss the question itself and the answers to it offered by Berkeley, Reid, and Sinha. 1 Molyneux’s Question Molyneux posed his famous question in a letter to Locke, which Locke quoted along with his endorsement of Molyneux’s answer in the second edition of the Essay Concerning Human Understanding (1694, Book II, ch. IX, sec. 8). Berkeley in turn quoted Locke in his Essay Toward a New Theory of Vision. Here is the question as it appears in Berkeley’s text: Suppose a man born blind, and now adult, and taught by his touch to distinguish between a cube and a sphere of the same metal, and nighly of the same bigness, so as to tell when he felt one and the other, which is the cube, which the sphere. Suppose then the cube and sphere placed on a table, and the blind man to be made to see: quaere, whether by his sight, before he touched them, he could now distinguish and tell which is the globe, which the cube? To which the acute and judicious proposer [Molyneux] answers: Not. For though he has obtained the experience of how a globe, how a cube affects his touch; yet he has not yet attained the experience, that what affects his touch so or so must affect his sight so or so; or that a protuberant angle in the cube, that pressed his hand unequally, shall appear

194

J. Van Cleve

to his eye as it does in the cube. I [Locke] agree with this thinking gentleman, whom I am proud to call my friend, in his answer to this his problem; and am of opinion, that the blind man, at first sight, would not be able with certainty to say which was the globe, which the cube, while he only saw them. (Berkeley, [1709] 1963, sec. 132; hereinafter cited as NTV)

Berkeley went on to agree with the two thinking gentlemen who preceded him. There are five clarifications or possible amendments to Molyneux’s question that ought to be considered. The first, proposed by Diderot in his Letter on the Blind of 1749, is that we change the question from globe vs. cube to circle vs. square.1 His reason is that a subject gaining sight for the first time might not be able to perceive depth (as Berkeley had maintained), in which case he would not yet be able to see anything as a three-dimensional globe or cube, but could nonetheless still presumably distinguish from one another two-dimensional figures like circles and squares. I agree with Gareth Evans (1985) that Diderot’s substitution leaves us with a question that still poses the key issues about the recognition of shapes presented in different sensory modalities. The next two twists are due to Leibniz, who discussed Molyneux’s question in the section of the New Essay Concerning Human Understandings dealing with Locke’s treatment of it. Here is Leibniz’s spokesman Theophilus: I believe that if the blind man knows that the two shapes which he sees are those of a cube and a sphere, he will be able to identify them and to say without touching them that this one is the sphere and this the cube. … I have included in [my answer] a condition which can be taken to be implicit in the question: namely that it is merely a problem of telling which is which, and that the blind man knows that the two shaped bodies which he has to discern are before him and thus that each of the appearances which he sees is either that of a cube or that of a sphere. Given this condition, it seems to me past question that the blind man whose sight is restored could discern them by applying rational principles to the sensory knowledge which he has already acquired by touch. ([1704] 1981, 136–137)

There are two suggestions here. One is that the subject be told that one of the objects before him is what he formerly knew by touch as a cube and the other what he knew as a globe. Molyneux’s own formulation is silent on whether the subject is to be given this information. The other is that the subject be given the opportunity to work things out “by applying rational principles to the sensory knowledge which he has already acquired by touch” (for example, that a cube but not a sphere has eight distinguished points). In his discussion of Leibniz’s version of the question, Evans gives prominence to the “let him work it out” aspect, but Leibniz

Berkeley, Reid, and Sinha on Molyneux’s Question

195

himself emphasizes the “give him a hint” aspect—that the subject be told “that, of the two appearances or perceptions he has of them, one belongs to the sphere and the other to the cube” ([1704] 1981, 138). It seems to me that the hint is the more significant factor of the two (or at any rate, that the hint together with the time to work things out is more significant than the time alone). A fourth clarification is due to Evans. The interesting question according to Evans is not whether a newly sighted subject will be able to distinguish circles and squares, but whether a subject who can see circles and squares will recognize them as shapes he formerly knew by touch. Perhaps his visual field at first will be chaos in which no distinct figures stand forth. But once the subject is able to see figures, will he be able to recognize them as the very figures he previously knew by touch? That is what we really want to know, and when the question is so understood, some apparently negative results involving subjects whose sight was restored are seen to be irrelevant. To these four qualifications or amendments, I propose to add a fifth: let the question concern the subject’s ability to recognize the shapes that now visually appear to him for the first time, regardless of whether these shapes are actually possessed by the objects before him. If we do not make this stipulation, the answer to Molyneux’s question threatens to be negative for an irrelevant reason. Suppose that because of some systematically distorting property in the conditions of observation or in the subject’s sensory transducers, tangible globes appear to his vision with corners poking out and tangible cubes appear with their corners smoothed away.2 If the subject in a Molyneux experiment could not rule out such a hypothesis about how tangible globes and cubes affect his sight, he would not be in a position to say which of the objects before him is a globe and which a cube. But this is an utterly boring reason for answering no. It would also be a reason for answering no to the following question: would a man who had become acquainted with red and white things in London be able to recognize them again when he saw them in Amsterdam? For all he knows, there are strange lights in the new city that make white things look red and red things look white, leaving him unsure or mistaken in his judgments about the colors of objects. Let it be so; it is still a question of interest whether he could recognize the visual presentations of red and white things. A parallel question about presentations is the central point of interest in the Molyneux problem: could the subject recognize visually presented shapes as the shapes he previously knew by touch? Diderot was aware of the complication I have just raised, but he does not reformulate the instructions to the subject so as to remove it. In

196

J. Van Cleve

consequence, he says he would expect the following response in a Molyneux experiment if the subject were philosophically minded: “This seems to me to be the object that I call square, that to be the object I call circle; however, I have been asked not what seems, but what is: and I am not in a position to answer that question” (Morgan, 1977, 50). I would take such a response as warranting an answer of yes to the Molyneux question as I conceive of it.3 Diderot’s philosophical subject would show by his words that he connects one visual presentation with tangible squares and the other with tangible circles. We have, then, five possible variations or specifications of Molyneux’s question: (1) replace globe/cube in the problem posed to the subject by circle/square; (2) tell the subject that one of the objects now before him is what he formerly knew by touch as a globe (or circle) and the other is what he formerly knew as a cube (or square); (3) let him work out his answer by reasoning rather than insisting on immediate recognition; (4) assign the recognitional task only after the subject is able to see circles and squares as distinct figures; (5) let the question concern the shapes presented to the subject’s vision rather than the shapes actually possessed by the objects before him. In what follows, I always take stipulations (4) and (5) for granted. I give separate consideration to how Molyneux’s question should be answered depending on whether stipulations (1), (2), and (3) are in place. 2 Berkeley’s Answer The fourth main thesis of Berkeley’s New Theory of Vision is that there is no idea or kind of idea common to sight and touch—the objects of these senses are entirely disparate and heterogeneous (NTV, secs. 121, 127). He says that the extension and figure perceived by sight are “specifically distinct” from those perceived by touch, meaning that they are different in species or kind. He agrees with Molyneux and Locke in their answer to Molyneux’s question, and he takes the supposed correctness of their answer as an important confirmation of his heterogeneity thesis. As he puts the point in the table of contents entry summarizing NTV, section 133, the Molyneux problem “is falsely solved, if the common supposition [of ideas common to sight and touch] be true.” Berkeley, in other words, offers us the following modus tollens: 1. If visible squares resemble tangible squares, the answer to Molyneux’s question is yes.

Berkeley, Reid, and Sinha on Molyneux’s Question

197

2. But in fact the answer is no. 3. Therefore, visible squares do not resemble tangible squares. He explains as follows why he takes the first premise to be true: Now, if a square surface perceived by touch be the same sort with a square surface perceived by sight, it is certain the blind man here mentioned might know a square surface as soon as he saw it. It is no more but introducing into his mind, by a new inlet, an idea he has been already well acquainted with. (NTV, sec. 133)

He goes on to say that since the blind man is supposed to have known by his touch that a cube is terminated by square surfaces and a globe is not, he could know (on the supposition of similarity between visible and tangible squares) “which was the cube, and which not, while he only saw them.” Thus Berkeley returns negative answers to both the globe-cube and circlesquare versions of the question, basing his negative answer to Molyneux’s original question on a negative answer to Diderot’s variant. In Berkeley’s view, a seen square and a felt square have nothing more in common than do a square of either variety and the word “square” (see NTV, sec. 140). The connection between them is a brute correlation of the sort that can be learned only through experience. One could no more expect a newly sighted man to know that the square figure he sees belongs to a tangibly square object than one could expect a neophyte in English to know that the word “square” denotes squares. 3 Reid’s Answer Although Reid never addresses the Molyneux question by name,4 he gives us ample materials for extracting an answer to it, particularly in his discussions of the capacities of the blind and the relations of visible to tangible figure. I begin with a passage from chapter 6, section 3 of the Inquiry into the Human Mind (hereinafter cited as IHM 6.3) in which Reid seems to give a negative answer. At the end of a discussion of the perspectival variation in the figures given to sight (a round coin seen at an angle will look elliptical, a rectangular table seen from the end as trapezoidal, and so on), Reid says this: To a man newly made to see, the visible appearance of objects would be the same as to us; but he would see nothing at all of their real dimensions, as we do. He could form no conjecture, by means of his sight only, how many inches or feet they were in length, breadth, or thickness. He could perceive little or nothing of their real

198

J. Van Cleve

[three-dimensional] figure; nor could he discern, that this was a cube, that a sphere. (IHM 6.3, 84–85, italics mine)

That sounds like a definite no to Molyneux’s question—until we read on. In IHM 6.11, Reid asks us to imagine what amounts to a Molyneux scenario involving Dr. Nicholas Saunderson, a blind mathematician whose acquaintance Reid made on a visit to Cambridge in 1736. Let us suppose such a blind man as Dr Saunderson, having all the knowledge and abilities which a blind man may have, suddenly made to see perfectly. Let us suppose him kept from all opportunities of associating his ideas of sight with those of touch, until the former become a little familiar; and the first surprise, occasioned by objects so new, being abated, he has time to canvass them, and to compare them, in his mind, with the notions which he formerly had by touch; and in particular to compare, in his mind, that visible extension which his eyes present, with the extension in length and breadth with which he was before acquainted. … … If Dr Saunderson had been made to see, and had attentively viewed the figures of the first book of Euclid, he might, by thought and consideration, without touching them, have found out that they were the very figures he was before so well acquainted with by touch. (IHM 6.11, 117–118)

That sounds like a definite yes! Has Reid given us two conflicting answers to Molyneux’s question? Readers who recall the Diderot variation on the Molyneux question will see that Reid’s answers are perfectly consistent; they are answers to different questions. The earlier passage gives an answer of no to the globe-cube question. The newly sighted man would not recognize a cube he sees as having the shape of the objects he formerly knew by touch as cubes because he would not see the object as a cube at all. That is precisely why Diderot proposed his variation. The later passage gives an answer of yes to the circlesquare question. Reid does not mention circles and squares specifically, but the figures he imagines Dr. Saunderson viewing are from Book I of Euclid, which deals only with plane figures, solids not making their appearance until Book XI. Moreover, in the paragraph explaining why he thinks Dr. Saunderson could recognize such figures (elided in my quotation from page 118 above), Reid discusses plane figures explicitly, emphasizing how great he takes the similarity between visible and tangible plane figures to be. There is another difference between the earlier and later passages that might be thought significant. IHM 6.11 concerns not just any blind person (as IHM 6.3 does), but Dr. Saunderson, who was the Lucasian professor of mathematics at Cambridge (successor but one to Newton). Does Dr. Saunderson’s mathematical sophistication play an essential role in Reid’s positive answer to the Molyneux question?

Berkeley, Reid, and Sinha on Molyneux’s Question

199

For the circle-square question, I believe it does not. Reid tells us that a visible square and a tangible square may have the same figure (or be almost perfectly similar in shape),5 just “as two objects of touch may have the same figure, although one is hot and the other cold” (IHM 6.11, 118). If visible squares and tangible squares are as similar as hot squares and cold squares, even a mathematically unsophisticated person should be able to detect the similarity. On the other hand, Dr. Saunderson’s mathematical expertise may well make a difference with respect to the globe-cube question. We can see why by consulting what Reid tells us about the relation of visible figure (which is always two-dimensional) to real figure (which may be either two- or three-dimensional). “The real figure of a body consists in the situation of its several parts with regard to one another,” he tells us, whereas “its visible figure consists in the position of its several parts with regard to the eye” (IHM 6.7, 96). The position of a point with regard to the eye is its direction out from the eye, regardless of its distance, as Reid believes the eye does not see depth. The visible figure of an object is given by the sum of all these directions. If lines of direction are drawn from the eye (considered as a single point) to every point of an object, the intersections of all these lines with a sphere centered on the eye—in other words, the projection of the object on that sphere—will mark out the object’s visible figure as seen from the central point. The visible figure of a tilted coin will be an ellipse, while that of a wireframe cube will be a network of polygons.6 The visible figure of an object is thus mathematically derivable from its real figure (together with its distance and orientation), as Reid explains in the following passage from IHM 6.7: The visible figure, magnitude, and position, may, by mathematical reasoning, be deduced from the real. … Nay, we may venture to affirm, that a man born blind, if he were instructed in mathematics, would be able to determine the visible figure of a body, when its real figure, distance, and position are given. Dr. Saunderson understood the projection of the sphere, and perspective. Now, I require no more knowledge in a blind man, in order to his being able to determine the visible figure of bodies, than that he can project the outline of a given body, upon the surface of a hollow sphere, whose centre is in the eye. This projection [determines] the visible figure he wants. (IHM 6.7, 95)7 From these principles, having given [to him] the real figure and magnitude of a body, and its position and distance with regard to the eye, he [our blind mathematician] can find out its visible figure and magnitude. He can demonstrate in general, from these principles, that the visible figure of all bodies will be the same with that

200

J. Van Cleve

of their projection upon the surface of a hollow sphere, when the eye is placed in the centre. (IHM 6.7, 96)

Reid is telling us that Dr. Saunderson could have worked out, in advance of ever seeing them, what visible figures are normally presented by globes and cubes when seen from various perspectives. This knowledge is not fully tantamount to knowing how they would look, if only because it omits any information about color. But it ought to be enough to enable him to know by reflection after he sees them for the first time that the object that looks this way is a cube and the object that looks that way is a globe. At any rate, Dr. Saunderson could surely have known this if he had been given the Leibniz hint—that one of the objects is a globe and the other a cube.8 Reid’s position on the Molyneux problem may thus be summed up in table 9.1. The reasons I have attributed to Reid for answering yes to the circlesquare question (even in the case of the plain man) involve attributing to him an argument like the following: 1. If visible squares resemble tangible squares, the answer to Molyneux’s question is yes. 2. Contrary to Berkeley, visible squares do resemble tangible squares. 3. Therefore, the answer to Molyneux’s question is yes. Reid thus agrees with Berkeley’s conditional premise but replaces Berkeley’s modus tollens with his own modus ponens.9 4 Empirical Evidence Though merely a thought experiment when first propounded, Molyneux’s question is apparently a straightforwardly empirical question.10 One might think that it ought to have been decided by now by actual cases of persons born blind and made to see. Nonetheless, three centuries after the question was first asked, the evidence drawn from cases of restored vision does not conclusively settle it. Such, at any rate, is the conclusion of three writers who have surveyed the evidence: Morgan (1977), Evans (1985), and Degenaar (1996). Table 9.1

Globe-cube Circle-square

Plain man on first view

Dr. Saunderson after thought and consideration

IHM 6.3: No IHM 6.11: Yes

IHM 6.7: Yes IHM 6.11: Yes

Berkeley, Reid, and Sinha on Molyneux’s Question

201

There have been scores of reported cases of persons blind from birth or a very early age who were restored surgically to sight, most often by the couching (moving aside) or removal of cataracts. Some of these patients were explicitly given Molyneux tasks and others not (though their relevant reactions were recorded). Some of them could perform Molyneux tasks (for example, name figures or answer which-is-which questions) and others not.11 The evidence thus points in divergent directions and is often simply ambiguous. Many questions have been raised about the reported cases. Could the postoperative patients really see? How blind were they initially, and for how long? Had they really been denied any opportunity to learn by association? Were they asked any leading questions? Evans (1985) notes that many of the “can’t tells” may have been “can’t sees” (violating stipulation 4) and that many of the “can tells” may already have had relevant experience (380–382). One of the best-known cases of a cataract patient restored to sight is that reported by the surgeon William Cheselden in 1728. Cheselden’s patient was a thirteen-year-old boy who had lost sight so early that he had no memory of it. Berkeley, Reid, Voltaire, and others commented upon Cheselden’s published report of the boy’s experiences. Some cited the Cheselden case as supporting a negative answer to Molyneux’s question, but others questioned its relevance, noting that the Cheselden lad apparently could not at first distinguish figures at all. It seems clear to me that the questioners are right. Cheselden says that when the boy first saw, “he knew not the shape of anything, nor any one thing from another, however different in shape” (Morgan, 1977, 19). He thus belongs in the class of “can’t sees” rightly deemed irrelevant by Evans. Degenaar (1996) concludes her book reviewing three centuries of empirical evidence bearing on the Molyneux question with the following sentence: “We have not answered Molyneux’s question—and, indeed, we think that it cannot be answered because congenitally blind people cannot be made to see once their critical period is passed” (132). She is referring to a period early in life during which if there is no appropriate retinal stimulation, there is no formation of the feature analyzer cells that are needed for subsequent discrimination of shapes—or so it has been believed until fairly recently. If there really is a critical period, though, how can there have been any positive results to date in Molyneux experiments? And how can the Sinha subjects described below have demonstrated an ability to discriminate visible shapes after being cured of congenital blindness?12 Beginning in 1979, there have been a number of experiments on infants that potentially bear on the issues underlying the Molyneux question. I

202

J. Van Cleve

have in mind the experiments on “Molyneux babies”—infants too young to have learned any associations between sight and touch who are given Molyneux-like tasks. In one such experiment, conducted by Streri and Gentaz (2003), days-old infants were allowed to grasp either a cylinder or a triangular prism out of sight in their right hand. They were then shown these objects for the first time, whereupon they gazed longer and more often at the object they had not previously grasped. Since independent experiments have shown that novel objects rather than familiar objects tend to capture an infant’s attention, Streri and Gentaz took their results to indicate that the infants recognized one of the seen shapes as what they had already felt and regarded the other as new.13 As the authors put it, “This is experimental evidence that newborns can extract shape information in a tactual format and transform it in a visual format before they have had the opportunity to learn from the pairings of visual and tactual experience.”14 Despite the suggestiveness of the experiments with infants, it seems that nothing short of tests on adults or adolescents cured of congenital blindness would constitute direct tests of Molyneux’s original question. 5 Sinha’s Answer Enter Project Prakash, a humanitarian and scientific project underway in India, where many children are born with curable blindness but not treated for it owing to inadequate medical resources. Under the auspices of the project, such children are identified, operated on to cure their blindness, and (if they meet the screening requirements to be described next) recruited as subjects in Molyneux experiments. The patients are examined ophthalmologically before and after their operations, and only those are subsequently used as Molyneux subjects who (a) can see nothing more than light and dark before the operation and (b) who can discriminate visually presented shapes from each other after the operation. There is thus no question whether the subjects were sufficiently blind before the operation or whether any of them were “can’t sees” afterwards. The tasks given to subjects who meet the screening requirements are as follows: They are given a Lego block of any of several shapes to hold in their hand;15 that object is taken away, and then it (the target) and another of a different shape (a distractor) are presented again simultaneously to the subject’s touch. Can the subject say which of the two was felt previously? Next, the procedure is repeated for vision: first one shape is seen, then it and another are presented to the subject’s vision simultaneously. Can the subject say which of the two was seen previously? Finally, the procedure

Berkeley, Reid, and Sinha on Molyneux’s Question

203

is repeated with one initially felt Lego block (held under a sheet and out of sight) and two subsequently seen Lego blocks, one the same in shape as what was felt and the other different. Can the subject say which of the two seen objects is the same as the one previously felt? That is Sinha’s version of the Molyneux question. Concerning five subjects tested and reported on to date, the results are as follows: The subjects’ success rate at matching one felt shape in a pair to a previously felt shape was very high (averaging 98 percent), as was their rate at matching one seen shape in a pair to a previously seen shape (averaging 92 percent). Condition (4) on Molyneux questions is therefore clearly satisfied; the subjects had the ability to discriminate and recognize (in intramodal matching tests) visual shapes as well as tactual shapes. But their rate of success at matching one seen shape in a pair to a previously felt shape fell to a mean of 58 percent, which is statistically indistinguishable from chance. Sinha concludes: Our results suggest that the answer to Molyneux’s question is likely negative. The newly sighted subjects did not exhibit an immediate transfer of their tactile shape knowledge to the visual domain. (2011, 552)

He goes on to note, however, that the subjects’ ability to make correct touch-to-vision transfer improved remarkably in a short period of time. I comment on the significance of this below. 6 Questions for Sinha I have several questions for Sinha, the answers to which may affect the significance of his data, as well as two suggestions for further research. First, why have results for only five subjects been published to date? Are there really so few who pass the screening, being blind enough to qualify before the operation and able to see well enough after? Or is it a matter of there not having been enough time, money, or personnel for further experiments? Second, the test objects in the Prakash experiments were Lego blocks, which have more complicated shapes than the globes and cubes of Molyneux’s original question. Have the experimenters considered using globes and cubes, if only for the sake of addressing Molyneux’s question exactly as originally formulated? Third, were the subjects still holding the tangible target under the sheet when the visual objects were presented? Or did they have to rely on tactile memory of the shape of the target object? I conjecture that the task would

204

J. Van Cleve

be somewhat easier if the subjects were grasping the target object concurrently, especially when the targets have complex shapes. Fourth, were the Project Prakash subjects given the Leibniz hint? That is, were they told something along the lines of, “One of the two objects I am going to show you is the same shape as the object you are now holding/ recently held. Which one do you think it is?” I suspect the answer is yes, but it would be helpful to know exactly what verbal instructions were given to the subjects.16 Fifth (and most important of all), have the experimenters considered using two-dimensional objects as their targets and distractors? I for one would be very interested in learning the outcome of circle-square Prakash experiments. As noted above, Diderot proposed his modification for fear that newly sighted subjects might not be able to see in three dimensions at all and therefore might not be able to see any objects as having the three dimensions of objects they were familiar with by touch. This was probably the case with the Prakash subjects. The experimenters report that their tests were done within forty-eight hours of surgery on the first eye. Thus the subjects were presumably unable to see in depth; they lacked binocular vision, which is the most powerful mechanism of depth perception and one of the few that does not rely on crossmodal learning. The Project Prakash experimenters tested three of the subjects again not long after the operation—five days in one case, seven days in another, and five months in the third. They found significant improvement in visionto-touch matching, the success rate now averaging about 85 percent. Sinha mentions (but does not endorse) the possibility that the subjects’ rapid improvement may have been due to their acquiring the ability to construct three-dimensional visual representations.17 If that were the case, it would confirm my Diderot-inspired conjecture that their initial failures to match seen and felt were due in good part to their inability to see in three dimensions at all. That conjecture is open to criticism from two opposing flanks. On the one hand, it may be that monocular subjects can see three-dimensional shapes thanks to shape-from-motion cues;18 on the other hand, it may be that the ability to see in three dimensions is not acquired by newly sighted subjects as soon as their ability to match seen to felt shapes.19 Nonetheless, it seems to me that our state of knowledge would be greatly improved by a two-dimensional variant of the Prakash experiments. Of course, there are no strictly two-dimensional tangible objects. It may nonetheless be possible to conduct a Molyneux experiment with objects that are effectively two-dimensional—for example, circles and squares of fine wire that subjects are allowed to trace with their fingers.

Berkeley, Reid, and Sinha on Molyneux’s Question

205

Sixth and last, I would like to ask if Sinha has any data bearing on a variant of the Molyneux question I posed in Van Cleve (2007). In developing his thesis of the radical heterogeneity of the objects of sight and touch, Berkeley explicitly denies that shape, size, and motion are “common sensibles”—features that can be perceived by more than one sense. What would he say about number, I wondered, which Aristotle had listed as a common sensible? In particular, how would he answer the one-two Molyneux question, a question we might frame in language reminiscent of Molyneux’s own as follows: Suppose a man born blind and taught to distinguish by his touch a single raised dot from a pair of raised dots (as Braille readers can do). Suppose the blind man made to see, and presented with a single visible dot on the left and a pair on the right. Quaere: whether by his sight and before he touched them, he could now distinguish and tell, which is the single and which the pair?

My thought had been that every reason Berkeley has for saying no to the original Molyneux question is also a reason for saying no to the one-two Molyneux question. Yet I also thought there would be something bizarrely defective about a subject who was stymied by the crossmodal number recognition task. In consequence, I took there to be something deeply implausible about Berkeley’s overall position. Does Sinha have data on the one-two question? I notice that one of his target-distractor pairs consists of two blocks that differ only in that one cylinder projects from one of them and two from the other (Held et al., 2011, fig. 1). I do not know whether any other target-distractor pairs differed only in a one-versus-two way or whether such pairs were used often enough to generate any useful statistics. I would love to see an experiment dedicated to the one-two Molyneux question. Acknowledgments Thanks to David Bennett for helpful comments on an earlier draft. Notes 1. Substantial portions of Diderot’s letter are translated and discussed in Morgan (1977). 2. For discussion of this sort of possibility, see Thomson (1974). 3. Here is another reason for insisting on the qualification I am proposing: even after learning from experience that what affects his touch in a certain way normally

206

J. Van Cleve

affects his sight in a certain associated way—in other words, even after learning what it takes according to a Berkeleian to perform Molyneux tasks—the subject could not authoritatively identify tangible objects on the basis of their visible appearances. For how could he be sure that the conditions of observation are normal and not such as to distort the visual appearances he has come through past experience to expect? 4. Molyneux’s name does not occur at all in the index to Reid ([1764] 1997), hereinafter cited as IHM. 5. The qualification—almost perfectly similar in shape—is necessary because Reid thinks the geometry of visible figures, unlike the geometry of tangible figures, is non-Euclidean. A tangible rectangle has exactly 360 degrees, a visible rectangle slightly more than that. For further explanation of this point, see Van Cleve (2002). 6. For further discussion of visible figures and their relation to a sphere centered on the eye, see Van Cleve (2002). 7. I replace Reid’s “is” by “determines” for reasons elaborated in Van Cleve (2002). The projection is curved and at a certain distance from the eye; the visible figure has neither of these properties. 8. Saunderson is reported to have had an opinion on the Molyneux question himself—that the subject could indeed know, if given the Leibniz hint (Degenaar, 1996, 49). I daresay that if Saunderson was capable of having this opinion, it must have been right! 9. It is sometimes suggested that Molyneux’s question is a question of nature vs. nurture, but that dichotomy is not exhaustive unless “nature” is used broadly to cover two quite different possibilities. One is that we are innately hardwired so as to respond the same way to certain tactile and visual stimuli, once we receive them; the other is that round objects as presented to touch and round objects as presented to vision are inherently similar. In the latter case, it would be the nature of the objects as much as our own nature that facilitated a yes answer to Molyneux’s question. Reid’s yes answer is based on his belief in inherent similarity, which a person made to see would be in a position to appreciate. For more on innate mappings as an alternative basis for an affirmative answer, see Evans (1985), especially 381. 10. It is possible, however, to construe it alternatively as a broadly normative question—a question not about what a Molyneux subject would say, but about what he should say, or would be in a position to say given the evidence available to him. The endorsement of Berkeley’s conditional I attribute to Reid is perhaps more plausible under such a normative construal of the Molyneux question. 11. Degenaar reports that the first researcher explicitly to pose a Molyneux task with sphere and cube was J. C. A. Franz in 1841. Franz’s patient could not recognize

Berkeley, Reid, and Sinha on Molyneux’s Question

207

the sphere and cube as such, but identified them instead as circle and square. Some sixty years later, A. M. Ramsay conducted a Molyneux test with a ball and a brick, telling his newly sighted patient which two objects he would be shown. The patient was able to identify each correctly. These cases highlight for me the significance of Diderot’s variation and Leibniz’s hint (see Degenaar, 1996, 87–97, for further details). 12. Until recently, it has also been believed that there is a critical period for the development of the binocular neurons that enable one to perceive depth. That this critical period is a myth is one of the main lessons of Barry (2009). 13. Brian Glenney has pointed out to me that it does not matter for purposes of the experiment whether novelty or familiarity draws an infant’s attention; all that matters is that the infants respond differently to seen objects that are new and seen objects that were previously felt. 14. The first experiments of this type were those of Meltzoff and Borton (1979). Sinha notes that Maurer et al. (1999) pointed out flaws in Meltzoff’s experiments and failed in their efforts to replicate his results, as did two other teams. Does Sinha know of any problems with the more recent work by Streri and Gentaz, which was designed to avoid the flaws in Meltzoff and which did find evidence of touch-tovision transfer? 15. Lego blocks are not cubes, but cubes, prisms, wedges, and objects of other shapes with projecting cylinders. 16. Kenneth Pearce has pointed out to me that it is actually a somewhat delicate matter to phrase the hint without begging the question against Berkeley, who denies that shape belongs to both visible and tangible objects in any univocal sense of the term. 17. I presume, but do not know, that the retests were done while the subjects were still monocular, in which case any three-dimensional visual abilities they possessed at the time of retesting cannot have been due to binocular disparity. 18. Evidence that eight-week-old infants can perceive three-dimensional shape on the basis of motion cues, which may be monocular, is presented in Arterberry and Yonas (2000). However, the objects shown to Prakash subjects were stationary. 19. Such is Sinha’s suggestion in Held et al. (2011, 552).

References Arterberry, M., & Yonas, A. (2000). Perception of three-dimensional shape specified by optic flow by eight-week old infants. Perception & Psychophysics, 62, 550–556. Barry, S. R. (2009). Fixing my gaze. New York: Basic Books.

208

J. Van Cleve

Berkeley, G. [1709] (1963). An essay towards a new theory of vision. In C. M. Turbayne (Ed.), Works on vision (pp. 7–102). Indianapolis: Bobbs-Merrill. Degenaar, M. (1996). Molyneux’s problem. Dordrecht: Kluwer. Evans, G. (1985). Molyneux’s question. In Collected papers (pp. 364–399). Oxford: Clarendon Press. Held, R., Ostrovsky, Y., de Gelder, B., Gandhi, T., Ganesh, S., Mathur, U., & Sinha, P. (2011). The newly sighted fail to match seen with felt. Nature Neuroscience, 14, 551–553. Leibniz, G. W. [1704] (1981). New essays on human understanding. P. Remnant & J. Bennett (Trans. and Eds.). Cambridge: Cambridge University Press. Locke, J. [1694] (1975). An essay concerning human understanding. P. H. Nidditch (Ed.). Oxford: Oxford University Press. Maurer, D., Stager, C. L., & Mondloch, C. J. (1999). Cross-modal transfer of shape is difficult to demonstrate in one-month-olds. Child Development, 70, 1047–1057. Meltzoff, A. N., & Borton, W. (1979). Intermodal matching by human neonates. Nature, 282, 403–404. Morgan, M. (1977). Molyneux’s question: Vision, touch, and the philosophy of perception. Cambridge: Cambridge University Press. Reid, T. [1764] (1997). An inquiry into the human mind. D. R. Brookes (Ed.). Edinburgh: Edinburgh University Press. Sinha, P. (2013). Once blind and now they see. Scientific American, July, 48–55. Streri, A., & Gentaz, E. (2003). Cross-modal recognition of shape from hand to eye in human newborns. Somatosensory and Motor Research, 20, 11–16. Thomson, J. (1974). Molyneux’s problem. Journal of Philosophy, 71, 637–650. Van Cleve, J. (2002). Thomas Reid’s geometry of visibles. Philosophical Review, 111, 373–416. Van Cleve, J. (2007). Reid’s answer to Molyneux’s question. Monist, 90, 251–270.

10 Modeling Multisensory Integration Loes C. J. van Dam, Cesare V. Parise, and Marc O. Ernst

When we move through the world, we constantly rely on sensory inputs from vision, audition, touch, etc., to infer the structure of our surroundings. These sensory inputs often do not come in isolation, and multiple senses can be stimulated at the same time. For instance, if we knock on a door, we see our hand making impact on the door, we hear the resulting sound, and we feel the movement of our arm and our hand making contact (see figure 10.1). How does our sensory system make sense of all these different inputs if, for example, we need to judge where exactly we are knocking? We can illustrate that, in order to perform such a task, we need to combine our sensory information in several fundamentally different ways. For instance, to obtain a visual estimate of where we are knocking, we have the information of the image of our hand on the retina, but this alone is insufficient information for estimating the location of the hand in the world. Such an estimate of the hand’s location can only be obtained if the information from the retinal location of the hand is combined with the information from the eye and neck muscles about the orientation of the eyes relative to the world. In other words, retinal location and eye and neck muscle signals provide complementary information, and without either one or the other, a single estimate of the location of the hand in the world would not be possible. In a similar fashion, the auditory cues about the location of the knocking sound need to be combined with information about the position of the ears with respect to the world to estimate where the knocking sound is coming from. Using the different senses, we can obtain several independent estimates about our knocking location. As indicated above and in figure 10.1, we have an estimate of where our hand is through vision (retinal location combined with eye- and neck-muscle information), through audition (location of the sound combined with neck-muscle information), and through

210

L. C. J. van Dam, C. V. Parise, and M. O. Ernst

Figure 10.1 When we knock on a door, our sensory system receives information from several different modalities. We see our hand knocking on the door. Together with the signals from our eye and neck muscles, this provides information about where the knocking event takes place. Similarly, our auditory system receives input from the sound produced by the contact. At the same time, we also feel the contact itself and the effects it has on our arm. To behave optimally, these separate sensory modalities have to be combined to come to a unified percept (reprinted from Ernst & Bülthoff, 2004, with permission from Elsevier).

proprioception (location of the hand estimated from muscles and the stretch of the skin on our arm). Each of these estimates alone is, in principle, sufficient to provide a good and unambiguous estimate of the hand’s location in the world. That is, besides providing complementary information, the separate sensory modalities often provide redundant estimates. In this chapter, we will describe how humans exploit such redundancy to get more reliable perceptual estimates through multisensory integration.1 In order to deal with redundant sensory information, the simplest strategy would be to pick just one sensory modality to base our estimate on and virtually ignore the others (see Rock & Victor, 1964). Though simple, this strategy of basing our estimate on a single sensory modality has several drawbacks. First, the sensory estimate of the chosen modality could be biased, and thus might not necessarily be the best we can do if, for example, the bias for the ignored senses is smaller. Second, the sensory signals may be noisy, such that the very same stimulus would be perceived slightly differently each time it is presented. That is, the estimate based on a single modality may also not be very precise, and integrating estimates

Modeling Multisensory Integration

211

from separate sensory modalities would cause their noise to partially cancel out, thus leading to a more precise estimate. Third, individual sensory signals are often ambiguous, hence combining information from multiple senses can help the system to resolve such ambiguities. Therefore, rather than ignoring part of our senses, a much better solution to deal with multiple sensory inputs would be to integrate them to come to a more precise and unambiguous estimate. Consider, for instance, estimating the size of an object by simultaneously seeing and touching it (Ernst & Banks, 2002). Both vision and touch provide a sensory signal that is inherently corrupted by noise. Thus, when presented with a certain signal, we can only say something about the actual stimulus that may have caused it with limited certainty. This can be represented probabilistically by likelihood functions representing the likelihood of the occurrence of the actual stimulus given the specific signal. The shape of these likelihood functions is determined by the sensory noise. Assuming that the noise is Gaussian distributed and independent across sensory modalities,2 the amount of noise can be quantified by the variance σ i2 of the likelihood function (where i refers to the different modalities). An estimate Sˆi of the relevant stimulus property can be obtained by taking the max of the likelihood function. That is, both vision and touch provide an independent estimate Sˆi of the size (see figure 10.2). If the sensory noises σ i2 in the separate estimates are independent, the optimal way to combine multiple sensory estimates in order to maximize precision is to multiply the likelihood functions corresponding to the individual estimates. When the likelihood functions are Gaussian, this corresponds to a weighted average of the sensory estimates, with weights proportional to the reciprocal of the amount of noise in each estimate (i.e., its precision, sometimes also referred to as the reliability). This is called the maximum likelihood estimation (MLE) and can be written as follows: 1 / σ i2 Sˆ = ∑wi Sˆi with wi = i ∑ j1 / σ 2j

(1)

Here, the combined estimate Sˆ is the weighted average of the individual estimates Sˆi from each modality, with weights wi assigned according to the relative reliability of each sensory estimate. Given that the weight is inversely related to the amount of noise in each estimate, the estimate with less variance will receive a higher weight. Furthermore, when the senses are combined in this way, the variance in the combined estimate is reduced with respect to either of the unisensory estimates:

212

L. C. J. van Dam, C. V. Parise, and M. O. Ernst

Probability

Likelihood functions combined touch

vision VT

V

T

ŜT

Ŝ VT Ŝ V Size

Figure 10.2 To estimate the size of an object, the separate estimates obtained from vision ŜV and touch ŜT have to be combined into a unified one ŜVT. These estimates can be represented by probability distributions (i.e., likelihood functions) of how likely different sizes of the object are, based on the sensory input. The likelihood functions vary in uncertainty σ. The less reliable the estimate (larger σ), the wider the distribution of the likelihood function. The combined estimate corresponding to the normalized product of the likelihood functions has a mean (i.e., maximum) equal to the weighted average of the unisensory estimates, where the relative weights are contingent on the levels of uncertainty in each estimate. The more reliable estimate (narrower distribution) receives a higher weight. The combined estimate is more reliable than either of the independent estimates, as indicated by the narrower distribution.

σ2 =

1

∑ 1/σ i

2 i

(2)

σ V2 σ T2 . σ V2 + σ T2 This implies that when the variance in the two sensory estimates is very different, the final combined estimate and the resulting variance are both close to that of the best (i.e., more precise) estimate. Ernst and Banks (2002) investigated whether human multisensory perception is consistent with the MLE combination of the sensory estimates. Participants estimated the size of a ridge with vision, touch, or with both modalities simultaneously. To test how the estimates from both senses are combined, Ernst and Banks used a virtual reality setup to bring the two senses into conflict and to parametrically manipulate the relative reliabilities. They found that observers’ behavior was indeed well predicted by the MLE weighting scheme. 2 = or, as in our example with vision and touch: σ VT

Modeling Multisensory Integration

213

By now, many studies have demonstrated optimal integration in other cue/modality combinations. For example, Alais and Burr (2004) showed that integration of spatial cues from vision and audition is also obtained in an optimal fashion, thereby accounting for the well-known ventriloquist effect. Similarly, the numerosity of sequences of events presented in two or more modalities such as tactile taps combined with auditory beeps and/ or visual flashes are perceived according to optimal integration principles (e.g., Bresciani, Dammeier, & Ernst, 2006, 2008; see also Shams, Kamitani, & Shimojo, 2000). Further examples of MLE integration across modalities include visuohaptic shape perception (Helbig & Ernst, 2007b), perception of the position of the hand using both vision and proprioception (e.g., van Beers, Sittig, & Denier van der Gon, 1999), and visual-vestibular heading perception (Fetsch, DeAngelis, & Angelaki, 2010), to name but a few. Integration of redundant sensory cues does not occur only between, but also within senses. This is, for instance, the case for visual slant perception, whereby both binocular disparity and texture gradients provide redundant slant cues. Notably, even in this case, cue integration is well predicted by a weighted average of binocular and texture cues (Hillis et al., 2002; Knill & Saunders, 2003; Hillis et al., 2004), thus suggesting that the MLE weighting scheme might represent a universal principle of sensory integration. When estimating a property of the environment, however, we do not just rely on our sensory inputs, but also make use of prior knowledge. For instance, we know that under most circumstances light comes from above (e.g., Brewster, 1826; Mamassian & Goutcher, 2001) and that most objects are static (Stocker & Simoncelli, 2006; Weiss, Simoncelli, & Adelson, 2002). It would therefore make sense to use such information to interpret incoming sensory signals that may otherwise be ambiguous or less precise. Within a probabilistic framework of perception, Bayes’ theorem provides a formal approach to model the combination of sensory inputs and prior information. In the Bayesian framework, the prior knowledge is represented by a probability distribution indicating the expected likelihood of certain circumstances occurring. Making use of both the sensory inputs and prior information, the combined perceptual estimate (i.e., the posterior) corresponds to the normalized product of the likelihood functions defining the sensory information and the prior knowledge (i.e., the prior). This is known as Bayes’ Rule, which can be formally written as:

( )

( )

P X|Sˆ ∝ P Sˆ|X P( X )

(3)

214

L. C. J. van Dam, C. V. Parise, and M. O. Ernst

( )

Here P Sˆ|X represents the likelihood-function of the sensory input Ŝ occurring given the possible state of the world X. P( X ) represents the prior probability of the state of the world X occurring based on, for instance, prior experience. Combining these two estimates provides us with the posterior distribution P X|Sˆ , whose maximum (known as maximum a posteriori, or MAP) indicates the most likely state of the world X given the prior knowledge and the sensory input Ŝ. A number of researchers have demonstrated that such prior information is also optimally integrated with sensory evidence to provide a more reliable estimate of the properties of the environment (e.g., Adams, Graf, & Ernst, 2004; Girshick, Landy, & Simoncelli, 2011; Stocker & Simoncelli, 2006; Weiss, Simoncelli, & Adelson, 2002; Parise et al., 2014). Notably, MLE constitutes a particular case of Bayesian sensory combination in the absence of prior information (i.e., flat prior), and in this case the combined estimate Ŝ of equation (1) would correspond directly to the maximum of the posterior distribution Max ⎡⎣ P X|Sˆ ⎤⎦ . For more information and further examples concerning the integration of prior information using Bayes’ rule, see also the tutorial on Bayesian cue combination contained in chapter 1 of this volume by Bennett, Trommershäuser, and van Dam.

( )

( )

1 The Cost of Integration As shown above, integration of sensory information brings about benefits in the combined estimate. Particularly, the precision of the combined estimate is increased compared to the estimates based on each individual sensory modality, and combining the estimate from sensory information with prior information further increases the precision of the final estimate. Integration, however, also comes at a cost. If the senses are completely fused into a unified percept, this would mean we do not have access to the separate estimates of the individual senses anymore. In other words, we would not be able to discriminate between a small and a large intersensory conflict if the weighted average in the two conflicts leads to the same combined estimate (perceptual metamers, see figure 10.3). Hillis and colleagues (2002) investigated to what extent the separate senses are fused, or put differently, whether we still have access to the individual sensory estimates feeding into the combined percept. Participants had to identify the odd one out in a set of three stimuli and could use any difference, regardless of whether unisensory or combined, to do the task. If participants had access to unisensory estimates, they should have been able to perform this task accurately. Conversely, if the senses were completely fused, identifying the odd

Modeling Multisensory Integration

215

SH (JND)

No combination

Complete fusion

1

–1 –1

1

SV

–1

1

SV

–1

1

SV (JND)

Figure 10.3 Discrimination performance in an oddity task for different levels of strength of fusion represented in 2-D sensory space. The x-axis represents the visual estimate of, for example, object size; the y-axis represents the haptic estimate. Both dimensions are normalized for the unisensory just noticeable difference (JND). Black corresponds to discrimination performance at chance relative to the center of this 2-D space, and white indicates perfect discrimination. If the separate estimates are independent, i.e., no combination takes place, discrimination performance should be about equal in all directions (left panel) and fully dependent on detection of a difference in either of the unisensory modalities. If, however, full fusion takes place (right panel), all sensory combinations leading to the same combined estimate (negative diagonal) become indistinguishable. Thus, discrimination performance along the negative diagonal should be at chance (perceptual metamers). Between these two extremes there are different levels of coupling strength (middle) (reprinted from Ernst, 2007, © ARVO).

one out would be impossible when it is defined by a sensory conflict (i.e., metameric behavior). For cue combination within a single sensory modality, such as the combination of visual perspective and binocular disparity cues to slant, they found that information about the separate estimates is indeed lost. That is, participants clearly showed metameric behavior. However, for size estimation across two separate modalities (vision and haptic), observers still had access to the individual senses in spite of the fact that sensory fusion took place, as observed by Ernst and Banks (2002). Costs of multisensory integration in terms of metameric behavior are also evident when previously unrelated cues end up integrated after prolonged exposure to multisensory correlation (Ernst, 2007; Parise et al., 2013). 2 The Correspondence Problem It may seem that the cost of integration, that is, losing information about the separate sensory modalities, is a small price to pay when combining the

216

L. C. J. van Dam, C. V. Parise, and M. O. Ernst

separate sensory modalities. After all, through sensory integration we can obtain more precise estimates about the same physical property of interest compared to each sensory modality alone. However, in spite of being a statistically optimal solution, a mandatory fusion of sensory information according to MLE is not always the best strategy to deal with multiple incoming signals. For example, if the spatial separation between a sound source and a visual event is large, we can wonder whether those signals are not causally related and, hence, whether it makes sense at all to integrate them. If we would always integrate the signals from vision and audition, information about this discrepancy would be lost and we would end up with one inaccurate combined estimate rather than two separate estimates of the two unrelated events. In other words, the perceptual system has to determine whether or not two signals belong to the same event in order to choose whether or not to integrate them. This is known as the correspondence problem, or causal inference. It has been demonstrated that spatial separation (Gepshtein et al., 2005), temporal delay (Bresciani et al., 2005), and natural correlation (Parise & Spence, 2009) between two sensory signals are used by the perceptual system as parameters that influence the integration of the signals. However, fusion can survive a large spatial separation between the sensory signals, provided there is additional evidence that the signals are coming from a common source (e.g., seeing your own hand move via a mirror, Helbig & Ernst, 2007a). Moreover, it has been demonstrated that prolonged exposure to spatiotemporal asynchrony can destroy fusion (Rohde, Di Luca, & Ernst, 2011). On top of being spatially and temporally coincident, multisensory signals originating from the same distal event are also often similar in nature. Specifically, the correlation of temporal structure of multiple sensory signals, rather than merely their temporal coincidence, provides a powerful cue for the brain to determine whether or not multiple sensory signals have a common cause. In order to demonstrate the role of signal correlation in causal inference, Parise and colleagues (2012) asked participants to localize streams of beeps and flashes presented together or in isolation. In combined audiovisual trials, the temporal structure of the visual and auditory stimuli could either be correlated or not. Notably, localization was well predicted by the MLE weighting scheme only when the signals were correlated, while it was suboptimal otherwise. This demonstrates that the correlation in the fine temporal structure of multiple sensory signals is indeed a cue for causal inference. In a Bayesian framework, the strength of sensory fusion can be modeled as a coupling prior (Bresciani, Dammeier, & Ernst, 2006; Ernst 2005, 2007,

Modeling Multisensory Integration

217

2012; Ernst & Di Luca, 2011; Knill, 2007; Körding et al., 2007; Roach, Heron, & McGraw, 2006; Shams, Ma, & Beierholm, 2005) that represents the belief that two signals go together. This belief is likely based on prior experience, whereby having been exposed to consistent correlation between the sensory modalities, we build up an expectation of whether certain combinations of signals do—or do not—belong together. This can be better understood if we illustrate the integration process in a two-dimensional perceptual space with the separate sensory modalities represented along the x- and y-axes (figure 10.4). In such a space, a prior expectation for coupling would be represented by the tuning of a 2-D probability distribution, the coupling prior, along the identity line (figure 10.4, middle column). In the case of strong perceptual coupling, this tuning will be very narrow (i.e., the signals are expected to fully correlate; see figure 10.4, bottom). If instead the sensory signals are expected to co-occur only by chance, the coupling prior will be flat (no association, figure 10.4, top). In other words, the coupling prior can be interpreted as a predictor of how strongly one sensory signal reflects what the other one should be. Using the coupling prior, the separate sensory estimates can be adjusted according to both the estimate obtained from the other source and the strength of the expected association. In our 2-D representation, this means that the estimates from the unisensory modalities (the likelihood function, figure 10.4, left column) are multiplied by the coupling prior (figure 10.4, middle column), to obtain a combined a posteriori estimate (right column). The maximum of the a posteriori distribution (MAP), which is the most likely combined sensory estimates SˆMAP = ( SˆVMAP , SˆHMAP ), would eventually determine the final percept. Depending on the shape of the coupling prior (e.g., a 2-D Gaussian probability distribution) and in particular on the width of the prior along the negative direction, which represents the coupling uncertainty σx, different a posteriori estimates will be obtained. The MLE type of integration in which the senses are completely fused can be illustrated as a prior that enables a one-to-one mapping from one sensory estimate to the other (i.e., an infinitely thin line along the positive diagonal; figure 10.4, bottom row). In this case, the system is certain the two signals always belong together, thus the coupling uncertainty σx is zero. Here, the combined estimate in the posterior distribution will always end up on the identity line SˆVMAP = SˆHMAP , and information about a discrepancy between the senses would be completely lost. At the other extreme, if the sensory modalities are considered independent by the perceptual system, the coupling prior is flat (i.e., the coupling uncertainty σx is infinite). In this case, the a posteriori estimate is fully determined by the likelihood function, and no combined percept

(

)

218

L. C. J. van Dam, C. V. Parise, and M. O. Ernst

x



Prior

Posterior MAP

MLE

(SˆV , SˆH )

(SˆVMAP , SˆHMAP )

2 V

x

2

x 2 H

2

SWH

x

m

Flat prior 2 ( x )

Likelihood

Impulse prior ( 2x = 0 )

x

x

x

SWV Figure 10.4 The sensory estimates from two separate modalities (e.g., vision and touch) are represented in 2-D space (left side). The Gaussian blob represents the uncertainties (blob-width) as well as the unisensory estimates (position along the corresponding axis). The coupling prior linking these separate estimates together can have different shapes: from completely flat, indicating no integration (top), to infinitely narrow, indicating full fusion of the senses (bottom). To obtain the final a posteriori estimate (right column), the distributions of the unisensory estimates are multiplied with the prior. Depending on how strongly the senses are combined, the location of the maximum a posteriori probability shifts in the 2-D perceptual space (adapted from Ernst, 2007).

Modeling Multisensory Integration

219

is formed (figure 10.4, top row). For intermediate levels of fusion, the coupling prior can be modeled as a bivariate distribution aligned along the diagonal (middle plot). In this case, partial fusion will take place, and there will be perceptual benefit for estimating the property of interest without the information about the separate modalities being completely lost (see figure 10.3). The narrower the distribution of the coupling prior, the stronger the fusion between the senses. Here it is important to note that as long as priors and likelihood distributions are assumed to be Gaussian, sensory integration will always occur for all possible signal combinations. That is, assuming Gaussian priors and likelihoods, the weighting between the senses would not change regardless of whether the discrepancy between the senses is small (close to the diagonal) or very large (off-diagonal, as demonstrated in figure 10.5a). Ideally, however, if the discrepancy between the signals is large, one can wonder whether they are caused by the same event at all, and rather than integrating them, it would be better to treat them separately. Such a complete breakdown of integration for large discrepancies is not possible to model using a single 2-D Gaussian distribution for the coupling prior. In other words, to model both integration for small sensory discrepancies as well as

Increasing prior -- likelihood separation

Prior

Likelihood Posterior Posterior

1D cross section Haptic size

Gaussian

A

B heavy tails

Heavy Tails heavy tails

Visual size

Figure 10.5 (A) Integration using a Gaussian-shaped prior. The weighting for the a posteriori estimate is always the same regardless of the discrepancy between the senses. (B) Using a heavy-tailed prior, integration occurs for small discrepancies, but with large discrepancies, the a posteriori estimate is the same as for the likelihood, indicating a breakdown of integration. In other words, the heavy-tailed prior can explain both integration for small sensory discrepancies as well as breakdown of integration for large sensory discrepancies (adapted from Ernst, 2012, fig. 30.7).

220

L. C. J. van Dam, C. V. Parise, and M. O. Ernst

breakdown of integration for large discrepancies, the shape of the coupling prior is very important. Using a mixture of the distributions described above, we can, however, create a single coupling prior that can explain both integration for small conflicts as well as breakdown of integration for large conflicts. For example, in figure 10.5b, the shape of the prior is Gaussian-like, but instead of approaching zero for large sensory conflicts, it has heavy tails added to it. That is, for large sensory conflicts, the prior is flat and nonzero. For the posterior estimates, this means that for small conflicts (top), the combined estimate is influenced by the peak of the prior (integration) in a very similar fashion, as discussed above for Gaussian priors. However, for large conflicts, the flat part of the prior does not influence the combined estimate (bottom) and thus will lead to independent treatment of the signals. This demonstrates that simply adding heavy tails to the coupling prior (i.e., nonzero probability for large cue conflicts) can explain the breakdown of integration when the conflict between the signals is too large, while at the same time keeping sensory integration for small conflicts intact (Ernst & Di Luca, 2011; Ernst, 2012; Körding et al., 2007; Roach, Heron, & McGraw, 2006). Such a smooth transition from integration to no integration depending on the sensory discrepancy, known as the “window of integration,” has also been repeatedly shown experimentally (Bresciani et al., 2005; Jack & Thurlow, 1973; Jackson, 1953; Radeau & Bertelson, 1987; Shams, Kamitani, & Shimojo, 2002; van Wassenhove, Grant, & Poeppel, 2007; Warren & Cleaves 1971; Witkin, Wapner, & Leventhal, 1952), and the coupling prior approach using heavy tails approximates these experimental findings pretty well. However, an empirical measure of the precise shape of the coupling prior along all possible discrepancy axes (e.g., temporal, spatial, etc.) is of course not so easy to obtain, as it would require a very large number of measurements. 3 Learning In the integration process, there are several stages at which learning can influence how and to what extent the sensory modalities are fused. First of all, in order to integrate the sensory information optimally, we need to know the reliabilities associated with each signal. That is, we need to assign the relative weights to each cue. How the sensory system exactly estimates the reliabilities, and thus the weights, is not entirely clear, but learning appears to be involved. For instance, Ernst and colleagues (2000) put two unisensory estimates of slant (perspective and stereo disparity) into

Modeling Multisensory Integration

221

conflicts of different extents. At the same time, by moving an object along the surface, haptic sensation provided a third estimate of the slant. The haptic slant estimate was always in correspondence with one visual cue (either perspective or disparity), and the slant specified by the remaining visual cue was randomly chosen from several different levels of conflict. Given such a situation, one of the visual estimates was always the odd one out from the three slant estimates, and it would make sense for the perceptual system to learn to weigh this cue less. This is exactly what Ernst and colleagues (2000) found. After exposure to such a situation, a reweighing of the visual cues had taken place, such that the always odd-one-out sensory estimate was weighed less when only the visual signals were presented together. In an analogous study using a different training paradigm, Atkins and colleagues (2001) showed very similar results. An open question, however, is if this means that this type of learning leads to a change in the reliability of the individual cues if measured in isolation, or if this is just a change in decision rule or some prior when combining it with other sensory signals. In addition to learning the reliabilities, the mapping between multiple sensory inputs can also be learned. For example, Ernst (2007) investigated whether humans could learn to integrate two previously unrelated signals. For this purpose, he brought two normally unrelated sensory signals, the luminance of an object and the stiffness of the same object, into correspondence during a training phase so that bright objects would be very stiff and dark objects soft (or vice versa). Before training, observers treated these signals as independent. That is, in an oddity task, performance was similar regardless of whether the odd one out was defined by a difference in brightness and stiffness congruent or incongruent to the to-be-trained axis. However, after training, performance for identifying differences that were incongruent to the learned correlation axis (i.e., the odd one out was chosen from the anticorrelation axis) was significantly worse than when the difference was defined along the congruent axis (luminance and stiffness in correspondence to the learned correlation). Previous studies have demonstrated that the fusion of sensory combinations leading to the same combined estimate results in perceptual metamers (Hillis et al., 2002, see “The Cost of Integration”). Therefore, such a difference in performance between before-and-after training is consistent with integration of the two signals and indicates that participants no longer have complete access to the individual cues. If the cues were still treated independently, performance for congruent and incongruent trials should still have been the same after training. In short, Ernst (2007) demonstrated that we can learn to integrate arbitrary signals from different sensory modalities if we are

222

L. C. J. van Dam, C. V. Parise, and M. O. Ernst

exposed to correlations between these two. This learning is consistent with a change in the distribution of the coupling prior after training (from flat to nonflat). Learning the mapping between multiple senses is also a key factor for multisensory perceptual development. Recent studies have demonstrated that until the age of eight or ten, children are not optimal integrators; that is, when simultaneously presented with redundant sensory information, they base their sensory estimates on a single sensory cue (not necessarily the most precise one), virtually ignoring the others (Gori et al., 2008; Nardini et al., 2008). Moreover, it has been shown that blind people regaining sight are unable to immediately match visual and haptic shapes and that some exposure to visuotactile stimuli (between five and nineteen days) is necessary to perform such tasks (Held et al., 2011). This result mirrors a developmental study by Meltzoff and Borton (1979), demonstrating that infants between the ages of twenty-six and thirty-three days old are able to match visual and haptic shape information. In a Bayesian framework, the inability to optimally integrate multiple sensory inputs can be modeled by a flat coupling prior, representing the lack of knowledge about the mapping between the senses (Ernst, 2008). Then, during development, the perceptual system learns the natural correlation between the senses—hence the shape of the coupling prior. Note however, that there is evidence that some mechanisms aimed at matching information from the separate senses may well be present already at birth (see, e.g., Streri & Gentaz, 2003). Such findings suggest that the brain might be carefully designed to search for similarities between the senses, a necessary condition for learning of a coupling prior to occur. In short, in order to integrate the senses, learning is involved in determining the relative weights as well as the mapping and the strength of the correlation between the senses. But perhaps the most obvious form of learning is that we appear to integrate prior knowledge into our perceptual estimates. This prior knowledge somehow has to be embedded in the perceptual system. The question is whether we have such knowledge innately, acquire it in early development, or are continuously updating our prior knowledge of the world. In other words, how fixed are our priors? Adams and colleagues (2004) investigated this question for the light-from-above prior, that is, the expectation that light sources are normally positioned above the observer’s head. Observers were presented with ambiguous visual images of shaded objects that could either be perceived as concave or convex. When observers actively touched those ambiguous shapes, perception (convex or concave) was disambiguated through a combination of

Modeling Multisensory Integration

223

the senses. Using such a disambiguation scheme, Adams and colleagues exposed the observers to a light direction that was rotated thirty degrees from the observers’ initial light-from-above prior. After training, Adams and colleagues found that the light-from-above prior had changed in a manner corresponding to training, thus demonstrating that priors are dynamically tuned to the statistics of the world. In a follow-up study, Adams and colleagues (2010) demonstrated that error signals, indicating that the percept from the ambiguous input alone was wrong, are important for such learning to occur. Furthermore, using an entirely different task, Körding and Wolpert (2004) showed that, depending on the experience, the learned priors can even take on relatively complex shapes. Another form of multisensory perceptual learning is sensory recalibration. Consider, for instance, watching an out-of-sync video in which the audio signal always precedes the corresponding visual events. Initially, we are likely to notice such a discrepancy since we still have access to the individual estimates from each modality with respect to time. But as you may have experienced when watching such videos, the asynchrony between the auditory and visual signals seems to decrease over time. In short, we are remapping (i.e., learning) how these sensory signals link together in the temporal domain based on the consistent information that they are out of sync at a constant delay; that is, we are recalibrating to this sensory discrepancy (Fujisaki et al., 2004). In sensory recalibration, which modality is being changed also depends on the assumed statistics of the signals and auxiliary information about which source might be at fault. For instance, Di Luca and colleagues (2009) showed, by investigating audiotactile and visuotactile synchrony perception after audiovisual adaptation, that time perception for vision changed the most. In the temporal domain, vision is also the less precise sense of the two, and the system may therefore also have assumed it to be the least accurate. However, when the auditory signal was presented via headphones, providing additional information that the auditory signal is coming from a different location (auditory signals presented via headphones are often perceived to be coming from inside the head), the auditory signal changed the most after being exposed to an audiovisual delay (Di Luca, Machulla, & Ernst, 2009). How quickly adaptation takes place is dependent, among other factors, on the quality/reliability of the error signal. For instance, Burge and colleagues (2008) showed that for visuomotor adaptation, very reliable visual feedback results in faster adaptation than when the visual feedback is blurred, and thus less reliable. Moreover, recent findings have shown that adaptation to cross-sensory inconsistencies can sometimes be

224

L. C. J. van Dam, C. V. Parise, and M. O. Ernst

very fast, occurring after a few milliseconds of exposure (Wozny & Shams, 2011; Van den Burg, Alais, & Cass, 2013). Overall, as is evident from the examples above, we learn to interpret the incoming signals by constantly updating our knowledge about the general behavior of the world (i.e., its statistics). The Bayesian framework introduced above explicitly incorporates the use of knowledge from prior experience (priors) and thus also provides an elegant approach to model all such learning effects (see, e.g., Ernst & Di Luca, 2011; Ernst, 2012). In particular, learning which modality (not) to trust (e.g., Ernst et al., 2000) can be conceptualized as learning the weights (the reliability) of the individual sensory estimates and thus the widths of the distributions. Learning the mapping between multiple sensory cues (e.g., Ernst, 2007) can be modeled as learning the shape (i.e., slope and variance) of a coupling prior. Crossmodal recalibration (e.g., Di Luca, Machulla, & Ernst, 2009) can be represented as a shift of the coupling prior, and learning the statistics of the environment (e.g., Adams, Graf, & Ernst, 2004; Körding & Wolpert, 2004) can be modeled as a prior influencing the interpretation of sensory cues. 4 Concluding Remarks and Open Questions Here we have discussed how we can model the combination of separate estimates from multiple sensory modalities. Overall, sensory integration seems to occur in an optimal fashion, but before integration takes place, the system has to estimate the likelihood of the different signals belonging together (correspondence problem). We’ve discussed how the system makes use of prior knowledge to help in the process and that learning processes are involved in every aspect of the integration process. Still, there are a number of open questions, such as how such computational principles are implemented in the brain; how the brain knows on an instance-by-instance basis the variance of the sensory input; how learning influences unisensory estimates; how to reconcile apparently non-Bayesian findings (such as the size-weight illusion, see Ernst, 2009; Flanagan, Bittner, & Johansson, 2008). It will be a challenge for future research to tackle such questions, to provide further insight into the actual validity of this framework for perception science, and to explore its potential in other disciplines such as cognitive science and cognitive robotics. Notes 1. Though note that sensory systems also have to deal with complementary information. Battaglia and colleagues (2010) examined an appealing case, in which cues

Modeling Multisensory Integration

225

from the separate senses are complementary rather than redundant—the perception of object-size change. Using only monocular vision, an object changing in size cannot be distinguished from the same object moving in depth because both affect the retinal image size in very much the same way. Therefore, to unambiguously perceive the retinal image changes as object-size changes, additional information about the object location is needed to “explain away” the effect of a change in depth on the retinal image. Such complementary information can come from another source such as touch or binocular vision. 2. See Ernst (2012) for a treatise on deviations from the assumptions of normally distributed and independent estimates.

References Adams, W. J., Graf, E. W., & Ernst, M. O. (2004). Experience can change the “lightfrom-above” prior. Nature Neuroscience, 7, 1057–1058. Adams, W. J., Kerrigan, I. S., & Graf, E. W. (2010). Efficient visual recalibration from either visual or haptic feedback: The importance of being wrong. Journal of Neuroscience, 30(44), 14745–14749. Alais, D., & Burr, D. (2004). The ventriloquist effect results from near optimal crossmodal integration. Current Biology, 14, 257–262. Atkins, J., Fiser, J., & Jacobs, R. A. (2001). Experience-dependent visual cue integration based on consistencies between visual and haptic percepts. Vision Research, 41, 449–461. Battaglia, P. W., Di Luca, M., Ernst, M. O., Schrater, P. R., Machulla, T., & Kersten, D. (2010). Within- and cross-modal distance information disambiguate visual sizechange perception. PLoS Computational Biology, 6(3), e1000697. doi:10.1371/journal. pcbi.1000697. Bresciani, J.-P., Dammeier, F., & Ernst, M. O. (2006). Vision and touch are automatically integrated for the perception of sequences of events. Journal of Vision, 6(5), 554–564. Bresciani, J.-P., Dammeier, F., & Ernst, M. O. (2008). Tri-modal integration of visual, tactile and auditory signals for the perception of sequences of events. Brain Research Bulletin, 75, 753–760. Bresciani, J.-P., Ernst, M. O., Drewing, K., Bouyer, G., Maury, V., & Kheddar, A. (2005). Feeling what you hear: Auditory signals can modulate tactile taps perception. Experimental Brain Research, 162(2), 172–180. Brewster, D. (1826). On the optical illusion of the conversion of cameos into intaglios, and of intaglios into cameos, with an account of other analogous phenomena. Edinburgh Journal of Science, 4, 99–108.

226

L. C. J. van Dam, C. V. Parise, and M. O. Ernst

Burge, J., Ernst, M. O., & Banks, M. S. (2008). The statistical determinants of adaptation rate in human reaching. Journal of Vision, 8(4), 1–19. Di Luca M., Machulla T., Ernst, M. O. (2009). Recalibration of multisensory simultaneity: Cross-modal transfer coincides with a change in perceptual latency. Journal of Vision, 9(12), 1–16. Ernst, M. O. (2005). A Bayesian view on multimodal cue integration. In G. Knoblich, I. M. Thornton, M. Grosjean, & M. Shiffrar (Eds.), Human body perception from the inside out. Oxford: Oxford University Press. Ernst, M. O. (2007). Learning to integrate arbitrary signals from vision and touch. Journal of Vision, 7(5), 1–14. Ernst, M. O. (2008). Multisensory integration: A late bloomer. Current Biology, 18(12), R519–R521. Ernst, M. O. (2009). Perceptual learning: Inverting the size-weight illusion. Current Biology, 19(1), R23–R25. Ernst, M. O. (2012). Optimal multisensory integration: Assumptions and limits. In B. E. Stein (Ed.), The new handbook of multisensory processes (pp. 1084–1124). Cambridge, MA: MIT Press. Ernst, M. O., & Banks, M. S. (2002). Humans integrate visual and haptic information in a statistically optimal fashion. Nature, 415, 429–433. Ernst, M. O., Banks, M. S., & Bülthoff, H. H. (2000). Touch can change visual slant perception. Nature Neuroscience, 3(1), 69–73. Ernst, M. O., & Bülthoff, H. H. (2004). Merging the senses into a robust percept. Trends in Cognitive Sciences, 8(4), 162–169. Ernst, M. O., & Di Luca, M. (2011). Multisensory perception: From integration to remapping. In J. Trommershäuser, K. Körding, & M. S. Landy (Eds.), Sensory cue integration (pp. 224–250). New York: Oxford University Press. Fetsch, C. R., DeAngelis, G. C., & Angelaki, D. (2010). Visual-vestibular cue integration for heading perception: Applications of optimal cue integration theory. European Journal of Neuroscience, 31(10), 1721–1729. Flanagan, J. R., Bittner, J. P., & Johansson, R. S. (2008). Experience can change distinct size-weight priors engaged in lifting objects and judging their weights. Current Biology, 18(22), 1742–1747. Fujisaki, W., Shimojo, S., Kashino, M., & Nishida, S. (2004). Recalibration of audiovisual simultaneity. Nature Neuroscience, 7(7), 773–778. Gepshtein, S., Burge, J., Ernst, M. O., & Banks, M. S. (2005). The combination of vision and touch depends on spatial proximity. Journal of Vision, 5(11), 1013–1023.

Modeling Multisensory Integration

227

Girshick, A. R., Landy, M. S., & Simoncelli, E. P. (2011). Cardinal rules: Visual orientation perception reflects knowledge of environmental statistics. Nature Neuroscience, 14(7), 926–932. Gori, M., Del Viva, M., Sandin, G., & Burr, D. C. (2008). Young children do not integrate visual and haptic form information. Current Biology, 18(9), 694–698. Helbig, H. B., & Ernst, M. O. (2007a). Knowledge about a common source can promote visual-haptic integration. Perception, 36(10), 1523–1533. Helbig, H. B., & Ernst, M. O. (2007b). Optimal integration of shape information from vision and touch. Experimental Brain Research, 179(4), 595–606. Held, R., Ostrovsky, Y., de Gelder, B., Gandhi, T., Ganesh, S., Mathur, U., et al. (2011). The newly sighted fail to match seen with felt. Nature Neuroscience, 14(5), 551–553. Hillis, J. M., Ernst, M. O., Banks, M. S., & Landy, M. S. (2002). Combining sensory information: Mandatory fusion within, but not between, senses. Science, 298(5598), 1627–1630. Hillis, J. M., Watt, S. J., Landy, M. S., & Banks, M. S. (2004). Slant from texture and disparity cues: Optimal cue combination. Journal of Vision, 4(12), 967–992. Jack, C. E., & Thurlow, W. R. (1973). Effects of degree of visual association and angle of displacement on the “ventriloquism” effect. Perceptual and Motor Skills, 37, 967–979. Jackson, C. V. (1953). Visual factors in auditory localization. Quarterly Journal of Experimental Psychology, 5, 52–65. Knill, D. C., & Saunders, J. A. (2003). Do humans optimally integrate stereo and texture information for judgments of surface slant? Vision Research, 43, 2539–2558. Knill, D. C. (2007). Robust cue integration: A Bayesian model and evidence from cue-conflict studies with stereoscopic and figure cues to slant. Journal of Vision, 7(7), 1–24. Körding, K. P., Beierholm, U., Ma, W. J., Quartz, S., Tenenbaum, J. B., & Shams, L. (2007). Causal inference in multisensory perception. PLoS ONE, 2(9), e943. Körding, K. P., & Wolpert, D. M. (2004). Bayesian integration in sensorimotor learning. Nature, 427, 244–247. Mamassian, P., & Goutcher, R. (2001). Prior knowledge on the illumination position. Cognition, 81(1), B1–B9. Meltzoff, A. N., & Borton, R. W. (1979). Intermodal matching by human neonates. Nature, 282, 403–404. Nardini, M., Jones, P., Bedford, R., & Braddick, O. (2008). Development of cue integration in human navigation. Current Biology, 18(9), 689–693.

228

L. C. J. van Dam, C. V. Parise, and M. O. Ernst

Parise, C. V., Harrar, V., Ernst, M. O., & Spence, C. (2013). Cross-correlation between auditory and visual signals promotes multisensory integration. Multisensory Research, 26, 307–316. Parise, C. V., Knorre, K, & Ernst, M. O. (2014). Natural auditory scene statistics shapes human spatial hearing. PNAS, 111(16), 6104–6108. Parise, C. V., & Spence, C. (2009). When birds of a feather flock together: Synesthetic correspondences modulate audiovisual integration in nonsynesthetes. PLoS ONE, 4(5), e5664. Parise, C. V., Spence, C., & Ernst, M. O. (2012). When correlation implies causation in multisensory integration. Current Biology, 22(1), 46–49. Radeau, M., & Bertelson, P. (1987). Auditory-visual interaction and the timing of inputs: Thomas (1941) revisited. Psychological Research, 49, 17–22. Roach, N. W., Heron, J., & McGraw, P. V. (2006). Resolving multisensory conflict: A strategy for balancing the costs and benefits of audio-visual integration. Proceedings of the Royal Society of London, Series B: Biological Sciences, 273, 2159–2168. Rock, I., & Victor, J. (1964). Vision and touch: An experimentally created conflict between the two senses. Science, 143, 594–596. Rohde, M., Di Luca, M., & Ernst, M. O. (2011). The rubber hand illusion: Feeling of ownership and proprioceptive drift do not go hand in hand. PLoS ONE, 6(6), e21659. Epub June 28, 2011. Shams, L., Kamitani, Y., & Shimojo, S. (2000). What you see is what you hear. Nature, 408, 788. Shams, L., Kamitani, Y., & Shimojo, S. (2002). Visual illusion induced by sound. Cognitive Brain Research, 14, 147–152. Shams, L., Ma, W. J., & Beierholm, U. (2005). Sound-induced flash illusion as an optimal percept. Neuroreport, 16, 1923–1927. Stocker, A. A., & Simoncelli, E. P. (2006). Noise characteristics and prior expectations in human visual speed perception. Nature Neuroscience, 9(4), 578–585. Streri, A., & Gentaz, E. (2003). Cross-modal recognition of shape from hand to eyes in human newborns. Somatosensory and Motor Research, 20(1), 13–18. van Beers, R. J., Sittig, A. C., & Denier van der Gon, J. J. (1999). Integration of proprioceptive and visual position information: An experimentally supported model. Journal of Neurophysiology, 81, 1355–1364. van den Burg, E., Alais, D., & Cass, J. (2013). Rapid recalibration to audiovisual asynchrony. Journal of Neuroscience, 33(37), 14633–14637.

Modeling Multisensory Integration

229

van Wassenhove, V., Grant, K., & Poeppel, D. (2007). Temporal window of integration in auditory-visual speech perception. Neuropsychologia, 45, 598–607. .

Warren, D. H., & Cleaves, W. T. (1971). Visual-proprioceptive interaction under large amounts of conflicts. Journal of Experimental Psychology, 90, 206–214. Weiss, Y., Simoncelli, E. P., & Adelson, E. H. (2002). Motion illusions as optimal percepts. Nature Neuroscience, 5(6), 598–604. Witkin, H. A., Wapner, S., & Leventhal, T. J. (1952). Sound localization with conflicting visual and auditory cues. Journal of Experimental Psychology, 43, 58–67. Wozny, D. R., & Shams, L. (2011). Recalibration of auditory space following milliseconds of cross-modal discrepancy. Journal of Neuroscience, 31(12), 4607–4612.

II

Primarily on the Unity of Consciousness

11 A Unity Pluralist Account of the Unity of Experience David J. Bennett and Christopher S. Hill

1 Causal Holism, Experiential Holism, and the Unity of Experience Question We don’t know how to carve or classify experiences in any theoretically satisfying way. We can’t tell you how many experiences you are having right now, or what their fundamental kinds are. We do suspect that such assessment is not an entirely arbitrary matter. Or at least that it ideally shouldn’t be. Perhaps a mature perception science will provide guidance. Pending the arrival of that happy day, the following observations still suffice to get an interesting exploration started. An element of informed common sense is that some experiences of a subject at a certain time occur independently of other experiences undergone by that subject at that same time. The sound of a bird’s chirp outside my window seems detachable from the sight of a splotch on the wall before me (compare Siegel, 2010, 19). First pass this means that that very sort of “splotch experience” might have occurred in other experience settings— including non-bird-chirp settings.1 Appreciating the following distinction supports the plausibility of this general picture. It is a psychological commonplace that experiences triggered by a certain physical stimulus can differ depending on the causal context of the physical stimulus—where such causal contexts consist of other physical stimuli that do or might impinge on sensory surfaces nearby in space or time. So, the sound heard in striking a certain piano key can vary depending upon the keys struck nearby in time in playing a melody. There are many familiar, similar, visual-perception examples as well. Call this causal holism. Causal holism can come in varieties and degrees. Its exact characterization requires close attention to how physical stimuli are marked out—a matter best determined by empirically informed

234

D. J. Bennett and C. S. Hill

Figure 11.1 Kanizsa triangle.

perception-science theorizing (see Bennett, 2014). Until this task is discharged, and specific perceptual domains carefully scouted, some caution is in order. But causal holism may well be a common, even pervasive, phenomenon. Such causal holism, however, is to be distinguished from experiential holism. There are kinds and grades here, too (cf. endnotes 3 and 4). The basic idea of experiential holism is that experiences of certain determinate sorts can only occur in the setting of certain other “surrounding” sorts of experiences (typically of similar specificity). Call experiences so bound together members of experiential wholes.2, 3 A main aim of Dainton (2000, ch. 8) is, in effect, to argue that what we have called “experiential holism” is not widespread. But Dainton does take seriously the suggestion that (in our terms) a local4 experiential holism is, for example, present in perceiving a Kanizsa triangle. The case is suggestive and illustrative of experiential holist claims. Approaching the example as a (tentative, restricted) experiential holist, Dainton appears to argue, in part, that experiences of the “filled-in” “illusory contours” likely can only occur in the setting of “Pac-Man experiences”

A Unity Pluralist Account of the Unity of Experience

235

(2000, 206).5 If that is the claim (or part of it), the proposal seems doubtful. The exact characterization and explanation of these effects is a complex matter (see endnote 5). But such modally filled-in figures are associated with differences in experienced lightness (cf. Frisby & Clatworthy, 1975). It is not clear why the lightness-marked “illusory contours” that partially limn the figure could not occur in non-Pac-Man settings. We need not sort through cases. There is strong theoretical reason to expect that experiential holism is at least not pervasive. A constraint on any reasonable view of perceptual experience is that experiences signal environmental properties—which can come in untold, ever-changing combinations. To serve the organism’s purpose, perceptual systems must be able to flexibly mix and match the environmental properties conveyed or represented in experience. Experiences of, say, an edge or a color or a certain shape or structure can only serve their key signaling purpose if just such signaling through experience is possible in other settings—consisting of other exemplified properties, many themselves also conveyed through experience.6 It may be that the/a reason why experiential holism has sometimes been felt to be widespread is that it has been confused with causal holism—which there is reason to think is widespread. Experiences that are not bound together into experiential wholes can be considered separable.7 The experiences of a subject at a time are therefore not an indissolubly linked experiential whole. Are there interesting ways or respects in which separable experiences of a subject at a time—some or all—might still be joined, or unified? In appealing here to the notion of unity, is a decently, sharply characterized target phenomenon conveyed? That is less than clear. There is surprisingly little help to be found in the unity of consciousness/experience literature about how to characterize the core target phenomenon/phenomena that is/are to be explained or illuminated.8 Webster’s Dictionary tells us that to unify is to “make into a unit or a coherent whole.” Dictionary.com lists as synonyms, “combine, merge, fuse, coalesce.” We’ll start with these characterizations as rough guides, asking what unifies (separable) experiences of a subject at a time—if or to the degree to which experiences of a subject at a time are unified. 2 Unity Pluralism: An Overview Tim Bayne’s recent book (2010) has figured prominently in the revival of interest in the topic of unity of consciousness (see also Bayne & Chalmers,

236

D. J. Bennett and C. S. Hill

2003). This and preceding work by Bayne is in this respect a stimulating instigator for us. His views are, however, also a foil for us in our update of a multiple relations view of the unity of consciousness/experience (Hill, 1991)—or, as we now prefer to say, a unity pluralist view.9 We differ with Bayne at every significant step. We thus begin by providing an orienting overview of our unity pluralism by contrasting core elements of our approach with core elements of the Bayne approach. On the Bayne view, experiences are unified in being subsumed within an overarching experience. Subsumption is held to serve as a single, universal, unity-making relation. It holds (it is said) among all the experiences of a subject at a time as a matter of necessity (in some sense of “necessary”). Subsumption is said to make a phenomenological difference; there is (it is said) “something it is like”—a special “conjoint phenomenology”—for experiences to occur thus joined or unified. Finally, Bayne argues that subsumption constitutes a species of part-whole relation, with some experiences parts of a subsuming experience. On our contrasting, unity-pluralist account, there is not one unity-making relation but many. Some unity-making relations join fairly wide swaths of the experiences of a subject at a time; some join more narrow swaths of experience. No single, unity-making relation joins all of the experiences of a subject at a time. That is, for the unity pluralist, there is no unity-making relation that is universal or total (Hill, forthcoming). These unity-pluralist, unity-making relations enter at different mental levels, more sensory to more cognitive. Included under the first head are, for example, spatial-unity relations, deriving from the spatial contents of experiences (see sec. 3.1). Falling under the second head are, for example, cognitive relations of shared accessibility (see sec. 4). Our unity-making relations are not necessarily present or represented in experience. And for those unity-making relations not present or represented in experience, only in a loose sense can it be said that “there is something it is like” for each such unity-making relation—per se—to link experiences undergone.10 Finally, experiences joined by one or more of our unity-making relations do not thereby bear a part-whole relation to some larger experiential whole, though some experiences may still, by some measure(s), bear part-whole relations to each other.11 It may be claimed at this point that we are simply not addressing the Bayne unity phenomena or the Bayne explanatory project (a possibility raised in Schechter, 2013a). After all, Bayne (2010) acknowledges some of our very unity-making relations (see also Bayne & Chalmers, 2003)—but

A Unity Pluralist Account of the Unity of Experience

237

this in the course of isolating his target phenomenon of phenomenal unity as (it is said) something altogether different, pertaining (it is claimed) to the experiences of a subject at time exhibiting a conjoint phenomenology. By way of conveying the core target and project, we have: The plausibility of the unity thesis derives largely from introspection. Consider the structure of your overall conscious state. I suspect that you will be inclined to the view that all your current experiences are phenomenally unified with each other— that they occur as the components of a single phenomenal field; to put the same point in different terminology, that you enjoy a single phenomenal state that subsumes them all. (Bayne, 2010, 75)12, 13

This appears to be an attempt to fix a target phenomenon by a kind of ostensive pointing to an aspect of experience revealed by introspection. But when we reflect14 carefully upon our own experiences, we can’t find the phenomenon. Or at least: all we find are experiences linked by familiar unity-making relations of the sort we have alluded to (several discussed in more detail in sections below). Yet here we are instructed to discern the presence or trace of some further, distinct, universal unity relation. This we cannot find. Perhaps we can be brought around by argument and instruction to accepting the Bayne picture, but we can hardly start thus initially guided. The next sections (3 and 4) provide some sample unity-pluralist unitymaking relations (see Hill, forthcoming, for other candidates).15 We start (sec. 3) with sensory unity-making relations. Here we can be especially brief, pointing to work by several other contributors to this volume who explore these aspects of experience in rich detail. Section 4 describes an important kind of more cognitive, unity-making relation. 3 Sensory Unity-Making Relations 3.1 Object and Event, Binding Unity Properties seem to always/typically be present in experiential awareness bound to objects.16 Consider visually sensed color and shape, qualifying an object. Or consider tactile-sensed texture or roughness linked in experience to seen object-surface shape; this second crossmodal case plausibly involves a similar sort of property binding (for discussion, see O’Callaghan, this volume; DeRoy, this volume; Bayne, this volume). When different properties are thus fixed in experience to the same object or surface, experiences of those properties may be said to stand in a kind of unity relation—achieved in attributing different properties, in experience, to the same object (Bayne

238

D. J. Bennett and C. S. Hill

& Chalmers, 2003; Bayne, 2010). De Vignemont (this volume) calls this “additive binding.” (An aside: one version of the corresponding explanatory binding problem for properties asks, “How can subjects detect multiple properties via disparate neural-computational processes, in a way that leads to experiences of those different properties as bound to the same objects?” A solution to the explanatory binding problem would tell us how the foregoing kind of experiential object binding-unity is achieved [again, compare Bayne & Chalmers, 2003].) Observe that a somewhat analogous binding connects experiences of events or processes such as the bird’s seen moving beak and the heard bird chirp. And perhaps, as well, experiences of wide swaths of causally interacting events (compare O’Callaghan, 2012; see also Scholl & Gao, forthcoming). Such sensed causal ties are, plausibly, present or represented in experience (Siegel, 2010; Rolfs, Dambacher, & Cavanagh, 2013; Scholl & Gao, forthcoming). Experiences are thus unified (the idea is) because, for example, the bird’s moving beak and the heard chirp are experienced to be connected—the latter sensed as emanating from the former. 3.2 Varieties of Spatial Unity Much ongoing perception-science research studies how information from different sources is combined in estimating spatial properties such as the slant of a surface (Trommershäuser, Kording, & Landy, 2011; Bennett, Trommershäuser, & van Dam, this volume; van Dam, Parise, & Ernst, this volume). The information drawn on might be entirely visual (visually sensed texture information, and stereo information, say). Or there may, for example, be haptic information involved (Banks & Ernst, 2002). This sort of cue combination is explored as integrative binding in de Vignemont (this volume). Our present concern is not with the causal-computational explanatory details, but with the perceptual targets and the experiential upshots. This research fits easily with the view that there are spatial common sensibles17— spatial properties accessed across multiple sensory systems (especially sight, hearing, and touch). Such spatial properties are the targets of much perceptual processing and are often represented in experience. This suggests that a kind of unity of experience is achieved in representing objects and events to be present in a shared spatial setting. A full treatment would detail different spatial frames of reference that objects or events are represented in, and how locations in such different frames of reference are or can be related. For example, I might presently be experiencing

A Unity Pluralist Account of the Unity of Experience

239

a cup to my right on a desk and a pen to my left. These experiences might place or represent these objects relative to me in some egocentric frame of reference. Or, compatibly, these experiences might represent the objects to be located within a spatial frame of reference anchored to the desk or desktop. It is not a trivial matter to determine which locations are recovered in experience and how they are related or might be held to be related (perhaps especially in the service of guiding action or decisions about what to do). But for our purposes, we can simply observe that when the spatial locations coded in experiences of objects thus bear spatial relations to each other, a certain spatial unity of experiences is achieved (compare Bayne & Chalmers, 2003, sec. 2). What of inner sensory events like felt twinges, itches, pains, or occurrent, symbolically clothed, thoughts? The issues here are subtle and complex. But it is worth noting that such inner experiences often seem spatially present or tagged, typically as on or within the bounds of bodily surfaces. Such location-tagging may mark out such occurrences as potential targets of perception-like monitoring (Hill, 2009). Perhaps in these sensed locations there are resources available to understand how such inner experiences are spatially related or relatable to the spatial codings of outer perception— with this marking a kind of spatial unity between inner and outer (for rich, relevant discussion, see de Vignemont, forthcoming). 4 Joining of Experiences via Accessibility Relations When experiences are said by philosophers of mind to be accessible to a subject or to a cognitive system at a time, the capacities potentially in play are a complex lot (Block, 1995; Hill, 2009). Here we restrict ourselves to several brief observations. There is a range of different possible accessibility-based, unity-making principles or relations. A key version: in order for experience-contents P and Q to be joined or unified (and so the corresponding experiences unified), these contents must be jointly accessible, as the conjunction P&Q. Other access-unity principles are worthy of exploration (see Bayne & Chalmers, 2003, and Van Cleve, 1999, 79–80, for illuminating discussion). But essentially this principle has been linked repeatedly to the unity of consciousness (Nagel, 1971; Marks, 1980; Van Cleve, 1999). We will focus on this accessunity relation. (Aside: though we will not pursue the topic, there will clearly be corresponding introspective unity principles—differing, no doubt, depending upon how introspection is understood [see note 14].)

240

D. J. Bennett and C. S. Hill

It is true that not all experiences of a subject at a time can be joined by the foregoing access-unity relation (or any close variant). Thus suppose, for two experiences, that each experience content, P and Q, nearly exhausts the capacity of short-term memory. Then it will not be possible to cognitively access the conjunctive content, P&Q (see Bayne & Chalmers, 2003; Dainton, 2000). But by our lights, this in no way diminishes the role of this principle as a unity-making principle. Indeed, as we’ve noted, for the unity pluralist, no significant unity-making relation by itself unites all of the experiences of a subject at a time. Of course, various experiences of a subject at a time will typically be linked by one or more unity-pluralist unity-making relations—thickly, tightly, not much at all, and what have you. That is all there is to the unity of experience, or so we suggest. 5 Subject Unity Surely it will be suggested that there is a unity-relation that is both total and substantial. What joins the experiences of a subject at a time into a unified whole (it will be said) is simply that they are the experiences of the same subject. However, Bayne (2010) and Bayne and Chalmers (2003) doubt that a substantive subject-unity thesis of this sort can be formulated (see Hill, forthcoming).18 We, too, in the end doubt that subject unity is an interesting, illuminating kind of experience-unity but on rather different grounds. Bayne and Chalmers (2003) begin by endorsing the following definition of subject unity: Definition: A set of experiences is subject unified just in case all of the members of the set belong to the same subject. They then try to combine this definition with the following unity thesis: (UT1) For any subject S and any time T, all of the experiences that are possessed by S at T are subject unified. But of course, when it is combined with the preceding definition, (UT1) becomes a tautology. Subject unity is thus set aside by Bayne and Chalmers as uninteresting. This seems the wrong way around, however, in approach. Our question is whether various relations between experiences are or might be assessed as contributing to some interesting unification of experience. So, instead of

A Unity Pluralist Account of the Unity of Experience

241

beginning with the foregoing definition, it is more appropriate to start with the relation being possessed by the same subject at the same time. The question is, then, whether this is a relation that can confer unity of an interesting sort on the experiences it connects. We should instead thus consider the following thesis: (UT2) The relation being possessed by the same subject at the same time is a unity relation, so any two experiences possessed by a subject at a time are unified. This claim is not obviously true or false. Part of the uncertainty in assessing (UT2) derives from uncertainty in interpreting key notions employed. So, if subjects are understood as human animals or human biological organisms (or the like), then the thesis is at best only contingently true (Hill, forthcoming). (UT2) thus comes out as false if held to obtain in any sort of modally strong sense (as in the unity theses advanced in Bayne & Chalmers, 2003; Bayne [2010] considers unity theses of varying modal strength). In the human animal understanding of subject, (UT2) comes out as actually false of split-brain patients if splits are interpreted as operating with “two minds”—a reading popular at early stages of the study of these patients by the pioneers of such studies (see Sperry, 1968, cited in Corballis, 1994).19 However: [Suppose we understand] a subject to be a psychological entity with a highly integrated functional organization, a functional organization that moreover involves only non-duplicated faculties—a single introspective faculty, a single visual working memory, and so on. (Hill, forthcoming)20

On this conception of what “subjects” are, assessment of (UT2) will turn centrally on how we understand the claim of unity. Is bare coinstantiation by such subjects at a time enough to ensure any interesting, substantial unity of experience? Consider cases of visual form agnosia (Farah, 2004, ch. 2; Milner & Goodale, 2006; Goodale & Milner, 2003; compare the Molyneux subjects described in Sinha, this volume, and in Ostrovsky et al., 2009). The experiences of visual form agnosics are deeply fractured.21 Certain simple, visible features or properties are present, such as textures and especially colors. Perhaps as well—from the research reports, the cases may differ—some shape or contour elements or fragments. Whatever shape fragments are sensed, however, they are not bound together to form comprehensive shape representations that might be used to segment scenes into objects (Farah, 2004;

242

D. J. Bennett and C. S. Hill

compare Ostrovsky et al., 2009; see also De Vignemont, this volume). Properties are also apparently not sensed bound to objects. Perhaps as Bayne (2010, 60–61) suggests, there is some sense in which the experiences of visual form agnosics are still unified. If so, we doubt this derives from the bare “co-bearing” of experiences or experience properties by the same subject. In this we agree with Bayne. But we also doubt Bayne’s (2010, 61) suggestion that “we have no reason to deny that [the experiences of such visual form agnosics] are phenomenally unified with each other.” At least where this is understood, on Bayne scruples, as holding that any felt residual unity derives from supposing that the form agnosic’s inchoate, fractured, sensory presentations display a special subsumption-shaped “conjoint phenomenal character.”22 Instead of directly backing these doubts about the foregoing interpretations, we contrast a better unity-pluralist story, which is immediately illuminating (we say) and built from familiar resources. Surely, the only respects in which such visual-form agnosics still enjoy any residual unity of experience come from the joining of their fragmentary sensings by one or more of our familiar unity-making relations. So, for example, such subjects are not entirely blind, and if probed, they can convey aspects of their (fractured) experiences.23 There is thus some residual access unity at work. Further, we suggest that some form or forms of spatial unity can also be reasonably inferred from the reports of some of these subjects and/or from experimental results—say, pertaining to coarsely (perhaps) sensed spatial directions.24 Just as important: the unity pluralist has a ready account of the manifest way in which the experiences of visual-form agnosics are very substantially and centrally disunified. Surely this stems from the absence, or severe degradation, of key unity-making relations. This includes the absence or severe degradation of object unity, the apparent degradation or impoverishment of spatial unity, and the apparent impoverishment of access unity.25 To be fair, we have noted that Bayne (2010) acknowledges unity-making relations other than subsumption. He can also, therefore, allow that there are respects in which the experiences of visual-form agnosics are disunified. Our unity-pluralist reading of these cases, however, still differs in what seem to be telling ways. So, the unity pluralist can readily and plausibly emphasize the deep and central disunity of form-agnosic experience. The unity pluralist can also readily and plausibly account for all of any apparent residual unity by appealing to unity-making relations familiar to everyday thought and/or familiar from empirical mind-science theorizing.

A Unity Pluralist Account of the Unity of Experience

243

6 What Unifies the Unity-Pluralist Unity-Making Relations? Bayne (2010) and Bayne and Chalmers (2003) themselves use the term “unity” in characterizing some of our unity-pluralist unity-making relations. There is something fitting in this appellation. But precisely what? Are the unity-pluralist unity-making relations linked by any deep theoretical principle? We will close by briefly considering three possibilities. 6.1 Unity-Making Relations Are Required for the Existence of Psychological Subjects Perhaps some measure or degree of joining of experiences by our unitymaking relations is a requirement for the core integration and existence of psychological subjects (in the Hill sense described above in sec. 5). Establishing such a link would connect at least key unity-pluralist unityprinciples in a theoretically satisfying way. This would also vindicate the sometimes-voiced view or conjecture that there is a central link between the unification of experiences and the presence or possibility of integrated, perceiving subjects (see endnote 8). Consider, then, the principle: Psychological Unity Thesis (PU): The holding of unity-pluralist unity-making relations—singly or (more likely) in combination—is required in order for there to be an integrated psychological subject. In support, it might, for example, be observed that representing the spatial environment in an integrated manner allows for the reasoned assessment of action options—in this way, at least furthering smooth, adaptive functioning of psychological selves. Further, perceptual binding is a primitive, fundamental achievement that allows for information uptake in perceptual judgment. No doubt much more could be added in this vein. We will see below (6.3) that there is likely something illuminating in these observations. But simple considerations show directly that (PU) faces serious challenges.26 The problem is that the bonds just don’t seem tight enough to support anything like (PU) between sensory-experiential order/disorder and the basic integrity/disintegration of psychological subjects. Recall the visualform agnosia condition discussed above in section 5, as exhibited by the Milner and Goodale subject DF. Or recall Sinha’s Molyneux subjects. In these cases, there is a severe breakdown of sensory-perceptual integration.

244

D. J. Bennett and C. S. Hill

Yet there seems no notable undermining of the basic integrity of these beings as psychological subjects. We treat these people as intact subjects, in what seems a pertinent sense of “subject.” The defender of (PU) has, at best, work to do. 6.2 No Theoretically Interesting Unification of the Unity-Pluralist UnityMaking Principles Philosophers know better, post-Wittgenstein, than to infer the presence of links of interesting principle—of any kind—simply from the ready application of the same term in varying settings. We do take seriously the possibility that the unity-pluralist unity-making principles share no deep theoretically interesting connections. Suppose there is indeed no theoretically interesting unification to be found of our unity-pluralist unity-making principles. Suppose further that we have been correct in arguing that the only unity (or coherence, or what have you) to be found in experience is fixed by the obtaining of our unitypluralist unity-making relations. Then there would be no interesting theoretical core about mentality or experience to be discovered through the study of the source(s) of whatever unity of experience is present across the range of experiences undergone by a subject at a time. Of course, there would still be much of theoretical interest and importance about experience to be learned from empirical studies of the capacities that underlie the obtaining of particular unity-pluralist unity-making relations. This lesson is amply illustrated in the many contributions to this volume that focus on just such vigorous and ongoing empirical research projects. This includes reflection on empirical projects concerning various forms of sensory integration. 6.3 Unity-Making Relations as Promoting Smooth, Adaptive Functioning of Psychological Subjects Note that visual-form agnosics and Sinha’s Molyneux subjects have great trouble responding and acting on the basis of their experiences. Note also that the considerations that cast serious doubt on PU above (sec. 6.1), do not threaten the view that the holding of key unity-making relations promote the smooth adaptive functioning of psychological subjects. This suggests: Psychological Functioning thesis (PF): The holding of the unity-pluralist unity-making relations, singly or in combination, promotes the smooth, adaptive functioning of psychological subjects.

A Unity Pluralist Account of the Unity of Experience

245

Appeal to (PF) seems a good deal more promising than working with (PU) as a way of establishing a theoretically interesting commonality between our unity-making principles. The observations above (sec. 6.1) about spatial unity and binding unity provide support. Other, similar, considerations could be marshaled. Equally important, the absence of key unity-pluralist relations between experiences is associated with disruption of smooth, adaptive functioning—as with DF and Sinha’s Molyneux subjects. We don’t see decisive objections to this approach. However, until PF is spelled out and explored more fully, there are some grounds for caution. To begin, it is not immediately clear how (PF) applies to our access-unity principle. However, at least the diminishment of the capacity to co-access experiences presumably does or would diminish smooth functioning of psychological subjects. So there may be a fit. One might worry that appeal to “smooth, adaptive functioning” is too general, even hazy, to be of much help. The worry here might be framed as a version of the preceding worry (sec. 6.2). No doubt there is much of interest in studying the separate functions served by the prominent unitypluralist principles—including in an empirical-scientific setting. But can we really expect much of deep, unifying, theoretical interest to be found across the ways in which the presence of such “experience-joinings” promote smooth, adaptive functioning? A full exploration here would require a careful study of the ways in which the joining of experiences by unity-pluralist unity-making principles serves psychological-subject functions. This task would include engaging the science of high-level cognizing and acting. Acknowledgments Thanks to Jeremy Goodman, Grace Helton, Geoff Lee, Lizzie Schechter, and Jim Van Cleve for helpful discussion. Notes 1. This is a claim about “experience types,” if of whatever determinate form experiences are held to occur at. (A note to readers: the longer footnotes to follow are included for scholarly and intellectual i-dotting. They can be skipped on a first read, as the text body is meant to stand on its own.) 2. This claim pertains to experience types (see also the preceding endnote). So certain clusters of experience-sorts only come in indissoluble packages (the idea is). Of course, a subject at a time, undergoing an associated package-deal cluster of

246

D. J. Bennett and C. S. Hill

experiences will be in some particular experiential state(s) (details depending on one’s exact view of the nature of experiences). See Lee (this volume) for discussion of a somewhat different kind of “holism” of experience. 3. Dainton’s (2000, 199) distinction between “weak impingement” and “strong impingement” is suggestive of our distinction between causal holism and experiential holism. Bayne, perhaps more distinctly, briefly in passing invokes the root distinction between (in our terms) causal holism and experiential holism (Bayne, 2001, 91). A full treatment of these topics would also explore connections between our experiential holism(s) and various, possibly related theses about within-experience dependencies. So, it has been claimed that any very determinate experience of a color qualifying a shaped surface requires experiencing the relevant sort of shaped surface. Various subtle and intricate dependencies between experiences of hue, saturation, and lightness may also constitute species of experiential holism(s) (see Wyszecki & Stiles, 1982, 420–424, on the Bezold–Brücke and the Abney effects; see also Garner, 1974, on “integral dimensions”). Compare, as well, O’Callaghan’s (this volume) incisive discussion of the O’Dea (2006) claim of evident “infusion” of properties in intramodal binding of properties; Bayne (this volume) is a sustained discussion of similar issues. We won’t pause to elaborate and adjudicate. That is a matter best left for a full, separate study. We are not claiming that there are no cases of (local) experiential holism. 4. With local experiential holisms, the idea is that certain experiences are bound together with other experiences (edge experiences with Pac-Man experiences, what have you)—but these tightly locked experience-clusters are themselves separable from other experiences undergone by a subject at the same time. Dainton (2000, 185) introduces something like this idea in his notion of “partial” holisms. The exact connection to our observations, however, is not simple to work out given Dainton’s rich web of distinctions and theses, and given his attempt to take a purely “phenomenological” approach. 5. Here is the relevant, fuller passage: The triangle is “illusory” because its apparent boundaries (between the “pies”) are not perceived in response to anything on the page, which is plain white. Yet the triangle does seem to have boundaries which cross these gaps. These phenomenal boundaries consist, in effect, in a quite distinctive kind of distortion of phenomenal space. These distortions are also parts of the boundary of a triangle which seems to be superimposed on three black circles. It is hard to see how this exact combination of effects could be produced in a different way.

Our question or worry in the text concerns the claim “quite distinctive kind.” That said, the relevant science is complex and should be consulted. On the human psychophysics, see Zhou et al. (2008), including their concluding survey of the extensive recent perception-science debates. A classic work on neural response to illusory

A Unity Pluralist Account of the Unity of Experience

247

contours is von der Heydt, Peterhans, & Baumgartner (1984). For worries directed at Dainton’s last-sentence summary claim, see Bayne (2001, 91). 6. On some accounts of experience, this may need qualification. So, it is in some sense conceivable that perceivers come to associate two (say) different sensations with the same worldly property—learning to use both such sensory signs of the same worldly property. We find much to fault in talk of sensations and sensory signs. But even if we grant appeal to such notions for argument’s sake, the noted scenario would surely be a baroque prospect—at least regarding within-modality sensory signaling. We also think that acknowledging this (unlikely) prospect in our formulations would only complicate the statement of the signaling principle or constraint described, while leaving the basic point intact. 7. In light of the possibility of local experiential wholes (see endnote 4), we should, strictly, locate experiences as separable relative to certain other experiences of a subject at a time—but allow that these experiences may not be separable from certain additional experiences of that subject at a time. We suppress this complication in order to avoid needlessly—for our purposes—tangled formulations. 8. At least in central ways that various engaged parties can all accept. So, Bayne (2010) suggests, in effect, that the target phenomenon be fixed by a kind of ostension through introspection. But see sec. 2 and Hill (forthcoming)—we don’t find what Bayne (2010) reports. Or: it is sometimes suggested as a constraint that unified experiences constitute a stream of consciousness—and that, further, there is a oneto-one pairing of streams of consciousness and subjects of experience (Lee, forthcoming, mentions, cautiously, such constraint[s]). But now we are left with the question of what a stream of consciousness is—a phrase with no clear everyday meaning, sometimes offered with little or no explanation. 9. To use the framing introduced by Geoff Lee (this volume) in characterizing our approach. In the present note, we also focus on the unity of experience, instead of the unity of consciousness, which may be a broader notion (though see the quite encompassing use of “experience” explored in Hill, 2009). 10. Or that the holding of the relevant relations makes a phenomenal difference, or what have you. A full exploration of if/whether/when our unity-making relations are present or represented in experience would require principled exploration of how to decide such issues. This would include scrutiny of uses of the familiar-in-philosophy “what it is like” locution (Hill, 2009, 21), along with carefully distinguishing what is present/represented in experience from what can be readily gleaned from reflecting on our experiences. But, we submit that our favored, cognitive accessibility, unity-making relation (see sec. 4) does not have a direct phenomenological presence in experience; of course, a minimum of reflection will reveal that accessed experiences are joined in being accessible. We think that the same may also hold of our spatial unity and our binding unity relations. See, though, sec. 3.1 concerning the experience of

248

D. J. Bennett and C. S. Hill

causal relations, for a case where a unity-making relation between certain experience contents seems itself to be represented in experience. 11. It would seem that at least not all experiences joined by the unity-pluralist unity-making relations serve as parts of some overarching experience (see Hill, forthcoming). Consider the conscious entertaining of a thought, feeling an itch, and undergoing a color experience. These experiences might be joined by one or more unity-pluralist unity-making relations. But it seriously strains usage to hold that these are parts of some more complex experience. (This whether the talk [as here] is about a limited range of experiences or about all the experiences of a subject at a time. In general, Bayne’s claims about the latter sort of universal or total unity relation are our main target. But we think at least versions of our worries apply as well to claims about more limited ranges of experiences—claims, say, that any such limited range of experiences exhibits a special conjoint phenomenology that reflects some mereological relation between experiences.) The point is likely not decisive. A rejoinder: “You have yourselves noted the imprecision in everyday thought and talk about just how to ‘carve’ and ‘count’ experiences. Here we simply hold that the upshot is itself to be understood as an ‘experience,’ with the ‘parts’ noted. This is a useful if somewhat stipulative extension of our everyday talk—cleaning up and pinning down in the service of philosophical systematization.” Still, the unity pluralist might claim advantages here. The unity pluralist avoids flouting an aspect of everyday thought and talk about our experiences. And the unity pluralist avoids the gnarly challenge of clarifying, precisely, claims of partwhole relations holding between experiences—in a way that illuminates any significant unification of experiences of a subject at a time. Hill (forthcoming) develops worries about the latter project; Lee (this volume) takes up the challenge. 12. However, compare Bayne (2010, 31–32), where Bayne appears to deny that the experiential part-whole relation has a proprietary phenomenology. This is in seeming conflict with the just-quoted passage—and in apparent conflict with any attempt to ostensively isolate a target universal unity relation. Bayne does say (31), apparently in reference to the pertinent relation, that its presence “makes a phenomenal difference” but adds “not in the sense that it has its own phenomenal character that makes an additional [italics in the original] contribution to what it is like to be the subject in question.” There is subsequent reference to “unity” as a “manner of experiencing” (32). Bayne also denies that the experienced part-whole relation has a phenomenological dimension in sec. 2 of his chapter in the present volume. As with the corresponding passage in Bayne (2010), we have had trouble reconciling that passage with the claims made in the quoted material above. We describe below all that we find in reflecting on our own experiences. 13. It is interesting to note that the appeal to introspection is little evident in Bayne and Chalmers (2003). In this earlier study, there are hints that claims about princi-

A Unity Pluralist Account of the Unity of Experience

249

ples of unity are instead to be measured against something like our concepts of experience, consciousness, selves, and the like (cf. Bayne & Chalmers 2003, sec. 5; see also Bayne & Chalmers, sec. 7, on the concept of a phenomenal state; compare the observations on Bayne & Chalmers, 2003, in Lee, forthcoming). 14. The term “introspective” might be appropriate here as well if it is allowed that we gain access to our mental-experiential states through multiple means (Hill, 2009, ch. 8; Samoilova, 2013). 15. There is some overlap. The kinds of unity achieved via sensory unity-making relations described in the next two sections are species of what Hill and others characterize as representational unity. There are some differences, for example Hill (forthcoming) emphasizes relations between experiences in “phenomenal space.” 16. This has been disputed (Zeki, 2003), but we will assume the core truth of the claim. 17. There need be no conflict here with the Sinha group evidence suggesting a no on Molyneux (Held et al., 2011). To begin, the issues here are subtle, and there are open empirical questions (Sinha, this volume; Van Cleve, this volume; see also de Vignemont, this volume). Most important, the basic, cue-combination observation is just that spatial properties are accessible via different routes. There is no commitment in this regarding how this capacity is gained, or exactly what it consists in. 18. Initial framing in this section—of the trivial and of the nonobvious subjectunity principles—derives closely from Hill (forthcoming). Stretches of the Hill (forthcoming) language are also taken over in framing this opening contrast. The lessons then drawn are somewhat different from those drawn in Hill (forthcoming) but not incompatible. 19. For empirically up-to-date and philosophically sophisticated discussion of splitbrain patients, see Schechter (this volume) and Schechter (2013b,c). Schechter’s views are probably the closest to ours in the unity of consciousness/experience philosophical literature. But there are differences. So, we are obviously skeptical that there is interesting theoretical insight about experience to be found in considering the Schechter (2013c) “conscious singularity” (encompassing the Bayne “phenomenal unity”). While related, the Schechter (2013c) notion of coherence unity also differs from the unity of experience fixed by our unity-pluralist unity-making relations. So: where we focus on experience, Schechter focuses on the/a unity/coherence of consciousness—for her, a broader notion. For purposes of her project, Schechter also differs in, in effect, building the presence/unity/integration of a psychological subject into the achievement of coherence unity (see the summary statement, Schechter, 2013c, 200; though see also our closing sec. 6.3). 20. This may be stronger than needed. Perhaps, say, visual working memory can be broken down into various coordinated subcapacities (cf. Baddeley, 2000). We can

250

D. J. Bennett and C. S. Hill

understand the basic Hill picture to be that “psychological subjecthood” is marked by tightly coordinated mental or psychological capacities, working together (which can include, not interfering) to serve organism ends. It would be a task to spell out details, consulting, as needed, empirically driven theorizing about high-level mentality. 21. At least or especially experiences of static objects or forms. Matters improve— significantly, qualitatively—when object or figure motion is introduced (see Farah, 2004; compare Ostrovsky et al., 2009). We also gloss over important, related complexities concerning tracing strategies that form agnosics often adapt to do certain tasks (Farah, 2004). 22. Taking this to be a distinct proposal, going beyond the observation that the form agnosic fragmentary experiences are simply all undergone at the same time by the same subject. On the Bayne (2010) official account of phenomenal unity, this is indeed a distinct proposal (though see the Hill, forthcoming, discussion of interpretations/commitments of the Bayne mereological conception of experience). 23. See the reports presented or described in Farah (2004, ch. 2) or the reports of Milner and Goodale’s famous patient DF (cf. Goodale & Milner, 2004). Compare, as well, the self-report by a Sinha Molyneux subject described in Mandavilli (2006). See also endnote 24, below. 24. First, this fits with the assumptions that (i) the experience fragments reflect the fairly peripheral detection of pre-bound features; and (ii) the neural mechanisms engaged are organized in topographic neural maps—with “map locations” corresponding to visual directions. Further, in a report quoted in Schechter (2013c, 206), in response to an interviewer’s inquiry a visual form agnosic denies a problem with “localization,” reporting instead that properties—which in fact belong to the same object—are experienced as not “belong[ing] together.” This is an informal report in an old (1908) study. But it fits well enough with other reports and evidence to take it seriously. Finally, Farah (2004, 13) also writes, “In the three cases [of visual-form agnosia] where depth perception was explicitly reported, it was either intact or recovered while the patient was still agnosic.” (Unfortunately neither the tests done nor the relevant research reports are identified.) Schechter also notes that there is a sense in which visual-form agnosics remain “aware of … [the] co-occurrence” of certain properties/states. This is close to simply observing—with us—that there is an access unity relation between the relevant experience contents (though this seems not quite Schechter’s own take—see Schechter, 2013c, 198–199, for the details). 25. All that actually seems clear is that there is an impoverished range of experience contents for cognitive capacities to work from, and not that the cognitive mechanisms underlying access are themselves impaired (perhaps at all). Of course what-

A Unity Pluralist Account of the Unity of Experience

251

ever diminishment of access unity is present—however exactly understood—this accompanies the severe fracturing of experience in other ways. 26. At least if the links between experiences invoked are said to be fixed by sensory unity-making principles. It is not interesting to observe that psychological subjects dissolve if more cognitive unity-making relations break down. After all, on the Hill characterization (sec. 5), decently intact functioning of the capacities underlying our more cognitive “unity-making” relations is essential to “psychological-subjecthood.”

References Baddeley, A. (2000). The episodic buffer: A new component of working memory? Trends in Cognitive Sciences, 4, 417–423. Bayne, T. (2001). Co-consciousness: Review of Barry Dainton’s Stream of Consciousness. Journal of Consciousness Studies, 8, 79–82. Bayne, T. (2010). The unity of consciousness. Oxford: Oxford University Press. Bayne, T., & Chalmers, D. (2003). What is the unity of consciousness? In A. Cleeremans (Ed.), The unity of consciousness: Binding, integration, and dissociation. Oxford: Oxford University Press. Bennett, D. J. 2014. Formulating debates about the cognitive penetration of perceptual capacities and of perceptual experience. Under review. Block, N. (1995). On a confusion about a function of consciousness. Behavioral and Brain Sciences, 18, 227–247. Reprinted with some changes in Ned Block, Owen Flanagan, & Güven Güzeldere (Eds.), The nature of consciousness (Cambridge, MA: MIT Press, 1997), 375–415. Corballis, M. C. (1994). Split decisions: Problems in the interpretation of results from commissurotomized subjects. Behavioural Brain Research, 64, 163–172. Dainton, B. (2000). Stream of consciousness: Unity and continuity in conscious experience. New York: Routledge. de Vignemont, F. (forthcoming). A multimodal conception of bodily awareness. Mind. Ernst, M. O., & Banks, M. S. (2002). Humans integrate visual and haptic information in a statistically optimal fashion. Nature, 415, 429–433. Ernst, M. O., & Ernst, M. S. (2002). Humans integrate visual and haptic information in a statistically optimal way fashion. Nature, 415, 429–433. Farah, M. (2004). Visual agnosia (2nd ed.). Cambridge, MA: MIT Press.

252

D. J. Bennett and C. S. Hill

Frisby, J. P., & Clatworthy, J. L. (1975). Illusory contours: Curious cases of simultaneous brightness contrast? Perception, 4, 349–357. Garner, W. R. (1974). The processing of information and structure. Hillsdale, NJ: Erlbaum. Goodale, M. A., & Milner, D. A. (2004). Sight unseen: An exploration of conscious and unconscious vision. New York: Oxford University Press. Held, R., Ostrovsky, Y., de Gelder, B., Gandhi, T., Ganesh, S., Mathur, U., et al. (2011). The newly sighted fail to match seen with felt. Nature Neuroscience, 14, 551–553. Hill, C. (1991). Sensations: A defense of type materialism. Cambridge: Cambridge University Press. Hill, C. (2009). Consciousness. Cambridge: Cambridge University Press. Hill, C. (forthcoming). Tim Bayne on the unity of consciousness. Analysis. Hillis, J. M., Ernst, M. O., Banks, M. S., & Landy, M. S. (2002). Combining sensory information: Mandatory fusion within, but not between, senses. Science, 298, 1627–1630. Landy, M. S., Banks, M. S., & Knill, D. C. (2011). Ideal observer models of cue integration. In J. Trommershäuser, K. Kording, & M. S. Landy (Eds.), Sensory cue integration. Oxford: Oxford University Press. Lee, G. (forthcoming). Unity and essence in Chalmers’ theory of consciousness. Forthcoming in a symposium on David Chalmers, “The Character of Consciousness.” Philosophical Studies. Mandavilli, A. (2006). Visual neuroscience: Look and learn. Nature, 441, 271–272. Marks, C. E. (1980). Commissurotomy, consciousness, and the unity of mind. Montgomery, VT: Bradford. Milner, D.A. & Goodale, M. A. (2006). The visual brain in action (2nd ed.). New York: Oxford University Press. Nagel, T. (1971). Brain bisection and the unity of consciousness. Synthese, 22, 396–413. O’Callaghan, C. (2012). Perception and multimodality. In E. Margolis, R. Samuels, & S. Stich (Eds.), Oxford handbook of philosophy of cognitive science (pp. 92–117). Oxford: Oxford University Press. O’Dea, J. (2006). Representation, supervenience, and the cross-modal problem. Philosophical Studies, 130, 285–295.

A Unity Pluralist Account of the Unity of Experience

253

Ostrovsky, Y., Meyers, E., Ganesh, S., Mathur, U., & Sinha, P. (2009). Visual parsing after blindness. Psychological Science, 20, 1484–1491. Rolfs, M., Dambacher, M., & Cavanagh, P. (2013). Visual adaptation of the perception of causation. Current Biology, 23, 1–5. Samoilova, K. (2013). The Case for Pluralism about Introspection. PhD Dissertation, Brown University. Schechter, E. (2013a). Commentary on Hill and Bennett. Society for Philosophy and Psychology Conference (June 13–15), held at Brown University. Schechter, E. (2013b). The unity of consciousness: Subjects and objectivity. Philosophical Studies, 165, 671–692. Schechter, E. (2013c). Two unities of consciousness. European Journal of Philosophy, 21, 197–219. Scholl, B. J., & Gao, T. (forthcoming). Perceiving animacy and intentionality: Visual processing or higher-level judgment? In M. D. Rutherford & V. A. Kuhlmeier (Eds.), Social perception: Detection and interpretation of animacy, agency, and intention. Cambridge, MA: MIT Press. Siegel, S. (2010). The contents of visual experience. Oxford: Oxford University Press. Sperry, R. W. (1968). Mental unity following surgical disconnection of the cerebral hemispheres. In The Harvey Lecture Series, 62, 293–323.2. New York: Academic Press. Trommershäuser, J., Kording, K., & Landy, M. S. (Eds.). (2011). Sensory cue integration. Oxford: Oxford University Press. Van Cleve, J. (1999). Problems from Kant. Oxford: Oxford University Press. von der Heydt, R., Peterhans, E., & Baumgartner, G. (1984). Illusory contours and cortical neural responses. Science, 224, 1260–1262. Wyszecki, G., & Stiles, W. S. (1982). Color science: Concepts and methods, quantitative data, and formulas (2nd ed.). New York: Wiley. Zeki, S. (2003). The disunity of consciousness. Trends in Cognitive Sciences, 7, 214–218. Zhou, J., Tjan, B. S., Zhou, Y., & Liu, Z. (2008). Better discrimination for illusory than for occluded perceptual completions. Journal of Vision, 8, 1–17.

12 Unity, Synchrony, and Subjects Barry Dainton

1 The Relational Approach to Phenomenal Unity: Brentano Brentano’s discussion of the unity of consciousness in Part II, section IV of his Psychology from an Empirical Standpoint (1874) is in a number of respects very much in keeping with much contemporary thinking on this topic. Although Brentano himself very likely believed (much of the time) that mental states belong to mental substances, he also subscribed to F.A. Lange’s project of developing a psychology without a soul. Irrespective of whether souls exist, we know that mental states and episodes exist, and that there “is a certain continuity of our mental life here on earth” (1995, 17). Brentano argued that a properly scientific psychology can (and should) make these the focus of its attention, setting entirely to one side the controversial metaphysical issue of whether our mental states reside in immaterial souls. If we are not allowed to appeal to the soul in accounting for any aspect of the mind, we obviously cannot appeal to the soul in explaining mental unity either. There are different forms of mental unity, but one is more prominent than most: the unity of consciousness. During our waking hours, our conscious mental lives are often highly complex, but the various experiences we are having at any one time—our thoughts, mental images, visual and auditory experiences, emotional feelings, and so on—are also invariably unified in a highly distinctive way. This distinctive form of unity poses no problem if our experiences are states of a metaphysically simple conscious substance; the nature of these substances is such that their conscious states are unified, or so the believer in souls will maintain. But what can we say about this unity if we are obliged to leave souls out of the picture? Brentano saw no difficulty: The unity of consciousness, as we know with evidence [i.e., certainty] through inner perception, consists in the fact that all mental phenomena which occur in us

256

B. Dainton

simultaneously such as seeing and hearing, thinking, judging and reasoning, living and hating, desiring and shunning, etc., no matter how different they may be, all belong to one unitary reality only if they are inwardly perceived as existing together. They constitute phenomenal parts of a mental phenomenon, the elements of which are neither distinct things nor parts of distinct things but belong to a real unity. This is the necessary condition for the unity of consciousness, and no further conditions are required. (Brentano, [1874] 1995, 163–164)

Here Brentano is making two substantive claims. The first concerns the ontological status of our conscious states at any one time; they are “real unities,” entities that are more akin to bricks or electrons than herds of cows or collections of toy soldiers. The second claim concerns the source of this unity. According to Brentano, experiential (or phenomenal) contents at a given time belong to a single, unified, conscious state by virtue of being experienced as existing together. This “experienced togetherness” is necessary for the unity of consciousness, and since nothing further is required—contents experienced together are thereby unified in the relevant way—it is also sufficient. In characterizing this position, Brentano talks of mental phenomena being “inwardly perceived” as existing together. In so doing, he is adverting to his doctrine that mental items are conscious if, and only if, they are accompanied by an “inner perception” or “inner consciousness” that they are occurring. On this controversial view, I never just feel a sensation of pain; whenever I feel a pain, I also simultaneously know, in an immediate way, that I am having the feeling. If this doctrine is true, when our experiences exist together in unified ensembles, they will necessarily be accompanied by an inner perception that they are so doing, and so their unity will be “inwardly perceived,” just as Brentano states. Now, while we undeniably can be explicitly aware of the unity of our current experiences, it would surely be implausible to hold that this is always the case. It is far more common, by far, that as we go about our ordinary business, we are very rarely paying any attention to the unity of our consciousness, even though it is quite likely the case that our consciousness is unified during these periods. However, although inner perception can take the form of an explicit conceptual judgment to the effect that “I am now experiencing …” this is not its only or most fundamental form. Or at least, this is what Brentano seems to be suggesting in passages such as this: No one who pays attention to what goes on within himself when he hears or sees and perceives his act of hearing or seeing could be mistaken about the fact that this judgement of inner perception does not consist in the connection of a mental act as

Unity, Synchrony, and Subjects

257

subject with existence as predicate, but consists rather in the simple affirmation of the mental phenomenon which is present in inner consciousness. ([1874] 1995, 142)

The claim that consciousness necessarily involves inner perception in this form—a “simple affirmation” that a particular experience is occurring—has a good deal more prima facie plausibility. It remains controversial quite what Brentano understood by it.1 To simplify, in what follows I will set the inner perception doctrine aside. To a significant degree, Brentano’s approach to the unity of consciousness is independent of his doctrines concerning inner perception. The key idea is simply that phenomenal contents constitute parts of a unified conscious state when, and only when, they are experienced as existing together. That contents can be experienced as existing together is well founded phenomenologically; the different parts or our overall states of consciousness at a given time typically are experienced as occurring together. We can call this way of making sense of the unity of consciousness relational. The contrasting substantivalist doctrine is that experiences are unified by virtue of being states of a single subject or conscious substance. Relational theorists needn’t deny that such things as conscious subjects or substances (or souls) exist, but they do maintain that we don’t need to appeal to such things in accounting for the unity of consciousness. All we do need to appeal to are relations, of an experiential sort, that exist between experiences. In his “unity” chapter, Brentano goes on to point out that the relational account of phenomenal unity is quite modest in its implications and commitments. For example, the relational theorist need not be committed to any particular relationship between unified conscious states and living organisms. An ordinary human being is an organism that (normally) contains just one unified consciousness, but there is no reason to think this is invariably the case. In corals “countless little animals appear to have a common bodily life in one and the same stem” ([1874] 1995, 55), but the experiences had by each of these animals are not experientially unified (“there is no inner perception which apprehends their simultaneous existence”), or so Brentano plausibly suggests. Those who house unified conscious states in metaphysically simple substances may (justifiably) feel that such states are immune to the possibility (or threat) of fission or fusion. But Brentano—here anticipating more recent debates—argues that since “it does not require either the simplicity or the indivisibility of consciousness” (171), there is no such guarantee if we embrace the relationalist’s account of phenomenal unity. Brentano—at

258

B. Dainton

least in the Psychology—also seems open to the possibility that conscious states may themselves be spatially extended and distributed through the spatial parts of organisms. In this context (166), he mentions Aristotle’s intriguing (if slightly gruesome) conjecture that when a worm is cut into several still-living segments, its previously unified consciousness undergoes a corresponding division. 2 Convergences When in Stream of Consciousness (Dainton, 2006) I set about the task of inquiring into what could be said about the unity of consciousness if we confine ourselves to the phenomenal level, the account I arrived at was very close to Brentano’s—more so, indeed, than I then appreciated. Rejecting the substantival approach in favor of relationalism, I suggested “co-consciousness” as the unifying relation. In these terms, a fully unified conscious state is one whose constituent parts are all mutually coconscious; a maximal or “total” conscious state is one that is not a (proper) part of any larger state whose parts are all mutually co-conscious. As for coconsciousness itself, I suggested it is simply the relationship of experienced togetherness. There is clearly a relationship between these experiences [that you are having at any one time], but of what sort? It may be that either of these experiences could have existed on its own, but as it happens they both co-exist within your consciousness, and the togetherness of these two experiences is itself something that you experience. (Dainton, 2003b, 2)

Brentano spoke in terms of “existing together” rather than “coexistence,” but it is very plausible to think that he had precisely the same relationship in view. As Brentano also appreciated, co-conscious is most conspicuous—to the extent that it ever is—in the crossmodal case. That the various things that we are seeing at any given time—the desk just in front, the walls all around—are experienced together is rarely a cause of puzzlement or wonder. That our visual faculty supplies us with a fully unified visual field is as fundamental to its normal mode of functioning as the fact that it allows us to see color or shape. Since that’s what seeing (for us, most of the time) is invariably like, it strikes us as entirely natural that the things we are seeing at any given time are seen together. The same applies for our ability to hear multiple sounds all together or to feel several different bodily sensations at once. Puzzlement only begins when we ask ourselves how it is possible

Unity, Synchrony, and Subjects

259

for experiences that belong to different sensory modalities to be intimately bound together—experienced together—to precisely the same degree as experiences within a single modality. In the Theaetetus, Plato argued that since crossmodal sensory integration cannot be the work of any of our ordinary senses, it must be accomplished by the soul.2 Aristotle proposed an alternative solution: he posited a higher-order sensory faculty, a “common sense,” over and above the familiar five. The nature and workings of Aristotle’s “sensus communis” are obscure, but higher-order models of consciousness still have their contemporary defenders, as do higher-order accounts of phenomenal unity. I have argued elsewhere that these are all problematic, in one way or another, but will not repeat these arguments here.3 I will simply note that when it comes to the unity of consciousness, we do not need to hold that our consciousness possesses a complex, multilevel architecture to explain how unity is possible. Acknowledging that the relationship of experienced togetherness holds between token experiences of diverse kinds does the job simply and economically. In recent writings, Bayne and Chalmers have labeled the sort of relational account just outlined as “bottom-up” and contrasted it with their preferred “top-down” approach. Their starting point is the thought that it seems reasonable to suppose that “there is a single encompassing state of consciousness that subsumes all of my experiences: perceptual, bodily, emotional, cognitive, and any others” (2003, sec. 2). They go on to suggest that the deep unity that exists among experiences is due to their being parts of aspects of these single encompassing states. More recently, Bayne has called this the “mereological” model of the unity of consciousness: In seeking to account for phenomenal unity it is natural to invoke the notion of subsumption. … Whereas treating phenomenal unity as primitive [in the form of the co-consciousness relation] provides us with a “bottom up” approach to the unity of consciousness, one that starts with the multiplicity in consciousness, taking subsumption as our primitive is to adopt a “top down” approach to the unity of consciousness and begin with the unity that subsumes this multiplicity. How should we think of subsumption? It is tempting to think of it in mereological terms—that is, in terms of parts and wholes. … One’s overall phenomenal field is an experience that contains within itself other experiences, nestled like Russian dolls within each other. Indeed, we might venture the thought that total phenomenal states are homeomerous: all the parts of which they are composed share their experiential nature. (2010, 20–21)

At first sight, the subsumptive approach may seem quite different from the account that appeals to a unifying relation in the form of co-consciousness. But how different are these approaches, really?

260

e1

B. Dainton

e2

E = [e1+e2+e3+e4] e4

e3

Figure 12.1

In broaching this issue, it will help to have a concrete example on hand. Figure 12.1 represents a (perhaps quite simple) state of unified consciousness containing four experiences, e1, e2, e3 and e4. The fact that each of these parts is experienced with all the others is indicated by the doubleheaded arrows. Now, there is a sense in which Bayne and Chalmers’s approach is not what it first seems. A natural (initial) way of construing the theory is along these lines: the unity of consciousness is a product of parthood and inclusion. Experiences are unified when they form parts of more encompassing wholes, in the manner in which e1 is part of E in figure 12.1. This applies in many cases, but not in all—and not, arguably, in the most important case. Each of e1–e4 taken singly belongs to a more inclusive whole, as do more complex compound parts such as e1–e2 and e3–e4. The same applies to e1–e2–e3 and e2–e3–e4. But what about E itself, the whole that consists of all four experiences? Since E is a total experience, we know it is not included in any more inclusive experiences. So if inclusion is the generator of unity, it cannot consist of a unified ensemble of experiences! But of course, it does, for ex hypothesi this is precisely what it is. Alert to this difficulty, Bayne and Chalmers stipulate that subsumption is a reflexive relation. As a consequence, experiential wholes such as E can and do subsume themselves. This may solve the problem on the technical level, but it also means that subsumption is not the relationship it initially seemed. For in the case of maximal or total experiences, phenomenal unity is not a product of inclusion or parthood in anything like the familiar manner. Given this, when it comes to maximal conscious states, it is no longer intuitively clear how the relation of subsumption creates phenomenal unity. It is looking rather more as if total states are basic experiential units, states

Unity, Synchrony, and Subjects

261

whose unity is taken as a primitive. If so, the subsumption account faces a further difficulty: are there any constraints on which combinations of experiences can belong to these basic total states? What is to rule out your current experiences and mine belonging to such a state? This point can be reached from a different direction. A purely mereological account of phenomenal unity would be quite a radical beast indeed. In the standard systems of mereological logic, unrestricted composition applies, i.e., every collection of parts constitutes a whole. If this applied in the experiential realm, then every collection of momentary (or very brief) experiences would constitute a genuinely unified conscious state, irrespective of when and where they occur, or to whom they belong. Your current experiences and mine, for instance, would constitute a unified state; as would Napoleon’s visual experiences at a certain point in time at the battle of Waterloo, and Neil Armstrong’s auditory experiences at a particular moment during the launch of Apollo 11. This is clearly an absurd view, and Bayne and Chalmers have no intention of committing themselves to it. To avoid doing so they impose an additional constraint: not every collection of experiences constitutes a unified experiential whole; only those that enjoy a “conjoint phenomenology” do. A pair of experiences has a conjoint phenomenology when “there’s something that it is like for the subject in question to have both experiences together.” This is starting to sound a good deal more plausible, but much now hangs on what is involved in a subject having both experiences together. There are only two remotely obvious ways of construing this. One option is to say that a pair of experiences has a conjoint phenomenology if, and only if, they are connected by the primitive relationship of experienced togetherness discussed above. But if we take this path, the mereological approach is offering nothing distinctive. It is, in effect, telling us that a unified conscious state is one whose constituent experiences are all mutually co-conscious in the manner of e1-e4 in figure 12.1. There is an alternative interpretation to consider. We might say that a subject S has a pair of experiences together if S has both experiences simultaneously, with no further condition or constraint. We are now being offered something that is distinctive, but not very plausible. To see why, suppose e* and e** are experiences that belong to S that are simultaneous but not co-conscious. If this state of affairs could arise, there is a sense in which there is something it is like for S to be having these simultaneous experiences. But since e* and e** are not co-conscious—not experienced together—it would be absurd to claim that e* and e** are parts of a single unified conscious state.

262

B. Dainton

So it is looking very much like the first alternative is the only viable option. It turns out that the phenomenal unification in the mereological model is the product of a phenomenal unity relation that exists among token experiences.4 Indeed, since it is this relation that is responsible for the existence of more complex unified wholes, it looks very much as though it also makes relationships of subsumption (in the intuitive “inclusive” sense) possible. 3 Wholes and Holism There is one further twist in the tale. Bayne and Chalmers contrast the top-down subsumptive approach with the bottom-up relational alternative. What does this contrast mean in this context? Looking at figure 12.1, it is not obvious, for on both accounts we have a complex phenomenal whole whose parts are bound together into a whole by the co-consciousness relationship. If experiences could be subsumptively related without being co-conscious, then there would be a significant divergence, but as has now become apparent, this is not possible. Bayne and Chalmers hold that cooccurring experiences are unified if, and only if, they have a conjoint phenomenology, and experiences have the latter property if and only if they are co-conscious. There is, however, a potential divergence between the two approaches that we have not yet considered. It concerns the metaphysical status of experiential wholes vis-à-vis their constituent parts. According to the phenomenal holist, it is experiential wholes that are metaphysically basic and their constituent experiences have only a derivative existence. In contrast, the phenomenal atomist holds that it is the token experiences that constitute experiential wholes that are the basic units of experience, whereas the wholes they constitute have an ontologically derivative status. A third view is that neither wholes nor their parts are metaphysically basic with respect to one another, that both are on an ontological par. There are different ways of spelling out what “metaphysically basic” amounts to, but we needn’t enter into that issue here. It will suffice to note two generally accepted consequences of these competing metaphysical positions. If the holist is correct, then the experiences that form the (proper) parts of unified phenomenal wholes could not exist independently of those wholes. If the atomist is correct, this is not the case: experiential parts can exist independently of the experiential wholes they happen to find themselves in. If the subsumptive theorist were committed to the view that the basic units of experience are total experiences, and the relational theorist

Unity, Synchrony, and Subjects

263

committed to denying this, then there would be a clear sense in which the former is top-down and the latter bottom-up. But in fact, the situation is not so clear-cut. While it is certainly possible to combine a relational approach to phenomenal unity with an atomistic conception of experiential wholes, it is by no means obligatory, for it is also possible to hold that in the case of wholes generated by the co-consciousness relationship, phenomenal holism obtains. In fact, in Stream of Consciousness I was entirely open to the possibility that experiential wholes are the basic units of experience, and I expended a good deal of space exploring ways in which this form of holism could be motivated (see Dainton, 2010, for further discussion). Similarly, while Bayne and Chalmers are both very sympathetic to holism, they fall short of endorsing it. Chalmers has recently written: “I hold that the total state subsumes the local state, where subsumption is a quasi-mereological relation. But this claim is compatible with different theses about the priority between the two.”5 Bayne does defend a form of holism in chapter 10 of The Unity of Consciousness. His discussion of the issue starts thus: Theorists who adopt an atomistic orientation assume that the phenomenal field is composed of “atoms of consciousness”—states that are independently conscious. Holists, by contrast, hold that the components of the phenomenal field are conscious only as the components of that field. Holists deny that there are any independent conscious states that need to be bound together to form a phenomenal field. (2011, 225)

However, as becomes clear over the course of the chapter, Bayne’s atomists hold that the constituents of our total conscious states come into existence as full-fledged experiences in their own right, before being bound into experiential wholes. The form of holism Bayne wishes to defend is a thesis about the way in which experiences are produced by the neural systems in our brains: “There are no mechanisms responsible for phenomenal binding because the unity of consciousness is ensured by the very mechanisms that generate consciousness in the first place” (2010, 248). This form of holism is not a thesis about the metaphysical relationship between total experiences and their constituent parts. Indeed, Bayne explicitly rejects the standard claim of the holist that it is impossible for any token experience to exist independently of the larger experiential whole to which (in actuality) it belongs. He argues that while some experiences may not be independent of their broader contexts—a pain in the leg, say, may require a backdrop of bodily feeling—this is not generally the case. Not only is it perfectly possible that “tokens of a single fine-grained phenomenal state type can occur

264

B. Dainton

within the context of various total phenomenal state types,” for Bayne this is “an important feature of consciousness that any account must accommodate” (243).6 4 A Very Peculiar Thread Since the relational and subsumptive approaches to the unity of consciousness both rely on a unifying phenomenal relationship and are similarly neutral on the issue of holism, the approaches converge on all the essentials. But is the unifying phenomenal relationship to which they appeal itself beyond reproach? In the fifth of a series of articles on “Novelist-Philosophers” penned in 1945 for Horizon, A. J. Ayer said the following about Jean-Paul Sartre: To say that two objects are separated by nothing is to say that they are not separated; and that is all that it amounts to. What Sartre does, however, is to say that being separated by Nothing, the objects are both united and divided. There is a thread between them; only it is a very peculiar thread, both invisible and intangible. But this is a trick that should not deceive anyone. (1945, 18–19)

One might well sympathize with Ayer’s unwillingness to accept Sartre’s claim that human consciousness contains an ineliminable fissure through which Nothingness (or le Néant) surges. More generally, one might well share Ayer’s suspicion of indiscernible and intangible linkages, particularly when the allegedly real but undetectable entities exist within our own states of consciousness. More worryingly, at least for those sympathetic to the relational account of phenomenal unity, there are those who allege that the co-consciousness relationship falls into this category. In chapter 5 of Sensations, Christopher Hill (1991) describes how he was once strongly tempted to think there is a primitive unifying co-consciousness relationship, but came to realize that there is in fact no such thing. When focusing on his experiencing of pairs of simultaneous sensations, it sometimes seemed that he was aware of them as co-conscious even though he wasn’t—in any obvious way—aware of them being unified in any other way (e.g., by causal or counterfactual relationships, by an act of introspection, by any other sensation, or by a spatial relationship). The sensations in question were experienced together, but the connecting relationship itself seemed devoid of further characteristics. He goes on to say: I have never felt that I was aware of this new form of co-consciousness as having positive differentiae that distinguish it from other forms. Rather I was aware only that it lacked the positive differentiae that belong respectively to the other forms.

Unity, Synchrony, and Subjects

265

Accordingly, it has seemed to be that this form of co-consciousness is pure—that it has no distinguishing characteristics other than its ability to unite sensations. (1991, 239–240)

What changed Hill’s mind was precisely the “ghostly” nature of co-consciousness, it’s lack of intrinsic phenomenal features. As he later came to view the situation, when experiencing two sensations belonging to different sensory modalities, he is aware “only of a fact that consists of the simultaneous existence of the two sensations” (1991, 240). If in addition to the two sensations the co-consciousness relation were present in his overall states of consciousness, he would be able to detect it, but he can’t. Hill claims that when he introspects, all he finds are the two sensations and nothing else. If co-consciousness has no phenomenological reality, we should not feel tempted to suppose that it exists. The “very ghostliness of the putative unity relation” is its downfall. More recently, the elusive character of the co-consciousness relationship has impressed itself upon Philip Goff, but the conclusion to which Goff finds himself drawn is rather different. Co-consciousness—or in Goff’s terms, the “phenomenal bonding” relationship—exists all right, but we know nothing whatsoever of its real nature. As a consequence of this radical ignorance, Goff suggests that our common sense assumptions regarding which experiences are in fact co-conscious may well be entirely misguided. So much so, indeed, that far from being absurd, mereological universalism could well be true in the phenomenal realm. If so, then at any given time, each and every combination of experiences forms a unified co-conscious ensemble no matter how spatially distant their physical subjects may be. The right and left halves of my visual field are co-conscious, but so too are the right half of my visual field and the left half of yours; so too is the left half of my visual field, the twinge in the toe my next-door neighbor is currently feeling, and your total conscious states, and so on, without limit or constraint (Goff, forthcoming a, sec. VI).7 In venturing this proposal, Goff insists that when it comes to serious metaphysics, common sense should not be seen as a reliable guide. Even if we grant this, the notion that unrestricted phenomenal composition obtains is so bizarre that it is difficult to begin to take seriously. In fact, there is a tension in Goff’s position. If co-consciousness is entirely noumenal, what reason does he have to think it even exists in the first place? Wouldn’t Hill’s position be the more justified? Wouldn’t it make more sense to reject it entirely? Hill’s position is problematic, too, though. Let’s return to the case of the two sensations, and the claim that “when it seems to me that I am aware of

266

B. Dainton

a fact involving two auditory sensations, or two sensations associated with different sense modalities, and a ghostly unity relation, I am aware only of the fact that consists of the simultaneous existence of the two sensations.” In one respect, what Hill says is true. The co-consciousness relationship does not possess any intrinsic phenomenal qualities of its own. When we are aware of experiences related in this way, we are not aware of any connecting filaments or bonding agency possessing distinctive phenomenal qualities. In this sense, co-consciousness is “ghostly.” But it remains the case that the sensations Hill describes don’t merely occur simultaneously; they are also experienced together, in a unified conscious state. If we follow Brentano’s lead and take co-consciousness to consist precisely in this relationship of experienced togetherness, this alone ensures that Hill’s sensations are co-consciousness, despite his reluctance to acknowledge as much. The absence of intrinsic phenomenal qualities in the unifying relationship is what lies behind Goff’s claim that we have no knowledge of the nature of phenomenal unity. Goff maintains that the real nature of some conscious mental items is revealed to us. “When I attend to a pain, it is directly revealed to me what it is for something to feel that way. When I attend to my experience of orange, it is directly revealed to me what it is for something to feel that way” (forthcoming b, sec. V). So in the case of phenomenal qualities—such as pain or orange—we have a transparent understanding of their nature, or so Goff claims. But he denies that we can have a similarly nature-revealing understanding of any relational feature of experience. “Apart from its mathematico-causal structure, the only feature of the world we transparently understand is consciousness. And consciousness is a monadic property” (forthcoming a, sec.V). If you believe that our conscious states consist of instantiations of monadic phenomenal qualities and nothing else, then clearly you will deny that a relational property such as co-consciousness is found in our experience. It follows that we won’t have the same sort of transparent, essence-revealing grasp of the nature of co-consciousness as we have of pain or color. What is more puzzling is why anyone would think that the only properties to be found in our conscious states are monadic. One of the least controversial claims that one can make about our ordinary states of consciousness is that at any given time they are unified, and that this unity is itself manifest in our experience—it is something we are acquainted with, not a theoretical posit. If we follow Brentano, and a primitive relationship of experienced togetherness is responsible for this unity, then our ordinary conscious states are relational through and through: each of their parts is experienced together with all of the others. Moreover, since we know precisely what it is like for

Unity, Synchrony, and Subjects

267

experiences to be related in this way, it is difficult to see why anyone could be justified in claiming that its nature qua mode of experiencing is entirely mysterious or noumenal. It is true, of course, that we aren’t aware of the co-consciousness relationship in the same way that we are aware of instances of red or pain. But these differences can surely be explained by the simple fact that co-consciousness is not itself a monadic experiential quality, but a relation between experiential qualities. It’s not just any relation: it is a phenomenal relation, consisting as it does of experienced togetherness.8 5 From Unity to Subjects Experiences are not freestanding, isolated particulars. They have—or are had by—subjects, or so it is generally (if not universally) believed. When it comes to the nature of these subjects of experiences, there is a good deal less agreement. According to a long-standing tradition dating back at least as far as Plato, subjects are metaphysically simple, immaterial substances. In Descartes’s familiar version of this doctrine, the essential nature of these substances is to be conscious in some way or other, and experiences are thus states or modifications of these immaterial substances. It is not surprising that variants of this view have found adherents throughout the centuries. For one thing, it is not easy to see how experience could itself be physical in nature. For another, by virtue of being simple, subjects of this sort cannot lose, gain, or change their parts in the manner of ordinary material things, and so can retain their identity over time in the strictest of ways—a feature that some have always found appealing. No less importantly, the notion that we are fundamentally experiential beings—things that are able to enjoy conscious states, of varying kinds—is itself an idea that has considerable appeal. Of course this conception has always had plenty of detractors too, even among those who are drawn to the notion that in some way or other we are fundamentally mental beings. Here is Locke voicing one complaint: The perception of ideas being (as I conceive) to the soul, what motion is to the body; not its essence, but one of its operations. And therefore, though thinking be supposed never so much the proper action of the soul, yet it is not necessary to suppose that it should be always thinking, always in action. … We know certainly, by experience, that we sometimes think; and thence draw this infallible consequence, that there is something in us that has a power to think. But whether that substance perpetually thinks or no, we can be no further assured than experience informs us. ([1690] 1975, sec. 10)

268

B. Dainton

Locke is here taking issue with the Cartesian doctrine that we are essentially conscious beings. While it is highly plausible to suppose that we cannot cease to exist while our current stream of consciousness continues to flow on, by making consciousness the essential property of conscious substances, Descartes also makes it impossible for us ever to stop being conscious. Not surprisingly, given that most of us seem to lose consciousness entirely from time to time, like many others, Locke finds this rather implausible. What is a good deal more plausible, as Locke recognizes, is to suppose that subjects such as ourselves have the power to be conscious. Our bodies have the power (or ability) to move, but although this power is often being exercised—e.g., when we get up in the morning—it is not always being exercised, e.g., when we are asleep or relaxed in an armchair. Similarly, our minds have powers for producing or enjoying very different forms of experience, and these powers are sometimes exercised and sometimes not. While we are awake, our capacities for different forms of perceptual experience are being continually triggered by changes in our environments and bodies, but we also have capacities for inner forms of experience, such as conscious thought, emotions, mental images, memories and so forth, which are also active, from time to time, during our waking hours. When we are asleep but dreaming, our perceptual capacities are all inactive but our capacities for mental imagery are very much active. When we are dreamlessly asleep, all our capacities for experience are completely inactive. In short, on Locke’s view it is not actual or occurrent consciousness that is the essential attribute of subjecthood, but rather the capacity to be conscious. For a given subject to enjoy a conscious mental life, some of his or her experiential capacities have to be active, but it is perfectly possible for a subject to exist without enjoying any occurrent consciousness at all. An unconscious subject is one whose capacities for consciousness remain intact but dormant. A capacity-oriented conception of subjecthood does have a good deal of prima facie appeal. On several occasions—for example, Dainton (1996, 2004, 2008)—I have suggested that an account of the nature of conscious subjects can be constructed on this basis. In saying that “something within us” has capacities for experience, Locke fails to spell out what we are. This is easily remedied: we can say that a subject of experience just is a collection of capacities for experience. I call these collections C-systems. If some form of materialism is true, our own C-systems are grounded in our neural systems. The C-systems of other species—on other planets, if not this one— may well be grounded in systems that are very different from our brains, and the experiences they can produce when active may well be different

Unity, Synchrony, and Subjects

269

from anything we can experience. But it matters not: by virtue of being able to produce some form of consciousness, these alien systems count as C-systems. And the same applies if the nonphysicalists are correct, and experiences are nonphysical in nature. The capacities that produce these experiences also (and obviously) count as C-systems. If we take this step, we have to be able to specify in an informative way the conditions at a time and over time in which experiential capacities belong to a single subject. We can do this in a noncircular way—in a way that does not appeal to the notion of subjecthood—in a straightforward fashion. We can say that at any given time, capacities belong to the same subject if and only if (i) they are active and contributing to unified states of consciousness by virtue of producing experiences that are co-conscious, or (ii) they are dormant, but would be contributing to unified states if they were active. In the diachronic case, we can pursue an analogous path and say that capacities at neighboring times belong to the same subject if and only if they have the ability to contribute to unified streams of consciousness. We can thus appeal to experiential unity in its synchronic and diachronic forms in explicating subject-identity at and over time. I call this account of the nature and persistence of subjects the C-theory. Now, this appealingly simple and straightforward version of the C-theory will not (quite) serve as it stands. In the diachronic case, we need to know more about what the continuity of consciousness involves, and in the synchronic case, there are certain complications that need to be overcome. In Dainton (2008) I say more about the diachronic unity of consciousness and provide the necessary elaborations to the account of synchronic unity. What I want to focus on here is a problem I did not fully deal with there, a problem that concerns the relationship between the unity of consciousness at a time and subjecthood. 6 Problematic Multiplicities Mark Johnston fully appreciates that we find it very natural to identify ourselves with the owners or subjects of our conscious states. He also sees that given this, a natural way forward, at least for those of us who are reluctant to believe in immaterial substances, is to identify ourselves with those physical entities—in our own case, brains or certain neural systems—that have the capacity to produce or “realize” our conscious states. But he also believes that this approach to the self offers us fewer guarantees than we might suppose. He outlines a scenario in which a conscious subject (or what appears to be such) is informed that their streams of consciousness—or enduring

270

B. Dainton

“arenas of presence,” as Johnston calls them—are being artificially induced, and are in fact all hallucinatory. The subject is then given the following information by the people supplying the hallucinations: Despite our vast psychophysical knowledge, we have never been able to achieve the goal of one realizer/one arena of presence. What we know of psychophysics tells us that our realizers always realize arenas of presence in batches of seven. We can find no way of parsing out subparts of the realizers to overcome this difficulty, and our best psychophysics strongly suggests this problem cannot be overcome. So there is not even a physical system that is especially causally responsible for any one of the seven separated arenas of presence in which these words are now appearing. (2010, 143)

Johnston suggests that if you were to find yourself in the position of this hapless subject, and believe what you have just been told, there is only one conclusion that you should draw: you do not exist; there is no you thinking your thoughts or having your perceptual and emotional experiences; there are just the thoughts and experiences. Johnston goes on to point out that if this is the case, then Descartes was mistaken to suppose that he could infer from the existence of his thoughts and experiences to the existence of a conscious subject. Tim Bayne deploys the hypothetical state of affairs of a single physical system producing multiple streams of consciousness to a similar but more tightly defined end. For Bayne, what this “multiplicity scenario” (as we can call it) demonstrates is the untenability of an otherwise promising approach to the self, namely identifying selves with “the underlying substrate that is responsible for generating the stream of consciousness—the machinery in which consciousness is grounded” (2010, 287). In earlier work, Bayne himself previously subscribed to an account along these lines—see Bayne and Dainton (2005)—but he now feels obliged to reject it. For what the multiplicity scenario demonstrates is that “there is no a priori guarantee that a single consciousness-generating mechanism will produce only one stream of consciousness” (2010, 288). For Bayne there has to be such a guarantee, for there is a “constitutive relation” between the self and the synchronic unity of consciousness, such that it simply makes no sense at all to suppose that a single self could have more than one unified conscious field at any given time. Whatever else they may be, selves are the kind of things that necessarily have a unified consciousness at any given time. If it is possible for consciousness-generating mechanisms to produce experiences that are not phenomenally unified, we cannot identify selves with these underlying mechanisms.

Unity, Synchrony, and Subjects

271

Although the C-theory as outlined in the previous section identifies selves (or subjects) with systems of capacities for consciousness, not “mechanisms” or “substrates,” in this context it is a difference that makes no difference. The capacities for experience with which we are familiar are not independent, free-floating particulars; they are located—or grounded— in objects, of one kind or another. If experiential capacities are possessed by physical things such as human brains, then the multiplicity problem rears its head. If a single, brain-like system can produce several streams of consciousness rather than one, how many subjects of experience does it sustain? Just one subject, several, or none? If experiential capacities are grounded in immaterial substances, the same question arises. Being nonphysical does not, in and of itself, guarantee that conscious substances and unified conscious states necessarily go hand in hand. Unless we can rule out the possibility of a single immaterial substance being able to sustain more than one stream of consciousness at a given time, then we are confronted with the need to make sense of such situations. 7 Circumventions One tempting response to the multiplicity objection is to deny that the scenarios described by Johnston and Bayne are genuine possibilities. Certainly many dualists would argue that the immaterial substances with which they identify themselves do necessarily house just one stream of consciousness at any given time. However, substance dualism is a highly contentious doctrine, and much of the appeal of the C-theory (such as it is) lies in its neutrality with regard to the precise relationship between the phenomenal and the physical. So let us set substance dualism to one side and consider other options. If consciousness is a wholly material phenomenon—if our experiential states are wholly physical states in our brain, perhaps in the way Russellian monists suggest, by contributing to the intrinsic natures of these states— then the problematic multiplicities may well be metaphysically impossible, at least for brain-like systems. Let us suppose, as seems likely, that your brain in its current condition (i.e., with corpus callosum intact) can produce just one stream of consciousness at a given time. Now, on current assumptions, your conscious states are identical with certain physical processes in your brain, and the phenomenal properties of your experiences contribute to the intrinsic natures of these physical processes (this is the distinctive claim of monism in its nonreductive Russellian guise). The notion

272

B. Dainton

that these physical processes could remain just as they are in all physical respects but nonetheless give rise to several unified conscious states, rather than the actual one, seems simply absurd. Since your experiences pervade volumes of matter and space in the same way as the physical processes with which they are identical, there simply isn’t room in your brain for the envisaged experiential doubling (or tripling, or more). To suppose otherwise is as absurd as holding that the nine-inch screen on a tablet computer could suddenly produce two entirely distinct nine-inch images, rather than one. The Russellian version of materialism is more plausible than most alternatives, but it is by no means the only game in town. Property dualism remains a popular alternative, particularly in the guise elaborated by Chalmers (1996) and more recently Tononi (2008), where phenomenal states are connected in law-like ways to physically realized information processing. On this view, individual experiences are particulars that are produced by activity in brains but are not to be found in brains, in virtue of being nonphysical. If experiences are related to brains in this way, the puzzling multiplicities envisaged by Johnston and Bayne are far harder to rule out. Certainly, the “there’s simply not enough space” objection has no force at all, since experiences don’t exist in physical space. Could it be argued that the relevant psychophysical laws couldn’t in fact take the form of “one physical state—many experiential states”? It no doubt could, but it is not obvious how to develop a very compelling case along these lines. It might be objected that if we suppose that just one nonphysical experience E is produced by a given physical event P in a given brain at a given time t, then we have a satisfactory account of the identity and individuality of E: it is an experience of a certain phenomenal character, which exists at time t and is caused by P. If, on the other hand, we accept that it is possible for P to produce multiple simultaneous experiences E, E*, E** etc., which are all exactly similar in phenomenal character, then we no longer have an account of what distinguishes any of these experiences from the others. However, while some may be swayed by this argument, it is difficult to see why property dualists who regard experiences as basic particulars, fully on par with any material things, need accept it. If experiences are basic entities in this sense, they don’t need to be related to anything else for their identity or individuality to be secure. Given this, if a physical event or process P can cause E to exist, it is not obvious why it can’t cause E*, E** (and perhaps many more E-type experiences) to exist as well, all at the same time. This point aside, there is no need for the experiences P produces to even be qualitatively identical. I have thus far been assuming that this is

Unity, Synchrony, and Subjects

273

what Bayne and Johnston had in mind, but in worlds where the psychophysical laws are indeterministic, it will be perfectly possible for the multiple streams of consciousness to vary in phenomenal character.9 These points aside, there remains a more general one. We don’t yet fully understand the relationship between the experiential and physical realms. Certainly none of the solutions currently on offer are altogether unproblematic—far from it. So it could well be that the real nature of the relationship in question is quite different from anything thus far contemplated. Given this ignorance, it is hard to see how we can be confident, at this juncture, in ruling out the possibility of Bayne and Johnston’s problematic multiplicities. 8 Some Solutions In discussing the multiplicity problem in “Selfhood and the Flow of Experience” (2012), I noted a number of responses to it that are available to the C-theorist. (1) Multiple Streams → One Subject: The C-theorist can argue that the relationship between subjects and streams of consciousness is less tightly constrained than Bayne allows, and so it is possible for a single subject to have more than one unified stream of consciousness at a given time. Perhaps hyperselves are possible: single subjects that possess a multiplicity of discrete, unified, conscious states at a given time. (2) Multiple Streams → No Subject: In the envisaged multiplicious circumstances, there is no subject of experience present, just the illusion of one— an illusion that exists in several simultaneous streams of consciousness. (This is the construal of the situation defended by Bayne and Johnston.) (3) Multiple Streams → Multiple Subjects: The consciousness-producing systems envisaged by Bayne and Johnston in fact house a multiplicity of distinct C-systems. If we equate individual subjects with individual C-systems, there is thus a multiplicity of subjects. After sketching these different options, I left the task of deciding between them to another occasion. Here I want to say a little more about their merits and demerits. If the C-theorist were obliged to subscribe to option (2), it would not be the end of the world. Bayne seems to assume that endorsing (2) undermines the whole approach of identifying subjects with systems of consciousnessgenerating capacities. But without further argument, it is not clear that this is the case. Why can’t the C-theorist argue like this: “Yes, when a C-system

274

B. Dainton

has the capacity for generating multiple streams of consciousness, it doesn’t constitute a subject, but those C-systems that can only generate a single stream at any one time do constitute subjects.” More bluntly, from the fact that some C-systems don’t constitute subjects, it doesn’t follow that none do. The multiplicity objection does not refute the C-theory; what it shows is that not all C-systems are candidates for subjecthood. In fact, I don’t think the C-theorist need opt for (2), for there are alternative construals that, if viable, do not suffer from the disadvantage of obliging us to hold that not all C-systems are subjects. Let’s first consider (3). If the existence of multiple streams entails the presence of multiple C-systems, and hence multiple subjects, the problem is solved. Each of these subjects will have a fully unified consciousness at any given time. As for whether this way of construing the situation is available, the answer is not simple or clear cut. C-systems are collections of experiential capacities that possess the distinctive attribute of generating unified conscious states. An experiential capacity, or so I am assuming, is a dispositional property of a distinctive type; when triggered, it produces an experience of some kind. I have also been assuming that physically realized experiential capacities are individuated by reference to the kinds of experience they can produce and their material bases, so that two capacities—E1 and E2—which exist at the same time, and can produce exactly the same kinds of experiences, can nonetheless be distinct by virtue of possessing distinct material bases (e.g., they are located in different brains, or different parts of the same brain). Now with these points in mind we can return to the issue in question: is it legitimate to regard physical systems that are capable of generating multiple, simultaneous streams as possessing multiple C-systems instead of just one? How we answer this question depends on how we conceive of dispositional properties in general. It is widely accepted that an object possesses the dispositional properties it does in virtue of the nature of the object. If a vase has the capacity to resist shattering when dropped, it possesses this ability because of certain causally relevant properties that it possesses. More generally, if an object x is disposed to produce a certain kind of effect E under conditions C, this will be because x possesses some feature F that is causally relevant to the occurrence of E in circumstances C. This F is the material ground or base of the disposition under consideration. So far, so familiar (and plausible), but what is the precise relationship between a dispositional property and its material ground? Here we find a divergence of opinion.10

Unity, Synchrony, and Subjects

275

For those who follow Armstrong (1969) and Mellor (2000), the relationship is identity: the disposition and causal base are one and the same. While this stance has distinct advantages—it is entirely transparent how dispositional properties manage to be causally effective—it also has some disadvantages. A foodstuff has the dispositional property of being fattening if, for a given population, it tends to lead to those who consume it putting on weight. But there are many different foodstuffs that have this property with different chemical compositions. Since it is the latter that is causally efficacious, if we identify dispositions with their causal bases, we are obliged to conclude that the property being fattening doesn’t actually exist; what does exist are a vast number of properties of the form has chemical composition X, for each distinct fattening compound. It was to avoid this consequence that Prior, Pargetter, and Jackson (1982) argued in favor of an alternative functionalist account of dispositions. On this view, a dispositional property such as being fattening consists of the second-order property of having a firstorder property (such as a certain chemical composition), which plays a certain causal role (in this case: causing an increase in weight). When construed in this way, dispositions have causal bases, but they are not identical with the latter. It is this distinctness that leads to a common complaint. Dispositions are supposed to be causal powers, but if all the causal work is being done by the first-order (base) properties, aren’t the second-order properties themselves causally impotent? This is not the place to enter further into this debate. For present purposes, what matters are the consequences of these different conceptions of capacities for how we can interpret the multiplicity scenario. To bring the essentials into clear focus, let’s start by introducing a single experiential capacity; this capacity is such that when triggered in circumstances C, it produces a single token E-type experience thanks to a causal basis F. Let’s now suppose that instead of producing a single token E-type experience when triggered, F produces two such token experiences. Can we say that the F is now the causal basis of two distinct experiential capacities, P1 and P2, for E-type effects? If we take experiential capacities to be second-order, functional properties, interpreting the situation in the “two-capacity” manner has some justification. There are, after all, two token experiences that can be produced by the envisaged system; given this, holding that the causal basis F grounds two distinct “capacities for E-type experience” seems justified. It is true that P1 and P2 are both grounded in a single causal basis F, and they are of the same type, namely for a single E-type experience. But we can nonetheless hold

276

B. Dainton

that they are distinct by virtue of the simple fact that the token experiences produced by P1 and P2 are themselves numerically distinct. In the envisaged circumstances, however, there is a competing account of the dispositional properties of causal basis F, namely that the latter grounds a single experiential capacity—let’s call it P3—for two, token E-type experiences. The dispositional facts may be compatible with the hypothesis that F possesses P1 and P2, but they are just as compatible with the hypothesis that F possesses only P3. Given this, the hypothesis that F sustains two powers (and by extension, two C-systems), rather than one, may not be incoherent, but neither is it demanded by the facts of the situation. Indeed, there is a case for holding that the single-capacity interpretation is less ontologically costly, since it requires us to acknowledge just one capacity rather than two and that it should be preferred for that reason alone.11 The multicapacity interpretation is more obviously and directly problematic if we follow Armstrong and Mellor, and hold that dispositions are identical with their causal bases. If we now try to say that P1 and P2 refer to numerically distinct powers, we immediately run into a problem. Since both P1 and P2 are numerically identical with F, they must be numerically identical with one another. Clearly, what goes for individual capacities will also apply in the case of complex collections of capacities, such as C-systems. If we construe dispositional properties in Armstrong’s way and try to hold that a single causal base supports two distinct C-systems, we will immediately fall into incoherence: the C-systems will both be identical with the causal base, and hence with each other. Pulling these points together, the “multiple streams → multiple subjects” option is itself looking distinctly problematic. It is not obviously incoherent if we construe dispositions as second-order properties; but even if the distinctness of the experiences they produce justifies ascribing multiple experiential powers to a single base, and this is questionable, it remains the case that this way of construing dispositional properties is itself dubious in the eyes of many. Since the main alterative account of dispositions renders the “multiple subjects” interpretation totally untenable, C-theorists who find the second-order view unpalatable need to look elsewhere for a solution to the multiplicity problem. 9 Hyperselves Revisited We have yet to take a closer look at the first of our three options, the “multiple streams → single subject” interpretation. If we take subjects to be

Unity, Synchrony, and Subjects

277

C-systems, we are now committed to holding that these multiple streams are the product of a single C-system. In so doing, we circumvent the problems associated with claiming that the systems in question are composed of two (or more) collections of distinct (duplicated), experiential capacities. But does this way of viewing matters make sense in other respects? I suggested earlier that we might call subjects such as these “hyperselves.” We ordinarily think of cubes as possessing just six sides and extending through three spatial dimensions, but mathematicians have now made us familiar with the properties of higher-dimensional hypercubes. Although we cannot visualize these, since their geometrical properties are well understood and perfectly consistent, most of us are happy to accept they are logically possible. In analogous fashion, should we not be prepared to accept the logical possibility of hyperselves, subjects possessing more than the usual number of streams of consciousness at a given time, but nonetheless genuine subjects? Limited beings that we are, we cannot imagine what it would be like to be such a subject, but if this is not a barrier to accepting the possibility of hypercubes, why should it prevent our accepting the possibility of hyperselves? After all, we know exactly what their existence would involve: a normal self has one stream of consciousness at a time, selves with additional “dimensions” have two concurrent streams, or more—Johnston’s hyperselves have seven. There is an obstacle, however. Bayne will insist that hyperselves cannot exist because they fail to abide by the constitutive connection between selfhood and phenomenal unity, the connection that guarantees a priori that “a single consciousness-generating mechanism will produce only one stream of consciousness” (2010, 288). We can formulate this constraint thus: Subject-Unity Principle: At any one time, the experiences of a single subject S are all mutually co-conscious, and necessarily so. Since there is no denying that the subject-unity principle does look to be plausible, it is looking as though the notion of a hyperself is fatally flawed. In fact, the situation is by no means so clear-cut. The subject-unity may normally apply, but does it invariably apply? Are there no conceivable circumstances in which it fails? There are indeed. The time travel tale is a staple of the sci-fi genre, and scenarios in which a time traveler travels back and encounters his or her earlier self are themselves staples of the time travel subgenre. The adventures of the protagonist of Heinlein’s “All You Zombies” (1959) is a classic example. It also features prominently in the 2012 movie Looper. In the latter, the eponymous loopers are assassins who

278

B. Dainton

travel back in time and dispose of their victims in the past. The organization that runs the operation has a tidy way of disposing of aging loopers: they arrange for their assassination by loopers who are in fact their own younger selves. Needless to say, young loopers who find out that they have “closed the loop” by killing their older selves are none too pleased by the discovery. When a young looper has his or her older self in their sights but has not yet pulled the trigger, we have an instance of one and the same person existing twice over. We have no difficulty at all—given the context and story—of accepting that the old and young loopers are one and the same person during the period when they coexist. Yet their states of consciousness during this period are not unified; they are as distinct as yours and mine. This fact serves to undermine the subject-unity principle. The latter may normally be true, but the time travel case shows us that it is not invariably true: in special circumstances, it can fail to hold. Perhaps Bayne and Johnston’s multiple-stream scenarios are another case in point? In response to the latter suggestion, a critic might reasonably claim: I grant that the possibility of coexisting time travellers suggests that the subjectunity principle doesn’t always hold, but time travel—assuming it is logically possible at all—is a dramatic and highly distinctive departure from the normal run of things. Clearly there is nothing resembling time travel going on in the multiple-stream case. As a result, there is no reason to suppose that time travel has anything to tell us about the best metaphysical interpretation of multiple-stream cases.

Once again, this turns out not to be entirely true. Time travel gives rise to several apparent paradoxes. Suppose I step into a time machine and travel a thousand years into the past. The machine I am using is quite efficient; the trip only takes half an hour. Is there a paradox here? It might seem so: my departure and arrival are separated by thirty minutes and one thousand years. How could this be possible? However, as David Lewis pointed out in his classic “The Paradoxes of Time Travel” (1976), in making sense of time travel we need to distinguish ordinary objective time, “external time” as Lewis labels it, from the “personal time” of the time traveler. In a time traveler’s personal time, the stages of his or her life are ordered as they usually are—e.g., bodies age in the normal way, earlier events are remembered (sometimes), later ones never, and so forth. From the standpoint of my personal time, my arrival occurs thirty minutes after my departure; from the standpoint of external time, it occurs one thousand years before my departure. Acknowledging the existence of these different temporal frameworks helps with another puzzle. Suppose I travel back (or forward) to a time

Unity, Synchrony, and Subjects

279

when a younger version of myself also exists. The trip has given me a headache—a quite excruciating one. But my younger self is not afflicted; he feels absolutely fine. So it seems that at the same moment in time I am both experiencing a headache and not experiencing a headache. How could this be possible? Doesn’t the principle of the Indiscernibility of Identicals rule it out? Once again, personal time comes to the rescue. My headache-afflicted and headache-free states do coexist in external time, but in my personal time they evidently don’t. From the perspective of the latter, the phase of my life that contains the headache occurs several years later than the headache-free phase enjoyed by my younger self. Resolving these problems in this way may itself seem problematic. “Fine, from the perspective of personal time, your older self’s painful headache occurs later than the pain-free condition of your younger self, but from the perspective of external time they are simultaneous. So the problem hasn’t gone away, for we are still confronted with a paradoxical situation,” or so an objector might argue. As I have pointed out elsewhere, in having recourse to personal time in resolving these paradoxes of time travel, Lewis tacitly relies on a general principle, one that can usefully be made explicit thus: The Temporal Parity Principle: If an object exists within more than one temporal framework, it is sufficient for its career to have an unproblematic form in one of these frameworks for the object to be deemed unproblematic from a metaphysical perspective. (Dainton, 2008, 390)

To appreciate why this Parity Principle is reasonable, it suffices to focus on the rationale for introducing the distinction between personal and external times in the first place. In the case of time travelers, we do so because it is clear that external time is not the only temporal framework within which the lives of these people can intelligibly be seen to exist and unfold. When just prior to stepping into my time machine I anticipate the adventures I will shortly be having in the past, there is a real sense in which these adventures lie in my future, even though from the vantage point of external time they exist in the past. We are led to recognize the existence of personal temporal frameworks precisely because external time cannot accommodate the temporal relationships that hold between the different phases of some people’s lives. In cases such as these, a life that makes (metaphysical) sense in the context of one of the temporal frameworks in which it exists will not do so in the context of the other. But since this is precisely what we should expect in such cases, there is no reason to suppose this state of affairs constitutes a (metaphysical) threat to these people’s lives.

280

B. Dainton

The distinction between personal and external times has application outside the realm of time travel. In a theological context, Pruss (2013) has recently put it to use in solving puzzles associated with saintly bilocations. Elsewhere I have argued that the distinction can be used to make (better) sense of personal fission cases: such divisions can be interpreted as branchings in personal time (2008, ch. 12). When they are so interpreted, there is no longer any insuperable difficulty in holding that the post-fission products are numerically identical. I will not repeat the case for construing fissions in this way here. What I will suggest is that the temporal distinction can assist in rendering the notion of a hyperself intelligible. If such subjects existed, they would have several streams of consciousness running concurrently, not necessarily all of the time, but some of the time. Since they do not undergo any form of fission, there are no grounds for supposing that the personal times of hyperselves instantiate a branching topology. Instead, we can hold that their mental lives unfold within parallel dimensions of personal time. There is a prima facie rationale for this. In the case of parallel physical universes, each with its own distinct (external) temporal dimension, the universe-branches are all entirely isolated from one another, spatiotemporally and causally. In analogous fashion, the experiences in each branch— each stream of consciousness—of a hyperself’s conscious mental life are experientially entirely isolated from one another. The conscious states in each of these streams may exist simultaneously in external time, but from an experiential perspective they might as well exist in separate universes. Moreover, if this construal of hyperselves is accepted, personal time can fulfill all the functions it serves in the case of the time traveler. Hyperselves whose streams are not qualitatively identical give rise to a problem: how can a single subject be in pain and not in pain at the same time? Those whose streams are qualitatively identical at any given time also pose a problem: how can a single subject have two (or more) total conscious states that are exactly the same at the same time? We now have an answer: these otherwise problematic states may exist simultaneously in external time, but they exist at nonsimultaneous moments of personal time. Of course, the notion that there could be such a thing as a single subject whose consciousness is distributed across parallel, personal time-series might easily seem so bizarre as to not warrant further consideration. However, the single substrate-multiple stream scenario that Bayne and Johnston invite us to consider is also very bizarre.12 No doubt those who are tempted to identify individual subjects with individual streams of consciousness—of contemporary writers, Galen Strawson comes to mind—will maintain that there are multiple subjects in such cases, one for each distinct stream. But

Unity, Synchrony, and Subjects

281

anyone who rejects this view in favor of identifying subjects with persisting capacities for consciousness, or C-systems, does not have this option. If there is just one C-system in such cases, there is at most one subject. Interpreting matters thus has an additional rationale. As viewed by Lewis, a personal time is determined by physical and psychological factors. “First come infantile stages. Last come senile ones. Memories accumulate. Food digests. Hair grows. Wristwatch hands move. The assignment of coordinates that yields this match is the time traveler’s personal time” (1976, 70). This approach is unobjectionable for anyone who appeals to biological and psychological factors (such as memory) in explicating personal identity over time. But what if we take ourselves to be, fundamentally, subjects of experience, whose identity conditions are specified in experiential terms, as in the C-theory? A subject of experience remains in existence through a given interval of time provided it remains conscious, or retains its capacity for consciousness throughout that interval, irrespective of any physical or psychological transformations it might undergo. Thus construed, subjects can survive changes that are not survivable according to the standard biological and psychological accounts of personal identity.13 Evidently, if this is the sort of entity we are, then in defining a personal time we will need to appeal to experiential rather than biological or psychological factors. According to the C-theory, a subject of experience is a collection of experiential capacities that are such that at any one time these capacities can generate experiences that are synchronically co-conscious, and over time they can contribute to continuous streams of consciousness. In broad outline at least, it is not difficult to see how a personal time can be defined from these basic ingredients. Since each stage of a (typical) stream of consciousness is experienced as flowing into the next, we can say of two experiential capacities, C1 and C2, that C1 is subjectively earlier than C2—if the experiences C1 can produce would be experienced as flowing into the experiences C2 can produce. C1 and C2 are subjectively simultaneous only if the experiences they can produce would be experienced as synchronically co-conscious. In this sort of way, the relationships that bind experiences into unified streams—i.e., co-consciousness, in its synchronic and diachronic forms—can be used to define a personal (or subjective) time. There is more to be said on this topic, but this is not the place for it— see Dainton (2008, sec. 12.6) for a more detailed discussion. For present purposes, it is sufficient to note just one consequence of defining personal time in this sort of way. As we saw above, a major obstacle in the way of construing a hyperself as a single self is the subject-unity principle, the thesis that at any one time the experiences of a single subject are all mutually

282

B. Dainton

co-conscious, and necessarily so. If we take hyperselves to be subjects living out their conscious lives in parallel dimensions of personal time, we needn’t reject this principle entirely, for we can endorse a variant of it: Subject-Unity Principle: At any given moment in the personal time of a subject S, the experiences of S are all mutually co-conscious, and necessarily so. This revised version of the principle retains the constitutive connection between subjecthood and phenomenal unity which Bayne insists upon, but it is now relativized to personal rather than external times. Indeed, since simultaneity in personal time is defined in terms of synchronic phenomenal unity, it is entirely clear why this constitutive connection exists in the first place. Moreover, irrespective of its application to the exotic cases we have been considering latterly, there are independent reasons for supposing that this is the form the subject-unity principle should take. How else are we going to make sense of the lives and minds of Dr. Who and other time-traveling folk? Acknowledgments My thanks to Tim Bayne, Christopher Hill, Daniel Hill, Philip Goff, Thomas Jacobi, and audiences in Fribourg, Manchester, and Oxford. Notes 1. See Textor (2006) for a good introduction to some of the controversies. 2. In the Theaetetus (185), Plato observes that we know perfectly well that sounds differ from colors in character and wonders how this is possible given that we cannot detect colors with our sense of hearing, nor sounds with our sense of vision. Theaetetus proposes a solution: “it appears to me that the soul views by itself directly what all things have in common”—and Socrates concurs. In his commentary on the Theaetetus, Burnyeat observes that the unity of the perceiving subject is demonstrated from the unity of the thinker who surveys and judges the proper objects of the different senses, for which purpose the thinker must also be a perceiver capable of exercising and coordinating a plurality of senses. Plato’s achievement in this passage is nothing less than the first unambiguous statement in the history of philosophy of the difficult but undoubtedly important idea of the unity of consciousness. (1990, 58)

3. See Dainton (2003a); for more on Aristotle’s account, see Pavel (2007). 4. Tim Bayne has confirmed in conversation that this is indeed the case.

Unity, Synchrony, and Subjects

283

5. See Chalmers (forthcoming). Chalmers goes on to say that “Bayne and I tentatively favour holism, on the grounds that our basic concept of consciousness is the concept of a total conscious state (what it is like to be a subject at a time).” He adds that he does not think this claim is one that is obviously true, but he finds it does have “some plausibility on reflection.” 6. Interestingly, in Psychology, Brentano also seemed to hold that holism applies to some but not all parts of our total states of consciousness. “If our simultaneous mental acts were never anything but divisives of one and the same unitary thing, how could they be independent of one another? Yet this is the case; they do not appear to be connected with one another either when they come into being or when they cease to be. Either seeing or hearing can take place without the other one, and, if they do occur at the same time, the one can stop while the other one continues” (1874, 157–158). Brentano claimed that there were different degrees of interdependence. Acts of seeing and hearing may be independent, but other kinds of acts are not, e.g., I can only desire something if I have a presentation of it, but I can have a presentation of it without desiring it. For more on Brentano’s holism, see Textor (forthcoming); for a defense of a more extensive form of holism, see Dainton (2010, ch. 9; 2008, sec. 9.5). 7. It should be noted that for Goff, the phenomenal bonding relationship exists between subjects rather than experiences. However, in the context of the form of panpsychism, Goff wishes to defend that the bonding relationship exists between conscious subjects, and it is not clear what subjects amount to over and above the experiences they are enjoying at any particular time. 8. Elsewhere Goff writes, “One deploys a direct phenomenal concept when one thinks about a given phenomenal quality by attending to it and thinking about what it’s like to have it. Having formed a direct phenomenal concept of a certain kind of pain, I can then go on to judge that other people feel that way, or that I felt that way yesterday” (forthcoming b, sec. 1). Can we not do the same for the relationship of experienced togetherness, and form a direct phenomenal concept of it? 9. If the streams do vary in phenomenal character, then consciousness had better be epiphenomenal in these worlds or else trouble will ensue! Although the “single substrate-multiple indistinguishable streams” scenario does not suffer from this complication, if events in these streams can causally interact with the physical, then at least some physical events will be causally overdetermined. 10. There are those who deny that dispositional properties need to have categorical bases at all; to simplify, I will resist speculating on whether the Bayne-Johnston multiplicity scenario could be intelligibly formulated in this framework. 11. Yet another way of construing dispositions gives rise to a similarly problematic indeterminacy. I have been assuming that dispositional properties have categorical bases, i.e., that they are grounded in nondispositional properties of some sort. This

284

B. Dainton

is probably the most common way of construing dispositions, but there are dissenters. Some hold that bare dispositional properties are possible, i.e., dispositions that are not grounded in categorical properties of any sort. (Ontic structuralists hold that reality as a whole is comprised of properties that are, in effect, bare dispositions.) Bare dispositions can be thought of as a nomological relationship between triggering circumstances, spatial regions, and effects. Accordingly, let us suppose that when certain triggering conditions obtain in its vicinity, spatial region R produces two E-type experiences. This pattern of activity is certainly compatible with R possessing two experiential capacities, each for single E-type experiences. But it is equally compatible with R’s possessing a single capacity for two E-type experiences. And once again, the latter hypothesis might be preferred on grounds of simplicity. 12. Pruss (2013) suggests that when saints bilocate, their personal time bifurcates into two parallel sequences; since saintly bilocation is a temporary condition, usually lasting only a few hours, so too are these bifurcations in personal time. Since hyperselves possess multiple streams of consciousness throughout their lives—or so I am assuming—interpreting their mental lives in Pruss’s way is not an option. However, during those periods when the saints in question are bilocated, their minds (temporarily) resemble those of hyperselves. 13. See Dainton (2008) for more on this.

References Armstrong, D. (1969). Dispositions are causes. Analysis, 30, 23–26. Ayer, A. J. (1945). Novelist-philosophers. Horizon, July, 12–25. Bayne, T. (2010). The unity of consciousness. Oxford: Oxford University Press. Bayne, T., & Chalmers, D. (2003). What is the unity of consciousness? In A. Cleeremans (Ed.), The unity of consciousness: Binding, integration, disassociation. Oxford: Oxford University Press. Bayne, T., & Dainton, B. (2005). Consciousness as a guide to personal persistence. Australasian Journal of Philosophy, 83(4), 549–571. Brentano, F. [1874] (1995). Psychology from an empirical standpoint. London: Routledge. Burnyeat, M. (1990). The Theaetetus of Plato. Indianapolis: Hackett. Chalmers, D. (1996). The conscious mind. Oxford: Oxford University Press. Chalmers, D. (Forthcoming). Modality and the mind-body problem: Reply to Goff and Papineau, Lee, and Levine. Dainton, B. (1996). Survival and experience. Proceedings of the Aristotelian Society, 96, 17–36.

Unity, Synchrony, and Subjects

285

Dainton, B. (2003a). Higher-order consciousness and phenomenal space: Reply to Meeham. Psyche, 9. Dainton, B. (2003b). Précis of Stream of consciousness. Psyche, 9. Dainton, B. (2004). The self and the phenomenal. Ratio, 17(4), 365–389. Dainton, B. (2008). The phenomenal self. Oxford: Oxford University Press. Dainton, B. (2010). Phenomenal holism. Philosophy, 67, 113–139. Dainton, B. (2012). Selfhood and the flow of experience. Grazer Philosophische Studien, 84, 173–211. Goff, P. (Forthcoming a). The phenomenal bonding solution to the combination problem. In G. Brüntrup & L. Jaskolla (Eds.), Pansychism. Oxford: Oxford University Press. Goff, P. (Forthcoming b). Real acquaintance and the phenomenal concepts strategy. In P. Coates & S. Coleman (Eds.), The nature of phenomenal qualities. Oxford: Oxford University Press. Heinlein, R. (1959). All you zombies. The Magazine of Fantasy and Science Fiction, Mercury Press, March. Hill, C. (1991). Sensations. Cambridge: Cambridge University Press. Johnston, M. (2010). Surviving death. Princeton: Princeton University Press. Lewis, D. (1976). The paradoxes of time travel. American Philosophical Quarterly, 13, 145–152. Locke, J. [1690] (1975). An essay concerning human understanding. P. Nidditch (Ed.). Oxford: Oxford University Press. Mellor, H. (2000). The semantics and ontology of dispositions. Mind, 104, 523–531. Pavel, G. (2007). Aristotle on the common sense. Oxford: Oxford University Press. Prior, E., Pargetter, R. & Jackson, F. (1982). Three theses about dispositions. American Philosophical Quarterly, 19(3), 251–257. Pruss, A. (2013). Omnipresence, multilocation, the real presence, and time travel. Journal of Analytic Theology, 1(1), 60–73. Textor, M. (2006). Brentano (and some neo-Brentanians) on inner consciousness. Dialectica, 60(4), 411–432. Textor, M. (Forthcoming). Unity without self: Brentano on the unity of consciousness. Tononi, G. (2008). Consciousness as integrated information: A provisional manifesto. Biological Bulletin, 215, 3216–3242.

13 Experiences and Their Parts Geoffrey Lee

1 Introduction Intuitively, your overall conscious experience at a given time has a complex structure. For example, you might simultaneously have experiences in more than one sensory modality, and your experience within a single modality might involve awareness of a complex of different features of a stimulus, of multiple stimuli, or of relations between stimuli. You might also simultaneously have both perceptual and nonperceptual experiences such as thoughts or conscious emotions; these states have their own structure—for example, a thought might involve using an array of different concepts—and their existence contributes to the overall structure of your conscious state. Perhaps the simplest kind of structure that experiences may have is a quasi-mereological structure that results from some experiences being components of, or “subsumed” by, more complex experiences. For example, your experience of the color and shape of an object might subsume your experience of its shape. There are arguably other kinds of structure that experiences have—for example, perhaps they have a representational structure that is richer than a hierarchy of quasi-mereological relations such as a sentential structure with elements playing the role that names and predicates play in English, or an imagistic structure (compare Fodor’s [1975] view that beliefs and other propositional attitudes have representational structure). A general understanding of experiential structure should be a central goal of a theory of consciousness—for example, it is necessary for understanding how experiences relate in a systematic way to neural states. The relatively modest task of giving an account of the part/whole structure of experience is my concern here: we can think of this as a first step toward a more general theory of experiential structure. Moreover, my aim is doubly

288

G. Lee

modest in that I will mostly taxonomize some different views rather than make a positive proposal. My discussion will focus on a number of interrelated core questions about experiences and their parts: The Identification Question: What is the class of “experiences?” What are the complex experiences whose part/whole structure we are interested in? The Parthood Question: When we talk about the “parts” of an experience, what is the relevant parthood relation? Experiential Decomposition Question: Given an experience, what are the parts of the experience? Experiential Composition Question: Given two or more experiences, under what conditions are they components of a more complex experience? The Priority Question: Is a complex experience constructed from its experiential parts, or are they derivative from the whole? The Unity Question: What is the relationship between the “unity” of consciousness and the part-whole structure of experiences? My main concern here will be the priority question, although all the questions will be relevant. A number of recent authors draw a distinction between holistic and atomistic views of consciousness (Searle, 2005; Bayne & Chalmers, 2003; Bayne, 2010), but I think it is fair to say that we lack a detailed account of the different ways this distinction can be drawn and how they are related. For example, Searle (2005) contrasts what he calls the “building block” model of consciousness with his preferred model on which consciousness involves a single, unified “field,” whose parts are not separately existing events, but rather modifications of the whole field. Searle suggests that the field model has important methodological consequences, in particular that attempts to explain consciousness in a piecemeal way by finding local neural correlates of different conscious contents is wrongheaded because the conscious field is not constructed from such independent parts. The building-block/holistic-field contrast is intuitive, but ideally we would like a rigorous account of what the different distinctions are in this ballpark. I hope to make some progress toward giving such an account here. My main point will be that the building-block metaphor adopted by Searle is misleading because it is suggestive of a number of different theses about the parts of an experience, all of which an anti-holist— understood as someone who thinks that typical experiences do have parts that are independent of the whole—could reject. There is a risk of saddling the anti-holist with the view that a total experience at a time can be broken down into a set of completely independent atomic experiences, which are

Experiences and Their Parts

289

somehow glued together to produce the complex whole. More precisely, the building-block metaphor is suggestive of four theses: supplementation, constructivism, disjointness, and unity externalism, which I will discuss in detail, arguing they can all be rejected without accepting the holistic field view. The bricks and other parts of a building satisfy supplementation in the sense that each proper part is accompanied by another disjoint proper part. This further implies1 that the building can be fully broken down into its proper parts in the sense that any given part of the building is accompanied by a set of mutually disjoint parts that sum to the whole. I will discuss how to formulate analogous claims about the parts of an experience, explain why they are probably not correct, and why rejecting holism doesn’t require accepting them. Thus an anti-holist need not think that a total experience can be fully broken down into independent experiential parts. A building is constructed out of a set of disjoint proper parts, in the sense that if the parts exist and stand in certain relations to each other, then we have a building. We might similarly think of a complex experience as “constructed” out of simpler experiences, in virtue of a unifying relation obtaining between them. In this way, we might think that in order to understand experiential composition, we need to figure out what the unifying relation is. As we will see, however, this constructivist approach to complex experiences might be wrongheaded. For example, it might be that experiential wholes are prior to their parts, as holists contend, and we should be starting with the wholes and asking how they decompose into parts. I will argue that anti-holists can also reject the constructivist approach, treating unity in a “top-down” way. On this view, there may be a sense in which there is no separately specifiable unity relation between experiences. Finally, bricks are disjoint objects, and in order for them to be “unified” into a house, they must stand in certain external relations such as being cemented together. I think the anti-holist can reject the claim that the parts of an experience are disjoint—they may overlap in a sense I will explain. Furthermore, it is consistent with rejecting holism that the unity relation between them is an internal relation, not an external relation, and so the metaphor of unity as a kind of phenomenal cement might be misleading. The plan: In section 2 I discuss some preliminary issues regarding the identification question and the parthood question. In section 3 I explain how I think we should understand the distinction between holism and atomism. In section 4 I discuss supplementation, in section 5 I discuss constructivism, and in section 6 I discuss disjointness and unity externalism.

290

G. Lee

2 Preliminaries: Identification and Parthood What are the “experiences” whose part/whole structure we are interested in? I will assume that experiences are events, and they involve the instantiation by subjects of certain special properties, call them “experiential properties,” such that what it’s like to be a subject at a given time is constituted by the experiential properties they enjoy. Thus, experiences are the components of “phenomenal consciousness,” as many philosophers call it. Let’s assume that phenomenal consciousness is a reasonably well-defined phenomenon, acknowledging that some may dissent (see, e.g., Byrne, 2009). Simply making this assumption, however, does not answer some basic—and surprisingly problematic—questions about the extension of the term “experience” that will be salient here. What are the experiences you are having right now? Are you enjoying an overarching über-experience, a “field of awareness” that encompasses all the more local experiences you’re having? Or is this “field” at best a mere sum of more local experiences? And what experiences does an experience have as parts? Does your experience of an object have separate experiences of its features as parts, or are these “experiences” really just aspects of your complex experience of the whole object? Does a multimodal experience decompose neatly into modalityspecific parts? Beyond the present moment: do experiences extend through time, perhaps having experiences as proper temporal parts? Or perhaps they are always instantaneous, or at least lacking temporal parts? In short, it is unclear when exactly experiences compose other experiences, and it is unclear how exactly experiences decompose into other experiences. On the one hand, it is natural and sensible to suspect that the answers to at least some of these questions will turn on stipulative decisions about how to use the term “experience” and so are not really substantive. On the other hand, experience presumably does have some kind of objective structure that we can discover; if there are verbal issues, they presumably have to do with which parts of this objective structure to call “experiences”. Furthermore, it remains true that we need to start with at least some clear instances of conscious experience in order to study it. I’ll assume that we can start with paradigmatic cases of experiences, such as experiences of features like colors and shapes, object experiences, and experiences of relations between objects such as spatial relations. We can then ask under what circumstances these compose more complex experiences, and how they decompose into simpler experiences. We should note right from the beginning that such “paradigmatic experiences” may not be self-standing: that is, their nature and existence might be metaphysically dependent on other experiences

Experiences and Their Parts

291

Figure 13.1 Gray cylinders.

(e.g., experiences of color and texture may be interdependent in this way). I think we cannot say in advance what the self-standing experiences are; for example, if holism is true, they may include only whole multimodal “fields” of awareness and therefore be quite distinct from the paradigmatic experiences, which may be mere aspects of self-standing experiences.2 As another preliminary, I want to put on the table a distinction between two approaches to the parthood relation on experiences, the type-entailment approach and the realization approach. So that we have a concrete example to consider, contemplate your experience of figure 13.1. It includes as parts experiences of the individual cylinder shapes, experiences of spatial relations between the cylinders, and experiences of distinct features of each cylinder such as their color and shape. In what sense are these parts of the whole experience? Talk of “parts” of experiences is puzzling. Is this the same part/whole relation that links bricks with houses, or is it merely some other relation that is sufficiently parthood-like that speaking in this way is excusable? I’m doubtful whether there is a single part/whole relation that applies in every case where we talk in mereological terms, but I will not take a stand on this issue here (for discussion see Bennett, 2011; Fine, 2011; and Uzquiano, 2006). We’ll consider some specific relations that might hold between experiences (and between events in general), and which are reasonable candidates for being a kind of part/whole relation; we won’t worry too much about whether they are “really” instances of the part/whole relation. By “reasonable candidate” I mean that each relation gives us a partial order on experiences and is a relation of metaphysical containment or inclusion of some kind. It’s not clear that these relations satisfy supplementation, or an analogue of this condition; this is discussed below in section 3. If they

292

G. Lee

don’t, this makes them less paradigmatically mereological than, say, part/ whole on spatial regions. As mentioned, I’m assuming that experiences are events involving objects instantiating experiential properties, perhaps relative to times, so experience parthood is a species of event parthood. The object involved here—which I’ll call the “logical subject” of the experience—may be coincident with a whole human body, but we should leave open the possibility that it is really a spatiotemporal part of a body, such as a brain, or a temporal part of a body or brain. An austere view is that an experience is fully individuated by a logical subject, experiential property (its “type”), and time. As we will see, there may be motivations for a richer conception. For one thing, it is worth noting right away that we tend to think of events as individuated in terms of the determinate way in which the event type is realized. For example, if I’m actually dancing at t by waltzing, but had I instead been dancing at t by belly dancing, then intuitively this would have been a different dancing despite involving the same subject, property, and time. Similarly, if I’m experiencing blue by looking at the sea at t, then had I instead been experiencing blue at t by staring at the sky, this would have been a different experience. We might helpfully distinguish states from events; states are fully individuated by a (perhaps determinable) property, subject, and time. Thus, in our first example, we have the same state of my dancing in both scenarios but different dancings. On the type-entailment approach to event parthood, one event is part of another if and only if (1) they have the same subject and time (2) the type of one event entails the type of the other. Thus event type-entailment is directly defined in terms of property entailment. For example, perhaps it is necessary that if a person is dancing then they are moving; so the event of the person dancing at time t type-entails the event of them moving at time t. Bayne and Chalmers (2003) consider type-entailment as an account of experiential part/whole, although they do not ultimately endorse it. The view is fairly plausible because in general a complex experiential property does entail the simpler experiential properties involved in the parts of an experience. For example, the property of experiencing a gray cylindrical figure entails the property of experiencing a gray figure and the property of experiencing a cylindrical figure. Note that entailments between properties are grounded in different ways; for example, the relation between a determinate property and its determinates is different from the relation between a complex property like a conjunctive property and any entailed properties out of which it is

Experiences and Their Parts

293

defined. On the type-entailment approach, we should ask how entailments between phenomenal properties are grounded. On the main alternative to type-entailment I want to consider, the realization approach, the mereological relations between experiences (or events in general) are defined in terms of mereological relations between their physical realizations. So in the cylinder example, the idea is that the physical realization of your experience of each cylinder is a part of the physical realization of your experience of the whole image, and so in a derivative sense one experience is part of the other. A realization of an event (as I will understand it here) is a minimal set (or mereological sum) of events of a certain type (e.g., of a neural type) that are sufficient for it to exist.3 A realizer is one of these events. So for example, a neural realization of an experience is a minimal set or sum of neural events that are sufficient for the experience to exist, and a neural firing might be one of its realizers. There are tricky questions about how exactly to understand realization (for discussion see Gillett, 2002; Kim, 2000; and Shoemaker, 2007), which I do not have space to discuss in full detail here. I will assume that there is a well-defined notion in this ballpark that can play the role I have in mind for it here. I will mostly deal here with what are often referred to as total realizations—realizations that are strictly sufficient for what is realized to exist. But the notion of core realization will be relevant too, a core realizer of an experience being the part of the total realizer that is relevant to fixing the specific type of the experience rather than the part of the total realizer that puts in place more generic conditions required for an experience to exist. For example, activity in V4 might determine the specific kind of color experience you are having, even though activity outside V4 (e.g., activity along thalamocortical loops) might also be required as a background enabling condition for the experience to exist. If the realizers of the events we are interested in have a well-defined mereological structure, then we can define derivative mereological relations on the events themselves. It is fairly plausible that the realizers of experiences in particular do have a straightforward mereological structure. Suppose in a given case that there are a certain number of independent4 neural events that we can take as “atomic” events. These will each involve a given neuron or neurons having a certain property, or a group of neurons related in a certain way. A fusion of any two of these atomic events will exist in a straightforward sense. For example, if event 1 involves N1 being F at t1, and event 2 involves N2 being G at t2, then their fusion is the event of

294

G. Lee

N1 being F at t1 and N2 being G at t2.5 Hence we can think of our basic neural events as generating a mereological hierarchy of complex neural events. Assuming we have an adequate base of atomic events, the total realization of an experience will be a mereological sum of these atomic events. The realization of one experience can therefore be a part of the realization of another in a straightforward sense. There are some important differences between the approaches. Note that in general, there is no requirement that if one event is part of another event then they have the same logical subject or happen at the same time, as is required on the type-entailment approach. For example, Bhoomika’s dancing might be part of a performance by a dance troupe, despite their having different logical subjects and her dancing happening at a different time from other parts of the performance. The realization approach can easily allow for this, because the realization of her dancing might be a part of the realization of the whole performance. In my view, it is not unreasonable to think that experiences might turn out to be like this. For example, it might be that visual experiences are not individuated in terms of the whole body but rather in terms of brain areas that include the areas where visual processing happens, whereas auditory experiences are individuated in terms of areas that do not include these visual areas; so different parts of a total experience might have different logical subjects (see Lee, manuscript, for discussion). Or it might be possible for a single unified field of experience to have distinct experiences as proper temporal parts (e.g., Dainton, 2006, defends such a view). If this is correct, it is a reason why the realization approach is preferable. Notice that it is possible to instantiate a determinable phenomenal property—such as experiencing the color gray—more than once at a single time; indeed, this is exactly what happens when you look at the gray cylinders. What distinguishes these experiences (call this the “duplication problem”)? A plausible answer is that their determinable types are realized in different determinate ways; for example, each experience of gray involves a more determinate experience of a particular kind of gray cylinder at a specific location. A related problem: the type-entailment theorist won’t want to say that each gray experience is part of the other, despite their types being mutually entailing. This problem can also be solved by taking determinable experiences to be individuated by their determinate realizers and part-whole structure to hold primarily between determinate parts (Bayne & Chalmers [2003] make the same point). This requires that there is a way of distinguishing the determinate parts of an experience (an issue I return to below in sec. 4) and also that determinate phenomenal types

Experiences and Their Parts

295

are nonrepeatable within a subject’s experience at a time. The appeal to realization also brings the type-entailment account close to the realization account; the main difference is that here we are appealing to realization at the experiential level (by a determinate phenomenal property instantiated by the same subject), whereas the realization approach descends down to the subpersonal level; it would distinguish the different bluish experiences in terms of their distinct subpersonal realizations. Both the type-entailment approach and the realization approach are reductive accounts of the part-whole relation on experience, specifying the conditions under which it holds. Why think that such an account is possible? Perhaps the part-whole relation between experiences is primitive. Certainly, in the case of material objects and their parts, it is quite plausible that the relevant part/whole relation can’t be cashed out in other terms. Furthermore, in our account of realization, we appealed to the notion of a fusion of two events, a notion that probably can’t be defined in other terms, suggesting that at least sometimes part/whole on events is primitive. Two points to make about this: first, part/whole could be a primitive determinable relation, with certain determinate relations as instances—in this sense, the two accounts considered might not be incompatible with primitivism. Second, even if the part/whole relation between experiences isn’t identical with either of the relations considered, it’s plausible that it is at least co-extensive with one or another of them in the cases we are interested in, and that thinking in terms of these more substantive conditions is more informative. This suggests that we can reasonably set aside the primitivist view for present purposes. Having put on the table some ideas about how to think about experiential parthood, let’s now move to considering the main distinction that I want to discuss here between holism and atomism. 3 Priority: Holism and Atomism Phenomenal holism is the view that the experiential parts of an experience exist only in virtue of the whole existing. Each part of the whole will be a phenomenal property instantiation; for the holist, this part exists only in virtue of the instantiation of a total experiential property. Intuitively, it is an aspect of, or abstraction from, the whole. One version of holism is a property dualist version. The property dualist believes that some phenomenal properties are fundamental properties, whose occurrences are basic in the way that fundamental physical occurrences are basic. On the holistic version of this view, it is total phenomenal

296

G. Lee

properties that are fundamental. This view can be thought of as analogous to a quantum holist view, on which the fundamental physical properties belong to the whole universe and more local states of affairs obtain in virtue of the global state of the world. Another version of holism is a physicalist version on which all phenomenal occurrences happen in virtue of physical occurrences. By way of analogy, consider the center of mass of all the neurons currently firing in your head. This is a three-dimensional quantity whose components depend holistically on all the neural activity happening in your head; a total phenomenal property could be a higher dimensional physical/functional property whose components depend holistically on a global pattern of neural firing in an analogous way. Notice that on either version of holism, the “field” of experience and its phenomenal properties are quite different things from a spatial region and its physical properties; the “phenomenal field” is not itself a concrete particular, but rather involves a concrete particular such as a body, brain, or brain region having a certain complex, structured property. Therefore, “modifications” of the phenomenal field, especially modifications that change over time, are not analogous to the changing physical properties of a spatial region such as a changing magnetic field. Rather, they are the ways in which the total experiential property associated with a concrete object (like a brain) are changing over time. Holism can be interpreted as entailing a token modal dependence thesis: each subexperience of a total experience necessarily only exists if the whole exists.6 This token-level modal claim shouldn’t be confused with a stronger claim of holistic type-necessitation, to the effect that an experience part of a certain type necessarily only exists in a total experience of a certain determinate type (a strong version of a kind of “gestalt unity.” Bennett and Hill [this volume] discuss this type-level holism). The (token) holist could say: this intense toe pain owes its existence to the total experience I’m having right now, even if another intense toe pain of the same type could occur as part of a different kind of total experience. Compare how this instance of blueness may be constituted by, and so modally dependent on, an instance of royal blue, even though it is perfectly possible for a surface to be blue without being royal blue. Suppose we assume that the basic neural events out of which we construct the total realization of an experience are modally independent: they could have existed without each other. Then, given that on holism the parts of an experience are mere aspects of the whole and so depend for their existence on the whole, it follows that each part of a total experience

Experiences and Their Parts

297

has the same total realization (see endnote for more discussion).7 Call this “realization holism.” Realization Holism: Each part of a total experience has the same total neural realization. The fact that, arguably, the physicalist holist is committed to the parts of an experience having the same total realization should give us reason to question how likely it is to be true. I think that an argument of the following form is often very plausible: (1) Parts E1 and E2 of total experience T have distinct core realizers. (2) If E1 and E2 have distinct core realizers, then they have distinct total realizers. (3) Therefore, E1 and E2 are prior to T. (4) Therefore, holism is false. Perhaps some holists will reject premise one in every instance, but it is often very plausible. Take an experience with auditory and visual components. We know that these experiences are associated with different brain areas, and this is fairly good evidence that they are at least core-realized in different places (I think this remains true, even once we acknowledge the role of intermodal feedback in perceptual processing: at most, this shows that in some cases the experience of a given feature is core-realized across different processing streams [see, e.g., Shimojo & Shams, 2001; see also Brogaard, Marlow, & Rice, this volume]). It is also plausible that a total experience is partially realized by one or more structured representations in the brain and that different parts of this representational structure are relevant to different parts of the experience, supporting distinctness of core realization in a different way. Distinct core realization doesn’t imply distinct total realization, but on the other hand, even acknowledging the role that intermodal feedback sometimes plays, it remains plausible that typically at least some of what is happening in auditory cortex, which is relevant to your auditory experiences, is not relevant to the existence of your visual experiences, so that they have distinct total realizations. I take this to show that we have good reason for taking seriously alternatives to the holistic picture. What anti-holist views are available? The weakest is simply a denial of holism. This is consistent with neither whole nor parts having priority, a view I will set aside here.8 I will consider only “atomist” views, on which whole experiences have experiential parts that don’t depend on the whole. These atomist views are distinguished by how strongly atomistic they are, in ways that I will categorize in what follows. To begin with, consider

298

G. Lee

the strong view that parts of experiences are always prior to wholes. This requires that experiences have basic parts that don’t have any experiential parts (or perhaps infinitely descending decomposition). What would such basic parts be? One candidate might be experiences corresponding to primitive feature representations such as experiences of edges, colors, or textures (e.g., Zeki [2003] believes in such “micro-experiences”). But surely a view in the spirit of atomism need not be quite so strongly atomistic. It could be that there are local holisms in experience—for example, the experiences of the shape, color, and texture of a surface could be holistically interdependent (e.g., have the same total physical realization)—even though typically a total experience does have at least some independent parts (i.e., parts whose existence and nature do not depend on other experiences). I suggest we construe atomism as consistent with local holism and define it as the view that total experiences typically have some parts whose existence and nature are not grounded in the whole. We might call this “weak atomism” to emphasize that some of the connotations of “atom” are misleading here; I will omit this qualification in what follows. On the atomist view, the total realization of a typical experience will itself have proper parts that themselves realize experiences; I’ll call these “subrealizations.” As above, I’ll assume that two distinct subrealizations are modally independent (that is, one could exist without the other—unless one is a part of the other; note that overlap is consistent with modal independence). It follows that the experiences associated with these distinct subrealizations will be modally independent, unlike the parts of an experience on the holist view. So the construal of atomism in terms of experiential parts having metaphysical priority over the whole is very closely related to a construal in terms of modally independent experiential parts realized by distinct, modally independent subrealizations.9 Some further varieties of atomism I want to highlight arise from the distinction between the content of a conscious experience and the fact that it is conscious. This distinction leads naturally to a distinction between consciousness atomism and content atomism. According to some theorists, the character or content of an experience and the fact that it is conscious at all are separable features of an event in the sense that the consciousness of the event is in some way detachable. For example, functional theories of consciousness such as higher-order theories (HOT) and access theories seem to imply a kind of detachability because being metacognized or accessed by working memory is a contingent property of a representation.10 If consciousness and content/character are separable in this way, then it is possible to take holistic/atomistic attitudes about them separately. In particular,

Experiences and Their Parts

299

you could combine content atomism and consciousness holism: the content of an experience could be built up from below, even if consciousness is a property that belongs holistically to a total experience and only derivatively to its parts. Note that by adopting a content atomist view, physicalist holists can make their commitment to realization holism more plausible. Content atomism allows the holist to say that parts of a total experience have distinct core realizations even though they have the same total realization.11 It might also allow them to explain such features of total experiences as their variable dimensionality in a systematic way (see below). If they go this route however, then they have a version of holism that doesn’t have the radical methodological consequences for consciousness research suggested by some holists like Searle. As mentioned above, Searle (2005) suggests that trying to explain consciousness from the bottom up by locating neural correlates of different aspects of experience is wrongheaded because consciousness has a holistic structure that requires theorizing about it as a whole field. However, if content atomism is true, we can at least hope to explain the content of experience in a piecemeal way, perhaps in part by locating local neural correlates, even if we need to think about consciousness itself more holistically. I should stress that consciousness holists are still committed to the dubious claim that different parts of a total experience always have the same total realization—a claim that, if anything, becomes more dubious once we accept content atomism. I expect that much of the intuitive appeal of consciousness holism is shared by a different, weaker view that I call the top-down view, and which I will argue atomists can subscribe to. This is the topic of section 5. What would be some concrete examples of holistic and atomistic views of consciousness? Most existing theories of consciousness either imply atomism or have atomistic versions. One big divide in theories is between “localist” views such as Block’s (2007) and Lamme’s (2003) on which perceptual experiences are wholly realized in areas concerned with processing perceptual information12 such as the visual areas in the occipital lobe, versus “centralist” views on which areas involved in more “central” forms of processing, such as working memory, are involved in the realization of experience. Examples of centralist views are Baars’ (1997) global workspace view, Dehaene and Naccache’s (2001) neuronal implementation of this view, Prinz’s (2012) attentional view, and HOT theories such as those of Armstrong (1968), Rosenthal (1997), and Carruthers (2000). Localist views imply that experiences associated with different modalities have separate

300

G. Lee

total realizers and therefore that atomism is true. Centralist views are also consistent with atomism: my visual and auditory experiences might have different total realizations even if these realizations overlap because the experiences exploit some of the same central resources (see sec. 6). Concrete illustrations of the holistic approach are harder to come by. We can abstractly conceive of content holism being true: even if the contents of, say, a visual state are fairly well correlated with local visual activity, it could be that the correct content-determining principles assign content holistically to a total conscious state (imagine, for example, that they require consistency across and within modalities). Even supposing that content holism fails, we can envisage versions of consciousness holism. For example, there will be versions of centralist views on which the availability of a representation to central systems is determined holistically; for example, the relevant kind of central availability might apply primarily to a coalition of representations that are linked in the right way and only derivatively to each member of the coalition (Van Gulick’s [2004] “Higher-Order Global States” view might fit into this category).13 Thus the total realization of, say, a visual experience might include the realization the whole coalition it is part of, including any nonvisual representations that are members. In this way, what is happening in the auditory system might, after all, be relevant to the existence of your visual experiences. Even though existing theories tend to be weakly atomistic, we shouldn’t rush to reject holism; we probably do not know enough about consciousness to know for sure what mereological form it takes. Having said that, in what follows I want to shine some light on what does and does not follow from rejecting holism. As I said, my main point in this paper is that this does not require accepting an (perhaps) implausible “building- block” model. To that end, I consider in the following sections various theses that are suggested by the building-block metaphor: supplementation, constructivism, disjointness, and unity externalism, but which I will argue are not implied by atomism as I have defined it here. 4 Decomposition and Supplementation How does a total experience decompose into parts? I want to consider the different answers we might get from holists and atomists. To that end, we need to first ask: how should holists and atomists think about part-whole relations on experiences? Because holists think that the parts of an experience have the same physical realization, they are likely to think in terms of type-entailment: a total phenomenal type will entail many less-specific

Experiences and Their Parts

301

phenomenal types, which correspond to the parts of the experience. Furthermore, the holist should take the relevant entailment relation to be akin to the determinate/determinable relation, not a definitionally grounded entailment relation (assuming that a complex defined property is derivative from the simpler properties it is defined out of). The atomist, by contrast, thinks that (at least some of) the parts of an experience have distinct physical realizations and therefore could think of experiential parthood in terms of mereological relations between these realizations. The same point applies to the mereological relations between the core realizers of experiences, which might be appealed to by a content atomist (this suggests that a consciousness holist who is a content atomist could also adopt a version of the realization approach). If the atomist does nonetheless think of parthood in terms of type-necessitation, it is open to her to hold that parts of a total experience can be definitional parts rather than determinable parts and so more basic components of that experience in a very clear sense. As mentioned, our intuitive notion of parthood includes a commitment to supplementation. A proper part is accompanied by another disjoint proper part, and perhaps the stronger condition of complementation: there is a disjoint accompanying part that makes up the “difference.” For present purposes, a helpful intuitive way to think of supplementation is as the view that every proper part of a whole is one of a set of disjoint parts that sum to the whole. If we assume unrestricted fusion, this is the same as complementation (in what follows, for simplicity, I will ignore the distinction between supplementation and complementation). The question I want to address here is: do total experiences break down into disjoint parts in an analogous way? Intuitively, it is only if some such condition holds that the “buildingblock” metaphor seems apt, so it is important for our purposes to consider the senses in which holists and atomists may or may not be committed to supplementation. Consider first a version of holism on which the parts of a total experience correspond to those phenomenal properties that are type-entailed by the total phenomenal property. If we include determinable ways of experiencing as “parts” here, then we won’t get a complementary structure: e.g., there is no experience that is the “difference” between determinably seeing something as blue, and determinately seeing it as royal blue.14 Nonetheless, there is, intuitively, a sense in which such a holistic experience has determinate experiences as parts that might break down in a complementary way (furthermore, to solve the duplication problem [see above], the holist needs a way of distinguishing such determinate parts). One way to

302

G. Lee

cash this out is as follows: Suppose that a total phenomenal property is a high-dimensional quantity with a certain number of independent degrees of freedom. Much as a quantity like three-dimensional location has three independent components, we could think of the determinate parts of an experience as the ways in which it is determined along each phenomenal degree of freedom.15 For example, suppose that a total experience consists of experiencing different colors at different locations and nothing more (an implausible “pixel map” model); the degrees of freedom might then be the ways of specifying the color at each location. Note that this satisfies supplementation in the sense that the total phenomenal property is the conjunction of the ways it is independently determined along each phenomenal dimension (although the whole is prior to each such part (compare: a 3-D location and its components)); we can think of the instantiation of each conjunct as a disjoint experiential part. By contrast, experiencing royal blue is not the conjunction of experiencing blue and some other independent experiential property. One problem with this way of decomposing an experience is that these degrees of freedom, assuming they exist, are probably too fine-grained to correspond to our intuitive idea of a determinate experiential part. For example, color experiences have at least three separate degrees of freedom: brightness, saturation, and hue. But we would not intuitively say that we have separate experiences of these different aspects of a color. Perhaps determinate experiential parts correspond to groups of determination dimensions—but if so, it is unclear what the principle of grouping is. In response, the holist can point out that since they are already committed to denying that total experiences have independently existing experiences as parts, it is not a huge cost to them to concede that the determinate “parts” of a holistic conscious field do not correspond very precisely with our pretheoretical notion of an “experience”. Another issue is that total experiences have variable complexity, which presumably would correspond to variable dimensionality. One might hope for an explanation of this variable complexity at a deeper level in terms of the way different total experiences can be built from more or less independent components; but this seems to require a way of picking out these independent components other than as determination dimensions of the whole (this might be thought of as a general objection to holism). Relatedly, unlike fundamental physical quantities, it is far from clear that total phenomenal properties can be neatly broken down into dimensions that are completely independent; for example, the hue, brightness, and saturation of color are not completely independent.16 Again, we might hope for

Experiences and Their Parts

303

an explanation of this at a deeper level. In response, perhaps a physicalist holist can explain these features at the level of subpersonal realization (for example, they might appeal to representational structure, perhaps by being a content atomist); the property dualist holist, on the other hand, seems to have no choice but to take the variable structure and interdimensional dependence of total experiences as basic and inexplicable.17 The main point to emphasize here, however, is that for the holist, these determinate dimensional parts, even if they involve independent variables, are not independently existing experiences (they all depend on the whole) and therefore not disjoint building blocks in the sense that the holist was concerned to reject. An atomist could also think that total phenomenal properties are separable into independent dimensions, but she is also unlikely to think that every such dimension corresponds to an independently existing experience (e.g., your experiences of the hue and brightness of a color are not independent existents). This raises the question of whether the atomist is committed to an experience having determinate complementary parts that are independent or “disjoint” in a stronger sense. Consider the set of subrealizations associated with the realization of a total experience. For each such subrealization there will be a maximally determinate experiential property that it realizes, which entails all the other experiential properties it realizes. You might think this gives the atomist a nice way of breaking up a total experience into independently existing, fully determinate parts. However, we need to be careful here. We haven’t yet said that the atomist is committed to such experiential parts having complements; if they don’t, then we can’t “break down” a total experience in this way. In what sense might these fully determinate, independent experiential parts have complements? We can certainly consider their “realization complements”: the complement of their total realization within the total realization of the associated total experience. However, it is not clear that a realization complement will itself realize any experiences, and if it doesn’t, then it will not give us an experiential complement; for example, it might be that all experiential parts of a total experience have overlapping realizers (see sec. 6). There is another more liberal definition of “experiential complement,” however, that doesn’t require experiential complements to have disjoint realizers. Let’s suppose that T has independent experiential parts E1 and E2. They are experientially disjoint if they don’t share any experiential parts (i.e., the intersection of their realizations doesn’t realize any experiences). One can be thought of as the complement of the other if (1) they are experientially disjoint, and (2) they “cover” the total experience:

304

G. Lee

that is, either (2a) the fusion of the realizations of E1 and E2 is coincident with the realization of T, or (2b) T is constituted by E1 standing in the phenomenal unity relation to E2. The reason for the disjunction between (2a) and (2b) has to do with the issues discussed below, of whether constructivism and unity externalism are true. For now, it is probably helpful to think of complements as satisfying (2a). I’ll say more to explain the difference between (2a) and (2b) below. Unfortunately, there is no guarantee that an experience will have a complement, even in this more liberal sense. For example, a total experience of a table against a white background might have the experience of the table as an independent proper experiential part but have no other independent parts that are experientially disjoint from the table experience (e.g., the experience of the background might not be an independently existing experience). Thus it is far from clear that merely granting that a total experience has some independent experiential parts implies that it can be fully broken down into a set of experientially disjoint parts. Let’s spell this out in more detail. Suppose your current experience does break down into independent parts. What are they? We can distinguish a bitty view and a chunky view. On the bitty view, the parts involve the experiencing of simple features such as colors, edges, and textures as placed in space in certain ways or as attributed to certain objects or surfaces and perhaps also experiences of the spatiotemporal relations between objects and events. “Simplicity” might be thought of in phenomenological terms, or, perhaps more promisingly, we might postulate that experience is underwritten by a structured sentence-like representation, and the features correspond to primitive predicates in a kind of “perceptual language” used, e.g., by the visual system. There are two problems with this that I want to mention. First, even if experience is underwritten by structured representations of some kind, it is unclear that they will have a canonical decomposition into basic parts involving simple predications; as Fodor (2010, ch. 6) points out, there are nonsentential forms of representation—such as imagistic representations—that may resist such decomposition even if their content is a function of their structure. As far as I know, it is an open question whether the representations underwriting conscious experiences (assuming there are such things) are like this. Second, even if we grant such a decomposition involving basic predicates is possible, it is dubious that such basic feature attribution underwrites independent experiences. In particular, even though it may not generally be true that features have to be bound to objects to be experienced,18 when binding does occur, it is plausible that we have a kind of experience of a feature that could not exist outside of

Experiences and Their Parts

305

binding, and whose existence depends on the other features that are experienced as bound to the object (think, e.g., about the interplay between color, shape, and texture). It is therefore implausible to think that feature experiences could be independent building blocks. On the alternative chunky approach, the smallest independent experiences are larger parts of experience such as experiences of whole objects having multiple features (as such, there will be “local holisms,” perhaps for quite large parts of experience). One problem with this approach is arbitrariness: it is unclear that there is the right kind of natural joint in experience at an intermediate level. Why an experience of a whole object rather than an experience of a surface, part of the object, or the relations between multiple objects? Despite this problem, it remains plausible that total experiences do have at least some independent parts, so presumably there are some such joints, even if they don’t cleanly divide experience into complementary parts. One possibility is that the smallest independent parts are modality-specific fields of experience. However, it may be that the experiences of some features are realized across different modality-specific processing streams (as mentioned above, this is one possible interpretation of what happens when there is intermodal feedback concerning a single feature). If so, then the real joints are more likely to be at the level of distinct object representations, or representations of distinct spatiotemporal regions, whose features may be interrogated in a multimodal way. Supposing there are such independent parts at a “chunky” level, it is unclear whether they have complements, for reasons already gestured at above. Take experiences of relations (such as spatial distance) between objects or regions. Even if the object experiences involved are independent experiential chunks, it is unclear whether my experience of the distance between the two objects could exist on its own without any of my experiences of the intrinsic features of the objects. The same point applies to certain “gestalt” experiences such as perceiving the cylinders in figure 13.1 as forming rows rather than columns; plausibly, your experience of a row formation requires that you also have experience of at least some intrinsic features of the cylinders, such as their shapes; so your “rowish” experience is not an independent “chunk.” A similar point was made above concerning the backgrounds of objects, and we could make the point about temporal relations between events and experiences that attribute high-level features based on lower-level features that are perceived (e.g., my experiences of the low-level features of a face might be an independent experience. If in addition I experience it as a face, this experience may not be independent of the low-level experiences).

306

G. Lee

To sum up: It is unclear whether experiences have bitty parts, and even if they do, it is implausible that these parts are independent experiences. If the smallest independent parts are at the chunky level, then it becomes unclear whether they have experiential complements. So if we are atomists, we have good reason to be skeptical about supplementation.19 This is the first way in which atomists need not subscribe to a “building-block” picture. 5 Constructivism and the Top-Down Approach The points in section 4 about supplementation relate closely to another issue concerning the proper interpretation of atomism. I mentioned earlier that we might think that there is a “unity relation” that holds between experiences, such that when they are unified, they are components of a more complex experience. Furthermore, many philosophers assume that specifying the nature of this relation is an independent task for a theory of consciousness. We might think that an atomist in particular is committed to saying what this relation is, as it is needed as a kind of binding agent for fusing together atomic experiences into wholes. I want to argue that atomists needn’t picture things like this. In the literature on the unity of consciousness, you can find a menagerie of different accounts of what the unity relation is. One large class of accounts treat unity in functional terms as a kind of integration between the contents of experiences (Shoemaker, 2003; Prinz, 2012); for example, experiences might be unified if their contents are jointly available to a single working memory system, or jointly represented by a higher-order monitoring system (Rosenthal, 2003). A related approach is the conjunctive representational approach, on which experiences are unified just if they contribute their contents as conjuncts to the complex content of an overarching experience (Hurley, 1994; Tye, 2003). On a subject-based account, unified experiences are those that belong to a single subject of experience (McDowell, 1997). Or on a primitivist view, unity is a primitive relation that can’t be analyzed in other terms (Dainton, 2006). There are other approaches as well. The idea that we need to specify some such relation would certainly be correct if a constructivist view of complex experiences is correct. The view has two components. First, there must be a basic supply of independently existing experiences that we can treat as “simple.” These simple experiences might be bitty, and so “simple” in an intuitive sense, but they could also be chunky (see above). Second, a “complex” experience is an event consisting in the obtaining of the (independently specified) unity relation between a

Experiences and Their Parts

307

set of simple experiences. For example, they might be simple experiences that are functionally integrated in the right way. Thus complex experiences are really “experiences” in a derivative sense: they are complexes that inherit their status as experiences from the simples out of which they are constructed. One noteworthy version of constructivism is unity pluralism (see Bennett & Hill, this volume;20 and Hill, 1991, ch. 10). Unity pluralists think that it is a mistake to think that there is a single, privileged unity relation that holds between experiences. Rather, there are myriad types of integration and unification that experiences can enjoy—such as those mentioned above—and the best we can expect from a theory of the “unity of consciousness” is a theory of each of these different relations. I think that once we accept the constructivist view, this is a plausible way to go: why think that one way of forming complexes from simple experiences has some privileged status? The theory also has the merit of explaining why we find it hard to say in certain cases whether a complex state is an “experience:” it might be unified in some ways, but not in others. I believe that there are at least some uses of the term “experience” for which a kind of constructivist approach is correct. Consider the experience of watching an entire movie. There is a sense of “experience” on which this is a single experience. However, arguably, in another sense this is just a temporally extended series of experiences that stand in the intimate relation of being enjoyed by a single subject during a certain period of time. We might take a constructivist approach, treating temporally local experiences as “simples” and allowing that whole-movie experiences can be “experiences” constructed from these simples in a derivative sense. The general constructivist thinks that an analogous approach is correct for complex experiences at individual moments. Constructivism requires supplementation (with some qualification).21 If supplementation fails, then we won’t have the requisite supply of independent building- block experiences out of which we can construct more complex experiences. I think this makes constructivism implausible for the same reasons. Consider again the gestalt experience as of cylinders arranged in rows rather than columns (fig. 13.1). Even if the experiences of cylinders are independently existing parts of the experience, we can’t think of the gestalt experience as obtained by gluing an independently existing “rowish” experience to the individual cylinder experiences.22 In this sense, it is not constructed from below. Fortunately, there is an alternative approach to unity and composition that is independently attractive, even if supplementation does obtain. On

308

G. Lee

what I’ll call the “top-down” approach, we can formulate a theory of what makes something an experience that applies directly to the range of complex experiences we are interested in so their status as experiences does not derive from their parts. Furthermore, rather than thinking that “complex experience” is defined in terms of an independently specified unity relation, we now take “unity” to be defined in terms of “conscious state”: two states are “unified” just if they are components of a state that is conscious. It’s helpful to think of this in terms of the unity biconditional (see Bayne & Chalmers, 2003): Unity Biconditional: Experiences E1 and E2 are phenomenally unified if and only if they are components of some complex experience E3. The constructivist takes the LHS to ground the RHS, whereas the top-down theorist thinks about things in a more “top-down” way, taking the RHS to have priority. What’s nice about this approach is that once we have a theory of what makes a state conscious and what it is for one experience to be part of another, we get a theory of unity for free—so, saying what unity is turns out not to be a separate task for a theory of consciousness. This is the approach advocated by Bayne and Chalmers (2003) and Bayne (2010). Bayne calls it the “mereological” view of unity. To see an example of the top-down approach in action, consider views on which phenomenal consciousness is really a functional property such as access consciousness (the property of being available to postperceptual consuming systems involved in motor planning, memory formation, and reasoning) or higher-order consciousness (the property of being targeted by a higher-order representation). On the top-down view, each functional account of consciousness delivers a functional account of what phenomenal unity is—for example, if phenomenal consciousness is access-consciousness, then phenomenal unity is access-unity (i.e., two states are phenomenally unified just if they are parts of a state that is access-conscious, i.e., just if they are jointly accessible to consuming systems). Another important example is the view of unity generated by a representational view of consciousness on which all experiences—including complex experiences—are a species of representational states that present the world as being a certain way: i.e., they have a particular propositional content (which may be further taken to determine the qualitative character of the experience). The representational theory combined with the mereological approach naturally generates the view that phenomenal unity is conjunctive unity: experiences are unified when they contribute their contents as conjuncts to the complex content of an experience that subsumes them.

Experiences and Their Parts

309

This view is noteworthy because it promises to reduce experiential mereological structure to a certain kind of representational structure, conjunctive structure (although for some objections to it, see Bayne, 2010, ch. 3). In addition to avoiding the potential problems for supplementation mentioned above, one advantage of this approach is that it gives a unitary account of what consciousness is for both complexes and more simple experiences. The constructivist approach says, in effect, that something is an experience if it is either a simple experience or is a complex constructed in one or more ways from simple experiences. The top-down approach avoids being disjunctive in this way, which would be a nice feature of a theory of consciousness if we can get it. This plays out in an interesting way in a recent discussion of unity in Prinz (2012, ch. 7). Prinz’s theory is that representations are conscious when they are attended (which for him means being accessible to working memory); he then proposes that attended representations are unified when they are realized by neural populations that have phase-locked firing patterns—unity is “neural resonance.” This initially sounds like a constructivist approach: the “simple” experiences are the attended ones, and the “complex” experiences are collections of attended states realized by resonant neural populations. However, Prinz is careful to argue that his account is really more unitary because neural resonance can be understood as realizing a form of attention, “co-attention,” that can apply to complex multimodal wholes. So this might after all be a top-down approach. The above examples illustrate how the top-down “mereological” view of unity is not necessarily in tension with other views on unity in the literature. Rather, they can be seen as ways of implementing the approach. Clearly, whether the top-down approach can be made to work (that is, whether complex experiences directly satisfy the condition for being conscious) depends on what the correct theory is of what makes a state conscious, and I won’t be able to address this big question here. The point I want to emphasize here is that atomists need not be constructivists, but could instead adopt the top-down view. Consider, for example, the view that phenomenal consciousness is access consciousness and play it through the top-down account to get the view that phenomenal unity is access unity. On this view, what makes a complex experience an “experience” might be that it is access-conscious, not that it has access-conscious parts that stand in a separately specified unity relation. Furthermore, a complex access-conscious whole could have parts that have distinct total realizations and which involve distinct access-conscious representations (e.g., representations in visual and auditory areas) so that

310

G. Lee

the complex whole has distinct atomic experiential parts (this could be true even if what makes one part access-conscious is not independent from what makes the other part access-conscious; see below). This illustrates how atomism is perfectly compatible with a top-down approach. This is the second way in which the building-block metaphor may be a misleading way of picturing the atomic view. Note that holists are automatically committed to the top-down view because they deny that total experiences have any independent parts. If atomists adopt the chunky view rather than the bitty view, then, in effect, they think there are at least some relatively simple experiences for which holism is locally true and for which the top-down approach is therefore also true. As mentioned earlier, most theorists will probably also accept that there are some “complex experiences”—such as experiences of whole movies—for which a restricted constructivist view is true. So there is a question for an atomist who thinks that the top-down view is sometimes true concerning which “complex experiences” it applies to and which complex experiences are really only “experiences” in a derivative sense. We should be open to the possibility that the concept of “consciousness” is indeterminate in such a way that it is not totally clear where exactly this line is drawn. As I discussed at the beginning, thinking that the atomist must believe in supplementation and thinking that they must be constructivists are not the only ways of saddling the atomist with a kind of building-block conception of the structure of experience. Building blocks are disjoint objects, and they require external cement to hold them together. In the next section, I explain how there are plausible versions of atomism that deny that the parts of experiences are separate in the way bricks are and deny that any external glue is required for unification. 6 Atomism, Unity Overlap, and Internalism Although atomists think that experiences typically have independent parts, we should be careful about what this independence amounts to. I defined it as existence independently of other experiences, which translates at the level of realization to the claim that two experiences are mutually independent if their realizations are distinct: that is, they are not coincident with each other, nor is one a proper part of the other. Holists think that the parts of a total experience always have completely coincident total realizations. We shouldn’t think that the only alternative is independent parts that have completely disjoint realizations—what we might call “strong

Experiences and Their Parts

311

independence.” Experiences can have overlapping realizations even if they don’t share any experiential parts,23 that is, even if the intersection of their realizations does not realize any experiences. An atomist could hold that unified experiences are weakly dependent in the sense that they overlap. Unity Overlap Thesis: Necessarily, if experiences E1 and E2 are unified, then they have overlapping realizations. Such overlapping experiences would not be separate building blocks in quite the sense that bricks are separate parts of a house; the only completely independent building blocks of the total experience might be subexperiential events like individual neural firings. Nonetheless, they are parts of the total experience that are metaphysically prior to the whole and which can exist without each other. (More weakly, the building-block metaphor would be misleading even if the independent elements of a total experience sometimes overlap; that is, if “disjointness,” the view that they are always disjoint, is false.) We noted above that experiences of different types—e.g., visual and auditory—sometimes are core-realized in different areas. This suggests that it would not be plausible to require that the core realizers of unified subexperiences overlap, and that a more specific version of the unity overlap thesis might be correct, linking the consciousness-making parts of the total realizations of experiences (i.e., the complement of the core realization in the total realization). Consciousness-Making Overlap Thesis: The consciousness-making parts of unified experiences overlap. We can think of this as the intuitively attractive claim that if two experiences are unified, then what makes one conscious is not completely independent from what makes the other conscious. Another important thesis in the same vein, and which the atomist could endorse, is unity internalism. Unity Internalism: Unity is an internal relation between experiences. For the internalist, the unity between two experiences is not some extra state of affairs in addition to the existence of the experiences themselves. Unity is an internal rather than an external relation between experiences: it holds in virtue of the intrinsic features of each experience, and therefore no substantial process of “binding” is required to make them unified. It is perhaps a little unclear what the “intrinsic” features of an experience are. I will assume that they include having a particular logical subject and

312

G. Lee

phenomenal type and being physically realized in a certain way. Note that it is plausible that these intrinsic features of an experience are also essential to it. If that’s right, then we can take internalism to imply that once experiences exist, nothing more is required to make them unified (if they are unified at all). Unity supervenes on existence. We can define unity externalism as the view that, except in certain degenerate cases, unity is an external relation. (We need the “degenerate cases” provision because one way for two experiences to be unified is for one experience to be part of the other experience. This is an internal relation.) Internalism should not be confused with the view that phenomenal types are internally related through unity in the sense that a phenomenal type necessarily entails the others it is coinstantiated with, and vice versa. Dainton (2006) construes the idea that unity is an internal relation this way. This view is implausible—it implies, for example, that if a certain type of pain occurs in the presence of a thought about lunch, then that type of pain necessarily always occurs accompanied by a lunch thought. Maybe some phenomenal types necessitate each other in this way—I will not discuss this idea here; the general claim is surely too strong. Phenomenal holism implies unity internalism because on holism, the subexperiences of a total experience only exist as unified parts of that total experience, so their unity supervenes on their existence. But atomists can embrace internalism as well. Consider, for example, the view that two experiences are unified if and only if they have overlapping realizations (I’m not saying this is a plausible view). Overlap is an internal relation between experiences (assuming the realization of an experience is an intrinsic property of it), so this would be a view on which unity is internal. Furthermore, it is clearly compatible with atomism: overlapping experiences need not be completely coincident (and therefore their existences need not be mutually dependent), as holism demands. Note that, as this example illustrates, internalism is perfectly compatible with two internally unified experiences being such that each could have existed without the other. It only requires that if they both exist, they are unified. Let a fusion of two events, A and B, be an event whose realization is a fusion of the realizations of A and B (which we are already assuming is well defined). Given the assumption that a fusion of two events always exists,24 internalism is equivalent to another mereological thesis about experiences—the unity fusion thesis. Unity Fusion Thesis: Two experiences are unified if and only if there is an experience that is the fusion of these experiences.

Experiences and Their Parts

313

If an experience fuses two experiences, then they are unified because they are parts of an experience (bear in mind that this is a nontrivial condition— not any two random experiences are parts of an experience, even if they fuse to form an event of some kind or other). Furthermore, if the fusing experience contains nothing other than the two experiential parts, nothing other than their bare existence was required for them to be unified (no external events were needed to bind them together). Thus unity fusion and internalism are equivalent. A stronger thesis that connects these claims with the issue of supplementation raised above is the phenomenal decomposition thesis. Phenomenal Decomposition Thesis: Each independent part of an experience is a member of a set of experientially disjoint experiences whose fusion is the whole experience. This is equivalent to the conjunction of internalism and supplementation. If internalism fails, then a fusion of experiences won’t in general be an experience. There will be extra events in the realization of a total experience needed to “link” the experiential parts. If supplementation fails, then a total experience won’t necessarily be exhausted by its proper experiential parts, even though if internalism is true, then their fusion is itself an experience. If phenomenal decomposition is true, then a total experience is “built up” from its proper experiential parts in a very straightforward sense,25 although we might be skeptical of this for the same reasons we might be skeptical of supplementation. If externalism and supplementation obtain, then we get a somewhat different decomposition thesis. The existence of a total experience will consist in the proper parts of the experience existing and standing in substantive external unity relations to each other. These experiential parts won’t fuse into a total experience because they need to be externally linked, much as a building is not a mere fusion of bricks because the bricks need to be cemented together. What is the relationship between internalism and the unity overlap thesis? Arguably internalism is a stronger claim in the sense that it implies unity overlap, but there is no reverse implication. If two experiences do not overlap, then surely some external connection is required between them for them to be unified.26 So internalism seems to imply unity overlap. The reason the reverse entailment—from phenomenal overlap to internalism— is less obvious is that it seems perfectly coherent for unity between experiences to require that they overlap, but also to require that certain external events relating the realizations of the experience to be in place as well. That

314

G. Lee

is, the unity relation might have both internal and external components. In practice, it’s hard to imagine a view on which things work out this way, so I suspect that in the context of most theories of consciousness, internalism and unity overlap will either be jointly accepted or rejected. But nonetheless, unity overlap is arguably a weaker claim for this reason. What, if anything, do existing theories of consciousness tell us about unity overlap and unity internalism? Localist theories like Block’s and Lamme’s suggest that a unified audio-visual experience can have separate components realized entirely in different perceptual systems, suggesting that these realizations are disjoint and that the relevant unity relation is external.27 Centralist theories, on the other hand, are at least consistent with internalism and unity overlap. It might be that in order to be jointly accessible, unified experiences have to be partly realized in the same part of the central system (e.g., in the same working memory systems), and thereby have overlapping realizations—i.e., what makes one experience accessible is not independent of what makes the other accessible. It is also consistent with this that no external link between these realizations is required to make the experiences unified; the overlapping neural activity that contributes to both states being conscious/accessible might be sufficient on its own for the experiences to be unified. Although centralism is consistent with unity overlap and unity internalism, it should be stressed that it is also consistent with them failing. To return to the example of Prinz’s (2012) attentional theory on which consciousness is accessibility to working memory: Prinz emphasizes that working memory is not a single integrated system but a network of different systems and that representations may be conscious in virtue of being accessible to different components of the working memory network. This suggests that on Prinz’s view, jointly conscious states need not have overlapping realizations, and their unity may be an external matter (which is consistent with Prinz’s view that unity is neural resonance, an external relation between neural populations). To sum up: Unlike bricks, the independent parts of an experience need not be disjoint, and contrary to the “cement” metaphor, unity need not be an external relation between experiences. It is also helpful to consider views like unity overlap and internalism (as well as the top-down view) because they show how a version of atomism can be developed that has some of the features of holism that I think some authors find attractive—e.g., the idea that the subexperiences of a total experience are not completely independent events.28 Having said this, although unity overlap and unity internal-

Experiences and Their Parts

315

ism are attractive views, existing theories of consciousness are mostly either neutral about them or in some cases rule them out. 7 Conclusion I have tried to clarify the difference between atomism and holism and explain some different ways the atomic view might be developed. In particular, I have emphasized four different theses suggested by Searle’s “building-block” metaphor that the atomist can reject: supplementation, constructivism, disjointness, and unity externalism. This raises the question of whether existing theories of consciousness support more moderate versions of atomism that reject one or more of these theses, and thereby have some of the attractive features of holism. Current theories are at best noncommittal on these issues, even theories where shared central resources such as working memory are implicated in making different kinds of experience conscious. We are still in the early days of understanding consciousness, however, so current theories may not indicate much about how more mature theories will lead us to think about these issues. I hope to have at least made clearer the different options available to an atomist and to have illustrated how the rather abstract concerns of most of the chapter play out in the context of more concrete theories of consciousness. Acknowledgments Thanks to Tim Bayne, John Campbell, and Dave Chalmers for helpful comments on an earlier draft. Notes 1. At least, this is implied given the assumption that there are only finitely many parts in the relevant model and that sets of these parts exist. The complications that arise from dropping these assumptions are not relevant here. 2. This possibility creates an interesting methodological puzzle. We might think that the way to identify nonparadigmatic experiences (such as whole fields of awareness, if they count as experiences), is to figure out the nature of paradigmatic experiences and then identify the states that have the same real nature. However, if paradigmatic experiences are derivative aspects of nonparadigmatic experiences, then we won’t be able to understand their nature without first identifying the nonparadigmatic experiences that they depend on.

316

G. Lee

3. Note that contrary to the approach of e.g., Kim (2000), I’m operating with a conception of realization on which realizer events need not involve the same logical subject as the property instantiation that is realized. 4. Independence is important. Without it, there is no guarantee that disjoint sets of atomic realizers do not realize some of the same phenomenal events. Independence at the neural level can be defined in terms of realization at a more basic level—neural events are independent if their fundamental, physical realizations are disjoint. This raises the question of how to understand independence at the fundamental physical level, and also whether there are any independent events at this level, and if so, whether they are the right kind of ingredients to build neural events from. I will not be able to pursue these important questions here. 5. If an object exists that is a fusion of N1 and N2, then we can think of this event as an instantiation by this two-neuron object of a structural property involving one part being F and the other G. 6. This assumes that it isn’t contingent that a token experience is grounded in the whole. Although it is probably coherent to hold a grounding view on which these grounding relations are contingent, I think it is plausible to interpret a holist as believing that the grounding relations between parts of an experience hold necessarily—this certainly fits best with talk of parts as being mere aspects of, or modifications of, the whole field. 7. Suppose two parts of a total experience have different total realizations. Then, given the assumption that these realizations are constructed out of different sets of modally independent neural events, it follows that each realization could have existed without the other (assuming that one realization is not a part of the other), so each experiential part could have existed without the other, contrary to holism. Two notes on this argument. First, I am assuming that on holism, it is not contingent that the existence of a token experience is grounded in the existence of the total experience of which it is a part. Second, there are views on which the different parts of a neural realization are not independent events. Consider a version of Schaffer’s (2010) holism, on which local neural events are grounded in global neural events. On such a view, even if two parts of a total experience are grounded in completely disjoint and spatially localized neural events, they might nonetheless be grounded by the total experience, in that they are grounded in the total neural state of the brain. A holism on which parts of an experience can still have extremely localized total realizers is clearly not in the spirit of the views intended by actual holists, even if it technically counts as a form of holism, and so I will set it aside here. (Thanks to Dave Chalmers for discussion here.) 8. Shoemaker’s (2003) view might be interpreted in this way. Whether there is a stable view intermediate between holism and atomism is an interesting question, but I won’t address it here. I would at least point out that my definition of atomism is sufficiently weak that it is hard to find room for an intermediate view.

Experiences and Their Parts

317

9. Note that this kind of modal independence is consistent with a given experience putting constraints on the kind of (determinate) experiences that can accompany it; the fact that E1 could have existed without E2 does not imply that E1 can exist on its own or that it can exist with any old experiences accompanying it. 10. I would suggest we understand detachability as the thesis that the experience has a realization part that fully determines its content/character and that could exist as part of an unconscious event. 11. This would also allow them to adopt a version of the realization approach to experience-parthood: they could say that parthood relations between experiences derive from parthood relations between their core realizers (obviously, they can’t adopt this approach using total realizers). 12. I should note that the main concern of localists such as Block and Lamme is to deny that postperceptual frontal areas are involved in conscious perception; this is consistent with the total realization of experience not being limited to cortical perceptual areas but also involving connections with areas such as the thalamus, which may be involved in sustaining conscious awareness (see, e.g., Block, 2007, 482). 13. We need to be careful here to distinguish between a holism in the dynamics of selection for consciousness and a holism in the individuation of conscious states. By way of analogy, imagine an army of zombie soldiers swarming to enter into the fortress of consciousness. What determines which zombies get through the gates might be an extremely holistic process involving interactions across the whole group; nonetheless, once a solider makes it through the gate, what it is for it to be inside the fortress might be definable without reference to the other soldiers. 14. Note that there is a difference between explicitly experiencing the color as falling under the category “blue,” and merely experiencing it as having a shade that is in fact a shade of blue. It is really the latter property that is the determinable of experiencing it as royal blue. 15. A complication here is that some multidimensional quantities, such as threedimensional location, don’t have a privileged coordinatization. Intuitively, total phenomenal properties are not in this category, but if this intuition is wrong, then even if they have a determinate number of independent dimensions, this wouldn’t on its own determine a privileged decomposition into specific dimensions. 16. For example, some levels of saturation are not available for every hue (see, e.g., Palmer, 1999). 17. Perhaps dualist holists who take total experiences to be primitive relations to structured entities such as propositions can avoid this objection. 18. Possible examples of unbound feature experiences are color experience in a ganzfeld, motion experience in peripheral vision, and experiences had in situations where binding failures are prone to happen such as while viewing the stimuli that

318

G. Lee

prompt “illusory conjunctions” (Treisman, 1998). (I do not say it is obvious that any of these cases really do involve unbound feature experiences.) 19. A more complete discussion would also consider the mereological structure of core realizers; it might satisfy supplementation even if experiences do not have experiential complements in the sense we have been discussing. For example, if a form of content atomism is true on which a total experience is partly realized by a structured representation with various parts joined together through conjunction, these representational parts might have complements even if these complementary parts do not correspond to independently existing experiences. 20. There is a way in which Bennett and Hill’s version of unity pluralism appears to differ from constructivism, as I am understanding it here: for them, not all complex states formed when experiences are unified (by one of the plurality of unity relations) should themselves count as “experiences.” However, I think this is merely a verbal disagreement: my constructivist says that “experience” applies to these complex states, but only in an extended sense. Bennett and Hill say we shouldn’t use the term “experience” to describe them. Clearly, nothing of theoretical importance turns on this. 21. Constructivism does not strictly imply supplementation. As I am understanding supplementation, an experience and its complement are experientially disjoint. I think we can make sense of a version of constructivism on which the “simple” building blocks out of which complexes are constructed are allowed to experientially overlap (which would require them to be somewhat “chunky”), even though they are capable of independent existence. I think this is an odd view, and I do not think it has significant advantages over a version of constructivism on which the building blocks are disjoint; but ideally it would be treated in detail as a separate case. 22. That is, this particular token “rowish” experience depends for its existence on the experiences of individual cylinders. That is not to say that one could not experience some different items as forming a row. In that case, we might have a rowish experience of the same type, but it would not be the same token rowish experience. 23. Note that overlapping in this sense requires more than having overlapping logical subjects; even if their subjects overlap, two events might not share any events as parts. 24. We need this assumption to rule out views on which the fusion of two experiences only exists under special circumstances such as their being related by an external unity relation. (Thanks to Dave Chalmers for pointing this out.) 25. Note that this is not equivalent to an even stronger thesis, to the effect that a total phenomenal property is equivalent to a conjunction of determinate phenomenal properties that characterize the parts of the experience. If holism is true, this

Experiences and Their Parts

319

thesis may be correct if we can think of each of these phenomenal properties as a way of making determinate the total property along a determination dimension, but if atomism is true, then presumably a total phenomenal property is not a mere conjunction of more basic phenomenal properties. 26. Objection: Isn’t belonging to the same subject sufficient for unity, and also an internal relation? But why think that experiences belonging to the same subject have to overlap? Response: Belonging to the same subject is only sufficient for unity if we individuate subjects in terms of unified fields (e.g., if subjects are just organisms, then there is no guarantee that their experiences are unified—e.g., an organism might have two separate brains). If two experiences belong to a single field-subject but do not overlap, then surely there is some external relation that makes them both unified and belong to the same subject. In other words, in the relevant sense, belonging to the same subject is only sufficient for unity if it is an external relation or the unified experiences overlap. 27. Although, as noted above in endnote 12, localists may in fact think that there are some neural structures such as thalamocortical connections that enable consciousness and which might be shared by different parts of a total experience. This version of the view is probably consistent with internalism and overlap. 28. Also, insofar as these theses illuminate the sense in which experiences have a mereological structure by focusing on mereological relations between the realizations of the experiences, they help motivate the idea that the mereology of experiences should be understood in terms of the mereological relations between their realizations, and furthermore that experiences are partly individuated in terms of their realizations.

References Armstrong, D. (1968). A materialist theory of the mind. London: Routledge. Baars, B. J. (1997). In the theatre of consciousness: Global workspace theory, a rigorous scientific theory of consciousness. Journal of Consciousness Studies, 4(4), 292–309. Bayne, T. (2010). The unity of consciousness. New York: Oxford University Press. Bayne, T., & Chalmers, D. (2003). What is the unity of consciousness? In A. Cleeremans (Ed.), The unity of consciousness: Binding, integration, and dissociation (pp. 23– 58). Oxford: Oxford University Press. Bennett, K. (2011). Construction area (no hard hat required). Philosophical Studies, 154(1), 79–104. Block, N. (2007). Consciousness, accessibility, and the mesh between psychology and neuroscience. Behavioral and Brain Sciences, 30(5), 481–498.

320

G. Lee

Byrne, A. (2009). Experience and content. Philosophical Quarterly, 59(236), 429–451. Carruthers, P. (2000). Phenomenal consciousness: A naturalistic theory. Cambridge: Cambridge University Press. Dainton, B. (2006). Stream of consciousness unity and continuity in conscious experience. London: Routledge. Dehaene, S., & Naccache, L. (2001). Towards a cognitive neuroscience of consciousness: Basic evidence and a workspace framework. Cognition, 79, 1–37. Fine, K. (2011). Towards a theory of part. Journal of Philosophy, 107(11), 559–589. Fodor, J. A. (1975). The language of thought. Cambridge, MA: Harvard University Press. Fodor, J. A. (2010). LOT 2: The language of thought revisited. Oxford: Oxford University Press. Gillett, C. (2002). The dimensions of realization: A critique of the standard view. Analysis, 62(4), 316–323. Hill, C. S. (1991). Sensations: A defense of type materialism. Cambridge: Cambridge University Press. Hurley, S. (1994). Unity and objectivity. In C. Peacocke (Ed.), Objectivity, simulation, and the unity of consciousness: Current issues in the philosophy of mind (pp. 49–77). London: British Academy. Kim, J. (2000). Mind in a physical world: An essay on the mind-body problem and mental causation. Cambridge, MA: MIT Press. Lamme, V. A. (2003). Why visual attention and awareness are different. Trends in Cognitive Sciences, 7(1), 12–18. Lee, G. (manuscript). Selfless experience. McDowell, J. (1997). Reductionism and the first person. In J. Dancy (Ed.), Reading Parfit (pp. 230–250). Oxford: Oxford University Press. Palmer, S. E. (1999). Color, consciousness, and the isomorphism constraint. Behavioral and Brain Sciences, 22(6), 923–943. Prinz, J. (2012). The conscious brain. New York: Oxford University Press. Rosenthal, D. M. (1997). A theory of consciousness. In N. J. Block, O. J. Flanagan, & G. Güzeldere (Eds.), The nature of consciousness: Philosophical debates. Cambridge, MA: MIT Press. Rosenthal, D. M. (2003). Unity of consciousness and the self. Proceedings of the Aristotelian Society, 103(1), 325–352.

Experiences and Their Parts

321

Schaffer, J. (2010). Monism: The priority of the whole. Philosophical Review, 119(1), 31–76. Searle, J. (2005). Consciousness: What we still don’t know. New York Review of Books, 52(1). Shimojo, S., & Shams, L. (2001). Sensory modalities are not separate modalities: Plasticity and interactions. Current Opinion in Neurobiology, 11(4), 505–509. Shoemaker, S. (2003). Consciousness and co-consciousness. In A. Cleeremans (Ed.), The unity of consciousness: Binding, integration, and dissociation. Oxford: Oxford University Press. Shoemaker, S. (2007). Physical realization. New York: Oxford University Press. Treisman, A. (1998). Feature binding, attention and object perception. Philosophical Transactions of the Royal Society of London, Series B: Biological Sciences, 353(1373), 1295–1306. Tye, M. (2003). Consciousness and persons: Unity and identity. Cambridge, MA: MIT Press. Uzquiano, G. (2006). The price of universality. Philosophical Studies, 129(1), 137–169. Van Gulick, R. (2004). Higher-order global states (HOGS): An alternative higherorder model of consciousness. In R. Gennaro (Ed.), Higher-order theories of consciousness. Amsterdam: John Benjamin. Zeki, S. (2003). The disunity of consciousness. Trends in Cognitive Sciences, 7(5), 214–218.

14 Unity of Consciousness: Advertisement for a Leibnizian View Farid Masrour

It is common to hold that our conscious experiences at a single moment are often unified. But when consciousness is unified, what are the fundamental facts in virtue of which it is unified? On some accounts of the unity of consciousness, the most fundamental fact that grounds unity is a form of singularity or oneness. I call these Newtonian accounts of unity because of their similarity to Newtonian views of space according to which the most fundamental fact that grounds relations of co-spatiality between various points (or regions) of a space is the fact that these points (or regions) are parts of the same single space. It is not, however, clear that the unity of consciousness has to be treated in a Newtonian manner. We can imagine an approach to unity that accounts for it in the same manner that one might think of the unity of a chain. Two links make a chain together when they are connected in the right way. Intuitively, the connection between the links is the fact that grounds the oneness of the chain. In this chapter, I sketch and defend an analogous account of unity of consciousness. Very roughly, the view holds that experiences are unified when they are connected in the right way. In this respect, the view is analogous to Leibnizian views of space according to which the oneness of space emerges from certain conditions over spatial relations. I call the view the connectivity view. The first section elaborates on the task at hand, the terminology that the paper relies on, and surveys some of the existing accounts. The second section outlines the main theses of the connectivity view. The third section discusses the comparative dialectical advantages of the view. And the last section addresses three potential objections. I end with a few words about the significance of this issue for cognitive science.

324

F. Masrour

1 Unity and the Grounding Question As I’m writing these lines, I see them appear on the screen, which partly blocks my view of the window behind it. I feel the pressure of the backrest and a slight pain in my neck. I smell the stew cooking downstairs in the kitchen and hear the cars passing in the street. I can hear the singing of the birds outside. The spring has finally arrived and I feel excited about it. There is something that it is like seeing the screen, hearing the birds, and smelling the stew. But there is also something that it is like for me to have all of these experiences together. Let us refer to this togetherness of my experiences as phenomenal unity of consciousness. Phenomenal unity of consciousness, for short, phenomenal unity, is the main topic of this chapter. The above snippet talks about events such as seeing, hearing, and smelling. These events are assumed to be phenomenally conscious events, marked by the fact that there is something that it is like to undergo them. The snippet claims that these conscious events enjoy a form of “phenomenal togetherness.” The “phenomenal” in “phenomenal togetherness” is meant to indicate that there is something additional that this togetherness contributes to what it is like to undergo these experiences, something the omission of which would render our description of the phenomenal facts incomplete. Here, I have followed the common practice of introducing phenomenal unity with a phenomenological snippet followed by a few clarificatory remarks.1 The snippet aims at pointing in the direction of our target phenomenon. Whether this attempt to identify a phenomenon succeeds or not is itself a matter of controversy and some theorists are skeptical of it. I actually agree that there is something correct about this skeptical attitude, but I would like to ask the reader to put her skepticism about the existence of our target phenomenon aside for the moment and regard our characterization as sufficient to serve as a starting point. For reasons that will become clearer in the next sections, the task of identifying phenomenal unity is closely intertwined with offering a substantive metaphysical account of it. Near the end of section 2, we will revisit our initial characterization. The topic of phenomenal unity has received significant attention during the past two decades.2 Much of the recent discussion has centered on a number of tasks and issues. One task is to provide an account of the more fundamental, personal-level facts, if any, in virtue of which phenomenal unity obtains (more on this soon). A second task is to provide an account of the psychological underpinnings of phenomenal unity. A related issue is whether phenomenal unity obtains by necessity or whether it breaks in

Unity of Consciousness

325

some normal or pathological conditions. There has also been some interest in the relationship between phenomenal unity and personal identity.3 Finally, some theorists think that the notion of phenomenal unity is not clearly articulated and that the existence of such a relation has not been properly demonstrated.4 Another issue is whether one can provide a satisfactory response to this skeptical worry and if so, what the response might look like. The central question of this chapter is about the first task in the above list. I want to answer the following question: Grounding Question When several experiences are phenomenally unified, what are the most fundamental personal-level facts, if any, in virtue of which the experiences are unified? As I am using the term, personal-level facts are those that are present to consciousness. I do not venture on defining what it is for something to be present to consciousness. But I shall assume that facts about the phenomenal character of experiences and the content of conscious states satisfy this demand. Since personal-level facts must be present to consciousness, the answers to the grounding question can only appeal to facts that are present to consciousness. For example, the answer that a set of experiences is phenomenally unified in virtue of the fact that its members are parts of one encompassing experience satisfies the requirement. Subpersonal underpinnings of conscious states that are not available to the subjects of the states, on the other hand, would not qualify. For example, an answer to the effect that a set of experiences is unified in virtue of the fact that their neural correlates are part of the same corticothalamic loop does not satisfy this demand.5 I take the claim that a set of facts obtains in virtue of another set of facts as equivalent to the claim that the former set is grounded in or depends on the other set. I will take the grounding relation as a metaphysically primitive relation.6 The grounding question should be distinguished from a related question that we can call the structural question: what are the necessary and sufficient personal-level conditions for phenomenal unity? Answers to the grounding and structural questions can diverge. For example, one might give a primitivist answer to the grounding question, holding that phenomenal unity is a fundamental personal-level fact that is not grounded in any other facts at the same level, while answering the structural question by holding that a set of experiences is unified if and only if its members are parts of one single, encompassing experience.7

326

F. Masrour

It is worth noting that the grounding question is posed in terms of unity relations among experiences. My usage of the term “experience” is regulated by Nagel’s criterion.8 Accordingly, we are entitled to attribute an experience to a subject, S, whenever we are entitled to hold that there is something that it like to be S. This is not to take a substantive metaphysical position toward experiences, for example, that they are mental entities such as sense data. I consider talk about experiences to be neutral with respect to the debates between representationalist, adverbialist, naïve realist, and sense datum accounts of experience.9 Formulating the grounding question in terms of experiences might sound problematic to those theorists who hold that we undergo only one experience at each moment. In this one-experience view, one does not have an experience of seeing the screen, an experience of hearing the birds, and yet another experience of smelling the stew. Rather, one has a single experience whose content is only incompletely described in saying that one hears the birds.10 It might seem that if the one-experience view is true, then there is no question to be asked about facts underlying the unity of experiences because there is no multiplicity in experience. But I think that the challenge that the one-experience view poses for the topic at hand is less substantive than it might seem. In order to pose the grounding question, we need elements that have an intimate connection with phenomenal consciousness, are multiple, and stand in unity relations. If the one-experience view is correct, then experience is not the right item to play this role because it fails the multiplicity condition. However, the grounding question can still be posed in terms of contents or their components. One might wonder: when several phenomenally conscious contents together form a unified experience, what is it in virtue of which they do so? I therefore think that those who submit to the one-experience view still face the grounding question, though they have to reformulate it in terms of contents. So, although I do not think that the one-experience view is correct, I don’t think we need to show that the one-experience view is incorrect in order to pose the question about phenomenal unity. In what follows, I will stick to the formulation in terms of experiences. Another point to note is that the grounding question does not mention subjects of experience. This raises an important question: does the grounding question disappear if the metaphysical structure of an experience essentially involves a subject?11 No. The idea that each experience involves a subject is compatible with the multiplicity of such subjects. It does not follow from the essential subject-involving nature of experiences that the

Unity of Consciousness

327

subject of the act of seeing the screen and the subject of the act of hearing the birds are the same subjects.12 A last point to note is that the grounding question does not explicitly mention the temporal relationship between the unified experiences. This diverges from the common practice of separating issues of synchronic unity from diachronic unity. I take it that there is an open question whether a unified account of synchronic and diachronic unities is possible. I do not therefore think that the project of accounting for phenomenal unity should be restricted to synchronic unities. Indeed, we will see later that the connectivity view provides a unified account of synchronic and diachronic unities. Most recent answers to the grounding question are Newtonian. Here are three examples: (1) A set of synchronic experiences is phenomenally unified in virtue of the fact that its members are parts of the same single experience (mereological view). (2) A set of synchronic experiences is phenomenally unified in virtue of the fact that the contents of its members are parts of the same single content (one-content view). (3) A set of synchronic experiences is phenomenally unified in virtue of the fact that its members are experiences of the same single subject (onesubject view).13 The mereological view grounds phenomenal unity in facts about the oneness of a total experience. Bayne (2010) offers a version of the mereological view.14 The content view grounds phenomenal unity in the oneness of a content. The content view grounds phenomenal unity in the oneness of a content—a view that Tye (2003) advocates.15 Peacocke (2014) seems to offer a view like this.16 These Newtonian views are not primitivist views of unity because they ground it in oneness or singularity, but they are primitivists about oneness. It is natural to wonder whether one can provide an account of phenomenal unity that is more substantive than grounding it in a primitive oneness. My aim in the following sections is to show that this can be done by sketching and motivating a novel Leibnizian view that builds unity from the ground up without an appeal to a primitive oneness. As we shall see, the view also sheds new light on some of the other issues surrounding phenomenal unity in the above list. Thus, those who are primarily interested in the other problems of unity will find some interest in the view that I offer.

328

F. Masrour

2 The Connectivity View The previous section distinguished between three different accounts of unity, namely the mereological view, the one-content view, and the onesubject view. As we saw, despite their differences, a common Newtonian thread runs through these three accounts—all three ground phenomenal unity in a global oneness or singularity. My main purpose in this section is to sketch an alternative view of unity that turns the metaphysical order of grounding upside down. In this view, the global unity of experience is grounded in local connections among experiences. The view, therefore, has a Leibnizian structure. As we shall see, this is not the only difference between the connectivity view and its Newtonian rivals. There is a second and equally important difference, but we need to do some stage setting before getting to that. We can start by highlighting a type of experience that is a common presence in our everyday experiential life. Right now, I see my hand, and I see the keyboard. But I also see my hand as being on the keyboard. The experience of my hand as on the keyboard is an experience of a specific spatial relation between them. At the moment, I also hear the birds singing and feel elated. But that’s not all: I experience the singing as the cause of my elation. My experience of the singing as the cause of my elation is another example of an experience of a specific relation. In my view, the repertoire of experiences of specific relations that we can have is very rich. For example, we can experience spatial, temporal, causal, dynamical, objectual, intentional, and even rational relations. We can experience objects as occupying the same space or as being in more specific spatial relation with each other. We can experience temporal simultaneity and succession relations. We can experience one event as the cause of another. We can experience two events as unfolding in a lawlike dynamic relation with each other. We can experience properties as properties of the same object (this is what I am calling an objectual relation).17 We can experience our thoughts and emotions as directed toward objects. And we can experience thoughts or beliefs as based on or justified by the objects and events that we experience.18 Experiences of specific relations play a central role in the connectivity view. The intuition behind the connectivity view is that we can account for phenomenal unity in terms of experiences of specific relations. This is partly motivated by the observation that experiences of specific relations seem to suffice for unity. In the connectivity view, the experience of the singing as the cause of my elation is all that it takes to unify my experience of my hand with my experience of the keyboard. The core idea of an

Unity of Consciousness

329

account of unity does not need to go beyond this mundane observation. What follows elaborates on this core intuition. We can use experiences of specific relations to define a relation that we can call binding. Binding Two experiences are bound together if and only if they are connected by an experience of a specific relation.19 My experience of my hand and my experience of the keyboard are bound together by the experience of the spatial relation between them. My experience of the singing and my experience of my elation are bound together by the experience of the causal relation between them. In each case, these experiences are connected by an experience of a specific relation. We can call the relation between an experience of a specific relation and the experiences that it binds “attachment.” Attachment, in the connectivity view, is a primitive relation. Binding is a relation among experiences, but bindings are not experienced as relations among experiences. They are experiences of relations among the objects and properties that experiences present. So the idea that there are binding relations among experiences does not conflict with the diaphanousness or transparency of experience. It seems plausible that two experiences can be unified without being bound. Take the earlier example in which you see your hand on the keyboard and hear the singing of the birds as the cause of your elation. Let us add that you hear the singing as coming from a different region in the same space in which you experience the keyboard to be located. Then you are experiencing a spatial relation between the keyboard and the singing. So your visual experience of the keyboard and the auditory experience of the singing are bound together. More importantly, it seems plausible that your experience of the keyboard and your experience of the elation are unified because they are both bound to the experience of the singing. So your experience of the keyboard and your elation experience are unified but not bound. One upshot of this observation is that the intuitive notion of unity is weaker than the binding relation. Intuitively, two experiences that are bound are unified with each other, but not all experiences that are unified need to be bound with each other. Therefore, we should not identify unity with binding. Nevertheless, there can be an intimate connection between the two notions. The above example also illustrates this connection. The seeing of the keyboard and the experience of the elation are connected through the mediation of the experience of the singing that is bound to both. There is, as

330

F. Masrour

it were, a path between these two experiences that is instantiated in virtue of a chain of bindings. More precisely, there is a unity path that connects these two experiences, where a unity path is defined as follows: Unity Paths There is a unity path between two experiences Em and En if and only if Em is bound with En or there is an Er such that Em is bound with Er and there is a unity path from Er to En. A path consists in a chain of binder experiences (experiences of specific relations) and the experiences that are bound by the path. An experience can be a member of a unity path in either of two ways: Path Membership An experience is a member of a path if and only if it is one of the binders in the path or one of the experiences that are bound by the binders in the path. The notions of a unity path and path membership can be used to define a property of a set of experiences that I call minimal connectivity: Minimal Connectivity A set of experiences, S, is minimally connected if and only if there is a unity path, P, such that all the members of S are members of P. We are now in the position to characterize the central claim of the connectivity view. I call this claim the connectivity thesis: Connectivity Thesis A set of experiences is unified in virtue of the fact that it is minimally connected. It is worth emphasizing the asymmetric relation between unity and connectivity. According to the connectivity thesis, unity is grounded in minimal connectivity. The connectivity thesis is thus an answer to the grounding question. The structural analog of the connectivity thesis would have been: a set of experiences is unified if and only if it is minimally connected. This biconditional thesis is silent about issues of metaphysical priority. In the connectivity view, unity relations are grounded in the existence of unity paths and facts about membership in the unity path. One way to think about the grounding relation between unity and unity paths is to think of unity paths as determinate versions of the determinable unity. Intuitively, determinables are instantiated in virtue of the instantiation of their determinate versions. This idea is not essential to the connectivity view but provides an additional framework for understanding it.

Unity of Consciousness

331

The connectivity view is Leibnizian in the sense that it accounts for unity in terms of local relations. This distinguishes this account from the Newtonian views that we discussed in the previous section. But there is another equally important way in which the account diverges from the existing accounts. To bring out this difference, let us consider the following paragraph from Bayne and Chalmers (2003): When I look at the book while feeling a pain, there is something it is like to see the book (yielding a phenomenal state A), and there is something it is like to feel the pain (yielding a phenomenal state B). But there is more than this: there is something it is like to see the book while feeling the pain. Here there is a sort of conjoint phenomenology, that carries with it the phenomenology of seeing the book, and the phenomenology of feeling the pain. … We can think of the conjoint state here as involving at least the conjunction A&B of the original phenomenal states A and B.

In the Bayne-Chalmers view, the fact that the feeling of pain and the seeing the book are unified with each other makes a difference to the overall phenomenology of the subject. But this difference is a matter of seeing the book and feeling the pain in a conjoint manner. In this view, having two experiences in a conjoint manner does not make a substantial contribution to phenomenology or content. Unity is a purely structural or logical matter and the phenomenology associated with it is, as it were, bare phenomenology. This observation generalizes to the other Newtonian views that we considered in the previous section. In all of these views, unity is a purely structural feature. We can call these views bare unity views where bare unities are connections between experiences that can happen independently of any experience of a specific relation. On the connectivity view, in contrast, phenomenal unity between experiences makes a substantial contribution to overall phenomenology and content. On this view, if I tell you that my experience of the singing of the birds and my feeling elated are bound, there is something left out in my description of my phenomenology. You can still ask, “How are they bound?” And I can give you an informative answer: “They are bound in that I experience the singing as the cause of my elation.” A similar idea applies when experiences are connected but not bound. In such cases, too, there is a unity path that connects the experiences, and the unity path makes a substantial contribution. The second way in which the connectivity view diverges from the common Newtonian has to do with this contrast. We can say that phenomenal

332

F. Masrour

unity on the Newtonian views is a bare relation, but on the connectivity view it is a substantial relation. We will see in the next section that this puts the connectivity view in a dialectically advantageous position in comparison to bare unity views. The substantiality of unity under the connectivity view is independent from the fact that the view has a Leibnizian structure. One can easily conjure up a Leibnizian view that is structurally similar to the connectivity view, but its binding relations are bare unity relations. This ends my characterization of the core concepts and theses of the connectivity view. On this view, the unity of a set of experiences is a matter of the connectivity of the set. Before ending the section, I want to return to how I introduced phenomenal unity at the beginning of the chapter and remove an ambiguity in the introduction. There, I introduced unity as a form of phenomenal togetherness. We should note that the referent of the phrase “togetherness of the experiences” depends on whether we adopt a Newtonian view or the connectivity view. On the Newtonian view, the togetherness refers to a bare global phenomenal oneness. On the connectivity view, in contrast, the togetherness refers to the connectivity of my experiences. This connectivity is implicitly contained in the passage. The assertion that the screen partly blocks my view of the window is meant to imply that I experience a spatial relation between the screen and the window. It is also implicit that the stew that I smell, the birds and the cars that I hear, and the backrest whose pressure I feel on my back are all experienced to be located in the same space in which I feel my body to be located. The togetherness refers to the fact that all of my experiences are connected in this way. So there was an ambiguity in our initial characterization of phenomenal unity that we were not in the position to remove then. Now that we have a clear picture of the contrast between the connectivity view and the Newtonian accounts, we are in a position to remove this ambiguity. 3 The Dialectical Advantages of the Connectivity View We saw in the previous section that the connectivity view diverges from the Newtonian approaches to unity in two ways. First, the view has a Leibnizian character in that it accounts for phenomenal unity in terms of local relations. Second, on the connectivity view, unity relations are substantial and make a positive contribution to the phenomenology and content of the experiences that they unify. These two differences put the connectivity view in a dialectically more advantageous position in comparison to the

Unity of Consciousness

333

Newtonian views. My aim in this section is to defend this claim by arguing that the connectivity view is in a better position in three respects. Consider the following possible account of phenomenal unity: The one-stream view A set of experiences is phenomenally unified in virtue of the fact that its members belong to the same stream of consciousness. There is something about this view that leaves us cold. The notion of phenomenal unity seems to be too close to the notion of a single stream of consciousness in terms of which it is elucidated.20 Many, therefore, might claim that they grasp the idea that a set of experiences is unified and the idea that the set forms a single stream of consciousness in a similar way. In short, the explanandum and the explanans in the one-stream view are too close to each other. This is not to say that the account does not tell us anything substantive. The account gives us a Newtonian picture of unity that is substantively different from a Leibnizian picture. But there is an intuitive sense in which the account does not expand our knowledge of the matter. To use a Kantian term in a slightly different manner, the account is not ampliative. Most Newtonian accounts suffer from a similar dialectical shortcoming, though to different degrees. Consider, for example, the mereological account according to which experiences are unified in virtue of the fact that they are parts of the same single experience. The account does not seem to lack substance. It is, after all, a Newtonian account according to which a oneness grounds unity. The claim that what makes consciousness unified is this Newtonian structure seems to be substantive. Also, the claim that the unifying oneness is an experience is, on the face of it, a substantive claim. A mountain range is a unified whole whose parts are mountains, but a mountain range itself is not a mountain. In claiming that a unified set of experiences together form an experience, the account might be making an additional substantive claim. Under closer inspection, however, the mereological view suffers from a shortcoming similar to the one-stream view. Consider a view about the individuation of experiences that we might call object-based individuation. On this view, we should cut up the space of experiences in the same way that we cut up the space of their objects. Under this object-based notion of experience, I have an experience of the keyboard and an experience of elation. But I do not have an experience of the keyboard and elation because my elation and the keyboard do not together form an object of experience. It is not obvious that the object-based notion of experience is correct. Nevertheless, we seem to have an intuitive grasp of the object-based notion of

334

F. Masrour

experience. Obviously, though, the notion of a single encompassing experience that is operative in the mereological view is a different notion. For, if we adopt the object-based view, many of the experiences that we often have at the same moment cannot be regarded as forming one single experience together. This gives rise to a question: what is the notion of a single experience that is operative in the mereological view, and is our grasp of this notion sufficiently removed from our grasp of a unified set of experiences? In my view, the notion of a single experience that is operative in this account is too close to the notion of unity, and our grasp of this notion is not independent from our grasp of the notion of unity. If this observation is correct, then the mereological view suffers from a shortcoming similar to the one-stream view. Its explanandum and explanans are not sufficiently distant from each other. Therefore, the account is not sufficiently ampliative.21 The situation for the other Newtonian approaches seems analogous. On the one-content view, experiences are unified in virtue of the fact that their contents are parts of the same total phenomenal content. The view does not ground unity in the oneness of experience. Rather, it grounds it in the oneness of phenomenal content. But it is not clear whether we can have a clear grasp of the oneness of phenomenal content independent of our grasp of unity. Again, the explanandum and the explanans are too close to each other. Arguably, similar issues arise for the one-subject view.22 All Newtonian views thus suffer from the same dialectical shortcoming. We can say that these accounts are not very ampliative.23 The first dialectical advantage of the connectivity view is that it is ampliative. This advantage emerges out of the interplay between its Leibnizian structure and its substantiality—the fact that under the connectivity view, unity is not a bare relation. The connectivity view grounds phenomenal unity in relations of binding, and those in turn in the attachment between experiences of specific relations and other experiences. The relations of binding and attachment are clearly different from the unity that the account aims at explaining. The account’s explananda and explanans are clearly distinct. Thus, the connectivity view is clearly ampliative and substantive. In this respect, it is dialectically preferable to the Newtonian views that we have considered. The second dialectical advantage of the connectivity view has to do with the fact that those who wish to provide an account of phenomenal unity have to defend their claim that there is a target phenomenon to be accounted for. Here, the dominant Newtonian accounts of unity have met some skepticism. For example, in a commentary on Bayne’s Unity of Consciousness, Hill complains that “[Bayne] has not yet specified an appropriate

Unity of Consciousness

335

relation of phenomenal unity, or even pointed us in a direction in which an appropriate unity relation can be found.”24 In the book, Bayne claims that introspection supports the existence of phenomenal unity. But Hill responds that he can never find Bayne’s phenomenal unity whenever he introspects, “looking for a phenomenal unity relation.” So the claim that there is phenomenal unity has been challenged and a well-developed account of unity has to respond to this challenge. In my view, the main reason Bayne’s view has received a skeptical reaction of this sort is due to a feature that it shares with all other Newtonian accounts. As we noted earlier, in grounding unity in a form of oneness, Newtonian accounts are forced to construe it as a bare relation. The connectivity view, in contrast, grounds phenomenal unity in experiences of specific relations. It is not as easy to be skeptical about substantial experiences of specific relations as it is to be about bare phenomenal unities.25 Thus, the connectivity view is in a better position to convert some of the skeptics about phenomenal unity into friends of phenomenal unity. The third, and perhaps the most important, dialectical advantage of the connectivity view is that it provides a unified account of diachronic and synchronic phenomenal unity. In this view, the conditions for diachronic unity between experiences are exactly the same as the conditions for synchronic unity: the existence of unity paths between them. Proponents of Newtonian views, in contrast, often admit that their account of synchronic unity does not generalize to diachronic unities.26 Clearly, a uniform account of unity is theoretically preferable to a nonuniform one in that it accounts for more phenomena in a uniform manner. This ends my defense of the claim that, in three respects, the connectivity view is in a dialectically more advantageous position than the existing Newtonian accounts. First, the view is ampliative because we grasp its explanandum and explanans differently. Second, the view is in a better position to face the skeptical challenge about phenomenal unity because unity relations in this view are not bare. Third, the view is theoretically in a better position because it provides a unified treatment of synchronic and diachronic unities. I think it is safe to conclude that we have good reasons to embrace the connectivity view unless there are important objections to it. Whether there are such objections is the focus of the next section. 4 Objections and Replies The previous section developed the connectivity view and argued that it is in a dialectically better position than its rivals. My aim in this section is to block three possible objections to the view.

336

F. Masrour

Hurley has argued against attempts to account for unity in subjective terms by employing a generic argument style that she aptly calls the “justmore-content” argument.27 The argument is based on the observation that if a set of experiences is not unified, adding more content can unify its members only if the new content is already unified with them. This, Hurley concludes, shows that we cannot account for unity in terms of the contents of experience. I am not concerned with the cogency of this argument, but I would like to consider a possible objection that one who finds Hurley’s argument attractive might submit against the connectivity view. Our imaginary interlocutor might reason in the following manner: Imagine a situation in which a subject has two streams of consciousness. In one stream, the subject has three experiences: E1, E2, and E3. Let us assume that E3 is an experience of a specific relation between the contents of E1 and E2 and binds the two together. In the other stream, the subject has an experience, E4, which is type-identical with E2. Since E2 and E4 have the same content, and E3 is an experience of a specific relation between the contents of E1 and E2, it follows that E3 is also an experience of a specific relation between the contents of E1 and E4. Under the connectivity view then, it follows that E1 and E4 are bound and unified. However, E4 is in a different stream from E1 and E2, which implies that E4 is not unified with E1 because by stipulation, it is in another stream. Thus, the connectivity view results in a contradiction and should be rejected. This is an interesting objection, but it does not survive scrutiny. Consider the thesis that a subject’s two experiences, E1 and E2, with contents p and q, are bound in virtue of the fact that the same subject has an experience with the content R(p, q), where R(p, q) is the content that an experience type that would bind E1 and E2 must have.28 This thesis has many counterexamples, and the above scenario illustrates one of them. But the thesis is different from the connectivity view under which two experiences are bound when an experience with the appropriate content connects them together. Under the connectivity view, having an experience with the appropriate content does not suffice for binding. The experience must be attached to the experiences that it binds. So the connectivity view is compatible with the above scenario that illustrates a situation in which the instantiation of the sufficient content in the subject does not suffice for binding. The connectivity view is not an attempt to ground unity on content. But our imaginary interlocutor might respond: There should be an explanation for the fact that E3 binds E1 and E2 but does not bind E1 and E4. The best explanation is that E4 is not in the same

Unity of Consciousness

337

stream of consciousness as E1 and E3. So experiences of relation can bind other experiences only when the binder and the experiences that it binds happen within a single unified stream. Thus, what partially grounds the fact that the appropriate content binds two experiences is that the binding experience and the experiences that are bound are unified with each other. It thus follows that unity is presupposed by binding and cannot be grounded in it. The above move extends the intuition behind the just more content argument. It is admitted on both sides that adding more content does not unify two experiences, but our imaginary interlocutor takes this a step further by arguing that the best explanation for this is that the additional content is not unified with what it aims to unify. In the connectivity view, however, attachment is a primitive relation in the sense that there is no other fact at the personal level that grounds it. E3 binds E1 and E2 but not E1 and E4 in virtue of the fact that E3 is attached to E2 but not E4. This fact has no further ground at the personal level that metaphysically explains it. In assuming the contrary, the above argument is begging the question against the connectivity view. The second possible objection to the connectivity view concerns its breadth: It seems that the connectivity view cannot account for all cases of phenomenal unity. For example, my emotional experiences are unified with my perceptual experiences, but I don’t find any connection between them. A similar point applies to the case of experiences associated with thoughts and other cognitive states. It is not clear how these experiences might be connected to my other experiences through unity paths. Since the objection argues that the connectivity view does not have the sufficient resources to account for the unity between all our experiences, I call it the insufficiency objection. Does the insufficiency objection succeed? The first thing to note is that a number of different theoretical options are available about the nature of emotional experiences. Under one view, for example, emotional experiences are collections of experiences of bodily dispositions. We can experience a variety of bodily dispositions. For example, we can sometimes feel disposed to laugh, to cry, to embrace, to jump, to dance, to sit, to run, to shout, to punch, and so on.29 On a view like this, there is something that it is like to feel joyful at each moment, but when we make this claim, we are not adding anything to the repertoire of the simple experiences that we can have because feeling joyful is a label for some collection of experiences of bodily dispositions such as feeling disposed to jump, dance, and sing. On a different view, feeling joyful consists

338

F. Masrour

in undergoing an unanalyzable raw feel of joy. On a third view, feeling joyful essentially combines a raw feel of joy with experiences of bodily dispositions. Obviously, the way in which emotional experiences can connect to perceptual experiences depends on which one of these views one takes. For example, if experiences of emotions essentially involve experiences of bodily dispositions, then it is not hard to imagine how they can connect to perceptual experiences. Even under the view that experiences of emotions are simple raw feels, there are several ways in which these experiences can connect to perceptual experiences. The most important potential source of connection is the experience of temporal relations. One can experience one’s emotion to be simultaneous with what one’s perceptual experience presents. One can also experience one’s emotions as temporally succeeding or preceding what one’s perceptual experience presents.30 Also, emotions can be experienced as standing in causal relations with external objects and events as well as our behavior, thoughts, and memories. For example, one might feel the sight of the snake as the cause of one’s fear, or one might feel one’s sadness as the cause of one’s crying. We might feel our happiness as caused by remembering a past joyful event. We can also experience our emotions to be directed at the items that perceptual experiences present. For example, one might experience one’s love as directed toward an individual. We can feel our emotions to be epistemically related to certain judgments and beliefs about oneself. For example, a sincere judgment that “I’m excited” can be felt to be based on one’s excitement. Thus, the proponent of the connectivity view has several resources to account for the connection between emotional and perceptual experiences. First, she might defend an account of the nature of emotional experience that directly links it to bodily dispositions and through them to other perceptual experiences. Second, she might hold that experiences of temporal, causal, intentional, and epistemic relations bind emotional and perceptual experiences. Finally, she might find connections between emotions and perceptual experiences through their connections with behavior, thoughts, imaginings, and memories. Arguably, a similar line of response would work for the experiences associated with cognitive states. So we can safely conclude that the insufficiency objection can be resisted. Let us thus turn to the last potential objection. On the connectivity view, a set of experiences cannot be unified unless it is minimally connected. The third potential objection against the connectivity view targets this idea. The objection attempts to show that unified sets of experiences that are not minimally connected are possible. Since the

Unity of Consciousness

339

objection bases this claim on the conceivability of such situations, I call it the conceivability objection. Here is one way that the objection might go: I find unified sets of experiences that are not minimally connected conceivable. For example, I can conceive of an emotional experience that has no connection with my other experiences but yet is unified with them because it is experienced from the same phenomenal point of view. In general, one can conceive of experiences that are disconnected but unified because their contents are experienced as given to a single point of view. It is not entirely clear what a phenomenal point of view is. But, whatever a phenomenal point of view might be, the natural position for the proponent of the connectivity view is that the oneness of a phenomenal point of view is grounded in phenomenal unity. So on the connectivity view, a situation in which an emotional experience is completely disconnected from all other experiences yet experienced from the same phenomenal point of view is impossible. Can we conceive of such a situation? I have been emphasizing that the proponent of the connectivity view has several resources to account for the connection between emotional experiences and other experiences. In order to conceive of an emotion that is disconnected from other experiences, we have to conceive of a situation in which all potential sources of connection are absent. It is not clear to me that after making the absence of all of these connections explicit, we would still confidently assert that the alleged emotion is given to the same point of view as our other experiences. Let me elaborate. Typically, when I conceive of an emotion along with other experiences, I conceive of a situation in which I experience the emotion as simultaneous with what my other experiences present: e.g., I felt my fear as I witnessed the biker zigzagging between the cars.31 I can also conceive of an emotion as preceding or succeeding my perceptual experiences. In order to conceive of a fully disconnected emotion, I have to conceive of a situation in which these temporal connections are absent. Typically, when I conceive of an emotion, I also conceive of a situation in which I experience some bodily dispositions. For example, when I conceive of feeling excited, I conceive of being disposed to get up and pace back and forth. In order to conceive of a fully disconnected emotion, I have to conceive of the absence of all of these dispositions. It is not completely clear to me that I can conceive of a situation like this. More importantly, the further that I go down the path of imaginatively cutting down the connections between an emotion and other experiences, the less confident I feel that I can imagine the emotion to be given to the same point of view as my other experiences.

340

F. Masrour

The gist of my reply to the conceivability objection is that if fully conceiving of a situation requires conceiving of all the relevant detail in that situation, then it is not at all clear that we can fully conceive of a situation in which an emotional experience is fully disconnected from our other experiences, yet given to the same point of view. Admittedly, more work needs to be done on what the oneness of a phenomenal point of view is and in virtue of what it emerges. But I think that when we understand this aspect of phenomenal experience in a deeper way, the prima facie conceivability of the situation that the conceivability objection depicts disappears. To summarize, I have considered three possible objections to the connectivity view. On the just-more-content objection, the fact that an experience of a specific relation can bind two other experiences is itself grounded in the fact that the binding experience is unified with what it binds. I responded that this claim begs the question against the thesis that relations of attachment are primitive relations. On the insufficiency objection, the connectivity view cannot account for the unity between emotional and cognitive experiences with other experiences. In response, I pointed out several ways in which these experiences can connect to other experiences. Lastly, I considered the objection that fully disconnected and yet unified experiences are conceivable and thus possible. I responded by questioning the claim that such situations are conceivable. 5 Conclusion My aim in this chapter has been to advertise for a novel account of phenomenal unity. To this effect, I outlined a view according to which a set of experiences is unified when there is a unity path that contains all of its members. On this view, unity paths are chains of binding relations, which are mediated by experiences of specific relations. I called the view the connectivity view. We saw that the connectivity view is different from the common contemporary approaches to phenomenal unity in two respects. First, unlike the Newtonian views, the connectivity view has a Leibnizian structure. Newtonian views ground phenomenal unity in a primitive form of oneness. The connectivity view, in contrast, builds unity from local unity relations and does not appeal to a primitive oneness. I then argued that the connectivity view is dialectically in a better position than its Newtonian rivals in that (a) the view is more substantive, (b) it is in a better position to face the skeptical challenge about phenomenal unity, and (c) it treats synchronic and diachronic unities in a uniform fashion. Finally, I responded to three potential objections to the view.

Unity of Consciousness

341

I want to end the chapter by quickly noting the implications that the choice between the connectivity view and its Newtonian rivals might have for the two other issues surrounding phenomenal unity—namely, the issue of whether consciousness is in some sense necessarily unified and the issue of the psychological underpinnings of unity. It is not uncommon for the Newtonian theorists about unity to hold that consciousness is necessarily phenomenally unified.32 Part of the reason for this is that under these views, the failure of unity seems in some sense inconceivable. The connectivity view, in contrast, gives us a better handle on what it takes for unity to break. Phenomenal unity breaks if there is discontinuity in the stream of consciousness. Arguably, then, one implication of the connectivity view is that the thesis that consciousness is necessarily phenomenally unified is false. The connectivity view also seems to give us a different picture of the cognitive architecture that underlies unity. On a Newtonian view, unity requires a phenomenal oneness that encompasses all experiences. On the assumption that unity structure is mirrored in cognitive architecture, the view seems to require an architecture in which experiences all belong to a single center. Newtonian views thus suggest borrowing a label from Bayne, an imperial cognitive architecture. The Leibnizian view in contrast is compatible with a cognitive architecture in which there is no single center to which all experiences belong. The view is more at home with a federal architecture in which several local centers are minimally connected with each other.33 Thus, if we assume that conscious experiences are often unified and that unity structure is mirrored in cognitive architecture, adopting the connectivity view would have important consequences for the way that we think of the cognitive architecture that underlies our experience. Acknowledgments This paper has benefited from discussion with Ned Block, Matthew Boyle, David Chalmers, Güven Güzeldere, Michael Murez, Efrain Lazos, Christopher Peacocke, Axel Seemann, and Susanna Siegel. More than all, I am indebted to Tim Bayne, whose excellent work on unity has influenced my views about the topic. Notes 1. See Tye (2003, xii) and Bayne (2010, 4–11). 2. See Bayne (2010), Bayne and Chalmers (2003), Dainton (2006), Hurley (1998), Lockwood (1989), Peacocke (1994, 2014), and Tye (2003).

342

F. Masrour

3. See Peacocke (2014, ch. 3), Bayne (2010, ch. 12), and Tye (2003, ch. 6). 4. See Hill (forthcoming) and Hurley (1998). Hurley does not describe her own position as a form of skepticism, but she defends the position that subjective accounts of unity of consciousness are all bound to fail. Her position implies that unity should not be described in phenomenal terms. She would thus be skeptical about our target phenomenon. 5. This, of course, does not mean that there are no difficult cases. Difficulties can emerge in two ways. First, there can be disagreement about how presence to consciousness should be characterized. For example, is introspectability a necessary condition for presence to consciousness? Even if we agree on the requirements for presence to consciousness, there can be disagreement about whether specific cases satisfy this requirement. For example, those who agree on an introspectability criterion might disagree about whether they can find a self or a subject under introspection. 6. The cogency and legitimacy of the use of the notion of grounding and its surrounding notions in formulating metaphysical positions has received ample defense in recent literature. For some of the recent contributions to the issue, see Fine (2010, 2012a, 2012b), Rosen (2010), Shaffer (2009), and the papers in Correia and Schnieder (2012). 7. The structural claim that a set of experiences is unified if and only if its members are parts of one single encompassing experience is compatible with three ways to think about the relationship between the two sides of the biconditional. On one view, unity between the experiences is primitive and grounds the existence of the single encompassing experience and the fact that these experiences are parts of the single encompassing experience. On another view, it is the other way around: mereological facts ground unity. On a third possible view, the two sides ground each other, and neither is more fundamental than the other. 8. Nagel (1974). 9. There are other difficult issues about the individuation of experiences. For example, is it possible for a subject to have two different tokens of the same type of experience at the same moment in time? The response to questions like this partly depends on how we individuate experiences, and there are several options here. However, I do not think that my argument in the chapter will be affected. 10. Tye (2003) argues on this basis that there is something problematic about posing the unity question in terms of experiences. See Bayne (2010, 21–28) for a reply to Tye. 11. See Peacocke (2014) for a defense of the subject-involving nature of experience. 12. Hurley (1998, 99–102) makes a similar claim.

Unity of Consciousness

343

13. Note that the above views can be coextensive with each other. For example, it might be the case that the members of a set of experiences are parts of the same single experience if and only if their contents are parts of the same single content. Nevertheless, the views disagree on which one of the sides of the biconditional is metaphysically more fundamental than the other. 14. See also Bayne and Chalmers (2003). 15. As noted earlier, Tye holds the one-experience view. So he would not formulate his view in the way that I have formulated the one-content view. But as I noted earlier, the grounding question does not disappear by adopting the one-experience view. Tye thus associates the oneness of an experience with the oneness of content and closure under conjunction. He does not explicitly distinguish between grounding and structural questions, but the manner in which he presents his views suggests that on his view the fundamental fact here is the oneness of content, and the closure under conjunction is only a necessary condition for unity. 16. I am not entirely confident about interpreting Peacocke as giving an answer to the grounding question. The main reason for this is that it is not completely clear to me that Peacocke would regard the oneness of a subject as something that is present to consciousness. For, he seems to accept the Humean view that we cannot attend to the self, and it seems plausible that we would have been able to attend to the self if the oneness of a subject were present to consciousness. If this observation is correct, then Peacocke might be holding the view that the oneness of the subject is a subpersonal ground for phenomenal unity and phenomenal unity does not have a ground at the personal level. If so, Peacocke’s view is a primitivist view about phenomenal unity. 17. I borrow this term from Bayne and Chalmers (2003). 18. All of these claims are substantive and can be opposed, but this chapter is not the place to defend them. 19. It is worth pointing out that this binding relation is not the same as the featurebinding relation that is discussed in psychology. 20. Van Gulick (2013) notices a similar point about Bayne’s mereological account but puts it into a different use. 21. Bayne (2010) offers what he calls a tri-partite conception of experiences according to which experiences are individuated on the basis of their subjects, phenomenal properties, and their time of occurrence. In my view, this account pushes the question of individuating experiences to the question of individuating phenomenal properties, and that is an issue about which we may not have clear intuitions. 22. For an argument to the contrary, see Peacocke (2014, ch. 3). 23. The claim that an account has a dialectical shortcoming because it is nonampliative should be understood in the context of the availability of competing more

344

F. Masrour

ampliative and equally plausible accounts of the same phenomenon. In such a context, we have some reason to abandon the nonampliative account in favor of the ampliative one. This is not to say that this reason cannot be overridden by other considerations. 24. Hill (forthcoming). 25. Hill, for example, can find some forms of unity such as spatial unity under introspection. His problem is with finding a unity relation that obtains universally. 26. See Bayne (2010), Peacocke (2014), and Tye (2003). 27. Hurley (1998, 97–102). 28. Here I am using the brain-body notion of a subject under which there is one subject when there is one brain-body. In this sense, there is no contradiction in assuming that one subject has two streams of consciousness. 29. These claims are of course substantive and require proper defense, but this chapter is not the place to do so. 30. In my view, simultaneously experiencing two things is not an experience of simultaneity between them. Neither is successively experiencing two events the experience of succession. So it is not the case that what synchronic experiences present are always experienced as simultaneous, and what successive experiences present are always experienced as successive. 31. Earlier, I said that experiences of relation are not experiences of relations between experiences but experiences of relations between what my experiences represent. One might think that this would make extending the view to emotional experiences or experiences such as moods somewhat difficult. For, it is not completely clear what these experiences represent. However, I do not think that this is a serious worry about the connectivity view. The remark that all experiences of relations are experiences of relations among what experiences represent is presupposing a representationalist view of all experiences. If it turns out that representationalism cannot be extended to some experiences, then we have to say that some experiences of relations are experiences of relations between experiences. Nothing in the connectivity view would change as a result. 32. See Bayne (2010) and Bayne and Chalmers (2003). 33. I also borrow the term “feudal” from Bayne (2010).

References Bayne, T. (2010). The unity of consciousness. New York: Oxford University Press. Bayne, T., & Chalmers, D. (2003). What is the unity of consciousness? In A. Cleeremans (Ed.), The unity of consciousness: Binding, integration, and dissociation (pp. 23– 58). Oxford: Oxford University Press.

Unity of Consciousness

345

Correia, F., & Schnieder, B. (Eds.). (2012). Metaphysical grounding: Understanding the structure of reality. Cambridge: Cambridge University Press. Dainton, B. (2006). Stream of consciousness: Unity and continuity in conscious experience. London: Taylor & Francis. Fine, K. (2010). Some puzzles of ground. Notre Dame Journal of Formal Logic, 51(1), 97–118. Fine, K. (2012a). Guide to ground. In F. Correia & B. Schnieder (Eds.), Metaphysical grounding: Understanding the structure of reality (pp. 37–80). Cambridge: Cambridge University Press. Fine, K. (2012b). The pure logic of ground. Review of Symbolic Logic, 5(1), 1–25. Hill, C. (Forthcoming). Tim Bayne on the unity of consciousness. Analysis Review. Hurley, S. L. (1998). Consciousness in action. Cambridge, MA: Harvard University Press. Lockwood, M. (1989). Mind, brain, and the quantum. Oxford: Blackwell. Nagel, T. (1974). What is it like to be a bat? Philosophical Review, 83(4), 435–450. Peacocke, C. (Ed.). (1994). Objectivity, simulation, and the unity of consciousness: Current issues in the philosophy of mind. London: British Academy. Peacocke, C. (2014). The mirror of the world: Subjects, consciousness, and self-consciousness. Oxford: Oxford University Press. Rosen, G. (2010). Metaphysical dependence: Grounding and reduction. In B. Hale & V. Hoffman (Eds.), Modality: Metaphysics, logic, and epistemology (pp. 109–136). Oxford: Oxford University Press. Shaffer, J. (2009). On what grounds what. In D. Chalmers, D. Manley, and R. Wasserman (Eds.), Metametaphysics (pp. 347–383). Oxford: Oxford University Press. Tye, M. (2003). Consciousness and persons: Unity and identity. Cambridge, MA: MIT Press. Van Gulick, R. (2013). Phenomenal unity, representation, and the self. Philosophy and Phenomenological Research, 86(1), 209–214.

15 Partial Unity of Consciousness: A Preliminary Defense Elizabeth Schechter

1 Introduction Under the experimental conditions characteristic of the “split-brain” experiment, a split-brain subject’s conscious experience appears oddly dissociated, as if each hemisphere is associated with its own stream of consciousness. On the whole, however, split-brain subjects appear no different from “normal” subjects, whom we assume have only a single stream of consciousness. The tension between these impressions gives rise to a debate about the structure of consciousness: the split-brain consciousness debate.1 That debate has for the most part been pitched between two possibilities: that a split-brain subject has a single stream of consciousness, associated with the brain (or with the subject) as a whole, or that she has two streams of consciousness, one associated with each hemisphere. Considerably less attention has been paid to the possibility that a split-brain subject has a single but an only partially unified stream of consciousness, a possibility that has been articulated most clearly by Lockwood (1989) (see also Trevarthen, 1974; Moor, 1982). The partial unity model of split-brain consciousness is interesting for reasons that extend beyond the split-brain consciousness debate itself. Most saliently, the model raises questions about subjects of experience and phenomenal perspectives, about the relationship between phenomenal structure and the neural basis of consciousness, and about the place for the type/ token distinction in folk and scientific psychology. This chapter examines two objections that have been raised to the partial unity model, objections that presumably account for how relatively little attention the model has received. Because I argue that neither of these objections impugns the partial unity model in particular, the chapter constitutes a preliminary defense of the partial unity model, working to show

348

E. Schechter

that it is on par with its clearest contender, a version of the conscious duality model. 2 The Split-Brain Consciousness Debate The split-brain experimental paradigm typically involves carefully directing perceptual information to a single hemisphere at a time, to the extent possible. (See Lassonde & Ouimet, 2010, for a recent review.) This is relatively simple to understand in the case of tactile perception. Suppose you blindfold a split-brain subject (or in some other way obscure his hands from his sight) and put an object in his left hand, say, a pipe. Since patterned touch information transmits from each hand only to the contralateral (opposite side) hemisphere (Gazzaniga, 2000, 1299), tactile information about the pipe will be sent from the subject’s left hand to his right hemisphere (RH). In a “non-split” subject, the corpus callosum would somehow transfer this information to, or enable access by, the left hemisphere (LH) as well. In the split-brain subject, however, this tactile information more or less stays put in the initial hemisphere that received it. Meanwhile, in a large majority of the population, the right hemisphere is mute. A split-brain subject is therefore likely to say, via his LH, that he cannot feel and doesn’t know what he is holding in his left hand. A few minutes later, however, using the same left hand, and while still blindfolded, the subject can select the object he was holding a minute ago from a box of objects—showing that the object was not only felt but recognized and remembered. The subject may even draw a picture of a pipe, again using the left hand, which is under dominant control of the right hemisphere (Levy, 1969). Visual, auditory, olfactory, pain, posture, and temperature information may all be lateralized, to varying degrees, under some conditions. What makes such findings interesting for thinking about conscious unity is this: On the one hand, a split-brain subject can respond to stimuli presented to either hemisphere in ways that we think generally require consciousness. On the other hand, a subject can’t respond to stimuli in the integrated way that we think consciousness affords, when the different stimuli are lateralized to different hemispheres (or when a response is elicited not from the hemisphere to which the stimulus was presented, but from the other). For example, a very basic test for the “split-brain syndrome” is a simple “matching” task in which the subject is first required to demonstrate ability to recognize both RH-presented stimuli and LH-presented stimuli by pointing to a picture of the referents of the presented words, by drawing a picture, and so on. After demonstrating this capacity, the subject is

Partial Unity of Consciousness

349

then finally asked to say whether the two lateralized stimuli are the same or different. In the paradigmatic case, the subject can perform the former, apparently much more complex sort of task, but not the second, apparently simpler task. This is what first suggests (obviously not conclusively), that the hemispheres somehow have different streams of consciousness: after all, I could demonstrate what I was conscious of and you could demonstrate what you were conscious of, without either of us having any idea whether we were conscious of the same thing. Such results notwithstanding, a number of philosophers have defended some kind of unity model (UM) of split-brain consciousness, according to which a split-brain subject (at least typically) has a single stream of consciousness. In the only version of the unity model invariably mentioned in the split-brain consciousness literature, a split-brain subject has a single stream of consciousness whose contents derive exclusively from the left hemisphere. It’s actually not clear that anyone ever defended this version of the model; a couple of theorists (Eccles, 1973, 1965; Popper & Eccles, 1977) are widely cited as having denied RH “consciousness,” but they may have been using the term to refer to what philosophers would call “selfconsciousness” (see especially Eccles, 1981). The simple difficulty with that version of the UM is that a lot of RH-controlled behavior so strongly appears to be the result of conscious perception and control. As Shallice once said of RH-controlled performance on the Raven’s Progressive Matrices task (Zaidel, Zaidel, & Sperry, 1981): If this level of performance could be obtained unconsciously, then it would be really difficult to argue that consciousness is not an epiphenomenon. Given that it is not, it is therefore very likely, if not unequivocally established, that the split-brain right hemisphere is aware. (Shallice, 1997, 264)

Contemporary versions of the unity model (Marks, 1981; Hurley, 1998; Tye, 2003; Bayne, 2008) in fact all assume that conscious contents derive from both hemispheres. I will make this same assumption in this paper.2 The major alternative to the unity model is the conscious duality model (CDM). According to the CDM, a split-brain subject has two streams of consciousness, each of whose contents derive from a different hemisphere. This model appealed particularly to neuropsychologists (e.g., Gazzaniga, 1970; Sperry, 1977; LeDoux, Wilson, & Gazzaniga, 1977; Milner, Taylor, & JonesGotman, 1990; Mark, 1996; Zaidel et al., 2003; Tononi, 2004), but several philosophers have defended or assumed it as well (e.g., Dewitt, 1975; Davis, 1997). Since both the CDM and contemporary versions of the UM allow that conscious contents derive from both hemispheres, what is at issue between

350

E. Schechter

them is whether or not RH and LH experiences are unified or co-conscious with each other—that is, whether they belong to one and the same or to two distinct streams of consciousness. Unsurprisingly, there is disagreement about what co-consciousness (or conscious unity) is, and whether there is even any single relation between conscious phenomena that we mean to refer to when speaking of someone’s consciousness as being “unified” (Hill, 1991; Bayne & Chalmers, 2003; Tye, 2003; Schechter, 2013b). It is nonetheless possible to articulate certain assumptions we make about a subject’s consciousness—assumptions concerning conscious unity—that appear to somehow be violated in the split-brain case. As Nagel says, we assume that, “for elements of experience … occurring simultaneously or in close temporal proximity, the mind which is their subject can also experience the simpler relations between them if it attends to the matter” (Nagel, 1971, 407). We might express this assumption by saying that we assume that all of the (simultaneously) conscious experiences of a subject are co-accessible. Marks, meanwhile, notes that we assume that two experiences “belong to the same unified consciousness only if they are known, by introspection, to be simultaneous” (1981, 13). That is, we assume that any two simultaneously conscious experiences of a subject are ones of which the subject is (or can be) co-aware. Finally, we assume that there is some single thing that it is like to be a conscious subject at any given moment, something that comprises whatever multitude and variety of experiences she’s undergoing (Bayne, 2010). We assume, that is, that at any given moment, any two experiences of a subject are co-phenomenal. Although the split-brain consciousness debate and this paper are most centrally concerned with co-phenomenality, I will basically assume here that whenever two (simultaneously) phenomenally conscious experiences are either co-aware or co-accessible, then they are also co-phenomenal. (This assumption may be controversial, but its truth or falsity does not affect the central issues under consideration in this chapter, so long as we view these relations as holding of experiences rather than contents; see Schechter, 2013a.) For simplicity’s sake, I will focus only on synchronic conscious unity—the structure of split-brain consciousness at any given moment in time—to the extent possible. Accordingly, I will speak simply of the co-consciousness relation (or conscious unity relation) in what follows.3 Let us say that streams of consciousness are constituted by experiences and structured by the co-consciousness relation.4 According to the unity model of split-brain consciousness, a split-brain subject has a single stream of consciousness: right and left hemisphere experiences are co-conscious, in other words. According to the conscious duality model, co-consciousness

Partial Unity of Consciousness

351

holds intrahemispherically but fails interhemispherically in the split-brain subject, so that the subject has two streams of consciousness, one “associated with” each hemisphere. Despite their disagreements, the CDM and the UM share a very fundamental assumption: that co-consciousness is a transitive relation. In this one respect, these two models have more in common with each other than either of them does with the partial unity model (PUM). The PUM drops the transitivity assumption, allowing that a single experience may be co-conscious with others that are not co-conscious with each other. Streams of consciousness may still be structured by co-consciousness, but it is not necessary that every experience within a stream be co-conscious with every other. In this model, then, conscious unity admits of degrees: only in a strongly unified stream of consciousness is co-consciousness transitive. According to both the UM and the CDM, then, a split-brain subject has some whole number of strongly unified streams of consciousness, while according to the PUM, a split-brain subject has only a single but only partly (or weakly) unified consciousness. Note that because there are several possible notions of conscious unity, there are other possible partial unity models. The truth is that conscious unity is (to borrow Block’s [1995] term) a “mongrel concept” (Schechter, 2013b); when we think of what it is to have a “unified” consciousness, we think of a whole host of relations that subjects bear to their conscious experiences and that these experiences bear to each other and to action. Talk of a “dual” consciousness may connote a breakdown of all these relations simultaneously. In reality, though, these relations may not stand or fall all together; in fact, upon reflection, it’s unlikely that they would. One intuitive sense of what it means to have a partially unified consciousness, then, is a consciousness in which some of these unity relations still hold, and others do not (Hill, 1991). This is not what I mean by a “partially unified consciousness,” however. In one possible kind of partial unity model, some conscious unity relations, but not others, hold between experiences. In the kind of partial unity model under consideration here, conscious unity relations hold between some experiences, but not between others. This point will be crucial to understanding the choice between the PUM and the CDM.5 The PUM of split-brain consciousness has several prima facie strengths. Most obviously, it appears to offer an appealingly intermediate position between two more extreme models of split-brain consciousness. The UM must apparently implausibly deny failures of interhemispheric co-consciousness; the CDM is apparently inconsistent with the considerable

352

E. Schechter

number of cases in which it is difficult or impossible to find evidence of interhemispheric dissociation of conscious contents. The PUM that I will consider also makes some kind of neurophysiological unity the basis for conscious unity. Against those who would claim that splitting the brain splits the mind, including the conscious mind, some philosophers argued that a putatively single stream of consciousness can be “disjunctively realized” (Marks, 1981; Tye, 2003). Lockwood’s defense of the PUM in contrast appeals explicitly to the fact that the “split” brain is not totally split, but remains physically intact beneath the cortical level: the cortically disconnected right and left hemisphere are therefore associated with distinct conscious experiences that are not (interhemispherically) co-conscious; nonetheless, these are all co-conscious with a third set of subcortically exchanged or communicated conscious contents. Many will be attracted to a model that makes the structure of consciousness isomorphic to the neurophysiological basis of consciousness in this way (Revonsuo, 2000).6 Another significant source of the PUM’s appeal is its empirical sensitivity or flexibility, in a particular sense. Lockwood sought to motivate the PUM in part by considering the possibility of sectioning a subject’s corpus callosum one fiber at a time, resulting in increasing degrees of (experimentally testable) dissociation. Would there be some single fiber that, once cut, marked the transition from the subject’s having a unified to a dual consciousness? Or would the structure of consciousness change equally gradually as did the neural basis of her conscious experience? Lockwood implies that nothing but a pre-theoretic commitment to the transitivity of co-consciousness would support the first answer, and simply notes that “there remains something deeply unsatisfactory about a philosophical position that obliges one to impose this rigid dichotomy upon the experimental and clinical facts: either we have just one center, or stream, of consciousness, or else we have two (or more), entirely distinct from each other” (Lockwood, 1989, 86). Lockwood’s thought experiment is in fact not wholly fictitious: callosotomy became routinely performed in stages, with predictable degrees and sorts of dissociation evident following sections at particular callosal locations (e.g., Sidtis et al., 1981). “Partially split” subjects really do seem somehow intermediate between “nonsplit” and (fully) “split-brain” subjects. Surely one appealing characterization of such subjects is that the structure of their consciousness is intermediate between (strongly) unified and (wholly) divided or dual. In light of the apparent strengths of the PUM, it should be puzzling how little philosophical attention it has received. Those who have discussed the

Partial Unity of Consciousness

353

model, however, have not been enthusiastic. Hurley (1994) suggested that there could be no determinate case of partial unity of consciousness; Nagel suggested that even if empirical data suggested partial unity, the possibility would remain inconceivable (and thus unacceptable) from the first-person and folk perspective (1971, 409–410); Bayne (2008, 2010) has questioned whether the model is even coherent. Indeed, Lockwood himself at one point admitted that “in spite of having defended it in print, I am still by no means wholly persuaded that the concept of a merely weakly unified consciousness really does make sense” (1994, 95).7 Of the philosophers just mentioned, Nagel, Bayne, and Lockwood (as well as Dainton, 2000) have been concerned, first and foremost, with what I call the inconceivability challenge. Their charge is, at minimum, that a partially unified consciousness is not possibly imaginable. Hurley’s indeterminacy charge, meanwhile, is that “no … factors can be identified that would make for partial unity” (1998, 175) as opposed to conscious duality. At a glance, these two objections to the PUM look to be in some tension with each other: the indeterminacy challenge suggests that the PUM is in some sense equivalent to (or not distinguishable from) the CDM, while, according to the inconceivability objection, the PUM is somehow uniquely inconceivable. Deeper consideration, however, reveals that the two objections are importantly related. The inconceivability objection is rooted in the fact that there is nothing subjectively available to a subject that makes her consciousness partially unified as opposed to dual; the indeterminacy challenge adds that there is nothing objective that would make it partially unified either. Taken together, these concerns may even imply that there is no such thing as a partial unity model of consciousness. Sections 4 and 5 address these twin objections, ultimately arguing that they do not and cannot work against the PUM in the way its critics have thought. The conclusion of the chapter is that the PUM is a distinct model, and one that deserves the same consideration as any other model of splitbrain consciousness. In the next section, I will lay out what is most centrally at issue between these models. 3 Experience Types and Token Experiences The central challenge for the CDM has always been to account for the variety of respects in which split-brain subjects appear to be “unified.” First of all, split-brain subjects don’t seem that different from anyone else: while their behavior outside of experimental conditions isn’t quite normal (Ferguson, Rayport, & Corrie, 1985), it isn’t incoherent or wildly conflicted.

354

E. Schechter

Second of all, even under experimental conditions, bihemispheric conscious contents don’t seem wholly dissociated. Via either hemisphere, for instance, a split-brain subject can indicate certain “crude” visual information about a stimulus presented in a given visual field (Trevarthen & Sperry, 1973; though see also Tramo et al., 1995). Similarly, although finely patterned tactile information from the hand transmits only contralaterally, “deep touch” information (sufficient to convey something about an object’s texture, and whether it is, say, rounded or pointed) transmits ipsilaterally as well. As a result, in such cases, one apparently speaks of what the subject (tout court) sees and feels, rather than speaking of what one hemisphere or the other sees or feels, or of what the subject sees and feels via one hemisphere or the other. Proponents of the CDM, however, have always viewed it as compatible with the variety of respects in which split-brain subjects appear “unified.” Of course a split-brain subject seems to be a single thinker: RH and LH have the same memories and personality by virtue of having the same personal and social history, and so on. And of course split-brain subjects typically behave in an integrated manner: especially outside of experimental situations, the two streams of consciousness are likely to have highly similar contents. In other words, proponents of the CDM long appealed to interhemispheric overlap in psychological types, while maintaining that the hemispheres are subject to distinct token mental phenomena. A primary reason for the persistence of the debate between the CDM and the UM is that proponents of the CDM have readily availed themselves of the type-token distinction in this way. Accordingly, the version of the CDM that has been defended by neuropsychologists in particular is one in which a split-brain subject has two entirely distinct streams of conscious experiences, but with many type- (including content-) identical experiences across the two streams. Call this the conscious duality (with some duplication of contents) model, or CDM-duplication. Proponents of the UM have meanwhile sometimes responded by arguing that there is no room for the type-token distinction in this context. (See Schechter, 2010, responding to Marks, 1981, and Tye, 2003, on this point; for a different version of this objection to the CDM, see Bayne, 2010.) At around this point in the dialectic, very deep questions arise about, among other things, the nature of subjects of experience (Schechter, 2013a), and it is not clear how to resolve them. Let’s look at an example. In one experiment, a split-brain subject had an apparently terrifying fire safety film presented exclusively in her LVF (to

Partial Unity of Consciousness

355

her RH). After viewing, V.P. said (via her LH) that she didn’t know what she saw—“I think just a white flash,” she said, and, when prompted further, “Maybe just some trees, red trees like in the fall.” When asked by her examiner (Michael Gazzaniga) whether she felt anything watching the film, she replied (LH), “I don’t really know why but I’m kind of scared. I feel jumpy. I think maybe I don’t like this room, or maybe it’s you. You’re getting me nervous.” Turning to the person assisting in the experiment, she said, “I know I like Dr. Gazzaniga, but right now I’m scared of him for some reason” (Gazzaniga, 1985, 75–76). In this case, there appeared to be a kind of interhemispherically common or shared emotional or affective experience. (And, perhaps, visual experience.) But here the defender of the CDM will employ the type/token distinction: what was common to or shared by V.P.’s two hemispheres was, at most, a certain type of conscious emotional or affective (and perhaps visual) experience—but each hemisphere was subject to its own token experience of that type. Perhaps, for instance, interhemispheric transfer of affect or simply bihemispheric access to somatic representations of arousal meant that each hemisphere generated and was subject to an experience of anxiety while V.P. (or her RH) watched the film—but if so, then there were two experiences of anxiety. Of course, if there really was an RH conscious visual experience of the fire safety film that was not co-conscious with, say, an LH auditory experience of a stream of inner speech that the LH was simultaneously engaging in (“What’s going on over there? I can’t see anything?”), then someone who accepts the transitivity principle has to resort to some kind of strategy like this. If the RH experience and the LH experience are not co-conscious, then they cannot belong to the same stream of consciousness—even if both are co-conscious with an emotional or affective experience of anxiety. Because the PUM drops the transitivity principle, however, it can take unified behavior and the absence of conscious dissociation at face value. According to the PUM, the reason V.P. was able to describe, via her left hemisphere, the feeling the anxiety that her RH was (presumably) also experiencing was because V.P. really had a single token experience of anxiety, co-conscious with all of her other token experiences at that time. More generally, wherever the CDM posits two token experiences with a common content (figure 15.1), the PUM posits a single token experience with that content (figure 15.2). To put it differently, where there is no qualitative difference between contents, the PUM posits no numerically distinct experiences.

356

E1

E2

E. Schechter

A

C

B

B

E3

E4

Figure 15.1 Conscious duality with partial duplication of contents.

4 The Inconceivability Objection As Bayne points out (2008, 2010), the inconceivability objection says more than that we cannot imagine what it’s like to have a partially unified consciousness. After all, there may be all kinds of creatures whose conscious experience we cannot imagine (Nagel, 1974) simply because of contingent facts about our own perceptual systems and capacities. According to the inconceivability objection, there is nothing that would even count as successfully imagining what it is like to have partially unified consciousness. Why should this objection face the PUM uniquely? After all, we cannot imagine what it would be like to have (simultaneously) two streams of consciousness, either. This follows from the very concept of co-consciousness: two experiences are co-conscious when there is something it is like to undergo them together. Failures of co-consciousness, in general then, are not the kinds of things for which there is anything that it’s like to be subject to them (see Tye, 2003, 120). (More on subjects of experience below.) As Tye (2003, 120) notes, there is of course a qualified sense in which one can imagine having two streams of consciousness: via two successive acts of imagination. That is, one can first imagine what it’s like to have the one stream of consciousness and then imagine what it’s like to have the other. There is just no single “experiential whole” encompassing both imaginative acts, for the experiences in the two streams aren’t “together” in experience, in the relevant, phenomenological sense. We could say, if we wanted, that having multiple streams of consciousness is sequentially but not simultaneously imaginable. These same remarks apply to the PUM as well, however. Consider figure 15.2 again. We can first imagine what it’s like to undergo experiences E1 and E2 together, and can then imagine what it’s like to undergo E2

Partial Unity of Consciousness

E1

357

E3 C

A

E2

B

Figure 15.2 Partial unity of consciousness.

and E3 together. There is just no single “experiential whole” encompassing E1, E2, and E3, because neither E1 and E3 nor their contents, A and C, are together in experience in the relevant, phenomenological sense. Thus having partially unified consciousness is also sequentially if not simultaneously imaginable. On the face of it, then, the inconceivability objection should face the PUM and the CDM equally. The objection concerns what it’s like to be conscious—a subjective matter—and there is nothing in the phenomenology of conscious duality or partial unity to distinguish them. The PUM and the CDM-duplication differ with respect to whether the experience that is carrying the content B and that is co-conscious with the experience that is carrying the content A, is the very same experience as the experience that is carrying the content B that is co-conscious with the experience that is carrying the content C. They differ, that is, with respect to whether the experience that is co-conscious with E1 is the very same (token) experience as the experience that is co-conscious with E3. This is a question about the token identities of experiences, and as Hurley (1998) notes, the identities of experiences are not subjectively available to us.8 The inconceivability objection concerns the phenomenality or subjective properties of experience, but there is no phenomenal, subjective difference between having two streams of consciousness and having a single but only weakly unified stream of consciousness. Why, then, have critics of the PUM—and even its major philosophical proponent (Lockwood, 1994)— found the PUM somehow uniquely threatened by the objection? I think that the reason has to do with personal identity. In ordinary psychological thought, the individuation of mental tokens, including conscious experiences, is parasitic upon identifying the subject whose experiences they are, so that if there is a single subject, for example, feeling a twinge of pain at a given time, there is one experience of pain at that time;

358

E1

E2

E. Schechter

A

C

B

B

E3

E4

Figure 15.3 Conscious duality with partial duplication.

if there are two subjects feeling (qualitatively identical) pains at that time, then there are two experiences of pain, and so on. The problem is that our thinking about experience is so closely tied to our thinking about subjects of experience that whether or not the “divided” hemispheres are associated with distinct subjects of experience seems just as uncertain as whether or not they share any (token) conscious experiences. Precisely because we ordinarily individuate conscious experiences by assigning them to subjects, one natural interpretation of the CDM has always been that the two hemispheres of a split-brain subject are associated not only with different streams of consciousness but with different subjects of experience (or “conscious selves,” e.g., Sperry, 1985). If that interpretation is correct, then no wonder split-brain consciousness is only sequentially imaginable: when we imagine a split-brain human being’s consciousness, we must in fact imagine the perspectives of two different subjects of experience in turn. The PUM has instead been interpreted as positing a single subject of experience with a single stream of consciousness—but one whose consciousness is not (simultaneously) imaginable. There must at least be two subjective perspectives in the conscious duality case (figure 15.3) because the co-consciousness relation is itself one that appeals to falling within such a perspective. (Think about the origins of this “what it’s like” talk!; Nagel, 1974.) An experience is conscious if and only if it falls within some phenomenal perspective or other; two experiences are co-conscious if and only if they fall within the same phenomenal perspective, if there is some perspective that “includes” them both. Now, either subjects of experience necessarily stand in a one-to-one with phenomenal perspectives, or they do not. We might understand subjects of experience in such a way that a subject of experience necessarily has a

Partial Unity of Consciousness

E1

359

E3 C

A

E2

B

Figure 15.4 Partial unity of consciousness.

(single) phenomenal perspective at a time. If this is the case, then the CDM posits two subjects of experience, each of whose perspectives is (it would seem) perfectly imaginable. Alternatively we might let go of the connection between subjects of experience and phenomenal perspectives. If so, then the CDM may posit a single subject of experience with two phenomenal perspectives. If we pursue this second course, then we cannot imagine what it is like to be such a subject of experience—but this is unsurprising, since we have already forgone the connection between being a subject of experience and having a phenomenal perspective. As before, however, these remarks apply equally to the PUM. The PUM also posits two phenomenal perspectives, for again failures of co-consciousness—even between two experiences that are mutually co-conscious with a third—mark the boundaries of such perspectives. Only and all those experiences that are transitively co-conscious with each other fall within a single phenomenal perspective (figure 15.4). (As before, the solid lines signify co-consciousness; each dashed oval circumscribes those experiences that fall within a single subjective perspective.) Once again, we can relinquish the connection between being a subject of experience and having a single phenomenal perspective, in which case we can’t imagine what it’s like to be the subject with the partially unified consciousness, but in which case, again, we’ve already forgone the commitment to there being something it’s like to be her. Alternatively, we can insist upon a necessary connection between being a subject of experience and having a phenomenal perspective—but then the PUM must also posit two subjects of experience within any animal that has a partially unified consciousness. And we can imagine the perspective of either of these subjects of experience.9

360

E. Schechter

Whichever model we accept—that shown in figure 15.3 or in figure 15.4—and whether we identify, for example, the split-brain subject as a whole with a subject of experience or not, the entity to which we would ascribe E1 and E3 in the figures above—the subject in the organismic sense—is not something that has a phenomenal perspective—not in the ordinary sense in which we speak of subjects “having” such perspectives. These remarks suggest an attenuated sense in which the two models can be distinguished on subjective grounds. On the one hand, there is no difference between what it’s like to have a partially unified consciousness versus what it’s like to have two streams of consciousness because there is nothing—no one thing—that it is like to have either of those things. But there is a difference between the models with respect to the role they make for phenomenal perspectives in individuating experiences. Because streams of consciousness are strongly unified, according to the CDM, an experience’s token identity may depend upon the phenomenal perspective that it falls within (or contributes to). The PUM forgoes this dependence: there can be multiple phenomenal perspectives associated with the same stream of consciousness, and a single experience can fall within multiple phenomenal perspectives. The strength of the conceptual connection between experiences and phenomenal perspectives is certainly a consideration that speaks against the PUM. What remains open, however, is whether other considerations could outweigh this one. For the reasons I go on to explain in the next section, I agree with Lockwood that this is at least possible. For now, the important point is that the distinction between subjects of experience and subjective perspectives undercuts the force of the inconceivability objection. Consider figure 15.2 again. According to the PUM, the experience that is co-conscious with the experience of A (with E1, in other words) and the experience that is co-conscious with the experience of C (with E3, in other words) is one and the same experience. Since the experience nonetheless contributes to two distinct phenomenal perspectives, there is nothing subjective that makes it true that there is just one experience with that content. It must therefore be an objective fact or feature that makes it the case that the experience that is co-conscious with E1 is one and the same as the experience that is co-conscious with E3. So long as there are properties of experiences that are not subjectively available to us, there is, on the face of it, no reason to think that there could not be any such feature or fact. According to the indeterminacy objection, however, this is just the situation that the PUM is in. That is, there is no fact or feature—subjective or objective—that could make it true that the

Partial Unity of Consciousness

E1

E2

A

C

B

B

361

E3

E4

Figure 15.5 Conscious duality with partial duplication of contents.

E1

E3 C

A

E2

B

Figure 15.6 Partial unity of consciousness.

experience that is co-conscious with E1 is the experience that is co-conscious with E3. I turn to this objection next. 5 The Indeterminacy Objection Where the CDM posits two token experiences with a common content, the PUM posits a single token experience with that content. This is where the threat of indeterminacy gets its grip: what would make it the case that a subject had a single token experience that was co-conscious with others that were not co-conscious with each other (figure 15.6)—rather than a case in which the subject had two (or more) streams of consciousness, but with some overlap in contents (figure 15.5)? The conscious duality model and the partial unity model agree that wherever there is a dissociation between contents, there is a failure of coconsciousness between the vehicles or experiences carrying those contents. The models differ with respect to what they say about nondissociated contents: according to the PUM, interhemispherically shared contents are

362

E. Schechter

carried by interhemispherically shared experiences; according to the CDMduplication, they are not. Neuropsychologists apparently recognized these as distinct possibilities. Sperry, for instance, once commented, “Whether the neural cross integration involved in … for example, that mediating emotional tone, constitutes an extension of a single conscious process [across the two hemispheres] or is better interpreted as just a transmission of neural activity that triggers a second and separate bisymmetric conscious effect in the opposite hemisphere remains open at this stage” (Sperry, 1977, 114). Sperry implies, here, that whether a subject like V.P. (sec. 3) has one or two experiences of anxiety is something we simply have yet to discover. Hurley (1998), however, suggested that the difficulty of distinguishing between partial unity of consciousness and conscious duality with some duplication of contents is a principled one. According to Hurley, the problem is not at base epistemic, but metaphysical: there is nothing that would make a subject’s consciousness partially unified, as opposed to dual but with some common contents. The PUM thus stands accused, once again, of unintelligibility: What does the difference between these two interpretations [partial unity of consciousness versus conscious duality with some duplication of contents] amount to? There is no subjective viewpoint by which the issue can be determined. If it is determined, objective factors of some kind must determine it. But what kind? … Note the lurking threat of indeterminacy. If no objective factors can be identified that would make for partial unity as opposed to separateness with duplication, then there is a fundamental indeterminacy in the conception of what partial unity would be, were it to exist. We can’t just shrug this off if we want to defend the view that partial unity is intelligible. (1998, 175)

The difficulty of conceptualizing the difference between partial unity and conscious duality with some duplication of contents is rooted in the purposes to which the type/token distinction is ordinarily put. Generalizations in psychology—whether folk or scientific—are generalizations over psychological types, including contents (Burge, 2009, 248). Mental tokens are just the instantiation of those properties or types within subjects. We assume that two subjects can’t share the same mental token, so if they both behave in ways that are apparently guided by some mental content, we must attribute to each of them a distinct mental token with that content. That is: what entokenings of contents explain is the access that certain “systems”— in ordinary thought, subjects—have to those contents. The problem is that both the PUM and the CDM-duplication allow that the right and left hemisphere of a split-brain subject have access to some

Partial Unity of Consciousness

363

of the same contents. Indeed, while disagreeing about how to individuate tokens, the PUM and the CDM-duplication could in principle be in perfect agreement about which systems have access to which information, and about what role this shared access to information plays in behavioral control. In that case, there would be no predictive or explanatory work, visà-vis behavior, for the type/token distinction to do. Suppose, for the sake of argument, that this is right, and that the two models are predictively equivalent vis-à-vis behavior. I have already argued that they are subjectively indistinguishable as well. Are there any other grounds for distinguishing partial unity from conscious duality with some duplication of contents? The most obvious possibility is that some or other neural facts will “provide the needed objective basis for the distinction” (Hurley, 1998, 175). In the early days of the split-brain consciousness debate, consciousness was usually assumed to be a basically cortical phenomenon so that the neuroanatomy of the callosotomized brain was taken to support the conscious duality model. Tides have changed, however, and by now the “split” brain, which of course remains physically intact beneath the cortical level, might be taken to provide prima facie support for the claim that split-brain consciousness is partially unified as well. Although my reasons for thinking so differ from hers, I agree with Hurley that the structure of consciousness cannot be read off neuroanatomical structure so straightforwardly. To start with, although subcortical structures are (usually) left intact by split-brain surgery, subcortico-cortical pathways may still be largely unilateral. Indeed so far as I know, this is largely the case for individual pathways of, for example, individual thalamic nuclei, though subcortico-cortical pathways taken collectively may still ultimately terminate and originate bilaterally.10 Furthermore, although structural connectivity is a good guide to functional connectivity, the latter is what we are really interested in. Now, given how intimately subcortical activities are integrated with cortical activities in the human brain, it is of course natural to hypothesize that the physical intactness of subcortical structures in the “split” brain provides the basis for whatever kind or degree of interhemispheric functional connectivity is needed for conscious unity. On the other hand, one could apparently reason just as well in the opposite direction: given how intimately subcortical activities are integrated with cortical activities, it is reasonable to suspect that a physical (surgical) disruption of cortical activities creates a functional disruption or reorganization of activity even at the subcortical level. Johnston et al. (2008), for instance, found a significant reduction in

364

E. Schechter

the coherence of firing activity not just across the two hemispheres of a recently callosotomized subject, but across right and left thalamus, despite the fact that the subject’s thalamus was structurally intact. What we will ultimately need, in order to determine the side on which the neural facts lie in this debate, is a developed theory of the phenomena of interest—consciousness and conscious unity—including a theory of their physical basis. It is only against the background of such a theory that the relevance of any particular neural facts can be judged, and, of course, only against the background of such a theory that those facts could make it intelligible that the experience that is co-conscious with E1 is the experience that is co-conscious with E3. Suppose, for instance, that we found the neural basis of the co-consciousness relation: suppose we find the neurophysiological relation that holds between neural regions supporting co-conscious experiences, and found that that relation holds between the region supporting consciousness of B on the one hand and the regions supporting consciousness of A and of C on the other. That discovery would weigh in favor of the PUM. But we would first have needed to have a theory of the co-consciousness relation, and we would need to have had some prior if imperfect grip on when experiences are and aren’t co-conscious. Thus, for example, Tononi (2004), who views thalamocortical interactions as a crucial part of the substrate of consciousness, also believes that the split-brain phenomenon involves some conscious dissociation, and this is because Tononi makes the integration of information the basis (and purpose) of consciousness. Behavioral evidence meanwhile strongly suggests that there is more intrahemispheric than interhemispheric integration of information in the split-brain subject. Depending on whether conscious unity requires some absolute degree of informational integration or instead just some relatively greatest degree, split-brain consciousness could be revealed to have been dual or partially unified. In her discussion of whether the PUM can appeal to neural facts to defeat the indeterminacy objection, Hurley considers neuroanatomical facts alone. I think there is a dialectical explanation for this: Lockwood himself motivates the PUM by appealing to neuroanatomical facts specifically, and of course the (very gross) neuroanatomy of the “split” brain is relatively simple to appreciate. In the long run though, we will have various facts about neural activity to adjudicate between the PUM and the CDM as well. Consider recent fMRI research investigating the effects of callosotomy on the bilateral coherence of resting state activity. Now as it happens, these studies have thus far yielded conflicting results. Johnston et al. (2008) (cited above) found a significant reduction in the coherence of firing

Partial Unity of Consciousness

365

activity across the two hemispheres following callosotomy, while Uddin et al. (2008) found a very high degree of bihemispheric coherence in a different subject.11 Suppose, however, that one or the other finding were replicated across a number of subjects. This is just the kind of finding that could weigh in favor of one model or the other—assuming some neurofunctional theory of consciousness according to which internally generated, coordinated firing activity across wide brain regions serves as the neural mechanism of consciousness. Hurley herself has fundamental objections to the notion that neural structure might make it the case that a subject’s consciousness was partially unified. On the basis of considerations familiar from the embodied/ extended mind view, she argues that the very same neuroanatomy may be equally compatible with a dual and a unified consciousness. (Though I don’t know if she would say that all neural properties—not just those concerning anatomy—are so compatible!) A discussion of the embodied/ extended mind debate would take us too far afield here. Suffice it to say that the position Hurley espouses is controversial from the perspective of the ongoing science of consciousness, and, as for a science of conscious unity, “it seems to me that the physical basis of the unity of consciousness should be sought in whatever we have reason to identify as the physical substratum of consciousness itself” (Lockwood, 1994, 94). Still, whether our best-developed theory of consciousness will necessarily be a theory of the brain is admittedly itself an empirical question. 6 Principles of Conscious Unity According to the indeterminacy objection, there is nothing that would make it the case that a subject’s consciousness was partially unified. Unfortunately, it is not possible, at present, to respond to the objection by stating what would. I have argued that if we had an adequate theory of the phenomenon of interest, we could use it to adjudicate the structure of consciousness in hard cases. Because we don’t yet have such a theory, this response, however persuasive in principle, is not fully satisfying at present. I will therefore conclude by offering a very different kind of response to the indeterminacy objection. The basic thought will be that the indeterminacy objection is neutral or asymmetric between the PUM and the CDM-duplication: that is, the PUM is no more vulnerable to the objection than is the CDM-duplication. If that is right, then the objection cannot work to rule out the PUM since it can’t plausibly rule out both models simultaneously.

366

E. Schechter

Even on the face of things, it is puzzling that the PUM should be uniquely vulnerable to the indeterminacy objection, since what is purportedly indeterminate is whether a given subject’s consciousness is partially unified or dual with some duplication of contents. In that case, shouldn’t the CDM be just as vulnerable to the objection? Why does Hurley (apparently) think otherwise? Hurley might respond that there are at least hypothetical cases involving conscious dissociation for which the PUM isn’t even a candidate model, cases that are thus determinately cases of conscious duality. These are cases in which there are no contents common to the two streams. Perhaps this suffices to make the CDM invulnerable (or less vulnerable somehow) to the indeterminacy objection. The version of the CDM under consideration here, however—and the version that has been popular among neuropsychologists—is one that does posit some duplicate contents. Moreover, although there may not be any candidate cases of partial unity for which the CDM-duplication is not a possible model as well, there are at least hypothetical cases that look to be pretty strong ones for the PUM. Imagine sectioning just a tiny segment of the corpus callosum, resulting in, say, dissociation of tactile information from the little fingers of both hands, and no more. Now consider a proposed account of the individuation of experiences: for a given content B, there are as many vehicles carrying B as there are “functional sets” of conscious control systems to which that content is made available. What makes a collection of control systems constitute a single functional set, meanwhile, is that they have access to most or all of the same contents. (The prima facie appeal of this account is that it is, I think, consistent with some accounts of the architecture of the mind, according to which all that “unifies” conscious control systems is their shared access to a limited number of contents [Baars, 1988].) In the imagined case, in which we section only one tiny segment of the corpus callosum, there is (arguably) a single functional set of conscious control systems, and thus just one vehicle carrying the content B.12, 13 Is there any other reason to think that the indeterminacy challenge faces the PUM uniquely? Hurley’s thought seems to be that the CDM-duplication skirts the indeterminacy challenge by offering a constraint according to which a partially unified consciousness is impossible. The constraint in question is just that co-consciousness is a transitive relation: What does the difference between these two interpretations [partial unity of consciousness versus conscious duality with some duplication of contents] amount to?

Partial Unity of Consciousness

367

… In the absence of a constraint of transitivity, norms of consistency do not here give us the needed independent leverage on the identity of experiences … note the lurking threat of indeterminacy. (Hurley, 1998, 175; emphasis added)

This is a threat, Hurley means, to the intelligibility of the PUM in particular. The transitivity constraint in effect acts as a principle of individuation for the CDM-duplication and rules out the possibility of a partially unified consciousness. If the PUM comes with no analogous constraint or principle of individuation, then the most a proponent of the PUM can do is simply stipulate that a subject has a partially unified consciousness. Such stipulation would of course leave worries about metaphysical indeterminacy intact; the PUM would thus be uniquely vulnerable to the indeterminacy challenge. There is a constraint that plays an individuating role for the PUM, however, one analogous to that played by the transitivity constraint for the CDM-duplication. For the PUM, the individuating role is played by the nonduplication constraint. This constraint might say simply that, at any moment in time, an animal cannot have multiple experiences with the same content. Such a nonduplication principle falls out of the account of the tripartite account of experiences offered by Bayne (2010), for instance, at least one version of which identifies an experience only by appeal to its content, time of occurrence, and the biological subject or animal to which it belongs. Or the constraint might be formulated in terms of a (prominent though still developing) functional theory of consciousness (Baars, 1988; Dehaene & Naccache, 2001): there is but a single vehicle for each content that is available to the full suite of conscious control systems within an organism. Whatever the ultimate merits of such a nonduplication constraint, it can at least be given a principled defense (see Schechter, 2013a). I cannot see a reason, then, to conclude that the indeterminacy objection faces the PUM uniquely. If that is so, then the objection cannot work in quite the way Hurley suggests. My reasoning here takes the form of a simple reductio: if the indeterminacy objection makes the PUM an unacceptable model of consciousness, then it should make the CDM-duplication model equally unacceptable, and on the same a priori grounds. Yet a priori grounds are surely the wrong grounds upon which to rule out both the PUM and the CDM-duplication for a given subject: whether there are any animals in whom some but not all conscious contents are integrated in the manner characteristic of conscious unity is surely at least in part an empirical question.

368

E. Schechter

For all the reasons I have discussed, it seems possible that there should be determinate cases of partially unified consciousness. Of course, I have not addressed how we (that is, neuropsychologists) should determine whether a subject has a partially unified stream of consciousness or two streams of consciousness with some duplication of contents. The question is difficult in part because it is, as I have suggested throughout, heavily theoretical rather than straightforwardly empirical. But that is true for many of the most interesting unanswered questions in psychology. Notes 1. Throughout the chapter I use the term “split-brain subject” (in place of “splitbrain patient”) to be synonymous with “split-brain human animal.” I mean the term to be as neutral as possible with respect to personal identity concerns. How many subjects of experience there are within or associated with a split-brain subject will be addressed separately. 2. Marks (1981) and Tye (2003) believe that a split-brain subject usually has one stream of consciousness but occasionally—under experimental conditions involving perceptual lateralization—two. It does not matter here whether we view this as a unity or a duality model. Because Marks and Tye make common contents the basis of conscious unity, their models are interestingly related to the partial unity model, but the version of the partial unity model that I consider also makes some kind of neurophysiological unity the basis of conscious unity, which their models do not. 3. Restricting our attention to synchronic co-consciousness in this way of course yields, at best, a limited view of split-brain consciousness. Moreover, co-accessibility, co-awareness, and co-phenomenality relations are probably more likely to diverge diachronically than synchronically (Schechter, 2012). I still hope that the restricted focus is justified by the fact that the objections to the partial unity model that I treat here don’t particularly concern what’s true across time in the split-brain subject. 4. This way of talking suggests what Searle calls a “building-block” model of consciousness (Searle, 2000; see also Bayne, 2007). If one assumes a unified field model of consciousness, then the distinction between the partial unity model (PUM) and the CDM is, at a glance, less clear, for reasons that will emerge in sec. 4. It nonetheless seems possible to me that the kinds of considerations I discuss in sec. 5 could be used to distinguish partial unity from conscious duality (with some duplication of contents). 5. The two kinds of partial unity models are of course interestingly related, and Hurley (1998), for one, considers a kind of mixed model. Although the objections to the PUM that I discuss here could be raised against either version of the model, I think they emerge most starkly in the context of the second.

Partial Unity of Consciousness

369

6. There is a possible version of the PUM that is (at least on its face) neutral with respect to implementation. I don’t think that’s the version that Lockwood intended (see, e.g., Lockwood, 1994, 93), but nothing hinges on this exegetical claim. A version that is neutral with respect to implementation would be especially vulnerable to the indeterminacy objection (and, thereby, the inconceivability objection), though I suggest in sec. 5 that theoretical constraints and not just neural facts could be brought to bear in support of the PUM. 7. Within the neuropsychological literature on the split-brain phenomenon, the model is occasionally hinted at (e.g., Trevarthen, 1974; Sperry, 1977; Trevarthen & Sperry, 1973), but, interestingly, these writings are on the whole ambiguous—interpretable as endorsing either a model of split-brain consciousness as partially unified or a model in terms of two streams of consciousness with common inputs. Several explanations for this ambiguity will be suggested in this paper. 8. Bayne (2010) disputes this, at least up to a point. See response in Schechter (2013a). 9. The language used in this section implies that we can choose whether and how to revise our concepts, but I don’t mean to commit myself to this (Grice & Strawson, 1956). Perhaps our concept of a subject of experience is basic, even innately specified, and perhaps there just is an essential conceptual connection between it and the concept of a subjective perspective. 10. Certainly this is the case if we read “subcortical” to mean “noncortical,” which most discussions of the role of “subcortical” connections in the split-brain subject appear to do. 11. The subject Johnston et al. (2008) looked at had been very recently callosotomized, while the subject Uddin et al. studied—“N.G.”—underwent callosotomy nearly fifty years ago. One possibility then is that in N.G., other, noncortical structures have come to play the coordinating role that her corpus callosum once played. (Actually N.G. has always been a slightly unusual split-brain subject, but then arguably each split-brain subject is.) A distinct possibility is that the marked reduction in interhemispheric coherence observed by Johnston et al. was simply an acute consequence of undergoing major neurosurgery itself. 12. This particular approach to individuating conscious experiences makes it possible for there to be subjects for whom it is genuinely indeterminate (not just indeterminable) whether they have a dual or a partially unified consciousness. This is because it views the identity of experiences and streams of consciousness as in part a matter of integration, something that comes in degrees. It isn’t clear, for instance, whether a split-brain subject has one or two “functional sets” of conscious control systems. So the structure of split-brain consciousness could be genuinely indeterminate without showing that there are no possible determinate cases of partial unity.

370

E. Schechter

13. It is worth noting that there is in fact some debate about the structure of consciousness in the “normal,” i.e., “nonsplit” case. How certain are we that there won’t turn out to be any failures of co-consciousness in nonsplit subjects? Several psychologists believe that there are (e.g., Marcel, 1993). If we discovered that there were any such failures, my guess is that we would be inclined to conclude that our consciousness was mostly unified, rather than dual—but to admit that our consciousness is mostly unified would be to acknowledge that it is partially not. Thus it is possible that even the normal case will end up being one to which we confidently apply the PUM rather than the CDM-duplication.

References Baars, B. (1988). A cognitive theory of consciousness. Cambridge: Cambridge University Press. Bayne, T. (2008). The unity of consciousness and the split-brain syndrome. Journal of Philosophy, 105, 277–300. Bayne, T. (2010). The unity of consciousness. Oxford: Oxford University Press. Bayne, T. (2007). Conscious states and conscious creatures: Explanation in the scientific study of consciousness. Philosophical Perspectives, 21 (Philosophy of Mind), 1–22. Oxford: Wiley. Bayne, T., & Chalmers, D. (2003). What is the unity of consciousness? In A. Cleeremans (Ed.), The unity of consciousness: Binding, integration, and dissociation (pp. 23– 58). Oxford: Oxford University Press. Block, N. (1995). On a confusion about a function of consciousness. Behavioral and Brain Sciences, 18, 227–287. Burge, T. (2009). Five theses on de re states and attitudes. In J. Almog & P. Leonardi (Eds.), The philosophy of David Kaplan (pp. 246–316). Oxford: Oxford University Press. Dainton, B. (2000). Stream of consciousness. London: Routledge. Davis, L. (1997). Cerebral hemispheres. Philosophical Studies, 87, 207–222. Dehaene, S., & Naccache, L. (2001). Towards a cognitive neuroscience of consciousness: Basic evidence and a workspace framework. Cognition, 79, 1–37. Dewitt, L. (1975). Consciousness, mind, and self: The implications of the split-brain studies. British Journal for the Philosophy of Science, 26, 41–47. Eccles, J. (1965). The brain and the unity of conscious experience: Nineteenth Arthur Stanley Eddington Memorial Lecture. Cambridge: Cambridge University Press. Eccles, J. (1973). The understanding of the brain. New York: McGraw-Hill.

Partial Unity of Consciousness

371

Eccles, J. (1981). Mental dualism and commissurotomy. Brain and Behavioral Science, 4, 105. Ferguson, S., Rayport, M., & Corrie, W. (1985). Neuropsychiatric observations on behavioural consequences of corpus callosum section for seizure control. In A. Reeves (Ed.), Epilepsy and the corpus callosum (pp. 501–514). New York: Plenum Press. Gazzaniga, M. (1970). The bisected brain. New York: Appleton-Century-Crofts. Gazzaniga, M. (1985). The social brain. New York: Basic Books. Gazzaniga, M. (2000). Cerebral specialization and interhemispheric communication: Does the corpus callosum enable the human condition? Brain, 123, 1293–1326. Grice, H., & Strawson, P. (1956). In defense of a dogma. Philosophical Review, 65, 141–158. Hill, C. (1991). Sensations: A defense of type materialism. Cambridge, MA: Cambridge University Press. Hurley, S. (1994). Unity and objectivity. In C. Peacocke (Ed.), Objectivity, simulation, and the unity of consciousness (pp. 49–77). Oxford: Oxford University Press. Hurley, S. (1998). Consciousness in action. Cambridge, MA: Harvard University Press. Johnston, J., Vaishnavi, S., Smyth, M., Zhang, D., He, B., Zempel, J., et al. (2008). Loss of resting interhemispheric functional connectivity after complete section of the corpus callosum. Journal of Neuroscience, 28, 6452–6458. Lassonde, M., & Ouiment, C. (2010). The split-brain. Wiley Interdisciplinary Reviews: Cognitive Science, 1, 191–202. LeDoux, J., Wilson, D., & Gazzaniga, M. (1977). A divided mind: Observations on the conscious properties of the separated hemispheres. Annals of Neurology, 2, 417–421. Levy, J. (1969). Information processing and higher psychological functions in the disconnected hemispheres of human commissurotomy patients. Unpublished doctoral dissertation. California Institute of Technology. Lockwood, M. (1989). Mind, brain, and the quantum. Oxford: Blackwell. Lockwood, M. (1994). Issues of unity and objectivity. In C. Peacocke (Ed.), Objectivity, simulation, and the unity of consciousness (pp. 89–95). Oxford: Oxford University Press. Marcel, A. (1993). Slippage in the unity of consciousness. In G. Bock & J. Marsh (Eds.), Experimental and theoretical studies of consciousness (pp. 168–179). Chinchester: John Wiley & Sons.

372

E. Schechter

Mark, V. (1996). Conflicting communicative behavior in a split-brain patient: Support for dual consciousness. In S. Hameroff, A. Kaszniak, & A. Scott (Eds.), Toward a science of consciousness: The first Tucson discussions and debates (pp. 189–196). Cambridge, MA: MIT Press. Marks, C. (1981). Commissurotomy, consciousness, and unity of mind. Cambridge, MA: MIT Press. Milner, B., Taylor, L., & Jones-Gotman, M. (1990). Lessons from cerebral commissurotomy: Auditory attention, haptic memory, and visual images in verbal-associative learning. In C. Trevarthen (Ed.), Brain circuits and functions of the mind (pp. 293–303). Cambridge: Cambridge University Press. Moor, J. (1982). Split-brains and atomic persons. Philosophy of Science, 49, 91–106. Nagel, T. (1971). Brain bisection and the unity of consciousness. Synthese, 22, 396–413. Nagel, T. (1974). What is it like to be a bat? Philosophical Review, 83, 435–450. Popper, K., & Eccles, J. (1977). The self and its brain. New York: Springer International. Revonsuo, A. (2000). Prospects for a scientific research program on consciousness. In T. Metzinger (Ed.), Neural correlates of consciousness: Empirical and conceptual questions. Cambridge, MA: MIT Press. Schechter, E. (2010). Individuating mental tokens: The split-brain case. Philosophia, 38, 195–216. Schechter, E. (2012). The switch model of split-brain consciousness. Philosophical Psychology, 25, 203–226. Schechter, E. (2013a). The unity of consciousness: Subjects and objectivity. Philosophical Studies. Schechter, E. (2013b). Two unities of consciousness. European Journal of Philosophy. Searle, J. (2000). Consciousness. Annual Review of Neuroscience, 23, 557–578. Shallice, T. (1997). Modularity and consciousness. In N. Block, O., Flanagan, and G. Güzeldere (Eds.), The nature of consciousness (pp. 255–276). Cambridge, MA: MIT Press. Sidtis, J., Volpe, B., Holtzman, J., Wilson, D., & Gazzaniga, M. (1981). Cognitive interaction after staged callosal section: Evidence for transfer of semantic activation. Science, 212, 344–346. Sperry, R. (1977). Forebrain commissurotomy and conscious awareness. Journal of Medicine and Philosophy, 2, 101–126.

Partial Unity of Consciousness

373

Sperry, R. (1985). Consciousness, personal identity, and the divided brain. Neuropsychologia, 22, 661–673. Tononi, G. (2004). An information integration theory of consciousness. BMC Neuroscience, 5, 42. Tramo, M., Baynes, K., Fendrich, R., Mangun, G., Phelps, E., Reuter-Lorenz, P., et al. (1995). Hemispheric specialization and interhemispheric integration: Insights from experiments with commissurotomy patients. In A. Reeves & D. Roberts (Eds.), Epilepsy and the corpus callosum (Vol. 2, pp. 263–295). New York: Plenum Press. Trevarthen, C. (1974). Analysis of cerebral activities that generate and regulate consciousness is commissurotomy patients. In S. Dimond and J. Beaumont (Eds.), Hemisphere function in the human brain (pp. 235–263). New York: Halsted Press. Trevarthen, C., & Sperry, R. (1973). Perceptual unity of the ambient visual field in human commissurotomy patients. Brain, 96, 547–570. Tye, M. (2003). Consciousness and persons: Unity and identity. Cambridge, MA: MIT Press. Uddin, L., Mooshagian, E., Zaidel, E., Scheres, A., Margulies, D., Clare Kelly, A., et al. (2008). Residual functional connectivity in the split-brain revealed with resting-state functional MRI. Neuroreport, 19, 703–709. Zaidel, E., Iacaboni, M., Zaidel, D., & Bogen, J. (2003). The callosal syndromes. In K. M. Heilman & E. Valenstein, E. (Eds.), Clinical neuropsychology 2002 (4th ed., pp. 347–403). New York: Oxford University Press. Zaidel, E., Zaidel, D., & Sperry, R. (1981). Left and right intelligence: Case studies of Raven’s Progressive Matrices following brain bisection and hemi-decortication. Cortex, 17, 167–186.

16 E pluribus unum: Rethinking the Unity of Consciousness Robert Van Gulick

Etymology is not always a reliable guide to meaning and even less so to truth, but perhaps there is something to be learned from the fact that the word “conscious” derives from the Latin verb “conscio,” which literally translates as “know together” (con + scio). Indeed, in one archaic use, it could mean knowledge shared among different people. The Oxford English Dictionary (2nd edition, 2000) defines this obsolete use as “sharing knowledge with another” and cites Thomas Hobbes in Leviathan (1651, I. vii. 31) where he wrote, “When two, or more men, know one and the same fact, they are said to be conscious of it,” as well Robert South slightly later (1693, II.ii.88), “Nothing is to be conceal’d from the other self. To be a friend and to be conscious are terms equivalent.” Being conscious in this sense of knowing together is a mutual or shared mental activity, just as one confides or conspires—literally “breathes together”—with another. We no longer use “conscious” in that way, but perhaps the surviving concept of “conscious” we apply to single individuals retains some sense of being known together, a way in which the very word “conscious” implies some form of unity or integration. The relevant unity would be within one mind or self, but still involve some way in which features or states of mind are shared or integrated. Consciousness is generally believed to be unified in some important respect, but in what specific ways and to what degree is not as clear. Nor is there agreement about the status of such unity: Is it essential to consciousness as a logical or empirical matter? And if so, how so and why? If not, might unity nonetheless be important to our understanding of consciousness, and how so? Unity and integration might figure in two distinct but complementary ways in theories of consciousness: either as an explanandum or as an explanans, that is, as a real feature of consciousness that needs to be explained, or as something to which we might appeal in explaining consciousness and its

376

R. Van Gulick

properties. Indeed, given the complexity of the actual theoretical situation, it might serve as both. As with any complex phenomenon, a theory of consciousness needs to describe, and perhaps model, its many important features and properties. We need a good sense of what consciousness is before we can explain how it can exist or be produced. Unities of various sorts seem likely candidates for inclusion on any adequate list of the properties of consciousness, including representational unity, object unity, and subject unity, as well as introspective, access, and phenomenal unity. Indeed, each of those unities subdivides into yet more specific types. Representational unity might be unity of content or of vehicle, and unity of content in turn can take many forms and degrees. Unity of subject might concern a unified subject of thought or one of action, and each in turn can take many yet more specific forms and degrees of integration. All these various possible unities need to be adequately described or modeled, and each serves as a possible explanandum, a property or feature whose existence and basis needs to be explained by a comprehensive theory of consciousness. Some forms of conscious unity might also serve as an explanans, insofar as we might appeal to one sort of conscious unity to explain another, e.g., explaining phenomenal unity in terms of the representational unity of consciousness (Tye, 2003). Unification and integration of various sorts can also occur at unconscious levels, and some theories try to explain consciousness or its properties in terms of such unconscious unities or integrations. Like conscious unity, unconscious unity comes in many forms both psychological and neural, including representational, spatial, and multimodal unities, as well as many sorts of functional and causal integrations, both within and between modules or subsystems of the mind or brain. In answering the “how question,” many theories of consciousness appeal to such nonconscious unities. Indeed, some explain the crucial transition from unconscious mental state to conscious state in terms of such integrative or unifying processes. For example, on Bernard Baars’s global workspace theory (1988, 1997), a specific unconscious mental state becomes conscious when it is brought into that workspace and thus globally “broadcast” for integration with other contentful states in a wide range of different subsystems or modules. Stan Dehaene has further developed the global workspace theory and combined it with a proposed neural model of the brain regions involved in carrying out the relevant integrations (Dehaene & Naccache, 2001).

Rethinking the Unity of Consciousness

377

Integration plays a more direct and essential role in Giulio Tononi’s (2008) integrated information theory of consciousness. On Tononi’s model, a state of a system is conscious just if it has the highest degree of integrated informational content, which Tononi defines in terms of an informationtheory based measure he calls Φ, which depends in part on the degree of interdependence between the states of the system and thus on their integration. I myself have proposed a model, the Higher Order Global States model (or HOGS), that explains the transition from unconscious to conscious mental state as a matter of its being recruited into the unified global state that constitutes the transient substrate of a subject’s conscious mental stream (Van Gulick, 2004, 2006). Though the HOGS model agrees with the workspace theories of Baars and Dehaene in treating the transition as a matter of increased global integration, it differs in the specific form of self-like unity it proposes. Thus the unity of consciousness is not one issue or one question. It generates a variety of questions within a problem space defined by the many possible forms of conscious and unconscious integration and their possible explanatory connections. We must determine which types of conscious unity are real and then describe and explain them. As to unconscious forms of unity and integration, they too must be modeled and described. According to many theorists at least, they are likely to play an important role in explaining the “how” of consciousness. Their guiding hypothesis is that consciousness, or at least some of its key features, is realized or produced by underlying nonconscious integrative processes. If so, nonconscious forms of unity and integration may figure as key explanantia in our understanding of consciousness. Philosophical discussions of the unity of consciousness often concern whether unity of one sort or another is a necessary condition for consciousness, or alternatively whether it is sufficient for it. Both sorts of questions are open to logical as well as empirical readings. If phenomenal unity is a necessary feature of human consciousness, is that a matter of logical necessity, nomic necessity, or merely a contingent fact about the particular structure of human consciousness or its substrate? Some scientific theories of consciousness also assert or imply claims about the necessity or sufficiency of one or another sort of unity or integration. Tononi’s integrated information theory explicitly equates consciousness with having a high Φ value, and global workspace and HOGS models both regard integration into a larger unified state as a necessary element of the transition from unconscious to conscious state.

378

R. Van Gulick

Unity may bear an important relation to consciousness even if it is not strictly necessary or sufficient. Scientific theories of a complex phenomenon Z often invoke explanatory properties that are in themselves neither necessary nor sufficient for Z, but nonetheless help us understand the nature of Z. The relevant property P, for example, might be a necessary part of some condition S that is sufficient for producing Z, but not uniquely so. Even though there may be other alternative ways to produce or realize Z, doing so in the S-way essentially involves P. For example, consciousness might be realized in one architecture that requires integration of content across modular subsystems; human consciousness may in fact do so. But there nonetheless be may be other ways to produce consciousness in systems with a different functional organization—for example, some conditions S* that suffice in systems without a modular structure. Thus, what might be necessary for consciousness in one systemic context might not be required in another. Even if unity were not necessary for consciousness per se, it might nonetheless be necessary to understanding its function. Given any sort of unity one might initially think essential to consciousness, both clinical evidence and thought experiments may provide reason to believe that some limited cases of consciousness can occur without that form of unity no matter how common it is in ordinary conscious experience. Nonetheless, consciousness may need to be unified in that way to carry out at least some of the functions that make it valuable and adaptive. For example, our normal conscious life involves the unified experience of integrated objects and scenes, and having such experiences surely requires specific forms of representational integration at the conscious and underlying nonconscious levels. However, we know that patients suffering from perceptive visual agnosia have great difficulty integrating visual stimuli into coherent wholes, though there is no doubt that they have visual experiences. Patients with simultanagnosia, Bálint’s syndrome, cannot see more than one object at a time and thus are incapable of having a unified experience of a scene. Moreover, with unimpaired subjects, it seems possible to have some minimal experience with no integration of object or scene. Imagine having just the experience of a dim flicker that passes so quickly that one cannot say just where it occurred or whether it was of any given color or shape, or the experience of brief, faint sound whose location and tone one cannot discern. Such stripped-down experience seems possible despite its lack of any parts to integrate or unify. It seems possible to have at least some conscious experiences that do not involve such unity. Thus if we think in terms of necessary conditions, we might conclude that

Rethinking the Unity of Consciousness

379

such unities of object and scene are not essential or central to understanding consciousness. However, that need not follow. Even if consciousness in some pathologically restricted cases lacks such unities, it may be the capacity of consciousness to support and enable such forms of unity and integration that explains why consciousness is important and useful. Enabling and supporting widespread integration in a dynamically unified representation may be one of consciousness’s central powers, even if it can be blocked from doing so in special cases. If so, understanding the nature of consciousness would require explaining how it comes to have that power and exercise it in normal conditions. If that is one of its key functions, then we need to understand what it is about consciousness and its underlying basis that enables it to play that role in normal contexts. The fact that the exercise of that power may be blocked in abnormal cases does not show that its capacity to support such integration is not central to its nature and value. In introducing these issues, I have spoken interchangeably of “unity” and “integration,” and I will continue to do so below. The two notions are closely related, though they may have subtly different associations and convey somewhat different implications. Integration leads us to think in terms of a process, whereas unity may seem more like a basic fact or result. It is also natural to think of integration as admitting of degrees. Unity as well can be treated as a matter of degree, but there is also some pull toward thinking of it as all or none. Once again, etymology is worth noting. The Latin root of “unity” literally invokes the idea of “oneness” from the number “unum.” What is united is one thing; and that might seem like a simple and determinate fact, for example, is there one conscious subject or not? “Integration,” which shares its root with “integer,” turns on a slightly different metaphor, that of combining into an integer or whole (a whole as what is literally “untouched”— from “in” meaning not + “tangere”). Especially when one is dealing with complex systems, what constitutes a whole may turn on many factors, and we are accustomed to the idea that new wholes may arise from suitably related or interacting parts. Though the idea of unity as oneness may incline us more to think in terms of what is simple, and integration more in terms of what has an underlying complex basis, each of the two notions can be used to think about the way in which consciousness coheres and how it might result from the coherent interaction of underlying nonconscious processes. Indeed, having both notions may aid our theorizing by offering two slightly different conceptual perspectives on the same basic process.

380

R. Van Gulick

Table 16.1 Consciousness and unity 1. Nonconscious Unity Synchronic/Diachronic Representational unity Vehicle/Content Object unity Scene unity Spatial unity World unity Multimodal unity Subject Unity Thought/Action Functional unity Neural unity

2. Relation [Sufficiency] [Necessity] [Functional value] [Other relations?]

3. Conscious Unity Synchronic/Diachronic Representational unity Vehicle/Content Object unity Scene unity Spatial unity World unity Multimodal unity Subject unity Thought/Action Phenomenal unity

Before moving on to consider some more specific questions, let me recap the general structure of the problem space. Unity may occur in many conscious and nonconscious forms as shown in table 16.1. Some questions concern the reality of those varying sorts of unity. Which are true of consciousness in general, or of human consciousness? Other questions concern relations between the various sorts of unity, both conscious and nonconscious. Which sorts of unities might be explained fully, or at least partly, in terms of others? Which forms of unity might be necessary or sufficient for consciousness, or human consciousness, or at least important to our understanding of its nature, function, and substrate? Table 16.1 aims to display the general problem space, with column three having a special structure that includes both various types of conscious unity as well as consciousness itself. The table can be read either across the columns or up and down within column 3 (and perhaps within column 1). Reading across, one set of questions can be generated by selecting specific items from each of the three columns: Is unconscious multimodal integration necessary for multimodal conscious integration? Is the unconscious unity of thought and action sufficient for the conscious unity of subject? Is the unconscious representational unity of content sufficient for consciousness itself? Other questions can be generated by applying one of the linking relations from column 2 with various pairings within column 3, either between various specific forms of conscious unity or between such unities and consciousness itself: Is conscious representational unity sufficient for phenomenal unity? Is conscious object unity necessary for

Rethinking the Unity of Consciousness

381

conscious subject unity? Is phenomenal unity necessary for consciousness? Does the unity of the experienced world explain the functional value of consciousness? Some cross pairings generate more interesting and plausible linkages than others, but it is useful to have an overview of the full range of possible connections. Understanding the unity of consciousness requires understanding how its various forms relate to each other and to consciousness itself, as well as to the various sorts of nonconscious unity that may provide their underlying substrate. A comprehensive exam of the full problem space is beyond the scope of the present chapter, and I will instead focus for the remainder of this chapter on a few specific questions about the relations between representational unity, phenomenal unity, and consciousness. As noted above, the neuroscientist Giulio Tononi has developed an influential theory of consciousness that identifies it with a form of integrated information that his theory defines in purely information theoretic terms (Tononi, 2008). Tononi’s proposal is thus a reductive theory that aims to fully explain consciousness in terms of nonconscious integration. He writes, “The integrated information theory (IIT) of consciousness claims that, at the fundamental level, consciousness is integrated information, and that its quality is given by the informational relationships generated by a complex of elements” (2008). Since the supposed relation is one of identity, relative to figure 16.1 Tononi’s theory should be understood as asserting that a type of nonconscious informational unity from column 1 provides both a necessary and sufficient condition (link from column 2) for consciousness itself in column 3. The key idea in Tononi’s IIT is that of integrated information for which he proposes a mathematical measure he terms “Φ” defined in purely information theoretic terms (with the symbol “Φ” itself composed of two components “I” for information and the circular “O” for integration within a whole). For present purposes, we need not go into the precise mathematical definition of Φ used by IIT. What matters is that Φ concerns the information within a complex or system that results from the interactions and causal dependencies among its parts as opposed to the information in the parts themselves. As Tononi puts it, “In short, integrated information captures the information generated by causal interactions in the whole, over and above the information generated by the parts” (2008, 221). To illustrate his point, Tononi uses the example of the detector in a digital camera as an example of nonintegrated information. The camera’s detector may have five million pixel elements, each with its own information value,

382

R. Van Gulick

but that information is not integrated; each is an independent unit simply signaling the light value for its small portion of the scene in isolation. By contrast, when one has a conscious visual experience—as when I look at the cluttered desk in front of me—the information about all the parts of the scene is integrated into a unified awareness of the overall environment from a single subjective viewpoint that embodies an understanding of how the parts fit together as well as their connections with all sorts of other stored information, including my knowledge and memory of the various items on my desk. According to Tononi, a complex that embodies such integrated information literally has a point of view, or at least does so if it is not embedded within a yet more integrated complex with a higher Φ value. He writes, “Specifically, a complex X is a set of elements that generate integrated information (> 0) that is not fully contained in some larger set of higher Φ” (2008, 221). A complex, then, can be properly considered to form a single entity having its own, intrinsic “point of view” (as opposed to being treated as a single entity from an outside, extrinsic point of view). The restriction on not being contained within a set of elements with a higher Φ is relevant to the case of the conscious mind or brain. A human brain will contain many subsystems with some significant measure of integrated information such as the visual cortex or auditory cortex, but they do not each have their own separate consciousness or subjective point of view. Only the larger corticothalamic complex of globally integrated elements is conscious and has such a viewpoint, or at least that is the supposed implication of Tononi’s theory. Tononi’s IIT is an interesting attempt to capture our phenomenological intuitions about the integrated nature of consciousness and translate them into a rigorous mathematical theory that might be applied to the brain (though the actual computation of Φ values for any system as complex as a brain is at present not possible in practice). However, as offering a strictly necessary and sufficient condition for consciousness, IIT confronts a number of challenges. First, it is an entirely abstract theory, that is, the conditions it specifies are purely mathematical and highly mediumindependent. They might be satisfied by all sorts of physical systems, not just biological ones and electronic systems, but also bizarre realizations of the sorts that have been raised by critics of the computational theory of mind (Searle, 1980). Indeed, John Searle (2013) makes this point himself in reviewing a recent book by another famed neuroscientist, Christof Koch, who is a staunch advocate of IIT (Koch, 2012). Tononi accepts this consequence and does not regard it as a reductio of his system, but others will

Rethinking the Unity of Consciousness

383

surely balk at the idea that being conscious and having a subjective point of view in the “what it is like” sense does not depend on the medium in which a mathematical structure is realized but only on the mathematical structure alone. Tononi’s theory also commits him to a form of panpsychism. Any system that forms a complex with a Φ value that is not itself contained within a system with a higher Φ value will have some sort or degree of consciousness on his theory, and will thus have some sort of point of view. Some might regard this as well as a reductio of his theory, but he again accepts it because he allows that consciousness admits of degrees in quantity and that its quality is determined by the network of elements linked within the complex. Thus he allows that an ant, an amoeba, or even a single isolated photo diode can be conscious in some way; it is just that its consciousness is of a far lower degree in quantity than the consciousness of a human or a mouse because it has a far lower Φ value, and its consciousness will not be similar to ours in quality since it does not involve the same vast network of integrated elements. Despite Tononi’s attempts to make panpsychism acceptable, many may find the implication that photo diodes have any consciousness at all a reason to reject his theory. A third objection concerns IIT’s claims that with a system with overlapping complexes, only the complex with the maximal Φ will be conscious and have a subjective point of view. If a complex C either contains or is contained within a complex C’ with a higher Φ, then only C’ will be conscious and C will not be conscious no matter how high its Φ value. As noted above, this would yield the intuitively correct result for situations like the human brain. We take it to have a single conscious point of view perhaps associated with a global pattern of thalamocortical integration, and we do not associate separate points of view with smaller complexes or subsystems such as the visual cortex even though they may have a high Φ value. So far so good for IIT. However, the theory also entails that the visual cortex would have a conscious point of view if it were not contained within the larger, global thalamocortical complex. Thus whether a complex is conscious and whether or not it has a subjective point of view turns not just on intrinsic facts about the level of informational integration internal to the complex, but also on extrinsic facts about what larger complex it may or may not be contained within. Consider two visual cortices, VC1 and VC2, that are exactly alike in all their physical properties and processes over a temporal interval T, but such that VC1 is contained within a complex with a higher Φ value during T while VC1 is not. VC1 and VC2 would have the same Φ value during T, but according to IIT, VC2 would have a conscious

384

R. Van Gulick

subjective point of view during T while VC1 would not. This seems to conflict both with IIT’s supposed identification of integrated information with consciousness as well as with our strong intuitions about the supervenience of consciousness on a system’s intrinsic properties. It seems odd to suppose that VC1 and VC2 could be physically the same in all respects during T, and yet one of them has a conscious point of view and the other does not. Having such a consequence is not a knockdown argument against IIT, but it does weigh against it. A lot more could be said about IIT, and it continues to be regarded as a serious theory of consciousness by at least some neuroscientists. But overall it does not seem plausible as a reductive proposal to provide necessary and sufficient conditions for consciousness in terms of nonconscious forms of unity and integration. Issues of a different sort within the problem space of figure 16.1 concern relations not between items in columns 1 and 3, but relations solely within column 3. Unlike IIT, these do not involve proposals to explicate consciousness in terms of some form of nonconscious integration but rather raise questions about the relations that various forms of conscious unity bear to each other and to consciousness itself. My discussion will again be selective, with a focus on the relation between conscious representational unity and phenomenal unity, especially as that issue has been recently addressed by Tim Bayne (2010) in The Unity of Consciousness. According to Bayne, phenomenal unity is not identical with conscious representational unity. Though phenomenal unity may typically be accompanied by representational unity, he argues that the former is not merely a special case of the latter. It is a separate and distinct type of conscious unity. He thus disagrees with representationalists, such as Michael Tye (2003), who argue that phenomenal unity is nothing over and above representational unity and analyze the unity of consciousness in terms of unified representational content. Having developed his nonrepresentational view of phenomenal unity, Bayne goes on to argue for the truth of what he calls the “unity thesis,” namely the claim that all the experiences had by a conscious subject at a time are phenomenally unified. I will argue that careful consideration of the unity thesis reveals that phenomenal unity is in fact a form of representational unity or at least that it depends essentially upon representational unity, though in a way that is different and more indirect than that proposed by standard representationalist accounts. Bayne offers three specifications of what it is for two experiences to be phenomenally unified:

Rethinking the Unity of Consciousness

385

(1) They are subsumed by a single conscious state, i.e., by being parts or components of that single state (Bayne, 2010, 15). (2) They occur within a single phenomenal field (2010, 11). (3) They possess a conjoint experiential character (2010, 10). They are not intended as three distinct conditions but as three ways of explicating one and the same relation. Of the three, the third is the most informative. The meaning of (1) depends crucially upon on how one understands the notion of a “conscious state,” and the notion of a state can be interpreted so broadly as to make the unity thesis trivial. One might interpret a subject’s conscious state at a time to be simply the totality of all her conscious states at that moment, just as one might define a subject’s belief state as the totality of all her beliefs. Reading “conscious state” in that way would make it a tautology that all one’s experiences at a time were parts of a single such state. Explication (2) does not fare much better insofar as it relies on the similarly vague metaphor of being part of a single “phenomenal field,” which clearly cannot be interpreted in a spatial sense insofar as it is intended to cover many experiences without any explicit spatial aspect. Thus the notion of conjoint phenomenality invoked by (3) offers the best possibility for unpacking Bayne’s notion of phenomenal unity. The idea is that if two experiences, E1 and E2, are phenomenally unified, there is something that it is like to experience them together, something more than the mere conjunction of experiencing E1 and experiencing E2. As in Bayne’s example, if I smell the coffee in the café and hear the rumba at the same time, there is something it is like to for me to experience both of them together. This makes the unity thesis a rather strong claim and less than intuitively obvious. Given all the many diverse experiences a subject can be having at a given moment, is there always a further experience (or experiential feature) of their conjoint togetherness over and above the mere conjunctive fact that one is having each of them at the same time? It is not at all obvious that there is, especially when one considers peripheral as well as focal experiences. The meaning of the unity thesis also depends on how one interprets the notion of a “conscious subject.” For most of the book, Bayne interprets the subject as the human organism. Thus understood, the unity thesis asserts that all the experiences had by a given human organism at a time are phenomenally unified. Bayne takes this to be an a posteriori claim about actual human beings, and a good part of the book is devoted to considering and replying to empirical examples that might seem to falsify the thesis such

386

R. Van Gulick

as cases of hypnosis, dissociative identity disorder, and split-brain patients. Only in the last chapter of the book does he turn his attention to a more a priori interpretation of the thesis, which takes the relevant subject to be the “conscious self” rather the organism. It is by considering this latter version of the thesis that I believe we can see how Bayne’s notion of phenomenal unity essentially depends upon representational unity. First let us back up a bit, though, and be clear about the basic disagreement between Bayne and representationalists such as Tye who explicate phenomenal unity in terms of the unity of representational content. If one is a representationalist who accepts the so-called transparency of experience, there is a simple direct argument one can give for equating the unity of consciousness with representational unity. According to the transparency thesis, the only properties of our experiences of which we are consciously aware are their contents, that is, how they represent the world as being. They are “transparent” in the sense that we “look right through” them to the represented world without being aware of any intrinsic or nonrepresentational properties of those experiences themselves. Thus if the unity of consciousness is to be a phenomenologically manifest property, that is, one present as part of the “what-it-is-likeness” of experience, then it must be a unity of representational content, a unity of the world as it is represented by experience. If we are aware only of content, then the only unity of which we can be aware is unity of content. Obviously, representationalists like Tye have a lot more to say in defense of their position, but hopefully the basic argument will suffice to motivate the view for present purposes. Bayne’s contrary view is based in large part on the existence of cases, many of them pathological ones, in which he believes our experience is phenomenally unified despite the presence of profound failures of representational integration. Even in ordinary experience, we fail to make many logical or inferential connections between items we simultaneously experience but which are nonetheless likely to be phenomenally unified in Bayne’s sense. Of course, the representationalist’s claim is not that we make every such connection, only that we make a sufficient number of such connections, e.g., sufficient to form a representation of unified objects possessing multiple properties and relations within unified scenes. Cases of illusion and hallucination might also be raised as objections to the representationalist view of unity since they seem to involve unified experiential states with nonunified contents—the stick looks bent but feels straight. Tye, however, denies there is any problem. He argues that the representational content in such cases is

Rethinking the Unity of Consciousness

387

inconsistent but unified; unity of content need not involve consistency. It requires only that there is a single overall representational state that represents the stick as being both straight and bent (Tye, 2003, 37). In pathological cases, the failures to unify content may be extreme. Neglect patients, agnosics, and schizophrenics may fail to make the most obvious contentful connections, and even the representation of unified objects, spaces, or body parts may be absent. Yet it seems plausible to regard their experience as phenomenally unified. For Bayne, this is further reason to distinguish phenomenal unity from representational unity, but again the representationalist may reply that representational unity does not require the particular sorts of logical connections or integrations that are lacking in such pathological cases. He may argue that his position commits him only to the claim that whatever conscious or phenomenal unity is present is simply a fact about the total content of the subject’s overall representational state, no matter how disorganized or chaotic that content may be. Thus there seems little possibility of resolving the basic dispute on the basis of such evidence. Nonetheless, considering such cases may help to clarify more specific issues about just what sorts and degrees of representational integration are required for phenomenal unity, or even for consciousness itself. Moreover, as noted above, even if a certain type of integration is not strictly necessary for consciousness or for phenomenal unity and is absent (or very limited) in some special cases, it may nonetheless play a major role in explaining the function and value of consciousness. Its presence in normal conscious cases may be essential to understanding what is distinctive and valuable about consciousness. Various integrative capacities of consciousness may play a key part in enabling it to fulfill major roles, even if those capacities are not always exercised or blocked in special cases. Though the empirical evidence about actual failures of integration may not settle the basic dispute, I believe there is another more a priori route one might follow to show that phenomenal unity of the sort Bayne describes is committed at a deeper level to a form of representational unity on which it essentially depends. Thus even if one stops short of identifying phenomenal unity as just a special case of representational unity, the links between the two may turn out to be tighter than Bayne proposes. At least that is what I hope to show. Recall that Bayne’s unity thesis has two interpretations that depend upon how one interprets the idea of a conscious subject, either as the human organism or as the conscious self. The empirical evidence is relevant largely to the former interpretation, which is an empirical claim

388

R. Van Gulick

about actual humans—as a factual matter, all the experiences occurring in an actual human at a time are phenomenally unified. The latter interpretation about the conscious self is a more a priori claim, and it is that second version of the thesis that promises to give us a deeper understanding of the link between representational and phenomenal unity. First, as to the empirical version, Bayne defends it against various empirical cases that might seem like counterexamples by offering an interpretation of the data in each instance that is consistent with his thesis. As we just saw, that sometimes involves distinguishing representational integration from phenomenal unity and arguing the latter can be present, even when some forms of the former are absent. Other replies turn on the fact that the unity thesis is a claim about simultaneous phenomenal unity, which is compatible with some measure of disunity across time, a lack of diachronic unity. The interpretations that Bayne gives of the various problem cases seem plausible enough to defend his thesis from refutation with one notable exception: that of split-brain patients. As is well known, in split-brain patients, after the severing of the corpus callosum connecting their two hemispheres, there seem to be at least some times in which phenomenally disunified experiences occur within a single human organism. Each hemisphere is able to act on the basis of experiences to which the other appears to have no access, and the standard view of such cases is that they involve two separate centers of consciousness. Indeed, both Tononi (2008) and Tye (2003) endorse that position. If so, the split-brain cases would refute the unity thesis understood as an empirical claim equating subjects with human organisms. Bayne offers an alternative account in terms of a rapid switching model, according to which there are quickly alternating centers of consciousness in the splitbrain patients that are distinct and not diachronically unlinked but never simultaneous. If they never occur at the same time, then their distinctness would pose no threat to Bayne’s unity thesis, which is a claim solely about synchronic phenomenal unity. Though it may not be possible to conclusively disprove the switching hypothesis, it seems implausible and somewhat ad hoc. It is the least plausible of Bayne’s various interpretations of the problem cases (see Prinz, 2013, for a similar critique). The split-brain patients seem capable of carrying out independent and contrary actions with their left and right hands at the same time, each of which is complex and nonhabitual to a degree that would indicate conscious control rather than control by a “zombie” system according to Bayne’s own criteria. Though the detailed data may not suffice to rule out rapid switching, it does not seem

Rethinking the Unity of Consciousness

389

to provide evidence to support it. Thus the empirical interpretation of the unity thesis is called into serious question by the split-brain cases. However, from a philosophical point of view, the a priori reading of the thesis that interprets “subjects” as conscious selves may be the more interesting and important claim. That claim, which Bayne addresses in the final chapter of his book (2010, ch. 12) need not conflict with the split-brain patients, since such cases can be viewed as having two conscious selves in one organism, each of whose experiences are phenomenally unified. Indeed that is the view put forward by Tononi (2008), who views the two hemispheres of the split-brain patients as complexes each with maximal Φ value, neither of which is contained within the other and both of which thus have a conscious subjective point of view. In the a priori reading, the unity thesis asserts that all the experiences of a conscious self at a time are phenomenally unified, leaving open the key issue of what counts as a conscious self and how such selves are to be individuated. Bayne argues against both animalist and bundle theory accounts of the self. Discussing Peter van Inwagen’s example of a two-headed dog, which he calls Cerberus (van Inwagen, 1990), Bayne argues convincingly that Cerberus with two disjoint and phenomenally disunified centers of experience—one in each of its two brains—would constitute two selves rather than one, as van Inwagen’s animalist criterion implies. Bayne also offers sound objections to theories that equate selves with mere bundles or streams of experiences, what he calls “naïve phenomenalism.” He notes that they get the ontology of selves wrong, writing that selves “cannot simply be streams of consciousness for selves are things in their own right, whereas streams of consciousness are not— they are modifications of selves” (2010, 281). Such theories, as he argues, also have difficulty accounting both for the sense in which selves “own” their experiences and for our modal intuitions about how a given self might have a radically different set of experiences yet remain one and the same self (2008, 282–283). Rather than reviewing Bayne’s reasons for rejecting such views, however, I want to focus on the view he supports, that of the self as a virtual entity implicit in the structure of phenomenal intentionality, both because I myself regard that view as the most promising option (Van Gulick, 2004, 2006) and because it allows us to finally see the deep connection between phenomenal and representational unity. The phenomenal, virtual self-view is a variant of Daniel Dennett’s theory of the self as the “center of narrative gravity.” That center or point of view on Dennett’s and Bayne’s account is not a character in the story, nor the author of the story, but rather a point

390

R. Van Gulick

of view implicit in how the parts of the story cohere. They hang together in a way that implies the existence of the relevant observer without needing to explicitly refer to or describe that observer. What is explicit is the story. The point of view itself need not ever be described; rather it is implicit in the narrative stream of experience. Extending the metaphor to the case of conscious experience, the idea is that the self too is a virtual structure, an intentional entity implicit in how our experience coheres as that of a unified subject. Thus it is not a version of the bundle theory or naive phenomenalism. The self is not identical with the stream of experience, but rather an intentional entity implicit in the organization of that stream. Those experiences are unified and coherently connected in their content as if they were the experiences of a single conscious subject, and thus that point of view is implicit in those experiences. They hang together and make sense as the experiences of a single self. Moreover, Bayne argues that each of the experiences has de se intentionality, i.e., its intentionality has an inherently self-referential character that refers each experience to the subject whose experience it is in a direct and nondescriptive way. Given that basic explanation of the virtual self theory, we can now see how it implies a deep connection between phenomenal unity and representational unity. If we understand the unity thesis as an a priori claim about subjects considered as conscious selves, then it asserts that all the experiences had by such a self at a given moment are phenomenally unified. But according to the virtual self theory, whether or not a set of experiences at a given time count as the experiences of a single self will depend on the contentful connections that hold among them. Whether or not they imply the existence of a single self, i.e., a single shared experiential point of view, is an intentional fact that depends on what relations of coherence hold among their representational contents. They may fail to be fully integrated in terms of their logical consequences, even some of their obvious logical consequences in the pathological cases, but those contents must at least be integrated so as to imply the existence of a single self as their shared subject. On the virtual self view, the subject unity of consciousness thus depends upon a form of representational or content unity. If one agrees with Bayne and accepts a virtual self view, as I believe one should (Van Gulick, 2004, 2006), then one can give a fairly simple and direct argument linking phenomenal and representational unity: (P1) The self is an intentional entity implicit in the structure of conscious representations and their integrated contents—contents that are integrated as being from the perspective or point of view of a single self.

Rethinking the Unity of Consciousness

391

(P2) Whether or not a set of conscious representations is integrated in that way—i.e., whether or not there is a virtual self implicit in those representations—depends upon the contents of those representations and how they are linked and integrated, thus on a type of representational unity. (P3) A set of experiences is phenomenally unified only if they are all experienced from the point of view of one and the same self, only if they are “like something” for one and the same self of subject. (P4) Therefore, whether two experiences are phenomenally unified ultimately depends on representational facts about whether or not their contents are integrated as implicit parts of one and the same point of view or virtual self. Indeed, one can extend the argument to show that such representational unity is a necessary condition for conscious experience itself: (P5) A conscious mental state CM (or experience E) can exist at time t only if there is “something that it is like” to be in the state (or have that experience) at t. (P6) There can be something that it is like to be in CM (or have experience E) at t only if there is some self or subject for whom it is like some way to be in CM (or have E) at t. (P7) Therefore, a conscious mental state CM (or experience E) can exist at t only if it is contained within a set of representations whose contents are integrated or unified in a way that implies the existence of a single self or subject. We can put the latter point in terms of a specific example. There cannot be a conscious pain without some self or subject for whom it is like something to be in or have that pain. But on the virtual self view, the existence of such a subject ultimately depends upon facts about whether the contents of a set of representations are integrated in a way that implies the existence of such a single self or point of view. Thus consciousness per se requires at least some significant measure of representational unity or integration. Of course, both the basic argument and its extension assume the virtual self view in (P1) and (P2), and that view is far from obvious. Indeed, it is likely a minority view among current views of the self. So the arguments above are perhaps best viewed as conditional arguments that show what follows if one accepts the virtual self theory. Since Bayne seems to do so in his final chapter, he ought to accept the conclusions of both arguments, including the thesis that phenomenal unity depends in a deep way on a type of representational unity. As to others who are more skeptical about

392

R. Van Gulick

the virtual self view, more persuasion will be needed. But I leave that for another time. References Baars, B. (1988). A cognitive theory of consciousness. New York: Cambridge University Press. Baars, B. (1997). In the theater of consciousness: The workspace of the mind. New York: Oxford University Press. Bayne, T. (2010). The unity of consciousness. Oxford: Oxford University Press. Dehaene, S., & Naccache, L. (2001). Towards a cognitive neuroscience of consciousness: Basic evidence and a workspace framework. Cognition, 79, 1–37. Hobbes, T. (1651). Leviathan, or The matter, forme, and power of a common-wealth ecclesiasticall and civill. London: Andrew Crooke. Koch, C. (2012). Consciousness: Confessions of a romantic reductionist. Cambridge, MA: MIT Press. Oxford English dictionary (2nd ed.). (2000). Oxford: Oxford University Press. Prinz, J. (2013). Attention, atomism, and disunity of consciousness. Philosophy and Phenomenological Research, 86, 215–222. Searle, J. (1980). Minds, brains, and programs. Behavioral and Brain Sciences, 3(3), 417–457. Searle, J. (2013). Can information theory explain consciousness? New York Review of Books, 60. South, R. (1693). Twelve sermons preached upon several occasions. London: Jonah Bowyer. Tononi, G. (2008). Consciousness as integrated information: A provisional manifesto. Biological Bulletin, 215, 216–242. Tye, M. (2003). Consciousness and persons. Cambridge, MA: MIT Press. Van Gulick, R. (2004). HOGS (higher-order global states)—an alternative higherorder model of consciousness. In R. Gennaro (Ed.), Higher-order theories of consciousness (pp. 67–92). Amsterdam: John Benjamins. Van Gulick, R. (2006). Mirror-mirror, is that all? In U. Kriegel & K. Williford (Eds.), Self-representational approaches to consciousness (pp. 11–40). Cambridge, MA: MIT Press. van Inwagen, P. (1990). Material beings. Ithaca, NY: Cornell University Press.

17 Counting Minds and Mental States Jonathan Vogel

A number of people have undergone surgery that severed the principal connection between the left and right hemispheres of their brains. Their subsequent behavior raises puzzles about what their mental lives are like. These puzzles lead to broader questions about the metaphysics of experience and the structure of consciousness itself.1 Let’s suppose that, ordinarily, the mental life of a human being comprises one unified “stream of consciousness.” The elements of that stream are individual experience tokens with various contents. To fix terminology, experience tokens e and e′ are unified just in case they belong to the same stream. Another important relation between experiences is co-consciousness. To try to be as neutral as possible, we can say that e and e′ are co-conscious just in case they are “experienced together.”2 A straightforward proposal is that experiences are unified just in case they are co-conscious.3 Experiments with split-brain subjects indicate that their mental lives are significantly altered by their surgery. While, for the most part, their conscious experience appears to be normal, there are some striking exceptions. A tactile experience via a subject’s left hand, α, may be divorced from a visual experience via his right eye, γ. Both experiences seem to occur without being co-conscious. There is a further complication. It could be that, at the same time, a siren goes off and the subject is bound to hear it. In that case, α will be co-conscious with an auditory experience of the siren, β, and so will γ. How should we describe the split-brain subject in these circumstances? Does he have one stream of consciousness or two? And how many experiences of the siren does he have? Schechter discusses two competing accounts of the split-brain subject’s mental life. On the partial unity model, the split-brain subject has a single stream of consciousness, albeit one that exhibits a marked level of dissociation. The experiences α, β, and γ are unified, even though not all of α, β, and γ are co-conscious (α and γ aren’t).4

394

J. Vogel

The alternative to PUM is the conscious duality model, according to which the subject hosts two distinct streams of consciousness, each primarily supported by one hemisphere of the subject’s brain. CDM draws support from the principle that co-consciousness is transitive (“transitivity,” for short). That is, if a token experience e is co-conscious with a token experience e′, and e′ is co-conscious with a token experience e″, then e is co-conscious with e″. The thought, roughly, is that if e is experienced together with e′, and e′ is experienced together with e″, then e has to be experienced together with e″.5 According to PUM, there is a single stream of experience which includes (token) experiences α, β, and γ. α is co-conscious with β, and β is co-conscious with γ. If transitivity holds, α must be co-conscious with γ, which isn’t the case. So, PUM is incompatible with transitivity. CDM is able to reconcile transitivity with the experimental finding that α and γ aren’t co-conscious. To do that, CDM posits the existence of two distinct token experiences with the same content, β1 and β2. β1 belongs to one stream of consciousness S1, which contains the members of the set {α, β1, …}. β2 belongs to a different stream of consciousness S2, which contains the members of the set {β2, γ, …}. Within S1, α is co-conscious with β1, and, within S2, β2 is co-conscious with γ. Under these circumstances, transitivity doesn’t require α and γ to be co-conscious. So, transitivity is consistent with CDM. Because of the difference between PUM and CDM in this regard, a commitment to transitivity favors CDM. PUM faces an important criticism due to Susan Hurley. PUM may be taken as a negative thesis, denying that co-consciousness is necessary for unity. But then the advocate of PUM owes us a positive account of unity, and Hurley thinks that there can be no satisfactory account of that sort: “No objective factors can be identified that would make for partial unity. (Thus) there is a fundamental indeterminacy in the conception of what partial unity would be, were it to exist” (1998, 175). To meet the objection, PUM has to be filled out so that it is unproblematic whether there is a unified stream of consciousness or not.6 Schechter undertakes just that. She entertains a trio of claims: Claim I: “There is but a single vehicle for each content that is available to the full suite of conscious control systems within an organism” (this vol., 367). Claim II: “For a given content B, there are as many vehicles carrying B as there are ‘functional sets’ of conscious control systems to which that content is made available” (this vol., 366).

Counting Minds and Mental States

395

“Functional set” may be glossed this way: Claim III: “What makes a collection of control systems constitute a single functional set, meanwhile, is that they have access to most or all of the same contents”7 (this vol., 366). Let’s return to the example discussed above, in which the contents α and β are available to one hemisphere and the contents β and γ are available to the other. Suppose further that there are only two “conscious control systems,” C1 and C2, each based in one hemisphere. Both C1 and C2 have access to the content β. Thus, according to Claim I, there is only one token experience with the content β. This result is inconsistent with CDM, so it points us in the direction of PUM. The trouble is that Claims II and III are inconsistent with Claim I. Suppose again that there are two conscious control systems C1 and C2, each associated with a different hemisphere. Now imagine that the contents available to these systems are very different, though not entirely so. For example: C1 has access to α, β, γ, and δ and C2 has access to δ, ε, ζ, and η. All control systems have access to the content δ, so by Claim I: (1) There is only one token experience with the content δ. However, the contents available to C1 and C2 are generally very different. So, according to Claim III: (2) C1 and C2 don’t constitute one functional set. From (2) and the description of the case: (3) More than one functional set has access to the content δ. From (3) and Claim II: (4) There is more than one token experience with the content δ. (1) and (4) are contradictory. The opponent of CDM will want to avoid this outcome, and Schechter suggests another line to take: Claim IV: “At any moment in time, an animal cannot have multiple experiences with the same content” (this vol., 367).8

396

J. Vogel

Claim IV is a much stronger version of Claim I, and Claim IV by itself is sufficient to rule out CDM. But, as it stands, Claim IV looks like little more than a stipulation that CDM is wrong.9 And it gets worse. Consider the animal described in this passage: “Pushmi-pullyus are now extinct. That means, there aren’t any more. … They had no tail, but a head at each end, and sharp horns on each head.”10 Now suppose a lion roars, which both heads of the pushmi-pullyu hear. It seems that both heads will have auditory experiences of the same type. That is, there will be an animal that has multiple experiences with the same content, contrary to Claim IV.11 Various defensive maneuvers are possible, of course. That the pushmipullyu has two completely disconnected brains is an excellent reason to suppose that it has two token experiences with the same content. We might build that in as a permitted exception to Claim IV. But what if the pushmipullyu’s nervous system were somewhat different, so that its brains shared a common part? If activity in that common part makes no difference to conscious experience, then it seems we should still say that the pushmipullyu is host to two experience tokens with identical contents. A further exception to Claim IV is necessary. Where do the exceptions stop? To settle the issue, the opponent of CDM must come up with a refined neural or functional criterion for determining how many token experiences there are in a given instance. But in that case, Claim IV and its successors would do no work. There is one more possibility to take up. According to PUM, experiences can be unified even though they aren’t co-conscious. There is, however, the ancestral of the co-consciousness relation, call it “co-consciousness*.” E1 and En are co-conscious* just in case E1 and En are co-conscious; or E1 and E2 are co-conscious, and E2 and En are co-conscious; or. … A way to spell out PUM is to say that two experiences are unified just in case they are co-conscious*.12 This proposal allows us to say that, in the example above, (token) experiences α, β, and γ make up one stream of consciousness, even though α and γ aren’t co-conscious. And, it seems, PUM so understood would bring with it no more indeterminacy than CDM does. Apparently, both will be as determinate as the co-consciousness relation is. I have to say that I am suspicious of this version of PUM. For one thing, it seems ad hoc. For another, it is extremely permissive. Two experiences e and e’ could count as unified, despite their being, in an important sense, quite isolated from one another. PUM allows them to be separated by indefinitely many links of co-consciousness between other experiences. Thus, the proponent of PUM might need to set some limit to how etiolated the link between unified experiences can possibly be. But, then, how many

Counting Minds and Mental States

397

removes are too many? Why set the boundary exactly there? These questions don’t seem to have good answers. To this extent, PUM will introduce a further level of indeterminacy about the number and identity of streams of consciousness, as Hurley had feared. There is an additional difficulty. For the sake of argument, let’s suppose that the maximum number of links between two elements of a stream is three. And imagine that, as things actually are, there is a stream with the following structure (“^” stands for co-consciousness): α^γ^δ^ε It is to be understood that there are no relations of co-consciousness besides those explicitly indicated. For example, α and δ aren’t co-conscious. Now, our subject could have had another experience β, which was co-conscious with α and γ, while all the other relations of co-consciousness remained in place. Thus: α^β^γ^δ^ε But by hypothesis, this four-link chain can’t be a single stream of consciousness. If the facts about co-consciousness are to be respected, there will have to be two streams, involving a duplication of some content or other. This itself is bad news for PUM, if the model is committed to avoiding such duplication. One possibility “(2 + 2)” is: α ^ β ^ γ1

γ2 ^ δ ^ ε

Another, “(1 + 3),” is: α ^ β1

β2 ^ γ ^ δ ^ ε

A host of unwelcome questions arise. What would make it the case that the resulting two streams were (2 + 2) rather than (1 + 3), or vice versa? Both possibilities respect the facts about co-consciousness (with respect to contents). Note that this sort of problem, and potential for indeterminacy, won’t come up for CDM. Given the facts about co-consciousness as stated and the assumption of transitivity, the only possibility is: α ^ β1

β2 ^ γ1

γ2 ^ δ1

δ2 ^ ε

Let’s return to PUM. Suppose that there is a fact of the matter about whether (2 + 2) or (1 + 3) would be the case if the subject were to have a further experience β. γ actually exists. If experiencing β brought about (2 + 2), would γ be identical to γ1 or to γ2? Neither answer seems right. Should we say, “If γ had been co-conscious with another experience β, γ wouldn’t have existed”? That sounds at least as bad.13 The problem here is worse than

398

J. Vogel

the one encountered in describing ordinary fission cases, like the fate of an amoeba that divides. The prospect in this instance is that the existence of γ could depend on whether the subject happens to have an additional experience with the content β, which seems like a wholly extraneous consideration. The difficulty this variant of PUM faces is more like the problem of the Ship of Theseus. In that notorious example, whether the original ship is identical to the repaired ship seems to depend upon whether discarded planks are reassembled—which is a highly unintuitive outcome at best.14 CDM avoids all this trouble, and that is a reason to prefer CDM to PUM in its latest form. Schechter gives us an imaginative and spirited defense of PUM. But despite her efforts, there are still considerable reasons to worry about the cogency of PUM and to prefer CDM. Going beyond what we ought to say about split-brain cases in particular, we see that transitivity is an appealing principle about consciousness and that allowing the possibility of token experiences with the same content may be inevitable. Notes 1. My thoughts on these topics took shape as reflections on Elizabeth Schechter’s “Partial Unity of Consciousness: A Preliminary Defense” (this vol.), and I’ll present them in that form. See Schechter’s chapter for details about the split-brain experiments and about competing models of the conscious lives of the subjects of the experiments. 2. As Schechter notes, the nature of co-consciousness is a delicate matter. Her preferred formulation is that two experiences are co-conscious insofar as they are “cophenomenal” (this vol., 350). 3. Schechter rejects this equivalence, as we will soon see. 4. Schechter distinguishes PUM from the unity model, according to which the splitbrain subject’s experiences are all co-conscious (this vol., 351). 5. See Dainton (2000) for a prominent defense of transitivity. 6. The proponent of PUM may think that some cases are genuinely indeterminate, and a virtue of PUM is that the model can respect that. See Schechter (this vol., 369). Even so, very many cases ought not to count as indeterminate, so the advocate of PUM has work to do. 7. Presumably, some kind of maximality condition has to be added to avoid an unwanted multiplication of functional sets. 8. Schechter goes so far as to say, “Where there is no qualitative difference between contents, the PUM posits no numerically distinct experiences” (this vol., 355). Apart

Counting Minds and Mental States

399

from its own merits or lack thereof, this claim may not sit well with some of Schechter’s other commitments; see note 12. 9. Schechter writes, “If the PUM comes with no analogous constraint or principle of individuation, then the most a proponent of the PUM can do is simply stipulate that a subject has a partially unified consciousness. Such stipulation would of course leave worries about metaphysical indeterminacy intact; the PUM would thus be uniquely vulnerable to the indeterminacy challenge” (this vol., 367). 10. Lofting (2004, 35). 11. Incidentally, examples like the pushmi-pullyu create trouble for the doctrine of animalism that is prominent in the personal identity literature. See Campbell and McMahan (2010). 12. In fact, Schechter herself seems to endorse a position of this sort: “The PUM drops the transitivity assumption, allowing that a single experience may be coconscious with others which are not co-conscious with each other. Streams of consciousness may still be structured by co-consciousness, but it is not necessary that every experience within a stream be co-conscious with every other” (this vol., 351). However, this view is perfectly consistent with the existence of distinct token experiences with the same contents. 13. David Lewis’s counterpart theory might allow us to maintain that γ could have been both γ1 and γ2. See Lewis (1971). Whether we ought to say that, and whether it is helpful to PUM, is another matter. 14. See Rea (1995).

References Campbell, T., & McMahan, J. (2010). Animalism and the varieties of conjoined twinning. Theoretical Medicine and Bioethics, 31, 285–301. Dainton, B. (2000). Stream of consciousness. New York: Routledge. Hurley, S. (1998). Consciousness in action. Oxford: Oxford University Press. Lewis, D. (1971). Counterparts of persons and their bodies. Journal of Philosophy, 68, 203–211. Lofting, H. (2004). The story of Doctor Doolittle. Whitefish, MT: Kessinger Publishing. Rea, M. (1995). The problem of material constitution. Philosophical Review, 104, 525–552.

Contributors

Tim Bayne School of Social Sciences, University of Manchester David J. Bennett Department of Philosophy, Brown University Berit Brogaard Department of Philosophy, University of Miami Psychology Faculty, Network for Sensory Research, University of Toronto Director, Brogaard Lab for Multisensory Research, University of Miami Barry Dainton Department of Philosophy, University of Liverpool Ophelia Deroy Institute of Philosophy, School of Advanced Study, University of London Frédérique de Vignemont Institut Jean Nicod Marc O. Ernst Professor, Bielefeld University Richard Held Department of Brain and Cognitive Sciences, MIT Christopher S. Hill Department of Philosophy, Brown University Geoffrey Lee Department of Philosophy, UC Berkeley

402

Contributors

Kristian Marlow Associate Director, Brogaard Lab for Multisensory Research, University of Miami Research Fellow, Initiative on Neuroscience and Law, Baylor College of Medicine Initiative on Neuroscience and Law, Baylor College of Medicine Farid Masrour Department of Philosophy, Harvard University Jennifer Matey Associate Professor of Philosophy, Southern Methodist University Casey O’Callaghan Department of Philosophy, Washington University in St. Louis Cesare V. Parise Bielefeld University Kevin Rice Department of Philosophy, University of Missouri–St. Louis Elizabeth Schechter Department of Philosophy, Washington University Pawan Sinha Department of Brain and Cognitive Sciences, MIT Julia Trommershäuser Brown University James Van Cleve University of Southern California Loes C. J. van Dam Bielefeld University Robert Van Gulick Department of Philosophy, Syracuse Jonathan Vogel Department of Philosophy, Amherst College Jonas Wulff Max-Planck Institute for Intelligent Systems

Index

Aberrant reentrant processing hypothesis, 50 Access consciousness, 298, 350 Accuracy, sensory, 223 Action, perceptually guided, 79–80 Adaptation, 223 Additive binding, 11 Amygdala, 52 Animalism, 399n11 Armstrong, D. M., 275 Arterberry, M., 207n18 Assumption of unity, 130, 140–142 Atomism (about the structure of experience), 288, 295–300, 306–310 Attention, 37, 42, 43, 44, 45, 46, 47, 52, 56, 59, 61, 62, 63, 64, 129, 130, 136, 142, 145n2, 146n7, 146n9 Audio-visual associations, 183–184, 187 Awareness, 45, 58 Ayer, A. J., 264 Baars, B., 376 Bach-y-Rita, P., 159 Bálint’s syndrome, 378 Banks, M., 6, 9 Bare unity, 331 and ampliativeness, 333–334 and bare unity accounts, 331 and skepticism about unity, 334–335

Bayesian cue combination, 6, 7, 12n6 Bayes’ rule, 5, 213 Bayne, T., 11, 107, 111–112, 115, 120n3, 126, 133, 235–236, 240, 242–243, 246n3, 247n12, 249n13, 259–263, 270–271, 277, 292, 327, 331, 341n16, 349, 350, 353, 355, 356, 367, 368, 369n8, 384–392 Beierholm, V. R., 10 Belief, 59 Bennett, D., 234, 307 Berkeley, G., 193 answer to Molyneux’s question, 196–197 Bias, 12n5, 210 Binding additive, 125, 129–133 atypical, 37 feature, 62 grapheme-color, 46, 53, 55 integrative, 125, 129–133, 134–135, 138–139, 143, 144–145 object- and event-binding unity, 237 Binding parameter, 135–137, 139–141, 144–145 sortal hypothesis, 137, 139–142, 144, 146n9 spatial hypothesis, 137–139, 144 (see also Location) proto-object hypothesis, 142–145

404

Binding problems, 125, 126–128, 132 encoding problem, 127, 133 parsing problem, 125, 127–128, 131, 133, 135–136, 137, 139–141 recoding problem, 129, 131, 133, 136, 138 weighting problem, 131–133 Blindsight, 63, 64 Block, N., 239, 299, 314 Borton, W., 174, 207n14 Brentano, F., 255–258, 266, 272 inner perception, 256–257 Psychology from an Empirical Standpoint, 255 psychology without a soul, 255 Brightness, 49 Brooks, A., 111 Campbell, John, 135, 137, 139, 142, 146n7, 146n9, 146–147n11 Caudek, C., 12n5 Causal holism, 233–234 Chalmers, D., 111–112, 120n3, 236, 240, 243, 249n13, 259–273, 292, 331 Cheseldon, W., 173–174, 201 Cognitive architecture, 341 federal models of, 341 imperial models of, 341 Cognitive penetration, 41, 60, 62, 144 Color, 41, 52 Columbo, M., 10 Common sensibles, 10, 18–20, 130, 135, 140, 238, 259 Complementary information, 209–210, 224–225n1 Conscious duality model, 394–396, 398 Content, 151–161 high-level properties, 152–161 low-level properties, 151–152 of perception, 151–161 Contents, closure under conjunction, 91

Index

Continuum view, 119 Correspondence problem, 11, 215–216, 224 Coupling perceptual, 217 prior, 216–220, 222, 224 uncertainty, 217–218 Cross-modal binding, 106, 109–110, 115 Cross-modal correlation, 216, 221–222 Cross-modal correspondence, 116 Cross-modal illusions, 81 Cross-modality, 41 Cross-modal learning, 182–184, 187 C-systems, 268, 274–277 C-theory, 269 Cutaneous rabbit, 26 Dainton, B., 234, 246n3, 246n4, 258, 263–264, 270, 281, 398n5 Decision rule, 221 Decomposition thesis, 15, 18, 20, 30 Degenaar, M., 201 Dehaene, S., 376 Dennett, D., 389 Deroy, O., 10 Descartes, R., 267–268 Desynchronized speech, 112 Determinate and determinable properties, 292–295 de Vignemont, F., 10, 11, 238–239 DF, 243, 245, 250n23 Diderot, Denis, 194, 195–196, 204 DiLuca, M., 8, 9 Distinctiveness of phenomenal character, 89–90 local, 89 regional, 89–90 Domini, F., 12n5 Dorsal stream, 57 Dorsal visual stream, 57 Dretske, F., 151

Index

405

Embodied and extended mind views, 365 Encapsulation, 59, 60 Ernst, M. O., 6, 8–10, 11n2, 23 Error signal, 223 Evans, Gareth, 194, 195, 201, 206n9 Experiential holism, 246nn3–4, 247n7 Explaining away, 225n1

Hobbes, T., 375 Holism (about the structure of experience), 234–235, 288, 295–300,

Feature binding awareness, 73–75 disruption, 86 empirical evidence, 80–83 illusory, 83–84 and infusion (see Infusion) intermodal, 76–79 intramodal, 75–76 process, 75 Feature dimensions, integral and separable, 99 Feature integration, 63 Feedback, 223 Frame of reference, 128–129, 133, 137– 138, 144

Illusions, 49, 63 Müller-Lyer, 59 Illusory contours, 235 Illusory feature conjunctions, 82, 97n12 Independent estimates, 211 Informational redundancy, 130, 133, 135 Infusion, 92–93 Integrated information theory, 376, 381–384 Intermodal feedback, 297 Intermodal processes, 37

Gentaz, E., 202, 207n14 Glenney, B., 207n13 Global workspace theory, 376 Goff, P., 265–266 Goodale, M. A., 241, 243 Hallucination, 49 Heinlein, R., 277 Heterogeneity of objects of sight and touch, 196, 205 Higher-order global states (HOGS) model, 376 Higher-order theory of consciousness, 298 Hill, C., 237, 239–241, 247n10, 250n20, 255–256, 264, 307, 334–335 Hillis, J. M., 6, 9 Hippocampus, 41, 53, 62, 63

306–310 Hue, 49 Hurley, S., 120n10, 336, 353, 357, 362– 367, 368n5, 395 Hyperselves, 273, 277–281

Jackson, C. V., 114 Jackson, F., 275 Johnston, M., 269–273, 277–278 Judgment, perceptual, 79 Kanizsa triangle, 234, 246n5 Knill, D., 3, 6 Koch, C., 382 Kohler, W., 175 Lamme, Victor, 299, 314 Landy, M. S., 6, 9 Lange, F., 255 Learning in multisensory integration, 10, 220–224 of priors, 11n2 Leibniz, G. W., 194 Lewis, D., 278, 280–281, 399n5 Light-from-above prior, 213, 222–223

406

Likelihood function/distribution, 5, 12n6, 211–214, 217, 219 Linear cue integration, 6 Location, 126, 133, 136–139, 143, 145 Locke, John, 172, 193, 267–268 Lockwood, M., 347, 352, 353, 360, 364, 365, 369n6 Long-term potentiation, 41 LSD, 50 Marks, C., 349, 350, 352, 354, 368n2 Maximum a posteriori estimation, 5, 214, 217 Maximum likelihood estimation, 5, 211–214, 216–217 McGurk effect, 21, 26, 28, 81, 114, 130, 134, 135 Mellor, H., 275 Meltzoff, A., 174, 207n14 Memory, 37, 52, 53, 54, 55 long-term, 41 reactivation model, 41, 53 working, 53, 154–155, 158 Mereological model, 133–135, 146n6 Metamers, perceptual, 214–215, 221 Milner, D. A., 241, 243 Modularity, 59, 60 Molyneux, William, 193 Molyneaux’s question, 10, 171–172, 177–178, 187–188, 193–203, 205, 206n9 Motion-induced blindness, 25 Motor-sensory cortex, 57 Movies, multisensory perception of, 85 Multimodality. See also Binding additive multimodality, 125, 129, 133, 134, 136 integrative multimodality, 125, 129, 133, 134–135 minimal, 87–89 unity, 126–127, 129, 133–134, 135

Index

Multisensory fusion, 9, 130, 132–134 Multisensory integration, 22, 25, 109, 110 Nagel, T., 326, 350, 353, 356, 358 Nature vs. nurture, 206n9 Neglect, unilateral, 25, 387 Neo-Humeanism, 107 Neurotransmitters, 40 Nonconceptual, 126, 142–143, 145 Object files, 24–25, 28–30, 82, 97–98n12 Objects, perceptual audio-visual, 81, 94–95, 97–98n12 as mereologically complex individuals, 94 multimodal, 94–95 Object-specific preview effects, 82, 97–98n12 Object unity, 105, 107, 109, 112 O’Callaghan, C., 10, 11, 26 Oddity task, 214, 221 Panpsychism, 383 Pargetter, R., 275 Parietal cortex, 57 Partial integration, 28–30 Partial unity model, 347, 351–370, 393, 395–397, 398n6, 399n9, 399n12 Part-whole relation on experience, 236, 248n11, 291–295 Peacocke, C., 326n11, 327, 327n16 Pearce, K., 207n16 Perception, 41 high-level, 56 low-level, 51, 52 Perceptual content Fregean, 18–20 Russellian, 18–20 Personal identity, 357 Personal vs. external time, 278 Phantom limb, 138–140, 143, 145, 147n13

Index

Phenomenal gradation, 114 Phenomenal holism vs. phenomenal atomism, 262–264 Phenomenal unity, 16–18, 105, 384–392 Phenomenology, 37, 38, 42, 46, 50, 55, 58 Philosophy of mind, 41, 59 Phi phenomenon, 26 Phonemes, 55 Plato, 259, 267 Pop-out, 43, 44, 45, 47, 52, 53, 56 Precision, sensory, 211, 213–214, 223. See also Reliability Prefrontal cortex, 53, 57 Priming, 43 Prinz, J., 309, 314, 388 Prior, E., 275 Prior knowledge, 213–214, 217, 221–224. See also Bayes’ rule coupling, 216–220, 222, 224 (see also Coupling, prior) learning of, 222–223 probability, 214, 217 Prism adaptation, 132, 139, 140, 143, 145 Project Prakash, 176, 193, 202 Property dualism, 295 Proprioception, 131, 132–133, 134, 138–139, 140 Proto-object, 142–145 Pruss, A., 280 Psilocybin, 50 Psychological functioning thesis, 244 Psychological subjects, 242–244, 250n20, 251n26 Psychological unity thesis, 243 Raven’s Progressive Matrices, 349 Raymont, P., 111 Reaction time, 42, 44, 52, 53 Real figure vs. visible figure, 199 Realization, 293–294, 297

407

Redundant information, 210, 213, 222, 225n1 Reid, T., 193 answer to Molyneux’s question, 197– 200, 206n9 Reliability, 4, 6, 211–212, 220–221, 223–224 Representational unity, 376, 384–391 Rescorla, M., 10 Rubber hand illusion, 113, 121n13, 131, 134, 143, 144, 147n13 Russellian monism, 271 Sartre, J.-P., 264 Saunderson, Nicholas, 198–200, 206n8 Schecter, E., 236, 249n19, 250n24, 350, 351, 354, 368n3, 369n8, 394, 398nn1–3, 398n5, 398n8, 399n9, 399n12 Schizophrenia, 387 Searle, J., 288, 299, 382 Segmentation, 182 Sensory conflict, 212, 220–221 Sensory fusion, 214–220 mandatory, 216 Sensory integration breakdown of, 219–220 cost of, 214–215, 221 development of, 222 learning to, 221–222 multisensory, 82, 97n12 window of, 220 Sensory noise, 210–211 Sensory recalibration, 223–224 Sensory weights, 6, 211, 216, 219–222, 224 learning of, 220–221, 224 Series, P., 10 Seydell, A., 6 Shams, L., 10, 128–129, 140, 141 Siegel, S., 233 Simultanagnosia, 378 Singular content, 90–92

408

Sinha, P., 10, 193 answer to Molyneux’s question, 202–203 questions for, 203–205, 207n14 Size-weight illusion, 224 Sortal concept, 137, 139–142, 144, 146n9 Sound-induced flash illusion, 21, 23, 26 Sound symbolism, 55 South, R., 375 Spatial unity, 238 Spence, C., 107, 115 Sperry, R., 362, 369n7 Split-brain patients, 241, 348–373, 388–389 Statistics of the environment, 213–214, 224 Stein, B. E., 109 Strawson, G., 280 Streri, A., 202, 207n14 Striate cortex, 49 Stroop, 42, 43, 48, 157 Subject unity, 240, 277, 282 Subjective simultaneity, 281 Subsumption, 236 Superadditive effects, 82, 97n12 Switch thesis, 110–111 Synaptic connections, 40 Synesthesia, 37, 152–161 aberrant reentrant processing hypothesis, 41 automaticity, 39, 42, 43, 52 battery, 39, 40 cross-modality, 41 developmental, 50 disinhibited feedback hypothesis, 41, 49 drug-induced, 50 grapheme-color, 37, 43, 44, 46, 50, 52, 56, 60, 61 higher, 152–161 local cross-activation hypothesis, 40, 41, 49

Index

long-term potentiation hypothesis, 41 lower, 152 mechanism, 40 and savantism, 153–158 stability, 39 test-retest reliability, 39, 50 Tactile extinction, 129–130 Tactile-visual sensory substitution (TVSS), 159, 160 Tammet, D., 153–155, 157 Temporal parity principle, 279 Tononi, G., 272 Transitivity of co-consciousness, 394, 398 Transparency of experience, 386 Traumatic brain injury, 58 Treisman, A., 125, 127, 137, 145n2, 146n11 Tononi, G., 377, 381–384, 388–389 Touch, 128, 129–130, 131, 132, 137, 142, 143–144, 146n5 Tye, M., 15, 151, 326n10, 327, 327n15, 349, 350, 352, 354, 368n2, 376, 384, 386, 388 Type/token distinction, 245nn1–2, 347, 354, 355, 357, 358, 360, 361, 362, 363 Unity assumption, 22, 23, 29 Unity of consciousness, 323–345 connectivity account of, 328–345 Leibnizian theories of, 323, 332–335 mereological account of, 259–261, 327, 332–335 Newtonian theories of, 323, 332–335 one-content account of, 327, 332–335 one-subject account of, 327, 332–335 relational vs. substantivalist approaches, 257 skepticism about, 334–335 subsumption relation, 259

Index

Unity relation accessibility unity relation, 239–240 constructivist view of, 306–310 as an external relation, 289, 311–314 as an internal relation, 289, 311–314 object and event, binding unity, 237–238 spatial unity, 238–239 subject unity, 240–241 top-down view of, 299, 308–310 van Cleve, J., 10 Van Gulick, R., 300, 376, 389, 390, 391 Van Inwagen, P., 389 Ventral stream, 57 Ventriloquist effect, 21, 23, 213 Virtual self, 389–392 Visible figure vs. real figure, 199 Vision for action, 57 early, 42, 43 low-level, 61, 62 object recognition, 57 Visual cortex, 41, 49, 50, 53, 56, 57, 60, 63 Visual form agnosia, 142, 241–243, 250n21, 250nn23–24, 378, 387 Visual search, 43, 45, 47, 56 Warren, D. H., 113, 115, 130, 132, 133, 140–141 Welch, R. B., 22, 113, 115, 130, 132, 133, 140–141 Yonas, A., 207n18 Zeki, S., 249n16

409