215 73 20MB
English Pages 633 pages: illustrations; 28 cm + 1 stereopticon [642] Year 2012
PERCEIVING IN DEPTH
OX F O R D P SYC H O L O GY S E R I E S 1. The Neuropsychology of Anxiety J. A. Gray
18. Perceptual and Associative Learning G. Hall
34. Looking Down on Human Intelligence J. Deary
2. Elements of Episodic Memory E. Tulving
19. Implicit Learning and Tacit Knowledge S. Reber
3. Conditioning and Associative Learning N. J. Mackintosh
20. Neuromotor Mechanisms in Human Communication D. Kimura
35. From Conditioning to Conscious Recollection H. Eichenbaum and N. J. Cohen
4. Visual Masking B. G. Breitmeyer 5. The Musical Mind J. A. Sloboda 6. Elements of Psychophysical Theory J.-C. Falmagne 7. Animal Intelligence L. Weiskrantz 8. Response Times R. D. Luce 9. Mental Representations Paivio 10. Memory, Imprinting, and the Brain G. Horn 11. Working Memory Baddeley 12. Blindsight L. Weiskrantz 13. Profile Analysis D. M. Green 14. Spatial Vision R. L. DeValois and K. K. DeValois 15. The Neural and Behavioural Organization of Goal-Directed Movements M. Jeannerod 16. Visual Pattern Analyzers N. V. S. Graham 17. Cognitive Foundations of Musical Pitch C. L. Krumhansl
36. Understanding Figurative Language S. Glucksberg
21. The Frontal Lobes and Voluntary Action R. Passingham
37. Active Vision M. Findlay and I. D. Gilchrist
22. Classification and Cognition W. K. Estes
38. The Science of False Memory C. J. Brainerd and V. F. Reyna
23. Vowel Perception and Production B. S. Rosner and J. B. Pickering
39. The Case for Mental Imagery S. M. Kosslyn, W. L. Thompson, and G. Ganis
24. Visual Stress Wilkins
40. Seeing Black and White Gilchrist
25. Electrophysiology of Mind Edited by M. D. Rugg and M. G. H. Coles
41. Visual Masking, 2e B. Breitmeyer and H. Öğmen
26. Attention and Memory N. Cowan
42. Motor Cognition M. Jeannerod
27. The Visual Brain in Action D. Milner and M. A. Goodale
43. The Visual Brain in Action D. Milner and M. A. Goodale
28. Perceptual Consequences of Cochlear Damage B. C. J. Moore
44. The Continuity of Mind M. Spivey
29. Perceiving in Depth, Vols. 1, 2, and 3 I. P. Howard with B. J. Rogers 30. The Measurement of Sensation D. Laming 31. Conditioned Taste Aversion J. Bures, F. Bermúdez-Rattoni, and T. Yamamoto 32. The Developing Visual Brain J. Atkinson 33. The Neuropsychology of Anxiety, 2e J. A. Gray and N. McNaughton
45. Working Memory, Thought, and Action Baddeley 46. What Is Special about the Human Brain? R. Passingham 47. Visual Reflections M. McCloskey 48. Principles of Visual Attention C. Bundesen and T. Habekost 49. Major Issues in Cognitive Aging T. A. Salthouse
PERCEIVING IN DEPTH VOLUME 2 STEREOSCOPIC VISION
Ian P. Howard
Brian J. Rogers
CENTRE FOR VISION RESEARCH
D E PA RTM E N T O F E X PE R I M E N TA L P SYCH O LO GY
YORK UNIVERSITY
OXFORD UNIVERSITY
TORONTO
1
1 Oxford University Press, Inc., publishes works that further Oxford University’s objective of excellence in research, scholarship, and education. Oxford New York Auckland Cape Town Dar es Salaam Hong Kong Karachi Kuala Lumpur Madrid Melbourne Mexico City Nairobi New Delhi Shanghai Taipei Toronto With offices in Argentina Austria Brazil Chile Czech Republic France Greece Guatemala Hungary Italy Japan Poland Portugal Singapore South Korea Switzerland Thailand Turkey Ukraine Vietnam
Copyright © 2012 by Oxford University Press, Inc. Published by Oxford University Press, Inc. 198 Madison Avenue, New York, New York 10016 www.oup.com Oxford is a registered trademark of Oxford University Press All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior permission of Oxford University Press. ____________________________________________ A copy of this book’s Cataloging-in-Publication Data is on file with the Library of Congress.
ISBN: 978-0-19-976415-0 ____________________________________________
987654321 Printed in the United States of America on acid-free paper
CONTENTS OF VOLUME 2
11. Physiology of disparity detection
1
21. Depth contrast
433
12. Binocular fusion and rivalry 13. Binocular summation, masking, and transfer
51 107
22. Stereopsis and perceptual organization 23. The Pulfrich effect
470 515
14. Binocular correspondence and the horopter 15. Linking binocular images
148 182
24. Stereoscopic techniques and applications
538
16. Cyclopean vision
210
References
564
17. Stimulus tokens for stereopsis 18. Stereoscopic acuity 19. Types of binocular disparity
249 287 363
Index of cited journals Portrait index Subject index
621 624 625
20. Binocular disparity and depth perception
385
v
A NOTE ON VIEWING THE STEREOGRAMS
The stereograms presented in this book can be fused with the aid of the prisms provided. The prisms should be held close to the eyes and about 12 cm above the page. The viewer should be parallel to the plane of the page and correctly oriented within that plane. Incorrect orientation shows as an elevation of one image with respect to the other. The act of fusing a side-by-side pair of images by diverging the eyes is known as divergent fusion or uncrossed fusion. The act of fusing images by converging the eyes is convergent fusion or crossed fusion. The prisms fuse the images by divergent fusion. Stereograms may also be free-fused by diverging or converging the eyes. In learning to free-fuse, it helps if the eyes converge on a pencil point held at the correct distance between the stereogram and the eyes. For divergence, one may place a piece of clear plastic over the stereogram and fixate the reflection of a point of light seen beyond the plane of the stereogram. The correct distance can be found by observing how the images move as the pencil is moved in depth. After some practice, readers will find that they can converge or diverge the eyes without an aid. When stereograms are free-fused, one sees each eye’s image on either side of the fused image. The presence of three pictures confirms that correct vergence has been achieved. Free fusion is a skill
well worth acquiring, since it is often the only way to achieve fusion with displays presented at vision conferences or when you have lost your stereoscope. A pair of images has one sign of disparity when fused by convergence and the opposite sign of disparity when fused by divergence. A change in the sign of disparity reverses the apparent depth relationships in the fused image. For stereograms in which the sign of disparity does not matter for the illustration of a phenomenon, only one pair of images is provided. When the effect depends on the sign of disparity, two stereogram pairs are provided—one pair for readers who prefer to converge the eyes, and the other for readers who prefer to diverge. Note that the provided lenses fuse by divergence only. Some stereograms in the book have triple images in a row. These create two side-by-side fused images with opposite signs of disparity, plus flanking monocular images. Therefore, four images are seen when the images are correctly fused. In some cases, it is instructive to compare the image formed by convergent fusion with that formed by divergent fusion. In other cases, only one of the fused images is of interest. In this case, the location of the fused image of interest is indicated by a cross for those who fuse by convergence and by two parallel lines for those who fuse by divergence.
vi
11 PHYSIOLOGY OF DISPARIT Y DETECTION
11.1 11.1.1 11.1.2 11.2 11.2.1 11.2.2 11.2.3 11.3 11.3.1 11.3.2 11.4 11.4.1 11.4.2 11.4.3 11.4.4 11.4.5 11.4.6 11.4.7 11.4.8 11.5
Introduction 1 Basic terms 1 Discovery of disparity detectors 2 Subcortical disparity-tuned cells 4 Disparity tuning in the pulvinar 4 Disparity tuning in the nucleus of the optic tract Disparity tuning in the superior colliculus 4 Disparity detectors in cats 5 Disparity detectors in areas 17 and 18 5 Disparity detectors in higher visual areas of cats Disparity detectors in primate V1 6 Disparity tuning functions 6 Number and homogeneity of detectors 14 Position- and phase-disparity detectors 16 Detection of vertical disparity 18 Orientation and disparity tuning 20 Disparity tuning and eye position 21 Disparity in contrast-defined stimuli 22 Dynamics of disparity detectors 23 Disparity detection in higher visual centers of primates 24
11.5.1 11.5.2 11.5.3 11.5.4 11.6 11.6.1 11.6.2 11.6.3 11.6.4 11.6.5 11.7 11.8 11.8.1 11.8.2 11.9 11.9.1 11.9.2 11.10 11.10.1 11.10.2
4
6
11.1 INTRODUCTION
Disparity detectors in V2 and V3 24 Disparity detectors in the dorsal stream 26 Disparity detectors in the ventral stream 28 Parvo- and magnocellular disparity detectors 29 Higher-order disparities 30 Detection of horizontal disparity gradients 31 Detection of vertical disparity gradients 31 Spatial modulations of disparity 33 Joint tuning to disparity and motion 33 Joint spatial and temporal disparities 34 Evoked potentials and stereopsis 34 PET, f MRI, and stereopsis 37 Stationary stimuli 37 Motion in depth 38 Detection of midline disparity 39 Effects of midline section of the chiasm 39 Effects of callosectomy 40 Models of disparity processing 40 Energy models 40 Neural network models 49
stimulus falls outside the field of view of the other eye, or (c) the stimulus is occluded to the other eye by a nearer stimulus. A binocular stimulus is one seen at the same time by both eyes. The term “dichoptic” was originally used to describe the well-separated eyes of insects in contrast to holoptic eyes, which have overlapping visual fields. The term “dichotic” was coined by Stumpf (1916) to denote the stimulation of each ear by a distinct sound. By analogy, the term “dichoptic” has come to mean stimulation of the two eyes by distinct distal stimuli (see Wade and Ono 2005). A dichoptic stimulus consists of distinct distal stimuli, one presented to one eye and one to the other, which an experimenter can control independently. There are two basic procedures for gaining dichoptic control. The first is to present distinct stimuli to the two eyes in a stereoscope or by an equivalent procedure such as free fusion. The other procedure is to place different filters or lenses in front of the two eyes. Dichoptic stimuli usually differ in some defined way specified by an experimenter. The difference may be (a) a disparity of position, size, or orientation between parts
11.1.1 BA S I C T E R M S
The term “stereoscopic vision” means literally, “solid sight.” Strictly speaking, it refers to the visual perception of the 3-D structure of the world, when seen by one eye or by two. However, the term is generally used to refer to 3-D depth perception arising from binocular disparities. Several terms referring to binocular vision are in common use, but their meanings vary from author to author. Strictly speaking, all animals with two eyes have binocular vision. Even animals with laterally placed eyes integrate the information from the two eyes to form a coherent representation of the field of view. Also, the field of view of almost all mammals has some region in which the monocular fields overlap. However, the term “binocular vision” is usually reserved for animals possessing a large area of binocular overlap within which differences between the images are used to code depth. A monocular stimulus is a distal display seen by only one eye because (a) the other eye is closed or absent, (b) the 1
or the whole of the stimuli, or (b) a difference in luminance, contrast, color, shape, or motion. The term “dioptic stimulus” has been used to mean a pair of identical stimuli in a stereoscope, in contrast to dichoptic stimuli, which differ in some respect (Gulick and Lawson 1976). In the masking literature, a dichoptic stimulus is one in which a mask and a test stimulus are shown to different eyes. In a dioptic stimulus they are shown to both eyes, and in a monoptic stimulus they are shown to one eye. The term “monoptic depth” has been used to denote an impression of depth created by a single eccentric stimulus in one eye (Section 17.6.5). 11.1.2 D I S C O VE RY O F D I S PA R I T Y D ET E C TO R S
Before the 1960s many scientists, including Helmholtz, believed that stereopsis does not involve conjunction of inputs from the two eyes at an early stage (Section 2.10.5). Ramón y Cajal (1911) proposed that inputs from corresponding retinal regions converge on what he called “isodynamic cells” and that this forms the basis of unified binocular vision. This idea was confirmed when Hubel and Wiesel (1959, 1962) reported that cells in the cat’s visual
Horace B. Barlow. Born in England in 1921. He graduated from Trinity College, Cambridge. He was a research fellow at Trinity College between 1950 and 1954 and a lecturer at King’s College, Cambridge, between 1954 and 1964. Between 1964 and 1973 he was professor of physiological optics and physiology at the University of California at Berkeley. He then returned to the physiological laboratory at Cambridge University as a Royal Society Research Professor. He became a fellow of the Royal Society of London in 1969.
Figure 11.1.
2
•
Colin Blakemore. Born in Stratford-upon-Avon, England in 1944. He obtained a B.A. in medical sciences from Cambridge University in 1965 and a Ph.D. in physiological optics from Berkeley in 1968. He also holds a Sc.D. (Cantab) and D.Sc. (Oxon). He was lecturer in physiology at Cambridge from 1972 to 1979, and then Waynflete Professor of Physiology at Oxford University. Between 1990 and 2003 he was director of the McDonnell-Pew Center for Cognitive Neuroscience in Oxford. He is now chief executive of the Medical Research Council. He has been the recipient of the Robert Bing Prize from the Swiss Academy of Medical Sciences, the Netter Prize from the French Académie Nationale de Médecine, the Royal Society Michael Faraday Award, the G.L. Brown Prize from the Physiological Society, and the Charles F. Prentice Award from the American Academy of Optometry.
Figure 11.2.
cortex receive inputs from the two eyes and that the receptive fields of these binocular cells occupy corresponding positions in the two eyes. But consider what would happen if the monocular receptive fields of each binocular cell occupied identical positions in the retinas and were identical in all other respects. Each binocular cell would respond optimally to similar stimuli with zero disparity. Such cells would not be differentially tuned to different disparities and would therefore be incapable of coding relative depth. If images falling on corresponding locations differed, inputs to binocular cells could sum or rival. For stereopsis, the visual system needs binocular cells that are maximally responsive to inputs from receptive fields that are displaced by different amounts from exact correspondence. Any such cell would respond optimally to a stimulus with disparity of a given magnitude and sign (crossed or uncrossed). A set of such cells with different receptive-field offsets in one sign or the other could code different disparities and hence relative depth.
STEREOSCOPIC VISION
In 1967 Jack Pettigrew discovered binocular cells with these properties in the cat’s visual cortex. Such cells are now known as disparity detectors. Pettigrew was a student of Peter Bishop at Sydney University, Australia (Section 2.10.5). He then joined Barlow and Blakemore in Berkeley, California. Working together, they confirmed the existence of disparity detectors in the cat (Barlow et al. 1967) (Portrait Figures 11.1, 11.2, and 11.3). Similar findings were reported about the same time from Sydney by Pettigrew, Nikara, and Bishop (1968) (Portrait Figure 11.4). In 1977, Gian Poggio and his coworkers at Johns Hopkins University discovered disparity detectors in monkey V1. The search for binocular cells tuned to different disparities was beset with the problem of ensuring that the images in the two eyes are in register. If the images are slightly out of register, a cell tuned to zero disparity will appear to be tuned to a disparity equal to the image misregistration. Also, any movement of the eyes during the recording introduces artifacts. Several procedures have been used to solve this problem. In the anesthetized animal, eye movements are controlled by paralyzing the eye muscles and attaching the eyeball to a clamped ring. A rotating mirror or a prism of variable power controls the effective direction of gaze. In the reference-cell procedure, different electrodes record
John D. Pettigrew. Born in Wagga Wagga, Australia, in 1943. He obtained his B.Sc. in physiology from the University of Sydney in 1966 and an M.B. from Sydney University Medical School in 1969. He has worked at Berkeley with H. Barlow and C. Blakemore, at the California Institute of Technology, Queens University in Canada, and the Zoologisches Insititüt in Munich. Since 1988 he has been director, of the Vision, Touch, and Hearing Research Centre at the University of Queensland, Australia. He is a fellow of the Royal Society of London and of the Australian Academy of Science.
Figure 11.3.
Peter O. Bishop. Born in Tamworth, New South Wales, Australia in 1917. He obtained the M.B. and B.S. in 1940 and the D.Sc. in 1967 from the University of Sydney. After serving as a surgeon during the war he studied at University College London. He held academic appointments at the University of Sydney from 1950 to 1967 when he became professor of physiology at the Australian National University in Canberra. He retired in 1983. He is a fellow of the Australian Academy of Sciences, fellow of the Royal Society of London, officer of the Order of Australia, and joint winner of the Australia Prize in 1993.
Figure 11.4.
responses of a test cell and a reference binocular cell, each with receptive fields in the central retinas. Changes in the response of the reference cell indicate when eye movements have occurred (Hubel and Wiesel 1970). In a related procedure, eye drift is monitored by the response of a reference cell to monocular stimulation (Maske et al. 1986a). Image stability can also be indicated by responses of LGN cells of foveal origin, one from each eye (LeVay and Voigt 1988). These procedures indicate when eye drift has occurred, but they do not specify when test stimuli have zero disparity, since the reference cell may not be tuned to zero disparity. One solution to this problem is to use the mean response of several reference cells to define zero disparity (Nikara et al. 1968). Another procedure is to use an ophthalmoscope to project images of retinal blood vessels onto the screen on which the stimuli are presented (Bishop et al. 1962; Pettigrew et al. 1968). The problem is simplified when testing is done on alert monkeys trained to converge their eyes on defined targets. In identifying a disparity detector one must ensure that changes in responses are not due to incidental changes in stimulation. For example, the stimulus in one eye may move outside the receptive field of the binocular cell when the experimenter changes the disparity of the stimuli. Effects of monocular position can be separated from effects of
P H Y S I O L O G Y O F D I S PA R I T Y D ET E C T I O N
•
3
disparity by measuring the response of a binocular cell over many combinations of image positions (Ohzawa et al. 1990). Effects of monocular cues can be eliminated by using random-dot stereograms, for which disparity is not related to monocular features of the stimulus. 1 1 . 2 S U B C O RT I C A L D I S PA R I T Y-T U N E D C E L L S We shall see that cells tuned to binocular disparity occur in several subcortical nuclei. However, it seems that all these cells acquire their disparity sensitivity from cells in the primary visual cortex. 11.2.1 D I S PA R I T Y T U N I N G I N T H E P U LVI NA R
Although there are binocular interactions in the LGN of the cat (Section 5.2.3), disparity-tuned cells have not been found there (Xue et al. 1987). Cells tuned to disparity occur in the cat’s pulvinar, which is a subcortical nucleus closely associated with the LGN. It receives most of its inputs from the superior colliculus and visual cortex (Casanova et al. 1989). In the monkey, the pulvinar provides the major subcortical input to cortical area 18 (V2) (Levitt et al. 1995). Since cells in the pulvinar are tuned to opposite directions of motion in the two eyes, they may be concerned with coding motion-in-depth. The pulvinar has been implicated in controlling attention to salient visual stimuli (Robinson and Petersen 1992). An approaching object is a salient stimulus (Morris et al. 1997). A person with a lesion in the left pulvinar had impaired stereoacuity (Takayama et al. 1994). Cells sensitive to phase disparity of dichoptic gratings have been found in the perigeniculate nucleus (Xue et al. 1988). 11.2.2 D I S PA R I T Y T U N I N G I N T H E NU C L EUS O F T H E O P T I C T R AC T
The nucleus of the optic tract (NOT) is part of the accessory optic system and is concerned with the control of optokinetic eye movements. In the cat, the response of about half the cells of the NOT that respond to moving displays are also tuned to binocular disparity; some show an excitatory response to a limited range of disparities, and others show an inhibitory response (Grasse 1994). In primates, all the cells in the NOT receive inputs from binocular cells of the ipsilateral visual cortex. Monocular or binocular deprivation disrupts these ipsilateral inputs (Grasse and Cynader 1986, 1987). Evidence reviewed in Section 22.6.1 shows that disparity signals conveyed to the NOT in primates serve to link the optokinetic response to the plane in depth on which the eyes are converged. 4
•
1 1 . 2 . 3 D I S PA R I T Y T U N I N G I N T H E S U P E R I O R C O L L I C U LU S The optic tectum is the primary visual center in vertebrates that do not possess a cerebral cortex. In mammals, the visual cortex is the primary visual center and the homologue of the optic tectum is a pair of subcortical nuclei in the midbrain known as the superior colliculus. These nuclei sit behind the pretectum and in front of the inferior colliculi (structures concerned with auditory functions). The superficial layer of each superior colliculus receives visual inputs from ganglion-cell collaterals (Cowey and Perry 1980). It contains a topographic map of the contralateral hemifield of visual space (Graybiel 1976). But most visual inputs arrive from the visual cortex. The intermediate and deep layers receive inputs from visual centers in the temporal, parietal, and frontal lobes, particularly from the frontal eye fields. Many cells are bimodal or multimodal, receiving visual, auditory, tactile, and proprioceptive inputs. Although the receptive fields of multimodal cells are large, there is some correspondence between the position of a cell’s visual receptive field and the location of a tactile stimulus on the face or paws of the cat (Stein et al. 1976). The directional map of visual space overlays a directional map based on auditory inputs (Updyke 1974). The consistency of this mapping would require visual inputs to be mapped in headcentric rather than oculocentric coordinates. Further work is needed to reveal whether this coordinated mapping extends to the third dimension. Such a mapping could help to initiate vergence eye movements (see Section 10.10.2). Outputs from the superior colliculi project to the LGN, thalamus, and other brainstem nuclei. The main functions of the superior colliculi are control of saccadic and vergence eye movements to the locations of visual, auditory, or tactile stimuli (Mays and Sparks 1980). Most visually driven cells in the superior colliculi are binocular and tuned to direction of motion. They acquire these tuning properties from cortical cells rather than from direct visual inputs. Directional selectivity and binocularity are lost after removal of the visual cortex (Wickelgren and Sterling 1969) and in dark-reared cats (Flandrin and Jeannerod 1977). Disparity-tuned cells, mostly of the tuned excitatory type, have been found in the superior colliculus of the opossum (Dias et al. 1991). However, stereopsis has not been demonstrated in this animal. Binocular cells in the superior colliculus of the cat showed summation when dichoptic images fell on corresponding receptive fields (Berman et al. 1975). Bacon et al. (1998) found that 65% of binocular cells in the cat’s superior colliculus were sensitive to position disparity of bar stimuli. They found excitatory and inhibitory cells tuned to zero disparity, cells tuned to crossed disparity, and cells tuned to uncrossed disparity. Mimeault et al. (2004) found a similar proportion of cells tuned to the
STEREOSCOPIC VISION
position disparity of bars, with a mean disparity bandwidth of about 3°. They also found cells sensitive to phase disparity in grating stimuli. Some cells responded to both position disparity and phase disparity. These types of disparity are described in Section 11.4.3. 1 1 . 3 D I S PA R I T Y D ET E C TO R S I N C AT S 11.3.1 D I S PA R I T Y D ET E C TO R S I N A R E A S 17 A N D 18
Barlow et al. (1967) reported that certain binocular cells in the visual cortex of the cat responded selectively to line and bar stimuli with a particular binocular disparity. The disparity-tuning function of a disparity detector is its frequency of firing as a function of the disparity of the retinal images. The disparity selectivity of a detector is indicated by the width of its disparity-tuning function at half its height. The narrower the tuning function, the higher the cell’s selectivity. The response variability of a detector is the mean fluctuation in the firing rate for a constant stimulus (Crawford and Cool 1970). A binocular cell shows facilitation, summation, or occlusion depending on whether its response to a binocular stimulus is greater than, equal to, or less than the sum of responses to monocular stimuli. Binocular summation is discussed in Section 13.1.1. The preferred disparity of a detector is the disparity to which it responds most vigorously. A preferred disparity is indicated by its magnitude, its sign (crossed or uncrossed), and its axis (horizontal, vertical, or oblique). The preferred disparity of a cell is measured by observing its response to stimuli presented simultaneously to the two eyes as a function of the magnitude, sign, and axis of disparity. Early studies of disparity detectors used bar stimuli. Freeman and Robson (1982) introduced the use of disparity between drifting gratings. In the 1980’s Poggio introduced the use of random-dot stereograms. In a related procedure, the mean retinotopic position of the receptive field of a binocular cell is determined first in one eye and then in the other. The separation in degrees of visual angle between the two separately determined monocular fields is taken as the cell’s preferred disparity. However, it is not easy to determine the relationship between receptive-field offset and disparity selectivity. The receptive-field offset can be measured only in cells that respond to each eye separately. Many cells tuned to disparity give an excitatory response to stimulation of one eye but not of the other eye. In any case, the disparity selectivity of some binocular cells depends on offsets of subunits within their monocular receptive fields rather than on the offset of the receptive fields as a whole (Section 11.4.3). Each retina projects retinotopically onto the visual cortex. Across the cortical surface within the ocular dominance band of one eye, the retinal location of the receptive
field changes systematically. However, in each small region of the cortical surface there is a random scatter of receptivefield locations. In the cat, the variance of this scatter has been estimated to be 0.12° (Albus 1975). In a more recent study, the scatter of receptive-field positions of binocular cells within each cortical column was about half the mean size of receptive fields (Hetherington and Swindale 1999). Some of this variation in position was correlated between the two eyes. Only the uncorrelated component of variation can contribute to disparity detection. In these studies, the mean variation of receptive field position was measured within a given cortical column. There could also be variation in mean offset between neighboring columns, which could contribute to disparity detection. If the monocular fields in a given small region are paired at random to form receptive fields of binocular cells, then the variance of the offsets of the monocular fields should equal the sum of the variances of the monocular scatters. For the cat, this was found to be approximately true (Bishop 1979). Blakemore and Pettigrew (1970) found that the positions of the receptive fields of binocular cells show greater variance in the ipsilateral retina than in the contralateral retina. Thus, the more recently evolved ipsilateral projection is less precise than the phylogenetically older contralateral projection. Barlow et al. (1967) reported that preferred disparities of binocular cells in the cat’s visual cortex were distributed over a horizontal range of 6.6° and a vertical range of 2.2°. Other investigators found that preferred disparities had a standard deviation of only about 0.5° for both horizontal and vertical disparities for eccentricities of up to 4°, increasing to 0.9° at an eccentricity of 12° (Nikara et al. 1968; Joshua and Bishop 1970; von der Heydt et al. 1978). These values suggest that the full range of disparities is only about 3°. Ferster (1981) found no cells sensitive to disparities over 1°. There are at least two reasons for these discrepancies. In the first place, the apparent peak disparity to which a given cell is tuned is affected by the extent to which eye movements are controlled. Secondly there may have been differences in the accuracy with which the position of zero disparity was registered in the testing procedure. Blakemore (1970b) reported that cells in the cat’s visual cortex with a given disparity were arranged in distinct columns, which he called constant depth columns. He also described a type of columnar arrangement in which the binocular cells were driven by receptive fields in the contralateral eye that were all in the same region of the retina and by receptive fields in the ipsilateral eye that were scattered over several degrees. The cells in such a column have a variety of preferred disparities, but they all respond to a stimulus lying on an oculocentric visual line of one eye. The column “sees” along a tube of visual space lined up with one eye. This finding should be replicated with more adequate control of receptive field mapping.
P H Y S I O L O G Y O F D I S PA R I T Y D ET E C T I O N
•
5
Disparity-tuned cells in areas 17 and 18 of the cat may be classified into three main types. 1. Tuned excitatory cells have narrow disparity-tuning functions centered at zero disparity. 2. Near cells fire maximally to crossed disparities. 3. Far cells respond to uncrossed disparities. There are no sharp distinctions between the disparity tuning functions of these cell types. The cells probably lie along a continuum. Cells that do not fit these categories are described in later sections of this chapter. It has been reported that tuned excitatory cells in cat cortical area 17 are ocularly balanced—they respond equally well to either eye. These cells are tuned to disparities close to zero or are nonselective for disparity (Gardner and Raiten 1986; Maske et al. 1986a, 1986b; LeVay and Voigt 1988). The near and far cells show strong ocular dominance. They show excitatory responses to stimulation of one eye and inhibitory responses to stimulation of the other eye, or they respond to stimulation of one eye when only that eye is open but not to the other eye when only that eye is open (Fischer and Krüger 1979; Ferster 1981). It has also been claimed that cells with high ocular dominance are sensitive to nonzero disparity. More recent evidence from the monkey does not support the idea of a relationship between ocular dominance and the type of disparity tuning. Read and Cumming (2004) pointed out that the energy model of disparity processing requires both excitatory and inhibitory inputs from both eyes. The model predicts that balanced ocular dominance produces stronger disparity tuning but it does not predict a relationship between ocular dominance and the type of disparity tuning. Ohzawa and Freeman (1986b) had found no relationship between disparity tuning and ocular dominance in the cat. Using random-dot stereograms, Prince et al. (2002a) and Read and Cumming found no relationship between ocular dominance and strength or type of binocular tuning of cells in V1 of the monkey. After ablation of cortical areas 17 and 18, cats could not discriminate depth based on disparity, although abilities such as vernier acuity and brightness discrimination survived (Ptito et al. 1992). About half the cells in cortical area 17 of ferrets were sensitive to horizontal disparity in bar and grating stimuli (Kalberlah et al. 2009). However, there have been no behavioral studies of stereopsis in ferrets. 11.3.2 D I S PA R I T Y D ET E C TO R S I N H I G H E R VI S UA L C E N T E R S O F C ATS
Disparity-tuned cells have been found in cortical area 19 of the cat. Such cells receive inputs from W-type ganglion cells and are tuned to zero or uncrossed disparities compared 6
•
with the predominant crossed-disparity tuning of cells in area 17 (Pettigrew and Dreher 1987). W cells have large receptive fields and may be involved more with the control of vergence than with stereopsis. Guillemot et al. (1993) found that only about 34% of cells in area 19 of the cat were tuned to disparity, compared with over 70% in area 17. Almost all cells in area 19 lost their disparity tuning following section of the chiasm, suggesting that these cells receive their input from the contralateral eye by this route. The suprasylvian area (areas 20 and 21) of the parietooccipital cortex (Clare-Bishop area) of cats receives inputs from areas 17 and 18. Most cells in this area are binocular and respond to horizontal disparity. They exhibited the same types of disparity tuning as cells in area 17, although they were not so finely tuned (Bacon et al. 2000). Most cells were sensitive to both position disparity and to phase disparity (Mimeault et al. 2002). Cells in area 21a (in the suprasylvian area) of the cat are orientation selective, and about 75% of them are binocular disparity-tuning functions similar to those of cells in area 17. There was no correlation between orientation tuning and disparity tuning (Wang and Dreher 1996). The tuned response of many cells to modulation of spatial phase between dichoptic gratings was retained for differences in orientation of the gratings of up to at least 45° (Vickery and Morley 1999). Some cells in the suprasylvian area of the cat respond to accommodative stimuli, changes in binocular disparity, and motion-in-depth (Bando et al. 1984, 1996; Toyama et al. 1985, 1986). Some cells responded to approaching motion, others to receding motion, others to lateral motion, and others to motion in any of several directions. The cells responsive to motion-in-depth were strongly activated by changing disparity and less strongly activated by monocular looming (Akase et al. 1988). The suprasylvian is regarded as the homologue of the middle temporal area (MT) in primates. We will see in Section 11.5.2a that cells in MT are also sensitive to motion-in-depth.
1 1 . 4 D I S PA R I T Y D ET E C TO R S I N P R I M AT E V 1 11.4.1 D I S PA R I T Y T U N I N G F U N C T I O N S
11.4.1a Basic Findings Hubel and Wiesel (1970) were the first to look for disparity-tuned cells in the primate visual cortex. They found no such cells in V1 of anesthetized monkeys but found them in V2. Their inability to find them in V1 was probably due to inadequate control of eye alignment. Later, disparity detectors were found in V1 and several other visual areas of anesthetized monkeys. The three types of disparity-tuned cell found in area 17 of cats were also found in V1 of monkeys.
STEREOSCOPIC VISION
Also found was an infrequent type that is inhibited by zero disparity, known as tuned inhibitory cells. These four types of cell occur in V1 but are more prevalent in V2, where at least 70% of neurons are tuned to horizontal disparity (Poggio and Poggio 1984; Hubel and Livingstone 1987). Poggio and Fischer (1977), at Johns Hopkins University, were the first to record from disparity-tuned cells in the visual cortex of an alert animal—the rhesus monkey (Portrait Figure 11.5). Gian Poggio and his coworkers extended these investigations (Poggio and Talbot 1981; Poggio et al. 1985, 1988; Poggio 1991). Monkeys were trained to fixate a small visual target while bar stimuli were presented in an area 2° wide and in different depth planes relative to the fixation target. The problem of aligning the two visual fields was thus greatly simplified. Both bar stimuli and random-dot stereograms were used to determine the disparity tuning functions of the cells. In addition, the sensitivity of cells to changes in dichoptic correlation was determined with dynamic random-dot displays that changed from being correlated in the two eyes to being uncorrelated. More than half the simple and complex cells in V1 were found to be disparity tuned. The proportion of disparitytuned cells increased as testing progressed into areas V2, V3, V3A, MT, and MST. About equal numbers of simple
and complex cells in V1 were disparity-tuned, but complex cells were particularly sensitive to depth in random-dot stereograms. Subfields within the receptive fields of complex cells presumably allow these cells to respond to disparity between microelements of the stereogram. Complex cells were also more sensitive than simple cells to changes in image correlation in a random-dot stereogram, probably for the same reason. Some cells sensitive to differences in image correlation were also sensitive to the sign and degree of disparity. Others were sensitive to a change in correlation only when disparity was zero (Gonzalez et al. 1993a). Poggio (1991) classified binocular cells in the monkey into six types: 1. Near cells broadly tuned to crossed disparities and inhibited by uncrossed disparities. 2. Far cells broadly tuned to uncrossed disparities and inhibited by crossed disparities. 3. Tuned excitatory cells with a well-defined preferred disparity peaking at a crossed disparity of up to 0.5°. They have an inhibitory flank on the zero-disparity side of the tuning function. 4. Tuned excitatory cells with a well-defined preferred disparity peaking at an uncrossed disparity of up to 0.5°. They have an inhibitory flank on the zero-disparity side of the tuning function. 5. Tuned excitatory cells with a narrow tuning function within +12 arcmin of zero disparity. This is the most common type. 6. Tuned inhibitory cells that are suppressed by disparities around zero.
Gian F. Poggio. Born in Genoa, Italy, in 1927. He received the Doctor of Medicine from the University of Genoa in 1951. He then held fellowships in neurological surgery and in physiology at John’s Hopkins University. In 1960 he joined the faculty of Johns Hopkins University, in the department of physiology from 1960 to 1980 and as professor of neuroscience from 1980 until he retired in 1993. In 1989, with Bela Julesz, he received the Lashley Award of the American Philosophical Society for work on stereopsis.
Figure 11.5.
All six types were inhibited by uncorrelated images. Cells that respond to stimuli moving in opposite directions in the two eyes are described in Section 31.8.2. The tuning functions of these types of cell are shown in Figure 11.6. The tuning functions of the near and far cells form reciprocal pairs, as depicted in Figure 11.7. The tuning functions of excitatory cells tuned to crossed and uncrossed disparities (types 2 and 3) also form reciprocal pairs, as do the tuning functions of types 5 and 6. The tuned inhibitory and the near and far cells tend to have strong monocular dominance, suggesting that inputs from the weaker eye inhibit those from the dominant eye, except when the stimulus is at the appropriate depth relative to the horopter. Classification of disparity detectors into six classes and the scheme depicted in Figure 11.7 are abstractions in terms of standard prototypes. Disparity detectors do not fall neatly into a fixed number of types. Also, disparity detectors are not uniform over the visual field. Some cells in V2 respond to both vertical and horizontal disparities
P H Y S I O L O G Y O F D I S PA R I T Y D ET E C T I O N
•
7
Tuned excitatory near
Tuned excitatory zero
100
150
100
Tuned excitatory far
125 75 100
75
75
50 Near responses (impulses/s)
50 50 25
25 0
DU 2S
–1.0
L R
25 0
0 –0.5
0.0
0.5
1.0
–1.0
–0.5
0.0
0.5
KK 07
–0.5
1.0
0.0
0.5
0.0
0.5
100 100
Tuned inhibitory
Near
75
Far
75 75
50
50 50 25 25
25
0
0
0 –1.0
–0.5
0.0
0.5
–1.0 –0.5 0.0 0.5 1.0
1.0
–1.0
–0.5
1.0
Horizontal disparity (deg) Six classes of disparity-tuned cells in monkey visual cortex. For each function, frequency of nerve impulses is plotted for different horizontal disparities of a bright bar moving in each of two opposed directions across the cell’s receptive field. For each cell, the functions for the two directions of motion are plotted separately. Vertical bars are standard errors. Red horizontal lines are responses of the left eye alone to two directions of motion. Blue lines are responses of the left eye alone. (Redrawn from Poggio 1991)
Figure 11.6.
(Section 11.5.1), and some are jointly tuned to disparity and motion (Section 11.6.4). Poggio suggested that excitatory/inhibitory pairs of disparity detectors finely tuned to near zero disparity provide the basis for fine stereopsis. Also, the broadly tuned Tuned excitatory
Far
Neural activity
Near
Tuned inhibitory
Near
Far Fixation point
Idealized disparity tuning functions. Tuned excitatory and tuned inhibitory cells are optimally tuned to zero disparity. Their tuning functions form a reciprocal pair. Tuning functions of “near” and “far” cells (dotted lines) are asymmetrical and do not show a clear preferred disparity. They, also, form a reciprocal pair. (Redrawn from Poggio et al. 1985)
Figure 11.7.
8
•
near/far pairs provide the basis for coarse stereopsis. This may be true for resolution of disparity modulations (Section 18.6.3). However, very fine discriminations of well-spaced stimuli can be achieved with broadly tuned channels (Section 4.2.7). For instance, the chromatic channels are broadly tuned but achieve fine discriminations of color. The broadly tuned near/far disparity channels should provide good discrimination for disparities of well-spaced stimuli around zero, because their tuning functions are steepest and overlap just at zero. The relative change in signal strength in the two channels is therefore greatest at this point, as can be seen in Figure 11.7. Fine discrimination around zero-disparity based on narrowly tuned detectors would require several detectors tuned to different disparities. See Lehky and Sejnowski (1990a) for more discussion of this issue. Prince et al. (2002a) measured disparity-tuning functions of cells in V1 of the alert monkey using dynamic random-dot stereograms. These stimuli contain no monocular information about depth. They also contain a broad range of orientations, which allows disparity selectivity to be measured independently of preferred orientation. Disparity selectivity varied from cell to cell in a continuous fashion rather than forming discrete groups. Disparity sensitivity was measured by the discriminability of the maximum and minimum points on the disparity tuning function
STEREOSCOPIC VISION
(see Section 11.4.1c). Sensitivity was correlated with the degree of tuning for motion direction but not with the orientation preference of the cells. This issue is discussed further in Section 11.6.2. The responses of the cells could, approximately, be accounted for in terms of the energy model described in Section 11.10.1.
11.4.1b Tuning Function Characteristics The disparity-tuning function of a cortical cell is the number of impulses per second plotted as a function of the horizontal or vertical angular separation of two optimally oriented dichoptic bars or gratings. Zero separation is defined in terms of the averaged response of several binocular cells with receptive fields in the foveal region when the eyes are converged on the stimulus. Tuning functions for orientation disparities are discussed in Section 11.6.2. A disparity-tuning function, like any other sensory tuning function, has eight basic features: 1. The peak amplitude of response. 2. Sensitivity, as indicated by the peak amplitude of response modulation to unit change in disparity. A related measure is the overall extent to which the firing rate is modulated by disparity. Ohzawa and Freeman (1986a) defined a binocular interaction index (BII), analogous to Michelson contrast: BII =
Rmaxa Rmaxa
Rmin Rmin
(1)
where Rmax and Rmin are the maximum and minimum points on the disparity-tuning function. Prince et al. (2002a) pointed out that this index takes no account of variability in firing rate. They defined a disparity discrimination index (DDI): DDI =
(R
Rmaxa Rmin R ) + 2 RMS MSerror
(2)
where RMSerror is the root mean square of the variance over the whole tuning curve. 3. The tuning width as specified by the width (or halfwidth) of the tuning function at half its peak amplitude. It indicates the range of disparities to which the cell responds. 4. The relationship between excitatory responses and inhibitory responses. 5. The degree of symmetry of the tuning function about its peak amplitude. 6. The preferred disparity, indicated by the size and sign of disparity that evokes the strongest response. One must also consider possible interactions between the effects of horizontal and vertical disparities.
7. The resting level of response. Most cortical cells have little or no resting level of discharge. 8. Response variability as indicated by the variability of the spike count with repetition of the same stimulus. The coefficient of variation is the standard deviation divided by the mean spike count. The first five features define the shape of the tuning function, the sixth specifies its position along the disparity axis, and the last two define its position and variability along the response axis. This section deals with factors that determine the shape of disparity-tuning functions. Section 11.4.2 deals with the number of tuning functions required to span the range of detected disparities and the homogeneity of tuning functions over the visual field.
11.4.1c Precision of Disparity Tuning The probability of response of a single neuron in the visual cortex to well-defined stimuli of variable strength can be used to produce a neurometric function, analogous to a psychometric function derived from behavioral data. Prince et al. (2000) compared the sensitivity of binocular cells in V1 of alert monkeys with the ability of monkeys to detect depth in a dynamic random-dot stereogram. The animals fixated a dot and indicated whether a 0.4° central patch in the stereogram was nearer than or beyond a 1° wide surrounding annulus. Neural performance was assessed by the probability that the response of a binocular cell to each of a set of disparities was greater or less than the response to zero disparity. On average, neuronal thresholds were about four times higher than behavioral thresholds. However, thresholds for the best neurons were slightly lower than the behavioral thresholds. It looks as though the monkeys used information from their most sensitive detectors. When the surround was absent or contained uncorrelated dots, the behavioral threshold increased more than the neuronal threshold. This suggests that the neuronal threshold depended on the local absolute disparity of the test patch, while the behavioral threshold depended on the relative disparity between test patch and surround. An increase in vergence instability when the surround was absent or uncorrelated may have been a factor. Although the animals were trained to fixate the central test patch, it is not clear to what extent vergence movements occurred during the 2-second test periods.
11.4.1d Disparity Tuning and Monocular Receptive Fields It has been claimed that one can account for disparitytuning functions of simple cortical cells in terms of the size and strength of the excitatory and inhibitory regions of the cell’s receptive fields in each eye (Bishop et al. 1971;
P H Y S I O L O G Y O F D I S PA R I T Y D ET E C T I O N
•
9
Ferster 1981). Cells with disparity-tuning functions of this type compute a form of binocular contrast energy and conform to an energy model (Section 11.10.1). The shapes of the monocular receptive fields of binocular cells are well described by elongated Gabor functions. These are sinusoidal modulations within an elongated Gaussian envelope (Section 4.4.2). The energy model of disparity detection predicts that the tuning function of a disparity detector is well described by an elongated Gabor function that represents linear summation of monocular receptive fields. Ohzawa and Freeman (1986a) stimulated simple cells in the cat’s visual cortex with dichoptic drifting sinusoidal gratings of optimal spatial frequency and orientation (Portrait Figures 11.8 and 11.9). Most cells responded most strongly when the gratings were in a particular spatial phase and least when they were 180° away from this phase (see Figure 11.10). Phase-specific interactions were absent for orthogonal gratings. The phase specificity of a cell did not depend on its ocular dominance, except for a few strongly monocular cells that showed a purely inhibitory, phase-independent response to stimulation of the eye that was not dominant for that cell.
Ralph D. Freeman. Born in Cleveland, Ohio in 1939. He obtained a degree in optometry from Ohio State University and a Ph.D. in biophysics from the University of California at Berkeley. In 1969 he obtained an academic appointment at the University at Berkeley, where he is now professor of vision science and optometry. He has held visiting appointments at Cambridge University and Osaka University.
Figure 11.8.
10
•
Izumi Ohzawa. Born in Hida Takayama, Japan, in 1955. He graduated in electrical and electronics engineering at Nagoya University in 1978 and obtained a Ph.D. in physiological optics from the University of California, Berkeley, in 1986. He continued postdoctoral research at Berkeley until he moved to Osaka in 2000. He is now a professor in the graduate school of frontier biosciences at Osaka University.
Figure 11.9.
The important point is that binocular interactions of most simple cells could be derived from the linear summation of the excitatory and inhibitory zones revealed by the cell’s response to gratings presented to each eye separately. Cells of this type conform to the energy model. For about 40% of complex cells, linear summation of excitatory and inhibitory zones, revealed when each eye was tested separately, accounted for phase-specific binocular interactions (Ohzawa and Freeman 1986b). About 40% of complex cells exhibited nonphase-specific responses, and about 8% showed a purely inhibitory influence from one eye. Hammond (1991) agreed that most simple cells show a phase-specific response to a moving sine-wave grating of optimal spatial frequency and orientation but variable interocular phase. But he found that most complex cells do not. Ohzawa et al. (1990) recorded responses of binocular complex cells in the visual cortex of anesthetized cats as an optimally orientated bar was moved across corresponding receptive fields in each eye. The bars were both black, both white, or black in one eye and white in the other. The responses are shown in Figure 11.11. They developed a model in which monocular receptive fields feed first into simple binocular cells and then into complex binocular cells. There are four types of monocular receptive fields revealed by how the firing of a ganglion cell is modulated as a dark or bright bar is moved over the receptive field. In each eye, one pair of receptive fields has
STEREOSCOPIC VISION
300sp/s
270 180 90 0
L
R
L
L
R
60
L
R
Simple cell
40
R
20
L
0 0
NULL
A
2 sec
R
80
Right stimulus position
Response spikes (spikes/s)
Right eye Left eye
90 180 270 360 Relative phase (deg)
Complex cell
Complex cell
300sp/s Response spikes (spikes/s)
Right eye Left eye 270 180 90 0
90 Model 60
R Left stimulus position L Figure 11.11.
0 0
NULL 2 sec
Reponses of cortical cells to disparity offset. The top three rows of squares show the firing rates (higher rate—darker) of three types of binocular cell in cat visual cortex as a function of lateral positions of optimally oriented bars in left and right eyes. Columns show responses to the bar stimuli shown at the top. In the two left columns, stimuli have the same luminance polarity in the two eyes. This causes tuning functions to have a single peak, because stimuli come into register at only one position. In the two right columns, stimuli have opposite polarity. This causes tuning functions to have two peaks, because there is one excitatory region in the eye with a bright bar and there are two in the eye with a dark bar (off flanks of simple cells are excitatory for dark bars). Separation between the peaks reveals the spatial period of receptive-field subunits. Profiles on the edges of some squares represent tuning functions to monocular stimuli. The bottom row of squares indicates responses predicted from a theoretical model. (From Ohzawa et al.
30
B
09 180 270 360 Relative phase (deg)
300sp/s Response spikes (spikes/s)
Right eye Left eye 270 180 90 0 NULL
50 40 30 20
1990. Reprinted with permission from AAAS)
10
L R
0 0
2 sec
C
180 270 360 90 Relative phase (deg)
Cortical responses to dichoptic gratings. Responses of a simple cell (A) and complex cells (B and C) in the visual cortex of a cat to a drifting sinusoidal grating presented dichoptically at various relative spatial phases. Time histograms of the responses are shown on the left, for each eye stimulated separately, and for various relative phases of dichoptic stimulation. Dashed lines represent the level of spontaneous activity. (Reprinted from Freeman and Ohzawa, 1990, with permission from Elsevier)
Figure 11.10.
symmetrical (cosine) luminance profiles of opposite polarity, and another pair has asymmetrical (sine) luminance profiles of opposite polarity, as shown in Figure 11.12A. Outputs from each pair of receptive fields feed into simple binocular cells. The output of a simple cell receiving symmetrical profiles is 90° out of spatial phase with respect to that of a cell receiving asymmetrical profiles. The two are said to be in quadrature. The squared (rectified) outputs of the simple cells feed into a complex cell. A model complex
cell with matching properties in the two eyes fires maximally when the stimulus is in the same position in each eye. Such a cell has a symmetrical disparity-tuning function about zero disparity. Ohzawa et al. modeled the responses of complex binocular cells tuned to nonzero disparity by providing receptive field subunits with a double quadrature organization, as shown in Figure 11.12B. In this case, the receptive field for the right eye is 90° phase-shifted relative to that for the left eye for all four subunits. Consequently, the cell fires maximally only when the stimulus in one eye is offset with respect to that in the other. Nomura et al. (1990) and Nomura (1993) developed a similar model. Prince et al. (2002a) summarized the following lines of evidence related to the energy model. 1. Tuning functions of cells in V1 of the monkey to horizontal disparity can be well described by a Gabor function, as predicted by the energy model. 2. The energy model is supported by the fact that disparity-tuning functions of cells in monkey V1 to
P H Y S I O L O G Y O F D I S PA R I T Y D ET E C T I O N
•
11
Left
Right
y=x2
+ +
S –
–
+ +
S
+ +
In phase
Cx
+ + +
+ S –
+
4. Some binocular cells are tuned more weakly to binocular disparity than is predicted by the energy model. Weak disparity tuning could be due to the monocular receptive fields of a binocular cell being poorly matched in their response to one or more visual features. It could also be due to the presence of receptive-field subunits that differ in their disparity tuning.
–
+ S
Quadrature
A Left
Cell
Right
3. The energy model predicts that the response of a cell to binocular inputs from dichoptic uncorrelated random dots is the sum of its responses to each monocular display. However, the binocular response is closer to the mean of the monocular responses. This discrepancy can be accounted for by a process of response normalization, in which the response of a cell is modulated by the mean response of neighboring cells (Fleet et al. 1996b).
Model
B Model of binocular complex cell. (A) Model of receptive field of a binocular complex cell tuned to zero disparity. The receptive field of the cell contains four subunits, each of which receives inputs from both eyes arranged as two mutually inhibitory pairs, one pair operating in phase, the other in quadrature (90°) phase. In the profiles of the monocular receptive fields the dark areas represent excitatory regions and the blank areas inhibitory regions. One pair of subregions receives inputs from symmetrical receptive fields and the other from asymmetrical receptive fields. (B) Hypothetical receptive-field subunits for a binocular complex cell tuned to nonzero disparity. The cell was modeled by receptive-field subunits with double quadrature organization. (From Ohzawa et al. 1990. Figure 11.12.
Reprinted with permission from AAAS)
stereograms with reversed contrast in the two eyes are inverted. However, the response to reversedcontrast stereograms is weaker than that to normal stereograms. This difference is not predicted by the standard energy model in which early processing is assumed to be linear. However, Read et al. (2002) showed that this result may be explained by simply adding a nonlinear threshold function to the input from each eye. Read and Cumming (2003) produced physiological evidence for this threshold function in monkey V1. 12
•
Menz and Freeman (2004a) measured the strength of lateral connections between pairs of disparity-tuned cells in the cat visual cortex. Some simple-complex pairs of cells with strong monosynaptic connections had similar disparity tuning and were approximately in quadrature. These cells fitted the energy model. Pairs of cells with dissimilar disparity tuning or spatial-frequency tuning or that were tuned to opposite phase had the weakest monosynaptic connections. However, other pairs of cells did not conform to the energy model. A simple cortical cell that receives direct inputs from the two eyes does not produce a pure disparity signal—one not influenced by incidental changes in the stimulus. For example, the response of a binocular simple cell to dichoptic black bars is not the same as the response to white bars. Also, because of their well-defined ON and OFF regions, simple cells are sensitive to slight changes in object location (Qian and Zhu 1997). In the cat, complex cells produce more robust disparity signals than do simple cells. The energy model is discussed further in Section 11.10.1.
11.4.1e Position Invariance of Disparity Tuning For a simple binocular cell, excitatory and inhibitory zones of the receptive field are spatially segregated. The cell responds strongly to a stimulus that falls on the excitatory regions and weakly to one that falls on the inhibitory regions. Thus, the response of the cell to a given disparity will vary as the stimulus is moved across the receptive field. The cells do not show position invariance. Consequently, simple cells do not have well-defined disparity-tuning functions. If a cell’s receptive field has a periodic pattern of excitatory and inhibitory zones it could show several peaks of response as the relative phase of the images of a dichoptic bar stimulus is varied. This could account for the fact that some simple cortical cells have more than one peak in their disparity-tuning function (Ferster 1981).
STEREOSCOPIC VISION
For a complex cell, the excitatory zones are coextensive for single bright and dark stimuli, and inhibition is revealed only as an interaction between two stimuli (Movshon et al. 1978). For this reason, the disparity-tuning functions of a complex cell should be reasonably independent of the location of the stimuli in the receptive field. The response of some complex cells with disparity selectivity narrower than the receptive field remained constant as the stimulus was moved over the receptive field (Ohzawa et al. 1990). The disparity tuning of these cells showed position invariance.
11.4.1f Disparity Tuning and Contrast Differences The disparity-tuning functions of cortical cells are largely independent of changes in the relative contrasts of the monocular images. The response of binocular cells in area 17 of the cat was the same for dichoptic gratings with very different luminance contrasts as for those with equal contrasts in the two eyes (Freeman and Ohzawa 1990). Similar results were obtained from V1 of the monkey (Smith et al. 1997a, 1997b). There must be a gain-control mechanism that keeps monocular inputs to disparity-tuned cells in balance. In contrast gain-control, the contrast range of a detection system is adjusted to the mean level of contrast over a given area. Contrast gain-control occurs at both retinal and cortical levels (Shapley and Victor 1978; Ohzawa et al. 1985). Truchard et al. (2000) recorded responses of binocular simple cells in cat area 17 to drifting sinusoidal gratings presented dichoptically at various phases. A 10-fold increase in contrast of the grating in one eye sharply reduced gain for that eye but had only a small effect on binocular gain. Thus, most control of contrast-gain occurs in monocular pathways. However, effects due to interocular suppression do occur (Section 13.2). Read and Cumming (2003) proposed modifications of the energy model that could explain why unbalanced inputs from the two eyes can produce strong disparity tuning (see Section 11.10.1). In the above experiments, the sign of contrast was the same in the two eyes. But what happens when the sign of contrast is reversed in one eye? Cumming and Parker (1997) reported that most binocular cells in V1 of the monkey produced a normal disparity-tuning function to a dynamic random-dot stereogram with the same sign of contrast in the images in the two eyes (correlated dots). However, the cells produced an inverted tuning function to a stereogram with contrast sign reversed in the image to one eye (anticorrelated dots). They predicted this reversal of the tuning function by modeling the response of a binocular cell with monocular receptive fields that match in spatial and temporal properties. For each receptive-field subunit, inputs from the two eyes are summed, squared, and linearly combined to form the response of a complex cell. They assumed that the cells code disparity in terms of receptive-field offset,
but the same predictions follow if disparity is coded in terms of receptive-field phase (Ohzawa et al. 1997). Visually, images with reversed contrast evoke rivalry, not depth (Section 15.3.7b). Also, Cumming et al. (1998) found that human subjects saw no depth in reversed-contrast random-dot stereograms like those that produced inverted tuning functions in the visual cortex of monkeys. However, even though reversed-contrast dots did not produce depth, they affected the depth produced by same-contrast dots with which they were mixed. Depth detectability was reduced when the same-disparity and the reversed-contrast dots had the same disparity but was enhanced when they had different disparities (Neri et al. 1999). Cumming et al. concluded that disparity detectors in V1 respond to a type of disparity that is not used directly for depth perception. However, other interpretations of these findings are discussed in Sections 11.10.1 and 15.3.7. Inverted-disparity signals could have the following uses. 1. Production of rivalry Some complex-cell detectors could accept only same-contrast signals to code depth. Other cells could use reversed-contrast signals to evoke rivalry between the images of objects well outside the horopter. Rivalry helps to preserve the monocular images of the most salient objects (Section 12.3.1). 2. Signals for vergence Images produced by objects well away from the horopter are often opposite in contrast. Opposite-contrast images evoke vergence. 3. Detection of interocular correlation The balance between responses to same-contrast and opposite-contrast could help to indicate whether images in the two eyes are in proper register, that is, have a maximum proportion of same-contrast edges. 4. Detection of monocular zones Monocular zones near vertical depth steps may have opposite contrast to the region of a near surface seen by the other eye. Under these circumstances, a complex cell that accepts opposite-contrast stimuli could indicate the presence of a monocular zone. Monocular zones have an important role in the detection of depth (Section 17.2). 5. Detection of periodic disparities The images produced by regular texture elements on a slanted surface periodically come into and out of phase. Oppositecontrast detectors could help in the detection of the spatial frequency of disparity modulation, which is a function of slant. Bridge and Parker (2007) recorded fMRI responses from the human cortex evoked by a sector of disparitydefined depth rotating in a random-dot circular patch. Area V1 was equally responsive to correlated and uncorrelated sectors. Area V4 showed reduced responses to uncorrelated sectors. The lateral occipital area and MT showed little
P H Y S I O L O G Y O F D I S PA R I T Y D ET E C T I O N
•
13
response to uncorrelated stimuli. Similar results were obtained by Preston et al. (2008). However, we will see in Section 11.5.2a that single-cell recordings have revealed that some cells in MT and MST in the dorsal stream of cortical processing are tuned to anticorrelated stereograms. Among other things, the dorsal stream is concerned with controlling vergence (Section 10.10.1), and vergence is evoked by anticorrelated disparities. On the other hand, it is noted in Section 11.5.3b that cells in the inferior temporal cortex of the ventral processing stream respond only to images with the same contrast polarity. This is the area into which V4 projects. Same-contrast images evoke depth percepts.
11.4.1g Detection of Absolute Versus Relative Disparity A disparity detector that receives inputs directly from the two eyes may be called a primary disparity detector. Primary detectors respond to absolute local disparities. Although there are some direct visual inputs to V2 and MT (Section 5.8) it seems safe to conclude that all primary disparity detectors occur in V1. Secondary disparity detectors combine inputs from primary detectors for the detection of disparity gradients. At even higher levels, tertiary disparity detectors combine the outputs of secondary detectors for the detection of complex disparity gradients. Secondary and tertiary detectors respond to second-order and third-order relative disparities. There is conflicting evidence about whether all disparity detectors in V1 respond only to absolute disparity. The following evidence indicates that all disparity detectors in V1 are primary detectors. (Cumming and Parker 1999) (Portrait Figures 11.13 and 11.14). Reported that most binocular cells in V1 of alert monkeys changed when the absolute disparity of a random-dot stereogram was changed, leaving relative disparities constant. None of the cells responded consistently to a specific relative disparity between points over changes in absolute disparity. Cumming and Parker (2000) placed identical gratings in a pair of dichoptic apertures. The apertures had a horizontal disparity equal to the period of the gratings, as in Figure 11.15. This caused the fused image of the apertures and gratings to appear in front of the fixation cross. A binocular cell in V1 with receptive fields confined to the grating was exposed to a zero-disparity stimulus even though the grating appeared in front of the fixation cross. Binocular cells of this type in the alert monkey responded in the same way as they did to a grating in a zero-disparity aperture. In other words, these cells registered the local zero disparity of the grating whether or not the grating was perceived as displaced in depth. On the other hand, Sasaki et al. (2010) found that about one-third of complex cells in V1 of the cat pooled inputs from other detectors, especially along the axis of 14
•
Andrew J. Parker. Born 1954 in Burnley, England. He obtained a B.A. in natural sciences in 1976 and a Ph.D. in 1980, both from Cambridge University. He was appointed to a university lectureship in physiology at Oxford and became professor in 1996. He is a fellow of St John’s College and has been awarded a Leverhulme Senior Research Fellowship (2004–5), a Wolfson Research Merit Award by the Royal Society, and the James S. McDonnell Foundation 21st Century Scientists Award.
Figure 11.13.
preferred orientation. Some of these cells had receptive fields that showed a gradual shift in preferred disparity along the orientation axis. These cells could code inclination in 3-D space and thus qualify as secondary disparity detectors. We will see in later sections of this chapter that cells sensitive to relative disparity have been found in higher visual centers in both the dorsal and ventral processing streams. 11. 4.2 NU M B E R A N D H O M O G E N E I T Y O F D ET E C TO R S
Suppose that each region of the visual cortex contains a set of disparity detectors that span the range of disparities detected in that region. The bandwidth of each detector as a fraction of the total bandwidth divided by the fractional overlap of the tuning functions defines the number of detectors required to span the full range of detectable disparities. As we move into the visual periphery, the mean size of
STEREOSCOPIC VISION
A
Bruce Cumming. He received a B.A. and M.D. at Oxford University and a Ph.D. with S. Judge, also from Oxford. He conducted postdoctoral work with A. Parker in the Department of Physiology at Oxford University. In 2000 he became an investigator in the National Eye Institute at the National Institutes of Health in Bethesda, Maryland, United States.
Figure 11.14.
B Detection of interpolated depth. The small circles represent monocular receptive fields of a binocular cell in V1. In (A) the binocular cell receives a zero-disparity input from the gratings and the disk containing the gratings appears coplanar with the fixation cross. In (B) the receptive fields still receive a zero-disparity input but, with crossed fusion, the disk containing the grating appears in front of the fixation cross. (Redrawn from Cumming and Parker 2000)
Figure 11.15.
receptive fields increases and larger disparities become detectable. Thus, the disparity-detection system, like other spatial systems, is inhomogeneous. Only a few disparity detectors with distinct tuning functions are required at each location, but there are many types of detector over the binocular field. Cormack et al. (1993) derived the tuning widths of disparity detectors from the threshold-elevation effect evident in the detection of interocular correlation. They first measured the correlation-sensitivity function for a dichoptic random-dot display. This is the degree of correlation between dots in the dichoptic images that could just be detected, as a function of disparity. They then measured the correlation-sensitivity function of a test display with a superimposed near-threshold random-dot display of variable disparity. The degree of threshold summation was at a maximum when the two displays had the same disparity and decreased to the level of probability summation as the disparity difference increased to a certain value. With further increase in disparity, the threshold increased, revealing inhibitory interactions. The derived tuning functions centered on zero disparity were symmetrical, with an excitatory central region and inhibitory surrounds. Those centered on a disparity to one side of zero were asymmetrical, with an inhibitory lobe on the other side of zero disparity. The disparity-tuning widths were approximately 20 arcmin, and the tuning functions showed considerable overlap. The largest disparity that could be used with this procedure was about 30 arcmin.
The discrete nature of chromatic channels is revealed by humps and dips in the hue-discrimination function, because hue discrimination is best where neighboring color channels overlap. Hue-discrimination functions are usually derived from stimuli subtending only 2°. Since the visual pigments are the same over wide areas of the retina, the huediscrimination function shows the same undulations for stimuli of all sizes. The disparity-detection system is not homogeneous, since the bandwidth of disparity detectors depends on the size of receptive fields, which increases with retinal eccentricity. One would therefore expect the disparity-discrimination function to show humps and dips only for small stimuli. For large stimuli, the humps and dips in different regions would not coincide and would therefore cancel to produce a monotonic tuning function. Humps as a function of disparity were not found in disparity-discrimination functions (Badcock and Schor 1985; Stevenson et al. 1992). Nor were they found in dichoptic correlation-detection functions (Cormack et al. 1993) or in contrast-detection thresholds (Felton et al. 1972). It has been concluded that disparity detection is not achieved by only three detectors in the manner suggested by Richards (1972). But this conclusion is premature.
P H Y S I O L O G Y O F D I S PA R I T Y D ET E C T I O N
•
15
The width of the tuning functions revealed by Cormack et al. suggest that very few detectors are required to span the detectable range of disparities in any local region of the visual field. It is difficult to see how a local region could accommodate a large number of disparity detectors. Large displays were used in all the studies that showed an absence of humps in disparity-discrimination functions, so their absence may merely reflect the fact that the disparitydetection system is not homogeneous over the binocular field. There could therefore be a small number of discrete detectors in each location but a continuous range of detectors over the visual field. An experiment is needed in which disparity-discrimination functions are derived from small displays in different locations. Channels for the detection of spatial modulations of disparity are discussed in Section 18.6.3.
11.4.3 P O S I T I O N- A N D P H A S E -D I S PA R IT Y D ET E C TO R S
11.4.3a Phase-Disparity Detectors Consider a binocular cell that has receptive fields in the two eyes with the same tuning for orientation, motion, and spatial frequency, and the same distribution of excitatory and inhibitory zones. Disparity tuning of this type of cortical cell can only be due to a difference in position between the receptive fields in the two eyes. These binocular cells are position-disparity detectors. If the preferred disparity of a cortical cell depends on the offset of its receptive fields, one should be able to correlate the two quantities. In practice, receptive-field offset cannot be determined for all disparity-tuned cells, because many of them do not respond when a stimulus is presented only to the nondominant eye. Some disparity-selective cells, known as AND cells, respond only to the joint stimulation of both eyes. Furthermore, many of the cells in which a receptive-field offset can be measured are not disparity selective (von der Heydt et al. 1978). Computational studies gave rise to the idea that binocular disparity may be coded in terms of differences in spatial phase of ON and OFF regions of monocular receptive fields feeding into a binocular cell ( Jenkin and Jepson 1988; Sanger 1988; Jenkin et al. 1991). The receptive fields of a pure phase-disparity detector have the same positions in the two eyes and the same tuning for orientation, motion, and spatial frequency, but different distributions of excitatory and inhibitory zones. For instance, the receptive field of a cell in one eye could have a symmetric (cosine) sensitivity profile and that in the other an asymmetric (sine) profile, as illustrated in Figure 11.12B. Some binocular cells in the visual cortex of the cat have been reported to detect this type of phase disparity (Freeman and Ohzawa 1990). 16
•
DeAngelis et al. (1991) mapped the receptive-field profiles of simple cells in the visual cortex of cats for each eye. By fitting the profiles with Gabor functions, they obtained the phase difference between the two eyes (see also Ohzawa et al. 1996). About 30% of binocular simple cells in area 17 showed substantial differences between the phases of the ON and OFF regions in the receptive fields in the two eyes. The cells had matching orientation, motion, and spatialfrequency tuning, and most of them preferred orientations near vertical. Almost all cells tuned to stimuli within 20° of the horizontal had receptive fields with matching or nearmatching phases, which suggests that they code position disparity. Cells tuned to stimuli within 20° of the vertical had receptive fields with a wide variety of phase relationships, which suggests that they code phase disparity. Anzai et al. (1999a) found most phase-disparity detectors in the cat to have a phase disparity of 90° or less. Prince et al. (2002b) distinguished between cells tuned to position disparity and cells tuned to phase disparity by the shapes of their disparity-tuning functions. For this purpose they used dynamic random-dot stereograms. Both types of disparity detector were found to be common in V1 of the monkey. They confirmed the distinction between the two types of detectors by measuring the disparity sensitivity of each type to sinusoidal luminance gratings as a function of spatial frequency. Tuning functions of phase-disparity detectors are discussed in Section 11.4.3a, and models of these detectors are discussed in Section 11.10.1.
11.4.3b Position- and Phase-Disparity Detectors Compared To make phase disparities comparable with displacement disparities, phase disparity should be expressed in terms of visual angle rather than phase angle. The angular disparity that produces a peak response equals the phase shift divided by the spatial frequency of the stimulus. Uncertainty increases with both the spatial frequency of the stimulus and with the size of the phase shift. Pooling responses over a local area or over spatial scale and orientation helps to reduce this uncertainty and produce an angular-disparity signal (Section 18.8). Position- and phase-disparity mechanisms differ in the following ways. 1. Maximum detectable disparities For position disparity there is no necessary linkage between the size of receptive fields and the preferred disparity of the cortical cell into which they feed. A binocular cell with small receptive fields could have a large receptive-field offset and therefore be tuned to a large disparity, and a cell with large receptive fields could have a zero offset and therefore be tuned to zero disparity. Also, for position-disparity there is no necessary linkage between
STEREOSCOPIC VISION
the size of receptive fields and the width of disparitytuning functions. For instance, Pettigrew et al. (1968) found complex cells with large receptive fields in the cat that were narrowly tuned to disparity. The maximum disparity detectable by a phase-disparity detector is half the spatial period of the zones within the receptive field. Thus, phase-sensitive binocular cells with small receptive fields are necessarily tuned to small disparities expressed in terms of visual angle. Simple cells with large receptive fields could be tuned to large angular disparities, since a simple cell with a large receptive field is essentially a scaled-up version of one with a small receptive field. The receptive field of a complex cell has several subunits within each of which there are excitatory and inhibitory detectors. The disparity preference of a complex cell with a large receptive field would therefore depend on the size and spatial disposition of the subunits within the receptive field rather than on the size of the whole receptive field. A tendency for small receptive fields to detect small disparities and for large receptive fields to detect large disparities could account for why cells at greater retinal eccentricities are tuned to larger disparities. Cells in V1 have smaller receptive fields than those in V2. This could account for the fact that V1 contains more cells tuned to zero disparity than cells tuned to near or far disparities, while V2 contains more cells tuned to near and far cells than cells tuned to zero disparity (Ferster 1981). Psychophysical evidence on the relationship between disparity and spatial scale is reviewed in Section 18.7. 2. Disparity bias Since the tuning functions of spatialfrequency channels overlap, a stimulus of a given spatial frequency stimulates one channel strongly and flanking channels less strongly. The bandwidth of spatialfrequency channels increases with increasing preferred spatial frequency. Hence, a stimulus with a given spatial frequency stimulates more flanking channels on the high-frequency side than on the low-frequency side. For phase-disparity detectors, in which high spatialfrequency channels code small disparities, this should cause disparity, and hence depth, to be underestimated. This is known as disparity bias (Section 18.7.3). Disparity bias should be less evident at higher spatial frequencies because of the limited number of spatialfrequency channels. Disparity bias would not necessarily arise in a system consisting of positiondisparity detectors. 3. Disparity beats Tsai and Victor (2003) argued that only a phase-disparity mechanism can account for the appearance of depth based on beats in dichoptic compound gratings, as described by Boothroyd and Blake (1984) (see Section 17.1.1a).
11.4.3c Hybrid Detectors A binocular cell that responded to both position disparity and phase disparity would be a hybrid disparity detector. Anzai et al. (1999a) measured both phase disparity and position disparity for the same simple cells in the cat’s visual cortex. They measured the position disparity of a given cell with respect to a neighboring reference cell, which was assumed to have zero offset-disparity. Statistical procedures were used to assess the uncertainty of this procedure, although its validity depended on the unconfirmed assumption that position disparities of neighboring binocular cells are uncorrelated. Most phase disparities had phase angles of less than 90°. The sign of phase disparities larger than 90° becomes ambiguous (Blake and Wilson 1991). Phase disparities were mostly within +1° of visual angle, whereas position disparities were within +0.5°. This seems to contradict the notion that large disparities are coded by position-disparity detectors. However, it is not clear how eccentric the receptive fields were. Perhaps large disparities are coded by the larger receptive fields of cells serving the peripheral retina. Anzai et al. concluded that phase disparity detectors code large disparities for low spatial-frequency stimuli and that position-disparity detectors provide a constant limit for high spatial-frequency stimuli for which phase disparities are small. Anzai et al. found no correlation between the preferred position disparities and phase disparities of binocular cells in the cat’s visual cortex. The two types of disparity therefore add in some cells and subtract in others. For any binocular cell, any uncertainty in the registration of the relative positions of its monocular receptive fields would produce a corresponding uncertainty in the calibration of phase disparities of that cell. Therefore, the joint determination of the two types of disparity is subject to the same uncertainty associated with the determination of position and spatial frequency (Section 4.4.1c). Tsao et al. (2003b) used light and dark bars to map the spatiotemporal monocular receptive fields of simple cells in V1 of alert macaque monkeys. They then measured the disparity-tuning functions of the cells by changing the disparity of a drifting bar. They extracted the position and phase components of the disparity-tuning functions by fitting a Gabor function to the interocular cross-correlogram. Different cells had different combinations of position and phase disparities. For both types of disparity, cells formed a continuum rather than falling in discrete classes. For each cell, the peak of the disparity-tuning curve was related to the sum of the position and phase components. As in the cat, the phase disparity of most cells was less than 90°. Theoretically, the position and phase components of a hybrid disparity detector could be measured by recording the disparity tuning of the cell to drifting sinusoidal gratings of different spatial frequencies. The phase offset can be derived by fitting a cosine function to the tuning function,
P H Y S I O L O G Y O F D I S PA R I T Y D ET E C T I O N
•
17
and the position offset is given by the slope of the function relating phase offset to the spatial frequency of the stimulus (Fleet et al. 1996a). For a position-disparity detector, peaks in response frequency occur at the same disparity for all spatial frequencies of the grating. For a phase-disparity detector, peaks occur when the phase difference between the gratings in the two eyes reaches a certain value, whatever the spatial frequency. This procedure has been used to investigate disparity detection in the owl (see Section 33.6.3), although its validity has been questioned (Zhu and Qian 1996). Liu et al. (1992b) produced psychophysical evidence that position disparity dominates phase disparity in determining the depth of a stimulus when the two types of disparity are in conflict. Two cycles of a vertical cosine grating were presented in a Gaussian window. The grating and window were moved in one eye to produce a position disparity. The grating was moved relative to the window to produce a phase disparity of the grating relative to a fixed disparity of the window. Phase disparity had to be about three times larger than position disparity to reach the threshold for perceived depth. Thus, a zero position disparity restrained the effect of the phase disparity. It is not clear that the two types of disparity were processed exclusively by position-disparity detectors and phase-disparity detectors respectively. Also, one cannot conclude that phase disparity in a stimulus is processed less efficiently than position disparity, since one cannot produce a stimulus with pure phase disparity. Erwin and Miller (1999) proposed a type of hybrid disparity detector for simple cells, in which position offset is correlated with phase displacement. In this type of disparity detector an ON region of the receptive field in one eye corresponds only with an ON region of the other eye, and an OFF region corresponds to only an OFF region. This means that a phase shift must be accompanied by a corresponding position shift and vice versa. They called this a subregion correspondence detector. They distinguished it from a hybrid detector in which phase and position offsets are not correlated, and from a pure displacement or pure phase detector. The four types of detector are depicted in Figure 11.16. Erwin and Miller’s model detectors apply best to tuned excitatory disparity detectors with near balanced inputs from the two eyes, which detect small disparities centered about zero. Erwin and Miller argued that subregion correspondence detectors account for how the receptive fields of tuned simple cells acquire the same orientation and spatialfrequency tuning. The phases of the receptive fields in the two eyes should be correlated to produce the signals that Hebbian synapses require for cortical plasticity. The correspondence detector model accounts for the preponderance of tuned excitatory cells with preferred disparities clustered around 0° in cortical areas 17 and 18 of the cat (Section 11.3.1). Erwin and Miller argued that the 18
•
(a) D e tector of pure position disparity.
(b) D e tector of pure phase disparity.
(c) D e tector of correlated position and phase disparities.
(d) D e tector of uncorrelated position and phase disparities.
Types of disparity detectors. Each vertical pair of diagrams represents the monocular receptive fields of that feed into a disparity detector. The bars represent the ON and OFF regions within the receptive fields.
Figure 11.16.
narrow clustering of preferred disparities of tuned excitatory cells would not occur if interocular phase and position shifts were uncorrelated, not even if position shifts were negligible. For pure phase detectors, the distribution of peak disparities would be broader for low than for high spatial frequencies. For pure position-disparity detectors, the distribution would have the same width as the distribution of position shifts. For uncorrelated hybrid detectors the distribution would be broad. Perhaps the existence of all types of detector accounts for the existence of both finely tuned and broadly tuned detectors. Erwin and Miller argued that cells tuned to horizontal lines have smaller phase shifts than cells tuned to vertical lines, because horizontal position shifts are larger than vertical position shifts, whatever the orientation of the stimulus. Joshua and Bishop (1970) reported this type of anisotropy in the cat at an eccentricity of about 12° but not in the central field. Erwin and Miller claimed that only the correspondence detector model accounts for the eccentricity dependent anisotropy. They stated that the model should be tested by recording phase and offset disparities of several cells simultaneously. 11.4.4 D ET E C T I O N O F VE RT I C A L D I S PA R I T Y
Vertical disparities play a crucial role in stereopsis. For example, differences between horizontal and vertical disparity indicate the relative deformation of the two images, which is used in the perception of the slant and inclination of surfaces (Sections 19.3 and 20.2). Vertical disparities also initiate vertical vergence (Section 10.6).
STEREOSCOPIC VISION
According to the simple energy model of disparity detection, cells tuned to vertical stimuli should be tuned to horizontal disparities and cells tuned to horizontal stimuli should be tuned to vertical disparities. This issue is discussed in Section 11.4.5. Natural scenes contain a broader range of horizontal disparities than of vertical disparities. This is especially true in the central visual field. One way to cater for this predominance of horizontal disparities would be to have a predominance of binocular cells tuned to vertically oriented stimuli. A second way would be for cortical cells to have broader tuning functions to horizontal disparity than to vertical disparity. Cells sensitive to vertical disparity have been reported in area 17 of the cat. The cells were more broadly tuned to horizontal disparities than to vertical disparities (Barlow et al. 1967). Joshua and Bishop (1970) and Nikara et al. (1968) did not find broader tuning for horizontal disparities. But they measured only position disparities between receptive fields (Uka and DeAngelis 2002). DeAngelis et al. (1991) found that horizontal phase disparities are detected over a larger range than are vertical phase disparities (see Section 11.4.3a). Thus, the verticalhorizontal anisotropy may be a characteristic of only the phase-disparity system. Gonzalez et al. (1993b) trained monkeys to fixate a spot on a random-dot display subtending 24 by 14° as the horizontal and/or vertical disparity of the central region was varied. Thirty percent of cells in V1 and 41% of those in V2 were sensitive to both horizontal and vertical disparity. When horizontal disparity was zero, the response of most of these cells was least for zero vertical disparity and increased with increasing vertical disparity. In the presence of a fixed horizontal disparity, the response decreased with increasing vertical disparity. Similarly, in the presence of a fixed vertical disparity, the response decreased with increasing horizontal disparity. Thus, tuning for vertical disparity was centered on zero disparity. The response characteristics of these binocular cells is appropriate for the detection of a difference between horizontal and vertical disparities, which is what one would expect of a cell designed to detect deformation disparities. The behavioral significance of these results is discussed in Sections 20.1 and 20.2. With bar stimuli, a cell’s response to horizontal disparity is measured with vertical bars, and its response to vertical disparity is measured with horizontal bars. With this method, responses to horizontal and vertical disparities cannot be measured at the same time. Cumming (2002) overcame this problem by measuring the disparity tuning of cells in V1 of alert monkeys using a random-dot stereogram rather than bar stimuli. When the disparity of the randomdot stimulus was beyond the detection range, the images in the two eyes were effectively uncorrelated and cells fired at a baseline rate. Within the disparity-detection range, firing rate was plotted as a function of the direction of the disparity of the random-dot patch falling within the receptive
field of the cells. Most cells responded to a wider range of horizontal disparities than of vertical disparities. Cumming also measured the orientation tuning of each cell to a bar. There was a predominance of cells tuned to vertical, as predicted by the energy model. For some cells, the preferred direction of disparity was related to preferred orientation. However, most cells were more broadly tuned to horizontal disparity whatever their orientation tuning. For each cell, the preferred direction of disparity (the peak of the twodimensional tuning function) was decomposed into horizontal and vertical components. The horizontal component varied more between cells than did the vertical component. These results may be reconciled with the simple energy model if most of the cells recorded by Cumming were complex cells, each of which received inputs from several simple cells. The component simple cells could vary widely in their tuning to horizontal disparity but have similar tuning to vertical disparity. Perception of the 3-D structure of natural scenes requires local detection of differences in horizontal disparity. Vertical disparities vary gradually over the visual field and are not required for local processing. It is therefore sufficient to detect mean vertical disparity over a large area. Consequently, it would help if cells tuned to horizontal disparity had smaller receptive fields than those tuned to vertical disparity. Also, it would help if signals arising from local horizontal disparities were conveyed to higher levels of the visual system. Signals arising from vertical disparities could be pooled to economize on signal transmission (see Chapter 20). There seem to be no physiological data on these issues. Stimuli near the center of the visual field do not produce vertical disparity unless the eyes are out of vertical alignment. With vertically aligned eyes, vertical disparities occur in the quadrants of the visual field. Those produced by a large textured frontal surface increase with increasing eccentricity along oblique meridians (Section 19.6.2). Thus, one would expect to find cells in V1 tuned to vertical disparity to be more numerous in the peripheral visual field than in the central field. Durand et al. (2002) used random-dot stereograms to measure the disparity-selectivity of cells in V1 and V2 of alert monkeys. At retinal eccentricities of between 8° and 22° about half the cells in V1 and V2 were sensitive to both horizontal and vertical disparities, 8% were sensitive only to horizontal disparity, and 23% were sensitive only to vertical disparity. The remaining cells were not sensitive to disparity. Cells tuned to vertical disparity showed the same types of tuning functions as cells tuned to horizontal disparity. However, cells tuned to vertical disparity had narrower tuning functions than those tuned to horizontal disparity. This is consistent with the fact that, in natural scenes, vertical disparities vary over a smaller range of disparity than do horizontal disparities. Thus, the regions of the visual cortex that represent the peripheral visual field contain binocular cells specifically
P H Y S I O L O G Y O F D I S PA R I T Y D ET E C T I O N
•
19
tuned to vertical disparity. These cells must be responsible for evoking vertical vergence eye movements (Section 10.6) and for coding the vertical disparities that occur in the oblique quadrants of the binocular visual field (Sections 20.2 and 20.3). Cumming (2002) found that cells in V1 serving the central visual field are more widely tuned to horizontal disparities than to vertical disparities. However, evidence reviewed in Section 5.6.2a indicates that cells serving the peripheral field are preferentially tuned to radially oriented stimuli. They are said to have a radial bias. If it is assumed that a cell’s axis of disparity tuning is determined by its orientation selectivity, then disparity tuning, like orientation tuning, should show a radial organization over the visual field. Durand et al. (2007a) tested this idea by recording from cells in V1 of alert macaque monkeys in response to various combinations of horizontal and vertical disparity in a dynamic random-dot stereogram. The orientation selectivity of the cells was also measured. Within the central 7° of the visual field the range of preferred horizontal disparities was more than twice the range of preferred vertical disparities. This confirms the results of Cumming’s experiment. Also, there was a vertical bias in the orientation tuning of the cells. However, outside the central region, there was no such anisotropy of disparity processing, and the orientation selectivity of cells showed a radial bias. These results conform to the energy model of disparity processing in which the axis of a cell’s preferred disparity is orthogonal to the cell’s preferred orientation. These results do not prove that vertical disparities and horizontal disparities are separately detected. According to the model proposed by Matthews et al. (2003), which is described in Section 20.2.5, vertical disparities are coded as equivalent horizontal disparities. They suggested that this accounts for the induced effect described in Section 20.2.3. Read and Cumming (2006) proposed that vertical disparity could be extracted from the activity of neurons sensitive only to horizontal disparity (see Section 11.10.1c). However, vertical disparities help in the coding of distance and surface curvature (Section 20.6). They also trigger vertical vergence (Section 10.6). These functions require the separate extraction of vertical disparities. Serrano-Pedraza and Read (2009) have produced psychophysical evidence that both the magnitude and sign of vertical disparities are explicitly coded in the visual system (see Section 20.2.5).
11.4.5 O R I E N TAT I O N A N D D I S PA R IT Y TUNING
11.4.5a Disparity Direction and Stimulus Orientation The first question is whether a cell’s disparity-tuning function is related to the direction of disparity and to the orientation of the stimulus. For bar stimuli that span several 20
•
receptive fields, disparity tuning is necessarily maximal in a direction orthogonal to the bar, especially when the bar is at the preferred orientation of the cell. For bar stimuli shorter than a cell’s receptive field it is reasonable to suppose that the cell responds to smaller disparities in bars at right angles to the receptive field axis and to larger disparities in bars parallel to the receptive field. Maske et al. (1986a) found this to be true for cortical cells in area 17 of the cat that lacked inhibitory end zones in their receptive fields (non-end-stopped cells). End-stopped cells responded to disparities along the receptive-field axis as well as, or almost as well as, to disparities at right angles to the axis, as long as the stimuli were shorter than the receptive field. Also, in V1 of the monkey, binocular cells responded to disparities along the length of a bar as long as the ends of the bar fell within the receptive field (Howe and Livingstone 2006). When the bar extended beyond the receptive field, the cell responded only to disparities orthogonal to the bar. But the direction of the disparity remained ambiguous. This is the stereo aperture problem, which is discussed in Section 18.6.5. Prince et al. (2002a) showed that a linkage between the orientation preference of a cell and the preferred direction of disparity does not hold for stimuli with a broad band of orientations, such as random-dot displays. However, with random-dot displays, Prince et al. found that disparity tuning is related to stimulus orientation in another way. Tuning functions for horizontal disparity resembled Gabor functions for cells with an orientation preference near vertical but resembled Gaussian functions for cells tuned to near horizontal. This can be explained as follows. A Gabor function indicates the multilobed response modulation across the width of the cell’s oriented receptive field. A Gaussian function represents the single-lobed response modulation across the length of the cell’s oriented receptive field. A related question is whether cells jointly tuned to horizontal and vertical disparities respond equally to the two types of disparity. This question was discussed in Section 11.4.1d.
11.4.5b Sensitivity to Different Orientations A second question is whether, in general, the visual system is more sensitive to horizontal disparities than to vertical disparities. Ohzawa and Freeman (1986a) found that the degree of binocular interaction for cells preferring horizontal stimuli was similar to that for cells tuned to vertical stimuli. Others have found that disparity-tuned cells preferring vertical stimuli (horizontal disparity) responded more strongly than those preferring horizontal stimuli (vertical disparity) (Maske et al. 1986a). Connections between neighboring cortical cells tuned to disparity were found to be stronger for cells tuned to vertically oriented stimuli than for cells tuned to horizontally oriented stimuli (Menz and Freeman 2004a).
STEREOSCOPIC VISION
Ohzawa et al. (1996) reported that about 30% of binocular simple cells in the visual cortex of the cat were sensitive to phase disparity. Almost all these cells preferred stimuli orientated between oblique and vertical. Binocular cells that preferred stimuli near the horizontal had receptive fields that matched in the phases of their ON and OFF regions. These results suggest that vertical disparities are detected in terms of position disparity rather than phase disparity. Anzai et al. (1999a) confirmed that cells in cat area 17 that are tuned to horizontal orientations (vertical disparity) have smaller phase disparities than cells tuned to vertical orientations. They did not find a corresponding anisotropy for position disparity. Barlow et al. (1967) had found the anisotropy using a measure of disparity that included contributions from both position and phase disparities. Others, including von der Heydt et al. (1978) and LeVay and Voigt (1988), found no anisotropy for a measure based on position disparity that did not allow for any contribution from phase disparity. In the cat visual cortex, Joshua and Bishop (1970) found that, with increasing horizontal eccentricity, the horizontal offset of receptive fields of binocular cells increased. Vertical receptive-field offset increased more gradually. Thus, the degree of anisotropy depends on the measure of disparity and stimulus eccentricity. Evidence reviewed in Section 20.3.2 shows that horizontal disparities are extracted locally whereas vertical disparities are pooled over large areas. Putting physiological and psychophysical evidence together suggests that phase disparities are extracted locally while position disparities are pooled.
11.4.6 D I S PA R I T Y T U N I N G A N D EY E POSITION
11.4.6a Disparity Tuning and Viewing Distance The horizontal disparity between two objects a fixed distance apart in depth decreases with the square of the absolute distance of the objects. Therefore, viewing distance must be taken into account when judgments of relative distance are derived from disparity. The angle of vergence provides information about distance, at least for near distances. One might therefore expect that the disparity tuning of binocular cells somewhere in the visual system would be scaled by the vergence angle. Trotter et al. (1992, 1996) recorded from disparitytuned cells in V1 of alert monkeys as they fixated a visual target at distances of 20, 40, and 80 cm (Portrait Figure 11.17). At each distance, an array of random dots was presented with various degrees of horizontal disparity relative to the fixation target. Dot size, display size, and disparities were scaled for distance so that the retinal images were the same for each distance. The response of most cells
Yves Trotter. Born in Quimperlé in Brittany in 1954. He obtained a Ph.D. in neurophysiology from the University of Paris with M. Imbert in 1981. He performed postdoctoral work with G. F. Poggio at Johns Hopkins University from 1981 to 1983. He then moved to the Research Center for Brain and Cognition in Toulouse, where became a director of research in 1993. In 2007 he became codirector at the Institute Federatif de Recherche (IFR 96) at the Brain Sciences Institute of Toulouse.
Figure 11.17.
was modulated by changes in viewing distance. For example, disparity selectivity emerged at only one distance or was sharper at one distance than at other distances. Gonzalez and Perez (1998a) obtained similar results. Cumming and Parker (1999) reported that changes in vergence had little effect on the disparity tuning of cells in monkey V1. However, they changed vergence angle by only 1 to 3.5° compared with a change of 2° to 10° in the experiment by Trotter et al. The responses of some cells in monkey V1, V2, and V4 to an optimally oriented bar were modulated by changes in vergence (Rosenbluth and Allman 2002). A few cells with similar properties have been found in monkey MT (Roy et al. 1992) and parietal lobe (Section 5.8.4e). However, cyclovergence that accompanies vergence could contribute to this effect. Responses of cells in the posterior parietal cortex of the monkey were modulated by changes in the distance of the visual stimulus (Sakata et al. 1980). Also, the disparity tuning of cells in this area was modulated by the convergence angle of the eyes (Genovesio and Ferraina 2004). Trotter et al. concluded that changes in disparity tuning with distance are mediated by changes in vergence or accommodation. They assumed that the pattern of retinal stimulation was the same at the different distances. However, fixation disparity may change with distance and changes in cyclovergence accompany changes in vergence. Since no precautions were taken to prevent or compensate for
P H Y S I O L O G Y O F D I S PA R I T Y D ET E C T I O N
•
21
these changes, the resulting changes in the alignment of the images may have caused the observed changes in the responses of cortical cells. Furthermore, the pattern of vertical disparities produced by a display in a frontal plane varies with distance (Section 20.6.3). This factor would contribute to the distance modulation of the response of cortical cells but only for stimuli subtending more than about 20°. Pouget and Sejnowski (1994) developed a neural-network model of disparity detectors modulated by vergence, which codes both relative and absolute distance. The role of vergence in judgments of absolute distance is discussed in Sections 20.4.1 and 25.2.
Luminance-defined grating
1 st-stage filters
Rectification + squaring
11.4.6b Disparity Tuning and Gaze Direction The responses of cells in the primary visual cortex of alert cats to a stimulus in a given retinal location have been found to vary with the direction of gaze (Weyand and Malpeli 1993). Trotter and Celebrini (1999) recorded from binocular cells in V1 as alert monkeys fixated a spot at the center of a 6° by 6° random-dot display with +0.6°, −0.6°, or zero horizontal disparity. About half the cells showed a change in response rate when the whole display was moved 10° to left or right of the median plane. The eyes fixated on the center of the display, which was tangential to the horizontal horopter in both positions. A few cells showed a change in their preferred disparity. It is unlikely that changes in vertical disparity would affect the results because the display and the eccentricity were both small. Fixation disparity (Section 10.2.4) may have changed with changing gaze angle, although any such change should have introduced a consistent shift in disparity-tuning functions for all cells. Rosenbluth and Allman (2002) found that the responses of some cells in monkey V1, V2, and V4 to an optimally oriented bar were modulated by changes in the direction of gaze. However, changes in cyclovergence that accompany changes in gaze were not allowed for. Any such change would modulate the disparity of the binocularly observed bar. Trotter et al. (2004) proposed that modulations of disparity coding by angle of gaze could be part of a mechanism that compensates for the fact that, with asymmetrical gaze, the normal to the line of sight does not lie on the tangent to the horopter (Vieth-Müller circle) (see Section 14.2.4). 11.4.7 D I S PA R I T Y I N C O N T R A S T-D E F I N E D S T I MU L I
On the left of Figure 11.18 is shown a luminance-defined grating, which is said to be a first-order stimulus. Below it is the receptive field of a cell that responds to the grating. The grating on the right is defined by modulations of contrast of a grating, and is said to be a second-order stimulus. The high-frequency grating is known as the carrier and the 22
•
Contrast-defined grating
+
–
+
+
–
+
–
– +
–
+
+ –
+
Cortical cell responding to luminance-defined or contrast-defined stimuli Luminance and contrast-defined stimuli. A grating defined by contrast (second-order stimulus) does not stimulate first-order detectors that respond to luminance-defined gratings. If the output of spatial-frequency filters are rectified and squared they form a receptive field similar to that responding to luminance. Area 18 of the cat contains cells that respond to both types of grating. (Redrawn from Tanaka and
Figure 11.18.
Ohzawa 2006)
low-frequency contrast modulation is known as the envelope. The envelope contains no luminance energy within the receptive field of a cell that is tuned to the frequency of the envelope. However, if the output of spatial frequency filters responding to the carrier are rectified and squared, luminance-defined and contrast-defined gratings of the same spatial frequency and orientation produce similar responses (see Section 5.6.3). Tanaka and Ohzawa (2006) examined the disparity tuning functions of binocular cells in cat area 18 for both luminance-defined and contrast-defined gratings, like those shown in Figure 11.18. Most cells responded to luminancedefined gratings, and 45% of these also responded to contrast-defined gratings. Three cells were selected that did not respond to the 1.3-cpd carrier grating but did respond to a contrast-modulated envelope of 0.11 cpd. While keeping the phase disparity of the carrier constant, they recorded the responses of these cells as the phase disparity of the
STEREOSCOPIC VISION
contrast-defined envelope was varied. The cells had disparitytuning functions with peak responses at different phase disparities. The cells were not responsive to changes in the phase disparity of the carrier grating. For each cell the phasedisparity tuning function for a contrast-defined grating was similar to that for a luminance-defined grating of the same spatial frequency. The cells thus showed cue invariance arising from convergence of first-order luminance and second-order contrast processing mechanisms. However, the cells responded more vigorously to the luminancedefined stimuli than to the contrast-defined stimuli. Tanaka and Ohzawa discussed two mechanisms for processing the disparity of contrast-modulated gratings. In mechanism A, the outputs of first-stage filters are rectified and squared to produce cells with linear receptive fields, as depicted in Figure 11.19A. Monocular cells serving each eye then feed to a binocular cell. In mechanism B, rectification and squaring are delayed until after inputs from the two eyes converge on binocular cells, as depicted in Figure 11.19B. Mechanism A predicts the finding that disparity tuning is insensitive to phase disparity of the carrier grating. Mechanism B predicts an effect of phase disparity of the carrier. Also, for mechanism B, disparity tuning depends on position disparities in the first-stage filters. But these filters are tuned to high spatial frequencies and would not be able to detect disparities as large as those of the contrast-defined
Left eye
envelope. Only small envelope disparities could be detected on the basis of disparities at the level of the carrier. We will see in Section 18.7.2d that people can detect disparities between contrast-defined modulations of disparity under appropriate conditions. 11.4.8 DY NA M I C S O F D I S PA R I T Y D ET E C TO R S
11.4.8a Response Variance The response rate of cortical cells of anesthetized monkeys to repetition of a given stimulus has a variance about equal to the mean response rate (Tolhurst et al. 1983). However, cells in V1 of alert monkeys have a much smaller response variance than those of anesthetized monkeys, especially when eye movements are minimized (Guret et al. 1997). Furthermore, the response variance of cells in V1 was no less than that of LGN cells. Low response variance improves the capacity of small numbers of cells to discriminate stimuli. Since most response variance was due to eye movements it was correlated between cells. Pooling of inputs at a higher cortical level would reduce effects of uncorrelated noise but not of correlated noise.
11.4.8b Time to Reach the Steady-State Response The processes involved in the creation of disparity selectivity of cortical cells could involve feedback signals in addition
Right eye
Left eye
Right eye
1st-stage filters
Rectification + squaring
Monocular cells
Rectification + squaring
Binocular cells
Two models of processing disparities of contrast-defined stimuli. (A) Outputs of first-stage spatial-frequency filters are rectified to create monocular cells with linear receptive fields. Monocular cells for the two eyes then combine to form binocular cells. (B) Outputs of first-stage filters from the two eyes combine to form binocular cells. The outputs of these cells are then combined to form binocular cells with linear receptive fields. (Adapted from Tanaka and Ohzawa 2006)
Figure 11.19.
P H Y S I O L O G Y O F D I S PA R I T Y D ET E C T I O N
•
23
to feedforward signals. Thorpe et al. (1991) found that disparity tuning of cortical cells in monkey V1was fully developed in the first 10 ms of response, which suggests that only feedforward signals are involved. The preferred disparity of simple and complex cells in the visual cortex of the cat did not change over the initial 40 ms. However, the disparity range of the cells decreased and their tuning to the spatial frequency of disparity increased during this period (Menz and Freeman 2004b). In other words, the disparity tuning of cortical cells sharpened over time. These changes could be due to changes in spatialfrequency tuning of monocular cells that feed into binocular cells. Over a period of about 50 ms, cells in V1 of the monkey became more sharply tuned to spatial frequency, and the preferred spatial frequency shifted to a higher value (Bredfeldt and Ringach 2002). But changes at the monocular level could not explain Menz and Freeman’s finding that cells tuned to coarser spatial modulations of disparity had shorter response latencies than cells tuned to finer spatial modulations. Also, lateral connections going from cells with coarse tuning to cells with fine tuning were stronger than connections in the opposite direction (Menz and Freeman 2003). These findings support the notion of a progressive refinement of disparity tuning over a time period of 40 ms after stimulus onset. Psychophysical evidence of coarse-to-fine disparity tuning is discussed in Sections 18.7.2e and 18.12.1c. 11. 5 D I S PA R I T Y D ET E C T I O N I N H I G H E R VI S UA L C E N T E R S O F P R I M AT E S 11.5.1 D I S PA R I T Y D ET E C TO R S I N V2 A N D V3
The four types of disparity detector that occur in monkey V1 are prevalent in V2, where at least 70% of neurons are tuned to horizontal disparity (Poggio and Poggio 1984; Hubel and Livingstone 1987). Ablation of the foveal region of V2 severely elevated the stereo threshold (Cowey and Wilkinson 1991). Visually evoked potentials in humans have also revealed a larger number of binocular cells in V2 than in V1 (Adachi-Usami and Lehmann 1983). In both macaque and humans, there is a strong fMRI response in V3 and the caudal parietal cortex to disparity in randomdot displays (Tsao et al. 2003a). It was mentioned in the Section 11.4.1g that Cumming and Parker found no cells in V1 that were sensitive to relative disparity between neighboring points. Von der Heydt et al. (2000) (Portrait Figure 11.20) found cells in V1 of anesthetized monkeys that responded to local disparities in a random-dot stereogram. However, V1 cells did not respond to the edges of a cyclopean shape defined by disparity, even though they responded to 24
•
Rudiger von der Heydt. Born in Rauschensamland, Germany, in 1944. He studied physics at the Universities of Göttingen, Marburg, and Munich from 1963 to 1969 and had research training in neurophysiology with G. Baumgartner in the department of neurology at the University Hospital Zurich. He obtained a doctorate from the Swiss Federal Institute of Technology in 1993. He has been professor of neuroscience at the Johns Hopkins University School of Medicine since 1993 and professor at the Krieger Mind/Brain Institute since 1994. He received the Alfred Vogt Preis Award of the Swiss Ophthalmological Society in 1986 and the Golden Brain Award, Minerva Foundation, Berkeley, in 1993.
Figure 11.20.
contrast-defined edges. Many cells in V2 were sensitive to relative disparity in random-dot stereograms, and many responded to contrast-defined edges and disparity-defined edges. They signaled the sign and orientation of the depth step and the orientation of an edge. It seems that detection of a disparity-defined edge requires an extra stage that is achieved in V2. The extra stage involves detection of disparity extending over several detectors. Cells in V2 of the monkey respond to disparities outside their receptive fields when they indicate an edge running across the receptive field (Section 22.2.3a). Thomas, Cumming, and Parker (2002) found cells sensitive to relative disparity in V2 of the alert macaque monkey. They used dynamic random-dot stereograms with a central patch that just covered the smallest area that elicited a response in a cell in V2. For a cell sensitive to only absolute disparity, the peak of its tuning function in response to changing disparity in the central patch would not shift when the disparity in the surrounding area changed. In this case the shift ratio is zero. The largest response in such a cell would be produced by a particular disparity in the center whatever the disparity in the surround. For a cell sensitive to relative disparity, the peak of
STEREOSCOPIC VISION
40 Surround disparity 0°
Firing rate (impulses/s)
Surround disparity –0.45°
Surround disparity +0.45°
30
20
10
–0.5
0.5 0 Disparity of central patch (deg)
1
Detection of relative disparity. Disparity tuning functions of a cell in V2 as a function of disparity in the surround. (Adapted from Thomas
Figure 11.21. et al. 2002)
the tuning function to changing disparity of the central patch would shift when the disparity of the surround changed. When the shift is equal in direction and magnitude to the change in the disparity of the surround, the shift ratio is 1. Not all cells in V2 were sensitive to relative disparity. Some cells had a shift ratio of near 1, while others had ratios considerably less than 1. For some cells, the shift ratio was 1 only over the middle of the cell’s disparity-tuning function. These cells respond to relative disparity only over a limited range of disparities. Figure 11.21 shows the disparity-tuning function of one cell to changes in disparity of the central patch of a random-dot stereogram for each of three disparities of the surround. It can be seen that the peak of the cell’s tuning function when the surround had zero disparity was about +0.2°. The peak shifted to about 0.6° when the surround disparity was +0.45°, and to about 0° when the surround disparity was −0.45°. Thus, the shift was greater when the surround disparity was nearer the preferred disparity of the cell. This was a characteristic of several cells. Cells with a preferred disparity of zero showed the greatest sensitivity to relative disparity. This corresponds to the psychophysical finding that disparities are most easily discriminated when presented on a zero-disparity pedestal (Section 18.3.3). Thomas et al. modeled the response characteristics of cells sensitive to relative disparity by summing the outputs of two neighboring cells followed by a nonlinear halfsquaring. The nonlinearity ensures that the output is largest when there is an appropriate relative disparity between the two cells. Given that cells in V2 are sensitive to relative disparity, one can ask whether they are specifically sensitive to the
sign of relative depth. The sign of depth of a vertical step of disparity refers to whether the nearer side is on the left or on the right. We saw in Section 5.8.2a that, in monkey V2, responses of many cells to an edge depend on which side of the edge is in a figure region (Zhou et al. 2000). The responses emerged in less than 25 ms, which suggests that integration of figural information starts at this early level. Qiu and von der Heydt (2005) found that many cells in V2 responded specifically to the sign of depth at an edge of a square defined by disparity in a random-dot stereogram. For example, a given cell responded when the surface on the left was nearer than the surface on the right but not to the reverse step. Also, cells selective to the sign of border ownership in 2-D figures showed the same selectivity to border ownership indicated by disparity. When 2-D cues to border ownership conflicted with the sign of disparity, the cells’ response to the sign of disparity was reduced. Responses of binocular cells in V2 to the sign of a disparity-defined edge could depend on their sensitivity to relative disparity rather than to the direction of figure boundaries. Bredfeldt and Cumming (2006) recorded from cells in V2 of alert macaque monkeys in response to a step in disparity along the diameter of a 6° random-dot disk. Many cells responded better to the disparity step than to a disk with uniform disparity and were broadly tuned to the orientation of the edge. In many cases, the response was specific to a particular disparity step. Also, the cells frequently responded to edges with different depth signs, but only when the edges were in different locations. Bredfeldt and Cumming concluded that responses of these cells to disparity-defined edges arise from convergence of signals from cells in V1 with different tuning to disparity. Responses to edges in V2 could support the detection of cue-invariant edges at a higher level. However, the simple bipartite stimulus used by Bredfeldt and Cumming contained no figure-ground information. They admitted that complex processing of figure-ground information might occur in V2. Neinborg and Cumming (2007) asked whether depth discrimination in monkeys is related to responses of disparity-selective cells in V2. They measured the ability of monkeys to discriminate the sign of depth in a central area of a random-dot stereogram as a function of the proportion of dots with random disparity in the central area. Monkeys based their decisions mainly on the presence or absence of dots with near disparity rather than on dots with far disparity. At the same time, Neinborg and Cumming recorded from cells in V2. The responses of cells preferring near disparity correlated with the monkey’s bias to use near disparities. After one monkey had been trained to put equal weight on near and far disparities, the responses of cells preferring far disparities became correlated with psychophysical choices. Chen et al. (2008) asked whether cells in V2 tuned to different disparities are arranged in distinct columns.
P H Y S I O L O G Y O F D I S PA R I T Y D ET E C T I O N
•
25
They used optical imaging and single-cell recording as anesthetized monkeys were exposed to a random-dot stereogram containing different disparities. Cells tuned to crossed disparities were segregated from those tuned to uncrossed disparities, and their separation increased with increasing difference in disparity. Cells in the same cortical column shared the same disparity selectivity. They found no consistent relationship between orientation selectivity and disparity selectivity. In V3 of the monkey, which borders V2, about half the cells were found to be disparity-tuned. Of these, about half were tuned excitatory cells centered on zero disparity. The others were tuned inhibitory cells or were tuned to either crossed or uncrossed disparity (Burkhalter and Van Essen 1986; Felleman and Van Essen 1987). In the macaque, cells in V3 with similar joint tuning to orientation and disparity are organized into columns (Adams and Zeki 2001). This organization is well suited to the extraction of higher-order orientation disparities and disparity gradients required for the perception of 3-D shape. We will see that V3 projects to the posterior parietal cortex, which processes 3-D shape. Poggio (1995) measured the responses of cells in cortical areas V3 and V3A of the monkey to horizontal disparity in a central stimulus. Responses were reduced when vertical disparity was added to the stimulus. The results are consistent with the idea that cells in the central visual field are tuned to horizontal disparity but that a vertical disparity perturbs the matching process and thereby reduces the response to horizontal disparity.
11.5.2 D I S PA R I T Y D ET E C TO R S I N T H E DOR SAL STREAM
11.5.2a Disparity Detectors in MT and MST Disparity-tuned cells occur in the dorsal processing stream that leads through MT and MST to the parietal lobe (Section 5.8.4). This system is specialized for coding low spatial frequency, fast flicker and motion, spatial location, and coarse stereopsis. About two-thirds of cells tested in MT of the anesthetized monkey belong to the same four disparity-tuned types found in V1. Disparity-tuned cells in MT occur in patches between 0.5 and 1 mm in diameter interspersed with regions not tuned to disparity. Cells in the same vertical column have similar sensitivity to disparity and similar preferred disparity. Across each patch there is a smooth transition of preferred disparity from crossed to uncrossed (DeAngelis and Newsome 1999). With bar stimuli, most of these cells were as sensitive to vertical disparity as to horizontal disparity (Maunsell and Van Essen 1983). DeAngelis and Uka (2003) recorded from cells in MT of alert monkeys in response to a random-dot stereogram. Of the cells tested, 93% were tuned to horizontal disparity. The tuning functions were broader than those of cells in V1 26
•
and tended to be well fitted by odd-symmetric Gabor functions. The preferred disparities were more strongly correlated with the phase disparity of the receptive fields than with their offset disparity (Section 11.4.1). DeAngelis et al. (1998) showed that binocular detectors in MT of the monkey are involved in stereopsis. They electrically stimulated clusters of MT cells possessing similar disparity preference. This biased the responses of monkeys in a near-far depth discrimination task in the direction of the disparity preference of the cells that were stimulated. The psychophysically determined sensitivity of monkeys to disparity-defined depth was similar to the mean sensitivity of MT neurons to depth differences (Uka and DeAngelis 2003). Uka and DeAngelis (2004) recorded from cells in MT that were tuned either to crossed (near) or to uncrossed (far) disparities, while monkeys performed the task of indicating whether a stimulus was nearer than or beyond a fixation point. The “near” and “far” choices were correlated with the activation of “near” and “far” neurons. In other words, the psychometric function could be predicted from the probability of responses of the two types of disparity detector. Responses of disparity detectors maximally tuned to zero disparity were not related to the animals’ choices. Microstimulation of disparity-tuned cells in MT biased the responses of monkeys in the task of detecting the sign of a disparity. However, microstimulation did not affect the ability of monkeys to indicate the relative depths of two stimuli irrespective of their absolute disparities (Uka and DeAngelis 2006). Uka and DeAngelis concluded that MT codes the absolute sign of coarse disparities but not fine relative disparities. They suggested that fine relative disparities are coded in the ventral cortical stream. After MT had been inactivated with muscimol, monkeys were not able to discriminate the sign of depth of a drifting patch of random dots with 0.5° of disparity relative to a fixation point (Chowdhury and DeAngelis 2008). However, inactivation of MT had no effect after the monkeys had been trained to detect the sign of depth of a patch of moving dots with fine disparity with respect to a stationary patch with variable disparity. This training did not affect the disparity-tuning functions of MT cells. This suggests that the fine-disparity training had recruited areas outside MT to perform the coarse-disparity task. However, it is not clear just what had been learned. The coarse stimulus had a large disparity, a comparison point with zero disparity, and disparity noise. The fine stimulus had small disparities, a comparison stimulus consisting of a patch of dots with variable disparity, and no disparity noise. The conclusion that MT processes only coarse disparity must be modified, because many cells in MT were found to respond selectively to the disparity-defined 3-D orientation of a random-dot surface (Nguyenkim and DeAngelis 2003). Many cells were tuned to the orientation of the axis of slant. They responded more vigorously when slant was increased,
STEREOSCOPIC VISION
but the shape of the tuning functions was not much affected by changes in slant magnitude. For these cells, the orientation of disparity-defined slant and the magnitude of slant are coded separately. Other cells were tuned to the orientation of slant, but the tuning functions shifted horizontally with changes in slant magnitude. Many other cells were not tuned to slant. Cells in MT showed similar disparity tuning for both drifting and stationary random-dot stereograms. Although most cells responded more strongly to moving stimuli than to stationary stimuli, some cells preferred stationary stimuli (DeAngelis et al. 2000; Palanca and DeAngelis 2003) (Portrait Figure 11.22). Some cells in the lateral-ventral region of MST are jointly tuned to motion and binocular disparity (Komatsu et al. 1988; Roy et al. 1992). For some cells the disparity preference in the center of the receptive field differs from that in the surround (Bradley and Andersen 1998; Eifuku and Wurtz 1999). Thus, these cells respond to spatial gradients of disparity. They could be involved in perceptual segmentation of moving camouflaged objects. Fernández et al. (2002) developed a model of these mechanisms in MT. Cells in MT did not respond strongly to superimposed patterns of dots moving in opposite directions in the same depth plane. However, the same cells responded strongly to patterns moving in a given direction in a distinct disparitydefined depth plane (Bradley et al. 1995). For example,
Gregory C. DeAngelis. Born in Fairfield, Connecticut, in 1965. He obtained a B.Sc. in biomedical engineering at Boston University in 1987 and a Ph.D. in bioengineering at Berkeley with R. D. Freeman in 1992. He conducted postdoctoral work at Stanford University with W. T. Newsome between 1995 and 1999. He held an academic appointment in the department of neurobiology at Washington University School of Medicine. In 2007 he was appointed professor of brain and cognitive sciences at the University of Rochester.
some cells preferred upward motion of a pattern with crossed disparity (nearer than the plane of convergence). With a random-dot transparent cylinder rotating about its horizontal axis, these cells responded most strongly when the front of the cylinder moved downward (Bradley et al. 1998). When disparity is set at zero, the direction of rotation of the 2-D image of a rotating cylinder is ambiguous and perceptually alternates. Some cells in MT of the monkey changed their response when the animal signaled that a change in direction of rotation of an ambiguous cylinder had occurred. Dodd et al. (2001) showed that this correlation between perception and neuronal responses is not due to eye movements or attention to specific locations. These cells are thus able to distinguish motion signals arising from surfaces at different actual depths or different perceived depths. In monkeys and humans, fMRI revealed strong responses in MT to a monocular display rotating in depth (Vanduffel et al. 2002). The human MT showed fMRI responses specifically to a random-dot stereogram that generated an impression of motion in depth (Rokers et al. 2009). Other evidence of disparity tuning in MT and MST is presented in Section 22.3.2. The relationship between depth perception and responses of MT neurons prompted Krug et al. (2004) to hypothesize that MT neurons would be insensitive to anticorrelated stereograms that do not evoke depth percepts. Some MT neurons responded only to disparity of correlated stereograms, but some responded to disparity in both correlated and anticorrelated stereograms. However, all cells showed tuning to the direction of rotation of a correlated randomdot cylinder. It is not clear whether these results are due to feedback from higher centers or to the rejection of anticorrelated disparity signals at a higher level in the visual system. Cells in MST, also, show tuning to disparity in anticorrelated stereograms (Takemura et al. 2001). Evidence reviewed in Section 10.10.1 shows that MT and MST are involved in the control of vergence eye movements and that anticorrelated disparities evoke vergence responses. Motion and depth signals generated in MT feed into the parietal cortex, which is also involved in the control of vergence. All this evidence suggests that the dorsal stream of cortical processing is concerned with processing disparities for the purpose of controlling vergence eye movements, in addition to any contribution it may make to depth perception.
Figure 11.22.
11.5.2b Disparity Detectors in the Parietal Cortex Cells selectively responsive to the 3-D orientation of objects have been found in the caudal part of the lateral bank of the intraparietal sulcus (CIP and LIP) of monkeys (Sakata et al. 1999). Some cells responded to the 3-D orientation of the
P H Y S I O L O G Y O F D I S PA R I T Y D ET E C T I O N
•
27
long axis of thin stimuli. Other cells were tuned to the 3-D orientation of flat surfaces. Some of these cells responded to a surface depicted in a random-dot stereogram, in which depth is specified only by disparity (Taira et al. 2000). In a later study, it was found that most cells in area CIP that were sensitive to surface orientation in depth responded when the cue was either disparity or perspective or when both cues were present (Tsutsui et al. 2001). However, these cells responded more strongly to disparity than to perspective. Some cells responded exclusively to disparitydefined depth and a few cells responded exclusively to perspective. A few cells responded only when both cues were present. Monkeys showed impaired discrimination of surface orientation when CIP was inactivated by injection of muscimol. Areas CIP and LIP connect with the anterior intraparietal cortex, an area concerned with manipulation of objects (Section 5.8.4). Disparity-sensitive neurons in LIP are also related to the control of eye movements in 3-D space (Gnadt and Mays 1995). The neighboring anterior intraparietal area (AIP) of the monkey showed fMRI activation in response to disparitydefined slanted and curved surfaces (Durand et al. 2007b). Single-cell recordings from experiments in the same laboratory showed that most AIP neurons responded preferentially to disparity-defined 3-D shapes independently of the absolute distance or retinal location of the shapes (Srivastava et al. 2009).
functions were narrower for random-dot stereograms than for line stereograms. We saw in Section 11.5.1 that some cells in V2 are specifically sensitive to the relative disparity between two regions. With the same procedure used by Thomas et al. (2002), Umeda et al. (2007) found a greater proportion of cells of this type in V4, although their tuning to relative disparity was not perfect. Hinkle and Connor (2002) found cells in V4 that were specifically tuned to slant in depth of bars. Hegdé and Van Essen (2005b), also, found that the responses of many cells in V4 were modulated by changes in disparitydefined slant of bars and surfaces in random-dot stereograms. However they found little evidence of cells that were selectively responsive to 3-D bumps and dents in surfaces in random-dot stereograms. We will see that cells sensitive to 3-D shapes occur higher in the ventral processing stream. The tuning of some cells in V1, V2, and V4 to changes in stimulus size depended on the distance of the stimulus. Some cells responded best to changes in size of near stimuli while others preferred far stimuli. Some of the cells retained their distance sensitivity under monocular conditions, but, for most of these cells, it was not possible to decide which depth cues the cells were responding to (Dobbins et al. 1998). Lesions of V4 in monkeys produced no defects in stereopsis as tested with static or dynamic random-dot stereograms or Gaussian patches (Schiller 1993).
11.5.3 D I S PA R I T Y D ET E C TO R S I N T H E VENTRAL STREAM
11.5.3b Disparity Detection in the Inferior Temporal Cortex
11.5.3a Disparity Detection in V4 Area V4 is the major initial stage in the ventral visual pathway. This is the pathway associated with pattern recognition and fine stereopsis (Section 5.8.3). In macaque area V4, 72% of cells were tuned to disparities in the disparity range −1° to +1° (Hinkle and Connor 2005). There was a continuous gradation of tuning functions over the dimensions of tuned excitatory and tuned inhibitory functions and near and far functions. More cells were tuned to crossed disparities (near stimuli) than to uncrossed disparities. Disparity tuning of cells in V4 was similar for different locations of the stimulus in the cell’s receptive field, and cells with similar tuning were clustered together (Watanabe et al. 2002). Hegdé and Van Essen (2005a) found that 67% of cells in V4 of the macaque were tuned to disparity, but that only about 12% of these cells showed similar tuning functions for disparity in bar stimuli as for disparity in random-dot displays. Tanabe et al. (2005) found that about half the cells they recorded from V4 of alert macaque monkeys responded to disparity in dynamic random-dot stereograms. Most of the cells were tuned to near zero disparity, and their tuning 28
•
The inferior temporal cortex is concerned with shape recognition (Section 5.8.3b). Some cells in this area in the monkey responded selectively to shapes defined by luminance, texture, or motion (Sáry et al. 1995). Some cells responded selectively to disparity-defined shapes in random-dot stereograms (Tanaka et al. 2001). The responses were the same to the same shapes defined by different patterns of random dots. For some cells, responses to disparitydefined shapes were correlated with responses to shapes defined by luminance or texture. Cells in the lower subregion of monkey inferior temporal cortex (area TE) were selectively responsive to the 3-D slant of textured surfaces defined either by a disparity gradient or by a texture gradient (Liu et al. 2004). They were relatively insensitive to the texture elements that defined the gradient. Cells with similar cue invariance have been reported in the intraparietal sulcus (Section 11.5.2b). About 50% of cells in the inferior temporal cortex of alert rhesus monkeys were found to be selective for the global 3-D structure of convex and concave disparitydefined random-dot displays depicting surfaces curved about a horizontal axis ( Janssen et al. 2000a). They were
STEREOSCOPIC VISION
not sensitive to local changes in disparity. These cells are therefore sensitive to higher spatial derivatives of disparity. Most of them responded selectively to either disparity along the edges of a curved surface or to disparity gradients within the surface. Some cells responded selectively to the magnitude and direction of curvature about a vertical axis ( Janssen et al. 2001). Most of the above cells were in the lower part of area TE. Only a few of the cells were found in the lateral part of TE. Cells in both parts were selective for 2-D shapes. Typically, the response of cells to a preferred 3-D shape was greater than the sum of their responses to monocular stimuli. In other words, the cells showed binocular summation. Cells sensitive only to 2-D shapes did not show binocular summation, and often showed binocular inhibition. Most of the cells sensitive to 3-D shape were selective for either disparity gradients or disparity curvature. The response of disparity-curvature-detectors was disrupted by disparity discontinuities, such as depth edges and steps. Most cells in TE maintained their response when the stimulus was moved 3.2° in various directions. The response of all cells was affected to some degree by changes in stimulus size or curvature ( Janssen et al. 2000b). Responses of disparity-sensitive cells in the inferior temporal cortex of the macaque varied with changes in the animal’s decision about whether a shape with a fixed small disparity was nearer than or beyond a fixation stimulus (Uka et al. 2005). Cells in V1 respond to disparities between anticorrelated texture elements (Section 11.4.1d) and so do some cells in MT and MST (Section 11.5.2a). In the inferior temporal cortex, cells sensitive to disparity-defined 3-D shapes were insensitive to disparity between anticorrelated images ( Janssen et al. 2003). It thus seems that dichoptic images are correctly matched for contrast at the level of the inferior temporal cortex. Even in V4, which feeds into the inferior temporal cortex, the response of cells to anticorrelated images was much less that that of cells in V1 (Tanabe et al. 2004). Uka et al. (2000) found that most neurons in the inferior temporal cortex of the alert monkey were selective for both shape and disparity. Most cells were “near” cells or “far” cells. Only a few cells were tuned to zero disparity. Neighboring neurons had similar disparity selectivity. The receptive fields of most cells included the fovea, and disparity selectivity was reasonably constant when the stimulus was moved 2° in any direction from the fovea. Some cells in the monkey frontal eye fields are sensitive to coarse disparities (Ferraina et al. 2000). These cells may be related to the planning of large vergence eye movements, since the frontal eye fields receive inputs from LIP in the parietal lobe, an area concerned with eye movements (see Section 5.8.4e). Also, the frontal eye fields project to the superior colliculus, an area concerned with saccadic eye movements.
11.5.4 PA RVO - A N D M AG N O C E L LU L A R D I S PA R I T Y D ET E C TO R S
Livingstone and Hubel (1988) proposed that the parvocellular system is blind to disparity-defined depth. They based their conclusion on the report by Lu and Fender (1972) that depth cannot be seen in an isoluminant random-dot stereogram. However, this argument relies on the false assumption that the parvocellular system is wholly chromatic. The parvocellular system is not merely a coloropponent system. It also codes high spatial-frequency stimuli defined by luminance contrast (Section 5.6.5). An isoluminant stimulus shuts off only the luminance-contrast component of the parvocellular system. In any case, stereopsis could not be confined to the magnocellular system because that system does not have the spatial resolution exhibited by the disparity system (Section 18.3.1). The more reasonable conclusion is that the chromatic component of the parvocellular system does not code depth. Even this conclusion has to be modified, as we will see in the following and in Section 17.1.4. The distinction between the chromatic and luminance channels is not the same as that between the parvocellular and magnocellular systems. While the magnocellular system is wholly or almost wholly achromatic, the parvocellular system is both chromatic and achromatic. It is structurally simple but functionally complex and can be considered to consist of four subchannels. Which of the four subchannels is activated depends on the spatial and temporal characteristics of the stimulus (Ingling and Matinez-Ugieras 1985; Ingling 1991). Consider a retinal ganglion cell in the red-green (r-g) opponent system. For a plain steady stimulus (low spatial and temporal frequency), responses from the red and green zones of the cell’s receptive field are subtracted to yield an opponent chromatic signal. This subtractive process manifests itself as a photometric subadditivity, in which the threshold for detection of a mixture of green and red light is higher than one would predict from the thresholds of green and red lights presented separately. When the redgreen stimulus is flickered, the red and green components begin to add to yield a luminance signal. The cell then loses its spectral opponency and shows photometric additivity. When a high spatial-frequency pattern is added to a steady red-green stimulus, the r-g components again begin to add. The cell again loses its spectral opponency and shows photometric additivity. Therefore, the r-g system is a pure color-opponent system for low spatial and low temporal frequencies, a pure luminance system for either high temporal frequencies or high spatial frequencies, and a mixed system in the middle range of spatial and temporal frequencies. Thus, the parvocellular system is not likely to register binocular disparities in isoluminant stereograms containing high spatial-frequency patterns. We will see in Section 17.1.4
P H Y S I O L O G Y O F D I S PA R I T Y D ET E C T I O N
•
29
that the parvocellular system can register disparities in stereograms with isoluminant low spatial-frequency patterns. The magnocellular system has no color opponency and cannot process high spatial frequencies. Therefore, this system will not register disparity-defined depth in any isoluminant stimuli or in fine patterns defined by luminance contrast. The magnocellular or parvocellular layers in the monkey LGN can be destroyed selectively by ibotenic acid. Lesions in the parvocellular layers severely impaired detection of depth defined by small disparities in random-dot stereograms containing elements with high spatial frequency. Lesions had less effect on the detection of large disparities, especially in stereograms with low spatial frequency elements. Lesions in the magnocellular layers produced deficits in high-frequency flicker and motion perception but had no effect on stereoscopic vision, even for low spatial frequency stereograms (Schiller et al. 1990). This suggests that depth in luminance-defined low spatial frequency stereograms is detected just as well in the parvocellular system as in the magnocellular system. Disparity in afterimages can create a sensation of depth (Section 18.10.2a). Ingling and Grigsby (1990) reported depth in afterimages of a perspective illusion and in afterimages of a reversible perspective figure. They assumed that afterimages do not arise in the transient magnocellular system and concluded that these sensations must arise in the parvocellular system. Depth sensations in these illusions are absent at isoluminance, so that if we accept Ingling and Grigsby’s assumption, the sensations they observed must have arisen in the luminance component of the parvocellular system. However, this argument is weakened by evidence that afterimages do arise in the magnocellular system (Schiller and Dolan 1994). Stereoacuity of human subjects was high in a dynamic random-dot stereogram when the luminance of each element was modulated smoothly over 300 ms (Kontsevich and Tyler 2000). This stimulus was designed to stimulate the sustained parvocellular system, which responds best to low temporal frequencies. Stereoacuity was low when element luminance was modulated in transient steps over the same interval. This stimulus was designed to stimulate the transient magnocellular system. Thus, the parvocellular system provides the predominant contribution to highresolution stereopsis. Schiller et al. (2007) designed random-dot displays that contained regions in different depths defined by disparity, by motion parallax, or by both cues. One display was achromatic and with low spatial frequency so that it stimulated the magnocellular system. Another display was defined by differences in color and had high spatial frequency so that it stimulated the parvocellular system. Monkeys made saccadic eye movements to a target with defined depths with respect to other targets. The results indicated that the magnocellular system plays a central role in processing 30
•
motion parallax on account of that system’s sensitivity to motion. However, the sensitivity of the magnocellular system to motion was degraded with isoluminant stimuli. On the other hand, the results indicated that both the magnocellular and parvocellular systems code disparitydefined depth for low spatial-frequency stimuli but that only the parvocellular system codes disparity at high spatial frequencies. Monkeys responded more rapidly to stimuli in which depth was specified by disparity than to stimuli in which depth was specified by motion parallax. Although ganglion cells of the magnocellular system conduct at a higher velocity than cells of the parvocellular system, the detection of motion parallax involves integration of signals over time. Schiller et al. suggested that one reason for the evolution of stereopsis is that it processes depth rapidly. This evidence, and other evidence cited in Section 17.1.4, leads to the following conclusions. Only the parvocellular system processes disparity in fine patterns defined by luminance contrast or in coarse isoluminant patterns. Both the parvo- and magnocellular systems process disparity in coarse patterns defined by luminance contrast. Verhoef et al. (2010) found that cells in the inferotemporal cortex (ventral stream) responded while monkeys discriminated between convex and concave stereograms. Cells in the intraparietal area (dorsal stream) responded after monkeys had completed the discrimination. They concluded that the ventral stream is responsible for 3-D shape discrimination and that the dorsal stream is concerned with initiating behavior based on discriminations. 1 1 . 6 H I G H E R - O R D E R D I S PA R I T I E S According to the evidence reviewed in Sections 11.4.1g, disparity detectors in V1 respond only to local absolute disparities. Cells sensitive to local discontinuities of disparity that occur along steps in depth are found in V2 and V3 (Section 11.5.1). We will now see that, at higher levels, there are cells specifically sensitive to the following patterns of relative disparity 1. Disparity defined by a difference in spatial periodicity (dif-frequency disparity). These disparities are produced by surfaces slanted about a vertical axis. 2. Disparity defined by horizontal shear (relative orientation) of the two images. These disparities are produced by surfaces inclined about a horizontal axis. 3. Second-order spatial derivatives of disparity produced by surfaces curved in depth. 4. Relations between disparity and motion. 5. Relations between spatial and temporal disparities. 6. Relations between horizontal and vertical disparities.
STEREOSCOPIC VISION
11.6.1 D ET EC T I O N O F H O R I Z O N TA L D I S PA R I T Y G R A D I E N TS
An evenly textured surface slanted in depth about a vertical axis produces images in the two eyes that differ in horizontal width (Section 20.2). A horizontal-width disparity may be defined as a horizontal gradient of point disparity or as a dif-frequency disparity. A dif-frequency disparity is an interocular difference in the spatial periodicity of the images in the two eyes. This section is concerned with whether the visual system detects width disparities by cells sensitive to gradients of point disparity or by cells with monocular receptive fields that differ in spatial periodicity, as depicted in Figure 11.23A. Psychophysical evidence for dif-frequency detectors is not conclusive, as we will see in Section 20.2.1. But what about the physiological evidence? Blakemore (1970a) proposed that dif-frequency disparities are detected by specialized disparity detectors, distinct from those that detect point and orientation disparities. Hubel and Wiesel (1962) and Maske et al. (1984) failed to find cells in the primary visual cortex with different
receptive-field structures, but their methods were not refined enough to reveal the crucial differences. Hammond and Pomfrett (1991) reported that, for a majority of cells in the cat visual cortex, the spatial frequency evoking the best response from one eye was slightly different from that evoking the best response from the other eye. Most cells of this type were tuned to a higher spatial frequency in the dominant eye than in the other eye. Furthermore, the cells were more likely to be tuned to orientations close to the vertical than were cells with matching spatial-frequency characteristics. However, according to recent evidence reviewed in Section 11.4.1g, disparity detectors in V1 respond to absolute disparities but not to relative disparities between neighboring object points. Cells higher in the visual system are sensitive to surface slant. The cells could derive their sensitivity to slant by detecting dif-frequency disparities. But it is more likely that they combine inputs from sets of cells in V1, each of which is sensitive only to simple point disparity. In other words, they could detect horizontal gradients of horizontal disparity. 11.6.2 D ET E C T I O N O F VE RT I C A L D I S PA R I T Y G R A D I E N T S
A
B
C Second-order disparity detectors. Hypothetical monocular receptive fields of a binocular cell in V1 that would be sensitive to (A) surface slant, (B) a horizontal modulation of disparity, (C) vertical modulation of disparity.
Figure 11.23.
An evenly textured surface inclined in depth about a horizontal axis produces images that are horizontally sheared (Section 20.3). A horizontal-shear disparity may be defined as a vertical gradient of horizontal disparity or as an orientation disparity. An orientation disparity is an interocular difference in the orientation of image features. This section is concerned with whether the visual system detects shear disparities by detectors of gradients of point disparity or by detectors of orientation disparity. Bishop (1979) suggested that, as for horizontal disparities, the range of orientation disparities to which cortical cells are tuned arises from random pairing of monocular receptive fields with a random scatter of preferred orientations (see Hetherington and Swindale 1999). The range of orientation disparities to which a binocular cell in the cat visual cortex responds is about the same as the range of orientation disparities that a cat encounters. Blakemore et al. (1972) measured the orientation tuning of binocular cells in the primary visual cortex of the cat for a bar presented to each eye in turn. For many cells, the optimal orientations for the two monocular receptive fields differed. These differences had a range of over 15°, with a standard deviation of over 6°. Although the binocular cells studied by Blakemore et al. were sensitive to orientation disparities, they were also sensitive to the relative positions of the images. In other words, they were influenced by both positional and orientation disparity. To make the case that these cells signal orientation disparity, it is necessary to show that they are specifically sensitive to orientation disparity, as opposed to position disparity.
P H Y S I O L O G Y O F D I S PA R I T Y D ET E C T I O N
•
31
Hubel and Wiesel (1973) were unable to confirm these findings, but the eyes of their animals may not have been in torsional alignment. Nelson et al. (1977) replicated Blakemore et al.’s finding after controlling for possible effects due to eye torsion induced by paralysis and anesthesia. For each cell, the widths of the orientation tuning functions for the two receptive fields were very similar. Thus, cells sharply tuned to orientation in one eye were sharply tuned in the other eye, even though the preferred orientations in the two eyes could differ. The response of a binocular cell was facilitated above its monocular level when the stimulus in each eye was centered in the receptive field and oriented along its axis of preferred orientation. As the stimuli were rotated away from this relative orientation, the response of the cell declined to below its monocular level, although this inhibitory effect was not strong. However, the tuning functions of the binocular cells to orientation disparity were no narrower than the monocular orientation-tuning functions. Nelson et al. argued that such broadly tuned orientation-disparity detectors could not play a role in the fine discrimination of inclination about a horizontal axis. However, fine discrimination does not require finely tuned channels when outputs of several channels are compared (Section 4.2.7). Binocular cells with different preferred orientations in the two eyes have been reported also in area 21a of the cat (Wieniawa-Narkiewicz et al. 1992). The orientation disparity functions of these cells, like those of cells in area 17, could be derived from the difference between monocular orientation-tuning functions. The above studies were performed on anesthetized animals. Hänny et al. (1980) found a small number of cells in V2 of the alert monkey that were specifically sensitive to changes in the inclination in depth of small stimuli about a horizontal axis. The cells responded maximally to a line in the frontal plane. A 45° forward or backward inclination of the stimulus in the median plane, corresponding to an orientation disparity of 2°, reduced the response by half. By comparison, the smallest tuning width of cells tuned to orientation of monocular stimuli is reported to be 6°, with a mean of 40° (DeValois et al. 1982a). In a second experiment, Hänny et al. (1980) controlled for the effects of horizontal disparity by using dynamic random-dot stereograms that contained an orientation disparity but only randomly distributed horizontal disparities. Von der Heydt et al. (1982), also, used a dynamic random-dot display containing a cyclopean vertical grating with no horizontal disparities. They found five cells in V1 of monkeys that responded to orientation disparities in this stimulus. They concluded that the visual cortex of the monkey contains cells specifically tuned to orientation disparity. 32
•
Bridge et al. (2001) pointed out that binocular cells that merely respond to the mean orientation of stimuli in the two monocular receptive fields do not qualify as orientation-disparity detectors. The response of such cells is said to be left-right separable. To qualify as a detector for orientation disparity, the cell must respond in a specific way to a given orientation difference, whatever the absolute disparities. A cell that responds specifically to a difference in the orientation of images is left-right inseparable. Its response cannot be produced by summing or multiplying the monocular responses. Bridge et al. developed a model system that detected true orientation disparities, but they showed that it was no more sensitive to changes in slant than was a simpler model based on the extraction of gradients of point disparities. Also, it is not clear how primary disparity detectors could detect orientation disparities in random-dot stereograms that lack oriented texture elements. A secondary disparity detector at a level higher than V1 that combines inputs from primary detectors could derive spatial gradients of position disparity in line stimuli or in random-dot displays. Bridge and Cumming (2001) looked for cells selectively responsive to orientation disparity in V1 of the alert monkey. Of 64 cells, 20 responded to an orientation disparity that was not predictable from monocular orientation selectivity. However, these cells were also selective for position disparity. They concluded that the apparent orientation selectivity of these cells arose from their sensitivity to gradients of position disparity. Sensitivity to differences in the orientation of stimuli is adversely affected by an increase in the orientation bandwidth of the stimulus. If inclination were coded by orientation disparity, one would expect sensitivity to differences in inclination to be similarly affected by the orientation bandwidth of the stimuli. Heeley et al. (2003) found that the threshold for detecting inclination of a textured patch was essentially unaffected by the orientation bandwidth of the texture. They concluded that inclination is signaled by the pattern of position disparities rather than by orientation disparity. It seems that there are few if any cells in the primary visual cortex tuned to orientation disparity based on differences in orientation tuning of monocular receptive fields. Higher centers could derive orientation disparities from gradients of position disparity. A cell that detects a gradient of position disparities is, in effect, an orientation disparity detector. There is evidence that cells specifically tuned to orientation disparity occur at higher levels in the visual system. Hinkle and Connor (2002) recorded from cells in V4 of alert monkeys as a bar in various orientations in the frontal plane and various inclinations in depth drifted back and forth over the receptive field. About half of orientationtuned cells were tuned to inclination. For most of the
STEREOSCOPIC VISION
neurons, tuning to inclination remained the same for different lengths of line, for different positions in the receptive field, and for different linear disparities of the line images. These cells presumably derive their specific sensitivity to inclination by detecting a vertical gradient of horizontal disparities derived from cells in V1 and V2. Area V4 feeds mainly into the ventral “object-recognition” stream, where higher derivatives of disparity would help in the recognition of the 3-D structure of objects (Section 11.5.3). Higher-order disparities associated with the inclination of surfaces and motion-in-depth are processed in the dorsal stream (Section 11.5.2). 11.6.3 S PAT I A L M O D U L AT I O N S O F D I S PA R I T Y
We readily detect sinusoidal modulations of depth that are defined only by disparity gradients as long as the spatial frequency of the modulations does not exceed about 3 cpd (Section 18.6.3). A centrally placed vertical hemicylinder produces images with opposite second-order disparity gradients. A horizontal hemicylinder produces images with gradients of shear disparity. Figure 11.23B depicts monocular receptive fields of a disparity detector sensitive to a horizontal disparity modulation arising from a disparity-defined vertical ridge. Figure 11.23C depicts receptive fields of a binocular cell sensitive to a vertical modulation of disparity arising from a horizontal ridge. We saw in Section 11.4.1g that cells with these types of receptive fields do not occur in V1. Cells in V1 detect only a simple offset of the monocular receptive fields or a simple phase shift of regions within the receptive fields. Nienborg et al. (2004) produced other evidence in support of this conclusion. They recorded responses of binocular cells in monkey V1 to random-dot stereograms in which disparity was modulated sinusoidally in a vertical direction. The stimulus was 4° in diameter, and the disparity corrugation varied from 0.06 to 4 cpd and drifted at 2 Hz. Only disparity corrugations of low spatial frequency produced a modulation of response as they drifted over a cell’s receptive field. The larger the receptive field, the lower the corrugation frequency that produced response modulation. A drifting corrugation of low spatial frequency modulated the mean disparity over a receptive field even for a cell with position invariance. Otherwise, the cells showed no consistent pattern of change in mean firing rate as a function of corrugation frequency. Nienborg et al. concluded that binocular cells in V1 are not selectively responsive to spatial gradients of disparity. However, they used only horizontal modulations of disparity in a vertical direction. Cells sensitive to disparity gradients and modulations occur at a higher level where responses of primary disparity detectors are combined. However, disparity detectors at any level that show position invariance cannot register a
second-order disparity gradient. This is because disparity selectivity of position-invariant cells remains constant as the stimulus moves over the receptive field. Cells in V1 are tuned to spatial modulations of luminance but not to modulations of disparity. Detection of spatial disparity modulations depends on cells that combine inputs from several disparity detectors in V1. This accounts for why luminance gratings can be resolved up to a spatial frequency of about 60 cpd while disparity modulated gratings can be detected only up to a spatial frequency of about 3 cpd (see Section 18.6.3c). Differences in curvature of dichoptic lines evoke a sensation of surface curvature (Rogers and Cagenello 1989). If the receptive fields in the two eyes feeding into a binocular cell were tuned to lines of different length, the cell would be sensitive to differential curvature in the two eyes. Only a few cells of this type were found in the visual cortex of the cat (DeAngelis et al. 1994). Orban et al. (2006) reviewed the topic of detection of 3-D structure from disparity. 11.6.4 J O I N T T U N I N G TO D I S PA R I T Y A N D M OT I O N
Cells responsive to both movement direction and disparity occur in several subcortical areas. These include the pretectum, superior colliculus, and pulvinar—a thalamic nucleus associated with the LGN (Casanova et al. 1989). Jointlytuned cells also occur in several cortical areas. All these centers are connected. It is not known whether disparity sensitivity in the subcortical areas survives removal of V1. Visual inputs feed directly to the pretectum, from where they feed to the pulvinar. Other visual inputs feed directly to the superior colliculus. Collicular inputs feed to the visual cortex, and the tectopulvinar pathway feeds to several cortical visual areas, such as areas 18 and 19, MT, MST, and the superior temporal polysensory area (STP) (Section 5.8.4). The fact that a cell responds to both motion and disparity does not prove that it is tuned to motion-in-depth. This would require tuning to particular combinations of motion and disparity. A cell may appear to be selectively tuned to motion-in-depth when tested with stimuli moving along various trajectories for which the mean disparity is not the cell’s preferred disparity (Maunsell and Van Essen 1983). There is evidence that cells tuned to both motion and disparity are involved in the following: 1. Disparity control of optokinetic nystagmus Some cells in V1 and V2 of the monkey respond selectively to stimuli moving in a given direction in the two eyes, with some responding to only crossed-disparity stimuli and others to only uncrossed-disparity stimuli (Poggio and Fischer 1977; Poggio and Talbot 1981). There is a projection from V1 to the pretectum, a subcortical center involved
P H Y S I O L O G Y O F D I S PA R I T Y D ET E C T I O N
•
33
in generating of optokinetic nystagmus. Many cells in the cat pretectum respond to moving displays and are also tuned to binocular disparity; some show an excitatory response to a limited range of disparities and others an inhibitory response (Grasse 1994). These cells exert a disparity-dependent control over optokinetic nystagmus, as described in Section 22.6.1. 2. Detection of approaching objects One of the functions of visual systems that bypass the primary visual cortex may be the execution of rapid responses to approaching objects. Visual inputs to the colliculus and tectopulvinar survive in destriate animals. Destriate rats and monkeys show responses, such as OKN and avoidance to approaching objects, which are served by this subcortical system (Dean et al. 1989; King and Cowey 1992). However, these responses do not depend on binocular disparity. The system may be responsible for blindsight, in which destriate patients manifest some visual functions (Weiskrantz 1987). Mechanisms sensitive to the direction of approaching objects are reviewed in Section 31.2. 3. Detection of 3-D optic flow Some cells in the medial superior temporal cortex (MST) of the monkey are jointly tuned to direction of motion and to the sign of disparity (Komatsu et al. 1988). Jointly tuned cells sensitive to crossed disparity are more numerous than those sensitive to uncrossed disparity. Most of the disparity sensitive cells in MST are tuned to either near or far stimuli rather than to stimuli with zero disparity. In a few of these cells the preferred direction of motion reversed as the disparity of the stimulus was reversed. For example, a cell that responded to rightward motion for stimuli with crossed disparity responded to leftward motion for stimuli with uncrossed disparity (Roy et al. 1992). Cells in MST have large receptive fields, which suggests that they are more suitable for detecting parallactic motion of large parts of the visual field created by self-motion than for detecting local motion. Saito et al. (1986) found cells in MST of the monkey that respond preferentially to patterns rotating in depth. Cells tuned to both motion and disparity have also been found in area MT of the monkey, together with cells that respond to both crossed and uncrossed disparities but not to zero disparity (Maunsell and Van Essen 1983). Evidence that jointly tuned cells in MT are involved in the perception of 3-D motion was reviewed in Sections 5.8.4b and 11.5.2a. Fernández et al. (2002) modeled the detection of relief structure by cells in MT jointly tuned to motion parallax and disparity. Psychophysical evidence that cells jointly tuned to disparity and motion are involved in the perception of relative depth is reviewed in Sections 28.4 and 31.3. 34
•
11.6.5 J O I N T S PAT I A L A N D T E M P O R A L D I S PA R I T I E S
Anzai et al. (2001) measured responses of binocular cells in the cat’s visual cortex to various spatial disparities and various temporal disparities. They found cells that were sensitive to particular combinations of stimulus speed and binocular disparity. Thus, some cells code motion and depth jointly at an early stage of visual processing. They found that fine disparities were coded by cells tuned to various speeds and temporal frequencies. Coarse disparities were coded by cells tuned to low spatial frequencies and high temporal frequencies (high speed). This finding is consistent with psychophysical evidence reviewed in Section 18.10.3. Anzai et al. proposed that cells with these tuning functions could explain the Pulfrich effect described in Chapter 23 and also dichoptic motion described in Section 16.5. Pack et al. (2003) used dichoptic arrays or bars perpendicular to the preferred direction of motion of binocular cells in V1 and MT of alert monkeys. The responses of subregions within the receptive fields were recorded as a function of interocular temporal delay and of spatial disparity. Some cells in both V1 and MT were tuned to a nonzero spatial disparity and did not respond to nonzero temporal disparities. For other cells, the preferred tuning to spatial disparity changed as a function of temporal disparity and vice versa. This is indicated by the slope of the function that relates responses to the two types of disparity, as shown in Figure 11.24. However, for stimuli with zero horizontal disparity, the peak response to temporal disparity always occurred at zero temporal disparity. Thus, the cells did not code temporal disparity in the absence of a spatial disparity. 1 1 . 7 E VO K E D P OT E N T I A L S A N D STEREOPSIS There are several problems to be solved in relating changes in visual evoked potentials (VEP) of humans specifically to changes in stereoscopic depth perception based on binocular disparity. 1. It must be demonstrated that the response is not due to stimulus-locked eye movements. 2. One must ensure that monocular cues to depth do not intrude. This can be done by using random-dot stereograms. 3. Changes in the depth in the stereogram must not introduce unwanted motion signals. This can be done by using a dynamic random-dot stereogram, in which the dots in each monocular image are replaced at the frame rate of the display, so that there is no motion of monocular dots related to the change of depth in the stereogram.
STEREOSCOPIC VISION
Responses of cells in V1 and MT to spatial and temporal disparities. (A) These cells in V1 and MT are selective for spatial disparity but not for temporal disparity. (B) These cells show modest tuning for both spatial and temporal disparity. (C) These cells change their preferred spatial disparity when the temporal disparity is changed. (Reprinted from Pack et al. 2003 with permission from Elsevier)
Figure 11.24.
4. One must ensure that changes in the VEP are due to a perceived change in depth rather than to a change in the degree of correlation between the patterns of dots in the two eyes. This can be done by alternating the stereogram between equal and opposite disparities rather than between zero disparity and either crossed or uncrossed disparity. A second control is to compare the VEP evoked by a random-dot stereogram alternating in depth because of a change in horizontal disparity with that evoked by a similar change in vertical disparity, which does not create a change in depth. This control must be applied with caution, because depth sensations can arise from certain types of vertical disparity (see Section 20.3). Periodically reversing the contrast of dichoptic vertical gratings produced a larger human VEP when the gratings differed in spatial frequency (DeAngelis et al. 1994). The surface appeared to slant about a vertical axis. The magnitude of the VEP increased as the apparent slant of the surface increased. When the dichoptic gratings were spatially separated, differences in spatial frequency did not affect the VEP. They concluded that the magnitude of perceived depth determines the magnitude of the VEP. This conclusion is valid only if the VEP is not affected by
spatial-frequency differences between horizontal gratings that do not produce a perception of slant. Regan and Spekreijse (1970) presented human subjects with a random-dot stereogram, in which the horizontal disparity of the central square alternated between zero and 10, 20, or 40 arcmin. Every half-second the central square appeared to jump forward from the plane of the background and then jump back. A positive-going VEP occurred about 160 ms after each depth change and was followed by a negative-going response. A monocular stimulus with a displacement of less than 20 arcsec produced an appearance of global motion (short-range apparent motion) and an associated monocular VEP. A monocular stimulus with a displacement of 40 arcsec produced no global motion and no VEP. However, a dichoptic stimulus with a shift in horizontal disparity of 40 arcmin produced a large VEP. A change in vertical disparity of 40 arcmin produced a much smaller VEP. Regan and Spekreijse concluded from this and from eye-movement controls that changes in the VEP were related specifically to changes in perceived depth and not to motion of parts of one of the monocular images, to changes in disparity unrelated to depth, or to eye movements. In a dynamic random-dot stereogram the dots are renewed many times per second so that any motion of the cyclopean image is not evident in either monocular image
P H Y S I O L O G Y O F D I S PA R I T Y D ET E C T I O N
•
35
(Lehmann and Julesz 1977). Lehmann and Julesz (1978) used a dynamic random-dot stereogram, in which a rectangular area appeared to move out from the background and then back every half-second. With a display confined to the left visual hemifield (right hemiretinas), each change in apparent depth was followed by a VEP in the right hemisphere, as one would expect from the fact that the right hemiretinas project to the right hemisphere. There was a smaller mirror-image echo of the response in the left hemisphere (see Figure 11.25). With a display confined to the left hemiretinas, the major VEP occurred in the left hemisphere with a small echo in the right hemisphere. They argued that the two hemispheres process stereopsis in a similar fashion. This runs counter to some clinical evidence that damage to the right hemisphere produces a selective impairment of stereopsis (Section 32.3). There were no controls for possible eye-movement artifacts in this study. Manning et al. (1992) found that the disparity threshold for detection of depth in dynamic random-dot stereograms was lower in the right than in the left visual field and that the VEP had higher amplitude when the stereogram was presented in the right visual field.
Left hemisphere Right hemisphere
Right hemiretina
500 ms
Left hemisphere Right hemisphere
Left hemiretina
1 μV
Evoked potentials and stereopsis. Averaged evoked potentials in response to changes in depth of a random-dot stereogram presented to either the right or left hemifield and recorded from either the right or left cerebral hemisphere. With a display confined to the left visual hemifield (right hemiretinas), each change in apparent depth (indicted by circles) produced a VEP in the right hemisphere, as one would expect from the fact that the right hemiretinas project to the right hemisphere. A small mirror-image echo of the response is evident in the left hemisphere. (Reprinted from Lehmann and Julesz 1978, with permission from Elsevier) Figure 11.25.
36
•
Skrandies (1997) found that a checkerboard dynamic random-dot stereogram in the right visual field showed a more pronounced tuning of the VEP to disparity than did a stimulus in the left visual field. Like Lehmann and Julesz, they found little differences between VEPs in the two cerebral hemispheres. For central stimuli, the maximum amplitude of the VEP occurred at smaller disparities than for stimuli in the peripheral field. Skrandies applied no verticaldisparity control. The latencies of potentials evoked by a cyclopean checkerboard pattern in a dynamic random-dot stereogram fluctuating in depth were similar to those evoked by a contrast-reversing luminance-defined checkerboard (Skrandies 1991). However, the cyclopean pattern evoked weaker responses than the luminance-defined pattern, presumably because there are fewer cells tuned to disparity than cells tuned to luminance contrast. Also, the spatial distribution of the potentials was different for the two types of stimuli, presumably because they are processed in distinct cortical areas. Stereoacuity and other forms of acuity are greater in the lower than in the upper visual field (Section 18.6.1b). The magnitude of the VEP was greater for a flashed dynamic random-dot stereogram presented in the lower rather than for one presented in the upper visual field (Fenelon et al. 1986). Evoked potentials related to stereopsis were obtained from dynamic random-dot stereograms with up to 8° of the central field occluded (Teping and Silny 1987). Brain potentials evoked by motion-in-depth of the central region of a dynamic random-dot stereogram have been found in the primary visual cortex (Neill and Fenelon 1988) and in the central parietal region (Herpers et al. 1981). A dynamic random-dot correlogram (DRDC) is a display of randomly distributed dots alternating between any two of the following states: (1) being in the same positions in the two eyes (+1 correlation), (2) being uncorrelated in position in the two eyes (zero correlation), (3) being in the same positions but with opposite luminance polarity (−1 correlation) ( Julesz and Tyler 1976). These changes between two states are cyclopean, since they are not evident in either monocular image. A much stronger VEP was evoked by a DRDC that changed in state than by one that remained either correlated or uncorrelated (Miezin et al. 1981). Evoked potentials from a DRDC therefore reflect the cyclopean features of the stimulus. Julesz et al. (1980) compared the amplitude of the VEP evoked by a dynamic 2-D random-dot correlogram with that evoked by a dynamic random-dot stereogram, in which alternate squares of a checkerboard pattern appeared to move in and out of the plane of the background squares. Both displays produced VEPs with a dominant latency of about 250 ms and which differed from those produced with one eye occluded (see Figure 11.26). However, the amplitude of the response to the random-dot stereogram was greater than that to the correlogram, and the waveforms
STEREOSCOPIC VISION
(a)
Amplitude VEP (2.0 μV per division)
Uncorrelated
(b)
(c)
d
In depth
Uncorrelated
Flat
Correlated
(d)
Correlated
Uncorrelated
0
Correlated
0.5 Time (s)
1.0
Evoked potentials and image correlation. (a) VEP in response to a dynamic random-dot pattern alternately correlated and uncorrelated in the two eyes. (b) VEP in response to a cyclopean checkerboard alternating between being flat and in depth. (c) Same stimulus as in (a) but with one eye closed. (d) Same as in (a) but with anaglyph spectacles removed. (c) and (d) were control conditions (N = 2). (From Julesz et al. 1980)
Figure 11.26.
generated by the two stimuli differed (see also Skrandies and Vomberg 1985). Julesz et al. concluded that the greater response to the stereogram was related to the appearance of depth, as opposed to the change in the degree of interocular correlation of the dots. These responses occurred only in subjects with functional stereopsis, and the authors suggested that they could be used as a simple nonverbal screening test for stereopsis. However, it is not clear whether or not vergence eye movements influenced the response, and there was no control for the effects of vertical disparities. If binocular facilitation is related to stereopsis based on disparity, it should not occur for a horizontal grating, because extended horizontal gratings do not create horizontal disparities. In conformity with this expectation, Apkarian et al. (1981) found that the binocular VEP response to a horizontal grating was the sum of the monocular responses, while the response to a vertical grating showed facilitation. Norcia et al. (1985) used a dynamic random-dot display that alternated as a whole between a crossed disparity and an equal uncrossed disparity while the subject fixated a stationary point. The amplitude of the VEP increased as a linear function of the amplitude of disparity modulation, for amplitudes up to about 15 arcmin. Above this value, the response first declined and then rose to a second peak at a
disparity of about 70 arcmin. The response to larger amplitudes of disparity alternation had a shorter latency but a greater phase lag than the response to smaller amplitudes of disparity. Norcia et al. argued that the two peaks in the VEP represent two disparity-processing mechanisms, one for fine disparities and one for coarse disparities (Section 7.6.3). Kasai and Morotomi (2001) asked whether the VEP from the occipitotemporal region is affected by stimulus features that subjects were asked to detect. Subjects indicated when a dynamic random-dot stereogram contained a shape in a designated orientation and depth relative to the background. Other stimuli had a nondesignated orientation, depth, or orientation and depth. A stimulus with the designated orientation enhanced the VEP about 175 ms after stimulus presentation, while one with designated depth had an effect at 200 ms. In both cases, the other feature could be either designated or not. This suggests that features are initially processed in distinct channels. Later VEP components were enhanced only when the stimulus had the designated orientation and crossed disparity. Kasai and Morotomi concluded that these VEP components were related to the integration of form and disparity into a figure on a ground. Other aspects of VEPs and stereopsis are discussed in Section 7.6.3. Regan (1989a) reviewed the whole question of evoked potentials. 1 1 . 8 P ET, F M R I, A N D S T E R E O P S I S 11.8.1 S TAT I O NA RY S T I MU L I
Gulyás and Roland (1994) used positron emission tomography (PET) to map cortical areas in ten human subjects as they viewed random-dot displays containing texturedefined patterns or a central region with binocular disparity. Occipital, parietal, and frontal areas responded specifically to disparity, with no indication of cerebral asymmetry. Fortin et al. (2002) used PET to indicate activity in the human brain produced by viewing a random-dot stereogram. Heightened brain activity was detected in areas 18 and 19, the right superior parietal lobe (area 7), and MT. Sereno et al. (2002) applied functional magnetic resonance imaging (fMRI) to the brains of anesthetized monkeys exposed to a variety of computer-generated 3-D objects. Depth was defined by shading, disparity, texture gradients, or motion. Responses occurred in the occipital, temporal, parietal, and frontal areas. They occurred in both the dorsal and ventral processing streams. Some areas were differentially activated by different depth cues. Kwee et al. (1999) recorded fMRI responses from the brains of eight human subjects viewing a 2-D display and a similar 3-D display in alternation. Three subjects showed
P H Y S I O L O G Y O F D I S PA R I T Y D ET E C T I O N
•
37
activation in the dorsolateral frontal lobe. The fMRI responses may have been related to vergence eye movements rather than to perception of 3-D structure. Only one subject showed responses in the occipital lobe. Four subjects exhibited bilateral activation in the intraparietal sulcus. However, the response was stronger in the right hemisphere. Iwami et al. (2002) found a similar right-hemisphere dominance. Right parietal lobe lesions are implicated in loss of stereopsis (Section 32.3). Backus et al. (2001) examined fMRI responses of human cortical areas V1, V2, V3, and V5 (MT) to randomdot displays in two superimposed depth planes. The minimum and maximum disparities that created an impression of two depth planes were determined psychophysically. Activity in V1 increased as disparity increased above threshold levels and decreased as disparity approached the upper limit for depth perception. The relation between psychophysical and physiological measurements was particularly strong in area V3A. Naganuma et al. (2005) recorded fMRI activity while subjects actively detected the shape, depth order, or direction of slant of a figure in random-dot stereograms. The right intraparietal sulcus was activated when subjects detected the depth order or direction of slant of the stimulus. The right lateral occipital area rather than the parietal area was activated when subjects detected the shape of the disparity-defined stimulus. Cortical activity, as reflected in the fMRI, adapts to repeated presentation of the same stimulus. Neri et al. (2004) used this effect in human subjects to investigate the responses of different visual areas to absolute and relative disparity. They repeatedly presented a dynamic random-dot stereogram depicting two superimposed planes at different depths. Absolute disparity was varied in one set of trials, and relative disparity was varied in a second set. In each case, either the same stereogram was repeatedly presented or different stereograms were alternated. Areas in the dorsal processing stream (V3A, V5, V7) showed more adaptation to repeated presentation of constant absolute disparity than to repeated presentation of constant relative disparity. Areas in the ventral stream (V4, V8) showed equal adaptation to both types of disparity. Neri et al. concluded that the dorsal stream preferentially processes absolute disparity while the ventral stream processes both absolute and relative disparity. The dorsal stream is involved in the control of vergence eye movements (Section 10.10.1), which are evoked by changes in absolute disparity. Areas V1 and V2 showed only small adaptation to either type of disparity. However, evidence presented in Section 11.4.1g shows that disparity detectors in V1 respond only to absolute disparity. Human V5 and the lateral occipital complex showed fMRI responses to wedge-shaped 3-D surfaces in which depth was defined by either binocular disparity or perspective (Welchman et al. 2005). The responses reflected the 38
•
perceived differences in the stimuli. Thus, these areas are involved in coding cue-invariant 3-D structure. The intraparietal cortex, especially subareas AIP and LIP, are involved in the control of visually guided grasping of 3-D objects (Section 5.8.4e). Durand et al. (2007b) measured fMRI responses in the intraparietal cortex of alert monkeys to flat stimuli, simple stereoscopic stimuli, and stimuli depicting depth curvature. Both stereoscopic stimuli produced fMRI activity in areas CIP, LIP, and AIP on the lateral bank and of areas PIP and MIP on the medial bank of the intraparietal cortex of monkeys. However, only areas AIP and LIP responded specifically to both the stereoscopic depth structure and 2-D shape of small objects. A later study from the same laboratory showed that areas DIPSM and DIPSA in the human intraparietal cortex responded in a similar way. These areas are believed to be homologous to monkey areas AIP and LIP. Two areas in the human occipitoparietal cortex, VIPS/v7 and POIPS, also responded to stereoscopic depth structure (Georgieva et al. 2009). Tyler et al. (2006), also, obtained strong fMRI activation of the human kinetic occipital area (KO), specifically by stimuli that depicted depth structure from disparity or from relative motion. The area was only weakly activated by luminance boundaries or by disparity stimuli lacking depth edges. 11.8.2 M OT I O N I N D E P T H
A stationary disk offset in depth and a disk that moved back and forth in depth produced fMRI activation of the human dorsal occipital and superior parietal regions. Area V5 (MT) was activated only by motion in depth (Iwami et al. 2002). Likova and Tyler (2007) obtained strong fMRI activation of the human occipital cortex adjacent to MT (V5) by motion-in-depth. They used a dynamic random-dot stereogram depicting alternating motion-in-depth of a plane between disparities of ±20 arcmin at 1 Hz. Areas V1 to V4 and areas MT and MST were strongly activated when two sequentially presented random-dot displays rotating in opposite directions were placed in distinct disparity-defined depth planes. Activation was weaker when the displays were in the same depth plane (Smith and Wall 2008). Tsao et al. (2003a) used a dynamic random-dot stereogram depicting a laterally moving checkerboard with disparity-defined depth or with zero disparity. In both monkeys and humans the disparity-containing stimulus produced strong fMRI activity in V3A, the caudal parietal cortex, and an intermediate area that they designated V4d-topo. Area V4d-topo corresponds to the kinetic occipital area (KO). Activity was confined to these dorsal-stream cortical areas. However, dorsal area MT was not activated in monkeys and only weakly in humans. We saw in Section 11.5.2a
STEREOSCOPIC VISION
that most cells in MT are tuned to disparity and especially to rotation in depth. The complex stimulus used by Tsao et al. would activate many types of disparity-tuned cells. Perhaps the fMRI signal from MT was weak because it was an average of the responses of a wide variety of disparitytuned cells in a given area (Tyler 2004). Sprang and Morgan (2008) pointed out that in the stimulus used by Tsao et al. disparity-defined depth was not related to stimulus motion. They used a dynamic randomdot stimulus in which depth was produced by interocular delay, as described in Section 23.3.1. With no interocular delay the stimulus appeared two-dimensional. With delay it appeared as a rotating 3-D cylinder. The 3-D stimulus produced more fMRI activity in V3, the intraparietal sulcus, and MT than did the 2-D stimulus. They concluded that these areas contain cells jointly tuned to disparity produced by interocular delay. The stimuli in the above experiments involved changes in relative disparity. Cottereau et al. (2011) compared EEG and fMRI responses in a variety of human cortical areas to modulations of absolute disparity and of relative disparity. A disc in a random-dot stereogram modulated in disparity with respect to a zero-disparity produced changes in relative disparity while a disparity-modulated disc in binocularly uncorrelated surround produced changes in absolute disparity. Responses in V1 were the same for modulations of absolute and relative disparity. Other evidence presented in Section 11.4.1g shows that V1 responds only to absolute disparity. However, the amplitude or phase of responses in MT, V4, the lateral occipital area, and V3A were affected specifically by modulations of relative disparity. 11. 9 D ET E C T I O N O F M I D L I N E D I S PA R I T Y The images of an object in the median plane of the head that is beyond the fixation point fall on the nasal halves of each retina. The images of an object nearer than the fixation point fall on the temporal hemiretinas (see Figure 5.11). In both cases, the images project to opposite cerebral hemispheres. With perfect partitioning of nasal and temporal inputs at the chiasm, cells in the visual cortex would not receive inputs from these disparate images. The binocular disparity, and therefore the stereoscopic depth, of midline objects would be impossible to detect. This is the problem of midline stereopsis. In fact, stereoscopic acuity is particularly good for objects on the median plane near the fixation point. Therefore, there must be cortical cells serving the midline region with receptive fields in opposite hemiretinas. Such cells receive binocular inputs by two routes. First, inputs from the midline region are not perfectly segregated in the chiasm. Some temporal inputs decussate and some nasal
inputs do not, as described in Section 5.3.4. Second, cortical cells that serve the midline region are connected through the corpus callosum, as described in Section 5.3.5. 11.9.1 E FFEC TS O F M I D L I N E S E C T I O N OF THE CHIASM
One approach to midline stereopsis is to study effects of midsagittal section of the chiasm. This severs all direct inputs from the contralateral eye to each hemisphere, so that any remaining binocular cells receive their contralateral input indirectly through the callosum (Berlucchi and Rizzolatti 1968). In split-chiasm cats there is a complete loss of binocular cells responsive to uncrossed disparity arising from stimuli in the median plane (Lepore et al. 1992). This is because the images of such objects fall on the nasal hemiretinas and inputs from the nasal hemiretinas are severed in the splitchiasm cat. Any remaining binocular cells must receive a direct input from the ipsilateral eye and an input from the contralateral eye through the callosum. Both these inputs arise from the temporal hemiretinas and hence from images with crossed disparity. Estimates of how many binocular cells survive in splitchiasm cats have varied. Lepore and Guillemot (1982) reported that 30% of cells were binocular. All the other cells were monocularly driven by direct inputs from the ipsilateral eye. Cynader et al. (1986) claimed that 76% of cells remained binocular. Milleret and Houzel (2001) found 17% of cells in area 17 were binocular but, in the region of the 17/18 border, 46% were binocular. In area 18, 27% of cells remained binocular. Guillemot et al. (1993) found that almost all cells in area 19 of the cat lost their disparity tuning following section of the chiasm, suggesting that these cells receive their input from the contralateral eye by this route. In cats with convergent strabismus and section of the chiasm, the disparities to which binocular cells responded ranged from 6° to 21° compared with a mean disparity of 3° in normal cats (Milleret and Houzel 2001). Since the mean size of their receptive fields was 12°, most of the cells had nonoverlapping monocular fields. The callosal terminal zones were greatly expanded within the hemisphere ipsilateral to the convergent eye. None of the cells responded to zero disparity. Split-chiasm cats performed poorly on a depth discrimination test in which random-dot stereograms were shown on a jumping stand. This constituted a test of fine stereopsis (Lepore et al. 1986). Blakemore (1970a) reported the case of a boy in whom decussating pathways in the chiasm were completely sectioned. He could discriminate depth in the region of the midline, as revealed by a coarse test of stereopsis involving disparities of at least 3 arcmin. This suggests that incompletely segregated inputs serve the detection of only fine disparity.
P H Y S I O L O G Y O F D I S PA R I T Y D ET E C T I O N
•
39
11.9.2 E FFEC TS O F C A L L O S EC TO MY
The role of the corpus callosum in midline stereopsis has been investigated by cutting transcallosal pathways. The physiological effects are discussed first. A lesion introduced into the callosal efferent zone in one hemisphere of cats significantly reduced the number of binocular cells in the 17/18 callosal zone in the other hemisphere. Cooling the border between areas 17 and 18 in one cerebral hemisphere of cats produced a selective loss of binocularity in a significant number of midline cells in the other hemisphere (Blakemore et al. 1983). The cells regained their binocularity when cooling was removed. Transection of one optic tract also led to a loss of binocularity in the contralateral hemisphere (Lepore et al. 1983). Callosectomy in adult cats permanently reduced the number of binocular cells in the region of area 17 receiving callosal inputs, extending 4° to either side of the retinal midline region (Payne et al. 1984a, 1984b). However, there is conflicting evidence on this point. For instance, Elberger and Smith (1985) reported that callosectomy affected binocularity and visual acuity at all retinal eccentricities when performed on cats before the age of about 3 weeks, and had no effect after that age (see also Elberger 1989, 1990). Minciacchi and Antonini (1984), also, failed to find any loss of binocularity in areas 17 and 18 of unanesthetized callosectomized adult cats. However, they did not test cells for disparity sensitivity. Part of the effect of early callosectomy on binocularity may be due to eye misalignment that this procedure introduces (Elberger 1979). Binocular cells in areas beyond areas 17 and 18, such as the suprasylvian area and the superior temporal sulcus, are not affected by neonatal callosectomy in otherwise normal cats. However, they are affected by this procedure in Siamese cats, in which the pathways fully decussate (Zeki and Fries 1980; Marzi et al. 1982; Elberger and Smith 1983). Now consider the behavioral effects of callosectomy. In cats with neonatal section of the callosum, coarse stereopsis revealed in reactions to a visual cliff was adversely affected (Elberger 1980). Mitchell and Blakemore (1970) reported a human clinical case in which callosectomy led to a disruption of midline stereopsis, also measured by a test involving only coarse (large) disparities. In other studies, fine stereopsis was not affected by callosal section in the neonatal or adult cat (Timney et al. 1985; Lepore et al. 1986). Four subjects with partial or complete loss of the corpus callosum were able to detect only crossed disparities in the midline ( Jeeves 1991). Perhaps detection of crossed disparities is mediated by the anterior commissure, which was intact in these subjects. Three subjects with congenitally absent callosum were as precise as normal subjects in adjusting two textured plates to equidistance when the plates were on opposite sides of 40
•
the midline (Rivest et al. 1994). However, perhaps these subjects had intact anterior commissures. In any case, they may have performed the task on the basis of monocular cues to depth, since the plates were actually moved in depth. Midline stereopsis in the monkey was not affected by transection of the part of the corpus callosum known as the splenium (Cowey 1985). Section of all parts other than the anterior commissure was without effect in monkeys (LeDoux et al. 1977) and humans (Bridgman and Smith 1945). It looks as though at least some of the crucial fibers serving midline stereopsis cross in the anterior commissure. It has been suggested that some axons could go directly from each LGN and cross in the corpus callosum to the opposite visual cortex. However, evidence for such connections is controversial (Glickstein et al. 1964; Wilson and Cragg 1967). In cats with unilateral removal of the area 17, midline cells tuned to fine disparity were present in the remaining hemisphere but cells tuned to coarse disparity were lost (Gardner and Cynader 1987). These findings support a suggestion made by Bishop and Henry (1971) that the callosal pathway is responsible for midline integration for coarse stereopsis, whereas fine stereopsis in the midline depends on overlap of visual inputs in the midline region. This issue is discussed further in Section 15.3.4. The physiology of stereopsis has been reviewed by Gonzalez and Perez (1998b), Cumming and DeAngelis (2001), Read (2005), and Parker (2007). 1 1 . 1 0 M O D E L S O F D I S PA R I T Y PROCESSING (This section was written with Robert Allison) 11.10.1 E N E RGY M O D E L S
Recent models of stereopsis are based on the idea that binocular cells code disparity energy within their receptive fields over a range of spatial frequencies. The models are based on what is known about receptive fields of simple and complex cells in V1. They deal with local disparities and take no account of global features, such as disparity gradients, that extend beyond the receptive fields of cortical cells. Energy models and most other models of disparity detection are designed to work with random-dot stereograms. The models do not consider other depth perception mechanisms or the role of color, motion, and shape in matching binocular images (see Section 17.1). In images of natural scenes, luminance structure and the 3-D layout of the scene are related so that there are significant correlations between contrast and luminance images and the associated
STEREOSCOPIC VISION
depth maps (Potetz and Lee 2003). Current models of disparity detection do not consider whether stereopsis can exploit such correlations.
11.10.1a Simple Cells Simple cells in cortical area 17 of the cat or V1 of primates are the earliest cortical neurons showing binocularity. Simple cells respond to patterns of stimulation in each monocular receptive field in an approximately linear fashion with the addition of a threshold nonlinearity. Each monocular receptive field of a binocular simple cell can be described by an elongated Gabor function, which is a sinusoidal function (the carrier) modulated by a Gaussian window (the envelope) (Figure 11.27). The carrier specifies the central spatial-frequency selectivity of the cell and the envelope specifies its bandwidth. Consider a cell with a vertical receptive field at position (xr, yr) in the right eye and a similar receptive field at (xl, yl) in the left eye. The sensitivity profiles across the width and height of the receptive fields can be described by equations (3). RFr =
⎛ ( x x r )2 ( y y r )2 ⎞ 1 exp x ⎜− − ⎟ 2s rx 2 2s ry 2 ⎠ 2ps rx s ry ⎝
cos (w lx ( x x r ) +
)
⎛ ( − )2 ( y − y r )2 ⎞ 1 RFl = exp x ⎜− − ⎟ 2s lx 2 2s ly 2 ⎠ 2ps lx s ly ⎝ co os (w rx (
l
)
(3)
f lx )
The first term in each equation represents the Gaussian envelope of the receptive field, with width αrx and height αry for the right field and width αlx and height αlx for the left
field (DeAngelis et al. 1991). So the first term describes the size of the receptive field. The second term in each equation reflects the transverse profile of ON and OFF regions within each receptive field. It describes the internal structure of the receptive field. The profile is modeled by a cosine function of frequency w (in radians per degree) and phase f relative to the envelope of the receptive field. The binocular cell responds most strongly to a grating of spatial frequency wr and phase fr in the right eye and frequency wl and phase fl in the left eye. The cell’s preferred orientation depends on the orientations of its monocular receptive fields. It is usually assumed that the two receptive fields have the same orientations. Different orientation preferences can be obtained by rotation of the profiles in equations (3). Detectors with Gabor-function profiles achieve the minimum product of position selectivity and spatialfrequency selectivity, which makes them optimally sensitive to both spatial frequency and position (see Section 4.4.2). Other functions, such as DOGs, can achieve similar performance. A binocular simple cell is maximally sensitive to a grating aligned with its monocular receptive fields and which matches the spatial periodicity of the receptive fields. A simple cell is most sensitive to a disparity orthogonal to the preferred orientation of the cell (the axis of the Gabor). This point was discussed in Section 11.6. We will consider only disparities orthogonal to the Gabor axis and will assume that the Gaussian envelopes and the preferred spatial frequencies of the monocular receptive fields are the same. Under these conditions, we can simplify the description of receptive-fields to the one-dimensional profile perpendicular to the preferred orientation of the cell, as indicated inEquation (4). RFr =
⎛ ( x − x l )2 ⎞ 1 RF Fl = exp x ⎜− ⎟ cos ( 2s 2 ⎠ 2ps ⎝
X
Cosine sensitivity profile
Gaussian sensitivity envelope
Figure 11.27.
(x (x
xr ) + xl ) +
r
l
) )
(4)
Even-symmetric gabor filter
X
Sine sensitivity profile
⎛ ( x x r )2 ⎞ 1 exp x ⎜− ⎟ cos ( 2s 2 ⎠ 2ps ⎝
Gaussian sensitivity envelope
Odd-symmetric gabor filter
These monocular receptive fields are linear operators, or filters, described in Sections 3.3 and 4.4.1. When a stimulus, I(x), is passed through a Gabor filter, f(x), with width w, the output, r, can be derived by multiplying (convoluting) the stimulus strength at each point by the sensitivity of the receptive field at that point and adding (integrating) the resulting products over the receptive field. w/2
r
Sensitivity profiles of simple-cell receptive fields. The sensitivity
profile of each monocular receptive field of a simple cell can be modeled by multiplying a cosine or sine sensitivity profile (carrier) by a Gaussian (normal) envelope. The receptive field is even-symmetric when the carrier is in cosine phase and is odd-symmetric when the carrier is in sine phase.
∫
f ( x )I ( x ) dx d
(5)
−w/2
This process results in a number, which represents the strength of response of that receptive field to that stimulus. In the visual system this is the frequency of firing of the cell
P H Y S I O L O G Y O F D I S PA R I T Y D ET E C T I O N
•
41
to a monocular stimulus. The response of a simple cell, rs, can be approximated by adding the inputs from the two eyes: w/2
r
∫
−w/2
⎡⎣ f l ( x ) I l ( x )
f r ( x ) I r ( x )⎤⎦ dx
(6)
where l and r denote left and right eyes respectively. If the binocular cell were to sum inputs from the two eyes linearly, the monocular receptive fields would fully specify the response of the cell to dichoptic stimuli. However, the binocular response shows some nonlinearity. In the first place, binocular cells have a threshold and a saturating response at high levels of stimulation. Secondly, both monocular receptive fields that feed into a binocular cell respond only to light increase (ON type) or only to light decrease (OFF type) (Section 5.1.4b). No binocular cell can respond to both types of stimulus because the cells do not have a resting discharge. Thus, the output is halfwave rectified. Combining the outputs of cells sensitive to opposite contrast polarity could compensate for this nonlinearity. Another nonlinearity has been found in single-cell recordings from binocular cells. Anzai et al. (1999a) found that beyond the simple sum of left and right responses, the binocular receptive field contains a term proportional to the product of the left and right responses. This nonlinear process can be modeled as a stage that linearly sums over the monocular receptive fields followed by another stage consisting of a static nonlinearity. The empirical results were modeled with an expansive power function with an exponent close to 2.0. Overall, the response of binocular cells shows half-wave rectified squaring. Squaring (or any multiplicative) operation introduces a term proportional to the interocular cross-correlation of the (filtered) images. A simple cell with identical monocular receptive fields in corresponding locations in the two eyes would be most sensitive to zero binocular disparity. Selectivity to nonzero disparities requires that the monocular receptive fields differ in some way. The simplest type of binocular disparity is a displacement of matching images in the two eyes with respect to corresponding retinal locations. This type of disparity could be coded by binocular cells with similar monocular receptive fields, which look for similar image structure at noncorresponding points. However, there is evidence that some binocular cells are sensitive to differences between monocular receptive fields other than position. For example, a binocular cell could be sensitive to interocular differences in spatial frequency, w. This would render them sensitive to dif-frequency disparity produced by slanted surfaces (see Sections 11.6.1, 19.2.2, and 20.2.1). Alternatively, binocular cells could be sensitive to interocular differences in orientation. This would render them sensitive to shear disparities produced by inclined surfaces (see Sections 11.6.2 and 19.2.3). Finally, the monocular receptive fields of a 42
•
binocular cell could differ in their Gabor phase (f). These cells are phase-disparity detectors. The idea of phase-disparity detectors grew out of computer vision (Sanger 1988; Jenkin et al. 1991). The physiological evidence reviewed in Section 11.4.3a suggests that phase disparity is one of the primitive signals for stereopsis. A pure position-disparity detector has monocular receptive fields with the same size, orientation, frequency sensitivity, and Gabor phase f. The receptive fields vary only in the relative locations of their centers (xr and xl for horizontal disparity). Thus, disparity is coded by the offset of the left-eye and right-eye receptive fields, as indicated in equations (4). A pure phase-disparity detector has monocular receptive fields with the same size, orientation, frequency sensitivity, and location. The monocular receptive fields vary only in the Gabor phase of their sensitivity profiles. Thus: RFr =
⎛ ( x x )2 ⎞ 1 exp x ⎜− ⎟ cos ( 2s 2 ⎠ 2ps ⎝
⎛ ( x x 0 )2 ⎞ 1 RF Fl = exp x ⎜− ⎟ cos ( 2s 2 ⎠ 2ps ⎝
(x (
x )+ −
0
r
)
)+ )
(7)
where the monocular receptive fields are both centered on x0 but differ in Gabor phase (Figure 11.28). Differences in phase do not correspond directly to differences in retinal disparity expressed in angular terms. Phase disparity (in radians) provides an indication of disparity only in terms of the proportion of a period. Thus a given phase disparity corresponds to a large position disparity at low spatial frequencies and a small position disparity at high spatial frequencies. If the spatial period of the stimulus, filtered by the Gabor receptive field, is the same as the preferred spatial period of the cell, then the cell responds to a position disparity of: d=
fr − f l w
(8)
Typically, the Gabor receptive fields in the models (and in the data on which they are based) are wide enough so that the spatial-frequency bandwidth of the resulting filters is about one octave. Narrow-band filters such as this are insensitive to stimuli that differ greatly from the center frequency of the filter. Thus, some models simply scale the phase disparity by the preferred spatial frequency of the detector (Fleet et al. 1991). Although it is unclear how this could be implemented biologically, local spatial frequency is encoded in the responses of a local population of V1 cells, and this normalization may be performed at higher cortical processing stages. This method sometimes overestimates and sometimes underestimates true disparities, depending on the actual spatial-frequency content in the stimulus. Pooling the outputs of disparity detectors at different spatial scales and phases reduces the effects of these errors (Qian 1994; Qian and Zhu 1997). Pooling across spatial
STEREOSCOPIC VISION
Left eye
Right eye Rectification
+ + + (a) Zero disparity
+ + + (b) Position disparity
+ + + (c) Phase disparity Figure 11.28.
Types of disparity of binocular energy neurones.
(Redrawn from Fleet et al.
1996a)
scales also reduces the effects of noise, which is typically limited to certain spatial frequencies (Sanger 1988). Hibbard (2008) normalized the responses of binocular energy neurons to natural images or simulations based on 1/f noise images. Natural images have significant energy at lower spatial frequencies. When presented with such stimuli the neurons may respond more to the out-of-band low spatial-frequency components than to the preferred frequency, effectively detuning the neuron to disparity. Interpreting the output of frequency selective binocular cells as disparity depends on the presence of significant energy at the preferred spatial frequency of the cell. This could be assessed by estimating the monocular energy at the preferred spatial frequency. Energy neurons are sensitive to both the monocular energy from each eye and the disparity (from the cross-correlation component). The divisive normalizing procedure used by Hibbard removes the monocular contributions and normalizes response strength but
makes it difficult to assess the absolute strength of the response. An alternative approach to account for the monocular energy components is to compare responses to disparities for a given location and spatial scale. For cells with disparity that differs from the true disparity, the monocular components will be preserved even when the disparity selective component (cross-correlation signal) is small. Thus, the simplest way to recover the disparity is to look for peaks in the population activity, which should reflect a peak in the disparity component (e.g., Read and Cumming 2004). Of course, monocular cortical cells could also provide the signals required to decode the response of binocular cells. A position-based disparity detector has no theoretical limit on the magnitude of disparity that it can detect. The limit is determined by the separation of the monocular receptive fields that feed into the detector. In contrast, the magnitude of disparity that a phase-based detector can detect is limited by the spatial period of the ON and OFF regions within the monocular receptive fields. A phasebased detector cannot distinguish a phase of θ from one of θ + 2π radians. Thus, for unambiguous disparity detection, phase disparities are limited to a range of −π to π radians. Even this range is not achievable because phase-based detectors have a finite bandwidth and respond to spatial frequencies greater than the preferred spatial frequency, typically up to at least twice the preferred spatial frequency. Hence a more realistic range of allowable phase disparities is −π/2 to π/2 radians. Both types of model detector perform well on smooth surfaces but fail at vertical steps in depth where one eye sees part of the far surface not visible to the other eye. But these model disparity detectors do not account for impressions of depth created by monocular occlusion (Section 17.3). Some computer vision algorithms have explicitly modeled discontinuities, disparity gradients, and monocular occlusion zones (e.g., Belheumer 1996). Phase-based and position-based detectors could coexist or disparity detectors could be hybrid phase-based and position-based detectors (Fleet et al. 1996b). Hybrid detectors were discussed in Section 11.4.3c. A Gabor filter is not an ideal spatial frequency detector since it has a small DC response. This DC response has been noted as a possible problem in computer vision applications and a DC-cleaned Gabor filter has sometimes been used. Empirically, Cozzi et al. (1997) found little benefit of using DC-cleaned filters. Qian and Zhu (1997) have argued that the DC component is beneficial in that it results in a slight bias for small disparities. Analysis of binocular disparities in distance range data from natural or simulated scenes has confirmed that, at least for fixation in the horizontal plane, the probability distribution of disparities is centered on zero disparity and highly peaked so that most disparities are small (Hibbard 2007; Liu et al. 2008).
P H Y S I O L O G Y O F D I S PA R I T Y D ET E C T I O N
•
43
Simple cortical cells do not code disparity unambiguously, whether they are phased-based or position-based detectors. This is because their response depends strongly on the location of each monocular stimulus with respect to the ON and OFF regions of the receptive field. For example, a periodic stimulus that matches the period of the receptive field produces a high response when it is in phase with the receptive field and a low response when it is out of phase. Thus, the response of a simple cell is a function of both the position and disparity of the monocular images (Qian 1994). Inverting the contrast of the binocular stimulus causes an inversion of the response profile of a simple cell (Cumming and Parker 1997). Simple cells are said to lack position invariance.
11.10.1b Complex Cells and Disparity Energy The energy neuron (Adelson and Bergen 1985) has been proposed as a way of achieving position invariance. The energy of a one-dimensional signal I(x) over an interval (−x0, x0) is defined as: x0
∫|
( ) |2 dx
(9)
− x0
One way to estimate the energy of a signal at a given frequency is to combine the energy in the sine and cosine Fourier components. Since the sine and cosine components are orthogonal, their energies can be added to get the overall energy in the interval. A similar scheme can be used to combine the outputs of the Gabor filters (or related functions) formed by monocular simple cells. In this scenario, the input to complex cells is formed by squaring and combining the outputs of quadrature pairs of simple cells. A quadrature pair consists of two detectors that differ in phase by 90°, such as sine and cosine profiles illustrated in Figure 11.11. Thus, the response (r) of a binocular complex cell can be modeled by summing the squared outputs of two binocular simple cells, rs1 and rs2 in phase quadrature (Ohzawa et al. 1990; Qian 1994; Anzai et al. 1999c). r
(rs )2 + (rs )2
(10)
Given that the disparity of the images (D) is significantly less than the width of the receptive fields: Df w D ⎞ r c 2 | I ( ) |2 cos 2 ⎛ − ⎝ 2 2 ⎠
(11)
where c is a constant, w is the spatial frequency of the stimulus, I(w)2 is the Fourier power of the stimulus within the receptive field at the cell’s preferred spatial frequency, and Δf is the phase difference between the receptive fields of the component simple cells. The response is maximal when the disparity equals the phase difference divided by the spatial frequency of the cell’s receptive field. This is the 44
•
preferred disparity of the cell. The equation applies to receptive fields described by a general class of functions, including Gabor functions (Qian and Zhu 1997). Anzai et al. (1999a, 1999b) showed that simple cells may perform the squaring operation, although other models use linear simple cells. The squaring/integration operation incorporates integration of the crossed product of the left and right images—a form of cross-correlation. In contrast to the standard cross-correlation procedure, the left and right images are band-pass filtered by the cell’s receptive field before being multiplied. This renders the algorithm immune to image distortions smaller than the smoothing area. Also, the algorithm is computationally efficient since it is local and there is no integration across the whole stimulus, as there is in the standard cross-correlation procedure. Finally, the provision of two cross products between a quadrature pair of simple cells renders the response of the complex cell approximately the same for two black bars as for two white bars (phase independence) (Ohzawa et al. 1990; Qian 1994, 1997). The preferred disparity of a complex cell is the relative phase shift between the left-eye and right-eye receptive fields divided by the spatial frequency of the receptive-field profiles of the constituent simple cells. A narrow-band stimulus can generate phase disparities only within the range of the spatial period of the cells that it excites. In other words, small phase disparities are coded by cells with high preferred spatial frequency, and large phase disparities by cells with low preferred spatial frequency. Figure 11.28 shows a binocular energy neuron that combines quadrature inputs from binocular simple cells. The disparity selectivity of the cell is modeled by giving the quadrature inputs position differences between the monocular receptive fields, as in Figure 11.28B, or phase differences, as in Figure 11.28C. A complication arises from the fact that simple cells are half-wave rectifiers that respond either only to ON stimuli or only to OFF stimuli. This problem can be overcome by composing each quadrature input from a combination of OFF-center and ON-center cells with the same phase and disparity preference. A diagram of the overall model is shown in Figure 11.12A. Chen et al. (2001) described an extension to the disparity energy model to describe the spatiotemporal properties of disparity-selective complex cells (see also Qian 1994; Qian and Andersen 1997). Theoretically, combining the outputs of a single quadrature pair of simple cells into a complex cell is sufficient to code the disparities at that location. However, the reliability of the estimates is still sensitive to the local phase of the signal. For complex stimuli such as random-dot stereograms, a final smoothing can remove high-frequency noise. The smoothing operation involves pooling disparities over an area. However, this blunts the response to sharp disparity discontinuities (Qian 1994). A better approach is to use a
STEREOSCOPIC VISION
weighted average of the responses of several quadrature pairs of simple cells (Qian and Zhu 1997). With this procedure, final smoothing of complex-cell responses is not necessary, and sharp disparity discontinuities are preserved. Pooling over a set of simple cells also reduces the effects of instantaneous spatial frequency on the disparity estimate (Fleet et al. 1996b) and is compatible with the larger average size of the receptive fields of complex cells compared with those of simple cells (Qian 1997). Cozzi et al. (1997) analyzed the performance of the phase-based disparity algorithms of Sanger (1988) and Fleet et al. (1991) and concluded that the algorithms are quite robust to changes in contrast, noise, interocular differences in luminance, and the spectral structure of the image. Prince and Eagle (2000a) analyzed a version of the energy model of Fleet et al. (1996a). They considered position-disparity detectors as sampling a local disparity energy function. The correct match should result in significant energy in disparity detectors tuned to the corresponding disparity. Thus, we would expect a peak in the local disparity energy function at the true disparity. As Fleet et al. suggested, one can use a peak-finding algorithm that interpolates between the discrete detectors to find a high-resolution estimation of the local disparity. However, there may be maxima at a number of other disparities corresponding to false image matches. To ease this problem of false image matches, Prince and Eagle weighted the disparity energy function to emphasize small disparities. This weighting operation is reminiscent of the model developed by Sperling (1970). It enhances the contribution of small disparities and reduces the probable false response peaks at large disparities (McKee and Mitchison 1988). This is not a disparity-gradient constraint but a simple bias for small disparities. Use of position disparity rather than phase disparity provided disparity energy over a range of 7.2°. The model could be no doubt be generalized but it was elaborated only for the task of disparity discrimination in a single channel tuned to a specific spatial frequency and orientation. Prince and Eagle argued that their model explains the phenomenon of second-order stereopsis, particularly the sensitivity of depth judgments to the disparity of the envelope of an amplitude-modulated carrier (see Section 18.7.2d). Energy detectors demodulate the carrier. The lowfrequency envelope is represented in the pattern of activation over a set of disparities, and in the shape of the disparity-energy function. This predicts strong sensitivity to disparity of the envelope for discriminating the sign of disparity, but subjects would be sensitive to the carrier for discriminating stimuli on disparity pedestals or for depth estimation. One possible problem with any disparity-energy neuron model is the idea that the squared output of simple cells forms the input for complex cells. This proposal is
questionable in the macaque monkey because the required binocular simple cells appear to be rare (Livingstone and Tsao 1999). Even in the cat, complex cells may not always receive their inputs from simple cells. Archie and Mel (2000) developed and analyzed a disparity-energy model for complex cells based solely on direct LGN inputs. They modeled a single cortical neuron with simplified morphology at the level of synaptic currents and membrane potentials. Instead of combining simple-cell subunits in quadrature pairs, the subunits are mapped onto separate branches of the dendritic tree. The excitatory LGN inputs for a given branch correspond to components of one of the four postulated subunits in the disparity-energy model. The compartmentalization provided by the separate branches provides a significant level of independence between subunit components. The required expansive nonlinear interactions between inputs to the cell are mediated by sodium and potassium currents and voltage-dependent NMDA synaptic currents. These form the basis for orientation and phase invariant disparity tuning typical of complex cells. There is a growing body of evidence for compartmentalized processing within the dendritic trees of pyramid cells (see Section 6.5.5). The possibility of implementing the disparity-energy model either with direct LGN inputs to complex cells or with quadrature pairs of simple cells illustrates the point that the implementation of a model is distinct from the model or the algorithm used to realize the model (Marr 1982). Grossberg and his associates have constructed neural models that involve the disparity energy model and laminar processing in the visual cortex to explain a wide variety of stereoscopic effects such as Panum’s limiting case, Da Vinci stereopsis, and the perception of 3-D structure (Grossberg and Marshall 1989; Grossberg and McLoughlin 1997; McLaughlin and Grossberg 1998; Grossberg and Howe 2003). In the standard energy model, the response of a cell to a compound grating is the sum of its responses to each component grating. Haefner and Cumming (2008) demonstrated that this means that energy-model neurons respond to types of disparity that do not occur in the natural world. In particular, they respond optimally to sinusoidal gratings in which the images are displaced by a constant phase, that is, by an amount that is proportional to the spatial period of the grating. Read and Cumming (2007) pointed out that such stimuli do not occur naturally. Haefner and Cumming recorded responses of cells in V1 of alert monkeys to binocular disparities in a drifting compound grating with spatial frequencies in the ratio 1:2. Almost half the cells responded in a way predicted by the energy model, which means that part of their response range was dedicated to stimuli that do not occur. The other cells responded only to naturally occurring disparities. These may be called adapted neurons.
P H Y S I O L O G Y O F D I S PA R I T Y D ET E C T I O N
•
45
Haefner and Cumming accounted for the responses of adapted cells by a simple extension of the energy model. In the standard model, each complex cell receives inputs from simple cells with the same disparity selectivity. In the extended model, the outputs of two simple cells with different disparity selectivities are first subjected to a compressive nonlinearity and then combined to form the inputs to an adapted neuron. This extended model accounts for the following features of stereoscopic vision, which the standard model does not account for.
band-pass monocular inputs. The key idea in the energy model involves a squaring (or similar) nonlinearity following the binocular summation of monocular inputs. Linear operations on a signal preserve frequency components in the signal. In contrast, nonlinear operations such as squaring can produce frequency components not found in the original signal. This can be easily demonstrated by applying a sinusoidal input to a squaring operator:
1. It explains why the visual system is particularly sensitive to naturally occurring disparities. However, nonadapted cells still respond to nonnatural disparities. Read and Cumming (2007) suggested that these cells could help to find binocularly corresponding images by identifying signals that do not correspond to real objects.
The squaring produces a DC or constant component at a frequency of zero and another component at twice the original frequency. Neither component is at the original frequency and indeed there is no component at the original frequency. Similarly, for band-pass stimuli, squaring produces copies of the input spectrum at zero and at twice the dominant frequency. Some AM radios use such a squaring followed by low-pass filtering to demodulate the radio signal and recover the original audio signal. Nienborg et al. used monocular inputs to the energy neurons, which simulated LGN temporal impulse response functions. These functions were causal (there was no response before the stimulus occurred) and biphasic with a large initial response followed by a smaller opposite rebound phase. The output nonlinearity acting on such asymmetric stimuli results in a large component centered at zero and a smaller component at twice the frequency. As the low frequency component dominated, there was no need to further low-pass filter to approximate the empirical low-pass output power spectrum. This analysis applies to the monocular components of the signal that are squared by the output nonlinearity. Nienborg et al. showed that a similar output power spectrum was predicted for the disparity selective (cross-correlation) component of the binocular energy response with white noise input sequences. White noise input sequences (spatially and temporally uncorrelated) are favored tools for identifying the underlying response kernel in nonlinear systems (Wu et al. 2006).
2. The extended model explains why responses of V1 neurons to anticorrelated stereograms are weaker than those to correlated stereograms. 3. It explains why some binocular cells respond only when stimulated by both eyes (AND cells) and why cells with strong ocular dominance respond only to one eye but show disparity selectivity. The receptive field of a binocular cortical cell not only has a spatial pattern but also a temporal response function. These cells act as spatiotemporal filters on the incoming image stream. Cells could have separable spatial and temporal responses where the spatial receptive field is unchanging, but the magnitude and temporal extent of the response could be modulated. Conversely the spatial receptive field could vary as a function of time. Such a neuron would be said to have a space-time inseparable receptive field (Qian and Andersen 1997). A classical inseparable space-time function is a linear shift in spatial position of the receptive field as a function of time. Such a tilted space-time function gives selectivity to a particular direction of motion. Since binocular cells receive two inputs, the properties of the cell could depend on the spatiotemporal relation between the inputs from the two eyes as well as on each monocular input. Qian and Andersen (1997) noted that directionselective cells would exhibit delay-dependent shifts in preferred disparity that could produce the Pulfrich effect (Chapter 23). However, Read and Cumming (2005) found that such cells are rare in monkey V1. Nienborg et al. (2005) found that the modulation of cells in monkey V1 in response to temporal modulation of disparity peaked at 2 Hz with a cutoff at around 10 Hz. These values are about 2.5 times lower than the temporal resolution of cells for luminance changes. Nienborg et al. attributed the temporal resolution limit to the properties of binocular energy neurons in V1 acting on temporally 46
•
⎡⎣cos (wt )2 ⎤⎦
1 1 + cos ( wt w ) 2 2
(12)
11.10.1c From Disparity Energy to Depth The way the outputs of binocular energy neurons are processed depends on the function that the information supports. Control of a vergence eye movement to a selected object, requires an estimate of the object’s absolute disparity. On the other hand, binocular depth perception requires precise estimates of relative disparity at each point in the scene. Energy neurons in V1 are not suitable for this purpose. They respond to absolute rather than relative disparity, and the interpretation of the disparity signal depends on local spatial frequency. In a cluttered visual scene, neurons tuned to the disparity of a pair of correctly matched images may be activated by incorrectly matched images. Furthermore, energy neurons in V1 produce a reversed
STEREOSCOPIC VISION
disparity signal in response to an anticorrelated randomdot stereogram. The output of energy neurons are better suited to control eye vergence than depth perception (Tyler 1991b) but they need further processing even for this function. Typically it is assumed that higher processes in the visual system combine the responses of sets of neurons, or the entire population of neurons, to derive estimates of relative disparity. In the above models, the depth represented by the response of a local population of cells tuned to different disparities and different spatial frequencies is derived from the peak response or from a weighted average. These procedures can result in a loss of detail, as discussed in Section 15.4.2. Coarse to fine processing is one approach to preserve detail. Alternatively, Lehky and Sejnowski (1999) proposed that the pattern of activity over a local population of cells provides a better signal for depth perception. Tsai and Victor (2003) developed a model of disparity processing based on the same idea. At each location in the visual cortex there is a population of cells tuned to different spatial frequencies. For each spatial frequency there are cells tuned to different phase disparities. They considered only cells tuned to vertical orientation and did not incorporate neural noise. The inputs from these cells are processed in the same way as in the front end of the phase-difference model of Qian (1994), extended to two dimensions. But instead of taking a weighted average of the responses of several quadrature pairs of simple cells, Tsai and Victor’s model preserves the information in the population response at each location in the visual field. The overall response is compared with a set of templates, each of which corresponds to a unique depth. The basic template is the expected response of the phase-difference energy model to binocular white spatial-frequency noise with a given uniform disparity, D. The perceived depth at any location is the value of D for which the mismatch between the population response and the template is at a minimum. Multiple minima indicate the presence of multiple depths. Predictions of the model conformed to several experimental findings. These include the dependence of stereoacuity on spatial frequency (Section 18.7) and disparity pedestals (Section 18.3.3), depth averaging and transparency (Section 18.8.2), and the appearance of compound gratings (Section 17.1.1a). Since the model is based on processing at each local region it cannot account for long-range effects, such as depth contrast (Chapter 21) or depth interpolation (Section 22.2). Like other modals with an energybased front end, it cannot account for depth created by second-order stimuli (Section 18.7.2d) unless a nonlinear component is added. In addition to pooling over spatial frequency and orientation, Read and Cumming (2007) proposed that phasesensitive energy neurons could signal false matches by acting as “lie detectors.” The algorithm relies on hybrid phaseposition neurons. The idea is that at the true disparity a
position detector should be most effectively stimulated if it has binocularly matched ON and OFF regions in the receptive field (zero phase disparity). In contrast, false positional matches should be associated with a random phase disparity match. If a peak response is associated with a nonzero phase it is a signal that there is a false match at the preferred positional disparity of the cell. Thus, the algorithm looks for a peak in the position-disparity responses that coincides with a peak phase response at zero phase disparity. Similar logic led Read and Cumming (2006) to propose that the magnitude of vertical disparity (but not its sign) could be extracted from the activity of neurons sensitive only to horizontal disparity. Any increase in vertical disparity reduces the correlation between the inputs to a binocular cell. This reduces the binocular energy in neurons sensitive only to horizontal disparity by an amount that depends on the magnitude of vertical disparity. Decorrelation produced by vertical disparity would affect mainly small receptive fields while that produced by a general loss of interocular correlation would affect all receptive fields equally. Thus, according to this proposal, there is no explicit computation of vertical disparities. However, such a model does not account for the evocation of vertical vergence by vertical disparity. In any case, Serrano-Pedraza and Read (2009) have produced psychophysical evidence that both the magnitude and sign of vertical disparities are explicitly coded in the visual system (see Section 20.2.5). Watanabe and Fukushima (1999) developed a neural model that incorporates detection of depth from disparity and depth from unpaired images (Da Vinci stereopsis). Hayashi et al. (2004) extended this model. In the first stage of the model, disparities from paired images are processed according to the standard energy model. Unpaired images are processed by specialized detectors and interactive processes that determine whether they conform to occlusion geometry (see Section 17.2). If they do, they are used to indicate depth. Unmatched images that do not conform to occlusion geometry engage in binocular rivalry. Energy neurons are essentially localized cross-correlators tuned to constant disparity in local frontal surface patches (more precisely, surface patches lying along the horopter). In this sense they are similar to the initial area-based stereoscopic computer vision algorithms that also rely on filtered cross- correlation. Conceptually the world is modeled as a set of flat surfaces. Some investigators have argued that this model describes some aspects of stereoscopic processing. Banks et al. (2004a) modeled limits on grating resolution and the disparity gradient limits in random-dot stereograms with a simple windowed cross-correlator. The sampling limits reflect fundamental limits from signal theory (Nyquist limit) and are not surprising. The limited stereoscopic resolution suggests spatial pooling or averaging over the window used in estimating cross-correlation (similar effects occur in disparity energy model in which cells effectively perform band-pass correlation).
P H Y S I O L O G Y O F D I S PA R I T Y D ET E C T I O N
•
47
Filippini and Banks (2009) simulated a simple windowed cross-correlator model and found that a window size of 3–6 arcmin for the finest scale explained both the limited resolution of stereopsis and effects of the disparity gradient. As noted above, cross-correlation favors detection of disparities that are constant over the correlation area. Nonzero disparity gradients decorrelate the images and therefore degrade the disparity signal. Filippini and Banks (2009) confirmed that increasing the disparity gradient reduced the detectability of sawtooth disparity corrugations in noise for both human observers and for a model based on a windowed correlator. But the model did not explain the finding that performance declined more steeply for humans than for the model. This was probably because the model did not embody the fact that human stereoacuity is degraded for stimuli on a pedestal disparity. However, Palmisano et al. (2001) found that human detection efficiency, relative to an ideal observer provided with correct matches, was higher for sinusoidal corrugations than for square wave corrugations in the presence of additive disparity noise. However, this may have been due to the steep disparity gradient at the steps. Similarly, Hibbard (2008) showed that simulated energy neurons can tolerate modest disparity gradients although RMS disparity error for a winner-take-all pooling shows a large increase with any disparity gradient. One problem with such an approach is that higherorder descriptions of the scenes need to be arrived at from postprocessing of the piecewise linear descriptions, and this has been problematic in computer vision. It is difficult to reconcile such an approach with the saliency of disparity discontinuities corresponding to depth steps and of changes in surface slant and inclination. If the initial disparity detection is correlation based, as appears to be the case for V1 disparity-selective cells, then higher centers need to recover these psychophysically salient features. Investigators have attempted to recover these features from the population response of V1 disparity detectors. Zhaoping (2002) described a model of V2 disparity processing that acts on noisy and ambiguous disparity inputs. As in the early cooperative stereo models, local excitatory lateral connections between cells tuned to similar disparities enforce the smoothness constraint. Furthermore, mutual inhibition between cells in the same spatial location but with differing disparity tuning enforces the uniqueness constraint. Additional local inhibitory lateral connections between cells with similar disparity tuning enhances stereoscopic discontinuities and depth popout effects. A dynamic nonlinear model was used for each unit and allowed for excitatory and inhibitory connections to have separate effects based on their temporal dynamics. The model was sensitive to the weights of the interneuronal connections, which the authors suggested could be learned or innate. Perception of monocular zones and disparity discontinuities can be incorporated into the binocular matching 48
•
process. Although monocular zones have not been incorporated into biological models, they have had been used with some success in computer vision. For example, Belheumer (1996) and Belheumer and Mumford(1992) developed a Bayesian model incorporating depth discontinuities, monocular occlusion, and disparity gradient statistics into the statistical prior for the scene. For a given image, determining optimal disparity, disparity gradient, and depth discontinuity requires a nontrivial nonlinear optimization. Although the dynamic programming that Belheumer used is not biologically plausible, discontinuities of depth and disparity gradients are salient features for human vision that should influence the matching process. Egnal and Wildes (2002) reviewed the computational literature and described five types of algorithm that identify monocular occlusion zones based on: (1) bimodality of matching, (2) detection of areas of poor matches, (3) mismatched disparity between corresponding points in matches based on left and right images, (4) violation of the ordering constraint, and (5) prediction of occlusion regions from disparity discontinuities (the occlusion constraint). Physiological mechanisms that detect and process depth discontinuities and monocular occlusion zones are not known. Mechanisms analogous to at least three of Egnal and Wildes’s five classes are theoretically possible. Bimodal or double-duty matches could be evident in the population code in the vicinity of a monocular occlusion, although Tsai and Victor (2003) proposed this as a signature of transparent surfaces. In area-based models such as the energy model, weak cross-correlation relative to monocular energy could signal poor matching. The ordering constraint is related to the disparity gradient limit that has been proposed for models of biological stereopsis (Pollard et al. 1985). Compared with cells in V1, disparity-selective neurons in visual areas beyond V1 show: 1. Increased sensitive to relative disparity. Physiological evidence for the processing of relative disparity in the ventral and dorsal streams was reviewed in Section 11.5. 2. Decreased response to anticorrelated stimuli (see Section 11.5). Even for V1 neurons, the response to anticorrelated stimuli is smaller than that to correlated stimuli. Read et al. (2002) modeled this asymmetry in V1 by adding a threshold nonlinearity to the monocular cell inputs prior to combination in binocular simple cells. It has also been proposed that pooling inputs across spatial frequencies underlies the increasing insensitivity to disparity in anticorrelated stimuli. It has also been proposed that spatial-frequency pooling reduces the effect of false matches (Fleet 1996a, 1996b). Consistent with this proposal, V4 neurons exhibit a negative correlation between the degree of spatial frequency pooling (as reflected by the neuronal
STEREOSCOPIC VISION
spatial-frequency bandwidth) and the response to disparity in anticorrelated RDS stimuli (Kumano et al. 2008). Essentially the visual system treats anticorrelated stimuli as false matches, since disparity estimates from anticorrelated stimuli are not coherent across spatial frequency channels. 3. Increased sensitivity to odd-symmetric disparity. Tanabe and Cumming (2008) modeled responses of disparityselective cells that combine spatially coincident subunits with differing disparity tuning. Even if the subunits were only sensitive to position disparity, a cell that combined their outputs could exhibit oddsymmetric responses. Tanabe and Cumming argued that such a mechanism produces the predominance of the odd-symmetric responses in V2. A similar combination of disparity selective subunits that differ in spatial position as well as disparity selectivity could provide a mechanism for extracting relative disparity.
11. 10.2 N EU R A L N ET WO R K MO D E L S
Several investigators have used neural networks to model aspects of human stereoscopic vision (Becker and Hinton 1992). Model neural networks are only loosely based on real neural systems. They are designed mainly to show how a network can learn to discriminate or recognize stimuli. The methods have been described by Rumelhart and McClelland (1986), Hinton (1989), and Miller et al. (1991). Lippert et al. (2000) used a three-layer neural net, which they trained with noise patterns to learn disparity tuning similar to that exhibited by binocular cells in the visual cortex. The disparity tuning of the network modeled that of tuned-excitatory neurons to position-disparity, phasedisparity, or to both types of disparity (hybrid type). In the position-disparity model, phase-offset is constant. In the phase-disparity model, receptive-field offset is constant. In the hybrid-type model, the input layer consists of monocular cells representing left-eye and right-eye receptive fields with Gabor sensitivity profiles at four scales, five phase offsets, and five position offsets relative to the receptive-field center. The input units provide subunit regions for forming the disparity-tuned response. The eight cells in the hidden layer receive the binocular inputs from these Gabor filters. The hidden layer projects to a single-output cell. Inputs weights are adjusted by supervised learning, using backpropagation, so that the output cell has a preferred disparity and a response similar to a tuned excitatory disparity detector. After training, disparity tuning was sharper for positiontype models than for phase-type models. However, for all models, disparity sensitivity peaked near the peak disparity of the teacher and generalized to novel stimuli, indicating that simple networks can learn disparity tuning from Gabor
inputs. Responses were strongest and most reliable for the hybrid-type model, which was expected because it has a richer input. Sensitivity to ghost images was observed, and the phase detectors exhibited the expected 180° range limitations. Thus, the authors claimed that disparity selectivity can be learned from Gabor inputs differing in phase or position with a simple and unphysiological backpropagation learning paradigm. The model output cells were much less selective for spatial frequency than are real disparityselective complex cells. Also, the model cells did not show the linear relationship between phase and frequency expected of binocular energy neurons. As a result, the network did not show the characteristic inverse disparity tuning with anticorrelated random-dot stimuli, as described in Section 11.4.1f. The phase-networks were less precisely tuned and had a smaller range compared with the positiondisparity networks. These results must be interpreted carefully, since only a single neuron was trained, with no training parameters related to spatial-frequency selectivity. Broad spatialfrequency tuning of phase-type detectors decreases the reliability of disparity estimates. A detector would have higher reliability if disparities were scaled by a measure of the actual instantaneous spatial frequency rather than by the preferred spatial-frequency of the detector. In the visual cortex there are probably multiple disparity detectors at each location tuned to the same disparity but to different spatial frequencies. Pooling responses from a set of such detectors could improve the reliability of disparity estimates. In addition, real complex cells form the substrate for much of visual processing and the demands of these processes might encourage spatial-frequency selectivity and allow more precise phase-type disparity detection. Gray et al. (1998) developed a feed-forward network model that selects disparity estimates based on their reliability. Local disparity estimates from phase-based disparity-energy neurons feed into two separate pathways, a local disparity pathway and a selection pathway. The first pathway computes local disparity by spatially pooling the output of disparity-energy neurons over a local region. Competition between these disparity-sensitive cells ensures support for a unique disparity estimate for the region. The parallel selection pathway estimates the regions with the most reliable evidence for a given disparity. After training, these units became sensitive to step changes in disparity, or disparity contrast. The selection pathway gates the local disparities multiplicatively so that reliable disparity estimates are given a greater weighting when passed to the output layer. Thus, a parallel pathway determines reliability, which is then used to gate the feedforward disparity inputs. The model demonstrates that derivation of disparity estimates can be dissociated from the determination of the reliability of those estimates. The concept of identifying and discarding unreliable disparity estimates is key to other models of stereopsis (Fleet et al. 1991).
P H Y S I O L O G Y O F D I S PA R I T Y D ET E C T I O N
•
49
All models of disparity processing assume that inputs from corresponding regions in the two eyes converge to neighboring locations or onto binocular cells. They make no attempt to account for how a visual system with laterally placed eyes, little or no binocular overlap,
50
•
full decussation, and separate representation of the eyes in opposite hemispheres of the brain evolved into a system with frontal vision, a binocular region, hemidecussation, and binocular cells. This topic is discussed in Section 33.8.
STEREOSCOPIC VISION
12 BINOCULAR FUSION AND RIVALRY
12.1 12.1.1 12.1.2 12.1.3 12.1.4 12.1.5 12.1.6 12.1.7 12.2 12.2.1 12.2.2 12.2.3 12.3 12.3.1 12.3.2 12.3.3 12.3.4 12.3.5 12.3.6 12.3.7 12.3.8 12.4 12.4.1 12.4.2 12.4.3 12.4.4
Binocular fusion 51 Limits of image fusion 51 Effects of spatial frequency and contrast 53 Fusion limits and disparity scaling 55 Temporal factors in fusion limits 57 Orientation fusion limits 58 Hysteresis effects in fusion 59 Dichoptic moiré patterns 59 Dichoptic color mixture 60 Basic phenomena 60 Factors affecting binocular color mixing 61 Dichoptic and dioptic color mixtures 62 Binocular rivalry 63 Introduction 63 Effects of luminance, contrast, and color 66 Figural factors in rivalry 70 Position on the retina 73 Temporal factors in rivalry 74 Rivalry between moving stimuli 78 Rivalry and eye dominance 79 Monocular rivalry 80 Spatial zones of rivalry 82 Zones of exclusive dominance 82 Spatial extent of the zone of rivalry 82 Independence of zones of rivalry 83 Eye rivalry versus stimulus rivalry 84
12.5 12.5.1 12.5.2 12.5.3 12.5.4 12.5.5 12.5.6 12.6 12.6.1 12.6.2 12.6.3 12.6.4 12.7 12.7.1 12.7.2 12.7.3 12.8 12.8.1 12.8.2 12.8.3 12.8 4 12.9 12.9.1 12.9.2 12.10
Interactions between dominant and suppressed images 87 Pupil and accommodation responses during rivalry Threshold summation of rivaling stimuli 88 Effects of changing the suppressed image 88 Movement signals from suppressed images 88 Visual beats from a suppressed image 89 Detection of dichoptically canceled textures 89 Effects from suppressed images 90 Threshold elevation by a suppressed image 90 Spatial-frequency aftereffect from a suppressed image Tilt aftereffect from a suppressed image 91 Motion aftereffect from a suppressed image 92 Rivalry, fusion, and stereopsis 92 The mental theory of fusion and rivalry 93 Suppression theory of fusion and rivalry 93 Two-channel and dual-response accounts 94 Cognition and binocular rivalry 96 Voluntary control of rivalry 96 Attention and rivalry 96 Binocular rivalry and meaning 97 Intesensory effects in rivalry 98 Physiology of binocular rivalry 99 Rivalry at the level of the LGN 99 Rivalry at the cortical level 99 Models of binocular rivalry 103
87
91
Chapter 14. The images of an object that does not lie on the horopter fall on noncorresponding retinal points and are said to have a binocular disparity. An object nearer than the horopter produces a crossed disparity and an object beyond the horopter produces an uncrossed disparity. When the disparity of the images of an object is within a certain limit, the images fuse and appear as one. They are haplopic images. Beyond that limit, the images of an object appears double. They are diplopic images. In the eleventh century, Alhazen noticed that the images of an object continue to appear single when they do not fall exactly on corresponding points in the two eyes (Section 2.10.2). In his Treatise of Optics, written in 1775, Joseph Harris wrote, “An object that is a little out of the plane of the horopter may yet appear single” (p. 113). Wheatstone (1838), also, noticed that images in a stereoscope fuse when they do not fall exactly on corresponding points.
12.1 BINOCUL AR FUSION 12.1.1 L I M I T S O F I M AG E F US I O N
12.1.1a Basic Terminology When the eyes are precisely converged on a small object, its images fall on corresponding points in the two retinas and create an impression of one object. The images are said to be binocularly fused. For a given angle of convergence, the locus in the binocular visual field where an object must lie to appear single is known as the horopter. With the eyes symmetrically converged, the horopter has two components. The horizontal horopter is approximately a circle passing through the point of convergence and the nodal points of the two eyes. The vertical horopter is a vertical line passing through the point of convergence, as explained in 51
The disparity at which two initially fused images just appear double is the fusion limit. The disparity at which two initially diplopic images fuse is the refusion limit. Ideally, the diplopia threshold is the mean of the fusion and refusion limits. However, the diplopia threshold sometimes denotes only the fusion limit. The diplopia threshold can be measured as the images of a small object are separated horizontally, vertically, or in oblique directions with the gaze held on a fixation point. The total area within which the images of the object are fused is Panum’s fusional area. It is named after Peter Ludvigh Panum, professor of physiology at Kiel, who described the first systematic experiments on this topic in 1858. The diplopia threshold is a radius of the fusional area, and the fusion range is a diameter of the fusional area. The fusional range is larger for stimuli separated horizontally than for images separated vertically. The fusional area is thus elliptical (Panum 1858; Ogle and Prangen 1953). However, at least part of this difference may be due to asymmetries in vergence eye movements (Mitchell 1966b). For a given axis of image separation, the fusion range is the sum of the diplopia threshold for crossed disparity and that for uncrossed disparity. The diplopia threshold is not always symmetrical. For some persons it is greater for uncrossed images, while for others it is greater for crossed images. These asymmetries may be due in part to fixation disparity resulting from incorrect vergence on the fixation target. The relationship between the diplopia threshold and fixation disparity is discussed in Section 10.2.4. Richards (1971) concluded that asymmetries of the diplopia threshold are not due only to fixation disparity but also reflect the independent processing of crossed and uncrossed images. We will see that the size of the fusional area depends on many factors, such as retinal eccentricity, stimulus duration, the presence of surrounding stimuli, and the criterion for single vision adopted by the observer. Reported values have ranged from a few minutes of arc to several degrees.
12.1.1b Measuring the Diplopia Threshold Diplopia thresholds are most commonly measured by the method of limits. The subject binocularly fixates a spot in a stereoscope while the experimenter gradually increases the binocular disparity between two other spots until the subject reports diplopia. Then, starting with the stimuli well separated, disparity is decreased until the subject reports fusion. The disparity at which two images appear double when initially fused is greater than the disparity at which they fuse when initially seen double. Thus, the fusion limit is larger than the refusion limit. This hysteresis effect is discussed in Section 12.1.6. Ideally, the diplopia threshold is the mean of the two limits. The diplopia threshold cannot be measured accurately unless convergence is maintained on the fixation point. It is difficult to control vergence eye movements when the 52
•
stimulus is viewed for some time, as in the method of limits. Nonius lines (Section 14.6.1c) can be used to indicate changes in vergence but they introduce extra stimuli that may contaminate the measurements, since analyzing stimuli can affect the fusion limit. In the method of constant stimuli, the subject aligns nonius lines just before the disparate stimuli are presented briefly. Stimuli with different disparities are presented in random order, and a psychometric function of percentage of “single stimulus” judgments against disparity is plotted. The diplopia threshold is conventionally defined as the point on the psychometric function where 50% of the judgments are “single stimulus” judgments. This method has the advantage that stimuli can be presented for too brief a period to evoke vergence eye movements. However, fusion can take time to occur (Ellerbrock 1954; Kertesz and Jones 1970). Also, brief exposure introduces a temporal transient into the stimuli. We will see later that Panum’s area is increased when stimuli are rapidly alternated in disparity. In a criterion-free forced-choice procedure, subjects discriminate between a binocular stimulus with horizontal disparity and a spatially adjacent or subsequently presented pair of stimuli with zero disparity. The disparity giving 75% accuracy is generally taken as the diplopia threshold.
12.1.1c Criteria for Fusion In an experiment designed to measure the diplopia threshold, subjects may use one or more of the following criteria instead of diplopia. 1. Apparent depth Just because two stimuli are fused does not mean that their separate identities are lost in the visual system. The fact that we can detect the sign of depth produced by crossed and uncrossed images within the fusion range proves that image identity is retained. When horizontal fusion limits are being measured, subjects may inadvertently judge apparent depth between the stimuli rather than diplopia. 2. Line thickening As two lines presented to the same eye are moved further apart, the first sensation is of a single line becoming thicker. The separation where thickening is detected is known as width resolution (Section 3.1.3a). Only at a larger separation do the lines appear to separate. The same effect occurs with dichoptic lines. People can discriminate between a pair of fused horizontal lines with a small vertical disparity and a pair of lines exactly superimposed (Kaufman and Arditi 1976; Arditi and Kaufman 1978). Thickening of dichoptic stimuli may be noticed before fusion is lost (Heckmann and Schor 1989a). 3. Rivalry As dichoptic stimuli are spatially separated, there comes a point where edges of opposite luminance polarity become superimposed. This may evoke a
STEREOSCOPIC VISION
The criterion problem is particularly severe with the forced-choice procedure because it forces subjects to use whatever differences are available.
12.1.1d Fusion Limits and Eccentricity All investigators agree that the diplopia threshold increases with increasing eccentricity, although it is difficult to measure the threshold at eccentricities of more than 10°. Studies reviewed by Mitchell (1966b) showed wide variations in the slope and shape of the function relating diplopia threshold to eccentricity. Palmer (1961) asked subjects to fixate between two marks 40 arcmin apart horizontally while a test spot 1.5 arcmin in diameter was presented for 10 ms with various horizontal disparities up to 6°. The fusion limit was about 15 to 20 arcmin in the fovea and increased to about 60 arcmin at an eccentricity of 6°. Mitchell (1966a) reported a similar dependence on eccentricity. Crone and Leuridan (1973) found that, beyond a horizontal eccentricity of 10°, the diplopia threshold increased in proportion to eccentricity. On average, the diplopia threshold was about 7% of the angle of eccentricity. Ogle (1964) reported a ratio of 6%. Hampton and Kertesz (1983) found that the diameter of the fusional area increased linearly with horizontal eccentricity with a slope of about 0.13° per degree. This is similar to the rate of increase of the magnification factor in the human visual cortex (Rovamo and Virsu 1979; Yeshurun and Schwartz 1999). There is evidence that the fusional area increases with eccentricity less rapidly along the vertical meridian than along the horizontal meridian (Ogle and Prangen 1953). We will see later that the fusion limit for a pair of images is smaller when other images are nearby. In recording the fusion limit as a function of eccentricity, subjects fixate a binocular stimulus to hold vergence constant while the test stimuli are moved into more peripheral positions. The increase in the fusion limit with increasing eccentricity may therefore be due, at least partly, to the increasing distance between the fixation stimulus and the test stimuli. Levi and Klein (1990) described a procedure for separating the effects of these variables in the measurement of vernier acuity. The procedure could be adapted for measurement of stereo acuity or fusion limits. The independent effect of increasing eccentricity could be measured by placing a zero-disparity stimulus at a fixed distance from a dichoptic test stimulus at different eccentricities. The independent effect of changing image proximity could be measured by placing the zero-disparity and test stimuli at
different separations on the circumference of a circle centered on the fixation point. 12.1.2 E FFEC TS O F S PAT I A L FR EQ U E N C Y AND CONTRAST
Kulikowski (1978) reported that the fusion limit is greater for gratings with gradual contours than for gratings with sharp contours. In more recent studies, the effects of low and high spatial frequency and contrast on the fusion limit have been investigated in more detail. Schor et al. (1984a) used vertical bars with a difference of Gaussian (DOG) luminance profile (spatial-frequency bandwidth of 1.75 octaves at half-height) superimposed on a small fixation spot. Nonius lines were used to check for changes in vergence, which were claimed to be less than 1 arcmin. Subjects adjusted the disparity between the two bars until they noticed an increase in width, a lateral displacement, or diplopia. Figure 12.1 shows that the fusion limit increased as spatial frequency decreased. The vertical fusion limit was consistently smaller than the horizontal limit. Below a spatial frequency of about 1.5 cpd, the limit of fusion corresponded closely to a 90° phase shift of the stimulus, indicated by the diagonal line. When the measurements were repeated with both Gaussian bars presented
200 B Dog profile Panum’s fusional radius (arcmin)
sensation of binocular luster or rivalry. If the two stimuli rival, an apparent change in position may occur. Generally, smaller fusion limits are obtained when subjects are allowed to use criteria other than diplopia than when they are required to use the criterion of diplopia.
B Bar profile
100
mit
h li
pt r de
pe
Up 50
Vertical DOG Horizontal DOG
25
Horizontal bar 10
5 Vertical bar 90° phase disparity
2.5 9.6 2.74
4.8
2.4 1.2 0.6 0.3 0.15 Peak spatial frequency (cpd) 5.47 10.94 21.88 43.75 87.5 175 Diameter (B) of bright bar (arcmin)
0.075 350
Fusion limits and spatial frequency. Diplopia threshold (radius of fusional area) as a function of peak spatial frequency of two Gaussian patches (spatial bandwidth 1.75 octaves) and of the width of two bright bars. For patches with spatial frequency below about 1.5 cpd, the threshold corresponds to a 90° phase shift of the stimulus (bold dashed line). The fusion limit for the bars remains the same as that of the high spatial frequency patch. Results for one subject. (Redrawn from Schor et al.
Figure 12.1.
1984b)
B I N O C U L A R F U S I O N A N D R I VA L RY
•
53
to one eye, the results also fell on the diagonal line. This is not surprising, because a 90° phase shift is the Rayleigh limit for monocular resolution (Section 3.1.2). Schor et al. concluded that, at low spatial frequencies, the limit of binocular fusion is determined by the same factors that determine monocular grating resolution. Both these limits are much coarser than acuity for vernier offset (Heckmann and Schor 1989a). Note that Schor et al. used a liberal criterion for the limit of fusion—even a slight thickening or displacement of the stimulus counted as diplopia. Perhaps the binocular and monocular limits would not match with a stricter criterion for diplopia. For spatial frequencies higher than about 2.4 cpd, the horizontal fusion limit leveled off to between 5 and 10 arcmin. Thus, for high spatial frequencies, the Rayleigh limit of 90° phase-shift ceased to be the limiting factor for resolution of diplopic images but not for resolution of images in the same eye. In fact, in this study and in that of Schor et al. (1989), the fusion limit at the highest spatial frequencies was about three times the width of the center of the DOG, as depicted in Figure 12.1. The experiment was repeated with sharp-edged bars with widths and luminances equal to those of the bright central component of the DOG patterns. Figure 12.1 shows that the fusion limits for the bars resembled those for the narrowest Gaussian pattern, suggesting that subjects were using the edges of the bars (the highest spatial-frequency component) to make their judgments. The results of these experiments suggest that, at high spatial frequencies, the spatial resolution of dichoptic stimuli is determined by some factor other than the limit for monocular resolution. Perhaps vergence instability is that other factor. A given vergence instability produces a greater effect on the phase relationship of dichoptic Gabors as spatial frequency increases. At high spatial frequencies subjects will have greater difficulty in distinguishing between the effects of stimulus offset and the effects of vergence instability. Vergence instability has no effect on Gabors presented to the same eye. The dependence of the fusion limit on the highest visible spatial-frequency component of a stimulus might arise because, for a given contrast, high spatial-frequency stimuli have a steeper luminance gradient than low spatial-frequency stimuli. Schor et al. (1989) investigated this question. Subjects decided which of two horizontal sine-wave gratings contained a vertical disparity. This procedure forces subjects to use any available cue such as thickening and displacement of lines, as well as diplopia. They could not use stereo depth because the gratings were horizontal. The fusion limit was measured for each of several spatial frequencies at each of several contrasts. The logic was that if spatial frequency is the crucial factor rather than the luminance gradient, then changing the contrast of a grating with fixed spatial frequency should have no effect. But if the luminance gradient is a crucial factor, then changing 54
•
contrast should have an effect, since halving the contrast halves the luminance gradient for a sinusoidal grating of fixed spatial frequency. The results showed almost no effect of changing contrast over a range of spatial frequencies from 0.4 to 3.2 cpd, a result confirmed by Heckmann and Schor (1989a). Furthermore, the fusion limit was not affected by a change in luminance gradient produced by adding a low spatial-frequency component to the sine-wave gratings, even when the added component had the higher contrast. Schor et al. concluded that binocular fusion is based on information in independent spatial-frequency channels rather than on the overall luminance distribution. We will see in Section 17.1.1 that linking disparate images for the detection of stereo depth can involve the overall luminance distribution of the images under certain circumstances and spatial-frequency components under other circumstances. Schor et al. argued that the fusion limit is not affected by changes in contrast because a change in contrast has the same effect on binocular cells that register fused images as on monocular cells that register diplopic images. The effect of contrast thus cancels out. Stereoacuity is adversely affected by a reduction in contrast (Section 18.5). This is presumably because the detection of disparity on which stereoacuity is based depends only on binocular cells. Certainly, the effects of contrast and spatial frequency on diplopia detection are not the same as their effects on disparity detection. Changes in stimulus luminance of up to 3 log units above threshold also have little effect on the fusion limit (Siegel and Duncan 1960; Mitchell 1966a). Perhaps vergence instability was the crucial factor in these experiments. In fact, the effects of vergence instability on the fusion limit of dichoptic gratings should be even more evident than its effect on the fusion limit of Gabor patches. Also, vergence instability would account for the dependency of the fusion limit on the spatial frequency of the stimuli rather than their contrast. The effects of vergence instability could be investigated by measuring the effect of imposed random disparity jitter of various mean amplitudes and frequencies. Roumes et al. (1997) confirmed that the fusion limit decreases as the spatial frequency of the stimuli increases, for both crossed and uncrossed disparities. However, with a 0.3 cpd DOG superimposed on a 4.8 cpd DOG, the fusion limit was intermediate between the limits for the separate components. In other words, the low spatial-frequency component of the compound stimulus increased the fusion threshold above the limit set by the high spatial-frequency component. They used DOGs that varied in luminance in two dimensions, as depicted in Figure 4.4. This contrasts with Schor et al.’s finding that the fusion limit depends on the highest visible spatial-frequency component. Roumes et al. argued that the fusion limit is less ambiguous with their 2-D stimulus than with the 1-D Gaussian patch or bar used by Schor et al. They pointed out that random-dot stereograms can be fused over greater disparities than can
STEREOSCOPIC VISION
isolated dots. The low-frequency dot clusters must play a part in determining the fusion limit of a random-dot display. Roumes et al. also showed that the time taken to fuse DOG stimuli was constant at about 500 ms for all spatial frequencies and for disparities up to about 20 arcmin. Above this disparity, fusion time increased rapidly, especially with stimuli of high spatial frequency. They suggested that the fusion time for stimuli with small disparity depends on a fast neural process but that fusion time for stimuli with large initial disparity depends also on vergence eye movements.
A
B 12.1.3 F US I O N L I M I T S A N D D I S PA R I T Y SCALING
12.1.3a Disparity Scaling and Disparity Gradients Helmholtz (1909) noticed that disparate points are less likely to fuse when there are other objects nearby. The spatial limitations on the fusion limit can be expressed in terms of the disparity gradient. The disparity gradient is the difference in disparity between the images of one point and the images of a second point divided by the mean angular separation of the image pairs, as explained in Section 19.4. Points on a visual line of one eye have a disparity gradient of 2, and those lying on a line in the median plane have a gradient of infinity. Tyler (1973) was the first to show a disparity-gradient limit for fusion. He presented a straight vertical line to one eye and a sinusoidally curved vertical line to the other eye. This produced a sinusoidal variation in disparity in the fused image. The lateral separation of the lines required to produce diplopia decreased as the spatial frequency of disparity modulation increased. Braddick (1979) found that subjects could readily fuse the two lines in Figure 12.2A when convergence was held on the surrounding circle. At a distance of 30 cm, the lines have a disparity of about 9 arcmin. Subjects could no longer fuse the two lines when a second pair of lines with the same but opposite disparity was added, as in Figure 12.2B. He independently varied the distance between the pairs of lines in one eye and the distance between disparate images in the two eyes. The factor limiting fusion for a given disparity was the monocular spacing of the images rather than competing disparate images. Contaminating effects of vergence changes were avoided by having subjects align nonius lines before the displays were exposed for only 80 ms. This is too short a time for vergence movements to occur. The reduction in the fusion limit was most evident when the closely spaced monocular images were parallel, vertically aligned, and equal in length. It is as if two closely spaced lines in one eye evoke responses in detectors of smaller spatial scale than those evoked by a single line. Thus, diplopia detection proceeds within a system of higher spatial resolution when this finer system is recruited.
Figure 12.2. Effects of lateral spacing on fusion. (A) The vertical lines fuse when the eyes converge on the circle. (B) When a second pair of lines with the opposite disparity is added, the lines no longer fuse. At a distance of 30 cm, the lines have a disparity of 9 arcmin. (Adapted from Braddick 1979)
The dependency of the fusion limit on the disparity gradient is known as disparity scaling. It is illustrated in Figure 12.3. Burt and Julesz (1980) reported that a pair of images does not fuse when the disparity gradient with respect to a analyzing fused pair exceeds a value of about 1. Thus, in the lower rows of Figure 12.3, the disparity gradient is steeper than 1 and, although the members of the Row Left-eye number image
Right-eye image
10 9 8 7 6 5 4 3 2 1
Disparity-gradient limit for binocular fusion. Diverge or converge to fuse the columns of dots. Burt and Julesz claimed that if the upper pair of dots in any row is fused, the lower pair fuses only if the disparity gradient is less than about 1. The vertical disparity gradient increases down the columns and may be calibrated for a given viewing distance. The disparity-gradient limit for fusion was specified by the number at which fusion of the lower pair of dots failed. (Derived from Burt and Julesz 1980)
Figure 12.3.
B I N O C U L A R F U S I O N A N D R I VA L RY
•
55
fixated pair of dots fuse, those of the other pair remain perceptually distinct. In the bottom rows in Figure 12.3 the image of the fused dots lies more or less between the unfused dots. The spatial intrusion of the fused pair between the unfused pair could prevent the flanking pair from fusing. But the same fusion limit applies when the two disparate points are to one side of the image of the fused points. Burt and Julesz (1980) referred to the orientation of the disparity gradient with respect to the horizontal as the dipole orientation. The largest horizontal disparity gradient for which the nonfixated images could be fused was independent of the dipole angle. Given that the disparity gradient limit for fusion is 1, it follows that each fused object in the visual field creates a forbidden zone, as illustrated in Figure 12.4. Within this zone, the disparity gradient is greater than 1 and disparate images do not fuse, unless the disparity is vanishingly small. Prazdny (1985a) confirmed that the limiting disparity gradient is 1 for similar stimulus elements but found that the largest disparity gradient for fusion increased to 1.4 when the objects differed in size, and to over 2 when they also differed in luminance polarity. We can conclude that fusion of disparate points is limited by a disparity gradient of about 1. However, both fused and unfused images can code relative depth. The disparity gradient limit for fusion is not the limit for stereoscopic vision.
Disparity limits for stereopsis
Fusional volume
C
Forbidden zone created by A
B
A
Disparity-gradient limit for fusion. A fused object, such as A, creates a zone within which the disparate images of a second object, such as C, will not fuse. The images of B fuse in the presence of A, since B is outside A’s “forbidden” zone. (Redrawn from Burt and Julesz 1980)
Figure 12.4.
56
•
We will see in Section 18.9 that superimposed randomdot displays with distinct disparities create an impression of two surfaces, one seen through the other. The impression of transparency is maintained when the disparity gradient between analyzing dots is well above 1. McKee and Verghese (2002) demonstrated that transparency occurs with a disparity gradient of up to at least 3. They created two transparent surfaces from well-spaced multiple pairs of random dots. One set of dots had a crossed disparity of 6 arcmin and the other set had an uncrossed disparity of 6 arcmin. The disparity gradient between the pairs of dots was varied by varying their vertical separation. With disparity gradients of up to at least 3, subjects saw two depth planes and were able to detect a test target placed in depth between the two planes. Stereoscopic vision is limited by the spatial frequency of disparity modulations, as we will see in Section 18.6.3. The disparity limits for stereopsis are discussed in Section 18.4.1.
12.1.3b Disparity Scaling and Spatial Frequency Wilson et al. (1991) measured the effects of a background grating of one spatial frequency on the fusion limit of small vertically elongated D6 Gaussian patches, as shown in Figure 12.5. Each patch had a spatial bandwidth of 1 octave with a center spatial frequency of between 0.5 and 12 cpd. The patch on the right had zero disparity while the disparity of the patch on the left was varied. Subjects fixated between the patches and reported whether or not the images of the left patch were fused. The diplopia threshold decreased about 3.9 times when the patches were superimposed on a grating with a spatial frequency half that of the Gaussian patches but was not affected by a grating with a spatial frequency one quarter that of the patches. Further tests also showed that coarse spatial scales constrain disparity processing in fine scales, but only over a spatial-frequency range of 2 octaves. These effects were not due to changes in vergence since the stimuli were presented briefly. They did not depend on the spatial phase of the test patches relative to that of the background grating but did depend on the test and background stimuli having the same orientation. Wilson et al. concluded that binocular disparities are processed in at least three distinct spatial-frequency channels, each with subchannels for near (crossed), zero, and far (uncrossed) disparities. To account for their data, they postulated that far and near cells inhibit far and near cells, respectively, in the next higher spatial-frequency channel and that zero-disparity cells inhibit both near and far cells in the next higher spatial-frequency channel. They argued that these effects could be accomplished by inhibitory feedback suppressing an appropriate subset of monocular inputs. Scheidt and Kertesz (1993) conducted a similar experiment with an induction stimulus consisting of D10 Gaussian patterns in a 5° circular area around a central fixation point
STEREOSCOPIC VISION
scale are crowded together, detectors of small spatial scale are devoted to the analysis of disparity between finer elements of the stimulus. This ensures that the fusion mechanism does not combine distinct parts of a dense stimulus pattern. The relationship between stereoacuity and stimulus spacing is discussed in more detail in Section 18.6.2.
A
12.1.3c Disparity Scaling in the Periphery As eccentricity increased, an object with a given disparity had to be more widely separated laterally from a neighboring object before its images would fuse (Scharff 1997). In other words, the critical disparity gradient for fusion decreased with increasing eccentricity. This effect may be explained by the fact that resolving power decreases (receptive fields become larger) in the periphery so that fine disparities cease to be processed. The linkage between spatial scale and processing of fine and coarse disparities is discussed in Section 18.7.2.
B
12.1.3d Disparity Scaling in the Blue-Cone System C Fusion with superimposed spatial frequencies. (A) A pair of D6 patches with horizontal disparity. When fused, one patch appears in front of the other. (B) D6 patches are superimposed on a grating 2 octaves lower in spatial frequency. The left-hand D6 can no longer be fused. (C) D6 patches are superimposed on a grating 4 octaves lower in spatial frequency. The patches fuse to create transparent depth.
Figure 12.5.
(From Wilson et al. 1991)
and a similar test pattern in a larger annulus around the induction stimulus. A 0.5° ring separated the inner area and the surrounding annulus. Both sets of patterns had a peak spatial frequency of 0.75 cpd, but the disparity of the induction stimulus varied from trial to trial between ±15 arcmin. When the stimuli were exposed simultaneously for 167 ms, the diplopia threshold of the test stimuli with the Gaussian induction stimulus was reduced relative to that when the induction stimulus was evenly illuminated. These results essentially confirm those obtained by Wilson et al. for this spatial frequency. However, when the stimuli were exposed continuously, the diplopia threshold of the test stimulus was reduced only in the presence of an induction stimulus with uncrossed disparity. Scheidt and Kertesz proposed that interactions between dichoptic stimuli have a fast, wholly inhibitory, component and a slow component that is inhibitory or facilitatory depending on whether the binocular disparities of the interacting stimuli have the same or opposite signs. The effects of the disparity gradient and of the spatial frequency of superimposed gratings on the fusion range are presumably aspects of the same underlying mechanism. The mechanism ensures that when stimuli of different spatial
Wilson et al. (1988) measured the horizontal and vertical diplopia thresholds for dots and lines that stimulated the blue cones only. They used a yellow adapting field to adapt out the middle- and long-wavelength cones. The diplopia threshold of the blue-cone system was similar to that obtained when all cone types were stimulated. The bluecone system also showed the same dependency on the disparity gradient when low spatial-frequency stimuli were used. In other words, the blue-cone system was subject to disparity scaling. 12.1.4 T E M P O R A L FAC TO R S I N F US I O N L I M I T S
There has been conflicting evidence about the effects of stimulus duration on the fusion limit. Palmer (1961) found that the fusional area of a 1.5 arcmin spot had a horizontal diameter of between 15 and 20 arcmin and did not change significantly when exposure duration increased from 10 to 170 ms. However, Woo (1974a) found that the mean horizontal fusion range of three subjects for a short vertical line increased from about 2 to 4 arcmin when exposure duration increased from 5 to 100 ms. Duwaer and van den Brink (1982a) found that the fusion limit for vertical disparity in one subject decreased from about 10 to 6 arcmin as exposure increased from 20 to 200 ms. These contradictory findings remain unexplained. Woo (1974b) presented a dichoptic narrow slit for 10 ms to each eye with various intervals of time between them. The fusion limit (diplopia threshold) was not affected until the delay was 40 ms, when the stimuli began to appear as discrete temporal events.
B I N O C U L A R F U S I O N A N D R I VA L RY
•
57
Temporal frequency of disparity modulation (Hz)
5.0 V 10' 2.0
H
1.0 0.5 0.25 0.1 0.25 0.5 1.0 2.0 0.125 Spatial frequency of disparity modulation (cpd)
Fusional areas and spatial frequency. Ellipses represent Panum’s fusional areas. Both diameters increase as the spatial frequency of disparity modulation of wavy lines shown in Figure 7.6 decreases. The horizontal but not the vertical diameter decreases as temporal frequency of depth modulation increases, especially at low spatial frequencies. (Redrawn from Schor and Tyler 1981)
Figure 12.7.
Spatiotemporal aspects of the fusion limit. Solid wavy lines are images in one eye. Dashed wavy lines are images in the other eye. Images in the two eyes were alternated at between 0.1 and 5 Hz to create a pair of lines undulating in depth, as shown below, with the sign of undulation alternating over time. The lines were 0.5° on either side of a fixation cross. (Adapted from Schor and Tyler 1981)
Figure 12.6.
Schor and Tyler (1981) explored the dependence of fusion limits on the spatiotemporal properties of the stimuli. In investigating horizontal fusion limits, they used dichoptic vertical wavy lines with opposite phases of the waves in the two eyes. From trial to trial, they changed the horizontal disparity between the aligned peaks of the waves by changing the amplitude of the waves. A set of dichoptic lines was placed 0.5° on either side of a fixation cross (Figure 12.6). To measure vertical fusion limits the lines were horizontal. The spatial frequency of the waviness of the lines varied between 0.125 and 2.0 cpd. The sign of disparity of the two sets of lines reversed in counterphase at between 0.1 and 5 Hz. This reduced any tendency to change convergence. They determined the amplitude of disparity modulation at which diplopia became apparent for each spatial and temporal frequency of disparity modulation. The results are shown in Figure 12.7. The horizontal and vertical fusion limits increased as the spatial frequency of the waviness of the lines decreased. With lines of low spatial frequency the horizontal fusion limit decreased by a large amount with increasing temporal frequency, whereas the vertical fusion limit remained fairly constant. Temporal frequency had little effect with high spatial-frequency stimuli. 12.1.5 O R I E N TAT I O N F US I O N L I M I TS
When superimposed dichoptic lines are rotated about their centers in the frontal plane in opposite directions, the angle 58
•
at which they appear double is the fusion limit for orientation disparity. If orientation disparities were processed by a distinct mechanism, one would expect the fusion limit for orientation disparity to be independent of the length of the lines. However, if the fusion limit were to depend on point disparities, it would also be independent of line length, if the point-disparity fusion limit were proportional to eccentricity. The measurement of the orientation fusion limit is confounded by cyclovergence, which tends to cancel orientation disparities. Kertesz (1973) found that a larger orientation disparity was required to induce diplopia in lines subtending 2° than in lines subtending 9° when the subject fixated the center of each line. He concluded that point disparities rather than orientation disparities determine the cyclofusional limit and that the fusion limit for point disparity does not increase linearly with eccentricity for eccentricities of less than 10°. The fusion limit of orientation disparity was smaller for a set of parallel lines than for single lines. This could be because the fusion limit for crowded stimuli is smaller than that for single stimuli or because gratings induce more cyclovergence than do single lines (Section 10.7.5). Kertesz did not control for the effects of cyclovergence because, at the time, he did not believe it occurred. It has been reported that the fusion limit for orientation disparity is about 2° for horizontal lines and about 8° for vertical lines (Volkmann in Helmholtz 1909, p. 449; Ames 1929; Beasley and Peckham 1936; Crone and Leuridan 1973; Sen et al. 1980). This is consistent with the fact that the fusion limit for horizontal point disparity is larger than that for vertical point disparity (Kertesz 1981). However, cyclovergence is evoked with greater magnitude by cyclorotated horizontal lines than by cyclorotated vertical lines (Section 10.7.5). Unless the effects of cyclovergence are taken into account, comparison between the fusion limits for horizontal and vertical orientation
STEREOSCOPIC VISION
disparities is not valid. The anisotropy of fusion limits may also be influenced by the fact that acuity is higher for vertical than for horizontal lines. One way to allow for the effects of cyclovergence is to present a dichoptic pair of horizontal lines at the same time as a pair of dichoptic vertical lines and see which pair becomes diplopic as each pair of lines is rotated away from being parallel in the same direction. Cyclovergence affects both lines equally. We have noticed, as did Volkmann (see Helmholtz 1909, vol. 3, p. 449) and O’Shea and Crassini (1982), that horizontal lines appear diplopic before vertical lines. This confirms that the fusion limit is greater for vertical lines (horizontal-shear disparity) than for horizontal lines (vertical-shear disparity). The corresponding vertical meridians of the retinas are normally excyclorotated about 2°. This should cause the fusion limit to be larger in the direction of the excyclorotation than in the opposite direction when testing is done with respect to the true vertical (Krekling and Blika 1983a). To control for this effect and for effects of cyclovergence, experiments should be performed with stimuli rotated in both directions and with opposite directions of disparity on either side of the fixation point. 12.1.6 H Y S T E R E S I S E FFEC TS I N F US I O N
The fusion limit when initially fused diplopic stimuli are moved apart is larger than the refusion limit when the stimuli are moved together from a position of diplopia. There has been a good deal of theorizing about the neural mechanisms responsible for this fusional hysteresis. However, hysteresis is not peculiar to binocular fusion. All psychophysical thresholds exhibit hysteresis according to the direction from which the threshold is approached. Fender and Julesz (1967) measured fusion hysteresis using optically stabilized images that did not move on the retina as the eyes moved. The stimulus to each eye was a horizontal or vertical black line on a 6°-wide white surround. As the vertical lines moved apart, diplopia became apparent at an uncrossed disparity of 65 arcmin. When the vertical lines moved toward each other they fused at a disparity of 42 arcmin. For the horizontal lines the diplopia thresholds were about 19 and 12 arcmin. Diplopia thresholds with stabilized images were at least 20 arcmin smaller than with normal viewing. Fender and Julesz measured changes of vergence and claimed that these could account for the larger fusion limits in normal viewing. The fusion limits reported by Fender and Julesz are larger than those reported by other investigators, but the line was 13 arcmin wide, which may have inflated the values. These measurements were repeated for both crossed and uncrossed disparities of a retinally stabilized black line, but with the border of the surrounding 3° white disk unstabilized (Diner and Fender 1987). The diplopia threshold for increasing disparity, either crossed or uncrossed, was about 20 arcmin, and the refusion limit was about 10 arcmin.
When an unstabilized fixation cross was added just above the line, these limits were reduced by about 5 arcmin. Thus the range of fusion in Fender and Julesz’s study, in which the disparities occurred over the entire contents of the visual field, was larger than in Diner and Fender’s study, where only some of the elements were disparate. This is what one would expect from the disparity gradients in the two types of display. An overall disparity has a disparity gradient of zero, but a locally applied disparity produces a nonzero disparity gradient. Put another way, the diplopia limit is greater when there is no zero-disparity comparison stimulus in view. Fender and his associates argued on the basis of the hysteresis effect that the fusional area becomes elongated in the direction of a gradually increasing disparity. Diner and Fender (1988) asked whether this elongation represents an overall expansion of the fusional area in both directions or an extension of the leading edge of the fusional area in the direction of movement, accompanied by a contraction of the lagging edge. In other words, does the area expand or merely shift ? To answer this question, they presented a fixed vertical line to the fovea of one eye and gradually moved a vertical test line in the other eye in the direction of increasing crossed or uncrossed disparity. Both lines were stabilized on the retina with respect to eye movements. The moving test line was replaced periodically for 2 ms by a probe line at each of several locations on either side of the fixed stimulus line. Subjects reported whether the probe line and stationary line were fused or diplopic. When the disparity of the test stimulus was near zero, the diplopia limits for the probe were approximately symmetrical about zero (Figure 12.8A). When the moving test line had an uncrossed disparity of 12 arcmin, the left and right boundaries of the fusional area shifted in the uncrossed direction (Figure 12.8B). When the test line had a crossed disparity of 16 arcmin, the boundaries of the fusional area shifted in the crossed direction (Figure 12.8C). Diner and Fender concluded that the boundaries of the fusional area move in the direction of the overall disparity rather than expand. In fact, the data suggest that the fusional areas may contract rather than expand. In all the experiments on the diplopia threshold reviewed here the stimuli were lines, bars, or dots. As the images of such stimuli are separated, the contours with the same luminance polarity move further apart and contours of opposite polarity move closer together and eventually coincide. With further separation the opposite polarity contours separate. The diplopia threshold is therefore a threshold for diplopia between contours of opposite luminance polarity. To investigate the diplopia threshold for contours of the same polarity, one would have to use the stimuli shown in Figure 12.9. 12.1.7 D I C H O P T I C M O I R É PAT T E R NS
Interference patterns (moiré patterns) appear when two periodic patterns with slightly different spatial frequencies
B I N O C U L A R F U S I O N A N D R I VA L RY
•
59
Probability of fusion
1.0 0.8 0.6 0.4 0.2 0 –30
A
–20 –10 0 10 20 Convergent Divergent Retinal disparity (arcmin)
30
Probability of fusion
1.0 0.8 0.6
Diplopia threshold for single contours. Display to investigate the diplopia threshold for contours of the same polarity. Each pair of patterns is fused with vergence on the vertical bars. In successive rows, disparity between the horizontal rectangles increases. At a certain disparity the left-hand edges of the rectangles become diplopic and rival.
Figure 12.9.
0.4 0.2 0 –30
B
–20 –10 0 10 20 Convergent Divergent Retinal disparity (arcmin)
30
–20 –10 Convergent
30
Probability of fusion
1.0 0.8 0.6 0.4 0.2 0 –30
C
0
10 20 Divergent Retinal disparity (arcmin)
Displacement of Panum’s fusional area. A line in one eye was moved slowly into an eccentric position with respect to a fixed line in the other eye, in the direction of the horizontal arrows. At a disparity indicated by vertical arrows, a briefly exposed probe revealed the fusional limits on either side of the fixed line. Dashed lines represent the rectangular approximations to the curves. The fusional area moves in the direction of changing disparity, as explained in the text. (Redrawn from
Figure 12.8.
Diner and Fender 1988)
are superimposed (see Spillmann 1993). For example, a grating of 10 cpd superimposed on one of 11 cpd produces a moiré pattern of 1 cpd. There are conflicting claims about whether gratings combined dichoptically generate moiré patterns. Oster (1965) found no dichoptic moiré patterns. Bryngdahl (1976) claimed that dichoptic combination of patterns of concentric rings produces a moiré pattern of vertical bars but he did not provide details or explain how 60
•
he avoided binocular rivalry. Kaufman (1974, p. 298) found that combining two offset radial patterns dichoptically did not produce the moiré pattern evident when they were combined in one eye. But these were high-contrast patterns, which therefore engaged in rivalry at each location. One would expect to see moiré patterns only if luminance were summed at each location. Low-contrast gratings sum rather than rival (Section 12.3.2c), so moiré patterns may be revealed in dichoptically combined low-contrast patterns. However, we could not see moiré patterns in such stimuli. With normal viewing, subjects were more sensitive to displacement of a 9-cpd grating relative to a superimposed 10-cpd grating than to displacement of a 10-cpd grating on its own (Badcock and Derrington 1987). The rapid motion of the 1-cpd moiré pattern facilitated perception of relative displacement of the two gratings. However, subjects were less sensitive to the relative displacement of two dichoptic gratings than to the displacement of a single grating. The dichoptic stimulus did not produce a moiré pattern. Subjects saw neither depth nor rivalry in the dichoptic stimulus. Even if the gratings did rival, they were probably too similar for the change to be noticed. Badcock and Derrington concluded that the binocular mechanism does not sum luminance variations in the two eyes. 12.2 DICHOPTIC COLOR MIXTURE 12.2.1 BA S I C P H E N O M E NA
Under some circumstances, a colored area presented to one eye appears to rival an area of another color presented to the
STEREOSCOPIC VISION
other eye. This is color rivalry. Under other circumstances, dichoptic colors combine to create a third color. This is binocular color mixing. There has been some dispute about whether binocular color mixing ever occurs. Even those who believe that it occurs disagree about the necessary conditions. Desaguliers (1716) was perhaps the first to investigate color rivalry. He dichoptically superimposed differently colored pieces of silk by viewing them through an aperture. He reported color rivalry rather than color mixing. Taylor (1738) dichoptically viewed differently colored glasses placed in front of candles and claimed to see color rivalry rather than mixing. Dutour (1760), also, observed color rivalry of patches of blue and yellow fabric combined by converging the eyes. On the other hand, Haldat (1806) reported that dichoptically combined glass prisms containing colored liquid appeared in an intermediate hue. Later in the nineteenth century, there was a controversy between those who adopted the Young-Helmholtz theory, which stipulates that all colors can be formed from mixtures of red, green, and blue light, and those who adopted the Hering or Ladd-Franklin theories, which stipulate that the sensation of yellow arises from a distinct process in the retina. The Young-Helmholtz theory predicts that yellow should arise from the dichoptic combination of red and green whereas the latter two theories do not. It is ironic that Helmholtz (1909) believed that binocular yellow was an artifact due to color adaptation, binocular suppression, and unconscious inference, whereas Hering (1879) regarded it as due to interaction of visual inputs at a central location (for a bibliography of early studies see Johannsen 1930). Helmholtz was reluctant to explain binocular yellow in terms of a central combination of inputs from the two eyes because he believed that binocular inputs are not combined physiologically. We know now that the trichromatic stage of color processing is followed by two retinal opponent processes: one between red and green receptors and one between blue and yellow. Yellow does not arise from a distinct cone type but is formed by inputs from red and green receptors (Boynton 1979). The basis of binocular color mixture is still not clear. Several investigators have claimed to see binocular yellow. For instance, Hecht (1928) saw yellow when he combined red and green patches. Murray (1939) pointed out that the Wratten filters used by Hecht extended into the yellow region of the spectrum. Dunlap (1944) argued that binocular yellow is an artifact of adaptation of the eye to the red light and claimed to see yellow when both eyes looked at red patches for some time. He concluded that, “binocular color mixture can be laid away in the museum of curious superstitions (p. 561).” But the problem lives on. Prentice (1948) claimed to have overcome Murray’s objection by using narrow-band Farrand interference filters centered on 530 μm (green) and 680 μm (red), neither of which extends into the yellow region of the spectrum.
He obtained good binocular yellow even with brief exposure, and the fused image became more yellow with longer exposure. Others have also reported binocular yellow with narrow-band filters and appropriate controls for color adaptation. Hurvich and Jameson (1951) pointed out that the spectral purity of the red and green filters is irrelevant. The crucial factor is the chromatic bandwidth of the receptors, since a receptor cannot distinguish between one wavelength and another within its tuning range—the principle of univariance. The wavelengths selected by Prentice and others evoked sensations of yellowish red and yellowish green and it was therefore not surprising that they produced binocular yellow. When Hurvich and Jameson used unique red and green, which evoke the purest sensations of red and green, the dichoptic mixture was not yellow but gray. This still represents a form of binocular color mixing that needs to be explained. The colors used by Hurvich and Jameson were close to being opponent colors that produce gray when mixed monocularly. As ordinarily understood, the opponent mechanism resides in the retina, so that the occurrence of binocular grey must depend on a distinct cortical process. DeValois and Walraven (1967) obtained a desaturated yellow when the afterimage of a bright red patch in one eye was superimposed on a green patch in the other eye. The effect faded with the fading of the afterimage and was not present when the eye containing the afterimage was pressure blinded (Gestrin and Teller 1969). 12.2.2 FAC TO R S A FFEC T I N G B I N O CU L A R COLOR MIXING
Binocular color mixtures are affected by the following factors: 1. Luminance Dichoptic color mixtures are more stable at lower than at higher luminance levels and when the luminance in the two eyes is the same (Dawson 1917; Johannsen 1930). They are also more stable when the components of the dichoptic mixture are presented on a dark rather than on a light background (Thomas et al. 1961). 2. Saturation Dichoptic color mixtures become more stable as the saturation of the colors decreases (Dawson 1917). 3. Stimulus duration Hering (1861) observed that prolonged inspection of a dichoptic mixture increases the stability of color mixtures. Johannsen (1930) suggested that prolonged inspection causes the color in each eye to become desaturated through adaptation and that this, rather than duration, is responsible for the increased stability of color mixtures. However, dichoptic color mixtures also seem to be more stable
B I N O C U L A R F U S I O N A N D R I VA L RY
•
61
with very short exposure times. Rivalry is most evident with intermediate exposure durations. Thus, synchronous flicker of red and green dichoptic stimuli increased the apparent saturation and stability of binocular yellow. The best results were obtained with flash durations of less than 100 ms and interflash durations of more than 100 ms (Gunter 1951). Binocular color rivalry does not occur with brief stimuli (Section 12.3.5). Stimulus asynchrony of more than 25 ms disrupted binocular yellow (Ono et al. 1971). Dichoptically combined regions differing in color between the two eyes induced color in a small achromatic patch superimposed on the region in one eye (Erkelens and van Ee 2002a). The effect took some time to develop and was not produced by mixing colors in one eye. Erkelens and van Ee concluded that the effect is due to adaptation to a color midway between the colors in the two eyes. 4. Color differences Periods of color rivalry that occur with larger stimuli are more pronounced the greater the color difference. Ikeda and Nakashima (1980) increased the dichoptic difference in the wavelength of a 10° test patch until the subject reported color rivalry. The threshold difference for the occurrence of rivalry varied as a function of wavelength in a manner closely resembling the hue-discrimination curve. In other words, threshold color differences for the production of rivalry were equally discriminable. Sagawa (1982) asked whether the discrimination of two patches of wavelengths λ and λ + Δλ presented to one eye was affected when patches of wavelength λ were superimposed in the other eye. The idea was that if color processing were independent in the two eyes, the addition of the dichoptic masking patches would not affect the discrimination threshold. Wavelength discrimination deteriorated in the presence of the masking stimulus, but the extent of the deterioration was largely independent of the luminance of the masking stimulus. This suggests that dichoptic masking between chromatic signals is independent of the luminance component of the visual stimulus. 5. Stimulus size Hering (1861) observed that dichoptic color mixtures are more stable with small than with large stimuli, and this has been confirmed (Thomas et al. 1961; Ikeda and Sagawa 1979). With stimuli subtending less than 2°, most subjects reported stable color mixture (Grimsley 1943; Gunter 1951). With larger stimuli, people experience color rivalry. With a display subtending 3.5°, binocular color mixture was unstable but became stable when a fusible micropattern was superimposed on the display, as in Figure 12.10 (De Weert and Wade 1988). One could think of the textured pattern as breaking up the display into small 62
•
Color rivalry and texture. Fusion of the solid red and green discs produces unstable color rivalry. Fusion of the textured disks produces stable color mixing. (Reprinted from de Weert and Wade 1988, with permission
Figure 12.10.
from Elsevier Science)
regions and thus preventing rivalry. Binocular color mixture is difficult to see in the presence of rivaling patterns (Dawson 1917).
12.2.3 D I C H O P T I C A N D D I O P T I C COLOR MIXTURES
Color matches obtained under dichoptic viewing differ in the following ways from those obtained with monocular viewing. 1. Lights combined in the same eye obey Abney’s law, which states that the luminances of differently colored lights add linearly. Lights combined dichoptically, whether of the same color or different colors, do not obey Abney’s law but produce an intermediate brightness, especially when similar in luminance (Section 13.1.4). 2. Dichoptic color matches are less saturated and more variable than similar monocular matches. 3. The proportion of green to red required to match a spectral yellow and the proportion of yellow required to cancel blue was less with dichoptic than with monocular viewing (Hoffman 1962; Hovis and Guth 1989a). Hovis and Guth (1989b) argued that less green is required for dichoptic than for monocular yellow because, with increasing luminance, the postreceptor response for green increases faster than
STEREOSCOPIC VISION
1 2 . 3 B I N O C U L A R R I VA L RY
Binocular adapting field
12.3.1 I N T RO D U C T I O N
12.3.1a Basic Phenomena of Binocular Rivalry Dichoptic test field Dioptic comparison field
5.5° Stimuli used by De Weert and Levelt (1976a). The inner bipartite field was presented repeatedly for 0.5 s on the larger white adapting field. The black annulus and bar kept the various parts of the display separate.
Figure 12.11.
that for red. The red/green ratio required for monocular yellow is invariant over changes in luminance because receptor inputs to the monocular opponent mechanism increase at the same rate with increasing luminance. De Weert and Levelt (1976a) presented a bipartite stimulus on a larger white adapting field, as shown in Figure 12.11). One half of the stimulus was illuminated dichoptically with lights of the same luminance but differing in wavelength. In the other half, the lights were superimposed to both eyes (dioptically). Subjects adjusted the relative luminances of the two halves until they appeared most similar in hue. This was done for many pairs of wavelengths to yield a set of hue-efficiency functions for dichoptic mixtures. Reasonably good matches of hue were obtained between the dichoptic and dioptic stimuli with the same wavelength components. In general, a smaller amount of the wavelength component nearer the middle of the spectrum was required in the dichoptic mixture than in the dioptic mixture. A colored patch presented to an amblyopic eye contributed less to the dichoptic color than a patch presented to the nonamblyopic eye (Lange-Malecki et al. 1985).
12.2.4 Summary Dichoptic color mixing is a genuine phenomenon but differs in several respects from monocular color mixing. Dichoptic color mixing is more stable with stimuli that are small or textured rather than large and homogeneous, flickering rather than steady, of low luminance and saturation rather than bright and saturated, and of equal luminance and chromaticity. Its occurrence implies that there are color mechanisms in the cortex in addition to those in the retina. Hovis (1989) reviewed the literature on binocular color mixing.
When the images falling on corresponding regions of the two retinas are sufficiently different, they rival, rather than fuse. In any location only the image in one eye is seen at a given time. This is binocular rivalry. The stimulus seen at a given time is the dominant stimulus, and the stimulus that cannot be seen is the suppressed stimulus. One may distinguish between the following types of rivalry. 1. Area rivalry Equal size homogeneous regions of opposite contrast, as in Figure 12.12A, rival to produce a shimmering region of variable brightness. This effect is known as binocular luster. Dichoptic regions that differ in luminance but which have the same luminance polarity fuse to create a region of intermediate brightness, as in Figure 12.12B. 2. Area-contour rivalry In this case a contrast-defined edge in one eye falls on the same region as a blank area in the other eye. For example, a small black disk on a white ground remains visible when superimposed on a larger black disk presented to the other eye, as shown in Figure 12.12C. The edge of the small disk suppresses the surrounding homogeneous region in the larger disk. The zone of binocular suppression surrounding a contour in one eye results from the inhibitory surrounds of receptive fields responding to the contour. A model of this process was proposed by Welpe et al. (1980). 3. Contour-contour rivalry An example of contour-contour rivalry is shown in Figure 12.12D. In this case, there are two superimposed orthogonal bars. At the intersection, the contours compete so that sometimes the vertical bar appears complete and sometimes the horizontal bar. When a large vertical grating is presented to one eye and a similar horizontal grating to the other, as in Figure 12.12E, one sees vertical lines in some regions and horizontal lines in other regions, with the regions constantly shifting about. This is mosaic rivalry. For short periods, only vertical lines or only horizontal lines may be seen. This is exclusive dominance. When both gratings are small, one tends to see only all of one image or all of the other. We will see in Section 12.3.3b that contours defining the boundary of a figure play a more important part in rivalry than contours within figural regions.
B I N O C U L A R F U S I O N A N D R I VA L RY
•
63
A
Binocular lustre
B
Intermediate brightness
C
Area suppression
D
Contour suppression
Basic phenomena of binocular rivalry. For each figure, binocular fusion of the
Figure 12.12.
E
Mosaic rivalry
We will see in Section 12.3.2c that brief or low-contrast stimuli do not rival but appear superimposed. The question arises whether dissimilar dichoptic images that appear superimposed should be regarded as fused. The perceived direction of fused similar images is midway between the directions of the two monocular images (Section 16.7.3), and disparity between similar images may code depth. Dissimilar images that appear superimposed have neither of these properties. This suggests that the processes responsible for their apparent superimposition differ from those responsible for fusion of similar images. Therefore, the simultaneous appearance of dissimilar images will be referred to as image superimposition rather than image fusion. Inspection of any pattern for a period of time with fixed gaze leads to a loss of apparent contrast, and parts and sometimes the whole of the pattern fade completely. When a stimulus on a textured background fades, the space is perceptually filled in by the background texture. This is known as the Troxler effect. It was first described by Troxler in 64
•
two images on the left produces the effect depicted on the right.
1804. The effect is particularly evident with blurred edges. Optically stabilized images fade completely. Like binocular rivalry, Troxler fading in a complex pattern is piecemeal and fluctuates. The effect is generally believed to be due to local adaptation. However, there is evidence that it is influenced by the figural features of the stimuli (see González et al. 2007). Also, we will now see that the effect is related to binocular rivalry. There has been dispute about the role of Troxler fading in binocular rivalry (Crovitz and Lockhead 1967). Liu et al. (1992b) suggested that it is particularly important in rivalry at near-threshold contrasts but not at high contrasts. On the other hand, we will see in Section 12.3.3 that Troxler fading is strongly influenced by binocular rivalry. Prior inspection of a patterned stimulus by one eye decreased the duration of dominance of that stimulus when it was paired with a rival stimulus in the other eye (Wade and De Weert 1986). With rivalrous gratings, prior inspection of one image had an effect on rivalry only when
STEREOSCOPIC VISION
the prior stimulus was in the same retinotopic location as the subsequent rivalrous stimulus. This supports the idea that the effect is at a low level in the visual system (van Boxtel et al. 2008). However, for complex stimuli, such as a house and a face, prior inspection of one stimulus had an effect when it was presented in the same spatial location but not the same retinal location. In this case the effect must have involved processes at higher levels. Other experiments involving the effects of prior adaptation are discussed in 12.3.6d. Inspection of a moving display evokes pursuit eye movements (OKN). When dichoptic displays move in opposite directions, the eyes follow whichever stimulus is dominant. Each change in image dominance is accompanied by a change in the direction of pursuit eye movements (Enoksson 1963; Fox et al. 1975) (Portrait Figure 12.13). This response has been used to investigate binocular rivalry in the monkey (Logothetis and Schall 1990). Early phenomenological studies of binocular rivalry were carried out by Volkmann (1836), Wheatstone (1838), Panum (1858), Fechner (1860), Hering (1861), Helmholtz (1909, vol. 3, p. 492), and Meenes (1930). The literature on rivalry has been reviewed by Fox (1991) and by Alais and Blake (2005).
12.3.1b Basic Theories of Rivalry Before 1838, when Wheatstone demonstrated that binocular disparity plays a crucial role in depth perception, people
Figure 12.13. Robert Fox. Born in Cincinnati, Ohio, in 1932, received a Ph.D. in experimental psychology from the University of Cincinnati in 1963. That same year he joined the faculty at Vanderbilt University to start a research program in visual perception, which became the Vanderbilt Vision Research Center (VVRC). Fox is also a member of the Center for Integrated and Cognitive Neuroscience (CICN) and a fellow of AAAS, APS, and APA.
interested in binocular vision were preoccupied with the problem of how a unified percept is formed from two images. Aristotle and Euclid had realized that the eyes have slightly different views of the world. Although a few people before Wheatstone had suggested that this contributes to the perception of depth (Section 2.10.3), this aspect of binocular vision was generally ignored. The following two theories were developed to account for how binocular images combine. 1. The suppression theory According to this theory, both similar and dissimilar images engage in alternating suppression at an early stage of processing. Thus, in any location in the visual field, only one eye’s input is seen at any one time. The dominant input varies from place to place in the visual field and alternates over time, resulting in a mosaic of alternating dominance and suppression. The suppression theory cannot be tested by simple observation of similar images, since any rivalry that might occur would not be visible. However, it has now been established by an indirect test that similar images do not inhibit each other in the manner required by the suppression theory (see Section 12.7.1). The suppression theory must therefore be discarded. 2. The fusion theory According to this theory, similar images falling on corresponding retinal points gain simultaneous access to the visual system to form a unitary percept, while dissimilar images engage in alternating suppression. The two images are said to fuse, although it is not always clear what fusion implies. We will see that the two inputs are not simply summed. Nor are the identities of the two signals lost in the fusion process. If they were, we would not be able to distinguish between images with crossed disparity and those with uncrossed disparity. The term fusion will be taken to mean that similar images presented on or near corresponding points appear as one and are processed simultaneously rather than successively. Evidence reviewed in Section 13.1 shows that similar high-contrast images in the two eyes engage in mutual inhibition. This is not alternating inhibition, as in rivalry. It is simply that binocular inputs onto binocular cells in the visual cortex do not summate. If they did, things would appear twice as bright with two eyes than with one eye. The inputs are subject to mutual gain control (Ding and Sperling 2006). The process is akin to contrast normalization, in which responses of cortical cells are normalized with respect to the activity in analyzing cells (see Moradi and Heeger 2009). Low-contrast similar stimuli summate their energy (Section 13.1.3). Also, low-contrast dissimilar stimuli do not rival (Section 12.3.2c). The important thing at low contrasts is to preserve stimulus energy.
B I N O C U L A R F U S I O N A N D R I VA L RY
•
65
Several investigators have noticed that rivaling stimuli combine when presented for less than 200 ms (Section 12.3.5a). This suggests that stimuli are initially processed simultaneously and then separated into low-contrast stimuli that remain fused and high-contrast stimuli that rival. In summary, it can be stated that when identical images fall on corresponding regions in the two retinas, the inputs to binocular cells are subjected to gain control and passed on to higher processes to produce a fused image. When the images differ slightly in position, orientation, or size, the disparities are registered and produce impressions of relative depth. Very different images rival, so that only one of them gains access to higher stages of visual processing at any one time in any one location. Thus, similar and dissimilar binocular images are processed in fundamentally different ways. Rivalry between dissimilar images may not be complete, even for high-contrast stimuli of long duration. Thus, some low-level features of a suppressed image can affect the processing of the dominant image. On the other hand, the fusion process involves both inhibitory and excitatory processes. Thus, the processes underlying binocular vision are more complex than suggested by the fusion theory or suppression theory. The evidence for these statements will be reviewed in the following sections. Models of rivalry are discussed in Section 12.10. Even if we accept that similar images are fused and processed simultaneously, two questions remain. The first question is where in the nervous system fusion occurs. Many investigators, including Helmholtz, believed that fusion is a mental, or psychic, act. We now talk about higher, or cognitive, levels of processing. Since 1959, when Hubel and Wiesel discovered binocular cells in the visual cortex, it has been generally believed that fusion of similar images occurs at this low level. We will see that rivalry between dissimilar images most probably involves processes in the primary visual cortex but that it may also involve processes occurring at a higher level. The second remaining question is what rules of stimulus combination are involved in the binocular fusion process? Most recent work on fusion has been concerned with this question.
12.3.1c Functions of Rivalry When the eyes are converged on a particular object, the images of objects in nearby depth planes are processed by the disparity-detection system for the purpose of perceiving relative depth. These images will have varying degrees of disparity and some may be diplopic. Images of objects well outside the horopter are beyond the range of the disparity detectors. Also, images of differently shaped objects well outside the horopter may fall on corresponding retinal locations and therefore rival. Thus, rivalry occurs in many parts of natural scenes containing objects disposed in depth. A binocular mechanism that combined nonmatching images 66
•
in each location would produce a meaningless percept. The best strategy is to preserve the most salient image in each local region. Saliency could depend on high-contrast edges, stimulus motion, or flicker. Local competition between the images in the two eyes is known as eye-of-origin rivalry. This preserves a salient image in one eye in one location and a salient image in the other eye in another location. As a result, rivalry occurs in a mosaic of local regions over the stimulus. Eye-of-origin rivalry between simple local features could be achieved early in the visual system, and we will see there is abundant evidence that it occurs in V1. It would be advantageous if well-focused images, especially those in central vision, were dominant over less wellfocused images. We will see in Section 12.3.2b that rivalry follows this rule. Continuity of contours across areas of rivalry would help to preserve the image of a large salient object. We will see that, under certain circumstances, a coherent pattern remains dominant even when it is distributed between the two eyes. Rivalry that occurs between patterns rather than between eyes is known a pattern rivalry. Pattern rivalry must occur at a level where figural continuity is processed. Thus, in general, binocular rivalry can be understood as a mechanism for preserving the most salient local images or the most coherent images of objects that are well away from the horopter. Binocular rivalry is also produced near the vertical edges of an opaque object that is some distance in front of a textured background. One eye sees a region of the background that the other eye cannot see. In this case, it is important to preserve the monocular image of the background because it contains information about the distance of the near object from the background. This special type of rivalry constitutes a cue to relative depth, which is known as Da Vinci stereopsis (see Section 17.3). 12.3.2 E FFEC TS O F LU M I NA N C E , C O N T R A S T, A N D C O L O R
12.3.2a Interocular Differences in Luminance and Contrast It is generally agreed that the strength of a stationary stimulus in the rivalry process is proportional to its contrast and to the amount of contour per unit area that it contains (Levelt 1965b, 1966). An image also gains strength if it moves or flashes on and off. There are three ways to measure the strength of an image in the rivalry process. 1. Width of the zone of suppression The zone of suppression produced by a given contour widens with increasing border contrast (Levelt 1966). 2. Durations of the phases of rivalry Levelt proposed that the strength of a stimulus determines how long it is suppressed rather than how long it suppresses
STEREOSCOPIC VISION
another stimulus (Levelt 1965b ; Whittle 1965; Fox and Rasche 1969). This is known as Levelt’s 2nd proposition. The lower the contrast or luminance of a suppressed stimulus the more time it needs before becoming dominant (Blake and Camisa 1979; Hollins and Bailey 1981). An image with no contours has zero strength and is believed to remain suppressed indefinitely by a patterned stimulus in the other eye. We will see in Section 12.3.3a that this is not always true. Bossink et al. (l993) varied stimulus strength by varying the luminance contrast, color contrast, or velocity of a moving dot pattern. They agreed that suppression duration is affected more by the strength of the suppressed image than by the strength of the dominant image. However, the strength of the dominant image had some affect on the duration for which it was dominant. Mueller and Blake (1989) found that the overall rate of alternation of rival patterns depended mainly on the contrast of the patterns in their suppressed phase but that the contrast of the patterns in their dominant phase had some effect. Kang (2009) confirmed Levelt’s 2nd proposition for rivalry between large stimuli but not for rivalry between stimuli subtending less than 1°. 3. The detection threshold of a flashed stimulus The contrast threshold for detection of a monocular flash was elevated when the flash occurred at about the same time as a sudden change in brightness in the other eye (Bouman 1955). Presumably, the change in the other eye suppressed the response to the test flash. Blake and Camisa (1979) used the contrast threshold for detection of a flash on a suppressed image as a measure of depth of suppression. They found that the contrast threshold for detection of a test flash presented to an eye in its suppressed phase was independent of the relative contrasts of the stimuli in the two eyes. They concluded that, once a stimulus is suppressed, the degree of suppression, as opposed to its duration, is independent of its contrast. The degree of suppression is also independent of the relative luminances and spatial frequencies of the stimuli (Holopigian 1989). Dimming the image in one eye had no effect on the degree of suppression (Makous and Sanders 1978; Hollins and Bailey 1981). We must distinguish between duration of suppression, which is affected by the contrast or luminance of the suppressed image, and depth of suppression, which is not affected by these features of the suppressed image.
12.3.2b The Rivalry Contrast Threshold The least contrast in an image that will instigate rivalry is the rivalry contrast threshold. The function relating the
Randolph Blake. Born in Dallas, Texas, in 1945. He obtained a B.Sc. in mathematics and psychology at the University of Texas, Arlington, and a Ph.D. at Vanderbilt University, with Robert Fox. He conducted postdoctoral work at Baylor College of Medicine. From 1974 to 1988 he was professor of psychology at Northwestern University. Since 1988 he has been professor of psychology at Vanderbilt University, He received Vanderbilt’s prestigious Sutherland Prize. He was elected to the American Academy of Arts and Sciences in 2006.
Figure 12.14.
rivalry contrast threshold to the spatial frequency of rivaling gratings is similar to the contrast-sensitivity function of monocularly viewed gratings (Blake 1977) (Portrait Figure 12.14). Thus, a stimulus with the spatial frequency for which contrast sensitivity is highest (4 cpd) requires the least contrast to initiate rivalry. However, a sine-wave grating of 4 cpd is not dominant for as long as a stimulus consisting of a broad mixture of spatial frequencies (Fahle 1982a). This supports the idea that rivalry occurs between distinct spatial-scale channels in the visual system. A defocused image is suppressed for longer periods than a sharply focused image (Humphriss 1982). Simpson (1991) presented the horizontal arms of a Rubin cross to one eye and the vertical arms to the other eye. The surrounding frame formed a binocular fusion lock. When one eye was defocused, its image tended to be suppressed by the well-focused image. The area of suppression centered on the fovea and increased as the difference in refraction between the eyes was increased. This effect could be due to the reduced contrast of a defocused image. It could also be due to the narrowed range of spatial frequencies in a defocused image, which means that that image stimulates fewer channels (Fahle 1982b) (Portrait Figure 12.15). Arnold et al. (2007), also, found that blurred images tend to be suppressed by sharp images. When sharp and blurred images
B I N O C U L A R F U S I O N A N D R I VA L RY
•
67
Manfred Fahle. Born in Düsseldorf in 1950. He obtained a B.A. in biology from Göttingen University in 1972 and an M.A. in 1975 from Mainz University. He did graduate training in medicine at the universities of Mainz and Tübingen from 1973 to 1977 and postdoctoral work at the Max-Planck-Institute for Biological Cybernetics, Tübingen, until 1981. He was professor of ophthalmology at Tübingen University and has been professor at City University in London since 1999. He is now also the director of the Institute for Brain Research at the University of Bremen. Recipient of the Max Planck Prize for Basic Research (with T. Poggio) in 1992.
Figure 12.15.
sufficient strength for suppression. Accordingly, lowcontrast stimuli should show less rivalry than high-contrast stimuli. We will now see that this is indeed the case. High-contrast orthogonal dichoptic gratings alternate more rapidly than low-contrast gratings, and continuous lines alternate more rapidly than broken lines (Alexander 1951). Alternations of rivalry occur less frequently, and suppression spreads over wider areas when both images are at scotopic rather than photopic light levels (Breese 1909; Kaplan and Metlay 1964; O’Shea et al. 1994a). Orthogonal dichoptic gratings just above the contrast threshold do not begin to rival for many seconds. Initially, they appear superimposed as a plaid pattern (Liu et al. 1992a). As the contrast of the gratings increases, the time before rivalry is experienced becomes shorter (see Figure 12.16). For a given contrast, gratings with higher spatial frequency appear as plaids longer than those with lower spatial frequency—probably because higher spatialfrequency gratings have higher contrast thresholds. This may explain why Burke et al. (1999) found that the plaid percept is more probable with square-wave gratings than with sine-wave gratings. Liu et al. showed that a fused aperture surrounding dichoptic orthogonal gratings significantly enhanced apparent superimposition of the images. It would be worthwhile to study rivalry in large low-contrast gratings in which any possible contribution of an aperture is minimized. The rate of binocular rivalry between orthogonal dichoptic gratings was higher in subjects with high stereoacuity than in those with low acuity (Halpern et al. 1987a). Also, the rate of binocular rivalry was lower after alcohol ingestion
were swapped between the eyes, the sharp image tended to remain dominant. The fact that stimuli with high contrast or high spatial frequency tend to suppress those with low contrast or low spatial frequency helps to preserve the visibility of objects in the surface within which the eyes are converged. This also helps people who wear a contact lens on one eye for hyperopia and a lens on the other eye for myopia. For near viewing, the image in one eye is in sharper focus than that in the other. For far viewing, the image in the other eye is in sharper focus. Patients nevertheless see in sharp focus at all distances because the well-focused image suppresses the less well-focused image. Under scotopic conditions the less well-focused image is not suppressed (Schor et al. 1987). The suppression of a weak image by a stronger image may explain why stereoacuity is degraded by unequal illumination or contrast in the two eyes (Section 18.5.4).
12.3.2c Luminance Contrast of Both Images It is known that the inhibitory surround of ganglion-cell receptive fields weakens at low contrast (Section 5.5.6b). Also, it may be that low-contrast stimuli do not have 68
•
Reduced rivalry of low-contrast edges. Fusion of the low-contrast images results in a plaid pattern. The high-contrast images produce mosaic rivalry. (Redrawn from Liu et al. 1992b)
Figure 12.16.
STEREOSCOPIC VISION
(Donnelly and Miller 1995). These effects may be due to loss of contrast sensitivity associated with stereodeficiency and alcohol consumption.
12.3.2d Rivalry and Color Contrast Dichoptic stimuli of similar luminance and contrast fuse, while stimuli that differ widely in luminance or contrast rival. But what happens when two superimposed dichoptic regions have the same physical luminance but differ in perceived whiteness because of contrast with their surroundings? Wallach and Adams (1954) showed that such regions rival rather than fuse. On the other hand, they showed that two regions that appear similar in whiteness but which differ widely in physical luminance also rival. Andrews and Lotto (2004) produced similar results using contrast effects produced by the Rubic cube stimulus shown in Figure 22.38. These results are to be expected for any contrast effect occurring before the level in the visual system where rivalry occurs. To the extent that contrast occurs beyond the level that rivalry occurs, one would have to say that contrast affects rivalry by feedback signals.
12.3.2e Color and Rivalry The question addressed in this section is whether the colors of rivaling stimuli affect binocular rivalry and, in particular, whether blue cones contribute to the rivalry process. Borders defined by equiluminant blue and yellow are much less visible than those defined by red and green (Tansley and Boynton 1978; Kaiser and Boynton 1985). Since borders contribute to the strength of a stimulus, one might expect that a blue-yellow grating would be a much weaker stimulus than a red-green grating. Wade (1975b) (Portrait Figure 12.17) used a black and green grating and a black and red grating (3.3 cpd). Periods of exclusive dominance were shorter between orthogonal gratings of the same color than between gratings that differed in color. Also, gratings of the same color tended to appear as a composite plaid. Durations of exclusive dominance increased as the chromatic difference between colored gratings (6 cpd) increased (Hollins and Leung 1978). But yellow stimuli in one eye and blue in the other eye behaved like stimuli of the same color. This latter finding does not prove that the blue-yellow opponent system does not affect rivalry, because the 6-cpd gratings were probably not visible to the blue-yellow system. Stalmeier and de Weert (1988) pitted concentric blackwhite stripes in one eye against radial stripes of alternating colors rendered equiluminant by the flicker method in the other eye. While red-green color pairs engaged in rivalry with the achromatic stimulus, tritanopic color pairs (blueyellow) rarely came into dominance. However, the mean spatial frequency of the radial chromatic display was about 5 cpd, which is rather high for the blue-cone system.
Nicholas J. Wade. Born in Nottinghamshire, England, in 1942. He obtained a B.Sc. in psychology from the University of Edinburgh and a Ph.D. from Monash University, Australia, with Ross Day. He conducted postdoctoral work at the Max-Planck-Institute für Verhaltensphysiologie, Seewiesen, Germany. In 1970 he went to the University of Dundee, where he is now professor of visual psychology. He is a fellow of the Royal Society of Edinburgh.
Figure 12.17.
Rogers and Hollins (1982) used 3-cpd gratings, to which blue-cones are sensitive, but still found no evidence of a contribution of blue-cones to rivalry. O’Shea and Williams (1996) stimulated only the blue cones by using violet gratings on a bright yellow background, which bleached the middle- and long-wavelength cones. Since the gratings had negligible luminance contrast for the middle- and long-wavelength cones, they were essentially equiluminant. Under these conditions, orthogonal dichoptic violet gratings of 2 cpd engaged in rivalry but at a slower rate of alternation than luminance gratings. A highcontrast violet grating could be dominant over a luminancedefined grating with sufficiently low contrast. Thus, the blue-cone system can contribute to binocular rivalry. It can also contribute to fusion and stereopsis (see Section 17.1.4). However, its contribution is not evident in stimuli with high spatial frequency. Sagawa (1981) measured the minimum luminance of a patch seen by one eye required to induce rivalry with a patch seen by the other eye as a function of their relative wavelengths. Although the contribution of the blue-yellow system was small when a red or green stimulus was presented to the other eye, it was never zero. We will see in the next section that, in the suppression phase of rivalry, the sensitivity of the blue-cone system is reduced more than that of other chromatic channels.
12.3.2f Chromatic Specificity of Suppression Color rivalry and pattern rivalry may be dissociated. For example, when two postage stamps with similar design but differing in color and value are combined in a stereoscope
B I N O C U L A R F U S I O N A N D R I VA L RY
•
69
the letters denoting the value of one stamp are sometimes seen in the color of the other stamp (Dawson 1917). Also, dichoptic combination of a piece of colored paper with printing on it with a plain piece of paper of another color sometimes produces an impression of print with a background in the color presented to the other eye (Creed 1935). Thus, the print dominates in one region while colors rival in the surrounding region. However, closely spaced black and colored lines in one eye tend to be strongly dominant over an untextured patch of another color in the other eye (De Weert and Levelt 1976a). Since contours and colors are not strictly in the same location in these stimuli, they may not contravene the rule of nonselective rivalry. There is no conclusive evidence that images of one feature of a stimulus can rival while those of another feature of the same stimulus are fused. Stimuli that rival, such as different colors, may be spatially adjacent to stimuli that fuse, such as luminance-defined edges. There is evidence that the chromatic system is more affected by suppression than is the achromatic system. Smith et al. (1982) presented, for 20 ms, small test probes varying in wavelength at the center of rival orthogonal black-white gratings. When the probe fell on the dominant grating, the spectral sensitivity functions showed the three maxima corresponding to the absorption spectra of the cones. This is symptomatic of the chromatic system. But when the probe fell on the nondominant grating, the sensitivity function had a single broad peak near 555 nm, which is symptomatic of the achromatic system. Thus, rivalry suppression caused a greater reduction in the sensitivity of the chromatic mechanism than of the achromatic mechanism. Ooi and Loop (1994) confirmed these results and, with a more refined testing procedure, showed that loss of sensitivity during suppression is greater for blue than for red probes. The suppression of the blue-cone system during rivalry is related to the fact that blue-cones are only weakly involved in rivalry.
The change in increment-threshold spectral sensitivity was the same for all wavelengths when the test probe was superimposed on a blank field that was permanently suppressed by a grating in the other eye (Ridder et al. 1992). In this respect, permanent suppression resembles amblyopic suppression (Section 8.4.2). Ooi and Loop (1994), also, found that permanent suppression produced a more equal loss of chromatic sensitivity than did alternating rivalry, except that they found some specific loss of blue sensitivity during permanent suppression.
12.3.3 FI GU R A L FAC TO R S I N R I VA L RY
12.3.3a Dominance of Homogeneous Fields It is generally believed that a featureless visual field never rivals a patterned stimulus. However, the blank field of a closed eye can suppress a highly textured stimulus. Close the dominant eye and view the black and white grating of Figure 12.18A as it is oscillated slowly up and down at about 2 Hz. A gray patch containing a diagonal meshwork pattern appears to spread out from the center of the grating and blot it out (Howard 1959). The meshwork pattern periodically spreads and then recedes. The effect is not Troxler fading of the grating, because Troxler fading does not occur with moving stimuli. People with only one eye did not experience this effect, supporting the idea that the occluding patch is the dark field of the closed eye. When a stationary coarse grating, like that of Figure 12.18B, is steadily fixated, a central patch of the dark field of the closed eye may occlude the lines but, instead of a meshwork pattern, the occluded region contains a faint phase-reversed image of the grating, as depicted on the right of Figure 12.18B. Not everyone sees this image. The origin of the patterns visible in the field of the closed eye remains a mystery. This transferred negative image may explain why Gilroy and Blake (2005) found that the afterimage
Figure 12.18.
A
Dominance of the dark field of the closed eye.
(A) When the grating is oscillated slowly up and down while viewed with the dominant eye, the dark field of the closed eye occasionally blots out the center of the grating. The dark field appears as a gray diagonal meshwork, as depicted on the right. (B) When the stationary grating is fixated with one eye, the dark field of the closed eye occasionally blots out the center of the grating and produces a low-contrast grating of the same spatial frequency but opposite contrast. (From Howard 1959)
B 70
•
STEREOSCOPIC VISION
produced by a suppressed grating is weaker than one produced by a visible grating. A blank field in one eye may suppress dynamic visual noise in the other eye (Christopher Tyler, personal communication). Dominance of a blank field over a textured stimulus is a strong violation of Levelt’s proposition that the more highly patterned stimulus is dominant in binocular rivalry. With continuous inspection, parts of a stationary pattern periodically fade. This is known as Troxler fading. Fading is particularly evident and occurs more quickly when one eye is closed. Also, Troxler fading is less evident in oneeyed people (Goldstein 1967). For one-eyed observers, fading times were similar to those of two-eyed observers viewing with both eyes (González et al. 2007). Lou (2008) showed that the longer fading time with binocular viewing is not wholly due to probability summation of binocular stimuli. This suggests that, in people with binocular vision, Troxler fading with monocular viewing is at least partly due to periodic dominance of the dark field of the closed eye. A homogeneous luminous field (Ganzfeld) tends to darken after it has been inspected for some time. Bolanowski and Doty (1987) found that darkening did not occur when the Ganzfeld was viewed with both eyes, and concluded that fading with a monocular stimulus is due to suppression of the luminous field by the dark field of the closed eye. Gur (1991) agreed that sudden blankout in a Ganzfeld occurs only when one eye is closed and that it is due to binocular rivalry, but found that the gradual fading associated with adaptation of a stationary patterned image occurs with both monocular and binocular viewing. Rozhkova et al. (1982) found the same to be true of large textured displays optically stabilized on the retina. In his review of fading of afterimages, Wade (1978) concluded that one of the major factors in the fading of monocular afterimages is rivalry between the afterimage and the dark field of the closed eye.
The inner disk surrounded by an oppositely orientated annulus was dominant for longer than the disk surrounded by an identically oriented annulus. Ooi and He (2006) used a stimulus like that in Figure 12.19B. The image on the right contains a boundary that defines a figure on a ground while the image on the left contains no such boundary. The bounded region on the right was dominant for over 90% of the time. Thus, rivalry was strongly biased in favor a contour that defines a figural region. Adding a white circle to the left image, as in Figure 12.19C, increased the durations of dominance of that image, although the disk region on the right was still dominant for more than half the time. The disk on the right is a stronger stimulus because it differs from the surroundings more than does the disk on the left, which is defined by only the white ring. These results indicate that rivalry operates to preserve figural regions and especially figural regions that differ strongly from their surroundings. The response of motion-sensitive neurons in V1 and MT to a stimulus moving in the preferred direction is inhibited by surround motion in the same direction and enhanced by surround motion in the opposite direction
A
12.3.3b Effects of Figure-Ground Relationships The question asked in this section is whether binocular rivalry is influenced by factors that determine the visibility of a figural region relative to a surrounding region. Cortical cells respond more strongly to a grating that is surrounded by orthogonal lines than to one surrounded by lines with the same orientation (see Section 5.5.6b). Thus, the figural contrast between a grating and its surroundings is enhanced when grating and surround have distinct orientations. From this evidence, one would predict that a grating surrounded by an orthogonal grating would be more dominant than one surrounded by a parallel grating. Several investigators have confirmed this prediction (Ichihara and Goryo 1978; Mapperson and Lovegrove 1991, Fukuda and Blake 1992; Ooi and He 2006). For example, Fukuda and Blake used a stimulus like that shown in Figure 12.19A.
B
C Effect of figural dominance on rivalry. (A) The inner disk on the right dominates the disk on the left most of the time. (Adapted from Fukuda and Blake 1992) (B) The figural region in the right image dominates the undifferentiated region in the other image most of the time. (C) A white bounding contour on the left image increases its duration of dominance. (Redrawn from Ooi and He 2006) Figure 12.19.
B I N O C U L A R F U S I O N A N D R I VA L RY
•
71
(Allman et al. 1985; Jones et al. 2001). From this, one would predict that the dominance of a moving grating seen by one eye would be enhanced when it is surrounded by a grating moving in the opposite direction relative to when it is surrounded by a grating moving in the same direction. Paffen et al. (2004) found this to be the case. Alais and Blake (1998) investigated the effect on binocular rivalry of figural coherence created by motion in neighboring stimuli. Gratings moving in independent directions in four neighboring disk-shaped windows appeared as one partially occluded grating moving coherently. Covering one window tended to destroy this effect. The motion-coherence effect was also destroyed when the grating in one window was suppressed by a disk in the other eye. Nevertheless, the disk suppressed the grating for shorter periods when it was accompanied by the other three disks than when it was seen in isolation. Physiological evidence reviewed in Section 5.5.6b indicates that inhibition of a central region arising from similarly oriented surrounds or from surrounds with similar motion, weakens or changes to facilitation at low contrasts. The important thing at low contrast is to keep stimuli visible rather than engage in mutual inhibition. According to this evidence, one would expect the inhibitory effects of surrounds on rivalry to weaken at low contrast. Paffen et al. (2006a) found that when contrast was reduced from 100% to 1.5%, a grating surrounded by an annulus with the same direction of motion became more dominant than a grating surrounded by an oppositely moving grating. Also, at low contrast, a grating surrounded by an annulus with the same orientation became more dominant than a grating surrounded by an oppositely oriented grating. A pattern with higher spatial frequency in one eye may appear to stand out in depth relative to a pattern with lower spatial frequency in the other eye (Yang et al. 1992). This is a figure-ground effect.
12.3.3c Relative Orientation Abadi (1976) reported that the contrast of a grating in one eye required to suppress a grating of fixed contrast in the other eye was independent of the relative orientation of the gratings, when the angle between them was larger than 20°. As the angle was reduced below 20°, less contrast was required for suppression. Abadi concluded that suppression is due to inhibition between orientation detectors in the visual cortex and is strongest between detectors tuned to similar orientations. Abadi’s stimuli were at near-threshold contrasts, where rivalry tends not to occur. His results may have been due to a threshold-elevation effect arising from contrast adaptation rather than binocular rivalry. Blake and Lema (1978) pointed out that Abadi’s data could also support the conclusion that inhibition is weaker between lines with similar orientation, since only a low-contrast stimulus in one eye overcame the inhibitory 72
•
influence of the image in the other eye. Using a detectionthreshold procedure with suprathreshold stimuli, Blake and Lema found no evidence that strength of suppression varied as a function of the relative orientation of the stimuli. However, the rate of rivalry alternation has been reported to increase with increasing relative orientation of dichoptic gratings (Thomas 1978). The contrast threshold for detection of a vertical grating is lower than that for detection of an oblique grating. This is the so-called oblique effect. One might expect that a vertical grating would be dominant for longer periods than an oblique grating presented to the other eye. Wade (1974) found no evidence of this. However, an afterimage of a vertical grating showed longer mean periods of dominance than an afterimage of an oblique grating presented to the other eye. Also, the afterimage of a vertical grating lasted longer than that of an oblique grating. Bonneh and Sagi (1999) found that, at high but not at low contrasts, an array of randomly oriented Gabor patches was more dominant than uniformly oriented patches. Ooi and He (1999) reported a similar effect. These effects relate to the fact that stimuli are inhibited by neighboring stimuli with similar orientation (Section 5.5.6). On the other hand, Bonneh and Sagi found that a high-contrast contour formed from collinear Gabor patches was more dominant than a jagged contour. This relates to the fact that the visual system is particularly sensitive to collinear stimuli (Section 5.5.6c).
12.3.3d Effects of Stimulus Complexity A pattern with greater contour complexity tends to be dominant over one of lesser complexity. Nguyen et al. (2003) asked whether the depth of suppression is affected by the complexity of the suppressed image. They presented dichoptic moving stimuli that ranged from a pair of simple orthogonal gratings to a pair of oppositely rotating spirals. They assumed that simple gratings are processed at an earlier stage than spirals. When the suppressed image was replaced by a stimulus moving at a different velocity, the change was more easily detected for the simple moving gratings than for the complex spirals. They concluded that the depth of suppression is shallower for simple stimuli detected at early levels of the visual system than for more complex stimuli detected at higher levels of the system.
12.3.3e Rivalry between Subjective Contours When opposed triangles are dichoptically combined, rivalry occurs between intersecting edges, as in Figure 12.20A. Dichoptic triangles formed from subjective contours, as in Figure 12.20B, form a six-pointed star, as they do when they are combined in one eye. With both dichoptic and monocular viewing, the star sometimes appears to break down into superimposed triangles, which alternate
STEREOSCOPIC VISION
with respect to their foreground-background relationship. Bradley (1982) concluded that this is figure-ground ambiguity rather than binocular rivalry. Fahle and Palm (1991) claimed that binocular rivalry is evident in Figure 12.20C. But Figure 12.20D shows that the same rivalry is evident in the same images superimposed in one eye so that this too could be put down to figure-ground ambiguity (see Section 12.3.8). Sobel and Blake (2003) investigated whether the processes responsible for binocular rivalry occur before or after those responsible for formation of subjective contours. Their first experiment was based on the finding that a sudden movement near a suppressed image brings that image into dominance (Walker and Powell 1979). Sobel and Blake found that a subjective vertical bar swept across a suppressed target in the same eye did not shorten the period
A
B
C
D Binocular rivalry of cognitive contours. (A) Rivalry occurs between the line intersections. (B) The triangles form a six-pointed star or show figure-ground rivalry when combined dichoptically. (C) The triangles alternate. (D) The triangles alternate when combined in the same eye. (Adapted from Bradley 1982 and Fahle and Palm 1991)
Figure 12.20.
of suppression of the target. The second experiment was based on the finding that the visibility of a near-threshold line is enhanced when the line is adjacent to a subjective triangle (Dresp and Bonnet 1995). Sobel and Blake found that this enhancement did not occur when one of the three pacmen responsible for producing the subjective triangle was suppressed by a complete disk in the other eye. They concluded from these two results that binocular rivalry precedes the synthesis of subjective contours. 12.3.4 P O S I T I O N O N T H E R ET I NA
Each visual cortex of the monkey has more binocular cells with a dominant input from the contralateral eye (the nasal hemiretina) than binocular cells with a dominant input from the ipsilateral eye (temporal hemiretina) (Sections 7.1.4 and 11.4). One might therefore expect a stimulus presented to the nasal retina of one eye to dominate that presented to the temporal retina of the other eye. In conformity with this expectation, Köllner (1914) found that when a homogeneous green field was presented to the left eye and a similar red field to the right eye for about 100 ms, the green field dominated in the left visual field (nasal half of the left eye) and the red field dominated in the right visual field (nasal half of the right eye). Thus, the left eye dominated in the left visual field and the right eye dominated in the right visual field. With longer viewing, the colored fields began to rival. When the displays presented to the nasal and temporal visual fields were separated by a vertical black band, the color projected to the nasal half of each eye remained dominant for prolonged periods (Crovitz and Lipscomb 1963a, 1963b). Fahle (1987), also, found that the pattern of alternation of rivaling stimuli varied as a function of position in the visual field. A vertical grating in the right eye was dominant for slightly longer than a horizontal grating in the left eye when they were presented within a 20° radius around the fixation point. This was attributed to the fact that most people have a dominant right eye, but it could also be due to vertical gratings tending to dominate horizontal gratings (Fahle 1982b). When the stimuli were presented more than 20° to the left of the fixation point, the left eye was dominant for about twice as long as the right eye. When they were presented more than 20° to the right, the right eye was about twice as dominant. In both cases, the temporal field (nasal hemiretina) of one eye tended to dominate the nasal field (temporal hemiretina) of the other eye. Fahle pointed out that, in convergent strabismus, the fovea of the deviating eye competes with the dominant temporal hemifield of the other eye whereas, in divergent strabismus, the fovea of the deviating eye competes with the nondominant nasal hemifield of the other eye. This could account for why convergent strabismics develop amblyopia in the deviating eye whereas divergent strabismics do not (Section 8.4.1).
B I N O C U L A R F U S I O N A N D R I VA L RY
•
73
The rate of rivalry has been found to be higher for stimuli in the lower visual field (upper retina) than for stimuli in the upper field (Chen and He 2003). This may be related to the fact that acuity and sensitivity to flicker are higher in the lower visual field (Payne 1967; Tyler 1987). The blind spot in each eye is about 5° in diameter and is centered 15° from the fovea on the nasal side. One would not expect a stimulus confined to the blind spot of one eye to rival a stimulus presented to the corresponding area in the other eye. We are not normally aware of the blind spot and we do not experience any discontinuity in a textured pattern extending across the blind spot. This is known as blind spot filling in. He and Davis (2001) claimed that a radial pattern extending across the blind spot more effectively rivals a concentric pattern presented to the other eye than does a radial pattern that surrounds the blind spot but does not extend across it. They concluded that filled in information in the blind spot contributes to rivalry. However, it can be seen in Figure 12.21 that the filled pattern extended over the boundary of the blind spot while the other pattern did not extend as far as the boundary of the blind spot. Therefore, in the filled stimulus, it may have been the pattern elements adjacent to the blind spot that contributed to rivalry rather than the elements that fell within the blind spot. 12.3.5 T E M P O R A L FAC TO R S I N R I VA L RY
Dawson (1913) noticed that rivalry ceased while he rapidly blinked his eyes. High-contrast dichoptic orthogonal gratings did not rival when flashed on repeatedly for 50 ms at a rate of 2 Hz (Kaufman 1963). Similarly, afterimages of orthogonal gratings did not rival when placed on a ground flashing at 2 Hz (Wade 1973). Thus, suppression takes time to develop and, before it develops, both dichoptic stimuli slip past the suppression mechanism and reach consciousness. Although orthogonal gratings combined into a plaid when exposed dichoptically for 10 ms, they appeared to rival when presented intermittently for 10 ms with intervals of less than 150 ms, just as they did when presented continuously (Wolfe 1983a, 1983b). Thus, the rivalry mechanism integrates over short time intervals and, once switched on, stays on for at least 150 ms, affecting the appearance of subsequently exposed stimuli. Another way to think about these effects is that rivalry is less evident in the transient channel of the visual system than in the sustained channel. Subjects who showed continuous suppression of the stimulus in one eye nevertheless experienced the combined image of dichoptic gratings when they were presented for 150 ms (Wolfe 1986a) (Portrait Figure 12.22). Some amblyopes showed the same time course for development of rivalry as normal subjects. Other amblyopes initially showed partial or complete dominance of the image in the good eye, giving way to rivalry or continued dominance
12.3.5a Stimulus Duration Hering (1874) reported that brief rivalrous stimuli appear as two complete superimposed stimuli. Other investigators have since noticed the same phenomenon. For instance, high-contrast dichoptic orthogonal gratings formed a grid pattern when shown for less than about 200 ms but rivaled when shown for more than 400 ms (Anderson et al. 1978).
A
B Stimuli used to study the spread of rivalry. (A) A concentric pattern in one eye rivals a radial pattern that extends across the blind spot in the other eye. The dotted circle represents the blind spot. (B) The radial pattern does not extend as far as the blind spot. Subjects fixated the white spot. The figures are too small to produce the effects. (Redrawn from He and Davis 2001) Figure 12.21.
74
•
Jeremy Wolfe. Born in London, England in 1955. He received his B.A. from Princeton University in 1977 and his Ph.D. in psychology from MIT in 1981. He held academic appointments in psychology and in brain and cognitive sciences at MIT between 1981 and 1991. He is now professor of ophthalmology at Harvard Medical School and director of the Visual Attention Lab at Brigham and Women’s Hospital.
Figure 12.22.
STEREOSCOPIC VISION
of the good eye with longer exposure. All amblyopes behaved like a normal subject when the contrast of the image in the good eye was reduced appropriately (Leonards and Sireteanu 1993). Even though dissimilar dichoptic patterns appear as a combined image when presented briefly, Blake et al. (1991a) found that subjects could distinguish between brief dichoptic images and both images presented briefly to one eye.
12.3.5b Rivalry of Images Flickering at Different Rates O’Shea and Blake (1986) presented an uncontoured field flickering at 4 Hz to one eye and a similar field flickering at between 0.5 and 16 Hz to the other eye. These stimuli produced very few reports of rivalry. Instead, subjects reported seeing a single field flickering irregularly at a frequency between the frequencies in the two eyes (Portrait Figure 12.23). The effect resembled that produced by superimposing the two flickering fields in one eye. Small monocular probes placed at different positions on the flickering fields in both eyes remained visible all the time. They concluded that rivalry does not occur within the transient channel of the visual system, which is most effectively engaged by flickering, uncontoured stimuli. Orthogonal gratings that were counterphase modulated at different frequencies rivaled in the usual way. A grating flickering at a lower frequency created longer dominance phases than one flickering at a higher frequency.
12.3.5c Rivalry of Contrast-Modulated Images A sudden change in a suppressed image can bring that image into dominance (Section 12.5.3). The question
addressed here is whether the frequency of rivalry can be influenced by reciprocal modulations of the strength of two images. Kim et al. (2006) superimposed a small + in one eye on an X in the other eye. The stimuli were reciprocally modulated in contrast by equal amounts at various frequencies. A 20% opposite modulation of contrast had little effect on the rate of rivalry. Rivalry became fully entrained by stimuli with 100% modulation of contrast. The effect of a 30% modulation of contrast on the rate of rivalry depended on the frequency of modulation, as shown in Figure 12.24. The effect was most pronounced when the frequency of contrast modulation was equal to the mean spontaneous rate of rivalry (1.25 Hz). At this frequency, the mean duration of dominance was shortened and the profile of dominance durations showed secondary peaks at odd multiples of the modulation frequency. This pattern of entrainment is symptomatic of a stochastic process running at that frequency. Odd multiples arise when one or more stimulus modulations fail to entrain rivalry. Kim et al. concluded that rivalry involves a stochastic resonant modulation in signal strength superimposed on the effects of image adaptation and interimage inhibition. They argued that this internal modulation (noise) is equivalent to a 30% modulation of contrast and is distinct from low-level noise in the visual system.
12.3.5d Images Presented in Temporal Alternation Rivalry occurs between stimuli presented briefly and alternately to each eye, as long as the interval is not too great. Thus, binocular rivalry between orthogonal gratings presented in alternation to the two eyes at rates above about 20 Hz was indistinguishable from that between simultaneously presented stimuli (O’Shea and Crassini 1984). Some rivalry was apparent in stimuli alternating at 3 Hz but, below this frequency, stimuli were seen in alternation, often with apparent motion from one to the other. Rivalry also occurred between stimuli presented alternately for 5 ms to each eye, with up to 100-ms intervals between stimuli. This is the same interval of time over which stereoscopic depth occurs with alternating stimuli to the two eyes (Section 18.12.2). Rivalry has the following dynamic features. 1. Dominance durations follow a gamma distribution, which represents the sum of a set of independent exponentially distributed random variables. 2. Dominance durations are temporally uncorrelated.
Figure 12.23.
Robert O’Shea. Born in Australia in 1953. He obtained a B.Sc.
and Ph.D. at the University of Queensland with Boris Crassini. He conducted postdoctoral work with Peter Dodwell, Randolph Blake, and Don Mitchell. In 1988 he went to the University of Otago in New Zealand. He is now a professor at the Southern Cross University in New South Wales, Australia.
3. Changing the contrast of the image in one eye changes the dominance duration only of the image in the other eye (Levelt’s second proposition). Stimuli presented alternately to the eyes with an appropriate interstimulus interval show dichoptic masking.
B I N O C U L A R F U S I O N A N D R I VA L RY
•
75
3HP 5HP 7HP 9HP 11HP 13HP 15HP 17HP
HP
HP
HP
Driven minus control
HP 3HP 5HP 7HP 9HP
200 ms
5HP
2.48 Hz
400 ms
3HP
1.25 Hz
600 ms
3HP
0.83 Hz
800 ms
HP
0.63 Hz
1000 ms
3HP
0.50 Hz
1200 ms
3HP
0.42 Hz
1400 ms
HP
0.36 Hz
1600 ms
HP
0.31 Hz
1800 ms
HP
0.28 Hz
Mod. half-period, HP =
HP
Modulation frequency =
Probability density
Control (spontaneous rivalry)
0
4000 0
4000 0
4000 0
4000 0
4000 0
4000 0
4000 0
4000 0
4000 0
4000
Dominance duration (ms)
Binocular rivalry as a function of reciprocal contrast modulation of competing images. The contrast of rivalrous images were modulated in antiphase at frequencies of 0.28 to 2.48 Hz. The lower graphs show peaks in the dominance-duration distributions at the odd-integer multiples of the halfperiods of contrast-modulation (vertical lines) consistent with stochastic resonance. In the upper graphs, the distribution due to spontaneous rivalry has been subtracted to reveal effects due to contrast modulation. N = 3. (Reprinted from Kim et al. 2006 with permission from Elsevier)
Figure 12.24.
Van Boxtel et al. (2007) reported that dichoptic masking has the same dynamic features as rivalry. They concluded that rivalry and dichoptic masking involve the same inhibitory processes.
12.3.5e Spatial Propagation of Suppression Large rivaling patterns produce a patchwork of zones of rivalry, with one eye’s image dominant in some zones and the other eye’s image dominant in other zones. The zones expand or contract, as one continues to view the pattern. Wilson et al. (2001) proposed that the rate of expansion of suppression of one image by the other is due to spread of neural activity over lateral connections in the visual cortex. They used the stimuli shown in Figure 12.25 to measure the rate of spread of binocular rivalry. First, a low-contrast radial pattern was dichoptically superimposed on a highcontrast spiral pattern. The subject pressed a key when only the spiral pattern was visible. The key press triggered a brief increment of contrast at one of the eight cardinal points in the radial pattern. This caused the radial pattern to become dominant at that point and initiated a spreading wave of dominance of the radial pattern around the annulus. The subject released the key when the wave of dominance reached a designated point on the annulus. The time taken for the wave to spread from the trigger point to its final 76
•
Stimuli to measure speed of spread of rivalry. The high-contrast spiral pattern seen by one eye was superimposed on either the radial pattern or the concentric pattern in the other eye. (From Wilson et al. 2001.
Figure 12.25.
Reprinted by permission from Macmillan Publishers Ltd.)
point indicated a mean speed of propagation of the wave of suppression of about 3.6°/s. The speed of propagation was 9.6°/s for concentric lines superimposed on the spiral pattern. The angle between these two patterns was 45° as it had been for the radial and spiral patterns. The difference in speed probably arose from the fact that lateral connections in the visual cortex are strongest between cells with collinear orientation preference (Section 5.5.6b). The speed of propagation of rivalry was about 173 ms longer for a wave that traversed the retinal midline, and which therefore traversed the corpus callosum. When the radius of the radial-plus-spiral annulus was increased from 1.8 to 3.6°, the mean speed of propagation increased to about 8.3°/s. With fixation on the center of the display,
STEREOSCOPIC VISION
the wave of rivalry for the larger display occurred in a more eccentric retinal location than that for the smaller display. When allowance was made for the cortical magnification factor (increase in receptive-field size with increasing eccentricity), mean propagation speed was 2.24 cm/s over the cortical surface for both display sizes. Lee et al. (2005) recorded fMRI responses from V1 as subjects reported rivalry in a stimulus like that shown in Figure 12.25. The time course of activity over the retinotopic map in V1 coincided with subjects’ reports of the traveling wave of disparity round the stimulus. A wave of dominance in rivalrous concentric and radial gratings tended to start where the contrast or motion of the suppressed image was stronger than that of the suppressor (Paffen et al. 2008a). Also, a wave of dominance in an array of oriented Gabor patches tended to start where orientation contrast in the suppressed image was greatest (Stuit et al. 2010). Arnold et al. (2009) found that local contrast increments triggered dominance changes that spread along illusory contours in a Kanisza figure. Also, changes spread along contours of a face. When different regions of the face were presented to different eyes, dominance changes spread preferentially along contours in the same eye rather than over contours linked across the eyes. These results link properties of rivalry propagation to known properties of the visual cortex, such as the cortical magnification factor and preferential linkage of collinear images. This supports the notion that rivalry arises primarily between inputs from the two eyes in the visual cortex. However, the results do not rule out some contribution from processes occurring at a higher level in the cortex. Wilson et al. (2001) developed a neural model of rivalry, which goes beyond previous models in providing for speed of rivalry propagation and for the effects of collinearity of stimulus elements.
12.3.5f Flash Suppression and Facilitation Wolfe (1984) presented a vertical grating to one eye for 1 to 2 seconds. About 100 ms later a flashed vertical grating was presented to the same eye and a flashed horizontal grating to the other eye. Subjects saw only the horizontal grating. This is known as flash suppression. See Wilkie et al. (2003) for a general discussion of flash suppression. Flash suppression provides an experimenter with some short-term control over which image is dominant. Tsuchiya and Koch (2005) described a technique for prolonging the dominance of a given image in binocular rivalry. They named it continuous flash suppression. An image is flashed to one eye at a rate of about 10 Hz while a different image is presented continuously to the other eye. The continuous image remains suppressed for several minutes. Thus, the dominance of an image is greatly increased by repetitive flashing. This procedure provides a way to effectively control binocular suppression.
Tsuchiya et al. (2006) found that continuous flash suppression elevated the detection threshold 20-fold, but that regular suppression elevated it only 3-fold. They concluded that continuous suppression is due to accumulated effects of multiple flashes. Brascamp et al. (2007) found that prior presentation of a brief, low-contrast oblique grating to one eye increased the probability that that grating would dominate an orthogonal grating in the other eye. They called this flash facilitation. With longer durations or higher contrasts of the prior stimulus, facilitation gave way to suppression. The key factor was stimulus energy, defined as contrast times duration. For example, a high-contrast grating presented for less than about 0.5 s produced facilitation but the same stimulus presented for longer than 0.5 s produced suppression. At lower contrasts, facilitation persisted over longer durations. The effects had the following features: 1. Flash facilitation did not occur when the prior stimulus was not in the same retinal location as the subsequent rivalrous stimuli. 2. Both suppression and facilitation declined to zero when the blank interval between the prior stimulus and the rivalrous stimuli was increased from about 0.2 s to about 2 s. 3. Suppression and facilitation occurred when the prior stimulus and the rivalrous stimuli differed in shape. In this case, the effects were eye-specific rather than pattern-specific. They therefore occurred at a low level in the visual system. 4. But the effects also occurred when both eyes were presented with a prior stimulus that was the same shape as the rivalrous stimuli. In this case the effects were pattern-specific rather than eye-specific. They therefore occurred at a higher level in the visual system. 5. Both suppression and facilitation were reduced when the pattern-specific effect was pitted against the eye-specific effect by having one eye preview the other eye’s stimulus. In this case, there was a trade-off between the low-level process and the high-level process. Prior exposure to a single monocular patch of oblique grating produced flash suppression in a subsequently viewed pair of rivalrous gratings. However, flash suppression did not occur when the prior patch was surrounded by similar patches that prevented subjects from detecting the orientation of the surrounded patch (Hancock et al. (2008). This suggests that flash suppression depends on awareness of the orientation of the prior stimulus. Arnold et al. (2008) described another procedure for prolonging the dominance of an image, which they called binocular switch suppression. A blurred target pattern and a sharp black-white random grid were presented
B I N O C U L A R F U S I O N A N D R I VA L RY
•
77
alternately to the two eyes at a rate of 1 Hz. The target pattern remained suppressed for prolonged periods. They concluded that the short exposure times prevented adaptation of the dominant image, which therefore remained dominant. The effect was most pronounced at the alternation rate of 1 Hz, which is considerably lower than the optimal flash rate for continuous flash suppression. Arnold et al. concluded that the two effects depend on distinct processes. Suppression and facilitation produced by prior inspection of a stimulus occur in other cases of alternating ambiguous stimuli. These include Necker cube reversals (Section 26.7.2), the kinetic depth effect (Section 28.5), motion in depth (Section 31.6), and ambiguous motion (Kanai and Verstraten 2005). In these cases, suppression is known as long-term adaptation and facilitation is known as priming.
12.3.5g Preservation of an Image Over Interruptions The rate of rivalry of orthogonal gratings was greatly reduced when both stimuli were presented for 3-second periods interrupted by 5-second blanks (Leopold et al. 2002). When a blank of 5 s was introduced soon after one stimulus became dominant, that stimulus remained dominant after both stimuli were returned to view and remained dominant for many repetitions of this cycle. Leopold et al. argued that an adaptation process could not explain this effect. But one could argue that the dominant image recovers from adaptation during the 5-s blank interval and therefore reappears when the stimuli are returned. In other words, stimulus interruption forestalls adaptation of the dominant stimulus, which is then preserved over interruptions.
12.3.6 R I VA L RY B ET WE E N MOVI N G S T I MU L I
12.3.6a Rivalry and Eye Movements Sudden motion or increase in luminance contrast of a suppressed image tends to terminate suppression. The eyes constantly execute small saccadic movements. Levelt (1967) proposed that the resulting sudden movements of the suppressed retinal image trigger reversals of binocular dominance. However, binocular rivalry occurs between retinally stabilized images (Blake et al. 1971) and between afterimages (Wade 1973, 1975a). This demonstrates that eye movements are not necessary for rivalry but it does not exclude a role for eye movements. The rate of rivalry between afterimages was lower than that between real images, unless the afterimages were viewed against a background flickering at below 3 Hz (Wade (1977). This suggests that image movements due to eye tremor or image flicker increase the rate of rivalry above a baseline rate. The baseline rate is independent of image movement or flicker. 78
•
Sabrin and Kertesz (1983) applied a more critical test of the eye-movement theory by stabilizing the image in only one eye. Rivalry still occurred but the stabilized image was suppressed for longer periods than the unstabilized image. When a motion simulating the effects of microsaccades was imposed on the stabilized image, the periods for which it was suppressed returned to normal. Thus, eye movements do affect the rate of binocular rivalry when their effects are unequal in the two eyes. The spread of suppression from a contour into a noncontoured region in the other eye was more extensive when the eyes executed vergence movements (Kaufman 1963). This effect was explained in terms of a time lag in recovery from suppression, which causes the moving eyes to leave a wake of suppression in their path.
12.3.6b Do Moving Stimuli Rival Like Stationary Stimuli? A moving stimulus was found to be dominant for longer than a similar stationary stimulus seen by the other eye (Breese 1899). A bar seen by one eye suppressed a stationary object seen by the other eye, when the bar moved over the object. The effect increased with increasing eccentricity of the object and was most evident at a velocity of 20°/s (Grindley and Townsend 1965). The dominance duration of a moving grating increased with speed (Wade et al. 1984). However, an advantage of one speed over another was not evident when the displays moved at different speeds in the same direction (Blake et al. 1985). A moving stimulus also suppresses a stationary stimulus in the same eye, an effect known as motion-induced blindness (Section 12.3.8). Rivalry occurs between dichoptic displays of random dots moving at the same speed in directions that differ by more than 30° (Wade et al. 1984). Dichoptic displays of dots moving upward at 1.5°/s in directions that differed by less that 30° fused to produce a display that appeared to move on an inclined depth plane with respect to the circular aperture (Blake et al. 1985). These displays clearly engaged the binocular-disparity system. The following experiments involved the use of orthogonal gratings, each translating in a direction at right angles to the lines of the grating. These will be referred to as orthogonal drifting gratings. Rivalry between orthogonal gratings moving at 1.2 or 4°/s had a similar time course as that between stationary gratings. However, depth of suppression, as measured by the detection threshold of a flashed probe, was greater for moving than for stationary gratings (Norman et al. 2000). Stationary low-contrast orthogonal gratings tend to fuse into a plaid rather than rival (Section 12.3.2c). However, setting the low-contrast orthogonal gratings in motion increased the incidence of rivalry (Cobo-Lewis et al. 2000).
STEREOSCOPIC VISION
Orthogonal drifting gratings superimposed in the same eye create a plaid moving in an intermediate direction, especially when the gratings have the same spatial frequency (Section 22.3.3). What happens when the drifting gratings are presented dichoptically? Andrews and Blakemore (1999, 2002) superimposed dichoptic orthogonal drifting gratings. Sometimes, one or the other grating dominated and appeared to move orthogonally to its orientation. Sometimes, parts of each grating formed a mosaic of orthogonal gratings that appeared to move as a whole in an intermediate direction. In this case, motion signals in all rivalrous patches were interpreted in the same way to create dichoptic coherent motion. Dichoptic coherent motion could be due to integration of direction information over the dominant images in contiguous patches. On the other hand, it could be due to integration of direction information in each patch between the dominant image in one eye and the suppressed image in the other eye. Motion in the intermediate direction was still perceived when one grating was dominant over the whole display. This suggests that motion signals from an otherwise suppressed grating can influence the perceived direction of motion of a dominant grating. Cobo-Lewis et al. (2000) asked whether dichoptic coherent motion follows the same rules as plaid motion in gratings superimposed in the same eye. They used orthogonal 1-cpd gratings moving at 4.5°/s. When superimposed in the same eye they created plaid motion. Plaid motion gave way to independent motion when the gratings differed in spatial frequency. Also, the direction of plaid motion depended on the relative velocities of the gratings. The occurrence and direction of plaid motion in superimposed dichoptic gratings was affected in the same way by differences in spatial frequency and velocity. Sun et al. (2002) used dichoptic 0.5-cpd orthogonal gratings drifting at 12°/s. Sometimes, the slow phase of optokinetic nystagmus was in the direction of one or the other grating. At other times it occurred in an intermediate direction at a higher velocity corresponding to that of the plaid pattern formed by superimposing the gratings. Changes in OKN were accompanied by corresponding changes in the perceived direction of motion. The subjects registered only component motions when the gratings alternated so that each appeared for longer than 200 ms. Thus, dichoptic motion signals were combined only when they were in close synchrony. Tian et al. (2003) obtained similar results. The velocity of OKN in the intermediate direction increased as the angular difference between the component gratings increased. This conforms to predictions from the intersection of constraints account of plaid motion rather than to predictions from the vector-sum account (see Adelson and Movshon 1982). Blake et al. (2003) rotated a pair of off-center 0.6° rivaling radial and concentric gratings round the fixation point at 6 rpm in the same direction. Consequently, they continually
engaged fresh neural tissue. The alternation rate of rivalry was significantly less than when the gratings were stationary or when the observers visually tracked the moving stimuli. The reason for this effect is not known. In a second experiment, Blake et al. moved the rivaling stimuli across a region that had been previously exposed to one of the stimuli for 60 s. A switch in dominance tended to occur when the dominant figure in the moving stimuli was the same as the previously exposed figure but not when the dominant figure and the adapted figure were different. A given stimulus was more dominant when surrounded by an annulus of moving elements, especially when elements in the target stimulus and those in the surround moved in opposite directions (Blake et al. 1998). A related question is whether distinct motion aftereffects in the two eyes show vector summation of perceived direction or show rivalry. This question is discussed in Section 13.3.3d.
12.3.6c Does Rivalry between Moving Stimuli Indicate Anything about the Site of Rivalry? Rivalry between oppositely moving arrays of dots was less evident when all pairs of dots in the two displays moved along intersecting pathways (paired) compared with when the pathways did not intersect (unpaired) (Matthews et al. 2000). With intersecting pathways, the motion signal is lost as the dots intersect, which interferes with the perception of two distinct arrays. Matthews et al. concluded that binocular rivalry involves binocular cells, since only binocular cells register interocular coincidences. But, the argument is not conclusive because reduced rivalry with intersecting dots may be due to binocular fusion occurring when intersecting dots fall on or near corresponding points. In a study from the same laboratory Meng et al. (2004) used dichoptic random-dot displays moving in opposite directions along intersecting pathways (paired). This produced monocular motion signals, but the binocular motion signals were canceled. In a second stimulus the dots moved along nonintersecting pathways (unpaired). This produced both monocular and binocular motion signals. If rivalry were only between cells sensitive to monocular motion, these stimuli would show equal rivalry. If rivalry were only between binocular motion cells, only the second stimulus would show rivalry. In fact, both stimuli showed rivalry but those with both types of motion produced stronger rivalry (longer periods of exclusive dominance) than those with only monocular motion. Meng et al. concluded that both monocular and binocular motion signals contribute to rivalry between moving stimuli. 12.3.7 R I VA L RY A N D EY E D O M I NA N C E
Eye dominance is defined in a general way as a preference for using one eye over the other. The early literature on eye
B I N O C U L A R F U S I O N A N D R I VA L RY
•
79
dominance was reviewed by Coren and Kaplan (1973), Porac and Coren (1976), and Wade (1998). Most people are right-handed and right-eyed, but there are conflicting claims about whether eye dominance is correlated with handedness (Miles 1930; Eyre and Schmeeckle 1933; Gronwall and Sampson 1971). In animals with hemidecussating visual pathways, eye dominance has nothing to do with cerebral dominance, because each eye projects to both cerebral hemispheres. The following three criteria have been used to define eye dominance:
image in the right eye is larger than that in the left eye when the gaze is to the right. When they independently varied gaze angle and relative image size they found that eye dominance was determined by image size rather than by gaze angle. Another factor may be that when the gaze is to the right the image of the nose intrudes into the visual field of the left eye. In amblyopes, the normal eye is the dominant eye and also dominates in rivalry.
12.3.8 M O N O C U L A R R I VA L RY
1. The eye with better visual acuity, contrast sensitivity, or other measure of visual functioning. In severe cases, the weaker eye is amblyopic and is permanently suppressed when both eyes are open (Section 8.4.2). 2. The eye used for sighting when, for instance, one looks at a distant object through a ring held in both hands at arm’s length with both eyes open. 3. The eye in which a rivaling stimulus is most often dominant. 4. The perceived direction of fused diplopic images (see Section 16.7.4). There is controversy about whether tests of eye dominance based on these definitions are correlated. There seems to be no single set of criteria that characterizes eye dominance (Mapp et al. 2003). Coren and Kaplan (1973) conducted a factor analysis on the results of 13 tests of eye dominance given to 57 subjects. The results revealed three principal factors, acuity dominance, sighting dominance, and rivalry dominance. Most of the variance was accounted for by sighting dominance. There is conflicting evidence about whether the sighting eye is related to rivalry dominance. Rivalry tests of ocular dominance were poorly correlated with sighting tests (Washburn et al. 1934). Some other investigators found no relation between sighting dominance and rivalry (Collins and Goode 1994). On the other hand, Porac and Coren (1978) reported that the sighting eye was dominant for longer periods than the nonsighting eye. Handa et al. (2004) obtained the same result for orthogonal gratings with a spatial frequency of 4 cpd but not for gratings with spatial frequencies of 2 or 8 cpd. The durations of dominance in the two eyes could be made equal by reducing the contrast of the grating in the dominant eye relative to that in the nondominant eye. They suggested that the contrast balance point could be used as a test of eye dominance. It has been claimed that eye dominance switches to the right eye when the gaze is 15° or more to the right and to the left eye when gaze is to the left (Khan and Crawford 2001). However, Banks et al. (2004b) pointed out that the 80
•
12.3.8a Basic Findings Breese (1899) noticed rivalry between a diagonal grid of black lines on a red ground and an orthogonal grid of black lines on a green ground, when they were optically superimposed in the same eye. Sometimes only one or the other set of lines was seen, and sometimes parts of each. The colors associated with each set of lines fluctuated accordingly. Breese introduced the term “monocular rivalry” to refer to rivalry between images in one eye. Monocular rivalry between differently colored gratings was most pronounced when they were orthogonal. When the angle between them was less than about 15°, the colors and lines combined into a stable percept (Campbell and Howell 1972; Campbell et al. 1973). Monocular rivalry was more frequent and more complete when the colors were complementary rather than noncomplementary or black and white (Rauschecker et al. 1973; Wade 1975b). The rate of monocular rivalry is much lower than that of binocular rivalry (Wade 1975b ; Kitterle and Thomas 1980). Also, monocular rivalry takes time to develop and requires constant fixation, whereas binocular rivalry occurs straight away and does not require fixation. The rate of alternation of orthogonal gratings superimposed in one eye did not change significantly with changes in contrast (Atkinson et al. 1973). The rate of alternation was highest when both gratings had a spatial frequency of 5 cpd and fell off steeply on both sides of the peak (Kitterle et al. 1974). A grating near the peak frequency was dominant more of the time than an orthogonal grating at another frequency (Thomas 1977). Changes in stimulus size, spatial frequency, or relative orientation had similar effects on alternation rates in monocular and binocular rivalry (Andrews and Purves 1997). Monocular rivalry also occurs between parallel gratings. For example, a 1-cpd sinusoidal grating superimposed on a 3 cpd (3rd harmonic grating) fluctuated in appearance, with the two gratings alternating in dominance (Atkinson and Campbell 1974). The fluctuation rate was highest when the relative phase of the gratings was 90° and least when it was 0° (peaks coincide) or 180° (peaks subtract). Successive presentation to one eye of a white horizontal bar and a white vertical bar, both on black backgrounds,
STEREOSCOPIC VISION
produced an afterimage in which the vertical and horizontal bars showed rivalry, complete with white halos at the intersections of black edges. Monocular rivalry was particularly impressive between an afterimage and a real bar. These forms of monocular rivalry appeared similar to rivalry between orthogonal dichoptic afterimages (Sindermann and Lüddeke 1972). When afterimages of a vertical and a horizontal bar are formed successively, there is some retention of the neural activity associated with each of the contours within the region where the afterimages intersect. The competition between these persisting neural processes is presumably responsible for the rivalry seen with monocular or dichoptic afterimages. A stationary pattern periodically disappears when superimposed in the same eye on a moving pattern, such as a rotating display of random dots (Bonneh et al. 2001a). This is motion-induced blindness. The effect shares many features with monocular rivalry (Carter and Pettigrew 2003).
12.3.8b Monocular Rivalry and Monocular Diplopia In monocular diplopia, a stimulus seen by one eye appears double (Section 14.4.2). The image in its normal oculocentric position appears normally bright and the anomalous image appears dim. Bielschowsky (1898) superimposed in one eye the normally bright image of a diplopic red patch on the dim image of a diplopic green patch. Instead of forming the hue created when two normal images of different hue are combined, the colored patches rivaled like dichoptic patches. Ramachandran et al. (1994a, 1994b) made a similar observation on a patient with intermittent exotropia. A patch of vertical lines was superimposed in the same eye on the dim diplopic image of a patch of horizontal lines. The subject experienced mosaic rivalry rather than the checkerboard pattern obtained when two orthogonal gratings of the same contrast are superimposed in the same eye.
12.3.8c Monocular Luster A dark patch in one eye superimposed on a light patch in the other eye produces binocular luster. Anstis (2000) found that a dark patch alternating with a bright patch at 16 Hz in the same eye also produces luster, which he called monocular luster. Both effects occurred only when the two patches had opposite luminance polarity—one patch had to be darker than the background and the other lighter than the background. Anstis suggested that binocular and monocular luster arise from competition between signals in the ON and OFF visual pathways. Alternating orthogonal gratings to the same eye at 10 Hz produced lustrous diamond elements (Burr et al. 1986). Two gratings flickering in counterphase are physically
equivalent to two gratings sliding over each other. Burr et al. suggested that this form of monocular luster arises from competition between signals in distinct motion channels.
12.3.8d Theories of Monocular Rivalry It has been suggested that monocular rivalry is due to eye movements that cause an afterimage formed with one fixation to reinforce or attenuate parts of the pattern viewed with another fixation (Furchner and Ginsburg 1978; Georgeson and Phillips 1980). But this cannot explain why Breese sometimes observed parts of each grating or why rivalry occurs between gratings at levels of contrast too low to produce afterimages (Mapperson et al. 1982). Others argued that eye movements are not the only factor, since monocular rivalry occurs with afterimages (Atkinson 1972; Wade 1976; Crassini 1982). Georgeson (1984) pointed out that afterimages are poor stimuli for studying monocular rivalry, since they pass through various phases and vary with changes in the background on which they are projected. He produced data in support of the eye-movement theory of monocular rivalry. Subjects fixated a spot as it moved to different positions on two superimposed orthogonal gratings. The grating that was perceptually dominant depended on the retinal position of the image in the preceding fixation position, as predicted by the eye-movement theory. Rates of fluctuations of rivalry were independent of the relative orientation of the gratings when appropriate eye movements were made. The response of a cell in the monkey visual cortex to an optimally oriented bar or grating is suppressed by the superimposition of an orthogonal bar or grating (Section 12.9.2b). This effect is known as cross-orientation inhibition. It is largely independent of the relative spatial phases of superimposed gratings and operates over a wide difference in spatial frequency between the superimposed stimuli (DeAngelis et al. 1992). The monoptic version of this effect could be a factor in monocular rivalry, as suggested by Campbell et al. (1973). Monocular rivalry may also be related to the Troxler effect in which parts of a steadily fixated stimulus fade from view. O’Shea et al. (2009) produced evidence that monocular and binocular rivalry arise from the same process. The two effects (1) were affected in a similar way by changes in image size and color, (2) showed the same gamma distribution of dominance periods, and (3) showed threshold elevation for detection of a pulse only during the suppression phase. It seems safe to conclude that monocular rivalry, like binocular rivalry, arises when a region of the visual cortex receives, simultaneously or in rapid succession, signals that are processed in distinct orientation channels or spatialfrequency channels. Signals in the two channels compete for access to later stages of processing. This theory is supported by the fact that the relative orientation (15°) and the relative spatial frequency (1 octave) of gratings where rivalry
B I N O C U L A R F U S I O N A N D R I VA L RY
•
81
becomes apparent correspond to the bandwidths of orientation and spatial-frequency channels.
12.3.8e Summary Images in the two eyes rival when they differ in contrast polarity, orientation, or motion direction. At any instant the whole or part of one image is seen and the corresponding region of the other image is suppressed. The strength of an image is determined by the contour it contains, and by its contrast, spatial frequency, and motion. The strength of a suppressed image has more effect on the duration of suppression than does that of the dominant image. However, the dark field of a closed eye has some strength, since it can suppress a grating in the other eye. A stimulus on the nasal retina of one eye tends to dominate that on the temporal retina of the other eye. Low-contrast stimuli and stimuli presented for less than 200 ms do not rival, which suggests that rivalry does not occur in the transient visual channel. Rivalry can occur between stimuli presented successively. Monocular rivalry can occur between orthogonal gratings presented to the same eye. 1 2 . 4 S PAT I A L Z O N E S O F R I VA L RY 12.4.1 Z O N E S O F E XC LUS I VE D O M I NA N C E
Rivalry occurs within discrete areal units, which increase in size with increasing eccentricity. Blake et al. (1992) referred to these units as spatial zones of binocular rivalry. Large displays, like those at the top of Figure 12.26, usually rival in a piecemeal fashion to produce a shifting mosaic of images from the two eyes. This is mosaic rivalry. When the image in one eye totally suppresses that in the other it is said to show exclusive dominance. Small displays, like those at the bottom of Figure 12.26, tend to show exclusive dominance. Blake et al. (1992) measured the percentage of viewing time that a pair of dichoptic orthogonal gratings (6 cpd) in
a circular patch showed exclusive rather than mosaic dominance, as a function of the size and eccentricity of the patch. They noted that the area within which rivalry was exclusive was similar to the zone of rivalry, indicated by the area over which suppression from a contour in one eye encroached on a noncontoured area or on neighboring contours in the other eye. At the fovea, the mean diameter of the largest patch that exhibited exclusive dominance 95% of the time was 8.1 arcmin. According to Schein and Monasterio (1987), this is close to the estimated size of a cortical hypercolumn in the monkey visual cortex (the region containing one complete set of ocular dominance columns). With appropriately scaled spatial frequency, the size of the patch that generated only exclusive rivalry increased with eccentricity in a manner similar to the increasing size of receptive fields and of the cortical magnification factor (the diameter of cortical tissue devoted to each degree of visual angle). The largest size for exclusive rivalry depends on image size, not perceived size. Thus, the limiting size for exclusive rivalry between a pair of orthogonal afterimages did not vary as the apparent size of the afterimages was varied by projecting them onto surfaces at different distances (Blake et al. 1974). Two low-contrast gratings showed longer periods of exclusive dominance than did two high-contrast gratings. For a given contrast, gratings with a spatial frequency of 3 cpd showed longer periods of exclusive dominance than did gratings with higher or lower spatial frequencies (Hollins 1980). Rivalry between random-dot patterns showed a similar dependency on spatial frequency (De Weert and Wade 1988) (Portrait Figure 12.27). In Hollins’s experiment, stimulus size was constant so that spatial frequency was confounded with the number of bars in the grating. O’Shea et al. (1997) confirmed Hollins’s results for a stimulus subtending 2°. However, with a 0.5° stimulus, exclusive rivalry peaked at about 4 cpd, and with a 4° stimulus, it peaked at about 1 cpd. Peak spatial frequency for exclusive dominance was inversely proportional to stimulus area. 12.4.2 S PAT I A L E X T E N T O F T H E Z O N E O F R I VA L RY
Effects of area on rivalry. With large displays, the whole of one image is infrequently suppressed totally by the other image. Small displays tend to rival as a whole. (Adapted from Blake et al. 1992)
Figure 12.26.
82
•
Kaufman (1963) measured the extent of the zone of rivalry around a dominant contour. He presented a pair of parallel thin lines to one eye and a single orthogonal line to the other eye. In dichoptic viewing the single line cut across the two parallel lines, and subjects reported when the part of the single line between the two parallel lines was totally suppressed. As the separation of the vertical parallel lines increased from 7 arcmin to 3.8° the percentage of time for which the single line was visible between them increased exponentially from 3% to 40%. The increase was most rapid for separations of less than 15 arcmin. The zone of suppression in a vertical direction was smaller than that in a
STEREOSCOPIC VISION
Charles M. M. de Weert. Born in 1942 in Steenbergen, the Netherlands. He studied experimental physics in Utrecht and obtained a Ph.D. with Pim Lavelt from Groningen and Nijmegen on binocular color mixing. Between 1987 and 2003 he was director of the Nijmegen Institute for Cognition and Information (NICI). He is now dean of the Faculty of Social Sciences at the University of Nijmegen.
Figure12.27.
horizontal direction. Kaufman ascribed this to a greater instability of horizontal gaze than of vertical gaze. Liu and Schor (1994) used Kaufman’s procedure with elongated DOG patches with a spatial-frequency bandwidth of 1.75 octaves and a variable mean spatial frequency. Two parallel patches that were either vertical or horizontal were presented to one eye for one second. A continuously visible single patch was presented to the other eye. This patch was orthogonal to and midway between the parallel patches. Subjects reported whether they saw part of the single patch between the two parallel patches. The size of the suppression zone decreased linearly with increasing spatial frequency, from about 150 arcmin at a spatial frequency of 1 cpd to about 15 arcmin at 10 cpd. At low spatial frequencies the rivalry zone was about twice as wide vertically as horizontally, at 4 cpd it was circular, and at high spatial frequencies it became a horizontal ellipse. Schor et al. (1984a) had found previously that Panum’s fusional area also decreased with increasing spatial frequency (Section 12.1.2), but the fusional area was much smaller than the zone of rivalry at all spatial frequencies. Liu and Schor (1994) found that the zone of rivalry increased rapidly with increasing contrast of Gaussian patches up to a contrast of about 30%, after which it increased less rapidly. The small zone of rivalry with stimuli of low contrast and high spatial frequency could explain why orthogonal gratings fuse rather than rival when they have low contrast and high spatial frequency (Section 12.3.2c). At scotopic levels of luminance, suppression spreads over larger areas of rivalrous gratings than at photopic levels
(O’Shea et al. 1988). Furthermore, the size of Panum’s fusional area was larger at scotopic levels. Both these effects could be due to the fact that, at scotopic levels, the excitatory regions of receptive fields of ganglion cells are larger because inhibition from the surround is weaker (Barlow et al. 1957). Spread of suppression also depends on figural features. For example, suppression spread further along contours than to locations disconnected from the suppressed figure. Suppression along a bar was arrested by a break in the bar unless the break was seen as due to occlusion (Maruya and Blake 2009). Orthogonal contours always rival while parallel contours with the same contrast polarity tend to fuse. One would therefore expect the zone of rivalry to be more extensive between orthogonal contours than between parallel contours. Using Kaufman’s procedure, Nichols and Wilson (2009) showed that the zone of rivalry spread further from two vertical parallel bars into a horizontal bar when gratings in the two bars were orthogonal compared with when the gratings were parallel. 12.4.3 I N D E P E N D E N C E O F Z O N E S O F R I VA L RY
As a general rule, when the two eyes are exposed to several distinct pairs of rivaling stimuli, the members of each pair rival independently rather than in synchrony. But there are exceptions to this rule. Connected patterns tend to appear and disappear as units. Also, a random mixture of red-green patches presented to one eye and the same pattern with reversed colors presented to the other eye appeared either all red or all green for much of the time (Kulikowski 1992). We will see in the next section that two contour segments forming a continuous cyclopean line or pattern tend to rival in synchrony, even though the segments are seen by different eyes. Fukuda and Blake (1992) claimed that a textured annulus placed round one of two small rivaling orthogonal gratings, as shown in Figure 12.28, increased the duration of exclusive dominance of the surrounded grating. The effect
Rivalry with a surround round one image. The diameter of each eye’s image subtended 3°. The white gap between the disk and annulus was varied between 10 and 40 arcmin. (Adapted from Fukuda and Blake 1992)
Figure 12.28.
B I N O C U L A R F U S I O N A N D R I VA L RY
•
83
of the annulus declined to zero as the gap between annulus and disk increased to about 40 arcmin. Also, the monocular annulus was more clearly visible when the grating it surrounded was fused with a similar grating in the other eye than when it was in rivalry with an orthogonal grating. The two rivalrous stimuli had well-defined contours, and the annular surround was visible for most of the time because it had no competing stimulus in the other eye. The effect could be explained by assuming that the dominant image of the annulus increases the dominance of the adjacent disk in the same eye. It is as if the visual system regards the annulus and the enclosed disk as belonging to the same surface. A related question is whether similarly oriented stimuli rival in synchrony. Alais and Blake (1999) used stimuli like those in Figure 12.29. The two images in one eye that contained collinear gratings, as in (A), tended to come into dominance simultaneously over random-dot images in the other eye. Synchrony of dominance was less evident with parallel gratings (B) and less evident still with orthogonal gratings (C). Also, gratings in which contrast was modulated in synchrony showed higher synchrony of dominance than gratings modulated in counterphase. Synchrony of dominance declined as the distance between the two rivaling patches was increased from 1 to 3°. The tendency for synchronous images to rival in synchrony is probably due to lateral connections between cortical areas responding to similar synchronized stimuli (Section 5.5.6). Alais et al. (2006) presented two Gabor patches with variable spatial frequency to one eye superimposed on random-dot patches in the other eye. Over a 60-s period, subjects indicated when each Gabor was visible. They called
the lateral separation over which the two collinear Gabors tended to rival together the association field, which they related to association fields that describe how people perceive connections between contours (see Section 4.5.2d). The size of the association field decreased monotonically as the separation between the Gabors was increased up to 10 times the wavelength of the carrier waveform. Also, the association field decreased as the Gabors departed from being collinear. With orthogonal Gabors, the association field was about half as wide as with aligned Gabors. The mean duration of dominance was not affected by these changes. The association field was smaller for Gabors presented in opposite retinal hemifields than for those presented in the same hemifield. The boundaries of rivaling subregions within a large display were not determined by the boundaries of meaningful units such as words (Blake and Overton 1979). The influence of meaning on binocular rivalry is discussed in Section 12.8.3. Another question is whether the dominance of a figure in rivalry is enhanced when it is perceived as belonging to neighboring figures. Sobel and Blake (2002) approached this question by using the displays depicted in Figure 12.30. A rotating radial pattern was superimposed on a drifting grating in the other eye. The duration of dominance of the drifting grating increased when it was one of four drifting gratings that together created an impression of coherent motion, as in Figure 12.30B. The other gratings had no effect when their motions were not coherent, as in Figure 12.30C. The context prolonged the dominance phase of a given stimulus but had no effect on the duration of the suppression phase.
12.4.4 EY E R I VA L RY V E R S US S T I MU LUS R I VA L RY
12.4.4a Eye Versus Stimulus Exclusive Rivalry A
B
C Relative orientation and synchrony of rivalry. (A) Collinear images tend to rival in synchrony more than parallel but not aligned images (B) or orthogonal images (C). (Adapted from Alais and Blake 1999)
Figure 12.29.
84
•
It has been generally assumed that rivalry occurs between the eyes or independently between distinct regions of the two eyes. This is eye rivalry. An alternative view is that rivalry preserves a coherent pattern even when rivaling patterns are interchanged between the eyes. This is stimulus rivalry. The simplest theory is that eye rivalry occurs early in the visual system while stimulus rivalry occurs at higher levels of visual processing. But we will see that this simple theory has not been firmly established. Blake et al. (1980b) developed an eye-swap procedure to investigate this question. The subject views rivalrous patterns and presses a key when one of them is totally dominant. The patterns are then smoothly reduced in contrast, interchanged between the eyes, and smoothly restored in contrast. Subjects nearly always reported that when horizontal and vertical rival patterns were interchanged there was no change in ocular dominance. If the dominant eye
STEREOSCOPIC VISION
A
B
C Effect of motion coherence on rivalry. (A) The rotating radial pattern rivals with the drifting grating. (B) The grating dominates for longer when it is part of a coherently moving display. (C) Dominance of the moving grating is not affected when it is part of an incoherently moving display. (Redrawn from Sobel and Blake 2002, Pion Limited, London)
Figure 12.30.
saw the horizontal grating before the interchange it saw the vertical pattern after the interchange. Thus, eye dominance was preserved at the expense of not preserving a given pattern. Wolfe (1984) presented a vertical grating to one eye for 1 to 2 seconds. About 100 ms after it was removed a flashed vertical grating was presented to the same eye and a flashed horizontal grating to the other eye. Subjects saw only the horizontal grating, which indicates that dominance had switched to the other eye even though this did not preserve figural continuity. Unfortunately, the test gratings were not presented the other way round. Logothetis et al. (1996) produced evidence that, they claimed, shows that rivalry can be determined by competition between stimuli rather than by competition between eyes. Oblique orthogonal gratings flickering at 18 Hz were interchanged between the two eyes three times per second. Subjects tracked changes in rivalry by pressing keys. They mostly perceived rivalry between the gratings, with dominance periods lasting several seconds, in much the same way as when each grating was presented continuously to the
same eye. At higher contrasts, two of the six subjects saw the effect reported by Blake et al. Logothetis et al. concluded that, for most subjects, the gratings competed for dominance independently of the eye to which they were presented so that patterns rivaled rather than eyes. In other words, they concluded that stimulus rivalry is stronger than eye rivalry. But these effects may also be interpreted in terms of eye rivalry. In the period in which one grating was dominant, eye dominance switched three times per second so that there was continuity in the perceived orientation of the grating. Thus, the eyes rivaled, but the rivalry was entrained by the alternation of stimuli between the two eyes so as to preserve continuity of perceived form. Every few seconds subjects switched to seeing the other pattern either because of a breakdown in the entrainment process or because of rivalry between orientation detection systems. Lee and Blake (1999) asked subjects to describe the frequency of rivalry rather than press keys, because it is not easy to track rapid changes in rivalry with a key. They obtained stimulus rivalry only when the orthogonal gratings were flickered at 18 Hz and abruptly interchanged between the eyes, as in the Logothetis et al. experiment. These are the conditions under which eye rivalry is entrained by transient signals. Mostly eye rivalry occurred when the gratings were not flickering or were interchanged gradually. Logothetis has now agreed that both eye rivalry and pattern rivalry occur depending on the conditions. Silver and Logothetis (2007) tagged changes in eye-of-origin or changes in the stimulus pattern by differential rates of flicker. Frequency-tagging eye-of-origin inputs enhanced eye rivalry, while tagging the patterns enhanced pattern rivalry. In a similar experiment, Knapen et al. (2007) found that simultaneous flicker of rivalrous stimuli at between 10 and 25 Hz facilitated pattern rivalry. However, flickering the stimuli in counterphase reduced the incidence of pattern rivalry and increased the incidence of eye rivalry. These procedures provide a way to separate low-level eye rivalry from high-level pattern rivalry. Other instances in which rivalry is entrained by the nature of the stimuli are discussed in Section 12.8.3. Nguyen et al. (2001) asked subjects to indicate the location of a brief probe stimulus presented to an eye in the suppressed phase of binocular rivalry produced by orthogonal gratings. Performance was not affected by whether the probe and the grating in the suppressed eye were similar or not similar in color, spatial frequency, or orientation. This supports the idea of eye suppression rather than that of stimulus-specific suppression, at least at the level of simple visual features. We will now see that the situation is different for high-level features. Face recognition depends on the fusiform area, while detection of global motion depends on MT and MST. Alais and Parker (2006) used rivaling faces or an expanding dot
B I N O C U L A R F U S I O N A N D R I VA L RY
•
85
pattern to one eye and a contracting pattern to the other eye. Sensitivity to a brief face probe was severely reduced when it was imposed on a suppressed face. Similarly, sensitivity to a brief global motion probe was reduced when it was imposed on a suppressed global motion stimulus. However, sensitivity to a face probe was not reduced when it was imposed on a suppressed grating and sensitivity to a global-motion probe was not reduced when it was imposed on a suppressed face. Alais and Parker concluded that rivalry suppression is localized to those regions that process particular higher-order stimuli. But these effects may not depend on high-level processes. A face probe on a face stimulus and a motion probe on a motion stimulus do not involve much change at any level of processing and will therefore not break the suppression. A face probe on a motion stimulus and a motion probe on a face stimulus involve large low-level and high-level changes, which will more effectively break suppression and allow the probe to be detected.
12.4.4b Interocular Stimulus Grouping The above experiments involved exclusive rivalry, in which the whole of one pattern or the whole of another is seen. A modified version of the stimulus rivalry theory is that coherent patterns are preserved in rivalry even though part of the pattern is in one eye and part in the other eye. Effects of this kind have been called interocular grouping. They were first described by Diaz-Caneja (1928; translation by Alais et al. 2000). They have also been described by Whittle et al. (1968), Kovács et al. (1996), Ngo et al. (2000), and de Weert et al. (2005). There are three ways to account for preservation of a figure that is composed of rivaling parts. 1. Propagation of mosaic eye rivalry over contours Rivalry could propagate along continuous contours that span regions of local eye rivalry. This would tend to preserve figural coherence even for meaningless figures. 2. High level modulation of mosaic eye rivalry Local mosaic eye rivalry could be modulated by feedback from a higher center where coherent figures are recognized. This could involve complex factors of familiarity in addition to contour continuity. 3. Deferment of rivalry to higher centers Instead of occurring between local regions in the visual cortex, rivalry could be deferred to the stage where coherent patterns are recognized. Information from each region in both eyes would be passed to the higher center, where rivalry would be determined only by figural coherence and familiarity. On any theory one would expect that coherent patterns would be more likely to be preserved in rivalry than 86
•
noncoherent patterns. Bonneh et al. (2001b) alternated arrays of orthogonal Gabor patches between the two eyes. Displays with patches that were closer, larger, and more uniform in orientation showed pattern rivalry. Those with small widely spaced patches with less uniform orientation showed eye rivalry. This means that, with patterns with low spatial coherence, distinct regions of the binocular field rival independently while, with patterns with high coherence, rivaling regions interact so as to preserve pattern coherence. Suzuki and Grabowecky (2002) used stimuli that produced coherent patterns both when one or the other eye was dominant (single-eye dominance) and when parts from each eye were combined (mixed-eye dominance), as shown in Figure 12.31. They found that periods of single-eye dominance alternated with periods of mixed-eye dominance. In both periods, the rivaling patterns were symmetrical. It seems that other possible combinations of the images in the two eyes did not occur, presumably because they would not be symmetrical. Silver and Logothetis (2004) presented an array of white dots on a gray background to one eye and an array of black dots on a gray background to the other eye. The two arrays were offset slightly so that dots did not fall on corresponding points. Subjects reported seeing one or other complete array of dots of one contrast for 10.7% of the 60-s viewing time. Any such percept would be very unlikely if the dots rivaled independently. However, the results could perhaps have been due to a tendency for the whole of one eye’s image to dominate the other eye’s image, irrespective of the contents of the images. Ooi and He (2003) used a display that created the impression of a transparent square superimposed on four disks only when the four corners of the square had the same color. When similar displays were presented to each eye, subjects tended to see a transparent square with all corners in the same color, even though alternate corners in each eye were red and blue. The impression of a same-colored square was dominant for longer than the impression of a mixed colored square, especially when one eye saw an all-red square and the other eye saw an all-blue square. In all the above studies, pattern elements fell on or near corresponding retinal locations in the two eyes. There was thus local rivalry between local elements and global rivalry between larger coherent patterns. Watson et al. (2004) presented a pattern of moving points of light that created the impression of a person walking. The two eyes were presented with different patterns in different colors. The two patterns were displaced vertically so that pairs of points did not fall on corresponding points. There was thus no local rivalry. Nevertheless, the patterns rivaled, and 55% of the time the whole of one or the other pattern was seen. There was little rivalry when parts of each pattern were presented to each eye or when they were both presented to one eye. In these cases, both patterns were seen at the same time.
STEREOSCOPIC VISION
Stimuli Right eye
Left eye
Right eye
Left eye
Single-eye dominance Effects of symmetry and continuity on rivalry. The stimuli in the two eyes can
Figure 12.31.
rival so that one eye or the other is dominant (single-eye dominance) or so that part of the image in one eye is combined with part of the image in the other eye (mixed-eye dominance). The mixed-eye dominance shown here is one that produces symmetrical patterns. Suzuki and Grabowecky (2002) claimed that other patterns do not occur because they lack symmetry or contour continuity.
or
or
Mixed-eye dominance
L-half of L-eye + R-half of R-eye
L-half of L-eye + R-half of R-eye
R-half of L-eye + L-half of R-eye
R-half of L-eye + L-half of R-eye
This suggests that the rivalry was basically between the eyes but that its occurrence depended on feedback from a highlevel mechanism that recognized the figural coherence of each pattern. Lee and Blake (2004) extended the eye-swap procedure to investigate rivalry between local regions of complementary montages of complex meaningful patterns. They used the head of an ape and printed text, as used by Kovács et al. (1996). The results supported the idea that rivalry in each local region is fundamentally between the eyes, but that figural continuity tended to be preserved between rivaling regions. This implies feedback from a higher center. On balance, the evidence suggests that rivalry is initially between eyes within each local region but that, under certain circumstances, figural coherence detected at a higher level favors dominance of a pattern that is a composite of elements presented to the two eyes (Pearson and Clifford 2004). A model of multistage rivalry is presented in Section 12.10. This conclusion supports the view that rivalry is designed to preserve the visibility of the most salient features of those parts of the visual scene that fall outside the horopter (Section 12.3.1c). 1 2 . 5 I N T E R AC T I O N S B ET W E E N D O M I N A N T A N D S U P P R E S S E D I M AG E S According to one view, binocular suppression affects all stimuli and all stimulus features within the suppressed region (Fox and Check 1966a). Although all features of the suppressed image are removed from consciousness, there is
considerable evidence that some features are suppressed more than others and that some suppressed features may interact with the dominant stimulus. 12.5.1 P U P I L A N D AC C O M M O DAT I O N R E S P O N S E S D U R I N G R I VA L RY
The pupillary reflex is controlled by visual signals that feed directly into the subcortical pretectal and Edinger-Westphal nuclei. One would therefore not expect the pupillary reflex to be affected by binocular rivalry. Nevertheless, it has been claimed that a flash of light presented to an eye in its suppressed phase of binocular rivalry is less likely to produce a pupillary response than one presented to the eye in its dominant phase (Bárány and Halldén 1948; Lorber et al. 1965; Richards 1966). Lowe and Ogle (1966) could not replicate this effect when the images in the two eyes were equal in luminance. However, when the images of rivalrous concentric rings differed in luminance, a small constriction of the pupil occurred as the brighter field came into dominance. Constriction increased as the dominant image was made brighter. Bradshaw (1969) illuminated one eye with a bright patch and the other eye with a less bright patch. When one or other eye was brought into dominance by the addition of a lattice of lines to the patch, the pupillary response was not affected. Any effect of rivalry on the pupillary response must involve centrifugal pathways from the visual cortex to the subcortical centers controlling the pupil. When accommodation demand differs between the two eyes, the accommodative response represents the vector average of the inputs (Flitcroft et al. 1992). However, when
B I N O C U L A R F U S I O N A N D R I VA L RY
•
87
the eyes are presented with orthogonal gratings that differ in contrast, the response at any time is determined by whichever grating is dominant (Flitcroft and Morley 1997). 12.5.2 T H R E S H O L D S UM M AT I O N O F R I VA L I N G S T I MU L I
A small near-threshold test flash superimposed on a grating presented to one eye was detected more frequently when it was accompanied by a similar test flash presented to a corresponding region in the dark field of the other eye (Westendorf et al. 1982). It was concluded that a stimulus in a suppressed eye can summate binocularly with one in a dominant eye. But the dark field of one eye may not have been suppressed on every trial, and the flash on the dark field may therefore have increased the probability of detection even if the two flashes did not summate. 12.5.3 E FFEC TS O F C H A N G I N G T H E S U P P R E S S E D I M AG E
When a suppressed image is changed in certain ways, the suppressed phase may rapidly terminate. Blake and Fox (1974a) presented a horizontal grating to the right eye and a vertical grating to the left eye. The horizontal grating had high contrast and was counterphase modulated at 4 Hz, whereas the vertical grating had low contrast and was not modulated. The horizontal grating suppressed the vertical grating. The suppressed phase was terminated by an increase in the contrast of the suppressed image but not by a decrease in its contrast. Changes in the spatial frequency or orientation of the suppressed image did not terminate the suppressed phase. Blake and Fox concluded that only energy increments terminate the suppressive process. This evidence is not conclusive because the other imposed changes may have been insufficient to overcome the flicker and larger contrast in the dominant image. Walker and Powell (1979) repeated the experiment with the left and right images equally dominant. They found that changes in the contrast, spatial phase, or spatial frequency of the suppressed image terminated the suppressed phase. Changes in spatial frequency and orientation involve local changes in stimulus energy and transient signals. O’Shea and Crassini (1981a) found that termination of suppression required a change in orientation of a suppressed image of at least 20°. Larger changes in orientation could involve stronger transient signals than small changes in orientation. One cannot conclude that spatial frequency and orientation are analyzed as such in the suppressed image or that suppression acts selectively with respect to orientation and spatial frequency. Freeman and Nguyen (2001) devised a procedure for experimentally controlling binocular rivalry. One eye saw a stationary horizontal grating. The other eye saw a horizontal grating superimposed on a vertical grating. These two 88
•
gratings were modulated in contrast in antiphase. As the vertical grating came into view, the stationary horizontal grating in the other eye was suppressed. As the horizontal grating came into view, suppression ceased. They tracked the cyclic suppression of the stationary grating by measuring the visibility of a test spot superimposed on it. 12 . 5.4 MOVE M E N T S I G NA L S FRO M S U P P R E S S E D I M AG E S
12.5.4a Apparent Movement from Suppressed Images There has been some dispute about whether rivalry suppresses the generation of apparent movement. Ramachandran (1975) found that a spot superimposed on a suppressed pattern did not generate apparent motion with respect to a nearby spot presented sequentially on the dominant pattern in the other eye. On the other hand, Wiesenfelder and Blake (1991) found apparent motion under these conditions, although it was not as clear as when both spots were viewed normally. Wiesenfelder and Blake concluded that the motion signal is weakened but not eliminated by suppression. Shadlen and Carney (1986) obtained dichoptic apparent movement by presenting the same flickering gratings to the two eyes but with a 90° spatial and temporal phase shift (Section 16.5.1). Carney et al. (1987) presented a yellow and black grating to one eye and an equiluminant red and green grating to the other. When the two gratings flickered with a 90° spatial and temporal phase shift, apparent motion was seen in a direction consistent with the luminance component of the alternating gratings. But color rivalry was seen at the same time. They concluded that rivalry might occur in the color channel while motion is seen in the luminance channel. This is similar to the argument used by Ramachandran and Sriram (1972) to account for the simultaneous occurrence of stereopsis and color rivalry in a random-dot stereogram. To account for these effects, it may not be necessary to assume that the color-rivalry and luminance-motion channels are activated at the same spatial location. Color rivalry may have occurred locally within areas between the contours that defined the apparent movement. In other words, the two effects may have been segregated spatially rather than by parallel visual channels serving different features in the same location.
12.5.4b Combining Orthogonal Dichoptic Motions In Section 12.3.6 it was described how moving orthogonal gratings appear to rival when presented dichoptically. However, the mosaic patches of the rivaling gratings appear to move in an intermediate direction, like gratings
STEREOSCOPIC VISION
superimposed in the same eye. Intermediate coherent motion was evident with rivalrous gratings that were too small to generate a mosaic of rivalry (Andrews and Blakemore 1999, 2002). This suggests that the motion signal from an otherwise suppressed grating can influence the perceived direction of motion of a dominant grating in the other eye.
12.5.4c Eye Movements Evoked by a Suppressed Image Optokinetic eye movements (OKN) are evoked by motion of the whole visual scene. In animals and humans lacking stereoscopic vision, OKN for each eye occurs only in response to retinal motion in a nasal direction. This directionally biased response is mediated by direct visual inputs to the subcortical pretectum, as explained in Section 22.6.1. Therefore, these subcortical inputs should not be subject to binocular rivalry based on motion signals. Brief retinal motion of a grating in its suppressed phase initiated OKN but mostly only in the nasal direction (Zhu et al. 2008). The response was weaker and had longer latency than that evoked by a grating in its dominant phase. This suggests that motion signals from a suppressed image reach the subcortical nuclei responsible for directionally preponderant OKN but that they do not reach the cortical mechanism responsible for evoking symmetrical OKN.
in the rest of the display. The target region was clearly visible in each monocular image. However, subjects saw a uniformly textured display of crosses when the images were combined and presented briefly to prevent changes in vergence. Thus, there was dichoptic cancellation of monocular texture boundaries. Nevertheless, in a forced-choice procedure, subjects could report the position of the target region at above chance levels. Kolb and Braun concluded that this dissociation between awareness and performance is analogous to blindsight in which patients with cortical damage can discriminate visual stimuli even though they are not conscious of seeing (Section 5.5.7). However, Kolb and Braun’s subjects may have applied a stricter criterion in reporting texture boundaries than they applied in the forced-choice detection task (Morgan et al. 1997). Also, a slight misalignment of the eyes would produce distinct asymmetrical crosses in the target region and the surround, as shown in Figure 12.32B. A slight tilt of the head is sufficient to misalign the images. Solomon and Morgan (1999) presented to each eye alternating rows or alternating columns of orthogonal oblique Gabor patches. To each eye the display appeared as a horizontal or vertical texture-defined grating. Each patch in one eye was orthogonal to that in the other eye. At an exposure of 100 ms the superimposed Gabors appeared as crosses. Nevertheless, subjects could detect the orientation of the dichoptically combined gratings. Solomon and
12.5.5 VI S UA L B E AT S F RO M A S U P P R E S S E D I M AG E
Carlson and He (2000) presented a red triangle facing left to one eye and a green triangle facing right to the other eye. At the same time, the luminances of the two stimuli were modulated sinusoidally at a slightly different temporal frequency. Out-of-phase dichoptic flicker produced a sensation of flicker at the difference frequency (visual beats). Subjects observed rivalry between the form and color of the stimuli but continued to see the visual beats. Thus, a flicker signal penetrated the mechanism that suppressed shape and color. Carlson and He suggested that rivalry occurs in the parvocellular system that carries color and shape signals but that flicker and high-velocity motion signals are carried by the magnocellular system and survive suppression. This separation between parvocellular and magnocellular channels could arise in the retina and depend on the spatial separation between ganglion cells serving the two systems.
A
B 12.5.6 D ET EC T I O N O F D I C H O P T I C A L LY CANCELED TEXTURES
Kolb and Braun (1995) constructed the displays shown in Figure 12.32A. All texture elements in the two images are orthogonal but in one target region the orientation of the elements is complementary to the orientation of elements
Dichoptic cancellation of texture boundaries. (A) When briefly exposed to corresponding regions of the two eyes, the whole display appears composed of Xs. (Adapted from Kolb and Braun 1995) (B) The images in (A) produce identical crosses only when in binocular register, as on the left. When the images are out of register, the crosses differ in the two complementary regions, as on the right. A slight tilt of the head brings the images in (A) out of register.
Figure 12.32.
B I N O C U L A R F U S I O N A N D R I VA L RY
•
89
Morgan concluded that monocular inputs are available to a mechanism that computes texture boundaries based on orientation of texture elements. However, this conclusion must be questioned, because a small misalignment of the eyes would produce asymmetries in the dichoptic crosses that would reveal the vertical or horizontal gratings. When Solomon and Morgan moved the horizontal texture gratings vertically, subjects could detect the direction of motion in the monocular image but not in the dichoptic image, even with exposures of 350 ms. They concluded that the visual system does not have access to the monocular motion signals in a dichoptic display in which texture boundaries are canceled. This is complementary to the finding that the visual system does not have access to binocular motion signals when the motion is not visible monocularly (see Section 16.5.1). Spurious signals generated by vergence instability would not operate with a moving display, and this may be why static dichoptic textured regions are visible with 100 ms exposure, but moving ones are not. Color rivalry is produced when a red-on-green shape is presented to one eye and the same shape in green-on-red is presented to the other eye. However, if the dichoptic stimuli are presented for less than about 100 ms they fuse into a homogeneous color and the shape is no longer visible. Moutoussis and Zeki (2002) used the fMRI to enquire whether an invisible shape of this kind activates the same visual centers as a visible shape created by presenting identical stimuli to the two eyes. In general, they found that an invisible face or house activated the same stimulus-specific areas as a visible face or house. However, in both cases, the invisible stimuli produced weaker responses than visible stimuli. Thus, higher visual centers responded in a stimulusspecific way to invisible stimuli but at a level below that required for conscious report.
2. Distinct rivalry channels Stimulus interactions could occur in parallel channels distinct from the channel in which perceptual rivalry occurs. 3. Incomplete suppression Although a suppressed image is below the level of conscious awareness there could be sufficient signal strength to induce an aftereffect in a subsequently seen stimulus. According to this view, aftereffects produced by a suppressed image should be weaker than those produced by a nonsuppressed image.
12.6.1 T H R E S H O L D E L EVAT I O N BY A S U P P R E S S E D I M AG E
Blake and Fox (1974b) found that a threshold-elevation aftereffect could be created by a suppressed stimulus. They used the stimulus shown in Figure 12.33. In the first condition, the adaptation grating was presented continuously or intermittently to one eye for 30 s, with no stimulus in the other eye. After adaptation, subjects adjusted the contrast of a newly exposed comparison grating in an analyzing position in the same eye to match the apparent contrast of the adaptation grating. The adaptation grating showed the familiar loss of apparent contrast, which was greater for continuous exposure than for intermittent exposure. In the second condition, the adaptation grating was suppressed for at least half the 30-s adaptation period by a rival grating presented to the other eye. The magnitude of the aftereffect was the same in the two conditions, showing that adaptation occurred even when the adaptation grating was suppressed. For two strabismic subjects who could suppress the adaptation grating for the full 30 s, the adaptation effect was as strong as when the grating was visible for the whole period (Blake and Lehmkuhle 1976).
12.6 EFFECTS FROM S U P P R E S S E D I M AG E S Binocular suppression eliminates conscious awareness of suppressed stimuli. However, since binocular suppression depends on the relative positions, orientations, and contrasts of images in the two eyes, it is to be expected that some processing of these simple features precedes suppression. The evidence reviewed in this section shows that certain aftereffects that depend on the spatial frequency, orientation, and motion of stimuli become evident even though the induction stimulus had been suppressed. There are three ways to account for these effects. 1. Early interaction Dichoptic stimuli could interact before they rival. If they did, effects produced by a suppressed image should be as strong as those produced by a nonsuppressed image. 90
•
Left-eye image
Right-eye image
Test and adaptation
Suppressor grating 1°
Comparison grating Threshold elevation from a suppressed image. Stimuli used by Blake and Fox (1974b) to reveal that a suppressed grating may generate a threshold-elevation effect. While adapting, subjects fixated within the horizontal bar and the left-eye grating was suppressed by the right-eye grating. In testing, the right-eye grating was removed and subjects matched the contrast of a comparison grating in the left eye to that of the test grating.
Figure 12.33.
STEREOSCOPIC VISION
12.6.2 S PAT I A L-FR EQ U E N C Y A F T E R E FFEC T FRO M A S U P P R E S S E D I M AG E
In the spatial-frequency aftereffect the perceived spatial frequency of a test grating shifts away from that of an induction stimulus with a higher or lower spatial frequency. Blake and Fox (1974b) found that the aftereffect was at full strength when the induction grating was suppressed for a good part of the induction period. They concluded that binocular suppression occurs at a site after that responsible for contrast adaptation and the spatial-frequency aftereffect. However, they measured these aftereffects only in the previously suppressed eye, and the results could be due to adaptation of cortical neurons that receive inputs from only one eye, which are therefore not affected by suppression. Perhaps the contrast of the adaptation grating was reduced by the afterimage of the suppressor grating. On the other hand, Crabus and Stadler (1973) had found no evidence of the figural aftereffect (apparent repulsion of a test line away from the location of a previously inspected line) when the monocular induction stimulus occurred wholly in periods of binocular suppression. Blake and Overton (1979) overcame the problem of unsuppressed monocular neurons by showing that the threshold-elevation aftereffect in the eye opposite to that exposed to the induction stimulus was also not weakened when the induction stimulus was suppressed for a substantial part of the induction period. They also found that adaptation to a grating presented to one eye lessened the dominance of that grating when it was pitted against an orthogonal grating in the other eye. These results strengthen the conclusion that processes responsible for contrast adaptation and the spatial-frequency aftereffect occur before binocular suppression. Blake and Overton were convinced that contrast adaptation and the spatial-frequency aftereffect are cortical and therefore occur at least in area V1. Since the aftereffects preceded binocular suppression, they concluded that binocular suppression occurs beyond V1. They agreed that rivalry between gratings must be processed fairly early, because it is not a cognitive process and does not occur in meaningful units (Section 12.8.3). But let us examine the claim that binocular suppression occurs beyond V1. Both contrast and spatial frequency are, at least to some extent, coded at the retinal level—contrast by lateral inhibitory processes and spatial frequency in terms of receptive fields of different sizes. The two aftereffects may therefore arise in the retina or LGN. If they do, the argument for placing binocular suppression beyond V1 collapses. The belief that spatial-frequency aftereffects are cortical is based on the fact that they show interocular transfer. This, in itself, is not a convincing argument. An afterimage shows interocular transfer, in the sense that an afterimage impressed on
one eye is visible when that eye is closed and the other eye opened. This does not prove that afterimages are cortical in origin, but simply that activity arising in a closed eye still reaches the visual cortex. An afterimage is no longer visible when the eye in which it was formed is pressure paralyzed (Oswald 1957). Thus, a cortical aftereffect is one that survives pressure blinding the eye to which the induction stimulus was presented. The threshold-elevation effect survived pressure blinding (Blake and Fox 1972). But this still leaves open the possibility of LGN involvement. Spatial-frequency aftereffects seem to be cortical, because a period of exposure to a grating of a given spatial frequency and orientation reduced the firing rates of neurons in the visual cortex but not of neurons in the LGN (Movshon and Lennie 1979). Even if contrast and spatial-frequency aftereffects are cortical and precede binocular suppression, we still do not have to conclude that the site of rivalry is beyond V1. The processes responsible for contrast and spatial-frequency coding and those responsible for the combination of binocular inputs could occur sequentially within V1. The square-wave illusion provides an example of spatial-phase adaptation (Leguire et al. 1982). When a grating with a triangular luminance profile is viewed for some time it begins to look like a square-wave grating, namely, a grating with the same sine wave components but in antiphase rather than in phase. Binocular suppression of the induction stimulus severely reduced the illusion (Blake and Bravo 1985). It looks as though complex effects that depend on interactions between several spatial frequencies occur after suppression. 12.6.3 T I LT A F T E R E FFEC T FRO M A S U P P R E S S E D I M AG E
A disk of vertical lines appears tilted clockwise when surrounded by an annulus of lines tilted a few degrees counterclockwise. This is simultaneous tilt contrast. There is conflicting evidence as to whether tilt contrast persists when the induction annulus is suppressed by a stimulus presented to the other eye. Rao (1977) used a high-contrast horizontal grating to suppress a low-contrast annulus containing lines tilted 10°. Tilt contrast in the central disk occurred only during periods when the tilted induction annulus was not suppressed. Wade (1980) used a similar stimulus and found that tilt contrast occurred during periods when the induction stimulus was suppressed. However, Wade did not use such a strong rivalrous stimulus so that parts of the induction stimulus may have survived suppression. Pearson and Clifford (2005) presented a central vertical grating to one eye and an annular tilted grating to the other eye. Tilt contrast was still evident when the annulus was suppressed by static noise. However, the effect required much more contrast in the suppressed induction stimulus
B I N O C U L A R F U S I O N A N D R I VA L RY
•
91
compared with when the induction stimulus was dominant during the inspection period. Vertical lines appear tilted clockwise after lines tilted counterclockwise in the same location have been inspected for some time. This is the tilt aftereffect. The tilt aftereffect was not weakened when the tilted induction stimulus was suppressed for a good part of the induction period by a rival horizontal grating presented to the other eye (Wade and Wenderoth 1978). The tilt aftereffect must be cortical because cells in precortical sites have little or no orientation tuning (Section 5.2.2c). The McCollough effect, described in Section 13.3.5, occurred at full strength when the induction stimulus was perceptually suppressed for much of the induction period by a rival stimulus presented to the other eye (White et al. 1978).
The motion aftereffect lasts longer when a dark interval is interspersed between the induction and test stimuli. Thus a stationary test stimulus drains the motion aftereffect, presumably because it readapts the visual system to its normal state. In the dark, the visual system returns to its normal state more slowly. The motion aftereffect in an adapted eye also lasted longer when the stationary test stimulus in the adapted eye was suppressed by a rival stimulus in the other eye (Wiesenfelder and Blake 1992). Since a motion aftereffect can be induced by a suppressed stimulus, the inability of a suppressed stationary test stimulus to readapt the visual system suggests that induction of the motion aftereffect occurs at an earlier site than its decay. But perhaps the ineffectiveness of a stationary test stimulus is due to its being stationary. Perhaps a suppressed stimulus moving in another direction would cause a motion aftereffect to decay.
12.6.4 MOT I O N A F T E R E FFEC T FRO M A S U P P R E S S E D I M AG E
It takes a person longer to react to a moving stimulus that starts to move while it is suppressed than to react to one that starts to move in a nonsuppressed period. Presumably, the suppressed image must become dominant before its motion can be detected (Fox and Check 1968). But we will now see that, even though motion remains unavailable to consciousness during suppression, some processing of motion signals occurs during the stage of suppression. When a moving textured display is viewed for some time, a stationary display in the same location appears to move in the opposite direction. This is the motion aftereffect. The duration of the aftereffect increases with the duration of the induction stimulus. The duration of the aftereffect in a test grating viewed by one eye was not reduced when the moving induction stimulus in that eye was suppressed for most of the inspection period by a stationary stimulus presented to the other eye (Lehmkuhle and Fox 1975). Also, interocular transfer of the motion aftereffect was the same when the induction stimulus was suppressed by rivalry as when it was visible for the whole inspection period (O’Shea and Crassini 1981b). Thus, some processing of visual motion must survive suppression. On the other hand, the motion aftereffect produced by a rotating display or by a rotating spiral was attenuated when the induction stimulus was suppressed (Lack 1974; Wiesenfelder and Blake 1990). Also, the motion aftereffect induced by coherent motion of a plaid was reduced when the induction stimulus was exposed to binocular rivalry (van der Zwan et al. 1993). The visual processes for detecting rotary, spiral, and plaid motions are more complex than those for detecting simple linear motion, and may occur in MT (Section 5.8.3d). This evidence suggests that binocular suppression occurs after the site of simple motion detection but before the site where complex motion is detected. 92
•
12.6.4a Summary It seems that simple aftereffects such as the threshold-elevation effect, the spatial-frequency aftereffect, and the motion aftereffect are produced at full strength by a suppressed image. Other aftereffects, such as the tilt aftereffect, occur in weakened form when produced by suppressed stimuli, especially when the contrast of the suppressing stimulus is higher than that of the suppressed stimulus. This suggests that these aftereffects depend, at least to some extent, on processes occurring before those responsible for binocular rivalry. Complex aftereffects, such as the spiral aftereffect and the square-wave illusion are not produced by suppressed images. These effects must occur beyond the level where rivalry occurs. 1 2 . 7 R I VA L RY, F U S I O N, A N D STEREOPSIS Galen’s idea that images from the two eyes fuse in the optic chiasm held sway for about 1,500 years and was replaced in the 17th century by the view that fusion occurs in the brain (Section 2.10.3). After 1838, when Wheatstone proved that binocular disparity forms the basis of stereoscopic vision, the problem arose of reconciling the fact that similar images fuse with the fact that image differences are used for stereopsis. If images were truly fused their separate identities and disparity information would be lost. A related problem is that some images fuse or remain diplopic and create depth, while other images rival and do not produce depth. Four basic accounts of the relationship between binocular fusion, disparity detection, and binocular rivalry have been proposed. They will be referred to here as the mental theory, the suppression theory, the two-channel theory, and the dual-response theory.
STEREOSCOPIC VISION
12.7.1 T H E M E N TA L T H E O RY O F F US I O N A N D R I VA L RY
Helmholtz (1909) objected to the idea of fusion of images at an early stage of visual processing. He wrote, The content of each separate field comes to consciousness without being fused with that of the other eye by means of organic mechanisms; and therefore, the fusion of the two fields in one common image, when it does occur, is a psychic act. (Vol. 3, p. 499) His argument was based on the fact that black in one eye and white in the other produce binocular luster, whereas the fusion theory predicts gray. This objection does not apply to fusion of similar images. However, Helmholtz also pointed out that, with very short exposures, depth based on crossed disparity can be distinguished from that based on uncrossed disparity. This means that we register which eye receives which image, which should not be possible if images fused. This objection to the fusion theory does not hold for a mechanism containing some binocular cells tuned to crossed disparities and others tuned to uncrossed disparities. Such cells create a unified set of signals from the two inputs but preserve disparity information required to code depth. In talking about a “psychic act” of fusion, Helmholtz observed that, although we do not normally notice diplopic images of objects well out of the plane of the horopter, they become visible when we make a special effort to see them. On the other hand, Helmholtz did not support the suppression theory, in which the images are processed in alternation. However, his only argument against the theory was that, “the perception of solidity given by the two eyes depends upon our being at the same time conscious of the two different images” (Helmholtz 1893, p. 262). Sherrington (1904) wrote that binocular fusion results from “a psychical synthesis that works with already elaborated sensations contemporaneously proceeding.” From his experiments on binocular flicker (Section 13.1.5) he concluded that fusion is not based on a physiological mechanism like the convergence of nerve impulses in the final common path of the motor system. The mental theory is now only of historical interest.
12.7.2 S U P P R E S S I O N T H EO RY O F F US I O N A N D R I VA L RY
According to the suppression theory, rivalry is the only form of binocular interaction and operates for both similar and dissimilar images. In one account, the position of each image is sampled intermittently and a subsequent comparison process detects disparity. In another account, information from the dominant and suppressed images is processed
for purposes of disparity detection (Kaufman 1964). In this form, the suppression theory is equivalent to the two-channel account. Duke-Elder (1968, p. 684) reported that the earliest references to the suppression theory were by Porta (1593), Gassendi (1658), and Dutour (1760, 1763). Washburn (1933) and Verhoeff (1935) gave fuller accounts of the theory, although they produced no evidence for it other than general observations about binocular rivalry. Asher (1953) provided a lively supportive account of the suppression theory. One cannot observe binocular rivalry between identical stimuli because it is not possible to tell which eye is seeing when the images are alike (see Section 16.8). But the suppression theory can be tested by measuring the strength of a stimulus presented to an eye both in its suppressed phase and in its dominant phase of rivalry. Three indicators of the strength of suppression are available: 1. The contrast threshold of a test flash Wales and Fox (1970) measured the duration threshold for detection of a small flash of light superimposed on the image of one eye when the eyes were exposed to rivalrous stimuli. The threshold was elevated by about 0.5 log units when the flash occurred in the eye’s suppressed phase relative to when it occurred in the eye’s dominant phase. The threshold for detection of a change in contrast of a grating increases as the baseline (pedestal) contrast increases. The threshold for detection of a pulsed change in contrast of a grating was elevated when the grating was suppressed by a rivalrous image in the other eye (Watanabe et al. 2004). However, the threshold increased with increasing pedestal contrast of the suppressed grating. Thus, the visual system must register the pedestal contrast of a suppressed image that is briefly increased in contrast. 2. Reaction time of a response to a test flash When a vertical grating was presented to one eye and a horizontal grating to the other, observers took longer to respond to a flash superimposed on the suppressed image than to one superimposed on the dominant image. When the orientation of the gratings was the same for both eyes, the reaction time to the flash was the same for either eye and the same as when the flash was superimposed on a monocularly viewed grating (Fox and Check 1966a). This suggests that neither image is suppressed when the two images are identical. When a vertical grating was presented to one eye and superimposed vertical and horizontal gratings were presented to the other eye, reaction time for detection of a decrement in contrast in any one of the gratings was the same for all three gratings. In other words,
B I N O C U L A R F U S I O N A N D R I VA L RY
•
93
neither of the vertical gratings was suppressed, nor was the horizontal grating that was superimposed on one of the vertical gratings. The vertical grating, being fused with the vertical grating in the other eye, kept the superimposed horizontal lines dominant (Blake and Boothroyd 1985). This demonstrates that fusion takes precedence over rivalry. One could say that fusion of the vertical lines left no stimulus to rival the horizontal line. According to the suppression theory, reaction times to a monocular flash superimposed on a binocularly viewed stimulus should show a skewed distribution because the image on which the flash is superimposed is sometimes in its suppressed phase and sometimes in its dominant phase. O’Shea (1987) found the expected skewed distribution when the stimuli were rivaling but not when they were similar and fused. 3. Recognition of a flashed pattern A flash was detected equally well by either eye when dichoptic gratings had a disparity that produced a slanted surface (Blake and Camisa 1978). They concluded that superimposed dichoptic images do not suppress each other when they yield an impression of depth. In another study, subjects recognized a letter superimposed on a dominant image but not one superimposed on a suppressed image (Fox and Check 1966b). A letter superimposed on either one of a pair of fused images was consistently recognized just as well as when it was superimposed on a stimulus presented to only one eye. On the other hand, Makous and Sanders (1978) reported that a test flash presented to one eye when both eyes viewed identical patterns was not detected as frequently as when the eyes viewed different patterns, with the flash presented to the eye in its dominant phase of rivalry. This does not prove that identical patterns engage in alternating rivalry. It may simply reflect some constant mutual inhibition between identical stimuli.
occur at the same time in different parts of the visual field. The disparity-detection mechanism may involve interocular inhibitory processes in the ocular dominance columns of the visual cortex but not a gross alternation between the images in the two eyes. A less stringent version of this theory would allow rivalry to occur in the chromatic channel while stereopsis occurs in the same location in the achromatic channel. This would account for why depth is produced by anaglyph stereograms. According to the two-channel account of rivalry, stereopsis and rivalry are distinct processes that can coexist in the same location in the visual field. According to the dual response account, stereopsis and rivalry do not occur simultaneously in the same location, even when stimuli that evoke stereopsis are superimposed on stimuli that evoke rivalry. Locally, the engagement of one channel preempts the engagement of the other channel. Evidence on the essential difference between these two accounts will now be reviewed. Kaufman (1964) prepared a random-dot stereogram in which disparity between black dots generated a form in depth while the red and green backgrounds produced color rivalry. He concluded that color rivalry and fusion stereopsis occur simultaneously in the same location. However, we do not have to conclude that the dots engaged in both rivalry and fusion-stereopsis but only that, while rivalry occurs in the chromatic channel, fusion stereopsis can occur in the achromatic pattern channel. This would be a weak version of the two-channel theory. A stronger version, espoused by Tyler and Sutter (1979) and Wolfe (1986b) is that rivalry and stereopsis occur simultaneously for patterned stimuli. Some of the evidence for this strong version of the two-channel theory will now be considered. Other literature on the coincidence of rivalry and stereopsis is reviewed in Section 17.5. Depth is apparent in disparate similar lines superimposed on a background of rivaling lines, as in Figure 12.34 (Ogle and Wakefield 1967). This suggests that rivalry and stereopsis can coexist in the same location. However, the effect could also be interpreted as another example of stereopsis taking precedence over rivalry in regions where corresponding images are located.
12.7.3 T WO - C H A N N E L A N D D UA L-R E S P O N S E AC C O U N T S
According to the two-channel theory, binocular rivalry and stereopsis are distinct processes in separate neural channels that may be engaged simultaneously in the same location. For instance, dissimilar dichoptic images may engage a channel devoted to alternating suppression while similar images engage a channel devoted to disparity-detection (Wolfe 1986b). According to the dual-response theory, rivalry and stereopsis are distinct processes, only one of which is engaged at a given time in a given location. The processes may 94
•
Stereopsis with rivalry. Depth is apparent in nonrivalrous vertical lines superimposed on rivalrous oblique lines. (Adapted from Ogle and
Figure 12.34.
Wakefield 1967)
STEREOSCOPIC VISION
Julesz and Miller (1975) found that depth was seen in a random-dot stereogram when random-dot noise with a spatial-frequency 2 octaves higher than that of the stereogram was added to one eye. They argued that, in a given region, stimuli of one spatial frequency can generate stereopsis while stimuli of another spatial frequency generate rivalry. Buckthought and Wilson (2007) arrived at the same conclusion. They superimposed identical vertical gratings seen by both eyes on rivalrous diagonal orthogonal gratings. Binocular disparities in the vertical gratings produced an impression of depth, while the diagonal gratings rivaled. But rivalry and stereopsis coexisted only when the spatial frequency of the vertical gratings and that of the orthogonal gratings differed by more than one octave. Blake et al. (1991b) had subjects view a random-dot stereogram that yielded depth. When low-contrast randomdot noise was added to one eye’s image, subjects saw depth but no rivalry. At intermediate contrasts, regions of rivalry and of depth were seen but not in the same place. At high contrasts, the noisy display was dominant and there was rivalry but not much evidence of depth. It has been claimed that suppressed images may contribute to stereoscopic depth. Blake et al. (1980b) presented a vertical grating to one eye and a horizontal grating to the other. A vertical grating was presented for 1 s to the same eye as the horizontal grating during periods when only the horizontal grating was visible. During these periods, subjects reported seeing a vertical grating slanted in depth according to the disparity between the two vertical gratings. Rivalry between an oblique grating and two vertical lines, as in Figure 21.35A, was rapidly terminated when a stimulus was introduced into one image that formed a fusible pair with the image in the other eye, as in Figure 12.35B (Harrad et al. 1994). Thus, binocular fusion took precedence over rivalry. When the fusible images formed an image that appeared in depth, as in Figure 12.35B, the stereo threshold was initially elevated. Thus, it took time for stereopsis to fully recover from rivalry. Dichoptic vertical gratings that differ slightly in orientation fuse to create an impression of inclination. As the orientation difference between the gratings is increased there comes a point when the impression of depth is replaced by rivalry. Buckthought et al. (2008) showed a sequence of dichoptic gratings at a rate of between 0.5 and 2 Hz. Orientation disparity was increased or decreased during the sequence. As disparity increased, the transition from fusion to rivalry occurred at a higher disparity than that at which the reverse transition occurred when disparity was decreased. In other words, the transition from fusion and stereopsis to rivalry was subject to hysteresis. They explained this effect in terms of mutual inhibition between the two mechanisms. The currently active mechanism inhibits the nonactive mechanism.
Let-eye image
Right-eye image
A
B
C Precedence of fusion over rivalry. (A) The vertical lines rival with the oblique lines. (B) The fused vertical lines do not rival the grating but create an impression of relative depth. (C) The vertical lines do not rival the fused oblique lines.
Figure 12.35.
12.7.3a Summary Most of the evidence supports the view that alternating suppression does not occur when congruent or near-congruent patterns are fused. We must therefore reject this form of the suppression theory of binocular fusion. That is not to deny that mutual inhibition of images may be involved in the fusion of similar images. Physiological evidence reviewed in Section 11.4.1 shows that some binocular cells are dominated by excitatory inputs from only one eye and are inhibited by inputs from the other eye. The psychophysical evidence suggests that, over an extended area, any mutual inhibition between the two eyes when they view similar images is the same for both eyes and does not result in the alternation of suppression seen when the images are dissimilar. The view that fusion and rivalry occur simultaneously in the same location and in distinct channels has been championed by Wolfe (1986b) and contested by Blake and O’Shea (1988) and Timney et al. (1989). Readers are referred to these sources for a detailed assessment of the evidence. On balance, the evidence favors the dual-response account, namely, that rivalry and stereopsis are mutually exclusive outcomes of visual processing in any given location. At least this seems to be the case for visual processing of patterned stimuli. However, it is possible that rivalry within the purely chromatic channel can proceed simultaneously with stereopsis based on disparity between patterned images in the same location. Chromatic and patterned
B I N O C U L A R F U S I O N A N D R I VA L RY
•
95
stimuli are processed in partially distinct neural channels, even at the level of the retina and geniculate nucleus (Section 5.6.5). Evidence reviewed in Section 17.1.4 suggests that stereoscopic processing is weak in the purely chromatic channel. It is also possible that rivalry and stereopsis proceed simultaneously in distinct spatial-frequency channels. 12.8 COGNITION AND BINOCUL AR R I VA L RY This section is concerned with the influence of high-level cognitive factors on binocular rivalry. There are two types of cognitive variables. The first type refers to the characteristics and temporary state of the observer, such as the direction of attention, expectations, and emotional state. The second type refers to high-level attributes of the stimulus, such as its familiarity, meaning, and emotional significance. Binocular rivalry has been used as a convenient tool for studying the effects of both types of cognitive factors. Each type will be dealt with in turn. 12.8.1 VO LU N TA RY C O N T RO L O F R I VA L RY
Several investigators enquired whether the rate of alternation of rivaling stimuli is under voluntary control. Breese (1899) reported that subjects could influence the duration for which one or the other of two rivaling stimuli was seen, but he noticed that the eyes moved whenever subjects exercised this control. The more the eyes moved over a dominant stimulus, the longer that stimulus remained in view. Furthermore, a moving stimulus was dominant over a stationary stimulus. Eye movements may therefore have mediated voluntary control over rivalry. Others have agreed that one can exercise some voluntary control over the rate of binocular rivalry (Meredith and Meredith 1962; Lack 1969). However, eye movements and blink rate were not controlled in these experiments, and there was no verification of subjects’ reports of the dominance of rival stimuli. Van Dam and van Ee (2006a) found that saccadic eye movements tended to occur at about the time of changes in binocular rivalry of orthogonal gratings. However, the relationship between saccades and image reversals did not change when subjects were instructed to hold a given image in dominance. Van Dam and van Ee (2006b) went on to ask whether eye movements affect rivalry directly or indirectly by the image motion that eye movements create. They used orthogonal dichoptic gratings and selected only those saccades that were orthogonal to one or the other image. Saccades orthogonal to the left eye image effectively moved only that image and caused it to come into dominance. Saccades orthogonal to the right-eye image effectively moved only that image and caused it to come into dominance. This result is consistent with evidence reviewed in 96
•
Section 12.5.3 that an abrupt change in one eye’s image brings that image into dominance. Voluntary control of rivalry could perhaps be due to changes in stimulus blur caused by changes in accommodation. Fry (1936) reported that voluntary control of rivalry was abolished after paralysis of the ciliary muscles with homatropine. On the other hand, Lack (1971) found that voluntary control of rivalry was still evident for stimuli viewed through small artificial pupils, which eliminate the need for accommodation, or after paralysis of the muscles serving accommodation. A flashed stimulus may not be visible when superimposed on a suppressed image (Fox and Check 1966b). Thus, invisibility of a flash indicates that a stimulus is indeed suppressed. With this procedure, Collyer and Bevan (1970) obtained a 10% improvement in detection of a test flash superimposed on a given image after subjects were given 3 s to bring that image into dominance over an image in the other eye. However, voluntary control of dominance was not always achieved, was not continuous, and may have involved only part of the stimulus. Meng and Tong (2004) reported that the relative durations of dominance of rivalrous stimuli could be modified by attention only to a small extent. The control of rivalry was weaker than that over the reversible perspective of a Necker cube. 12.8.2 AT T E N T I O N A N D R I VA L RY
Voluntary attention to an object is known as endogenous attention. Involuntary attention to a salient stimulus is known as exogenous attention. There is evidence that both types of attention influence binocular rivalry. Consider first the effects of endogenous attention. Ooi and He (1999) presented an array of gratings to one eye and a blank field to the other eye. An apparent motion stimulus in part of the otherwise blank field suppressed one of the gratings. This suppression was less evident when subjects attended to that grating rather than to one of the other gratings. Ooi and He concluded that attending to a stimulus helps to resist its suppression by a distinct stimulus presented in the same location to the other eye. Sasaki and Gyoba (2002) obtained the opposite result. They presented a mixed array of 36 vertical and horizontal circular gratings to one eye. When subjects attended to the gratings in one orientation, those gratings became suppressed more readily than the nonattended gratings when a rivalrous stimulus was briefly presented to the other eye. The effect was evident even when the stimuli in the two eyes did not fall on corresponding regions. Sasaki and Gyoba exposed the test stimulus for a longer time than did Ooi and He. It is possible that attention induces short-term facilitation and long-term inhibition of the attended stimulus. Paffen et al. (2006b) asked whether the rate of rivalry of orthogonal 0.6° central gratings is affected when subjects
STEREOSCOPIC VISION
are instructed to detect changes in the motion of random dots in a surrounding annulus. The rate of rivalry was reduced in proportion to the difficulty of the distracting task. They showed that the effect was not due to an inability of subjects to track changes in rivalry. They concluded that it was due to effects of attention on the effective contrast of the rivalrous stimuli. Paffen et al. (2008b) asked whether a feature becomes more dominant after subjects have been trained to discriminate between values of the feature. They trained subjects in a speed discrimination task using dots moving in a given direction. After training, subjects became more sensitive to dots moving in the trained direction. Stimuli moving in the trained direction were presented to one eye and stimuli moving in the orthogonal direction were presented to the other eye. Stimuli moving in the trained direction showed stronger dominance. Chong et al. (2005) enquired whether rivalry is influenced by endogenous attention (voluntary attention to a feature of a stimulus). They used rivalrous orthogonal gratings that varied independently in spatial frequency from frame to frame. When subjects counted changes in spatial frequency in one grating, the dominance duration of that grating increased by a mean value of 50%. Merely attending to the rivalrous stimuli had no effect. When subjects counted the changes in both gratings, the dominance duration of both gratings increased. Consider next the effects of exogenous attention. Attention is automatically drawn to a stimulus that suddenly increases in contrast. Chong and Blake (2006) showed that when one of two superimposed gratings viewed with both eyes briefly increased in contrast, this grating was initially dominant when the gratings were presented one to each eye (dichoptically). Attention is also drawn automatically to an object that is suddenly moved. Mitchell et al. (2004) used this fact to investigate whether attentional cueing affects rivalry. Subjects viewed with both eyes two superimposed randomdot displays rotating in opposite directions. Attention was drawn to one display by briefly translating it. The stimulus was then changed to a display rotating in one direction in one eye superimposed on a display rotating in the other direction in the other eye. These displays rivaled but, during the initial one-second period, the display rotating in the direction of the cued stimulus tended to be dominant, whichever eye it appeared in. Thus rivalry can be influenced by exogenous attention due to prior cueing of one stimulus. An approaching object grabs the attention more effectively than a receding object. Parker and Alais (2007) presented an expanding concentric grating to one eye and a contracting grating to the other eye. Each grating subtended 2° and had a spatial frequency of 3 cpd. The expanding grating was dominant for longer than the contracting grating.
12.8.3 B I N O C U L A R R I VA L RY A N D M E A N I N G
12.8.3a Effect of Meaning on Periods of Dominance This section deals with the question of whether the duration for which an image is dominant is influenced by its meaning. Rommetveit et al. (1968) presented a word such as “wine” to one eye and a typographically similar word such as “nine” to the same location in the other eye. The words were presented for 370 ms next to a binocularly viewed word such as “red” that made a meaningful phrase, such as “red wine,” with one of the rival pair but not the other. The semantically relevant word was more frequently reported than the irrelevant word. It was concluded that a more meaningful word tends to be more dominant than a less meaningful word. However, a person may be less likely to read “red nine” than “red wine” when both words are presented side by side to the same eye. There was no control for this possibility. Furthermore, stimulus duration may have been too brief to allow rivalry to develop. When an erect face was dichoptically superimposed on an inverted face, the erect face was seen more frequently than the inverted face (Engel 1956; Hastorf and Myro 1959). But this may not mean that the basic rivalry process is affected by meaning. For much of the time, dominance was incomplete and parts of each face were visible. Parts of an erect face are more familiar than parts of an inverted face and are therefore more likely to form the basis of the decision about whether the face is erect or inverted. Furthermore, certain features of a face, such as an eye or the mouth, appear in an unusual location in an inverted face. Other features, such as the nose and ears remain more or less in their usual location. These factors could bias judgments in favor of an erect face quite apart from any influence of binocular rivalry. This idea could be tested by seeing whether a binocularly viewed face consisting of an equal mixture of erect and inverted regions is more often reported to be erect or inverted. Ono et al. (1966) presented subjects with pairs of photographs of faces combined dichoptically. More rivalry was reported between the faces that had been rated as less similar on a variety of criteria, such as pleasant and unpleasant, than between those that had been rated as more similar. But it is not clear from this whether the crucial variable was similarity in terms of the semantic criteria or similarity in terms of low-level features such as the relative positions of contours. There are also reports that the relative dominance of dichoptically combined words matched for number of letters and frequency of usage depends on the emotional impact of the words as determined by their sexual or aggressive significance (Kohn 1960; Van de Castle 1960). Such results may reflect the willingness of subjects to report a certain word rather than a greater visual dominance of one type of word over another.
B I N O C U L A R F U S I O N A N D R I VA L RY
•
97
Several studies have reported that the personal significance of a picture affects which picture predominates when two different pictures are combined dichoptically. For example, a picture with North American content, such as a boy playing baseball, was presented to one eye and a picture with Mexican content, such as a matador, was presented to the other eye (Bagby 1957). North American subjects reported seeing more North American pictures while Mexican subjects reported more Mexican pictures. There are several problems with these studies. The frequency with which one stimulus is reported may reflect the greater salience of local features of that stimulus during those periods when parts of each stimulus are visible. Because of this greater salience of partial features, one stimulus is more likely to be reported than is the other during the periods of mixed dominance. This could increase the probability that a given stimulus is reported but this will not necessarily reflect any basic effect of meaning on the rivalry process itself. Another problem is that recognition will be more rapid when the more familiar picture is totally dominant than when the unfamiliar picture is totally dominant. Yu and Blake (1992) attempted to overcome these problems. They found that a face showed longer periods of exclusive dominance over a geometrical comparison pattern than did a control stimulus with the same spatial-frequency content and mean contrast as the face. They obtained the same result when the relative dominance of the patterns was assessed by the reaction time for detection of a probe flashed on the geometrical pattern. The face may have been more dominant than the control stimulus because it was a face or because it had a coherent shape. They investigated this issue by comparing the duration of exclusive dominance of a hidden “Dalmatian-dog” figure before it was seen as a dog and again after it was seen as a dog. The dog pattern became more dominant after subjects had been shown that it was a dog by placing a tracing of a dog over it. However, a scrambled version of the dog pattern also became more dominant after subjects had seen the same tracing placed over it. This effect must have been due to suggestion since it was not based on any objective feature of the stimuli. An upright version of the dog pattern was more dominant than an inverted version, even though subjects were not aware of the dog. Whatever this set of experiments reveals about the effects of stimulus configuration on rivalry, the effects are small compared with effects of stimulus variables such as contrast, spatial frequency, color, motion, and orientation. A related issue is whether a meaningful stimulus rivals as a whole (globally) rather than in a mosaic fashion. Large orthogonal gratings rival in a mosaic fashion. Alai and Melcher (2007) found that distinct meaningful stimuli, such as faces or houses, rival globally so as to preserve the coherence of each stimulus. Also, the depth of suppression, as indicated by the visibility of a probe stimulus 98
•
superimposed on the suppressed image, was greater for meaningful stimuli than for rivalrous gratings. However, rivalry between a meaningful stimulus and grating or between a meaningful stimulus and a scrambled version of the same stimulus was shallow and piecemeal like that between orthogonal gratings. Thus global rivalry occurred only between two meaningful stimuli. They concluded that rivalry is basically a local eye-based process but that highlevel processes globally organize regions of mosaic rivalry so as to preserve a coherent and meaningful stimulus.
12.8.3b Semantic Processing of Suppressed Images Another question is whether we are sensitive to the semantic content of suppressed images. A person can follow a verbal message presented to one ear when a different message is presented to the other ear (Cherry 1953; Lewis 1970). However, Blake (1988) found that subjects could not read a message presented to one eye when a different message was presented to the other eye, even when the messages were in distinct fonts or when the message to be read started 5 seconds before the other. Sections 12.4 and 12.5 dealt with how a suppressed image might influence certain effects such as binocular summation and interocular aftereffects. These processes involve only low-level features of the stimulus such as luminance, contrast, and motion. What is the evidence that high-level, or semantic, features of a suppressed stimulus are processed? Zimba and Blake (1983) addressed this question by using semantic priming. In this effect, prior presentation of a word shortens the time needed to decide whether a subsequently presented stimulus is a random letter string or a word semantically related to the priming word. They found that priming operated only when the priming word was presented to an eye in its dominant phase of binocular rivalry. This suggests that rivalry occurs before the level of semantic analysis so that the semantic content of a suppressed image is not processed. 12.8.4 I N T E R S E NS O RY E FFEC TS I N R I VA L RY
One way to bias attention to a visual stimulus is to combine the stimulus with a matching stimulus in another sensory modality. Lunghi et al. (2010) reported that a grating was maintained longer in its dominant phase when it was accompanied with a tactile grating in the same orientation. Also, the suppressed phase of a grating was shortened when it was accompanied with an orthogonal tactile grating. Zhou et al. (2010) reported that the dominance time of a rose over a marker pen was prolonged by the presence of a rose scent and the dominance of the pen was prolonged by the smell of butanol. Van Ee et al. (2009) reported that subjects had more attentional control over rivalry between a rotating radial pattern and a looming concentric pattern when the
STEREOSCOPIC VISION
temporal modulation of looming matched the temporal modulation of a sound. Unlike the other intersensory effects, this effect worked only when subjects attended to the modulated sound. 1 2 . 9 P H YS I O L O GY O F B I N O C U L A R R I VA L RY 12.9.1 R I VA L RY AT T H E L EV E L O F T H E L G N
Anything that weakens the signals from one eye will influence binocular rivalry. Fry (1936) formed afterimages of orthogonal lines in the two eyes. When pressure was applied to one eye, the dominance of the afterimage in that eye was reduced. Therefore, anything that weakens the signal arriving from one eye will have an effect on binocular rivalry. The lateral geniculate nucleus (LGN) is the earliest stage where signals from the two eyes can compete for entry into subsequent stages of processing. Each cell in the LGN receives a direct input from only one eye and does not respond when only the other eye is stimulated. Nevertheless, for many LGN cells, the response is modified by stimulation of the eye from which the cell does not receive a direct input (Section 5.2.3). This could be mediated either by intrageniculate inhibitory connections or by descending influences from the visual cortex. There is some dispute about the role of cortical influences. Some investigators found that binocular interactions in the LGN of the cat require an intact visual cortex (Varela and Singer 1987), while others found that they do not (Tumosa et al. 1989; Tong et al. 1992). A report that interocular influences in the LGN are greatest when the stimuli presented to the two eyes differ in orientation, contrast, and movement prompted the suggestion that the LGN is involved in binocular rivalry, since binocular rivalry is affected by interocular differences between the same features (Varela and Singer 1987). Subsequent experiments revealed that interocular influences in the cat LGN are not much affected by changes in stimulus orientation or direction of motion, but are affected by changes in spatial frequency (Moore et al. 1992; Sengpiel et al. 1995b). Interocular influences in the LGN could serve to balance responses from the two eyes to small interocular differences in contrast by adapting the relative contrast gains of the inputs from the eyes. Rivalry and contrast gain could be served by the same or by different interactive processes in the LGN. If inhibitory interactions responsible for binocular rivalry occurred in the LGN, the degree of inhibitory coupling would have to depend on the degree of correlation between the images, as detected by feature detectors in the visual cortex. In particular, detection of differences in orientation or motion would have to depend on signals
descending from higher centers. A high correlation would weaken the inhibitory coupling and lead to fusion, while a low or negative correlation would produce rivalry (Lehky and Blake 1991). Lehky and Maunsell (1996) found no difference in the responses of single neurons in the LGN of alert monkeys as they fixated rivalrous or similar gratings. This seems to eliminate the LGN as the site of binocular rivalry, at least in the monkey. However, Lehky and Maunsell did not determine which eye was dominant at a given time. It seems that the fMRI reveals effects of rivalry in the LGN that single-cell recordings do not detect. Haynes et al. (2005) detected differences in fMRI responses from the human LGN as a function of which eye was stimulated. They then showed subjects rivalrous dichoptic gratings. The fMRI signals in the LGN associated with a specific eye were modulated according to which stimulus was dominant. They obtained similar results from the visual cortex. Wunderlich et al. (2005) found that fMRI signals from the human LGN were stronger during periods in which a highcontrast grating was dominant over a low-contrast orthogonal grating. These rivalry-dependent responses in the LGN could be due feedback from the visual cortex. This is most likely, because orientation is not coded by cells in the LGN. 12.9.2 R I VA L RY AT T H E C O RT I C A L L EV E L
12.9.2a Relating Rivalry to Cortical Activity Logothetis and Schall (1989) trained monkeys to press one key when a display moved to the left and another key when it moved to the right. When shown dichoptic displays moving in opposite directions, the monkeys changed their response, as the displays alternated in dominance. At the same time, the experimenters recorded the activity of motion-sensitive binocular cells in the superior temporal sulcus, probably MT. Some cells that normally responded to a given direction of motion responded only when the monkey indicated that the stimulus moving in that direction was dominant. However, most cells remained unaffected by rivalry. In a later study, alert monkeys were presented with orthogonal gratings and responded according to whether they saw the vertical or horizontal grating (Leopold and Logothetis 1996). About 20% of cells in V1 and V2, and 38% of cells in V4 increased their activity when the orientation of the grating that the animal saw corresponded to the preferred orientation of the cell. A few cells in V4 responded when the orientation of the suppressed image corresponded to the preferred orientation of the cell. Almost all these cells were binocular cells. These results suggest that, at these levels in the nervous system, rivalry occurs between a minority of binocular cells and almost no monocular cells. On the other hand, Sheinberg and Logothetis (1997) found that responses of 90% of cells in the inferior temporal
B I N O C U L A R F U S I O N A N D R I VA L RY
•
99
cortex and superior temporal sulcus of the monkey were contingent on the perceptual dominance of the effective stimulus. The stimuli were a radial pattern in one eye and the picture of an animal in the other. Thus, as one proceeds to higher levels of the ventral visual pathway involved in pattern recognition, rivalry involves a greater proportion of neurons tuned to a particular complex stimulus. This would explain why images in their suppressed phase may influence certain visual processes occurring at lower levels of the visual system (Section 12.5). Crick (1996) speculated that these results suggest that the seat of visual consciousness is at a higher level than V1 (see Section 4.8.4). Pearson et al. (2007) found that local transcranial magnetic stimulation of the visual human cortex perturbed binocular rivalry in the corresponding region of the visual field. Magnetic stimulation did not produce any time-locked effects on stimulus rivalry that occurred during eye swapping of the stimuli. This evidence supports the notion that eye rivalry but not stimulus rivalry is contingent on neural activity in early visual areas.
12.9.2b Rivalry and Cross-Orientation Inhibition The response of a cell in the visual cortex to an optimally oriented bar or grating is suppressed by the superimposition of an orthogonal bar or grating in the same eye (Bishop et al. 1973; Morrone et al. 1982; Bonds 1989). This is monoptic cross-orientation inhibition, mentioned in Section 12.3.8. It operates over a wide difference in spatial frequency between the gratings and increases with the contrast of the superimposed grating (Snowden and Hammett 1992). Some evidence suggests that it results from intracortical inhibition (DeAngelis et al. 1992; Morrone et al. 1987). But Ferster (1987) suggested that the effect originates in the LGN. In support of this idea, Freeman et al. (2002) found that cross-orientation inhibition occurred when the stimuli drifted at a velocity that exceeded the limits of cortical neurons but not of LGN neurons. Also, it was not much reduced after adaptation to one of the gratings. This accords with the fact that adaptation has little effect on the response of LGN neurons but has a large effect on the response of almost all cortical neurons. It has been suggested that monoptic cross-orientation inhibition is related to binocular rivalry and dichoptic masking (Legge 1984a). However, the following findings do not support this view. 1. In monoptic cross-orientation inhibition, the images of differently oriented gratings are superimposed on the classical receptive fields of ganglion-cells and of cortical cells in layer 4 before these cells converge onto binocular cells. In dichoptic suppression, the images are carried by distinct ganglion cells and project to cortical monocular cells in distinct ocular dominance columns. 100
•
They become superimposed only at the level of binocular cells. 2. Cross-orientation inhibition is as strong with monocular as with binocular viewing (Walker et al. 1998). 3. Monoptic cross-orientation inhibition was maintained to higher stimulus drift rates and higher temporal frequencies than was dichoptic suppression (Li et al. 2005; Sengpiel and Vorobyov 2005). Sengpiel and Vorobyov also reported the following two differences. 4. Unlike cross-orientation inhibition, interocular suppression was reduced after adaptation to one of the orthogonal gratings. 5. Dichoptic suppression was selectively reduced by application of the GABA antagonist bicuculline, which suggests that it depends on inhibitory cortical interactions. 6. Cross-orientation inhibition was not produced by orthogonal bars or gratings shown to different eyes of anesthetized cats, even though inhibition was evident when both stimuli were presented to the same eye (Burns and Pritchard 1968; Ferster 1981; DeAngelis et al. 1992). This suggests that cross-orientation inhibition is generated before the signals from the eyes are combined and is not the basis for binocular rivalry. However, we will now see that there has been some debate on this issue. Unlike previous investigators, Sengpiel and Blakemore (1994) found that the response of a binocular cell in area 17 of anesthetized cats to a grating presented in its preferred orientation to the cell’s dominant eye diminished when an orthogonal grating was suddenly presented to the other eye. This suppression was replaced by facilitation when the gratings had the same orientation in the two eyes. In strabismic cats, dichoptic suppression occurred at all relative orientations of the gratings (see also Sengpiel et al. 1994, 1995a). Interocular suppression was not evident in monocular cells in layer 4 of the visual cortex, before the cells converge onto binocular cells. Logothetis (1998) pointed out that Sengpiel and Blakemore (1994) found a large number of cells in area 17 of anesthetized cats, which exhibited orientation-specific interocular suppression only when the cells had been preadapted to their preferred orientation. Also, the short exposures that they used were not typical of conditions under which rivalry occurs. Sengpiel et al. (1995b) presented an optimally orientated drifting grating to one eye of cats while an induction grating of variable orientation and spatial frequency was presented intermittently to the other eye. When the spatial frequency of the induction grating was too high or too low
STEREOSCOPIC VISION
to produce an excitatory response in the visual cortex, it continued to produce interocular suppression, which occurred equally at all relative orientations of the two gratings. Sengpiel et al. concluded that dichoptic interaction is produced by the sum of facilitation for stimuli of similar orientation and by suppression that is independent of orientation. Other evidence suggests that binocular rivalry involves a change in synchrony of cortical responses rather than a change in response amplitude. Fries et al. (1997) recorded simultaneously from many cells in area 17 of alert strabismic cats. The direction of optokinetic eye movements indicated which of two dichoptic gratings moving in counterphase was dominant at any time. Neurons that fired in synchrony to a grating presented to one eye continued to do so when that eye was dominant in the rivalry condition. The activity of neurons responsive to the stimulus that was not dominant became desynchronized. Changes in eye dominance were not associated with changes in the rates of discharge of cortical cells. Fries et al. suggested that the change in firing rate found by Sengpiel and Blakemore was due to their use of anesthetized animals and reflected the presence of rivalrous stimuli rather than the outcome of rivalry. A high-contrast grating in one eye suppresses the response to a low-contrast grating in the other eye, when the two gratings are set at a small angle to each other. Binocular cells in areas 17 and 18 of the cat modulated their firing rate in response to a 4.8-Hz phase reversal in the luminance of a low-contrast grating presented to one eye when a homogeneous field was present in the other eye. When a high-contrast grating was added to the other eye, however, the response modulation due to the low-contrast grating was no longer present (Berardi et al. 1986). These effects may have more to do with dichoptic masking, which occurs between similar patterns, than with binocular rivalry, which occurs between distinct patterns.
12.9.2c Rivalry and Lateral Cortical Connections Dendrites from the axons of pyramidal cells in area 17 of the cat project up to 8 mm horizontally within layers 2, 3, and 5. This represents several receptive-field diameters. These dendrites produce spaced clusters of predominantly excitatory synapses that link cells with similar orientation preference, as described in Section 5.5.6a. Lateral connections are therefore unlikely to serve as a basis for binocular rivalry, which occurs between stimuli differing in orientation. They could build large receptive-field units that respond to lines in a particular orientation, or they could modify the response of cells according to the nature of surrounding stimuli (Gilbert et al. 1991). Binocular rivalry is presumably served by shorter inhibitory connections between cortical cells (Section 5.5.6).
12.9.2d Interhemispheric Rivalry Miller et al. (2000) proposed that rivalry occurs at a high level of visual processing and that the rivaling percepts are processed in opposite hemispheres. According to this view, rivalry is between hemispheres rather than between eyes. In support of the theory they used rivaling vertical and horizontal lines. They reported that activation of one hemisphere by caloric stimulation of one vestibular system or transcranial magnetic stimulation altered the duration of one percept relative to that of the other. However, O’Shea and Corballis (2003) found that binocular rivalry occurs in callosotomized (split-brain) human observers and that it is similar for stimuli confined to one or the other of the two hemispheres. For normal observers, adjacent patches tend to rival in synchrony, even when they project to opposite cerebral hemispheres. For a split-brain patient, synchrony occurred only for patches presented to the same hemisphere (the same hemifield) (O’Shea and Corballis 2005). This evidence suggests that rivalry is not due to switching between hemispheres and that, in split-brain patients, rivalry occurs independently in each hemisphere.
12.9.2e Rivalry and the Cortical VEP and MEG Evoked potentials (VEPs) from the surface of the head above the visual cortex allow an experimenter to track cortical events associated with rivalry in animals or preverbal humans. The use of VEPs to study the development of binocular rivalry in infants was discussed in Section 7.6.3. Lansing (1964) presented a 50°-wide illuminated area flashing at 8 Hz continuously to the left eye. When a steady striped pattern was presented for periods of 5 s to the right eye, subjects reported that it dominated the flickering field. When this happened, there was an 82% reduction in VEPs synchronized with the flickering field. Similar findings were reported by Lehmann and Fender (1967, 1968) and by MacKay (1968). The amplitude of the VEP associated with a counterphase-modulated pattern presented to one eye was strongly reduced when it was perceptually suppressed by a steady pattern presented to the other eye compared with when it was presented with a blank field in the other eye (Spekreijse et al. 1972). Since misaccommodation can cause a large reduction in the VEP, the reduced VEP from suppressed images could have been due to the suppressed eye becoming misaccommodated. However, Spekreijse et al. obtained the same result when the rivalrous stimuli were presented in the context of correlated stimuli that served to keep both eyes properly accommodated. Similar gratings presented to the two eyes produce a larger VEP than does a grating presented to one eye. Apkarian et al. (1981) found that the VEP produced by dichoptic orthogonal gratings showed no interocular summation. The binocular response fell to the monocular level
B I N O C U L A R F U S I O N A N D R I VA L RY
•
101
when the two gratings differed in orientation by more than about 20° (Tyler and Apkarian 1985). A VEP was detected when a dynamic random-dot display alternated between being correlated and anticorrelated (contrast reversed) between the two eyes. The response was undiminished when the displays were composed of correlated and anticorrelated equiluminant red and green dots (Livingstone 1996). This suggests that there are binocular cells sensitive to interocular correlation that are not involved in stereopsis, since stereopsis is weak at equiluminance (Section 17.1.4) Fries et al. (2002) recorded from multiple neurons in area 17 of cats exposed to rivaling stimuli. The stimulus that activated the neurons could be made dominant by increasing its contrast or by presenting it to a strabismic eye. When this was done, the responses of the neurons became more highly synchronized in the gamma frequency range. A patient with sagittal section of the chiasm showed no change in VEP responses to monocular flicker when the other eye was stimulated by a structured stimulus (Lehmann and Fender 1969). Brown and Norcia (1997) presented a 2-cpd, 12°-diameter grating to each eye. One grating oscillated at 5.5 Hz, and the other at 6.6 Hz. Because of these distinct temporal labels, the VEP evoked by each grating could be recovered from the VEP spectrum. When the gratings were orthogonal, the responses from the two eyes alternately waxed and waned in synchrony with reports of rivalry. The alternation disappeared when the gratings were aligned. The effect was not evident with a 2°-diameter display, although Lawwill and Biersdorf (1968) had obtained similar results with orthogonal 3° gratings superimposed on a 12° steady surround. Other investigators who had used only a small display failed to find a connection between rivalry and the VEP (Cobb and Morton 1967; Riggs and Whittle 1967; Martin 1970). Kaernbach et al. (1999) recorded VEPs when dichoptic gratings changed from being orthogonal to being congruent. When the congruent gratings were parallel to the member of the rivaling gratings that was dominant when the gratings changed from orthogonal to parallel, subjects did not perceive a change in stimulus orientation. When the congruent gratings were orthogonal to the dominant grating, subjects perceived a change in orientation. The N1 component of the VEP, which is evoked in extrastriate cortex, changed only when subjects perceived a change in orientation. The use of VEPs to study binocular summation is discussed in Section 13.1.8b. Magnetoencephalography (MEG) has, also, been used to investigate binocular rivalry. Srinivasan et al. (1999) presented a red 13° vertical grating to one eye and a blue horizontal grating to the other eye, each flickering at a distinct frequency. Neuromagnetic responses (Section 5.4.3d) associated with a particular grating increased coherently in 102
•
widely separated areas of both cerebral hemispheres when the subject reported that that grating was dominant. Srinivasan concluded that conscious awareness is mediated by widespread intra- and interhemispheric synchrony of neural activity. Cosmelli et al. (2004) extended the MEG procedure in a way that allowed them to detect the temporal propagation of activity over the cerebral cortex from the occipital pole. Kamphuisen et al. (2008), also, found widespread MEG activity that was tagged to a stimulus undergoing binocular rivalry. However, unlike earlier investigators, they applied a phase analysis that filtered out activity not phase-locked to the stimulus. This revealed that activity phase-locked to the stimulus was produced by a limited set of sources in the occipital lobe. They concluded that this casts doubt on the theory that awareness is mediated by widespread synchronous activity. This conclusion rests on the assumption that activity generated by a given stimulus in different centers of the brain does not differ widely in phase.
12.9.2f Rivalry and Magnetic Resonance Imaging It was mentioned in Section 12.9.2a that the responses of most cells in V1 of the monkey are not affected by binocular rivalry. However, we will see that the human fMRI is strongly reduced when the stimulus is in its suppressed phase. Maier et al. (2008) found that, although single unit responses and fMRI in monkeys were affected in a similar way when a stimulus was physically removed, only the fMRI was reduced when the stimulus was suppressed. The reasons for this dissociation of responses to the same stimulus change are not known. Polonsky et al. (2000) measured fMRI signals from V1, V2, V3, and V4 of the human brain while subjects experienced rivalry between a low-contrast grating and an orthogonal high-contrast grating. In each area, activity increased when subjects reported that the high-contrast grating was dominant. The increase was about half as large as that observed when the monocular images were presented alternately without rivalry. These results support the idea that the neural events evoked by rivalry occur at several levels, starting at V1. Lee and Blake (2002) found that the amplitude of the fMRI response from V1 to rivalrous gratings was midway between that produced by gratings that alternated in one eye and that produced by superimposing the gratings in one eye. The same results occurred with rivalry between meaningful patterns. Tong and Engel (2001) recorded the fMRI response from the region of V1 corresponding to the blind spot— a region that receives inputs from only the ipsilateral eye. A response occurred when a grating in the ipsilateral eye was dominant but not when a grating in the contralateral eye was dominant. They concluded that rivalry occurs between inputs from the two eyes, primarily in V1.
STEREOSCOPIC VISION
Lumer et al. (1998) found no modulations of fMRI in V1 associated with rivalry between a grating in one eye and a face in the other eye. They concluded that rivalry occurs after analysis of monocular stimuli in V1. But rivalry may not show in the fMRI from V1 with large stimuli, because overall activity may be as high whichever of two strong stimuli is dominant. It is only a question of how mosaic rivalry is distributed over binocular cells. In other words, Lumer et al. were probably not recording the site where rivalry occurs but rather the relative levels of activity triggered by the contents of the stimuli after rivalry had been achieved. Lee et al. (2005) recorded traveling waves of fMRI activity in V1 that corresponded to subjects’ reports of traveling waves of rivalry round a spiral grating, as described in Section 12.3.5e (see Figure 12.25). In this case, rivalry was between a high-contrast grating and a low-contrast grating so that one would expect activity in V1 when dominance changed between the two gratings. When subjects directed their attention away from the traveling waves, cortical activity associated with the waves persisted in V1 but was eliminated in visual area V2 (Lee et al. 2007). Thus, attention was required for rivalry to activate visual areas beyond V1. At higher levels of the nervous system, where the perceptual contents of the rivaling images are analyzed in distinct areas, the level of cortical activity will depend on the nature and complexity of the stimulus. For example, cells that respond to faces become active when the face is dominant, just as they do when attention is voluntarily directed to a face (Wojciulik et al. 1998). Several experiments have revealed that particular suppressed images may show fMRI activity in selective cortical areas. Lumer et al. (1998) measured brain activity with fMRI while human subjects reported changes in rivalry between a grating in one eye and a face in the other. Modulations of fMRI signals corresponding to perceptual changes occurred in the fusiform gyrus, the right inferior and superior parietal areas, and bilaterally in the inferior and middle frontal areas and insular cortex. Activity in the same areas was modulated when subjects did not make any motor responses while observing the rivalrous stimuli (Lumer and Rees 1999). Zaretskaya et al. (2010) found that transcranial magnetic stimulation over the human right interparietal sulcus prolonged periods of dominance during rivalry between pictures of a face and a house. The fMRI response from the human fusiform “face” area increased when a face presented to one eye was dominant, and the response from the hippocampus increased when the picture of a house presented to the other eye was dominant (Tong et al. 1998). This does not mean that these widely separated areas engage in rivalry but only that, when a given stimulus becomes dominant, it gains access to the area that processes it. The amygdala is activated by emotive stimuli. The fMRI activity in the human amygdala was stronger to a
threatening face than to a neutral face, even when the face was suppressed by the picture of a house shown to the other eye (Williams et al. 2004). Also, while the suppressed image of a threatening face activated the amygdala, it did not activate the inferotemporal cortex (Pasley et al. 2004). This evidence suggests that visual inputs from threatening stimuli are conveyed to the amygdala along a distinct subcortical pathway that is immune to rivalry. Areas in the dorsal processing stream (V3A, V7, and the intraparietal area) were still well activated by an image that was continuously suppressed by a flashing stimulus in the other eye (Fang and He 2005). However, responses in the ventral stream (anterior occipital cortex and fusiform gyrus) were almost eliminated for all types of suppressed images.
12.9.2g Summary Rivalry occurs at more than one level of the nervous system and probably involves several distinct processes. At the level of the primary visual cortex, rivalry occurs between the eyes and is determined by simple properties of the images, such as contrast, motion, and orientation. At this level rivalry preserved the stimulus with the most salient simple features. Rivalry is evident in higher visual centers when it involves higher visual features, such as figural grouping, continuity, and meaning. In this case, rivalry may be between features rather than between eyes. Suppressed emotive stimuli may selectively activate the amygdala, and the dorsal cortical stream may be selectively activated by suppressed images. 12.10 MODELS OF BINOCUL AR R I VA L RY The simplest way to think about binocular rivalry is to suppose that it is due to mutual inhibition between competing stimuli arising in the two eyes. This can be called the mutual inhibition mechanism. The mutual inhibition must be time-dependent in the manner of a bistable oscillator (Sugie 1982; Matsuoka 1984). To account for the alternation between rivaling images, the dominant image’s potency to inhibit the suppressed image must gradually weaken but without affecting the visibility of the dominant image. At the same time, the suppressed image recovers its potency to inhibit the dominant image without affecting the visibility of the suppressed image while it is still suppressed. At a certain threshold point, the suppressed image becomes visible and the dominant image becomes suppressed. The two images then reverse their roles and the process is repeated. According to this account, a visible image gradually loses its inhibitory potency while the nonvisible image gains in inhibitory potency and the rate of alternation depends on the time constants of these processes and the relative visual strengths of the two images. The strength of an image is determined by its luminance, contrast, and figural
B I N O C U L A R F U S I O N A N D R I VA L RY
•
103
complexity. The inhibitory potency of an image depends on its strength and its changing value in the duty cycle. Whether an image is visible or suppressed depends on its strength and inhibitory potency relative to the image in the other eye. Any change in the relative strengths of the two images results in a corresponding change in the relative durations of their dominance and suppression phases. Blake et al. (1990) used a sharp transient stimulus to force an eye to return to dominance whenever it became suppressed. For that eye, the dominance periods became unusually brief during the procedure and unusually long for a short time after the procedure. This result is consistent with the idea that transitions from dominance to suppression are due to a short-term adaptation or fatigue process. An alternative view is that an image does not compete when it is suppressed. The duration of dominance of an image depends only on the strength of the dominant image and not on its strength relative to the suppressed image. This can be called the dominant suppression mechanism. A third alternative is that the durations of the dominance and suppression phases depend only on the strength of the suppressed image. This view conforms to Levelt’s second proposition, which was presented in Section 12.3.2a. This can be called the suppression recovery mechanism. These three views generate different predictions when the contrast of the image in one eye (say the left eye) is increased while that in the other eye is kept the same. According to the mutual inhibition account, the duration of left-eye dominance should increase and that of right-eye dominance should decrease by a proportional amount, leaving the overall rate of alternation the same. According to the dominant inhibition account, the periods when the strengthened stimulus is seen should increase but the periods when the constant stimulus is seen should remain unchanged. According to the suppression recovery account, periods when the strengthened stimulus is seen should be constant and periods when the constant stimulus is seen should be shortened. There is evidence in favor of the latter prediction and hence of the suppression recovery account of binocular rivalry (Levelt 1966; Fox and Rasche 1969). However, Mueller and Blake (1989) found that the contrast of a pattern in its dominant phases exerts some influence on the rate of alternation. Bossink et al. (l993), also, cited evidence that the strength of the dominant image has some effect on the duration of the dominant phase, although they agreed that the influence of a dominant image is less than that of a suppressed image. Mueller and Blake suggested that a mutual suppression mechanism for rivalry exists, but that it contains an element that makes frequency of rivalry increase with contrast and a nonlinear element sensitive to unbalanced contrast in the two eyes. Fox and Check (1972) determined the magnitude of suppression of concentric rings by contrast-reversed rings by measuring the threshold for detecting a test flash placed 104
•
on the suppressed image. The magnitude of suppression, tested at several times during the suppression period, was constant. Norman et al. (2000) obtained the same result for rivaling orthogonal gratings. They argued that this contradicts the mutual suppression account of rivalry. But the mutual-inhibition theory does not predict that the visibility of the suppressed image increases but only that its potency to inhibit the dominant image decreases. The converse of this finding is that a dominant image does not become visibly weaker during its dominance phase even though its inhibitory potency weakens. It is only at the point when the images change dominance-suppression roles that there is a change in visibility. In other words, the gradual change in relative inhibitory potential finally results in a saltatory, or winner-take-all, change in visibility (Wilson 2003). One must distinguish between the visibility of an image (whether it is dominant or suppressed), the strength of an image, and the inhibitory potency of an image. More recently, Alais et al. (2010) produced evidence that the visibility of a dominant image decreases and the visibility of a suppressed image increase over time. They argue that the flashed spot of light used by previous investigators is inappropriate for measuring visibility during rivalry. First, it does not resemble the rivalrous images. Second, the probe was applied in only the first half of rivalrous episodes. Alais et al. measured the threshold for detection of an increase in contrast in one half of the dominant or suppressed image over the whole period of dominance or suppression. With this improved probe, they found that the contrast-increment threshold of the dominant image decreased while that of the suppressed image increased during rivalrous episodes (see also Ling et al. 2010). This evidence supports the simple theory of reciprocal inhibition between dominant and suppressed images. The dominant suppression and suppression recovery mechanisms make the duration of the dominance phases of the two images independent, because each depends only on its own strength and not on its relation to the other stimulus. Autocorrelation procedures and fitting data to a gamma distribution have revealed that the durations of succeeding dominance phases vary randomly and independently (Fox and Herrmann 1967; Walker 1975). Lehky (1995) derived a time series of dominance durations evoked by dichoptic orthogonal gratings. He then applied two tests to determine whether this time series conformed to a stochastic noisy process or to a process of deterministic chaos (Section 3.5). An autocorrelation function derived from a stochastic noise process increases linearly as a function of the dimension over which the signals are correlated. The autocorrelation function derived from deterministic chaos reaches an asymptote. Also, short-term predictions are not possible with a stochastic process but are possible with deterministic chaos. These tests showed that rivalry data conformed more to a stochastic process than to a chaotic process.
STEREOSCOPIC VISION
Durations of dominance usually lie between 1 and 3 s. They form a distribution skewed toward shorter durations with a mode at about 2 s. Levelt (1967) described the distribution of rivalry durations by a gamma function. This can be thought of as the probability density function produced by an internal clock with randomly varying tick durations. A Poisson distribution describes the number of randomly occurring events occurring in a fixed interval of time. However, some investigators found that other theoretical distributions fit data from binocular rivalry or from alternation of ambiguous figures as well as or better than the gamma distribution (De Marco et al. 1977). Brascamp et al. (2005) found that the gamma distribution gave a better fit to rates of binocular rivalry than to reversal durations (reciprocal of rates). Also, the gamma distribution produced a reasonable fit to rates of alternation of ambiguous figures. However, it is not clear what these distributions tell us about the underlying neural machinery. Since the duration of suppression depends on features of the suppressed image, Walker (1978b) argued that these features must somehow evade suppression. But this would follow only if high-level features were involved, and the evidence for this is not very convincing (Section 12.8.3). Several of the features affecting suppression such as contrast, spatial frequency, and flicker are already coded in the retina. One could explain their effect on the recovery from suppression by supposing that inputs from the suppressed eye charge a buffer mechanism until the charge breaks through the inhibitory barrier of the other eye’s image. The rate of charging of the buffer depends on the firing frequency of the inputs from the suppressed eye, which in turn depends on features such as contrast, spatial frequency, and flicker. These features would not have to be processed separately, since each contributes to the undifferentiated mean rate of afferent discharge. The evidence presented by Kim et al. (2006), which was reviewed in Section 12.3.5c, suggests that rivalry involves spontaneous stochastic modulations in the relative strengths of the two images in addition to image adaptation and mutual inhibition. Any theory of rivalry must take account of: 1. Adaptation of the dominant image coupled with recovery of the suppressed image. 2. Mutual inhibition between the images that is a function of their relative strengths but which gives greater weight to the suppressed image and is sensitive to correlation between the images. 3. Nonlinear, winner-take-all switching of image dominance. 4. Replacement of rivalry by image superimposition at low contrasts and brief durations.
5. Random fluctuations in the strengths of the competing images and a lack of correlation between successive dominance durations. 6. Retention of a dominant image over repeated brief interruptions of stimulation. Neural models of binocular rivalry based on oscillation in reciprocal inhibitory connections have been proposed by Lehky (1988), Blake (1989), and Mueller (1990). The model proposed by Kalarickal and Marshall (2000) accounts for the relationship between the strengths of the two images and their dominance durations. In the model proposed by Laing and Chow (2002), these relationships are explicitly derived from the properties of a network of neurons described by the Hodgkin-Huxley equations (Section 4.2.1). Wilson (2007) developed a model, which, like other models, incorporates mutual inhibition between rivalrous images, and adaptive recovery of the suppressed image. At low contrasts, excitation falls below the threshold for instigation of mutual inhibition and rivalry gives way to image superimposition. The model also incorporates a compressive response nonlinearity and recurrent excitation between neurons responding to a given stimulus. When recurrent excitation of one image is sufficiently strong, it prevents the recovery of the suppressed image, which therefore remains permanently suppressed. The model also incorporates a process of rapid onset of excitation followed by slow recovery to baseline after removal of the stimulus. This feature of the model provides a plausible explanation of the finding that an image remains dominant over repeated brief interruptions of stimulation (Section 12.3.5g). The models discussed so far were concerned with rivalry between the eyes. We saw in Section 12.4.4 that, under certain circumstances, rivalry occurs between coherent patterns even though parts are distributed between the two eyes. Eye rivalry most likely occurs in V1, while pattern rivalry presumably occurs at a higher level. We saw in Section 12.9.2a that there is physiological evidence that rivalry occurs at more than one level in the visual system. Dayan (1998) proposed that rivalry is based on competing high-level interpretations of the rivaling stimuli. In this model, competition occurs between competing topdown interpretations of the stimuli rather than between the inputs themselves. Perhaps top-down influences influence rivalry under certain circumstances, but most of the evidence reviewed in Section 12.4.4 supports the idea that rivalry is mainly due to processes occurring at an early level. Wilson (2003) produced a model of rivalry occurring at two levels. At the first level, neurons in V1 dominated by inputs from a stimulus in one eye compete with neurons dominated by inputs from a stimulus in the other eye. Excitatory neurons driven by the stimulus in one eye drive inhibitory neurons, which reduce the excitatory response
B I N O C U L A R F U S I O N A N D R I VA L RY
•
105
to the stimulus in the other eye. In addition, the excitatory response in the dominant eye slowly self-adapts, which eventually allows the other eye to come into dominance. Time constants for these processes were derived from physiological recordings of spike frequencies. At the second level, rivalry occurs between binocular neurons at a higher center in the visual system. At this level, responses from lefteye and right-eye neurons in V1 that are sensitive to the same orientation are pooled. The same equations were used, but rivalry now occurs between summed monocular responses to similar patterned inputs. This means that mutual inhibition must be stronger at this stage. A computer simulation of Wilson’s model replicated the basic phenomena of rivalry. In the initial 150-ms period orthogonal gratings both passed the network and generated a composite plaid, as they do for human observers. The network then settled into rivalry with 2.4-s periods of dominance and suppression. The model was then presented with
106
•
the stimuli used by Logothetis et al. (1996), described in Section 12.4.4. Orthogonal gratings flickering at 18 Hz were interchanged between the eyes three times per second. With this stimulus, human subjects mostly perceive rivalry between the gratings, with dominance periods lasting over several reversals between the eyes. The computer simulation replicated these results. Rapid temporal flicker weakens the mutual inhibitory signals relative to the excitatory signals. This weakens rivalry between the eyes at the first level. Thus, inputs from both eyes reach the second level. At this level, the strong inhibition between competing patterns survives attenuation produced by flicker and produces rivalry between the patterns in the two eyes. Freeman (2005) developed a model consisting of two parallel channels driven by the left eye and two channels driven by the right eye. Each channel has several successive processing stages with mutual inhibition between channels at each stage.
STEREOSCOPIC VISION
13 BINOCULAR SUMMATION, MASKING, AND TRANSFER
Binocular summation 107 Theoretical background 107 Binocular summation of contrast sensitivity 110 Summation at suprathreshold contrasts 114 Binocular summation of brightness 116 Flicker fusion and binocular summation 120 Sensitivity to pulsed stimuli and motion 121 Monocular and binocular reaction times 124 Physiology of binocular summation 124 Dichoptic visual masking 127 Types of visual masking 127 Masking from even illumination and the state of light adaptation 127 13.2.3 Masking between adjacent figured stimuli 128 13.2.4 Masking with superimposed patterns 130 13.1 13.1.1 13.1.2 13.1.3 13.1.4 13.1.5 13.1.6 13.1.7 13.1.8 13.2 13.2.1 13.2.2
13.2.5 13.2.6 13.2.7 13.2.8 13.3 13.3.1 13.3.2 13.3.3 13.3.4 13.3.5 13.4 13.4.1 13.4.2 13.4.3
Dichoptic visual crowding 133 Threshold-elevation 133 Meta- and paracontrast 134 Transfer of chromatic adaptation 135 Transfer of figural effects 135 Experimental paradigms 135 Transfer of tilt contrast 137 Transfer of the motion aftereffect 140 Transfer of the spatial-frequency shift 144 Transfer of contingent aftereffects 144 Transfer of perceptual learning 146 Transfer for simple visual tasks 146 Transfer of pattern discrimination 146 Transfer of visual-motor learning 147
1 3 . 1 B I N O C U L A R S U M M AT I O N
13.1.1a Low-Level Factors in Binocular Summation
13.1.1 T H E O R ET I C A L BAC KG RO U N D
The following low-level factors may contribute to binocular summation:
Binocular summation occurs when a stimulus appears brighter when viewed with two eyes rather than one eye or when two eyes are better than one eye at detecting or discriminating between stimuli. Blake and Fox (1973) thoroughly reviewed the work on binocular summation to 1972 and Blake et al. (1981a) extended the review to 1980. Binocular summation could be affected by the following factors:
1. The pupil of an eye, illuminated at a given of luminance, constricts more when the other eye is also illuminated rather than in darkness (Thomson 1947). Thus, there is binocular summation in the subcortical centers controlling pupil size. Changes in pupil size must be taken into account when comparing monocular and binocular detection or acuity. A reduction in pupil size increases diffraction, which reduces acuity, but produces less spherical aberration, which increases acuity. The monocularbinocular difference in acuity was reduced when an artificial pupil removed effects of changing pupil size in going from monocular to binocular viewing (Horowitz 1949). Another way to eliminate effects of changes of pupil size is to stimulate the nontested eye with a blank field at the same mean luminance as the eye being tested.
1. Low-level factors such as steadier fixation or improved accommodation with binocular viewing. 2. Probability summation. 3. Neural summation of signals from the two eyes. 4. Binocular rivalry, as discussed in Chapter 12. 5. Mutual inhibition (gain control) between similar suprathreshold stimuli in the two eyes.
107
2. Fixation may be steadier with binocular vision than with monocular vision. The binocular superiority of acuity shows only when the images are in good register. Binocular Snellen acuity fell to the level of monocular acuity when fixation disparities were introduced by placing a prism before one eye ( Jenkins et al. 1992). Also, acuity improved when naturally occurring fixation disparities were reduced by a prism ( Jenkins et al. 1994). 3. If the astigmatic axis differs in the two eyes, grating acuity may be better with binocular than with monocular viewing. There does not seem to be any systematic evidence on this possibility.
1
N p s ⎡ ∑ i =1 si ⎤ p ⎣ ⎦
13.1.1b Probability Summation Two sensory processes are said to be independent when the probability that one of them detects a stimulus is unaffected by the simultaneous stimulation of the other. Independence also requires that the internal noise associated with one process is uncorrelated with that associated with the other. When a weak stimulus is presented to a sensory detector, the probability of detection at a given instant depends on the level of internal noise at that instant. When the same weak stimulus is presented to two distinct detectors with uncorrelated noise the probability that at least one signal will be detected is higher than when the stimulus is presented to one detector. This is because, at a given instant, the signal from one signal may exceed the threshold when the signal from the other signal is below threshold. This is probability summation with independent detectors. The inputs from two distinct detectors with uncorrelated noise may converge before a decision has been made about the signal. In this case the noise signals partially cancel out, so that noise after convergence is less than the sum of the noise in the two detectors. This is probability summation with converged detectors. If the probability of detecting a stimulus by two detectors is greater than that of detecting the stimulus by only one detector by an amount that exceeds probability summation, then either the two detectors are not independent or the signals from the two detectors sum before a decision is made about the presence of a stimulus. Pirenne (1943) introduced the concept of probability summation into visual science. For two detectors of equal sensitivity, the probability of detecting a stimulus using both (Pb), relative to that of detecting it with either one alone (P1 and P2), is given by Pb ( P1 + P2 ) P1 P2
(1)
For example, if the probabilities P1 and P2 are 0.5, Pb = 0.75—an improvement of 50%. For the same reason, one is more likely to get at least one head by throwing 108
•
two coins rather than one. This is classical probability summation. Bárány (1946) proposed the same idea independently. Dutour had expressed a similar idea in 1763. He argued that a piece of paper seen by both eyes appears brighter than when viewed with one eye because points not well registered by one eye may be registered by the other. Quick (1974) provided the following simplified formula for calculating the effect of probability summation on stimulus detectability (s) when there are several detectors with different sensitivities, (2)
where N is the number of independent detectors responding to a set of stimuli, Si is the sensitivity of the ith detector, and p is the slope of the psychometric function at the point where a stimulus is detected 50% of the time. There has been some debate about the form of the probability function most appropriate for understanding binocular summation. Eriksen (1966) pointed out that formula (1) does not make proper allowance for guessing behavior, and Eriksen et al. (1966) proposed a version to take this factor into account. Another problem is that the calculation is invalid if judgments made with one eye are not independent of those made with the other. For instance, if the thresholds in the two eyes fluctuate together because of some central process, such as fatigue or inattention, then the assumption of independence is violated. If the spatial and temporal correlation between noise-related activity in the two eyes were 1 then there would be no statistical advantage in having two eyes. Even a weak correlation between the noise-related activity of neighboring neurons in a single eye can restrict the statistical advantage of probability summation (Zohary et al. 1994).
13.1.1c Empirical Probability Summation Theoretical complications in defining probability summation can be sidestepped by measuring the contribution of probability summation empirically and using this to assess the contribution of other factors. This is done by measuring the effect of one stimulus on another stimulus when they are separated spatially or in time. This procedure is based on the idea that probability summation occurs for wellseparated stimuli but that neural summation does not. It has been shown that binocular summation falls to the level of probability summation when the time interval between two brief stimuli is increased to more than about 100 ms (Mati4n 1962; Thorn and Boynton 1974). Temporal aspects of binocular summation are discussed in more detail in Section 13.1.6. In the preceding analysis it was assumed that signals from the two eyes do not combine before a decision about the presence of each monocular stimulus has been made, and that a central decision process has access only to these
STEREOSCOPIC VISION
independently processed signals. Now consider what might happen if the neural signals are combined before a decision about the presence of the stimulus is made.
13.1.1d Linear Summation of Dichoptic Inputs Assume a simple linear summation of neural signals and a source of internal noise that is independent of stimulus strength (additive noise). Combining two weak stimuli in the same area of one eye doubles the probability of detection because signal strength doubles at the level of the generator potential within the linear range, while internal additive noise stays the same. Thus, the signal-to-noise ratio is doubled. The combination of signals in the brain from the two eyes involves the combination of discrete nerve impulses, not generator potentials. Suppose that two stimuli are presented dichoptically and the neural signals converge at a central site. If noise in the two eyes is perfectly correlated, neural summation confers no advantage because, although stimulus signals add, so do noise signals, leaving the signal-to-noise ratio the same. If the noise in the two eyes is uncorrelated, neural convergence provides an advantage because two uncorrelated noise sources partially cancel when combined. When two equal stimuli with equal but uncorrelated noise are combined, the noise level increases √2 times while signal strength doubles. If neural signals from the two eyes combine this way, then binocular signal-to-noise ratio is 2/√2, so that binocular sensitivity should be √2 times monocular sensitivity (Campbell and Green 1965). With convergence of monocular signals there is no classical probability summation because there are no independent decision processes. Campbell and Green realized that their analysis rested on the assumption that noise does not arise from a closed or evenly illuminated eye. If it did, binocular performance would be twice as good as monocular performance, since the same binocular noise would be present with both monocular and binocular viewing, with the signal arising from the binocular stimulus being twice as strong as that from the monocular stimulus. Evidence reviewed in the last chapter suggests that a contour in one eye suppresses activity arising from a corresponding area of even illumination in the other eye. If this is true, then noise from an eye lacking contoured stimuli should be either attenuated or completely switched off. However, the only way to be sure that an unstimulated eye has no effect is to pressure blind it. Evidence reviewed in Section 12.3.3a suggests that noise does arise in a closed eye. It was also assumed in Campbell and Green’s analysis that there is no internal noise peculiar to the channel in which the signals from the two eyes are summed. A mechanism that sums inputs from the two eyes gains a binocular advantage greater than √2 over the single-eye performance in the following three ways:
1. The binocular advantage would be 2 if there were no internal noise before the summation of signals and no severe saturation effects. 2. The binocular advantage would be greater than √2 if nerve impulses below the sensation threshold summed to a suprathreshold value at a central site. Light quanta are summed at the level of the receptor generator potential before nerve impulses are generated. Under ideal experimental conditions, noise-free stimuli are summed completely within the limits set by Bloch’s law of temporal summation and Ricco’s law of spatial summation (Schwarz 1993). But for subthreshold neural summation to operate at a central site one would have to assume that a stimulus strong enough to generate nerve impulses in one eye would be subthreshold for detection at a higher level when only one eye is open. 3. There could be a facilitatory nonlinear summation of inputs from the two eyes. A binocular AND-gate mechanism works this way since it responds only to signals arriving simultaneously from both eyes. If all binocular cells were AND-gates we would see nothing unless both eyes were open. This would be superadditivity. Some binocular cells respond only when both eyes are stimulated (Grüsser and GrüsserCornehls 1965). These cells may determine the level of binocular summation. On the other hand, there are binocular cells that are excited by inputs from one eye and inhibited by inputs from the other. These cells would counteract the influence of the AND cells. There are reasons for believing that AND cells respond best to similar inputs from the two eyes and that inhibition is strongest when the inputs differ. We will see that the advantage of binocular vision over monocular vision is greatest when stimuli in the eyes are similar in shape, size, and contrast.
13.1.1e Signal-Detection Theory and Binocular Summation An account of binocular summation has been provided in terms of signal-detection theory. In this theory d´ is the criterion-free measure of the detectability of a stimulus, defined as the mean fluctuation of noise plus signal combined, minus the mean fluctuation of noise alone, divided by the common standard deviation of the two fluctuations, as described in Section 3.1.1d. When two independent detectors with the same variance of signal and of noise are exposed to the same stimulus, the joint detectability of the summed stimulus d´b is related to the detectabilities of the stimulus in each detector acting alone, d´1 and d´2 by d ′ b = {(d ′ )2 (d ′2)2 }1 2
B I N O C U L A R S U M M AT I O N, M A S K I N G , A N D T R A N S F E R
•
109
(3)
This is equivalent to saying that the precision with which the mean of a population is estimated increases in proportion to the square root of the number of observations. Under optimal conditions, this formulation gives the same √2 advantage of two eyes over one eye predicted by Campbell and Green’s formulation. Green and Swets (1966) referred to this formulation as the integration model because it is based on the idea that signals are perfectly summed before a detection decision is made. It is a model of neural summation rather than of simple probability summation because, in simple probability summation, all signal-to-noise processing is done in each eye and only “yes” or “no” signals are finally combined. Guth (1971) argued that when the probability of correct detection for each eye (Pm), is the same, the probability of correct detection with both eyes, (Pb) is Pb
Pm + d ( Pm − .5)
(4)
where d is the difference between the miss rate and the falsealarm rate. With Pm > 0.5, it follows from this equation that when the false-alarm rate is smaller than the miss rate, Pb > Pm, but when the miss rate is smaller than the false alarm rate, Pb < Pm. Guth also argued that whether two eyes perform better than one eye depends on the relative performance of the two eyes and on the relative frequencies of no-signal catch trials and signal trials. In this discussion it has been assumed that neural signals from each eye reach the brain along a single channel and that signal and noise sum algebraically, at least from spatially congruent contoured stimuli. There is some support for the idea that uncorrelated external noise in the two eyes sums algebraically (Braccini et al. 1980). However, visual inputs are grouped into different channels defined by color, size, and luminance-polarity, each with its own source of noise, and inputs combine by both summation and inhibition into partially distinct mechanisms in the visual cortex. We will see that several models of these processes have been proposed.
13.1.2 B I N O CU L A R S UM M AT I O N O F CONTRAST SENSITIVIT Y
13.1.2a Basic Studies of Contrast Summation Lythgoe and Phillips (1938) found that the monocular luminance threshold for detection of a white disk 12.5° in diameter was 1.4 times the binocular threshold at all times during 20 minutes of dark adaptation. Crawford (1940a) obtained similar results for detection but found substantial binocular summation for brightness discrimination only outside the fovea. Campbell and Green (1965) measured the contrast sensitivity (reciprocal of threshold contrast) for a sinusoidal grating of various spatial frequencies. The pupils were atropinized, and stimuli were viewed through 110
•
artificial pupils. In monocular testing, one eye viewed a diffuse field with the same mean luminance as the grating in the other eye. Binocular sensitivity was √2 times higher than monocular sensitivity, in conformity with simple summation of signals from the two eyes, as mentioned previously. When the spatial frequency of the gratings was the same in the two eyes, the ratio of monocular to binocular contrast sensitivity was constant over the visible range of spatial frequencies, a result confirmed by Blake and Levinson (1977). In these experiments, and in others mentioned later, contrast is Michelson contrast, as defined in Figure 3.8. As the Michelson contrast of a grating is varied, its mean luminance remains constant. When the mean luminance of a monocular grating was doubled, Campbell and Green found that contrast sensitivity increased by a ratio of only 1.17. We can explain this low ratio by saying that, although the rate of neural firing increases when luminance increases, sensitivity to a difference in luminance (contrast) does not increase in proportion, because the differential threshold increases with increasing luminance. Thus, it is not signals representing luminance that are summed in binocular summation of contrast, because this would not produce the observed improvement in contrast sensitivity. Instead, signals representing contrast are derived in each eye, and it is these signals that sum. The processes of lateral inhibition in the retina generate signals related to contrast that are relatively unaffected by changes in the level of illumination. The binocular advantage can be explained if it is assumed that contrast signals sum and that noise sums only by √2. Legge (1984a) (Portrait Figure 13.1) found that the monocular contrast-detection threshold for a 0.5-cpd sinewave grating was about 1.5 times the binocular threshold (similar to the value reported by Campbell and Green). Thus, a monocular grating would need about 50% more contrast than a binocular grating to be as visible. Monocular and binocular psychometric functions had the same maximum slope of about 2 (Figure 13.2A). The horizontal separation of the two functions indicates the difference in threshold for a given rate of detection. In this example, for a detection rate of 75%, the monocular threshold is 1.4 times the binocular threshold. The vertical separation of the functions indicates the difference in detection rate for a given contrast. The functions in Figure 13.2B are plotted on loglog coordinates, and the % rate of detection is converted into the detectability measure, d´. This reveals that when both curves have a slope of 2, the ratio of binocular to monocular detectability for a given contrast is 2 to 1, as indicated by the vertical separation of the functions. For functions with slope 1, the ratio of thresholds is the same as the ratio of detectabilities. In general, the detectability (d´) for a grating of contrast C presented to one eye is:
STEREOSCOPIC VISION
n C d′ =⎛ ⎞ ⎝C ′ ⎠
(5)
100
% contrast detection
90
80 Left eye 70 Binocular 60
50
0.1
0.3 % contrast
1.0
A 3.0 Gordon E. Legge. Born in Toronto in 1948. He received a B.S. in physics from MIT in 1971, and a Ph.D. in experimental psychology from Harvard in 1976. He conducted postdoctoral work with Fergus Campbell at Cambridge University. In 1977 he joined the faculty of the University of Minnesota, where he is now a professor of psychology and neuroscience and director of the Minnesota Laboratory for Low-Vision Research.
2.0
where C´ is the threshold contrast (at 76% correct detection), and n is the slope of the psychometric function, which was 2 in Legge’s experiment. Thus, when the psychometric function for detection has a slope of 2, the monocular contrast threshold is 1.4 times the binocular threshold, as predicted by neural summation. However, detectability of a binocular grating at the contrast threshold is twice that of a monocular grating at the monocular contrast threshold. According to equation (2), neural summation predicts that binocular and monocular detectabilities should have a ratio of 1.4 to 1. The difference between the ratio of contrast-detection thresholds and the ratio of detectabilities arises because the detectability of a stimulus is not a linear function of its contrast.
13.1.2b Effects of Interocular Differences in Contrast Legge (1984b) proposed that the effective binocular contrast of a grating (Cb) is the quadratic sum of the monocular contrasts (Cl and Cr) or Cb
(C ) + (C ) 2
l
2
Detectability (d’)
Figure 13.1.
Left eye
1.0
Binocular
0.5
0.2
0.1
0.1
0.3 % contrast
1.0
B Grating detection as a function of contrast. (A) Psychometric functions showing percent correct detection of a 0.5-cpd grating as a function of its contrast, for monocular and binocular viewing. For 75% detection, the monocular contrast threshold is about 1.5 times the binocular threshold, as indicated by the separation between the vertical lines. At the inflection points, both functions have a slope of about 2. Results for one subject. The symbols represent results from four sessions. (B) The same functions plotted on log-log coordinates with percent detection expressed as detectability, d´. For d´ of 1.0 (76% detection) horizontal separation of the functions indicates that the monocular contrast threshold is about 1.5 times the binocular threshold. Vertical distance between functions indicates that, for a given contrast, detectability of a binocular grating is about twice that of a monocular grating. (Redrawn from Legge 1984a)
Figure 13.2.
(6)
r
Quadratic summation implies that the contrast signal in each eye is squared before the two signals are combined with a compressive nonlinearity. When dichoptic stimuli have the same contrast and the eyes have the same contrast threshold, the binocular contrast threshold is √2 times
the monocular threshold. The quadratic summation rule assumes that the stimuli are the same except in contrast. The model could be generalized to accommodate other differences between the stimuli by adding weighting functions to the two monocular contrasts.
B I N O C U L A R S U M M AT I O N, M A S K I N G , A N D T R A N S F E R
•
111
Legge and Rubin (1981) used a standard binocular grating with the same contrast in the two eyes alternating with a test grating with fixed contrast in the right eye and variable contrast in the left eye. Each grating was presented for 180 ms with 600-ms intervals. Subjects adjusted the contrast of the test grating in the left eye until the contrast of the fused test grating appeared the same as that of the standard binocular grating. All gratings had the same phase and a spatial frequency of 1 or 8 cpd. Figure 13.3 shows an equal-contrast curve for 8-cpd gratings and three contrasts of the standard grating. Results for gratings of 1 cpd were similar except that the departure from averaging was more severe at low contrast. The results lie close to the curve representing a summation index of 2 but are well inside the diagonal, which represents an index of 1 (contrast averaging). This means that greater weight is given to the grating with higher contrast. The degree of summation decreased until, when the difference in contrast was considerable, the more luminous grating became less detectable than when the other eye was closed. This shows as some of the data points turning toward the origin as they approach the axes. This is the contrast-detection analog of Fechner’s paradox described in Section 13.1.4a. In Levelt’s averaging formula, stimuli are weighted in proportion to their contrasts. Perhaps the extra weight given to the dominant contrast in Legge and Rubin’s experiment arose because their stimulus was a grating with many
contrast borders, whereas Levelt used disks with contours only around the edges. The reduction in binocular summation as a function of the reduction of contrast in one eye has been found to be constant across the visible range of contrasts and spatial frequencies (Gilchrist and Pardhan 1987; Pardhan et al. 1989). Anderson and Movshon (1989) used superimposed dichoptic vertical sinusoidal gratings with the same luminance, phase, and spatial frequency in each eye. In each trial, the interocular contrast ratio was set at some value and the subject adjusted the contrast of both stimuli, while the ratio remained the same, until the grating was visible. One subject’s data are shown in Figure 13.4. The threshold contrasts of the two monocular gratings for each interocular contrast ratio define a binocular summation contour. For perfect linear summation of binocular signals, the contour should fall on a diagonal line with slope of -1. For complete independence, with no probability summation, it should fall along lines parallel to each axis, as shown in Figure 13.4. In fact, the data fell between these two limits. The data were fitted with the following power equation: s
s
⎛ ml ⎞ ⎛ mr ⎞ ⎜⎝ a ⎟⎠ = ⎜⎝ a ⎟⎠ = 1 l r
(7)
where ml and mr are the threshold contrasts of the left- and right-eye gratings when presented as a dichoptic pair, a l and a r are the contrast thresholds of each grating measured separately, and s is a parameter inversely related to the
200 0.1
Probability summation
Contrast of standard stimulus Threshold contrast (right eye)
Left-eye contrast
Summation with correlated noise
0.01 0.1 0.3
150
100
50
0
Figure 13.3.
0
50
100 150 Right-eye contrast
200
0.08
0.06 Summation with 0.04 uncorrelated noise
Linear summation
0.02
0 0
Dichoptic equal-contrast function. Example of an equal-contrast
curve for 8-cpd gratings. For each contrast of the right-eye grating the contrast of a left-eye grating was varied until the fused image appeared equal in contrast to a binocular standard grating. Both test contrasts are expressed as a percentage of that of the standard stimulus for three contrasts of the standard grating. The blue line is the result expected from averaging contrasts in the two eyes. The red circle is the result expected from a quadratic summation rule. The green lines indicate results expected when the match is determined solely by the image with the higher contrast (N = 1). (Adapted from Legge and Rubin 1981) 112
•
0.02
0.04
0.06
0.08
0.1
Threshold contrast (left eye) Binocular summation contours. Detection of a dichoptic grating in the presence of uncorrelated and correlated noise. For each datum point, interocular contrast ratio was fixed and subjects adjusted the contrast of both dichoptic gratings until the fused grating was visible. Arrows on the axes indicate thresholds for monocular stimuli masked by monocular noise. The blue line indicates perfect linear summation of binocular signals with correlated noise. The green lines indicate loci of probability summation (N = 1). (Redrawn from Anderson and Movshon 1989)
Figure 13.4.
STEREOSCOPIC VISION
magnitude of binocular summation. When the monocular contrasts are equal and s = 2 , the formula is equivalent to Legge’s quadratic summation formula, and the ratio of binocular to monocular thresholds is √2. The mean value of s was close to 2. Anderson and Movshon argued that, if binocular summation represents the combined action of several visual mechanisms, it should be possible to probe the contribution of each mechanism by selective masking or by adaptation. They measured the binocular summation contour when noise was added to the dichoptic gratings. In one condition the noise was the same in both eyes (correlated) and in another condition it was uncorrelated. When contrast was similar in the two eyes, the threshold with uncorrelated noise was about √2 times lower than that with correlated noise, as one would predict from Campbell and Green’s formula for neural summation. These results agree with those of Braccini et al. (1980). Pardhan and Rose (1999) also obtained similar results and, in addition, found that binocular summation decreases with increasing levels of both correlated and uncorrelated noise. The greater effect of uncorrelated noise in the two eyes can be explained as follows. Correlated noise stimulates the same zero-disparity detectors used to detect the zerodisparity grating, whereas uncorrelated noise stimulates a variety of disparity detectors because the stimulus elements combine in various ways to produce lacy depth. Thus, it is easier to detect a grating in uncorrelated noise than a grating in correlated noise. Anderson and Movshon found that, as the contrasts in the dichoptic stimuli became more different, the difference between correlated and uncorrelated noise became smaller. This result can be explained in the following way. When the contrasts in the two eyes are similar, binocular cells that summate inputs from the two eyes are maximally stimulated, and signals and noise summate to give an advantage to inputs with uncorrelated noise. But when interocular contrasts differ, the summation mechanism is turned off and mutual inhibition responsible for binocular rivalry is turned on. The eye with the stronger signal now suppress the other eye so that, in the extreme case, signal and noise from only one eye are available. Under these circumstances, it makes no difference whether the noise in the two eyes is correlated or uncorrelated. Actually, as we have already seen, some such mechanism must be assumed in the Campbell and Green model to account for why noise from a closed eye does not affect the monocular contrast threshold. Anderson and Movshon produced evidence that there are several ocular-dominance channels, each of which can be selectively adapted. They dubbed this the distribution model of binocular summation. Binocular cells with equal monocular contrast thresholds are optimally stimulated by dichoptic stimuli of equal contrast. Cells strongly dominated by one eye are optimally stimulated by dichoptic
stimuli that differ in contrast. The general form of the psychophysically determined binocular summation contour represents the summed response of the ocular-dominance channels. According to evidence summarized in Section 13.1.8, most binocular cells in the visual cortices of cats and monkeys show linear binocular summation. In spite of linear summation within each channel, binocular summation assessed psychophysically is approximately quadratic. The deviation from linearity could be due to the low monocular contrast thresholds for the nondominant eye of cells with strong monocular dominance.
13.1.2c Effects of other Interocular Differences The tuning characteristics of a binocular cell for orientation, spatial-frequency, and other stimulus features are fundamentally the same for stimuli presented to each eye separately. It is therefore not surprising that interocular summation of contrast sensitivity occurs only for stimuli with similar orientations and spatial frequencies ( Julesz and Miller 1975; Westendorf and Fox 1975; Blake and Levinson 1977), similar directions of motion and temporal properties (Blake and Rush 1980), similar clarity of focus (Harwerth and Smith 1985), and similar wavelength (Trick and Guth 1980).
13.1.2d Effects of Stimulus Spacing Binocular summation is reduced when the stimuli in the two eyes are spatially separated. Thus, the detectability of a small flashed target fell to the level of summation defined by Green and Swets’s integration model when the images of the flashed target were more than about 20 arcmin apart. This is a disparity at about the limit of binocular fusion (Westendorf and Fox 1977). Binocular summation of lowcontrast gratings, as reflected in the reaction time for detection, fell to the level of probability summation as disparity increased beyond the limits of Panum’s fusional area (Harwerth et al. 1980). The disparity limit of binocular fusion increases as the spatial frequency of the stimulus is reduced (Section 12.1.2). The range of disparities over which binocular summation occurred showed a similar dependency on spatial frequency (Rose et al. 1988). It seems that binocular summation between dichoptic stimuli occurs above the level of probability summation only when the stimuli are fused.
13.1.2e Effects of Stimulus Position and Eccentricity Wolf and Zigler (1963, 1965) measured the detectability of a 1° test patch at various positions on a circle with a radius of 10° around the fovea. Detectability was greater for binocular than for monocular viewing except when the test patch fell on the midvertical meridian. They argued that the
B I N O C U L A R S U M M AT I O N, M A S K I N G , A N D T R A N S F E R
•
113
two halves of a test patch on the midvertical meridian project to opposite cerebral hemispheres so that, although each hemisphere receives inputs from both eyes, it receives only half the total area. The retina has a radial organization. Responses of orientation-selective cells in the cat visual cortex were largest when the stimulus grating was orientated radially (Levick and Thibos 1982). Also, in the peripheral retina, human observers are more sensitive to a radial grating than to a grating in any other orientation (Rovamo et al. 1982; Temme et al. 1985). Binocular summation for detection of the orientation of a grating at the fovea did not vary with grating orientation. However, at an eccentricity of 8°, summation was greater when the grating was parallel to a radial meridian than when it was orthogonal to it (Pardhan 2003). As a spot increased in horizontal eccentricity to 75°, binocular summation decreased with targets subtending 0.1° but increased with targets subtending 1.7° (Wood et al. 1992). Thus, binocular summation is greater when the size of the stimulus is more closely matched to the size of receptive fields. In the foveal region of the retina, resolution is limited by the optical properties of the eye. With increasing retinal eccentricity, the limiting factor becomes the sampling density of receptive fields (Wang et al. 1997; Williams et al. 1996). Undersampling causes aliasing in which a grating is seen as one of lower spatial frequency and, typically, in a different orientation (Section 9.1.5). Thus, a grating in the periphery may be detected but its orientation may be misperceived. Moving a grating into the periphery should have less effect on detection than on identification of orientation. In the fovea, viewing with both eyes improved both grating detection and identification of grating orientation about 5% (Zlatkova et al. 2001). This is in accord with probability summation. At 25° away from the fovea, binocular viewing improved detection by 6% but improved orientation identification by 16%. The combination of the two monocular fields must have improved sampling density at some higher level in the visual system.
13.1.2f Binocular Summation and Breaking Camouflage Binocular summation could enhance the visibility of a fixated object relative to that of a binocularly disparate background in the following way. Spatial-frequency components in the background, having twice the spatial period as the disparity of the background, would be in antiphase in the two eyes. Therefore, binocular summation would make them less visible. The visibility of components with the same spatial period as the disparity would be relatively enhanced by binocular summation. Overall, the spectral density function for the background should show undulations with cancellation at odd multiples of the disparity and 114
•
summation at even multiples. The fused images of the fixated object show binocular summation across the whole range of spatial frequencies. Schneider and Moraglia (1994) showed experimentally that the visibility of a fused target is enhanced when its spatial frequency falls within a furrow of the spectral density function of the disparate background. In other words, spatial-frequency components of a fused target that are odd multiples of the disparity of the background have enhanced visibility relative to a background.
13.1.2g Binocular Summation of Isoluminant Stimuli Simmons and Kingdom (1998) measured monocular and binocular detection thresholds for 0.5-cpd Gabor patches that were either isochromatic or red-green isoluminant. Binocular summation for both stimuli was above the level of probability summation, and was particularly high for the isoluminant stimuli. They suggested that binocular summation is high in the chromatic system because binocular inhibition is weaker in that system than in the achromatic system. Jiménez et al. (2002a) found binocular summation above the level of probability summation for both isochromatic and isoluminant stimuli. They then measured binocular summation by recording reaction times to changes in luminance or chroma of 1.5° patches. This revealed greater summation in the luminance system than in the chromatic system.
13.1.2h Summary The monocular contrast-detection threshold for a light spot or grating is about 1.5 times the binocular threshold, in conformity with the formulation of Campbell and Green. The magnitude of binocular summation decreases with increasing difference in contrast between the images in the two eyes. Binocular summation of contrast sensitivity is greater for dichoptic images that are close together in space and in time, have similar stimulus characteristics, and stimulate the same cerebral hemisphere. In the periphery, summation is greater when a grating is parallel to a radial meridian and larger in size. Also, in the periphery, binocular summation improves orientation detection more than it improves grating detection. Binocular summation may help to break camouflage and occurs for both luminance and chromatic stimuli. 13.1.3 S U M M AT I O N AT S U P R AT H R E S H O L D CONTRASTS
A low levels of luminance, it is an advantage to sum inputs from the two eyes. That may be one reason why frontal vision evolved in the nocturnal ancestors of primates. However, it would be disturbing if suprathreshold stimuli
STEREOSCOPIC VISION
from the two eyes were to sum linearly. Things would appear to double in contrast from monocular to binocular viewing. Also, things in the binocular field would appear brighter than those in the monocular crescents. The processes described in the next two sections contribute to the absence of linear summation of contrast at suprathreshold contrasts.
13.1.3a Binocular Summation and Response Saturation At low levels of contrast, the neural response to a stimulus is a positively accelerating function of contrast. The response saturates at higher levels of contrast. For a monocularly viewed grating with a contrast of up to about 0.25, the contrast-difference threshold is much smaller than the absolute level of contrast required for detection of the grating (Nachmias and Sansbury 1974). Legge (1984a) used a forced-choice procedure to measure the threshold for detection of an increment of contrast in a suprathreshold grating at various levels of contrast. Initially, both binocular and monocular psychometric functions had a slope of 1. As the pedestal contrast of the grating increased to 0.25 the advantage of binocular over monocular discrimination fell to zero, as one would expect of a response that saturates at a high levels of contrast. Discrimination of the orientation of a grating also saturates at high contrasts. This would explain why orientation discrimination of high-contrast gratings was similar for monocular and binocular viewing (Andrews 1967). Bearse and Freeman (1994) measured orientation discrimination for one-dimensional Gaussian patches. Binocular performance was 66% better than monocular performance for stimuli that were both brief (50 ms) and low in contrast (8%). When either duration or contrast was increased beyond a certain level, binocular and monocular performances became equal. In summary, it seems that binocular energy summation occurs in the contrast or temporal threshold region where the response of the visual system is an accelerating function of stimulus energy. Response saturation limits binocular summation for discrimination of stimuli well above detection threshold. See Meese et al. (2006) for data and theory on this issue.
13.1.3b Binocular Summation and Gain Control Ding and Sperling (2006) developed a gain-control theory of binocular combination. They proposed that, in each neighborhood, inputs from the two eyes engage in mutual inhibition. The inhibitory strength of each stimulus is proportional to its contrast. When both stimuli have low contrasts, inhibitory influences are reduced and the images show strong summation.
Ding and Sperling tested their theory by presenting a horizontal sine-wave grating to each eye. The gratings had a spatial frequency of 0.68 cpd but varied in amplitude and phase. Subjects judged the phase of the fused grating relative to a fixed marker. The perceived phase of the fused grating was taken as an indication of the relative contributions of the two images to the fused image. The results provided strong evidence for their model. Moradi and Heeger (2009) produced evidence for a gain-control mechanism of binocular combination in the fMRI responses to dichoptic gratings. They related this process to response normalization in which responses of cells are normalized with respect to the activity of neighboring cells.
13.1.3c Binocular Summation for Acuity Given that binocular summation of contrast sensitivity is much reduced at high contrasts and that vernier acuity depends on contrast sensitivity, one would expect binocular summation for hyperacuities to decline at high contrasts. Landolt C acuity showed a binocular advantage of about 40% at a contrast of 0.01 but of less than 10% at a contrast of 0.8 (Home 1984). Vernier acuity was between 40 and 60% better with binocular than with monocular viewing for contrasts up to about 20 times above threshold. The binocular advantage declined at higher contrasts (Banton and Levi 1991). Binocular summation for a 3-dot alignment task varied between 0% and 35% but was not much affected by an increase in dot separation up to 5 arcmin. Summation for the task of placing a high-contrast horizontal line midway between two other lines occurred only for separations of the outer lines of less then 2 arcmin, probably because this task involves an element of luminance discrimination (Lindblom and Westheimer 1989).
13.1.3d Dichoptic Detection of Luminance Ratios When a difference in luminance of superimposed dichoptic patches reaches a certain value the effect of the difference becomes evident as a change in perceived luminance or binocular luster. Formankiewicz and Mollen (2009) asked whether the detection of interocular luminance difference obeys Weber’s law, Ricco’s law, and Bloch’s law. Subjects detected which patch in an array of 16 patches viewed in a stereoscope was differently illuminated in the two eyes. When the data were expressed in terms of differences in contrast, the difference threshold was a linear function of the log of the contrast of the fixed contrast of the image in one eye, with a slope close to unity. This indicated that judgments followed Weber’s law. In conformity with Ricco’s law, detection of an interocular luminance difference improved with increasing area for patches up to a certain area and for short durations. In conformity with Bloch’s law, detection
B I N O C U L A R S U M M AT I O N, M A S K I N G , A N D T R A N S F E R
•
115
improved with increasing stimulus duration up to a limiting duration. When the general illumination of a scene changes, the light reflected from one surface expressed as a ratio of the light reflected from a neighboring surface remains almost exactly constant. If the reflectance of a surface changes, ratios of reflected light from neighboring objects usually change. Thus, the constancy of luminance ratios could allow people to distinguish between changes in illumination and changes in surface reflectance (whiteness). Nascimento and Foster (2001) asked whether changes in the luminance ratio of two adjacent surfaces are detected when one surface is presented to one eye and the other surface to the other eye. Two pairs of neighboring squares were presented in succession with a change in luminance that represented either a change in illumination or a change in reflectance. In a second condition, subjects indicated which of the two sequences appeared more like an illumination change. Subjects could perform the task dichoptically, but were not as precise as when both eyes saw both test squares. Dichoptic and binocular performances became similar when the test squares were spatially separated by 3°.
13.1.3e Binocular Summation for Pattern Recognition Binocular summation is near the level of probability summation for more complex visual tasks performed at a suprathreshold level of contrast, such as recognition of letters on a noisy background (Berry 1948; Carlson and Eriksen 1966; Townsend 1968; Frisén and Lindblom 1988). Cagenello et al. (1993) used a letter recognition task with contrasts between 0.3 and 1 and obtained a mean binocular advantage of 11%, but there were wide differences between the four subjects. The mean advantage diminished as the images were made to differ in contrast. Briefly exposed letters were recognized more accurately with binocular than with monocular presentation (Williams 1974). The advantage fell to the level of probability summation when letters fell on noncorresponding areas in the two retinas or were separated by temporal intervals of more than 50 ms (Eriksen et al. 1966; Eriksen and Greenspon 1968). Also, the binocular advantage was not evident in two subjects with strabismus. As one would expect, the ability to recognize two different letters was reduced when they were superimposed dichoptically but not when they were presented to noncorresponding areas or successively (Greenspon and Eriksen 1968). Uttal et al. (1995) dichoptically combined identical aircraft silhouettes, subtending 1°. One image was degraded by low-pass spatial-frequency filtering. The other image was degraded by intensity averaging within local areas. Each pair of images was presented for 100 ms. When two dichoptic pairs were presented sequentially, subjects could detect 116
•
whether the silhouettes were the same or different aircraft with fewer errors than when two silhouettes with the same single degradation were presented monocularly in sequence. The same advantage was evident when the images with two types of degradation were physically superimposed in one eye. Thus a degraded image of an object can be seen more clearly when combined, either monocularly or dichoptically, with the same image degraded in a different way rather than in the same way. 13.1.4 B I N O C U L A R S U M M AT I O N O F BRIGHTNESS
If the inputs from the two eyes were to sum in a simple fashion, an illuminated area would appear about twice as bright when viewed with two eyes than when viewed with one eye. In fact, Jurin in 1755 and Fechner in 1860 observed that an illuminated area appears only slightly brighter when viewed with two eyes (see Robinson 1895; Sherrington 1904; De Silva and Bartley 1930). A bright patch presented to one eye may actually appear less bright when a dim patch is presented to the same region in the other eye. This effect is known as Fechner’s paradox.
13.1.4a Levelt’s Experiments on Brightness Summation To investigate binocular brightness summation, Levelt (1965a) presented a 3° luminous disk on a dark ground to corresponding regions in each eye. The disk in one eye had a fixed luminance, and the subject adjusted the luminance of the disk in the other eye until the combined image appeared the same brightness as a comparison stimulus with the same fixed luminance in the two eyes. The test and comparison stimuli were presented sequentially in the center of the visual field. The results for one subject for a comparison stimulus with a luminance of 20 cd/m2 are shown in Figure 13.5. Over the straight part of the curve, the brightness of the binocular comparison disk was equal to the mean brightness of the dichoptic test disk. When the test disks had the same luminance, the comparison and test disks were, necessarily, identical. As the luminance of the test disk was increased in one eye, it had to be decreased in the other eye by a proportionate amount to maintain the match with the comparison disk. Some subjects gave a greater weighting to one of the dichoptic images than to the other. Look what happened when the luminance of the disk in one eye was set at or near zero, as indicated by the orthogonal dashed lines in Figure 13.5. The slope of the linear function reversed, which means that more luminance was required in the brighter disk when the dimmer disk was presented to the other eye compared with when there was no stimulus in the other eye. This is Fechner’s paradox. For the data in Figure 13.5, a luminance of about 32 cd/m2 was required to match a monocular disk with a comparison
STEREOSCOPIC VISION
48
Left test field luminance (cdm2)
40
x
32 xx
x x
24
A 16
8
0
8 16 24 32 Right test field luminance (cdm2)
40
Dichoptic equal-brightness curve. Equal-brightness curve for a 3° luminous disk presented to both eyes at a luminance of 20 cd/m2 with respect to a pair of fused dichoptic disks set at various luminance ratios. Dashed lines indicate boundaries of the region in which dichoptic summation fails and Fechner’s paradox is evident (N = 1). (Redrawn from
Figure 13.5.
Levelt 1965b)
stimulus of 20 cd/m2, whereas a higher luminance was required when a dimly illuminated disk was visible in the other eye. Fechner’s paradox can be explained by assuming that the processes underlying dichoptic brightness averaging involve both summation and inhibition. When border contrast is similar in the two eyes, summation predominates, with greater weight given to the stimulus with greater contrast, but when the contours in the two eyes differ greatly in contrast, inhibition outweighs summation. When the contrasts are opposite in sign, inhibition becomes evident as rivalry (Fry and Bartley 1933). Inputs from a totally uncontoured region in one eye are usually suppressed by inputs from a contoured region in the other eye. When Levelt (1965b) superimposed a 2°-diameter black ring on one of the dichoptic disks, the contribution of that disk to the brightness match increased. Furthermore, in the immediate neighborhood of a contour presented to only one eye, binocular brightness was determined wholly by the luminance in that eye. The effect of an added contour can also be understood in terms of Fechner’s paradox. The influence of contour is illustrated in Figure 13.6. A white disk with a black perimeter combines dichoptically with a black disk to form a gray disk. But a black disk combines with an uncontoured white region to form a black disk, which resembles that formed by two black disks (Levelt 1965a). One cannot be sure that a closed eye makes no contribution to a binocular match. Zero contribution from an eye can be guaranteed only if the eye is pressure blinded. Levelt explained his results by stating that binocular brightness (B) depends on a weighted sum of the luminances
B Influence of contour on dichoptic brightness. (A) The fused black and white disks appear brighter than either the fused black disks or the monocular black disk because the strong rim round the white disk adds to the dominance of the white disk. (B) The black and white disks rival because they have equally strong rims. Sometimes the fused black and white disks are as dark as the fused black disks. (From Levelt 1965a)
Figure 13.6.
of the monocular stimuli (El and Er). The weights (wl and wr) sum to 1 and depend on the relative dominance of the two eyes and the relative strengths of the two stimuli, determined mainly by the contours they contain. Thus, wl El
w r Er
(8)
This is a purely formal theory since it can describe many results if appropriate weights are selected, and there is no independent procedure for deciding the weights. Furthermore, it assumes a linear transduction of stimulus luminance and contrast into neural signals signifying brightness. Since the weights sum to 1, the formula cannot account for binocular brightness in excess of the average of the monocular luminances. De Weert and Levelt (1974) added a parameter to the simple luminance-averaging formula to account for brightness summation being slightly better than predicted by averaging, and to account for Fechner’s paradox. De Weert and Levelt (1976b) provided evidence that stimuli from the middle of the chromatic spectrum contribute more to dichoptic brightness than stimuli from the ends of the spectrum. Levelt assumed that dichoptic luminances rather than dichoptic brightnesses were averaged. Teller and Galanter (1967) held the luminance of monocular patches constant while varying their brightness, either by changing the adaptive state of the eye or the contrast between the patches and
B I N O C U L A R S U M M AT I O N, M A S K I N G , A N D T R A N S F E R
•
117
their background. In both cases the brightness of dichoptically viewed patches varied with the imposed change in monocular brightness. In particular, the level of luminance of the stimulus in one eye at which Fechner’s paradox was evident did not depend on the absolute luminance of the stimulus but on its luminance relative to the luminance threshold.
13.1.4b Other Models of Brightness Summation Erwin Schrödinger (1926), as a change from his work in fundamental physics, proposed that each monocular input (fl and fr) is weighted by the ratio of the signal strength from that eye to the sum of the strengths of the signals from the two eyes. The binocular result, B is then given by: B
fl
fl f
fr
+ fr
(9)
fr
MacLeod (1972) added to this account by proposing that the strength of a neural signal, f, is a logarithmic transform of the stimulus contrast, as specified by: f
⎛ l⎞ f 0 + log ⎜ ⎟ ⎝ l0 ⎠
(10)
where fo is the internal noise, l is the difference in luminance across the contour, and l0 is the threshold luminance difference. A good fit to Levelt’s data in Figure 13.5 was obtained by setting B = 1.36, fo = 0.34, and lo = 2 cd/m2 for each eye. Several other models of binocular brightness summation have been proposed. Engel (1967, 1969, 1970a) used a weighted quadratic sum model to account for binocular summation of brightness. In this formulation, the brightness of a binocular stimulus derived from magnitude estimations (y b ) is related to the brightness of monocular stimuli ( y l and y r ) by the expression vb
b
(Wl w l )2 (Wr w r )2
(11)
The weighting functions (Wr and Wl) were derived from normalized autocorrelation functions of the image in each eye and reflected the amounts of contour and contrast in each image. They thus served the same function as the weights in Levelt’s formula except that Engel provided a process for determining their values. Engel’s function resembles the quadratic summation model used by Legge to describe binocular summation of contrast sensitivity (equation 5), and by Legge and Rubin (1981) to describe summation of contrast in suprathreshold gratings. Tanner (1956) proposed that the detectabilities of single stimuli in each of two detectors sum like vectors to predict the discriminability between stimuli presented to the two detectors. According to this formulation, monocular contrasts of magnitude Cl and Cr sum like 118
•
)2 (
l
)2 + (
r
)2
C l C r cos f
(12)
Correlation between noise in the two eyes is represented by cosf. When f is between 90° and 120° the function reduces to averaging and accounts for Fechner’s paradox. An angle of 90° (cosf = 0) signifies that contrast is detected independently in the two eyes with uncorrelated noise and the formula reduces to Legge’s quadratic sum formula. An angle of 0° (cosf = 1) signifies that the noise is perfectly correlated in the eyes and binocular contrast becomes the simple sum of the monocular contrasts or, (
fr f
vectors to produce binocular contrast, Cb (Curtis and Rule 1978), or
b
)2 (
or C b (
l
)2 + ( l
r)
r
)2
C l Cr ,
(13)
Curtis and Rule did not propose a physiological representation of the vector-addition process. This formula contains no weightings to allow for differences between the images in the two eyes but it could easily be modified to do so. Other models of binocular brightness summation specify binocular processes that extract differences between binocular stimuli, and other processes that extract sums of binocular stimuli. For instance, in a model proposed by Lehky (1983), dichoptic stimuli with matching contours are processed in the summing channel while those with opposite luminance polarity are processed in the differencing, or rivalry, channel. Lehky also used a vector-sum formula and interpreted the angle between the vectors as the relative contributions of the summing and differencing channels. Cohn et al. (1981) found that stimuli in the summing channel, such as binocular increments of luminance, were selectively masked by noisy fluctuations of luminance that were correlated in the two eyes. On the other hand, signals in the differencing channel, such as a luminance increment in one eye and a decrement in the other, were selectively masked by uncorrelated noise. They argued that this evidence supports a two-process model of binocular combination. Cogan (1987) proposed that the differencing channel receives an excitatory input from one eye and an inhibitory input from the other, and that the summing channel receives only an excitatory input from both eyes. He assumed that there are no purely monocular cells in the binocular field and that the net binocular response is the pooled output of the two channels. Sugie (1982) developed a neural network model of these processes.
13.1.4c Summation With and Without Contours The models proposed by Fry and Bartley (1933), De Weert and Levelt (1974), and Engel (1970a) stress the role of
STEREOSCOPIC VISION
Brightness estimation in assigned numbers
contour in determining the amount of binocular brightness summation. However, this factor was not explored systematically. Leibowitz and Walker (1956) found that the amount of binocular summation of brightness decreased as the size of the stimulus was reduced from 1° to 15 arcmin. They attributed this effect to the increase in the proportion of contour to area, as area was decreased. Bolanowski (1987) obtained estimates of binocular summation when all contours were removed from the visual field. He used a Ganzfeld produced by illuminating table tennis balls trimmed to fit over the eyes. Subjects rated the apparent brightness of the Ganzfeld presented for 1 s either to one eye or to both. The results for different levels of illumination are shown in Figure 13.7. When each eye received 1000 Ganzfeld Binocular 100
10 Monocular
13.1.4d Interactions Between Orthogonal Dichoptic Edges
1.0
0.1 –10
–8
–6 –4 –2 Log luminance
0
Brightness estimation in assigned numbers
A 1000 2° field 100
In the experiments reviewed so far the dichoptic stimuli were similar in orientation. In Section 12.3.2 it was mentioned that low-contrast orthogonal dichoptic gratings do not rival but rather combine to create a stable plaid pattern (see Figure 12.16). Liu and Schor (1995) assumed that, at low contrast, the contrast of each orthogonal edge in a stimulus like that shown in Figure 13.8 is preserved in the dichoptic image because, at each intersection, the edge in Left eye
Right eye
Fused image
10 Monocular and binocular 1.0
0.1 –10
–8
–6 –4 –2 Log luminance
0
B Dichoptic apparent brightness. (A) Magnitude estimation of apparent brightness of a Ganzfeld presented to both eyes or to one eye as a function of log luminance. (B) Magnitude estimations of apparent brightness of a 2° spot presented to both eyes and to one eye. Bars are standard error (N = 8). (Redrawn from Bolanowski 1987)
Figure 13.7.
the same luminance, the apparent brightness of the binocular Ganzfeld was about twice that of the monocular Ganzfeld. Binocular summation of brightness was thus complete. When the diameter of the stimulus was reduced to 2°, binocular brightness was about the same as monocular brightness, as found by Levelt. Bourassa and Rule (1994) confirmed these results and also noted that Fechner’s paradox was absent when binocular summation was measured with Ganzfeld stimuli. They also noted that summation with small stimuli with gradual borders was less than with Ganzfeld stimuli but more than with small stimuli with sharp borders. Binocular facilitation did not occur when the density of a grid pattern was different in the two eyes, or when one eye was exposed to a grid and the other to a flashed diffuse light (Harter et al. 1974). These results can be described by Curtis and Rule’s equation (11) if the angle separating the vectors increases in relation to the presence of inhibitory influences arising from sharp contours. Thus the various models of brightness summation can accommodate this result if appropriate weights are assigned to visual contours. Grossberg and Kelly (1999) developed a model of binocular brightness perception in terms of neural dynamics.
Stimuli used by Liu and Schor (1995). Upper stimuli are dichoptic orthogonal edges, which fuse to create a plaid pattern. Lower stimuli are a plaid pattern in one eye and a blank field in the other. At low contrast, the contrast of the monocular plaid pattern had to be twice that of the dichoptic patterns to produce fused images with the same apparent contrast. At high contrasts, more contrast was required in the dichoptic patterns to produce a match.
Figure 13.8.
B I N O C U L A R S U M M AT I O N, M A S K I N G , A N D T R A N S F E R
•
119
one eye dominates the evenly illuminated region in the other eye. From this they predicted that the apparent contrast of a low-contrast dichoptic pattern should match that of a similar monocularly viewed pattern of twice the contrast combined with a black disk in the other eye. Their results agreed with this prediction rather than with one based on averaging of luminance within each region of the display. At higher contrasts, the apparent contrast of the dichoptic stimulus was reduced relative to that of the monocular stimulus. This was presumably because of spreading rivalry between the orthogonal dichoptic edges. Subjects could distinguish between the dichoptic and monocular stimuli even when their contrasts matched. This could be due to the relative motion of dichoptic images arising from vergence instability. 13.1.5 FL I C K E R F US I O N A N D B I N O C U L A R S U M M AT I O N
The frequency at which a flickering light appears to fuse into a continuous light is the critical fusion frequency (CFF). With increasing luminance, the CFF increases up to a limit of about 50 Hz (Crozier and Wolf 1941). Sherrington (1904) hypothesized that, if inputs from the two eyes converge on the same cells in the same way that motor efferents converge in a final common path, then the CFF should be higher for a flickering light viewed binocularly than for one viewed monocularly. However he found monocular CFF and binocular CFF to be about the same. He also hypothesized that the CFF should be about twice as high for flickering lights presented in phase to the two eyes than for lights in antiphase. Sherrington found the CFF to be only about 3% higher for in-phase than for antiphase dichoptic flicker and concluded that there is very little convergence of binocular inputs. He wrote, “The binocular sensation attained seems combined from right and left uniocular sensations elaborated independently.” Sherrington underestimated the difference between in-phase and antiphase binocular CFF. More recently, the CFF for in-phase flicker was found to be between 4.5% and 10% higher than for antiphase flicker (Ireland 1950; Baker 1970). This difference is due to neural summation in binocular cells, since it is higher than predicted by probability summation (Peckham and Hart 1960). Also, no significant difference between in-phase and antiphase flicker sensitivity was found in subjects lacking stereoscopic vision (Levi et al. 1982). We now know that many inputs from the two eyes do converge on common cells. Nevertheless, Sherrington’s main conclusion still stands, namely, that lights flickering in phase well below the CFF do not simply sum to produce a flicker sensation of twice the frequency. On the other hand, the partial elevation of in-phase over antiphase flicker could be due to sensations of flicker arising from monocular cells. Even a few monocular cells would retain a signal of flicker after all binocular cells have ceased 120
•
to register it. Thus, the CFF is not a sensitive measure of binocular summation. Furthermore, Sherrington worked at suprathreshold levels, where inhibitory as well as excitatory interactions occur between inputs to binocular cells. Interocular summation of flicker sensitivity is more likely to be revealed at threshold levels of luminance. Another factor could be the presence of contours in the stimuli. It has already been noted that dichoptic brightness summation is increased when the images contain no contours. Thomas (1956) found that the CFF with in-phase dichoptic flicker was increased by the addition of parallel lines to each image, even when the lines in one eye were orthogonal to those in the other. The absence of dichoptic summation of flicker represents the temporal limit of the ability of binocular cells to receive alternating flashes from the two eyes, possibly because of mutual inhibitory processes evoked by stimuli in antiphase. One would expect that asynchronous dichoptic flashes below a certain frequency would not engage in mutual inhibition and would produce a sensation of double the flicker frequency. Andrews et al. (1996) obtained this result for alternating dichoptic flashes at frequencies of 2 Hz or less. Above this frequency, asynchronous and synchronous flashes were increasingly judged to have the same frequency. The temporal contrast-sensitivity function is the least modulation of luminance required for the detection of flicker plotted over a range of temporal frequencies. It is also known as a De Lange function (De Lange 1954). It is the temporal analog of the spatial contrast-sensitivity function and has a similar band-pass shape. This measure has been used to investigate dichoptic flicker. Cavonius (1979) found that, with a homogeneous foveal field, flicker sensitivity increased with increasing flicker rate up to about 2% luminance modulation at about 10 Hz. It then fell rapidly to zero at a frequency of about 50 Hz. For flicker rates above about 10 Hz, sensitivity for in-phase dichoptic flicker was about 40% higher than for antiphase flicker or for monocular flicker (Figure 13.9). At low flicker rates, sensitivity for in-phase dichoptic flicker was up to four times higher than that for antiphase flicker. This could be because of summation of signals arising from lights flickering in phase in the two eyes, which is dichoptic summation of in-phase signals, or because of summation of opposite-sign signals from lights flickering in antiphase, which is dichoptic summation of antiphase flicker. The summation of in-phase signals seems to be the crucial factor, because sensitivity to antiphase dichoptic flicker was about the same as sensitivity to monocular flicker with the other eye exposed to a steady field (van der Tweel and Estévez 1974; Cavonius 1979). Other evidence reviewed in the next section supports the idea of dichoptic summation of in-phase flicker but not of antiphase flicker. The difference probably arises from interactions between the temporal phases of responses to flashes, as discussed in Section 13.1.6c.
STEREOSCOPIC VISION
Contrast modulation threshold
0.01
Counter-phase flicker
0.1
In-phase flicker
1.0 0.1
1.0
10
50
Flicker frequency (Hz) Detection of dichoptic flicker. Threshold contrast modulation for detection of in-phase and counterphase flicker of a 1° illuminated spot as a function of flicker frequency. (N = 1). (Adapted from Cavonius 1979)
Figure 13.9.
13.1.6 S E NS I T I V I T Y TO P U L S E D S T I MU L I A N D M OT I O N
Anstis and Ho (1998) repeated Levelt’s experiment using a 0.7° dichoptic test spot flickering in phase to the two eyes at 15 Hz, rather than a constantly illuminated spot. The spot presented to one eye was set at one of several luminance modulations. Subjects adjusted the luminance modulation of the spot presented to the other eye until the fused image of the two flickering spots matched the luminance of a steady gray comparison spot seen subsequently by both eyes. For light spots on a dark ground, the results were similar to those obtained by Levelt. The apparent luminance of the dichoptic flickering spot was the mean of the luminance of the component spots. At extreme values there was evidence of Fechner’s paradox (see Figure 13.4). However, for dark spots on a light ground, the apparent luminance of the dichoptic spot was equal to the luminance of the spot with higher luminance. Anstis and Ho suggested that the mean weighting function for light-on-dark spots compared with the winner-take-all weighting for dark-on-light spots is due to an underlying asymmetry in the ON and OFF visual pathways. However, the state of adaptation of the eyes differed between the two conditions.
13.1.6a Dichoptic Flashes The temporal sensitivity of the visual system can be explored with single flashed stimuli, which may be a light spot that is momentarily extinguished (negative polarity) or a dark spot that is momentarily increased in luminance (positive polarity). The threshold for detection of dichoptically flashed test spots that either both increased or both decreased in luminance was lower than the threshold of flashed spots that increased in luminance in one eye and decreased in the other (Westendorf and Fox 1974). Samesign flashes were detected at a level above that of probability summation, whereas opposite-sign flashes were detected at about the level of probability summation. This provides further support for summation of dichoptic in-phase signals and independence of dichoptic antiphase signals. When one of the flashed targets was a vertical bar and the other was a horizontal bar, flash detection was at the level of probability summation for both same- and opposite-sign flashes (Westendorf and Fox 1975). The receptive field of a ganglion cell has a spatial sensitivity profile. For example, an on-center receptive field has an excitatory center with a Gaussian profile and an inhibitory surround with a wider Gaussian profile. A receptive field also has a temporal-sensitivity profile. A flash of light in the center of an on-center receptive field produces an excitatory discharge followed by an inhibitory phase. An off-center field responds in the same way to a briefly darkened spot. The durations of these phasic responses to light pulses can be estimated by measuring either the probability of seeing or the threshold for detection of a pair of flashes as the interstimulus interval is increased. The procedure is analogous to Westheimer’s procedure for measuring the spatial properties of receptive fields (Section 13.2.3).
13.1.6b Bloch’s Law Under Dichoptic Conditions Under the most favorable conditions, the visibility of a single flash is proportional to its duration up to about 100 ms. During this period, visibility depends on the product of intensity and duration, a relationship known as Bloch’s law. The limiting period of temporal integration is decreased by increasing the area of the stimulus, keeping luminance constant, or by increasing the luminance of the background (Barlow 1958). Two flashed stimuli with the same size and luminance polarity presented to one eye physically sum their stimulus energy (Bouman and van den Brink 1952). Similarly, within the limits of Bloch’s law, the visibility of dichoptic flashes depends on the total energy in each flash, which is the product of the duration and intensity of the flash (Westendorf et al. 1972). Cogan et al. (1982) found that the detectability of low-contrast dichoptic flashes set within fused binocular contours was at least twice that of a monocular flash. The binocular advantage was not as large for high-contrast flashes. Binocular detectability, even for low-contrast flashes, was only 41% better than monocular detectability when the background contours were omitted from one eye. It was suggested that binocularly fused contours engage cells responsible for binocular fusion, and which sum low-contrast stimulus energy. Contours in only one eye engage monocular mechanisms or the binocular rivalry system, which reduces binocular summation.
13.1.6c Detection of Flashes Separated in Time A flash generates an initial response of one sign and a secondary response of the opposite sign. The sign of the
B I N O C U L A R S U M M AT I O N, M A S K I N G , A N D T R A N S F E R
•
121
122
•
0.02 Monocular
Phase = 0°
0.04 Dichoptic Contrast threshold
responses depends on the contrast polarity of the flash. For short interstimulus intervals, the initial excitatory responses of two bright flashes in the same eye summate. As the interstimulus interval is increased, the mutual facilitation decreases to zero at about 35 ms. At this interval the excitatory phase of one flash coincides with the inhibitory phase of the other. With a longer interval, the stimuli show mutual inhibition as the two inhibitory phases come into coincidence. When the interval reaches 100 ms there is zero interaction because the responses no longer overlap in time. Beyond this interval, the probability of detecting at least one of the flashes is influenced only by probability summation. This is the zero level of neural interaction. Flashes with opposite luminance polarity presented to the same eye physically cancel when they are simultaneous. They show inhibitory interactions with a short interstimulus interval and facilitatory interactions with a longer interstimulus interval (Ikeda 1965; Rashbass 1970; Watson and Nachmias 1977). These results indicate that interactions between successive flashes depend on how the excitatory and inhibitory phases of neural responses interact. Matin (1962) measured the probability of detecting dichoptic flashes 35 arcmin in diameter and of 2 ms duration as a function of the time interval between them. Binocular summation was greater than probability summation only for interstimulus intervals less than about 100 ms. Thorn and Boynton (1974) obtained similar results. Note that inhibitory effects with flashes presented to the same eye were not present with these dichoptic stimuli. These experiments provide evidence that similar signals falling simultaneously on corresponding points exhibit real neural summation. Blake and Fox (1973) reviewed other early experiments on this topic. The study of interactions between flashed stimuli has been extended to flashed gratings for which stimulus alternation may involve a spatiotemporal displacement, not merely a temporal displacement. This is because two gratings of opposite luminance polarity presented in succession can be regarded as having been displaced spatially by one half-period of the grating. When a 0.75 cpd sine-wave grating was flashed to the same eye for two periods of 5 ms, there was summation up to an interstimulus interval of about 50 ms, followed by a small inhibitory effect (Green and Blake 1981). When the gratings were presented in the same way to opposite eyes there was a similar but weaker facilitatory effect, but the inhibitory phase was absent, as shown in Figure 13.10. Gratings with opposite luminance polarity (180° spatial phase shift) presented to the same eye showed an initial inhibitory phase followed by a facilitatory phase, as shown in the figure. Oppositepolarity gratings presented dichoptically showed no facilitation or inhibition. A similar result was reported for two light flashes with the same and opposite polarity (Cogan et al. 1990). Although Blake and Levinson (1977) found dichoptic interactions between gratings of opposite polarity
0.06 11
32
53
74
96
1
73
85 133
Monocular
0.04
Dichoptic 0.06 Phase = 180° 0.08
1
13
25 37 49 61 73 Inters timulus interval (ms)
85 156
Dichoptic masking and spatial phase. Top graph: the contrast threshold for detection of two spatially in-phase, 0.75-cpd gratings presented for 5 ms each, as a function of interstimulus interval. The gratings were presented to different eyes (empty symbols) or the same eye (solid symbols). Bottom graph: the same functions with the gratings in spatial counterphase. Dashed lines indicate thresholds for the first grating alone (N = 1). (Adapted from Green and Blake 1981)
Figure 13.10.
(180° phase), they were high spatial-frequency gratings. A slight misconvergence may have brought them into phase. The contrast threshold for detection of flicker or movement in a sinusoidal grating between 0.5 and 7 cpd, which reversed in spatial phase at 3.5 Hz, was 1.9 times lower with binocular than with monocular viewing (Rose 1978). This is a much greater binocular advantage than that for detection of a stationary grating or than that reported by previous workers for detection of a counterphase grating. This advantage of binocular flicker detection over pattern detection was independent of spatial frequency but was lost at temporal frequencies above 10 Hz (Rose 1980). Responses to light onset and light offset are processed in visual channels that remain distinct at least up to the visual cortex (Section 5.1.4e). Nevertheless, these channels must interact to account for inhibitory interactions that occur when opposite-sign flashes are presented to the same eye with a small interflash interval. From the preceding evidence it seems that dichoptic interactions between transient signals of opposite sign in the two eyes do not occur at any level. Investigators have concluded from these results that opposite-polarity stimuli arising from the two eyes are processed independently. But this is the wrong way to look at it.
STEREOSCOPIC VISION
Think of two steady square-wave gratings presented 180° out of spatial phase to one eye. Clearly, the gratings are invisible because they physically cancel to a homogeneous gray. When the same gratings are presented dichoptically they do not physically cancel, but rival. At any instant, the dominant grating is seen just as well as when there is no grating in the other eye. Thus, the suppressed grating does not weaken the visibility of the dominant grating (Bacon 1976). From the point of view of visibility, the two gratings are processed independently, but only one of them is processed at any one time in a given location. Although opposite polarity dichoptic stimuli do not engage in simultaneous mutual inhibition, they do engage in alternating total suppression, or rivalry. The same argument can be applied to superimposed flashes of opposite polarity. They physically cancel when presented simultaneously to the same eye, and the excitatory and inhibitory phases of their neural responses interact when the flashes are presented successively to one eye. When presented dichoptically, opposite-polarity flashes rival, but, during the dominant phase of either one, the stimulus remains just as visible as a flash presented to only one eye. As the interstimulus interval is increased, rivalry ceases and both stimuli become visible as independent events; they do not, as with monocular viewing, engage in mutual inhibition. It is interesting to note in this context that the threshold for detecting a low-contrast grating was lowered when it was presented just after a similar grating with a spatial phase offset of 90°, but that this facilitation was not evident when the two gratings were presented dichoptically (Georgeson 1988). The facilitation occurred between sequentially presented dichoptic gratings when they were in spatial register. According to this evidence, although the binocular detection mechanism combines dichoptically superimposed stimuli, it does not combine dichoptic stimuli in spatiotemporal quadrature. Wehrhahn et al. (1990) asked subjects to decide which of two suprathreshold vertical lines, either 5 or 40 arcmin apart, was presented first. When the stimuli were presented binocularly rather than monocularly, the temporal threshold was lower by a factor of 1.4.
13.1.6d Binocular Summation of Motion Detection It is believed that local processing of motion occurs in V1 and that global motion is processed in MT. The following evidence suggests that global motion is processed mainly, if not entirely, by binocular cells. Almost all cells in MT are binocular. The motion aftereffect shows almost complete interocular transfer (Section 13.3.3). The perception of global motion is adversely affected by amblyopia (Section 8.4.4). This evidence suggests that global motion is detected better by two eyes than by one eye.
Hess et al. (2007) used random-dot kinematograms undergoing translational, radial, or rotational motion with various degrees of contrast. They measured the proportion of coherently moving dots to randomly moving dots required for motion detection (the motion-coherence threshold). At low contrasts, binocular viewing showed a 1.7-to-1 advantage over monocular viewing. The thresholds were the same when signal dots were presented to one eye and the noise dots to the other eye (dichoptically) as when all dots were presented to one eye. A purely monocular motion processing mechanism should show lower thresholds under the dichoptic condition. Hess et al. concluded that there is no monocular mechanism for detection of global motion that is not influenced by stimuli in the other eye.
13.1.6e Summary For people with normal binocular vision, binocular thresholds for luminance increments in discrete stimuli and for contrast detection in gratings are lower than monocular thresholds to a greater extent than predicted by probability summation. Binocular summation is greatest when the stimuli have similar shapes, sizes, contrasts, and locations. In other words, summation is greatest when the visual mechanisms responsible for fusion are engaged rather than those responsible for rivalry. Binocular summation of brightness is most evident when the dichoptic stimuli lack contours. Binocular summation is not evident for discrimination between stimuli well above the detection threshold, presumably because of response saturation. In stereoblind animals and humans, binocular summation is no more than one would predict from probability summation (Section 32.4.1). This further supports the idea that in people with normal vision, near-threshold excitatory signals from the two eyes are at least partially summed when they impinge on cortical binocular cells. There is considerable interaction between inputs from the two eyes in response to flicker, especially for low frequencies and within the modulation threshold region. In-phase flicker with similarly shaped stimuli is detected above the level of probability summation, whereas antiphase flicker is detected at or below this level. Binocular summation of stimulus energy occurs under conditions that foster binocular fusion but not under conditions that foster binocular rivalry. Light flashes or gratings with similar luminance polarity show summation when presented in quick succession either to the same eye or to opposite eyes. Flashes or gratings with opposite polarity show summation and inhibition phases when presented to the same eye but not when presented dichoptically. However, opposite-polarity dichoptic flashes do engage in alternating suppression. As far as visibility is concerned, they are processed in distinct channels, but these channels engage in suppressive rivalry. The binocular advantage for
B I N O C U L A R S U M M AT I O N, M A S K I N G , A N D T R A N S F E R
•
123
flicker detection seems to be greater than that for pattern detection.
13.1.8 P H Y S I O L O GY O F B I N O C U L A R S U M M AT I O N
13.1.8a Summation of Single-Cell Responses 13.1.7 M O N O CU L A R A N D B I N O CU L A R R E AC T I O N T I M E S
The reaction time of the visual system decreases as the luminance of the stimulus is increased (Alpern 1954). If the increase in latency with increasing luminance depended only on retinal processes, the latencies of monocular and binocular stimuli of the same luminance would be the same. One would expect binocular latencies to be shorter than monocular latencies only if luminances of signals from the two eyes summate and the latency of cortical processes decreases with increasing luminance. Several investigators have reported that button-pressing reaction times to the onset of a stimulus are shorter with both eyes open than when only one eye is open (Minucci and Connors 1964). The reaction time to a flashed binocular stimulus was about 35 ms shorter than that to a monocular stimulus (Haines 1977). Reaction time for detection of the onset of low-contrast gratings of various spatial frequencies was about 1.4 times longer with monocular viewing than with binocular viewing. As contrast was increased, reaction times decreased exponentially. At higher contrasts, the binocular advantage varied between subjects. Some subjects showed a monocular advantage at some levels of contrast (Harwerth et al. 1980). For subjects with normal stereoscopic vision, the binocular advantage in reaction time for detection of a grating was about 10%. This is a considerable advantage when transformed into the difference in contrast required to produce the same advantage for stimuli presented to one eye. For stereoblind subjects, the binocular advantage corresponded to probability summation (Blake et al. 1980c). Justo et al. (2004) obtained a binocular advantage of 15 ms for the task of responding to a change in the shape of a stimulus. Reaction time increased when texture was superimposed on an outline stimulus. But the increase was less when the outline and the texture were presented to different eyes than when they were presented to the same eye or both eyes. Westendorf and Blake (1988) measured reaction times to an increase in contrast of a vertical grating rather than to the onset of the grating. This allowed them to measure the effects of the magnitude of the contrast increment and of the magnitude of the pedestal contrast. At low contrasts, the binocular advantage exceeded probability summation. As pedestal contrast was increased, the binocular advantage for near-threshold increments declined to that predicted by probability summation. With high-contrast increments, binocular summation, as indicated by reaction times, remained slightly above the level of probability summation for all pedestal contrasts. 124
•
According to Hubel and Wiesel (1962), the responses of binocular cells in the primary visual cortex receive inputs from both eyes. However, electrophysiological recording from single binocular cells has revealed various forms of binocular interaction, including facilitation and inhibition (Crawford and Cool 1970; Ohzawa and Freeman 1986a, 1986b). Some cells respond only when both eyes are stimulated. Other cells show binocular interactions but respond to only one eye when the eyes are tested separately. Ohzawa and Freeman proposed that monocular inputs to binocular cells sum linearly according to the energy model described in Section 11.10.1. They argued that departures from linearity are mainly due to threshold mechanisms operating at the level of spike production after the initial linear combination. Haefner and Cumming (2008) showed that this departure from linearity can be accounted for by a modified energy model involving the combination of distinct disparity signals (see Section 11.10.1b). Anzai et al. (1995) measured the response of single binocular cells in the visual cortex of the cat as a function of the contrast, spatial frequency, and orientation of a drifting sinusoidal grating. They applied a signal-detection analysis to derive monocular and binocular neurometric functions and contrast-sensitivity functions for each cell. These functions are analogous to behaviorally determined psychometric and contrast-sensitivity functions. They concluded that the contrast threshold is reached when the response of a small number of cells reaches a criterion level, and that the contrast-sensitivity function depends on the number and sensitivities of cells tuned to each spatial frequency. The contrast thresholds of binocular cells revealed a binocular advantage similar to that found in human psychophysics. Anzai et al. concluded that binocular summation is due to binocular cells being more sensitive to binocular than to monocular stimulation, so that the criterion number of cells required for contrast detection is achieved at lower contrasts. Smith et al. (1997b) performed similar experiments on single binocular cells in area V1 of the monkey. The stimuli were drifting sinusoidal gratings with optimal orientation and spatial frequency for each cell. For simple cells and for complex cells tuned to phase disparity the contrast signals from the two eyes were initially combined linearly. Nonlinearities due to signal rectification arose at the postsynaptic level. For cells with balanced ocular dominance, the binocular summation contour (see Section 13.1.2c) had a slope of -1. For cells with strong ocular dominance, the summation contour had a greater or lesser slope, according to which eye was dominant. Binocular summation of contrast was greatest for stimuli with an interocular phase difference that conformed to the peak of the phase-disparity
STEREOSCOPIC VISION
13.1.8b Evoked Potentials and Binocular Summation In electroencephalography, electrodes are applied to the scalp and the pooled activity of cells in the underlying cortex is recorded as temporally modulated visual stimuli are presented (Section 5.4.3c). Visual evoked potentials (VEPs) recorded from the visual cortex have been used to reveal properties of two binocular mechanisms—summation and suppression between inputs from the two eyes on the one hand, and stereopsis on the other. In these investigations, two basic comparisons are made. First, the magnitudes of potentials evoked by monocular stimulation of each eye are compared with those evoked by binocular stimulation. Second, the response when the two eyes receive identical stimuli is compared with that when the stimuli in the two eyes are uncorrelated. The idea is that only congruent stimuli summate their inputs, whereas rivalrous stimuli compete for access to binocular cells. The results are used to assess the following outcomes. 1. Summation A binocular VEP that is simply the sum of the monocular responses could signify that visual inputs are processed by independent mechanisms in the visual cortex. But it could also signify that binocular cells sum the inputs from the two eyes linearly. Partial summation signifies that only some inputs converge or that summation is less than complete. A binocular response that equals the mean of the monocular responses (zero summation) signifies binocular rivalry, in which the input from one eye suppresses that from the other when both are open, or in which the two eyes gain alternate access to cortical cells. It could also arise from binocular cells that average the inputs from the two eyes. 2. Inhibition A binocular response that is less than the mean of the monocular responses signifies mutual inhibition between left- and right-eye inputs. It could occur in binocular rivalry when a strong stimulus in one eye is suppressed by a weak stimulus in the other eye. 3. Imbalance A monocular response from one eye that is stronger than that from the other eye signifies a weakened input from one eye or suppression of one eye by the other. This condition arises from anisometropia and strabismus (Section 8.5). 4. Facilitation A binocular response that is greater than the sum of the monocular responses indicates the presence of a facilitatory binocular interaction that one might
expect in a mechanism for detecting binocular disparity. The most extreme facilitation arises in binocular AND cells that respond only to excitation from both eyes and give no response to monocular inputs. Apkarian et al. (1981) conducted a thorough investigation of binocular facilitation of the VEP in human adults. Subjects were shown a vertical grating of various spatial frequencies with luminance modulated in counterphase at various temporal frequencies. The stimulus was shown to one eye or to both eyes. Binocular facilitation of the amplitude of the VEP as a function of spatial frequency for a fixed temporal frequency of 30 Hz is shown for one subject in Figure 13.11. For this subject, binocular facilitation was limited to spatial frequencies in the region of 2 cpd, which is not the region where the monocular response has its peak. Binocular facilitation was also found in a specific range of temporal frequencies of contrast modulation, generally between 40 and 50 Hz, and was higher at higher contrasts. The range of spatial and temporal frequencies within which facilitation occurred varied from subject to subject. When a 2-log luminance difference was introduced between the images of a dichoptic reversing checkerboard pattern the amplitude of the VEP was less for binocular than for monocular presentation (Trick and Compton 1982). The effect was more evident at higher temporal frequencies of stimulus reversal. They explained the effect in terms of a relative phase shift in the inputs from the two eyes due to a difference in latency. This means that a difference in retinal illumination due to opacities in the eye could lead to a misinterpretation of VEP data, if not allowed for. The dependence of binocular facilitation on the spatial and temporal properties of the stimulus probably explains the wide variation in the degree of binocular facilitation reported in previous studies (Cigánek 1970; Harter et al. 1973; Srebro 1978). A sharply focused flashed pattern produced more binocular summation than a defocused pattern, except at high luminances (White and Bonelli 1970). Potentials evoked by diffuse flashes did not exhibit binocular summation, whereas flashed patterns did (Ellenberger et al. 1978). Binocular/monocular ratio
tuning function of the cell. Most cells showed binocular suppression when the phase disparity was 180° away from optimal phase. In this case, the binocular response was less than the monocular response to stimulation of only the dominant eye.
4 3 2 1 0 0.1
0.2
0.5 1 2 Spatial frequency (c/deg)
5
10
Evoked potentials and spatial frequency. Ratio of binocular to monocular VEP as a function of the spatial frequency of a vertical grating counterphase modulated at a temporal frequency of 30 Hz. (Redrawn from Apkarian et al. 1981)
Figure 13.11.
B I N O C U L A R S U M M AT I O N, M A S K I N G , A N D T R A N S F E R
•
125
126
•
2.0 F1 + F2 1.6
Power
Tests of binocular functioning that rely on the comparison of monocular and binocular VEPs are suspect if they are not based on appropriate stimulus parameters. One solution is to record evoked potentials while the stimulus is swept through a range of stimulus values—a procedure first used by Regan (1973). The amplitude of the evoked potential was recorded as the refractive power of a lens in front of an eye was varied for each of several astigmatic axes. This provided a rapid determination of the required refractive correction. In a second study, VEPs were recorded while a checkerboard pattern was optically zoomed in size, with brightness held constant (Regan 1977). Norcia and Tyler (1985) recorded the VEP as the spatial and temporal frequencies of a grating were swept through a range of values. This gave a spatiotemporal VEP profile. This method is particularly useful with children too young to be tested by psychophysical procedures. In addition to providing a better basis for comparison of VEPs under a range of stimulus values, the sweep method is much faster than the presentation of different stimulus values in discrete trials. After a stationary pattern has been presented for some time, the VEP evoked by stimulus onset is attenuated. Smith and Jeffreys (1979) found almost complete transfer of this attenuation to the other eye for the CII component of the VEP, which is thought to originate in the prestriate cortex. But there was only partial transfer for the CI component, which is thought to originate in V1. This suggests that monocularly driven neurons are more common in V1 than in the prestriate cortex in humans, just as they are in the monkey (Zeki 1978). Katsumi et al. (1986) investigated the effects of optically induced aniseikonia on the degree of binocular summation revealed in the human VEP. The stimulus was a checkerboard pattern undergoing contrast reversal at 12 Hz. When the pattern was more than 5% larger in one eye than in the other, there was no evidence of a greater VEP to binocular than to monocular stimulation. The tolerance of the stereoscopic system for aniseikonia is discussed in Section 9.9. Another approach to using VEPs for detecting binocular interactions is to look for evidence of nonlinear interactions in the response to dichoptic flicker. Suppose that the left eye views sinusoidal flicker of frequency F1 and the right eye views sinusoidal flicker of frequency F2. Nonlinear processes produce harmonics of F1 in the left-eye channel and harmonics of F2 in the right-eye channel. Nonlinear processes occurring after the monocular signals are combined produce cross-modulation terms of the general form nF1 + mF2, for integral values of n and m. The relative amplitudes of these terms depend on the nature of the nonlinearities both before and after binocular convergence. A mathematical analysis of these processes has shown that nonlinear processing occurring after binocular convergence can be isolated from that occurring before convergence
1.2 .8 2F2
2F1
.4 0 14.0
15.0
16.0
VEP frequency spectrum (Hz) Nonlinear processing of dichoptic flicker. One eye viewed a homogeneous field flashing at 8 Hz (F1) with 17% amplitude modulation while the other viewed a light flashing at 7 Hz (F2) with 12% modulation. The VEP spectrum was recorded at a resolution of 0.004 Hz by zoom-FFT. The F1 + F2 component in the VEP indicates a nonlinear process sited after binocular convergence. (From Regan and Regan
Figure 13.12.
1989)
(Regan and Regan 1988, 1989). In contrast with the randomdot techniques described later, this procedure allows one to explore binocular functions even when acuity is low in one or both eyes. Figure 13.12 shows an (F1 + F2) component of the VEP recorded by a form of Fourier analysis, with ultrahigh resolution, known as the zoom fast-Fourier transform (zoom-FFT). This component of the response must arise from nonlinear processes occurring after binocular convergence (Regan and Regan 1989) (see also Zemon et al. 1993). The (F1 + F2) component of the VEP to dichoptic flicker is weak in stereoblind subjects (Baitch and Levi 1988). Normal infants show nonlinear responses to dichoptic flicker by 2 months of age, but the response is absent in esotropic infants, especially in those who have not had corrective surgery (France and Ver Hoeve 1994). Thus, only subjects with normal stereopsis show nonlinear combination of monocular inputs, which arises from the way in which binocular cells combine signals from the two eyes. Presumably, the more linear addition of signals in stereoblind subjects arises from two pools of monocularly driven cells. Another manifestation of binocular summation is that VEPs are greater when a grating moving in one direction is presented to one eye at the same time as a grating moving in the opposite direction is presented to the other eye, compared with when either moving grating is presented alone (Ohzawa and Freeman 1988). Odom and Chao (1995) measured human VEPs generated by full-field modulations of luminance at 2 Hz. A nonlinear second harmonic was evident in the VEP when the peaks were either in phase or 180° out of phase in the two eyes but was much reduced at a phase of 90°. They argued that these data support the idea of a magnocellular pathway,
STEREOSCOPIC VISION
which sums nonlinear monocular inputs and a parvocellular pathway, which sums linear monocular inputs, followed by a nonlinear stage. They produced further evidence for this model by measuring the threshold luminance modulation required for detection of flicker with dichoptic flashes at various phase differences. At 2 Hz, the threshold fell to a minimum when the phase difference was 90°. They argued that this biphasic response is evidence for the activation of both visual pathways. At 16 Hz, the threshold fell monotonically with increasing phase difference. They argued that this monotonic function occurred because only the magnocellular pathway was activated at this high frequency. Büchert et al. (2002) found that the fMRI response from the visual cortex of human subjects was smaller when flickering checkerboards were presented simultaneously to the two eyes compared with when they were presented in alternation. They suggested that simultaneous stimuli engage in mutual inhibition. There was no evidence of this effect in extrastriate cortex. The use of VEPs to study binocular rivalry was discussed in Section 12.9.2e. 1 3 . 2 D I C H O P T I C V I S UA L M A S K I N G 13.2.1 T Y P E S O F V I S UA L M A S K I N G
An induction stimulus with near-threshold contrast lowers the detection threshold of a superimposed test stimulus. This is threshold summation. In visual masking, a suprathreshold induction stimulus, or mask, reduces the visibility of a briefly exposed test stimulus. The mask is usually presented briefly, and the test stimulus is presented either at the same time as the mask, slightly before it, or slightly after it. The mask can be a disk of uniform luminance, an edge, or a grating with sinusoidal luminance profile. The presence of masking between two stimuli is interpreted as evidence that the stimuli are detected by the same channel or by partially overlapping channels. In dioptic masking, the mask and test stimulus are presented to both eyes. In dichoptic masking, the mask is presented to one eye and the test stimulus to the other. The main types of masking paradigm are listed in Table 13.1. Dichoptic masking differs from binocular rivalry in two ways. First, in masking the test stimulus is usually presented for less than 200 ms, which is too short a time for binocular rivalry to manifest itself (Section 12.3.5). Second, dichoptic masking is maximal when the test and masking stimuli have similar visual features, whereas binocular rivalry is most evident between stimuli that differ widely in shape, orientation, spatial frequency, or color. Dichoptic masking probably occurs at an early stage in the combination of binocular signals whereas rivalry occurs later, at a stage when patterned inputs are compared. Masking and rivalry could occur at different stages of processing within V1.
Table 13.1. TYPES OF VISUAL MASKING SIMULTANEOUS MASKING
Induction and test stimuli superimposed Test and induction stimuli adjacent—crowding SUCCESSIVE MASKING
Induction and test stimuli superimposed Forward masking—1st stimulus masks 2nd Backward masking—2nd stimulus masks 1st Induction and test stimuli spatially adjacent Paracontrast—1st stimulus masks 2nd Metacontrast—2nd stimulus masks 1st
13.2.2 M A S K I N G F RO M EV E N I L LU M I NAT I O N A N D T H E S TAT E O F L I G H T A DA P TAT I O N
First consider whether continuous and homogeneous illumination of one eye affects the visibility of stimuli in the other eye. Grating detection by one eye is adversely affected by uniform illumination in the same eye. But grating detection was not affected by uniform illumination in the other eye (Blake et al. 1980a). This suggests that masking by uniform illumination occurs in the retina. The attenuation of the contrast sensitivity function (CSF) at low spatial frequencies is also believed to occur in the retina. Yang and Stevenson (1999) questioned the view that masking by uniform illumination and the roll-off of the CSF are only retinal in origin. Masking of one grating by another in the same eye declines with increasing difference in spatial frequency between mask and test gratings. Uniform illumination contains low spatial frequency components because of retinal inhomogeneities. Therefore, masking by uniform illumination should be most evident with test gratings of low spatial frequency. Also, insofar as this type of masking involves the detection of spatial frequencies, it should show interocular transfer. Yang and Stevenson measured the monocular contrastsensitivity function for a grating of low mean luminance, modulated in contrast at 2 Hz. The other eye was illuminated at each of two levels of steady uniform illumination. Figure 13.13 shows that interocular masking was evident at spatial frequencies below 1 cpd. Blake et al. (1980a) had not tested below 1 cpd. There is thus a cortical component to masking by uniform illumination. Yang and Stevenson explained the smaller level of interocular masking compared with same-eye masking in terms of a postretinal interocular gating mechanism. A related question is whether masking depends on the state of dark adaptation of the eye. Exposing one eye to an overall brightness of between zero and 132 c/ft2 did not affect the detectability of a 0.67° spot presented in dark
B I N O C U L A R S U M M AT I O N, M A S K I N G , A N D T R A N S F E R
•
127
1
Log amplitude sensitivity (1/td)
0.5 td steady
0.5 433 td steady 433 td @ 2 Hz 0
–0.5 0.1
1 10 Spatial frequency (cpd)
100
Dichoptic masking of contrast sensitivity. A sine-wave grating of mean luminance 8 trolands and contrast modulated at 2 Hz was presented to one eye and a mask consisting of even illumination was presented to the other eye. With steady even illumination, there was a loss in contrast sensitivity at low spatial frequency. The loss increased with increased luminance of the mask, as indicated by the numbers on each graph. The loss was greatest when the even illumination was, like the grating, modulated at 2 Hz. Results for one subject. (Adapted from Yang
the contrast sensitivity of the other eye. Removal of this inhibitory influence when both eyes are stimulated may contribute to binocular summation. The visibility of a flash varies with the state of light adaptation of the eye. An eye was more sensitive to a local test flash within a light-adapted region in the other eye than when the other eye was dark-adapted (Lansford and Baker 1969; Paris and Prestrude 1975). Adaptation of one eye to red light lowered the dark-adapted threshold for a test flash presented to the other eye by about 0.15 log units (Auerbach and Peachey 1984; Reeves et al. 1986). Yang and Stevenson (1999) proposed that the crucial factor is the similarity between the temporal features of test and mask, rather than the presence or absence of a flickering mask. It can be seen in Figure 13.13 that interocular masking was greatest when mask and test grating both flickered at 2 Hz.
Figure 13.13
and Stevenson 1999)
surroundings to the other eye (Crawford 1940a). Also, the detectability of a 2° square was not affected by prior light adaptation of the other eye (Mitchell and Liaudansky 1955). Wolf and Zigler (1955) and Whittle and Challands (1969) found small interocular effects, but their adapting stimuli were not devoid of visual contours. Contrast sensitivity for gratings with spatial frequencies higher than 2 cpd presented to one eye improved when the other eye had been light adapted (Denny et al. 1991). For spatial frequencies over 10 cpd, improved sensitivity required brighter adapting fields in the other eye. Light adapting the eye containing the stimulus had little effect on sensitivity. Also, monocular or binocular light adaptation had little effect on binocular contrast sensitivity. Improvement in contrast sensitivity was also achieved by pressure blinding the eye not containing the stimulus (Makous et al. 1976). This suggests that the effect is due to removal of inhibitory influences from the opposite eye rather than to facilitatory influences arising from light adapting the opposite eye. It was noted in Section 12.3.3a that the dark field of a closed eye may rival stimuli in the other eye. The potential evoked from the human scalp by a monocular checkerboard reversing in contrast at 5 Hz was weaker when the other eye was closed and dark adapted than when it was adapted to a dim homogeneous light (Eysteinsson et al. 1993). This suggests that signals arising from rods in a fully dark-adapted eye exert a small inhibitory influence on 128
•
13.2.3 M A S K I N G B ET WE E N A D JAC E N T F I GU R E D S T I MU L I
Crawford (1940b) and Westheimer (1965) introduced the paradigm of exposing a test spot briefly on the center of a featureless disk-shaped conditioning stimulus. The luminance threshold of the test spot was measured as a function of the duration, luminance, size, and eccentricity of the conditioning stimulus. This procedure has revealed inhibitory interactions between the receptive fields of neighboring ganglion cells within the same eye (Makous and Boothe 1974). Inhibition of rod receptor potentials by stimulation of cones involves inhibitory interneurons in the retina. It also involves short-latency responses from cones occluding longlatency responses from rods as the two responses converge on ganglion cells (Gouras and Link 1966; Whitten and Brown 1973). Inhibition of rods by cones clears the postreceptor pathway so that it carries only cone signals at photopic levels. Cone-cone inhibitory interactions within the receptive fields of ganglion cells render the system particularly sensitive to luminance gradients. Westheimer (1967) found that the detection threshold for a 1-arcmin flash on a central disk was elevated as the luminance of an annular surround was increased, but the effect was weakened when flash and surround were in opposite eyes. Buck and Pulos (1987) reported that the detection threshold for a 5-arcmin, 200-ms test flash was elevated up to 0.6 log units in the presence of a 1° photopic background only when flash and background were presented to the same eye. They concluded that rod-cone interactions occur only in the retina. However, in these experiments, the background stimulus was homogeneous and of long duration. We will see in the next section that dichoptic masking occurs with figured or flashed conditioning stimuli. It has been suggested that there are two temporal components of masking: a steady state component and a
STEREOSCOPIC VISION
3.0
Monocular Foveal test flash
21.5'
Log luminance threshold (ML)
transient component at the onset and offset of the masking stimulus (Sperling 1965). Green and Odum (1984) found that masking of a drifting grating by a steady field of illumination showed little interocular transfer, but that masking by a flickering field showed substantial transfer, like that produced by a steady patterned mask. Battersby and Wagman (1962) measured the detection threshold for a 5-ms, 40-arcmin test flash presented at various time intervals before, during, or after a larger concentric illuminated disk was presented for 500 ms to the same eye. The results for one subject are shown in Figure 13.14. The threshold was elevated when the test flash was presented in the period between 100 ms before and 100 ms after the onset of the conditioning disk, and was maximal when the two events overlapped in time. A conditioning stimulus 4.7° in diameter had little effect on the threshold of the test stimulus. As the conditioning stimulus was reduced in size, bringing its border closer to the test flash, the threshold for seeing the test flash became increasingly elevated during the whole period of the conditioning stimulus. Markoff and Sturr (1971) investigated the effects of changing the size of the conditioning stimulus, with the foveal test flash and the conditioning stimuli presented simultaneously. A 5-ms, 3.5-arcmin test flash was presented
2.0 40' 1.5 1°20' 1.0
2° 4°40'
0.5 0 + –0.2
0 0.2 0.4 0.6 0.8 Delay between test and conditioning flashes (sec) (Plus indicates conditioning stimulus before test stimulus)
Dichoptic masking and interstimulus delay. Luminance threshold for detection of a 40-arcmin, 5-ms test flash as a function of time before or after a 500-ms dichoptic conditioning stimulus. The four curves are for different diameters of the conditioning stimulus, indicated by numbers on the curves (N = 1). (Redrawn from Battersby and Wagman 1962)
Figure 13.14.
at the same time as a conditioning stimulus exposed for 50 ms, 200 ms, or continuously. The stimuli were presented either to the same eye or dichoptically. The top graphs of Figure 13.15 show that, with both monocular and dichoptic viewing, the luminance threshold for detection of the 3.0
Dichoptic Foveal test flash
2.5
2.5 50 ms 200 ms Log luminance threshold (ML)
2.0
Continuous
50 ms 200 ms
2.0
Continuous 15
15 10
100 75'
10
1000
3.5
Monocular 10° eccentric test flash 3.5
3.0
3.0
2.5
2.5
2.0
2.0 10
100
1000
100
1000
Dichoptic 10° eccentric test flash
10
100
1000
Diameter of conditioning stimulus (arcmin) Dichoptic masking as function of stimulus size and duration. Luminance threshold for detection of a 5-ms test flash presented with a conditioning stimulus to the same eye (left two graphs) or to the opposite eye (right two graphs). Curves in each graph are for three durations of the conditioning stimulus. The top two graphs are for a foveal test flash, the bottom two for one at 10° eccentricity. Dashed lines indicate the semi-interquartile range of the resting threshold for that position for monocular and binocular conditions (N = 1). (Redrawn from Markoff and Sturr 1971)
Figure 13.15.
B I N O C U L A R S U M M AT I O N, M A S K I N G , A N D T R A N S F E R
•
129
13.2.4 M A S K I N G WI T H S U P E R I M P O S E D PAT T E R N S
13.2.4a Effects of Contrast and Spatial Frequency In a typical masking experiment with superimposed patterns, a subject is shown a sinusoidal masking grating twice in succession to the same eye. Subjects report which masking grating has a test grating superimposed on it. The contrast of the test grating required for 75% success in this forced-choice task is its threshold contrast. When the masking and test gratings have the same spatial frequency and phase, the measurement is that of the increment threshold. When the contrast of the mask is low, the increment threshold contrast is lower than when the mask is not present. In other words, the mask facilitates detection of the test grating. As the contrast of the mask increases above about 0.3, the increment threshold contrast increases linearly, 130
•
1
.3 Dichoptic gratings Incremental threshold
test flash rose as the diameter of the conditioning patch increased from 10 to about 21 arcmin, after which it declined to a value that depended on the luminance of the conditioning spot. This can be explained in terms of the structure of the ON-center receptive fields of ganglion cells. A small conditioning stimulus adds to the stimulation of the on-center and elevates the differential threshold. As the area of the conditioning stimulus increases, its edge encroaches on the inhibitory surround of the receptive field. Its edge eventually extends beyond the inhibitory surround, and masking declines to a level that depends on the luminance of the mask. The stimulus producing peak masking was larger at scotopic than at photopic levels of luminance, presumably because the inhibitory surround is weaker at scotopic levels. Peak masking size was also larger with peripheral than with foveal viewing, presumably because receptive fields get larger in the periphery (lower graphs in Figure 13.15). It can be seen in Figure 13.15 that detection of a test spot in one eye is not affected by a conditioning disk wider than about 3° presented to the other eye. Therefore, as we saw in Section 13.2.2, pure luminance masking within homogeneous areas does not occur between the eyes. One can infer that both monocular and dichoptic masking are due to interactions between the contiguous edges of the conditioning and test stimuli. We will return to this topic shortly. It can also be seen in Figure 13.15 that a conditioning disk produces very little dichoptic masking when it is visible continuously. Similar evidence was reported by Fiorentini et al. (1972) and Sturr and Teller (1973). One can infer that dichoptic masking is due to rivalrous interactions between stimulus onsets or offsets of contiguous edges. Chromatically selective dichoptic masking occurs with large masking flashes, but only within the blue-cone system (Boynton and Wisowaty 1984).
.1 1 cpd .03
.01
Monocular gratings
16 cpd
.003
.001 .003
.01
.1 .03 Initial contrast
.3
1
Monocular and dichoptic contrast thresholds. Incremental contrast thresholds for gratings presented successively to one eye and for a grating presented to one eye and then the other eye. Each grating was presented for 200 ms with a 750-ms interstimulus interval. Both gratings had a spatial frequency of 1 cpd or 16 cpd. (Redrawn from Legge 1979)
Figure 13.16.
as shown in Figure 13.16 (Legge 1979). In effect, this function expresses Weber’s law for incremental contrast. The results for four spatial frequencies follow the same function when the data are rescaled in units of the absolute threshold. In dichoptic masking, the superimposed test and masking gratings are presented simultaneously to opposite eyes. Figure 13.16 shows that, for dichoptic viewing with 200-ms exposures of the stimuli, the facilitatory effect at low contrast is weaker than for monocular viewing. However, with higher contrasts, dichoptic masking is stronger than monocular masking (Legge 1979). The weak facilitatory effect represents binocular summation. Dichoptic masking at higher contrasts represents interocular inhibition. In Fechner’s paradox, the brightness of a high luminance patch presented to one eye declines as the luminance of a similar patch in the other eye increases from zero to the upper limit of the mesopic range—the range over which rods respond (Curtis and Rule 1980). Fechner’s paradox does not occur when the eyes are evenly illuminated (Bourassa and Rule 1994). It thus seems that, with contoured stimuli, inputs from rods in one eye inhibit responses of cones in the other eye. For figured stimuli, this inhibition is least from an unstimulated dark-adapted eye and increases when a contoured stimulus within the mesopic range is introduced to the contralateral eye. Beyond the mesopic range, inputs from cones predominate and these exhibit binocular summation. There are inhibitory interactions at the retinal level (Section 13.2.2), but these involve inhibition of rods by cones or cones by cones, rather than of cones by rods.
STEREOSCOPIC VISION
Relative threshold elevation
Dichoptic masking is not the same as binocular suppression occurring in binocular rivalry, because suppression is greatest when the dichoptic patterns are dissimilar whereas, as we will now see, masking is greatest when dichoptic patterns are similar. Like binocular summation of simultaneously presented threshold stimuli, the masking effect of a suprathreshold grating is greatest when the spatial frequency and orientation of the test and mask are the same (Gilinsky and Doherty 1969). Threshold elevation as a function of the spatial frequency of a masking grating for a given spatial frequency of a test grating is the spatial-frequency masking function. In general, spatial-frequency masking functions give an indication of the bandwidth of channels tuned to different spatial frequencies (Legge 1979). However, it is difficult to compare the spatial-frequency bandwidth of monocular and dichoptic masking functions, since the two functions have very different slopes (Figure 13.17). Harris and Willis (2001) asked whether a 1-cpd contrast-modulated grating presented to one eye masks a 1-cpd
25 1
0.25 20
4 15
0.125
(a) Dichoptic masking
10 5
16
0
Relative threshold elevation
.1
.3
1
3
10
8 1
6 4 0.125
30 100 Test spatial frequencies
(b) Monocular masking
4
0.25
2
16
0 .1
.3
1
3
10
30
100
Spatial frequency (cpd) Dichoptic spatial-frequency masking functions. (a) Each curve shows elevation of contrast threshold of a test grating of a particular spatial frequency presented to the right eye, as a function of the spatial frequency of a masking grating presented to the left eye. Arrows indicate spatial frequencies of test gratings. The masking grating had a contrast of 0.19. The two gratings were presented at the same time for 200 ms. Threshold elevation is the ratio of threshold with the mask to threshold without the mask. Numbers on the curves are the spatial frequencies of the test grating (N = 2). (b) Monocular spatial-frequency masking functions obtained when mask and test gratings were presented to the same eye. (Redrawn from Legge 1979)
Figure 13.17.
luminance-modulated grating presented to the other eye. The contrast-modulated grating was the beat pattern (moiré pattern) formed by superimposing gratings of 8 and 9 cpd. Interocular masking was as strong as when mask and test grating were presented to the same eye. The contrastmodulated grating produced results similar to those produced by a luminance-modulated mask of similar effective contrast. The contrast threshold for detecting a 2°-diameter counterphase-modulated vertical grating was elevated by addition of an annular radial grating extending out to 20° in the same eye, especially when the radial grating was moving (Marrocco et al. 1985). Significant threshold elevations for both stationary and moving annular gratings were also found with dichoptic viewing. Vernier acuity is degraded when the stimulus is masked by a superimposed grating or by flanking lines in the same eye. Performance on a vernier stimulus presented to one eye was similarly degraded by masks presented to the other (Mussap and Levi 1995). This suggests that the neural processes responsible for vernier acuity receive binocular inputs. However, it is logically possible that a neural signal arising from a vernier offset is formed before binocular convergence but that the signal is then subject to degradation by inputs from the other eye. Visual evoked potentials recorded from the scalps of human subjects in response to a grid pattern presented briefly to one eye were reduced while a similar grid pattern was presented continuously to the other eye. This interocular suppression of the VEP was reduced when the grids in the two eyes differed in density and was positively correlated with stereoacuity (Harter et al. 1977). Towle et al. (1980) used psychophysical tests and the VEP to measure masking of a flashed grating by a stationary grating applied to the same or the opposite eye. In the initial 100 ms, masking depended only on the relative orientations of the gratings. After about 200 ms, both relative orientation and relative spatial frequency had an effect. It has been argued that the component of masking due to pattern features of the mask occurs in the cortex, where pattern features are processed (Bowen and Wilson 1994). However, there is a general problem with all arguments about the site of visual processes based on masking experiments. Neural processes responsible for performance on a given visual task may reside at many levels in the nervous system. Some processes occur in series and some in parallel, and a mask may affect any or all of these levels or may have an effect only after visual processing of a given visual feature is complete. Single-cell recordings in the monkey revealed that masking of similar stimuli presented to the same eye occurs at many levels in the visual system, beginning in the retina (Macknik and Martinez-Conde 2004). Monocular masking should therefore increase as stimuli pass through successive processing stages. However, dichoptic masking was not evident in single-cell recordings from binocular cells in V1.
B I N O C U L A R S U M M AT I O N, M A S K I N G , A N D T R A N S F E R
•
131
The strong dichoptic masking that is known to occur must therefore arise beyond V1. The following evidence that processing of binocular disparity precedes dichoptic masking supports this conclusion.
13.2.4b Masking and Relative Disparity A tone embedded in noise is easier to detect with two ears than with one, especially when the tone is presented to the two ears in antiphase. This is binaural unmasking. The difference between masking with in-phase and antiphase tones is the binaural masking-level difference. Henning and Hertz (1973) revealed a visual analog of these effects. A visual noise stimulus consisting of a mixture of vertical gratings within a narrow band of spatial frequencies, and slowly varying in contrast and phase, was presented to both eyes. A sinusoidal test grating of fixed contrast at the mean spatial frequency of the noise was superimposed on the noise. The test grating was either spatially in phase in the two eyes or in antiphase. As with auditory stimuli, the test grating was detected at much lower contrasts when in antiphase than when in phase in the two eyes, but only for spatial frequencies of less than about 6 cpd. The binocular masking-level difference did not depend on the temporal modulation of signal and noise (Henning and Hertz 1977). Since vergence was not controlled, one cannot be sure whether the signal or the noise was in antiphase. In other words, we do not know whether antiphase noise is a less effective mask or whether an antiphase test stimulus is more resistant to masking. Also, Henning and Hertz did not mention whether the antiphase signal and the noise appeared in different depth planes; any misconvergence on the antiphase grating would induce either a crossed or uncrossed disparity into the noise display. Moraglia and Schneider (1990) controlled for vergence and investigated the role of perceived depth in masking. They used a luminance-modulated Gabor patch as the test stimulus and a background area of broadband Gaussian noise as the mask. With convergence held on the background, the visibility of the superimposed test patch was augmented when it had a horizontal disparity of 13.5 arcmin. It was augmented less when its disparity was 40.5 arcmin. In both cases, the test patch appeared out of the plane of the noise. However, the appearance of depth was not required for augmentation of visibility because 13.5 arcmin of vertical disparity also augmented visibility, although to a lesser extent (Moraglia and Schneider 1991). Visibility of a patch with a horizontal or vertical disparity of 67.5 arcmin was no better than with zero disparity. The augmentation effect was absent when test patch and noise were orthogonal (see also Schneider et al. 1989). These results and others (Schneider and Moraglia 1992) suggest that dichoptic masking occurs at the cyclopean level within the disparity-detecting mechanism and that a stimulus is released from masking when its disparity differs from 132
•
that of the mask. For disparity to help in the recognition of superimposed patterns, as opposed to helping in their detection, all spatial frequencies required for pattern recognition must be unmasked (Schneider et al. 1999). Backward masking is also reduced when the mask and the test stimulus are in different depth planes (Section 13.2.7). A depth-dependent effect has also been found for simultaneous masking of a pattern by a surrounding annulus (Fox and Patterson 1981). In a related experiment, McKee et al. (1994) found that a test bar in one eye is masked to a lesser degree by a superimposed mask in the other eye when a second bar, similar to the mask, is placed adjacent to the test bar. They argued that the mask becomes stereoscopically matched with the adjacent bar rather than with the test bar, which releases the test bar from masking. This suggests that binocular images are linked before the site of dichoptic contrast discrimination. Masking of disparity detection by monocular noise is discussed in Section 18.7.4.
13.2.4c Masking of Suppressed and Dominant Images Instead of asking how a stimulus in one eye masks a stimulus in the other eye, one can ask how a stimulus in the suppressed phase of binocular rivalry masks a stimulus that is in either the suppressed phase or the dominant phase. Westendorf (1989) measured the reaction time to the onset of a monocular probe superimposed on one of a pair of rivalrous dichoptic patterns. When the probe was presented in the dominant phase, the suppressed image in the other eye had no effect on the visibility of the probe. When the probe was presented during the suppressed phase, however, its detection was delayed, and the delay was greater when the probe and suppressed image were identical rather than different. The greater effect with identical stimuli was ascribed to masking, since masking is greatest between identical stimuli and rivalry suppression is maximal between dissimilar stimuli. Thus, a suppressed image does not mask a probe on a dominant image but does mask a suppressed probe. It was concluded that dichoptic pattern masking occurs at a more central site than binocular suppression.
13.2.4d Summary When a test pattern in one eye is superimposed on a masking pattern in the other eye for about 200 ms, detectability of the test pattern is enhanced at low contrasts and reduced at high contrasts. The masking effect is greatest when test and masking stimuli have the same spatial frequency and orientation. In this respect, dichoptic pattern masking differs from binocular rivalry, which is most evident when the stimuli differ. Dichoptic pattern masking is evident in visual evoked potentials, but single-cell recordings have revealed that dichoptic masking does not occur in monkey V1.
STEREOSCOPIC VISION
A test stimulus presented to both eyes is released from masking when the mask is in a distinct disparity-defined depth plane. A suppressed image may mask a superimposed suppressed image but not a dominant image. 13.2.5 D I C H O P T I C V I S UA L C ROWD I N G
The features of a stimulus are more easily recognized when it is presented in isolation than when it is flanked by other similar stimuli. This effect is known as crowding, or contour interaction. It was discussed in Section 4.8.3a. For example, an isolated letter was more easily recognized than a letter presented among other letters (Eriksen and Eriksen 1974). Also, the orientation of an isolated letter T was more easily recognized than that of a T flanked by other letters (Toet and Levi 1992). Furthermore, vernier acuity was better for an isolated visual target than for one flanked by other lines (Levi et al. 1985). Crowding differs from masking that occurs when superimposed stimuli are presented in succession. A masked stimulus is not visible, whereas a crowded stimulus is visible but not recognizable or not well registered. Masking is more local than crowding. It seems that masking occurs at a low level of processing, where stimuli compete for access to the same feature detector. In crowding, all the stimuli are detected but interfere with each other at the level where spatial features are integrated (Pelli et al. 2004). Several studies have revealed that crowding effects are evident when the test stimulus is presented to one eye and the distracters to the other eye. For example, detection of the gap in a Landolt C was impaired when four short bars flanked the stimulus. The effect fell to zero when the distance between the bars and stimulus reached about 5 arcmin (Flom et al. 1963). The effect was as strong when the test stimulus and distracters were in different eyes. Changes in the tilt of an isolated line were more easily discriminated than changes in the tilt of the same line flanked on both sides by other lines, even when the flanking lines had the same orientation as the test line (Westheimer et al. 1976). This effect also showed when the test stimulus and distracters were in different eyes (Westheimer and Hauske 1975). Interocular crowding suggests that crowding is cortical in origin. Crowding could be mediated by the lateral cortical connections discussed in Section 5.5.6. The spatial range of crowding is similar to the spatial range of cortical lateral connections, and both processes show a similar dependence on stimulus eccentricity (Tripathy and Levi 1994). Also, the range of crowding, as revealed by the dichoptic effects of flanking lines on vernier acuity corresponded to the spacing of ocular dominance columns in the visual cortex (Levi et al. 1985). The key idea is that, although vernier acuity and other hyperacuities are a few seconds of arc, they depend on the registration of the relative positions of two stimuli.
Registration of position involves integrating information over a larger area. Levi et al. referred to this cortical integration area as a “perceptive hypercolumn.” Also, neighboring stimuli tend to be perceived as forming a larger pattern. Tripathy and Levi (1994) presented a test letter T to the monocular region in the left eye that corresponds to the blind spot in the right eye. Subjects reported the orientation of the test letter less accurately when three Ts were placed in the region surrounding the blind spot in the right eye. This suggests that lateral cortical connections run from the region surrounding the blind spot in one eye into the monocular region corresponding to the blind spot in the other eye. The cortical area corresponding to the blind spot contains only monocular cells in the sense that each cell receives direct inputs from only one eye (LeVay et al. 1985). Nevertheless, if lateral connections run into the monocular area, this area is not strictly monocular. The relationship between crowding and attention was discussed in Section 4.8.3. The effects of crowding on stereoacuity will be discussed in Section 18.6.2a. 13.2.6 T H R E S H O L D -E L EVAT I O N
In the threshold-elevation effect a period of inspection of a suprathreshold masking grating elevates the contrast threshold of a subsequently exposed test grating in the same region of the same eye. The effect is observed only when the adaptation and test gratings have a similar spatial frequency and orientation. The effect shows interocular transfer of about 65% of its monocular value (Blakemore and Campbell 1969; Hess 1978). The degree of interocular transfer remained the same when the eye seeing the induction grating was pressure blinded in the test period, showing that the effect is cortical in origin (Blake and Fox 1972). The functions relating the threshold-elevation effect to the contrast and duration of the induction grating were the same for the monocular aftereffect as for the transferred aftereffect (Bjorklund and Magnussen 1981). The elevation of the contrast threshold of a test grating produced by an induction grating presented to the same eye was reduced when the induction grating was accompanied by a grating of a different spatial frequency presented to the other eye. This interocular effect operated only when the gratings were vertical, which suggests that it is related to stereopsis. The effect was not evident in three subjects who lacked stereoscopic vision (Ruddock and Wigley 1976; Ruddock et al. 1979). The threshold-elevation effect for a 4-cpd grating measured binocularly was the same whether the induction grating was presented to one eye or the other, or alternately to the two eyes, as long as the total duration was the same (Sloane and Blake 1984). This suggests that the binocular aftereffect represents the pooled effect from binocular cells differing in ocular dominance.
B I N O C U L A R S U M M AT I O N, M A S K I N G , A N D T R A N S F E R
•
133
Threshold elevation produced by prior inspection of a line or a small number of dots did not show interocular transfer (Fiorentini et al. 1976). Also, threshold elevation produced by inspection of a region of high contrast did not show interocular transfer (Durgin 2001). All this evidence suggests that the threshold-elevation effect for gratings occurs at a site more central than that at which visual inputs are combined. Neurons in visual areas 17 and 18 of the cat responded less vigorously to a low-contrast drifting grating after they had been exposed to a similar drifting high-contrast grating (Ohzawa et al. 1985). Maffei et al. (1986) found the same level of interocular transfer of contrast adaptation in areas 17 and 18 of cats after section of either the corpus callosum or the optic chiasm. There was no transfer after both interhemispheric pathways were severed. 13.2.7 M ETA- A N D PA R AC O N T R A S T
In another form of successive masking, the induction and test stimuli are presented only briefly and in adjacent locations rather than in the same location. For instance, a visual pattern presented for a small fraction of a second is not seen when followed by a stimulus in a nearby location when the interstimulus interval (ISI) is between 40 and 80 ms. This is known as metacontrast or backward masking. Under certain circumstances a test stimulus is masked by a stimulus that precedes it. This is known as paracontrast or forward masking. Metacontrast seems to have been first observed by Exner (1868). The early literature is reviewed in Alpern (1952). Werner (1940) used the disk-ring configuration in which a black test disk is not seen when followed by a masking black annulus. It is as if the new inner edge of the annulus desensitizes the system for subsequent processing of the edge of the disk. The two contours have opposite luminance polarity. The superimposition or close proximity of similar contours of opposite polarity seems to be a general feature of stimuli that generate metacontrast. The effects of metacontrast are evident in the response of cells in the visual cortex of anesthetized monkeys, so that the suppression must occur early in the visual system (Macknik and Haglund 1999). Various theories of metacontrast and paracontrast have been proposed and were reviewed by Weisstein (1972).
13.2.7a Metacontrast with Cyclopean Images Metacontrast can occur between a cyclopean shape defined by disparity and a binocular shape defined by luminance contrast, although this interdomain masking is less than when induction and test stimuli are either both cyclopean or both binocular (Patterson and Fox 1990). Lehmkuhle and Fox (1980) constructed a dynamic random-dot stereogram depicting an annular mask exposed for 160 ms and a central Landolt C test stimulus presented for 80 ms. 134
•
Detectability of the gap in the Landolt C was reduced when the mask followed the test stimulus (backward masking) by less than 100 ms or preceded the test stimulus (forward masking) by less than about 300 ms. Backward masking was similar to that observed with regular contours, but forward masking extended over a larger span of time than with regular contours. Masking declined as the test stimulus moved nearer to the viewer than the mask, but stayed constant when the test stimulus was moved beyond the mask (Section 22.5.1). Metacontrast can affect the perception of depth in a random-dot stereogram. A mask consisting of a noisy 3-D array of dots reduced the accuracy with which depth in a random-dot stereogram was detected. Masking was greatest when the interstimulus interval was less than 50 ms (Uttal et al. 1975a). It is not clear from this result whether the masking effect depended on disparity in the mask—a 2-D mask may produce the same effect. The results were interpreted in terms of the time it takes to perceive depth in a random-dot stereogram.
13.2.7b Dichoptic Metacontrast Werner found that metacontrast was the same when the test stimulus and mask were seen by the same eye as when they were presented dichoptically (see also Kahneman 1968). Kolers and Rosner (1960) reported dichoptic metacontrast with a variety of stimuli. Like monocular masking, dichoptic masking occurs between a test flash stimulating only rods and an adjacent masking flash stimulating only cones (Foster and Mason 1977). However, there are differences between dichoptic and monocular metacontrast. Although letters are masked by a flash of light only when both stimuli are presented to the same eye, they are masked by a pattern presented either to the same eye or to the other eye (Schiller 1965). For small interstimulus intervals, dichoptic masking is more pronounced than monocular masking (Schiller and Smith 1968), perhaps because of binocular rivalry. Dichoptic, but not monocular, metacontrast decreases with repetition of the stimuli (Schiller and Wiener 1963). One way to think about dichoptic metacontrast is that the newly delivered stimulus switches dominance to the eye seeing the new stimulus before the processing of the first stimulus is complete in the other eye. Oppositely polarized luminance edges rival, even when presented simultaneously. Werner (1940) argued against explaining dichoptic metacontrast in terms of binocular rivalry on the grounds that masking due to rivalry should occur when the mask precedes the test stimulus as well as when it follows it. This is not a strong argument. When the test stimulus follows the mask, it may cause a switch of dominance to the eye containing the test stimulus. This should remove the contribution that rivalry makes to forward masking, leaving only the contribution of other factors.
STEREOSCOPIC VISION
The three types of cones act independently in increment thresholds (Stiles 1939). For example, different annular masking flashes raised the threshold of a preceding test flash detected by red cones by equal amounts only if the chromatic content of the flashes was adjusted to produce an equal effect on red cones (Alpern et al. 1970; McKee and Westheimer 1970). This suggests that metacontrast involves interactions in the retina between cones of the same type. Yellott and Wandell (1976) found that a flashed masking stimulus raised the threshold of a preceding test flash by the same amount whether the flashes were presented to the same eye or to different eyes. But the inhibition was not cone-type specific in either case, which suggests some involvement of a central chromatic mechanism. However, Alpern et al. used a 1° test stimulus superimposed on a 9° masking disk, whereas Yellott and Wandell used two 1°-wide by 3°-high bars flanking a 1°-wide test bar. Thus, there was more opportunity for contour interactions in the latter stimulus than in the former, and dichoptic interactions are stronger when there are contiguous contours (Section 13.2.3). The degree of masking of a 0.25° test disk, flashed on just before a slightly larger disk presented briefly to the same eye, was greatly reduced when the larger disk was immediately followed by a surrounding annulus, also in the same eye (Schiller and Greenfield 1969). Presumably the outer annulus masked the larger disk, which was then less able to mask the inner disk. When the conditioning and test stimuli were presented to one eye and the outer annulus to the other, the effect of the conditioning stimulus on the test stimulus was not weakened (Robinson 1968). Any postchiasma effect of the annulus on the conditioning stimulus did not relieve the prechiasma inhibitory effect of the conditioning stimulus on the test stimulus. This aspect of metacontrast does not show interocular transfer.
13.2.7c Summary A briefly presented test stimulus may be more difficult to see when a brief masking stimulus occurs in a neighboring location, either just before the test stimulus (paracontrast) or just after it (metacontrast). The effect seems to be due to the close proximity of contours of opposite contrast polarity. Both forms of successive masking also occur between two shapes defined by disparity or between a disparity-defined shape and a shape defined by luminance. Both forms of masking also occur when the test and mask are presented to different eyes. Dichoptic masking may be due to rivalry between edges with opposite contrast polarity in the two eyes. 13.2.8 T R A N S F E R O F C H RO M AT I C A DA P TAT I O N
A small gray patch appears green and a green patch appears a supersaturated green when superimposed on a larger area
previously exposed to red light. This effect is usually explained in terms of bleaching of the red receptors. DeValois and Walraven (1967) suggested that the effect is due largely to contrast produced in the test patch by an afterdischarge from “red” receptors in the surrounding area. When a red adapting patch and a green test patch were the same size, the color of the test patch was desaturated because of the combination of the red afterdischarge and the green of the test patch within the area of the test patch. When they placed the adapting red patch in one eye, a subsequently seen green patch in the other eye appeared gray whether it was smaller than or equal in area to the adapting patch. DeValois and Walraven argued that the contrast effect did not show dichoptically because it is retinal and that the grayness of the dichoptic test patch was produced by interocular color combination of green from the test patch and a red after discharge from the same region in the other eye. 13.3 TR ANSFER OF FIGUR AL EFFECTS In figural induction, an induction stimulus affects the figural properties of a test stimulus, rather than its visibility. For instance, an induction stimulus may cause an apparent change in the orientation, size, or movement of a test stimulus. As with masking, the induction and test stimuli can be presented simultaneously or successively. Geometrical illusions are simultaneous figural effects. The tilt aftereffect and the motion aftereffect are successive figural effects. Interocular transfer of an effect occurs when an induction stimulus applied to one eye affects a test stimulus applied to the other. Interocular transfer of figural effects has been studied for the following purposes: 1. To reveal the site of processes responsible for a particular effect. 2. To investigate how inputs from the two eyes combine. 3. To reveal effects of visual pathology on the combination of inputs from the two eyes. The first two purposes are discussed in this section. The third is discussed in Section 32.4. 13.3.1 E X P E R I M E N TA L PA R A D I G M S
Physiological studies reviewed in Chapter 11 have revealed four types of cell in the visual cortex: 1. Binocular OR cells, which respond to excitatory inputs from either eye but no more strongly to both eyes than to either eye alone. Some respond equally to either eye, while others respond more vigorously to one eye than to the other.
B I N O C U L A R S U M M AT I O N, M A S K I N G , A N D T R A N S F E R
•
135
2. Binocular AND cells, which respond only when excited by inputs from both eyes. 3. Excitatory/inhibitory cells, which receive an excitatory input from one eye and an inhibitory input from the other. 4. Monocular cells, which receive input from only one eye. These types of cell probably form a continuum of types, and their response properties may not be fixed; for instance, their response to fusible stimuli may differ from their response to rivalrous stimuli, and the excitatory/inhibitory ratio may vary as a function of stimulus strength. Psychophysical experiments on interocular transfer of figural induction effects have been conducted to explore the role of the various types of cell in binocular processing. A major issue is the types of cortical cell required to account for the results of interocular transfer experiments. Wolfe and Held (1981, 1983) championed the view that AND cells are required in addition to OR cells and monocular cells. Moulden (1980) developed an account based on monocular cells and three types of OR cells that differ in their degree of ocular dominance, but no AND cells. Cogan’s (1987) model of binocular processing involves only binocular cells. There is something arbitrary about such accounts, because they explain different sets of data and could probably accommodate new data by adjusting parameters or by adding extra assumptions. The five basic experimental paradigms used in studies of interocular transfer of figural induction effects are as follows: 1. Interocular transfer paradigm One first measures the figural induction effect with the induction and test stimuli in the same eye then with the two stimuli in different eyes. The transferred effect is expressed as a percentage of the same-eye effect. The following logic is then applied. The percentage transfer should (a) increase according to the extent to which the induction and test stimuli excite the same binocular OR cells, (b) diminish according to the extent to which the test stimulus excites unadapted monocular cells, (c) be influenced by the extent to which the different classes of cortical cells inhibit each other, (d) be influenced by the presence of postinduction effects in the closed eye during the test period, and (e) not be affected by the presence of binocular AND cells, since such cells do not contribute to the strength of either the monocular effect or the transferred effect. 2. Monocular versus binocular test The induction stimulus is presented monocularly, and the test stimulus is presented either to the same eye or to both. For example, Wolfe and Held (1981) argued that 136
•
unadapted AND cells, which are not activated by a monocular induction stimulus, cause the tilt aftereffect to be less with binocular testing than that with monocular testing. The aftereffect with binocular testing after monocular induction would also be reduced through the activation of unadapted monocular cells associated with the unadapted eye. Wolfe and Held determined the reduction of the aftereffect due to this latter factor by measuring the amount of transfer from one eye to the other. The degree of transfer was insufficient to account for the reduction from one eye to two eyes. The extra reduction in the aftereffect in two eyes was put down to the effect of recruitment of unadapted AND cells in binocular testing. 3. Binocular recruitment The aftereffect is first measured with both the induction and test stimuli viewed binocularly and then with a monocular induction stimulus and a binocular test stimulus. The presence of AND cells should make the aftereffect with binocular induction larger than with monocular induction. Wilcox et al. (1990) pointed out that this argument cannot be used to infer the presence of binocular AND cells. While it is true that binocular testing after monocular induction brings unadapted AND cells into play, which reduces the aftereffect, it also brings monocular cells of the adapted eye into play, which increases the aftereffect. Since there is no way of knowing the relative contributions of these opposed influences, it is not possible to draw conclusions about what types of cortical cell are involved in the aftereffect. 4. Alternating monocular induction The induction stimulus is presented alternately to each eye for a period of time, and the test stimulus is then presented to either one eye or both simultaneously (Blake et al. 1981b). For comparison, the same procedure is followed with the induction stimulus presented intermittently to both eyes. The following logic is then applied. The alternating induction sequence adapts binocular OR cells and monocular cells for both eyes. Therefore, for these cells, the aftereffect should be the same for monocular and binocular testing. Since the alternating induction sequence does not adapt AND cells, the aftereffect is diluted by the activation of unadapted AND cells during binocular testing. With monocular testing, the AND cells are not excited and therefore do not dilute the aftereffect. Thus, any reduction in the aftereffect with binocular testing relative to monocular testing indicates the presence of AND cells. This logic is not subject to the ambiguity of the binocular-recruitment paradigm. 5. Cyclopean stimuli A cyclopean induction or test stimulus is defined by disparities in a random-dot stereogram.
STEREOSCOPIC VISION
Such cyclopean stimuli do not excite purely monocular cells. Both induction and test stimuli can be cyclopean or either one can be cyclopean and the other a conventional contrast-defined stimulus. The logic underlying the use of cyclopean stimuli in interocular transfer experiments will become apparent in what follows. There are many pitfalls in applying these arguments, and the literature has become complex and rather contentious. Any asymmetry in ocular dominance should result in more interocular transfer from one eye than from the other. A person lacking binocular cells should show neither interocular transfer nor binocular recruitment of cortically mediated aftereffects (see Section 32.4). A basic problem in all studies of interocular effects is that merely closing an eye does not stop inputs from that eye reaching the cortex. For instance, an afterimage impressed on one eye is visible when that eye is closed and the other eye opened, a fact noted by Isaac Newton in 1691 (see Walls 1953). This does not prove, as was once thought, that afterimages are cortical. It simply means that activity arising in a closed eye still reaches the cortex (Day 1958). We now know that an afterimage is no longer visible when the eye in which it was formed is blinded by pressing the finger against the eye for about 30 s (Oswald 1957). Pressure cuts off the blood supply to the retina and the eye remains blind until the pressure is relieved. It is dangerous to keep the pressure on for more than about one minute. Another problem in studies of interocular transfer of successive induction effects is that, in nontransfer trials the same eye is used in both induction and test periods, whereas in transfer trials the adapted eye is open during the adaptation period and closed during the test period, and the tested eye is at first closed and then opened. The sudden transition in the state of adaptation of the eyes in transfer trials could cause a spurious weakening of the aftereffect being tested. This problem can be solved by keeping both eyes in the same state of light adaptation at all times. As we will see, this precaution has been applied in only one study. There are three ways to prove that an aftereffect is cortical in origin: 1. Show that the effect survives pressure blinding the eye seeing the induction stimulus. 2. Show that the effect depends on visual features that are processed only in the visual cortex. Since orientation is first coded in the visual cortex, one might argue that an orientation-specific induction effect must be cortical. But orientation depends on the alignment of retinal receptive fields so that any retinal process that changes the activity over the set of receptive fields would also produce an orientation-specific aftereffect. Examples of truly cortical processes include phosphenes produced
by electrical stimulation of the cortex, and alternations in ambiguous figures. 3. Show electrophysiologically that the first adaptive changes to an induction stimulus arise at the level of the cortex. Other issues concerning drawing conclusions from studies of interocular transfer were discussed by Long (1979). 13.3.2 T R A NS FE R O F T I LT C O N T R A S T
13.3.2a Psychophysical Studies In orientation contrast the apparent orientation of a test line in a frontal plane changes when it is intersected by or placed adjacent to a second line in a slightly different orientation. In simultaneous orientation contrast the stimuli are presented at the same time. In successive orientation contrast, also known as the tilt aftereffect, the test stimulus is presented after the induction stimulus. These effects could be produced, at least in part, by processes operating in the retina. However, it is generally believed that simultaneous orientation contrast is due to inhibitory interactions between orientation detectors in the visual cortex. In the tilt aftereffect, an added factor is believed to be the selective adaptation of the orientation detectors responding to the induction stimulus, leading to a bias in the activity of the population of orientation detectors responding to the test stimulus. Inspection of an off-vertical line induces an apparent tilt of a vertical line in the opposite direction. This form of the tilt aftereffect may be due partly to orientation contrast but it is also due to normalization of the tilted line to the vertical and a concomitant shift in the scale of apparent tilt with respect to the vertical. In a contrast effect, the angle between the induction and test lines apparently increases, whereas in tilt normalization both lines appear displaced in the same direction. See Howard (1982) for a review of this topic. Any orientation contrast when the induction and test stimuli are in different eyes must be cortical in origin. Binocular cells tuned to orientation could mediate interocular transfer of orientation contrast (Coltheart 1973). Virsu and Taskinen (1975) used two intersecting lines that were 3° long and subject to binocular rivalry and fusion. Walker (1978a here seems to be general agreement that simultaneous tilt contrast does not occur when the induction stimulus and the test stimulus are presented to different eyes, with the angle between the lines less than about 10°; Walker 1978a) used a display free from rivalry, consisting of an annular induction grating with an outer diameter of 4.8° surrounding a 1.6° diameter test grating. For larger angles between induction and test stimuli Walker obtained a small amount of interocular transfer while Virsu and Taskinen obtained transfer of about 60% of that obtained when both eyes both lines were presented to the same eye. For small
B I N O C U L A R S U M M AT I O N, M A S K I N G , A N D T R A N S F E R
•
137
angles there was little interocular transfer when the lines were presented dichoptically. For two of the three subjects tested by Virsu and Taskinen the tilted induction line presented to one eye attracted, rather than repulsed, the test line in the other eye. Westheimer (2011), also, found that for small angles a tilted line a tilted induction line in one eye attracted a vertical test line in the other eye. But attraction occurred only when the lines intersected and when the test line was vertical rather than horizontal. One reason for this may be that lines close together tend to fuse or rival. For larger angles between induction and test stimuli, A textured surface rotating in the frontal plane about the visual axis causes a superimposed vertical line to appear tilted in a direction opposite to the background motion. When the rotating surface was presented to one eye and the line to the other, this effect remained at full strength in subjects with normal binocular vision but was significantly reduced in stereoblind subjects (Marzi et al. 1986). Motioninduced tilt seems to be a more strongly binocular effect than static tilt contrast. The tilt aftereffect shows interocular transfer. Estimates of transfer vary between 40 and 100% (Gibson 1937; Campbell and Maffei 1971). In this case, binocular rivalry is not a factor because the induction and test displays are presented successively. The strength of the transferred aftereffect, like that of the monocular aftereffect, did not depend on whether the induction and test stimuli had the same or opposite contrasts (O’Shea et al. 1993). This suggests that the tilt aftereffect occurs at a level where contrast polarity is unimportant. Interocular transfer of the tilt aftereffect is positively correlated with stereoacuity and is absent in people lacking stereoscopic vision (Section 32.4.3). When the world is viewed through prisms, the images of straight lines are made convex toward the base of the prism. After some time the lines appear straight. After the prisms are removed, straight lines appear curved in the opposite direction for a while. This curvature aftereffect has been reported to show between 60 and 100% interocular transfer (Gibson 1933; Hajos and Ritter 1965). Wolfe and Held (1981) used an induction stimulus like that in Figure 13.18a. After exposure to the induction stimulus, a test stimulus with straight lines appeared as a chevron bent in the opposite direction. Subjects adjusted a chevron test stimulus until the lines appeared straight and
A
B
Stimuli used to measure a dichoptic tilt aftereffect. Wolfe and Held (1981) used a chevron pattern (A) as induction stimulus. In the test phase of the experiment, subjects set a chevron pattern to appear straight. Moulden (1980) used tilting lines (B) as induction stimulus, and subjects set lines to vertical in the test phase.
Figure 13.18.
138
•
parallel. Wolfe and Held applied paradigm 2 (Section 13.3.1) and found that the aftereffect showed about 70% interocular transfer, but only 40% transfer when both eyes saw the test stimulus. They concluded that unadapted monocular cells were responsible for the dilution of the aftereffect in going from one eye to the other, and the recruitment of unadapted binocular AND cells was responsible for the extra dilution in going from one eye to two. Moulden (1980) performed a similar experiment using a single set of parallel tilted lines, like those in Figure 13.18b, rather than a chevron pattern. They obtained the opposite result, namely, more aftereffect when both eyes viewed the test stimulus than when only the unadapted eye viewed it. However, Wolfe and Held pointed out that tilt normalization might be involved in the stimulus used by Moulden. Wilcox et al. (1990) used both the chevron and tilted-lines stimuli and obtained the same result as Moulden. They concluded that, because of the uncertain role of unadapted monocular cells, the existence of binocular AND cells cannot be established by this procedure. In addition, Wolfe and Held applied paradigm 4 and found that, after alternating monocular adaptation, the binocular aftereffect was less than either monocular aftereffect. They argued that unadapted AND cells were also responsible for this result. The argument is more convincing in this case since there were no unadapted monocular cells. Wilcox et al. (1990) obtained a similar result and agreed with Wolfe and Held that this supports the idea of the existence of binocular AND cells. Gratings presented in alternation to the two eyes produced equal threshold elevations in monocular and binocular test stimuli whereas gratings presented intermittently to both eyes gave different elevations (Blake et al. 1981b). This argues against the existence of AND cells. Wolfe and Held suggested that the stimuli used for the threshold-elevation effect, unlike those used for the suprathreshold tilt aftereffect, fall below the contrast threshold of AND cells. Wilcox et al. (1994) found that the difference between the monocular and binocular tilt aftereffects and that between the monocular and binocular threshold-elevation effects to be similar both at and above the contrast threshold. They suggested that Blake et al.’s failure to find evidence of AND cells with the threshold-elevation effect was due to the blank intervals in their intermittent exposure condition, which were not present in the alternating condition. When Wilcox et al. controlled for this factor they found evidence for AND cells using the threshold-elevation effect. Wolfe and Held (1982) provided further support for the existence of AND cells. The induction stimulus was a cyclopean chevron pattern defined by binocular disparities in a random-dot stereogram so that the pattern was not visible to either eye. The test stimulus was a chevron pattern defined by luminance, which subjects set into alignment. The aftereffect showed only when the test pattern was viewed binocularly.
STEREOSCOPIC VISION
Wolfe and Held argued that the effect would have been visible with a monocularly viewed test stimulus if the cyclopean induction pattern had stimulated binocular OR cells. A cyclopean induction stimulus does not adapt monocular cells. They concluded that a cyclopean image involves the stimulation of mainly AND cells that require a simultaneous input from both eyes. They admitted that the procedure might have been insensitive to the response of a small percentage of OR cells to the cyclopean image. Binocular viewing of a luminance-defined induction stimulus produced equal monocular and binocular aftereffects. Although monocular cells and binocular OR cells were adapted in this case, there should have been some advantage of binocular viewing, since it alone brought in AND cells. They suggested that the absence of a binocular advantage was due to a ceiling effect. Wolfe and Held did not examine the case in which the induction stimulus is monocular and the test stimulus is cyclopean. According to their theory there should be no aftereffect because monocular stimulation cannot adapt binocular AND cells, and only AND cells are excited by a cyclopean image. Burke and Wenderoth (1989) conducted this experiment with the stimuli shown in Figure 13.19. All subjects saw an aftereffect in the vertical line defined by disparity in the random-dot stereogram after monocular inspection of tilted lines defined by luminance. Binocular inspection of the tilted lines produced a similar result. They used an induction stimulus with a single set of tilted lines, whereas Wolfe and Held used a chevron pattern. Adaptation to a single set of tilted lines brings in tilt normalization— the high-level process that causes a tilted display to appear vertical. The chevron-pattern effect depends on orientation
contrast—interaction between the two halves of the pattern. Whether or not this makes a difference remains to be investigated. Wolfe and Held (1983) concluded from other experiments on the diminution of the tilt aftereffect when going from monocular adaptation to binocular testing that AND cells are more sensitive to low than to high spatial frequencies and are not responsive to near-threshold stimuli or to stimuli that are blurred in one eye. They finally concluded that, since stereoscopic vision is sensitive to the same stimulus features, AND cells are involved in stereopsis. Paradiso et al. (1989) obtained 92% interocular transfer of the tilt aftereffect with a test stimulus consisting of subjective contours, but only 46% when the test stimulus was a real bar. The same difference was found whether the induction stimulus was a subjective bar or a real bar. They related this finding to the fact that von der Heydt and Peterhans (1989) had found cells responding to subjective contours only in V2 of the alert monkey, an area containing a greater proportion of binocular cells than are contained in V1.
13.3.2b Physiological Data on Orientation Contrast The response of a cell in the visual cortex to an optimally oriented bar or grating is suppressed by a superimposed orthogonal bar or grating presented to the same eye (Bishop et al. 1973; Bonds 1989). This is known as cross-orientation inhibition. It is largely independent of the relative spatial phases of the gratings and operates over a wide difference in spatial frequency. Cross-orientation inhibition was not elicited when the test and orthogonal gratings were presented to different eyes (DeAngelis et al. 1992). This suggests that the effect is generated in the visual cortex before signals from the two eyes are combined, and is not responsible for interocular transfer of orientation aftereffects. The response of a cell in the cat’s visual cortex to an optimal stimulus centered on the receptive field is larger when surrounding lines outside the cell’s normal receptive field have a contrasting orientation than when they have the same orientation (Section 5.6.2). For some cells, the inhibitory effect of similarly oriented lines could be evoked dichoptically (DeAngelis et al. 1994). This suggests that it depends on intracortical inhibitory connections. This could provide a physiological basis for interocular transfer of orientation aftereffects.
13.3.2c Summary
A cyclopean tilt aftereffect. Inspection of the fused upper gratings for about 3 minutes causes the cyclopean bar in the stereogram to appear tilted to the left. (Redrawn from Burke and Wenderoth 1989)
Figure 13.19.
Simultaneous tilt contrast, motion-induced tilt, the tilt aftereffect, and the curvature and chevron aftereffects show about 60% interocular transfer. Considerable effort has been expended in using interocular transfer of these effects to reveal binocular AND cells. Binocular AND cells are known to exist from physiological evidence (Section 11.4.1),
B I N O C U L A R S U M M AT I O N, M A S K I N G , A N D T R A N S F E R
•
139
but they seem to be few in number. Perhaps the role of binocular AND cells in binocular vision will never be resolved by psychophysical means. There are so many factors to be taken into account. The same data can be explained by making different assumptions about the types of cell present, their inhibitory or facilitatory interactions, their differential contrast thresholds, their differential dependence on stimulus features such as spatial frequency, and the degree of binocular congruence of the images in the two eyes.
13.3.3 T R A N S F E R O F T H E MOT I O N AFTEREFFECT
13.3.3a Basic Studies In the motion aftereffect, inspection of a textured display moving in one direction causes a stationary display seen subsequently to appear to move in the opposite direction. Aristotle (trans. 1931) saw the effect after looking at a flowing river, and mentioned it in his book Parva naturalia. Purkinje (1825) noticed that after looking at a cavalry parade the houses appeared to move in the opposite direction. Addams (1834) experienced the landscape moving after looking at the falls of Foyers in Scotland. Purkinje and Addams believed that the effect was due to continuation of nystagmic eye movements. Wohlgemuth (1911) reviewed the early work on the motion aftereffect. Holland (1965) and Mather et al. (1998) have provided more recent reviews. The motion aftereffect can be conveniently observed by inspecting the center of a rotating spiral for about a minute and then transferring the gaze to a stationary pattern. An inwardly rotating spiral causes an apparent expansion of a stationary pattern and an outwardly rotating spiral causes an apparent contraction (Plateau 1850). The spiral aftereffect cannot be due to eye movements since it occurs in all radial directions simultaneously. The effect is generally believed to be due to the selective fatigue of one set of motion detectors and the subsequent disturbance in the balance of activity across the population of motion detectors. The magnitude of the motion aftereffect has been measured by recording its duration (Pinckney 1964), by estimating its apparent velocity (Wright 1986), by nulling it with a real motion in the opposite direction (Taylor 1963), and by measuring its effect on the threshold for detection of motion in the adapted and unadapted directions (Levinson and Sekuler 1975). In a variant of the latter procedure, Raymond (1993) measured the motion aftereffect by the elevation in the motion-coherence threshold. This is the percentage of coherently moving dots in a dynamic randomdot display required for detecting unidirectional motion. A large part of the motion aftereffect shows when the inspection display is presented to one eye and the stationary test display to the other. This fact was first noted by 140
•
Dvorák (1870) and has been confirmed by several investigators (Ehrenstein 1925; Freud 1964; Lehmkuhle and Fox 1976). The aftereffect shows interocular transfer only when the induction and stationary test stimuli fall on corresponding areas of the two retinas (Walls 1953). Estimates of the magnitude of the transferred aftereffect relative to that elicited in the same eye vary between zero and 96%, with a mean of about 50% (Wade et al. 1993). All zero-transfer results were obtained from experiments that used moving square-wave gratings and the rather insensitive criterion of duration of the aftereffect. For all other stimuli and criteria, interocular transfer was at least 40%. Transfer of the motion aftereffect of 78% was obtained by Lehmkuhle and Fox using the following procedure. During the induction period the nonadapted eye was presented with a stationary illuminated aperture with the same space-average luminance as a moving display presented in the same aperture to the other eye. This prevented a sudden change in the state of light adaptation of the nonadapted eye in the transition from induction period to test period. Without this control, there was only 52% interocular transfer. The crucial factor may not be the presence or absence of a sudden change of luminance but the presence or absence of contours in the nonadapted eye. The nonadapted eye saw the illuminated aperture within which the moving display was presented to the other eye. Timney et al. (1996) argued that, in the Lehmkuhle and Fox experiment, conjugate eye movements induced by the moving display caused the images of the aperture to move and generate a motion aftereffect in the supposedly nonadapted eye. They supported their view with two observations. First, illuminating the nonadapted eye had no effect on the interocular transfer of the tilt aftereffect or the threshold elevation effect, neither of which involves motion. Second, when they removed all contours from the illuminated display presented to the nonadapted eye, the interocular transfer of the motion aftereffect was the same as when the nonadapted eye was occluded. It seems therefore that interocular transfer of the motion aftereffect is best reflected by the 52% value, which Lehmkuhle and Fox obtained with the nonadapted eye occluded. Interocular transfer of the motion aftereffect produced by rotary motion was found to be less for a stimulus subtending 62° than for a stimulus subtending 30° or 5 ° (Grove et al. 2008). This suggests that binocular interactions are reduced in the periphery. Another possibility is that the duration of the motion aftereffect decreases as the induction stimulus is moved into the retinal periphery (Van de Grind et al. 1994).
13.3.3b Motion Aftereffects at Different Processing Levels If motion detectors were in the retina, one might expect some interocular transfer of the motion aftereffect, since
STEREOSCOPIC VISION
closing the eye in which the induction stimulus was presented would not prevent inputs from the adapted detectors in the closed eye from reaching the visual cortex. However, the motion aftereffect transferred from one eye to the other when the retina of the eye exposed to the induction stimulus was pressure blinded just after the exposure period (Barlow and Brindley 1963; Scott and Wood 1966). This proves that at least some motion detectors responsible for motion aftereffects in humans are at a higher level than the retina. See Section 22.3.2 for a discussion of the interocular transfer of adaptation to plaid motion in superimposed orthogonal gratings. Raymond (1993) obtained 96% transfer of the motion aftereffect from a random-dot display when the aftereffect was assessed by elevation of the motion-coherence threshold. There are no sudden changes of luminance with stimuli of this kind, which may account for the high interocular transfer. Raymond favored the view that the site of this interocular transfer is the extrastriate area MT, which is known to contain only binocular cells and to be sensitive to the degree of coherent motion (Murasugi et al. 1993). Steiner et al. (1994) found less interocular transfer of the motion aftereffect from translatory motion than from expansion or rotation and argued that this is because the latter types of motion are processed at higher levels of the visual system. Nishida et al. (1994) cited evidence that the motion aftereffect tested with a static grating involves a lower level of neural processing than that tested with a directionally ambiguous grating flickering in counterphase. In conformity with this evidence, they found that the duration of the aftereffect induced by a drifting grating showed 30 to 50% interocular transfer when tested with a static grating but almost 100% transfer when tested with a flickering pattern. However, Nishida and Ashida (2000), in agreement with Hess et al. (1997), found that interocular transfer with a flickering test pattern was not complete in the retinal periphery. They also found, in agreement with Steiner et al. (1994), that transfer was not complete when the aftereffect was measured with a nulling method nor when the observer’s attention was distracted by a secondary task. If a dynamic test display, such as a counterphase flicker test, reflects activity at a higher level in the nervous system than does a stationary test display, one should be able to use these tests to reveal motion aftereffects at different neural levels. Motion of a luminance-defined pattern is first-order motion, and motion of a pattern defined by modulation of contrast, or by edges defined by texture, is second-order motion. It is believed that first-order motion is processed at a lower level in the nervous system than is second-order motion. A motion aftereffect induced by first-order motion shows with a static test stimulus but that induced by a second-order stimulus shows only with a dynamic test stimulus (McCarthy 1993). Furthermore, simultaneous
adaptation to first-order motion in one direction and second-order motion in the opposite direction produced an aftereffect with a static pattern in the opposite direction to the first-order stimulus and an aftereffect with a dynamic pattern in the opposite direction to the second-order stimulus (Nishida and Sato 1995). See Nishida and Ashida (2001) for further discussion of interocular transfer of different types of motion aftereffects. The motion aftereffect is greatly enhanced when the induction stimulus is surrounded by a stationary texture rather than by a blank field. This effect may be due to the fact that a display consisting only of moving elements induces pursuit eye movements, which cancel the motion of the stimulus on the retina. Enhancement of the motion aftereffect by a stationary surround does not occur when the induction stimulus is presented to one eye and the textured stationary surround to the other (Symons et al. 1996). This suggests that relative motion signals are required to be present in the same eye for the generation of the aftereffect.
13.3.3c Motion Aftereffect Transfer and Stimulus Velocity Tao et al. (2003) investigated the effect of the velocity of the induction stimulus on the degree of interocular transfer of the motion aftereffect. They used a nulling procedure and a dynamic-noise test pattern. As the velocity of the induction stimulus increased from 1.5°/s to 24°/s, mean interocular transfer increased from 18% to 63%. They concluded that binocular cells play a more dominant role in the motion aftereffect at higher velocities. The following evidence suggests that slow motion and fast motion are processed in distinct channels. The standard motion aftereffect does not occur for induction stimuli moving at velocities in excess of 30°/s. However, an induction stimulus moving faster than 30°/s induced an aftereffect in a test stimulus consisting of dynamic random noise (Verstraten et al. 1998). Inspection of superimposed dot patterns moving at the same velocity in different directions produces a motion aftereffect that represents the vector sum of the two motion signals. When the induction stimuli move in opposite directions, the aftereffects cancel (Verstraten et al. 1994a). However, adaptation to two superimposed random-dot patterns moving at different velocities (4 and 12°/s) in orthogonal directions produced an aftereffect opposite the slow motion component in a static test display and an aftereffect opposite the fast motion in a dynamic test display. When the test stimulus contained both static and dynamic elements, the aftereffect was that of two superimposed patterns moving orthogonally (van der Smagt et al. 1999). Dichoptic patterns moving in opposite directions engage in binocular rivalry. However, a rapid pattern in one eye and a slow pattern in the other eye did not rival but
B I N O C U L A R S U M M AT I O N, M A S K I N G , A N D T R A N S F E R
•
141
produced the impression of transparent motion of two arrays (van de Grind et al. 2001). They called this effect dichoptic motion transparency. It suggests that there are two motion channels, each with a distinct rivalry stage.
13.3.3d Distinct Motion Aftereffects in each Eye Random-dot stimuli moving in different directions and superimposed in the same eye produce an aftereffect in a direction opposite the vector sum of the induction stimuli. The aftereffects cancel when the induction stimuli move at the same velocity in opposite directions. A vector sum aftereffect occurs also in a binocular test stimulus when the two induction motions are presented to different eyes (Grunewald and Mingolla 1998). However, when only one eye views the test stimulus, the direction of the aftereffect is opposite that of the induction stimulus that was presented to that eye (Wohlgemuth 1911; Anstis and Moulden 1970). When the test stimulus consisted of stationary rivalrous dichoptic orthogonal gratings, whichever image was dominant appeared to move in a direction opposite to the motion to which that eye had been exposed (Ramachandran 1991). These results support the idea that motion adaptation occurs in distinct directions in the two monocular sites, which then sum at a binocular site. The monocular aftereffects could be due to monocular motion-sensitive cells in the visual cortex, which selectively adapt to input from only one eye. But they could also represent adaptation in that subset of binocular cells for which that eye forms the dominant input. Anstis and Duncan (1983) extended this paradigm as follows. A rotating textured disk was seen rotating clockwise for 5 s by the left eye, then for 5 s by the right eye, and finally rotating counterclockwise for 5 s by both eyes. The sequence was repeated 40 times. In the test period a stationary disk appeared to rotate counterclockwise when viewed by either eye alone and clockwise when viewed by both eyes. Similar eye-specific aftereffects were reported by Jiao et al. (1984). The binocular aftereffect must have arisen in binocular cells. The monocular effects in Anstis and Duncan’s study could not have been induced in monocular cells in a straightforward way, since each eye was exposed to equal clockwise and counterclockwise motion. Three processes have been proposed to account for these monocular aftereffects: 1. Anstis and Duncan suggested that the response of monocular cells responding to clockwise motion was inhibited by binocular cells responding to clockwise motion. 2. Tyler suggested that these effects are explained by the results of his 1971 experiment in which binocular motion signals suppressed monocular motion signals 142
•
(Section 18.12.3). Thus, the counterclockwise binocular motion signals do not excite monocular cells and therefore do not cancel effects of monocular exposure to clockwise motion. 3. Van Kruysbergen and de Weert (1993) distinguished between a pure monocular system, a simple binocular system, and a pure binocular system. In Anstis and Duncan’s experiment, the pure monocular system for each eye was exposed to equal amounts of clockwise and counterclockwise motion and therefore did not exhibit an aftereffect. The simple binocular system received 5 s of clockwise motion from each eye separately and 5 s of counterclockwise motion from the two eyes simultaneously. Overall, the simple binocular system was exposed to clockwise motion for a longer period than it was exposed to counterclockwise motion. It therefore generated a counterclockwise aftereffect with monocular testing. The pure binocular system was activated by only counterclockwise motion and therefore exhibited a clockwise aftereffect with binocular testing. It must be assumed that the pure binocular system gives a stronger aftereffect than the simple binocular system; otherwise the two stimuli would cancel with binocular testing. Van Kruysbergen and de Weert repeated Anstis and Duncan’s experiment with subjects not aware of conditions they were exposed to. They produced further evidence for the pure monocular system and for the two types of binocular system. In addition, their results suggest the existence of a central monocular system for each eye. These various systems are depicted in Figure 13.20 (see also van Kruysbergen and de Weert 1994).
Left monocular
Right monocular
Simple binocular system
Pure binocular system Central right monocular system
Central left monocular system
Monocular and dichoptic motion detection. Neural systems proposed by van Kruysbergen and de Weert (1993) to account for monocular and binocular motion aftereffects.
Figure 13.20.
STEREOSCOPIC VISION
Blake et al. (1998) obtained motion aftereffects in the two eyes that showed binocular rivalry rather than vector summation. Subjects were adapted for 10 minutes to dichoptic random-dot display moving in orthogonal oblique directions. For most subjects, a binocularly viewed test display of randomly moving dots appeared to move alternately in one direction and the other direction. In other words, there was binocular rivalry between two distinct motion aftereffects. With a stationary test display, the aftereffect was in the direction of the vector sum of the two monocular aftereffects. It is not clear why these test stimuli produced different results. Motion of cyclopean contours not present in either monocular image can create a motion aftereffect (Papert 1964; Anstis and Moulden 1970). This aftereffect must depend on binocular cells, since only binocular cells register a cyclopean image.
13.3.3e Interocular Transfer of Induced Visual Motion A small stationary object superimposed on a moving background appears to move in the opposite direction, an effect known as induced visual motion. Levi and Schor (1984) found only weak induced motion in a stationary grating presented to one eye that was flanked by moving gratings presented to the other eye. They controlled for effects of vergence but provided no data. Swanston and Wade (1985) presented a stationary test line to one eye and a symmetrically expanding background to the other. Induced visual motion evident with ordinary binocular viewing was not present in the dichoptic condition. However, binocular rivalry complicated the dichoptic judgments. Confining the moving display to an annulus surrounding a stationary disk eliminated the effects of rivalry. With this display, dichoptic induced motion occurred at about 30% of its normal value (Day and Wade 1988). It is possible, however, that part of the dichoptic effect resulted from cyclorotation of the eyes induced by the moving annulus. A stimulus subjected to induced motion in one eye generated a sensation of motion-in-depth when dichoptically combined with a stimulus that was perceived as stationary (Wade and Swanston 1993). In other words, dichoptic stimuli that differed only in their perceived motion generated stereoscopic motion-in-depth. Further experiments are required to determine whether vergence eye movements are responsible for some or all of this effect.
13.3.3f Physiology of Transfer of the Motion Aftereffect Physiological evidence indicates that motion detectors in primates occur in V1 and other cortical visual areas. Motion detection at each location is mediated by a set of motion detectors. Each detector is optimally sensitive to a
particular direction and speed of motion, but the detectors have overlapping tuning functions (see Sekuler et al. 1978). A stimulus is perceived as stationary when detectors of rightward and leftward motion are stimulated equally. The motion aftereffect is believed to be due to selective adaptation of the motion detectors responding to the induction stimulus, which causes an imbalance in the response of the set of motion detectors to a stationary stimulus. The motion aftereffect is not induced by inspection of a moving display that fills the visual field (Wohlgemuth 1911). This is perhaps not surprising, since the optokinetic response of the eyes would tend to null the motion of the image over the retina. Retinal motion is preserved when there is relative motion in the visual field. In any case, relative motion is a more potent stimulus than absolute motion (Snowden 1992). Interocular transfer of the motion aftereffect must be mediated by motion-sensitive binocular cells in V1 or at a higher level. Experiments on motion-sensitive cells in the visual cortex of unanesthetized cats have provided direct evidence for this idea. For about 30 s after being stimulated by a moving grating for 30 s, the cells remained less sensitive to motion in the adapted direction and more sensitive to motion in the opposite direction (Vautin and Berkley 1977; Hammond et al. 1988). Cells that had been adapted to a moving grating presented to one eye showed similar but weaker aftereffects when tested with stimuli presented to the other eye. Interocular transfer in a cell was stronger when the induction stimulus was presented to the eye that provided the dominant input to that cell than when it was presented to the nondominant eye for that cell. No transfer of the aftereffect occurred for cells classified as monocular (Hammond and Mouat 1988). Interocular transfer of direction-specific motion adaptation was most evident in simple cortical cells and showed only when the cells were adapted to motion in the preferred direction (Cynader et al. 1993). The interocular motion aftereffect, like the monocular aftereffect, is specific to the direction of motion of the induction stimulus. Thus, the directional tuning of binocular cortical cells must be the same for both eyes. This explains why binocular contrast sensitivity for moving gratings presented dichoptically is better than monocular contrast sensitivity only when the gratings move in the same direction (Arditi et al. 1981a).
13.3.3g Summary The motion aftereffect shows considerable interocular transfer when the moving induction stimulus is confined to one eye. The extent of interocular transfer is greater when the aftereffect is tested with a dynamic pattern with ambiguous motion rather than with a stationary pattern. A dynamic test pattern seems to tap aftereffects occurring at a higher level in the nervous system or at higher velocities compared with the effect produced by a stationary pattern.
B I N O C U L A R S U M M AT I O N, M A S K I N G , A N D T R A N S F E R
•
143
The motion aftereffect produced by second-order motion shows only with a dynamic test pattern, presumably because second-order motion is processed at a higher level than first-order motion. Distinct motion aftereffects may be generated in each eye, and seem to be at least partially distinct from an aftereffect generated by binocular viewing. Induced visual motion also shows considerable interocular transfer. 13.3.4 T R A N S F E R O F T H E S PAT I A LFR EQ U E N C Y S H I F T
After inspection of a grating of a given spatial frequency, a grating of lower spatial frequency appears coarser than it normally appears, and a grating of higher spatial frequency appears finer (Blakemore and Sutton 1969). This is known as the spatial-frequency shift. The aftereffect shows interocular transfer, although the transferred effect relative to the ordinary aftereffect does not seem to have been measured (Murch 1972). Cyclopean shapes generated in a random-dot stereogram can also induce the spatial-frequency shift (Section 16.3). The aftereffect shows the same interocular transfer when the adapted eye is pressure blinded during the test period (Meyer 1974). Therefore transfer must be central rather than arise from activity in the adapted eye. Favreau and Cavanagh (1983, 1984) obtained interocular transfer of the spatial-frequency shift using isoluminant colored gratings, but only when the test grating was exposed for less than 400 ms or was flickering. They argued that the interocular transfer reflected the activity of color-coded binocular cells with transient characteristics that have been found in area V4 of the monkey. Transfer of other effects involving color is discussed in the next section. The issue of whether stereopsis occurs with isoluminant stimuli is discussed in Section 17.1.4. Adaptation to a high-density random-dot display reduces the apparent density of a subsequently seen display of lower dot density. This effect is not a spatial frequency aftereffect because it occurs with displays that differ in density but not spatial frequency (Durgin and Huk 1997). This aftereffect showed interocular transfer of about 70%. An aftereffect due to a simple difference in the contrast of texture elements showed no interocular transfer (Durgin 2001). 13.3.5 T R A N S F E R O F C O N T I N G E N T AFTEREFFECTS
A contingent aftereffect is one that depends on a particular combination of two stimulus features. The first contingent aftereffect to be reported involved color and orientation. The human eye is not corrected for chromatic aberration, which means that blue light is brought into focus nearer the lens than red light. This produces color fringes along blackwhite borders away from the optic axis. However, we do not see these color fringes, presumably because the visual system 144
•
applies a correction at a neural level. Color fringes specific to the luminance polarity of edges appear when we view the world through prisms, because prisms increase the degree of chromatic aberration above its normal level. For instance, base-left prisms produce blue fringes on the right of light regions and red fringes on the left. These fringes disappear after the prisms have been worn for a few days, which reinforces the idea that the neural system compensates for them. In a footnote to a paper on adaptation to prismatically induced curvature, Gibson (1933) reported that color fringes of opposite sign were seen for several hours after removal of prisms that had been worn for 3 days. These so-called phantom fringes must be neural rather than optical in origin since they show in monochromatic light (Hay et al. 1963). Phantom fringes represent the firstknown example of a contingent aftereffect—the colors are contingent on the polarity of the edge. Adaptation to prisminduced chromatic fringes does not transfer from one eye to the other (Hajos and Ritter 1965). Celeste McCollough (1965) discovered a contingent aftereffect that is induced in a few minutes. Subjects view an orange and black vertical grating for 10 s, then a blue and black horizontal grating for 10 s. After these alternating stimuli have been viewed for several minutes, a vertical achromatic grating appears black and blue-green (complementary color of the vertical induction grating) and a horizontal grating appears black and orange (complementary color of the horizontal induction grating). This contingent aftereffect is known as the McCollough effect. It is not an ordinary color aftereffect obtained by gazing steadily at a colored grating, since both colors stimulate all regions of the retina during the induction period. Also, the effect lasts for several hours, days, or even months, unlike aftereffects produced by single stimulus features, which last only minutes. See Stromeyer (1978) and Vul et al. (2008) for reviews of this topic. McCollough noted that there was no color-contingent aftereffect when the inspection and test stimuli were presented to opposite eyes, except for one subject who reported an aftereffect in which the same rather than the complementary colors appeared in the test stimuli. This is known as the positive contingent aftereffect. Other investigators have confirmed that the negative McCollough effect does not transfer to a nonadapted eye (Murch 1974; Stromeyer 1978). Nine out of 27 subjects tested by Mikaelian (1975) observed a weak positive aftereffect in an eye that was not exposed to the induction stimulus. Fifteen of the subjects observed the positive aftereffect in the unstimulated eye after scanning the achromatic test stimulus with both eyes. White et al. (1978) argued that monocular viewing during the induction period does not provide a fair test of interocular transfer, since the eyes receive different levels of luminance and color. They avoided this problem by presenting the pattern component of the induction stimulus to one eye and the color component to both eyes. Thus, one eye
STEREOSCOPIC VISION
was exposed to red vertical and green horizontal gratings, as in the regular McCollough effect, while the other eye saw only alternating colors, with no patterns. White et al. obtained a negative aftereffect when the eye seeing only the color was tested. The positive aftereffect was seen when the eye that had been exposed to the patterned stimulus was pressure blinded during testing. This provides strong evidence that the figural processes responsible for the McCollough effect occur, at least partially, in binocular cells. The conclusion could be tested if the experiment were also done the other way around—exposing one eye to the regular McCollough induction stimuli and the other eye only to alternating achromatic gratings. MacKay and MacKay (1975) fully partitioned the two stimulus components between the two eyes during a 20-minute inspection period. One eye saw only black and white gratings alternating in orientation while the other eye saw only a uniform green field alternating with a uniform red field. A black-white test grating presented to the eye exposed to only uniform color appeared in the color complementary to that of the inspection stimulus with the same orientation. However, a test grating presented to the eye exposed only to achromatic gratings appeared in the same color as its matching inspection stimulus (positive aftereffect). Both transferred effects were less than one-third the strength of a normal McCollough aftereffect. Over et al. (1973) failed to observe interocular transfer. They used a similar method, but with a higher rate of alternation between the pairs of right and left eye stimuli. Potts and Harris (1979) used a similar procedure and obtained the negative aftereffect when the test grating was presented to the eye previously exposed to color, but no aftereffect when the eye exposed to color was closed. Shattuck and Held (1975) failed to find interocular transfer of a colorcontingent tilt aftereffect. Thus, interocular transfer of the McCollough effect is not always observed and, when observed, is much weaker than the direct effect, suggesting that the aftereffect depends mainly on processes before binocular fusion. Positive contingent aftereffects accord with a wellknown property of afterimages. An ordinary colored afterimage is in the complementary hue when projected on a light background and in the same hue when projected on a dark background or seen in the dark field of a closed eye (Sumner and Watts 1936; Robertson and Fry 1937). This dependence of the hue of an afterimage on the luminance of the background has nothing to do with binocular interactions (Howard 1960). Thus, it is not surprising that the contingent aftereffect is positive when the eye exposed to color in the induction period is closed during the test period. Note that MacKay obtained the ordinary negative aftereffect when the eye exposed to color in the induction period was open during the test period. Transferred contingent aftereffects should be in their negative rather than their positive form if, during the inspection and
test periods, the unstimulated eye is evenly illuminated rather than closed. Vidyasagar (1976) presented a red vertical grating and a blue horizontal grating alternately to both eyes. Complementary gratings (blue vertical and red horizontal) were then presented alternately to each eye separately. An aftereffect complementary to the binocular stimuli was obtained. He argued that the complementary gratings canceled monocular components of the aftereffect leaving only the component arising from activation of binocular and cells. White et al. (1978) used a similar logic. Each eye was exposed to a conventional McCollough inspection stimulus with the gratings and colors changing either in phase or in antiphase in the two eyes. After in-phase inspection the binocular aftereffect was greater than either of the monocular aftereffects. After antiphase inspection the binocular aftereffect was weaker than either of the monocular aftereffects. It is surprising that there was any binocular aftereffect following antiphase inspection. These results provide evidence of binocular facilitation and cancellation of the McCollough effect, since only the binocular stimulus varied between the in-phase and antiphase induction conditions. However, binocular facilitation was not large, suggesting that most of the McCollough effect is induced before binocular convergence. On the basis of binocular facilitation, one would expect monocular aftereffects to be greater after in-phase than after antiphase induction, since only in-phase induction provides an effective stimulus for either binocular OR cells or AND cells. White et al. did not mention this point, but a scrutiny of their data reveals no difference of this type. Kavadellas and Held (1977) also failed to find that monocular aftereffects after binocular induction with identical stimuli are greater than those after binocular induction with opposed stimuli. They did not compare monocular and binocular aftereffects and used a colorcontingent tilt aftereffect rather than the conventional McCollough effect. White et al. also demonstrated that aftereffects specific to each eye could be induced at the same time. Furthermore, when a black and white stimulus was presented to only one eye after a binocular induction period, the extinction of the aftereffect was largely confined to that eye (Savoy 1984). This evidence strengthens the conclusion that the McCollough effect arises before the site of binocular convergence. This conclusion is also supported by the fact that White et al. found the aftereffect to be at full strength when, for much of the induction period, the induction stimulus presented to one eye was perceptually suppressed by a rivalrous stimulus presented to the other eye. Inspection of a red disk rotating clockwise alternating with a green disk rotating counterclockwise produces a color-contingent motion aftereffect in a stationary patterned disk (Hepler 1968; Mayhew and Anstis 1972). Stromeyer and Mansfield (1970) found that this aftereffect
B I N O C U L A R S U M M AT I O N, M A S K I N G , A N D T R A N S F E R
•
145
did not transfer to an eye that was closed during the induction period. Favreau (1978) agreed that the negative aftereffect does not transfer to an unstimulated eye, but did find a transferred positive contingent aftereffect. In these studies a stationary test stimulus appeared to move in one direction when shown in one color and in the opposite direction when shown in another color. Smith (1983) found that the threshold for detecting motion of a rotating colored spiral was elevated by prior inspection of a spiral moving in the same direction and with the same color. This color- and direction-specific threshold-elevation effect showed significant interocular transfer. This suggests that there are double-duty binocular cells jointly tuned to motion and color revealed only when both the induction and test stimulus are moving, presumably because the cells do not respond to stationary stimuli.
13.3.5a Summary The complex series of experiments and theoretical arguments reviewed in Section 13.3 yield the following conclusions. (1) Some aftereffects show more interocular transfer than others. (2) The magnitude of an aftereffect is determined by some pooling of activity from different types of cortical cells, including monocular cells, binocular OR cells, and binocular AND cells if they exist. (3) There is some evidence from studies on interocular transfer that both binocular OR cells and binocular AND cells do exist. (4) Contingent aftereffects are due largely to processes occurring before binocular matching. Evidence reviewed in Section 32.4 shows that people with defective binocular vision show less interocular transfer than those with normal vision. 1 3 . 4 T R A N S F E R O F P E R C E P T UA L LEARNING 13.4.1 T R A NS F E R F O R S I M P L E VI S UA L TA S K S
It was explained in Section 4.9 that practice leads to improvement in a variety of simple visual tasks, such as orientation discrimination, contrast detection, Snellen acuity, vernier acuity, and the ability to see a shape defined by relative motion. For some individuals, vernier acuity and resolution acuity improve with practice, and this effect showed complete interocular transfer (Beard et al. 1995). Improvement in the ability to discriminate briefly flashed gratings is specific to the spatial frequency and orientation of the grating but showed complete interocular transfer (Fiorentini and Berardi 1981). Improvement in the ability to discriminate between directions of motion of dots is specific to a narrow range of directions of motion and to position but shows almost complete interocular transfer (Ball and Sekuler 1987). 146
•
Improvement in identifying the orientation of oblique lines within a circular disk is orientation and position specific but shows complete or almost complete interocular transfer (Schoups et al. 1995). Similarly, improvement in detecting a line that differs in orientation from a set of surrounding lines is specific to the position, size, and orientation of the stimuli but shows interocular transfer (Ahissar and Hochstein 1995, 1996; Schoups and Orban 1996). These two latter results are not surprising because monocular cells show only weak orientation tuning (Blasdel and Fitzpatrick 1984). The only report of lack of interocular transfer of improvement in a simple visual task is one in which improvement of vernier acuity with practice was specific to the trained eye (Fahle 1994). Interocular transfer of perceptual learning is to be expected because eye-specific learning would imply that learning involves only monocular cells. Learning is more likely to involve binocular cells since there are more binocular cells than monocular cells in the visual cortex. Furthermore, visual learning is improved by attention to the task, which implies the activity of centers beyond the primary visual cortex (Fahle 2004). Nevertheless, learning of simple tasks has generally been found to be specific to the shape, size, and position of the visual stimulus, which suggests that an early stage of visual processing is involved along with higher-order attention processes. 13.4.2 T R A NS F E R O F PAT T E R N D I S C R I M I NAT I O N
The trained ability to select one of two shapes transfers from one eye to the other in a variety of submammalian species, including the octopus (Muntz 1961), goldfish (Sperry and Clark 1949), and pigeon (Catania 1965). Transfer may be less than 100% for incompletely learned tasks or for more difficult tasks. Since the visual pathways in these species decussate fully, interocular transfer must depend on interactions between the hemispheres, conveyed through the corpus callosum. In animals with hemidecussation of visual pathways, the contralateral pathways from the nasal hemiretinas may be severed by section of the chiasm, leaving only the ipsilateral pathways from the temporal hemiretinas. Each hemisphere then receives inputs from only one eye. Cats with section of the chiasm are still capable of interocular transfer of pattern discrimination (Myers 1955). The same is true of monkeys with chiasm section (Lehman and Spencer 1973). Interocular transfer of shape discrimination in the monkey is abolished after section of the anterior commissure or splenium but not after section of other parts of the interhemispheric commissures (Black and Myers 1964). Pigeons trained to peck one of two different shapes seen only with one eye continued to select the correct shape when tested with the other eye. However, if the two shapes
STEREOSCOPIC VISION
were right-left mirror images, the animals selected the incorrect mirror-image shape when tested with the untrained eye. When trained with up-down mirror-image shapes the animals selected the correct shape when tested with the untrained eye (Mello 1966). Similar effects have been found in goldfish (Campbell 1971). In both fish and pigeon, inputs from each eye project fully to the contralateral hemisphere. Interocular transfer must therefore be mediated by the corpus callosum. In a monkey with section of the chiasm, the temporal half of each retina projects to the ipsilateral hemisphere and the nasal inputs are almost entirely lost. As in the normal pigeon, interocular transfer must be mediated by the corpus callosum. Noble (1966, 1968) found that monkeys with section of the chiasm when trained with one eye to select one of two mirror-image shapes, such as > versus a/d H=a/d Vieth-Müller circle H=0
d
a/d > H > 0
a Right eye
Left eye
0 > H > –a/d
(a) Various curvatures of the horopter, indicated by values of H Skewed horopter
Fixation point
qL
qR
(b) An example of a skewed horopter. Figure 14.28.
Curvature and skew of the horopter.
(Adapted from Ogle 1964)
the horopter produced by a difference in the spacing of retinal points is illustrated in Figure 14.21. When R0 = 1, the apparent frontal plane is symmetrical about the median plane of the head. Let R denote the ratio of the tangents of the angles L and f R for a given pair of points on the horopter. R=
tan f R tan f L
(8)
If tan f R is plotted on the x-axis and the tangent ratio (R) of each point along the empirical horopter is plotted on the y-axis, then H is the slope of the resulting linear function and R0 is its intercept on the y-axis, as illustrated in Figure 14.29. When R = 1 and H = 0, the apparent frontal plane is circular and lies on the Vieth-Müller circle for all viewing distances. For other values of H, it follows from equation (2) in Section 14.2.3 that, as viewing distance increases, a given angular difference between a pair of corresponding points signifies a greater deviation of the apparent frontal plane from the Vieth-Müller circle. For positive values of H, the apparent frontal plane is an ellipse with its long axis
BINOCULAR CORRESPONDENCE AND THE HOROPTER
•
177
1.08
Left field
Right field 1.06
R or
tan φr tan φl
20 cm
1.04 40 cm
76 cm
1.02
–16
–12
–8
6m
–4 4
8 12 tan φr
16
0.98
0.96
0.94
Plot of horopter data. For data in Figure 14.27, the tangent of the subtense of the right-eye image (tan ϕr) is on the x-axis and the tangent ratio of the subtense of the two images (R) for each point on the empirical horopter is on the y-axis. H is the slope of the linear function, and R0 is its y-axis intercept. (Redrawn from Ogle 1964)
Figure 14.29.
extending laterally beyond the Vieth-Müller circle at near viewing distances. As viewing distance increases, the frontal plane becomes flat and then convex to the viewer, as illustrated in Figure 14.28a Thus, for positive values of H, there is a viewing distance, d, where the apparent frontal plane lies in the frontal plane. This is the abathic distance. This occurs when d = a/H, where a is the interpupillary distance and d is the distance of the fixation point from the interpupillary line. For negative values of H, the horopter, as determined by the apparent frontal plane, is an ellipse compressed laterally with respect to the Vieth-Müller circle. There are three questions about the constancy of H: its constancy at different eccentricities of the stimulus, its constancy at different viewing distances, and its constancy in asymmetrical convergence. Ogle (1964) assumed that H was constant at different eccentricities, and derived a single average value of H for each viewing distance. This average value was used to obtain linear plots of equation (6). Shipley and Rawlings (1970a) objected to this averaging procedure on the grounds that it hides potentially interesting fluctuations in the pattern of retinal correspondence, evident in data from their own experiments (Rawlings and Shipley 1969). 178
•
Ogle (1964) showed that data from several investigators reveal that H is not constant as viewing distance changes. He estimated a change in correspondence of about 12 arcmin for a stimulus at a retinal eccentricity of 12° (see also Reading 1983). This suggests that there is some instability in the pattern of retinal correspondence. However, changes in H could be due to differential changes in the optics of the eyes resulting from changes in accommodation that accompany changes in vergence, or to changes in the positions of the nodal points with respect to the centers of rotation of the eyes. The instability of H with viewing distance, revealed by the criterion of setting points in a frontal plane, may be due to a change in this criterion with viewing distance. These points are discussed in more detail below. Flom and Eskridge (1968) formed afterimages of a vertical line in the upper field of one eye and of a vertically aligned line in the lower field of the other eye. The afterimages thus formed a vernier nonius display. Since afterimages do not move with the eyes, any departure from alignment should represent a change in binocular correspondence. When vergence changed from 10 cm to 6 m, afterimages at a retinal eccentricity of 12° became misaligned by about 3 arcmin. This was about twice the discriminable misalignment. At an eccentricity of 24° the misalignment was about 9 arcmin. They did not use centrally placed afterimages. A related question is whether binocular correspondence in the central visual field shifts after the eyes have been held in an extreme vergence position for some time. Wick (1990) projected nonius afterimages on a random-dot stereogram with a central area of 20 arcmin of disparity. Three of their six subjects noticed a change in afterimage alignment when the eyes were held in a maximally diverged position for several minutes. All subjects reported a change in the alignment of Haidinger brushes in the two eyes under the same circumstances. Duwaer (1982, 1983) did not observe any misalignment of afterimages under similar circumstances, although it is not clear how long vergence was maintained. Hillis and Banks (2001) pointed out that the afterimages used by Flom and Eskridge and by Wick may not have been aligned. The perceived magnitude of misalignment may have changed with vergence because of vergence micropsia. Hillis and Banks also pointed out that the settings of nonius lines might have been influenced by the textured display on which they were superimposed. This artifact in setting nonius lines was discussed in Section 10.2.4b. They repeated the experiment and found no change in binocular correspondence when the contaminating effects were eliminated. Other evidence of temporary changes in binocular correspondence was discussed in Section 10.2.4c. With regard to the constancy of H with asymmetrical convergence, Ames et al. (1932b) reported that the shape of the apparent frontal plane changed when the eyes were fixated on an eccentric target, although it changed in different ways for the two subjects. Herzau and Ogle (1937) and Lehnert (1941) investigated this question over a wide range
STEREOSCOPIC VISION
of angles of eccentric gaze, using the apparent frontal plane and nonius lines. In both studies, the apparent frontal plane changed with changing angle of gaze but the nonius horopter remained constant. Flom and Eskridge (1968), also, used a nonius method and found that binocular correspondence is stable to within 6 arcmin for eccentric angles of gaze up to an eccentricity of 12°. Lehnert (1941), Linksz (1952), and Shipley and Rawlings (1970a, 1970b) concluded that only the nonius method reveals the shape of the horopter. The nonius method equates visual directions in the two eyes, which is the defining characteristic of the horopter. The apparent frontal plane is derived from setting points in a common plane, which may have little to do with whether or not those points have zero disparity. The question of the perception of slant as a function of eccentricity is discussed in greater detail in Section 20.2.2.
14.6.2b Optical Factors in Hering-Hillebrand Deviation The second explanation of the Hering-Hillebrand deviation is that it is due to optical factors. This was mentioned by Hillebrand (1893). The retinal image is subject to barrel distortion because the entrance pupil of the eye is not aligned with the anterior nodal point. This means that small objects are magnified more than large objects. If the barrel distortion is symmetrical about the optic axis it will be asymmetrical about the visual axis (fovea) because the visual axis is displaced several degrees in the temporal direction from the optic axis (angle alpha). Thus, the nasal half image in one eye is optically compressed relative to the corresponding temporal half image in the other eye. This has the same effect on the horopter as a relative compression of corresponding points in the nasal hemiretinas. For the central part of the horizontal horopter, Halldèn (1956) related Ogle’s coefficient, H, the radius of the retina, r, the distance of the nodal point from the fovea, n, and the angle between the visual and optic axes, a by the following function: H =2
n r sina r
(9)
closely match at any distance for stimuli containing vertical disparities. Evidence that vertical disparities are involved in the perception of frontal planes is reviewed in Section 20.6.4. 1 4 . 7 E M P I R I C A L V E RT I C A L H O R O P T E R The geometrical vertical horopter is a single vertical line intersecting the horizontal horopter in the median plane, as shown in Figure 14.15. However, we will now show that the empirical vertical horopter is inclined top away in the median plane because corresponding vertical meridians in the two eyes are rotated with respect to corresponding horizontal meridians. The nonius method is the best procedure for determining the empirical vertical horopter. In one form of the method, a vertical nonius line is presented above a fixation point to one eye and a second line is presented below the fixation point to the other eye. Circles round each nonius line hold horizontal and vertical vergence steady. These are known as Volkmann disks (Figure 14.30). One of the lines is stationary, and the other is rotated in a frontal plane about the point of fixation until the subject reports that the lines appear aligned. It is assumed that dichoptic lines that appear aligned fall on corresponding vertical meridians. Helmholtz (1909 vol. 3, p. 408) first set horizontal nonius lines collinear to determine the alignment of the corresponding horizontal meridians. He then set vertical nonius lines collinear to determine the alignment of the corresponding vertical meridians. With the eyes symmetrically converged, the corresponding horizontal meridians were almost parallel in his subject, but the corresponding vertical meridians were extorted by about 2° relative to each other. Nakayama (1977) confirmed the opposite tilt of corresponding vertical meridians by measuring the horizontal disparity between two dichoptically presented points of light as a function of their distance above and below the fixation point. The lights were presented up to vertical eccentricities of 30° at a viewing distance of 2 m. They were flashed onto the two eyes in alternation and their lateral
Stimulus color also affects the position and form of the horizontal horopter (Bourdy 1978), a fact related to chromostereopsis discussed in Section 17.8.
Nonius lines
14.6.2c Hering-Hillebrand Deviation and Vertical Disparities The third explanation of the Hering-Hillebrand deviation is that the vertical rods used by investigators such as Ames et al. (1932b) to determine the apparent frontal plane contained no vertical disparities. Helmholtz (1909, vol. 3, p. 318) showed that apparent and actual frontal planes
Left-eye stimulus
Vergence lock Right-eye stimulus
Volkmann disks. Angular misalignment of the two nonius lines in the binocular image provides a measure of the declination of corresponding vertical meridians. In most people the corresponding meridians are relatively extorted about 2°.
Figure 14.30.
BINOCULAR CORRESPONDENCE AND THE HOROPTER
•
179
separation was adjusted until there was no apparent motion between them. At this point, it was assumed that their images fell on corresponding points. Ledgeway and Rogers (1999) used the same procedure with points of light 21° above and below a central fixation point at various horizontal eccentricities up to ±16° and viewing distances of between 28 cm and infinity. They found that corresponding vertical meridians were extorted about 2° at all eccentricities and viewing distances. A similar procedure revealed that corresponding horizontal meridians were aligned over the same range of vertical eccentricities and viewing distances. Grove et al. (2001) used the same procedure and reported a backward inclination of the vertical horopter of about 30° at a distance of 65 cm, corresponding to an extorsion of corresponding vertical meridians of 1.7°. The inclination of the vertical horopter may explain why people prefer to read with a book inclined backward to the line of sight. Schreiber et al. (2008) used the same procedure with dichoptic horizontal or vertical line segments at eccentricities of up to 8° in different radial directions and at different distances. After plotting corresponding points they determined the horizontal, vertical, and torsional vergence positions of the eyes and the distribution of points in space that minimize the root mean square of disparities over the tested area of the visual field. They confirmed that the tilt of corresponding vertical meridians is the same for all horizontal azimuths and that the Hering-Hillebrand deviation occurs at all elevations. The optimal surface was a surface lying close to the ground and seen at a distance that varied from subject to subject. All the above evidence shows that when corresponding horizontal meridians are aligned, corresponding vertical meridians are tilted about 1° on either side of the true vertical, with the top of each meridian tilted to the temporal side. The angle between the corresponding meridians is known as the angle of declination. In other words, the mapping of corresponding points contains a horizontal shear distortion. The distortion cannot be due to eye torsion since this would affect horizontal meridians in the same way. For images of a line in the median plane of the head to fall on the extorted vertical meridians, the line must be inclined away from the observer, as shown in Figure 14.31. In Figure 14.32, q is the angle of declination between the corresponding vertical meridians and i is the angle of inclination of the vertical horopter relative to the vertical. If a is the interocular distance, d the viewing distance, and x is the distance below the eyes where planes containing corresponding vertical meridians intersect, then: tan
q 2
a d a d tan i = 2x x
It follows that 180
•
The empirical vertical horopter. The empirical vertical horopter for near convergence is a straight line inclined top away within the median plane of the head. The dotted lines on the retinas indicate the corresponding vertical meridians. The solid lines indicate the true vertical meridians.
Figure 14.31.
q a tan i tan = 2 2d
(10)
Since the tangent of half the angle of convergence, f 2 , equals a/2d, equation (10) can be written tan
q 2
f ttan tan i 2
For small values of q q=
a
i
or i = arctan
d
qd in radians a
(11)
f
d a
x i q
A Inclination of the vertical horopter. Relationship between the inclination, i, of the vertical horopter and angle of declination, q , of corresponding vertical meridians. Fixation is on F at distance d. The vertical horopter extends from F to A at distance x below the eyes, where planes containing corresponding vertical meridians intersect. Interocular distance is a and f is the angle of convergence.
Figure 14.32.
STEREOSCOPIC VISION
F
Therefore, geometrically, the vertical horopter for near viewing is nearly vertical and, as viewing distance increases, it becomes increasingly inclined until, for far viewing, it is parallel with the ground. The increased outward inclination of the vertical horopter with increasing viewing distance was confirmed empirically by Amigo (1974). The horopter inclination increased from 3 to 12° as viewing distance increased from 50 cm to 2 m (Siderov et al. 1999). This change was not accompanied by any change in the declination of the corresponding vertical meridians, as measured by a minimum stereoacuity criterion or by a nonius procedure similar to that used by Nakayama. Nor was it accompanied by any change in the state of cyclovergence. In the study by Siderov et al. the inclinations of the vertical horopter for two subjects were less than those predicted by Helmholtz. They were also less than those determined empirically on one subject by Nakayama. Individual differences may account for the different results.
The shear of corresponding vertical meridians relative to corresponding horizontal meridians is probably a consequence of the fact that most horizontal surfaces are below eye level and are inclined top away, like ground surfaces. The pattern of correspondence may be shaped in early development by alignment of the vertical horopter with the ground plane (Krekling and Blika 1983b). This topic was discussed in Section 7.5. Cooper and Pettigrew (1979) showed electrophysiologically that the corresponding vertical meridians of owls and cats are rotated by an amount that places the vertical horopter along the ground for the eye height and viewing distance of these animals. The tuning of cortical cells to rotational misalignment of the images in the two eyes was modified in kittens exposed for some time after birth to prisms that rotated the images in opposite directions. Tyler (1991a) reviewed the literature on the vertical horopter and the effects of cyclovergence.
BINOCULAR CORRESPONDENCE AND THE HOROPTER
•
181
15 LINKING BINOCULAR IMAGES
15.1 15.2 15.2.1 15.2.2 15.3 15.3.1 15.3.2 15.3.3 15.3.4 15.3.5 15.3.6 15.3.7 15.3.8
Introduction 182 Correlating binocular images 183 Interocular correlation 183 Detection of interocular correlation Local matching rules 189 Unique-linkage rule 189 Nearest-neighbor images 190 Image adjacency in slanted surfaces Image order 193 Similarity of orientation 195 Similarity of spatial frequency 197 Relative image contrast 197 Matching color 200
15.3.9 15.3.10 15.3.11 15.3.12 15.3.13 15.4 15.4.1 15.4.2 15.4.3 15.4.4 15.4.5 15.4.6
186
193
15. 1 I N T R O D U C T I O N
Relative motion 201 Epipolar images 202 Effects of texture inhomogeneity 203 Ecological factors 204 Ambiguity in linking oblique lines 204 Global matching rules 204 Minimizing unpaired images 204 Coarse-to-fine spatial scales 205 Coarse-to-fine disparities 206 Edge continuity 207 Surface smoothness 207 Image linkage determined by convergence
208
corresponding images will be referred to as linked images. Two images that are treated as arising from distinct objects are unlinked images. Images are correctly linked only when they are corresponding images. Matching images that fall on corresponding retinal points or that fall within the range of the binocular fusion mechanism are necessarily linked, whether or not they are corresponding images. Thus images that are fused are linked but images that are linked are not necessarily fused. When a stimulus in one eye is linked to a blank area in the other eye the image of the stimulus is an unpaired image. Unpaired images can arise because the correct corresponding image has not been found or because an object is visible to only one eye. An object is visible to only one eye when it is in the monocular visual field, in the blind spot of one eye, or near the edge of a vertical step in depth. The definitions of points and images are summarized below.
Corresponding images are binocular images that originate from the same location in space. Ideally, corresponding retinal points are points that have the same location in the two retinas and that project to the same location in the visual cortex. Corresponding images fall on corresponding points only when the object is on the horopter. Since there can be only one object in each spatial location, corresponding images necessarily have similar size, shape, orientation, and color. Images with similar features will be called matching images. Matching images that fall on or near corresponding points create the impression of one object—they are fused images. Images falling on corresponding points are matching and corresponding images only when they arise from the same object. The images of two identical objects outside the horopter may fall on corresponding retinal points. Such noncorresponding matching images produce false linkages in the disparity-detection system. Images of distinct objects with different shapes that lie outside the horopter may fall on corresponding retinal points. These images show binocular rivalry—they are rivalrous images. Rivalry is the manifestation of the refusal of the visual system to link grossly dissimilar images. The stereoscopic system has the task of finding corresponding images in the two eyes and rejecting noncorresponding images. Images that the visual system treats as
Corresponding images Images arising from the same location in space. Corresponding points Points in the same location in the two retinas. Matching images Images that are similar in shape, size, orientation, and color. Fused images Matching images falling on or near corresponding points and appearing as one.
182
Diplopic images Corresponding images that are not fused. Rivalrous images Nonmatching images falling on corresponding points. Linked images Images that are treated as corresponding images. Unlinked images Images that are treated as arising from different objects. Unpaired image An image of an object visible to only one eye or on a blank area in the other eye. For a set of distinct objects on the horopter the visual system has no difficulty linking corresponding images. They are simply matching images that fall on corresponding points. But how does the visual system find (link) corresponding images of objects that do not lie on the horopter? It clearly does so in order to trigger vergence eye movements and code disparity. This is known as the binocular correspondence problem. The correspondence problem is essentially unsolvable for a long display of equally spaced identical elements distributed along the horizontal horopter. There is no unique linkage of images, because a match can be found for all dots at any multiple of the interdot separation. There is no way of knowing which of these linkages is the correct one. The Keplerian projection defines all ways of linking pairs of images in the two eyes for horizontal arrays of point-like objects for a given state of convergence (Section 14.2.2). A regular array of identical objects along the horopter produces matching images whatever the convergence of the eyes. This is the basis of the wallpaper illusion shown in Figure 14.3. For a short horizontal row of elements incorrect linkages leave unpaired images at each end. The apparent depth of a region with a repetitive texture may be primed by unambiguous disparities present at the edges of the region (Section 22.2.3). The correspondence problem is severe for a display of identical dots randomly distributed in 3-D space over the whole binocular field. We must look for matching dot patterns at different spatial scales within a particular range of disparities within a particular region. There is no single transformation of one image that can bring all pairs of images in 3-D space onto corresponding retinal points. The problem is eased when identical texture elements are distributed over a single surface. The correspondence problem can be reduced to a minimum by transforming the image in one eye relative to that in the other eye to bring the highest number of images into binocular congruence. The type of transformation required depends on how the surface is oriented. For a set of objects lying on the horopter, the images can be completely linked by translating or rotating one image relative to the other. Once the appropriate transformation
has been detected it can be used to initiate horizontal vergence, vertical vergence, or cyclovergence to bring all the images onto corresponding points (Section 10.7). Vergence can bring all images simultaneously onto corresponding points only for objects that lie on a horopter. A large frontal surface contains a complex pattern of horizontal and vertical compression and shear disparities. The images of a slanted surface are relatively compressed and those of an inclined surface are relatively sheared. Vergence can minimize the mean disparity but no type of vergence can eliminate relative image compression or shear. Images with these types of disparity must be linked by neural computation of the required transformation. The transformation defines the pattern of disparity. The stimuli discussed so far were arrays of identical elements. Such stimuli may contain unpaired images, but there are no rivalrous images. Natural scenes contain objects that differ in size, shape, orientation, and color. Therefore, most incorrectly linked images do not match and, if they fall on corresponding retinal points, they rival. This greatly simplifies the task of correctly linking images in the two eyes. In viewing a natural scene we rarely link the wrong images, because such images rival. This chapter reviews the processes that allow us to solve the binocular correspondence problem.
1 5 . 2 C O R R E L AT I N G B I N O C U L A R I M AG E S 15.2.1 I N T E RO C U L A R C O R R E L AT I O N
15.2.1a Cross-Correlation This section deals with how well people discriminate different degrees of correlation between random-element dichoptic displays. Mathematically, correlating two superimposed random-element displays involves assigning to each element in each display a signed normalized value of luminance or some derivative of luminance, multiplying the values of each pair of points and averaging the products. In the simplest case, each array consists of closely packed squares. Half the squares are white and are assigned a value of 1. The others are black and are assigned a value of –1. The black and white squares are randomly distributed. When all corresponding pairs of squares have the same luminance, the correlation coefficient is 1. When 50% of pairs of squares have the same luminance and 50% have the opposite luminance, the correlation coefficient is 0. A coefficient of –1 means that all pairs of squares have opposite luminance; wherever one square is dark the other is light. The same measure can be applied to well-spaced black and white dots randomly distributed on a grey background. The above measure works only if the display elements have the same distribution in the two displays and differ
LINK ING BINOCULAR IMAGES
•
183
only in the sign of luminance contrast. A different measure of correlation is required for displays consisting of wellspaced randomly distributed elements that are all the same contrast but not identically distributed. Elements falling on corresponding points are assigned a value of 1. Elements in one display that fall in spaces between the elements in the other display are assigned a value of –1. Such dots are unpaired because the dot in one image falls in a blank area in the other image. The correlation could be defined as the mean of these values. No simple measure of image correlation embodies both the criterion of matching luminance polarity and that based on paired and unpaired points. A correlation coefficient based on congruency of pairs of image points may be derived with the images in one relative position. The process can be repeated as one image is transformed relative to the other image. The set of correlation coefficients plotted as a function of the transformation defines a cross-correlation function over that transformation. For extracting patterns of disparity on surfaces, the most significant relative image transformations are translation, shear, compression, and rotation. The cross-correlation function defined over a given relative transformation of two images resembles a 3-D landscape, with local maxima occurring as depressions—the higher the correlation the deeper the depression (see Sperling 1970). Once a depression is found, by either vergence eye movements or a neural process, the visual system locks on to that value, because any departure from it increases mean disparity. Each depression can be thought of as a position of least energy representing a stable state in the image-linkage process. Similar images that fall on corresponding points in the two retinas serve to lock vergence. Once vergence is locked, matches found between images of objects outside the plane of vergence serve for judging relative depth. When the images are randomly positioned dots, as in a random-dot stereogram, each depth plane has a unique best solution because only one relative position of the images gives the highest correlation between them. However, the best match may be hard to find because there could be several lesser maxima in the correlation function defined over a given transformation. In other words, a spurious match between a sizable subset of dots may be found, which prevents the system from finding a better match for the whole set. Analogous problems arise in a variety of contexts, such as predicting the way atoms anneal into crystals as they cool and the problem of planning the most efficient route for a salesman traveling between a series of towns (Kirkpatrick et al. 1983; Barnard 1987).
15.2.1b Hill-Climbing Procedures Ambiguities of image matching may be resolved by an iterative hill-climbing procedure. Starting with the two images in a particular relative configuration, random changes are 184
•
applied until a configuration is found that improves the cross-correlation. The process is then repeated until no further improvement can be found. This stochastic algorithm does not guarantee the best solution because it may become trapped in lesser maxima in the correlation function. One solution is to repeat the whole process several times from different starting positions and select the best outcome. Metropolis (1953) developed an algorithm based on the way crystals anneal when cooled. The elements of the system (local image linkages) are initially subjected to large random fluctuations of amplitude T (disparity). At each location, match x is accepted if it increases the correlation. If the match significantly decreases the correlation, it is rejected. The random fluctuations ensure that the system is periodically jolted out of small maxima in the correlation function. The process is repeated for decreasing values of T until the system converges on a stable solution. In the binocular correspondence problem, random fluctuations of image matches occur because of positional jitter of the two images caused by fluctuations in vergence. There could also be random fluctuations in neural processes that link images in a given neighborhood. Both processes could shake the system out of a shallow depression in the correlation landscape into a more stable, deeper depression. Once the best match between binocular images has been found, it is like being in a deep hollow, and the system is unlikely to be shaken out of it. The spontaneous alternations of perspective in the perception of the Necker cube, the alternations in ambiguous figures, and those in ambiguous motion probably arise from a process akin to noisy jitter (Section 26.7). A given interpretation of a multistable stimulus also seems to be subject to spontaneous weakening over time—a process sometimes referred to as adaptation or satiation. See Kruse et al. (1996) for a model of multistability. These processes may partly explain why it sometimes takes a long time to see depth in a random-dot stereogram lacking easily identified matching features (Section 18.14.2). With natural scenes, the visual system can be more intelligent and rely on the recognition of similar diplopic images rather than on a random search over the correlation landscape. Computer algorithms and neural-network models of this type of process have been developed (Hopfield 1982; Kienker et al. 1986).
15.2.1c Subdividing the Matching Problem Correlation may be performed on local areas rather than on the binocular field as a whole. For example, when objects occur in multiple overlapping planes, as when we look into the branches of a tree, there are distinct peaks in the crosscorrelation function in each direction of the visual field. At any instant, we pay attention to one location and are content if the images in that location are correctly linked.
STEREOSCOPIC VISION
Let a point at lateral position XL in the left retina contain a pair of receptive fields with Gabor-function profiles, one in cosine phase (even) and the other in sine phase (odd), SLE and SLO. Let a point at lateral position XR in the right retina contain a similar pair of receptive fields, SRE and SRO. Assume that the four receptive fields feed into binocular simple cells and then into a complex binocular cell after the inputs have been summed and squared. The horizontal offset of the receptive fields in the left and right eyes, XL–XR, defines the disparity to which the complex cell is maximally tuned. The response of the complex cell is:
(
)2 + (
)2 + ( + )2 or )2 + ( SRE )2 + ( SRRO )2 + 2 S LE SRRE + 2 S LO SRRO +
(1)
The first four terms represent the response to the energy in each monocular receptive field. The last two terms express the local cross-correlation (binocular energy) between the images over the left and right receptive fields. A high correlation indicates that the stimulus disparity matches the peak of the disparity tuning function of the cell. A local set of binocular complex cells with different preferred disparities sample the binocular energy over the dimension of disparity to generate a local binocular energy function, as depicted in Figure 15.1. The peak of the function usually indicates the disparity of the stimuli and the best solution to the correspondence problem. Some of the spurious peaks in the disparity energy function are attenuated when large disparities are given less weight relative to small disparities (Prince and Eagle 2000a). This is equivalent to the nearestneighbor rule, which stipulates that the visual system links images with the least binocular disparity. Although local correlation is more subject to the effects of noise than is a global process, it can detect local variations that a global process might miss. These conflicting requirements can be resolved if local correlations are performed independently within distinct stimulus domains. The domains must be independent so that the solution in one does not affect that in the others. For example, one can find a best fit between binocular images at a coarse scale of spatial frequency or a coarse scale of disparity before seeking a fit at a fine scale (Section 15.4.2). Also, image linking could be done independently as a function of orientation, motion, color, or other stimulus features ( Jones and Malik 1992; Fleet et al. 1996a). It also helps if the linkage process is related to changes in vergence (Section 15.4.6). Mousavi and Schalkoff (1994) built three interacting neural networks. The first locates luminance-defined boundaries in each image. The second multilayered network extracts the features for linking the two images. The third network solves the correspondence problem by imposing a set of constraints on possible linkages. The output of
Local binocular energy
Response = (
Disparity
0 Crossed Uncrossed Disparity of receptive fields Local disparity energy function. The monocular images of a random-dot stereogram are sampled by receptive fields of complex binocular cells in that location. Receptive fields of three cells are represented by circles. One cell prefers crossed disparity, one zero disparity, and one uncrossed disparity. The stereogram has an uncrossed disparity relative to corresponding meridians (vertical lines). The binocular energy of each cell depends on the correlation between the images in its receptive fields. (Redrawn from Prince and Eagle 2000a)
Figure 15.1.
the third network feeds back to the second network, and the iterative process continues until the output of the third network converges to a stable state.
15.2.1d Phase and Cepstral Filters Estimates of disparity based on cross-correlation of the images are perturbed by noise and by differences in contrast or luminance between the images. These problems can be reduced by performing a cross-correlation at each of several spatial frequencies and then combining the results. This reduces the effects of noise, which is usually confined to high spatial frequencies. Sanger (1988) developed a model of disparity detection in which the image in each eye is first convoluted with a set of Gabor filters of differing spatial frequency. Disparity at each location is then registered at each spatial scale in terms of the interocular phase shift between the Gabor functions. The outputs at each location are then combined across all spatial scales. This process reduces the effects of bandlimited noise. Jenkin and Jepson (1994) proposed a phase filter for the computation of disparities. A Fourier transform of each
LINK ING BINOCULAR IMAGES
•
185
image is derived and the amplitudes of all frequency components are normalized to unity. This process acts like a high-pass filter and emphasizes the high-frequency components to provide a better estimate of image position. The Fourier transforms of the two images are then multiplied. This provides a Fourier domain measure of the correlation, and hence the disparity, of the two images. Models of this kind were discussed in more detail in Section 11.10.1. A cepstrum is the power spectrum of the logarithm of the Fourier transform of a time-varying or space-varying signal. Cepstrum is an anagram of “spectrum.” The concept was developed by Bogert et al. (1963) and originally applied to the detection of echoes in seismic signals reflected from layers of the earth’s crust. The method is suited to the characterization of repetitive structures such as echoes. Each echo generates an easily located peak in the cepstrum. The location of this peak signifies the echo delay. The echo signal can then be removed by filtering, and the original signal reconstituted by the inverse transform. Yeshurun and Schwartz (1989, 1990) developed a model of disparity detection using cepstral filters. The unit visual input in the model consists of a patch of image about 5 arcmin in diameter spread across neighboring left-eye and right-eye ocular dominance columns. One half the patch is derived from one eye, and the other half from a corresponding region in the other eye. It is assumed that the disparity to be detected lies within this region. The Fourier transform of the pair of images is first derived. Then the power spectrum of the logarithm of the Fourier transform is plotted to yield the cepstrum. A horizontal disparity shows as a localized peak in the cepstrum spectrum with a position on the X-axis that signifies disparity magnitude in angular units. The nonlinearity introduced by the logarithmic transformation renders this procedure specifically sensitive to repetitive structures in the input signal that take the form of a binocular disparity. Since the logarithmic transformation is compressive, it emphasizes high-frequency components of the image. Furthermore, unlike correlation procedures, the disparity signal in cepstral analysis is not subject to interference from spatial frequency terms in the component images. For these reasons, the signal-to-noise ratio is several times higher than that achieved by correlating the images. The method is highly resistant to blurring, rotation, and differential magnification of the images. Image degradation smears the disparity signal in the cepstrum but does not shift its mean position. The visual system could derive a Fourier transform of the visual input only if it possessed a large number of narrowly tuned detectors of spatial frequency, each with very large receptive fields (Section 4.4.2). However, an estimate of the spatial-frequency power spectrum of a local binocular image patch may be derived by a small set of detectors each with a 1.5 octave bandwidth. The output of this stage would then have to be transformed logarithmically to 186
•
yield the cepstrum with its peak signal representing the disparity. The process would be carried out in parallel at each location.
15 .2.2 D ET E C T I O N O F I N T E RO C U L A R C O R R E L AT I O N
15.2.2a Time to Detect Changes in Interocular Correlation Sensitivity to interocular correlation has been investigated by measuring how long it takes to detect changes in correlation in random-dot displays. Julesz and Tyler (1976) measured the stimulus duration required for detection of two types of change in interocular correlation in dynamic square matrix displays. First, a display in which half the squares had the same luminance in the two eyes and half had opposite luminance (zero correlation) changed to a display in which all pairs had opposite luminance (–1 correlation), or vice versa. These transitions involve the engagement or disengagement of the binocular rivalry mechanism. Second, a fully correlated changed to a zero-correlated display or a fully correlated display changed to a negatively correlated display. These transitions involve the engagement or disengagement of the binocular correspondence mechanism. In each case, the display remained in the changed state for a variable time interval and then changed back. Subjects required longer intervals to detect the first type of changes than to detect of the second type. Julesz and Tyler concluded that transitions are detected more efficiently by the rivalry mechanism than by the binocular correspondence mechanism. Tyler and Julesz (1978) used dichoptic displays consisting of randomly distributed well-spaced dots, all of the same luminance. The displays were dynamic, meaning that they changed every 1.5 ms. When the dots in each exposure were identically distributed in the two eyes the correlation was 1 and the fused image appeared flat. When the relative distribution of the dots in the two eyes was random, the correlation was 0 and the fused image appeared as a 3-D array of swirling dots. Tyler and Julesz measured the ability of subjects to detect a change in the correlation between two sequentially presented displays as a function of stimulus area and duration. The time taken to detect a change in correlation decreased as display area increased to about 5 deg2, after which the duration threshold remained constant. Because they used only one dot density of 6%, it is not clear whether the crucial factor was the number of dots or the area of the display. With larger displays, adding extra area may have been ineffective because of the retinal eccentricity of the added dots. Tyler and Julesz also found that subjects could detect changes of correlation of only 1%, sometimes in only 3 ms, suggesting that correlation detection is done in parallel over the display, rather than serially. In other words, it suggests
STEREOSCOPIC VISION
Binocular luster as a preattentive feature. Detection of a patch with rivalrous luminance does not require a serial search.
Figure 15.2.
(a) Correlation 1.0
that dichoptic luminance polarity is a preattentive visual feature. The ease with which dichoptic dots with reversed polarity can be found among a set of dots with the same polarity is illustrated in Figure 15.2. A change of a randomdot display from a state of correlation to one of decorrelation was detected about 10 times faster than a change from decorrelation to correlation. This could be because all elements must be checked to establish that a display is correlated, whereas, to establish that it is uncorrelated requires the detection of only one unlinked image.
(b) Correlation 0.5
15.2.2b Discrimination of Interocular Correlation Sensitivity to interocular correlation has also been investigated by measuring the threshold for detecting a difference in the interocular correlation of random-dot displays. Cormack et al. (1991) presented a dynamic randomdot display to each eye with various degrees of correlation, as shown in Figure 15.3. At above about 10 times the contrast threshold, two subjects could discriminate between an 8% correlated display and a display with zero correlation. The correlation threshold was independent of changes in contrast. They explained this constancy of the correlation threshold using a model in which the signal (the peak in the correlation function) and the extrinsic noise (lesser peaks due to spurious matches in the display) both vary with the square of contrast, as shown in Figure 15.4. This similar dependence of signal and noise on contrast means that the signal-to-noise ratio remains constant as contrast is varied. As contrast fell below 10-times-above-threshold, the correlation threshold increased rapidly. A similar drop in stereoacuity occurs at low contrasts (see Figure 18.17). A rapid rise in the stereo threshold occurs near the contrast threshold because intrinsic noise necessarily becomes stronger than the signal at low contrasts. Cormack et al. (1994) used dynamic high-density randomsquare displays. The degree of interocular correlation was varied by varying the amount of dynamic noise present in the displays. Subjects decided in which of two time intervals the mean correlation changed from zero to some positive value. The area of the display varied between 0.14 and 7 deg2, and its duration varied between 49 and 800 ms.
(c) Correlation 0.0 Different degrees of interocular correlation. When the correlation between the images is 1.0, they fuse to create a smooth plane. When the correlation is 0.5, some dots appear out of the plane. When the correlation is 0.0, a plane is not seen. Cormack et al. (1991) investigated the ability of subjects to detect transitions between different degrees of correlation. (Reproduced with permission from Elsevier Science)
Figure 15.3.
For displays with up to about 10,000 elements, the correlation threshold was inversely proportional to the number of elements in the display, regardless of its size or duration. With more than 10,000 elements, the correlation threshold leveled off to a constant value of about 0.1. These results did not reveal well-defined integration areas or integration times but conformed to an ideal observer that detects the statistical attributes of the total number of elements in the display for up to 10,000 elements. In Section 18.3.5 it is mentioned that subjects detected a disparity-defined depth edge in a sparse random-dot stereogram using the information in only about 20 elements, even though the display contained many more elements (Harris and Parker 1992). In this task the crucial information is contained in the region of the depth edge.
LINK ING BINOCULAR IMAGES
•
187
0.22 cpd, (3) disparity amplitude increased above 0.5 arcmin, and (4) dot lifetime decreased from 1.6 s (static RDS) to 80 ms (dynamic RDS). Palmisano et al. concluded that increasing dot density or decreasing dot lifetimes improves tolerance for decorrelation because both manipulations prevent undersampling, increase the signal-to-noise ratio, and reduce the incidence of false matches.
Correlation amplitude
30% 24% Contrast
18%
15.2.2c Detection of Interocular Correlation in Horizontal and Vertical Arrays
12%
6% Disparity Cross-correlation functions. Family of cross-correlation functions for a random-dot stereogram with 80% interocular correlation but decreasing contrast (indicated by numbers on the right). Signal (peak in correlation function) and the extrinsic noise (lesser peaks due to spurious matches) both vary with the square of contrast. The functions are separated vertically. (Adapted from Cormack et al. 1991)
Figure 15.4.
However, in the correlation-detection task used by Cormack et al., the information is distributed over the whole display. An experiment on sensitivity of the stereoscopic system to added disparity noise is described at the end of Section 18.6.3. Cormack et al. (1997b) varied the correlation between two dynamic random-element displays by varying the proportion of element pairs with the same luminance relative to those with opposite luminance. Subjects reported which of two sequentially presented displays was most highly correlated. The threshold for detection of a difference in interocular correlation of a 2° by 2° display decreased as element density increased up to a limiting value. At low densities, human subjects were as efficient as an ideal observer that sampled all regions. However, they were not as efficient as an ideal observer that used only the minority elements in the display. As dot density increased, human efficiency fell with respect to both ideal observers. Use of a constant fraction of dots would result in constant efficiency. Use of a constant number of dots would result in a decline in efficiency with a slope of –1, which was steeper than that observed. Cormack et al. concluded that human observers undersample the information contained in high-density random-dot stereograms. Palmisano et al. (2006) examined effects of image decorrelation on both detection of decorrelation and on detection of disparity-defined 3-D sinusoidal surfaces. They used static and dynamic random-dot stereograms. Detection of a 3-D surface required a much larger number of correlated dots than did detection of interocular correlation. Surface detection tolerated more image decorrelation as: (1) dot density increased from 23 to 676 dots/deg2, (2) spatial frequency of depth modulation decreased from 0.88 to 188
•
Cormack and Riddle (1996) measured the threshold for detection of interocular correlation in dynamic randomline displays. Each display was a 2° by 2° square with randomly spaced vertical or horizontal lines extending across the display. Subjects reported which of two displays, presented sequentially for 1.2 s each, contained nonzero spatially correlated patterns. The threshold for horizontal lines was similar to that for vertical lines. However, the slope of the psychometric function for vertical lines, which involved horizontal disparities, was about twice that for horizontal lines, which involved vertical disparities. In other words, interocular correlations are discriminated less well by the vertical-disparity system than by the horizontal-disparity system. Cormack and Riddle concluded that the decision stage is similar for horizontal and vertical disparities but that there is more noise in the correlation signal in the verticaldisparity system than in the horizontal-disparity system. There may be another way to think about these results. Section 20.2.4 reviews evidence that horizontal disparities are extracted locally but that vertical disparities are extracted as a mean value over large areas. In the horizontal-disparity system, disparities of noncorresponding stimulus elements, if detected, will be detected at their full value in each location and then used to determine the correlation threshold. In the vertical-disparity system, only a mean value of disparity will be extracted. This will usually be less than the full value used by the horizontal-disparity system.
15.2.2d Detection of Interocular Correlation in Displays Displaced in Depth Stevenson et al. (1992) asked subjects to indicate which of two sequentially presented dynamic random-dot displays contained some degree of correlation. Subjects detected a correlation of about 0.1 when the two displays were in the plane of fixation. When the displays had a horizontal disparity in excess of 1°, a fully correlated display became indistinguishable from a zero-correlated display. Stevenson et al. then used a depth-adaptation procedure to estimate the disparity tuning of channels that detect interocular correlation. Subjects inspected a 9°-diameter circular array of fully correlated dynamic random dots for 60 s, with 0.5-s topping-up exposures to this adaptation stimulus.
STEREOSCOPIC VISION
Log correlation threshold
– 0.4 Before adaptation – 0.7
–1.0
–1.3 – 40
After adaptation to zero-disparity stimulus –20 0 20 Disparity of test stimulus (arcmin)
40
Effects of adaptation on detection of interocular correlation. The threshold for detection of interocular correlation is elevated at the disparity of the adaptation stimulus and lowered for disparities beyond about 5 arcmin. Error bars indicate + 1 SE. Results for one subject. (Redrawn from Stevenson et al. 1992)
Figure 15.5.
The array had different degrees of crossed or uncrossed disparity with respect to a surrounding aperture. Between adaptation top-ups, subjects were tested for detection of a partially correlated surface embedded in a zero correlation cloud of dots. When the test and adaptation stimuli had the same disparity, the percentage of correlated dots required for detection of correlation was about twice as high after adaptation as before. The effect of adaptation declined to zero when disparity between adaptation and test stimuli was increased to between 5 and 10 arcmin, as shown in Figure 15.5. With higher disparity differences between the stimuli, adaptation lowered the correlation-detection threshold. Stevenson et al. concluded that interocular correlation is processed in several channels, each with a disparity tuning width of about 5 arcmin, but with a broader center-surround organization. The above experiments tested the ability to detect correlation under stringent conditions. In a typical randomdot stereogram, all the dots are the same and the patterns of dots in the two images have an overall similarity. This may partly explain why it can take time to see depth in these stereograms. Depth latency is shortened when clearly discriminable elements, such as corners and blobs, are added to random-dot stereograms (Saye and Frisby 1975). Natural images do not consist of randomly distributed identical elements but patterns of lines, edges, and distinctly colored surfaces. These natural redundancies simplify the task of linking binocular images because they impose constraints on the number of ways in which images may be linked. Incorrectly linked images arising from natural scenes tend to rival. The linkage rules described in the following sections show how the visual system exploits these constraints. Processing disparities for detection of the best fit between complex images may have little to do with subsequent processing of disparities for the detection of depth.
Correlating images, even within a given region and over a restricted range of disparities necessarily entails pooling disparity information. If this were the only procedure, the visual system would be unable to detect the depth of an isolated dot set among dots in distinct depth planes. In fact, the relative depth of a dot can be detected when it has a disparity of a few arcmin, even when the surrounding dots are partially uncorrelated (Harris et al. 1997). This suggests that processes for detection of fine disparity are distinct from those responsible for detection of image correlation. In a similar way, the process of image fusion, which is limited by the spatial gradient of disparity (Section 12.1.3), is distinct from the processes of image correlation and disparity detection, which are not limited by the disparity gradient in the same way. The disparity processes that drive vergence are probably different from all the other disparity processes in the extent to which they spatially integrate disparity information. Thus the same disparity information is processed in different ways for different purposes. Also, there are distinct mechanisms that process different types of disparity in parallel or hierarchically. 1 5 . 3 L O C A L M ATC H I N G RU L E S 15.3.1 U N I Q U E -L I N K AG E RU L E
It is a basic property of the world that a small object has a unique position in 3-D space at any instant and that there is no more than one object in one place. This constraint allows the binocular matching process to operate with the rule that each point in the image of one eye corresponds to at most one point in the image of the other eye at any one time. According to this unique-linkage rule, once a pair of image points has been linked by the disparity-detection system, all other potential linkages for those images are excluded. Current physiological theories of disparity detection allow multiple linkages, insofar as neighboring binocular cells receive overlapping inputs from monocular receptive fields. Perhaps multiple linkages are made initially, and the unique-linkage rule is applied at a later stage in which the best set of linkages is retained by the application of other criteria described in later sections (Grimson 1981). Weinshall (1991) claimed that multiple depth planes seen in an ambiguous random-dot display are due to doubleduty linkage. However, Pollard and Frisby (1990) pointed out that multiple planes might be seen without violating the unique-linkage rule. Each point in one eye could be linked with only one point in the other eye but different linkages could occur in different parts of the stereogram, creating two planes. Later, Weinshall (1993) showed that the perceived density of dots in each depth plane was
LINK ING BINOCULAR IMAGES
•
189
Right eye
Figure 15.6.
There is no conclusive evidence that the visual system simultaneously registers multiple linkages between images (see Section 15.3.1).
Left eye
Display used to study double-duty matching. The display in each
eye was 1.5° long by 0.8° high. Each target consisted of a white line 2.75 arcmin wide flanked by black lines 1.38 arcmin wide. Background and targets had the same mean luminance. (Adapted from McKee et al. 1995)
consistent with each dot having been linked with only one dot. Thus, the unique-linkage rule holds for this type of display. Stereo transparency is discussed in more detail in Section 18.9. McKee et al. (1995) claimed that a single vertical bar presented to one eye can simultaneously mask two bars presented to the other eye. Each vertical white bar was flanked with black bars and presented on a gray background. Each set of three bars is a target (Figure 15.6). This arrangement was designed to eliminate low spatial frequencies. Two targets were presented 7 arcmin apart to the right eye. The contrast-increment threshold for each of these targets was elevated by the presence of a similar target in an intermediate position in the left eye. The threshold elevation was as great as when the right eye had only one target, suggesting that the single white bar in the left eye simultaneously masked both white bars in the right eye. Furthermore, perceived depth between the two targets, as measured by a depth probe, indicated that the single target in the left eye was linked at the same time with both targets in the right eye. Vergence was controlled by having subjects align nonius lines before the stimulus was presented for only 200 ms. One problem here is that each target contained two black lines. The single target in the left eye was thus not really a single stimulus. Perhaps each of the two black lines of the target in the left eye independently masked the white lines in the right eye. Also, the target in the left eye had four edges and that in the left eye had eight edges. The stimuli therefore provided multiple disparities. This would not violate the unique-linkage rule. Panum’s limiting case looks as though it presents a problem for the unique-linkage rule. The eyes converge on a vertical line seen by both eyes, and a second vertical line is presented to only one eye, slightly to the temporal side of the fused line (see Figure 17.45). The second line appears to lie in a depth plane beyond the binocular fused line. Hering (1865) proposed that the image in the eye that sees only one line is linked with each of the two images in the other eye, one linkage signifying one depth plane and the other linkage a second depth plane. If true, this would violate the unique-linkage rule. However, we will see in Section 17.6 that Panum’s limiting case can be explained without assuming double-duty linkage of this type. 190
•
15.3.2 N E A R E S T-N E I G H B O R I M AG E S
According to the nearest-neighbor rule, the visual system links dichoptic images with the least disparity. The images of objects in or near the horopter are automatically fused and linked. As an object moves away from the horopter, its images acquire a disparity and eventually become diplopic. When there are several similar objects in the same neighborhood, there will be several ways to link their images. It is a good strategy to give priority to nearest-neighbor images because the larger the disparity the more likely it is that the images arise from distinct objects. For a surface covered with evenly spaced identical texture elements, the nearest-neighbor linkage rule produces the wrong match when the eyes converge so that the disparity between corresponding images is more than half the spacing of the texture elements. The surface will then appear in an anomalous depth plane. The wallpaper illusion illustrates this point (Sections 14.2.2 and 25.2.6b). However, if the surface also contains well-spaced lines or edges, each with a distinct shape, it should be easy to find the correct match for these features. Once the well-spaced features have been linked, the same disparity could be applied to more finely spaced features on the same surface. The images of particular elements on a textured surface may be constrained to link with non-nearest neighbors to optimize the match of the whole surface. This strategy works when it is correctly assumed that the sparse features and the dense features belong to the same surface. Usually, the communality of different textural elements on a surface is apparent because of the coherence or familiarity of the overall pattern. There are stimuli for which a linkage that produces the smallest absolute disparity between image features in a given stimulus is not the same as the linkage that produces the minimum relative disparity between the images of features in adjacent stimuli. Zhang et al. (2001) devised the stereograms shown in Figure 15.7. In Figure 15.7A, the two gratings have the same phase disparity of 135° with respect to the vertical reference lines. When cross-fused, they appear nearer than the reference lines. The grating in Figure 15.7B has the same disparity but with opposite sign. When cross-fused it appears beyond the reference lines. Each of these depth impressions conforms to nearest-neighbor linkage between the peaks of the gratings that are 135° apart. When all three gratings in Figure 15.7C are cross-fused they all appear either nearer than or beyond the reference lines. For either the center or the flanking gratings, image linking must be between non-nearest-neighbor peaks. For example, when all the gratings appear nearer than the reference lines, the match in the flanking gratings is a nearestneighbor (135°) match. However, the match in the center
STEREOSCOPIC VISION
A
B
C
D Matching non-nearest-neighbor stimuli. (A) The gratings have the same phase disparity of 135° with respect to the vertical reference lines. When cross-fused, they appear nearer than the reference lines. (B) The grating has the same disparity but of opposite sign to those in (a). When cross-fused it appears beyond the reference lines. Both these depth impressions conform to nearest-neighbor matching between peaks of the gratings 135° apart. (C) When cross-fused, all three gratings appear either nearer than or beyond the reference lines. For either the center or the flanking gratings, image matching must be between nonnearest neighbor peaks. (D) With crossed fusion, the upper figure appears nearer than the black lines. The +135° phase disparity in middle figure causes the fused image to appear beyond the black lines. In both cases, nearest-neighbor images are matched. However, when the two stereograms abut in the bottom figure, the grating appears nearer than the random-dot displays. (From Zhang et al. 2001. Reprinted with permission from Elsevier
Figure 15.7.
Science)
gratings must be between peaks 225° apart (360° – 135°) because this match has the same sign as that in the flanking gratings. This way of linking the images minimizes the relative disparity between the center and flanking gratings. Thus, minimizing relative disparities between stimuli takes precedence over minimizing absolute local disparities within stimuli. In a subsequent experiment Zhang et al. (2004) used the stereograms shown in Figure 15.7D. With crossed fusion, the disparity in the stereogram in the upper figure causes the patterns to appear nearer than the black lines.
The +135° phase disparity in the images of the grating in the middle figure causes the fused image to appear beyond the black lines. In both cases, nearest-neighbor images are matched. However, when the two stereograms abut in the bottom figure, the grating appears nearer than the randomdot displays. Thus, a match between peaks of the grating –225° apart replaces the nearest-neighbor match between peaks +135° apart. This brings the relative disparity between the grating and the flanking random-dot stereograms down to 90° (225° minus 135°) from 270° (135° plus 135°). Thus, the unambiguous disparity of the random-dot stereograms determined which match was made between the images of the ambiguous grating. Zhang et al. went on to ask whether the influence of a flanking display on a neighboring display declines when two briefly exposed displays are separated in time. The random-dot images and the grating images were each presented for 166 ms with a variable interstimulus interval (ISI). The flanking images ceased to affect the perceived depth of the grating when the ISI exceeded about 50 ms. Zhang et al. used short exposures to prevent changes in vergence. With prolonged viewing, the eyes could converge so as to reduce the disparity of the images of the randomdot stereogram to zero, which would automatically change the disparity of the grating to bring the fused image nearer. This would not involve any spatial interactions between the disparities of the two displays. However, the initial 166-ms exposure of the random-dot stereogram may have triggered a vergence response, which could have changed the disparity of the subsequently exposed grating. Goutcher and Mamassian (2005) pointed out that, with three adjacent surfaces, observers could have minimized local relative disparities between the surfaces or the global change in disparity over the whole display. They devised a set of dot stereograms with the same global disparity, defined as the largest disparity across the whole display, but different patterns of local disparity. Matches conformed to a nearest disparity constraint, rather than a nearestneighbor constraint. They concluded that binocular matches minimize local relative disparities rather than global disparity. Goutcher and Hibbard (2010) produced further evidence that the visual system minimizes relative disparities. Figure 15.8A shows an ever-ascending staircase produced by misleading perspective (Penrose and Penrose 1958). Figure 15.8B is a stereo version of this illusion (Papathomas and Julesz 1989). The construction depends on the rule that linkage of binocular images in a region of constant disparity minimizes the perceived difference in depth between that region and a neighboring region. Most people see a continuously ascending set of steps in depth as they slowly move their gaze from the top of the stereogram to the bottom, in spite of the fact that the disparity periodically jumps back to its previous value. Part of the reason for this is probably that the eyes change convergence as they scan the display.
LINK ING BINOCULAR IMAGES
•
191
A
B Ever ascending staircases. (A) An ever-ascending staircase created by illusory perspective. (Redrawn from Penrose and Penrose 1958) (B) A stereo version of the ever-ascending staircase. When the images are fused, one sees continuously ascending steps in depth as the gaze is moved slowly from the top to the bottom, even though disparity jumps back periodically to its previous value. (From Papathomas and Julesz 1989, Pion Limited, London)
Figure 15.8.
Linkage of nearest-neighbor images in a periodic stimulus would be facilitated if the disparity were less than half the largest spatial period to which disparity detectors respond. Phase-disparity detectors embody this half-cycle limit (Section 11.4.3). Phase-disparity detectors with a half-cycle (180°) limit could not process large disparities. For example, to process a disparity of 10° unambiguously would require a cell tuned to a spatial frequency of 0.05 cpd (Section 18.4.1). This suggests that positiondisparity detectors are used to process large disparities, especially large disparities between the images of small objects. Edwards and Schor (1999) argued that the transient stereoscopic system responds to short duration, largedisparity stimuli and is not subject to the half-cycle limit, while the sustained system responds to long duration, small-disparity stimuli and is subject to the half-cycle limit (see Sections 18.12.1a and 18.12.3). To test this hypothesis they presented two gratings with various matching 192
•
spatial frequencies. One grating had a crossed phase disparity and the other had an equal uncrossed disparity. Each grating also contained a complementary phase disparity. For example, a grating with a 90° phase disparity also had a 270° phase disparity. Subjects reported which grating appeared nearest in depth. For the long-duration stimuli, perceived depth almost always corresponded to that predicted by nearest-neighbor linkages (less than 180° phase disparity). For the short-duration stimuli (140 ms), perceived depth most often corresponded to phase disparities of more than 180°. In other words, the sustained system responded selectively to the smaller phase disparity, while the transient system responded selectively to the larger phase disparity present in the stimulus. The tendency for the transient system to respond to the larger disparity increased as phase disparity approached 180°. The tendency also increased as the angular subtense of the display increased from 15 to 30°, presumably because the peripheral visual field contains proportionately more detectors with large receptive fields.
STEREOSCOPIC VISION
15.3.3 I M AG E A D JAC E N C Y I N S L A N T E D S U R FAC E S
A conflict between linking nearest-neighbor images and linking corresponding images can arise in viewing slanted or inclined surfaces that contain regular texture. For example, the images in Figure 15.9 have a horizontal size disparity, which causes the fused image to appear slanted. When the leftmost dots are fused, the rightmost dot in the right image coincides with the next to the end dot in the left image. According to the nearest-neighbor rule, these dots should be linked. But this would leave the last dot in the left image and one of the dots in the right image unpaired. Ramachandran et al. (1973b) claimed that this stereogram produces a continuous depth ramp and concluded that linking occurs between corresponding dots rather than between dots that are nearest neighbors. According to this claim, the process that minimizes the number of unpaired images by forcing nonadjacent images to be linked wins out over the process that seeks nearest-neighbor linkages and tolerates unpaired images. This evidence is not conclusive. When fixation is held on the two leftmost dots in Figure 15.9, only the first few dots appear slanted—the others appear to lie on a frontal plane. This is because the disparities become ambiguous and hard to detect in the periphery. When the gaze is allowed to wander over the whole row, a coherent slanted row of dots is seen. By inserting pairs of nonius lines above the dots one can show that vergence changes as the gaze scans over the row of dots. This brings successive pairs of dots into correspondence and allows one to build up an integral impression of a slanted surface, by storing successive partial impressions of slant in a buffer store. At each stage of this process the nearest-neighbor rule is maintained, and unpaired images some distance from the point of convergence are simply ignored. This process of sequential scanning can be further explored by using vertical lines instead of dots and varying their number and spacing. Let the number of lines per degree of visual angle be f in the left eye and f + x in the right eye. With the eyes stationary, the lines in the two eyes fall on corresponding points x times per degree of visual angle. Let each of these positions be called a node. When nodes occur frequently, as in Figure 15.10, fusion of the images creates a Venetian blind in which short ramps between
Stereogram of dots with only one disparity node. With gaze on the left end dot, a slanted surface is seen extending over the first few dots, beyond which depth becomes vague. A continuous depth ramp is seen when the gaze moves across the display. As the gaze moves from left to right, the nonius lines on the left part and those on the right become aligned. This indicates that vergence has changed.
Figure 15.9.
The Venetian-blind effect. A stereogram of vertical lines with closely spaced nodes of zero disparity creates a Venetian blind. Short ramps between nodes, which occur every three lines of the set of lines on the left, are interspersed with step returns.
Figure 15.10.
nodes are interspersed with step returns to a frontal plane at each node. This is what one would expect from the pattern of nearest-neighbor disparities over the array. When the nodes are far apart, as in Figure 15.9, and when the gaze fixed, a slanted surface is seen extending over the first few dots, beyond which depth impressions become vague. A similar effect occurs when one inspects a surface covered with a regular pattern of dots inclined in depth about a horizontal axis, as in Figure 14.3. The surface breaks up into a set of horizontal planes with a step between each plane. Within each plane, each dot in one eye fuses with its nearest neighbor in the other eye to create an inclined surface. However, at the boundary between one plane and the next, the nearest-neighbor linkages are shifted by one interdot spacing. These visual effects suggest that the visual system prefers to link adjacent images rather than nonadjacent images. However, we will now see that nonadjacent images are linked when there are no alternatives.
15.3.4 I M AG E O R D E R
15.3.4a Topological Order Corresponding lines of sight (visual lines) from the two eyes intersect in either the horizontal or vertical horopter. Consider a line AB through point A on the horopter and lying between the two corresponding visual lines that meet at A, as in Figure 15.11. Any pair of objects, A and B, lying on such a line have a horizontal disparity gradient greater than 2, as explained in Section 19.4. The disparity gradient is greater than 2 for a pair of objects lying between any pair of corresponding visual lines, not only the visual axes. For objects with a disparity gradient greater than 2, the two images in the left eye are in reversed left–right order with respect to the images in the right eye. The topological continuity of these images is not preserved. Objects A and C have a horizontal disparity gradient less than 2. For these objects the image of A is to the left of the image of C in one eye and also to the left of the image of C in the other eye. The topological continuity of these images is preserved. It follows that, for objects having a horizontal disparity gradient greater than 2, the images of the nonfixated object are separated by the images of the fixated object, as shown in Figure 15.11. This should complicate the problem of
LINK ING BINOCULAR IMAGES
•
193
Corresponding visual lines
Horopter
A
C
B
BL AL
CL
AR BR
CR
Disparity with loss of topological continuity. Object B lies between the visual axes intersecting at object A. The relative order of the images of A and B is not the same in the two eyes. Also, the images of B are separated by the fused images of A.
Figure 15.11.
linking these images. For a phase-disparity mechanism that relies on the offset of subregions within corresponding monocular receptive fields (Section 11.4.3), linking would become impossible, since it is difficult to see how a binocular cell could operate when a fused pair of images is inserted between the disparate images.
15.3.4b Interhemispheric Order The images of any object that lies to the left or to the right of both visual axes project to the same side of the brain. The images of any object between the visual axes, other than the fixated object, fall on opposite retinal hemifields and therefore project to opposite cerebral hemispheres (Section 5.3.4). Thus, there are two problems in linking the images of objects lying between the visual axes. (1) The image pairs are in reversed order so that the image of the fixated object lies between those of the nonfixated object, and (2) the images of the nonfixated object project to opposite sides of the brain. Evidence reviewed in Section 11.9 supports a suggestion made by Bishop and Henry (1971) that the callosal pathway is responsible for midline bilateral integration for coarse stereopsis, whereas fine stereopsis in the midline region depends on overlap of direct visual inputs in the midline region. Linksz (1971) made the same suggestion and added that impressions of depth from coarse disparities arise from command centers controlling vergence. The idea that signals carried in the corpus callosum generate vergence for 194
•
midline objects is supported by the fact that a patient with section of the corpus callosum failed to converge on midline targets but responded when images were in the same hemisphere (Westheimer and Mitchell 1969). However, this does not prove that these vergence signals mediate midline stereopsis. If coarse stereopsis were contingent on vergence, subjects would be unable to detect depth of more than one object at a time relative to a fixation point. Ziegler and Hess (1997) presented Gabor patches 3° above and 3° below or 3° to the left and 3° to the right of a fixation cross for 150 ms. The images of each patch had disparities of between 2 and 3°—well above the fusion limit. The patches were well separated from each other and from the fixation cross so that they did not form steep disparity gradients. Subjects could indicate the depth of each patch relative to the fixation cross for both the up-down and the left-right patches. The disparity of the left patch was detected on one side of the brain and the disparity of the right patch was detected on the other side of the brain. Interhemispheric connections must have been involved in comparing the two depth intervals. The images from each patch above and below the fixation cross went to opposite hemispheres. Signals from opposite sides of the midline with 3° of disparity are more likely to be conveyed to the opposite hemisphere through the corpus callosum than directly from midline regions with mixed ipsilateral-contralateral projections (Section 5.3.5). This evidence suggests that coarse disparities well beyond the fusion limit code depth directly, even when the disparate images project to opposite sides of the brain. However, it does not prove that vergence signals make no contribution to coarse stereopsis. Also, Ziegler and Hess showed only that subjects could detect the sign of interhemispheric disparities. They did not investigate whether the magnitudes of interhemispheric disparities are processed as efficiently as same-hemisphere disparities. It is explained in Section 19.4 that images arising from a disparity gradient greater than 1 do not fuse. When the dots and lines in columns A and B of Figure 15.12 are combined with convergence, with the gaze firmly on the dots, the images of the thin vertical lines in column A appear on each side of the fused dot and do not fuse. These lines have an interhemispheric disparity and are separated by the fused image of the dots. Perhaps nonadjacent images are linked only for the purpose of evoking vergence responses. In column B the disparities are the same but the disparity gradient is less than 2; the images of the lines fall to one side of the fused image of the dots and project to the same hemisphere. When vergence is firmly on the dots, these lines also appear diplopic. However, the diplopia is not as easy to see because the lines are further from the fovea than the lines in column A. Depth between the dots and lines is clearly evident in all the fused images when the eyes change convergence from
STEREOSCOPIC VISION
Right eye
Left eye A
Figure 15.12.
B
A
B
Depth from high- and low-disparity gradients. The thin vertical
lines in fused column A and column B appear in front of the dots with convergent fusion and behind them with divergent fusion. Depth increases down each column. When convergence is firmly on the dots, the vertical lines in column A appear diplopic and in the same plane as the dots. Lines in column B tend to remain fused and in depth. Disparities in the two columns are the same, but images in column A form a steeper disparity gradient and are topologically out of order compared with those in column B.
dots to lines. Vergence could help in two ways. First, even when the eyes do not change convergence, they may not be accurately converged on either the dots or the lines because the disparity of the low spatial-frequency content of the images in the two eyes induces a fixation disparity into the fused images. With crossed convergence, this causes the dots to appear nearer than the monocular lines. Second, when the eyes change convergence from the dots to the lines, the vergence movement and the resulting change in the pattern of disparity create depth. 15.3.5 S I M I L A R I T Y O F O R I E N TAT I O N
The stereograms discussed in the previous sections consisted of arrays of identical points lacking all features except their spatial distribution and disparity. But natural scenes consist of objects that differ in spatial position, color, orientation, contrast, motion, and spatial frequency. The rule that each object has a unique position in space implies that the correctly linked binocular images of an object are fundamentally similar in size, orientation, shape, color, and motion. Images that differ markedly in any of these respects most likely arise from different objects. When images on corresponding retinal points differ in shape they tend to rival rather than fuse (see Chapter 12). The simil
Orientation disparities conforming to a surface. In the upper stereogram the relative orientations of line elements produce disparities appropriate to the slope of the pyramid produced by disparities in the continuous lines. This creates a smooth-sided pyramid. In the lower stereogram there are no orientation disparities between line elements. This creates a stepped pyramid. (From Ninio 1985, Pion Limited, London)
Figure 15.13.
arity constraint drastically reduces the correspondence problem for natural scenes. The image-linkage process makes good use of this natural constraint. Small differences in orientation of the images are the natural cue for inclination in depth. The pairs of corresponding lines in the upper stereogram in Figure 15.13 have orientation disparities appropriate to the inclinations of the sides of the pyramid that are defined in terms of horizontal disparities between the lines. The lines appear to lie smoothly on the sides of the pyramid (Ninio 1985). In the lower stereogram, corresponding lines have the same orientation and the lines appear to lie on steps on the sides of the pyramid. Thus, line elements can have both a horizontal disparity and a small orientation disparity, each of which produces it own depth effect. The question asked here is whether stereopsis created by horizontal disparity is disrupted when the elements of a stereogram differ in orientation. Frisby and Roth (1971) used a stereogram consisting of randomly oriented lines, each about 12 arcmin long. The central region had either a crossed or uncrossed horizontal disparity of 18 arcmin and appeared clearly in depth when the lines had the same orientation in the two eyes. The impression of depth began to deteriorate when the lines differed in orientation by 10°, and little or no depth was reported when they differed by more than 45°. It has been claimed that depth can be seen in a stereogram made of lines that are orthogonal in the two eyes, as in Figure 17.9. However, there are several artifacts in this type
LINK ING BINOCULAR IMAGES
•
195
Figure 15.14.
Stereopsis with orthogonal line elements. Some depth is still evident in a random-line stereogram with orthogonal line elements, as long as the
lines are short.
(From Frisby and Julesz 1975a. Perception. Pion, London.)
of display (see Section 17.1.2a). Some depth is seen in a stereogram with short orthogonal lines, as in Figure 15.14. Stereoacuity is severely reduced when the orthogonal lines are more than 3 arcmin long (Mitchell and O’Hagen 1972; Frisby and Julesz 1975a, 1975B, 1976). It seems that horizontal disparity between boundaries of regions containing oppositely oriented lines does not serve as a cue to depth, unless the lines are short. The disruptive effects of differences in orientation are probably due to the inability of binocular cells to sustain their response to the disparity of stimuli that differ in orientation by more than a critical amount. The receptive fields of binocular cells have similar orientation tuning in the two eyes (Section 11.4.1). Stereopsis based on horizontal disparity between differently oriented short lines probably depends on the registration of disparities between the low spatial-frequency components of the stimuli. The large point-disparities between the ends of the lines in different orientations begin to obtrude when the lines become too long. Also, long orthogonal lines engage in rivalry. Mitchell (1969) reported that depth order relative to a fixation point could be detected in a stereogram consisting of a fixation point and a horizontal line to one eye and a vertical line to the other, as shown in Figure 15.15A. The disparity between the lines was sufficient to ensure that their images fell on distinct regions of the retinas. Thus, the lines did not rival. The magnitude of perceived depth was not measured. In this case, depth judgments may have been based on the transient vergence that nonmatching disparate stimuli induce (see Section 10.5.10), rather than directly on disparity. Even though the stimuli were exposed for only 100 ms, perceived depth could have been based on vergence 196
•
A
B
C Depth from orthogonal dichoptic images. (A)A stimulus derived from that used by Mitchell (1969). He claimed that fusion of the images with fixation on the dot causes the upper lines to appear nearer than the dot and the lower lines to be more distant, even with an exposure of only 100 ms. Depth impressions are stronger with short lines, as in (B) or with parallel lines, as in (C). Depth impressions are strengthened when the eyes change convergence from one pair of lines to the other pair.
Figure 15.15.
STEREOSCOPIC VISION
occurring after the stimulus was switched off. One cannot argue that vergence plays no part in depth perception with briefly exposed stimuli. Depth is seen with short orthogonal lines, as in Figure 15.15B, or with vertical lines, as in Figure 15.15C. However, even with these images, depth is less apparent when fixation is firmly held on the central fixation dot. Schor et al. (1998) referred to the transient impression of depth created by disparity between orthogonal stimuli as transient stereopsis, as opposed to sustained stereopsis generated by similar stimuli. This issue is discussed in Sections 10.5.10 and 18.12.1. Reimann and Haken (1994) described a computer algorithm that uses the similarity constraint to solve the binocular correspondence problem.
evidence that the stereo threshold depends on luminance channels tuned to peak spatial frequencies below 2.5 cpd. Schor et al. (1998) asked subjects to detect depth between two 1° circular Gabor patches with a relative disparity of 6° and presented for 140 ms. Performance was impaired when the spatial frequencies of the sinusoidal components of the left-eye and right-eye Gabors differed. However, performance with nonmatching images improved when the contrast of the low spatial-frequency images was reduced relative to that of the high spatial-frequency images. They concluded that this provides further support for the Kontsevich and Tyler theory.
15.3.7 R E L AT I V E I M AG E C O N T R A S T
15.3.7a Images Differing in Contrast 15.3.6 S I M I L A R I T Y O F S PAT I A L FREQUENCY
Schor et al. (1984b) investigated the effect on stereoacuity of differences in spatial frequency between the images in the two eyes. The stereo threshold was a linear function of the difference in the center spatial frequency of bar-like patterns (DOGs) in the two eyes. However, the stereo threshold was independent of spatial-frequency differences when the spatial frequency of each pattern was above 2.5 cpd. Perhaps, above 2.5 cpd, the visual system begins to rely on the disparity between the envelopes of the DOGs. Below this limit, the disparity between the luminance modulations within the DOGs is the dominant factor. Hess and Wilcox (1994) found that, with Gabor patches, stereoacuity depends on the disparity of the envelope when the center spatial frequency of the contrast modulation of the patches exceeds 4 cpd (see Section 18.7.2c). Another factor is that the effective contrast of a grating varies with its spatial frequency, according to the contrastsensitivity function. Kontsevich and Tyler (1994) argued that the lowest spatial-frequency channel for disparity detection has a peak spatial frequency of 2.5 cpd and that this channel alone detects all stimuli with lower spatial frequencies. Below this frequency, the effectiveness of a stimulus depends on its equivalent contrast. They argued from masking data that the effective contrast of a grating is inversely related to the square of its spatial frequency. They measured the stereoscopic threshold as a function of the relative contrasts of dichoptic DOG stimuli with peak spatial frequencies of 0.6 and 1.2 cpd, giving a spatialfrequency ratio of 2:1. As predicted, the lowest threshold occurred when the contrast ratio of the stimuli was 4:1. Since effective contrast is a linear function of contrast, stereoacuity is inversely related to the square root of effective contrast. It follows that stereoacuity should vary inversely with spatial frequency under those conditions where stimulus contrast is the dominant factor. This was the result obtained by Schor et al. (1984b). However, Hess et al. (2002) obtained
Images from the same location, such as those produced by a glossy surface, often differ in contrast. The stereoscopic system tolerates moderate interocular differences in contrast. Stereoacuity is degraded only when there is a large difference in contrast between the images in the two eyes (Section 18.5.4). Smallman and McKee (1995) asked whether linking of binocular images is influenced by their relative contrast. Two white vertical lines, L1 and L2, 10 arcmin apart were presented for 200 ms on a dark background to the left eye and two similar lines, R1 and R2, were presented to corresponding points to the right eye. Subjects saw two lines in the same depth plane when all the lines had the same contrast. When R2 was not present, the stimulus conformed to Panum’s limiting case and the unfused line L2 appeared beyond the fused image of L1 and R1. As the contrast of R2 was increased, the impression of two lines at different distances suddenly changed into the impression of coplanar lines. Before the switch occurred, the low contrast line R2 was seen as an unfused “ghost” near the plane of fixation. This impression was also created when the contrast of line R2 was increased well above that of line L2. The contrast of R2 required to bring the lines into one plane increased as a linear function of the contrast of the other three lines. Thus, L2 and R2 were matched when the ratio of their contrasts fell between certain limits. When each eye saw only one line, the images fused into a single line even when their contrasts differed widely. However, when there was more than one line, linkages between images that differed in contrast by more than a certain ratio were rejected. Similar phenomena are described in Section 15.4. But the above experiment does not prove that samecontrast linkages are preferred when the contrast difference is not large. Petrov (2004) also used stimuli conforming to Panum’s limiting case. A pair of dots in one eye that differed in contrast was shown next to a single dot in the other eye with a contrast equal to that of the dot with lower contrast. An array of 200 three-dot units was presented. In each unit
LINK ING BINOCULAR IMAGES
•
197
the dots were arranged so that if same-contrast dots were linked the dots would appear oriented in depth in one direction and if the single dot were linked with the dot with higher contrast the dots would appear oriented the other way. The results indicated that, for contrast ratios below 3.5, linkages between dots of different contrast were preferred to those between dots of the same-contrast. In other words, a single dot in one eye was preferentially linked to the dot with higher contrast in the other eye. Petrov concluded that the stereoscopic system uses a correlation measure that maximizes the scalar product of images features rather than one that minimizes differences between the images.
15.3.7b Images with Reversed Contrast According to the evidence cited in Section 11.4.1, binocular cells in V1 of the monkey respond to disparity in a random-dot stereogram in which the images in the two eyes have reversed contrast (anticorrelated). The disparitytuning functions are reversed relative to those produced by same-contrast images. However, reversed-contrast randomdot stereograms do not create impressions of depth. Cumming and Parker (1997) concluded that disparities between reversed-contrast images are not used for perception even though they are coded in V1. However, we shall now see that the evidence for this conclusion is conflicting. Helmholtz (1909, vol. 3, p. 512) produced the stereogram shown in Figure 15.16. White lines on a black background are presented to one eye, and black lines on a white background to the other. Depth is seen in the fused image, and Helmholtz concluded that depth is coded by disparity between contours of opposite luminance contrast. But depth in a reversed-contrast stereogram may be due to the linkage of the images of opposite edges of the lines, which have the same contrast polarity (Kaufman and Pitblado 1965; Howe and Watanabe 2003). The depth created by reversed-contrast images is opposite to that corresponding to the disparity between samecontrast images (Anstis and Rogers 1975; Rogers and Anstis 1975). This is because the visual system registers the
disparity between edges of similar contrast rather than that between edges of opposite contrast. According to the above hypothesis, depth should not be seen when reversed-contrast lines are so wide that edges with similar contrast polarity are too far apart, as illustrated in Figure 15.17B. Treisman (1962) failed to find stereopsis under these conditions. However, Kaufman and Pitblado (1969) reported that some subjects saw depth at a fully reversed-contrast border in spite of not being able to fuse the images. Lateral inhibition at a high-contrast edge generates Mach bands. Thus, there is a darker region on the dark side of the border and a lighter region on the light side. Kaufman and Pitblado suggested that depth at a reversedcontrast border is due to disparity between edges of similar polarity selected from these Mach bands. Also, areas with opposite contrast produce binocular luster that may contribute to the impression of depth. Krol and van de Grind (1983) obtained depth at a reversed-contrast border when contrast was too low to generate Mach bands and suggested a different explanation. They found that the eyes misconverge on the reversedcontrast border, producing two effects. First, the two diplopic contrast boundaries are treated as unpaired
A
B
C Reversed contrast and line width. (A) Depth is produced from same-contrast and reversed-contrast thin lines. (B) Depth is not produced by reversed-contrast wide lines. (Adapted from Krol and van De Grind 1983) (C) Depth is produced by wide lines that differ in contrast.
Figure 15.17. Figure 15.16.
Stereogram with opposite luminance polarity. Fusion of
these images can create an impression of a convex or concave polyhedron. (From Helmholtz 1909) 198
•
STEREOSCOPIC VISION
monocular images and are defaulted to the plane of zero disparity. Second, the lateral edges of the display (with similar-contrast polarity) acquire a disparity. The monocular boundary lines are therefore seen in depth relative to the disparity induced into the lateral edges of the display by vergence. The sign of the depth depends on whether the eyes overconverge or underconverge on the reversed-contrast border. This is the same default rule for unpaired images that is discussed in connection with the unique-linkage rule in Section 15.3.1. This explanation preserves the samepolarity linkage rule. Levy and Lawson (1978) obtained valid stereopsis when the contrast at a wide dichoptic border was only partially reversed, with gray-on-white in one eye and gray-on-black in the other. Cogan et al. (1995) presented a vertical bar with matching difference-of-Gaussian luminance profiles (DOGS) in the two eyes next to a vertical DOG bar with opposite luminance polarity in the two eyes. A DOG bar contains a bright central region flanked by two dark regions. Stereoacuity was an order of magnitude worse than when both bars had matching images. However, disparitydependent relative depth was still evident with oppositepolarity bars as wide as 1 cpd. If this depth were due to linking same-sign edges in the DOG profiles, perceived depth in reversed DOG images would be a nonmonotonic function of disparity. This is because, with increasing disparity, different components of the luminance profiles come into coincidence. Cogan et al. obtained a monotonic function and therefore rejected the same-sign hypothesis. Also, they obtained the same effects from opposite polarity DOGS as from opposite-polarity simple Gaussian bars. Simple Gaussian bars contain only a single bright or dark region and therefore do not contain same-sign components. Cogan et al. did not propose an alternative mechanism. Subjects noted that reverse-polarity images appeared diplopic at disparities for which matched images fused. Under some circumstances, rivalry between regions with reversed luminance polarity can be a stereoscopic cue in its own right (see Section 17.5).
15.3.7c Transient Stereopsis with Reversed Contrast Briefly exposed dichoptic stimuli with orthogonal contours can generate transient stereopsis (Section 15.3.5). Also, stimuli differing in shape can generate transient vergence (Section 10.5.10). We will now see that opposite-contrast stimuli can generate transient stereopsis. Pope et al. (1999) presented one pair of dichoptic ° ) 2.2° above a fixation point and Gaussian patches ( a second pair 2.2° below the fixation point. One pair had 0.5° of crossed disparity and the other had 0.5° of uncrossed disparity. The presence of patches with equal and opposite disparity was designed to prevent the initiation of vergence.
The patches had either the same or opposite contrast polarity in the two eyes and were presented with various durations and contrasts. Subjects indicated which patch appeared nearer. Performance with matching stimuli was generally perfect for all durations and contrasts. Performance with reverse-contrast stimuli declined as duration increased above 0.2 s or as contrast was reduced. Thus, images with matching contrasts evoked both transient and sustained stereopsis, while those with reversed contrasts evoked only transient stereopsis. Temporal transients were present in stimuli modulated by a square temporal window, but absent in stimuli modulated by a sinusoidal window. Reversecontrast stereopsis was maintained over a longer period in the presence of temporal transients. Reverse-contrast stereopsis was also prolonged when stimulus contrast was increased. Thus, the duration of transient stereopsis depended on the temporal energy (determined by the temporal frequency and amplitude of contrast modulation) in the stimulus, rather than by just stimulus duration. Pope et al. argued that random-dot stereograms do not evoke transient stereopsis because the transient system responds only to low spatial frequencies. This issue is discussed further in Section 18.12.3.
15.3.7d Reversed Polarity Random-Dot Stereograms Stereopsis does not occur in a dense random-dot stereogram in which the images have reversed luminance polarity (Figure 15.18). However, Cumming and Parker (1997) reported that most binocular cells in V1 of the monkey produce an inverted disparity tuning function to a stereogram with contrast sign reversed in one eye (anticorrelated dots). They modeled this response with monocular receptive fields that match in spatial and temporal properties (Section 11.4.1). Cumming et al. (1998) concluded that disparity detectors in V1 respond to a type of disparity not used directly for depth perception. However, Julesz (1971, p. 157) observed stereopsis in reversed-polarity stereograms with low dot density or to which correlated color was added. Cogan et al. (1993) also observed a weak impression of depth in reversed-polarity stereograms with 20% dot density, although the dots did not fuse. Neri et al. (1999) found that noise consisting of anticorrelated dots reduced the detectability of depth in a randomdot stereogram when noise and signal dots had the same disparity and enhanced detectability when the noise dots had a disparity of +9 arcmin relative to the signal dots. Disparity noise consisting of correlated dots had the opposite effects. Thus, the effects of anticorrelated noise resembled the inverted disparity tuning functions of cortical cells produced by disparities in anticorrelated stimuli. Neri et al. suggested that the effects of anticorrelated noise arise at an early stage of visual processing, where binocular inputs of
LINK ING BINOCULAR IMAGES
•
199
A
B Stereopsis and reversed luminance polarity. The depth seen in the normal stereogram in (A) is not seen when the images have reversed luminance polarity in (B).
Figure 15.18.
One image was displaced sideways relative to the other in a 130-ms exposure. Subjects indicated which of two successively presented stimuli moved and had depth. For small displacements with anticorrelated patches with a narrow band of spatial frequencies, subjects perceived both reversed motion and reversed depth, although the depth was not as clear as that produced by correlated patches. Like other investigators, they found that an anticorrelated random-dot display produced reversed motion but not reversed depth. The presence of reversed depth with anticorrelated lines suggests that the absence of reversed depth with anticorrelated dots is due to a reluctance to combine discrepant disparities arising from different orientation components of stimulus motion. Depth is seen even in a dense polarity-reversed randomdot stereogram when the briefly exposed images are presented with an interocular delay of about 75 ms (Cogan et al. 1993). With such a delay, the negative component of the biphasic temporal response of the visual system to the first stimulus coincides with the positive initial component of the second stimulus. The two simultaneous images therefore have the same contrast polarity.
15.3.8 M ATC H I N G C O L O R
opposite luminance polarity are combined in a local, linear fashion for the detection of binocular correlation. They proposed that a later, nonlinear stage is involved in the detailed linkage of images for stereopsis. The impression of motion produced by a reversedcontrast kinematogram is in the opposite direction to that induced by a normal kinematogram. This is because displacing an anticorrelated sine-wave grating of wavelength l through distance d to the right is equivalent to displacing it l 2 − d to the left and reversing its contrast. With complex anticorrelated stimuli, the size of the reversed displacement is different for different spatial frequency components, but all components signal the same direction of motion. Also, a complex moving display generates motion signals in several directions. Read and Eagle (2000) proposed that the motion system combines different spatial-frequency or different orientation components of motion to produce a coherent motion signal for both correlated and anticorrelated displays. On the other hand, they suggested that stereoscopic depth is weakened when there is disagreement between disparity signaled by different spatial-frequency channels or by different orientation channels. This would explain why an anticorrelated kinematogram produces reversed motion while an anticorrelated stereogram produces rivalry rather than reversed depth. Read and Eagle presented a patch of vertical lines of mixed spatial frequency to one eye and the same patch with contrast reversed to the other eye against a gray background. 200
•
15.3.8a Stereopsis with Images Differing in Color Depth seen in anaglyphs demonstrates that corresponding elements in the two eyes need not be the same color. Treisman (1962) obtained stereo depth from the stereogram in Figure 15.19 and concluded that color rivalry can occur while disparity between the colored regions evokes a sensation of depth. A random-dot stereogram viewed with a red filter in front of one eye and a green filter in front of the other eye appeared to alternate in color by binocular rivalry while the central square appeared to stand out in depth (Ramachandran and Sriram 1972). They concluded that information from the suppressed image remains available for stereopsis. This conclusion is open to question. Rivalry is a spatially local process and may occur between colored regions between
Stereopsis with color rivalry. The differently colored rings rival while disparity between their edges evokes a sensation of relative depth. (Redrawn from Treisman 1962)
Figure 15.19.
STEREOSCOPIC VISION
the dots but not between the dots. Dots in a random-dot stereogram are not subject to rivalry. Thus, rivalry between colored regions would not affect detection of disparity between the luminance-defined edges of the dots. In the above experiments, stimuli were not closely matched for luminance. Yang et al. (1996) produced a random-dot stereogram in which equiluminant red and green dots were presented on an equiluminant yellow background. Depth was perceived when dot pairs had the same color. When all dot pairs were opposite in color, subjects could not fuse them and could not see the depth. Introduction of some same-polarity luminance contrast between the images restored fusion and stereopsis.
15.3.8b Color as an Aid in Matching Images The question addressed in this section is whether stereopsis is facilitated when texture elements defining one depth plane differ in color from those defining other depth planes. The texture elements differ in luminance from the background. Stereopsis produced by stimuli totally lacking in achromatic contrast is discussed in Section 17.1.4. Domini et al. (2000) placed a set of random red dots with a Gaussian distribution of disparities on a dark ground. A set of random green dots was added with a distinct but overlapping Gaussian distribution of disparities. The red and green dots were the same luminance. Subjects could distinguish the two depth distributions. If they had ignored color differences they would have seen a single depth distribution. Inspection of an array in which the red dots were nearer produced a bias to see the green dots as nearer in a subsequent test display. This color-specific depth aftereffect indicates that processing of disparity is associated with color processing. They obtained the same aftereffect when one eye saw chromatic stimuli during adaptation and the other eye saw chromatic stimuli in the test stimulus. Domini et al. concluded that, although image matching is based mainly on luminance, the aftereffect occurs at a higher level in the visual system, where disparity-defined depth is related to color. Over et al. (1973) had failed to find a colorcontingent depth aftereffect. However, in their stimuli, color did not play an essential role in depth segregation. Kingdom et al. (2001) set a random pattern of target dots with a defined disparity among random dots with variable disparity. The disparity threshold for detection of depth between the two surfaces was lower when the dots in one depth plane were one color and those in the other depth plane were another color compared with when the dots within each depth plane varied in color. But this threshold difference showed only when the target dots were sparse so that, when they were all the same color, they could be identified in each monocular image. In other words, stereopsis was facilitated when depth planes were distinguished by color but only when this provided monocular information about the shapes of the depth regions.
If a difference in color helps the visual system to link the images correctly, one would expect a greater number of false matches when a color difference is not present. Den Ouden et al. (2005) produced evidence in support of this proposition. Image matching based on color alone can induce an impression of motion-in-depth. Chen and Cicerone (2002) presented identical 8° random-dot squares to the two eyes. The dots were red except for those in a 2° central circular region, which were green. In successive frames, none of the dots changed position but the color assigned to the dots changed to produce an apparent 0.5° sideways motion of the central region in one eye. This introduced a disparity between the central regions in the two eyes, which caused the central region to appear to move in depth. Also, the achromatic background in the central region appeared green by a process related to neon spreading (Section 22.1.2). Since the red and green dots were not matched for luminance, one cannot conclude that this effect depended only on the chromatic component. 15.3.9 R E L AT I V E M OT I O N
Superimposed coplanar textured surfaces moving in opposite directions may create an impression of flicker rather than of motion. Disparity between superimposed textured surfaces moving in opposite directions creates an impression of surfaces in different depth planes with opposed motion (see Section 22.3.2). The present section is concerned whether relative motion between dichoptic displays with whether differing disparity helps in the detection of disparity-defined depth. Depth in a dense random-dot stereogram depicting a transparent stationary cylinder was difficult to detect because it was difficult to find corresponding images. However, the depth of the cylinder was readily detected when it rotated about its central axis (Pong et al. 1990). This suggests that differential motion facilitates image matching. Bradshaw and Cumming (1997) approached the same question. They used a random-dot stereogram containing a square-wave modulation of disparity that was too fine (15 cpd) to create horizontal ridges. Instead, it created a surface with an apparent thickness in depth, or pyknostereopsis (Section 18.8.2). The far and near rows of the horizontal square-wave grating each contained two rows of pixels. The pixels moved horizontally either in the same direction at the same or different speeds or in opposite directions. Subjects could detect depth thickening at a lower disparity when the dots in the different depth planes moved in different directions. Differences in speed of dots moving in the same direction did not improve performance. Bradshaw and Cumming concluded that common direction of motion of dichoptic images increases the probability that the images will be linked by the visual system.
LINK ING BINOCULAR IMAGES
•
201
Van Ee and Anderson (2001) performed a similar experiment using the projected image of an array of rods differently orientated within a few degrees of the vertical and with disparities that placed them in a series of superimposed planes. They used the following conditions: (1) motion parallax and disparity-defined depth were correlated, (2) motion parallax and disparity were uncorrelated, (3) the rods moved independently left or right at various speeds, (4) half the rods moved to the left and half moved to the right at the same speed, (5) all the rods moved together in the same direction at the same speed, (6) the rods did not move. The perceived depth of the display was greater when the rods moved rather than remained stationary. But, perceived depth was greater when the rods moved in different ways rather than coherently. Thus, disparity-defined depth was facilitated by any pattern of motion that differentiated between the rods, not only opposite motion. Van Ee and Anderson concluded that differential motion helps the visual system to link the images in the two eyes correctly. These motion signals did not help stereoanomalous subjects, who were unable to perceive depth from either crossed disparity or uncrossed disparity. They relied completely on motion parallax (Van Ee 2003). Lankheet and Palmen (1998) asked subjects to discriminate between a dynamic random-dot display in which disparity between dots had a bimodal distribution and a superimposed display in which disparity had a continuous distribution over the same range. Detection of the depth interval between the two displays was degraded by addition of disparity noise. When the dot displays moved at different speeds or in different directions, discrimination improved and was less degraded by disparity noise. In the display used by Bradshaw and Cumming, rows of dots superimposed in the two eyes always moved at the same speed and in the same direction, while dots that differed in motion occupied adjacent rows. In the display used by Lankheet and Palmen, dichoptic texture elements with the same motion were spatially correlated in the two eyes, while those with opposite motion were spatially uncorrelated. Thus, linkages between elements that moved in different ways were discouraged because the elements were either not spatially correlated or not superimposed. Under other circumstances, dichoptic images with opposed motion readily combine to produce motion-in-depth (Section 31.3.5). These experiments tell us that disparity-defined depth is more easily detected when the texture elements in two depth planes move relative to each other. The effect can be explained by assuming that the probability of false linkages is reduced when superimposed stimulus elements move in different ways. Thus, texture elements that have the same motion are preferentially linked to form a coherent surface in depth, while those that differ in motion are less likely to be linked. However, there is disagreement about whether the crucial factor is a difference in speed or a difference in direction. 202
•
Most likely, this interaction between disparity and motion occurs in V1 because disparity-tuned cells in V1 have monocular receptive fields that match in orientation tuning and direction-of-motion tuning. However, most binocular cells in V1 respond independently to variations in disparity and motion. They are therefore not tuned to a particular combination of disparity and motion (see Read and Cumming 2005b). On the other hand, many binocular cells in MT, and MST of the monkey are jointly tuned to particular combinations of disparity and motion (Section 11.5.2). For example, a cell may respond best to a stimulus with crossed disparity moving to the left. A few cells in the visual cortex and elsewhere respond best to dichoptic images that move in opposite directions. These cells therefore respond to motion-in-depth (Section 31.8.2). Some cells in MST reverse their direction-of-motion preference as disparity is reversed. For example, a cell that responded to rightward motion of crossed-disparity stimuli responded to leftward motion of uncrossed-disparity stimuli. These cells therefore respond preferentially to patterns rotating in depth. 15.3.10 E P I P O L A R I M AG E S
For any binocular system in which the visual axes intersect, the point of convergence, the two nodal points of the eyes, and the two visual axes lie in the horizontal plane of regard, as in Figure 14.11. A plane containing the two nodal points and any object, whether or not it is in the plane of regard, can be called a binocular plane. The plane of regard passes through the centers of the two foveas. Any other binocular plane cuts the two retinas in horizontal meridians of longitude an equal distance above or below the foveas. These are known as epipolar meridians. It is assumed that the angle of vertical vergence is zero so that the visual axes lie in the same plane. Note that the elevation of binocular planes may be defined in retinocentric coordinates with the foveas as origin, or in headcentric coordinates. Epipolar lines are specified in headcentric coordinates. They are corresponding retinal meridians only when the eyes are in torsional alignment. But, whatever the alignment of the eyes, all objects in a given binocular plane produce images lying in a pair of headcentric epipolar meridians and all objects in other planes necessarily produce images lying in other pairs of epipolar meridians. Thus, for a given image point in one eye, the search for a corresponding image may be confined to the epipolar meridian of the other eye. The horizontal disparity of the images lying in a pair of epipolar meridians is zero only for an object lying on the horizontal or vertical horopter. For all other objects, the disparities of images on a pair of epipolar meridians are not zero. This epipolar constraint reduces the search for corresponding images from two dimensions to one. The constraint may readily be exploited in a machine vision system in which epipolar medians are easily computed. Prazdny (1983) devised a
STEREOSCOPIC VISION
purely visual algorithm for assigning images to epipolar meridians. The epipolar constraint is not easy to apply in the visual system because of the difficulty of knowing which retinal meridians lie in a given headcentric binocular plane (Section 14.3.2). The difficulty arises because the eyes, when converged, do not remain in torsional alignment when the gaze is elevated or depressed. According to Listing’s law, the eyes rotate about an axis in a frontal plane fixed to the head. If the eyes were to move this way when converged on a near point they would excyclorotate on upward gaze and incyclorotate on downward gaze. However, we saw in Section 10.1.2d that the deviation from torsional alignment is not as large as one would predict from Listings law because Listing’s planes in the two eyes relatively rotate about a vertical axis, like a hinge, when the eyes elevate or depress. Nevertheless, the deviation from Listing’s law is not large enough to ensure that corresponding retinal meridians fall on epipolar meridians. The visual system could compensate for this type of image misalignment in any of the following ways: 1. It could execute torsional eye movements of sufficient magnitude to null the cyclodisparity in the images. Schreiber et al. (2001) found that cyclovergence varied only 0.07° for each degree of cyclodisparity in a stereogram. Cyclovergence of high gain occurs in response to cyclodisparity only for large stimuli when no other stimuli are in view (Howard and Zacher 1991; Howard et al. 1994). 2. It could register the relative torsional positions of the eyes to allow for effects of cyclovergence. 3. It could restrict its search for corresponding images to within the zone over which corresponding retinal meridians rotate as the gaze shifts. Any corrective cyclovergence would reduce the size of this zone (Schreiber and Tweed 2003). However, cyclovergence would not reduce the zone to correspond to epipolar lines. Schreiber et al. (2001) devised the stereogram shown in Figure 15.20 to illustrate that the visual system uses strategy (3) rather than strategy (2). The images of the stereogram are incyclorotated 4° (2° in each eye) when fused by convergence and viewed straight ahead. When viewed with depressed gaze but with the stereogram orthogonal to the line of sight, the eyes incyclorotate. This brings the images into closer torsional alignment, and the arrow depicted in the stereogram becomes visible. When viewed with elevated gaze, the eyes excyclorotate. This increases the cyclodisparity in the stereogram and the arrow is not visible. If the visual system registered the cyclovergence of the eyes it would have no difficulty in linking the images of the cyclorotated images.
No image in depth Image in depth Effects of cyclovergence on binocular fusion. The images in the stereogram are relatively tilted inward by 4°. The obliquely pointing arrow in the stereogram should be visible when the stereogram is viewed with downward gaze, keeping the images normal to the line of sight, as shown below. This is because cyclovergence occurring with downward gaze tends to cancel the relative tilt of the images and bring the disparity within the range of the stereoscopic system. The arrow should be difficult to see with upward gaze because cyclovergence occurring with upward gaze adds to the tilt of the images. and makes the disparity too large. (Adapted from Schreiber et al. 2001)
Figure 15.20.
The above observations illustrate that matching images can be found as long as the corresponding retinal meridians are not too far out of alignment. Schreiber et al. concluded that the visual system relies on cyclovergence to keep corresponding retinal meridians within a range that allows corresponding images to be found. The oculomotor system strikes a balance between simplifying motor control that obeying Listing’s law allows and simplifying the search for corresponding images. The visual system therefore looks for matching images within that range of vertical disparities that are produced by cyclovergence. In other words, the visual system must be able to detect horizontal disparities between images with some degree of vertical disparity. The literature on this topic is discussed in Section 18.4.2. Thus, the visual system does not register the cyclovergence state of the eyes when matching binocular images. In other words, it does not register precise headcentric epipolar meridians. Van Ee and van Dam (2003) produced further evidence in support of this conclusion. Other supporting evidence is provided in Section 20.2.2a 15.3.11 EFFEC TS O F T E X T U R E INHOMOGENEIT Y
Anything that helps an observer to distinguish between texture elements should help the image linking process. False image linkages are more likely to occur between a set of identical stimulus elements than between a set of elements that differ in features such as orientation, size,
LINK ING BINOCULAR IMAGES
•
203
or motion. It was mentioned in Section 15.3.9 that stereoscopic depth is facilitated when texture elements move at different speeds. We now ask whether differences in orientation help. Ninio and Herlin (1988) and Herbomel and Ninio (1993) created stereograms from different types of textured surface. In each surface, the texture was the same in the two eyes and subjects had to identify whether each of five protuberances was convex or concave. The main factors that shortened the time taken to complete the task were discontinuities of texture and diversity in the positions and orientations of pairs of matching texture elements. Correct depth percepts were evoked more readily by vertical than by horizontal line elements but only when there was some irregularity in line orientation or spacing. Monocular cues to the location of the protuberances played only a minor role. Van Ee and Anderson (2001) also found that the perception of disparity-defined depth between a set of rods is facilitated if the rods vary in their orientation about the vertical. It is not known whether variations in other features, such as flicker and size, help the image-linking process. 15.3.12 EC O L O G I C A L FAC TO R S
In natural scenes, objects below eye level are more likely to be nearer (crossed disparity) than fixation distance than are objects above eye level. One reason is that there is almost always a ground surface in view that extends to the feet of the viewer. A second reason is that the vertical horopter is inclined top away from the viewer (Section 14.7). Also, in natural scenes, the number of objects producing crossed disparity increases as fixation distance increases (Yang and Purves 2003). Thus there are more objects beyond the horopter (uncrossed disparities) when the eyes fixate a near object and more objects nearer than the horopter (crossed disparities) when the eyes fixate a far object. This relationship is especially true for objects located below rather than above the horizon. Hibbard and Bouzit (2005) asked whether statistical features of the world bias the interpretation of ambiguous stereograms. The stimulus consisted of 3-cpd square-wave gratings presented 180° out of spatial phase to the two eyes with a central fixation cross. Therefore, the disparity signal was ambiguous with regard to sign. The images could be linked to produce a vertical grating nearer than the fixation cross or a gating beyond the cross. Subjects were more likely to see the grating as closer than the cross when it was presented below eye level and beyond the cross when it was presented above eye level. Also, for stimuli below eye level, subjects were more likely to see the grating as closer than a fixation cross at 5 m and more distant than a cross at 1 m. Thus, a person’s implicit knowledge of the statistics of disparities can bias the interpretation of an ambiguous stimulus. 204
•
15.3.13 A M B I GU I T Y I N L I N K I N G OBLIQUE LINES
A horizontal disparity between two long smooth oblique lines is indistinguishable from a vertical disparity between the lines. Thus, in Figure 15.21, a vertical disparity in the oblique lines, as in (a), produces the same depth as a horizontal disparity in the lines, as in (b). However, a vertical disparity in short oblique lines, as in (c), does not produce a clear impression of depth because the ends of the lines indicate that the disparity is vertical rather than horizontal. A vertical disparity in horizontal lines flanking zerodisparity long oblique lines, as in (d), produces clear depth. In this case, the eyes change their vertical vergence so as to fuse the horizontal lines. A slight tilting of the head one way or the other affects the impression of depth because this has the same effect as a change in vertical vergence. After the horizontal images are vertically fused, the two images contain the same horizontal disparities as those in (b). A vertical disparity in the horizontal lines does not produce clear depth in short oblique lines, as in (e), but it produces depth in a long dotted line if the disparity equals the spacing between the dots, as in (f ). Ito (2005) claimed that the depth induced into long oblique lines by vertical disparity in flanking images is not due to vertical fusion of the flanking images. They obtained the effect with exposures too brief for vergence. With long exposure, they reported that opposite effects are seen at the same time, as in Figure 15.21d. However, they did not control for tilt of the head or for changes in gaze with long exposure. In Figure 15.21a the depth difference between the two fused images is clear only when one gazes first at one fused image and then at the other. Depth is weak or absent when fixation is held firmly on the central cross.
1 5 . 4 G L O B A L M ATC H I N G RU L E S 15.4.1 M I N I M I Z I N G U N PA I R E D I M AG E S
The best linkage between binocular images is the one that links the greatest number of corresponding images in the depth plane to which attention is directed. Objects far from the horopter produce images that are so diplopic that their images remain unlinked by the visual system. Such images are usually blurred because they are not in focus. The blurring probably helps us to disregard them (see Arnold et al. 2007). Also, in most natural scenes there are images in the binocular field of one eye for which there is no matching image in the other eye, even though the images may be in focus. Such unpaired images occur in the neighborhood of a step in depth between two textured surfaces, unless the step is parallel to the interocular axis. These are monocular zones discussed in Section 17.2. In this case, the image linkage process is helped by the fact that monocular zones occur
STEREOSCOPIC VISION
(a) Vertical disparity on oblique lines
(b) Horizontal disparity on oblique lines
(c) Vertical disparity on oblique lines
(d) Vertical disparity on horizontal lines
(e) Vertical disparity on horizontal lines
(f) Vertical disparity on horizontal lines equals the interdot spacing Ambiguous disparity in oblique lines. A vertical disparity between long oblique lines in (a) creates the same depth as a horizontal disparity between oblique lines in (b). However, a vertical disparity between short oblique lines, as in c), does not create clear depth because the vertical disparity can be seen between the line ends. Depth is created by vertical disparity between the flanking horizontal lines with no disparity in the oblique lines, as in (d). In this case, vertical fusion of the horizontal lines creates the same disparity as in (b). The same disparity in short flanking lines, as in (e), creates little or no depth. A vertical disparity in the horizontal lines creates depth in a line of dots only when the vertical disparity is equal to the interdot spacing, as in (f ). Figure 15.21.
in coherent patches near disparity discontinuities and obey other rules described in Section 17.2.1. Monocular zones confirm that a correct linkage has been found, since coherent regions of unpaired images are not likely to arise by chance. Under certain circumstances a compelling impression of depth arises from monocular zones in the absence of disparity (Section 17.3). Noncorresponding images falling on corresponding retinal points arise from two objects outside the horopter and tend to be distributed at random. When the objects
have different shapes, the images undergo binocular rivalry, although it is not something that we are normally conscious of. 15.4.2 C OA R S E -TO -F I N E S PAT I A L S C A L E S
There has been much theorizing about whether disparities are extracted at a coarse spatial scale before they are extracted at a fine spatial scale. Evidence for distinct spatial-scale channels for stereopsis will be reviewed in Section 18.7. A
LINK ING BINOCULAR IMAGES
•
205
hierarchical, coarse-to-fine search within the domain of spatial scale is an algorithm used in computer vision (Rosenfeld and Vanderbrug 1977). However, the visual system is not a Fourier analyzer, so we cannot define visual spatial scale in terms of spatial frequencies. There are four related criteria of spatial scale. 1. The size of stimuli The stimulus may be a single object or a textured surface. A large homogeneous object does not stimulate receptive fields within the boundaries of its image. On the other hand, small objects stimulate small receptive fields more effectively than they stimulate large receptive fields. A given cell is stimulated most effectively when the stimulus has the same structure of the cell’s receptive field and fills the receptive field. 2. The density of stimulus elements Consider the density of textural elements in a random-dot stereogram. Wellspaced and pronounced luminance-defined features in a surface are less likely to give rise to false linkages than are closely spaced features. Once found, linkages between well-spaced features could propagate along connected edges and lines or to other edges and lines in the same depth plane. Linkages between well-spaced and pronounced features could guide the process of linking closely spaced features in a given depth plane, especially for similar features that could be linked in more than one way. Mallot et al. (1996a) used a stereogram containing periodic patterns of differing spatial scale. Well-spaced features provided a more efficient basis for finding the correct match than did high-density features. However, subjects sometimes used fine detail to disambiguate the matching of coarse detail. The general effects of spatial scale on stereopsis are discussed in Sections 18.6.2 and 18.7. 3. The spatial frequency content of stimuli Marr and Poggio (1979) proposed that low spatial-frequency features of binocular images are linked before high spatial-frequency features. There is a natural confound between spatial frequency defined by element density and spatial frequency defined by sharpness of elements. A display with well-spaced elements has low spatial frequency content even if the elements have sharp boundaries. Furthermore, closely packed elements are not discriminable unless they have sharply defined edges. 4. The spatial scale of disparity The spatial frequency of disparity-defined depth modulations is discussed in Section 18.6.3. The next section considers the influence of the spatial scale of disparity on the process of linking binocular images. 206
•
15.4.3 C OA R S E -TO -F I N E D I S PA R IT I E S
To a first approximation the disparity between the images of an object increases as a linear function of the distance of the object from the horopter. The process of linking binocular images in one of several depth planes is simplified if it is performed within the channel tuned to the magnitude of disparity within that depth plane. A match found between a pair of images with a given disparity applies to all image pairs within a local area in the same depth plane. There is also a natural correlation between disparity magnitude and image blur, since images of objects farther from the plane of zero disparity are correspondingly out of focus. Images of out-of-focus objects lack fine detail (steep luminance gradients) and cannot be detected by small receptive fields. This natural constraint could be used to advantage by the imagelinking system. Linking of fine disparities could be done within the fine-scale system (small receptive fields in the central retina), and linking of coarse disparities could be done within the coarse-scale system. Such a coupling of two correlated stimulus features would help in the segregation of distinct depth planes. Evidence that disparity processing is carried out in distinct disparity-tuned and size-tuned channels is reviewed in Section 18.7. The spatial frequency of disparity modulation is discussed in Section 18.6.3. Mayhew and Frisby (1980, 1981) implemented a computer algorithm based on cooperation between different visual channels, which they called “stereoedge.” The algorithm could be reiterative in that it started by selecting all feature pairs having the same disparity by the nearestneighbor rule. It then retained only those conforming to a well-formed surface. The criterion used to define a wellformed surface could be figural continuity or meaningfulness. Finally, the algorithm looked for non-nearest-neighbor matches conforming to the same surface. When there are several objects at different depths, the best procedure is to converge the eyes on a selected object and attend to images with zero or near-zero disparity. Such images fall on or near corresponding points and are fused and linked automatically. No decision process is required. The process fails when the stimulus is an array of similar elements, especially regularly distributed elements. In this case there is nothing to help the eye converge correctly on a particular object. In most natural scenes, images that match in size, shape, and color are corresponding images. By using disparity detectors narrowly tuned about zero disparity, most of the spurious linkages between neighboring image features are eliminated. Once a zero-disparity plane is well defined, the system can explore linkages sharing other disparities. The whole process is greatly simplified when the eyes converge sequentially on each depth plane. Vergence eye movements to a given plane could be evoked by disparities between well-spaced and prominent features, leaving the linking of finer and less well-pronounced features until after
STEREOSCOPIC VISION
the eyes have moved. Once the images in one depth plane have been linked, the unique-linkage rule forbids those same images from being linked to other images. All other image points (except in monocular zones) are disparate, and further linkages will be sought and vergence movements evoked only from this residual pool of images. As the eyes converge from one plane to another, the sets of corresponding images and the depth information extracted from each of them could be retained in a buffer memory, leading to a progressive reduction in the pool of points that have not been linked. Tyler (1990) reported that people became less sensitive to depth modulations as stimulus duration was reduced from 1 s to 50 ms. However, the form of the function relating depth sensitivity to the spatial frequency of modulation of disparity was the same for different stimulus durations. Glennerster (1996) found that a square with uniform 4 arcmin of disparity in a random-dot stereogram was detected in the same time as a similar square with alternating columns of 6 and 3 arcmin of disparity. The two stimuli had the same mean disparity pedestal. However, both stimuli were seen sooner than a square with alternating columns of plus and minus 1° of disparity. The pedestal disparity of this square was zero. Glennerster argued that subjects had to detect the depth modulations within this square before they could detect the square and that fine depth modulations are detected before coarse depth modulations. But another possibility is that the three squares were detected by the spatial decorrelation of the dots within them before depth was perceived. The mean magnitude of decorrelation was higher for the two squares with a disparity gradient than for the square with zero mean disparity. Further experiments are required. 15.4.4 E D G E C O N T I N U I T Y
The perception of edges provides strong evidence that images have been correctly linked. The image linking process is facilitated when each image contains continuous edges or contours. For instance, depth in a random-dot stereogram is seen more rapidly when a continuous line surrounds the monocular images that define the disparate region ( Julesz 1960). Depth discontinuities normally occur along the edges of objects and are often accompanied by discontinuities of texture, color, and motion. Even in the absence of monocular discontinuities, a continuous step of disparity within a random-dot stereogram confirms that the correct linkage has been found, since a continuous step of disparity is unlikely to occur in incorrectly linked images. Local discontinuities of disparity are not suppressed; it is only that a continuous depth edge prevents other interpretations of the matched points that define that edge. Local perturbations in an edge are usually detected. Discovery of a depth edge prompts a search for its continuation,
especially when the edge is interpreted as belonging to an identifiable object. We will see in Section 22.2 that continuous depth edges are perceived across gaps when the resulting percept is one of a complete figure in depth. The “stereoedge” algorithm developed by Mayhew and Frisby (1980) exploits this constraint. The effects of continuity constraints on depth judgments are discussed in Sections 22.1.1. 15.4.5 S U R FAC E S M O OT H N E S S
The disparity produced by a smooth textured surface is continuous over each local neighborhood—it is differentiable. The disparity field produced by objects viewed through a textured transparent surface is not continuous. However, there is continuity within the set of elements that define the transparent surface. This is continuity over a particular plane in 3-D space. A dense 3-D array of random dots can be said to contain all possible disparity surfaces, since any plane cutting such a display contains a set of points with a constant or continuously varying disparity. We might fleetingly see these planes, just as we fleetingly see coherent patterns of motion in a field of randomly moving dots, but the general impression is of a 3-D array of dots. Most of the time, the natural world is sparsely populated with objects that are seen against opaque surfaces, such as walls, and the ground, or against the sky. The presence of isolated and relatively smooth surfaces simplifies the image-linking process. Once a surface is found, the image points constituting it become linked, and the unique-linkage rule forbids other linkages for these points. In their computational model of the stereoscopic system, Marr and Poggio (1976) postulated a surface continuity constraint in conjunction with the unique-linkage rule and the similarity constraint (see also Marr et al. 1978). The continuity principle was based on the idea that people assume that most surfaces are smooth. But this assumption is not required. Even if most surfaces were highly convoluted or discontinuous, detection of a surface with constant or gradually changing disparity would be taken as evidence that the images had been correctly linked. The only assumption required is that a highly structured, low-entropy, sensory signal is most likely to arise from a structure in the world. Points arranged in a regular 3-D lattice, even though they do not define a surface, may also confirm that a proper link has been made. The crucial factor is not the smoothness of a 2-D surface but the regularity of the disparity field, whether in two dimensions or three. We can readily distinguish between continuous and discontinuous disparity fields. The world contains more smooth surfaces than irregular surfaces. Through experience, this will add to the probability that a linkage that yields a smooth surface is correct. Marr and Poggio also argued that slightly mismatched images on an otherwise smooth surface tend to be perceived as lying on the surface by a cooperative process.
LINK ING BINOCULAR IMAGES
•
207
Neighboring binocular cells with similar disparity tuning were said to be mutually facilitatory, and those with different tuning were said to be mutually inhibitory. A few images with an unambiguous disparity can bias the interpretation of a larger array of images with ambiguous disparity ( Julesz and Chang 1976). But this does not support Marr and Poggio’s view, since the biasing process does not involve a perceived change in disparity between a set of matched points but rather a change in which sets of images are linked. Only one of the sets of matches creates a smooth surface, and therefore that is the one preferred (see also Kontsevich 1986). Perceptual smoothing occurs when the difference in disparity is too small to be resolved. This is disparity averaging described in Section 18.8. Detected relative disparities are, if anything, perceptually enhanced. The idea that smooth disparity gradients have precedence over disparity discontinuities is an unwarranted extension of the continuity principle. The continuity principle does not forbid us from seeing discontinuities when they exist. The world is full of disparity discontinuities and we have no difficulty seeing them. In fact, discontinuities produce stronger signals than smooth gradients because they are more informative, just as luminance discontinuities are more informative than luminance gradients. The idea that smooth disparity gradients have precedence led to the proposition that disparity detectors with similar disparity tuning are mutually facilitatory while those differing in tuning mutually inhibit each other (Dev 1975; Nelson 1975; Marr and Poggio 1976). Mutual facilitation occurs in the sense that linked images conforming to a disparity gradient confirm that the correct linkage has been found. Other linkages are then not possible for those image points. Mutual inhibition is not required. When a transparent textured surface is superimposed on a more distant textured surface, image pairs with one disparity are interspersed among image pairs with another disparity. We have no difficulty seeing the two surfaces, as long as dot density is not too great. This shows that the disparity in one set of image points does not suppress the disparity in an interspersed set of image points (Prazdny 1985b). The phenomenon of depth contrast discussed in Section 21.5 suggests that neighboring disparities enhance rather than inhibit each other. Superimposed surfaces with randomtexture elements are seen as two surfaces once an appropriate linkage has been found for each set of interspersed elements. Other linkages for that set of elements are not allowed. In each local region of overlapping transparent surfaces there is severe disparity discontinuity, but each surface is continuous within itself. We will see in Section 22.2 that people prefer to interpret ambiguous stereoscopic displays in term of flat surfaces rather than in terms of smoothly curved surfaces. If there are no regions of coherent disparity or other regularities in the scene, as for instance in a swarm of gnats 208
•
or a cloud of snowflakes, we must use constraints other than surface continuity in 3-D space to guide the search for correct image matches. 15.4.6 I M AG E L I N K AG E D ET E R M I N E D BY C O N V E RG E N C E
Two identical objects aligned in depth in the midline, as shown in Figure 15.22, present special problems. The nearest-neighbor and image-similarity rules do not ensure that only corresponding images are linked. When the eyes converge on point A, well beyond the two pins, the left eye sees the pins on the right of A and the right eye sees them on the left, making a total of four images. Thus, the images of both pins have crossed disparity. All pairs of images may be too far apart for fusion or for the disparity-detection system. The images appear somewhere beyond the actual pins, and their apparent direction conforms to the rules of egocentric direction (Section 16.7). Their appearance beyond the actual pins is probably due to the effect of far convergence. We also see four images when the eyes converge on point C, well in front of the two pins, except that now the image pairs are uncrossed and appear somewhere in front of the actual rods. When the eyes converge on the far pin, its images fuse and those of the near pin have a crossed disparity. Similarly, when the eyes converge on the near pin, its images fuse and the images of the far pin have an A Far pin
Double-nail effect
B
Near pin
C
The double-nail illusion. Two pins are held in the median plane about 30 cm from the eyes, with one pin about 2 cm further away than the other. Four images are seen when the eyes converge at distance A or C. When the far pin is fixated, its images fuse and the near pin has a crossed disparity. When the near pin is fixated, the far pin has an uncrossed disparity. When convergence is at B, about halfway between the pins, the two pairs of images are matched inappropriately and appear as two pins side by side. A third pin between the other two helps bring the point of fixation to the intermediate position.
Figure 15.22.
STEREOSCOPIC VISION
uncrossed disparity. In both cases, a pair of fused images separates the disparate images, and the disparate images project to opposite cerebral hemispheres. In these cases, phasedisparity detectors could not code depth, and it is difficult to see how position-disparity detectors could be involved. Vergence could be involved in two ways. First, the asymmetry of the whole stimulus could cause the eyes to converge toward restoring overall image symmetry. This would induce disparity into the images of the object that the observer is trying to fixate. It may be this disparity that codes relative depth rather that the disparity between the images of the object that the observer is not attempting to fixate. This issue is discussed in Section 17.6.2. Midline stereopsis may also be aided by changing vergence between the two objects. The vergence changes may code relative depth directly or they may help by sequentially bringing the images of each object into binocular register (Section 18.10.4). Although vergence may contribute to the detection of the relative depth of midline objects, it is not necessary since depth can still be detected when the possible contribution of vergence has been prevented (Ziegler and Hess 1997). There remains one special case—convergence on point B, midway between the two pins. Normally, the eyes converge on one or the other pin, and it is difficult to converge the eyes on a point midway between them. A small fixation object at the halfway point helps. However, if the pins are about 1 cm apart in depth at a viewing distance of about 30 cm, the gaze tends to slip into a point between them, even when no fixation point is provided. The image of the near pin in the left eye fuses with the image of the far pin in the right eye, and vice versa. There are no unfused images and hence no disparities. This set of stimuli acts as a strong vergence lock since both pairs of images are fused, rather than only one pair, as in the other ways of linking the images.
The appearance is that of two pins, side-by-side in the plane of fixation. This is known as the double-nail illusion. It was first described by French (1923, p. 231) and has also been described by Cogan (1978), Krol and van de Grind (1980), and Mallot and Bideau (1990). The illusion occurs because a side-by-side pair of pins and an in-line pair of pins create identical proximal stimuli (Ono 1984). This illusion belongs to the class of illusions that arise because two distal stimuli create the same proximal stimulus. Other examples are the Ames room and Ames window, which depend on the projective equivalence of different distal stimuli (Section 29.3.2). The side-by-side pins are ghost images in the sense that they arise from incorrectly linked images. However, we do not see them at the same time as pins in other positions, and the unique-linkage rule is preserved. If the pins differ in shape, the illusion does not work, because the images of an in-line pair of pins are no longer identical to those of a sideby-side pair. If one pin is taller than the other, the images of the top of the taller pin form a disparate pair. As Krol and van de Grind reported, this part of the pin appears to float out of the plane of convergence to the location in depth appropriate to the disparity of its images. If the two pins are tilted slightly in a frontal plane in opposite directions, one sees two side-by side pins inclined in depth in opposite directions. This occurs because the proximal stimuli for the two situations are identical. There is a large literature on algorithms for finding corresponding images for computer reconstruction of stereo depth. For a selection of papers on stereo algorithms see volume 47 of the International Journal of Computer Science and Di Stefano et al. (2004). Intel has an open-source library of stereo drivers with camera calibration, image rectification, epipolar alignment, and matching. ( http ://www.intel.com/research/mrl/research/ opencv/).
LINK ING BINOCULAR IMAGES
•
209
16 CYCLOPEAN VISION
16.1 16.1.1 16.1.2 16.2 16.2.1 16.2.2 16.3 16.3.1 16.3.2 16.3.3 16.4 16.4.1 16.4.2 16.4.3 16.5 16.5.1
The cyclopean domain 210 Introduction 210 Types of dichoptic cyclopean stimuli 210 Acuities in the cyclopean domain 213 Cyclopean vernier acuity 213 Discrimination of cyclopean features 214 Cyclopean figural effects 215 Figural effects with binocular composites Cyclopean geometrical illusions 216 Cyclopean aftereffects 216 Cyclopean motion 217 Types of cyclopean motion 217 Dichoptic apparent motion 219 Aftereffects of dichoptic apparent motion Motion of cyclopean shapes 225 Sensitivity to cyclopean motion 225
16.5.2 Speed discrimination of cyclopean motion 226 16.5.3 Contrast effects from cyclopean motion 227 16.6 Cyclopean texture segregation 228 16.6.1 Cyclopean pop out 228 16.6.2 Inverse cyclopean phenomena 229 16.7 Binocular visual direction 230 16.7.1 Introduction 230 16.7.2 Headcentric direction and the egocenter 232 16.7.3 Visual directions of disparate images 237 16.7.4 Perceived direction of monocular stimuli 239 16.7.5 Effects of phoria, fixation disparity, and strabismus 242 16.7.6 Locating the cyclopean eye 243 16.7.7 Controversy over the cyclopean eye 245 16.8 Utrocular discrimination 247
215
224
1 6 . 1 T H E C YC L O P E A N D O M A I N
2. Direct stimulation of the visual cortex Direct stimulation of the visual cortex by an electric current produces visual phosphenes. Migraine fortification illusions are naturally occurring cyclopean stimuli, since they arise from a direct disturbance of the visual cortex.
16.1.1 I N T RO D U C T I O N
The Cyclops was a one-eyed giant in Homer’s Odyssey. Cyclopia is a birth defect in which there is only one central eye (Section 6.4.2a). Hering and Helmholtz used the term “cyclopean eye” to denote the point midway between the eyes that serves as a center of reference for headcentric directional judgments. Ptolemy and Alhazen had the same idea (Sections 2.10.1, 2.10.2). Julesz (1971) generalized the term “cyclopean” to denote the processing of visual information after inputs from the two eyes are combined. He also defined a cyclopean stimulus as one formed centrally and which is not present in either retinal image. A cyclopean stimulus is necessarily processed at or beyond the primary visual cortex. There are three ways in which a stimulus can be presented to the visual cortex without a corresponding stimulus on the retina.
3. Use of dichoptic stimuli In this procedure, which may be called the dichoptic cyclopean procedure, distinct images are presented to the two eyes in a stereoscope to create an effect not evident, in whole or in part, in either monocular image. A dichoptic cyclopean stimulus can be a region of binocular rivalry, binocular disparity, dichoptic motion or flicker, or any combination of these features. Once synthesized, a cyclopean shape can be made to move by simply moving the dichoptic boundaries that define it. Dichoptic stimuli will now be discussed in more detail. 16.1.2 T Y P E S O F D I C H O P T I C C YC L O P E A N S T I MU L I
16.1.2a Levels of Binocular Processing
1. Paralysis of an eye For instance, if a bright light is presented to one eye and the afterimage is still visible after the retina of that eye has been pressure paralyzed, one must conclude that afterimages can be generated at a central site.
There are two basic levels of binocular processing— noninteractive and interactive. A noninteractive dichoptic stimulus is one in which the monocular images do not affect each other. For instance, lines presented to one eye 210
Figure 16.1.
A dichoptic composite stimulus. Fusion of the two upper displays
produces the lower shape.
can be combined dichoptically with lines in the other eye to form a dichoptic composite shape, as in Figure 16.1. The dichoptic shape is cyclopean because it is not present in either monocular image, but it is not exclusively cyclopean because combining stimuli in the same eye also produces it. However, vergence eye movements disrupt the cyclopean shape but not the monocular shape. Binocular composite shapes cannot be constructed by superimposing intersecting contours, because intersecting contours rival. A second type of noninteractive binocular process is comparison of dichoptic stimuli. For instance, alignment of nonius lines involves interocular comparison (Section 14.6.1). The task is essentially the same as when both stimuli are presented to the same eye (vernier acuity). The main difference is that changes in vergence affect the composite stimulus but not the same stimulus seen by one eye. In an interactive binocular process the appearance of at least one dichoptic stimulus is changed in the presence of the other. For example, certain geometrical illusions may be formed as binocular composites (Section 16.3.1). Also, cyclopean apparent motion can be elicited by dichoptic alternation of two flashing lights with a spatial offset between them. Each eye sees only a stationary flashing light. Motion is not perceived until the images are combined. Cyclopean motion may be created by dichoptically combining moving stimuli to form motion in a direction not evident in either monocular image. These dichoptic effects are not exclusively cyclopean because combining the same stimuli in one eye produces similar effects. However, careful study of an interactive dichoptic effect is required to reveal whether the neural processes differ from those operating in the monocular image.
An exclusively binocular process is a dichoptic interactive process that is not evident when the stimuli are combined in one eye. An exclusively binocular process depends on a cortical mechanism activated only by binocular inputs. For example, the sensation of depth arising from static binocular disparity is an exclusively binocular process, as is binocular rivalry produced by combining distinct images in the two eyes. The detection of binocular disparity is not exclusively binocular, because the difference in location of the images is evident when dichoptic stimuli are combined in one eye. However, some cells in the visual cortex are specialized for detection of binocular disparity, and this mechanism differs from that involved in the spatial discrimination of monocular stimuli. Thus, in these respects, the detection of disparity is an exclusively binocular process. Dichoptic color mixing has some features in common with monocular color mixing, but in other respects it is exclusively binocular (Section 12.2).
16.1.2b Superimposed Dichoptic Stimuli Similar stimuli fuse into a single stimulus when they are superimposed, either in the same eye or dichoptically. Dichoptically superimposed dissimilar stimuli engage in binocular rivalry. Under certain conditions, dissimilar stimuli superimposed in one eye engage in monocular rivalry (Section 12.3.8). Dichoptically superimposed regions differing in luminance produce binocular luster. Superimposed stimuli in the same eye can produce an impression akin to binocular luster (Section 12.3.8c).
16.1.2c Binocular Disparity The disparate areas in a random-dot stereogram are evident when the two halves of the stereogram are superimposed in one eye so that, in this sense, disparity detection is not exclusively binocular. On the other hand, although binocular disparity detection has features in common with monocular spatial resolution, it has features unique to the binocular process. Physiological evidence has revealed that it involves specialized disparity detectors. The sense of depth arising from static disparity is exclusively cyclopean, although a similar perceptual effect can be produced by monocular motion parallax (Section 28.1.2). The depth seen in both standard stereograms and random-dot stereograms is cyclopean. The hallmark of a random-dot stereogram is that the shape defined by a region of disparity is not evident in either monocular image. In a pictorial stereogram, the locations of disparity discontinuities are usually evident as luminance boundaries in each monocular image. Cyclopean shapes defined by binocular rivalry can be formed by placing a region of dichoptically uncorrelated
C YC L O P E A N V I S I O N
•
211
An interesting phenomenon occurs when black annuli filled with white in one eye are combined with black disks in the other eye, as in Figure 17.39. The outer rims fuse to form a set of holes through each of which one sees a rivalrous region of black and white. Howard (1995) dubbed this the sieve effect (Section 17.5). Here is a case in which opposite luminance polarity in the two eyes creates an impression of depth in its own right. This is an exclusively cyclopean phenomenon.
A
16.1.2d Cyclopean Visual Features
B
C Cyclopean shapes defined by rivalry. (A) The correlated images produce a rectangle in depth. (B) A central rectangle of uncorrelated dots has a disparity, but its perceived depth is indeterminate. (C) An outline round the uncorrelated rectangle appears in depth but not the dots in the rectangle.
Figure 16.2.
texture elements within a surround of correlated elements, as in Figure 16.2B. This reveals whether boundaries defined by rivalry can generate stereoscopic depth. The correlated rectangular region in Figure 16.2A produces depth but the uncorrelated region in Figure 16.2B appears to float at an indeterminate depth with respect to the correlated region. This has been called rivaldepth (O’Shea and Blake 1987). Some people have a consistent impression of depth order with this stimulus but, as is pointed out in Section 17.5, this is probably due to fixation disparity. When the boundary of a disparate uncorrelated region is outlined, as in Figure 16.2C, the outline appears in depth but the texture within it still appears at an indeterminate depth. A consistent impression of stereoscopic depth occurs when a region of dichoptically uncorrelated dots has a different spatial frequency with respect to that of an uncorrelated surround, as in Figure 17.12. But in this case the disparate region is not cyclopean, since it is visible in both images. 212
•
Tyler (1975b) introduced the concept of a hypercyclopean level of analysis. He” defined a hypercyclopean feature is a feature other than depth that is produced by a pattern of differences between the images in the two eyes that is not visible when one eye is closed. The dichoptic difference could be spatial disparity, binocular rivalry, motion, or color. The feature can be shape, orientation, curvature, size, spatial frequency, or motion. Rather than the term “hypercyclopean,” the terms “cyclopean shape,” “cyclopean tilt,” “cyclopean spatial frequency,” and “cylopean motion” will be used here. Tyler (1991a) reviewed cyclopean vision. Random-dot stereograms contain a cyclopean shape defined by disparity. The shape can be tilted, curved, spatially modulated, or moved by adjusting the disparity boundaries that define it. A cyclopean shape defined in terms of a feature other than disparity, such as motion, can have disparity imposed on it (see Section 17.1.5). Dichoptic combination of patches that differ in color produces color rivalry or color mixing. Binocular yellow, formed by presenting red to one eye and green to the other, may be regarded as a cyclopean color. Binocular color mixing does not obey the same laws as ordinary color mixing so, in this respect, it is an exclusively cyclopean process (Section 12.2). Cyclopean shapes defined by color rivalry can be constructed from dichoptic arrays of different colored disks, with the colors matching between the two eyes in one region and not matching in another region (Figure 16.3A). The cyclopean region defined in this way is not as clear as that defined by reversed luminance polarity, as shown in Figure 16.3B. This is probably because color rivalry is not a preattentive feature, whereas reversed luminance polarity is (Section 22.8.2). We failed to obtain stereopsis by imposing a disparity on a region of color rivalry. This chapter deals with the perception of cyclopean visual features, other than depth, that are defined by binocular rivalry or disparity. The effects to be discussed occur between shapes defined by disparity but do not involve interactions between disparity detectors after the point at which the cyclopean shape has been detected. Contrast effects in the cyclopean domain that do involve interactions between different disparities are reviewed in Chapter 21.
STEREOSCOPIC VISION
A A
B Cyclopean shape defined by rivalry. (A) The dots have reversed colors in the central region. (B) The dots have reversed contrast polarity in the central region. The reversed region is more evident in (B) than in (A).
Figure 16.3.
B
1 6 . 2 A C U I T I E S I N T H E C YC L O P E A N DOMAIN 16.2.1 C YC L O P E A N V E R N I E R AC U I T Y
Vernier acuity is normally measured with two vertical black lines seen against a white background. Thus, the lines are defined by luminance contrast. Morgan (1986) found that vernier acuity fell off rapidly as black dots were added to white test bars seen against a random-dot background. Acuity fell to zero when sufficient dots had been added to make the bars indistinguishable from the background. This can be appreciated by looking down the set of stereograms in Figure 16.4 without fusing them. When the vertical bars were set in stereo relief (3.3 arcmin disparity) with respect to the background, vernier acuity fell off less rapidly from an initial value of about 18 arcsec, as dots were added to the bars. This can be appreciated by looking down the set of stereograms in Figure 16.4 after fusing them. Thus, acuity was less disturbed by added dots when the test bars were seen in stereo relief. Acuity remained at about 40 arcsec when the bars were defined by disparity alone so that they were not visible in either monocular image. Morgan concluded that vernier acuity with lines defined by disparity alone is not as high as it is for lines defined by luminance alone. One problem is that the vertical border is jittered by spurious dot correspondences in the cyclopean case, while the luminance-defined test lines have welldefined edges. This experiment should be repeated with horizontal boundaries defined by shear disparity, which have no border jitter. Cyclopean grating acuity is discussed in Section 18.6.3.
C Cyclopean vernier acuity. (A) Vernier bars well defined by luminance and disparity. (B) Bars defined poorly by luminance and by disparity. (C) Bars defined only by disparity. (From Morgan 1986, Pion Limited,
Figure 16.4.
London)
Several visual tasks, including grating acuity, grating detection, orientation discrimination, and vernier acuity, are performed more precisely with stimuli oriented vertically or horizontally than with obliquely oriented stimuli (see Howard 1982). This anisotropy is referred to as the oblique effect. Mustillo et al. (1988) measured observers’ ability to discriminate differences in the orientation of a vertical, horizontal, or oblique bar defined by a region of either crossed or uncrossed disparity in a random-dot stereogram. The mean discrimination threshold for vertical or horizontal bars was about 1.13˚ and that for oblique bars was about 2.3˚. These effects of orientation are similar to those reported for bars defined by luminance contrast. Bars with crossed disparity yielded lower orientationdiscrimination thresholds than those with uncrossed disparity (see Section 18.6.4).
C YC L O P E A N V I S I O N
•
213
30 20
Aspect-ratio discrimination threshold (%)
10 0
5
0
5
Uncrossed
0 1000
Disparity (arcmin) Aspect-ratio discrimination and disparity. Aspect-ratio discrimination as a function of disparity of a test rectangle relative to a zero-disparity background for (a) a cyclopean rectangle defined by disparity, (b) a luminance-defined rectangle, and (c) a motion-defined rectangle. Each fine dashed line separates disparities where all dots of the stereogram were fused and those where only dots in the rectangle were fused. Bars are standard errors. (Adapted from Regan and Hamstra 1994)
Figure 16.5.
16.2.2 D I S C R I M I NAT I O N O F C YC L O P E A N F E AT U R E S
16.2.2a Aspect-Ratio Discrimination Regan and Hamstra (1994) constructed a cyclopean rectangle in a random-dot stereogram and measured the justdiscriminable difference between the ratio of its height to its width (aspect ratio) over a wide range of crossed and uncrossed disparities. Also, subjects set the depth of the cyclopean rectangle to match that of a luminance-defined bar. The aspect-ratio threshold remained high until disparity was increased from zero to well above the threshold for detecting the rectangle (indicated by the arrow in Figure 16.5a). The threshold then fell to a minimum at intermediate disparities and increased again at high disparities. Matched depth was linearly proportional to disparity until fusion was lost for the rectangle. Regan and Hamstra concluded that these differences between threshold and suprathreshold performance reveal a distinction between two neural mechanisms. One mechanism supports suprathreshold depth perception and the other involves spatial interactions among local disparity-sensitive mechanisms. Although the effect of disparity on aspect-ratio discrimination was different for crossed and uncrossed disparities, 214
•
the lowest threshold (about 4%) was similar for the two types of disparity. This value implies that each edge of the rectangle was localized with a precision of about 0.6 arcmin. This is much better than the 3 to 5-cpd maximum acuity for cyclopean gratings defined by modulation of disparity (Tyler 1974a). This distinction between edge localization and grating resolution in the cyclopean domain parallels the distinction between hyperacuity and grating resolution in the luminance domain (Section 3.1.2). In another condition, Regan and Hamstra switched off the dots surrounding the cyclopean rectangle, leaving a luminance-defined rectangle with a bold frame. With this display, disparity had much less effect on aspect-ratio discrimination than with the cyclopean rectangle, as can be seen in Figure 16.5b. The lowest discrimination threshold (3%) was similar for crossed and uncrossed disparities and was only a little lower than that for the cyclopean rectangle. In Figure 16.5c, dots within the cyclopean rectangle were moved obliquely at a velocity equal and opposite to that of dots outside the rectangle, creating a motion-defined rectangle. Again, the effect of disparity was minimal. The lowest discrimination threshold (1.9%) was less than for the luminance-defined rectangle, possibly because the edges were defined by more dots during the presentation interval.
STEREOSCOPIC VISION
16.2.2b Orientation Discrimination Hamstra and Regan (1995) measured peoples’ ability to detect tilt-from-vertical of a bar defined by disparity between bar and background in a random-dot stereogram. The bar was shown for 1.5 s in different orientations in the frontal plane with 0.5-s intervals. The effect of disparity on the tilt threshold was similar to the effect of disparity on aspect-ratio discrimination shown in Figure 16.5a. Tilt thresholds lay between 0.6 and 1.5˚, and were similar for crossed and uncrossed disparities and fell within the range of thresholds for luminance-defined bars or gratings. Kohly and Regan (2001) measured the detectability of differences in the orientation, separation, and location of two cyclopean bars presented at the same time, with a distracter bar placed midway between them. Stimuli were presented for only 82 ms to prevent subjects’ scanning the stimuli. Subjects could discriminate small trial-to-trial changes in each of these stimulus features while ignoring variations in the orientation, width, and location of the distracter bar. Kohly and Regan concluded that comparison of features of objects in different locations is mediated by long-distance lateral connections in the visual cortex (Section 5.5.6). Familiar shapes, such as letters, take longer to identify when in an unusual orientation (Howard 1982; Shepard and Cooper 1982). The function relating recognition time to frontal-plane orientation was the same for letters defined by luminance as for letters defined by color, motion, or disparity ( Jolicoeur and Cavanagh 1992). Thus, this task depends on processes that are independent of low-level coding. 1 6 . 3 C YC L O P E A N F I GU R A L E F F E C T S 16.3.1 FI GUR A L E FFEC TS WIT H BINOCULAR COMPOSITES
A binocular composite stimulus is one in which part of a stimulus is presented to one eye and part to the other, where both parts are required for a given visual effect. The effect produced by a binocular composite display is cyclopean only in a weak sense, because a similar effect is produced when the two parts are combined in one eye. Witasek (1899) was the first person to use binocular composite stimuli. He created a composite of the Zöllner illusion shown in the bottom stereogram in Figure 16.6. This display produces severe rivalry, but Witasek claimed that the illusion became evident after some practice. Ohwaki (1960) found that binocular composite illusions are smaller than normally viewed versions. He drew the improbable conclusion that the illusions are largely retinal in origin. Day (1961) obtained similar results but concluded that the smaller illusions with dichoptic viewing are due to binocular rivalry. Springbett (1961) did not see the Zöllner
Left eye
Right eye
Fused image
Dichoptic composite visual illusions. Fusion of the left- and right-eye patterns produces the complete patterns on the right. In the bottom display binocular rivalry makes it difficult to decide whether the illusion occurs.
Figure 16.6.
and Hering illusions in binocular composites even in the brief moments of the rivalry sequence when both components were clearly visible. The problem is that, in these illusions, the lines in the two eyes overlap and rival. Springbett saw the Müller-Lyer illusion when the fins were presented to one eye and the connecting lines to the other. Rivalry is less of a problem in this case because the component lines do not overlap. He concluded that this illusion depends on processes occurring after binocular fusion. This is not a convincing argument, because the Müller-Lyer illusion is evident in a figure consisting only of fins, as in Figure 16.7. The vertical-horizontal illusion, in which a vertical line appears longer than an equal horizontal line, also survives with dichoptic lines. In this case there is no rivalry and no intruding monocular effect (Harris et al. 1974). The most thorough investigation of binocular composite illusions was conducted by Schiller and Wiener (1962). For the first three displays in Figure 16.6 the dichoptic components do not overlap, and, when these were presented briefly to further minimize binocular rivalry, the illusory
Figure 16.7.
The Müller-Lyer illusion. The illusion is evident in a figure
consisting only of fins.
C YC L O P E A N V I S I O N
•
215
effects were almost as strong as with normal viewing. Very little illusion was evident in the last display, in which the component lines overlap. Schiller and Wiener concluded that these illusions depend primarily on central processes, but that the illusion is not seen in the last display because of binocular rivalry. 16.3.2 C YC L O P E A N G E O M ET R I C A L I L LUS I O N S
Several geometrical illusions occur in cyclopean shapes generated in random-dot stereograms. For example, Figure 16.8 shows that the Müller-Lyer illusion is preserved in disparity-defined cyclopean shapes, in which no part of the illusory figure is present in either monocular image (Papert 1961, 1964; Julesz 1971). In Chapter 7 of Foundations of Cyclopean Perception, Julesz (1971) presented random-dot stereograms depicting the Müller-Lyer, Poggendorff, vertical-horizontal, Ebbinghaus, Ponzo, and Zöllner illusions. All these illusions except the Zöllner illusion were evident in the cyclopean form. This demonstrates that cortical processes can generate the illusions but does not exclude the possibility that precortical processes may also be involved. Papathomas et al. (1996) found that the luminancedefined Ebbinghaus illusion was about 20% greater than the cyclopean version shown in Figure 16.9. Also, the apparent size of a cyclopean test disk was affected by surrounding luminance-defined disks to a greater extent than by surrounding cyclopean disks. Quantitative comparisons have not been made for other geometrical illusions. In Hermann’s grid, dark spots are seen at the intersections of a grid pattern, as in Figure 21.13a. They are usually explained in terms of lateral inhibition in the retina. However, Lavin and Costall (1978) and Troscianko (1982) obtained a weak effect in dichoptic displays in which the
Figure 16.8.
A cyclopean Müller-Lyer illusion.
monocular components did not produce an effect. Thus, at least a part of this luminance-contrast effect can arise centrally. 16.3.3 C YC L O P E A N A F T E R E FFEC TS
Inspection of luminance-defined stimuli can produce aftereffects such as the tilt aftereffect and the spatialfrequency aftereffect. Analogous aftereffects occur after inspection of cyclopean shapes produced by disparitydefined regions in random-dot stereograms. A cyclopean aftereffect of this type is neither a depth aftereffect nor a disparity aftereffect. Rather it is an aftereffect arising from inspection of some feature, such as orientation, spatial frequency, or motion defined by a pattern of disparity. In a figural aftereffect, a test bar is apparently displaced away from a previously inspected bar in a neighboring location. The same aftereffect occurs between cyclopean bars defined by disparity (Walker and Kruger 1972). In the tilt aftereffect, inspection of a tilted line or grating causes a subsequently seen vertical line or grating to appear tilted in the opposite direction. Tyler (1975b) generated a cyclopean tilt aftereffect. In the version shown here, observers inspect a random-dot stereogram depicting gratings tilted in opposite directions at 20˚ to the vertical, as shown in Figure 16.10. After scanning across this stimulus for 30 s a vertical cyclopean gratings appears tilted in the opposite direction. The aftereffect transferred 50% from a cyclopean induction grating to a noncyclopean test grating defined by luminance. In the spatial-frequency aftereffect, a grating appears finer after inspection of a coarser grating, and coarser after inspection of a finer grating. Tyler (1975b) described a spatial-frequency aftereffect in the cyclopean domain. In the version shown here, observers scan the horizontal boundary lying midway between two disparity-defined gratings with different corrugation frequencies, as shown
(From Julesz 1971. Copyright by Bell Telephone Laboratories Inc.)
216
•
STEREOSCOPIC VISION
Figure 16.9.
A cyclopean Ebbinhaus illusion.
(From Papahomas et al. 1996, Pion Limited, London)
in Figure 16.11. After adaptation, identical disparity gratings above and below the fixation point are perceived to have different corrugation frequencies. Tyler provided no data on the magnitude or frequency tuning of the effect but he suggested that the cyclopean spatial-frequency shift does not transfer to disparity gratings of different orientations. 16. 4 C YC L O P E A N M OT I O N 16.4.1 T Y P E S O F C YC L O P E A N M OT I O N
Several lines of evidence suggest that, in primates, motion detectors are in the visual cortex, not in the retina. For instance, the motion aftereffect is still visible after the eye exposed to the moving induction stimulus has been pressure blinded in the postinduction period, although it is weak and is not seen by all observers (Barlow and Brindley 1963). Furthermore, a movement aftereffect is not seen when the eye exposed to the inspection stimulus is pressure blinded during the induction period (Pickersgill and Jeeves 1964). Given that motion detection is a cortical process, one would expect motion sensations to arise in the cyclopean domain. Lu and Sperling (1995) distinguished between three types of visual motion and three types of motion detector. First-order motion is motion of a form defined by luminance. Models of first-order motion detectors are derived from the Reichardt detector, which detects the temporal sequence of stimulation of spatially adjacent regions differing in luminance (Reichardt 1961; Adelson
and Bergen 1985) (Section 5.6.4). The temporal tuning of such detectors is band-pass, and they are true motion detectors because the detection threshold depends on stimulus velocity rather than on a change in position. Second-order motion is motion of a form defined by a feature such as texture, color, or flicker, with no difference in mean luminance between form and background. Second-order stimuli are sometimes called non-Fourier stimuli. Models of detectors of first-order motion cannot account for the perceived motion of a second-order stimulus. Such stimuli must be processed by second-order mechanisms, unless they are first converted by a nonlinear mechanism into signals that can be processed by a firstorder mechanism. First-order motion and second-order motion share many properties. Both are processed by mechanisms sensitive to motion that operate over different spatial scales, and both produce aftereffects with similar properties, as listed in Section 16.5.3a. One would expect that properties of cyclopean motion that differ from those of luminancedefined motion would be shared with other types of second-order motion. Two such properties have been identified. (1) For all forms of second-order motion, the threshold for detecting the direction of motion of a drifting grating is higher than for the threshold of detecting the orientation of the grating (Lindsey and Teller 1990; Smith and Scott-Samuel 1998). (2) Sensitivity to all forms of second-order motion is limited to temporal frequencies of up 10 Hz, whereas sensitivity to luminance-defined motion extends to 50 Hz. Third-order motion is motion of a visual feature without regard to the stimuli that define it. Third-order
C YC L O P E A N V I S I O N
•
217
Cyclopean spatial-frequency aftereffect. Vertically oriented sinusoidal corrugations in the two halves of the adaptation stereogram have different spatial frequencies. The corrugations in the two halves of the test stereogram have the same spatial frequency. After 30 s viewing of the adaptation stereogram (allowing the eyes to track along the boundary between upper and lower corrugations), the corrugations in upper and lower halves of the test stereogram should appear to be of different spatial frequencies. This cyclopean spatial-frequency aftereffect was first demonstrated by Tyler (1975b).
Figure 16.11.
Cyclopean tilt aftereffect. The sinusoidal corrugations in the upper and lower halves of the adaptation stereogram are oriented counterclockwise and clockwise to vertical. Together they form a cyclopean chevron pointing to the right. Upper and lower corrugations in the test stereogram are vertical. After 30 s viewing of the adaptation stereogram (allowing the eyes to track along the boundary between the upper and lower corrugations), corrugations in the upper and lower halves of the test stereogram should appear as a chevron in the opposite direction. This cyclopean tilt aftereffect was first demonstrated by Tyler (1975b). Figure 16.10.
motion is detected by tracking the displacement of the feature rather than by specialized motion detectors. The temporal tuning of a position-tracking mechanism is low-pass, and the detection threshold depends on stimulus displacement. There has been a lively controversy about whether cyclopean motion is processed by second-order motion detectors or by detection of stimulus displacement (position tracking). Patterson (1999) (Portrait Figure 16.12) reviewed this controversy and concluded that evidence against the 218
•
idea of true motion detectors for cyclopean motion was based on the use of inappropriate stimuli, especially stimuli that were too slow or too brief to stimulate the cyclopean system. Some of that evidence will now be reviewed. There are three types of cyclopean motion: 1. Dichoptic apparent motion Stimuli presented in rapid succession in neighboring locations appear as one stimulus in continuous motion. This is known as apparent motion. Dichoptic apparent motion is created by presenting a stimulus, such as a white spot on a dark background, to one eye followed by a similar stimulus to the other eye in a neighboring position. When the
STEREOSCOPIC VISION
of the stereogram but by the cyclopean macropattern. They are therefore not visible in either eye. Only random motion is seen in each monocular image. The visual properties of motion-defined shapes viewed binocularly have been investigated by Regan (1986, 1989b). A cyclopean shape can be moved in the frontal plane or in depth. Motion of cyclopean shapes in depth is discussed in Section 31.3. Motion of cyclopean shapes in the frontal plane is discussed in Section 16.5. 16.4.2 D I C H O P T I C A P PA R E N T M OT I O N
16.4.2a Short- and Long-Range Apparent Motion
Robert Patterson. Born in East Orange, New Jersey, in 1951. He obtained a B.A. in behavioral science from San Jose State University in 1976 and a Ph.D. in experimental psychology from Vanderbilt University in 1984. He conducted postdoctoral work at Northwestern University. He is now a professor in psychology and neuroscience at Washington State University, Pullman.
Figure 16.12.
stimuli are presented repeatedly, an impression of to-and-fro movement is created. Each monocular image appears as a flickering stimulus, but the stereoscopically combined images appear to move. The object is visible in each eye, but the motion is visible only when both eyes view the stimuli. This type of cyclopean motion is discussed in the next section. 2. Stereoscopic motion-in-depth This motion is created by changing the disparity of a stimulus relative to an unchanging surround. Each eye’s image moves laterally so that the sensation of motion in the fused image is not cyclopean. However, the sensation of motionin-depth is cyclopean. This topic is discussed in Section 31.3. 3. Motion of a cyclopean shape This type of motion is created by a moving cyclopean shape defined by rivalry or disparity in a dynamic random-dot stereogram. The dots are correlated between the two eyes at any moment, but a new set of dots is presented on every frame, typically every 20 ms ( Julesz and Payne 1968). This stimulus ensures that there are no monocular cues to the shape or to the motion of the elements comprising the shape. In other words, the shape and the motion signals are not carried by the microelements
There is a vast literature on the dependence of apparent motion on the spatial and temporal properties of the stimuli (Anstis 1986). Apparent motion occurs within a plane inclined in depth as readily as within a frontal plane (Section 22.5.3). Dichoptic apparent motion occurs between a stimulus presented to one eye and a succeeding stimulus presented in a neighboring location to the other eye (Shipley et al. 1945; Pantle and Picciano 1976; Braddick and Adlard 1978). The shape is visible in each eye, but the motion of the shape is not. The monocular shapes may be defined by luminance boundaries or by texture boundaries with mean luminance held constant (Ramachandran 1973). It has been claimed that apparent motion does not occur between dichoptic random-dot patterns (Braddick 1974) or between dichoptic sinusoidal gratings (Green and Blake 1981). Braddick (1974) (Portrait Figure 16.13) proposed that there are two types of apparent motion—short-range motion and long-range motion. Detectors of short-range motion respond to local spatiotemporal distributions of luminance. Long-range motion is detected after the shape of the stimulus has been extracted. In particular, Braddick proposed that the short-range process is responsible for the apparent motion of a displaced region of dots within a larger display of randomly flickering dots, when both eyes see all the dots. This type of stimulus is known as a random-dot kinematogram. Using the criterion of perceptual segregation of the displaced region of dots in a random-dot kinematogram, Braddick found that short-range apparent motion was seen for displacements of successive images of less than 15 arcmin. With larger displacements it became difficult to find the correct pairing of dots in successive images of the random-dot kinematogram. Long-range apparent motion between well-defined stimuli can occur over several degrees. In a random-dot kinematogram the region of random dots seen in motion is not discriminable until motion occurs, and therefore shape discrimination occurs after motion detection.
C YC L O P E A N V I S I O N
•
219
Mark Georgeson. Born in England in 1948. He obtained a B.A. in mathematics and experimental psychology from Cambridge University in 1970 and a Ph.D. in spatial vision from Sussex University in 1975. He was a lecturer in psychology at the University of Bristol from 1976 to 1991 and a reader at Aston University, Birmingham, from 1991 to 1995. After being professor of psychology at Birmingham University from 1995 to 2001 he moved back to Aston University as professor of vision sciences.
Figure 16.14.
Oliver J. Braddick. Born in London, England, in 1944. He obtained a B.A. in experimental psychology in 1965 and a Ph.D. in 1968, both from Cambridge University. He conducted postdoctoral work at Brown University. He was lecturer and then reader in psychology in Cambridge University between 1969 and 1993 and professor of psychology in University College, London, between 1993 and 2001. He is now professor of psychology in Oxford University and a fellow of the Academy of Medical Sciences. He was winner of the Kurt-Koffka medal in 2009.
Figure 16.13.
Using the same criterion of perceptual segregation of the displaced region of dots, Braddick found that apparent motion was not seen with dichoptic random-dot patterns, that is, when the successive images of the displaced central region of dots were seen alternately by the two eyes. He concluded that the short-range process does not operate with dichoptic stimuli. Perhaps, in the dichoptic domain, processes that seek correspondences between closely spaced successive images are concerned with detection of disparity rather than of motion. Apparent motion occurs with well-spaced and well-defined dichoptic stimuli. Thus, the long-range system works with dichoptic stimuli. Cavanagh and Mather (1989) questioned whether there are distinct channels for short- and long-range motion. Instead, they suggested that all motion is detected by the same initial detectors and then processed in different ways. As we shall now see, there has been a lively debate between those who support the two-channel idea and those who support the single-channel idea of motion detection. Shadlen and Carney (1986) obtained apparent movement with dichoptic sinusoidal gratings flickering in alternation with a 90˚ temporal phase lag and a 90˚ spatial offset (in spatiotemporal quadrature). Each monocular image appeared as a flickering grating that lacked directionof-motion information. In contradiction to Braddick 220
•
and to Green and Blake, they concluded that the shortrange motion system operates on dichoptic inputs. Although Georgeson and Shackleton (1989) confirmed this result, we shall see in the next section that they concluded that this type of dichoptic motion is long-range, not short-range (Portrait Figure 16.14). Derrington and Cox (1998) compared the contrast required for detection of the direction of motion of a grating in which the motion was visible in one eye with that required when motion was visible only after two gratings were combined dichoptically. Results were very similar for the two types of motion over a wide range of temporal frequencies. They concluded that dichoptic motion is processed efficiently by a low-level motion-energy based system. A stationary Gabor patch containing a moving grating appears shifted as a whole in the direction of the motion when the patch is placed between two stationary stimuli (Ramachandran and Anstis 1990). Hess et al. (2009) obtained a similar apparent shift of a stationary Gabor containing a grating with cyclopean motion. The motion was not visible to either eye alone. Thus a motion-induced shift can arise at the cyclopean level of processing.
16.4.2b Dichoptic Motion with Missing Fundamental A square-wave grating can be decomposed into a sine-wave grating of the same spatial frequency (the fundamental) plus sine-wave gratings at odd multiples of the fundamental
STEREOSCOPIC VISION
frequency (odd harmonics). The luminance modulation of the harmonics decreases in proportion to the increase in spatial frequency. A square-wave grating appears to move in the direction of a repeated 90˚ phase shift of the fundamental spatial frequency. However, when the fundamental spatial frequency is removed, the phase-shifted grating appears to move in the opposite direction (Adelson 1982). This is because, when the fundamental of a square wave shifts 90˚, the third harmonic shifts 270˚, which is equivalent to a 90˚ shift in the opposite direction. It is believed that, in the absence of the fundamental, the short-range motion system is engaged by displacement of the third harmonic. The visual system ignores the long-range displacement of the pattern of the grating as a whole, which is in the direction of the missing fundamental (see Figure 16.15). When Georgeson and Shackleton (1989) presented alternating missing-fundamental gratings dichoptically, the apparent motion was in the direction of the long-range displacement of the pattern as a whole, rather than in the direction of the third harmonic. They concluded that only the long-range motion system operates dichoptically. When the contrast of the grating was low, the dichoptic motion was in the direction of the third harmonic. This is because, at low contrast, the higher spatial-frequency components that define the shape of the grating as a whole are below threshold. The binocular system detects the disparity between the overall shapes of missing-fundamental gratings, rather than that between the third harmonics, even when the latter
Spatial frequency and dichoptic motion. A grating with a square luminance profile possesses a fundamental sine-wave component superimposed on odd-harmonic components. If the fundamental is removed, the luminance profile resembles that shown in the three bottom rows. Motion of this grating to the right, as signified by the bottom two arrows, carries the peaked luminance bands to the right. Peaks of the third harmonic also move to the right but appear to move to the left because of the relative proximity of peaks in that direction, as indicated in the top three rows. The way the grating appears to move reveals which aspect carries the motion signal in the visual system. (Redrawn from Georgeson and Shackleton 1989)
Figure 16.15.
disparity is smaller (see Section 17.1.1). Perhaps this preference arises from a property of the disparity mechanism. It may not reflect a basic inability of the binocular system to detect short-range apparent motion (see Carney and Shadlen 1992; Georgeson and Shackleton 1992).
16.4.2c Dichoptic Motion in Spatiotemporal Quadrature Carney and Shadlen (1993) designed a dichoptic motion stimulus, which they claimed taps the short-range motion system. A random textured surface flickering in counterphase at 3 Hz was presented to one eye. The same pattern shifted vertically 90˚ in spatial phase and 90˚ in temporal phase was presented to the other eye. Since two stationary patterns in spatiotemporal quadrature are theoretically equivalent to a traveling wave, these stimuli combined in the same eye are equivalent to a vertically moving grating. An impression of a vertically moving grating was also created when the stimuli were combined dichoptically. The dichoptic motion had all the characteristics of short-range apparent motion, as defined by Braddick. Carney and Shadlen proposed that others had failed to obtain shortrange motion in dynamic random-dot stereograms because of the following problems with their displays. In earlier experiments, monocular motion signals were avoided by (1) use of a dynamic random-dot kinematogram, (2) reversing the direction of motion after every two frames (Braddick 1974), (3) using only two frames (Green and Blake 1981), or (4) employing spatial displacements too large for monocular detection of motion (Georgeson and Shackleton 1989). Carney and Shadlen argued that none of these methods is completely satisfactory. In their experiment, they overcame the problem of monocular motion by making the motion signals in their monocular images directionally ambiguous. Another problem is that the earlier displays required detection of a moving figural region in a stationary surround rather than discrimination of motion direction. These criteria are not equivalent. Chang and Julesz (1983) found that direction discrimination for apparent motion in a random-dot kinematogram operates over larger distances than pattern discrimination. Also, Carney and Shadlen found that, when their dichoptic display contained a familiar figure, it was not recognized even though direction of motion was detected. Figure-ground perception is perhaps disrupted by rivalry between microelements of the display. Whatever the reason for disruption of figure perception in dichoptic kinetic displays, evidence about dichoptic motion based on the criterion of figure-ground segregation is not reliable. The criterion of directional discrimination shows that the dichoptic system supports apparent motion, both long-range and short-range. Hayashi et al. (2007) used a dynamic random-dot display in which all spatial frequencies were shifted 90˚
C YC L O P E A N V I S I O N
•
221
both spatially and temporally. The dichoptically combined images could be made to move in any specified direction without providing any recognizable features that subjects could track. Carney and Shadlen modulated each of their stimuli at only one temporal frequency, and dichoptic motion ceased at 11.3 Hz. Each stimulus used by Hayashi et al. contained a wide range of temporal frequencies, and motion was visible up to at least 60 Hz. Hayashi et al. concluded that dichoptic motion in their displays provides further evidence for an energy-based motion-detection system at the level in the nervous system after monocular inputs are combined.
However, more recent evidence on this issue has once again “returned the ball to the opposite court.” When Carney (1997) repeated Lu and Sperling’s experiment with increased frame rate and stimulus duration, they found that observers correctly discriminated the direction of dichoptic motion, even in the presence of a static pedestal grating. The pedestal disrupts cyclopean motion based on feature tracking, but not that based on a low-level energy-based system. Performance was better for low spatial-frequency stimuli and when the contrast of the pedestal grating was reduced. Carney suggested that Lu and Sperling’s subjects failed to discriminate the direction of dichoptic motion because the stimulus was presented for only five frames.
16.4.2d Dichoptic Motion with Added Pedestal Grating The debate between those who believe that dichoptic motion involves a low-level process and those who believe that it involves only a high-level process took another turn with publication of a paper by Lu and Sperling (1995). They distinguished between three types of visual motion; firstorder motion based on moving luminance modulations, second-order motion based on moving modulations of texture contrast, and third-order motion based on the tracking of higher-order visual features. A moving grating, defined by modulation of luminance or of texture contrast, appeared to move when presented dichoptically in spatiotemporal quadrature (90˚ shifts of spatial phase and 90˚ shifts of temporal phase). For this stimulus, the direction of motion seen by each eye was ambiguous because the phase shift for each eye was 180˚. The contrast threshold for detecting the direction of motion was much higher for dichoptic motion than for when the alternating stimuli were presented to the same eye. To reveal whether dichoptic motion was due to the detection of second-order motion or to feature tracking, Lu and Sperling devised a pedestal stimulus in which feature tracking was not possible. This consisted of a stationary sinusoidal grating with a superimposed moving grating of the same spatial frequency and half the contrast. For this stimulus, a third-order, feature tracking system responding to motion of features such as peaks or zero crossings would “see” only a zigzag motion, whereas first- or second-order motion systems would “see” the coherent motion of the moving grating. With monocular viewing, this stimulus produced coherent motion with both first- and secondorder moving gratings. However, coherent motion was not seen with dichoptic viewing of gratings presented in spatiotemporal quadrature with a superimposed pedestal grating. Lu and Sperling concluded that dichoptic motion evident in the nonpedestal stimulus was detected by a feature-tracking mechanism rather than by first- or secondorder motion detection systems. Coherent motion was not evident with dichoptic viewing of the pedestal stimulus because feature tracking did not produce coherent motion. 222
•
16.4.2e Dichoptic Motion with the Ternus Display The display shown in Figure 16.16 was first described by Ternus (1926). The two rows of vertical lines are shown in succession with two of the lines superimposed. When the interstimulus interval is long (80 ms), the whole group of lines appears to move from side to side (group movement). When the interval is short (20 ms), the outer line appears to jump over the two stationary inner lines (element motion). Several investigators have identified group movement with long-range apparent motion and element motion with short-range apparent motion. Group motion is more likely when the stimulus elements are grouped into a single figure (He and Ooi 1999), although grouping may also produce an impression of motion out of the plane of the display (Dodd et al. 2000). If one assumes that short-range apparent motion is not generated in cyclopean stimuli, one would expect only group motion to be seen when the two sets of lines of the Ternus display are presented dichoptically. Pantle and Picciano (1976) obtained this result. However, Ritter and Breitmeyer (1989) obtained both group motion and
A
B
Ambiguous apparent motion (Ternus display). The two rows of vertical lines, A and B, are presented in succession with the two aligned lines superimposed. With an interstimulus interval of over 80 ms the group of lines appears to move from side to side as a whole. When the interval is short, the outer lines appear to jump over the two stationary inner lines.
Figure 16.16.
STEREOSCOPIC VISION
element motion dichoptically. For both ordinary and dichoptic viewing, group motion became more likely as either frame duration or the size of the stimulus elements was increased. Patterson et al. (1991) constructed a Ternus display with lines defined by disparity in a random-dot stereogram. Thus, both the lines and the apparent motion between them were cyclopean. They, too, obtained both group and element motion, with element motion predominating at short interstimulus intervals. It seems, therefore, that one cannot identify group motion with the long-range motion system and element motion with the short-range system, unless one abandons the notion that the short-range system does not operate in the cyclopean domain.
Left eye
f
Right eye
f
Dichoptic
f
A Left eye
2f
Right eye 2f
Dichoptic 2f
16.4.2f Dichoptic Motion between Hemifields When two equilateral triangles abutting on a vertical line are presented in succession one sees one triangle flipping over out of the plane of the display. The same apparent motion occurs when the two triangles are presented dichoptically, as shown in Figure 16.17. When the dichoptic triangles were presented to the nasal hemiretinas, the triangle most often appeared to move behind the plane of a central fixation line (Ferris and Pastore 1971). When the images were presented to the temporal hemiretinas, the triangle appeared to move out in front of the fixation line. Since images on the nasal hemiretinas have an uncrossed disparity they are perhaps more likely to produce an impression of being behind the plane of fixation than images on the temporal hemiretinas, which have a crossed disparity.
16.4.2g Dichoptic Interpolation of Flashes Another approach to dichoptic apparent motion is to ask whether it occurs with a sequence of flashed images presented out of phase to the two eyes. Green (1992) set up the
L
R
Nasal hemiretinas Fixation line Temporal hemiretinas R Left-eye stimulus
Right-eye stimulus
L
Dichoptic stimulus
Stimuli for dichoptic apparent movement. Successive presentation of dichoptic images on the nasal hemiretinas appeared most often to move behind the binocular fixation line. Images on the temporal hemiretinas appeared to move in front of the line. (Adapted from Ferris and
Figure 16.17.
Pastore 1971)
B Left eye
f
Right eye
f
Dichoptic 2f
C Stimulus displays used by Green (1992). (A) Each eye sees the same bars shown in succession at f Hz. (B) Each eye sees the same set presented at 2f Hz. (C) Each eye sees a set shown at f Hz but out of spatial and temporal phase in the two eyes. Dichoptically, the stimuli are seen at 2 Hz. However, the sensation of apparent motion produced by display (C) resembles that produced by display (A) rather than that produced by display (B).
Figure 16.18.
stimuli shown in Figure 16.18. In the top two displays, each eye receives the same spatial sequence of bars presented in succession at a frequency f Hz or 2f Hz. The two bars move at the same velocity, but the eyes sample the 2f Hz bar at twice the temporal and spatial frequency as the f Hz bar. In the bottom display, each eye sees the bar displacing at a frequency of f Hz, but, since the images in the two eyes are out of phase, the moving bar is sampled at 2f Hz when the images are combined. Observers reported that the smoothness of apparent motion produced by the dichoptic display resembled that of the f Hz bar rather than the 2f Hz bar. In other words, the apparent movement depended on the sampling rate in each monocular image rather than on the sampling rate in the cyclopean image. Green concluded that apparent movement interpolation is not achieved dichoptically. The frequency of 2f Hz used by Green may have been too low to reveal effects of dichoptic exposure. Wolfe and Held (1980) found that both normal and stereoblind observers experienced a stronger illusion of selfmotion when exposed to a large moving display illuminated stroboscopically out of phase to the two eyes than when the
C YC L O P E A N V I S I O N
•
223
display was illuminated in phase. The enhancement was strongest at frequencies between about 5 and 10 Hz. Thus, stereoblind subjects integrated information from the two eyes to create a dichoptic motion signal.
16.4.2h Dichoptic Motion in ON and OFF Channels Anstis et al. (2000) presented a white vertical line on a gray background undergoing apparent motion in one direction superimposed on a black line on the same background undergoing apparent lateral motion in the opposite direction, as illustrated in Figure 16.19A. Motion appeared in the direction of the white bar when the background was dark and in the direction of the black line when the background was light. Observers adjusted the luminance of the background until the motion signals canceled. The nulling luminance was the arithmetic mean of the luminances of the two lines rather than the geometric mean. Anstis et al. concluded that signals from the ON contrast system and those from the OFF system sum linearly rather than in a nonlinear logarithmic way. Only compressive nonlinearities in each system cancel to produce a linear difference signal. In a second condition, the apparent motion of the white and black lines was dichoptic, as depicted in Figure 16.19B. Observers adjusted the background luminance until motion-in-depth ceased. As before, the nulling luminance was the arithmetic mean of the luminances of the lines. Interpretation of this result is complicated by the possibility of monocular interactions between the line ends.
Time 1 Time 2 Same eye
A
Time 1 Left eye
Time 2 Right eye
B
Apparent motions with opposed polarity. (A) Black-on-gray bars undergoing apparent motion are superimposed in the same eye on white-on-gray bars undergoing apparent motion in the opposite direction. (B) Black bars undergoing dichoptic apparent motion are superimposed on white bars undergoing dichoptic apparent motion in the opposite direction. (Adapted from Anstis et al. 2000)
Figure 16.19.
224
•
16.4.3 A F T E R E FFEC TS O F D I C H O P T I C A P PA R E N T M OT I O N
A motion aftereffect can be induced by dichoptic motion. Anstis and Moulden (1970) presented a ring of six lights flashing in synchrony to one eye and an interleaved ring of six lights to the other eye, flashing in counterphase with respect to the first set. The display in each eye appeared as a set of flashing lights, but the dichoptic display appeared to rotate. After the dichoptic display had been exposed for 30s, a stationary ring of lights appeared to rotate in the opposite direction. This aftereffect must depend on binocular cells because only binocular cells register dichoptic motion. When dynamic random noise is viewed with a neutral filter over one eye, observers see two planes of dots moving in opposite directions and separated in depth. In effect, there are two opposite Pulfrich effects (Chapter 23). Inspection of the more distant plane of dots for 20 s induced a motion aftereffect lasting about 1.3 s when the neutral filter was removed (Zeevi and Geri 1985). It was shown that the aftereffect was not due to the dark-adapted state of the filtered eye. Carney and Shadlen (1993) produced dichoptic motion by presenting a counterphase-modulated horizontal sine-wave grating to each eye in spatiotemporal quadrature, as described in Section 16.4.2c. This produced the predicted vertical cyclopean motion. Five minutes of inspection of this stimulus produced a motion aftereffect only slightly weaker than that produced by nondichoptic motion. They suggested that the difference was due to differential effects of eye movements in the two conditions and to dilution of the cyclopean aftereffect by activity in monocular cells. After similar dichoptic stimuli have been observed moving in opposite directions in the two eyes for some time, the two opposed motion aftereffects cancel when both eyes view a stationary test display. When only one eye views the stationary test display, an aftereffect appropriate to that eye is seen (Wohlgemuth 1911). Thus, opposite motion aftereffects may be induced simultaneously into the two eyes. Each aftereffect could be built up in cortical monocular cells or in binocular cells dominated by one eye. Anstis and Moulden (1970) set up dichoptic circular displays of lights, which appeared to rotate in the same direction when viewed with one eye. When viewed with both eyes, the phase relationships between the two displays caused the combined display to appear to rotate in the opposite direction. The direction of the aftereffect after binocular viewing was opposite that of the binocular motion. Thus, dichoptic motion induced in binocular cells dominated that induced in monocular cells. Motion of disparity-defined gratings also produces aftereffects (Section 16.5.3a).
STEREOSCOPIC VISION
1 6 . 5 M OT I O N O F C YC L O P E A N S H A P E S This section is concerned with motion of cyclopean shapes in dynamic random-dot stereograms. There is no monocular information about the shape or the motion of the shape. 16.5.1 S E NS IT I VIT Y TO C YC L O P E A N M OT I O N
Julesz and Payne (1968) used a dynamic random-dot stereogram depicting a disparity-defined vertical grating of 0.95 cpd. The temporal conditions for optimal apparent motion of a cyclopean grating differed from those for motion of a binocularly viewed luminance-defined grating. At low temporal frequencies, cyclopean and luminance displays appeared to oscillate from side to side. At high temporal frequencies (> 10 Hz) both displays appeared as two superimposed stationary gratings. At frequencies between 4 and 7 Hz, the cyclopean grating appeared as a single stationary grating, an effect not evident with the luminance grating. This effect is known as stereo motion standstill. The stationary effect was not seen when the dot displays were not renewed on every frame, but in this case the motion was no longer exclusively cyclopean. This difference between the two types of motion was probably due to the fact that motion of cyclopean shapes is carried only by the boundaries of the bars of the grating while motion of a luminance grating is carried by both the boundaries and the texture within the bars. There was also a difference in effective contrast between the two types of display. The stimulus used by Julesz and Payne alternated between two locations between frame changes. Thus, the frequency of side-to-side motion was confounded by the frame rate. The opposite motions might have canceled. Tseng et al. (2006) continued the motion in each direction for five frames. They confirmed that stereo motion standstill occurs at frequencies of side-to-side motion of between 4 and 6 Hz. Thus, the grating was detected in the rapidly alternating stimulus at frequencies above which the cyclopean motion could not be detected. They argued that motion of cyclopean shapes is detected by the motiondetection system—the system that depends on tracking. They claimed that motion standstill occurs in an oscillating luminance-defined grating of sufficiently high spatial frequency. Phinney et al. (1994) measured upper and lower spatial displacement limits (dmax and dmin) for cyclopean apparent motion of a disparity-defined disk in a dynamic randomdot stereogram. The 2˚ disk moved laterally with either a crossed or uncrossed disparity of 22.8 arcmin with respect to the random-dot background. Apparent motion for a disk with crossed disparity was perceived over a range of displacement two to three times larger than the range for
a luminance-defined disk. Motion quality for a disk with uncrossed disparity was much poorer and the displacement range (dmax) was much shorter than for a disk with crossed disparity. This was put down to suppressive effects of occlusion of the edges of the disk with uncrossed disparity because it appeared beyond the textured background. Donnelly et al. (1997) created a disparity-defined array of moving dots in a dynamic random-dot stereogram. The proportion of signal dots (moving in the same direction) to noise dot (moving in random directions) required for motion detection was 25% for cyclopean motion but only 5% for luminance-defined motion. The directiondiscrimination thresholds were the same for the two types of motion when both were set at equal multiples above the detection threshold. Chang (1990) generated a horizontal grating defined by disparity in a random-dot stereogram and caused it to move upward or downward. All the dots within the disparitydefined grating were stationary, or moved in the same, opposite, or orthogonal direction as the grating, or were dynamic (replaced on every frame). The motion of the grating was most pronounced and smooth when the grating and dots moved together. When they moved in different directions, the predominant motion was of the dots rather than of the grating. With dynamic random dots, the grating appeared to move, but the motion was not smooth, and the addition of a small percentage of coherently moving dots was sufficient to bias the overall perceived direction of motion. When the dots were stationary, the grating appeared to undulate in depth, with only a weak sensation of vertical motion within the frontal plane. Note that, although the dots within the grating were stationary, there were opposite motion signals in the two eyes along the disparity boundaries of the moving grating. Chang concluded that luminancedefined motion is required to create motion of cyclopean shapes. He also concluded that motion of cyclopean shapes within the frontal plane is processed by interactions between modulations in disparity and luminance-defined motion rather than by a dedicated motion-detection system. In other words, signals from moving disparity-defined contours are too weak to override conflicting motion signals arising from luminance-defined texture elements. Ito (1997) questioned the generality of Chang’s conclusion. He hypothesized that, when disparity-defined cyclopean motion occurs simultaneously with luminancedefined cyclopean motion, the percept will depend on the relative strengths of the two signals. He identified luminance-defined cyclopean motion with the short-range system that prefers short interstimulus intervals, and disparity-defined cyclopean motion with the long-range system that operates over longer interstimulus intervals. He created a vertical square grating in a dynamic random-dot stereogram that alternated in dichoptic phase by 90˚. The disparity-defined motion signal caused the grating to appear
C YC L O P E A N V I S I O N
•
225
to move laterally, while the luminance-defined motion signal caused dot elements to appear to move to-and-fro in depth. Lateral motion of the disparity-defined grating dominated with large dot displacements, when dot displacement was incoherent, or when interstimulus intervals were large. These features are symptomatic of the long-range motion system. Motion-in-depth of luminancedefined dots dominated when dot displacement was small, when the direction of dot displacement was coherent, or when the interval between frame changes was small. These stimulus features are symptomatic of the short-range motion system. Thus, Chang’s conclusion that luminancedefined motion dominates cyclopean motion is true only under certain conditions. Fox et al. (1978) demonstrated that motion of cyclopean stimuli induces involuntary pursuit motion of the eyes, or optokinetic nystagmus. The stimulus was a dynamic random-dot stereogram depicting a vertical grating in depth with a 30-arcmin peak-to-trough disparity. The grating drifted across the display at 9.3˚/s. Stereoblind observers do not respond to motion of cyclopean shapes, presumably because they do not see the shape (Archer et al. 1987). However, stereoblind observers showed optokinetic nystagmus in response to dichoptic motion, a response that does not require prior detection of a cyclopean shape (Wolfe et al. 1981). Other aspects of the relation between pursuit eye movements and stereoscopic vision are discussed in Section 22.6.1. Smith and Scott-Samuel (1998) moved a cyclopean square-wave grating with the fundamental frequency of disparity modulation missing. The grating appeared to move in the opposite direction, as one would expect if its direction were determined by the displacements of the third harmonic. This is also true for motion of luminance-defined and contrast-defined gratings. Thus cyclopean motion, like other forms of motion, is coded by mechanisms operating over different spatial scales. On balance, the evidence supports the idea that cyclopean motion is processed by true motion detectors rather than by change in position. 16.5.2 S P E E D D I S C R I M I NAT I O N O F C YC L O P E A N M OT I O N
Patterson et al. (1992b) measured the duration threshold for detecting direction of motion of a cyclopean grating in a dynamic random-dot stereogram. The threshold decreased with increasing velocity and did not depend on temporal frequency or displacement magnitude. They concluded that a motion-detection system operates at the level of cyclopean motion. The threshold increased as disparity in the moving grating increased, in both the crossed and uncrossed direction. This could be because there are fewer detectors of large disparities than of small disparities. The speed of cyclopean motion of a random-dot display 226
•
could be discriminated under conditions in which dot displacement could not be discriminated (Patterson et al. 1997). Portfors-Yeomans and Regan (1997) compared speed discrimination for motion of a cyclopean shape with that for motion of a noncyclopean shape. They used a diskshaped dynamic random-dot stereogram subtending 8.5˚ containing a central square subtending 0.75˚ with a crossed disparity of 10 arcmin. The cyclopean shape moved up, down, left, or right through various amplitudes and at various speeds around a mean speed of 0.25 or 0.5˚/s. Observers reported whether the shape moved faster or slower than the mean of 64 trials and whether it moved through a greater or lesser distance than the mean. Observers discriminated displacement whatever the speed when that was the task, and discriminated speed whatever the displacement when that was the task. Portfors-Yeomans and Regan concluded that displacement and speed of cyclopean shapes are processed independently and in parallel. Similar measures were obtained for a noncyclopean shape created by switching off the dots surrounding the central square. No difference was found between motion toward or away from the fovea. Weber fractions for discrimination of displacement ranged from 0.09 to 0.13 for the cyclopean shape, and from 0.07 to 0.09 for the noncyclopean shape. Weber fractions for discrimination of speed ranged from 0.11 to 0.15 for the cyclopean shape, and from 0.08 to 0.097 for the noncyclopean shape. The differences for the two types of shape, although significant, were small. The results justify the conclusion that detection of speed of a cyclopean shape in the frontal plane involves a specialized velocity-sensitive mechanism comparable in sensitivity to that involved in detection of speed of a noncyclopean shape. Harris and Watamaniuk (1996) reported that Weber fractions for discrimination of speed of a cyclopean grating were at least 0.3 compared with between 0.05 and 0.1 for a noncyclopean grating. They used a dynamic random-dot stereogram depicting a cyclopean vertical grating modulated in disparity at 0.5 cpd and moving at between 0.7 and 3˚/s. The noncyclopean stimulus was the same except that the stereogram was not dynamic, so that the dots were seen to move in each monocular image. Initial phase and exposure duration were varied to discourage observers’ basing their judgments on displacement magnitude. A vertical random-dot grating is not strictly cyclopean, since relative compressions are evident in each monocular image—only horizontal gratings are fully cyclopean. Observers may have been judging the monocularly perceived motion of the pattern of compression. The depth modulations of the grating may not have been properly registered, since the grating subtended only 0.38˚ in height and there were only five dots on average along the vertical dimension. Kohly and Regan (1999) obtained the same results using a similar display but found that several observers could specifically
STEREOSCOPIC VISION
discriminate the speed of a cyclopean grating that was 2.8˚ high. Seiffert and Cavanagh (1998) found that, while detection of luminance-defined motion (first-order motion) was determined by velocity, detection of cyclopean motion (second-order motion) was determined by stimulus displacement. But their stimuli may have moved too slowly to be detected by the cyclopean velocity system. 16.5.3 C O N T R A S T E FFEC TS FRO M C YC L O P E A N M OT I O N
16.5.3a Cyclopean Motion Aftereffect Dichoptic motion between monocularly visible stimuli can produce an opposite motion aftereffect in a stationary display (Section 16.4.3). Papert (1964) generated a disparity-defined cyclopean bar in a random-dot stereogram and caused it to move down over the background, taking care to remove all monocular cues to motion. After a period of inspection, a stationary cyclopean bar appeared to move in the opposite direction. It has been claimed that motion of a cyclopean shape produces only a weak and brief aftereffect (Anstis 1980) (Portrait Figure 16.20). However, Nishida and Sato (1995) found that, while a luminance-defined moving stimulus produces an aftereffect with both stationary and dynamic (flickering) test patterns, a cyclopean motion aftereffect requires a dynamic test pattern. Patterson et al. (1994) produced a robust motion aftereffect from the motion of a
disparity-defined vertical square-wave grating in a 60-Hz dynamic random-dot display. The dynamic display removed monocular motion signals. The cyclopean aftereffect required a longer induction time than a regular motion aftereffect, which is another reason why earlier investigators failed to find it. Otherwise the two types of motion aftereffect have several similar properties. 1. Both aftereffects are direction selective and confined to the location of the induction stimulus. 2. Although the luminance aftereffect lasted longer than the cyclopean aftereffect, the duration of both aftereffects was proportional to the square root of adaptation duration (Bowd et al. 1996). 3. Both aftereffects depend on the temporal frequency of the stimulus rather than on its speed. Shorter and Patterson (2001) measured the duration of the aftereffect produced by motion of a disparity-defined grating of various spatial frequencies moving at various speeds. The duration of the aftereffect peaked at a temporal frequency of between 1 and 2 Hz, whatever the speed. The magnitude of the aftereffect from a luminance-defined grating peaked at a frequency of 5 Hz (Pantle 1974). The lower temporal sensitivity of the cyclopean aftereffect reflects the lower temporal response of the disparity system (Section 18.12.3). 4. Both aftereffects occur only when the spatial frequency and orientation of the moving adapting stimulus and the stationary test stimulus are similar (Shorter et al. 1999). 5. The duration of neither aftereffect was reduced much when observers performed a distracting wordrecognition task while viewing the induction stimulus (Patterson et al. 2005). 6. The aftereffect produced by a moving cyclopean shape is evident in a test pattern defined by luminance (Fox et al. 1982). Also, adaptation to a moving cyclopean plaid or its component gratings affects the perceived coherence of a luminance test plaid, and vice versa (Bowd et al. 2000).
Stuart Anstis. Born in Southall, England, in 1934. He obtained a B.A. from Corpus Christi College, Cambridge in 1955 and a Ph.D. in psychology from Cambridge University with R. L. Gregory in 1963. Between 1964 and 1991 he held academic appointments in psychology at the University of Bristol, England, and at York University, Toronto. Since 1991 he has been a professor of psychology at the University of California at San Diego.
Figure 16.20.
This evidence suggests that motion of cyclopean shapes and luminance-defined motion share a common neural substrate at some level of the nervous system. Both aftereffects depend on adaptation of a relatively low-level mechanism sensitive to temporal frequency rather than on adaptation of a higher-level mechanism that codes speed. But we cannot conclude that they share the same low-level system. A motion aftereffect induced by a moving cyclopean grating was weaker when the induction and stationary test displays did not have the same sign of binocular disparity,
C YC L O P E A N V I S I O N
•
227
that is, when they were on opposite sides of the zerodisparity plane (Fox et al. 1982). However, a later study from the same laboratory revealed that the motion aftereffect was reduced when both the induction and test displays were in the same disparity-defined plane on the same side of the zero-disparity plane (Patterson et al. 1996). It seems therefore that a difference in disparity between induction and test displays is not the crucial factor that influences the strength of the aftereffect. Perhaps there are fewer motion sensitive cells among those tuned to nonzero disparity than among those tuned to zero disparity.
(a) Stereoscopic depth is preattentive.
16.5.3b Cyclopean Direction-of-Motion Contrast In direction-of-motion contrast the difference in direction of motion of two patterns appears exaggerated (Marshak and Sekuler 1979). Cyclopean shapes are also subject to direction contrast. Thus, the difference in direction of motion of two cyclopean dynamic random-dot shapes was larger by between 5 and 20˚ when the angle between the motions was 30˚. These effects are probably due to mutual inhibition between motion detectors. In the direction-of-motion aftereffect, viewing a moving grating shifts the apparent direction of a subsequently seen grating. Adaptation to a cyclopean display produced similar aftereffects in a cyclopean display as in a luminance-defined display (Patterson and Becker 1996). These cross-domain aftereffects indicate that luminancedefined motion and cyclopean motion are processed at least in part by the same neural mechanism. Two superimposed arrays of dots moving in directions that are only 1 or 2˚ apart appear as one array moving in an intermediate direction. This is due to motion metamerism (Watamaniuk et al. 1989) (Section 4.2.7). It would be interesting to see whether motions of cyclopean shapes metamerize. 16. 6 C YC L O P E A N T E X T U R E S E G R E G AT I O N 16.6.1 C YC L O P E A N P O P O U T
A stimulus with a given value of a feature is said to pop out when it is immediately detected among an array of stimuli with different values of the same feature. For example, a vertical line pops out when presented in an array of horizontal lines. Pop out indicates that values of the feature are processed in parallel without the need to attend to the stimuli sequentially. Processing is said to be preattentive. Preattentive processing of binocular disparity is discussed in Section 22.8.2. The present section is concerned with preattentive processing of cyclopean stimuli other than those involving disparity. 228
•
(b) Contour rivalry requires serial search (Wolfe and Franze 1988).
(c) Luminance rivalry is preattentive. Testing for preattentive cyclopean features. A feature is processed in parallel, or preattentively, when the time taken to find a stimulus exhibiting the feature is independent of the number of other stimuli.
Figure 16.21.
16.6.1a Pop Out of Distinct Cyclopean Elements A patch in stereoscopic depth pops out, as in Figure 16.21a. Detection of a patch of rivalrous gratings in a display of nonrivalrous gratings, as in Figure 16.21b, requires a serial search. A patch with reversed contrast in the two eyes that produces binocular luster pops out when set among patches with the same contrast polarity, as in Figure 16.21c (Wolfe and Franzel 1988). This suggests that binocular luster is a basic visual feature. However, a patch of rivalrous color among patches of matching color does not pop out, as can be seen in Figure 16.3a. Wolfe and Franzel presented the stimuli shown in Figure 16.22. Two simple crosses were presented separately to the two eyes to produce a cyclopean double cross. The time taken to detect a double cross set in an array of single crosses was independent of the number of single crosses. This indicates that parallel search is possible for a stimulus that exists only after the inputs from the two eyes have been combined.
STEREOSCOPIC VISION
Left eye
Right eye
Binocular stimulus
Targets
Distractors
Pop-out of a cyclopean shape. A single multiple cyclopean cross was set among a variable number of distractors consisting of single crosses. (Redrawn from Wolfe and Franzel 1988)
Figure 16.22.
16.6.1b Pop-Out of Cyclopean Textured Regions In monocular vision, differently textured regions may immediately stand out. This indicates that certain types of texture are processed in parallel and preattentively. Nothdurft (1985) claimed that cyclopean textured regions not visible in either monocular image of a randomdot stereogram perceptually segregate only when the different texture elements are seen in different disparitydefined depth planes. However, when Péres-Martinez (1995) used texture elements that were more discriminable than those used by Nothdurft he obtained texture segregation within a single depth plane of a random-dot stereogram, as shown in Figure 16.23a. Péres-Martinez devised the stereogram shown in Figure 16.23b, in which texture segregation occurs within each depth plane even though the boundaries between the textured regions are not evident in the monocular images. Petrov (2003) created a stereogram depicting two superimposed textured surfaces separated in depth. In one stereogram, texture elements were horizontal in one surface and vertical in the other, except for a circular patch in which the orientation of the lines was reversed, as shown in Figure 16.23c. Other stereograms contained patches reversed in contrast polarity, as shown in Figure 16.23d, or in color, or in motion. All five subjects detected patches defined by reversed contrast, and most of them detected the patches defined by reversed motion. None of the five subjects detected patches defined by orientation, or color. However, the patch defined by reversed orientation in the top righthand corner of Figure 16.23c becomes visible once the two depth planes have been clearly segregated. Petrov used denser textures with shorter line elements than those in Figure 16.23c.
16.6.2 I N V E R S E C YC L O P E A N P H E N O M E NA
In a random-dot stereogram, a pattern visible to one eye can become invisible when the two images are fused. Julesz (1971) referred to this effect as inverse cyclopean stimulation. This can happen when monocular patterns are scattered in depth. The symmetry evident in one image of the stereogram in Figure 16.24 is lost when the images are fused ( Julesz 1966). Monocular patterns may also disappear because their boundaries form complementary pairs, as in Figure 16.25 (Frisby and Mayhew 1979a). In this case, the regions are in the same depth plane. Frisby and Mayhew concluded that texture discrimination does not occur before monocular images are combined. But even if the textured regions were segregated before fusion they would surely form a homogeneously textured pattern at the cyclopean level. Glass patterns illustrate how monocular structure may be disguised in dichoptic images. A Glass pattern is formed by first constructing a random array of dots and then superimposing on it a transformed copy of the same array. A radial Glass pattern emerges when the superimposed copy is slightly expanded radially. A concentric pattern emerges when the copy is slightly rotated (Glass and Perez 1973). In each case, the Glass pattern is defined by aligned pairs of dots in the combined pattern. In the stereogram shown in Figure 16.26a, the basic set of dots is the same in the two eyes. A radially expanded copy of the basic set is superimposed on both displays and displaced laterally in the right-eye display to produce a binocular disparity. Thus, in the fused image, the pairs of dots that define the Glass pattern are separated in stereoscopic depth. Fusion of these displays creates two depth planes with no evidence of the Glass pattern in either one. Thus, the pairing of the dots by disparity to produce depth has precedence over the pairings in each eye that creates the Glass pattern (Earle 1985). However, when the disparity between the pairs of dots was less than 17 arcmin, the Glass pattern could still be seen in the fused image (Khuu and Hayes 2005). In Figure 16.26b, the basic dot pattern is the same in both eyes. An expanded version of the basic pattern is superimposed on the left eye’s display. A different set of random dots is superimposed on the right eye’s display. The fused image appears as dots dispersed at random in depth, again with the Glass pattern disguised. In this case, the perception of depth supersedes the perception of aligned dot pairs. When pairs of dots defining the Glass pattern are shifted as a pair relative to matching pairs in the other eye, as in Figure 16.26c, the Glass pattern is preserved in the fused image. A Glass pattern not evident in either monocular image may be created in the cyclopean domain, as shown in Figure 16.26d. Adjusting the disparity between pairs of dots in Glass patterns can create a cyclopean Glass pattern in depth.
C YC L O P E A N V I S I O N
•
229
(a) Cyclopean regions defined by shape can be perceived. (From Pérez-Martinez 1995, Pion Limited, London)
(b) The stereogram forms two depth planes with distinct textured regions, as on the right. (Pérez-Martinez 1995)
(c) One depth plane has a patch of vertical lines among horizontal lines (as on the right), the other has a patch of horizontal lines among vertical lines. The patches become visible in the fused image.
(d) A stereogram like (c) but with patches defined by differences in contrast polarity. (From Petrov 2004) Figure 16.23.
Cyclopean texture segregation.
1 6 . 7 B I N O C U L A R VI S UA L D I R E C T I O N (This section was written with Alistair P. Mapp and Hiroshi Ono)
are egocentric reference frames because they involve some part of the observer’s body.
16.7.1 I N T RO D U C T I O N
1. Oculocentric frame Oculocentric judgments of direction are with respect to the visual axis or one of the principal retinal meridians.
People can judge the direction of an isolated object in any of the following reference frames, the first three of which
2. Headcentric frame Headcentric judgments of direction are with respect to the median plane of the head and
230
•
STEREOSCOPIC VISION
Inverse cyclopean effect. Symmetry evident in the monocular images becomes difficult to detect when the images are fused.
(a) Left image is a random pattern plus an expanded copy. Right image is the same with one image displaced laterally. Fusion creates two depth planes but no radial Glass pattern.
Figure 16.24.
(Adapted from Julesz 1966)
(b) Left image is a random pattern plus an expanded copy. Right image is the same pattern plus another random pattern. Fusion creates a 3-D dot array with little visible Glass pattern.
Inverse cyclopean effect in complementary images. Boundaries between the textured regions in each monocular image are less evident after fusion because they form complementary pairs and the bold regions suppress the less bold. (Based on a figure in Frisby and Mayhew 1979b)
Figure 16.25.
the midtransverse plane through the eyes. A headcentric judgment requires the observer to register the position of the images in the eyes (oculocentric component) and the angular position of the eyes in the head (eyeposition component). When an observer fixates an object, its images can be expected to fall precisely on the foveas, and there is little uncertainty about the oculocentric direction of the object. In this condition, a headcentric judgment reduces to the task of registering the direction of gaze. The direction of gaze could be provided by proprioceptors in the extraocular muscles or by efference to those muscles. The images of the orbital ridges and tip of the nose could also provide it. However, the variability of judgments of straight ahead was not affected by whether or not these structures were in view (Shebilske and Nice 1976; Wetherick 1977). It seems that the nose is too eccentric in each monocular visual field to be effective, since an external object fixed with respect to the head did reduce variability of judgments when it was near the target. 3. Torsocentric frame Torsocentric judgments are made with respect to the median plane of the body and some arbitrary horizontal plane. They must now take account of the position of the head on the torso.
(c) The two images are similar Glass patterns but with disparities between dot pairs. Fusion creates a 3-D Glass pattern.
(d) Disparity creates a Glass pattern not present in either image. Cyclopean effects on Glass patterns.
Figure 16.26.
(From Earle 1985, Pion Limited, London)
4. Exocentric frame In an exocentric judgment the direction of one visual object is judged with respect to a second object or with respect to an external reference frame. When the reference frame is visual, only the relative locations of retinal images are required. In order to interpret responses from an experiment on visual direction the experimenter must know which frame of reference observers are using. For example, when asked to set a stimulus to “straight ahead” an observer could set
C YC L O P E A N V I S I O N
•
231
it (1) on the visual axis, (2) on the median plane of the head, (3) on the median plane of the torso, or (4) in the center of the visual field. The responses would coincide only when object, visual axis, and median planes were aligned. Oculocentric judgments require only one source of information—image position. Headcentric judgments require additional information about the position of the eyes in the head. Torsocentric judgments require, in addition, information about the position of the head on the torso. Exocentric judgments require only information about the relative locations of retinal images. One would expect precision to be less when more sources of information are required, since each source of information adds noise. The following discussion is concerned mainly with headcentric directional judgments, but there is some discussion of exocentric direction tasks. It seems that the only experiments that involve oculocentric judgments of direction are experiments on saccades and visual pursuit to targets defined with respect to the fovea. We will see that the failure to clearly distinguish between egocentric and exocentric frames of reference has produced confusion.
16.7.2 H E A D C E N T R I C D I R E C T I O N A N D T H E EG O C E N T E R
16.7.2a. Basic Law of Visual Direction When a near object is fixated binocularly, the eyes point in different directions with respect to the median plane of the head, and yet the visual object appears to have a single direction in space. Somehow directional information from the two eyes combines to produce a unitary sense of visual direction. One is then confronted with the question of which location in the head serves as the origin for directional judgments. One possibility is that it is the dominant eye, but the evidence reviewed next suggests that directional judgments are referred to a point midway between the eyes, known as the cyclopean eye or visual egocenter. Therefore, information from each eye must be transferred to the cyclopean eye. This section is concerned with how this is done, both when the eyes fixate the target, so that the positions of the two images correspond, and when directional information is derived from disparate images. Analysis of headcentric direction starts with the basic unit of the visual line, which is common to directional judgments in all frames of reference. A visual line is any straight line passing through the pupil and the nodal point of an eye. A visual line is the locus of all points, fixed relative to the eye, which stimulate a given point on the retina. The visual line through the center of the fovea is the visual axis. Any other visual line may be specified in terms of its angle of azimuth with respect to the eye’s median plane, and of its angle of elevation with respect to the eye’s mid-transverse plane. Wade et al. (2010) suggest that the term “ocular line” 232
•
is less ambiguous than the term “visual line,” but we will retain the term “visual line.” A visual line may also be specified in terms of its angle of eccentricity and meridional angle (Section 14.3.1). For a given position of an eye, each fixed point in space has only one physical direction and only one apparent direction. An exception to this rule is provided by monocular diplopia or polyopia, in which single objects appear double or multiple, either because of an optical defect or because of defective neural processing (see Section 14.4.2). All points on the same visual line have the same visual direction and appear visually superimposed. Objects on different visual lines of one eye appear in distinct locations, except objects that are closer together than the resolution threshold of the visual system. The preceding statements are summed up by the basic law of visual direction, which states that all objects on the same visual line are judged to be in the same direction, which is unique to that set of objects. The law does not specify where the objects in the set appear with respect to any of the four frames of reference. It simply states that the objects appear aligned (superimposed) in each frame of reference. The law of visual direction does not state that two images on different visual lines in one eye retain the same relative apparent direction under all conditions. For example, two lines that are aligned or parallel on the retina do not necessarily appear aligned or parallel and may change their apparent relative orientation. Apparent changes in orientation or position of fixed images are illustrated by figural aftereffects, geometrical illusions, and the tilt aftereffect.
16.7.2b. Laws of Headcentric Direction For a given angular position of an eye, points on the same visual line are also judged to be in the same headcentric direction. Thus, the law of headcentric direction states that, for a given position of the eye in the head, objects lying on the same visual line are judged to be in the same headcentric direction with respect to the cyclopean eye, which is unique to that visual line. The basic demonstration of this law was reported by Ptolemy (c. AD 150) (see Howard and Wade 1996; and Tyler 1997), Alhazen (see Howard, 1996), and Wells (1792). The concept of the cyclopean eye was proposed by Towne (1865, 1866), Hering (1868, 1879), and LeConte (1871, 1881) at approximately the same time (see Wade et al. 2006; Ono et al. 2009). When a person sights with one eye along a thin rod held about 30 cm from the eye, the rod lies along the visual axis and is experienced as pointing directly at the cyclopean eye. One may also align a rod with a visual line other than the visual axis by maintaining fixation on a point in the median plane, while aligning an eccentrically placed rod. In this case, too, the rod appears to point directly at the cyclopean eye.
STEREOSCOPIC VISION
B
B
A
A
Nodal point
Centre of rotation Loss of alignment with motion of an eye. Points A and B are aligned when the eye is in the primary position. They become misaligned when the eye rotates. This is because the nodal point is in front of the eye’s center of rotation.
Figure 16.27.
A rod aligned with a visual line for one position of the eye will not be aligned with any visual line, after the eye has rotated to another position. This is because the center of rotation of the eye is behind the nodal point, so that rotating the eye displaces the nodal point in the same direction as the eye movement (Figure 16.27). Brewster (1844b) called this ocular parallax. The displacement of the center of rotation from the nodal point is revealed by the fact that an object near the nasal limit of the visual field with gaze straight ahead is not visible when the eye is turned in toward the nose (Mapp and Ono 1986). The cyclopean eye, or visual egocenter, is the location in the head toward which visually aligned objects appear to point. A more precise definition is given later. The cyclopean eye is not necessarily the point toward which monocularly aligned objects actually point, which is, of course, the nodal point of the eye. In fact, as we will see later, the cyclopean eye is normally in the median plane of the head. Next, assume that, in the binocular field, images falling on corresponding points in the two retinas have a common visual direction. Each pair of corresponding points is associated with a pair of corresponding visual lines. It follows from the law of visual direction and the principle of corresponding points that all objects on either of a pair of corresponding visual lines appear spatially superimposed. This is the law of common binocular directions applied to corresponding lines. In itself this does not prove that objects lying along corresponding visual lines will appear to be in the same headcentric direction for the two eyes. For instance, if each eye were a center of reference for headcentric direction, an object seen by one eye and an
object on a corresponding line in the other eye would seem to be in a different direction even though the objects appeared to occupy the same position in space. In fact, corresponding visual lines are referred to the cyclopean eye, which is normally midway between the eyes. These principles can be summed up by the law of the cyclopean eye. Points on any visual line of either eye appear aligned with the cyclopean eye midway between the eyes. Any line through the egocenter is a cyclopean line. Since the egocenter does not correspond to the nodal point of either eye, cyclopean lines and visual lines do not coincide. The direction of a cyclopean line can be specified with respect to the coordinates of the cyclopean eye or with respect to headcentric coordinates (see Section 14.3.2). To say that a set of points appears aligned with the egocenter does not specify the apparent direction of the points relative to the median plane of the head, since direction relative to the median plane cannot be specified by one point on the median plane. A metric for headcentric direction is provided if the direction of the point of binocular fixation is judged correctly. This is a point on the horopter where the two visual axes intersect. This idea can be generalized by the statement that the headcentric directions of all points on the horopter (points where corresponding lines intersect) are judged correctly. Thus, the apparent direction of lines within the horizontal plane of regard may be specified if (1) the directions of points on the horizontal horopter are judged correctly and (2) points lying on the same visual line are perceived as collinear. From the law of common binocular directions and from these assumptions one can derive the law of cyclopean projection. This states that points on any visual line appear to be aligned with the cyclopean eye and the physically defined point where the visual line intersects the horopter. This can be regarded as a corollary to the law of common binocular directions. With symmetrical convergence, points on the visual axis of either eye appear in the median plane of the head. The angle ϑ, between the visual axis and the median plane, is half the angle of convergence, as shown in Figure 16.28. Assume that the horopter conforms to the Vieth-Müller circle and that the cyclopean eye lies on this circle, midway between the eyes. Then ϑ, is the angle between any visual line and the cyclopean line on which objects on the visual line appear to lie. Thus, for a given convergence, any visual line will appear displaced by half the vergence angle with respect to that visual line. The cyclopean line passing through the intersection of the two visual axes is called the cyclopean axis or the common axis. The law of differences in headcentric directions states that the angle formed by a visual axis and a visual line is seen as the difference in the visual directions between the cyclopean axis and the cyclopean line. A corollary of this law is that an angle formed by two visual lines of an eye is seen as the difference in the visual directions at the
C YC L O P E A N V I S I O N
•
233
A' A
Apparent directions of points C
C'
Vieth-Muller circle
q B'
B
q
D'
Directions of points
D
Egocentre The egocenter and perceived direction. Assume that the visual egocenter lies midway between the eyes on the Vieth-Müller circle and that the headcentric directions of points on the horopter are correctly judged. Points, such as A and B, lying on a visual axis will appear aligned with the egocenter and the point where the visual axis intersects the horopter. The angle ϑ between the visual axis and the line on which the objects appear is half the vergence angle. Objects, such as C and D, on another visual line, also appear displaced by angle ϑ. Figure 16.28.
parts of the tree and chimney. Both will be seen simultaneously, now the tree more distinctly, now the chimney, and sometimes both equally well, according to which eye’s image is victor in the conflict. One sees therefore, the spot on the pane, the tree and the chimney in the same direction. (Hering 1879, p. 38) Figure 16.29 illustrates this situation. Another way to illustrate the concept of the cyclopean eye is to draw two lines on a card so that when the card is held in front of the eyes, the lines extend precisely from the center of each pupil to an apex, as in Figure 16.30. A thin vertical separator down the center of the card ensures that each eye sees only its own line. If the lines are visually distinct—for instance in different colors—and if fixation is maintained on the point where they intersect, the two lines appear as one line extending from a point between the eyes. Ptolemy used this display in the second century, Alhazen used it in the 11th century (Section 2.10.2), and Towne used it in the 19th century.
Scene as seen by the observer
cyclopean eye. In Figure 16.28, the angle formed by A and C is equal to the angle formed by A´ and C´.
16.7.2c Demonstrations of Laws of Headcentric Direction
Median plane
Hering stated the law of the cyclopean eye as:
Fixation point Pane of glass
The truth of this statement was demonstrated in the following way: Let the observer stand about half a meter from a window which affords a view of outdoors, hold his head very steady, close the right eye, and direct the left to an object located somewhat to the right. Let us suppose it is a tree, which is well set off from its surroundings. While fixing the tree with the left eye a black mark is made on the windowpane at a spot in line with the tree. Now the left eye is closed and the right opened and directed at the spot on the window, and beyond that to some object in line with it, for example, a chimney. Then with both eyes open and directed at the spot, this latter will appear to cover •
Visual axis of left eye
Visual axis of right eye
For any given two corresponding lines of direction, or visual lines, there is in visual space a single visual direction line upon which appears everything which actually lies in the pair of visual lines. (Hering 1879, p. 41).
234
Actual tree
Actual house
Cyclopean eye Hering’s illustration of cyclopean direction. While fixating a distant object (tree) with only the left eye open, a black spot on the pane of glass is aligned with the tree. When both eyes fixate the spot, a distant object (a house) in line with the spot for the right eye, and the object aligned with the left eye, appear superimposed.
Figure 16.29.
STEREOSCOPIC VISION
Illustration of the egocenter. Demonstration that visual directions are referred to an egocenter. Each line must point to the pupil of an eye, and fixation should be on the point where the lines meet. The two lines appear superimposed in the median plane of the head.
Figure 16.30.
We produce a unified sense of direction from the distinct vantage-points of the two eyes by judging directions with reference to the cyclopean eye. The directions of objects on any pair of corresponding visual lines are judged as though the objects are seen by the cyclopean eye, as shown in Figures 16.30 and 16.31. When 2-year-old children sight an object through a tube, they place the tube midway between the eyes (Church 1966). This is called the cyclops effect. Barbeito (1983) found that about one-third of a group of 3-year-old
Apparent position of A
A’
A
Point seen only by left eye
Fixation point on glass plate
Screen with hole
Illustrating Hering’s law of visual direction. A pinhole in a card is held several centimeters in front of the right eye. A black dot on a pane of glass is fixated directly by the left eye and through the pinhole by the right eye. Object A, on the visual axis of the left eye, appears at A’ in the median plane beyond the fixation point, even when the right eye is closed.
Figure 16.31.
children behaved this way but only about one in ten of 4-year-olds. The cyclops effect has also been reported in young strabismic children and in children under 4 years of age, two years after they had one eye removed (Dengis et al. 1993a). Older observers (5.8 to 22.8 years) showed some cyclops effect in visual tasks such as aligning a line or a moving stimulus with a landmark of the head (the bridge of the nose or edge of the pinna) but less so than age-matched binocular observers (Gonzalez et al. 1999). Thus, young children behave as if they see out of the cyclopean eye and they must learn to bring a tube to one or the other eye. That is not to say that they consciously believe that their eyes are in the center of the head. Even after children have learned to bring a tube to one eye they must learn to close the other eye (Dengis et al. 1996, 1997). Moreover, if visual feedback is eliminated as the tube is raised toward the face, then adults behave as children do and place the tube midway between the eyes (Dengis et al. 1998). A corollary of the law of the cyclopean eye is that two objects at different distances, that appear aligned when viewed with one eye, will not appear aligned when viewed with the other eye. This follows from the fact that two objects at different depths cannot fall simultaneously on corresponding visual lines in the two eyes, because corresponding visual lines intersect in only one point. When one sights a distant object through a ring with both eyes open, there is conflicting information about the alignment of the ring and the object. In this situation, binocular disparity is too large to allow fusion. Most people accept the information in one eye—the sighting eye—and ignore what they see with the other eye. The sighting eye is therefore the eye one uses preferentially in making judgments about the alignment of objects well separated in depth. This is not to say that the sighting eye becomes the location in the head that serves as the origin of directional judgments, for it is not. Points lying on a horizontal line extending away from the observer in the median plane of the head stimulate noncorresponding points in the two retinas, except where the line intersects the horopter. When the line is just below eye level it appears as a cross with its intersection point on the horopter. This cross is easily observed by taking a card with a line drawn on it and holding it just below eye level with one end of the line touching the bridge of the nose. It is as if the space before one eye had rotated scissors-fashion about the fixation point over the space before the other eye. This has the effect of apparently transferring the objects on each visual axis to the median plane and, for each eye, transferring objects in the objective median plane of the head to the visual axis of the opposite eye. Ptolemy and Alhazen described this effect (Section 2.10.2). With symmetrical convergence, all visible objects imaged on the foveas are judged to have the same headcentric direction, which lies approximately in the median plane of the head. This is true even when only one eye is open or
C YC L O P E A N V I S I O N
•
235
when, because of an obstruction, the object can be seen by only one eye. Hering demonstrated this in the following manner. A card with a pinhole at its center is held several centimeters in front of the right eye. A black dot, F, on a pane of glass is fixated directly by the left eye and by the right eye through the pinhole, as illustrated in Figure 16.31. A small object, A, is placed beyond the glass on the visual axis of the left eye. Although A is seen by only the left eye and is to the right of the median plane, it appears in the median plane behind the point F. If the right eye is closed, the impression remains the same. The apparent position of A changes only if the eyes change their positions. Results discussed so far in this section have not been universally accepted. See section 16.7.7 for a discussion of recent controversies. There are three possible reasons for these controversies: 1. The distinction between relative (exocentric frame) and absolute (headcentric or torsocentric frame) directions is sometimes not made. 2. Observers often have knowledge of the actual location of stimuli and which eye is being used to view the stimuli. 3. Tasks used in visual direction experiments lack ecological validity (they are not used in daily life). To address the first two reasons, Khokhotva et al. (2005) presented a fixation point in the median plane and flashed one of thirteen monocular stimuli closer to the face. Two of the thirteen stimuli were collinear with respect to one of the eyes and the fixation stimulus, and one of them was collinear with respect to the bridge of the nose. For the relative direction task, observers reported whether the closer stimulus appeared to the left of, to the right of, or in the same direction as the fixation stimulus. For the headcentric task, observers reported to which part of their face (or beside their face) the imaginary line passing through the two stimuli appeared to point. The relative direction judgments were correct in over 90% of the trials. The headcentric direction judgments followed the laws of headcentric direction. Specifically, the line passing through the fixation point and an eye appeared to point to the nose and the line passing through the fixation point and the nose appeared to point to an eye. To address the third reason, Mapp et al. (2007) used rifle aiming, dart throwing, and pistol aiming tasks. These tasks are unlike visual direction tasks such as those used by Khokhotva et al. (2005), which require making judgments about nonfixated stimuli. Rifle aiming is a relative direction task in which observers are required to make the two sights of the rifle collinear with a target. The task is essentially one of Vernier acuity. Dart throwing and pistol aiming are headcentric or torsocentric tasks in which observers are required to localize the target with respect to their head 236
•
or body. The task of aiming a rifle was performed precisely and accurately independent of whether fixation was on the target or the rifle sights. The dart throwing and pistol aiming tasks were performed less accurately when viewing the target monocularly than when viewing it binocularly, and the extent of the constant errors in the monocular condition were predictable from the extent of phoria (the deviation of the nonviewing eye). See section 16.7.5. 16.7.2d Summary Here are five laws or principles of visual direction for distinct pointlike stimuli: 1. The law of visual direction Objects on a given visual line have the same visual direction and appear aligned, or superimposed in any frame of reference. Objects falling on discriminably different visual lines appear spatially separate in any frame of reference. 2. The law of headcentric direction For an eye in a fixed direction of gaze relative to the head, objects lying on the same visual line are judged to be in the same headcentric direction, which is unique to that visual line. 3. The law of the cyclopean eye In monocular or binocular viewing, all visual lines of either eye appear to point to a common cyclopean eye midway between the eyes. 4. The law of cyclopean projection Points on a visual line appear to lie on the cyclopean line that geometrically intersects the visual line on the horopter. It follows that objects on the visual axes of the two symmetrically converged eyes appear to extend in the median plane of the head from a point midway between the eyes. In general, for asymmetrical stimuli and asymmetrical convergence, objects on any pair of corresponding visual lines appear on a cyclopean line passing through the cyclopean eye and the point in the horopter contained in both visual lines. An object seen and fixated by only one eye is judged to be in the direction of a line that intersects the cyclopean eye and the point of binocular convergence. There are complications due to phoria and strabismus. 5. The law of differences in headcentric directions The angle formed by a visual axis and a visual line is seen as the difference in the visual directions between the cyclopean axis and the cyclopean line. Ono (1979, 1991) and Ono and Mapp (1995) described a similar set of principles. Although Hering is usually credited with first formulating principles of visual direction, Ptolemy described cyclopean projection in the 2nd century AD and so did Alhazen in the 11th century (Section 2.2.4c). Principles of cyclopean projection were also illustrated by William Briggs in 1676 and by Wells in his book Essay upon
STEREOSCOPIC VISION
Wim A. van de Grind. Born in Rotterdam in 1936. He received a B.Sc. in electronics from the University of Rotterdam in 1958 and a Ph.D. on “Retinal Machinery” with M. A. Bouman from Utrecht University in 1970. He then worked with O. J. Grüsser at the University of Western Berlin. He was appointed professor of sensory physiology and psychophysics at the Physiological Institute in 1974. In 1987 he was appointed professor of comparative physiology at Utrecht University, where he stayed until his retirement in 2001. Since then he has worked as a research professor in the Laboratory for Functional Neurobiology, Utrecht University.
Figure 16.32.
Single Vision with Two Eyes, written in 1792. This was 87 years before Hering wrote his account (Ono 1981; van de Grind et al. 1995) (Portrait Figure 16.32). Moreover, LeConte (1871, 1881) independently proposed the laws of visual direction that make the same predictions and suggested the term “the cyclopean eye” for the origin of visual direction (see Wade et al. 2006). These laws must be modified to account for the perceived directions of points on a surface or of points in monocularly occluded areas (Sections 16.7.3, 16.7.4). Any account of visual direction must distinguish between absolute and relative directions and between physical and perceptual variables. 16.7.3 VI S UA L D I R E C T I O N S O F D I S PA R AT E I M AG E S
16.7.3a Directional Averaging of Disparate Images The fact that the directions of objects are judged in relation to a cyclopean eye means that the directions of all objects other than those in the horopter are misjudged (see Ono and Angus 1974) (Portrait Figure 16.33). This seems intolerable from a behavioral point of view. However,
Hiroshi Ono. Born in Kyoto, Japan in 1936. He obtained a B.A. in psychology from Dartmouth College, New Hampshire in 1960 and a Ph.D. from Stanford University in 1965. He conducted postdoctoral work at Stanford University before taking an academic appointment at the University of Hawaii in 1966. In 1968 he moved to York University, Toronto where he is now a Distinguished Research Professor of psychology.
Figure 16.33.
people usually fixate an object to which they are attending, so that the illusory directions of other objects are of little consequence. We can think of the visual fields of the two eyes as uniting to form a single cyclopean field. Cyclopean lines of sight of points within this cyclopean visual field are formed by the superimposition of the sets of binocularly corresponding lines from each retina. However, slightly disparate images also fuse in the cyclopean field. One may then ask whether the perceived direction of two slightly disparate but fused images is that of the image in one or the other eye, or the average of the visual-line values of the two images, as originally proposed by Hering (1865). Ono et al. (1977) investigated this question by presenting observers with the stereogram shown in Figure 16.34. When the eyes converge to fuse the two outer circles, the fused image of the enclosed dots appears centered in the fused image of the circle seen by both eyes. This indicates that the cyclopean oculocentric direction of fused disparate images is the average of the visual-line values of the two monocular images. With large disparities, the images appear double. Sheedy and Fry (1979), also, found that the cyclopean direction of fused disparate images is essentially an equally
C YC L O P E A N V I S I O N
•
237
c
a
+1
h+
A
C
The visual direction of disparate images. When the top two circles
are fused, the black dots fuse and appear at the center of the circle. Directions of monocular images are averaged. In the lower pair of circles, the black dots do not fuse, and their directions are not averaged. Nonius lines in small circles indicate correct vergence. (Adapted from Ono
D
d
b Figure 16.34.
B g
1
Helmholtz’s stereogram. This stereogram was used by Helmholtz (1909) to support Wheatstone’s claim that stimuli falling on corresponding points in the two retinas can be seen in two different directions.
Figure 16.35.
et al. 1977)
weighted compromise between the directions of the two monocular images. The perceived orientation of the fused image of a stereogram consisting of gratings with a cyclodisparity is also the mean of the orientations of the disparate images (Kertesz 1980). Rose and Blake (1988) found that dichoptic vertical lines with a horizontal disparity of up to 1˚ appeared displaced toward each other. Thus, a tendency to average the oculocentric directions of dichoptic images is evident even in unfused images. If we did not average the visual directions of disparate images, 3-D objects would appear distorted. For example, when we look straight down on a pyramid we see a regular pyramid even though the image of the apex is displaced in opposite directions relative to the image of the base in the two eyes. The laws of headcentric visual direction must be modified to deal with this averaging process. Corresponding lines in the two eyes are seen in a single direction. But so also are binocularly disparate visual lines that are fused by the stereoscopic system. Moreover, Wheatstone (1838) argued that the converse of this is sometimes true. Two corresponding visual lines, one in each eye, can yield the percept of two visual directions (Ono and Wade 1985). To demonstrate Wheatstone’s claim, Helmholtz (1909) constructed a stereogram consisting of two parallelograms, as shown in Figure 16.35. The top and bottom edges are horizontal, and the sides are oriented so that their upper portions produce crossed disparities with respect to the lower portions. The left half of each parallelogram is green and the right half red. The division between the two halves is parallel to the sides of the parallelogram. Thus, stimulus elements near the color boundary stimulate corresponding points in the two retinas but in different colors. The fused image appears as a rectangle inclined in depth about a horizontal axis. This indicates that stimulus elements falling on corresponding points are seen in different colors and therefore in 238
•
different halves of the display, that is, in different directions. Ono et al. (2000) obtained a similar effect using a randomdot stereogram. The above phenomenon is best understood as a necessary consequence of the fact that a surface with disparity is binocularly fused into a single surface. When two different sets of visual lines associated with elements of the retinally disparate surface are averaged, a single surface emerges. A stimulus element that falls on disparate visual lines is seen in a single direction. Different elements on corresponding visual lines are seen in two different directions. Ono et al. (2000) called this change of a visual line value into a cyclopean direction a “transformation of the visual-line value.” Erkelens and van Ee (1997a, 1997b) called it “capture of visual direction.”
16.7.3b Ocular Prevalence When the cyclopean direction of fused disparate images is an equally weighted compromise, as discussed above, the directional transformation of the image in each eye is one half the binocular disparity. However, the perceived direction of a pair of fused disparate images is not always an equally weighted compromise. The term ocular prevalence refers to the extent to which the perceived location of a fused disparate image is pulled toward one eye or the other. The Freiburg Ocular Prevalence Test is a computerbased version of a test developed by Sachsenweger (1958). The images shown in Figure 16.36 are presented in a crystalshutter stereoscope. The two triangles have a relative disparity of 7 arcmin, so that one appears nearer than the other. The subject adjusts the horizontal positions of the two triangles until they appear vertically aligned. Using this test, Kommerell et al. (2003) found that a displacement of the triangles from midline alignment of more than 10% in 13 of 20 subjects. In 15 of the 20 subjects the displacement was in the direction of the dominant eye, as assessed by pointing to a distant object.
STEREOSCOPIC VISION
16.7.4 P E RC E I V E D D I R E C T I O N O F M O N O C U L A R S T I MU L I
16.7.4a Direction of Simple Monocular Stimuli
The Freiburg ocular prevalence test. When fused, one of the triangles appears in front of the frame and the other appears beyond the frame. (From Kommerell et al. 2003, with permission from Elsevier)
Figure 16.36.
Using a similar test of ocular prevalence, Ehrenstein et al. (2005) found that 35.5% of 103 subjects showed right-eye prevalence, 27.5% showed left-eye prevalence, and the rest showed no prevalence. They, also, found that ocular prevalence was correlated with sighting dominance. If the images in the two eyes differ in luminance or blur, the direction of the fused image is biased toward the more intense or less blurred image (Verhoeff 1933; Charnwood 1949; Ono and Barbeito 1982; Kommerell et al. 2003). This shift may be responsible for the adverse effect of unequal image luminance on stereo acuity, as described in Section 18.5.4. Similar biasing effects occur in any metameric sensory system. For example, when two points are applied to neighboring areas of the skin they feel like one point in a position that depends on the relative strengths of the two stimuli (Section 4.2.7). Mansfield and Legge (1996) provided evidence that the weight assigned to a monocular image is a function of the uncertainty of localization of that image, which depends on such factors as its luminance and blur. Khan and Crawford (2003) asked subjects to point with unseen hand at a visual target in dark surroundings. The head was fixed, and subjects fixated the target, which appeared at various horizontal eccentricities. When the target was in the midline, subjects aligned the finger with the target and a cyclopean point between the eyes. Khan and Crawford (2001) claimed that the sighting dominant eye, which they found to be gaze-dependent, determined the location of the egocenter. Also, as the target became more eccentric, the finger moved into alignment with the eye on the side of the target. The asymmetrical position of the arm may have been a factor. They used only one arm. Erkelens and van de Grind (1994) produced evidence that the rules governing the apparent visual direction of binocular and monocular images must be modified for images within a monocular occlusion zone bordering a vertical step of disparity, as in Panum’s limiting case. However, Nakamizo et al. (1994) found that the laws of visual direction do account for the apparent directions of images in Panum’s limiting case. The results of an experiment by Takeichi and Nakazawa (1994) can be interpreted in the same way. For further discussion of the rules governing visual direction see Section 16.7.7.
It was explained In Section 16.7.3a that the perceived direction of fused disparate images is intermediate between the directions of the component monocular images. Thus, in Figure 16.37A, the fused images of the vertical lines appear vertically aligned but in different depth planes. The question addressed in this section is whether the perceived direction of a stimulus presented to only one eye changes when the horizontal disparity of the background changes. In certain stimulus conditions, the transformation of visual direction that occurs with fused disparate images is the same for a monocular stimulus. To illustrate this point, imagine a random-dot stereogram with an inner area that has crossed disparity and appears in front of a surrounding area. Further, imagine a vertical line superimposed on the center of the image of the inner area in one eye. This line will appear in the center of the fused area, although it does not have a fusible partner in the other eye. It is as if the line is glued to the inner area in one eye and undergoes the same visual-line transformation as that of the fused parts of the image. The term “glued on” aptly describes how a monocular line appears at the same inclination as a binocular field (Domini and Braunstein, 2001) or how it appears to move
A
B Perceived direction of monocular elements (A) After fusion, the lines appear in the midline even though the monocular images are not aligned. (B) The type of stimulus used by Shimono and Wade (2002) to investigate whether the perceived direction of a monocular line is affected by the disparity of the background. In the fused images, subjects adjusted the horizontal position of one line until it appeared aligned with the other line.
Figure 16.37.
C YC L O P E A N V I S I O N
•
239
with a part of the binocular field that is apparently moving concomitantly with a head movement (Shimono et al. 2007). The degree to which the monocular image is glued on depends on whether the monocular and binocular stimuli are presented together or sequentially (Domini and Braunstein 2001), the spatial frequency composition of the monocular image (Raghunandan et al. 2009), and on the proximity of other fused stimuli (Erkelens and van Ee 1997a; 1997b; Shimono et al. 1998). The closer they are, the greater the transformation of the monocular image, to a maximum of one half the disparity (Erkelens and van Ee 1997b). This last finding indicates a limitation in monitoring binocular eye positions with nonius lines. Nonius lines must not be near binocular stimuli (Ono and Mapp 1995; Shimono et al. 1998). The capture of a monocular line was strongly reduced when it was flashed briefly at each peak of the change in disparity ( Jaschinski et al. 2006). Thus flashed nonius lines provide a more reliable indication of changes in vergence than do continuously visible nonius lines. In this description of visual capture, the monocular stimulus is treated as part of the binocular process of transforming the visual-line value into a headcentric direction value. Erkelens and van Ee (1997b) conclude that “binocular visual directions of monocular stimuli are captured by the binocular visual directions of adjacent binocular stimuli.” (p. 1742) However, they also describe it as “capture of binocular visual direction,” (p. 1744) which suggests that the monocular stimulus captures the binocular stimulus.) Hariharan-Vilupuru and Bedell (2009) asked observers to judge the alignment of two monocular lines, both of which were presented to the same eye, when these lines were embedded in different depth planes of a binocularly fused random-dot stereogram. They found that the degree to which the headcentric directions of these monocular
stimuli were captured varied as a function of the magnitude and direction of both the horizontal and vertical disparities within the stereogram. A monocular stimulus embedded in a random-dot stereogram is “captured” in depth as well as in direction. Shimono and Wade (2002) used a stimulus like that shown in Figure 16.37B. Two vertical bars were presented one above the other to the same eye but on random-dot backgrounds that differed in horizontal disparity. They reported that the apparent depth of the monocular lines covaried highly with the extent of binocular depth but poorly with apparent directional displacement. That is, the apparent displacement did not follow one half the disparity. In a subsequent study, Shimono et al. (2005) found that the apparent displacement covaried closely with the disparity of the stereograms only in a specific condition, in which the density of the dots was 10% and the width of the binocular area was relatively narrow (30 min of arc).
16.7.4b Direction of Monocular Occlusion Zones Monocular zones occur when an occluder prevents one eye from seeing a region visible to the other eye, as in Figure 16.38. The role of monocular occlusion in stereopsis is discussed in Section 17.3. Monocular occlusion also affects visual direction. Interest in monocular zones can be traced back to Euclid (Section 2.1.3b). Leonardo da Vinci (1452–1519) drew eight diagrams to analyze what the two eyes can see behind an occluder. He also diagrammed what is seen when looking through an aperture (see Figure 17.20). He noted that it is impossible to represent a 3-D scene with occluders or through an aperture on a flat canvas (Wade et al. 2001). Boring (1942) noted that, in one of Leonardo’s diagrams, one can see the entire background beyond an opaque sphere whose diameter is less than the interocular distance. He called this “Leonardo’s paradox,” because the background can be seen behind the sphere. Ono et al.
Displacement and compression with veridical shape perception. When the images of the stereogram are fused by divergent or convergent fusion, one fused image appears in depth with seven dots, which is one more than in either monocular image. This shows that the monocular dots are displaced inward and the space between the vertical flanking occluders is compressed. However, the occluded diagonal line appears straight and the occluded square appears square in the fused image.
Figure 16.38.
240
•
STEREOSCOPIC VISION
Far surface a b c d
e
f
Far surface
d
e f d' e' f'
g g'
Near surface
Near surface
Cyclopean point
Left eye
Right eye
A
Left eye
Right eye
B
The cyclopean view of a monocular zone. (A) Shows the positions of object points, as seen by each eye. (B) Shows the perceived directions of points from the cyclopean eye. Point (d ) is displaced rightward to (d´) and the area (d ) to ( g) shrinks to fit into the area (d´) to ( g´). Compression is indicated by the distance from (d´) to ( g´) being smaller than the distance from (d) to ( g). Similar displacements and compressions occur in monocular zones seen by the left eye, but to simplify the figure they are not illustrated. (Adapted from Mapp and Ono 1999)
Figure 16.39.
(2002c, 2003) examined what is seen and where it is seen with respect to the cyclopean eye in these stimulus conditions (for example, see Figure 16.39). They suggest that the term “Leonardo’s constraint” is a better description of such situations, in that the visual system is operating with the constraint that no two opaque objects can be seen in the same visual direction. First consider the application of Leonardo’s constraint to monocular zones created when an occluder is fixated. If zones b to c and d to e and the vertical edges of the occluder in Figure 16.39 were seen in their actual locations, the constraint would be violated. The laws of headcentric direction discussed in Section 16.7.2 have a provision to satisfy this constraint, when fixation is on the nearer surface, as shown in Figure 16.39B. The laws specify that d is apparently displaced to d´, where it is not behind the occluder. The monocular regions produced by a fixated foreground occluder are displaced outward in a headcentric frame. Now consider the application of Leonardo’s constraint to the direction of the edges of an occluder when the background is fixated. In this case, neither the laws of visual direction nor the averaging of directions of fused images are sufficient to satisfy the constraint. The laws of direction predict that the monocular and binocular zones of the background are seen in their correct directions with respect to the cyclopean eye. The averaging process predicts that the edges of the occluder are seen in their correct directions with respect to the cyclopean eye. These predictions conflict with the requirements of Leonardo’s constraint. The constraint requires that the edges or background zones
be displaced. Ono et al. (2003) showed that the binocularly seen edges were displaced when the background was fixated in accordance with the constraint but against the prediction based on averaging. Although these displacements would satisfy Leonardo’s constraint, this is not the whole story. Another way to think about the edges of the nonfixated occluder is that the angle subtended by the occluder is too large to fit into the cyclopean scene. Also, the nonfixated monocular zones of the background are too large to fit into the cyclopean scene if they are seen in their correct directions and extents. In other words, the cyclopean scene cannot accommodate all parts of the binocular visual scene. Ohtsuka (1995a, 1995b) and his colleagues (Ohtsuka and Yano 1994; Ono et al. 1998, 2000c, 2003) hypothesized that perceptual displacement is accompanied by perceptual compression of some parts of the scene. Figure 16.38 demonstrates this displacement and compression (Ohtsuka 1995a, 1995b; Ono et al. 1998). Viewing more than two vertical lines through two apertures in a card provides a more convincing demonstration of compression. One aperture is large enough to allow both eyes to see all the lines, while the smaller aperture allows the two end lines to be seen by only one eye. When fixation is held on a point on the card, the two monocular lines are displaced inward, and the region seen through the smaller aperture is compressed as indicated by the apparently smaller separation between the lines compared with that seen through the larger aperture (Ono et al. 2003). The displacement and compression of a portion of the visual scene satisfies Leonardo’s constraint, but in doing so creates deformation of shape and misalignment of lines. Two possible distortions caused by displacement are illustrated in Figure 16.40. The gray square in Figures 16.40A and B is occluded in its midportion and should appear rectangular because both ends are displaced outward (horizontally elongated) because of displacement. In addition, the two segments of the oblique line extending to the left and right of the occluder should no longer appear collinear. The butted end of the top segment to the left of the occluder is displaced leftward more than its far end, and the butted end of the lower segment at the right side of the occluder is displaced rightward more than its far end. As a result, both segments of the oblique line should be displaced clockwise. Studies reported in the literature, however, show that these two possible consequences are limited to visual direction and do not occur for visual shape and alignment in 3-D perception (Ohtsuka and Yano 1994; Ohtsuka 1995a). In 3-D perceptual space a square seen behind an occluder is seen as a square (Ono et al. 1998; van Ee and Erkelens 2000), and a line behind an occluder appears continuous (Drobnis and Lawson 1976; Gyoba 1978; Liu and Kennedy 1995; Ohtsuka 1995b). See the fused oblique line and fused square panel in Figure 16.40B.
C YC L O P E A N V I S I O N
•
241
Stimulus
Expected view Monocular zones
A
B
that (a) one-eyed observers (enucleated due to retinoblastoma) experienced a smaller illusion than did agematched binocular observers, (b) observers viewing the stimulus monocularly experienced less illusion than when viewing binocularly, and (c) the illusion almost disappeared when the stimulus was rotated 90 degrees. Finding (c) is consistent with the hypothesis, because the correction is proposed for only horizontal displacement. Further tests of the hypothesis as it applies to the Poggendorff illusion are yet to be conducted. 16.7.5 E FFEC TS O F P H O R I A , FI X AT I O N D I S PA R I T Y, A N D S T R A B I S MUS
Kanizsa illusion
C
Poggendorf illusion
D
Image displacement and compression. (A) The stimulus situation. (B) The expected front view. The effect of compression is illustrated by the perceived width of the monocular regions being smaller than the width of the binocular regions, whereas their physical widths are the same. (C) and (D) Two illusions that are opposite to what is expected from the displacement (and compression) shown in (B).
Figure 16.40.
These results suggested to Ohtsuka and his colleagues that the visual system has a mechanism to correct the directional distortions when judging shape and alignment. The existence of such a mechanism is also suggested by the directions of the Kanizsa and Poggendorff illusions, which are opposite to what is expected from displacement and compression. Instead of a horizontally elongated rectangle, as shown in Figure 16.40B, a horizontally shortened rectangle is seen in Figure 16.40C. Furthermore, instead of the two oblique lines being displaced clockwise, as in Figure 16.40B, an anticlockwise displacement is seen in Figure 16.40D. According to this hypothesis, the direction of the illusions is the result of the unnecessary operation of the correcting mechanism when the stimulus is represented in 2-D. That is, in viewing the drawing there is no need for the displacement and compression to satisfy Leonardo’s constraint. However, an inappropriate correction is applied, which causes the illusions. This hypothesis, if correct, places the Kanizsa and Poggendorff illusions into Robinson’s (1972) category of illusions caused by a mechanism that usually operates to create veridical perception. According to (Gregory 1970), the Müller-Lyer illusion falls in the same category. Moreover, it appears that the “correction” is small enough for the visual system to tolerate the contradiction between visual direction and the perception of shape or alignment. Gonzalez et al. (2005) found support for this hypothesis using Kanizsa’s amodal shrinkage illusion. They found 242
•
The law of cyclopean projection states that objects on a corresponding pair of visual lines appear on a line through the cyclopean eye and the point in the horopter contained in both visual lines. It is assumed that the eyes are correctly converged on a defined visual object. This situation is complicated by two conditions. One of the conditions is phoria, which is a deviation of a closed eye inward (esophoria) or outward (exophoria) from the intended point of convergence. The other condition is a deviation of one or the other eye from the intended point of convergence when both eyes can see the same stimulus. This is a fixation disparity when the deviation is within Panum’s fusional area and a tropia, or strabismus, when the deviation is larger. Consider, first, effects of phoria on headcentric localization. An esophoric eye deviates inward when occluded and, according to the law of cyclopean projection, the target should appear to shift toward the seeing eye (Figure 16.41A). For an occluded exophoric eye, the target should shift toward the occluded eye (Figure 16.41B). In both cases, the shift should be half the angle of changed vergence. As predicted, observers experienced apparent movement of a target in the direction of the phoria, as the phoric eye was occluded and, in the opposite direction, as the phoric eye was uncovered (Ono et al. 1972; Ono and Gonda 1978; Park and Shebilske 1991). Furthermore, when pointing to monocular visual targets, observers made constant errors related to the direction of phoria (Ono and Weber 1981; Khokhotva et al. 2005; Mapp et al. 2007). However, they made some corrective adjustments when provided with error feedback. These findings confirm that headcentric directions are judged in terms of information about the positions of the two eyes and not in terms of the position of the dominant eye or of an eye that happens to be open. A person with no phoria may observe a shift of a monocularly fixated object when the eyes change convergence with one eye occluded. A fixation disparity (overconvergence or underconvergence) has a predictable consequence on the visual direction of a monocularly seen stimulus. When viewing a
STEREOSCOPIC VISION
Apparent location of object
Misalignment
Fixated object
Intersection of visual axes Esophoria Occluder
A
B
Apparent displacement of monocular images. (A) Observer binocularly viewing vertical lines through a ring. (B) Typically, the monocularly seen lines on each side in the ring appear misaligned with respect to the binocularly seen lines above and below the ring. (Adapted from Nakamizo et al. 2008)
Figure 16.42.
Cyclopean eye
Esophoric right eye
A Intersection of visual axes
Fixated object
Apparent location of object
Exophoria Occluder
Cyclopean eye
Exophoric right eye
B Visual direction and phoria. (A) An esophoric eye deviates inward when occluded. An object fixated by the seeing eye appears shifted toward the seeing eye, onto a line through the cyclopean eye and the intersection of the visual axes. (B) An exophoric eye deviates outward when occluded. An object fixated by the seeing eye appears displaced from the seeing eye, onto a line through the cyclopean eye and the intersection of the visual axes. (Redrawn from Ono and Gonda 1978)
Figure 16.41.
group of vertical lines at a distance of 30 cm through a ring, one sees the outermost line segments (one on each side) within the ring monocularly. These segments appear displaced outward with respect to the center of the ring, as shown in Figure 16.42. Nakamizo et al. (2008) showed that this displacement was due to underconvergence with respect to the binocular lines. When an overconvergence was induced by a haploscope or Nonius stimuli in a stereoscope, the apparent displacement shifted toward the center.
A person with a constant strabismus of long standing learns to suppress the visual input from the deviating eye when both eyes are open. When only the deviating eye is open, that eye is able to fixate on an object and see it, although the covered normal eye will not be converged on the object. The question arises as to whether, under these circumstances, the direction of the object fixated by the strabismic eye is judged in terms of the position of that eye, the position of the covered normal eye, or their combined positions. Mann et al. (1979b) showed that the second strategy is adopted by constant strabismics—they use the direction of their normal eye whether or not that eye is viewing the target. Alternating strabismics judge the direction of a visual target in terms of whichever eye is open— they dissociate the two eyes in judging directions. People who have had surgery for correction of strabismus show shifts in pointing with the unseen hand to visual targets, although the direction and magnitude of these shifts are not related in a simple way to the type of surgical correction (Steinbach et al. 1988). However, the cyclopean eye of strabismic children was in the same location as that of children with normal vision. Furthermore, children showed no change in the cyclopean eye after they underwent surgery for strabismus (Dengis et al. 1993b). Thus, pointing errors in strabismics must be due to a change in the registered position of the eye rather than to a shift of the cyclopean eye. In children who had lost one eye at an early age, the cyclopean eye was shifted toward the remaining eye. Normal children tested with only one eye open showed no such shift (Moidell et al. 1988). 16.7.6 L O C AT I N G T H E C YC L O P E A N EY E
16.7.6a. Methods for Locating the Cyclopean Eye Four procedures have been used to determine the location of the cyclopean eye.
C YC L O P E A N V I S I O N
•
243
1. In the method devised by Funaishi (1926) the observer fixates a point in the median plane at a distance of 2 meters and is shown other nonfixated targets in the same frontal plane. The observer aligns an unseen finger with each of the nonfixated targets. The lines joining each pair of finger positions are extended toward the observer, and their intersection defines the cyclopean eye. 2. In the method devised by Roelofs (1959), the observer sights down a tube with one eye and indicates the place on the face where the tube appears to be aimed. This is repeated for different positions of the tube, and the cyclopean eye is the intersection of the set of lines joining the front of the tube to each location on the face to which it appears to point. This method assumes the law of cyclopean projection to be valid without the assumption that the cyclopean eye is located midway between the eyes. 3. In the method proposed by Howard and Templeton (1966) the observer fixates the near end of a rod lying in a horizontal plane at eye level and swings it within the horizontal plane until it appears to point directly at the self. This is repeated for different positions of the rod. The cyclopean eye is defined as the intersection of the projections of the rod in its various positions. Mitson et al. (1976) removed the distracting double images of the nonfixated parts of the rod by replacing the rod with two stimuli at different distances. In this revised method the observer switches on the near and far stimuli in alternation and moves the near stimulus until the imaginary axis joining the stimuli appears to point at the self. The procedure is repeated for different directions, and the intersection of the projected axes is taken to be the cyclopean eye. 4. In the method proposed by Fry (1950), the observer fixates a far stimulus on the median plane and indicates the apparent locations of diplopic images of a near stimulus also on the median plane. A line is drawn from each apparent location toward the face so that the bisector of the angle formed is equal to the angle subtended by the two stimuli to each eye. The location of the cyclopean eye is the intersection of the two lines. See Bailey (1958) for attempts to locate the cyclopean eye using this method. This method assumes the law of differences in headcentric directions to be valid without the assumption that the cyclopean eye is located midway between the eyes. Barbeito and Ono (1979) tested the success of each method in predicting performance on other spatial tasks that depend on the location of the cyclopean eye. They used three tasks: judging the straight ahead, setting a point at one fixed distance to be midway in direction between two 244
•
other points side-by-side at another distance, and judging the extent of apparent movement of visual targets during accommodative vergence. Only method 3, involving multiple alignments of targets with the cyclopean eye, successfully predicted performance on these tasks. This is the only purely visual method, which may be why it relates best to other purely visual tasks. The Howard and Templeton method, as modified by Mitson et al., was also the most precise method. This is probably because it does not involve highly variable pointing responses. Taking the four methods together, their power to predict individual differences was not impressive, but all four methods found the mean location of the cyclopean eye to be between the two eyes, which is relevant for the next section.
16.7.6b Location of the Cyclopean Eye Consider two points F and P on the horizontal horopter, conforming to the Vieth-Müller circle as in Figure 14.14. The points subtend the same angle at the nodal point of one eye as at the nodal point of the other eye, because a chord of a circle subtends the same angle at any point on the circumference. Furthermore, F and P subtend this same angle at the point between the eyes where the Vieth-Müller circle cuts the median plane. According to Hering’s law of cyclopean projection, the direction of P with respect to F is judged as if both eyes were in the median plane of the head. If the cyclopean eye is on the Vieth-Müller circle, midway between the eyes, then the relative directions of any two points on the circle with respect to the cyclopean eye are the same as their relative directions with respect to either eye. This provides a convenient theoretical definition of the cyclopean eye. The headcentric azimuth of any point with respect to the cyclopean eye is the dihedral angle between the median plane of the head and the vertical plane containing the line joining the point to the cyclopean eye. The headcentric elevation of any point with respect to the cyclopean eye is the dihedral angle between the horizon plane and the line joining the point to the cyclopean eye. If the cyclopean eye is assumed to be the midpoint of the line joining the nodal points of the eyes, these two angles become the headcentric cyclopean azimuth and elevation, as defined in Section 14.3.2. By the law of cyclopean projection, objects on corresponding visual lines appear on a common line through the cyclopean eye and the point where the two visual lines intersect. When the eyes are symmetrically converged and the objects lie on the two visual axes, all the objects appear in the median plane of the head. It also follows from the law of cyclopean projection that any object not on the horopter has two visual directions, one for each eye. When the eyes are asymmetrically converged, the law applies only if the center of rotation of each eye is at the same location as the nodal point. In fact, the nodal point is in front of the eye’s
STEREOSCOPIC VISION
center of rotation (see Figure 16.27), so the geometry of visual direction for asymmetrical convergence is more complex than indicated here. If the eyes move into a tertiary position, the geometry of the headcentric directions of visual targets becomes even more complicated by the changed shape of the horopter with oblique gaze and by the fact that the eyes obey Listing’s law (Section 10.1.2d). The fact that most people consistently use the same eye for sighting or aligning objects led Walls (1951) to suggest that the dominant eye is the cyclopean eye. But most people judge directions as if both eyes are located in the median plane of the head, no matter which eye is used for sighting (Charnwood 1949; Ono and Barbeito 1982; Mapp et al. 2003). Barbeito and Simpson (1991) investigated this question quantitatively by asking observers to point to a visual target seen by only one eye, for different angles of gaze of the occluded eye. The angle of gaze of the occluded eye was varied by having observers fixate with both eyes on spots at different distances along the line of sight of the open eye, as shown in Figure 16.43. The apparent direction of the monocular target closely coincided with that predicted from the principle that the visual direction of any point along the visual axis of one eye is specified by the line passing through the point of binocular convergence and the cyclopean eye.
Directions of target predicted from Hering’s theory
1
2
3
Monocular target
3
2
Binocular fixation spots
1
Occluder
Left eye
Cyclopean eye
Right eye
Testing Hering’s laws of visual direction. Subjects pointed to the monocular target, which was occluded to the right eye. Empty dots indicate the apparent position of the target for each of three points of convergence, indicated by black dots. (Adapted from Barbeito and
Figure 16.43.
Simpson 1991)
The predicted locations are illustrated in Figure 16.43. For all observers there was a linear relationship between changes in the angular position of the occluded eye and the apparent direction of the target seen by the open eye. However, for some observers, the slope of this function was not the same for both eyes. This could be because the cyclopean eye was slightly off-center for these observers or because the observers gave more weight to one eye than to the other, a tendency that may be related to eye dominance (Porac and Coren 1986). It has already been indicated that the cyclopean eye can be defined as the point in the median plane of the head to which the relative visual directions of objects are projected unchanged. This definition places the cyclopean eye just behind the corneal plane of the eyes. On other theoretical grounds, one would predict that the cyclopean eye is on the axis of rotation of the head. Directional judgments made in relation to such a center would not be affected by rotation of the head on a vertical axis, as they would be if the cyclopean eye were situated outside this axis. Funaishi (1927) and Roelofs (1959) reported that the cyclopean eye lies on the axis of vertical rotation of the head. On the other hand, Fry (1950) placed it several centimeters in front of the corneal plane. Barbeito and Ono (1979) found that all methods for determining the position of the cyclopean eye are critically dependent on accurate binocular fixation. In the earlier methods, fixation was difficult to control, and this probably accounted for the divergent results. Barbeito and Ono controlled fixation and found that, on average, the cyclopean eye was very close to a point midway between the eyes in the corneal plane. 16.7.7 C O N T RO V E R S Y O V E R T H E C YC L O P E A N EY E
Ever since Kepler (1604) described image formation in the eye there has been a widespread belief that objects seen with one eye are always seen in their correct locations. This claim was refuted by observations made by Ptolemy in the 2nd century, Alhazen in the 11th century, and by Wells (1792) Hering (1868, 1879) and others (see Section 16.7.2b). However, this erroneous belief was retained by British researchers, particularly Porterfield (1737), Brewster (1830, 1844b), and Wheatstone (1838). They did not understand or appreciate Hering’s law of headcentric direction (see Ono et al. 2009). The recent controversy over the cyclopean eye to be discussed below is reminiscent of this historical debate. Recently, several investigators have claimed that the cyclopean eye is not fixed in the head, but moves along the interocular axis as a function of the stimulus situation (Erkelens and van de Grind 1994; Erkelens et al. 1996; Mansfield and Legge 1996, 1997; Erkelens 2000; Khan and Crawford 2001). Erkelens and van Ee (2002b) assert that the concept of the cyclopean eye is inappropriate
C YC L O P E A N V I S I O N
•
245
and irrelevant. We will now see that all these investigators confused (1) exocentric direction and headcentric direction and/or (2) physical descriptions and perceptual descriptions of direction (Banks et al. 1997; Mapp and Ono 1999; Khokhotva et al. 2005; Ono et al. 2002a, 2002b; Mapp et al. 2007). This section deals with consequences of these confusions. It has already been shown that the cyclopean eye is not the location in the head to which apparently aligned objects are actually aligned. Moreover, inferences about the location of the cyclopean eye cannot be based only on observers’ reports that objects appear aligned. Inferences about the location of the cyclopean eye can be made only on reports of where objects lie with respect to the median plane of the head. The task must be a headcentric one. Despite the long history of demonstrations and experiments illustrating this point, some investigators make claims about the location of the cyclopean eye on the basis of exocentric tasks, which do not bear on the question of the location of the cyclopean eye. Indeed, all the studies cited above that claim that the location of the cyclopean eye is stimulus specific are based on only exocentric tasks. For example, Mansfield and Legge (1996, 1997) claimed that the cyclopean eye coincides with the location in the head with which their stimuli were physically aligned. Erkelens et al. (1996) claimed that since the edge of a binocularly seen near surface and the edge of a monocularly seen distant area appeared aligned when they were physically aligned to one eye, the cyclopean eye moved to that eye. Erkelens and colleagues questioned the validity of the concept of the cyclopean eye (Erkelens and van de Grind 1994; Erkelens et al. 1996). After conducting a series of experiments using exocentric tasks they concluded that, “The concept of the cyclopean eye is sometimes inappropriate and always irrelevant as far as vision is concerned” (Erkelens and van Ee 2002b). They claimed that all experiments dealing with this issue since Ptolemy were poorly done and stated that, “Indeed we are astounded that results of many poor experiments from the literature carry so much weight.” (p. 1162) Figure 16.39 shows the type of stimulus used by Erkelens and colleagues. Note that, since the near surface in the figure partially occludes the distant surface, some of the stimulus elements do not physically project to a cyclopean eye located midway between the eyes (Figure 16.39A). For example, the area from d to e is visible to the right eye but it is not projected to the centrally located cyclopean eye. Also, note that in the right-eye’s view, point d is physically aligned with the right edge of the near surface. From an exocentric direction task Erkelens et al. (1996) concluded that, “binocular space perception near monocularly occluded areas is veridical and the cyclopean eye does not have a fixed position in the head, but is located between the eyes for certain visual directions and in one of the eyes for other directions” (p. 2145). 246
•
If they had measured the headcentric directions of point d and of the right edge of the near surface, they would have found that when fixation is on the near surface, point d appears displaced to d´ and is seen from a fixed cyclopean eye, as shown in Figure 16.39B. For further discussion of displacement of monocular areas see Section 16.7.4b. Although the monocular zone experiments led Erkelens and his colleagues to conclude erroneously that the location of the cyclopean eye is stimulus specific, it was their cyclopean illusion experiment that led them to conclude that the concept of the cyclopean eye is irrelevant. The cyclopean illusion is the apparent shift in the headcentric direction of visual stimuli that occurs when the eyes change convergence, as shown in Figure 16.44. Erkelens (2000) found that with binocular viewing all observers experienced the illusion both in dark surroundings and when the room lights were on. With monocular viewing, however, only 33% of observers experienced the illusion in darkness, and none experienced it when the room lights were on. Erkelens concluded that, “perceived direction during monocular viewing is based on signals of the viewing eye only” (p. 2411). Erkelens and van Ee (2002b) concluded that, “The concept of the cyclopean eye is . . . always irrelevant as far as vision is concerned.” Erkelen’s conclusion conflicts with all published findings that judgments of headcentric direction are based on information from the two eyes with both monocular and binocular viewing (Section 16.7.5). Moreover, Erkelens’s data are inconsistent with those of other investigators who have found the cyclopean illusion to be robust with
Actual location
Actual and apparent location Fixation far
Apparent location
Apparent location Fixation near
Actual location
Actual and apparent location
Occluder
Left eye Cyclopean eye
A
Left eye Cyclopean Right eye eye
B
The cyclopean illusion. When fixation changes from the near stimulus, as in (A), to the far stimulus, as in (B), the headcentric direction of the far stimulus shifts to the left. The two stimuli on the visual line of the right eye appear on the cyclopean line through the point of fixation (dashed lines). The location of the cyclopean line changes with the change in the point of binocular fixation. Therefore, the concept of the cyclopean eye is needed in explaining the illusion.
Figure 16.44.
STEREOSCOPIC VISION
monocular viewing (Wells 1792; Hering, 1879; Helmholtz 1910; Ono et al. 1972; Carpenter 1988; Enright 1988). These inconsistencies led Ono et al. (2002b, 2007a) to examine the stimulus used by Erkelens. They found that with the viewing distances he used (near stimulus 15 cm, far stimulus 30 cm) it was very difficult to produce a monocular cyclopean illusion. This is because (1) the occluded eye was exophoric, (2) the extent of the exophoria was greater at 15 cm than at 30 cm, and (3) accommodation was relatively ineffective in driving vergence for the stimuli used. Thus, with these distances, the magnitude and velocity of eye movements in his monocular condition (an accommodative vergence situation) were much smaller than those in his binocular condition (a disparity vergence situation). Therefore, the rarity of the monocular cyclopean illusion in Erkelens’s stimulus is not surprising because the magnitude of the illusion depends on the magnitude of eye movements. Figure 16.45 illustrates the interaction between the extents of phoria, eye movements, and the cyclopean illusion. If Erkelens had presented his stimuli at a greater distance, where phoria is smaller and accommodation is more effective, he would have found that virtually all observers experience the monocular cyclopean illusion (Ono et al. 2007a). Thus, he would have concluded, as have others, that judgments of headcentric direction are based on signals from both eyes and not on signals from only the eye that happens to be open. Actual locations Apparent locations Actual and apparent locations
Apparent motion
Eye movement
Cyclopean eye
A
Cyclopean eye
B
The cyclopean illusion and phoria. The motion of the cyclopean line (bold dashed line) with a change in convergence is smaller when the occluded eye is exophoric (A) than when it is not phoric (B). The general principle is that the perceived direction of an object is on that line of sight that passes through the intersections of the lines of sight from the two eyes.
Figure 16.45.
It can be concluded that small and slow eye movements can account for the rarity of the monocular illusion with near stimuli in dark surroundings. But why did no observers experience the illusion when the room lights were on? Under these conditions, the stimuli would appear fixed with respect to the surroundings and thus inhibit the occurrence of the cyclopean illusion. The external frame of reference provided by the background overrides the headcentric information from oculomotor and retinal signals (Ono et al. 2007a). It has been reported that observers do not experience the cyclopean illusion shown in Figure 16.44 when tracking a smoothly moving target from the near point to the far point, if the stimulus is an afterimage or if the stimuli are illuminated stroboscopically at a temporal frequency of 5 Hz (Enright 1988). Ono et al. (2007a) found, however, that if the afterimage is “projected” to a screen near the far stimulus, the illusion occurs. In Enright’s experiment, the afterimage was “projected” to the moving stimulus, and therefore the illusion did not occur. They also found that presenting the stimulus stroboscopically produced the predicted illusion. The reason for Enright’s second result is not clear. 1 6 . 8 U T R O C U L A R D I S C R I M I N AT I O N Helmholtz (1909) asked whether one can tell which eye sees a monocular stimulus when there is no information, apart from the ocular site of the stimulus. This task is referred to as utrocular discrimination, and the information on which it is based is called “eye-of-origin information” or “eye signature.” Helmholtz did not doubt that eye-of-origin information is available to the stereoscopic system since, if it were not, one would not be able to distinguish between depth based on crossed disparities and that based on uncrossed disparities. This information must also be available to the vergence system, since vergence occurs appropriately to crossed and uncrossed disparities, even in an open-loop situation (Howard 1970). The question is whether eye-of-origin information is available to consciousness when all extraneous cues are eliminated. Helmholtz concluded that eye-of-origin information is not available to consciousness, but people have continued to investigate the question. It is not easy to design a test lacking give-away cues. If the eyes differ in any way, a person can tell which eye is stimulated. For instance, a person color blind in one eye will make consistent judgments about which eye sees a colored stimulus. If the person is confused about which eye is color blind, the answers may be wrong (invalid) even though discrimination is perfect (reliable). Even if answers are correct it does not mean that there is eye-of-origin information, but only that the person knows that one eye is color blind. A difference in contrast
C YC L O P E A N V I S I O N
•
247
sensitivity between the two eyes can also serve as an extraneous cue (Porac and Coren 1984; Steinbach et al. 1985). Oculomotor cues and binocular parallax may also serve as extraneous cues. For instance, if the stimulus extends to the edges of the visual field, one may identify the stimulated eye by noting how far the stimulus extends to the left or right. Blinking an eye immediately reveals which eye is seeing. Pickersgill (1961) placed an occluder in front of one eye and asked observers to report which eye saw a small flash of light. Some observers performed at above chance level, but Templeton and Green (1968) pointed out that, for an observer with phoria, the apparent direction of the stimulus would vary depending on which eye was occluded. A person using this cue would be able to give consistent answers even if they were not correct. Smith (1945) overcame the problem of phoria by presenting an identical line grid to each eye. Observers fused the lines and reported which eye was stimulated by a flash presented on the fovea of one eye. Some observers performed at above chance level, which Smith explained in terms of differential eye-movement tendencies. However, it is possible that an observer with a fixation disparity (Section 10.2.4) would see the flash displaced with respect to the fused fixation target and could use this as an extraneous cue for utrocular discrimination. This interpretation is supported by the fact that performance fell to chance when the position of the test flash was varied. Performance also fell to chance when fixation disparity was controlled (Templeton and Green 1968; Ono and Barbeito 1985). Wolfe and Franzel (1988) presented a single light spot to one eye and an array of identical spots to the other eye. Subjects were unable to detect the spot that was presented to only one eye. Blake and Cormack (1979a, 1979b) reasoned that, because stereoblind observers have a preponderance of monocular cortical cells, they should have better utrocular discrimination than people with normal vision. Both eyes were exposed to the same homogeneously illuminated region on which a grating was superimposed in one eye.
248
•
Observers with normal vision could report which eye saw the grating when the spatial frequency of the grating was low but performed at chance level when it was high. Stereoblind observers performed well at all spatial frequencies. If valid, this test could be used to diagnose stereodeficiency. Barbeito et al. (1985) interpreted the above results in a different way. They pointed out that many stereodeficient people are amblyopic in one eye and some have local scotomata. The appearance of the test grating would therefore vary according to whether it is presented to the good eye or the deficient eye and this could serve as an extraneous cue. Furthermore, the addition of a grating to one eye involved a sudden change of luminance to that eye. Observers may have detected this sudden change rather than the grating. Barbeito et al. investigated this question by changing the intensity of the patch in one eye at the same time that a grating was added to the patch in the other eye. Observers chose the eye in which the largest change in luminance occurred, rather than the eye with the grating, even though they were asked to select the eye seeing the grating. Observers reported a “feeling” in the eye receiving the change in luminance. Similar reports can be found in the older literature (Enoch et al. 1969). Martens et al. (1981) provided further evidence supporting the role of transient stimuli in utrocular discrimination. Summary Although eye-of-origin information is used by the stereoscopic and vergence systems, attempts to show that people are aware of which eye is stimulated have been beset with difficulties. Extraneous cues, such as differences in sensitivity and visual parallax due to oculomotor imbalance, could allow people to make consistent judgments, though not necessarily correct ones. It seems that whatever genuine ability people have to make accurate utrocular discriminations is based on the occurrence of a transient stimulus in one eye. It may turn out that this also provides a spurious cue in the form of reflex movements of the eye or pupil.
STEREOSCOPIC VISION
17 STIMULUS TOKENS FOR STEREOPSIS
17.1 17.1.1 17.1.2 17.1.3 17.1.4 17.1.5 17.1.6 17.2 17.2.1 17.2.2 17.2.3 17.2.4
Stimuli for disparity-stereopsis 249 Luminance-defined edges 250 Stereopsis and stimulus orientation 254 Texture-defined regions 255 Stereopsis in the chromatic channel 257 Motion and flicker as tokens for stereopsis 261 Disparity between specularities 262 Interactions between monocular occlusion and binocular disparity 263 Basic rules of monocular occlusion 263 Monocular zones and depth discontinuity 265 Occlusion, camouflage, and rivalry 266 Monocular zones and surface opacity 267
17.3 17.4 17.5 17.6 17.6.1 17.6.2 17.6.3 17.6.4 17.6.5 17.7 17.8 17.9
17. 1 S T I M U L I F O R D I S PA R I T YSTEREOPSIS
Da Vinci stereopsis 267 Depth from cyclopean transparency 272 Depth from binocular rivalry 273 Panum’s limiting case 277 Monocular figural repulsion 277 Vergence-induced disparity 278 Double-duty image linkage 279 Panum’s limiting case as Da Vinci stereopsis Monoptic depth 283 Stereopsis from geometrical illusions 283 Chromostereopsis 284 Irradiation stereopsis 286
282
the same in the two eyes. These clusters must also help in the binocular linking process. However, the addition of a small proportion of extra dots in one image can camouflage similar dot clusters in the two eyes, as in Figure 17.1B. In spite of this, depth is still evident in the stereogram. Recognizable dot clusters in each monocular image are therefore not essential for the linking process, as long as there are enough matching dots in the two images to produce a recognizable overall correlation at some scale. Blurring one image disrupts the match between high spatial-frequency components, as in Figure 17.1C. However, the dot patterns can be matched in terms of their common low spatial-frequency components. Depth may also be seen in a random-dot stereogram containing a disparity between a region of uncorrelated dots and a region in which the dots are correlated in position between the two eyes, as in Figure 17.12. For this to work, the two regions must be defined by differential dot density or by spatial frequency to create regions visible in each eye. The general rule is that stereograms must have disparityproducing edges evident in each monocular image. They can be at the micropattern level and/or the macropattern level. They can be edges defined by luminance contrast, motion, color, or texture, and they can have low or high spatial frequency. One may not be aware of the features that define disparate regions.
This chapter is about stimulus tokens for detection of disparity in stationary stimuli. Tokens for detection of disparity-defined motion-in-depth are discussed in Section 31.3. In all stereograms, the essential binocular disparity that produces depth is that between contours visible in each image. It is often claimed that random-dot stereograms are devoid of monocularly visible contours that generate disparity, but this is not true. The essential disparity is between the matching dots visible in each monocular image. Depth is seen in a random-dot stereogram even when the pattern of disparities between the dots does not define a cyclopean shape, as in Figure 17.1A. A cyclopean shape is not evident in the monocular images but emerges only when the disparate dots are arranged in coherent regions that have the same disparity or form a disparity gradient. The cyclopean shape is defined by the pattern of disparities in the microelements of the stereogram and is therefore not the primary stimulus for the detection of disparity. The cyclopean shape acts as a powerful confirmation that the random-dot images have been correctly linked. Julesz (1960) coined the term “global stereopsis” for this process. By chance, each monocular dot pattern in a typical randomdot stereogram contains dot clusters that are recognizably
249
sometimes differ locally in contrast because of the different vantage points of the eyes. However, the contrast polarity of the images must be the same in the two eyes (Section 15.3.7). Furthermore, the orientation of contrast-defined edges must be similar in the two eyes (Section 15.3.5). The question remains about what types of local discontinuity in luminance are used for linking binocular images. Marr and Poggio (1979) proposed that the visual system links regions where the second spatial derivative of luminance is zero. These regions are known as zero crossings. They occur not at the peaks of luminance but at the inflection points between two different levels of luminance where the luminance gradient is steepest. In the Marr and Poggio model, linking of zero crossings is done separately on each output of four spatial filters, ranging from fine to coarse. This helps to minimize ambiguities in the linking process. Linked images are stored in a buffer memory, referred to as the 21/2-D sketch. The output of the low spatial-frequency filter is also used to control vergence. Grimson (1981) used a computer implementation of this model and tested it on a variety of stereograms. Mayhew and Frisby (1981) showed that zero crossings are not the only tokens for detection of disparity (Portrait Figures 17.2 and 17.3). They made a stereogram consisting of two phase-shifted pseudo-square-wave gratings (gratings
A
B
C Properties of random-dot stereograms. (A) Depth is seen in a random-dot stereogram when disparities between the dots do not define a cyclopean shape. (Based on a figure in Brookes and Stevens 1989) (B) A few extra dots are added to the right-eye image so as to camouflage the dot clusters. In spite of this, depth is still evident in the fused stereogram. (C) Stereopsis survives considerable blurring of one image. (From Julesz 1971)
Figure 17.1.
The first aim of this chapter is to define stimulus tokens used to detect binocular disparity. The second aim is to describe the strategies used by the visual system to find binocularly matching elements. 17.1.1 LU M I NA N C E -D E F I N E D E D G E S
17.1.1a Nature of Luminance-Defined Edges An edge defined by luminance contrast is the most prevalent token for detecting disparity. In the simplest case, an object produces images with disparity between well-defined edges. In a more complex case, disparity occurs between textured regions, as in the random-dot stereogram. Dichoptic images differing in luminance contrast may be fused to yield stereopsis, although stereoacuity is degraded (Section 18.5.4). It is not surprising that some difference in contrast is tolerated, since the two images of a natural scene 250
•
John P. Frisby. Born in Northampton, England, in 1941. He obtained a B.A. from Cambridge University in 1964 and a Ph.D. from Sheffield University in 1969. He stayed at Sheffield University, where he is now professor of psychology. He invented the Frisby Stereotest in 1977.
Figure 17.2.
STEREOSCOPIC VISION
Figure 17.3.
John E. W. Mayhew. Born in South Africa in 1941. After 10 years in the Royal Navy he studied at Bristol University, where he obtained a Ph.D. in 1972. In 1972 he was appointed lecturer in psychology at Sheffield University, where he became professor in 1982.
Pseudo-square-wave stereogram. Stereogram composed of square-wave grating with the fundamental frequency component missing. Luminance profiles for each image are shown below. The peaks of the luminance distributions are phase shifted 90˚. The zero crossing at C can be matched with that at A but also with the nearer one at B. Thus the nearest zero crossings are phase shifted less than 90˚. (From Mayhew and Frisby 1981 with permission from Elsevier)
with the fundamental sine wave missing), as shown in Figure 17.4. Perceived depth corresponded to the disparity between the clearly defined contrast edges of the original square-wave grating rather than to the disparity between the nearest-neighbor zero crossings, even though the latter disparity was smaller. This is probably because a match based on luminance peaks brings all features of the two images into correspondence, whereas a match based the nearest zero crossings in the composite luminance profile brings only parts of the displays into correspondence. In addition, the nearest-neighbor match may not have been favored because the luminance gradients of the nearestneighbor zero crossings were very different in the two eyes. Boothroyd and Blake (1984) obtained the same result for a pseudo-square-wave grating of 1 cpd but, as we will now see, they obtained different results with patterns of higher spatial frequency. Rather than think about disparities between zerocrossings of the composite waveform of a compound grating, one can think about disparities between component spatial frequencies. Figure 17.5A shows the component sine waves of a pseudo-square-wave grating. Note that the disparity between the nearest-neighbor zero-crossings of the composite luminance profile of the grating is approximately the same as that between the 3f components. With a pseudo-square-wave grating based on a 3-cpd square grating, Boothroyd and Blake found that most subjects saw depth corresponding to disparity between the 3f components. This may be because the luminance gradient of the
zero crossing determined by the 3f component and the zero-crossings in the composite luminance profile are more similar when the fundamental frequency is 3 cpd rather than 1 cpd. A simpler approach to the question of stereopsis in compound gratings is to use gratings with only two spatialfrequency components and vary their phase and contrasts. Boothroyd and Blake found that when each image consisted of a 3-cpd sine-wave grating superimposed on a 9-cpd sinewave grating, perceived depth corresponded to the disparity between the 3-cpd components. With this display, both components are brought into register only when the 3-cpd components are linked. Thus, with equal contrasts, a linkage was preferred that brought both components into register rather than one based on a match of only the 9-cpd component, which provided a nearest-neighbor linkage. When the contrast of the 3-cpd images was reduced, perceived depth corresponded to disparity between the 9-cpd images, even though the 3-cpd images evoked depth when presented alone. In this case, the nearest-neighbor match of the higher contrast components won out over the best overall pattern match. When two component gratings are multiples of each other (harmonics), as in a 3-cpd plus 9-cpd grating, linking the lower component necessarily brings the higher component into register. However, linking the higher component does not necessarily bring the lower component into register. With a 6-cpd plus 9-cpd grating an observer has three linking options, as can be seen in Figure 17.5B. When the
Figure 17.4.
S T I MU LU S TO K E N S F O R S T E R E O P S I S
•
251
Left eye Pseudo-squarewave
180°
Right eye
Left eye 3f component Right eye
Left eye 5f component Right eye
Left eye 7f component Right eye
A
Components shifted differently in each eye. (A) Stereogram of two sine-wave gratings of different spatial frequency in each eye, with the two gratings in one eye shifted by different amounts. Subjects saw two depth planes based on disparities formed from the high-contrast and low-contrast lines of the moiré patterns in each eye. (B) Luminance profiles of gratings in the stereogram. Components in one eye are shifted by different amounts relative to those in the other eye. They cannot be in binocular correspondence simultaneously. (Adapted from
Left eye
Figure 17.6.
130° 15° Right eye
Boothroyd and Blake 1984 with permission from Elsevier)
B Figure 17.5.
Types of phase-shifted disparity. (A) Disparities between the
component spatial frequencies of a square-wave grating with missing fundamental and a 165˚ phase shift between the two eyes. (B) A 6-cpd grating and a 9-cpd grating are presented to each eye with a 130˚ interocular phase shift. In the binocular image either or both of the sine-wave components can be brought into correspondence, as indicated by the arrows. (Redrawn from Boothroyd and Blake 1984)
contrasts of the component gratings were the same, perceived depth corresponded to the disparity in the compound grating—the disparity giving the best overall pattern match between the images. When the contrast of the components was unequal, observers perceived depth corresponding to disparity between either the 6-cpd or the 9-cpd component gratings. In the cases mentioned so far, all spatial-frequency components were shifted together. In Figure 17.6A, the component gratings are shifted by different amounts so that they cannot be matched simultaneously, as illustrated in Figure 17.6B. The pattern in each eye is a moiré pattern of high- and low-contrast bars with a periodicity that differs 252
•
from that of either component grating. The bars are in different locations in the two eyes. Subjects saw two depth planes: one produced by disparity between the high-contrast bars in the moiré pattern and the other by disparity between the low-contrast bars. Percepts associated with disparities between the component gratings were not evident.
17.1.1b Stereogram with a Mixture of Contrasts Signals produced by on-center ganglion cells that respond to bright patches on a dark ground, and those produced by off-center ganglion cells that respond to dark patches on a lighter ground remain distinct as far as the visual cortex (Section 5.2.2). These contrast-polarity pathways expand the dynamic range of the visual system beyond what a single pathway could achieve. Motion seems to be processed in distinct contrast-polarity channels, and Harris and Parker (1995) asked whether the same is true of disparity. They used the method described in Section 18.3.5 to measure the efficiency with which the stereo system makes use of dots in
STEREOSCOPIC VISION
Luminance
a random-dot stereogram. Half the dots were black in both eyes and half were white in both eyes. The background was gray. If images are linked in distinct contrast-polarity channels, stereo efficiency with a mixed polarity stereogram should be twice that for a stereogram with all the dots of the same contrast polarity. This is because two independently processed sets of dots provide two samples of the depth edge. The efficiency advantage was, in fact, close to 2. Thus, binocular images are linked on the basis of two statistically independent samples of contrast polarity, one black-ongray and one white-on-gray.
–1
Retinal position
Luminance
A
17.1.1c Luminance-Defined Gradients
Retinal position
–1
1
B
Luminance
Depth can be perceived in stereograms in which disparity is between smoothly graded changes in luminance rather than between well-defined edges. Disparities of this type arise in natural scenes from differences in the positions of shadows. Shipley (1971) drew attention to the following translation of a passage from a paper by Ernst Mach (1866, p. 1492), which shows that he had the concept of making a stereogram with disparity between two gradients of luminance rather than between sharp contours: The plastic monocular effect of the rotating discs has led me to construct binocular situations of this type using rotating cylinders and to observe them under a stereoscope. If I move a vertical straight line as directrix through a sinusoid, a wavy cylinder surface results. This I illuminate from the side and provide for the two eyes two such illuminated surfaces next to one another on the same rotating cylinder. In this case all light intensities are continuous from one level to another. Each image alone gives the impression of the plastic cylindrical surface referred to above. They appear even more plastic when I superimpose the two images by crossing my eyes. In this way, the stereoscopic images can be constructed without any contours. Puerta (1989) noted that disparity between cast shadows created stereoscopic depth even when there was no disparity between the edges that cast them. For example, craters on the moon appear in depth when pictures of the moon taken 25 hours apart are stereoscopically fused. The only difference between the pictures is the position of shadows. Arndt et al. (1995) devised stereograms consisting of circular patches with the three luminance profiles shown in Figure 17.7. The first luminance profile is a parabola, which lacks zero crossings because the second derivative of a parabola is a constant. Stereograms embodying this type of disparity are shown in Figure 17.8. The disparity is between the peaks of the two luminance distributions, as shown in Figure 17.7A. The second luminance profile is a cubic profile. It also lacks zero crossings. In addition, the peaks of the functions are set at zero disparity. The only disparity is between the local luminance gradients, which can be thought of as a disparity between the luminance centroids
1
Retinal position
–1
1
C Luminance intensity profiles for stereograms. (A) Parabolic profiles with no zero crossings. Disparity is between the peaks of the profiles. (B) Parabolic profiles with no zero crossings. Disparity is between the centroids of the profiles. (C) Profiles with no zero crossings, no disparity between peaks, or centroids, which are balanced by the cubic term in the distribution. Disparity is between the mean square luminance distributions.
Figure 17.7.
of the images. The third luminance profile is derived from an asymmetrical function and contains no disparities of zero crossings, of peak luminance, or of centroids. The disparity is the minimum of the mean square difference of the luminance intensity profiles. In a control condition, the luminance gradients were identical in the two eyes. In a forced-choice procedure, subjects judged whether two stereograms were the same or different and described the kind of difference they perceived. In a second procedure, subjects indicated which of two stereograms appeared nearer. Apart from an overall difference in depth, the stereograms appeared curved in depth, but this effect was due to shading rather than to disparity. Although discrimination of depth was easiest with the parabolic luminance profile, subjects reported depth in all three types of stereogram. It seems that stereopsis can be based on disparities between luminance peaks, between centroids, and between mean square differences of luminance. The disparity threshold for detection of depth based on mean square differences
S T I MU LU S TO K E N S F O R S T E R E O P S I S
•
253
distributions, luminance centroids, and other derivatives of luminance gradients. These latter types of disparity cannot register fine variations in depth. 17.1.2 S T E R E O P S I S A N D S T I MU LUS O R I E N TAT I O N
17.1.2a Stereopsis with Orthogonally Oriented Lines
Intensity-based stereopsis. With uncrossed fusion, the upper stereogram is of a shaded convex surface. With crossed fusion, the lower stereogram is a convex surface. In both cases, the other stereogram is reversed and should therefore produce a concave surface. Most subjects saw the convex surface but not the concave surface. However, the fact that subjects saw a difference in depth between the two stereograms shows that a disparity between gradually shaded images influences depth perception. (From Arndt et al. 1995 with permission of Springer Science+Business
Figure 17.8.
Media)
of luminance was inversely related to stimulus contrast (Mallot et al. 1996b). Since this type of disparity involves the comparison of intensity distributions over a relatively large area, it cannot be used to register local variations in depth that define the 3-D shape of the surface. It can be used to register only the overall distance of the smoothly shaded surface with respect to its surroundings. The 3-D shape of a shaded surface is determined by the monocularly perceived shading and by the tendency to perceive brighter parts nearer than dimmer parts (Mallot 1997). Other stereoscopic effects produced by luminance gradients are described in Section 17.5. The resolution of conflicts between edge-based stereo and stereo based on gradual luminance gradients is reviewed in Section 30.6. 17.1.1d Summary It seems that linkages between images that optimize the overall pattern match are preferred, whether these matches are based on zero crossings or on luminance peaks. Linkages between component spatial frequencies are used only when these are the most prominent features. When an overall match between the images is not possible, component elements are often linked to generate multiple depth planes or depth corrugations. Stereopsis can also be based on disparities between low spatial-frequency peaks of luminance 254
•
It has been claimed that depth can be seen in a stereogram containing horizontal disparity between regions in which lines in one eye are orthogonal to those in the other eye, as in Figure 17.9 (Ramachandran et al. 1973a ; Kaufman 1974, p. 306). However, there are potential artifacts. In the figure used by Kaufman, the lines in the central square were thicker than those in the surround, so there was a difference in space-average luminance (Portrait Figure 17.10). O’Shea (1989) did not obtain stereopsis when he eliminated this artifact and the effects of vergence. In the displays used by Kaufman and by Ramachandran, the ends of the lines of the inner squares abutted those of the surround to create a series of v-shapes, as in Figure 17.9. The tips of these v-shapes could serve as fusible stimuli. Furthermore, the region in which the v-shapes are formed does not have the same space-average luminance as the rest of the display. It would be unwise to draw strong conclusions from this type of display. Setting lines in a stereogram at right angles did not affect the stereo threshold when the lines were shorter than 3 arcmin (Mitchell and O’Hagen 1972). However, the stereo threshold rose rapidly as the orthogonal lines
Stereogram with orthogonal lines. With crossed fusion, disparity in the upper stereogram should bring the inner square forward and that in the lower stereogram should take the square back. The impression of depth is unstable and may be due to fusion of features along the boundary between rivaling regions. (Redrawn from Kaufman 1974)
Figure 17.9.
STEREOSCOPIC VISION
Lloyd Kaufman. Born in New York in 1927. He obtained a B.A. in psychology from San Diego State University in 1950 and a Ph.D. from New School University, New York, in 1961. He was a design engineer and flight test engineer for Sperry Rand from 1952 to 1962 and on the research staff at the Sperry Rand Center in Sudbury, Massachusetts, from 1962 to 1969. He became associate professor of psychology at Yeshiva University in 1967 and professor of psychology at New York University in 1969. He has been an emeritus professor and senior scientist at Long Island University since 1995.
Figure 17.10
became longer. It seems that disparity between oppositely oriented lines serves as a reliable cue to depth only when the lines are very short. This issue was discussed in Section 15.3.5.
17.1.2b Orientation Specificity of Stereopsis One approach to orientation specificity of disparity processing is to investigate orientation specificity of stereoscopic aftereffects. Julesz (1971, p. 92) reported that depth-contrast (Section 21.4) produced by prior inspection of a random-line stereogram with the lines in a particular orientation did not vary with the orientation of the line elements in the test stimulus. However, lines in a randomline stereogram contain a broad band of contrast energy at all orientations. They are therefore not appropriate for investigating orientation specificity (Mansfield and Parker 1993). A second approach to the question of orientation specificity in stereopsis is to measure the effects of adding noise, consisting of line elements with specified orientation, to a random-line stereogram. One might expect masking would be most severe when the orientation of the noise lines is similar to that of the disparity-defining lines. However, Mayhew and Frisby (1978) reported that the
quality of stereopsis was the same when the noise elements were set an angle of 45˚ to the disparity-defining elements of the stereogram as when they had the same orientation. They concluded that the masking of stereopsis by noise is not orientation specific. Mayhew and Frisby concluded that human disparity coding, at least for random-line stereograms, uses nonoriented visual channels. This conflicts with physiological evidence that most binocular cells have similar orientation tuning in the two eyes (Section 11.4.1). Parker et al. (1991) challenged Mayhew and Frisby’s conclusion. They found that the effectiveness of masking noise, as indicated by a forced-choice depth-discrimination task, was clearly reduced as the angle between noise elements and disparity elements of a line stereogram was increased from 0˚ to 90˚ (see Frisby and Pollard 1991). Only part of the masking was independent of the orientation of the masking noise. This isotropic component was more evident with low than with high spatial-frequency elements (Mansfield and Parker 1993). Mayhew and Frisby (1979b) used another argument to support their conclusion that disparity coding does not involve orientation detectors. They pointed out that an orientation-tuned system would have difficulty detecting closely packed horizontal depth corrugations in a randomdot stereogram, because the receptive fields suitable for detecting orientation would extend across several depth modulations. This argument is not conclusive because horizontally oriented detectors could code horizontal depth corrugations. Horizontal disparity may be coded by end-stopped binocular cells tuned to horizontal lines (Section 11.4.5). On balance, physiological and psychophysical evidence suggests that detectors tuned to orientation are involved in disparity coding. Orientation-tuned disparity detectors are certainly involved in the image-linkage process, because similarly oriented lines fuse and orthogonal lines rival (Section 15.3.5). Furthermore, detectors tuned to disparity of orientation may code depth in their own right (Chapter 22). Nevertheless, we saw in Section 11.4.5 that detectors tuned to different orientations are all equally sensitive to disparity in stimuli with a broad range of orientations, such as random-dot stereograms. 17.1.3 T E X T U R E -D E FI N E D R E G I O NS
The texture of a homogeneously patterned surface may be defined in terms of the shapes, orientations, sizes, or spacing of the texture elements. Disparity between boundaries of differently textured regions can create a step in depth when the texture for each region is the same in the two eyes. In this section, we ask whether disparity between boundaries of textured regions can code depth when the contents of each textured region differ in the two eyes.
S T I MU LU S TO K E N S F O R S T E R E O P S I S
•
255
A Figure 17.11.
Stereogram formed from nonmatching letters. The letters in the
two eyes differ but the spacing is the same. The central region of I’s in the left image are laterally offset relative to a similar region of o’s in the right image. Depth is seen in spite of letter dissimilarity. (Redrawn from Kaufman and Pitblado 1965)
Figure 17.11 shows that depth may be created by disparity between the boundaries of textured regions, even though the texture elements in each region are dissimilar in the two eyes. The clustering of the letters is the same in the two images except that the boundary between the central region in one image is displaced with respect to the central region in the other. The disparity is therefore defined in terms of the boundary between element clusters—a low spatialfrequency feature, rather than by the elements—a high spatial-frequency feature. Figure 17.12A produces good stereopsis in spite of the mismatch of the high spatial-frequency elements within the squares (Mayhew and Frisby 1976). Thus stereopsis is processed by disparity between the boundaries of the squares, in the presence of pattern mismatches in the high-frequency contents of the squares. The spatial-frequency content of the boundaries of the squares is not well defined. Stereopsis is also achieved with the stereogram in Figure 17.12B, which has texture rivalry in the background as well as within the squares. However, this works only when the spatial frequency of the background texture differs from that of the texture in the squares. In other words, the boundaries of the squares must be visible in the monocular images. Mayhew et al. (1977) devised an algorithm for searching within rivalrous textured regions for a subset of points possessing point-for-point correspondences sufficient for computation of disparity. In the stereogram in Figure 17.13A the inner region contains elements with a spatial frequency of 2.5 cpd that match in the two eyes. Elements in the surround have a mean spatial frequency of 10 cpd, and also match in the two eyes (Ramachandran et al. 1973a ; Frisby and Mayhew 1978b). This stereogram produces good depth for both crossed and uncrossed disparities. In Figure 17.13B, the spatial frequencies of corresponding regions match, but the positions of elements do not match. This produces a vague impression of depth, which is probably due, not to 256
•
B Effects of uncorrelated texture on stereopsis. (A) Stereogram with uncorrelated texture. Depth is seen. (B) Stereogram with uncorrelated texture in the squares and background. Depth is seen only if, as here, boundaries of the disparate regions are visible in each eye. (From Mayhew and
Figure 17.12.
Frisby 1976. Reprinted by permission from Macmillan Publishers Ltd.)
disparity, but to a figure-on-ground effect. In Figure 17.13C, corresponding regions differ in both spatial frequency and position of elements. This does not produce depth, presumably because the different spatial frequencies in corresponding regions in the two eyes produce binocular rivalry. Summary Stereopsis is best when corresponding regions of stereograms have matching pattern elements, as in a regular random-dot stereogram. In that case, the texture of the disparate figure can be made the same as that of the zero-disparity background, because stereopsis is based on point-for-point linkage of microelements and not on the boundaries of regions defined by differences in the size, density, or shape of elements. If the micropattern elements do not match, stereopsis is still possible and is based on the disparity between discriminably different regions of the stereogram. For this to work, these boundaries must be visible in each eye. Differences of spatial frequency (size and density of elements) can make these boundaries visible. However, spatialfrequency differences are useful only if (1) corresponding regions in the two eyes are not too dissimilar in spatial frequency and (2) the parts of the stereogram within each
STEREOSCOPIC VISION
channels designed to detect detail arising from differences in luminance independently of color. Both channels involve binocular cells in V1. See Peirce et al. (2008) for recent evidence about these two channels. Several investigators have inquired whether disparity-based stereopsis is confined to achromatic channels. A second question is whether color, in the presence of luminance contrast, helps the image-linkage process. Two differently colored patches are isoluminant when they excite the luminance channel of the visual system to the same degree. There are two main criteria for deciding when two patches differing in hue are isoluminant. The first is the point of minimum perceived flicker when the two patches are rapidly alternated. The second is the point where the border between two abutting patches is minimally distinct (Kaiser 1971). There are several technical problems associated with making isoluminant matches.
A
1. After two patches have been set to isoluminance, luminance fringes may be introduced by chromatic aberrations in the optical components of the eye or apparatus.
B
2. Rapid alternation of a display on a color monitor may produce luminance fringes because of the differential rise time of phosphors.
C Disparities in spatial frequency and location. (A) A stereogram with matching elements in the inner and surround regions creates good depth. (B) The spatial frequencies in the inner and surround regions match but the locations of elements do not. This creates a vague impression of depth that may be due to a figure-ground effect. (C) Inner and outer regions differ in spatial frequency. This creates rivalry, not depth. (From Frisby and Mayhew 1978b. Perception Pion Limited London) Figure 17.13.
eye have sufficiently distinct spatial frequencies. Thus, regions segregated in each eye on the basis of spatial frequency form the basis for disparity detection if the spatial frequencies of corresponding regions are similar in the two eyes. Spatial frequency may not be the only texture feature serving perceptual segregation of regions in stereograms lacking point-for-point correspondence in the microelements that make up a pattern. 17.1.4 S T E R EO P S I S I N T H E C H RO M AT I C CHANNEL
17.1.4a Stereopsis at Isoluminance The visual system has a set of channels designed to detect color differences rather than luminance gradients, and
3. An isoluminant match under continuous conditions will not hold when the display is moved, because the different color channels of the visual system have different dynamics. Image movements arising from eye movements may be sufficient to upset an isoluminant match. 4. Isoluminance is not the same for different regions of the retina. 5. Chromatic and achromatic borders cannot be compared unless their effective contrasts are equated, and we will see that there is no simple way to do this. 6. The relative effectiveness of chromatic and achromatic stimuli may depend on the spatial-frequency content of the stimuli and on the linearity or nonlinearity of disparity processing. Lu and Fender (1972) presented random-dot stereograms with the dots in one color and the background in another color. There were 100 dots in the display, each subtending 0.1˚. Various color combinations were tried, but in no case was depth reported when the dots and background differed in color but not in luminance (see also Gregory 1979). Lu and Fender found some weak stereopsis with line stereograms at isoluminance and concluded that color contributes very little to stereopsis. Brain potentials in human subjects evoked by depth reversal of a red-green
S T I MU LU S TO K E N S F O R S T E R E O P S I S
•
257
dynamic random-dot stereogram disappeared or were greatly reduced at isoluminance (Livingstone 1996). If stereopsis depended only on luminance, the contrast required for stereopsis should be independent of chromatic differences between the dichoptic stimuli. However, Lu and Fender’s data showed that some color combinations required more luminance contrast than others did before stereopsis was evident. This suggests that different color systems have different luminance functions. Russell (1979) explained Lu and Fender’s data in terms of the higher efficiency of the red/green system relative to other color systems in processing luminance contrast for high spatialfrequency stimuli, such as a random-dot stereogram. Comerford (1974) used the criterion of minimally distinct borders to equate the luminance of a red, green, or blue wheel-shaped object relative to that of a white or green background. The disparity of the object relative to the surrounding aperture was set at either 7 or 30 arcmin. The percentage of correct depth judgments was better for the red-green stimulus than for other color combinations. However, performance was not reduced at the isoluminant point except for the red-on-white stimulus. The disparity in these displays was between low spatial-frequency patterns, namely, the disk and the surrounding aperture. Since the chromatic system operates most effectively at low spatial frequencies, the chromatic signal would be stronger in this stimulus than in the high spatial-frequency, random-dot display used by Lu and Fender. This could account for the discrepant results. Comerford recorded percentage of correct scores for fixed disparities, not stereoscopic acuity. Therefore, he did not prove that stereoscopic acuity is as high under isoluminant conditions as under luminancecontrast conditions. Osuobeni and O’Leary (1986) measured stereoacuity using a red-green line stereogram with various levels of luminance contrast and found that stereoacuity at isoluminance was five times worse than with an adequate luminance difference. Kingdom and Simmons (1996) used vertical Gabor patches and found stereoacuity with isoluminant stimuli to be at least one quarter of that with luminance patches with equivalent contrast, defined as multiples of contrast threshold. De Weert (1979) used a random-dot stereogram with the dots and background in various color combinations. For all color pairs, the impression of depth disappeared in the neighborhood of the isoluminant point defined by minimum perceived contrast, although not at the isoluminant point defined by minimum flicker. In a stereogram consisting of two large bars, depth was perceived for all values of relative luminance. However, this comparison was biased in favor of the bar stereogram, since it had a disparity of 33 arcmin compared with a disparity of only 13 arcmin in the random-dot stereogram. In a later study, De Weert and Sadza (1983) reported that subjects could correctly identify the depth in both an isoluminant random-dot 258
•
stereogram and an isoluminant stereogram consisting of a large monocularly defined shape when the disparity in both was +3.6 arcmin. Subjects rated the sensation of depth to be very poor in both stereograms at isoluminance. There may be problems here. Both stereograms subtended only about 3˚ and were surrounded by a highcontrast border. Subjects may have responded to the disparity in the border created by changing convergence on the elements of the stereogram. They had ample opportunity to learn to do this, with 500 forced-choice presentations, each followed by a signal indicating whether the response was correct. Furthermore, the effects of luminance artifacts and chromatic aberration in the video display were not assessed. Kovács and Julesz (1992) obtained depth when a color difference was added to a random-dot stereogram with reversed luminance contrast between the two eyes. A reversed-contrast stereogram does not produce depth in the absence of color. Kingdom et al. (1999) found that, at isoluminance, contrast thresholds for the perception of depth were no higher for random-dot stereograms than for line stereograms. However, contrast thresholds for judging the form of a cyclopean shape in an isoluminant random-dot stereogram were elevated relative to those for judging form in an isochromatic stereogram. Thus, there seems to be a specific impairment of processing stereoscopic form at isoluminance, presumably because form processing is weak in the chromatic domain. Jiménez et al. (1997) found that the upper disparity limit for detection of depth in a random-dot stereogram was highest for a luminance-defined display, next highest for a display that was isoluminant along the red-green locus, and least for a display isoluminant along the yellow-blue locus. Monkeys detected depth in random-dot stereograms at isoluminance, but with a much reduced level of success. Recordings from microelectrodes showed that magnocellular cells in the LGN retained some response to isoluminant stimuli (Logothetis et al. 1990). The following factors should be taken into account when comparing isoluminant stimuli with stimuli defined by luminance.
17.1.4b Efficiency of Luminance and Chromatic Detectors Because the spectral sensitivities of red and green cones overlap, the information at the receptors is reduced when an edge defined by color replaces one defined by luminance. Scharff and Geisler (1992) used an ideal-observer analysis of the photoreceptor mosaic to calculate the equivalent contrasts of color-defined and luminance-defined edges. They measured stereo discrimination for six subjects with an isoluminant red-green random-dot stereogram blurred
STEREOSCOPIC VISION
to reduce the effects of chromatic aberration. For three subjects who could fuse the stereogram, depth discrimination was worst at the point of isoluminance, but they used the information in the isoluminant display as efficiently as the equivalent contrast of the display would allow. Jordan et al. (1990) found that subjects used chromatic cues more efficiently than luminance cues in matching the images of an ambiguous stereogram. Simmons and Kingdom (1994) equated the contrasts of isoluminant and isochromatic Gabor patches in terms of multiples of the detection threshold. With this criterion, depth discrimination with chromatic Gabor patches required slightly higher contrast relative to the detection threshold. Krauskopf and Forte (2002) found that depth thresholds were ten times higher for chromatic targets than for luminance targets, when the contrast of the targets was an equal multiple of the detection threshold. The measure of equivalent contrast used by Scharff and Geisler gives greater weight to chromatic stimuli than does that used by Simmons and Kingdom. In a subsequent paper Simmons and Kingdom (1995) obtained the same result as in their earlier paper for isoluminant (red-green) and isochromatic (yellow-black) vertical Gabor patches in which the crucial disparity was between modulations within the patches. When the patches were horizontal so that the crucial disparity was between the envelopes of the patches, the contrast threshold for stereo discrimination for the isoluminant stimulus was considerably higher than for the isochromatic stimulus. They proposed that large disparities between the Gaussian envelopes are detected by a nonlinear mechanism that operates most effectively with luminance contrast. Disparity detection in random-dot stereograms may involve nonlinear processes because the disparity in these stereograms is usually greater than the mean separation of the dots. This could explain why depth is not seen in random-dot stereograms with isoluminant stimuli. In a third paper, Simmons and Kingdom (1997) measured the contrast threshold for detection of stereoscopic depth using Gabor patches with different mixtures of color and luminance contrast (Portrait Figures 17.14 and 17.15). The data indicated that the two forms of contrast sum only at the level of probability summation. Using similar stimuli, Simmons and Kingdom (2002) showed that stereoacuity was disrupted when luminance contrast was anticorrelated and chromatic contrast was correlated or vice versa. These results suggest that luminance and chromatic stereo channels are independent.
17.1.4c Relative Density of Cone Types The second factor to be taken into account when comparing chromatic and achromatic stimuli is the relative spacing of different types of cone. The blue cones are virtually absent in the fovea, and are much less dense than red or green cones
Fred Kingdom. Born in East Molesey, England, in 1953. He obtained a B.A. from Cambridge University in 1974 and a Ph.D. with Bernard Moulden from the University of Reading in 1984. After holding research positions at the Universities of Reading and Cambridge he joined the department of ophthalmology at McGill University in 1990, where he is now a professor.
Figure 17.14.
David Simmons. Born in Norwich, England, in 1963. He obtained a B.Sc. in physics from Imperial College, London, in 1985 and a Ph.D. in physiological science with M. Hawken from Oxford University in 1993. He held research appointments with D. Foster at Keele University and with F. Kingdom at McGill University. He obtained an academic appointment at Glasgow Caledonian University in 1995 and moved to the University of Glasgow, department of psychology in 1999.
Figure 17.15.
S T I MU LU S TO K E N S F O R S T E R E O P S I S
•
259
Red-green chromatic grating Stereo Monocular Luminance grating
Stereo
Monocular motion threshold
100 Depth modulation threshold (phase)
in the rest of the retina. In the baboon, at eccentricities over 5˚, 13% of receptors are blue cones, 33% are red cones, and 54% are green cones (Marc and Sperling 1977). Blue cones also have larger receptive fields and poorer contrast sensitivity than red and green cones, and these factors could also contribute to the low stereoacuity of the blue-cone system. It is not surprising that the stereo threshold for isoluminant blue-on-yellow random-dot stereograms was 43 arcsec, which is about nine times that for stimuli defined by luminance (Grinberg and Williams 1985). Another factor that may be related to receptor density is the instability of gaze when a person attempts to fixate an isoluminant pattern (Tyler and Cavanagh 1991).
Monocular
10 0.1
1.0
10
100
Temporal frequency of depth modulation (Hz)
17.1.4d Dynamics of Chromatic and Achromatic Stimuli A third factor in comparing chromatic and achromatic stimuli is the relative sensitivity of chromatic and luminance channels to transient stimuli. The chromatic channel of the parvocellular system is relatively insensitive to transient stimuli and should therefore be most sensitive to depth oscillations at low temporal frequencies. The luminance channel, especially the magnocellular part of the channel, is sensitive to transient stimuli and most sensitive to oscillations in depth at higher frequencies. Tyler and Cavanagh (1991) investigated this question by measuring the amplitude of disparity modulation required for detecting depth in a red-green grating as a function of the temporal frequency of depth modulation. Figure 17.16 shows that, with a grating of 10% contrast, the stereo threshold was lowest at a temporal frequency of about 3 Hz, and increased steeply below this frequency. With an isoluminant grating, the lowest threshold was at about 1 Hz and showed no significant increase at lower frequencies. Below 1 Hz, the depth threshold for the isoluminant grating was equal to or less than that for the luminance grating. With luminance-defined stimuli, the threshold for detection of monocularly viewed lateral motion was lower than that for detection of motion-in-depth defined by disparity. This is stereomovement suppression (see Section 31.3.3c). With isoluminant red-green gratings, the threshold for oscillations of spatial phase was the same when the gratings were viewed by one eye as when they were viewed dichoptically to produce oscillation in depth (Tyler and Cavanagh 1991). Thus, stereomovement suppression was not present in isoluminant gratings. They concluded that stereomovement is processed separately in luminance and chromatic channels.
17.1.4e Effects of Chromatic and Achromatic Noise If chromatic contours are not used in stereopsis, then stereopsis should not be disturbed when some of the dots in a 260
•
Chromatic and achromatic depth-modulation. Filled symbols show the amplitude of disparity modulation required for detection of depth in a red-green chromatic grating (upper curves) and a luminance grating (lower curves) as a function of temporal frequency of depth modulation. At higher temporal frequencies the luminance channel is more sensitive to depth modulations than the chromatic channel. Unfilled symbols show the threshold for monocular oscillatory motion in the chromatic (upper curve) and luminance channels (lower curve), as a function of temporal frequency. In the chromatic channel, the function for depth modulation is similar to that for oscillatory monocular motion. However, depth modulation thresholds in the luminance channel are higher than those for monocular motion, an effect dubbed stereomovement suppression. (Adapted from Tyler and Cavanagh
Figure 17.16.
1991)
random-dot stereogram differ in color in the two eyes. Stuart et al. (1992) measured the time taken to discriminate two depth planes in a random-dot stereogram as a function of the percentage of dots that differed in color and/or had reversed contrast in the two eyes. The effective contrasts of the luminance and chromatic elements were equated by the procedure used by Scharff and Geisler, mentioned in Section 17.1.4a. At low levels of contrast (14.6%) the task took longer to complete or depth could not be detected when 20% of the dots differed in color. However, subjects could discriminate depth when 50% of the dots were contrast reversed and the only correlated dots were defined by color. At luminance contrasts of 28 and 37% and chromatic contrast at 14.6%, the addition of chromatic noise had little effect, but only a small percentage of luminance noise was tolerated. Stuart et al. argued that this decline in the contribution of color to stereopsis at high levels of luminance contrast is to be expected if color and luminance are processed within a “double duty” system, such as the parvocellular system, rather than by two independent systems. Even if color and luminance are processed in the same parvocellular system, they each pass through several stages in the ventral processing stream, some of which may be specific for either luminance or color (Section 5.8.3). Thus, distinct aspects of color are processed in the retina, V1, V4, and the inferotemporal cortex.
STEREOSCOPIC VISION
Patients with lesions in a region analogous to V4 in monkeys, suffer a form of color blindness known as cerebral achromatopsia (Zeki 1990). Patients describe the world in terms of gray, although all three cone mechanisms are intact. One patient could see depth in a random-dot stereogram presented with red-green anaglyph filters even though he could not detect the red stereogram elements when he looked through the red filter (Hendricks et al. 1981). Disparity detection must depend on processes at an earlier stage than that responsible for achromatopsia. Isoluminant stimuli do not reveal the full contribution of color to visual performance. Gur and Akri (1992) showed that contrast sensitivity is enhanced when color contrast is added to a grating defined by luminance contrast. They explained this in terms of the contribution of cells of the parvocellular system that respond to both the chromatic and luminance components of the stimuli. Summary There are three ways to interpret the presence of some stereopsis at isoluminance. (1) The chromatic component of the parvocellular system is capable of some disparity processing. (2) The magnocellular system is capable of responding weakly to isoluminant stimuli. (3) Disparity is processed in both channels. In any case, we cannot assume that the parvo- and magnocellular systems are totally segregated, and one cannot draw firm conclusions about the site of disparity processing from stereo performance at isoluminance. Isoluminance degrades stereoscopic vision, and the loss seems to be greater for disparities defined by small elements than for those defined by large elements. However, there is dispute about whether stereoacuity is the same for luminance-defined and chromatic-defined targets when the stimuli are an equal distance above the detection threshold. The upper disparity limit for depth is higher for luminance-defined than for isoluminant random-dot displays. Little is known about the magnitude of the loss in stereoacuity at isoluminance because stereoscopic acuity has usually not been measured with isoluminant displays. The loss may be no more than one would predict from the loss in monocular resolution at isoluminance for the same type of pattern, because of the lower contrast sensitivity. Artifacts such as chromatic aberration and the presence of luminance-contrast borders may have contributed to the depth sensations reported in isoluminant displays. Chromatic contrast may enhance depth induced by luminance contrast. 17.1.5 MOT I O N A N D F L I C K E R A S TO K E N S F O R S T E R E O P S I S
The question addressed in this section is whether disparate regions defined only by motion can generate impressions of depth. Julesz (1971, p. 83) exposed each eye to columns of short vertical lines moving in opposite horizontal directions
in alternate columns. The motion of each column in one eye was out of phase with the motion of the corresponding column in the other eye. This procedure eliminated positional disparities from the stereogram. One square region of the display in one eye was horizontally shifted to produce a disparity. The motion in the two eyes was correlated in the shifted region, and uncorrelated in the surround region. The square was not seen in depth. It was concluded that motion is processed after disparity and is not a token for disparity-based stereopsis. However, Julesz pointed out that a negative result is not conclusive. Note that the motion signals had the same magnitude and direction in the two regions of the stereogram. Presumably, the visual system failed to detect the dichoptic correlation of the motion signals in the disparate region of the stereogram and hence failed to detect the disparity. Lee (1970c) exposed each eye to a vertical strip of random dots that oscillated sinusoidally from side-to-side, occluding and exposing a background of similar but stationary random dots. The boundaries of the vertical s trip were not visible in any snapshot of the display. The dot patterns of both strip and background were uncorrelated between the two eyes, but the locations of the boundaries between the moving strip and the stationary background were correlated. When a phase difference was introduced between the motions of the oscillating strips, the vertical strip appeared to rotate in an elliptical path in depth similar to that seen in the Pulfrich effect (Section 23.1). Lee concluded that kinetic boundaries constitute a token that can be used by the stereoscopic system. Prazdny (1984) obtained impressions of depth from a static disparity between vertical arrays of dots that were correlated from frame to frame in random-dot arrays that were otherwise uncorrelated over time and between the eyes. Halpern (1991) devised a stereogram with disparity between forms defined by motion alone. Each eye saw a display of random dots in which a central square moved from side to side through 20 arcmin with respect to surrounding dots, with deletion of dots along the square’s leading edge and accretion of dots along the trailing edge. The squares in the two eyes moved out of phase to produce 1, 3, or 5 arcmin of disparity. Subjects correctly identified depth produced by crossed and uncrossed disparities. However, settings with a depth probe revealed that, even with the largest disparity, very little depth was perceived for uncrossed disparities in the square. Halpern suggested that the monocular cue of accretion and deletion of dots at the edges of the square indicated that the square was in front, and that this detracted from depth created by uncrossed disparities. When accretion and deletion were removed by using a cyclopean moving square formed from dynamic random-dot displays, both crossed and uncrossed disparities produced impressions of depth commensurate with the imposed disparity. In these experiments, the motion signals in the two regions of the stereogram were different.
S T I MU LU S TO K E N S F O R S T E R E O P S I S
•
261
The visual system had simply to detect this difference to detect the disparity in the moving region. The fact that it did so suggests that motion can be a token for disparitybased stereopsis. Poom (2002) pointed out that the displays used by Lee and by Halpern contained elements visible to only one eye. This occurred where the moving dots occluded or exposed dots of the stationary background. The monocular regions rather than disparity between kinetic edges may have generated impressions of depth (Section 17.3). Poom constructed a stereogram in which random dots rotated back and forth within a diamond-shaped region in one direction while the surrounding dots rotated in antiphase. In both regions the dots were uncorrelated between the two eyes. A crossed or uncrossed disparity between the kinetic boundaries in the two eyes created an impression of a near or far diamond frame. Along the kinetic edge of the diamond, dots inside and outside the diamond underwent accretion and occlusion. The accretion and deletion cues were ambiguous with respect to border ownership and therefore provided no information about depth. Subjects were able to identify the sign of depth created by the kinetic boundary for disparities above 0.1˚. Rogers (1987) showed that a binocular phase difference between standing wave patterns of shearing motion may also be detected by the visual system. Each eye saw a similar pattern of shearing motion in which the dots moved up and down along vertical paths with sinusoidal motion. The amplitude of vertical motion varied sinusoidally with horizontal position to create a standing wave of shearing motion. Monocularly, the standing wave was seen as a 3-D vertical corrugation, which rocked back and forth in depth about a horizontal axis through the center of the pattern, like a kinetic depth effect display (Section 28.5). The dot patterns seen by the two eyes were uncorrelated, but there was a correlation between the positions of the nodes and antinodes of the standing wave motion. The vertical corrugations appeared to rock back and forth in depth in front of the fixation point when the nodes and antinodes had a crossed disparity, and beyond the fixation point when they had an uncrossed disparity. There are two possible explanations of this effect. The visual system might detect the phase difference between the correlated vertical motions in the two eyes. If so, the effect produced by the standing wave of motion suggests that the binocular system can link smooth (low frequency) kinetic boundaries as well as the sharp (high frequency) kinetic boundaries used in Lee’s (1970c) experiment. Alternatively, the structure-from-motion could be interpreted separately in the two eyes before the disparity between the 3-D structures is derived. That is, the visual system could use— structure-from-motion disparity—rather than kineticboundary disparity. No evidence was provided to distinguish between these two possibilities. Disparity between boundaries defined by flicker also creates impressions of depth. Prazdny (1984) presented 262
•
uncorrelated dynamic random-dot displays to the two eyes. In a bar-shaped region the black and white dots continuously changed their contrast polarity. At any instant, the frames presented to the two eyes appeared as rivalrous displays. In a sequence of frames, disparity between the bar regions created an impression of depth. Poom (1987) created an impression of depth from disparity between images of a diamond-shaped region of dots undergoing correlated flicker. 17.1.6 D I S PA R I T Y B ET WE E N SPECULARITIES
The types of binocular disparity discussed so far are between the contrast, texture, or motion boundaries of surfaces. But disparity can also arise from luminance-defined edges due to differences in illumination (Section 27.3). A perfectly matte surface reflects light falling on each point equally in all directions; the surface is said to be Lambertian. Gradients of luminance (shading) are created when an undulating matte surface intersects the principal direction of illumination at different angles. Moreover, the gradients of luminance change when the angle of illumination changes and, for a given angle of illumination, the gradients differ between the eyes to create disparity. A surface that reflects light more in one direction than in others is specular. In the limit of specularity, the surface is a mirror and the angle of reflection at each point is equal to the angle of incidence. Flat or undulating semispecular surfaces create luminance highlights that are reflections of light sources. These are referred to as specularities. For a given angle of illumination, the two eyes see specularities in different locations on the surface, thus creating binocular disparities. If a specular surface is convex to the observer, any specularity forms a virtual stereoscopic image beyond the surface. If the surface is concave, the specularity forms a real image, which is generally nearer than the surface (Figure 17.17). Koenderink and van Doorn (1980) analyzed the properties of specularities during movement of the observer. Blake and Bülthoff (1991) extended this model to binocular viewing situations. Their analysis shows that disparities between specularities provide information about the direction and amount of curvature in a surface, which the visual system could, in principle, exploit. Blake and Bülthoff (1990) tested whether human observers use specularity disparity. In the first experiment, test surfaces were stereoscopic textured convex and concave ellipsoids and a convex sphere. Observers adjusted the disparity of specularity to maximize the perceived glossiness of the surface. All seven observers judged the convex sphere to be most glossy when the specularity disparity was uncrossed (placing it behind the surface), as their model predicted. The average value of disparity selected by the subjects was only slightly less than that predicted by the model.
STEREOSCOPIC VISION
Virtual image
Convex surface
Light source
A
Real image
Concave surface
B
forced-choice decision whether the surface appeared convex or concave. At first, responses were not consistent with the direction of the specularity but, after about 20 exposures, all observers could reliably report the direction of surface curvature. These results provide some evidence that we use the disparity of specular reflections to judge the direction of surface curvature. However, the fact that, initially, none of the observers gave appropriate responses to the concave surfaces weakens this conclusion. The results of the forced-choice experiment show that observers can learn to discriminate between stereoscopic stimuli that have opposite directions of specularity disparity but this does not mean that these differences create percepts of convexity or concavity. Todd et al. (1997) used stereograms of more complex shapes that were textured or plain and with Lambertian or specular reflections. Subjects adjusted the slant of a monocular circular patch so that it appeared tangential to the surface at each of several points. Like Bülthoff and Mallot (Section 27.3.2b) they found that surface relief on smooth surfaces was underestimated about 30% relative to that on textured surfaces. However, unlike Bülthoff and Mallot they found that relief was more accurately perceived with specular reflection than with Lambertian reflection, even with untextured surfaces.
Light source
Specular disparity. (A) Reflection off a convex surface creates a virtual image of the light source behind the surface. (B) Reflection off a concave surface usually creates a real image of the light source in front of the surface. (Redrawn from Blake and Bülthoff Figure 17.17.
1 7 . 2 I N T E R AC T I O N S B ET W E E N M O N O C U L A R O C C LU S I O N A N D B I N O C U L A R D I S PA R I T Y
1990)
For most trials involving the convex ellipsoids, maximum glossiness was also reported when the specularity was stereoscopically beyond the surface, as predicted. The results for the concave surfaces were not consistent with the model. Four of the observers judged the surface to be most glossy when the specularity had zero disparity (placing it on the surface), and two when it had an uncrossed disparity—in the opposite direction to the predictions of the model. Sakano and Ando (2010) found that a temporal change in the luminance of an evenly illuminated specular surface produced by head motion or by binocular disparity enhanced the perceived glossiness of the surface. In their second experiment, Blake and Bülthoff (1990) showed observers a shaded and textured surface in which all texture elements had zero disparity. The shading and texture information was ambiguous as to whether it was generated by a convex or a concave surface. Five observers viewed a random sequence of such stereograms in which the specularity was stereoscopically either in front of or beyond the surface by a disparity of 5 arcmin. They made a
17.2.1 BA S I C RU L E S O F M O N O C U L A R O C C LUS I O N
Next to a vertical edge of an opaque object there is a region of a far surface that is visible to only one eye, as shown in Figure 17.18A. This is monocular occlusion. A region visible to only one eye will be referred to as a monocular zone. A region visible only to the left eye is a left monocular zone and a region visible only to the right eye is a right monocular zone. A region visible to both eyes is a binocular zone, and a region not visible to either eye is a binocular occlusion zone. Leonardo da Vinci drew the diagrams shown in Figure 17.19 (Strong 1979). They illustrate occlusion zones on a surface viewed through an aperture. An object lying in front of a surface is not visible if the object and the surface have very similar textures and luminances so that the near object is camouflaged against the far surface. In Figure 17.18C the near object is camouflaged to the left eye because its image is superimposed on a matching far surface, but the right eye can see it because the near and far surfaces do not overlap for this eye. This is monocular camouflage.
S T I MU LU S TO K E N S F O R S T E R E O P S I S
•
263
Left-eye Binocular Right-eye monocular zone zone monocular zone Distant surface
Near object q
A
Drawings by Leonardo da Vinci. The drawings illustrate monocular occlusion zones created by looking at a surface through each of the two apertures. With the narrow aperture there is a region not visible to either eye. (Traced from the original drawings
Figure 17.19.
Left-eye monocular zone
Binocular zone
Right-eye monocular zone
in Strong 1979)
2. A monocular zone due to occlusion is necessarily part of an object beyond the binocular object. A monocular zone due to camouflage is nearer than the object against which it is camouflaged.
Near object
3. The size of a monocular zone due to either occlusion or camouflage is affected only slightly by changes of accommodation, version, or vergence. There is a slight change because an eye’s center of rotation and its nodal point are not coincident.
B Far object
Near object
C Monocular zones from occlusion and camouflage. (A) When a near object is shorter than the interocular distance, there is a binocular region between the monocular zones. (B) When a near object is longer than the interocular distance, neither eye sees the region between the monocular zones. (C) A near object is not visible to the left eye when near and far surface are the same. The object is camouflaged for the left eye. The near object is visible to the right eye against the different background. It appears on the nasal side of the far object. Figure 17.18.
Monocular occlusion and camouflage obey the following simple geometrical rules: 1. On a far surface, a monocular zone seen by the left eye is to the left of a near binocular object and one seen by the right eye is to the right of a near surface (see Figure 17.18A). Thus, for occlusion, the monocular zone in each eye is on the temporal side of the near binocular surface. A near surface camouflaged to one eye is seen by the other eye on the nasal side of the far surface. 264
•
4. For a far surface at a given distance, the angular subtense of a monocular zone (angle ϑ in Figure 17.18A) is inversely proportional to the distance of the occluding object from the viewer. A monocular zone becomes vanishingly small as the near surface approaches the far surface, as the far surface approaches the near surface, or as both surfaces are moved further away from the viewer. A viewer who sees a monocular zone changing in angular size knows only that one or some combination of these three things is happening. The size, or change in size, of a monocular zone is a potential source of unambiguous information about relative distance only if the viewer knows the distance of the occluder or of the occluded surface. 5. A near object narrower than the interocular distance produces two monocular zones on a far surface. Beyond a certain distance the two zones are separated by a binocular zone (Figure 17.18A). An object wider than the interocular distance produces monocular zones flanking a zone that neither eye sees (Figure 17.18B). 6. The width of a monocular zone limits the minimum possible depth between an occluder and an occluded object, but does not limit the maximum possible depth. See Tsirlin et al. (2010a). for empirical evidence on this point.
STEREOSCOPIC VISION
17.2.2 MO N O C U L A R Z O N E S A N D D E P T H DISCONTINUIT Y
The question discussed in this section is how monocular occlusion and camouflage influence impressions of depth created by binocular disparity. Monocular occlusion occurs wherever two surfaces are separated by a depth discontinuity that is not horizontal. A depth discontinuity must have a disparity gradient greater than 2 to generate a monocular zone. Figure 17.20 illustrates the role of monocular zones in creating a sharp depth edge. In the upper stereogram, the monocular zone of three rectangles creates a straight depth step. The lower stereogram has the same disparities, but both eyes see all the rectangles, and a straight edge is not seen (Nakayama and Shimojo 1990). With crossed fusion of the stereogram in Figure 17.21A, the vertical grating appears beyond the diamond frame on the right (Anderson 1999a). This interpretation is
A
B
C Effects of background on occlusion and camouflage. (A) In the right fused image with crossed fusion, both white and black bars of the grating appear beyond the diamond frames because monocular zones conform to the occlusion configuration. In the left image, only white bars are seen in front because only they can be seen as extending across the white background. (B) In the left fused image black bars come forward because only they can be seen as extending across the black background. (C) The left fused image does not create a definite impression of depth because the background does not support either an occlusion or a camouflage interpretation. The right fused image supports only an occlusion interpretation. (Redrawn from Anderson 1999a)
Figure 17.21.
A
B A monocular zone creates sharp depth edge. (A) One of the fused images creates a step with a straight edge with a subjective contour abutting four rectangles that are visible to only one eye. In the other fused image the monocular rectangles are seen by the wrong eye and do not create an edge. (B) Both fused images create a step in depth but there are no monocular elements to define a sharp subjective contour. (Adapted from Nakayama and Shimojo 1990)
Figure 17.20.
supported by the fact that monocular zones created by disparity conform to the occlusion configuration. On the left, the white bars separate from the black bars and appear nearer than the plane of the diamond. This interpretation is supported by the fact that the monocular zones conform to the camouflage configuration. In Figure 17.21B, the black bars come forward on the left because only they can be interpreted as being camouflaged against the black background. Figure 17.21C does not produce depth on the left because the intermediate luminance of the background does not support either an occlusion or a camouflage interpretation.
S T I MU LU S TO K E N S F O R S T E R E O P S I S
•
265
Gillam and Borsting (1988) argued that a monocular zone is more visible and thus more effective as a cue for a depth discontinuity if it is textured rather than blank. It took less time to recognize a depth edge in a random-dot stereogram when the monocular zone was filled with dots, like those in the background, than when the monocular zone was left blank. The crucial factor, however, may be the similarity between the texture in the monocular zone and that in the rest of the far surface rather than the absence of texture in the occluded zone. Grove and Ono (1999) tested this suggestion by adding a condition in which the texture density of the monocular zone was higher than that of the rest of the stereogram. For this condition, latency was higher than when the monocular zone was blank or had matching texture. In a second experiment they used the stereograms shown in Figure 17.22. In the upper stereogram, the monocular zone has the same texture as that of the far surface, and depth is readily seen. In the lower stereogram, the monocular zone matches the near surface, and depth is hard to see. Grove et al. (2002) had subjects set a depth probe to match the perceived depth of a step in a random-dot stereogram. The disparity of the probe equaled that of the depth step when the monocular zone had the same texture as the rest of the stereogram. However, very little depth was seen in the stereogram when the texture density of the monocular zone was double that of the rest of the stereogram. A change in texture density just at a monocular zone boundary is a rare (accidental) event. This is presumably why stereo
latency is longer and perceived depth is reduced in this situation. In the above experiments the monocular occlusion zone was part of a far surface. But monocular occlusion also occurs on the side of a solid object, when a given side is visible to one eye but not to the other. Wilcox and Lakra (2007) found that subjects detected the depth in a stereoscopic image of a textured box more rapidly when the monocular zone created by the side of the box was present than when the zone was omitted. 17.2.3 O C C LUS I O N, C A M O U F L AG E , A N D R I VA L RY
According to the rule mentioned above, a left monocular zone occurs on a left-facing occluding edge, and a right zone occurs on a right-facing occluding edge. In Figure 17.23
A
B Appropriate and inappropriate monocular zones. (A) Depth is easier to see with convergent than with divergent fusion. Only convergent fusion places the light-density monocular zone with its matching far surface. (B) Depth is easier to see with divergent than with convergent fusion. Only divergent fusion places the light-density monocular zone with its matching far surface. (Adapted from Grove and Ono
Normal and anomalous occlusion zones. In one fused image, the black disk appears in front of the background. The striped monocular zone on the appropriate (temporal) side is seen clearly in the plane of the background. The striped crescent on the inappropriate (nasal) side of the disk engages in binocular rivalry with the textured background and is periodically suppressed. In the other fused image, the disk is seen beyond the background. Both occlusion zones are now inappropriately placed and both engage in rivalry. (Adapted from Shimojo
1999)
and Nakayama 1990a)
Figure 17.23.
Figure 17.22.
266
•
STEREOSCOPIC VISION
one eye’s image has a monocular crescent on both sides of the black disk, but only one of them would be created when the two images are fused to form a black disk standing in front of a textured background (Shimojo and Nakayama 1990a). This crescent survives rivalry with the dots of the foreground and, as proposed by Julesz (1964), appears to be part of the far textured surface. The other monocular crescent is on the wrong side to be interpreted as a monocular occlusion zone. It therefore engages in binocular rivalry with the dot pattern in the other eye. Both monocular crescents tend to be suppressed by the dot pattern when the stereogram is fused so that the black disk is seen beyond the random-dot surface. Thus, rivalry is coupled to the interpretation of monocular occlusion. Precedence is given to ecologically valid images. The contrast threshold for a small dot presented briefly in the half occluded region of a stereoscopic image is higher than that for a dot presented in a nonoccluded region or monocularly (Emoto and Mitsuhashi 1998).
X
Y
A
17.2.4 M O N O CU L A R Z O N E S A N D S U R FAC E O PAC I T Y
Fusion of the two images in Figure 17.24A creates an opaque white square in front of a background of dots (Lawson and Mount 1967; Gulick and Lawson 1976). A monocular column of dots occurs along each vertical edge of the fused image. The two vertical edges are linked by top and bottom rows of dots seen by both eyes. The foreground square appears opaque, which is what it would have to be to create the occlusion zones. The occluded far surface appears to continue behind the white square—the most parsimonious conclusion. Figure 17.25 shows similar displays using rows of letters instead of dots (Kaufman 1965). The stereogram in Figure 17.24B creates a frame of dots in front of another frame of dots. The impression of an opaque square and a continuous surface is not evoked, because each column of dots has a matching set in the other eye—there are no monocular zones. Thus, monocular zones provide effective information about the opacity of foreground objects and about texture continuity in background surfaces. One cannot conclude that the impression of depth in Figure 17.24A is due entirely to the monocular zones. As Lawson and Gulick pointed out, a disparity exists between the two inner white squares relative to the whole image even though there are no disparities between the dots that define the squares. The next section considers the role of monocular zones as a depth cue in more detail. 1 7 . 3 DA VI N C I S T E R E O P S I S Euclid noticed, in 300 BC, that one eye sees part of a sphere not seen by the other eye. Galen, in the second century AD,
B Monocular occlusion and surface continuity. (A) In one fused image the blank area appears as an opaque surface occluding a dotted background. This is due to the zones in columns X and Y that are seen only by one eye. (B) The fused images appear as outline squares in two depth planes. The monocular zones are blank (Adapted from Lawson and Mount
Figure 17.24.
1967, and Gulick and Lawson 1976)
described how part of a more distant surface is seen by only one eye. Leonardo da Vinci noticed the same thing in the 15th century and commented on the role of monocular zones in creating depth (Section 2.10.3a). Thus the idea that monocular zones play a role in depth perception was mentioned before anyone suggested that disparity between corresponding images had anything to do with depth perception. After Wheatstone demonstrated the role of disparity, people forgot about monocular zones. Interest in this factor revived only recently. It has now been demonstrated that monocular zones can indeed create impressions of depth in the absence of conventional binocular disparity. Nakayama and Shimojo (1990) coined the term da Vinci stereopsis to denote stereopsis based on monocular zones. Liu et al. (1994b) claimed to have obtained stereopsis from monocular occlusion, using a stereogram like that
S T I MU LU S TO K E N S F O R S T E R E O P S I S
•
267
majfgjkfuvcuwq p fjvnblrperuhje
fjvnblrperuhje g
k yierhglkjrflojh
yierhglkjrflojh c
n oneiktuhrbfjm
oneiktuhrbfjm x
b fvfkiwjlbhkrig
fvfkiwjlbhkrig z
trhgniuhirpouw
A
majfgjkfuvcuwq fjvnblrperuhje g
p fjvnblrperuhje
yierhglkjrflojh c
k yierhglkjrflojh
oneiktuhrbfjm x
n oneiktuhrbfjm
fvfkiwjlbhkrig z
b fvfkiwjlbhkrig
B
trhgniuhirpouw Stereopsis from nonmatching letters. The central region of letters appears to be in front of or behind the unpaired upper and lower rows. The disparity is defined with respect to unpaired columns of letters on the left- and right-hand edges of the images. (Adapted from
Figure 17.25.
Kaufman 1965)
shown in Figure 17.26A. In one fused image, a white rectangle stands out from a black rectangle. In the other fused image, a white rectangle appears through a hole in the black rectangle. There are no corresponding lateral edges in the white rectangle by which horizontal disparity can be generated. Liu et al. argued that the effects arise from the black monocular zones at each end of the white rectangle in the fused image. The proximal stimulus is equivalent to that produced by a white rectangle partially occluding or seen beyond a larger black rectangle. There are two possible artifacts in Liu et al.’s display. First, disparities exist between the inner and outer horizontal boundaries. Gillam (1995) found that a simple line stereogram consisting of horizontal lines with these same disparities creates the same relative depth as that created by the Liu et al. figure (see Figure 17.26B). Liu et al. (1997) argued that, whereas the terminators of the lines in Gillam’s display have the same luminance polarity, those in their display have opposite polarity. They then argued, with the aid of a computer model, that corners with opposite luminance polarity stimulate conventional disparity detectors in a manner similar to corners with the same polarity. They also argued that this low-level system is supplemented by a high-level system that responds to the monocular zones and creates a continuous surface in depth. The second possible artifact in Liu et al.’s display is misconvergence on the black shapes due to the asymmetry of the images. It is as if edges with opposite luminance polarity repel each other. With attention focused on the illusory depth of the white rectangle, nonius lines placed on the images separate horizontally. This shows that a disparity is induced into the images of the partially occluded black rectangle—a crossed disparity when the black shape appears 268
•
C Stereopsis without corresponding vertical edges. (A) One fused image appears as a white rectangle in front of a black rectangle. The other image appears as a white rectangle beyond the black rectangle. (Adapted from Liu et al. 1994b) (B) The horizontal boundaries extracted from (a) contain disparities that create the same depth impressions as those in (A). (Redrawn from Gillam 1995) (C) In one column of fused images the white rectangles come forward. In the other column the rectangles becomes more distant. Note that the nonius lies become more separated with increasing depth down the columns. Figure 17.26.
nearer than the white rectangle and an uncrossed disparity when it appears beyond the rectangle. The images of the white rectangle do not acquire a disparity since they have no overlapping vertical edges. The illusory depth in this figure could therefore arise from vergence-induced disparity in the black shape relative to the white rectangle. The white rectangle defaults to the plane of convergence (zero disparity). When the width of the white element in each image is decreased, as in Figure 17.26C, the white rectangle becomes more displaced in depth with respect to the black rectangle. In one column of fused images the white rectangles come further forward and in the other column they go further backward. Note that the white elements are on the inside of the black shape in the first set and on the outside in the other set. If the two sets of stereograms are viewed with
STEREOSCOPIC VISION
convergent fusion, the perceived depth relations are the same in the two sets. In both cases, the offset of the nonius lines above each stereogram increases with the increasing apparent depth of the white rectangles, as one would expect if the effects were due to vergence-induced disparity in the black shapes. To prove that the depth effect is due to monocular occlusion, as suggested by Liu et al., one would have to show that it survives when the eyes are correctly converged on the black shapes. The nonius offset could be due to conventional disparity signals or it could be due to proximal vergence induced by the apparent depth of the white rectangle. Liu et al. (1998) found that 0.5-Hz alternation of the white-rectangle-in-front view with the white-rectanglebehind view in the stereograms of Figure 17.26A induced horizontal vergence. However, when the stereograms were rotated 90˚ so that the white rectangles were offset vertically, vertical vergence was not induced. In this case, the vertical disparity was the same magnitude as the horizontal disparity in the previous stimulus. They concluded that the depth in their original display was due to occlusion cues and not spurious disparities. Gillam and Nakayama (1999) sidestepped these complexities by designing the stereogram shown in Figure 17.27. A white rectangle appears to stand out in one of the fused pairs of images, depending on the direction of vergence. The top and bottom sides of the rectangle are formed by subjective contours. This stereogram seems to be free of disparity artifacts, and the depth must therefore arise from monocular zones only. Mitsudo et al. (2005) found that the magnitude of depth produced by the stereogram in Figure 17.27 was greater than that predicted by the width of the lines. That is,
+
+
Phantom square from monocular occlusion. Depending on the direction of vergence, in one of the fused pairs of images a phantom white rectangle appears in front of the black lines. (Adapted from Gillam and
Figure 17.27.
Nakayama 1999)
it was greater than the width of the monocular occlusion zones. A phantom rectangle was detected more easily when set among distracters than was a rectangle created by ordinary disparity. Mitsudo et al. concluded that the greater depth from the phantom rectangle is processed at the preattentive stage of visual processing. However, it is not clear what actually causes the phantom rectangle to have more depth than that predicted by the width of the monocular zones. Crossed fusion of the upper stereogram in Figure 17.27 creates an impression of a single occluding surface with subjective contours along its upper and lower edges. This is consistent with the display that produces these images. In the lower stereogram of Figure 17.27, the order of the images is reversed. The display that produces these images consists of two separate occluding surfaces. That is the impression created, but, in this case, the impression of depth is less stable. Thus, although both types of surface are ecologically valid, we are particularly sensitive to coherent occluding surfaces that connect two occlusion zones. Detection of such a coherent surface is less perturbed by external noise than is detection of separated occluding surfaces (Mitsudo et al. 2006). Presumably the strong subjective contours in a coherent surface enhance the visibility of the surface. Brooks and Gillam (2006a) produced a dynamic version of this effect, which they called sequential monocular decamouflage. A white vertical line moved horizontally over a black surface. For one eye, the center of the line disappeared and then reappeared as if moving beyond an occluding black rectangle. The line in the other eye did the same thing with a delay, as depicted in Figure 17.28. The stimulus sequence is the same as would be produced by a white line passing some distance beyond a camouflaged black rectangle. The sequential disappearance of the line created a pseudodisparity along the vertical edges of the camouflaged black rectangle. The pseudodisparity increased as the interocular delay in the disappearance of the lines increased. A depth probe revealed that the perceived depth between line and rectangle increased accordingly. Anderson (1994) provided an example of monocular occlusion as a cue to depth in the absence of horizontal disparity. In Figure 17.29A the black dots are identical in the two eyes, but the lines differ in length in the two eyes. This difference in length could induce apparent slant, as discussed in Section 20.2. But, with appropriate fusion of the images, instead of slant one sees the lines and dots in a frontal plane through an aperture. The difference in length of the lines is what would be created if the lines and dots were seen beyond a diamond shaped aperture, as shown in Figure 17.29B. The visual system therefore creates a diamond shaped aperture to account for the difference in line length. This depth effect seems to be free of vergence artifacts.
S T I MU LU S TO K E N S F O R S T E R E O P S I S
•
269
Right eye
Left eye
Plan view
T1
T2
A T3
T4
T5 Sequential monocular decamouflage. In the motion sequence T1 to T5 the white line moves with an interocular delay. In each eye it disappears and then reappears as it passes a camouflaged black rectangle (dashed lines). Plan views of the equivalent 3-D stimuli are shown on the right. (Adapted from Brooks and Gillam 2006a)
Figure 17.28
B Depth created by vertical disparity. (A) The horizontal disparity of the contents of the diamonds relative to the diamonds takes the contents beyond the diamonds in one fused image and brings them forward in the other image. (B) The same effect occurs when the frames round the images have been removed. (Adapted from Anderson 1994)
Figure 17.29.
Gillam et al. (1999) constructed displays like those shown in Figure 17.30A. One eye sees two black rectangles with a white gap between them, while the other eye sees one black rectangle equal in width to the combined width of the two rectangles seen in the other eye. The overall width disparity between the two images should produce a surface slanted about a vertical axis. Instead, one sees two frontal rectangles separated by a step in depth. These are the actual surfaces that would produce these images, as illustrated in the diagram on the left. This type of da Vinci stereopsis is referred to as monocular-gap stereopsis. In Figure 17.30B the images in the two eyes are equal in overall width. There are therefore no real disparities. The real surfaces that would create these images are slanted in depth with a depth step between them, as illustrated on the right. It is as if the visual system partitions the rectangle seen in one eye into sections and matches each section with one of the squares in the other eye. The width of the monocular gap may be called an occlusion disparity. The effect has a superficial resemblance to Panum’s limiting case, in that one eye sees a bar not seen by the other eye. However, in Panum’s limiting case, the monocular 270
•
element is perceived as an object in depth, while the monocular white bar in Figure 17.30A is perceived as a gap between objects separated in depth. Pianta and Gillam (2003b) used stereograms like those shown in Figure 17.30A to measure the minimum width of gap required for reliable discrimination of the sign of depth. The threshold occlusion disparity was similar to the minimum difference in gap width (real disparity) in a stereogram with gaps in both images. The dependence of depth detection on the duration of the stimulus was similar for the two types of stimuli (Sachtler and Gillam 2007). Also, inspection of a stereogram with 4 arcmin of occlusion disparity for 15 s produced the same depth aftereffect as that produced by inspection of an equivalent real disparity. These results indicate that the visual system treats occlusion disparities and real disparities in the same way. As the width of the gap is increased, the magnitude of perceived depth between the rectangles increases.
STEREOSCOPIC VISION
A
B
C
D Depth created by a gap in one image. (A) The stereogram produces frontal surfaces with a depth step, as shown in the diagram on the left. (B) The stereogram produces slanted surfaces with a depth step, as shown on the right. (C) A real disparity creates two surfaces with a depth step. (D) As the width of the gap increases, depth magnitude increases to produce inclination in depth. (Adapted from Pianta and Gillam 2003a)
Figure 17.30.
The series of gaps of increasing width in Figure 17.31 produce multiple depth planes, as illustrated in the lower figure (Pianta and Gillam 2003a). A gap that increases in width along its length creates a rectangle inclined in depth, as shown in Figure 17.30D. See Tsirlin et al. (2010a) for evidence on the effects of the width of an occlusion zone on the perceived shape and depth of an occluding surface. A continuous change in the width of the gap should produce an impression of motion-in-depth. Brooks and Gillam (2006b) devised dynamic versions of the displays shown in Figures 17.30A and 17.30B. The gap in the image in one eye was reduced in width until it closed. The gap in the image in the other then increased from zero. Repetition of this reciprocal change in gap width produced an impression of motion in depth of one rectangle relative to the other. The magnitude of the motion in depth, as assessed by a depth probe, equaled that produced by a real change in disparity in a display in which the gap was shown to both eyes. In Figure 17.30 the white gap in one eye’s image is perceived as part of the white surrounding area. Grove et al. (2002) predicted that an impression of depth would not occur when the gap differs from the background. This situation could arise only where the part of the surface seen through the gap accidentally differs from the rest of the surface. They confirmed this prediction by using the stereograms shown in Figure 17.32. It can bee seen that a good impression of depth occurs only when the gap matches the background. Cook and Gillam (2004) devised a stereogram like that in Figure 17.33. This creates an impression of a white rectangle either in front of or beyond a black shape. The magnitude of perceived depth increases as the white
Real 3-D objects
Left eye’s image
Right eye’s image
Multiple depths created by gaps in one image. With divergent fusion the upper stereogram should appear stepped down to the left and the lower stereogram stepped down to the right. Convergent fusion should create the opposite result. Step depth should vary with gap width. The lower figure shows real objects that would create the stereograms. (Adapted from Gillam et al. 1999)
Figure 17.31.
element protrudes further into the black shape. This is evident when the two stereograms in Figure 17.33 are compared. To-and-fro lateral movement of the white element produced an impression of motion-in-depth (Brooks and Gillam 2007). Cook and Gillam argued that the effect
S T I MU LU S TO K E N S F O R S T E R E O P S I S
•
271
Da Vinci stereopsis and the background. Fusion of the images produces a stronger impression depth when the monocular gap matches the surround compared with when it differs from the surround. (Adapted
Figure 17.32.
from Grove et al. 2002)
Kuroki and Nakamizo (2006) asked whether depth created by da Vinci stereograms scales with distance in the same way. They used a phantom-surface stereogram, a monocular-gap stereogram, and a regular random-dot stereogram. Apparent distance was manipulated by changing the convergence required to fuse the images. The vergence angle varied between 3.56˚ and 1.25˚, corresponding to viewing distances of between about 1 and 3 m. The distal and proximal sizes of the images remained constant. Subjects estimated the depth in each stereogram at each distance by adjusting a tape measure. For each type of stereogram, perceived depth decreased as vergence angle increased, as shown in Figure 17.35. Depth was overestimated with large vergence angles (near) and underestimated with small angles (far). Other investigators have found the same contraction of perceived distance to a central value (see Section 29.2.2b). Computational models of da Vinci stereopsis have been developed by Hayashi et al. (2004), Malik et al. (1999), and Watanabe and Fukushima (1999). Assee and Qian (2007) developed a physiologically plausible model of the detection of the location and sign of monocular zones bordering vertical discontinuities in disparity. Interactions between disparity and monocular occlusion are discussed in Section 30.4. 1 7 . 4 D E P T H F R O M C YC L O P E A N T R A N S PA R E N C Y
Figure 17.33.
Da Vinci stereopsis. With crossed fusion, the fused images
on the left show a white object in front of the black shapes. The fused images on the right show a white object beyond the black shape. Some people find this difficult to see. In both cases, the smaller intrusion of the white region produces less depth. There are no conventional disparities in these fused images. (Adapted from Cook and Gillam 2004)
cannot be due to fusion of the vertical edge of the white element seen by one eye with the curved edge of the black shape seen by the other eye because this would produce a 3-D shape with a curved surface. In one of the fused images in Figure 17.34A an inner square region of dots appears in front of the surrounding region. In the other fused image, a far dotted region is seen in the four portholes created by the incomplete black disks. Since there are no binocular disparities, the depth effects must be created by differential monocular occlusions of the black disks (Häkkinen and Nyman 2001). The depth effects are not evident in Figure 17.34B, which suggests that monocular occlusion acts as a depth token only when occluded regions and neighboring nonoccluded regions form a visibly coherent figure. Relative depth created by regular binocular disparity varies inversely with the square of viewing distance. 272
•
The impression of depth created by fusion of the images in Figure 17.36A is due only to monocular zones. There are no binocular disparities of the conventional type. However, although the sign of depth is unambiguous, the magnitude of depth is unspecified, as shown in Figure 17.36C and D. Although the images in the eyes are the same, the two displays are at different relative depths. The basic problem is that the stimulus in Figure 17.36A contains no information about the horizontal extent of the occluded object, although the viewer may make a default assumption about how much of the far object is occluded. Howard and Duke (2003) removed this ambiguity by constructing the stereogram in Figure 17.37. Crossed or uncrossed fusion creates two pairs of fused images. In the upper pair, one gray square appears beyond a transparent gray surface, while the other square appears transparent and nearer than the surrounding surface. In the image in one eye, the square just fills the white gap so that the vertical edges of the square are not visible. In the image in the other eye, the square is laterally displaced relative to the gap. There are no conventional disparities in these fused images. In the lower pair of fused images, a black square is visible in both eyes, so that there is a conventional disparity in the fused image. Note that the apparent depth created by the upper squares that lack disparity is the same
STEREOSCOPIC VISION
A
B Stereo capture arising from monocular occlusion. (A) In one fused image, an inner square comes forward. In the other image, corners of a far square appear in four portholes (B) Depth is not evident in either fused image. (Adapted from Häkkinen and Nyman 2001)
Figure 17.34.
Judged depth (cm)
20
Theoretical depth
15 Phantom-surface stereogram 10 Random-dot stereogram 5 n=8 2.5
1
Monocular-gap stereogram 2 3 Convergence angle (deg)
4
Perceived depth as a function of vergence. Perceived depth produced by two types of Da Vinci stereograms and a random-dot stereogram as a function of the angle of convergence required to fuse the images. Mean of 8 subjects. Error bars are standard errors of the mean. (Adapted from Kuroki and Nakamizo 2006)
Figure 17.35.
as that created by the lower squares that contain conventional disparity. When the images in Figure 17.38 are fused, the squares on one side appear beyond the surface and those on the other side appear in front of the surface. The greater the horizontal offset of the square relative to the vertical gap in one eye, the greater the perceived depth. These effects do not arise from monocular occlusion because nothing is occluded. They arise because the fused image contains information that either the square or the
surface is transparent. Note that the strong impression of transparency arises after the images are fused. The effect is still evident when the monocular image does not contain information about transparency (Grove et al. 2006). One eye sees the whole square so that it receives information about the size of the square. The other eye does not see the vertical edges of the square because it just fills the gap. Both the sign and magnitude of depth are specified if the viewer uses the available information. Figure 17.37 shows a plan view of the physical arrangement that would create the depth impressions seen in the stereograms. The horizontal offset of the square relative to the gap creates a “pseudodisparity” of the square with respect to the “hidden” square in the other eye that just fills the gap. It is not an actual disparity, because only one eye sees the vertical edges of the square. The position of the edges of the square in the other eye must be inferred. Howard and Duke used a depth probe to measure the magnitude of perceived depth in stimuli like those in Figure 17.37. The perceived depth produced by transparency and that produced by conventional disparity were almost the same for disparities up to 4˚, the highest value tested. Both depth magnitudes were close to theoretical values.
17.5 DEPTH FROM BINOCUL AR R I VA L RY This section is concerned with whether depth can be created by differences in luminance polarity in the absence of binocular disparity.
S T I MU LU S TO K E N S F O R S T E R E O P S I S
•
273
A
Square beyond a transparent surface
B Nearer transparent square Stereopsis without disparity or occlusion. The upper squares in the stereogram create a square beyond a transparent surface in one fused image and a transparent square in front of a surface in the other fused image. There are no conventional disparities in the images. The depth is the same as that created by conventional disparity in the lower squares. The lower figure shows the arrangement that would create these depth effects. (Redrawn from Howard and Duke 2003)
Figure 17.37.
C
D The ambiguity of depth magnitude. (A) One fused image produces the depth shown in (B). However, the magnitude of depth is unspecified. Arrangements (C) and (D) produce the same images, but the nearer object is closer to the eyes in (C) than it is in (D).
Figure 17.36.
Depth is created with a stereogram consisting of a set of black circular rims filled with white in one eye and the same set of black rims filled with black (equivalent to black disks) in the other eye, as in Figure 17.39 (Howard 1995). The black rims form a set of “holes” within which are seen rivalrous black and white regions. For most people the fused image creates a black-and-white dotted surface seen through black-rimmed holes in a nearer surface. There are no disparities between any of the edges in the display. Contrast rivalry serves as a disparity cue in its own right. Howard dubbed this the sieve effect. There are three depth effects to which the sieve effect could be related. 274
•
Graded transparency depth. Fusion creates squares beyond a transparent surface and transparent squares in front of a surface. Depth increases with the extent of displacement of the square relative to the vertical white bar in the image of one eye. There are no conventional disparities because the left image has no matching vertical edges in the right image.
Figure 17.38.
1. Depth can be produced by disparity between thin lines with opposite luminance polarity (Section 15.3.7). But this works only when the opposite-polarity stereograms have a conventional horizontal disparity. Since none of the stereograms of Figure 17.39 has a disparity, depth in the sieve effect is created by rivalry alone. 2. A random-dot stereogram with a central square of binocularly uncorrelated dots set in a surround of correlated dots, as in Figure 17.40, produces fluctuating
STEREOSCOPIC VISION
the central dots have opposite luminance polarity. They found that the direction of depth is a function of the direction of fixation disparity. This suggests that subjects were responding to the disparity between the surrounding regions induced by fixation disparity, relative to the indeterminate disparity in the inner square. Misconvergence would not induce a detectable disparity into the uncorrelated region since this region has no defined disparity to begin with. This effect cannot account for the perceived depth in the sieve effect because the sieve effect has a definite and consistent depth, even without misconvergence.
The sieve effect. For most people, fusion of each of these displays creates an impression of a surface with holes, with a black and white surface seen through the holes. (Redrawn from Howard 1995)
Figure 17.39.
3. The sieve effect could be related to Panum’s limiting case, in which a monocular line on the temporal side of an adjacent binocular line appears to lie beyond the binocular line (Section 17.6). In each fused disk, one edge of the monocular white disk is on the temporal side of the adjacent binocular black rim. This configuration is similar to Panum’s limiting case. However, the other lateral edge of each white disk is on the nasal side of the black rim, and this should create the impression that the monocular edge is in front of the binocular edge. According to all theories of Panum’s limiting case, the inner disk should appear slanted in depth about a vertical axis. Panum designed the stereogram shown in Figure 17.41, which creates this impression. But his disks were much larger than those producing the sieve effect. The sieve effect is clearly not the same as Panum’s limiting case. The sieve effect can be explained as follows. Rivalry within a small area shows exclusive dominance. One or another image is seen exclusively at any one time (Section 12.4). Therefore, at any instant, the contents of each fused element are seen as either white or black. Over the whole pattern, the contents of some elements will appear black and those of others will appear white. A dotted surface seen through holes in a near surface creates this same proximal stimulus, as illustrated in Figure 17.42. When the disks subtend more than about 1˚, as in Figure 17.43a the rivalrous contents appear as a fluctuating silvery sheen at an indeterminate depth. This is
A
B Rivaldepth. Stereograms similar to those used by O’Shea and Blake (1987). The dots in the inner square of (A) are uncorrelated. Those in the inner square of (B) have reversed luminance. Each stereogram creates a sensation of depth but with indeterminate sign.
Figure 17.40.
depth, even though there is no disparity between the uncorrelated regions ( Julesz 1960; Frisby and Mayhew 1978b). O’Shea and Blake (1987) dubbed this rivaldepth. In the top stereogram, the central region contains uncorrelated dots. In the bottom stereogram,
The display used by Panum (1858). With crossed fusion of the outer circles the inner circle appears to slant about a vertical axis, right side forward. Uncrossed fusion should create the opposite effect. There is no binocular rivalry in this display.
Figure 17.41.
S T I MU LU S TO K E N S F O R S T E R E O P S I S
•
275
Black and white surface
Plate with holes
(a) Large discs produce luster not depth
A sieve creates rivalry without disparity. A black and white surface seen through holes in a near surface creates a pattern of binocular rivalry without disparity. (Redrawn from Howard 1995)
Figure 17.42.
binocular luster. It occurs because luminance rivalry in large areas does not show exclusive dominance but rather mosaic dominance, which creates luster (Section 12.3.1). Figure 17.43b shows that the sieve effect is replaced by indeterminate depth when there are no binocular rims (the gray background keeps the white disks visible). This is probably because there is no longer an impression of a rimmed porthole through which the rivalrous surface is seen. When the thickness of the black rims is increased, the sieve effect gives way to a very different impression. This is seen most clearly in Figure 17.43d, in which the white disks are reduced to small spots. After a period of viewing, the white spots appear to float in depth, sometimes in front of the background and sometimes beyond it. The white spots remain in view and show dominance rivalry, rather than exclusive alternating rivalry or luster. The spots remain visible because there are no nearby contours in the other eye to compete with them. The lack of a good fusion lock for the white dots produces instability of vergence. This causes each dot to sometimes come closer to that edge of the black disk with which it has an uncrossed disparity and at other times to come closer to the edge with which it has a crossed disparity. These changes in disparities could account for the fluctuations in depth of the dots relative to the rims. The depth effect with small dots is not due to rivalry but to either Panum’s limiting case or disparities induced by vergence. When the contrast between the rivalrous disks is decreased, as in Figure 17.44A, the sieve effect is reduced or absent. This could be because, for small rivalrous regions of low contrast, exclusive rivalry is replaced by luminance mixture (Liu et al. 1992a). Matsumiya et al. (2007) provided further evidence that the sieve effect occurs under conditions of exclusive 276
•
(b) The effect is not clear without the black rims
(c) The effect is clear with black rims
(d) Small white discs float at an indeterminate depth Figure 17.43.
Factors affecting the sieve effect.
alternating rivalry. The depth effect was maximal when the sizes and contrasts of the stimuli induced the highest rate of exclusive rivalry. Tsai and Victor (2000) asked subjects to set the disparity of a patch in a random-dot display so that its perceived depth matched that produced by the rivalrous elements in the sieve effect. Although very variable, the settings indicated that the rivalrous elements in the sieve effect were perceived as more distant than the plane containing the black rims. Summary The sieve effect occurs when disks of opposite contrast are superimposed. The disks must be within the range of sizes and contrasts for exclusive rivalry. With large disks, the
STEREOSCOPIC VISION
A
B
Panum’s limiting case. When fused by divergence, the top stereogram is in the occlusion configuration and the bottom stereogram is in the camouflage configuration. The thin vertical line appears beyond the wide bar in the occlusion configuration and nearer than the black bar in the camouflage configuration. The white nonius lines are aligned when the eyes are accurately converged on the black bar. Note that there is a tendency for the nonius lines to drift apart in opposite directions in the two stereograms. Figure 17.45.
Sieve effect with reduced contrast. The sieve effect is less impressive when contrast between the rivalrous disks is reduced in (A) compared with (B).
Figure 17.44.
sieve effect gives way to binocular luster. If the monocular disks are small relative to the binocular disks, there is permanent dominance of the monocular disks, and the sieve effect is replaced by a variable-depth effect that probably depends on vergence instability. 1 7 . 6 PA N U M ’ S L I M I T I N G C A S E In Panum’s limiting case, one eye views a single vertical bar and the other eye views the same vertical bar flanked on the temporal side by a vertical line, as in the upper stereogram of Figure 17.45. This is the occlusion configuration. When the two images of the bar are fused by divergence, the monocular line appears beyond the bar, in accordance with the rules mentioned in Section 17.2.1. The white lines in the bars are nonius lines, which appear aligned when the eyes are properly converged on the bar. In the lower stereogram the monocular line is on the nasal side of the binocular bar when the images are fused by divergence. This is the camouflage configuration. For many people this causes the line to appear in front of the bar, also in accordance with the rule mentioned in Section 17.2.1. Figure 17.46 is a random-dot stereogram version of Panum’s limiting case (Allik 1992). For each dot in one eye there is a matching dot in the other eye plus a neighboring unmatched dot. When fused, the stereogram creates two planes of dots, one containing the fused dots and the other containing the monocular dots. Panum’s limiting case has been regarded as a puzzle because there are no obvious disparities in the display—the binocularly fused bar has zero disparity, and the monocular line has no corresponding image in the other eye.
Random-dot stereogram of Panum’s limiting case. For each dot in one eye there is a matching dot in the other eye plus a neighboring unmatched dot. The single and double dots are distributed between the two eyes so that the number of dots is the same in the two eyes. When fused, the stereogram creates two depth planes. (From Alik 1992,
Figure 17.46.
Perception Pion)
The following four theories have been proposed to account for the impression of depth in Panum’s limiting case. Each theory can be extended to account for depth in the camouflage configuration. 17.6.1 M O N O C U L A R F I GU R A L R E P U L S I O N
In the figural aftereffect, two neighboring parallel lines seen by the same eye appear displaced from each other. If this effect operates in Panum’s limiting case, the monocular line would repulse the image of the bar in the same eye and thus create a disparity between the two images of the bar. Westheimer (1986a) found that the figural repulsion effect is too small to account for Panum’s limiting case.
S T I MU LU S TO K E N S F O R S T E R E O P S I S
•
277
17.6.2 V E RG E N C E -I N D U C E D D I S PA R I T Y
Perceived depth in Panum’s limiting case could be due to disparity induced into the images of the binocular bar by vergence eye movements (Lau 1925; Kaufman 1976). The two images (bar and line) in one eye have a low spatialfrequency centroid determined by the mean of their luminance distributions. The disparity between this centroid and the image of the bar in the other eye evokes divergence, which injects a crossed disparity into the images of the bar. The disparity between the images of the bar causes it to appear nearer than the monocular line. When the monocular line is on the nasal side of the bar, the resulting image asymmetry induces convergence, which injects an uncrossed disparity into the images of the bar. This causes the bar to appear beyond the monocular line. In both cases the unpaired image of the line conveys no disparity information to the stereoscopic system and is therefore treated like a binocular image with zero disparity and, like a zero-disparity image, appears in the horopter (the plane of convergence). In this account it is assumed that the lowfrequency disparity induces vergence but does not play a direct role in perceived depth, but the theory would not be substantially altered if this assumption were relaxed. The rule that unpaired monocular images are defaulted to the horopter was proposed by Aguilonius in 1613 and has wide application. It may be called the horopter default rule for unpaired images. This rule does not apply if the monocular images are seen as belonging to a textured surface. In that case, monocular images in the occlusion configuration are seen as part of a surface behind an occluding object, and those in the camouflage configuration are seen as part of a transparent textured surface lying in front of an object. In both cases the images are defaulted to the depth of those surfaces, whether or not they lie in the horopter. This can be called the similar-surface default rule for unpaired images. The vergence theory of Panum’s limiting case does not violate the unique-linkage rule for images of similar spatial scale and avoids the problem of having to explain how disparities far beyond the fusion limit can code depth. Howard and Ohmi (1992) demonstrated that misconvergence contributes to depth in Panum’s limiting case. It was first established that a monocular image induces fixation disparity in a neighboring binocular line. The subject fused the dichoptic images of a black vertical bar. A white nonius line was superimposed on each image of the bar, as shown in Figure 17.45. A black vertical line was added about 0.5˚ to the right of the bar. This line was visible either only to the right eye (temporal line) or only to the left eye (nasal line). All subjects reported that the line on the temporal side induced a displacement of the nonius lines in the direction of a crossed disparity. Not all subjects saw depth, but those who did saw the binocular bar nearer than the monocular line. This is monocular occlusion, or the 278
•
classic Panum effect. The line on the nasal side induced an uncrossed disparity, which showed as an uncrossed displacement of the two nonius lines. The binocular bar now appeared to be beyond the monocular line. This is monocular camouflage. Thus, the expected fixation disparities predicted by the vergence account certainly occur and in directions appropriate to the resulting depth impressions. However, this in itself does not disprove the other two theories. Both the configuration theory and Hering’s theory, which is reviewed in the next section, state that the direction of perceived depth in Panum’s limiting case depends on whether the monocular line is on the temporal or the nasal side of the binocular bar. The vergence theory states that the direction of perceived depth depends on the sign of the disparity induced into the binocular bar. Thus, according to this theory, reversing the sign of fixation disparity will reverse the sign of perceived depth, whether the monocular line is on the temporal or the nasal side of the binocular bar. To test these conflicting predictions, Howard and Ohmi used the stereograms shown in Figure 17.47. A vertical line was presented on each side of the bar in one eye. The line on the temporal side was in the occlusion configuration and that on the nasal side was in the camouflage configuration. A white fixation dot was presented dichoptically on the bar with various crossed or uncrossed disparities so that the dot appeared to be in various positions in depth. The offset of two nonius lines was tied to the disparity in the fixation dot so that when the subject converged on the dot the nonius lines appeared to be aligned. In this way vergence was controlled. For each setting of the dot, the subject fixated it and reported the relative positions in depth of the bar and each monocular line and also the depth of one monocular line relative to the other. When the white dot forced subjects to converge nearer than the bar, they reported that both monocular lines appeared closer than the bar and when they converged beyond the bar, they reported that both monocular lines appeared beyond the bar. Howard and Ohmi concluded that the apparent depth of a monocular object relative to an adjacent binocular object is determined mainly by disparity induced into the images of the binocular object by vergence. Shimono et al. (1999) produced evidence supporting this conclusion. Monocular objects simply default to the plane of zero disparity (the plane of convergence). Only the vergence theory can account for the fact that both the temporal and nasal monocular lines appeared nearer or beyond the binocular bar, depending on the state of convergence. Contour repulsion effects would be canceled with the display used in this experiment because a monocular line was on each side of the image of the bar in one eye. Jaensch (1911) had also noticed that depth in Panum’s limiting case is reversed when the eyes are unnaturally converged. He concluded that convergence causes the monocular image to fuse with the image in the other eye, bringing
STEREOSCOPIC VISION
(a) With crossed fusion of the white spots, the black bars have zero disparity. The line on the left appears nearer than the bar and that on the right beyond the bar
(b) Crossed fusion of the offset white spot creates an uncrossed disparity in the black bars, which makes both monocular lines appear nearer than the bar
(c) Crossed fusion of the white spot creates a crossed disparity in the black bars, which makes both monocular lines appear beyond the bar Panum’s limiting case and misconvergence. A monocular line is presented on both sides of one image of the binocular bar. With crossed fusion, the line to the right of the bar is in the occlusion configuration, and that to the left is in the camouflage configuration. Divergent fusion reverses the left-right order of the effects. (Redrawn from Figure 17.47.
Howard and Ohmi 1992)
the unfused image onto the nasal side and hence into the camouflage configuration. He argued that an unfused image on the temporal side evokes a “convergence impulse,” which causes it to appear behind. When the unfused image is on the nasal side, it evokes a “divergence impulse” which causes it to appear in front. He did not mention the disparity in the fused images produced by these vergence impulses. Häkkinen and Nyman (1996) found that, in both the occlusion and camouflage configurations of Panum’s limiting case, the depth of the monocular stimulus was shifted according to whether there was a crossed or uncrossed binocular object adjacent to the display. Although the authors did not interpret their findings in terms of
changes in vergence induced by the adjacent display, these results are consistent with such an interpretation and confirm those obtained by Howard and Ohmi (1992). We conclude that the apparent depth of a monocular line relative to a binocular bar in Panum’s limiting case with relaxed vergence can be due to the effects of vergence induced by the dichoptic asymmetry of the images. The vergence induces a disparity into the binocular bar, which appears displaced in depth relative to the monocular line. Since the monocular line has no matching image in the other eye, it defaults to the plane of zero disparity. Perhaps stereoscopic depth induced by disparities of more than about 30 arcmin is also due to vergence-induced disparities of much smaller magnitude. However, induced vergence is not the only cause of Panum’s limiting case. When subjects in Howard and Ohmi’s experiment converged in the plane of the bar, the line in the occlusion position still appeared beyond the occluding bar, and the line in the camouflage configuration appeared nearer than the bar. Thus, occlusion and camouflage create depth after one allows for the effects of misconvergence. Nakayama and Shimojo (1990) also controlled vergence and found that apparent depth in the occlusion configuration was a function of the lateral separation between line and bar. The depth in Figure 17.48 cannot be explained in terms of vergence-induced disparity. Three of the vertical lines appear in front of the vertical bar, and three appear behind it. Since the images in the two eyes are bilaterally symmetrical they do not induce vergence. Also, vergence-induced disparity in the vertical bar cannot account for the simultaneous appearance of lines at different depths. Depth effects that occur when vergence is controlled may be due to one or more of the following processes. 17.6.3 D O U B L E -D U T Y I M AG E L I N K AG E
Hering (1879) proposed that one image of the binocular bar is paired both with its partner in the other eye and with the monocular line in the other eye. It is assumed that the
Mixed occlusion and camouflage configurations. In the fused image, the vertical lines on one side appear in front of the vertical bar while those on the other side appear behind. Vergence should be stable because the images are symmetrical and because of the outer frame.
Figure 17.48.
S T I MU LU S TO K E N S F O R S T E R E O P S I S
•
279
eyes are accurately converged on the bar so that its images fall on corresponding points, and that the bar appears in the plane of convergence. The uncrossed disparity between one image of the bar and a monocular line on the temporal side creates the impression that the line is beyond the bar. The crossed disparity between one image of the bar and a monocular line on the nasal side creates the impression that the line is nearer than the bar. This explanation violates the unique-linkage rule, which states that an image in one eye is matched with only one image in the other eye at any one time (Section 15.3.1). Note that Hering’s theory does not imply that the image in one eye fuses with two images in the other eye, but only that disparities are detected between an image in one eye and each of two images in the other. In Hering’s theory the binocular bar has zero disparity and appears in the plane of convergence, while the monocular line appears displaced in depth because of disparity between it and one of the images of the bar. According to the vergence theory, the monocular line appears in the plane of convergence and the binocular bar is displaced in depth because of a disparity induced in its images. Gettys and Harker (1967) found that perceived depth in Panum’s limiting case was proportional to the lateral separation between the binocular and monocular stimuli up to a limiting value, which was as high as 55 arcmin when free eye movements were allowed. Westheimer (1986a) found that perceived depth declined to zero when the lateral separation between the monocular and binocular images was increased to 30 arcmin. It was concluded in both of these studies that binocular and monocular images engage in double-duty linkage. But vergence was not controlled. One need only assume that the tendency for the monocular line to induce a change in vergence increases as its distance from the binocular bar increases up to a certain point. Weinshall (1991) claimed that multiple depth planes seen in an ambiguous random-dot display are due to double-duty linkage. However, she subsequently produced evidence that this is not the case (Weinshall 1993). The double-nail illusion described in Section 15.4.6 has also been interpreted as being due to double-duty linkage of images (Krol and van de Grind 1980). But this effect is due to convergence slipping into a plane midway between the rods, which causes the image of the near rod in the left eye to fuse with the image of the far rod in the right eye and vice versa. There are no unfused images, and hence no disparities with respect to the plane of convergence. The stimulus is identical to that produced by two rods in the plane of convergence (Ono 1984). Kumar (1996) devised several demonstrations based on Panum’s limiting case, which he interpreted as evidence of double-duty fusion. However, he did not control vergence and ignored possible effects of zero linear perspective that some of his demonstrations contained. He obtained depth of monocular lines in both the occlusion and camouflage configurations, as in Figure 17.45. 280
•
In Figure 17.49A, lines A and B have the same spacing as lines A’ and B’. With convergent fusion, monocular line C is in the camouflage configuration and comes forward. In Figure 17.49B, line C is in the occlusion configuration and recedes. Helmholtz (1910, p. 446) devised the stereogram shown in Figure 17.49C When lines A and A’ are fused, the image of line B lies midway between the images of lines C and D in the other eye. He noted that this creates two lines, one beyond and one nearer than the fused image of lines A and A’. He concluded that line B fuses simultaneously with lines C and D. Gillam et al. (1995) devised a similar display. They presented a single vertical line to one eye and two intersecting tilted lines to the other eye, as shown in Figure 17.50A. The single line falls between the intersecting lines and causes each tilted line to appear inclined in depth in opposite directions. Like Helmholtz’s display (Figure 17.49C), this suggests that the vertical line is matched with both the tilted lines, supporting the idea of double-duty linkage of images. A
B
A'
B' C
A'
C B'
A'
C D
A A
B
B A
B
C From occlusion to camouflage configuration. (A) When lines A and A´ are fused by convergence, lines B and B´ correspond. Monocular dashed line C is in the camouflage configuration and appears nearer. (B) When A and A´ are fused, B and B´ correspond. Monocular dashed line C is in the occlusion configuration and appears far. (C) When A and A´ are fused, B falls midway between the two thin lines in the other eye. Helmholtz observed that one line appears beyond fused line A plus A´ and the other appears nearer. Figure 17.49.
STEREOSCOPIC VISION
A
B Panum’s limiting case with inclined lines. (A) Figures used by Gillam et al. (1995) to produce inclination in two tilted lines in one eye combined with a vertical line in the other eye. (B) The same effect is produced when the vertical line is prevented from being matched with both tilted lines. The effects must be due to depth contrast rather than double duty matching.
Figure 17.50.
However, the same effect of two inclined lines is produced when a matching line is provided for one of the tilted lines, as in Figure 17.50B. In this case the vertical line can fuse with only one tilted line to produce an apparently inclined line. The fused inclined lines have no disparity and should appear frontal. But they appear inclined in the opposite direction. This effect must be due to depth contrast, rather than double-duty fusion. Perhaps the Helmholtz effect and the effect in Figure 17.50A are also due to depth contrast. Gillam et al. used a probe to measure the apparent depth of both the fused and unfused lines of a Panum’s limitingcase display. Subjects aligned a pair of nonius lines to control vergence. But one nonius line was below the single line and the other was above the matching line in the other eye. This would produce inaccurate settings because of the large separation and because of confusion between alignment of the two nonius lines and alignment of each nonius line and its adjacent line. When subjects matched the depth of the probe to that of the unfused line there was an almost perfect fit between the disparity of the probe and the separation between the fused lines and the unfused line. Gillam et al. argued that only double-duty linkage of images would produce such results. There is a problem in the use of a depth probe defined by disparity. Any misconvergence would affect the disparity of the probe, and hence its apparent depth, in just the same way as it would affect the
apparent depth of the single line relative to the fused line. In other words, one cannot use a depth probe defined by changing disparity to measure an effect due to disparity. The apparent depth of the probe is not independent of the effect being measured. Wang et al. (2001) used displays like those shown in Figure 17.51B. The single line in the left image coincides with the line tilted to the left in the right image. Their disparity is therefore zero. The monocular line in the right image tilts to the right. If double-duty fusion occurred, the monocular line should appear in the midsagittal plane and inclined in depth in accordance with the stereogram in Figure 17.51A. However, the monocular line appears tilted to the right, as long as the nonius lines are vertically aligned. This indicates that double-duty fusion has not occurred. Also, when the left-tilting line in the right image is removed, the two remaining lines fuse to create a line inclined in depth. This confirms that double-duty fusion had not occurred before the line was removed. The monocular line also appears this way when stereogram (B) is fused so as to superimpose the line in the left image with the right-tilted line in the right image.
A
B Stereograms adapted from Wang et al. (2001). (A) When fused, the two tilted lines create a single line inclined in depth. The fused line appears vertical in the frontal plane, which is the mean orientation of the two lines. (B) When the nonius lines are vertically aligned, the two left-tilted lines fuse with no disparity. The monocular tilted line appears inclined in depth but still tilted to the right. If the monocular line had engaged in double-duty fusion it should appear vertical in the frontal plane as in (A).
Figure 17.51.
S T I MU LU S TO K E N S F O R S T E R E O P S I S
•
281
Other evidence reviewed in Section 15.3.1 provides no conclusive support for the idea of double-duty linkage of images. In the classic Panum display, the images of the binocular line and the monocular line can be considered to form one low spatial-frequency image with its centroid between the bar and the line. The disparity between this low spatialfrequency image and the single image in the other eye could be compared with the zero disparity between the images of the bar. This involves a type of double-duty linkage of images, but the two linkages occur within distinct spatialscale channels of the visual system and therefore could be regarded as occurring between distinct pairs of stimuli. Double-duty linkage of images in different spatial scales could be tolerated, since it does not seriously disrupt the image-linkage process. There is no evidence bearing on this theory. 17.6.4 PA N UM ’S L I M IT I N G C A S E A S DA V I N C I S T E R E O P S I S
Panum’s limiting case could be related to da Vinci stereopsis, in which monocular occlusion or monocular camouflage creates impressions of relative depth in the absence of disparity information. Two vertical lines at different distances in the midline could create images corresponding to Panum’s liming case, as shown in Figure 17.54. The visual system could interpret the depth of the monocularly occluded line according to the rules listed in Section 17.2.1. Nakayama and Shimojo (1990) used stereograms such as those shown in Figure 17.52. The monocular line appeared beyond the binocular rectangle when the line was on the temporal side (monocular occlusion configuration), but only one of the three subjects saw the line nearer than the rectangle when it was on the nasal side (monocular camouflage configuration). Subjects maintained their vergence on the binocular rectangle, as indicated by the alignment of nonius lines. The perceived depth between the rectangle and the line, as indicated by the settings of a depth probe, was a function of the distance between the line and the edge of the rectangle for distances of up to about 30 arcmin. This result is compatible with both da Vinci
Stimuli from Nakayama and Shimojo (1990). In one fused image the monocular line is on the nasal side of the rectangle and produces the camouflage configuration. In the other fused image the monocular line is on the temporal side of the binocular rectangle and produces the occlusion configuration. The line appeared more distant than the rectangle. The nonius lines within the rectangles indicate the state of vergence. The rectangle subtended 137 x 417 arcmin at 70 cm.
stereopsis and Hering’s theory of double-duty fusion. It is not compatible with the vergence theory when vergence is held on the binocular rectangle. Ono et al. (1992) found an occlusion depth effect when vergence was held constant in the plane of a binocular bar but only when the monocular object was one that could be occluded by the binocular object. The display had to be ecologically valid. This supports the da Vinci stereopsis account. However, they did not investigate whether the amount of perceived depth varied with the separation between the monocular and binocular elements. Gillam et al. (2003) used a display similar to that in Figure 17.52 except that the binocular rectangle was textured to aid binocular fusion. The magnitude of perceived depth of the monocular line, as indicated by a binocular depth probe, increased with increasing distance from the binocular rectangle. This is predicted by both double-duty fusion and da Vinci stereopsis. However, when the monocular element was a black dot, it appeared beyond the binocular rectangle but its perceived depth did not vary with its distance from the binocular rectangle. They concluded that Panum’s limiting case occurs only in the presence of fusible edges and is therefore due to doubleduty fusion rather than to da Vinci stereopsis. In Figure 17.53 the monocular bars occur in the occlusion configuration on both sides of a binocular rectangle. When fused by convergence, the bars appear beyond the binocular rectangle, and the magnitude of depth increases with the width of the gap between monocular and binocular elements. In this case, depth cannot be due to double-duty fusion. The inner vertical edge of each of the two lower monocular bars is opposite in contrast to the edge of the binocular rectangle.
Figure 17.52.
282
•
Depth in the absence of fusible images. With crossed fusion, the monocular bars are in the occlusion configuration on both sides of the binocular rectangle. Their apparent depth beyond the rectangle cannot be due to double-duty fusion because the upper bars have no fusible vertical edges and, in the lower bars, the nearest vertical edges in the two eyes have opposite contrast.
Figure 17.53.
STEREOSCOPIC VISION
In summary, it can be stated that depth in Panum’s limiting case can certainly be created by disparity induced by misconvergence. However, an effect remains when this factor is controlled. This residual effect could be due to one or more other possible causes of the effect, but the evidence is somewhat confusing. 17.6.5 MO N O P T I C D E P T H
Consider the images formed by two vertical bars lying at different distances in the midline, as shown in Figure 17.54a. When the eyes are converged on the near bar each image of the far bar lies on the nasal side of the retina. When the eyes are converged on the far bar each image of the near bar lies on the temporal half of the retina. Thus, two matching monocular images falling on the nasal retinas invariably indicate an object beyond the fixation plane and two monocular images on the temporal retinas indicate an object nearer than the fixation plane. When one monocular image is removed, as in Figure 17.54b, the stimulus conforms to Panum’s limiting case. A monocular bar on the nasal side of the binocular bar appears beyond the binocular bar, and a monocular bar on the temporal side appears nearer than the binocular bar. When the fixation point is removed, as in Figure 17.54c, a monocular bar on a nasal retina arises from an object beyond where the eyes are converged and one on a temporal retina arises from a nearer object. If the visual system embodies this ecologically valid rule, then a monocular image on the nasal retina should appear more distant than one on the temporal retina, even when there are no other objects in view.
Fixation point
FL
FR
(a) Midline disparity
FL
FR
(b) Panum’s limiting case
FL
FR
(c) Monoptic depth
Stimulus condition for monoptic depth. Monoptic depth could be responsible for depth impressions in (a) and (c). Monoptic depth alone creates a depth impression in (c).
Figure 17.54.
Hering (1851) was the first to propose that an image on the temporal retina of one eye will appear to lie in front of the fixation plane, and an image on the nasal retina appears to lie beyond the fixation plane. According to Hering, the perceived depth increases with increasing horizontal distance of the image from the fovea. This relationship between the positions of a monocular image and its perceived depth signs corresponds to the analysis presented above. Helmholtz (1910) ridiculed this notion, citing numerous examples where the theory would fail. However, Kaye (1978) showed that depth percepts could be obtained from monoptic images, thus confirming Hering’s theory. Wilcox et al. (2007) introduced the term “monoptic depth” to describe the depth percept created by a monocular stimulus in one eye while the other eye views a blank field. They replicated Kaye’s results and found that the monoptic depth percept disappeared when the unstimulated eye was patched, and that shifting the fixation point away from the midline reversed perceived depth due to the shift in the retinal location of the stimulus. These results suggest that monoptic depth is a binocular phenomenon. Wilcox et al. went on to show that the monocular image is matched to the fovea of the unstimulated eye using a stereoscopic mechanism that is able to match dissimilar features, or contrast envelopes. 17.7 STEREOPSIS FROM G E O M ET R I C A L I L LU S I O N S One may ask whether depth can be created by identical dichoptic lines that have a pseudodisparity arising from an apparent difference in shape or length. In the Zöllner illusion, parallel lines with cross-hatching appear nonparallel. Lau (1922) presented the same set of oblique parallel lines to the two eyes, but one set contained crosshatching, as in Figure 17.55A. The display did not create a consistent impression of depth. When he combined two oppositely oriented Zöllner illusions, as in Figure 17.55B, the predominant impression was of rivalry rather than of depth. However, he did obtain consistent impressions of differential inclination of the parallel lines when the crosshatching was only slightly different in the two eyes, as in Figure 17.55C. He concluded that the visual system uses the pseudodisparity between the parallel lines induced by the configurational properties of the two images. However, the slight difference in orientation of the crosshatching constitutes a real disparity, and it is not clear what effect this has on the appearance of the parallel lines. Also, the nonparallel appearance of the lines of the Zöllner illusion produces a pseudolinear perspective, which complicates the stimulus. Lau (1925) presented the stereogram shown in Figure 17.56, which he claimed produced curvature in depth in the intersecting straight lines.
S T I MU LU S TO K E N S F O R S T E R E O P S I S
•
283
Dichoptic combination of Müller-Lyer illusions. Disparity between the low spatial-frequency components creates an impression of slant in depth. (Glennerster and Rogers 1993)
Figure 17.57.
The perceived depth was commensurate with the disparity predicted from the 2-D Müller-Lyer illusion in each eye. However, the whole image with outgoing fins is objectively longer than the whole image with ingoing fins. Depth could be due to disparity between the whole images rather than to a pseudodisparity between the central lines. Overall, the evidence that impressions of depth can be produced by pseudodisparities arising from geometrical illusions is not conclusive. 17.8 CHROMOSTEREOPSIS
Stereopsis from the Zöllner illusion. Lau (1922) could not obtain a convincing impression of depth by combining the Zöllner illusion in one eye with parallel lines in the other eye, as in (a), nor by dichoptically combining oppositely oriented Zöllner illusions, as in (b). He claimed to see depth in combined Zöllner illusions in which the angle of cross-hatching was slightly different in the two eyes, as in (c).
Figure 17.55.
Stereopsis from the Hering illusion. Lau (1925) claimed that straight lines made to appear curved by different amounts in the two eyes by the Hering illusion generate an impression of curvature in depth when fused stereoscopically.
Figure 17.56.
Squires (1956) reported stereoscopic depth in dichoptic Zöllner and Poggendorff illusions. Others have not been able to replicate these effects (Ogle 1962; Julesz 1971). Glennerster and Rogers (1993) designed the stereogram in Figure 17.57, which creates two lines slanted in depth. 284
•
For most people, a red patch on a blue patch appears to stand out in depth when viewed with both eyes. This effect is known as chromostereopsis. It was described by Goethe (1810) and Brewster (1851). It is believed that it is due to chromatic aberration. The refractive index of the eye’s optical system for short wavelengths (blue light) is about 1.8 diopters larger than that for long wavelengths (red light) (Howarth and Bradley 1986). This means that blue light is brought to focus nearer to the lens than red light. This is longitudinal chromatic aberration (Thibos et al. 1990; Zhang X et al. 1991). Chromostereopsis could be due to blue objects’ requiring a greater degree of accommodation than red objects. Thus, one or other of the colored patches is blurred, depending on which patch is in focus. Helmholtz (1909, vol. 3, p. 294) claimed to get an effect due to this cause, although it is not clear from his account whether he used monocular or binocular viewing. Any effect due to differential blur would be evident with both monocular and binocular viewing. It cannot be the only cause of binocular chromostereopsis, since depth is seen with small artificial pupils, which eliminate the effects of differential image blur due to longitudinal chromatic aberration. For most people, the eye’s optic axis (the line through the centers of the eye’s four refractive surfaces) intersects the retina on the nasal side of the fovea. The angle between the optic axis and the visual axis through the fovea is the angle alpha. The nasal offset of the optic axis creates a parallactic displacement of the images of red objects toward the temporal retina relative to those of blue objects. This is transverse chromatic aberration. The visual angle between the image of a blue point and the image of a superimposed red point defines the magnitude
STEREOSCOPIC VISION
of transverse chromatic aberration. An object illuminated by both red and blue light produces chromatic diplopia in each eye. Consider a small blue object and a red object placed vertically above it in the same frontal plane. The opposite transverse chromatic aberration in the two eyes induces a crossed disparity in the image of the red object with respect to that of the blue object (Einthoven 1885; Hartridge 1918). This causes the red object to appear nearer than the blue object. We refer to this as red-in-front-of-blue chromostereopsis. A red object acquires an uncrossed disparity relative to a blue object in those people for whom the angle alpha has the opposite sign. They therefore experience blue-in-front-of-red chromostereopsis. A random mixture of small red and blue squares or of squares in other pairs of colors does not produce depth within the pattern because the differently colored squares share the same border and therefore do not produce a relative disparity signal. The pattern as a whole may appear to stand out in depth when viewed against a region containing other colors (Faubert 1994). The direction of chromostereopsis can be reversed by moving artificial pupils, both in a nasal direction or both in a temporal direction with respect to the centers of the natural pupils (Vos 1960). Moving the artificial pupils nasally induces blue-in-front-of red chromostereopsis and moving them the other way has the opposite effect. This is because moving a pupil changes the position of the optic axis but not of the visual axis, and thus changes the sign of transverse chromatic aberration. Changes in the magnitude and sign of transverse chromatic aberration brought about by changing the lateral distance between small artificial pupils are accompanied by equivalent changes in chromostereopsis (Owens and Leibowitz 1975; Bodé 1986; Simonet and Campbell 1990a ; Ye et al. 1991). Thus, it is well established that chromostereopsis can be explained in terms of transverse chromatic aberration for small pupils. Chromostereopsis is often enhanced by spectacles that contain chromatically uncorrected lenses. Viewing a display through prisms also increases the chromatic aberration. Thus, when blue and red patches are viewed through base-in prisms of about 5 diopters the blue patch appears closer than the red patch. When the prisms are base-out, the depth order of the colors is reversed. Covering the temporal half of each pupil produces the same effect as looking through converging prisms (Kishto 1965). The red-in-front-of-blue stereopsis experienced by most people reverses to a blue-in-front-of-red effect at low levels of illumination (Kishto 1965). Four explanations of this effect have been proposed. 1. The effect could be due to an increased sensitivity of the eye to blue light relative to red light in low illumination (the Purkinje shift). Sundet (1972) disproved this
theory by showing that chromostereopsis still reverses when pupil size is varied without a change in retinal illumination. 2. Reversal of chromostereopsis could be due to the pupils moving in a nasal direction when they dilate in response to reduced illumination. Changes of pupil centration of about 0.2 mm occur during pupil dilation or constriction, but they are very variable from person to person, tend to be the same in both eyes, and are not related to the direction of pupil change (Walsh 1988). Sundet (1976) found that chromostereopsis reversed when the sizes of artificial pupils were changed without a change in position. Thus, although lateral shifts of the pupils may affect chromostereopsis, the reversal of chromostereopsis with decreasing illumination occurs when such shifts are not present. 3. When the pupils dilate, the effective center of each pupil shifts in a nasal direction with respect to its geometrical center (Vos 1960, 1966). In the human eye, light passing through the center of the pupil is transmitted to the retina with less loss than light passing through peripheral parts of the pupil. This is the well-known Stiles-Crawford effect. The exact point in the pupil through which light is transmitted most efficiently is known as the effective center of the pupil. This can be determined by observing a flashed pattern of concentric rings reflected from the corneal surface of a person’s eye. Foveal cones automatically align themselves with the point of maximum luminance. If we define the optic axis in terms of the effective center of the pupil, a shift of this center is equivalent to a shift in the optic axis. The nasalward shift of the effective centers of the pupils reverses the sign of the angle alpha for most people and thus reverses the sign of chromostereopsis. Simonet and Campbell (1990b) reported that not all changes in chromostereopsis with changing illumination were accompanied by changes in monocular transverse chromatic aberration. However, Ye et al. (1992) found that changes in chromostereopsis with changes in pupil size were closely matched by changes in chromatic diplopia. They also found that chromostereopsis could be predicted from transverse chromatic aberration, with the Stiles-Crawford effect playing an important role. 4. Finally, the reversal of chromostereopsis at low levels of illumination could be due to the border contrast between the colored stimuli and the background. Verhoeff (1928) observed blue-in-front-of-red chromostereopsis when red and blue letters were printed on white paper and a red-in-front-of-blue effect when they were printed on black paper. Dengler and Nitschke (1993) found the same transition when the color of the background was changed. They suggested
S T I MU LU S TO K E N S F O R S T E R E O P S I S
•
285
that changes in border contrast cause this reversal and argued that Kishto did not control contrast properly when he reduced overall luminance. Winn et al. (1995) also obtained a reversal of chromostereopsis when the background was changed from black to the sum of the spectral colors of the test patches. A given colored target appeared in different depth planes when placed on different backgrounds in the same display. They noted that the location of a red bar on a black background is determined by the red image, but that the location of a red bar on a white background is determined by the white minus red spectrum. From this point of view, changes in chromostereopsis due to changes in the background are explained in terms of transverse chromatic aberration. Dengler and Nitschke’s explanation of chromostereopsis in terms of border contrast can be explained in the same way. It seems that transverse chromatic aberration can explain all forms of chromostereopsis. Chromatic fringes seen when prisms are first put on gradually fade because of neural adaptation. However, the increased chromostereopsis remains, which shows that perceptual adaptation to color fringes has no effect on optical dispersion produced by prisms (Hajos 1962). Colorblind people experience chromostereopsis since the effect is optical and has nothing to do with whether color is correctly coded in the retina (Kishto 1965). Although chromatic aberration produces depth, removing chromatic aberration by achromatizing lenses does not significantly affect stereoacuity based on disparity defined by luminance (Osuobeni 1991). Chromostereoscopy can be used to create stereoscopic images, but objects are not represented in their natural colors (Steenblik 1993; Idesawa et al. 2005). 1 7 . 9 I R R A D I AT I O N S T E R E O P S I S When an illuminated square in dark surroundings is viewed with a neutral filter in front of one eye, the square appears slanted about a vertical axis. Münster (1941) first reported this effect, and Ogle (1962) gave it the name irradiation stereoscopy. It can be explained in terms of optical irradiation, in which the brighter image is optically larger than the dim image. Any difference in horizontal size of the images is equivalent to the horizontal-size disparity produced by a surface slanting in depth about a vertical axis (Section 19.2.2). Irradiation stereoscopy may also be due, at least in part, to neural interactions in the retina or at higher levels in the nervous system. Békésy (1970) suggested that irradiation stereopsis might occur naturally because of differences in the sizes of the pupils.
286
•
Irradiation stereoscopy. When viewed with a neutral density filter in front of one eye, the images of the white bars are relatively larger in the well-illuminated eye. This causes each white bar to appear slanted in depth to create a Venetian blind effect. When the filter is in front of the other eye, the white bars slant in the opposite direction. Figure 17.58.
Irradiation stereoscopy is due to unequal sizes of the images of each affected object, so each object appears to slant about its own central axis. On the other hand, aniseikonia is an overall meridional difference in image size, so that the contents of the whole binocular field appear to slant about a single axis. When a set of alternating black and white bars, like those shown in Figure 17.58, is viewed with a neutral filter over one eye, each white bar appears to slant about its own vertical axis making the set of bars appear like a Venetian blind. If the black bars are seen as objects, they too appear to slant but in the opposite direction, so the set of black and white bars appears like a folded screen. The direction of the folds reverses when the filter is moved to the other eye. The effect persists when the gaze moves over the display. A similar effect occurs with overall magnification of one eye’s image of a vertical grating (Section 19.2.2). Some people experience the irradiation Venetian-blind effect in a vertical grating with no filter. Cibis and Haber (1951) suggested that this is due to a natural difference in the illumination or focus of the eyes. They called this anisopia (see also Miles 1953). The apparent slant of a square produced by a difference of illumination in the two eyes can be nulled by a change in the width of one of the images. Cibis and Haber (1951) used this procedure and found that the apparent slant of a square increased as a linear function filter density up to a density of 1.25, and saturated at a density of about 2.5. When a textured disk rotating in a frontal plane is viewed with a neutral filter over one eye, it appears inclined about a horizontal axis. This is a manifestation of the Pulfrich effect (Section 23.1). At the same time, the disk appears slanted about a vertical axis because of irradiation stereoscopy (Walker 1976).
STEREOSCOPIC VISION
18 STEREOSCOPIC ACUIT Y
18.1 18.2 18.2.1 18.2.2 18.2.3 18.2.4 18.3 18.3.1 18.3.2 18.3.3 18.3.4 18.3.5 18.4 18.4.1 18.4.2 18.5 18.5.1 18.5.2 18.5.3 18.5.4 18.5.5 18.6 18.6.1 18.6.2 18.6.3 18.6.4
Terminology and tasks 287 Tests of stereoscopic vision 289 Tests using real depth 289 Tests using standard stereograms 290 Tests using random-dot stereograms 291 Correlations between stereo tests 295 Basic features of stereoacuity 295 Limits of stereoacuity 295 Detection of absolute and relative disparities 296 Stereoacuity away from the horopter 297 Stereoacuity and relative image size 300 Ideal observer for stereoacuity 301 Upper disparity limit 302 Upper limit of horizontal disparity 302 Tolerance for added vertical disparity 306 Luminance, contrast, and stereopsis 307 Effects of luminance and contrast 307 Contrast-sensitivity function for stereopsis 308 Simple detection and detection of depth 309 Effects of interocular differences 310 Stereoacuity and color 312 Spatial factors in stereoacuity 313 Stereoacuity and stimulus location 313 Stimulus spacing 315 Spatial frequency of disparity modulation 316 Stereoacuity with crossed and uncrossed disparities 323
18.6.5 18.6.6 18.6.7 18.7 18.7.1 18.7.2 18.7.3 18.7.4 18.8 18.8.1 18.8.2 18.9 18.10 18.10.1 18.10.2 18.10.3 18.10.4 18.10.5 18.11 18.12 18.12.1 18.12.2 18.12.3 18.13 18.14 18.14.1 18.14.2
1 8. 1 T E R M I N O L O GY A N D TA S K S
Stereoacuity and stimulus orientation 324 Discrimination of disparity gradients 326 Stereoacuity and viewing distance 327 Stereoacuity and spatial scale 328 Introduction 328 Spatial scale and disparity detection 329 Spatial scale, contrast, and perceived depth 336 Spatial scale and stereopsis masking 337 Disparity pooling 338 Disparity pooling and noise reduction 338 Disparity metamerism 338 Transparency and stereoacuity 342 Stereoacuity and eye movements 344 Effects of lateral eye movements 344 Detection of depth across shifts of gaze 345 Stereoacuity and vergence stability 348 Stereo integration over vergence changes 349 Stereoacuity and head movements 350 Stereoacuity and other acuities 351 Temporal factors in stereoacuity 354 Stimulus duration and processing time 354 Effects of stimulus delays 356 Sustained and transient stereopsis 357 Attention and stereoacuity 358 Learning and stereopsis 359 Practice and stereoacuity 359 Practice and stereo latency 360
in which subjects decide which of two stimuli contains a depth interval (Blackwell 1952). The two stimuli are presented either side by side or one after the other. In a related task, subjects decide whether a stimulus object is nearer than or beyond a comparison object. The statistical reliability of interpolation of the threshold point in the psychometric function is improved by curve–fitting procedures coupled with assignment of weights derived from probit analysis (Section 3.1.1b). The end result is a measure of the least separation in depth between two stimuli that evokes a sensation of depth for a given mean distance of the stimuli from the viewer. Stereoacuityy ( h) is a measure based on the depthdiscrimination threshold when binocular disparity is
The depth-discrimination threshold is the smallest depth interval between two stimuli that a subject can detect. With the method of adjustment, the subject adjusts the distance of a variable stimulus until it appears at the same distance as a comparison stimulus. The depth threshold is the mean unsigned error or the variance of the equidistance settings. With the method of constant stimuli, the subject reports the relative distance of two stimuli. The threshold is defined as the depth interval discriminated on 75% of trials. The double-staircase method further refines the measuring procedure (Blakemore 1970c). Criterion problems are eliminated by a two-alternative forced-choice procedure 287
the only cue to depth. It is the difference between the binocular subtense of a fixed stimulus and that of a variable stimulus in its mean threshold position, as illustrated in Figure 18.1A. To a first approximation, the justdiscriminate difference in depth ( Δ ) and stereoacuity (h) are related by h=
a d in radians d2
(1)
where a is the interpupillary distance and d is the viewing distance (Section 14.2.3). These values can be converted into seconds of arc by multiplying by 206,000. The equation shows that, for a given Δd , stereoacuity is approximately proportional to the stereo base (interocular distance) and inversely proportional to the square of viewing distance. In standard tests of stereoacuity, the subject fixates one of two targets or is allowed to look from one to the other. Thus, at any instant, one of the targets has zero disparity. In other tests, the subject fixates a third target so that both test targets have a standing disparity, known as a disparity pedestal. In this case, the depth-discrimination threshold is the just discriminable depth between the test targets with the subject converged on the fixation target, as in Figure 18.1b. Stereoscopic accuracy is the signed difference, or constant error, between a judged depth interval and an actual depth interval. The smaller the difference (error), the higher is the accuracy. There are three types of stereoscopic inaccuracy. 1. Scale expansion or contraction Zero disparity is correctly judged as indicating no depth between two stimuli, but depth intervals on either side of zero are under- or overestimated by a fixed ratio. Thus, the scale of perceived depth is expanded or contracted relative to that of actual depth. Stereoscopic gain is the magnitude of perceived depth between two stimuli one unit of depth apart. In more general terms, it is the ratio of perceived depth to actual depth. Usually, one of the test stimuli is fixated and assumed to have zero disparity. Gain has a value of 1 when perceived depth and actual depth are equal. A gain of between 0 and 1 indicates that depth is underestimated. Perceived depth can be determined from subjects’ estimates of the depth between two targets, or by asking subjects to place a probe at the same apparent depth as a test target. Ideally, in the matching method, the distance of the probe must be judged correctly and the probe and test targets should not interact. It is not easy to check these requirements, but even if they are not met the method allows one to compare stereo gains of different test targets. 2. Scale shift The scale of perceived depth may be shifted by a constant amount with respect to that 288
•
A B Relative depth
A
Depth pedestal P Convergence distance
B Absolute and relative binocular disparities. (A) Binocular disparity with reference to zero disparity. Angles and distances that define the disparity of a point, P, with respect to a fixation point, F, at distance d. Dd is the distance between F and P, a is the interocular distance, q is the binocular subtense of P, and w is the binocular subtense of F. Disparity equals w q . (B) Binocular disparity with respect to a disparity pedestal. With convergence on P, the disparity between A and B is on a disparity pedestal, defined as the disparity between A and B.
Figure 18.1.
of actual depth. Thus, two objects with the same disparity would appear to lie in different depth planes, and all other depth intervals would be shifted accordingly. Something like this occurs when surfaces in different depth planes are juxtaposed (Chapter 21). 3. Reversal of the sign of disparity A person who perceived crossed disparities as uncrossed and uncrossed disparities as crossed see objects in reversed depth in the
STEREOSCOPIC VISION
absence of other depth cues. They would have constant errors everywhere except at zero disparity. People seem to be incapable of adjusting to prisms that optically reverse the signs of disparities. Thus, it seems that this type of inaccuracy does not occur. The disparity sign mechanism must be hard-wired. A stereoscopic visual system is symmetrical when discrimination and gain are the same for crossed and uncrossed disparities. We will see that the normal human stereoscopic system has systematic asymmetries. Some people have severe asymmetries, to the extent of being able to process only crossed or only uncrossed disparities. A point with crossed disparity is not necessarily closer than a point with either zero or uncrossed disparity. The disparity of a point specifies its location with respect to the Vieth-Müller circle. At large headcentric eccentricities, a point with uncrossed disparity can be closer than a point on the midline, and a point with crossed disparity can be further away (Figure 14.4). Also, an eccentrically positioned surface that is slanted with respect to the cyclopean direction (closer on the left and farther away on the right, or vice versa) can have a disparity gradient opposite in sign to that of a surface with the same slant positioned close to the midline. In general, the visual system seems to be capable of interpreting both the relative distances of points and the slants of surfaces with respect to the cyclopean direction, even when the surfaces are eccentric with respect to the head. However, observers may report slant reversals when information about the surface’s eccentricity, derived from eye position and vertical disparities, is weak (Gillam 1967). Conflicting perspective information plays an important role in slant reversals (see Section 20.3.2c) (Gillam 1993). The upper limit of disparity is the largest disparity that can evoke an impression of depth. Images that evoke an impression of depth are not necessarily fused. In other words, the limit of fusion is smaller than the upper limit of disparity.
18.2 TESTS OF STEREOSCOPIC VISION The tests of stereoscopic acuity described in this section are designed for subjects who can respond to verbal instructions. Tests of stereopsis for preverbal infants were described in Section 7.6.1. Tests used on animals are described in Chapter 33.
distance of 5 m and obtained a stereoacuity of 6 arcsec in his best subjects. Bourdon (1902) used three pins at a distance of 2 m and obtained a stereoacuity of 5 arcsec. In the Howard-Dolman test adopted by the American Air Force during the 1914–18 war, the subject views two 1-cm diameter vertical rods, 6 cm apart, from a distance of 6 m. A horizontal aperture near the eyes occludes the ends of the rods. In the original test the rods were exposed briefly and the subjects made a judgment about the depth of one rod relative to the other (Howard 1919). In testing American Air Force pilots, Howard showed for the first time that stereoacuity could be as fine as 2 arcsec. Andersen and Weymouth (1923) obtained similar values. In the standard Howard-Dolman test, the subject pulls a string attached to one rod until the two rods appear equidistant. In a two-alternative forced-choice procedure, subjects decide which of two pairs of rods has a depth difference, with the pairs presented sequentially or simultaneously (Fan et al. 1996). Larson and Giroux (1982) designed a Howard-Dolman test in which pairs of rods with different separations in depth could be changed rapidly. At least two monocular cues are present in this test—the relative widths of the images of the rods and motion parallax due to head movements. Differences in image blur would not be detectable at the large viewing distance used in the Howard-Dolman test. Woodburne (1934) found that controlling the width of test threads had very little effect on the results. Howard claimed that the binocular depth threshold was about 20 times lower than the monocular threshold, but Pierce and Benton (1975) found it to be only about 2.4 times lower. See Sloan and Altman (1954) for other evidence on this point. Kaye et al. (1999) developed a procedure for taking monocular depth acuity into account when measuring binocular stereoacuity, which involves setting two small spheres to equidistance under monocular and binocular conditions.
18.2.1b Stereopter The Verhoeff (1942) Stereopter (The American Optical Co., Scientific Instrument Division, Buffalo, NY, 14215) uses three rods seen against a back-illuminated screen. The rods differ in thickness, thus eliminating the size cue to relative distance. Sloane and Gallagher (1945) found the Verhoeff test to be more sensitive than the Howard-Dolman test. Another procedure is to use two single high-contrast edges rather than two rods.
18.2.1c Diastereo Test
18.2.1 T E S TS US I N G R E A L D E P T H
18.2.1a Howard-Dolman Test The best-known test of stereoacuity, first used by Helmholtz, is to set a vertical rod to appear in the same frontal plane as two flanking rods. Heine (1900) used three rods at a
This test consists of a flashlight with a diffusing lens. Three opaque dots are mounted on the front of the lens, one of which is 9 mm nearer than the other two. The instrument is rotated around the visual axis into each of eight orientations. The subject has to say which dot is in front.
STEREOSCOPIC ACUIT Y
•
289
The viewing distance is reduced from 6 m until answers for the eight orientations are correct. This test is inexpensive and easy to use (Pardon 1962).
18.2.1d Falling Bead Test In Hering’s falling bead test, the subject looks through a tube and fixates on a bead hanging on a thread. The experimenter drops a second bead in front of or behind the thread and slightly to one side (Hering 1865). The drop takes about 200 ms. The subject reports the relative depths of the two beads.
18.2.1e Frisby Stereo Test The Frisby Stereo Test (Clement-Clark, Ltd., Airmed House, Edinburgh Way, Harlow, Essex, CM20 2ED, UK) consists of three clear plastic plates of variable thickness. Three circular disks with randomly arranged dots are placed all on the front or all on the back surface of each plate, and a fourth disk is placed on the opposite surface of each plate. The plate creates a binocular disparity of between 15 and 340 arcsec, depending on its thickness and the viewing distance. For each plate, subjects identify the single disk that differs in depth from the other three. A stereoscope is not needed because depth between the disks is real. This overcomes the problem that some people have in fusing dichoptic images, and makes the test easy to use with children who resist wearing viewing devices. Care must be taken to ensure that subjects do not generate parallax by moving the head or the plate. None of 40 children who passed the test under binocular conditions passed with monocular viewing (Manny et al. 1991). Only one of 32 adults who passed the test with binocular viewing performed above chance with monocular viewing (Cooper and Feldman 1979). The Frisby stereotest is understood
Figure 18.2.
The Titmus Stereo Fly Test.
by most children and has the best test/retest reliability of the available clinical tests (Heron et al. 1985). It has been found to correlate highly with the TNO test (Rosner and Clift 1984). 18.2.2 T E S TS US I N G S TA N DA R D STEREOGRAMS
18.2.2a Stereoscopic Howard-Dolman Test Monocular cues to distance may be eliminated in the Howard-Dolman test by creating rods in a stereoscope and varying the binocular disparity of the images of the variable rod. There are several tests of this type, including the Bausch and Lomb Ortho-Rater used by the U.S. Navy during the 1940–45 war (Fletcher and Ross 1953). It is impossible to create disparities smaller than about 20 arcsec using film or printed paper, because they are subject to expansion and contraction with changes in humidity. Also, stereograms are subject to distortions from the optics of the stereoscope. For example, a disparity of 4 arcsec requires a difference between stereograms of less than 0.004 mm.
18.2.2b Stereo Fly Test The Stereo Fly Test (Stereo Optical Co., Chicago, IL, 60641) consists of three subtests: the Fly test, the Circle test (derived from the Wirt test), and the Animal Stereo test, as illustrated in Figure 18.2. All the stereograms are viewed through polarizing spectacles at a distance of 40 cm. The Fly test is a qualitative test for young children. A fly appears in depth when the images are properly fused. Stereopsis is indicated if the child reaches in front of the plane of the stereogram when asked to touch the fly’s wings. The Circle test has nine numbered diamonds, each containing four circles. One of the circles in each diamond has a disparity, ranging from 40 to 400 arcsec, and the subject indicates
(Stereo Optical Company, Inc.)
290
•
STEREOSCOPIC VISION
which of the circles appears out of the plane of the other three. The Animal test consists of three rows of animals. One animal in each row has a disparity of 100, 200, or 300 arcsec. These disparity ranges can be reduced by increasing the viewing distance beyond 40 cm. Note that when the viewing distance of a stereogram is doubled, the disparity is halved. Objects in real depth obey an inverse square law by which disparity is quartered when distance is doubled. With coarser disparities it is possible to pick out the correct stimulus in the Circle and Animal tests with only one eye open. This is done by observing the lateral displacement of the correct image relative to the other images (Cooper and Warshowsky 1977). Simons and Reinecke (1974) found that many amblyopes, who were presumed to lack stereopsis, could pick out the correct stimulus in the Circle and Animal tests, and assumed that they were using this monocular cue. True stereopsis is indicated if subjects can report the relative depth plane of the odd circle and not merely that it is odd.
18.2.2c AO Vectographic Card Test The AO Vectographic Card Test (Armstrong Optical Co., 14 Mechanic Street, Southbridge, MA 01550) consists of 10 rows of 5 circles. For each row, the subject reports which circle is stereoscopically displaced out of the depth plane of the others. At a viewing distance of 40 cm, the disparity range extends from 5 to 739 arcsec.
18.2.3 T E S T S US I N G R A N D O M-D OT STEREOGRAMS
18.2.3a Random-Dot Stereograms Although random-dot stereograms were described before 1960, it was in that year that Bela Julesz introduced them as a research tool (Portrait Figure 18.3). Methods of constructing random-dot stereograms are described in Section 24.1.5. The two images of a cyclopean stereogram contain matching texture that allows the visual system to find the correct correspondence and detect the local disparity in each region. The cyclopean shape defined by disparity between groups of elements emerges only after the images have been correctly fused, as illustrated in Figure 18.4. However, detection of the cyclopean shape does not require detection of depth. When one of the regions of the stereogram is in binocular register, the disparity in the disparate region creates a higher density of dots. This point is readily demonstrated by superimposing the two images of a random-dot stereogram in one eye, as in Figure 18.5. Also, as Regan and Hamstra (1994, p. 2289) pointed out, the cyclopean shape in a polaroid or anaglyph random-dot stereogram is visible without the polaroid or color filters before the eyes. In other words, detection of disparity or depth does not have to precede detection of the cyclopean shape in a random-dot stereogram.
18.2.2d Keystone Visual Skills Test The Keystone Visual Skills Test (Mast/Keystone Co., 2212 E. 12th Street, Davenport, IA 52803) consists of cards viewed in a hand-held stereoscope. The DB card has 12 rows of 5 symbols (star, square, cross, heart, and disk) set within a rectangular frame. In each row, one symbol is stereoscopically out of the depth plane of the others. At a viewing distance of 20 cm, the disparity range is 52 to 1076 arcsec. The DC card has three rows of characters, which vary in size according to different Snellen acuities. By changing the viewing distance, disparity can be varied between 10 and 1300 arcsec.
18.2.2e Freiburg Stereoacuity Test This test was developed by Bach et al. (2001). A computergenerated stereoscopic image of a vertical line is set at random horizontal positions in a rectangular frame with a random-dot surround. The threshold depth disparity of the line is determined by a staircase procedure. Antialiasing allows the disparity of the line relative to the frame and surround to be set between 1 and 1000 arcsec.
Bela Julesz. Born in Budapest in 1928. He obtained a diploma in electrical engineering at the University of Budapest in 1953 and a doctorate from the Hungarian Academy of Science in 1956. He worked at the Telecommunication Institute in Budapest from 1951 to 1956 and at the Bell Laboratories in Murray Hill, New Jersey, from 1956 to 1989, where he was head of the Sensory and Perceptual Processes Division. In 1989 he became professor of psychology at Rutgers University. Recipient of the H. P. Heineken Prize of the Royal Netherlands Academy of Arts and Science in 1985 and the Karl Spencer Lashley Award of the American Philosophical Society in 1989, He died in 2003.
Figure 18.3.
STEREOSCOPIC ACUIT Y
•
291
Figure 18.4.
A random-dot stereogram. A central square stands out in one of the fused images and appears through an aperture in the other fused image.
(Adapted from Julesz 1971)
A well-designed random-dot stereogram contains no monocular evidence about the shape of the cyclopean form. If the images are not properly matched, the stereogram is seen as a random array of elements in depth, known as lacy depth. However, a random-dot stereogram contains several monocular depth cues, all of which indicate that the display is flat. Depth cues other than disparity have the following effects in a random-dot stereogram: 1. Conflict between accommodation and vergence In a real 3-D scene, accommodation changes with changes in vergence. In a stereogram, vergence and disparity are linked, as in a real 3-D stimulus. However, the accommodation required to keep the images of a stereogram in focus does not change with vergence. Also, in a 3-D scene, objects at different distances have
different degrees of image blur. In a stereogram, all depth planes are in focus. 2. Absence of perspective Lack of size and linear perspective and texture gradients in a random-dot stereogram indicates that the display is flat. 3. Absence of motion parallax This point is vividly illustrated when one moves the head from side-to-side while viewing a random-dot stereogram containing a central square in depth. The zero motion parallax causes the square to appear to move with the head, rather than remain in a stable position. 4. Figure-ground effects A cyclopean form in a random-dot stereogram may appear in front of the background because of the general tendency for a form to appear in front of a ground. Thus the monocular cue of figure-ground operates on the cyclopean form. This effect may be responsible for the perception of depth in a random-dot stereogram in which the cyclopean form is generated by rivalry rather than by disparity, an effect known as rival depth (Section 17.5). 5. Monocular zones In standard random-dot stereograms, a region of dots in one eye is shifted by an integral number of dots to create a sharp discontinuity of disparity. Along vertical discontinuities, a column of dots is visible to one eye but not to the other. Monocular zones can create depth in a random-dot stereogram, in addition to, or instead of, depth created by disparity (Section 17.2). While a monocular zone is a binocular depth cue, it is not a disparity cue.
Images of a random-dot stereogram in one eye. The shape of the disparate region is visible in superimposed images presented to one eye. Therefore, detection of the shape in dichoptic images does not require detection of disparity. It only requires correct matching of one set of random dots.
Figure 18.5.
292
•
In his 1960 paper, Julesz reported that the impression of depth in a random-dot stereogram survives a considerable addition of noise to one of the images (Figure 18.6A). The noise creates lacy depth superimposed on the depth planes defined by disparity. He also reported that depth is seen when alternate points in alternate lines are black in one eye and white in the other, so that no pair of corresponding
STEREOSCOPIC VISION
points has identical neighbors. However, stereopsis fails when all pairs of dots have reversed luminance polarity. Julesz found that depth is difficult to see when the dots in the disparate region are isolated among dots that are spatially uncorrelated in the two eyes (Figure 18.6B). Uncorrelated dots thus destroy the impression of a coherent depth plane. However, lacy depth is seen in randomdot stereograms that do not form coherent depth planes, as long as dot density is not too great. Relative motion of totally uncorrelated dot patterns can create motionin-depth (see Section 31.3.5). Stereopsis also survives considerable blurring of one image. Julesz found that depth in a random-dot stereogram takes longer to see than that in a normal stereogram. However, the time needed to see cyclopean depth becomes shorter after repeated trials (Section 18.14.2). Depth is much easier to see when an outline is placed around the disparate region or when the disparate region is made lighter in one eye than in the other. Julesz realized that corresponding points in the two images cannot be found by a simple cross-correlation process in which zones of defined size in the two eyes are compared. If the compared zones are small, one cannot account for the resistance of the impression of depth to perturbations of the dots. If the zones are large, one cannot account for how very small
regions of disparity can give rise to depth. In normal scenes, the image-linking process is carried out at several levels of image size and with respect to a variety of tokens (see Section 17.1). The following clinical screening tests for stereoscopic vision are based on the random-dot stereogram. They have the great advantage over traditional tests of containing no monocular cues to indicate the correct response. Some care must be taken in interpreting the results, since some people with otherwise normal stereoscopic vision have difficulty fusing random-dot stereograms, especially if they cannot correctly focus and fuse the stimulus.
18.2.3b TNO Test The TNO Test (Alfred Poll Inc., 40 W. 5th Street, New York, NY 10019) was developed by Walraven (1975) of the Institute for Perception at Soesterberg in the Netherlands. It consists of six random-dot stereograms printed as red/green anaglyphs to be viewed through spectacles fitted with red and green filters. Three of the stereograms are designed for children and have a monocularly visible object, in addition to a cyclopean object. The monocularly visible object gives children something to see and lets them believe they have done well when they do
A
B Effects of noise on random-dot stereogram. (A) A square is still seen in depth after the addition of considerable Gaussian noise. (B) Depth of the square is difficult to see after addition of uncorrelated dots. Lacy depth is seen over the whole display. (From Julesz 1960)
Figure 18.6.
STEREOSCOPIC ACUIT Y
•
293
not see the cyclopean object. Each of the three plates designed for adults contains four cyclopean disks, each with a missing sector in one of four positions. When viewed at a distance of 40 cm, the disparities in the disks range from 15 to 480 arcsec. For a given viewing distance, disparity is changed in discrete steps rather than continuously. A continuous change in disparity can be achieved by rotating each stereogram so that horizontal disparity is gradually converted into vertical disparity (Reading and Tanlamai 1982). However, this procedure introduces unwanted vertical disparities. It has been claimed that the TNO test diagnoses stereoblindness in children more reliably than the traditional Stereo Fly test (Walraven 1975) The cutoff point of 240 arcsec correctly identified 63% of children with stereo deficiency associated with squint or amblyopia (Williams et al. 1988). However, Heron et al. (1985) found that children had difficulty understanding this test and that it showed poor test/retest reliability. It underestimates stereoacuity relative to other tests and may fail to reveal amblyopia in children (Avilla and von Noorden 1981). There is evidence that stereoacuity is adversely affected by red-green anaglyphs, and children under 2 years of age tend to refuse to wear the glasses (Broadbent and Westall 1990). Redgreen filters reduce stereoacuity when worn for the Randot test, in which image separation is achieved by polaroids (Cornforth et al. 1987), and when worn for the HowardDolman test (Larson 1988). The effective contrast of the image seen through the green filter is twice that of the red image. Simons and Elhatton (1994) found 8 out of 12 anisometropic amblyopes showed higher stereoacuity on the TNO test when the amblyopic eye saw the green image rather than the red image. It seems that the green image partially compensates for the reduced contrast sensitivity of the amblyopic eye. This artifact can be at least partially overcome by measuring stereoacuity with the anaglyph glasses in both positions.
18.2.3c Random-Dot E Stereo Test The Random-Dot E Test (Stereo Optical Co., N. Kenton Avenue, Chicago, IL, 60641) was devised by Reinecke and Simons (1974) as a screening test for amblyopia (Figure 18.7). The subject views a random-dot stereogram with polaroid spectacles. A capital letter E stands out from a background with a disparity that varies with viewing distance. The subject indicates the position of the letter or which of two cards has a letter. Subjects are shown a 3-D model of a letter E on a random-dot background so that they know what to look for in the stereogram. At viewing distances from 20 cm and 6 m, the test has a disparity range between 42 and 1261 arcsec. Fricke and Siderov (1997) found that most subjects could distinguish between the plates of the Random-Dot E test with monocular viewing, although they could not select the 294
•
Figure 18.7.
The Random-dot E Stereo Test.
(Stereo Optical Company, Inc.)
plates containing the letter E. The effectiveness of this test in diagnosing visual deficiencies in children has been assessed by Hammond and Schmidt (1986) and Schmidt (1994).
18.2.3d Randot Test The Randot Test (Stereo Optical Co.) consists of three sets of random-dot stereograms. The first set consists of six stereograms containing various shapes with 10 arcmin of disparity when viewed at 40 cm. The second set consists of eight diamonds, each containing four stereoscopic circles, one of which has a crossed disparity of between 20 and 400 arcsec. The third set consists of 15 random-dot stereograms with between 100 and 400 arcsec of disparity. Subjects must identify the stereoscopic shapes or detect which circle is in crossed disparity. Zanoni and Rosenbaum (1991) devised a version of the test that involves the use of liquid crystal shutter glasses. A preschool version of the Randot test has been developed (Birch and Salomao 1998).
18.2.3e Lang Stereo Test The Lang (1983) Stereo Test (Forch, Switzerland) consists of random-dot stereograms of silhouettes of a star, car, and cat presented as a lenticular-sheet stereogram on a 9.5-by-14.5-cm card with 23 stripes per centimeter for each eye. The test does not require a viewing device or spectacles. It is not suitable for measuring stereoacuity, but only as a screening device. Lang and Lang (1988) provided some norms.
STEREOSCOPIC VISION
18.2.4 C O R R E L AT I O N S B ET WE E N STEREO TESTS
There have been several investigations of correlations between different stereo tests. Fagin and Griffin (1982) devised charts for converting the results of six tests of stereopsis to a common scale of disparity. Although the HowardDolman and the Keystone tests showed good test-retest reliability, there was only a low correlation between them (Warren 1940; Hirsch 1947; Harwerth and Rawlings 1977; Reading and Tanlamai 1982). Tests differ in their efficiency. For example, Hall (1982) found that only 9% of a group of 67 adults attained a stereoacuity of 15 arcsec on the TNO test but that 21% attained this level on the Frisby stereotest. Broadbent and Westall (1990) found that the Frisby test was passed by 20% of children under 1 year of age, while no child under that age passed the TNO test. Simons (1981) compared the efficiency of the Random-Dot E test, the Frisby test, and the TNO test in diagnosing stereo deficiencies. The Random-Dot test was best in grading patients with defective stereopsis. Marsh et al. (1980) compared the ability of different tests to discriminate between normal children and children with strabismic or anisometropic amblyopia. The low correlation between different stereo tests may be due to any of the following factors: 1. Tests differ in the size, shape, and familiarity of the figures, and in the procedure for image fusion. 2. Most laboratory tests involve discrimination of opposite disparities, while clinical tests involve detection of a simple disparity (Fahle et al. 1994). 3. Stereo tests determine the least disparity for the perception of depth, but suprathreshold stereo efficiency may be a more useful measure. Somers and Hamilton (1984) found that accuracy in pointing to the tip of the wing of the Titmus fly or to the “E” in the Random Dot E test was highly correlated with stereoacuity determined in the usual way. This suprathreshold measure of stereoscopic ability is more difficult to apply but is immune to effects of spurious depth cues evident in the forced-choice procedure of standard tests. 4. Standard clinical tests involve stationary stimuli. Zinn and Solomon (1985) used 33 subjects to compare the Titmus and TNO tests with a dynamic test in which subjects had to select the nearest of four stereoscopic circles approaching along a linear track at about 30 cm/s. Neither of the static tests showed a significant correlation with the dynamic test. Stereoscopic acuity for motion-in-depth is discussed in Section 31.2.5. 5. Subjects have no difficulty fusing the images in the Howard-Dolman test. They usually have no difficulty
in standard dichoptic tests, because the outline of each image is well formed in each eye. Some subjects showing evidence of stereoscopic ability on a standard test, such as the Titmus test, fail a random-dot test. However some of these subjects pass the random-dot test when the disparate regions are outlined in each eye, presumably because this helps them to converge correctly on the disparate region (Frisby et al. 1975). 6. Performance on stereo tests involving real objects improves with increasing interocular distance, since this increases the stereobase and hence the disparity. However, performance on stereogram tests becomes worse with increasing interocular distance (Lang et al. 1991). This is because increasing interocular distance in a stereogram has no effect on disparity but increases vergence and causes the stimulus to appear nearer and with less depth. 7. Stereoacuity could perhaps be affected by differences in viewing distance. However, Wong et al. (2002) found that changing viewing distance from 40 cm to 518 cm had no significant effect on stereoacuity measured by the Random-Dot E test, with other variables held constant. 8. In some tests, such as the Titmus Circle test and the Randot test, that use polaroid viewers, the overlapping imaged may be detected with one eye. Therefore these tests may indicate that there is some stereoscopic vision when other tests reveal that there is none (Fawcett and Birch 2003).
1 8 . 3 B A S I C F E AT U R E S O F S T E R E OAC U I T Y 18.3.1 L I M I T S O F S T E R E OAC U I T Y
About 97% of adults have a stereoacuity of 2 arcmin or better, and 80% have a stereoacuity of 30 arcsec or better when tested with the Keystone stereo test (Coutant and Westheimer 1993). For selected subjects under the best conditions, stereoacuity is in the range 2 to 6 arcsec. A stereoacuity of 2 arcsec is equivalent to detecting a depth interval of 4 mm at a distance of 5 m. Monkeys have stereoacuities in the same range (Sarmiento 1975; Harwerth and Boltz 1979a, 1979b; Harwerth et al. 1995). Under favorable conditions, stereoacuity is at least as good as vernier acuity (Section 18.11). Tests of stereopsis used in the clinic are not designed to measure the stereo threshold since they do not present disparities less than 15 arcsec. In clinical practice, stereoacuity better than 40 arcsec is regarded as an indication of stereoefficiency in adults.
STEREOSCOPIC ACUIT Y
•
295
For a stereoacuity of 1 arcmin, the depth interval between a point at distance of 215 m and a point at infinity can just be detected. At distances beyond 215 m, no depth intervals can be detected on the basis of a stereoacuity of 1 arcmin. With a stereoacuity of 1 arcsec, depth intervals between objects beyond a distance of about 13 kilometers cannot be detected. Teichner et al. (1955) found that the ability to discriminate depth between two objects placed on an airstrip ceased to be better with binocular viewing than with monocular viewing at a distance of about 1000 m. Differences in stereoacuity between crossed and uncrossed disparities are reviewed in Section 18.6.4. The effects of stimulus variables on stereoacuity are discussed in Sections 18.5, 18.6, and 18.7. The development of stereoacuity was reviewed in Section 7.5.
18.3.2 D ET E C T I O N O F A B S O LU T E A N D R E L AT I VE D I S PA R I T I E S
18.3.2a Sensitivity to Absolute Disparity The visual system is particularly sensitive to relative disparities between objects at different distances. The threshold for the detection of a change in depth of a single line stepped from zero disparity to either a crossed or uncrossed disparity was found to be about 1 arcmin. When the stepped stimulus was flanked by parallel lines that remained in the fixation plane, the threshold was only a few seconds of arc (Westheimer 1979a) (Portrait Figure 18.8). Thus, a temporal sequence of absolute disparity in a single object is detected less well than a sequence of relative disparities between two or more objects. An absolute disparity in the whole of a stimulus provides no information about viewing distance. However, absolute disparity provides a stimulus for vergence eye movements. Theoretically, people could detect the depth interval between a single target outside the horopter and the plane of convergence when there is no comparison stimulus in view. In other words, they could perceive relative depth based on the absolute disparity of a single object relative to where the eyes are converged. Although horizontal disparity provides no information about viewing distance, a change in disparity contains information about the motion of an object in depth. It has been claimed that changing of disparity in the whole of a textured surface creates no change in perceived depth (Erkelens and Collewijn 1985; Regan et al. 1986). However, we will see in Section 31.3.2 that changes in absolute disparity do produce motion-in-depth when the inhibitory effects of unchanging perspective are removed. The visual system is less sensitive to simple disparity gradients than to spatial discontinuities of disparity. For example, a disparity difference between a test line and comparison line was more precisely detected when the 296
•
Gerald Westheimer. Born in Berlin in 1924. In 1938 the Westheimer family fled to Australia from Nazi Germany. He trained as an optometrist at Sydney Technical College and obtained a B.Sc. at Sydney University. In 1953 he obtained a Ph.D. in physiological optics with Glenn Fry at Ohio State University. He was a faculty member at the University of Houston and Ohio State University before joining the faculty at the School of Optometry in Berkeley in 1960, where he remained until he retired. He is a fellow of the Royal Society of London and recipient of the Tillyer medal, Proctor Medal, Prentice Medal, Bicentennial medal, and Sallmann Prize.
Figure 18.8.
lines were superimposed on a frontal plane display of dots than when presented in isolation (Glennerster and McKee 1999). A system that responds to a disparity gradient uncouples relative disparity signals from an absolute disparity distributed over the stimulus. The uncoupled signal is independent of fluctuations in vergence and provides a mechanism suited for detecting local depth gradients that indicate the shapes of 3-D objects. This issue is discussed in Section 20.5. The perception of relative disparities is discussed further in Section 21.3.
18.3.2b Disparity Discrimination and Reference Surfaces Local perturbations of depth applied to a smooth surface are detected more easily than when applied to a rough surface (Norman et al. 1991). Thus, subjects easily discriminated between a fully coherent random-dot surface with a low spatial-frequency depth modulation and the same
STEREOSCOPIC VISION
surface with 4% of incoherent dots. As more incoherent points were added, the threshold for detecting a difference in coherence increased. A coherent surface provides a consistent reference against which to judge changes in disparity. This is intuitively obvious with bolder stimuli. A single stimulus in a set is easily detected when it is the only one that does not lie in a plane, but is lost when added to a set of stimuli lying in different planes. We will see in Section 22.8.2 that simple depth features are processed in parallel and that an object with a given depth immediately pops out when presented in the context of other objects with distinct depth features. Glennerster and McKee (1999) superimposed two 1˚ vertical lines on a regular lattice of black dots, which had a horizontal disparity appropriate to a surface slanted about a vertical axis. The surface appeared more frontal than indicated by disparity, and the two lines appeared frontal when the disparity gradient between them was similar to that of the surface. In other words, the surface normalized to the frontal plane and shifted the apparent relative depth of the lines accordingly (Section 21.4.1). The depth discrimination threshold for the two lines was at a minimum when they appeared to lie in the same depth plane rather than when they were actually in the same depth plane. This suggests that depth normalization of the lattice rescaled disparities. Petrov and Glennerster (2004) investigated detection of the 3-D position of a point relative to two other points, as shown in Figure 18.9. Subjects reported in which of two sequential displays point T was displaced from position 0 to either position 1 or position 2. Performance was not much affected by the relative disparities between point T and the reference points A and B, or by the disparity curvature of the three points. The most importance factor was the disparity of point T with respect to the imaginary inclined line joining points A and B. In other words, the stereoscopic system is designed to detect departures
A
T2
Viewing direction T0
T1
B Display used by Petrov and Glennerster (2004). Subjects reported in which of two intervals object T was displaced from position 0 to position 1 or 2 relative to fixed points A and B.
Figure 18.9.
from real or imaginary surfaces or lines in 3-D space. Glennerster and McKee (2004) found that this result did not depend on the perceived slant of the reference surface. Petrov and Glennerster (2006) extended their previous findings. When the reference points were in a frontal plane, detection of the relative depth of the test point (point T) was best when it was near one or the other reference point. When the imaginary line between the reference points was inclined in depth, depth acuity was mainly a function of the distance of the test point relative to the imaginary line rather than of its proximity to one or other reference points. However, instructions were not clear in these experiments. Subjects could have judged the depth of the test point with respect to the imaginary line or with respect to the nearest reference point. The two types of judgment would be the same when the reference points were on a frontal plane, but not when they were on an inclined plane. Perhaps a change in instructions would change the results. Discrimination of differences in slant about a vertical axis is discussed in Section 20.2.2. Discrimination of inclination and curvature about a horizontal axis is discussed in Sections 20.3 and 20.5. Depth discrimination is compared with equivalent monocular tasks in Section 18.11.
18.3.3 S T E R E OAC U I T Y AWAY F RO M T H E H O RO P T E R
18.3.3a Disparity Pedestals Subjects can be asked to fixate on a point and detect a difference in depth between two neighboring stimuli, both with crossed or uncrossed disparity. The two stimuli are said to be on a disparity pedestal equal to the disparity in the nearer stimulus with respect to the fixated point, as illustrated in Figure 18.1B. Ogle (1953) asked subjects to set a thin line to the same apparent depth as a comparison line set at various pedestal disparities with respect to a fixation point. The depth-discrimination threshold increased exponentially with increasing pedestal disparity. Blakemore (1970c) placed two 2.25-arcmin-wide luminous slits, one above and one below a fixation point. They were exposed for 100 ms against a dark background. They measured depth discrimination as a function of the disparity in both slits with respect to a fixation point. Like Ogle, he found that the discrimination threshold increased exponentially with increasing pedestal disparity. For centrally placed stimuli and stimuli at 5˚ eccentricity, the rate of increase of the threshold on a logarithmic scale was steeper for crossed than for uncrossed disparity pedestals (Figure 18.10). Krekling (1974) and Westheimer and McKee (1978) reported similar results. Depth is produced in random-dot stereograms with up to 2˚ of overall disparity (Section 18.4.1).
STEREOSCOPIC ACUIT Y
•
297
5°
Threshold disparity (arcmin)
10°
0°
0°
20
10° 5°
10 5
1
0.3
120
80 40 0 40 80 Convergent Divergent Disparity relative to the horopter (arcmin)
120
Stereoacuity and absolute disparity. Disparity threshold for detecting relative depth of two vertical lines as a function of the absolute disparity in the lines (depth pedestal). Plots are for central stimuli (red), and stimuli at 5˚ (blue) and 10˚ (green) eccentricity. Filled symbols, subject 1, closed symbols, subject 2. (Adapted from Blakemore
Figure 18.10.
1970c)
Popple and Findlay (1999) measured the threshold for detecting depth between a 1.7˚ central test disk and a surrounding disk on a disparity pedestal of +24 arcmin. The threshold fell as the diameter of the comparison disk increased from 2.6 to 8˚. Andrews et al. (2001) used a vertical test line and a comparison line placed below it. They controlled for effects of vergence by exposing the stimuli for only 150 ms. The stereo threshold increased with increasing pedestal disparity of the comparison line. The lines were then presented in a 1˚-high by 3˚-wide blank region in a random-dot display. For a given disparity of the comparison line, the stereo threshold was about halved when the random-dot display had the same disparity. The random-dot display, like the larger disk in the Popple and Findlay study, increased the reliability of estimates of the disparity of the comparison stimulus. Badcock and Schor (1985) used two elongated DOG patches with center spatial frequencies between 0.15 and 11.6 cpd. Note that the width of a DOG increases with decreasing spatial frequency. With fixation on a point, the standard patch had a crossed or uncrossed pedestal disparity of between 10 and 80 arcmin. The test patch was pulsed on for 750 ms with various disparities. Figure 18.11 shows that the stereo threshold increased exponentially with pedestal disparity, with a slope that was not much affected by the spatial frequency of the stimulus (see also Siderov and Harwerth 1995). Badcock and Schor found that the threshold tended to level off for disparities of more than about 20 arcmin. On the other hand, Ogle (1953) and 298
•
Blakemore (1970c) had found that the disparity discrimination threshold increased exponentially up to a pedestal disparity of 2˚. The break in the continuity of the discrimination function reported by Badcock and Schor suggests that disparity coding involves two processes. One possibility is that disparity detectors for fine disparities rapidly lose sensitivity with increasing disparity, and that detectors of large disparities have a more constant response with increasing disparity. Another possibility is that, as stimuli become diplopic with large disparities, subjects do not see depth but consciously use the relative separations of the diplopic images (see McKee et al. 1990a). Blakemore controlled for this factor by jittering the separation of the images from trial to trial. Siderov and Harwerth (1993b) found that, after jittering image separation, depth-discrimination functions were still discontinuous, as reported by Badcock and Schor, rather than continuous, as Blakemore reported. When Siderov and Harwerth took the extra precaution of randomly varying crossed and uncrossed disparities, the discrimination functions became exponential and continuous. They concluded that the disparity discrimination threshold shows a continuous exponential rise with increasing pedestal disparity and that the leveling off of the function at larger disparities reported by Badcock and Schor was due to the intrusion of judgments based on explicit widthmatching. McKee et al. (2005) asked whether sensitivity to relative disparity between a bar and a grating depends on the phase disparity of the grating or on the disparity of the grating as a whole, as depicted in Figure 18.12. Subjects fixated a small point, which was then replaced by the grating and a comparison bar placed 1˚ below it at a specified disparity. Exposure was 200 ms to prevent vergence changes. Subjects reported the depth of the bar relative to the grating as the disparity of the grating was varied from trial to trial around the disparity of the bar. If depth-discrimination depended on the phase disparity of the grating, the threshold should have first increased and then decreased as the images came into phase again at a disparity of 20 arcmin. However, the depth-discrimination threshold increased monotonically as the disparity of the whole grating increased to 20 arcmin, as did the perceived depth of the grating relative to the fixation plane. Thus, the way the images of the grating were linked in the visual system was determined by their overall disparity rather than by the local phase disparity. Presumably, local phase disparities are used when there are no competing disparities. The stereogram shown in Figure 20.10C does not create an impression of slant about a vertical axis even though the whole image in one eye is magnified 10% with respect to that in the other eye ( Julesz 1963). In Section 19.3.3 it is argued that this is because perceived slant arises from the difference between horizontal-size disparity and vertical-size
STEREOSCOPIC VISION
500
300 0.15 250
400 0.6
100 50
0.15 300
2.4 9.6 Bar
200 0.6
25 2.4 9.6 Bar
Stereo threshold (arcsec)
100
A
50
B
8 –80
–80 –40 –20 0 20 40 80 Uncrossed Crossed 300
500
0.15 0.6 250
100
–40 –20 0 20 40 80 Uncrossed Crossed
400 2.4 9.6 Bar
0.15 0.6
300
50 200 2.4 9.6
25 100
C
Bar
D
50
8 –80 –40–20 0 20 40 80 Uncrossed Crossed
–80
–40–20 0 20 40 80 Uncrossed Crossed
Disparity pedestal (arcmin)
Depth discrimination as a function of the disparity of the standard stimulus. The threshold for discrimination of depth between test and standard stimuli as a function of the pedestal disparity of the standard. In panels A and B both stimuli were DOG patches with a center spatial frequency indicated by each curve. In panels C and D the standard stimulus was a thin bright bar, and the test stimulus was a DOG. Panels B and D show the same data as panels A and C with thresholds plotted on linear scales. Results for one subject. (Redrawn from Badcock and Schor 1985)
Figure 18.11.
disparity rather than from horizontal-size disparity alone. Superimposing a step disparity on a differential magnification of the images is equivalent to placing the disparity on a disparity pedestal. One would expect stereo acuity to be lower than for a step disparity superimposed on a zero disparity display. In line with this expectation, Reading and Tanlamai (1980) found that the stereo threshold in random-dot stereograms and in several standard stereoacuity tests was elevated in proportion to the overall magnification of the image in one eye. The question of how well people extract different components of disparity, such as horizontal disparity, orientation disparity, and differential magnification when they are presented simultaneously has not been systematically explored.
18.3.3b A Dipper Function in Disparity Discrimination A stimulus is most easily detected when it excites a detector at the peak of its tuning function. However, a change in
a stimulus along a feature continuum is discriminated best when the stimulus falls on a steep flank of the tuning function. Also, the difference between two stimuli is most easy to detect when the stimuli fall in the region where neighboring tuning functions overlap (Section 3.1.4b). As the value of the stimulus moves in either direction away from the steepest part of the tuning function, the discrimination threshold increases. The resulting u-shaped discrimination function is known as a dipper function. Consider a disparity detector for which the peak of the tuning function falls at zero disparity. It will respond best to a zero-disparity stimulus but it will be most sensitive to a change in disparity on the flank of its tuning function, where disparity is not zero. Accordingly, the ability to detect a difference in disparity between two neighboring stimuli should at first improve as both stimuli are moved away from the horopter (on a pedestal) before it finally falls. In all the experiments reported in the previous section, discrimination of depth was best when the comparison stimulus had zero horizontal disparity. The dip in the depth-discrimination function showed at zero disparity.
STEREOSCOPIC ACUIT Y
•
299
Disparity increment threshold (arcmin and deg)
20° 3.3'
Left eye
Right eye
Local phase disparity and overall disparity. For these images, the local phase disparity is zero, but the overall disparity is one period of the gratings. Perceived depth of the fused images relative to a bar in the plane of fixation is determined by the overall disparity. (Adapted from McKee
Far increment 16° 2.6'
Near increment Mean 12° 2.0'
Figure 18.12.
8° 1.3' 0' 0°
et al. 2005)
5' 10' Pedestal disparity (arcmin) 15°
30°
45°
60°
15' 75°
90°
Pedestal disparity (deg of phase)
This suggests that the tuning functions of detectors of horizontal disparity overlap at zero disparity and have their peaks on either side of zero. However, the smallest pedestal disparity used in these experiments was 10 arcmin. A dipper function that might occur over a range of smaller disparities would not be revealed. Also, the stimuli had a broad range of spatial frequencies. Small disparities engage detectors of high spatial frequency, and large disparities engage detectors of low spatial frequency (Section 18.7). Differences in discrimination functions for the different spatialfrequency components of the stimuli would wash out dipper functions that may be evident when each spatial frequency is tested alone. Farell et al. (2004a) used 8˚ by 8˚ vertical sinusoidal gratings and pedestal disparities of between 0 and 15 arcmin. The disparity of a test grating with respect to its boundary varied with respect to that of a comparison grating with a fixed pedestal disparity. The two gratings were presented sequentially in a temporal forced-choice procedure. The results for one subject for a grating of 1 cpd are shown in Figure 18.13. It can be seen that both the upper and lower discrimination thresholds declined as the pedestal disparity of the comparison stimulus increased to about 30˚ of phase disparity (equivalent to 5 arcmin of disparity). Similar dipper functions occurred for gratings with spatial frequencies of 2 and 3 cpd. The dipper function was not evident for a random-dot display containing many spatial-frequency components. These results suggest that disparity-tuning functions intersect at disparities on either side of zero, which agrees with tuning functions determined physiologically (Section 11.4.1). Duwaer and van den Brink (1982b) measured the threshold for discriminating between the vertical disparity of two horizontal test lines and that of two horizontal 300
•
Figure 18.13.
Disparity threshold and pedestal disparity.
(Adapted from Farell et al.
2004)
comparison lines. The threshold fell to a minimum as the disparity of the comparison lines increased from zero to 1.7 times above the threshold for detection of disparity. The threshold increased with larger disparities of the comparison lines. This is a dipper function for vertical disparity. It suggests that tuning functions of vertical-disparity detectors also intersect at disparities on either side of zero. Georgeson et al. (2008) used a 10˚ by 10˚ random-dot stereogram depicting a surface with 3 or 6 cycles of horizontal depth corrugation. They obtained a depth-discrimination dipper function similar to that obtained by Farrell et al. when the phase (vertical location) of depth corrugations was varied from trial to trial. However, the dipper function was not much in evidence when the phase of the corrugations was always the same. They argued that the dipper function was due to uncertainty about the location of the peaks and troughs (disparity sign) of the corrugation when phase was varied from trial to trial. A pedestal disparity helped to resolve this uncertainty. When phase was constant, the sign of disparity was easier to detect and a pedestal disparity was not needed. Their stimulus contained many spatial frequencies and was not optimal for revealing a dipper function arising from the distribution of tuning functions of disparity detectors. 18.3.4 S T E R E OAC U I T Y A N D R E L AT I V E I M AG E S I Z E
There has been some dispute about how much a difference in the sizes of the images in the two eyes (aniseikonia) is
STEREOSCOPIC VISION
tolerated without loss of stereopsis. Ogle (1964) put the upper limit of tolerated aniseikonia for the perception of depth in the space eikonometer at 5%. Others reported that stereopsis in a normal visual environment is impossible when aniseikonia exceeds between 5 and 8% (Campos and Enoch 1980). However, Highman (1977) reported that stereopsis was present with aniseikonia up to 19% and Julesz (1960) reported that depth is still apparent in a randomdot stereogram when the images differed in size by 15%. Lovasik and Szymkiw (1985) found that all subjects retained stereopsis with up to 13% aniseikonia on the Randot and Titmus tests. Some subjects retained some stereoacuity with up to 22% of aniseikonia. The Randot test was more resistant to aniseikonia than the Titmus test. It seems that random-dot stereotests are particularly resistant to aniseikonia. Reading and Tanlamai (1980) found that the stereo threshold in random-dot stereograms and in several standard stereoacuity tests was elevated in proportion to the magnification of one eye’s image. Jiménez et al. (2002b) measured the range of disparities for which depth in a random-dot stereogram could be detected as a function of the magnification of the image in one eye. All subjects perceived depth in the stereogram at all levels of magnification up to 8˚. However, the range of disparities over which depth was reported decreased steeply as aniseikonia exceeded about 2˚. The degree of binocular summation of the human visual evoked potential started to decrease when the size of one image of a checkerboard stereogram was increased by 3% (Katsumi et al. 1986). With more than 8% magnification, binocular summation was replaced by binocular inhibition. When an object is fixated with eccentric gaze, the image in the nasally turned eye is larger than that in the other eye. Ogle (1964) proposed that the visual system compensates for the size difference. Vlaskamp et al. (2009) argued that if this compensation occurred before disparities are registered the difference in image size produced by eccentric gaze would not degrade stereoacuity. They found that a difference in image size degraded stereoacuity by the same amount whatever the direction of gaze. This suggests that there is no compensation for eccentric gaze. However, a woman with a constant difference in image size had compensated for the difference.
The quantal efficiency of the human visual system is influenced by (1) quantal noise in the test stimulus, (2) the area and duration of the stimulus and the presence of extraneous stimuli, (3) the efficiency of the visual system in capturing and transducing light quanta, and (4) internal noise (Barlow 1958). Barlow (1978) introduced the related concept of statistical efficiency. This measure is designed for tasks involving detection of a suprathreshold patterned stimulus in the presence of extraneous stimuli. If the stimulus is well above threshold and adequately resolved, the ideal detector is not limited by quantal noise or optical factors, but only by extraneous stimuli. The method is applicable only to tasks defined with enough precision to allow one to specify the theoretical limit of performance. Harris and Parker (1992) measured the statistical efficiency of two human subjects in detecting a vertical step of disparity in a random-dot pattern. The dots on one side of the step were set at zero disparity, and those on the other side were set at a disparity that varied from trial to trial. An extra horizontal disparity was imposed at random on all the dots, with a mean of zero and a standard deviation that varied from trial to trial. An ideal detector uses all information in the stimulus to detect the disparity step, given that it knows where the step will be when it is present. The performance of the ideal detector was specified by the mean disparity difference, obtained by subtracting the mean disparity of all the dot pairs on one side of the step from the mean disparity of pairs on the other side. The addition of noise with standard deviation of snoise caused the mean disparity of the n dots on each side of the step to vary from trial to trial with a standard error of: s=
(2)
This is the standard error of the disparity signal. The mean disparity, Δd , divided by the standard error of the disparity signal, s. is the signal-to-noise ratio. This ratio defines the ideal discriminability of the stimulus, or d ’ ideal =
D s
(3)
The statistical efficiency, F, of the human observer is the square of the ratio of the discrimination performance of the observer, d’, to the ideal discriminability of the stimulus: ⎛ d’ ⎞ F =⎜ ⎟ ⎝ d ’ideal ⎠
18.3.5 I D E A L O B S E RV E R F O R S T E R E OACU I T Y
The concept of the quantal efficiency of a detector was discussed in Section 5.1.5. Quantal efficiency is the ratio of the performance of a human observer to that of an ideal detector limited only by the statistical fluctuations in the arrival of quanta (quantal noise). It provides an absolute reference for assessing the performance of a visual system.
snoise N
(4)
Stereoscopic efficiency was 20 to 30% when there were fewer than 30 dots, and fell to 2% as the number of dots increased to about 200. Changes in dot density, with number of dots constant, had little effect. Increasing the width of the stimulus, with dot density constant,
STEREOSCOPIC ACUIT Y
•
301
caused a drop in efficiency but only for high dot densities. It looks as though people use only a limited set of dots from the total available. Presumably, they use the dots closest to the disparity edge, since disparities in this region define a disparity gradient that is not subject to adaptation. Noise makes it difficult to detect a second-order gradient because it injects second-order disparities over the whole display. Also, with fixation on the disparity step, dots in that region fall on the fovea. We saw in Section 15.2.2 that observers use up to 10,000 elements when detecting interocular correlation in a random-dot display. In that case, the crucial information is not confined to one region of the display. In a further study, Harris and Parker (1994a) found that the stereoefficiency of human observers declined as the variance of disparity noise increased. They explained this in terms of increased difficulty in matching the dots. When this factor was controlled by confining the dots to two well-spaced columns, stereo efficiency remained constant as noise amplitude increased. They argued that efficiency is limited by processes beyond image matching in these circumstances (see Harris and Parker 1994b). 18. 4 U P P E R D I S PA R I T Y L I M I T 18.4.1 U P P E R L I M I T O F H O R I Z O N TA L D I S PA R I T Y
18.4.1a Disparity Limits for Line Images The upper disparity limit for stereopsis goes beyond the limit of binocular fusion. See Ono et al. (2007b) for an account of early disputes about the role of binocular fusion in stereopsis. Ogle (1952) measured the maximum disparity (dmax) between the images of a thin vertical line with respect to a fixation point that produced (a) depth with fused images, (b) clear depth with diplopic images, and (c) a vague impression of depth combined with diplopia (Portrait Figure 18.14). He called the strong impression of depth created with fused or diplopic images patent stereopsis and the vague impression of depth with more obvious diplopia qualitative stereopsis. In the foveal region, the fusional area extended to about +5 arcmin, patent stereopsis extended to about +10 arcmin, and qualitative stereopsis extended to about +15 arcmin. At an eccentricity of 6˚, patent stereopsis extended to about 70 arcmin and qualitative stereopsis to about 2˚. We will see that these estimates are lower than those found in recent studies. Westheimer and Tanzman (1956) improved on Ogle’s procedure by presenting the stimuli only briefly, to eliminate effects of changing vergence, and by using the method of constant stimuli to eliminate errors of anticipation. For disparities up to about 7˚, most subjects could detect whether the test target was nearer or further away than the 302
•
Kenneth N. Ogle. Born in Colorado in 1902. He obtained a B.A. from Colorado College in 1925 and a Ph.D. in physiological optics at the Dartmouth Medical School in 1930. He remained at Dartmouth, where he became professor of physiological optics. In 1947 he moved to the Mayo Clinic at the University of Minnesota, where he remained until he died in 1968. Recipient of the Tillyer Award of the Optical Society of America in 1967.
Figure 18.14.
fixation spot. Subjects detected the relative depth of a stimulus with uncrossed disparity more reliably than that of a stimulus with crossed disparity. Using a similar procedure, Blakemore (1970c) found that the sign of depth of a midline slit relative to a fixation point was categorized well above chance for crossed disparities of between 4 and 7˚ and for uncrossed disparities of between 9 and 12˚. Richards and Foley (1971) used a signal-detection paradigm and found even higher disparity limits. The ability to detect the sign of depth of a 100-ms stimulus relative to a fixation stimulus at a distance of 170 cm began to deteriorate at a disparity of 1˚ or less. However, the single subject still performed above chance with an uncrossed disparity of 8˚ or a crossed disparity of 16˚. This degree of disparity corresponds to that produced by an object at a distance of 21 cm with the eyes converged at 170 cm. Foley et al. (1975) (Portrait Figure 18.15) exposed stereoscopic images of a vertical bar for 40 ms with various disparities relative to a fixation point. Subjects indicated whether the bar was nearer than or beyond the fixation point and pointed with unseen hand to the bar. Categorization of depth order began to deteriorate at
STEREOSCOPIC VISION
John M. Foley. Born in Springfield, Massachusetts, in 1936. He obtained a B.A. in physics at the University of Notre Dame in 1958 and a Ph.D. in psychology at Columbia University in 1963. Since 1963 he has been at the Department of Psychology at the University of California, Santa Barbara, where he is now a research professor.
Figure 18.15.
disparities of about 2˚, whereas pointing accuracy continued to improve up to a disparity of about 4˚ and was still above chance at a disparity of 8˚, the highest disparity tested. It is unlikely that detection of depth from disparities of several degrees is mediated by a conventional disparitydetection mechanism. Pointing to a midline rod with a disparity well beyond the diplopia threshold could be based on the image in only one eye. This issue was discussed in Section 17.6.5. The following sections are concerned with stimulus factors that determine the upper limit of disparity that creates depth (dmax).
18.4.1b Hysteresis in the Disparity Limit for Stereopsis The present section is concerned with the relationship between the diplopia threshold and the limiting disparity required for the perception of depth in random-dot stereograms. Fender and Julesz (1967) used a 3.4˚-wide retinally stabilized, random-dot stereogram. A central square stood out in depth by a fixed disparity, and the disparity of the
whole display was increased or decreased at 2 arcmin/s. They concluded, “for random-dot stereoscopic images there is no difference between fusion thresholds and the thresholds for stereopsis.” This is a strange result since it is well known that stereoscopic depth can be perceived with diplopic images. A diplopic array of closely spaced random dots has a hazy rivaling appearance compared with the planar appearance of a fused array. The fusion limit for random-dot displays is ill defined. Oddly, the stereogram in their paper consists of two uncorrelated random-dot displays. One subject saw depth in retinally stabilized stereograms when the images were separated horizontally up to 2˚ and vertically up to about 20 arcmin. We are not told whether the square in depth still appeared smooth or whether it took on a hazy appearance. Depth was not seen in initially unfused images until they were within 6 arcmin of each other horizontally and 1 arcmin vertically. Thus, a larger horizontal disparity limit and a larger hysteresis effect were obtained with the criterion of perceived depth using a random-dot stereogram than with the criterion of diplopia using the line target. However, it is not clear from this comparison whether the crucial factor was the criterion or the type of display. Using a similar procedure, Piantanida (1986) measured crossed and uncrossed disparity limits for seeing a cyclopean form in a retinally stabilized random-dot stereogram. The range (sum of crossed and uncrossed limits) was between 68 and 150 arcmin for increasing disparity and between 46 and 96 arcmin for decreasing disparity. Thus, there was a small hysteresis effect, although a cyclopean form in depth was regained at a much larger disparity than in the Fender and Julesz study. Piantanida reported that the stereograms still appeared fused after the cyclopean shape could no longer be seen. However, the criterion for fusion was the elongated appearance of the square outline of the stereogram. He did not report whether the surface of the stereogram appeared as a flat plane or as hazy depth. Loss of fusion of the matching sets of dots may have occurred before, not after, the loss of the cyclopean shape. Hyson et al. (1983) approached the issue of stereopsis hysteresis with a different procedure. They presented a 9.8˚-wide random-dot stereogram with a pattern of disparities that created a spiral in depth. Subjects were free to change convergence between different depth planes within the stereogram. The two images were slowly separated laterally while vergence eye movements were measured. The extent to which vergence failed to keep up with the separation of the images gave a measure of the residual overall disparity between the images. The spiral in depth could be seen with up to 3˚ of absolute disparity. As soon as depth was lost, the eyes returned to their original converged position. The displays were then brought slowly together until the impression of depth returned. On average, it returned 2.6˚ in from the point where depth had been lost. Thus, the disparity limit for maintained depth
STEREOSCOPIC ACUIT Y
•
303
and the hysteresis effect were even larger than with the smaller stereogram used by Fender and Julesz. These large tolerated disparities need not be regarded as extensions of Panum’s fusional area, since the criterion was perceived depth, not diplopia. The subjects saw depth produced by a fixed relative disparity in a random-dot stereogram with up to 3˚ of overall horizontal disparity. This is equivalent to the task of registering a disparity superimposed on a disparity pedestal, as discussed in Section 18.3.3. For instance, reliable relative depth judgments were made between two lines when they were up to 2˚ of disparity away from the fixation point (Blakemore 1970c). Hyson et al. argued that although random-dot images must fall on nearly corresponding retinal regions before depth is first registered, a record of matching dot clusters might be retained over large disparities well outside the normal fusion limits, once depth has been perceived. This process would occur only if the visual system registered large dot clusters or used the edges of the stereogram. Hyson et al. called this process “neural remapping.” This term is misleading because it suggests that the pattern of neural correspondence has been remapped. But this is not established by these results. The relative disparity that defined the depth in the stereograms remained constant—only the overall disparity changed. It is not necessary to assume that corresponding points are remapped but only that, up to a point, overall disparities are disregarded in favor of relative disparities. It was mentioned in Section 12.1.6 that Diner and Fender (1987) found that the fusional area for lines shifted in the direction of a slowly moving disparity. Erkelens (1988) investigated the same issue using a 30˚-wide random-dot stereogram. The images were retinally stabilized for vergence movements but not for version. The subjects could thus look at different parts of the stereogram but could not change convergence appropriate to the disparity. The disparity limits for slowly increasing crossed and uncrossed pedestal disparities were measured with the criterion of perceived depth. The same limits were also measured for randomly presented static pedestal disparities. The limits for increasing disparity were similar to those for static disparities, but the limits for regaining the impression of depth were lower than those for either increasing or static disparities. Erkelens concluded that a history of perceiving fused images does not shift the disparity limit for the perception of depth, but a history of perceiving disparate images contracts the limit for that same disparity. These results confirm the hysteresis effect and also confirm Piantanida’s claim that limits for regaining the impression of depth are higher than those reported by Fender and Julesz. But the results contradict Diner and Fender’s claim that the disparity range for stereopsis with an increasing disparity is shifted relative to that for a static disparity. However, Diner and Fender did not investigate the refusion limit relative to the static disparity limit. 304
•
Duwaer (1983) pointed out that the disparity limit indicated by the criterion of detected depth is a limit of stereo depth rather than of fusion. He found that the diplopia limit for a square superimposed on a random-dot stereogram was within normal limits of about 0.3˚, while depth was seen in the stereogram up to a limiting disparity of about 1.3˚. He argued that the major hysteresis effect observed with random-dot stereograms does not represent a change in the fusional limits, as Fender and Julesz believed, but is due to the difficulty of regaining the correct binocular match once images have become disparate. However, Piantanida (1986) and Erkelens (1988) claimed that random-dot stereograms remain fused even after the impression of depth is lost. This is difficult to reconcile with the small fusional limits for displays with high spatial-frequency content, as reported in Section 12.1.2. Summary However this debate is resolved, to perceive depth in a random-dot stereogram the two images must first be linked. Fender and Julesz claimed that this initial linking does not occur unless the images are within a few arcmin of being in binocular register, but Erkelens claims that it can occur with more than 1˚ of disparity between the images. In any case, once the disparity that defines the pattern in depth has been detected, up to 2˚ of overall disparity between the two images is tolerated before the sensation of relative depth is lost. The visual system detects relative disparities within a stereogram despite the presence of a disparity over the stereogram as a whole. When overall disparity in a random-dot stereogram is reduced from a state of diplopia, depth is not perceived until disparity has reached a lower level than that at which depth disappears when disparity is increased. Some investigators interpret this hysteresis effect as a shift in the limits of stereoscopic fusion as disparity is slowly increased, but Erkelens interprets it as a contraction of the limits of fusion due to previous exposure to unfused images.
18.4.1c Luminance, Contrast, and the Disparity Limit A reduction in luminance from mesopic to scotopic levels, with contrast held constant, degraded detection of the sign of depth in a bar with 0.5˚ of disparity but improved detection for a bar with 4˚ of disparity. A reduction in luminance contrast at photopic levels of luminance had a similar differential effect (Richards and Foley 1974). There is less lateral inhibition, and therefore more neural summation, at low luminance or contrast. This increased summation could facilitate processing of large disparities. On the other hand, Wilcox and Hess (1995) found almost no effect of varying the contrast of Gabor patches from 3 to 30 db above threshold on the upper disparity limit for stereopsis. Effects of contrast on stereoacuity are discussed in Section 18.5.
STEREOSCOPIC VISION
18.4.1d Spatial Frequency and the Disparity Limit Receptive fields that process low spatial frequencies are larger than those that process high spatial frequencies. Therefore, the low spatial-frequency system should be able to process larger disparities than the high spatial-frequency system. Richards and Kaye (1974) found that the upper disparity limit (dmax) for bars increased in proportion to the width of the bars. Schor and Wood (1983) investigated dmax using small patches with difference-of-Gaussian (DOG) luminance profiles with center spatial frequencies between 0.075 and 19.2 cpd, each with a bandwidth at half-height of 1.75 octaves. Figure 18.38 shows that dmax for stereopsis was constant for spatial frequencies above about 2.4 cpd and increased for lower spatial frequencies. With decreasing spatial frequency, the depth-discrimination threshold (lower disparity limit) rose faster than the upper disparity limit, so that the range of disparities evoking depth sensations became narrower as spatial frequency decreased. Since the spatial frequency and width of the Gaussian patches used by Schor and Wood covaried, it is not clear which variable was responsible for the variation in dmax. Wilcox and Hess (1995) investigated the separate effects of these two variables. A Gabor patch with variable disparity was exposed for 0.33 s between two similar zero-disparity patches placed above and below it. Either the width of the envelope of the patches varied between 5.7 and 45.8 arcmin while the spatial frequency of luminance modulation within the patch was held constant, or spatial frequency was varied between 0.03 and 10 cpd while patch width was held constant. The largest disparity for which depth was reported did not vary with the spatial frequency within the patches but was proportional to patch width. At the disparity limit, the disparity was several multiples of the period of the luminance modulation of the patches so that, with small patches, subjects were forced to base depth judgments on the disparity of the envelopes of the patches. With very wide patches, subjects would be forced to base judgments of depth on the modulations within the patches, and the impression of depth would vary cyclically as the bars of the Gabor patch came into and out of phase. Prince and Eagle (2000b) obtained similar results.
by 16˚-high display, to about 50 arcmin with a density of 50% dots. These values were very similar to the maximum displacement of sequentially displayed stimuli that gave rise to apparent motion. Ziegler et al. (2000a) asked subjects to report the orientation of a disparity-defined grating in a random-element stereogram constructed from small Gabor patches. As luminance, spatial frequency, or patch size was increased, dmax decreased, but dmax was constant as a function of stimulus bandwidth. This and other data demonstrated that, at each luminance, dmax for detection of a depth modulation is limited by the number of local luminance cycles. As the number of cycles in each patch increases, false matches become evident at smaller disparities. The false matches between the contents of the Gabor patches limited dmax rather than the disparity between the envelopes of the patches. Related findings are discussed in Section 18.7.2d.
18.4.1f Disparity Gradients and the Disparity Limit A disparity gradient is defined as the ratio of the disparity between neighboring stimuli to their difference in visual direction (Section 19.4). The disparity gradient limit is the steepest disparity gradient for which stimulus elements can be perceptually fused. Ziegler et al. (2000b) enquired whether a disparity gradient limit applies to the detection of depth in a disparity-defined grating. They measured dmax for detection of a disparity-defined grating in an array of Gabor patches. The grating had a trapezoidal, triangular, sinusoidal, or square-wave profile. If spatial frequency of depth modulation is the crucial factor, then dmax should be independent of the type of gradient, but if the spatial gradient is the dominant factor, dmax should vary with the shape of the grating at each spatial frequency. As spatial frequency increased from 0.04 to 0.35 cpd, dmax for trapezoidal and sine-wave gratings increased but, for square-wave gratings, it remained essentially constant. The results suggest that the upper disparity limit for detection of smooth and discontinuous cyclopean shapes depends on a disparity gradient plus low-pass disparity filtering.
18.4.1g Eye Movements and the Disparity Limit 18.4.1e Element Density and the Disparity Limit Multielement displays, such as random-dot stereograms, become subject to spurious binocular matches as disparity exceeds half the mean dot spacing (Stevenson et al., 1992). Glennerster (1998) measured the largest horizontal disparity (Dmax) in a random-dot stereogram exposed for 150 ms for which subjects could detect the direction of depth relative to a frame. The maximum disparity declined from about 5˚ with only two dots in the 21˚-wide
Stereoscopic gain and the upper disparity limit for stereopsis increase if subjects scan between targets rather than fixate one target (Section 18.10.2). In most experiments on the upper disparity threshold, stimuli were presented for a period shorter than the latency of vergence. Therefore eye movements did not affect the stimulus. However, briefly exposed stimuli with large disparities could evoke an appropriate vergence response that occurs after the stimulus has been shut off (Section 10.5.7).
STEREOSCOPIC ACUIT Y
•
305
Subjects’ judgments may be prompted by the vergence movement rather than by a direct appreciation of the disparity. If coarse stereopsis were contingent on vergence, subjects would be unable to detect the depth of more than one object at a time relative to a fixation point. Ziegler and Hess (1997) found that when subjects were shown two stimuli that had disparities of between 2 and 3˚, they could indicate the depth of each stimulus relative to the fixation cross. This suggests that vergence signals are not required for the detection of depth from disparities well outside the range of binocular fusion. 18.4.2 TO L E R A N C E F O R A D D E D V E RT I C A L D I S PA R IT Y
To the extent that detectors for horizontal disparity have vertically elongated receptive fields, they should possess some tolerance for vertical disparity. Given that receptive fields increase in size with eccentricity, tolerance for vertical disparity should increase with increasing size of a display. Also, one would expect some tolerance for vertical disparity because vertical disparities occur naturally. In investigating tolerance for vertical disparity, vertical vergence must not null the vertical disparity. In Figure 18.16 the surrounding texture locks vergence. The images marked 0 have no vertical disparity, and the relative depth produced by horizontal disparity is readily perceived. Depth becomes difficult to detect as vertical disparity increases down the column of numbers.
18.4.2a Vertical Disparities in Dot and Line Displays Ogle (1955) exposed a test point for 200 ms with different horizontal and vertical disparities relative to a fixation point. Subjects could detect the relative depth between the points with vertical disparity of up to about 25 arcmin. Depth was still evident when the point appeared diplopic.
Mitchell (1970) displayed a pair of 40-arcmin dichoptic vertical lines for 120 ms with a horizontal disparity of up to 5˚ and variable vertical disparity. Subjects could judge the depth of the line relative to a fixation point when the vertical disparity was up to about 3˚. Fukuda et al. (2009) produced evidence that the large tolerance for vertical disparity reported by Mitchell was due to depth having been coded in terms of monoptic depth rather than in terms of horizontal disparity. It was explained in Section 17.6.5 that a single image in one eye creates an impression of depth with a sign that depends on whether the image falls on the nasal or on the temporal retina. As point images with a horizontal disparity become widely separated vertically, they cease to engage the system that detects horizontal disparity. Instead the depth of each image relative to the point of fixation between them is coded in terms of monoptic depth. The image on the nasal retinal of one eye appears beyond the fixation point and the image on the temporal retina of the other eye appears nearer than the fixation point. When the images of a point contain horizontal and vertical disparity, depth judgments may be based on the oblique disparity. This issue is discussed in Section 18.6.5. For large displays, the visual system relies on the difference between horizontal and vertical disparity (Chapter 19).
18.4.2b Vertical Disparities in Random-Dot Stereograms Boltz et al. (1980) measured the vertical disparity that could be tolerated before loss of depth in a random-dot stereogram with 54 arcmin of horizontal disparity but unspecified size. The tolerated vertical disparity for both humans and monkeys was about 1.5˚ for a stimulus duration of 500 ms. Nielsen and Poggio (1984) showed subjects a randomdot stereogram subtending 54 arcmin that depicted a central square with up to 10.8 arcmin of crossed or uncrossed horizontal disparity. Subjects failed to detect the sign of depth when the central region had a vertical disparity of
0 0
0
0
1 1
1
1
2 2
2
2
3 3
3
3
4 4
4
4
5 5
5
6 6
6
5 6
Tolerance for vertical disparity. The relative depth between the numbers becomes more difficult to see as the vertical disparity increases from 0 to 6. The surrounding display is designed to hold vertical vergence constant.
Figure 18.16.
306
•
STEREOSCOPIC VISION
more than 3.5 arcmin or when the stereogram as a whole had a vertical disparity of more than 6.5 arcmin. Subjects fixated a spot before the stereogram was exposed for only 117 ms. Thus, vergence would not have contributed to the result. The tolerance for vertical disparity was slightly extended when monocular cues to the depth of the disparate region were visible. It is not clear whether subjects experienced diplopia at the point where depth discrimination broke down. Prazdny (1985c) showed subjects a random-dot stereogram subtending 6.5˚. It contained two vertical bars containing binocularly correlated dots in a surround of uncorrelated dots. Only one bar could be seen when the vertical disparity difference between the bars exceeded about 10 arcmin. These results suggest that the range of tolerated vertical disparity is smaller for random-dot stereograms than for isolated dots or lines. This is probably because even small vertical disparities in a random-dot stereogram produce false matches. False matches are most likely when vertical disparity exceeds the vertical separation of the elements. Another factor is stimulus area. The ability to detect a match between dichoptic random-dot displays improves as the size of the displays increases (Section 15.4.2). Stevenson and Schor (1997) asked whether the tolerance of stereopsis for vertical disparity in dynamic random-dot stereograms increases with increasing size of the display. Subjects judged whether a dynamic random-dot display presented in one half of a circular aperture for 200 ms was near or far relative to a vertical line across the center of the aperture. The other half of the aperture contained uncorrelated dots. For a display subtending 12˚, vertical disparities of up to about 30 arcmin were tolerated. For a display subtending only 6˚, disparities of up to about 20 arcmin were tolerated. This range is similar to that reported by Ogle but is larger than that reported by Nielson and Poggio and by Prazdny. But they used static randomdot displays rather than a dynamic display as used by Stevenson and Schor. The increased tolerance for vertical disparity with increasing size of display may occur because it is easier to detect whether a random-dot display is dichoptically correlated when it has more elements. The ability to detect whether dichoptic random-dot displays are correlated showed a similar tolerance for vertical disparity (Stevenson and Schor 1977). One could distinguish between these two possible contributions to vertical disparity tolerance by independently changing stimulus size, number of stimulus elements, and spacing of stimulus elements. For horizontal disparities of less than about 4 arcmin, the ability to detect horizontal depth corrugations in a random-dot stereogram was less affected by vertical-disparity noise than by horizontal-disparity noise (Palmisano et al. 2001). However, as both signal disparities and noise
disparities were increased, vertical-disparity noise became more disruptive than horizontal-disparity noise. Perhaps vertical disparities soon reach their upper limit for fusion in the presence of a large horizontal disparity. 1 8 . 5 LU M I N A N C E , C O N T R A S T, AND STEREOPSIS 18.5.1 E FFEC TS O F LU M I NA N C E AND CONTRAST
Nagel (1902) noted that the displacement in depth of a central rod relative to two flanking rods was detected at scotopic levels of luminance. Berry et al. (1950) measured stereoacuity using two vertical black rods against an illuminated background. Increasing the luminance of the background up to 10 millilamberts improved stereoacuity at least threefold, after which further increases had little effect. The function relating stereoacuity to luminance was steeper than that relating vernier acuity to luminance, but the two functions had a similar shape. A discontinuity in the function relating stereoacuity to luminance is evident as luminance is reduced from mesopic to scotopic levels (between 0.1 and 0.01 millilamberts) (Mueller and Lloyd 1948; Lit and Hamm 1966). Livingstone and Hubel (1994) obtained similar results when subjects detected the depth offset of a central region in a random-dot stereogram. Stereo acuity declined from about 10 arcsec at 1 foot Lambert to about 100 arcsec at 10-4 foot Lamberts. Linebisection acuity declined in a similar fashion but less steeply. An increase in illumination contracts the pupil and increases depth of focus. Increased depth of focus should make it more difficult to detect when a rod moving in depth has gone out of focus (Howarth 1951). On this account, an increase in luminance should degrade stereoacuity measured with real rods but not that measured with dichoptic stimuli. However, increases in luminance improve stereoacuity for both types of stimuli. Reduction in pupil size of one or both eyes adversely affected stereoacuity on the Randot and Titmus tests but only when pupil size of one or both pupils was reduced below 2.5 mm (Lovasik and Szymkiw 1985). In this case, pupil size must have influenced stereoacuity by affecting retinal illumination. Ogle and Weil (1958) found that the threshold disparity for detection of depth in a stereoscopic display remained constant until contrast was reduced to near the contrast threshold. Lit et al. (1972) obtained similar results using a two-rod Howard-Dolman apparatus. They found that the increase in the disparity threshold near the contrast threshold was steeper at higher levels of luminance, as one would expect from the fact that the Weber fraction for luminance decreases as luminance increases. For stereograms consisting of vertical sinusoidal gratings, the stereo threshold was approximately inversely
STEREOSCOPIC ACUIT Y
•
307
related to the square root of Michelson contrast, for contrasts between 0.01 and 1.0 (Legge and Gu 1989). Cormack et al. (1991) used a random-dot stereogram and found a cube-root dependency of stereoacuity on contrast at contrasts above about five times the threshold (Figure 18.17). The difference was probably due to the different spatialfrequency content of the stimuli. At contrasts below five times the threshold, Cormack et al. found that stereoacuity was proportional to the square of contrast. This is what one would expect in the threshold region, where performance is limited by a constant level of intrinsic noise. There is thus agreement that stereoacuity has a weak dependence on contrast at suprathreshold levels but declines rapidly as contrast approaches the contrast threshold. The perceived depth produced by a given disparity is also affected by contrast. In general, low contrast stimuli appear more distant than high contrast stimuli (see Section 18.7.3b). The effects of reduced contrast on stereoacuity depend on the spatial frequency of the stimulus, as we shall now see. 18.5.2 C O N T R A S T-S E N S I T I VI T Y FUNCTION FOR STEREOPSIS
Frisby and Mayhew (1978a) derived a contrast-sensitivity function for stereopsis by measuring the luminance contrast required for detection of depth in random-dot stereograms. The center spatial frequency of a filtered dot pattern was varied between 2.5 and 15 cpd, and the disparity in the stereograms varied between 3 and 22 arcmin. Since contrast thresholds did not vary as a function of disparity, results were averaged over the disparities. They also measured
contrast sensitivity for detection of a dot pattern when both eyes saw the same pattern. The results for one subject are shown in Figure 18.18. The dot patterns were detected at a contrast between 0.3 and 0.4 log units below that required for the detection of depth. However, the forms of the two functions were similar (correlation 0.96), which suggests that the mechanism for stereopsis is not sensitive to a particular spatial frequency as far as contrast thresholds are concerned. A similar high correlation between the contrastsensitivity function for dot detection and the stereo contrast-sensitivity function was obtained by Legge and Gu (1989). They used a stereogram consisting of identical vertical sine-wave gratings with variable horizontal disparity. Subjects reported the sign of depth of the test grating with respect to a zero-disparity comparison grating placed just below it. Stereoacuity was thus determined as a function of the spatial frequency of the grating, for each of several contrasts. For a given contrast, on a log/log plot, threshold disparity was inversely proportional to spatial frequency, reaching a minimum at about 3 cpd, after which the threshold increased in an irregular fashion. As the spatial frequency of a regular grating increases, disparity becomes ambiguous, since it is unclear which bar should be paired with which. This is the wallpaper illusion discussed in Section 14.2.2, and may account for the irregularity of the stereo threshold above 3 cpd. The binocular contrast-sensitivity function for detecting a luminance modulation also peaked at about 3 cpd and, like the stereo
200
Detection 100 Contrast sensitivity
Stereoacuity (arcsec)
100
10
50
Stereopsis
20
10 8 6 1
2.5 5 10 Spatial frequency (cpd) 1
10 Contrast (threshold multiples)
15
50
Stereoacuity and luminance contrast. At contrasts less than about 5-times the threshold, acuity is proportional to the cube-root of contrast. Above a contrast of about 10-times above threshold, acuity is proportional to the square of contrast (N = 1). (Redrawn from Cormack et al.
A contrast-sensitivity function for stereopsis. The upper curve is the contrast sensitivity for detection of a binocularly viewed random-dot pattern as a function of spatial frequency of the pattern. The lower curve is the contrast sensitivity for detection of depth in a random-dot stereogram as a function of spatial frequency of the dot pattern. Stereograms had disparities between 3 and 22 arcmin (N = 1). (Redrawn
1991)
from Frisby and Mayhew 1978a)
Figure 18.18.
Figure 18.17.
308
•
STEREOSCOPIC VISION
threshold, fell off at lower spatial frequencies. This further supports the idea of a close link between the binocular detectability of a grating and the detectability of disparity in a grating. Legge and Gu found that the spatial frequency that produced the lowest stereo threshold was about the same for all contrasts. However, for all spatial frequencies, the stereo threshold was approximately inversely related to the square root of Michelson contrast. This latter finding is consistent with the idea that, as contrast is reduced, the internal noise in the visual system becomes proportionately larger and adversely affects signal detection. Halpern and Blake (1988) reported similar results using elongated D10 Gaussian patches, which resemble small patches of sinusoidal grating. They used a wider range of spatial frequencies and found that stereoacuity is less affected by changes in contrast at higher spatial frequencies. This may explain why stereoacuity has been found to vary only with the cube root of contrast in a random-dot stereogram (Cormack et al. 1991). Halpern and Blake concluded that disparity is processed by spatial-frequency tuned mechanisms with a compressive nonlinear dependence on contrast. Heckmann and Schor (1989b) confirmed that spatial frequency and contrast, rather than luminance gradients, are the crucial factors determining stereoacuity. On the one hand, stereo thresholds were the same for sinusoidal gratings with the same spatial frequency and contrast but with different luminance gradients. On the other hand, for targets with the same luminance gradient, thresholds were lower for the target with higher contrast. Westheimer and McKee (1980b) found that stereoacuity is adversely affected by filtering of high spatial-frequency components of the stimulus in a way that cannot be accounted for in terms of reduced contrast. Hess et al. (2002) produced evidence for the same conclusion. Stereo acuity for a grating with a spatial frequency of 8 cpd was impaired when the stimulus was counterphase flickered at between 5 and 20 Hz, but the same flicker had little effect at lower spatial frequencies (Patterson 1990). It seems safe to conclude from this evidence that there is a close relationship between stereoacuity and the contrast and spatial frequency of the stimulus. However, we will see in Section 18.7.3b that contrast has little effect on stereoacuity for contrast-defined (second-order) stimuli. 18.5.3 S I M P L E D ET E C T I O N A N D D ET E C T I O N O F D E P T H
The contrast required for detection of a stimulus may differ from that required for detection of the value of a stimulus feature. For example, the contrast required for detection of a line may differ from that required for detection of the
orientation of the line. It is generally believed that, if the two thresholds are the same, the feature-detection mechanism has access to the earliest stages of signal generation. The primary visual cortex is the earliest level at which binocular signals are combined. If this is also the level at which disparities are detected, one would expect the contrast threshold for detection of stimuli arising simultaneously from both eyes to be the same as the threshold for stereopsis. Data from Frisby and Mayhew (1978a), depicted in Figure 18.18, show that a random-dot pattern seen with both eyes was discriminated from an evenly illuminated area at a contrast of 0.3 to 0.4 log units below that required for detection of depth in a random-dot stereogram. They concluded that, at all spatial frequencies, the stereoscopic system requires a stronger stimulus than that required for detecting the dots in the stereogram. However, the area of the stimulus for detection of depth was not the same as that required for detection of dots. Also, the possible influence of eye movements was not controlled. Smallman and MacLeod (1994) found a difference of only 0.25 log units between dot-detection thresholds and depthdetection thresholds in a random-dot display. Halpern and Blake (1988) found a similar small difference using D10 Gaussian patches. In all the above studies, binocular detection thresholds for dots should be lower than stereo thresholds, because binocular detection operates when the stimulus in either eye is above threshold, whereas stereopsis requires both stimuli to be above threshold (Mansfield and Simmons 1989). The appropriate comparison is the multiple of the probabilities of detection obtained from each eye tested separately. Simmons (1998) used a pair of 1˚ dichoptic Gabor patches with a spatial-frequency bandwidth of 1.1 octaves presented for 200 ms at a crossed or uncrossed disparity of 30 arcmin with respect to a 3˚ surrounding circle. When the contrasts of the dichoptic patches were similar, the threshold for discriminating depth was equal to or below the combined monocular detection thresholds. When the contrasts of the images differed, the threshold for stereopsis at first improved relative to the detection threshold, and then depth became undetectable at a contrast ratio of 12 db. Simmons explained these effects in terms of excitatory and inhibitory interactions between the eyes. These results support the idea that the sign of disparity is detected at the earliest level at which binocular signals are combined. This is true for disparity based on first-order luminance contrast. For disparity based on chromatic contrast or for second-order stereopsis based on the envelope of Gabor patches, the threshold for stereopsis is higher than the detection threshold (Section 17.1.4b). In other words, these types of stereopsis require more signal strength than that required for simple detection or for contrastbased stereopsis.
STEREOSCOPIC ACUIT Y
•
309
18.5.4 E FFEC TS O F I N T E RO CU L A R DIFFERENCES
18.5.4a Differences in Luminance and Contrast Stereoacuity, measured by the Howard-Dolman apparatus, is reduced when luminance differs in the two eyes (Rady and Ishak 1955; Lit 1959a). Stereoacuity measured by the Titmus and Randot tests was not significantly affected by a neutral filter of approximately 3% transmission placed before one eye (Lovasik and Szymkiw 1985). However, these two tests are not designed to detect disparities of less than 20 arcmin. Stereoacuity is also reduced when luminance contrast is not the same in the two eyes (Simons 1984; Halpern and Blake 1988; Legge and Gu 1989). However, we will now see that there has been some dispute about whether the reduction is greater than when both images are reduced in contrast. It has been claimed that a given reduction of contrast applied to one eye reduces stereoacuity about twice as much as the same reduction applied to both eyes. It has also been claimed that the effect of unequal contrast on detection of a dichoptic grating is relatively the same for spatial frequencies of 0.5 cpd and 2.5 cpd, as can be seen in Figure 18.19 (Legge and Gu 1989). However, Schor and Heckmann (1989) found that interocular differences in contrast produced a greater loss of stereoacuity for a 0.8-cpd grating than for a 3.2-cpd grating. Cormack et al. (1997a) used dynamic random-dot stereograms and narrow-band Gabor patches with various center frequencies. For all stimuli containing high spatial
Disparity threshold (arcmin)
100
0.5 cpd 0.125 contrast 10
0.5 cpd 0.25 contrast 1 2.5 cpd 0.25 contrast
4:1
2:1
1:1
Left/right
2:1
4:1
Right/left Contrast ratio
Stereoacuity and relative contrast. Disparity threshold for discriminating depth in a grating relative to a zero-disparity grating as a function of relative contrast of the images in the two eyes. Contrast: 0.125 or 0.25. Spatial frequency: 0.5 or 2.5 cpd (N = 1).
Figure 18.19.
(Redrawn from Legge and Gu 1989)
310
•
frequencies, stereo thresholds were not affected by inequality of contrast but were limited only by the lower of the two contrasts. For Gabor patches under 5 cpd, an interocular difference in contrast had a larger effect than equal low contrast in the two eyes. They argued that differences in contrast at high spatial frequencies arise naturally as a result of instability of vergence and are corrected by a compressive nonlinearity in the contrast signals before binocular combination. Interocular mismatches at low spatial frequencies are preserved because they indicate a large vergence error. Hess et al. (2003) argued that monocular low contrast affects stereoacuity more than binocular low contrast only with narrow-band 1-D stimuli. With the fractal textures shown in Figure 18.22A, contrast reduction in one image had no more effect than contrast reduction in both images, even when the images were low-pass or high-pass filtered. Some impression of depth persists with low-contrast stimuli or with stimuli with quite large interocular differences in contrast. The relative immunity of stereopsis to low contrast could be due to a contrast-gain mechanism in the LGN. Immunity to interocular differences in contrast could be due to binocular interactions within the LGN. The response per unit change in contrast (contrast gain) of many cells in the LGN of the cat to stimulation of one eye, changed when the other eye was stimulated at the same time. This effect survived removal of the visual cortex (Tong et al. 1992). Vernier acuity and detection of two-frame apparent motion were also adversely affected when one stimulus had a lower contrast than the other stimulus (Stevenson and Cormack 2000). In these cases, both stimuli (Gabor patches) were presented to the same eye. As with stereoacuity, the effect of unequal contrast was larger with stimuli of low narrow-band spatial frequency. Wilson (1977) revealed a hysteresis effect that depends on the relative contrast of the images in the two eyes (Portrait Figure 18.20). The stimuli were 4.5˚-wide stereograms consisting of vertical sinusoidal gratings. The spatial frequency of the grating in one eye was set at various values between 0.5 and 8 cpd, and the spatial frequency of the grating in the other eye was set 1.2 times higher. This created a surface slanted in depth about a vertical axis (Section 19.2.4). The grating in one eye had a constant contrast of 0.5. When the contrast of the grating in the other eye was increased from zero to 0.5 the impression of slant was delayed beyond the point where the impression of slant was lost when contrast in the other eye was reduced from 0.5 to zero (Figure 18.21). Wilson modeled the hysteresis effect in terms of a neural network with positive feedback among disparity detectors generated by disinhibitory circuits. Wilson argued that this contrast-dependent hysteresis is not due to differences in vergence eye movements,
STEREOSCOPIC VISION
Hugh R. Wilson. Born in Fort Monmouth, New Jersey, in 1943. He received a B.A. in chemistry and physics from Wesleyan University in 1965 and a Ph.D. in physical chemistry from the University of Chicago in 1969. In 1972 he was appointed to the faculty in the University of Chicago, where he became professor of visual science. In 2000 he became professor of biology at York University, Toronto. He is a fellow of the Optical Society of America.
Figure 18.20.
because eye movements do not affect the difference in spatial frequency between the images. This argument is not conclusive. A high-contrast image in one eye and a low-contrast image in the other eye provide a weak stimulus for convergence. It is argued in Section 19.2.4 that
0.3
Contrast of variable display
Slant impression gained
0.2
0.1
0
1.0 2.0 4.0 Spatial frequency (cpd)
8.0 11.0
Contrast hysteresis in stereopsis. The curve with upward-pointing arrows indicates the contrast of a sinusoidal-grating stereogram at which slant was perceived as contrast was increased. The curve with downward-pointing arrows indicates the contrast at which the impression of slant was lost as contrast was reduced (N=1).
Figure 18.21.
(Adapted from Wilson 1977)
1. Cancellation of uncorrelated noise Visual acuities involve detection of a difference between two signals, each with its own noise. Devices such as differential amplifiers, which combine signals in this way, produce a signal with an improved signal-to-noise ratio because the uncorrelated noise signals tend to cancel. When the signal from one eye is stronger than that from the other, the noise from the two eyes would not be canceled as effectively and stereoacuity would therefore be degraded. 2. Interstimulus suppression A stimulus with higher illumination or contrast may suppress a weaker stimulus either in the same eye or in the opposite eye (Section 12.3.2). A related possibility discussed in Section 15.3.7 is that stimuli of widely differing contrast do not fuse. This would interfere with the extraction of disparity information.
Slant impression lost
0.5
vergence allows one to scan effectively over a disparity gradient created by a difference in spatial frequency of the two images, and that these scanning movements aid stereopsis. Once an effective pattern of vergence movements has been formed, with images of equal contrast, it should be easy to maintain it after considerable loss of contrast in one eye. However, effective eye movements may be difficult to establish when the images differ initially in contrast. This interpretation explains why Wilson found no hysteresis when the gratings had a spatial frequency of only 0.5 Hz. At this spatial frequency, there would be only eight cycles in the display. With a frequency ratio of 1 to 2 this would produce only two nodes where the displays are in phase (produce zero disparity). This should make it easy to see depth without vergence movements. The multiple zero-disparity nodes in high spatial-frequency depth ramps make it difficult to see a single slanted surface without the aid of vergence. Furthermore, a low spatial-frequency display is more visible in the peripheral retina than a high spatial-frequency display. For both these reasons, scanning eye movements are not as necessary for the perception of depth in low spatial-frequency depth ramps as in high spatial-frequency depth ramps. The effects of differences in illumination, contrast, and spatial frequency on stereoacuity may be due to any or all of the following factors. Some of these factors are specific to stereoacuity, while others could account for effects in other acuities. Some apply only to narrow-band stimuli.
3. Location shift A brighter stimulus may cause a shift in the apparent location of the dimmer stimulus in the direction of reducing the disparity or spatial offset between them. Verhoeff (1933) presented a black line to each eye, one tilted clockwise and the other counterclockwise with respect to vertical, so that they fused into a single line apparently inclined in
STEREOSCOPIC ACUIT Y
•
311
the median plane. When one line was viewed through a neutral filter, the fused line appeared to tilt in the frontal plane toward the brighter line, and the apparent inclination of the line in depth was reduced. When the filter covered only the bottom half of one line, the bottom half of the fused line appeared tilted in the frontal plane and less inclined in depth relative to the top half. The change in the angular position of the dim line required to account for the tilt of the fused line was similar to that required to account for the change in its inclination. Thus, the loss of stereoscopic acuity when the images differ in luminance could be due to a shift in the apparent lateral location of the dim image relative to the brighter image (Section 17.9). 4. Contrast normalization Contrast normalization weakens the weaker of two neighboring or sequential stimuli (Stevenson and Cormack 2000). 5. Differential latency The visual latency for a dim stimulus is longer than that for a bright stimulus. Thus, with unequal illumination, the signals from the two eyes arrive at the visual cortex at different times. This could directly interfere with the detection of disparity. For instance, it has been suggested that the response of a binocular cell to a stimulus from the nasal hemiretina is more rapid than its response to a matching stimulus from the temporal hemiretina. This difference allows the brain to identify the eye of origin of each signal and thus distinguish between crossed and uncrossed images (Bower 1966). Reading and Woo (1972) measured the elevation of stereo threshold due to unequal illumination of dichoptic stimuli presented for 40 ms or less and then tested whether this could be nulled by introducing an interocular time delay. They found no nulling of one effect by the other and concluded that the effect of unequal illumination on stereoacuity is not due to differential latency. They did not consider the second way in which differential latency may affect stereoacuity. A differential latency of signals from the two eyes translates into a disparity if the two stimuli move, as in the Pulfrich stereophenomenon (Section 23.1). Movement of the stimuli in a frontal plane in the direction of the eye receiving the weaker stimulus creates an uncrossed disparity, and movement the other way creates a crossed disparity. With a stationary stimulus, to-and-fro eye movements impose a variable disparity signal into the test disparity and reduce the signal-to-noise ratio. The differential latencies in the Reading and Woo experiment did not translate into disparity because the stimuli were exposed for only 40 ms, and the effects of eye movements would not be significant with such
312
•
short exposures. The experiment should be repeated with longer exposure times.
18.5.4b Differences in Image Blur Stereoacuity is reduced by induced anisometropia—the blurring of one eye’s image (Westheimer and McKee 1980b ; Wood 1983). As the image in one eye was optically blurred, the loss in Snellen acuity in that eye was proportional to the loss in stereoacuity, as assessed by the Titmus Fly test (Levy and Glick 1974). When acuity in one eye was optically degraded to 20/200, stereo acuity was reduced to the level attainable by only monocular cues. For Snellen acuities between 20/25 and 20/50, there was greater loss of stereoacuity with monocular blur than with binocular blur (Goodwin and Romano 1985). The deterioration of stereoacuity with increasing monocular blur was more rapid for the Titmus test than for the Randot test (Lovasik and Szymkiw 1985). Most subjects maintained a stereoacuity of 40 arcsec with a 1.0 D blur for the Randot test, or 0.5 D blur for the Titmus test. They maintained a moderate degree of stereopsis on both tests with 2 D of monocular blur. These results agree with those of Levy and Glick but not with those of Peters (1969), who found that 80% of subjects showed complete loss of stereoacuity with 1 D of monocular blur. Donzis et al. (1983) developed nomograms of stereoacuity for 30 normal subjects for various combinations of image blur in the two eyes. Lam et al. (1996) obtained a correlation of 0.76 between stereoacuity and visual acuity in 30 subjects with natural interocular differences in acuity. Hess et al. (2003) used fractal textures containing a wide range of spatial frequencies and orientations, as shown in Figure 18.22. Stereoacuity was reduced when one image was severely blurred by filtering off high spatial frequencies, as in Figure 18.22B. Loss of acuity was much less severe when both images were blurred. Stereoacuity was not reduced when the images had the same high spatial frequencies but different low spatial frequencies, as in Figure 18.22C. The extent of naturally occurring interocular differences in optical aberrations, including coma and spherical aberration, were found to be associated with decreased binocular summation of contrast-sensitivity and a reduction in the upper limit of disparity detection ( Jiménez et al. (2008). 18.5.5 S T E R E OAC U I T Y A N D C O L O R
Stereopsis for isoluminant stimuli was discussed in Section 17.1.4. This section is concerned with whether stereoacuity is affected by the color of the stimuli. Pennington (1970) measured stereoacuity as a function of color using the Howard-Dolman apparatus, with the vertical rods emitting
STEREOSCOPIC VISION
experiments, the rods or the background varied in color, but the borders were defined by luminance contrast. 1 8 . 6 S PAT I A L FAC TO R S I N S T E R E OAC U I T Y 18.6.1 S T E R E OAC U I T Y A N D S T I MU LUS L O C AT I O N
18.6.1a Stereoacuity and Stimulus Eccentricity Figure 18.23 shows the threshold for discrimination of disparity-defined depth between two neighboring points as a function of their horizontal distance from a fixation point (Rawlings and Shipley 1969). As eccentricity increased to 8˚, stereoacuity declined at about the same rate as other spatial hyperacuities such as vernier acuity (Fendick and Westheimer 1983; Levi et al. 1985). Furthermore, all hyperacuities were approximately independent of eccentricity when allowance was made for the mean size of receptive fields (the cortical magnification factor) (Levi et al. 1985). The whole retina is probably paved with large receptive fields, which means that stereoacuity for stimuli of low spatial-frequency should be relatively independent of eccentricity. On the other hand, since small receptive fields are confined to the central retina, stereoacuity for stimuli of high spatial-frequency should decline rapidly with increasing eccentricity. Siderov and Harwerth (1995) found stereoacuity for 0.5 cpd Gaussian patches declined much more gradually than that for 8 cpd patches as eccentricity increased up to 10˚.
400 Interocular differences in spatial frequency.
(From Hess et al. 2003. With
permission from Elsevier)
a narrow band of wavelengths in the red, green, or blue parts of the spectrum. The colored rods were equated for apparent luminance and viewed against a dark background. Stereoacuity for the red rods relative to that for the green rods varied from subject to subject, but all subjects had a lower stereoacuity with blue rods. A similar experiment was conducted by Dwyer and Lit (1970) using black rods on a red, green, yellow, or blue background, and by Young and Lit (1972) using red, green, yellow, or blue rods on a dark background. Stereoacuity was lower with blue, but improved with increasing luminance in a similar way for all four colors. The lower acuity with blue background or blue rods could be due to the low density of blue cones in the retina and their relatively large receptive fields. In these
Mean stereoscopic threshold (arcmin)
Figure 18.22.
300
200
100
0 –10 –8
–6
–4 –2 0 2 4 6 8 Right Left Horizontal eccentricity of target (deg)
10
Stereoacuity and horizontal eccentricity. Disparity threshold for discrimination of relative depth of two neighboring points as a function of horizontal distance from the fixation point (N = 3). (Redrawn from Rawlings
Figure 18.23.
and Shipley 1969)
STEREOSCOPIC ACUIT Y
•
313
Figure 18.9 shows that stereoacuity decreases less steeply with increasing size of a depth pedestal as the stimuli are moved into the periphery (Blakemore 1970c). Krekling (1974) pointed out that this leads to the paradoxical result that stimuli on a depth pedestal of 80 arcmin have a higher stereo threshold when they are in the central visual field than when they are at an eccentricity of 5˚. Stereoscopic depth produced by dichoptic images of a small vertical bar oscillating in depth at 2 Hz between 0 and 0.4˚ was detected out to at least 20˚ of eccentricity (Richards and Regan 1973). Figure 18.24 shows the disparity threshold for detecting relative depth between vertical lines, 4 arcmin in length, presented at various eccentricities above and below a fixation point. Stereoacuity improved as the lines increased in length up to 20 arcmin. Further increases produced little improvement (McKee 1983) (Portrait Figure 18.25).
18.6.1b Stereoacuity in Upper and Lower Visual Fields Stereoacuity has been found to be higher for a dynamic random-dot stereogram in the lower visual field than for one in the upper visual field (Manning et al. 1992). In the upper visual field, a region of 6 arcmin of uncrossed disparity (far) in a random-dot stereogram was detected with shorter exposure durations than a similar region of crossed disparity ( Julesz et al. 1976). In the lower visual field, a region of crossed disparity was detected more rapidly than a region of uncrossed disparity. Julesz et al. found no left-right differences and concluded that the up-down difference reflects an anisotropy in the distribution of disparity detectors in the human visual cortex. These results are just what one would expect from the backward inclination of the vertical horopter (Section 14.7).
Disparity threshold (arcsec)
18
12
6 4 arcmin lines
0 –30
–15
0
Below center
15
30
Above center
Eccentricity (arcmin) Stereoacuity and vertical eccentricity. Disparity threshold for detection of depth between lines at vertical eccentricities up to 30 arcmin (N = 1). (Redrawn from McKee 1983)
Figure 18.24.
314
•
Suzanne McKee. Born in Vallejo, California, in 1941. She graduated in psychology from Vassar College in 1963 and obtained a Ph.D. in psychology with G. Westheimer from Berkeley in 1970. After working at the Polaroid Corporation in Boston, she returned to Berkeley to continue working with G. Westheimer. In 1981 she joined the Smith-Kettlewell Eye Research Institute in San Francisco, where she is now a senior scientist. She was vice president of ARVO from 1998 to 1999.
Figure 18.25.
In the upper visual field, an object with uncrossed disparity is closer to the vertical horopter than one with crossed disparity while, in the lower visual field, an object with crossed disparity is closer to the vertical horopter. All the stimuli used by Julesz et al. were within 1˚ of the fixation point. Previc et al. (1995) confirmed that, in the lower visual field, a crossed-disparity region in a random-dot stereogram is detected more rapidly than an uncrossed disparity region. Manning et al. (1987) failed to replicate this finding using random-dot test regions subtending 2˚, with 15 arcmin of disparity, and placed 3˚ above or below fixation. Instead, they found that, over the whole visual field, crossed-disparity regions were more easily detected than uncrossed-disparity regions. Breitmeyer et al. (1977) argued that, because of the greater sensitivity to uncrossed disparity in the upper hemifield, and to crossed disparities in the lower hemifield, a truly vertical plane should appear inclined top away. They obtained a mean apparent displacement of the apparent visual vertical of 1.6˚ in the expected direction. But they failed to realize that this effect could also be due to the backward inclination of the vertical horopter. However, other investigators have not confirmed the effect (see Section 10.7.2b).
STEREOSCOPIC VISION
18.6.2 S T I MU LUS S PAC I N G
This section is concerned with the effects of stimulus spacing on stereoacuity. The concepts of disparity gradient, disparity ramp, and ramp density are defined in Section 19.4.
18.6.2a Crowding Effects The topic of crowding was discussed in Sections 4.8.3 and 13.2.5. A depth interval between two vertical lines can be detected most easily when the lines are an optimal lateral distance apart. Stereoacuity improved when the vertical distance between a test line and a comparison line was increased up to about 0.2˚ (Westheimer and McKee 1980a). Similar crowding effects occur with monocular acuities. As a pair of stereo targets becomes more removed from the fovea, an even greater separation between comparison and test elements is required for optimal stereoacuity (Westheimer and Truong 1988). Westheimer and McKee (1979) found that the disparity threshold for discrimination of depth between a vertical test line and two flanking comparison lines increased steeply as the distance between the lines was reduced below about 5 arcmin. For separations of less than 2.5 arcmin, depth discrimination became impossible. Kumar and Glaser (1995) confirmed these results. However, for a test line with a single adjacent comparison line, they obtained thresholds better than 20 arcsec when the two lines were only 1 arcmin apart. This is about the same separation between two lines that is required to resolve them as two lines (the line-resolution threshold). This suggests that the stereo and line-resolution tasks are limited by the same factors. Norman and Todd (1998) found that stereoacuity for two dots was higher when the dots were isolated than when they were on a curved textured surface. Also, stereoacuity declined more rapidly with increased separation of the dots when they were placed on a textured surface than when they were isolated. As the lateral separation between a test stimulus and comparison stimuli was increased beyond the optimal value, stereoacuity declined, slowly at first and then more steeply (Hirsch and Weymouth 1948a, 1948b). The decline in stereoacuity with increasing stimulus separation was confirmed by Ogle (1956) and Enright (1991b). With wide separations, one or the other stimulus is displaced from the fovea unless they are fixated in succession. This issue is discussed in Section 18.10.2. Addition of flanking lines to a test stimulus can, also, degrade stereoacuity. The stereo threshold for a vertical test line relative to a fixated vertical line placed below it was elevated when the test line was flanked by vertical lines in the plane of fixation (Butler and Westheimer 1978). This effect reached a peak value of a sixfold increase in stereo threshold when the test line and flanking lines were about 2.5 arcmin apart, and declined with greater separations.
The effect fell to zero as the flanking lines were moved from the fixation plane into the plane of the test line. Well-spaced flanking lines can improve stereoacuity. Thus, the ability to detect depth between a vertical test line and a previously seen fixation point was improved when other vertical lines were added within lateral distances up to 40 arcmin and in depth planes within 10 arcmin of the test line. Four well-placed flanking stimuli improved performance more than twice (Kumar and Glaser 1992). Kumar and Glaser (1992) found that when a test line and two flanking lines are separated by less than 5 arcmin, the stereo threshold was elevated when the luminance of the test line was less than that of the flanking lines. The threshold was lowered when the luminance of the test line was moderately greater than that of the flanking lines. A central line and two flanking lines in a different depth plane constitute a spatial modulation of disparity. Effects of spatial modulation of disparity on stereoacuity are discussed in Section 18.6.3. The preceding experiments were conducted at threshold levels of disparity. Suprathreshold effects of crowding are discussed in Section 21.2.
18.6.2b Effects of Density of Stimulus Elements Experiments reviewed so far involved only two or three target elements. It is not clear whether the results were due to the increasing depth of the disparity ramp or to the increasing separation between the stimuli (ramp density). The effects of varying ramp density (the number of stimulus points per unit distance along the ramp) are now considered. The disparity threshold for discriminating depth between two fixed targets separated horizontally or vertically increased from about 10 arcsec to about 20 arcsec when a third point was interpolated between them to form a linear depth ramp. The threshold continued to increase to about 300 arcsec as more points were interpolated on the ramp until the crowded points created the impression of a continuous line, as shown in Figure 18.26 (Fahle and Westheimer 1988). The effect of increasing the number of points was enhanced when the separation between the endpoints was increased from 10 arcmin to 1˚. This effect occurred for both horizontal and vertical arrays of points. The addition of points between two fixed points does not alter the disparity gradient. It was concluded that the depth between neighboring points (disparity density) is the most important variable determining the depth threshold between the end points of a stimulus. This result suggests the operation of a local process akin to crowding. Fahle and Westheimer also found that the disparity threshold for detecting depth between the ends of a horizontal or vertical line was higher for a line 1˚ long than for one 10 arcmin long. Thus, a second factor is the length of the depth ramp, which suggests the operation of a global factor.
STEREOSCOPIC ACUIT Y
•
315
detection of modulations of binocular disparity. This function is the threshold peak disparity required for detection of depth produced by a sinusoidal modulation of binocular disparity as a function of the spatial frequency of the disparity modulation.
18.6.3a Disparity Modulation of a Line
Disparity detection and ramp density. A representation of the dichoptic stimuli used by Fahle and Westheimer (1988) to study the effects of dot density on the disparity threshold for detection of depth. According to their results, depth should be more evident between the endpoints of the stimuli when there are no intervening dots.
Figure 18.26.
The effect of an interposed dot on the depthdiscrimination threshold for two dots exposed for 2 s decreased as the insertion of the interposed dot was delayed, although the dot still had an effect when delayed by 1 s (Fahle and Westheimer 1995). In these experiments, the depth threshold was elevated by increasing element density. This suggests that the crucial factor is the number of disparate elements within the range of some integrative mechanism. This could be a disparity-averaging mechanism discussed in Section 18.8.2 or a lateral-inhibition mechanism. If lateral inhibition occurs only between detectors with similar disparity sensitivities, then the effects of crowding should be larger for flat surfaces than for curved surfaces, such as a sinusoidal disparity corrugation. This issue is discussed in the following section. Another factor in these crowding effects may be the spacing of elements in the monocular images in addition to factors operating at the level of disparity detection. Finally, vergence eye movements may be involved. The whole issue is an aspect of two broader questions; depth contrast, discussed in Chapter 21, and the relationship between stereoscopic vision and the spatial frequency of disparity modulation, discussed in the next section.
Tyler (1973, 1975a) was the first person to investigate this topic. He also introduced the general concept of the scaling of stereoacuity by the spatial frequency of depth modulation (Portrait Figure 18.27). He presented a straight 15˚ vertical line to one eye and a wavy vertical line to the other eye. When fused, the lines produced a line curved sinusoidally in depth. He found that the threshold amplitude of disparity modulation decreased with increasing spatial frequency of disparity modulation, reaching a minimum at about 1 cpd. With higher frequencies of disparity modulation, the threshold rose to a limiting value at a spatial frequency of about 3 cpd, beyond which the depth modulation was no longer evident. This function defines the sensitivity function for the detection of disparity modulation in a line. The amplitude threshold for the monocular detection of undulations in a line showed a similar dependency on the frequency of undulation at spatial frequencies below about 0.5 cpd. However, monocular undulations could be detected best at about 3 cpd and were visible up to about 12 cpd. It looks as though the upper frequency limit of disparity modulation for stereopsis is determined by factors other than the capacity of each eye to detect undulations in a line.
18.6.3 S PAT I A L FR EQ U E N C Y O F D I S PA R IT Y M O D U L AT I O N
This section deals with how the detection of depth created by spatial modulations of disparity varies as a function of the spatial frequency of the modulations. The modulation transfer function for detection of luminance modulations is the threshold peak luminance contrast required for detection of a grating as a function of its spatial frequency. We can define an analogous modulation transfer function for 316
•
Christopher W. Tyler. Born in Leicester, England, in 1943. He obtained a B.A. in psychology from the University of Leicester in 1966 and a Ph.D. in communications with D. Regan from the University of Keele, England, in 1970. He held research appointments at Northeastern University in Boston, University of Bristol in England, and the Bell Laboratories in New Jersey. In 1975 he became a research scientist at the Smith-Kettlewell Institute in San Francisco.
Figure 18.27.
STEREOSCOPIC VISION
Tyler also showed that the greater the spatial period of each cycle of disparity modulation became, the larger was the upper limit of disparity that elicited depth. In other words, the upper disparity limit is scaled in proportion to the spatial period of disparity modulation—the greater the spatial period, the larger the disparity that can be processed. Tyler suggested that the upper threshold is based on a disparity-gradient limit. Figure 18.28 shows both the lower and upper disparity limits for perception of depth in the vertical line combined into one graph for one subject. One set of data was obtained with a constant size of display and hence a variable number of depth modulations. The other set was obtained with a constant number of depth modulations and hence a variable size of display. For higher frequencies of disparity modulation, both the lower and upper depth-modulation detection limits were somewhat elevated when the number of cycles was constant compared with when the size of the display was constant. Tyler concluded that disparity information is integrated over about two cycles of disparitydefined depth modulation. A related finding is that the larger the area of a region of constant disparity in a randomdot stereogram, the higher is the upper limit of disparity (Tyler and Julesz 1980).
1000
Peak-to-peak disparity difference
300
100
18.6.3b Disparity Modulation in Random-Dot Stereograms A wavy line is not ideal for studying spatial modulations of disparity because a change in the frequency of disparity modulation is accompanied by changes in the spatial frequency of position modulation in the monocular image. A better stimulus is provided by a random-dot stereogram, with a sinusoidally modulated disparity, which appears as a corrugated surface when fused. A surface with horizontal corrugations is created by presenting identical random-dot displays to each eye and horizontally shearing rows of dots sinusoidally between crossed and uncrossed disparities, as in Figure 18.30A. The sheared rows, indicated in the figure by wavy edges, are not visible in the monocular image and therefore do not provide a cue to the spatial frequency or amplitude of the depth corrugations. A surface with vertical corrugations is created by oppositely compressing columns of dots in two stereograms. This is not a useful stimulus because the columns can be seen in each monocular image as modulations in dot density, which introduce visible modulations of mean luminance if the disparity gradient is steep. A depth-threshold function for disparity-modulations is obtained by plotting the threshold disparity modulation as a function of the spatial frequency of sinusoidal modulation. The stereogram shown in Figure 18.29 depicts a frequency-swept grating of disparity modulation, analogous to a frequency- and contrast-swept grating of luminance modulation. Tyler (1974a) determined a threshold function by asking subjects to mark the boundary between the region to the left where the surface appears corrugated and the region to the right where corrugations are not visible. This showed a fall-off in sensitivity for corrugation frequencies above 1 cpd. The upper limit of disparity modulation
30
10
3
Perception of depth
1
0.3
0.1 0.03
0.1 0.3 1 3 10 Frequency of depth modulation (cpd)
Stereopsis and depth modulation. Disparity threshold and upper disparity limit of stereopsis as a function of the spatial frequency of depth modulation in a vertical line. Filled circles: fixed size of aperture. Open circles: aperture varied in size so that one period of depth modulation was visible at each frequency The dashed line indicates a slope of 1 (N = 1). (Redrawn from Tyler 1975a)
Figure 18.28.
Frequency-swept grating of disparity modulation. At each level in the fused image, the amplitude where a corrugation can just be seen indicates the amplitude threshold. In each column, the level above which depth modulations are not visible in the fused image defines the upper spatial frequency of disparity modulation for stereopsis. (From
Figure 18.29.
Tyler 1974a. Reprinted by permission from Macmillan Publishers Ltd.)
STEREOSCOPIC ACUIT Y
•
317
318
•
Targe
t widt
h
Mask
A 10
3 Sensitivity (1/arcmin)
for stereopsis, found by asking subjects to mark the highest row of the frequency swept corrugation that appeared corrugated, was about 4 cpd—close to the value obtained with a disparity-modulated line. This upper limit was not due to the inability of subjects to resolve the dots in the monocular image. Nor was it much affected by a 10-fold reduction in dot density. This latter finding does not accord with Fahle and Westheimer’s report, discussed in Section 18.6.2, that stereoacuity is degraded as the number of elements defining a given depth gradient increases. The crucial difference between the experiments may be that the display used by Fahle and Westheimer was a single depth ramp (first-order disparity) and the effects may have been due to a tendency to see isolated inclined surfaces as lying in the frontal plane (Section 21.3.2). In Tyler’s display, several depth undulations with second-order disparities prevent normalization to the frontal plane. Tyler showed with his stimulus that the upper disparity limit for detection of depth modulations is also scaled in proportion to the spatial period of disparity modulation. The upper limit of spatial frequency for detection of depth corrugations is not the upper limit for perceived depth. Above the frequency where corrugations are seen, one is aware of dots dispersed in several depth planes. This is depth transparency, which is discussed in Section 18.9. Figure 18.30 shows the disparity-corrugation sensitivity function (the reciprocal of the disparity-modulation threshold) obtained by Schumer and Julesz (1984) for corrugated surfaces offset on 25- and 40-arcmin crossed and uncrossed disparity pedestals. With the pedestal corrugations, the corrugation frequency giving peak depth sensitivity was progressively reduced and the high-frequency loss became more severe. These results further support the conclusion that, as a depth-modulated display is removed further from the zero-disparity plane, the perception of depth occurs only with more gradual disparity gradients. Note also that sensitivity for displays on crossed disparity pedestals is greater than sensitivity for displays on uncrossed disparity pedestals, which confirms evidence discussed in Section 18.6.4. Schumer and Julesz suggested that this asymmetry between crossed and uncrossed disparities is due to subjects’ misconverging about 5 arcmin in front of the fixation target. Figure 18.31 shows the disparity threshold function obtained by Bradshaw and Rogers (1999) over a range of corrugation frequencies from 0.0125 cpd (80˚ period) to 3.2 cpd, for three sizes of display (10˚, 20˚, and 80˚). Detection of disparity modulations extended down to corrugation frequencies of at least 0.0125 cpd. Low-frequency falloff in sensitivity was not due to a reduction in the number of modulations, since the threshold was not significantly affected by changing the size of the display. For several observers, thresholds for detecting peak-to-trough disparity corrugations were less than 3 arcsec at optimal frequencies of between 0.3 and 0.5 cpd, when the peaks
1 Uncrossed pedestals No pedestal 25 arcsec pedestal 40 arsec pedestal Crossed pedestals No pedestal 25 arcsec pedestal 40 arcsec pedestal
0.3
0.1 0.1
0.3 1 Corrugation spatial frequency (cpd)
3
B Sensitivity to depth modulations. (A) Appearance of the display. (B) Sensitivity for detecting disparity modulations as a function of frequency of modulations for displays superimposed on disparity pedestals. (Adapted from Schumer and Julesz 1984)
Figure 18.30.
and troughs were separated by a visual angle of more than 1˚. Disparity thresholds for detection of vertical corrugations of low spatial frequency were higher than for horizontally oriented corrugations (Figure 18.31) but there were considerable individual differences in the magnitude of this anisotropy (see Section 20.4.2). Prince and Rogers (1998) measured sensitivity to disparity modulation as a function of stimulus eccentricity. With a forced-choice procedure, subjects detected radial depth modulations in an annulus of variable diameter while fixating a central spot (Figure 18.32A). Dot size was scaled for eccentricity. When plotted as a function of corrugation frequency per millimeter of cortex, the peak sensitivity to disparity modulation occurred at the same frequency for all eccentricities (Figure 18.32B). Thus, like shifts in peak sensitivity to luminance modulation with increasing
STEREOSCOPIC VISION
Threshold disparity (arcsec)
100 80° display
Vertical ridges (compression disparity) 10
Horizontal ridges (Shear disparity)
A 10° displays
20° displays
1 0.01
0.1 1 Spatial frequency of depth modulation (cpd)
Detecting vertical and horizontal corrugations. Threshold disparity required for detection of depth corrugation in a random-dot stereogram as a function of frequency of corrugation for horizontal and a vertical corrugations (N = 6). (From Bradshaw and Rogers 1999)
Figure 18.31.
eccentricity, shifts in peak sensitivity to disparity modulation can be accounted for in terms of the cortical magnification factor. Tyler and Kontsevich (2001) measured the modulation transfer function (MTF) for detection of Gabor-shaped spatial modulations of disparity in a random-textured display. The MTF for a one-cycle modulation of disparity was the same for a horizontal as for a vertical wavelet, as shown in Figure 18.33a. For a multicycle vertical grating defined by compression disparity, the MTF did not change when the grating was increased in vertical or horizontal extent, as shown in Figure 18.33b. However, the MTF for a horizontal grating defined by shear disparity, did change with a change in vertical or horizontal extent, as shown in Figure 18.33c. The disparity threshold at all spatial frequencies was lowered by an increase in the height of the grating. The threshold was lowered even more by an increase in the width of the grating. Thus, disparity information is pooled over a larger extent for horizontal-shear disparity than for horizontal-compression disparity. This agrees with the finding that shear-motion signals in a monocular randomdot display are integrated over a larger area than compression-motion signals (Nakayama et al. 1985). Schlesinger and Yeshurun (1998) reported that, in a random-dot stereogram with 15% white dots and pixel size 1.5 arcmin, a patch of dots with 9 arcmin of disparity could not be perceived in depth when its diameter was less than 10 arcmin. Such a patch would contain about 10 white dots. This result is difficult to understand since we have no difficulty seeing depth produced by a single pixel in a random-dot stereogram. They related their result to the 3-cpd (10 arcmin distance between peaks and troughs) peak sensitivity to depth corrugations. But there is no
B Sensitivity to depth corrugations. (A) The stimulus. (B) Peak-to-trough threshold as a function of corrugation spatial frequency for several eccentricities (N = 3). (From Prince and Rogers 1998, with permission from
Figure 18.32.
Elsevier)
theoretical reason why the two thresholds should be related. Similarly, the threshold for detection of a single luminance spot has no relation to the ability to detect luminance modulations. The relationship between sensitivity to spatial disparity modulations and the spatial frequency of luminance modulation is discussed further in Section 18.7.2.
18.6.3c Disparity Resolution and Contrast Resolution The function relating the contrast threshold for detecting depth corrugations to the frequency of disparity modulation shows a band-pass characteristic, with the lowest threshold occurring between 0.3 and 0.5 cpd and the highest detectable depth corrugation at about 3 cpd. A luminance modulation is detected best at a spatial frequency of about 3 cpd and can be detected up to about 50 cpd (Campbell and Robson 1968). Thus, disparity resolution is at least 10 times worse than luminance resolution.
STEREOSCOPIC ACUIT Y
•
319
Disparity threshold (arcsec)
period of the stimulus is at least twice the spacing of the detectors. This is the Nyquist limit. For detection of luminance modulations, the highest detectable spatial frequency approaches the Nyquist limit set by the density of foveal receptors. It is not limited by the size of the receptive fields of ganglion cells because their receptive fields have excitatory and inhibitory subregions and it is these that limit resolution. Each ganglion cell is tuned to a particular spatial frequency of luminance modulation. Thus, luminance modulations are detected early in visual processing. For detection of disparity, the monocular elements in each image must be no denser than the Nyquist limit. Thus, in a random-dot stereogram, each eye must resolve the dots in the stimulus. However, the basic detector for disparity is a binocular cell in the primary visual cortex. These cells also have excitatory and inhibitory subregions. For detectors of phase disparity, these subregions determine the preferred disparity and disparity tuning width of the cell. For detectors of position disparity, the sizes of the monocular receptive fields determine the width of the disparity tuning function. The evidence reviewed in Section 11.4.1 shows that cells in V1 respond only to local absolute disparities. They are not tuned to a particular spatial frequency of disparity modulation or to a particular gradient of disparity. Thus, disparity gradients and modulations are not detected early in the visual system. It is as if each binocular cell in the primary visual cortex assumes that the disparity within its boundaries is evenly distributed. Cells tuned to gradients of disparity exist at higher levels of the visual system (Section 11.5.2). Each of these cells combines inputs from several simple disparity detectors in V1. Therefore, for detection of disparity gradients there is an extra low-pass filter that limits the detection of modulations of disparity to a frequency set by the sizes of the receptive fields of cells in V1. This is the basic reason why resolution of modulations of disparity is much poorer than resolution of modulations of luminance (Banks et al. 2004a ; Filippini and Banks 2009).
Disparity threshold (arcsec)
(a) Single cycle of disparity modulation.
Disparity threshold (arcsec)
(b) Vertical grating.
18.6.3d Suprathreshold Functions
(c) Horizontal grating. Disparity modulation thresholds. Mean threshold for detection of disparity modulation of a textured display as a function of the spatial frequency of the modulation. Stimuli consisted of (a) single cycle of modulation, (b) vertical gratings of different heights and widths, and (c) horizontal gratings of different heights and widths (N = 1).
Figure 18.33.
(Adapted from Tyler and Kontsevich 2001)
Both types of resolution are affected by the optical properties of the eye and the size of the detectors. A set of detectors can resolve a periodic stimulus only if the spatial 320
•
The perceived contrast of a grating is not much affected by changes in spatial frequency when the contrast is more than 100 times the threshold (Georgeson and Sullivan 1975). Ioannou et al. (1993) reported an analogous effect for disparity modulations. Observers matched the peak-to-trough depth seen in a fixed amplitude disparity corrugation at one of a number of corrugation frequencies between 0.1 and 1.6 cpd to a variable amplitude reference corrugation. When the depth of the corrugations was just above threshold, the low- and high-spatial-frequency corrugations appeared to have less depth than the reference corrugation of intermediate corrugation frequency. However, when the peakto-trough depth was a factor of a hundred or more above
STEREOSCOPIC VISION
threshold (4 arcmin), the matching functions flattened out and corrugations of all spatial frequencies appeared to have approximately the same depth (Figure 18.34). As with luminance, a suprathreshold mechanism compensates for the low threshold sensitivity of the disparity system to both low and high spatial frequency corrugations. Lankheet and Lennie (1996) determined the amount of disparity noise required to eliminate the impression of depth in a dynamic random-dot stereogram depicting a corrugated surface. Disparity noise was introduced by random (Gaussian) perturbation of disparity, as illustrated in Figure 18.35. Figure 18.36A shows the results for a stationary grating with peak-to-peak disparity of 5.1 arcmin. The function relating frequency of depth modulation to the effects of added noise had a band-pass characteristic. Stereopsis was most immune to added noise at a spatial frequency of depth modulation between 1.5 and 2 cpd. The upper limit of stereopsis occurred at about 4 cpd. At disparities above 5.1 arcmin, the function showed a low-pass characteristic. Figure 18.36B shows the results for a grating of 1.8 cpd and disparity of 5.1 arcmin drifting at various velocities. The upper temporal limit of stereopsis occurred when the bars of the grating drifted past the fixation point at about 6 Hz. The upper temporal limit declined sharply at higher amplitudes of disparity modulation. The form of the function relating noise sensitivity to the spatial frequency of depth modulation was not affected by the rate at which the grating drifted. Similarly, the form of the function relating noise sensitivity to drift rate was not affected by the spatial frequency of depth modulation. This suggests that spatial
Threshold/matched disparity (arcsec)
1000
10
C Effects of disparity noise on depth detection. The horizontal depth grating has a disparity of ± 4 pixels. The standard deviation of Gaussian noise is 0 in (A), 4 pixels in (B), and 8 pixels in (C). (Reprinted from Lankheet
Figure 18.35.
and Lennie 1996 with permission from Elsevier)
and temporal requirements for binocular correlation are independent.
30 arcsec
18.6.3e Visual Channels for Disparity Modulation
Threshold
1 2 0.1 0.2 0.5 1 Corrugation spatial frequency (cpd) Perceived depth and depth modulation. Bottom curve shows disparity thresholds for detection of corrugations as a function of corrugation frequency. For the upper curves subjects set the disparity of a test surface to match that of a comparison surface. Peak-to-trough disparity is shown on each curve. This subject saw less depth in highthan in low-frequency corrugations. (Adapted from Ioannou et al. 1993)
Figure 18.34.
B
8 arcmin 2 arcmin
100
A
The question now arises whether there are one, two, or more independent channels sensitive to the full range of spatial modulation of disparity. Evidence from single-cell recording in the visual cortex points to the existence of at least three populations of disparity-tuned cells, tuned to zero, crossed, and uncrossed disparities (Section 11.4.1). But this evidence does not establish that there are channels specifically tuned to the spatial modulation of disparity as opposed to local disparity sign and amplitude. Two methods have been used to measure the bandwidth of disparitymodulation channels. One method uses adaptation, the other uses masking. Schumer and Ganz (1979) found that the disparity threshold for detection of depth in a random-dot stereogram with a given corrugation frequency, f, was not affected
STEREOSCOPIC ACUIT Y
•
321
Correlation sensitivity (arcmin)
2
1
0.5
0.1 0.4
0.6
2 3 0.8 1 Spatial frequency (cpd)
5
Correlation sensitivity (arcmin)
A 3 2
1 0.8 0.6 0.4
0.1
0.5 1 Temporal frequency (Hz)
5
10
B Binocular correlation sensitivity. (A) Correlation sensitivity as a function of spatial frequency of disparity modulation of a stationary cyclopean grating, with a disparity of 2.55 arcmin. Results for three subjects. (B) Sensitivity as a function of temporal frequency of local disparity modulation produced by a drifting grating. Spatial frequency 1.8 cpd. Points on the ordinate are for a stationary grating. Sensitivity indicates noise amplitude at 85% detection (N = 3). (Redrawn from Lankheet
Figure 18.36.
and Lennie 1996)
by superimposition of another disparity grating of frequency 3f. They concluded that the spatial modulation of disparity is processed in several channels, each with a full bandwidth at half amplitude of between 2 and 3 octaves. Schumer and Ganz also measured the effect of prolonged viewing of a grating of fixed corrugation frequency on the threshold of a test grating, as a function of the corrugation frequency of the test grating. The threshold elevation was centered on the frequency of the adapting grating and confined to about two octaves within the visual system’s total corrugation bandwidth. This result also supports the notion of broadly tuned disparity-modulation channels with overlapping tuning functions. There could be a multistage system involving lowlevel disparity-modulation detectors, each with a narrow 322
•
disparity range, feeding into broadly tuned channels at a higher-level. But the final outcome is that the visual system responsible for detecting spatial modulations of disparity is metameric, like color, orientation, and many other sensory systems (see Section 4.2.7). Two or more superimposed stimuli with disparity modulations falling well within the bandwidth of a single channel should be metamerically combined by that channel into one signal. This is related to the topic of disparity averaging discussed in Section 18.8.2. Two stimuli processed by distinct channels could interact by mutual inhibition, or by opponency, as in color opponency. On the other hand, distinct channels could interact by mutual facilitation or a combination of inhibition and facilitation, depending on the relative frequencies of disparity modulation in the two stimuli. Tyler (1983) used a masking paradigm. The disparity threshold for detection of depth produced by a disparitymodulated grating was determined in the presence of a masking grating of variable spatial frequency. The data indicated that disparity-modulation channels have a bandwidth of about one octave. A masking procedure underestimates the width of sensory channels because the mask affects the channel on which it is centered but leaves overlapping channels on the side opposite the mask relatively unaffected. The signal may be detected by the unaffected off-center channel, since this channel has the highest signal-to-noise ratio. The problem of off-channel viewing can be overcome by use of a notched mask consisting of two nonoverlapping band-pass masks symmetrically positioned about the frequency of depth modulation of the test stimulus. Off-channel viewing is prevented because the channel with the highest signal-to-noise ratio is the one centered on the frequency of the signal. The threshold of the test stimulus is measured as a function of the width of the notch between the two masks. With this procedure Cobo-Lewis and Yeh (1994) obtained channel bandwidths for disparity modulations of about twice those reported by Tyler, and more in agreement with those reported by Schumer and Ganz. However, comparison of data derived from the various methods depends on certain assumptions, such as the linearity of the mechanisms involved and the symmetry of the disparitymodulation tuning functions. Hibbard (2005), also, used the notched-mask procedure to measure the orientation bandwidth of channels tuned to modulations of disparity. The signal consisted of a 5.4˚ diameter random-dot stereogram containing a vertical or horizontal disparity-defined depth modulation of 0.73 cpd. The signal was masked by two superimposed random-dot displays of the same spatial frequency but oriented symmetrically around the orientation of the signal stereogram. The orientation of the signal fell in a gap between the orientations of the masks. The ability of subjects to detect whether the signal was present improved rapidly as the
STEREOSCOPIC VISION
width of the gap (notch) between the masks increased to about 10˚. The half-width at half-height of the orientation tuning of disparity-modulation channels was estimated to be about 12˚, which is similar to the width of channels tuned to modulations of luminance. Also, there was no evidence that orientation tuning widths of channels tuned to horizontal depth modulations differed from those tuned to vertical depth modulations. Tyler and Konsevich found that disparity information is pooled over a larger extent for horizontal-shear disparity than for horizontal-compression disparity (see Section 20.4.2). Hibbard argued that this anisotropy occurs because of some nonlinear process at a higher level of processing.
18.6.3f Discrimination of Disparity Modulations If spatial modulations of disparity are detected by distinct channels with overlapping tuning functions, one should expect a disparity-modulated grating to be most easily detected when its spatial frequency of disparity modulation corresponds to the peak of one of the channels. Also, masking of one disparity-modulated grating by another should be most marked when they have the same modulation frequency. On the other hand, a change in modulation frequency should be most easily discriminated when the spatial frequency falls where the tuning functions of neighboring channels intersect (see Section 4.2.8). Also, the threshold elevation produced by inspection of a grating should be greatest at modulation frequencies offset from that of the adapting grating. Grove and Regan (2002) measured the least detectable difference in modulation frequency of a horizontal cyclopean grating with a fixed peak-to-trough disparity of 4.2 arcmin. For each of a set of gratings, subjects indicated whether its modulation frequency was greater or less than the mean of the set. Modulation frequency discrimination of the cyclopean grating was between 0.25 and 0.5%. The discrimination threshold was only slightly lower for a luminance-defined gating. The discrimination threshold was almost constant over the modulation frequency range of 0.16 to 2.0 cpd. In a second experiment, the modulationfrequency discrimination threshold rose sharply when the modulation was reduced to near the disparity-detection threshold and when the disparity became so large that the images could not be fused. In a final experiment, Grove and Regan showed that preadapting to a grating of one frequency of disparity modulation elevated the frequency discrimination threshold most when the modulation frequency of the test grating fell to one side of that of the inspection grating. These results support the idea that cyclopean disparitymodulated gratings, like luminance-defined gratings, are detected by spatial-frequency channels with overlapping tuning functions.
18.6.4 S T E R E OAC U I T Y WI T H C RO S S E D A N D U N C RO S S E D D I S PA R I T I E S
Larson (1990) found that stereo acuity on the Frisby and TNO stereo tests was, on average, no better for crossed than for uncrossed disparity. But these tests are not very sensitive. Woo and Sillanpaa (1979) found that mean stereoacuity was 5.6 arcsec for crossed disparity targets and 14.5 arcsec for uncrossed disparity targets. Grabowska (1983) and Landers and Cormack (1997) reported a similar difference. Schumer and Julesz (1984) found that sensitivity to disparity modulations in displays on crossed-disparity pedestals was greater than for displays on uncrossed-disparity pedestals (see Figure 18.30). They suggested that this asymmetry is due to a tendency of subjects to misconverge about 5 arcmin in front of the fixation target. Westheimer and Tanzman (1956) presented stereoscopic stimuli only briefly to eliminate monocular cues and effects of changing vergence. Subjects detected the depth order of a test spot relative to a fixation spot. On average, subjects detected uncrossed disparities more reliably than crossed disparities. Performance fell to chance for both types of disparity at a disparity of about 7˚. Using a similar procedure, Blakemore (1970c) found that subjects correctly categorized the disparity-defined depth of a centrally placed slit relative to a fixation point at well above chance levels for crossed disparities of between 4 and 7˚ and for uncrossed disparities of between 9 and 12˚. Schor and Wood (1983) investigated this question using small patches with difference-of-Gaussian luminance profiles of varying center spatial frequency. Figure 18.37 shows that the stereo threshold (lower dotted line) increased more rapidly with decreasing spatial frequency for crossed disparities than for uncrossed disparities. Also, the efficiency with which disparity was coded into depth deteriorated with decreasing spatial frequency. The deterioration was more marked for crossed than for uncrossed disparities. This means that, for low spatial frequencies, less depth was evoked by a given disparity when it was crossed than when it was uncrossed. Thus, the distance from fixation of low spatial-frequency stimuli with crossed disparity was underestimated. An explanation of this disparity bias in terms of a correlation between disparity tuning and spatial scale was provided in Section 11.4.3b. Figure 10 shows that, for a line stimulus, stereoacuity declined more rapidly with increasing crossed pedestal disparity than with increasing uncrossed pedestal disparity. These results are discussed more fully in Section 18.7.3. One would expect a relationship between the sign of phoria and differences in stereoacuity for crossed and uncrossed images. Of 53 observers who had better stereoacuity when the circles of Wirt’s stereotest had uncrossed disparities, 75% were esophoric. Of 27 observers who had better acuity with the circles in crossed disparity, 74.5% were exophoric (Shippman and Cohen 1983).
STEREOSCOPIC ACUIT Y
•
323
250 100
disparities between +4˚ and derived estimates of d’ from the percentage of correct scores. Some subjects performed better with crossed disparities and others with uncrossed disparities but, on average, crossed disparities were more precisely categorized than uncrossed disparities. Several investigators have reported that, in a randomdot stereogram, depth intervals based on crossed disparities were more rapidly detected and generally more accurately estimated than depth intervals based on uncrossed disparities, which tended to be underestimated (Manning et al. 1987; Finlay et al. 1989; Patterson et al. 1992a, 1995).
Uncrossed disparity Upper disparity limit
40 20 10 5 2.5
Disparity (arcmin)
1.0 Lower disparity limit 0.5
250 100
Crossed disparity Upper disparity limit
40 20
18.6.5 S T E R E OAC U I T Y A N D S T I MU LUS O R I E N TAT I O N
10 5 2.5 1.0
Summary The balance of evidence suggests that the disparity threshold for detection of depth is lower for crossed disparities than for uncrossed disparities. Also, at least in random-dot stereograms, depth from crossed disparities is estimated more accurately than depth from uncrossed disparities. Mustillo (1985) reviewed differences between the responses of the visual system to crossed and uncrossed disparities.
Lower disparity limit
0.5 0.05 0.1 0.21 0.42 0.83 1.67 3.3 6.7 13.6 19.2 9.6 4.8 2.4 1.2 0.6 0.3 0.15 0.075 Center spatial period (deg) Center spatial frequency (cpd) Stereopsis and spatial frequency. Dotted curves indicate the upper and lower disparity limits for stereopsis as a function of spatial frequency of DOG patches for uncrossed (upper figure) and crossed disparities (lower figure). Solid lines indicate the efficiency with which disparity in a DOG patch is coded into depth, measured by the disparity in the patch required to match the depth of a line depth probe. Disparities of the depth probes for which efficiency was measured are indicated by arrows (N = 1). (Adapted from Schor and Wood 1983)
Figure 18.37.
There is also evidence that depth intervals based on crossed disparity are more precisely categorized than depth intervals based on uncrossed disparities. Herring and Bechtoldt (1981) used a five-category scale of relative distance of bar and disk-shaped stimuli. Crossed and uncrossed disparities of 15 and 45 arcmin were categorized with equal precision. However, subjects had practice trials with feedback, so they would soon have learned that the two pairs of depth intervals were equal. Lasley et al. (1984) asked people to categorize the depth of a test stimulus with respect to a comparison stimulus for 324
•
Consider a line of dots tilted at angle q to the horizontal and set at a certain distance beyond a circular aperture on which the eyes are converged. This creates parallel disparate images of lines of dots, as shown in Figure 18.38. There are two distinct issues. The first concerns the dots that the visual system links in the two eyes. The second concerns the component of disparity between linked dots that is used to code depth. If horizontally aligned dots are linked, there is only a horizontal component of disparity, d, which is independent of q . If nearest-neighbor dots orthogonally opposite each other are linked, two disparities could code depth. The first is the oblique disparity, d cosf , as shown in Figure 18.38. In this case perceived depth should be proportional to cosq and therefore decrease with increasing q for a constant d. But when subjects use only the horizontal disparity component of the oblique disparity, d cos 2 q , then perceived depth should decline more rapidly with increasing q . There are several reports that stereoacuity declines in proportion to the cosine of the angle of tilt of test rods in the frontal plane (see Ogle 1955, p. 496; Ebenholtz and Walchli 1965). Patel et al. (2003) reported that, initially, disparities in random-dot displays are detected by sets of binocular cells tuned to different orientations and spatial frequencies. They are then converted into equivalent horizontal disparities. The following factors must be considered before definite conclusions can be drawn: 1. Effects of vertical disparity on detection of horizontal disparity If the visual system links orthogonal, nearestneighbor points, and codes depth in terms of only the
STEREOSCOPIC VISION
were not affected when up to 15 arcmin of vertical disparity was introduced into the images of the test disk. This is less than the maximum 0.7˚ of vertical disparity used by Friedman et al. We saw in Section 18.4.2a that the addition of overall vertical disparity disrupts the detection of depth from horizontal disparity.
Left-eye image
2. Increased image density A reduction in stereoacuity with increasing tilt of a test rod could arise because a given horizontal disparity is more difficult to detect when images are crowded. As a test rod with a given horizontal disparity becomes more tilted, the disparate images come closer together. There seems to be no evidence on this point other than that reported in Section 18.6.2.
Right-eye image
d dcos2q
q os
dc
3. Preponderance of cells tuned to vertical Horizontal disparities between vertical lines may be detected more efficiently than disparities between tilted lines. This would be so if there were a preponderance of binocular cells tuned to vertically oriented lines. The evidence on this point is equivocal (Section 11.4.5). This factor would not operate with a stimulus consisting of a line of points.
Figure 18.38.
Disparities produced by an oblique row of dots.
horizontal component of the orthogonal disparity, namely d cos 2 q , perceived depth should decrease rapidly with increasing angle of tilt. But this horizontal disparity occurs between images that also have a vertical disparity component of d cosq sin q . This may make horizontal disparity difficult to detect, as explained in Section 18.4.2a. Friedman et al. (1978) investigated this question. They presented a disk 15 arcmin in diameter to each eye with a separation of 1˚ and a fixation point midway between them. The disks were exposed for 100 ms. Subjects estimated the distance of the disks relative to the screen as the axis joining the centers of the disks was tilted at various angles (q ) to the horizontal. The horizontal component of disparity was cosq , in degrees. Perceived depth declined more rapidly than cosq . They concluded that depth is evoked by the horizontal component of an oblique disparity but that vertical disparity attenuates the impression of depth evoked by a given horizontal disparity. Van Ee and Schor (2000) conducted a similar experiment in which subjects adjusted the horizontal disparity of the images of a 15-arcmin test disk until it appeared at the same depth as a comparison stimulus with 15 arcmin of horizontal disparity. Depth settings
4. Anisotropy of monocular acuity Monocular contrast sensitivity and vernier acuity are higher for vertical lines than for oblique lines (see Howard 1982). Insofar as stereoacuity depends on monocular acuity, one might expect the same anisotropy for stereoacuity based on line stimuli. Anisotropy in the perception of horizontal and vertical depth modulations is discussed in detail in Section 20.4.2. 5. Effect of line length Blake et al. (1976) proposed that the loss of stereoacuity with tilt is due to a reduction in the height of the image of the test rods as the angle of tilt increases. They cited Andersen and Weymouth (1923) to support their proposal that stereoacuity declines with a reduction in line length. This is an unlikely factor with rods as long as those used by Blake et al. Davis et al. (1992) found that stereo acuity for oblique difference-of-Gaussian patches was not improved by an increase in the length of the patches. This also argues against an effect of line length. 6. Effect of type of disparity detector A disparity detector cannot detect a disparity along the length of a bar when the image extends beyond the receptive field (see Section 11.4.5). The effects of stimulus tilt on stereoacuity for a grating should depend on whether depth is coded by displacement disparities or phase disparities. Consider, first, a system using point disparities. Stereoacuity for a grating should vary with the cosine of the tilt angle
STEREOSCOPIC ACUIT Y
•
325
when based on horizontal point disparities, but should not vary when based on point disparities orthogonal to the grating. For a system using phase disparities, the relative horizontal phase shift of the images of a tilted grating is the same as the relative orthogonal phase shift. This is because the horizontal spatial frequency of a grating decreases in proportion to the angle of tilt. Thus, for a system using phase disparities derived along either the horizontal axis or the orthogonal axis of a tilted grating, stereoacuity should not vary with angle of tilt. Morgan and Castet (1997) found that stereoacuity for a grating expressed as a function of phase shift was independent of tilt angle up to about 80˚. Stereoacuity as a function of horizontal point disparity was proportional to the cosine of the angle of tilt. These results suggest that subjects were using phase disparities rather than position disparities. But Morgan and Castet pointed out that horizontal point disparities may have been detected but scaled with respect to the increase in the horizontal spatial period of the grating. 7. Ease of image linkages The effects of stimulus tilt on stereoacuity should depend on the ease with which stimulus points can be linked horizontally as opposed to orthogonally, as illustrated in Figure 18.38. With short oblique lines, perceived depth is determined by the horizontal disparity between the ends of the lines. As the length of oblique lines is increased, the disparity of the end points becomes undetectable, and perceived depth conforms more and more to the disparity between horizontally aligned points on the lines (van Ee and Schor 2000). Morgan and Castet found that, with an oriented Gaussian patch, stereoacuity based on the horizontal component of disparity did not vary with the angle of tilt but that acuity based on the orthogonal component varied as a function of the cosine of the tilt angle. This is presumably because the images of a localized tilted Gaussian patch match better along the horizontal axis than along the orthogonal axis. The stereoacuity for two dichoptic circular Gaussian patches with variable orientations of the interpatch axis depended on the horizontal component of disparity between the centroids of the patches. In this case, the patches could be matched only along the interpatch axis, but depth sensations depended on the horizontal component of this disparity. 8. Effect of line-end occluders The true ends of a featureless rod seen through an aperture are not visible. When the rod is tilted, the visual system could link nearestneighbor, orthogonally opposite points or horizontally opposite points or other pairs of points. This is the stereo aperture problem. It is analogous to the aperture problem that occurs in the ambiguity of the 326
•
perceived direction of motion of a grating seen through an aperture (Section 22.3.1). A drifting oblique grating seen through an aperture tends to be seen as moving parallel to the boundary of the aperture. Van Dam and van Ee (2004) had subjects match the depth of a disparity probe to that of an oblique line seen beyond two vertical textured occluders. The perceived depth of the line conformed to that predicted from the horizontal disparity between the images of the line. Subjects continued to use the horizontal linkage when the orientation of the occluders was varied. If subjects had used the disparity between the points where the line intersected the edges of the occluder, perceived depth would have varied with the orientation of the occluders. When the true ends of the lines were visible, subjects relied on the horizontal component of disparity between the line ends. A distinct issue is whether sensitivity to a difference in the disparity-defined depth of two gratings is affected by the relative orientations of the two gratings. Farell (2006) asked subjects to judge whether a 3-cpd central grating was nearer or more distant than a surround grating. An orientation difference of 10˚ increased the relative-disparity threshold. The mean threshold disparity was 1.1 arcmin when the gratings had the same orientation and about 3.7 arcmin when their orientations differed by more than 45˚. Farell concluded that this supports the idea of nonseparable coding of disparity and orientation. But perhaps subjects became uncertain about the directions of the disparities in the two gratings when they differed in orientation. Another issue is whether stereoacuity is affected when lines comprising a stereogram have different orientations in the two eyes. This issue was discussed in Section 15.3.5. Another issue is whether detection of disparity-defined depth modulations is affect by the orientation of the modulations. This issue is discussed in Section 20.4.2. Summary Perceived depth and stereoacuity generated by line images oriented at various angles, depends on the horizontal component of their disparity. Stereoacuity for images consisting of oriented Gaussian patches that are tilted but lie on the same horizontal axis depends on their horizontal disparity rather than on the disparity orthogonal to the orientation axis of the patches. Stereoacuity for grating images seen through circular apertures depends largely on the phase disparity of the images rather than on their positional disparity. 18.6.6 D I S C R I M I NAT I O N O F D I S PA R I T Y G R A D I E N T S
Lunn and Morgan (1997) compared discrimination thresholds for relative depth with those for a depth gradient and
STEREOSCOPIC VISION
Spatial period
depth curvature. In the first experiment, subjects saw a random-dot stereogram containing a sinusoidal modulation of disparity that created horizontal ridges in depth. In such a stimulus, disparity amplitude is independent of modulation frequency, disparity gradient is proportional to frequency, and disparity curvature is proportional to frequency squared. Thus, performance over a range of frequencies indicates which cue is used. Two subjects indicated which of two intervals contained a stimulus with greater depth for depth corrugations of 0.25 and 0.5 cpd and peakto-trough disparities from zero to 13.3 arcmin. In terms of disparity amplitude, the Weber fraction increased with increasing disparity in a similar way for both frequencies of depth modulation. When the data were plotted in terms of disparity gradient or curvature, the increase in the Weber fraction with increasing disparity was steeper for the lower than for the higher frequency of depth modulation. These results suggest that subjects based their judgments on disparity amplitude rather than on the first or second derivatives of disparity. This does not prove that subjects do not use disparity gradients or depth curvatures when these alone are available. In their second experiment, Lunn and Morgan isolated each of the three types of disparity by using random-dot stereograms with spatial modulations of disparity with profiles shown in Figure 18.39. The frequency of disparity modulation varied between 0.25 and 0.37 cpd. Subjects discriminated (1) depth amplitudes of square disparity modulations, (2) depth gradients in triangular disparity modulations, and (3) depth curvatures in parabolic disparity modulations. The spatial period of each waveform was jittered so that subjects could use only the defined cue. For both subjects, the Weber fraction was 4%–10% for the depth step, 6%–12% for the depth ramp, and 13%–30% for the depth curvature. These results suggest that people possess mechanisms tuned to disparity-defined depth ramps and curvatures. However, they are best at discriminating depth steps and relatively poor at discriminating depth gradients and depth curvatures. However, the stimuli differed in the area over which information was distributed. For depth steps, the information is available at every point. For gradients and curvatures, information has to be integrated over an area. Rogers and Cagenello (1989) obtained Weber fractions of 5% for discrimination of depth curvature in a horizontal parabolic cylinder (Section 20.6.5). However Lunn and Morgan pointed out that subjects could have detected local disparities rather than disparity curvature. But another factor may be that Rogers and Cagenello used larger parabolas constructed from lines rather than random-dots. Warren et al. (2002) measured the precision with which subjects could set a dot until it appeared to lie on a straight or parabolic contour defined by a series of dots.
Depth (a) Square wave
Figure 18.39.
(b) Triangular wave
(c) Parabolic wave
Stimuli used by Lunn and Morgan (1997).
The contour was oriented at different angles in 3-D space. The mean separation of the dots was 25 mm for the line and 30 mm for the parabola. The precision of interpolation was comparable to Vernier acuity. Precision was higher for straight contours than for parabolic contours and higher for contours in the frontal plane than for those orientated in depth. But here also, the stimuli differed in the area over which information was distributed. Interpolation into a straight contour requires information from only two points. Interpolation into a curved surface requires information from several points. Vreven (2006) compared subjects’ ability to discriminate disparity modulations that differ in depth but not shape with their ability to discriminate disparity modulations that differ in shape but not depth. The stimuli are shown in Figure 18.40. In the within-shape task, subject subjects reported whether a depth modulation with fixed shape had more or less depth than a standard. In the between-shape task, subjects reported whether a stimulus with fixed depth had more or less curvature than a standard. Disparity thresholds were larger for within-shape discrimination than for between-shape discrimination. 18.6.7 S T E R E OAC U I T Y A N D VI EWI N G D I S TA N C E
For a stereoacuity of 1 arcmin, the depth interval between a point at a distance of 215 m and a point at infinity can just be detected. At distances beyond 215 m, no depth intervals can be detected on the basis of disparity. With a stereoacuity of 1 arcsec, depth intervals of objects beyond about 13 kilometers cannot be detected. The angular disparity created by two objects with a fixed linear separation in depth is approximately inversely proportional to the square of viewing distance. However, stereoacuity should not vary with viewing distance because it is defined in terms of angular subtense, not linear depth.
STEREOSCOPIC ACUIT Y
•
327
1 8 . 7 S T E R E OAC U I T Y A N D S PAT I A L S C A L E 18.7.1 I N T RO D U C T I O N A
B Stimulus profiles redrawn from Vreven (2006). (A) Subjects reported whether each of the gray shapes differed from the standard (bold shape) in depth. (B) Subjects reported whether each of the gray shapes differed from the standard (bold shape) in shape. Only one shape was presented at a time.
Figure 18.40.
Most investigators have found that stereoacuity is not affected significantly by changes in the angle of vergence required to fixate the visual target with other cues to distance eliminated (Ogle 1958; Brown et al. 1965). However, the nearest distance in these studies was only 40 cm. Some deterioration in stereoacuity has been reported at viewing distances less than 50 cm (Amigo 1963; Lit and Finn 1976). In assessing the effects of changes in vergence induced by base-in or base-out prisms, one must allow for chromatic dispersion produced by the prisms, changes in fixation disparity (Section 10.2.4), and stimulus blur due to changes in accommodation that accompany changes in vergence (Section 10.4.2) (Fry and Kent 1944). Horizontal or vertical decentration of spectacle lenses reduced the range of disparity in a random-dot stereogram within which stereo depth was produced ( Jiménez et al. 2000). Bradshaw and Glennerster (2006) found some elevation in the threshold for detection of disparity-defined depth corrugations when the separation of the stereoscope screens required subjects to converge to a simulated distance of 28 cm rather than 57 cm. The actual distance of the screens, and hence accommodation, was constant. The loss of acuity at the near distance was larger when they varied the actual distance of the screens and added other cues to distance. They eliminated effects of changing accommodation by using lenses. Bradshaw and Glennerster concluded that stereoacuity is not determined wholly by angular disparity at near distances. However, the loss of stereoacuity at near distances could be due to increased fixation disparity arising from inadequate convergence. Also there is greater instability of vergence at near distances, because of the greater vergence demand (see Ukwade et al. 2007). 328
•
In monocular vision, acuity, receptive field size, and the spatial scale of the stimulus are related. For instance, vernier acuity for a given stimulus is constant over the visual field when scaled for the mean size of receptive fields (the cortical magnification factor) (Levi et al. 1985). As one moves into the retinal periphery, receptive fields become larger and there is a concomitant loss in the ability to process fine disparities. Felton et al. (1972) suggested that disparity is processed by distinct size channels—small disparities by small receptive fields and coarse disparities by large receptive fields. This is referred to as size-disparity correlation. Marr and Poggio (1979) and Frisby and Mayhew (1980) developed models of the stereoscopic system based on this idea. There are at least four functional reasons for expecting different magnitudes of disparity to be processed in distinct spatial-scale channels. 1. Multiple image-matching channels If the range of disparities to which a disparity detector responds is equal to one period of its spatial-frequency tuning function, then there will be only one potential match of dichoptic images within this range of disparities. Consequently, different spatial frequencies would be processed by distinct disparity detectors, which would simplify the process of image matching. 2. Use-specific channels Size-disparity correlation would allow each channel to be used for specific purposes. For instance, the large-size channel could control large transient vergence, while the small-size channel could control the vergence locking mechanism (Section 10.5.10) and pursuit eye movements in depth (Section 22.6.1). 3. Interchannel facilitation A given channel could help to resolve ambiguities in the linking of images processed in another channel. With periodic visual patterns, disparity detection is necessarily linked to spatial scale. The largest disparity over which the two images of a periodic grating can be matched in a particular way is half the spatial period of the grating. If the disparity is greater than this, the images of the grating match up in a different way, as in the wallpaper illusion. A low spatial-frequency component of a periodic display allows the visual system to identify matches between the images at disparities greater than the periodicity of a high spatial-frequency component. Since there are necessarily fewer matching contours in low-frequency components than in high-frequency
STEREOSCOPIC VISION
components of a display, there is less chance of finding the wrong match in the large spatial scale, coarsedisparity system than in the small spatial scale, finedisparity system. It is therefore an efficient strategy to first find correspondences between the low spatialfrequency components of a display and use these to drive vergence eye movements to reduce these disparities to a minimum so that residual fine disparities can be detected by the fine-scale system (Marr and Poggio 1979). 4. Channels in distinct disparity-detection systems In stereograms consisting of isolated elements rather than periodic patterns, there is another reason for expecting the scale of disparity processing to be linked to the scaling of the sizes of receptive fields. First, assume that disparity coding depends only on the spatial offset of identical monocular receptive fields (inter-receptive-field disparity). Theoretically, a binocular cell with small receptive fields in the two eyes could code large disparities if the monocular receptive fields were far apart, although, with large disparities, the system would be prone to register noncorresponding images. A binocular cell with large receptive fields in the two eyes should be able to detect small disparities unless the cell’s positional tuning is too coarse to produce a fine disparity signal. Thus, when disparity detection is based on receptive-field offset, the fine and coarse limits of disparity tuning are not closely linked to the spatial scale of the stimulus. Predictions are different if disparity tuning depends on the phase disparity of subregions within the monocular receptive fields. Phase-disparity detectors cannot detect a disparity larger than the spatial period of their ON and OFF regions (Section 11.4.3). For a cell with a single spatial period, the upper limit of disparity detection is determined by the diameter of its receptive fields. However, a cell could be sensitive to disparities smaller than excitatory and inhibitory subregions within its receptive fields. Thus, for phasedisparity detectors, the upper limit but not the fine limit of disparity tuning is tied to the spatial scale of stimulus elements. Psychophysical evidence for a linkage between disparity magnitude and the spatial scale of the visual display is reviewed in the following three sections. 18.7.2 S PAT I A L S C A L E A N D D I S PA R I T Y D ET E C T I O N
18.7.2a Correlation of Disparity and-Spatial Scale A linkage between disparity detection and the spatial frequency of luminance modulation leads to the following prediction. Small disparities should be detected at a lower contrast in stimuli with high spatial frequency and large
disparities should be detected at a lower contrast in stimuli with low spatial frequency. But we will see that the question is complicated by the fact that spatial scale has three interrelated factors, which have been confused in most experiments. 1. The spatial frequency of a sinusoidal grating or the center spatial frequency of a Gabor patch or of a filtered random-dot display. 2. The size of an isolated stimulus or of the envelope of a Gabor patch. 3. The mean spacing of stimulus elements. A linkage between spatial frequency and disparity magnitude was not evident in the data reported by Frisby and Mayhew (1978a) from random-dot stereograms, presented in Section 18.5.2. However, subjects may have altered their vergence slightly to bring disparities into the most sensitive range for each spatial frequency. Smallman and MacLeod (1994) had subjects stabilize convergence with nonius lines before seeing two bandpass-filtered random-dot patches for 150 ms. One patch had a crossed disparity, and the other an uncrossed disparity. They determined the contrast required for 75% correct identification of the nearer patch for center spatial frequencies between 1 and 15 cpd and for disparities between 1 and 20 arcmin. Peak stereo sensitivity in the resulting stereo contrast-sensitivity function was at 3 cpd, the same value obtained by other investigators. However, unlike Frisby and Mayhew, they found that threshold disparities were linked to spatial scale. Thus, at a center spatial frequency of 15 cpd, the range of detectable disparities was confined to between 1 and 7 arcmin. At a center spatial frequency of 1 cpd, sensitivity was very low for 1 arcmin of disparity and best for 20 arcmin of disparity. As spatial frequency increased, coarser disparities became undetectable. Thus, above a frequency of 3 cpd, disparities greater than 20 arcmin were not detected. Above 7 cpd, disparities greater than 15 arcmin were not detected. Over the range of center spatial frequencies from 1 to 15 cpd there was a fivefold change in the disparity that was most easily detected. The data were transformed into spatial phase. For instance, a disparity of 5 arcmin at a spatial frequency of 3 cpd is equivalent to a phase disparity of 90˚. The resulting functions provided some support for the quadrature model of disparity detection proposed by Ohzawa et al. (1990) (Section 11.4.1d). In any case, these data support the idea of size-disparity correlation. Prince and Eagle (1999) argued that, in filtered-noise stereograms like those used by Smallman and MacLeod, the number of false matches is proportional to mean frequency, which may confound spatial frequency with the complexity of matching the images. They derived contrast thresholds for depth discrimination as a function of disparity and
STEREOSCOPIC ACUIT Y
•
329
spatial frequency for an isolated Gabor patch, for which the matching problem is minimal. Thresholds were low and constant up to disparities 10 times larger than one period of the sinusoidal modulation of luminance within the Gabor. For a band-pass noise stereogram, the results were more like those reported by Smallman and MacLeod. Subjects may have used the disparity between the Gabor envelopes rather than that between the sinusoidal carrier wave within the Gabors. Prince and Eagle found that subjects could not perform the task when the sinusoids in the two eyes differed in spatial frequency or orientation, with the envelopes held constant. But this does not prove that subjects did not use envelope disparity when the sinusoids were similar. Differences between the sinusoids would induce rivalry, which would spread to the envelope, and this may be why subjects could not perform the task. It is difficult to accept Prince and Eagle’s conclusion that, in the early stages of disparity processing, a wide range of disparities is detected at each spatial frequency and that false matches are eliminated at a later stage. If there is a size-disparity correlation, both the stereoscopic threshold and the upper limit of stereopsis should be related to the spatial frequency of the stimulus. Schor and Wood (1983) investigated this question using small patches with difference-of-Gaussian (DOG) luminance profiles with center spatial frequencies between 0.075 and 19.2 cpd, each with a bandwidth at half-height of 1.75 octaves. Figure 18.37 shows that both the depth-discrimination threshold for such a patch relative to a comparison stimulus, and the upper disparity limit for stereopsis were constant for spatial frequencies above about 2.4 cpd. Below this value, both quantities increased as spatial frequency decreased. With decreasing spatial frequency, the depthdiscrimination threshold rose faster than the upper disparity limit, with the result that the range of disparities evoking depth sensations became narrower. These results support the idea of size-disparity correlation, but only for low spatial frequencies. Schor and Wood varied only the center spatial frequency of the DOG patches. But decreasing the spatial scale of an array of patches also involves increasing the sizes of the patches and their spatial separation. Hess and Wilcox (2006) investigated the interactions between these three spatial-scale factors. Subjects reported whether a central Gabor patch presented for 1336 or 122 ms was nearer than or beyond two other identical patches. In one condition, Gabor spatial frequency, patch size, and patch separation were scaled together, as when viewing distance is varied. Stereoacuity increased with increasing spatial frequency. Stereoacuity increased less steeply when the Gabor patches had no internal structure. It also increased less steeply when the separation between the Gabors was not reduced in proportion to the increase in spatial frequency. Thus all three scaling factors—spatial frequency, patch size, and patch separation—affected stereoacuity. 330
•
18.7.2b Pedestal Disparity and Spatial Frequency The depth-discrimination threshold increases as pedestal disparity is increased. If disparities are processed in distinct spatial-frequency channels, one would expect the depthdiscrimination threshold to increase more rapidly for high than for low spatial-frequency stimuli. Badcock and Schor (1985) investigated this question using DOG patches with center spatial frequencies between 0.15 and 9.6 cpd. Vergence was controlled with a fixation point and nonius lines. The results are shown in Figure 18.11. The rate of increase of the depth-discrimination threshold with increasing pedestal disparity of the test and comparison stimuli was not much influenced by the spatial frequency of the display for spatial frequencies of 0.6 cpd and above. Badcock and Schor concluded that these results do not support the idea that low spatial-frequency channels most effectively process large disparities. However, at large disparities, subjects may have based their judgments on the relative separations of the diplopic images rather than on impressions of depth (see McKee et al. 1990a). Blakemore (1970c) had controlled for this factor by jittering the relative separations of the images from trial to trial and found that the discrimination threshold increased exponentially up to 2˚ of disparity. Siderov and Harwerth (1993a) also took this precaution plus the precaution of randomly varying the crossed and uncrossed disparity of the stimuli. They used two vertical bars with a DOG luminance profile. One had a fixed peak spatial frequency of 2 cpd. The spatial frequency of the other varied over 2 octaves. They also found that the increase of the depth-discrimination threshold as a function of pedestal disparity was exponential and continuous. They concluded that the leveling off of the function at larger disparities reported by Badcock and Schor was due to the intrusion of judgments based on width-matching. Although the depth-discrimination threshold decreased as the spatial frequency of both stimuli increased from 0.5 to about 3 cpd, a difference in spatial frequency of up to 2 octaves between the test and comparison bars did not affect the depth-discrimination threshold. Smallman and MacLeod (1997) conducted a similar experiment. They measured thresholds for discriminating depth between two adjacent random-dot displays, exposed for 250 ms, as a function of their pedestal disparity. The results are shown in Figure 18.41A. The stereo threshold increased more rapidly with increasing pedestal disparity for displays with high mean spatial frequency than for displays with low spatial frequency. In contrast to the results in Figure 18.11, stereo thresholds for pedestal stimuli were higher for high spatial-frequency stimuli than for low spatial-frequency stimuli. When plotted in terms of phase disparities, as in Figure 18.41B, all the data fell on a single line.
STEREOSCOPIC VISION
drifting vertical gratings was nearer. At a spatial frequency of 0.23 cpd the threshold contrast required for detection of a depth difference was lowest at a disparity of about 30 arcmin. At a spatial frequency of 7.5 cpd the threshold contrast was lowest at a disparity of about 1 arcmin. This confirms the linkage between spatial frequency and disparity detection reported by others. The effects of the temporal frequency of the stimuli varied with spatial frequency. For high spatial frequencies, sensitivity to depth was highest for a temporal frequency of between 5 and 10 Hz. For low spatial frequencies, depth sensitivity was highest at a temporal frequency below 1 Hz. Lee et al. concluded that disparity is processed in two spatiotemporal channels—the magno- and parvocellular channels.
Stereo threshold (arcmin)
30
10 3 5
5
11
3
1
8 2
1 0.5 0.3 0
5 10 15 20 25 Standing disparity (arcmin)
30
A 1,000
18.7.2c Modulation of both Disparity and Luminance
Threshold phase (deg)
300 100 30 10 3 1 10
30 100 300 Standing phase (deg)
1,000
B Stereo thresholds and spatial frequency. (A) Stereo thresholds for detecting depth in a textured surface as a function of pedestal disparity for a range of spatial frequencies of the texture (indicated on each graph). With high spatial frequencies (5, 8, 11 cpd), thresholds rose faster away from the fixation plane than for low spatial frequencies (1, 2, 3 cpd) (N= 1). (B) When replotted as threshold interocular phase differences as a function of standing interocular phase, data for the different spatial frequencies form a single function. The dashed line depicts Weber’s law. (Redrawn from Smallman and McLeod 1997)
Figure 18.41.
These results provide strong support for size-disparity correlation, with disparity coded by a phase-disparity mechanism. Smallman and MacLeod suggested that Badcock and Schor’s stimuli contained spurious monocular cues to depth and that vergence may not have been perfectly stable. Prince et al. (2002b) produced physiological evidence for a correlation between the width of the disparity-tuning functions of cells in V1 of the monkey and the spatial scale of the receptive fields. Lee et al. (2007) measured the relationship between spatial frequency and disparity as a function of the temporal frequency of the stimuli. Subjects judged which of two
Pulliam (1981) revealed another aspect of size-disparity correlation. He measured the disparity threshold for detection of depth produced by disparity-defined corrugations of a vertical grating defined by sinusoidal variations in luminance. Disparity sensitivity increased as luminance spatial frequency increased from 0.3 to 7 cpd. Also, the peak sensitivity to disparity modulation shifted to a higher frequency of disparity modulation as luminance spatial frequency increased. He concluded that channels tuned to high frequencies of luminance modulation are also tuned to high frequencies of disparity corrugation. Hess et al. (1999) proposed four ways in which spatial frequency may be linked to disparity modulations, as shown in Figure 18.42. In figure 18.42A there is no linkage between spatial frequency and disparity processing. Pulliam’s results conform to Figure 18.42D. A vertical grating is not the best stimulus for this type of experiment because horizontal disparity modulations are evident in each eye’s image as changes in line spacing. Lee and Rogers (1997) conducted a similar experiment using a random-dot display containing vertical modulations of shear disparity. The dot pattern was filtered by a 2-D narrow-band spatial filter. Disparity thresholds were derived from the lowest amplitude of disparity modulation at which subjects detected the sign of the depth modulation. Disparity thresholds for detection of a disparity corrugation of a given frequency as a function of luminance spatial frequency showed a band-pass characteristic with a minimum at a luminance spatial frequency of about 4 cpd. The minimum was the same for all frequencies of disparity modulation between 0.25 and 1 cpd, as shown in Figure 18.43. For a frequency of 0.125 cpd the minimum luminance spatial frequency shifted to about 3 cpd. This suggests that surfaces with a spatial frequency in the range 1 to 4 cpd and a disparity modulation in the range 0.25 to 1 cpd are detected by the same channel. Spatial frequencies greater than 4 cpd and disparity modulations of less than 0.25 cpd
STEREOSCOPIC ACUIT Y
•
331
Disparity modulation threshold
A
Low
B
High
C
Low
High
50 Disparity-detection threshold (arcsec)
Low spatial frequency of luminance modulation High spatial frequency of luminance modulation
D
1 cpd of disparity modulation 0.125 cpd 20
10
0.5 cpd 0.25 cpd
5 0.5
1 2 5 10 20 Centre spatial frequency of luminance modulation
A High Low High Low Spatial frequency of disparity modulation Linkages between spatial frequency and disparity. Four possible relationships between spatial frequency of luminance modulation and spatial frequency of disparity modulation, as they affect the threshold for detection of disparity modulations. (Adapted from
Figure 18.42.
Hess et al. 1999)
appeared to stimulate a different channel. They concluded that stereoefficiency is reduced for low-frequency depth corrugations defined by high spatial frequencies. These results conform to Figure 18.42B. Subjects also set the disparity amplitude of a comparison disparity-modulated random-dot display to match the depth of each of a set of test displays with suprathreshold peak-to-peak disparity amplitudes of 2 or 4 arcmin. The comparison stimulus was not spatially filtered and always had the same frequency of disparity modulation as the test stimulus. It can be seen in Figure 18.43 that the depthmatching functions were largely flat, indicating an absence of interaction between spatial frequency and disparitymodulation frequency. Lee and Rogers concluded that, at suprathreshold disparities, a scaling process equates the efficiency of the different luminance channels serving stereopsis. Hess et al. (1999) pointed out that Lee and Rogers introduced shear disparity into their stimuli after they had been spatially filtered and that this may have imported contaminating high spatial frequencies. Rather than using a random-dot grating, Hess et al. modulated the disparity of a field of randomly positioned Gabor patches set on a background with the same mean luminance. The mean spatial frequency and size of each of the patches and their overall density could be independently varied, keeping apparent contrast constant. In a random-dot grating, spatial frequency and dot density covary. A 16-fold increase in patch density increased sensitivity to disparity modulations 332
•
Disparity-detection threshold (arcsec)
50 1 cpd of luminance modulation
20 2 cpd
8 cpd
10 4 cpd 5 0.5
1 2 5 10 Spatial frequency of disparity modulation
20
B Figure 18.43. Depth detection as a function of the spatial scale of disparity modulation and luminance modulation. (A) Threshold for detection of
depth in a disparity modulated random-dot surface as a function of the center spatial frequency of the surface. Results for each of four spatial frequencies of disparity modulation. (B) The same data plotted as a function of the spatial frequency of disparity modulation for each of four spatial frequencies of luminance modulation. (Adapted from Lee and Rogers 1997)
for patches with high spatial frequency. A 16-fold increase in patch area produced only a slight change in the disparity modulation function. When patch density and size were held constant, sensitivity to high-frequency disparity modulations increased with increasing luminance spatial frequency. However, low-frequency disparity modulations were detected equally well at all luminance spatial frequencies. According to these results, low-frequency depth modulations are supported equally well by low- and highfrequency luminance modulations but high-frequency depth modulations are enhanced at high-frequencies of luminance modulation. These results conform to Figure 18.42C.
STEREOSCOPIC VISION
18.7.2d First- and Second-Order Stereopsis First-order stimuli are defined by modulations of color or luminance that can be processed by a linear mechanism. Second-order stimuli are defined by modulation of contrast, texture, or motion with mean luminance held constant. The distinction between first- and second-order stimuli was first made for motion (Cavanagh and Mather 1989). The processing of second-order stimuli requires initial filtering followed by a nonlinear stage, such as rectification, and a second filtering. Second-order motion can produce linear artifacts (Scott-Samuel and Georgeson 1999), and first-order motion could be processed by a nonlinear system. Nevertheless, there is evidence that the two types of motion are processed by distinct and parallel channels (Wilson et al. 1992; Lu and Sperling 1995). In monocular vision, the effect of spatial frequency of luminance modulation on vernier acuity depends on the separation between the stimuli being compared. Toet et al. (1987) measured the displacement threshold for patches with a Gaussian contrast profile at the contrast threshold as a function of the blur and spatial separation of the patches. For a constant ratio of blur to patch separation, the threshold increased linearly with increasing blur. It was concluded that the relative spatial position of stimuli at a given resolution is detected with an accuracy that is a constant fraction of receptive-field width at that level of blur (resolution). The displacement threshold was independent of the distance between the stimuli when the distance was sufficient to allow them to be seen as two but was less than 25 times the spread of the stimuli. When the stimulus separation was more than 25 times the blur parameter, the displacement threshold increased as a linear function of separation for a given value of blur. Toat et al. concluded that there are two processes; a linear process that provides a direct measure of relative position when the distance between the stimuli is small relative to their size, and a nonlinear process that involves indirect comparison of stimulus positions when the stimuli are further apart than a critical value. This second process is adversely affected by increasing the separation between the stimuli. Hess and Wilcox (1994) used stimuli similar to those used by Toet et al. Instead of the threshold for lateral offset, they measured the disparity threshold for perception of relative depth between a Gabor patch and similar patches presented above and below it for 0.5 s. Each patch contained a sinusoidal grating in sine phase (the carrier) confined within a window with a Gaussian luminance profile. The diameter of the window (indicated by its standard deviation) varied between 5 and about 100 arcmin. As the size of the window increased for a given spatial frequency of the carrier, the spatial-frequency bandwidth of the Gabor patch decreased (Section 4.4.2). The disparity threshold was independent of the separation between the patches for targets with center spatial frequencies of 0.66 and 5.24 cpd
and a 1.13-octave bandwidth but increased rapidly with increasing separation for a target of 5.24 or 10.4 cpd and a 0.18-octave bandwidth. Thus, narrow-band stimuli (large patches) showed a strong dependence on the separation of the targets. For a given separation of the targets and a fixed diameter of the Gaussian window, the stereo threshold decreased linearly as the center spatial frequency of the Gabor patches increased from 0.1 to 10 cpd. This dependence of stereoacuity on spatial frequency was the same for different sizes of Gaussian window, showing that it was a function of spatial frequency and not of the number of sine waves in the window. These results differ from those for monocular vernier acuity, which was found to be independent of the spatial frequency of fixed-diameter Gabor patches (Hess and Holliday 1992). For a given separation of the targets and a fixed spatial frequency, Hess and Wilcox found that the stereo threshold increased steeply as the size of the Gaussian window increased from a value corresponding to a bandwidth of 0.5 octaves. For small Gaussian windows (broadband stimuli) the stereo threshold was independent of the size of the window. Thus, for small Gaussian windows, the stereo threshold depended on the spatial frequency of the carrier but not on the size of the window or the spatial separation of the stimuli. For large Gaussian windows (narrow band stimuli), the threshold depended more on the size of the window and the stimulus separation than on spatial frequency. Hess and Wilcox proposed that the dependence of stereoacuity on spatial frequency with broadband stimuli reflects the operation of a linear, or first-order, system based on luminance modulation, and its dependence on the size of the Gabor patch with narrow-band stimuli reflects the operation of a nonlinear, or second-order system. This is because the envelope of a Gaussian patch in a surround of the same mean luminance is not visible to a system that integrates luminance changes within the patch in a linear fashion. The crossover between the two processes occurred with a Gaussian envelope corresponding to a bandwidth of 0.5 octaves or about four cycles of the sine wave within the window. The dependence of stereoacuity on spatial frequency with broadband stimuli could reflect the operation of a phase-disparity detection system (Section 11.4.3). Perhaps one could account for these results by saying that the Gaussian envelope is a stimulus of low spatial frequency, which is processed preferentially by the coarsedisparity system. The sinewave display inside the envelope is a stimulus of fine spatial scale, which is processed preferentially by the fine-disparity system. The linear-nonlinear distinction may not be the most important factor. Wilcox and Hess (1996) found that disparity detection based on disparities of the envelopes of Gaussian patches was possible with dichoptic patches containing uncorrelated noise. They argued that this is possible only if the nonlinear envelope signal is extracted
STEREOSCOPIC ACUIT Y
•
333
before binocular combination. They also found that stereoacuity was degraded when the noise line-elements in the Gaussian patches were orthogonal and argued that this demonstrates that the envelope signals are derived from oriented detectors. Langley et al. (1999) found that depth transparency and depth adaptation generated by disparity between second-order Gaussian envelopes are influenced by stimulus orientation and spatial frequency. They concluded that the principal nonlinearity in second-order stereopsis is cortical since it occurs after orientation and spatial-frequency linear filtering. Stereoacuity for a disparity step between first-order patches of constant luminance on a dark ground was about 10 times better than for second-order patches of uncorrelated vertical lines in a surround of the same mean luminance (Wilcox et al. 2000). It is not clear whether poor stereoacuity for the second-order stimuli was due to difficulties in image matching caused by uncorrelated lines or to a lack of luminance-defined contour. One would have to use a patch with a luminance-defined contour filled with uncorrelated vertical lines. Increasing blur degraded both types of stereopsis. Increasing the size of the patches degraded only second-order stereopsis, probably because larger patches contained more uncorrelated lines. Changes in contrast affected stereoacuity for luminance-defined patches more than for second-order patches (Wilcox and Hess 1998). Ziegler and Hess (1999) confirmed that depth is created by a disparity step between Gaussian patches with uncorrelated luminance modulations. Modulations were uncorrelated because their spatial frequencies differed by two octaves. However, subjects could detect a depth corrugation produced by modulation of disparity in a random display of patches only when the contents of the patches were the same in the two eyes. When they differed, false linkages occurred between luminance modulations.
10
10
Stereo sensitivity
10 24.9 arcmin stimulus
49.6 arcmin stimulus 1
Wilcox and Hess (1997) used a stereogram consisting of a coarse luminance-modulation superimposed on a finer sinusoidal grating, all in a Gaussian envelope. The stimulus contained no consistent first-order luminance-based disparity. Stereoacuity depended on the disparity of the coarser of the two stimulus components unless this component was degraded by blurring or by extending the fine sinusoidal grating beyond the boundary of the Gaussian envelope. Thus, while luminance-based stereoacuity depended on the highest available spatial-frequency component, nonlinear stereoacuity relied on the coarser available component. Wilcox and Hess proposed that the nonlinear stereo system is a back-up system concerned with estimating relative depth in a cluttered environment of complex textures. Hess and Wilcox (2008) asked whether first- and second-order disparities take different times to process. They used a Gaussian envelope containing a noisy vertical grating that was correlated between the eyes to produce first-order disparity, or uncorrelated to produce second-order disparity. Subjects reported whether the test stimulus was nearer or more distant than two flanking stimuli in the fixation plane. It can be seen in Figure 18.44 that sensitivity for the first-order stimulus increased as stimulus duration increased from about 60 ms to about 150 ms. But sensitivity for the second-order stimulus was much lower and remained more or less constant as stimulus duration increased. The results were similar for stimuli of different sizes, except that sensitivity to the second-order stimuli increased as the stimuli were reduced in size. Hess and Wilcox concluded that second-order disparity is transient, while first-order disparity is sustained. But the second-order disparity showed little evidence of being transient. Perhaps the initial impression of depth from the two types of disparity was reached at the same time, because it was based on the disparity of the envelope. But it then took
1
First-order
0.1
0.1
8.28 arcmin stimulus
First-order
Second-order
1
First-order
Second-order 0.1
Second-order 0.01
0.01
0.01 100
100 1000 Stimulus duration (ms)
1000
100
1000
Stereo sensitivity as a function of stimulus duration for first- and second-order disparities. The stimuli were Gaussian-windowed patches with 1-D vertical noise that was correlated in the two eyes (first-order disparity) or uncorrelated (second-order disparity). Results for one subject for each of three stimulus sizes. (Adapted from Hess and Wilcox 2008)
Figure 18.44.
334
•
STEREOSCOPIC VISION
more time to process the finer disparity contained in the first-order stimulus. Since the second-order stimulus did not contain this finer disparity it had nothing more to process. Evidence for binocular cells tuned to disparity between contrast-modulated gratings was discussed in Section 11.4.7.
18.7.2e Fine-Coarse Disambiguation Stimuli with low spatial frequency and coarse disparity could be used for the preliminary linking of images to bring them within the range of mechanisms sensitive to high spatial frequency and fine disparity. Marr and Poggio (1979) suggested that vergence responds to coarse disparity in a display and brings the fine residual disparities into the range of the disparity-detection system. Perhaps, without vergence, a purely neural mechanism uses coarse disparities to scale fine disparities (Quam 1987). The idea is that coarse disparities shift the scale of neural processing into a range in which fine disparities can be detected. On this basis, Rohaly and Wilson (1993) argued that people should be better able to discriminate depth in a high spatial-frequency stimulus set on a depth pedestal in the presence of a low spatial-frequency surround than in a stimulus containing only high spatial frequencies. They tested this prediction using elongated D6 Gaussian patches with spatial frequencies separated by 2 octaves. The low spatial-frequency surround was in the stereo depth plane of either the high spatial-frequency test patch or the zerodisparity comparison stimulus. The depth-discrimination threshold increased exponentially as the pedestal disparity of the test patch increased relative to the zero-disparity comparison stimulus. This occurred both when the low spatial-frequency stimulus was present and when it was absent. Rohaly and Wilson concluded that disparity in low spatial frequency stimuli does not shift the scaling of disparity, at least for stimuli more than two octaves higher in spatial-frequency. Smallman and MacLeod (1997) came to the same conclusion. Two adjacent random-dot patches were presented for 250 ms on various disparity pedestals. Both patches had a mean spatial frequency of 2 or 8 cpd, or contained both spatial frequencies. Subjects reported which patch was nearer. The stereo threshold was much the same for all stimuli when the disparity pedestal was 4 arcmin. The threshold for all stimuli increased as the pedestal disparity increased. For pedestals of 12 arcmin or more, the threshold for the compound stimulus was lower than that for the 8-cpd stimulus but higher than that for the 2-cpd stimulus. In other words, the addition of a high spatialfrequency component degraded stereo sensitivity in a display away from the plane of zero disparity. Presumably, the pedestal stimuli were out of range of the fine disparity system accessed by the high spatial-frequency elements.
The elements therefore acted like noise. Addition of a low spatial-frequency component improved stereo sensitivity for pedestal stimuli because low-frequency stimuli accessed the coarse disparity system, which could cope with the pedestal disparities. These results support the idea of a coupling between spatial frequency and the scale of disparity processing but do not support the idea that coarse disparities rescale fine disparities. Smallman (1995) asked whether disparity in stimuli with fine detail disambiguates coarser disparity. He presented a 2-cpd vertical sine-wave grating to each eye. The gratings were in antiphase so that they could be seen with either a crossed or an uncrossed disparity with respect to a zero-disparity stimulus placed below them. The stimuli were presented for 220 ms to avoid vergence eye movements. An 8-cpd filtered random-dot pattern was superimposed on each grating, and subjects judged the depth of both the fine and coarse patterns for several disparities of the fine pattern. The perceived sign of disparity of the coarse grating was the same as that of the fine pattern, showing that an unambiguous disparity in a fine pattern can be used to resolve an ambiguous disparity in a coarse pattern. Farell et al. (2004b) produced evidence that the interaction between different spatial frequencies in depth discrimination depends on the magnitude of the pedestal disparity. They used a vertical 2˚ Gabor patch with a single mean spatial frequency of either 0.5 cpd or 2 cpd and a compound patch with the two spatial frequencies superimposed with equal horizontal space disparities. The lower half of the single Gabor patch was at a given pedestal disparity, and the upper half was set at disparities above or below the pedestal disparity. The stimuli were presented for 150 ms to prevent vergence eye movements. Subjects judged the relative depth of the two halves of the patch. For all pedestal disparities, the compound grating was seen as a coherent surface at a depth determined by the disparity of the 0.5 cpd grating rather than as two superimposed surfaces at different depths. Figure 18.45 shows that, for pedestal disparities below about 10 arcmin, the depth-discrimination threshold for the compound grating was well below that for the 0.5 cpd grating but similar to that for the 2-cpd grating. Thus the threshold depended only on the high spatialfrequency component. For pedestal disparities between 10 and 20 arcmin, the threshold for the 2-cpd grating rose above that for the compound grating. As one would expect, the threshold for the 2-cpd grating was especially high at a pedestal disparity of 15 arcsec, corresponding to the ambiguous phase disparity of 180˚. Thus, the presence of the low spatial-frequency grating facilitated detection of depth when the phase disparity of the high spatialfrequency grating became ambiguous. As the pedestal disparity rose above 20 arcmin, the threshold for the compound grating began to rise until it was well above that for either of the simple gratings. This is to be expected because, as the
STEREOSCOPIC ACUIT Y
•
335
Disparity incremen t threshold
30 0.5 + 2 cpd
25 20 15
0.5 cpd
2 cpd
10 5 0 0'
10'
20'
30'
40'
50'
60'
0°
30°
60°
90°
120°
150°
180°
Pedestal disparity (arcmin and deg) Figure 18.45. Disparity thresholds as a function of pedestal disparity for simple and compound gratings. The phase disparity scale is for the 0.5-cpd grating.
Phase disparities for the 2-cpd grating are four times larger. Note that the threshold for the 2-cpd grating peaks when its phase disparity is 180˚ (N = 1). (Adapted from Farell et al. 2004b)
pedestal disparity increased from 20 to 60 arcmin, the phase disparity of the 0.5-cpd grating rose from 60˚ to an ambiguous value of 180˚. 18.7.3 S PAT I A L S C A L E , C O N T R A S T, A N D P E RC E I VE D D E P T H
18.7.3a Spatial Scale and Perceived Depth Linkage between disparity processing and spatial frequency would suggest that the magnitude of perceived depth produced by a given disparity (stereoscopic gain) is related to spatial frequency. Stereoscopic gain in a Julesz stereogram was found not to vary significantly when the spatial frequency of luminance modulation was varied between center frequencies of 2.5 and 16.2 cpd (Mayhew and Frisby 1979a). However, the disparity range of 2.6 to 20.8 arcmin may have been too small to reveal any effect. Schor and Wood (1983) used small patches with difference-of-Gaussian luminance profiles with a wider range of disparities. Stereo gain was measured by adjusting the depth of the Gaussian test patch to match that of a thin black line comparison stimulus. The stimulus was shown for as long as required, and nonius targets were used to monitor the accuracy of convergence. It is unlikely that vergence was held perfectly steady by this procedure. Whereas stereoacuity, as indicated by the lower dotted line in Figure 18.37, began to deteriorate when spatial frequency fell below 2.4 cpd, stereo gain, as indicated by the solid lines, did not begin to fall until spatial frequencies were much lower. The solid lines also indicate that stereo gain began to deteriorate at a lower spatial frequency for fine disparities than for 336
•
coarse disparities. The finding that gain deteriorates at a lower spatial frequency for uncrossed disparities than for crossed disparities was mentioned in Section 18.6.4. It looks as though stereo gain is higher for the high spatial-frequency/fine-disparity system than for the low spatial-frequency/coarse-disparity system. Furthermore, low spatial-frequency Gaussian patches with zero disparity appeared to lie behind the fixation plane defined by a fixation spot (Schor and Howarth 1986). Note that a low spatial-frequency stimulus has a lower apparent contrast than a high spatial-frequency stimulus. When the apparent contrasts of the stimuli were made equal, the differential loss of stereo gain was much reduced. The different stereo channels therefore seem to be linked to different effective contrasts as much as to different spatial frequencies. There were three potentially confounding factors in this experiment. (1) The spatial frequency of the Gaussian patches covaried with their width, so that width rather than spatial frequency may have been the crucial variable. (2) Stimuli with different spatial frequencies were not equated for visibility. (3) The spatial frequency and width of the test stimulus were varied, while the comparison stimulus remained constant. At least part of the effect of the spatial frequency of the test stimulus on perceived depth could have been due to monocular perspective—a tendency to see smaller objects or finer textures as being further away than large objects or coarse textures. One must be cautious in drawing conclusions from experiments in which test and comparison stimuli differ, because the results may not reflect properties of the stereoscopic system alone. They may arise from the interplay between the disparity-based stereoscopic system and other cues to depth. Perceived depth could be the outcome of a trading relationship between distinct cues (Chapter 30). There is no reason to expect that a trading relation between cue systems would affect the discrimination threshold for either of the cues presented alone. For instance, just because lines in the Müller-Lyer illusion appear to differ in length does not mean that a change in the relative lengths of the two lines is difficult to detect. Discrimination reflects a person’s ability to detect changes in a stimulus feature rather than the ability to assess the magnitude of a sensory effect relative to a standard.
18.7.3b Contrast and Perceived Depth Fry et al. (1949) reported that the perceived depth of a stereoscopic rectangle relative to a surrounding aperture increased as the contrast of the rectangle decreased relative to the fixed contrast of the aperture surround. They concluded that relative contrast acts as a cue to distance. Rohaly and Wilson (1999) measured the effects of contrast on stereo discrimination and stereo gain (perceived depth per unit disparity). Two vertical D6 Gaussian bars were presented 0.67˚ on either side of a fixation spot for
STEREOSCOPIC VISION
500 ms. One was a test bar with contrast of between 10 and 100% and disparity of between ±4 arcmin. The other was a comparison bar of 50% contrast and variable disparity. Subjects indicated which bar appeared nearer. Thresholds for the detection of an increase in disparity relative to a 4-arcmin pedestal indicated a power law dependence on contrast, with exponents between 0.15 and 0.42. For both crossed and uncrossed disparities, a low-contrast test bar appeared more distant than a high-contrast test bar relative to the comparison bar of higher contrast. This effect also showed a power law dependence on the contrast of the test bar. In other words, lowering the contrast of a stimulus of a given disparity caused the stimulus to appear more distant. This is the same effect reported by Fry et al. and by Schor and Howarth (1986). However, Rohaly and Wilson obtained some effect at high spatial frequencies, whereas Schor and Howarth found the effect only for low spatialfrequency stimuli. Control experiments revealed that the effect of relative contrast on perceived relative distance was not due to fixation disparity produced by vergence or to only an apparent difference in size of the two stimuli created by the difference in contrast. If the effect of relative contrast were due to a direct effect of contrast on perceived relative distance, one would expect the same effect with monocular viewing. O’Shea et al. (1994b) obtained a greater effect with monocular viewing, but Rohaly and Wilson and Schor and Howarth obtained a greater effect with binocular viewing than with monocular viewing. There is a monocular effect, but there seems also to be an effect operating through the disparity system. Weaker stereo stimuli are perceived as further away than stronger stimuli. The effects of contrast on perceived depth are discussed further in Section 27.3.4. 18.7.4 S PAT I A L S C A L E A N D STEREOPSIS MASKING
The spatial-frequency selectivity of stereoscopic vision may be investigated by measuring the effects of adding noise of specified spatial-frequency content to stereograms composed of patterns confined to a specified spatial-frequency bandwidth. Random texture added to band-pass-filtered random-dot stereograms abolished stereopsis when the signal-to-noise ratio was less than 1:1.4 and the spatial frequency of the noise overlapped that of the stereogram. But stereopsis was not affected when the spatial frequencies of signal and noise were 2 octaves or more apart ( Julesz and Miller 1975). In Section 17.1.3 we saw that depth is not perceived in a random-dot stereogram when the spatialfrequencies of the dichoptic images do not overlap. Thus, we can draw the general conclusion that binocular interactions of any sort do not occur between images that differ widely in spatial frequency. Widely different spatial frequencies neither interfere with each other nor create depth.
Yang and Blake (1991) used a masking procedure to measure the spatial-frequency bandwidth of channels devoted to disparity detection in random-dot stereograms. They determined the luminance contrast of random-dot noise added to one eye that just masked the depth in a random-dot stereogram, as a function of the central spatial frequency of the noise and of the stereogram. They concluded from the forms of the masking functions that there are only two spatial-frequency channels for crossed disparities and two for uncrossed disparities. In each case, the channels peak at 3 and 5 cpd. They concluded that both the low and high spatial-frequency channels serve in the detection of both fine and coarse disparities. This conclusion runs counter to the idea of a coupling between fine disparities and high spatial frequency and coarse disparity and low spatial frequencies, discussed in Section 18.7. The above argument may not be conclusive, because the noise may not have gained access to postfusional mechanisms responsible for the detection of disparity, and their stimulus may have induced luminance rivalry. Also, the monocular noise may have introduced confusing occlusion signals. Spatial-frequency interactions in binocular fusion were discussed in Section 12.1.2. Furthermore, the noise was presented to only one eye and may therefore have interacted with the image in one eye before the stage at which binocular disparity is detected. Shioiri et al. (1994) used a similar masking procedure with noise in the form of uncorrelated dots in the two eyes. They found similar channels peaking at 3 and 5 cpd but they also found a channel tuned to a peak spatial frequency of between 1.5 and 2 cpd. They suggested that Yang and Blake did not find this channel because their display was too small. Shioiri et al. found the bandwidth of these channels to be similar to those of corresponding channels devoted to the detection of luminance contrast. But they found only three spatial-frequency channels devoted to disparity detection rather than seven channels claimed to exist for the detection of luminance contrast (Wilson and Gelb 1984). Glennerster and Parker (1997) reanalyzed Yang and Blake’s data after taking account of the visual system’s initial modulation transfer function (MTF). They concluded that the data are consistent with Julesz and Miller’s conclusion. Over a 2.8-octave range of spatial frequencies, the most effective mask was one with the same spatial frequency as the disparity signal. Prince et al. (1998) used masks containing dichoptically uncorrelated noise scaled by each subject’s MTF. They determined the depth discrimination contrast threshold as a function of signal and mask spatial frequencies. The spatial bandwidths for stereo masking were wider than those for simple luminance masking. These results suggest that the band-pass spatial-frequency channels feeding into the stereo mechanism have wider tuning widths than those serving pattern vision.
STEREOSCOPIC ACUIT Y
•
337
Orientation selectivity of effects of masking noise could be investigated by measuring effects of adding noise consisting of lines with specified orientation with respect to the orientation of lines that define depth in a randomline stereogram. The effect of masking should be function of the orientation of the noise relative to that of the disparity-defining lines. Dichoptic masking was discussed in Section 13.2. Summary Smallman and MacLeod’s results suggest that coarse disparity detectors engaged by low spatial-frequency stimuli are required for optimal discrimination of depth differences well away from the horopter. Fine disparity detectors engaged by high spatial-frequency stimuli operate most effectively near the horopter. Adding high spatial-frequency stimuli to a display well away from the horopter degrades rather than enhances depth discrimination. Low-frequency stimulus components do not rescale the disparity-detection system to bring it within range of high spatial-frequency components, when vergence movements are not allowed. Disparity is processed in distinct spatial-frequency channels, but the number of channels seems to be fewer than for processing of luminance contrast. 18. 8 D I S PA R I T Y P O O L I N G There are three basic ways in which information about a given visual feature may be pooled. 1. Metamerism This occurs when a feature is coded by detectors with overlapping tuning functions (Section 4.2.7). Distinct stimuli falling within the Nyquist limit of such a system are not resolved but produce a signal that is a weighted average of the activity in the set of stimulated detectors. Thus, the stimuli are combined metamerically into one signal. This issue is discussed in Section 18.8.2 2. Stimulus pooling over large areas The outputs of distinct detectors could be combined over a large area. For example, horizontal disparities may be pooled over a large area for the initiation of vergence, even though they are processed locally for detection of relative disparities (Chapter 10). Also, evidence reviewed in Section 20.3.2 shows that vertical disparities are pooled over much larger areas than horizontal disparities, both for vergence control and for stereopsis. This is because vertical disparities tend not to vary over small areas. 3. Other types of interactions between detectors These could be mutual facilitation or inhibition between the outputs of detectors for a given feature. 338
•
18.8.1 D I S PA R I T Y P O O L I N G A N D NOISE REDUCTION
Spurious disparity signals due to rapid fluctuations of vergence or to noise in the visual system are randomly distributed about some mean value and therefore cancel when inputs from disparity detectors tuned to different disparities are pooled (Tyler and Julesz 1980) (Section 15.4.1). A related point is that spatial modulations of disparity of more than 5 cpd are not registered by the visual system (Section 18.6.3). This helps in the rejection of spurious disparity signals, which tend to be modulated at high spatial frequency. Disparity detectors produce a strong primary response when the disparity in a random-dot pattern matches their preferred disparity. Detectors produce secondary response peaks at other disparities when, by chance, the disparity between noncorresponding images (false matches) coincides with the preferred disparity of the detectors. For position-disparity detectors tuned to different spatial frequencies and orientations, the same primary response occurs at the same disparity. For cells tuned to different spatial frequencies and orientations, the secondary false peaks occur at different disparities. Therefore, pooling the responses of position-disparity detectors tuned to a given disparity but to different spatial frequencies and orientations would reinforce the primary response and eliminate the secondary responses. Pooling disparity signals over spatial scale and orientation would also help to suppress the secondary false peaks in phase-disparity detectors (Fleet et al. 1996b). Pooling over a local area may also help. Response normalization (division of a cell’s response by the pooled responses of neighboring cells) could also help in removing false responses in both position- and phase-disparity detectors. Physiological evidence for this type of pooling is lacking. Pooling or normalization does not eliminate false peaks arising from a regular periodic stimulus. Such a stimulus has an essential ambiguity. False matches do not occur with stimuli that are well spaced in comparison to the disparity. 18.8.2 D I S PA R I T Y M ETA M E R I S M
18.8.2a Introduction A random-dot display with two intermingled sets of dots, each set with a distinct disparity, can produce one of three impressions. 1. Lacy depth The display appears as a display of dots at different depths. This is also called pyknostereopsis. 2. Depth transparency One smooth surface is seen through another surface ( Julesz and Johnson 1968). This is known as diastereopsis. Lacy depth and depth transparency in random-dot stereograms are discussed
STEREOSCOPIC VISION
in the next section. Transparency produced by superimposed gratings is discussed in Section 22.1.3, and that produced by superimposed moving displays is discussed in Sections 22.3.2 and 22.3.3. 3. Disparity averaging, or metamerism The display appears in one plane at an intermediate depth, as discussed in the present section. At each location, disparity is detected by a limited number of channels with overlapping disparity-tuning functions. This is the condition for metamerism. Some of the evidence for disparity metamerism is open to other interpretations. But we will see that, under certain circumstances, similar disparities in neighboring locations average, or metamerize, to produce one intermediate disparity. Consider the simplest case of a display of random dots with disparity d1 superimposed on a second display with disparity d2. Seven variables are likely to affect whether disparity metamerism occurs. 1. The disparities d1 and d2 of the images of the dots in each component display. 2. The difference in disparity between the component displays, (d1 – d2), or Δd. As Δd increases, the component disparities should begin to excite detectors with nonoverlapping tuning functions and be seen as distinct depth planes. If there were only three types of disparity detector with broadly overlapping tuning functions, as in the color system, then, for all detectable values of Δd, the components would not be resolved when they fall in the same region in the visual field. 3. The mean of the two disparities, (d1 + d2)/2. 4. Density of the dots in the monocular images of each component display. 5. The mean distance between the pairs of disparate dots in one display and the pairs of disparate dots in the other. In a random-dot stereogram with superimposed depth planes, variables 4 and 5 are confounded. One would predict that a region of mixed disparities is more easily resolved into distinct depth planes when the pairs of disparate dots in one display are well separated laterally from pairs of dots in the other. 6. Separation in time between stimuli with different disparities should facilitate their separation into distinct depth planes. 7. Similarity between the texture elements in the two displays, including such features as shape, size, color, and motion.
18.8.2b Monocular Averaging Versus Disparity Averaging What is claimed to be disparity averaging could be due to spatial interactions between the elements in each monocular image. Spatial attraction and repulsion between monocular images are well-known phenomena (Section 21.2). Kaufman et al. (1973) presented a random-dot stereogram in which a central region with a crossed disparity of between 4 and 10 arcmin was superimposed on a region with an equal uncrossed disparity. When the images were equally bright, the central region appeared to be in the same depth plane as the surrounding zero-disparity region. Kaufman et al. concluded that the two oppositely signed disparities combined, or metamerized, into an average value of zero. When the crossed images were made to differ in brightness, the central region appeared to move in depth in the direction of the uncrossed images, and vice versa. In other words, more weight was given to the disparity components with the same brightness in the two eyes. It is not clear whether this effect was due to the fact that one pair of images differed in brightness and one pair did not or to an image in one of the pairs being dim. Another problem with this procedure is that when one of the images of a pair is made very dim, a bright monocular image remains in the other eye, which adds uncorrelated noise. It would be interesting to know what would happen if the brightness of both crossed images were varied relative to that of both uncrossed images. If the brightness of one pair of images is reduced, perceived depth must eventually conform to the disparity of the brighter pair. The only uncertainty concerns the slope of the function relating perceived depth to the relative brightness of images with competing disparities. Metameric pooling in other sensory systems depends on the relative strengths of component stimuli (Section 4.2.7). A further problem with the procedure used by Kaufman et al. is that the effect may be due to averaging of locations of images in each eye rather than to averaging of disparities at the binocular level. Rogers and Anstis (1975) favored the monocular averaging interpretation. In their own experiments, the stimulus to one eye was a random-dot pattern with dots subtending 25 arcmin while the stimulus to the other eye was a composite of two identical versions of the same image but in slightly different horizontal positions. When the composite images were separated by less than 2 arcmin, the perceived depth of the surface changed smoothly as the luminance balance between the two images in the composite was varied from 100:0% to 0:100%. When the composite images in one eye were separated by more then 2 arcmin, the depth changed abruptly near the 50:50% balance point of relative luminance from being determined principally by the (zero) disparity of one of the images in the composite to being determined by the nonzero disparity of the other (Figure 18.46A).
STEREOSCOPIC ACUIT Y
•
339
18.8.2c Disparity Averaging and Image Linking Foley and Richards (1978) superimposed two vertical lines in one eye on two vertical lines in the other eye. Each pair of lines was 1˚ apart. One dichoptic pair of lines presented alone appeared as a single line beyond a central fixation point. The other pair presented alone appeared as a line in front of the fixation point. When the two pairs of lines had equal brightness, subjects saw two lines in the plane of the fixation point placed midway between them. When one left-eye, right-eye pair of images was dimmed, the line produced by the brighter pair appeared closer to the position in 340
•
66.6 5.2
66.4 Matched depth (cm)
66.2 3.9
66.0 65.8
2.6
65.6 65.4 1.3
65.2 65.0
0
64.8 100:0
75:25
50:50
25:75
0:100
Ratio of positive to positive in composite image Matched depth (cm)
This does not prove that the change in perceived depth was due to monocular luminance averaging rather than to disparity averaging. In their second experiment, Rogers and Anstis presented to one eye a single (positive) pattern and to the other a composite of two slightly displaced images, one of which was contrast reversed (negative). Whenever the luminance balance of the contrast-reversed image was higher than that of the positive image, subjects reported rivalry rather than depth. This is to be expected given previously reported findings on stereopsis with opposite contrast stimuli (Section 15.3.7). However, when the luminance of the positive image was higher in the composite image, the perceived depth changed smoothly but in the opposite direction to that predicted by the disparity of the negative image in the composite (Figure 18.46B). There are three reasons for doubting that the observed depth change was due to disparity averaging between the linked positive-positive and linked positive-negative images. First, it would require the single image in one eye to be paired with both the positive and negative images in the other eye, thereby violating the unique-linkage rule (Section 15.3.1). Second, it would require that opposite-contrast images in the two eyes be linked, which was shown to be impossible when the negative picture dominated in the composite image. Third, even if the positive and negative images were linked, the predicted depth would be in the direction of displacement of the negative image and thus the predicted depth from disparity averaging should have been in the direction of the disparate negative image rather than in the opposite direction. Rogers and Anstis modeled the consequences of simple spatial summation or averaging of the locations of contours of the composite positive-positive and positive-negative images, before they are compared by the stereoscopic system, and found a good agreement with their empirical results. They also argued that a monocular averaging mechanism more parsimoniously explains the comparable reversed effects in moving patterns (reversed apparent motion) and in vernier alignment of positive-negative composite contours. This mechanism modifies the positions of contours before they are used by the stereoscopic, motion, or vernier alignment mechanisms.
A 65.2 65.0
5.2 3.9 2.6 1.3
64.8
64.6 100:0 75:25 50:50 25:75 0:100 Ratio of positive to negative in composite image
B Perceived depth with composite images. Left eye saw a 50% density random-dot pattern; right eye saw the same pattern plus a version with a slight uncrossed disparity. The contrast ratio of the patterns in the right eye is given on the x-axis and the disparity is indicated on each curve. In (A) the two right-eye patterns had the same sign of contrast. As the contrast ratio changed, the matched depth shifted toward the disparity of the displaced positive pattern with the higher contrast. In (B) the displaced pattern seen by the right eye was contrast reversed (negative). The matched depth shifted away from the disparity of the displaced negative pattern.
Figure 18.46.
(Redrawn from Rogers and Anstis 1975)
which it appeared when the images were presented alone. Foley and Richards concluded that perceived depth in this display represented the mean of the two component disparities, weighted for luminance (see also Foley 1976a). Krol and van de Grind (1986) suggested that the effect reported by Foley and Richards is an artifact of changing vergence. They found that vergence is pulled away from the fixation point toward fusing the brighter images, and concluded that this, not disparity averaging, causes the images to appear displaced in the same direction. Birch and Foley (1979) obtained similar results when vergence was controlled by nonius lines. However, Tam and Ono (1987) spotted another artifact in this type of display. When fixation is held at a point midway between the component lines, the two pairs of images fall on corresponding points. This is the double-nail illusion (Figure 15.22). It is meaningless to talk about disparity averaging in this case, since there are no disparities. When fixation is slightly nearer or farther than the midpoint, the left-left and the right-right images, which signify
STEREOSCOPIC VISION
lines near the plane of fixation, are closer to each other than the left-right and right-left images, which signify two lines in different depth planes. As a left image in one eye and a right image in the other eye are dimmed, the left-left and right-right matches become less probable and the left-right, right-left matches more probable. This produces a shift from seeing two coplanar lines to seeing two lines separated in depth. This has nothing to do with disparity averaging; it is simply a question of changing the way pairs of images are combined. A sudden shift in the way the images are combined should cause the lines to appear to flip from one position to the other. Tam and Ono observed a sudden flip rather than the gradual transition reported by Foley and Richards. Evidence to be cited next suggests that true disparity averaging occurs only over disparity differences of a few minutes of arc rather than over the 2˚ of disparity difference used by Foley and Richards. The experiments reported so far do not provide conclusive evidence for disparity averaging. Parker and Yang (1989) designed a better display for studying disparity averaging and transparency. In this display, horizontal rows of dots with disparity d1 alternated with rows of dots with disparity d2, as shown in Figure 18.47. The patch containing the alternating rows was set in a surrounding region in which all dots had zero disparity. The difference in disparity between the rows was Δd, and the average disparity was (d1 + d2)/2. In this display, Δd, (d1 + d2)/2, and the lateral separation between the component rows could be varied independently. Subjects set the depth of a comparison patch placed in the zero-disparity surround to equal the depth of the mixed-disparity test patch. For values of Δd up to about 114 arcsec, the apparent depth of the test patch was equal to the average disparity, showing that disparity averaging had occurred within this range. Disparity averaging occurred at values of Δd of up to about 200 arcsec, but only when the average disparity of the test display was displaced from zero (when the fixation point was not midway between the two depth planes). With larger Δd values, subjects saw two depth planes with depth transparency. Within the range where averaging occurred, changes in the stereoscopic appearance of the display contingent on changes in Δd could still be discriminated. These results
Left eye Odd rows Right eye
Disparity 1
Left eye Even rows Right eye
Disparity 2
Display used to study disparity averaging. Rows of dots with 1 unit of disparity alternate with rows with 2 units. The line ends have no disparity. (Redrawn from Parker and Yang 1989)
Figure 18.47.
support the view that disparity averaging occurs for disparities up to about 114 arcsec and suggest that what was taken for disparity averaging in previous experiments resulted from competition between conflicting disparity matches. Since Parker and Yang did not vary the vertical separation between the rows of their display, they did not determine the spatial range of disparity averaging, as opposed to its disparity range. Rohaly and Wilson (1994) approached this question by superimposing a dichoptic pair of cosine gratings of one spatial frequency on a second pair with a different spatial frequency. One of the fused pairs had the same zero disparity as the fixation point, and the other had a disparity of 112 arcsec. Subjects compared the perceived depth in the test display with that in a comparison display in which the disparity of the two cosine gratings was always the same. The test gratings appeared as a single display at an intermediate depth with respect to the fixation point (disparity averaging) when the spatial frequencies of the gratings differed by less than 3.5 octaves. No thickening of the display in depth was noticed. With greater differences in spatial frequency, the two gratings appeared in different depth planes, one seen through the other. When the contrast of one grating was increased, perceived depth moved in the direction of that grating. They proposed a multichannel model of disparity averaging in which the stimulus components are processed at different spatial scales. They posited a small amount of cross-channel inhibition to account for the data at high contrasts. Two dichoptic vertical gratings with different spatial frequencies creates a surface slanted about a vertical axis (Section 20.2.1). Superimposition of two disparity ramps composed of similar gratings but with different slants created a slanted surface at an intermediate angle for one observer and a surface that slanted at a greater angle than either ramp for another observer (Richards and Foley 1981). The increase in perceived slant was explained in terms of inhibitory side bands in the monocular spatialfrequency channels. But these different impressions, and others mentioned in the paper, could arise out of confusion over which of the pair of gratings in one eye to match with which of the two gratings in the other eye. When the spatial frequency of the gratings comprising one of the ramps differed from that comprising the other by more than 2 octaves, two slanted surfaces were created. It was concluded that the bandwidth of the monocular spatialfrequency channels feeding into the slant perception mechanism is about 2 octaves. Anderson (1992) constructed a random-dot stereogram in which half the dots had 12.5 arcmin of crossed disparity and half had 12.5 arcmin of uncrossed disparity. This appeared as two superimposed planes—one beyond and one nearer than a surrounding display of zero-disparity dots. In a second stereogram, half the dots in each set of dots were given random values of disparity between plus
STEREOSCOPIC ACUIT Y
•
341
and minus 12.5 arcmin, while the other half of each set remained in the nearer or farther plane. This display appeared as a volume of dots rather than as two planes. As the two-plane display was replaced by the volume display, the depth appeared to shrink. During the opposite transition, the depth appeared to expand. This apparent change in perceived depth was nulled when the range of disparities in the volume display was increased by about 50% and each display was exposed for 140 ms. With longer exposure, subjects could detect this transition because they had time to notice whether or not some of the dots were at an intermediate depth. Anderson explained the contraction of apparent depth during the transition between the two displays in terms of depth averaging in the volume display. The magnitude of depth averaging revealed by this procedure is about three times greater than that reported by Parker and Yang. However, there may be an artifact in Anderson’s measure. During the transition from the volume display to the two-plane display, half the dots migrated from an intermediate position to one or other of the limiting surfaces. This motion should generate expansion in depth. The opposite transition should generate contraction in depth. These impressions would add to any effect due to disparity averaging. Control measurements are required in which the depth of a constantly visible two-plane display is nulled against that of a constantly visible volume display. In fact, Stevenson et al. (1991) did an experiment of this type (Portrait Figure 18.48). They superimposed a randomdot surface with a disparity of between zero and 15 arcmin on a zero-disparity random-dot surface. Subjects adjusted the disparity of a comparison random-dot surface until it appeared at the same depth as the zero-disparity surface. When the superimposed surfaces were less than about 4 arcmin apart, they appeared as one surface at an intermediate depth. The degree of disparity averaging was about one-third that reported by Anderson and was therefore in line with that reported by Parker and Yang. Summary In disparity metamerism, neighboring stimuli with similar disparities evoke distributions of activity within the set of disparity detectors. When these distributions are not sufficiently distinct, they coalesce into a single distribution of activity with a peak at an intermediate position. Such a process would occur beyond the stage where monocular images are combined. But what is claimed to be disparity averaging could also be due to spatial interactions between the elements in each monocular image. The best evidence suggests that disparity metamerism occurs over a range of disparities of about 2 arcmin. However, it is not yet clear whether this process occurs before or after binocular images are combined. What appears like disparity averaging between larger disparities is probably due to changes in the way binocular images are linked. 342
•
Scott B. Stevenson. Born in Huntsville, Alabama, in 1959. he graduated in psychology/biology from Rice University in 1981 and obtained a Ph.D. in experimental psychology from Brown University with Lorrin Riggs in 1987. He did postdoctoral work at Berkeley with Clifton Schor. In 1995 he joined the faculty of the University of Houston, where he is now associate professor of optometry.
Figure 18.48.
1 8 . 9 T R A N S PA R E N C Y A N D S T E R E OAC U I T Y When one textured surface is seen through another, image pairs with one disparity are interspersed among those with another disparity. We have no difficulty seeing superimposed transparent surfaces, as long as the density of texture elements is not too great. We saw in Section 12.1.3a that transparency is evident when the disparity gradient between neighboring dots is up to about 3, which is well above the gradient limit for binocular fusion. Weinshall (1991) claimed that multiple depth planes seen in an ambiguous random-dot display are due to double-duty linkage between the images. However, she later showed that the perceived density of dots in the depth planes was consistent with each dot having been linked with only one other dot (Weinshall 1993). Thus, the uniquelinkage rule holds in transparency displays. The present section is concerned with the effects of dot density on stereoscopic transparency. Akerstrom and Todd (1988) used two superimposed random-dot surfaces, which created transparent depth, or two side-by-side depth planes, as shown in Figure 18.49. Subjects fixated a dot at an intermediate depth between the displays. Combinations of crossed and uncrossed
STEREOSCOPIC VISION
Stereogram with superimposed depth planes
Stereogram with side-by-side depth planes Figure 18.49.
Superimposed and adjacent depth planes.
(From Akerstrom and Todd 1988.
Copyright Psychonomic Society, Inc)
disparities ranged from –14, +28 arcmin to –49, +63 arcmin. As the disparity between the surfaces increased, it became more difficult to see transparency in the superimposed surfaces but not more difficult to see a depth step between the adjacent surfaces. As disparities became very large, the displays formed a mishmash of randomly matched images, which created an impression of lacy depth. Akerstrom and Todd also found that depth transparency became less likely as dot density increased. Making the dots in the component depth planes different in color facilitated the perception of depth transparency. Images differing in color are less likely to form spurious matches. Depth transparency was not facilitated by differences in orientation of line elements in the two depth planes. The simplest explanation of the increased difficulty of seeing transparent surfaces is that, as the images in each depth plane become more disparate or more numerous, it becomes more likely that elements belonging to one plane are paired with elements belonging to the other plane. Also, it becomes more likely that images of elements on one plane abut or overlap images of elements in the other plane to form spurious clusters. Spurious clustering is most likely with rectangular texture elements. Spurious matches between texture elements can be avoided by assigning the elements in each surface to alternate rows. Tsirlin et al. (2010b) avoided spurious matches by restricting the distribution of rectangular texture elements so that elements from two transparent surfaces did not abut or overlap. Subjects perceived two surfaces in distinct depth planes at smaller disparity separations between the planes when spurious matches were eliminated than when they were present. Gepshtein and Cooperman (1998) presented subjects with random-dot stereograms depicting a cylinder seen through a flat textured surface. The display was presented
for between 100 and 180 ms with fixation in the plane of the near surface. Subjects were asked to distinguish between two orientations of the cylinder. For a given disparitydefined depth between the flat and cylindrical surfaces, performance deteriorated with increasing density of dots in the near surface. As the relative depth of the stimuli increased, perception failed at lower and lower dot densities. Performance improved with increasing density of dots in the far cylindrical surface. This asymmetry in the effect of dot density may have been due to the fact that only the near surface appeared transparent. The dots of the far surface would be seen against an opaque white background. The effects of dot density persisted when the dots in one surface had opposite luminance polarity to those in the other surface. Gepshtein and Cooperman argued that opposite contrast dots do not interact in the disparity domain and concluded from this that the effects of changing dot density cannot be due to difficulties of correctly linking the dots in the two planes arising from the nearestneighbor constraint or the order constraint. However, opposite contrast dots do interact even though they do not generate impressions of depth (see Section 11.4.1f ). Gepshtein and Cooperman proposed that performance deteriorates with increasing dot density of the near surface because this increases the disparity gradient over each local region. They argued that neurons tuned to different disparities inhibit each other over small regions of space. In their model, the zone of inhibition arising from each point in the near surface grows wider as the depth separation between surfaces increases so that an increasingly large part of the far surface becomes affected. It is not clear how these inhibitory interactions operate between dots with opposite luminance polarity. Metameric pooling of inputs from neighboring disparity detectors would have the same effect as mutual inhibition. Wallace and Mamassian (2004) pointed out that Gepshtein and Cooperman did not assess performance when the cylinder was not viewed through a transparent surface. It is therefore not clear whether detection of depth in the cylinder at different distances depended on overall density. Wallace and Mamassian compared the performance of human observers and an ideal observer on the task discriminating which of two random-dot displays was more distant from a fixation cross placed between them. In one condition, the two displays were superimposed to produce the impression of one surface seen through the other. In a second condition, the two displays were presented sequentially. Each display was presented for 2 s with five levels of added disparity noise consisting of unpaired dots. For both types of display, as dot density increased, human depth-discrimination thresholds increased slightly while those of the ideal observer decreased slightly. Thresholds were higher for the superimposed displays than for the sequential displays for both human observers and for the
STEREOSCOPIC ACUIT Y
•
343
ideal observer. Performance of the ideal observer was determined only by the probability of making false image linkages. It processed local disparities independently and therefore did not embody cross-disparity inhibition. Wallace and Mamassian concluded that the higher depth discrimination threshold with transparent displays arises from the difficulty of correctly linking the images rather from inhibitory interactions between disparity detectors. One problem is that the ideal observer took no account of the fixation point between the test surfaces. Subjects may have judged the relative depth of the fixation point and one or other surface rather than the depth between the surfaces. Tsirlin et al. (2008) asked how many superimposed dot displays separated in stereoscopic depth can be identified. The displays contained distinct Glass patterns, which subjects had to distinguish. This ensured that they had identified each display. Subjects were allowed to change convergence. Under the best conditions, observers identified up to six superimposed displays. Increasing overall element density degraded performance. For a given overall dot density, performance was worst when the disparity between the planes was 1.9 arcmin. This disparity falls within the range of disparity pooling (Section 18.8). Performance improved as disparity was increased up to a peak value, after which it declined. The peak disparity was smaller for dense stimuli than for sparse stimuli. However, analysis of the results revealed that the disparity gradient over the displays was not the crucial factor. The adverse effects of dot density and large interplane disparities agreed with the results of Akerstrom and Todd. They can be accounted for in terms of the properties of fine-scale and coarse-scale disparity detectors. Depth transparency produced by motion parallax is discussed in Section 28.3.3a. 1 8 . 1 0 S T E R E OAC U I T Y A N D EY E M O VE M E N T S 18.10.1 E FFEC TS O F L AT E R A L EY E M O VE M E N T S
18.10.1a Stereoacuity with stabilized images Grating acuity and vernier acuity do not change when the stimulus is stabilized on the retina (Keesey 1960; Gilbert and Fender 1969). Thus, microsaccades and drifting motions of the eyes during fixation neither aid nor degrade visual acuity. Furthermore, imposed motion of a vernier target up to a velocity of 2.5˚/s did not degrade vernier acuity (Westheimer and McKee 1977). Stereoacuity, also, is not affected when the images of the target are optically stabilized. Depth can be seen in briefly exposed random-dot stereograms even when the subject has no prior knowledge of the stimulus ( Julesz 1963; 344
•
Mayhew and Frisby 1979a ; Tyler and Julesz 1980). Furthermore, depth may be seen in afterimages of line stereograms (Wheatstone 1838; Ferree and Rand 1934; Ogle and Reiher 1962; Bower et al. 1964). Depth may also be seen in afterimages of random-dot stereograms. However, the dots must be well spaced, because a closely spaced random-dot pattern is not visible in an afterimage (Evans and Clegg 1967). Ogle and Weil (1958) found that the mean stereo threshold rose from about 10 to 40 arcsec as stimulus duration was reduced from 1s to 7.5 ms. They concluded that, although not essential for stereopsis, small involuntary eye movements help because they keep the images in motion. Shortess and Krauskopf (1961), however, found a similar dependence of stereoacuity on stimulus duration when the images were retinally stabilized. It seems that small movements of the eyes during fixation neither aid nor hinder stereopsis, but that disparity is integrated over time in a manner analogous to integration of luminance over time, as described by Bloch’s law. This question is discussed further in Section 18.12.1.
18.10.1b Stereoacuity with Laterally Moving Images Grating acuity is affected by retinal motion of the grating. Burr and Ross (1982) measured the contrast required for the detection of the direction of motion of a computergenerated vertical sinusoidal grating drifting sideways at various velocities. Observers fixated a stationary spot. The results for one of their two observers are shown in Figure 18.50. The contrast sensitivity curve shifted down the spatial frequency scale as the velocity of the grating increased from 0 to 800˚/s. Low spatial-frequencies became easier to detect, while high spatial-frequencies became more difficult to detect. Grating acuity is not affected by pursuit eye movements in the absence of retinal image motion (Flipse et al. 1988). Low-velocity motion improves the visibility of a grating presumably because it prevents neural adaptation. But, beyond a certain velocity, visibility begins to deteriorate because the temporal frequency (velocity x spatial frequency) at each location approaches the temporal resolution limit. The higher the spatial-frequency, the sooner this limiting velocity is reached. In addition, high spatial frequencies take longer to process than low spatial frequencies. Moving a grating is equivalent to removing high spatial-frequency components from the stimulus. That explains why the loss in contrast sensitivity produced by a given image motion increases as the spatial frequency of the grating increases. Lit (1960a, 1964) had subjects move a rod to the same apparent depth as a test rod. The variability of settings increased linearly as the sideways velocity of the test rod increased from 1.5 to 40˚/s. The effects of increased
STEREOSCOPIC VISION
1
1000
800°/s 11 0
100°/s 3 10 30
1
30
1
10°/s 31
03
1°/s 3 10
30
0
Contrast sensitivity
0°/s
100
10
800°/s
100°/s 10°/s 1°/s
1
0.01
0.1 1 Spatial frequency (cpd)
10
50
Effect of image motion on contrast sensitivity. Contrast-sensitivity curves for a vertical sinusoidal grating drifting laterally at various speeds, indicated on each curve. At all drift speeds, the curves have the same height, width, and general shape (N = 1). (Adapted from Burr and Ross 1982)
Figure 18.50.
velocity could be offset by an increase in the illumination of the rods (Lit and Hamm 1966). When the width of the aperture across which the test rod moved was wider than about 10˚, the effects of velocities between 7 and 40˚/s were independent of aperture width (Lit and Vicars 1970). A rod moving at 40˚/s over a 10˚ aperture is seen for 0.25 s. With apertures less than 10˚, variability of settings increased rapidly for all velocities, showing that, for stimulus exposure times under 0.25 s, the crucial variable was total time of exposure rather than velocity. When the stimulus was visible for more than 0.25 s the critical variable for this range of velocities was stimulus velocity rather than exposure time. Stereoacuity was not affected when either the fixation target or the comparison target placed just below it moved at velocities of up to 2˚/s (Westheimer and McKee 1978). Ramamurthy et al. (2005) measured the threshold for detecting stereo depth between two vertical lines presented for 200 ms and moving laterally at between 0 and 12˚/s. The threshold was not affected by velocities up to 2˚/s but increased exponentially eightfold as velocity increased from 2 to 12˚/s. Control experiments revealed that the effects of motion were not due primarily to exposure duration or to the increasing eccentricity of the target. The results were consistent with a loss in sensitivity to high spatial-frequency components of the stimulus resulting from image blur. The effects of motion and exposure time are unconfounded if the stimulus consists of a vertical grating moving continuously horizontally. Morgan and Castet (1995) used a vertical sinusoidal grating about 30˚ wide and 20˚ high.
The grating was exposed for 0.5 s as it moved horizontally with various crossed or uncrossed disparities with respect to stationary random-dot surfaces placed above and below it. For a grating with a spatial frequency of 0.04 cpd, subjects could detect the sign of depth for velocities up to 640˚/s. With higher spatial frequencies, stereopsis was maintained only to lower velocities. However, the crucial factor was not velocity but the spatial phase of the disparity. Depth was detected in gratings that generated a temporal frequency peak of less than 30 Hz, which is equivalent to a spatial phase of 5˚ and an interocular delay of about 0.5 ms. Morgan and Castet concluded that stereopsis for a moving stimulus depends on neurons with a spatialtemporal phase shift in the monocular receptive fields in the two eyes. It is not clear whether Morgan and Castet controlled eye movements in this experiment. A moving grating would evoke pursuit eye movements with a latency of under 0.5 s. In the preceding experiments, the moving objects were seen in depth. Hadani and Vardi (1987) devised the display depicted in Figure 18.51. The vertical square-wave grating with 1.8 arcmin disparity at each vertical contour was stationary while the dots defining it moved at various velocities over its surface, changing their disparity each time they crossed a disparity boundary. The motion of the dots induced optokinetic nystagmus, causing the eyes to pursue the moving dots with periodic saccadic returns. Stereoacuity was impaired at dot velocities between 1 and 3˚/s. At higher velocities, performance improved until, at a velocity of 11˚/s, it was equal to that with stationary dots. Optokinetic nystagmus was most evident at those velocities where the loss in stereoacuity was greatest. Sinusoidal motion of the dots from side to side did not affect stereoacuity. With this type of motion, saccadic phases of optokinetic nystagmus do not occur. This suggests that impairment of stereoacuity with continuous motion of texture over the surface of a stereogram is due to intrusion of saccades that disrupt the binocular fusion of the display. In these experiments the whole stimulus display was moved, leaving relative disparities unchanged. A related question concerns the effect of temporal modulations of relative disparity on stereoacuity. This question was discussed in Section 18.6.3.
18.10.2 D ET E C T I O N O F D E P T H AC RO S S S H I F TS O F G A Z E
18.10.2a Gaze Shifts Between Distinct Objects Consider the task of detecting a difference in depth between two targets. A loss in stereoacuity occurs as the lateral separation between the targets increases beyond about 0.2˚ (Section 18.6.2). However, the loss is not as great when the
STEREOSCOPIC ACUIT Y
•
345
1.2
Figure 18.51.
Stereoacuity and microtexture motion. A random-dot stereogram
defined a square-wave depth grating with 1.8 arcmin of disparity. The grating was stationary, but dots defining it moved at various velocities over its surface, changing their disparity as they crossed a disparity boundary. (Adapted from Hadani and Vardi 1987)
Stereoacuity 1/σ
1.0
0.8 Eye movements 0.6 Constant fixation 0.4
Difference
0 0
eyes are allowed to move from one target to the other compared with when one target is fixated (Ogle 1939c ; Hirsch and Weymouth 1948a, 1948b). Some data on this point are shown in Figure 18.52 (Rady and Ishak 1955). The effects of eye movements on the detection of a difference in depth between spatially separated targets could be influenced by the following factors:
10
20 30 40 Separation of targets (deg)
50
60
Stereoacuity and lateral separation. Acuity for perception of relative depth between two illuminated apertures as a function of their lateral separation. Blue curve: subjects looked from one stimulus to the other. Green curve: subjects fixated one stimulus. Red curve: the difference between conditions (N = 10). Acuity is the reciprocal of the standard deviation of depth judgments. Higher numbers represent higher acuity. (Redrawn from Rady and Ishak 1955)
Figure 18.52.
1. Changes in vergence A change in vergence as gaze moves from one target to the other could improve depth discrimination through the mediation of signals provided by either motor efference or sensory feedback from the extraocular muscles (Wright 1951). However, a change in vergence required to bring images onto corresponding points is possible only after the disparity of the target has been detected. Therefore, changes in vergence do not provide an independent source of information about the relative depth of two objects.
Backus and Matza-Brown (2003) used a cue-conflict procedure to measure the relative contributions of disparity and changing vergence to judgments of the relative depth of two vertically separated dots. As subjects changed their gaze from one dot to the other the absolute disparity (required vergence) of both dots changed in the same way, leaving relative disparity constant. Subjects tended to base judgments of relative depth on disparity for dot separations up to 10˚ and on changing vergence for larger separations.
The vergence state of the eyes fluctuates. This should not affect depth discrimination for simultaneously visible objects because vergence fluctuations affect the disparities of all objects in the same way, leaving relative disparities unchanged (see Section 18.10.3a). However, when two separated targets are presented sequentially, any fluctuation in vergence in the interstimulus interval affects the relative disparities of the targets. This would degrade stereoacuity if the vergence change were unregistered. Evidence reviewed in Section 18.12.2b shows that the ability to discriminate a depth interval between two adjacent targets declines as the interstimulus interval (ISI) is increased. But this effect could also be due to fading of the memory trace of the first stimulus. Zhang et al. (2003) found that the ability to detect a depth interval between two dots declined with increasing dot separation when the ISI was constant. This effect must have been due to vergence instability increasing with increasing amplitude of gaze shift.
2. Sequential foveal fixation The images of a fixated object fall on the fovea but the images of the nonfixated object fall in the periphery, where stereoacuity is reduced. When the gaze is allowed to move, the images of first one object and then the other are brought onto the fovea. The disparity in the images of one object must be retained in short-term memory and compared with the disparity in the images of the other object. Stereo depth perception is certainly possible with objects presented sequentially in the same position, as we will see in Section 18.12.2.
346
•
There would be no need for sequential comparison of disparities if the images were compared halfway through the eye movement when both are relatively near the fovea (Ogle 1956). This presupposes that one can sample disparities during a saccadic eye movement, which is most unlikely. Enright (1991a) asked subjects to alternate their gaze between two targets positioned so that, when one of
STEREOSCOPIC VISION
them was fixated, one of the images of the other target fell on the blind spot of one eye. Thus, the disparities produced by the two targets were never in view at the same time. Subjects adjusted the distance of one target until the two targets appeared at the same distance. They could look back and forth between the targets. The standard deviation of settings was about 5 arcmin. Wright (1951) had conducted a similar experiment.
5. Interstimulus masking Sequential adjacent targets are subject to masking, which reduces stereoacuity (Section 18.12.2b). The effects of masking are reduced with increasing separation of the targets. This would explain why the loss of stereoacuity with successive presentation was found to be much less for targets separated by 10˚ than for spatially adjacent targets (Enright 1991b).
Stronger evidence in support of the idea that disparities are retained in memory over changes in version was provided by Enright (1996). He showed that depth resulting from 45 arcmin of disparity between finely textured surfaces seen through 3.5˚ apertures separated by 9˚ could be detected when subjects were allowed to look from one surface to the other. When subjects fixated on one surface the texture of the other was too fine to be discriminated in peripheral vision. Frisby et al. (1997) obtained similar results when the textured patterns contained no low spatial-frequency components.
6. Number of judgments Weale (1956) argued that when subjects look from one target to the other they make twice as many mental comparisons compared with when they fixate one target. This should improve stereoacuity.
Brenner and van Damme (1998) asked subjects to look back and forth between an object that was visible only before the eye movement began and a second object that was visible only after the completion of the eye movement. The objects were 20˚ apart. Subjects adjusted one of the lights to appear equal in distance to the other with a standard deviation of about 10 arcmin. To perform this task subjects could set the distance of the adjustable target until no change in vergence was detected as the eyes moved between the targets. Otherwise they could set the target until they detected no change in disparity as they moved between the targets. Subjects set the adjustable target to be half or double the distance of the fixed target with a standard deviation of about 20 arcmin. Taroyan et al. (2000) obtained similar results when the head rotated between presentations. In these cases, subjects would have to make comparative estimates of vergence angles, of changing disparities, or of a combination of the two. 3. Disparity normalization If the eyes are stationary, the images of the test objects remain in the same location. Under these circumstances, perceived depth between the test objects adapts out and they come to appear nearer to a frontal plane (Section 21.3.2). If the eyes move back and forth between the targets, the pattern of disparity changes in position. This should prevent depth normalization to the frontal plane. There is no direct evidence on this point. 4. Troxler fading With very steady fixation the whole or part of a stimulus may fade, an effect known as Troxler fading. Eye movements prevent this from happening.
It is well known that objects are mislocalized when flashed on just before or just after a saccadic eye movement. Teichert et al. (2008) asked whether perception of relative depth is maintained about the time of a saccade. Observers reported the depth of a target flashed on for 10 ms relative to a fixation target or to a target to which they made a saccadic eye movement. For stimuli flashed on up to 25 ms before or after a horizontal saccade, relative depth was underestimated. Relative depth of stimuli outside this time interval was overestimated.
18.10.2b Gaze Shifts and the Perception of Slant As the gaze scans over a slanted or inclined surface, the angle of vergence changes so that matching features are maintained on the foveas. Detection of the direction of slant of a large smooth random-dot surface about a horizontal axis did not improve when the gaze was allowed to wander over the surface rather than remain at one location (Berends et al. 2003). However, scanning eye movements improved performance when disparity noise was added to the surface to produce depth irregularity. The improvement occurred when the gaze moved horizontally over the slanted surface (along its indepth dimension) but not when it moved vertically. The crucial factor seems to be the sampling of disparity gradients at different azimuth locations. In viewing an inclined or slanted surface the eyes tended to move more along the in-depth dimension than parallel to the slope axis (Wexler and Ouarti 2008). This would help in the detection of the angle of slope. For complex surfaces and multiple surfaces, sequentially acquired samples of disparity produced by scanning eye movements provide a basis for building an internal representation of the 3-D layout of the scene. This internal representation facilitates the control of subsequent sweeps of gaze, and these confirm and refine impressions of depth structure. One may also ask whether the detection of a difference in the slant of two separated surfaces is improved when the gaze is allowed to move from one surface to the other.
STEREOSCOPIC ACUIT Y
•
347
Zhang et al. (2003) presented an 8˚-wide by 1.5˚-high random-dot stereogram depicting a slanted test surface vertically above a comparison frontal surface. They were presented for 734 ms. When subjects fixated the comparison surface the slant-detection threshold for the test surface increased rapidly as the vertical separation of the surfaces increased. This must have been due, at least partly, to the increasing eccentricity of the test surface. But the threshold still increased, though less steeply, when subjects looked from one surface to the other. This could have been due to memory loss as the time interval (ISI) between successive fixations increased or to increasing effects of instability of gaze. The surfaces were then presented for 167 ms, either simultaneously or with an ISI of 400 ms with fixation on the comparison surface. In both cases, the slantdetection threshold increased with increasing separation of the surfaces. At all separations, the threshold was lower for the simultaneous than for the successive presentation. If this difference had been due to memory loss it should have been independent of surface separation, because the ISI was constant. However, the difference was not constant. Zhang et al. concluded that an additional factor is that, with increasing separation of the surfaces, subjects become less sensitive to the relative slants of the two surfaces. Zhang et al. (2003) also showed that vergence instability was not a factor in slant detection. Any change in vergence leaves the disparity gradient over a surface unchanged. We saw in the previous section that vergence instability affects detection of a depth interval between separated simple targets. This latter task involves comparison of absolute disparities, which are affected by changes in vergence.
18.10.3 S T E R E OAC U I T Y A N D V E RG E N C E S TA B I L I T Y
18.10.3a Compensating for Vergence Instability Changes in vergence are not required for stereoscopic vision, since depth is apparent in stereograms presented for durations much shorter than the latency of vergence, which is at least 150 ms (Dove 1841; Westheimer and Mitchell 1969). Also, stereoacuity is not improved when the images of the target are optically stabilized. Nevertheless, stereoacuity is degraded by excessive vergence instability in patients with congenital nystagmus (Ukwade and Bedell 1999). The small movements of an eye that occur when a small visual target is fixated are almost as large with binocular fixation as with monocular fixation (St Cyr and Fender 1969). Uncorrelated movements of the two eyes produce a corresponding variation in the disparity between the images of a binocularly fixated object. Motter and Poggio (1984) found that the eyes of a monkey were misconverged by more than 7 arcmin in both the horizontal and vertical directions about 60% of the time when the animal fixated a 348
•
small target. Such eye movements create fluctuations in disparity equally over the whole visual field. Motter and Poggio suggested that a dynamic feedback process prevents drifts in vergence from interfering with stereoscopic vision. Anderson and van Essen (1987) proposed a specific neural model of this process, called a shifter circuit. The idea is that the receptive-field boundaries of cells in the LGN or striate cortex are dynamically adjusted to null the effects of image motion. In the shifter-circuit model, the relative topographic mapping changes, not the size of receptive fields. Motter and Poggio (1990) produced physiological evidence in favor of the shifter circuit hypothesis, although this has been questioned on the grounds that eye movements were inadequately monitored and stimuli were inappropriately large (Gur and Snodderly 1987, 1997). There are three strong arguments against the shifter-circuit idea, as applied to nulling the effects of involuntary eye movements: 1. No visual feedback mechanism could respond quickly enough to null the moment-to-moment effects of microsaccades, which constitute a major portion of involuntary eye movements. 2. To correct for vergence-induced fluctuations in disparity the mechanism should operate globally over the whole visual field. However, vernier acuity, and therefore probably stereoacuity also, is normal for targets oscillating simultaneously in opposite directions in the same retinal region (Fahle 1991, 1995). For this purpose the shifter mechanism would have to operate locally. But it would be disadvantageous to null slowly changing local disparities, since such changes usually signify real differences in depth. 3. Disparity changes too large to be accommodated by the proposed shifter circuit, when applied evenly over the whole visual field, do not provide a strong signal for the detection of changing depth (see Section 31.3.2). Furthermore, stereoscopic vision is not much disturbed by normally occurring fixation disparities (Section 10.2.4) or by experimentally imposed fixation disparities of over 1˚ (Fender and Julesz 1967). If shifter circuits nulled fixation disparities, the disparities would no longer be visible; but they are visible when tested with nonius lines. All that is required to account for the fact that stereoscopic vision is not unduly disturbed by overall conjugate or disconjugate motions or displacements of retinal images is that the stereoscopic system registers the first or higher spatial derivatives of disparity, as discussed in Chapter 21. Such a mechanism simply responds to local disparity differences and discontinuities and therefore automatically rejects absolute disparities applied equally over a
STEREOSCOPIC VISION
given region. Note that, as we approach a 3-D visual scene, disparities are not changed by the same amount over the whole scene because the disparity per unit depth separation between a pair of objects is inversely proportional to the square of viewing distance. Disparities in different areas are not changed homogeneously even when we approach a flat surface in a frontal plane, because the gradients of vertical and horizontal disparities in such a surface vary with viewing distance and eccentricity (Section 20.2.2). Homogeneous changes in disparity are produced only by horizontal or vertical misconvergence or by inappropriate cyclovergence and are therefore best ignored by the depth perception system. However, homogeneous changes of disparity must be detected at some level because they evoke vergence movements and can give rise to sensations of changing size. Also, homogeneous changes of disparity are not accompanied by changes in image size that normally accompany changes in the distance of an object. We will see in Section 31.3.2 that overall changes in disparity do produce sensations of motion-in-depth when there is no conflicting information. Another possibility is that rapid changes in disparity produced by fluctuations of vergence are not detected because disparity signals pass through a low-pass temporal filter. The time-averaged response would smooth out rapid fluctuations. A third possibility is that an estimate of image motion arising from eye tremor (baseline motion) in each eye is derived from the region of lowest velocity. This multidirectional motion signal is then subtracted from motion signals over the whole retina to give an estimate of motion arising from external sources. The baseline motion signal can be derived as a mean estimate over an area and over a period of time. Murakami and Cavanagh (1998) produced evidence for this idea. Subjects inspected a small patch of dynamic random dots for 30 s, after which they looked at a larger test patch of stationary dots. In the adapted inner region of the test patch the dots appeared stationary but in the unadapted surround they appeared to jitter rigidly in random directions. The effect can be explained if it is assumed that, in the adapted region of the test patch, image motion due to eye tremor is undetected because the motion system has been adapted in this region. Relative to this lowered motion signal, the image motion due to eye tremor in the surrounding unadapted region appears as motion. It seems that the baseline motion signal is derived in each eye separately, since adaptation to dynamic noise in one eye did not affect perceived motion of a test patch presented to the other eye.
18.10.3b Stereoacuity and Vergence Accuracy Evidence reviewed in Section 18.3.3 reveals that the threshold for discriminating a difference in disparity-defined
depth between two stimuli decreases rapidly when both stimuli are on a disparity pedestal. However, up to a pedestal value of about 5 arcmin, the stereo discrimination threshold decreases—it shows a dipper function. This, along with physiological evidence (Section 11.4.1), suggests that the tuning functions of disparity detectors overlap on either side of zero disparity. Accordingly, the stereodiscrimination threshold should be adversely affected only by a fixation disparity of more than about 5 arcmin. Fixation disparity induced by base-in or base-out prisms adversely affected the ability to detect the sign of depth of a vertical line with respect to flanking lines. However, almost all the fixation disparities were greater than 5 arcmin (Cole and Boisvert 1974). Ukwade et al. (2003a, 2003b) used prisms to impose fixation disparity or oscillating mirrors to impose sinusoidal or random modulations of vergence. Both procedures adversely affected the threshold when the fixation disparity or the mean vergence instability exceeded 3.8 arcmin. The threshold for discrimination of depth between two vertical lines was increased by induced fixation disparity or by vergence instability of more than 3.8 arcmin. There is some indication in their data of a lowering of the threshold at values between zero and 3.8 arcmin of disparity, but they do not mention this. Insofar as fixation disparity is positively correlated with phoria, stereoacuity could also be adversely affected by phoria. Saladin (1995) measured Howard-Dolman stereoacuity in over 1,700 people with phoria. Exophoria up to 7 diopters had no effect on stereoacuity. However, stereoacuity decreased from a mean value of 10 arcmin to 20 arcmin for increasing magnitude of esophoria from 1 to 7 diopters. Stereoacuity was reduced by only 1 diopter of vertical phoria. Saladin explained the asymmetric effects of esophoria and exophoria in terms of the asymmetry of fixation disparity in the two types of phoria. Exophoria is accompanied by very little fixation disparity while esophoria is accompanied by an increase of approximately 1 arcmin of fixation disparity for each diopter of phoria ( Jampolsky et al. 1957; Ogle 1964) (see Section 10.2.4). Lam et al. (2002) also found some loss of HowardDolman stereoacuity in esophores, especially for uncrossed disparity. Exophores showed a smaller loss than esophores. 18.10.4 S T E R E O I N T E G R AT I O N OVER VERG EN C E C H A N G E S
Although depth can be seen in displays that are too brief to allow vergence changes (Section 18.12.1), vergence plays a role when sufficient time is allowed. Vergence is particularly important in a display with a large initial disparity or when there are several superimposed displays at widely different depths. The angle of convergence changes from one depth plane to another, guided by binocular disparity and other cues to depth. Large disparities provide a phasic signal,
STEREOSCOPIC ACUIT Y
•
349
which initiates vergence in the right direction but is unable to maintain steady vergence. Phasic vergence may be initiated by widely dissimilar images (Section 10.5.10). A change in vergence reduces the disparities in the target plane until they come within the range of fine disparity detectors. Disparities arising from objects not too distant from the plane of convergence are used to code depth. Within a selected plane a more or less steady state of vergence is maintained by error feedback from tonic disparity detectors operating on local detail. Large disparities in other depth planes initiate another vergence movement only when the observer decides to change vergence. To a first approximation, and for small areas, the images in the two eyes arising from each depth plane are similar so that, once one pair of matching images with a given disparity has been found, the task of finding other matching images with similar disparity is eased. These ideas have been applied to robotic stereo systems (Theimer and Mallot 1994). Vergence provides a disparity and spatial-resolution zoom mechanism. Marr and Poggio (1979) proposed that distinct analyses done at each vergence angle are integrated and stored in a buffer memory, which they called the 21/2-D sketch. In this way the viewer builds up a representation of the 3-D scene, which directs further exploratory eye movements and other types of behavior. According to the above account, phasic vergence is initiated by disparities too large to code depth or by disparities between nonmatching images. Depth is coded only from the pattern of finer disparities near the plane of the horopter (Tyler and Kontsevich 1995). Accordingly, it should take longer to recognize a strongly slanted or inclined shape than a frontal shape because a sloping shape would require vergence eye movements. Uttal et al. (1975b) found that a plane in a random-dot stereogram depicting a cube of dots was identified just as rapidly whatever its inclination to the horopter (Portrait Figure 18.53). However, the range of disparities may have been too small to reveal any effect. A reasonable view is that all disparities within a certain range are used to code relative depth. Phasic vergence initiated by disparities beyond this depth-detection range is required to bring the images of a selected object within the range of the depth-detection mechanism. However, there is no clear evidence to indicate the boundary between disparities used for depth detection and those used to initiate phasic vergence. The fine depth structure derived from each location is stored in buffer memory as we scan a complex scene to build up an appreciation of the depth structure of the whole scene. A second function of vergence scanning is that it can aid the image linkage process (Section 15.4.6). Conjugate saccades are under voluntary control and bring the gaze onto an object of interest to the viewer (Yarbus 1967). Horizontal vergence, also, is under voluntary control for the same reason. Version and vergence 350
•
William R. Uttal. Born in Mineola, New York, in 1931. He obtained a B.Sc. in physics at the University of Cincinnati in 1951 and a Ph.D. in experimental psychology and biophysics at Ohio State University in 1957. He worked at the IBM Research Laboratory, New York, from 1957 to 1963, the University of Michigan from 1963 to 1985, and the Naval Ocean Systems. Center, Hawaii, from 1985 to 1988. He has been a professor in the Department of Engineering at Arizona State University since 1988.
Figure 18.53.
form a unitary search mechanism for exploring the 3-D layout of a scene. We can also mentally scan a visual scene without moving the eyes or we can attend to something in a given location even though the eyes are not allowed to move. For vertical rods at a fixed distance, depth discrimination thresholds were lower when subjects fixated the rod that was adjusted in depth rather than the stationary rod (Lit 1959b). Presumably, the change in vergence supplemented the change in disparity. The role of attention in stereopsis is discussed in Section 18.13. The importance of vergence in stereopsis is mentioned in Sections 20.4.1 and 23.5. 18.10.5 S T E R E OAC U I T Y A N D H E A D MOVE M E N T S
When the head rotates in the dark about a given axis, the eyes execute compensatory movements in the opposite direction, interspersed with saccadic return movements. This is the vestibulo-ocular response (VOR). Stimuli evoking VOR originate in the semicircular canals of the vestibular system. The gain of the compensatory phase of the response is the ratio of eye velocity to head angular velocity. When the gain is 1, the images of stationary distant objects remain stationary on the retinas. The gain of VOR is low for low frequencies of head oscillation but is about 1 for frequencies between 2 and 5 Hz.
STEREOSCOPIC VISION
When the head is rotated in illuminated surroundings the VOR is supplemented by optokinetic nystagmus (OKN). This reflex response consists of pursuit movements of the eyes evoked by the motion of the image of the stationary scene over the retina interspersed with saccadic return movements. The gain of OKN is highest at low frequencies of scene motion, so that when VOR and OKN occur together the gain of the combined response remains high over a wider range of frequencies of head rotation than does the gain of either response alone. Visual performance during head rotation is illustrated by the fact that one can read a stationary page of print while the head is oscillated at over 5 Hz. However, when a page of print is oscillated in front of a stationary observer, reading is impaired at 2 Hz. This indicates that the VOR supplements OKN at high frequencies (Benson and Barnes 1978). Performance is intermediate when a person reads a page of print that moves with the head. In this case, VOR is evoked but OKN is not (Barnes et al. 1978). People vary in the extent to which retinal images are stabilized during head rotation, and the two eyes do not always move by the same amount (Steinman et al. 1982). Steinman and Collewijn (1980) used scleral coils to measure eye and head movements as subjects fixated an object while rotating the head from side-to-side at between 0.25 and 5 Hz. In one subject, the gain of eye movement relative to head movement was 0.87 in one eye but only 0.66 in the other eye. Vergence errors were up to 3˚. In spite of imperfect image stabilization and vergence control, all subjects reported that the scene appeared stable and fused. Furthermore, stereoacuity and the ability to register depth in random-dot stereograms were not much disturbed by imperfections of image stability during side-to-side head rotations up to 2 Hz (Patterson and Fox 1984; Steinman et al. 1985). Steinman and Collewijn’s results have not been confirmed. Duwaer (1982) found that head oscillations through 20˚ at 0.66 Hz produced vergence shifts of only between 5 and 13 arcmin, as indicated by an afterimage method. Ciuffreda and Hokoda (1985) used a nonius procedure and found that head oscillations through 20˚ at 4 Hz produced vergence errors of only between 5 and 13 arcmin. Instability of vergence introduces an overall, or zero-order, disparity into the visual scene. A mechanism that responds only to first or higher spatial derivatives of disparity would therefore be immune to changes of vergence that do not cause the images to become diplopic. Collewijn et al. (1991) reviewed literature on the effects of head movements on stereoacuity. 1 8 . 1 1 S T E R E OAC U I T Y A N D OT H E R AC U I T I E S It seems reasonable to suppose that visual processes involved in stereoacuity are similar to those involved in other forms of pattern acuity, such as vernier acuity.
Indeed, stereoacuity and vernier acuity are very similar for targets in the foveal region (Stratton 1900; Walls 1943). It is difficult to compare vernier acuity and stereoacuity because they are not comparable measures. In vernier acuity, one judges the offset between two parallel lines both seen by the two eyes (binocular acuity) or both seen by one eye (monocular acuity). The equivalent dichoptic task is the nonius task in which one eye sees one of the lines and the other eye sees the other. An unavoidable difference is that fluctuations in convergence disturb the alignment of nonius lines but not of binocularly viewed vernier targets. In a typical stereoacuity task the separation between one pair of dichoptic targets is compared with that between a second pair of dichoptic targets. In a vernier acuity task the position of one target is compared with that of another target. The binocular task that is comparable to a stereoacuity task is the task of comparing the separation between one pair of targets with the separation between a second pair of targets, viewed binocularly (McKee et al. 1990a). The literature will now be reviewed with these points in mind. Schor and Badcock (1985) measured both stereoacuity and vernier acuity using elongated DOG (difference of Gaussians) patches placed one above the other. As the stimulus was moved up to 40 arcmin away from the fovea, vernier acuity declined rapidly but stereoacuity remained reasonably constant. However, when the disparity of the stereo targets was increased by an amount that placed the monocular images 40 arcmin on either side of the fovea, stereoacuity was severely reduced. In other words, stereoacuity is high for targets slightly displaced from the fovea as long as they remain on the horopter. That must mean that cortical cells tuned to fine disparities are well represented over a reasonably wide area of the central visual field. Detectors for binocular vernier offset must be more tightly clustered in the fovea. However, we have already pointed out that stereoacuity and vernier acuity are not comparable measures. Therefore, differences between them may not reflect a fundamental difference in the processing of dichoptic and binocular stimuli. Schor and Badcock measured stereoacuity only out to an eccentricity of 40 arcmin. We saw in Section 18.6.1 that stereoacuity declines as eccentricity is increased to 10˚, especially for high spatial-frequency stimuli. McKee and Levi (1987) compared monocular vernier acuity with dichoptic vernier acuity (nonius acuity) for different vertical separations between the vertical target lines. In the nonius task, convergence was first stabilized by having the subject look at the aligned vertical nonius lines in the context of a binocular frame. The frame was then removed, and the nonius lines were shown briefly in one of several offset positions. For small vertical separations of the target lines, monocular vernier acuity was higher than nonius acuity, but at separations greater than about 1˚, the two acuities were identical and both decreased as a power function of target separation (see Figure 18.54). When the vernier targets were oscillated from side to side at a
STEREOSCOPIC ACUIT Y
•
351
Dichoptic distance 2
Dichoptic distance 1
5
Stereoscopic stimuli
Single monocular target Threshold (arcmin)
2 Lateral separation Threshold for dichoptic nonius lines
1
LE
0.5
Monocular distance 1
Threshold for monocular lines
0.2
RE
LE Relative separation stimuli
RE
Monocular distance 2
0.1 2
5 0 20 50 Vertical separation (arcmin)
100
200
Vernier acuity and stereoacuity compared. Thresholds for detection of lateral separation between two monocular vertical lines and two dichoptic nonius lines as a function of the vertical separation of the lines. The inset shows threshold for detection of change in position of a monocular line in the absence of a reference line (N = 1).
Figure 18.54.
(Redrawn from McKee and Levi 1987)
frequency and amplitude that mimicked the effect produced by fluctuations in vergence, dioptic vernier acuity matched nonius acuity even for small separations of the test lines. Thus, when effects of vergence jitter are taken into account, monocular and dichoptic versions of vernier acuity depend on the same limiting process in the nervous system, which is probably the uncertainty of position-detecting units in the visual cortex. Fahle (1991) obtained similar results. McKee et al. (1990a) used comparable dichoptic and monocular stimuli as shown in Figure 18.55. In the stereo task, the least detectable change in depth was determined between one pair of dichoptic vertical lines and a second pair of dichoptic lines. In the monocular task, the least detectable difference in lateral separation was determined between one pair of vertical lines and a second pair of lines. Performance on these two tasks was the same and remained the same as the lateral separation between the two pairs of lines increased to 4.8˚. Thus, the ability to compare the distance between one set of lines with that between another set is the same, whether the sets of lines are dichoptic or seen by the same eye. McKee et al. (1990b) designed a second set of comparable stereoscopic and monocular stimuli (Figure 18.56). In the stereo task, subjects binocularly fixated a line while the disparity in a pair of dichoptic lines varied from trial to trial about each of several disparity pedestals. In the monocular task, subjects fixated a line while the lateral distance between two other lines varied from trial to trial about each of several mean values. Since the nonfixated lines were not resolved as two, subjects detected the change in width of the apparently single line. The Weber fraction 352
•
RE
RE
RE
RE
Stereoacuity and relative separation. Stimuli used to compare stereoacuity with acuity for relative separation. In the stereoacuity task, subjects detected the depth between one pair of dichoptic lines and a second pair of dichoptic lines. In the relative-width task, subjects detected a difference in the distance between one pair of lines and the distance between a second pair of lines, seen by the same eye. Both tasks were performed for a range of lateral separations between the two pairs of lines. (Adapted from McKee et al. 1990a)
Figure 18.55.
for detection of depth change was several times higher than that for detection of width change in the pair of nonfixated lines. The depth threshold fell with longer viewing time but remained above the monocular threshold. Thus, while stereoacuity and lateral separation acuity are similar, disparity thresholds based on disparity pedestals are much higher than increment thresholds for lateral distance. This suggests that stereoscopic vision is most useful for small depth intervals centered on the horopter. The stimuli in these experiments had fixed spatialfrequency content. Evidence reviewed in Section 18.7.2 suggests that depth discrimination about a disparity pedestal is better for low than for high spatial-frequency stimuli. These experiments therefore need repeating at a variety of spatial frequencies. It has been claimed that stereoacuity and vernier acuity are affected in different ways by changes in the spatial disposition of test targets. Stigmar (1971) reported that when test lines were brought closer together, increasing blur had more effect on vernier acuity than on stereoacuity. However, Westheimer (1979b) found that blur had more effect on stereoacuity than on vernier acuity. Westheimer and Pettet (1990) found that a reduction of contrast or of exposure time also had a more adverse effect on stereoacuity than on vernier acuity. Patients with reduced visual acuity due to disease of the optic nerve showed a disproportionate loss of stereoacuity as tested with the Titmus test (Friedman et al. 1985).
STEREOSCOPIC VISION
Horizontal disparity stimuli
Monocular disparity stimuli S ± ΔS
D ± ΔD
Half images RE LE
Half images RE RE
D ± ΔD
S ± ΔS
RE + LE
RE
Stereoacuity and width discrimination. Displays used to compare a stereo-increment task with a monocular width-increment task. In the stereo task, subjects fixated a line and detected a change in depth of a second line placed above it, about each of several depth pedestals. In the monocular task, subjects fixated a line and detected a change in the distance between two other lines about each of several initial separations. (Adapted from McKee et al. 1990b)
Figure 18.56.
Stereo and vernier acuities are affected to different extents by changes in target separation. Vernier acuity falls off more steeply than stereoacuity as the distance between target lines increases (Berry 1948; Westheimer and McKee 1979). It has been claimed that for vertical lines separated vertically by less than about 4 arcmin the vernier threshold is two to three times lower than the stereo threshold (Berry 1948; Stigmar 1970). For separations greater than 4 arcmin, Berry found stereo thresholds were lower than vernier thresholds, but Stigmar found them to be similar. Krauskopf and Forte (2002) found that stereo thresholds decreased by a factor of 2 to 3 as target separation increased from 1 to 20 arcmin. But these results may not reflect fundamental differences between stereoacuity and vernier acuity because the tasks were not basically comparable. Heinrich et al. (2005) asked whether the ability to detect a lateral offset between two vertical lines (vernier acuity) decreases when the lines are separated in stereoscopic depth. They found that vernier acuity at a level of about 7 arcsec was maintained until the horizontal disparity between the two lines reached about 1 arcmin. As disparity was increased to 4.5 arcmin, the mean vernier acuity decreased to 34 arcsec. All subjects fused the lines up to a disparity of 2 arcmin, but some subjects retained fusion up to 4.5 arcmin. For some subjects, the perceived alignment of the two lines was determined more by one of the
disparate images than by the other image. They referred to this as ocular prevalence. Ocular prevalence was not stable over repeated measurements and was not related to normal vernier acuity (Kromeier et al. 2006). It was explained in Section 3.1.2 that different types of monocular acuity differ in the level of performance they allow. Thus, grating resolution cuts out at about 60 cpd (about 1 arcmin). The detection of thickening as two lines separate has a lower threshold, and vernier acuity can be a few arcsec. The question now is whether there are analogous types of stereoacuity with similar differences. Stevenson et al. (1989) used random-dot stereograms to create the three stereo tasks illustrated in Figure 18.57. In the first task, subjects detected a step between two adjacent depth planes. This is analogous to vernier acuity, although not strictly equivalent, as the preceding discussion shows. The threshold for this task was about 3 arcsec. The second task was a super-resolution task in which subjects discriminated between a flat stereo surface and one in which disparities were just sufficient to cause a visible thickening of the surface but not two surfaces. Tyler (1983) called this effect pyknostereopsis. This is analogous to the monocular detection of a small separation of two parallel lines by the apparent thickening of the perceived single line (Section 3.1.2). The threshold for pyknostereopsis was between 15 and 30 arcsec. The third task was that of detecting whether two overlapping depth surfaces have an empty gap between them. Tyler called the detection of such a gap diastereopsis. This is analogous to monocular grating acuity. The lower bound for diastereopsis is the upper bound for pyknostereopsis. The effects of relative motion on a diastereopsis task are discussed in Section 17.1.5.
Stereo hyperacuity for a depth step.
Stereo width acuity for a thickening in depth.
Depth resolution for superimposed planes. Three types of stereoacuity. Hypothetical distributions of neural activity corresponding to zero disparity and to a just discriminated disparity difference. The threshold stimulus for each task are shown on the right. (Adapted from Stevenson et al. 1989)
Figure 18.57.
STEREOSCOPIC ACUIT Y
•
353
Overall, the order of performance on the three stereo tasks resembled the order of performance on the analogous monocular tasks. Acuities within the cyclopean domain were discussed in Section 16.2.
1 8 . 1 2 T E M P O R A L FAC TO R S I N S T E R E OAC U I T Y
h = −kt a
18.12.1 S T I MU LUS D U R AT I O N A N D P RO C E S S I N G T I M E
This section deals with two related questions concerning the processing time for stereopsis. First, how long must a stimulus be exposed for depth to be detected? Second, how long does it take to process depth information after a brief stimulus has been switched off ?
18.12.1a Effects of stimulus duration In Section 18.12.1a, it was mentioned that depth may be perceived in stereograms illuminated for a few milliseconds (Dove 1841). Subjects can recognize a form in depth in a random-dot stereogram exposed for less than 1 ms if the eyes are properly converged before the stimulus is presented (Uttal et al. 1994). This section is concerned with how long a stimulus must be shown for the disparity threshold for stereopsis to reach its asymptotic value. According to Bloch’s law, detection of a stimulus up to a critical duration depends on the product of intensity and duration. This means that stimulus energy is completely summed within this critical time. For durations longer than the critical duration, further increases in intensity have lesser effect on the detection threshold. Vernier acuity improves with increasing duration and with increasing contrast. However, provided total stimulus energy (product of contrast and duration) was constant, 3-dot alignment acuity was equally good for stimulus durations between 2 and 200 ms (Hadani et al. (1984) and vernier acuity was equally good for durations between 12 and 2000 ms (Waugh and Levi 1993). Langlands (1926, 1929) was the first to measure stereoacuity as a function of stimulus duration. Threshold disparity in a Howard-Dolman test was constant for exposures up to 0.1 s and decreased for longer exposures up to 3 s, beyond which it again stayed constant. Hertel and Monjé (1947) found that the stereo threshold in a modified Howard-Dolman test decreased as exposure time was reduced from 500 to 40 ms. In these studies, there was inadequate control of eye fixation, the state of light adaptation, and the luminance and contrast of the test objects. Ogle and Weil (1958) controlled fixation, and controlled the state of light adaptation by keeping the luminance of the background constant while systematically 354
•
varying the contrast between test object and background. With the method of constant stimuli, subjects reported the relative depth between two vertical lines placed 30 arcmin on either side of a central vertical fixation line. The mean stereo threshold, h , rose from about 10 to 50 arcsec as the duration of exposure, t, of the test lines was reduced from 1 s to 7.5 ms, in approximate conformity with the expression (5)
where k is the threshold at one second and a is an exponent, which was about -0.3. A similar relationship was found with random-dot stereograms in humans (Harwerth and Rawlings 1977) and monkeys (Harwerth and Boltz 1979a). Shortess and Krauskopf (1961) found a similar dependence of stereoacuity on stimulus duration when the images were stabilized on the retinas. It seems that disparity information is integrated over time in a manner analogous to the integration of luminance over time, as described by Bloch’s law, although the exponent for luminance is 1 rather than –0.3. With brief exposure, vernier acuity was similar to stereoacuity for detection of depth between two lines (Foley and Tyler 1976). Since a given angle of disparity is twice the angle of stimulus displacement in each eye, the offset detected in each eye at the stereo threshold is about half the vernier threshold. Both thresholds decreased in a similar way as exposure duration increased from 25 to 200 ms. Tyler (1991b) measured the stimulus duration required for detection of depth in a random-dot stereogram as a function of the magnitude and sign of disparity. For both crossed and uncrossed disparity, the threshold was inversely proportional to stimulus duration. It decreased from about 50 arcmin at a duration of 7 ms to about 1 arcmin at 160 ms. Thus, fine disparities took longer to detect than coarse disparities, and disparity was integrated over about 180 ms. Tyler cited evidence that temporal integration in the luminance domain occurs over only about 40 to 50 ms. Harwerth et al. (2003) measured the binocular disparity required for detection of whether test stimuli were nearer than or beyond a background. For humans and monkeys, the disparity threshold decreased as exposure duration increased to about 100 ms, after which it remained constant. Disparity thresholds were lower for stimuli of high spatial frequency and high contrast than for those with low spatial frequency and contrast. However, the time required to reach a constant threshold was the same for all stimuli, as shown in Figure 18.58. The data were fitted with a quadratic summation model in which the threshold at stimulus duration t is: Threshold = h0 (
STEREOSCOPIC VISION
+
)
05
(6)
1,000
Spatial frequency 0.5 cpd 6% 12%
100
25%
Stereothreshold (arcsec)
50%
10
10
100 1,000 Stimulus duration (ms)
1,000 6% 100
Spatial frequency 4 cpd
12% 25% 50%
10
10
100 1,000 Stimulus duration (ms)
Stereoacuity and stimulus duration. The stereothreshold as a function of stimulus duration for two Gabor patches differing in spatial frequency at four levels of contrast. Results for one monkey. The lines represent the best-fitting curves derived from a quadratic summation model. (Adapted from Harwerth et al. 2003)
Figure 18.58.
Where t0 is a constant that determines the position of the function on the x-axis, and h0 is a constant that determines its position on the y-axis (threshold when t = t0). Harwerth et al. replotted data from Ogle and Weil (1958), Shortess and Krauskopf (1961), and Harwerth and Boltz (1979a) and found that the replotted data also fitted the quadratic summation model. Patterson et al. (1995) found that depth in a randomdot stereogram was detected more rapidly for a central square with crossed disparity than for a square with uncrossed disparity (see Section 18.6.4). Physiological data on processing time for stereopsis was reviewed in Section 11.4.8. The effects of practice on the latency of stereopsis are discussed in Section 18.14.2, and the effects of stimulus duration on stereo anomalies were discussed in Section 32.2.1.
18.12.1b Processing Time for Stereopsis Depth can be seen in simple stereograms exposed for only 1 ms. Stimulus duration is therefore not a crucial factor as long as the luminance is sufficient to ensure that the stimulus is seen. However, stereoacuity decreases when there is uncertainty about which of several sequentially presented displays contains the test disparity. Thus, stereoacuity for a single 100-ms display of dots containing disparity
was higher than when the test display was presented along with three 100-ms zero-disparity displays in a random sequence (Lindblom and Westheimer 1992). Whatever the stimulus duration, it takes time to process depth in a stereogram. Julesz (1964) measured stereoscopic processing time in the following way. A briefly presented unambiguous stereogram was followed by an ambiguous stereogram in which the central square could be seen as further or nearer than the surround. When the interstimulus interval was greater than about 50 ms, the unambiguous stereogram biased the interpretation of the ambiguous stereogram. It was concluded that 50 ms were required to process depth in the first stereogram. Uttal et al. (1975a) measured the masking effect of random-dots that were uncorrelated in the two eyes on the detection of depth in a random-dot stereogram. Performance was degraded when the mask followed the test stimulus by less than about 50 ms. They also confirmed that the crucial time is not the duration of the stimulus, but the time provided for unimpeded processing of the disparity, while the stimulus remains on or after it has been turned off. These conclusions refer to optimal conditions of viewing and to people who readily see depth in stereograms. Some people require many seconds or minutes to see depth in random-dot stereograms (Section 18.14.2). Disparity discontinuities are more readily detected than simple disparity gradients. Thus, depth in a random-dot stereogram representing a single surface slanted or inclined in depth takes longer to see than depth in a stereogram representing two adjacent surfaces slanted or inclined in opposite directions (Section 21.4.2d). Also, depth of a single surface inclined about a horizontal axis is seen more quickly than that of a surface slanted about a vertical axis (Rogers and Graham 1983; Gillam et al. 1988b ; van Ee and Erkelens 1996a). Other instances of this anisotropy are described in Section 20.4. Monkeys responded more rapidly to stimuli in which depth was specified by disparity than to stimuli in which depth was specified by motion parallax (Schiller et al. 2007). The detection of motion parallax involves integration of signals over time. One reason for the evolution of stereopsis is that it processes depth rapidly.
18.12.1c Spatial Scale and Sequential Processing Watt (1987) proposed that, in a mixed stimulus, low spatial frequencies are processed before high spatial frequencies in vernier acuity, orientation acuity, and stereoacuity tasks. Watt found that each type of acuity improved as stimulus exposure time increased, up to about one second. Dot resolution did not improve with longer stimulus durations. This may be because resolution involves detection of an inhomogeneity, whereas acuities involve the comparing one stimulus with another.
STEREOSCOPIC ACUIT Y
•
355
In the stereoacuity test, subjects detected the disparitydefined depth of a vertical test line relative to a comparison line beneath it. It can be seen in Figure 18.59 that, with short durations, only large disparities evoked a sensation of depth but that, as stimulus duration increased, depth was evoked by finer disparities in the display. This result was interpreted as being due to an initial processing of only coarse disparity by low spatial-frequency visual filters followed by processing at a finer scale within a high spatialfrequency system. The theory suggests that depth in a stereogram containing only low spatial frequencies is processed more rapidly than the same depth in a stereogram containing only high spatial frequencies. However, the time-dependent probability of detecting a disparity at a given spatial scale has not been determined. Although Watt provided stimuli that were designed to stabilize vergence, vergence was not measured and nonius targets were not provided. This experiment should be repeated with better control over vergence. Hess and Wilcox (2006) measured stereoacuity by asking subjects to report the depth order of a central Gabor patch with respect to two identical flanking patches. Stereoacuity increased as stimulus duration was increased from 85 to 1336 ms. A masking stimulus was presented after each stimulus to prevent subjects from processing stimuli after they were extinguished. The form of this relationship was basically the same for different center spatial frequencies of the Gabors. Hess and Wilcox concluded
Stereo-disparity threshold (arcmin)
1000
100
10
0 10
100 Exposure duration (ms)
1000
Stereoacuity and exposure duration. Disparity threshold for detecting depth in a line relative to a zero-disparity line, as a function of exposure duration. As duration increased, depth was evoked by finer disparities. The dashed line is the slope expected if threshold varies linearly with duration (N = 1). (Redrawn from Watt 1987)
Figure 18.59.
356
•
that, “the dynamics of stereo processing do not vary systematically with spatial frequency/spatial scale.” But this does not prove that low spatial frequencies were not processed before high spatial frequencies within the period of stimulus exposure. Perhaps all spatial frequencies had been processed even in the shortest 85-ms exposure duration. We will see in Section 18.12.3 that Hess and Wilcox found that the dynamics of stereo processing are different for first- and second-order disparities. Physiological evidence for the theory that coarse disparities are processed more rapidly than fine disparities was presented in Section 11.4.8. Menz and Freeman (2004b) found that, for binocular cells in the cat’s visual cortex, the disparity range decreased and disparity spatial frequency tuning increased over the first 40 ms. They speculated that this represents a coarse-to-fine temporal analysis of disparity. However, it is not clear how a process occurring in such a short time could serve any useful function. The changes may merely represent the time needed to recruit the neural connections that determine the steady-state responses of disparity detectors.
18.12.2 E FFEC TS O F S T I MU LUS D E L AYS
18.12.2a Effects of Interocular Delay Several early investigators reported that stereoscopic vision is possible when there is a delay between the images presented to the two eyes (see Stevenson and Stanford 1908). However, lack of precise timing devices did not allow the effect to be measured. Ogle (1963) asked subjects to fixate a binocular target and judge the relative depth of a second target after it was exposed briefly to each eye in succession. For disparities between 30 and 150 arcsec, depth judgments were not affected by delays of up to about 25 ms between the offset of the left-eye image and the onset of the right-eye image. With longer interocular delays, performance declined to chance levels at a delay of about 100 ms. The limiting delay was increased to about 250 ms when the successive dichoptic targets were exposed several times, a result previously reported by Efron (1957). Wist and Gogel (1966) asked subjects to adjust the depth of a disk to that of one of a pair of continuously visible comparison targets separated in depth. The disk was seen alternately by each eye for 4 ms with various interocular delays and various intervals between pairs of flashes. With long intervals between pairs of interocular flashes, depth settings remained accurate for interocular delays up to 32 ms, beyond which they were more variable and indicated that less depth was perceived in the disk. With shorter intervals between pairs of flashes, performance was maintained with interocular delays up to 65 ms. Thus, it is agreed that longer interocular delays are tolerated when several stimuli are presented in rapid succession. This indicates that information about depth is integrated
STEREOSCOPIC VISION
over short time intervals. Part of the reason for this may be that afterimages of the alternately exposed images build up and remain more visible when the stimuli are presented in quick succession. When the duration of each monocular image was increased from 20 to 500 ms, the interstimulus interval for which depth was still evident decreased from over 150 ms to zero (Herzau 1976). Presumably, when each monocular image is presented for an appreciable period it becomes well registered as a distinct object and does not fuse with an image presented sequentially to the other eye, even with a zero time interval. In the above studies the shapes of the targets were visible monocularly. Ross and Hogben (1974) used random-dot stereograms in which the shape was visible only after binocular fusion. The stimuli were presented once to each eye for 10 ms with various interocular delays. Detection of the depth of the cyclopean shape was not affected by interocular delays of between 36 and 72 ms and remained above chance level for delays of up to about 150 ms. Ludwig et al. (2007) used a three-rod stimulus to measure stereoacuity as a function of the frequency of alternation of the monocular images and of interocular delay. They used electronically controlled shutters that allowed independent control of stimulus duration and interocular delay. Stereoacuity improved from about 90 arcsec at an interocular frequency of 1 Hz to 18.5 arcsec at 16 Hz. Increasing interocular delay from 0 to 50 ms degraded stereoacuity. Like Herzau (1976), they found that, as interocular frequency increased (stimulus duration decreased), longer interocular delays were tolerated without loss of stereoacuity. The most obvious explanation for stereopsis with an interocular delay is that signals from one eye persist long enough to interact with those from the other. Engel (1970b) presented some evidence that monocular visual persistence times are similar to tolerated interocular delays.
18.12.2b Effects of Interstimulus Delay A related issue concerns the effects of introducing a delay between the presentation of a dichoptic stereo test target and a dichoptic comparison target. As we have already seen, stereoacuity with simultaneously presented targets can be as fine as 3 arcsec. Stereoacuity with targets presented sequentially but with no interval of time between them was up to 10 times coarser than with simultaneous presentation (Westheimer 1979a). This degradation of stereoacuity could be due to mutual masking of adjacent sequential stimuli (Section 13.2). The loss of stereoacuity with successive presentation was much less for targets separated by 10˚ than for spatially adjacent targets (Enright 1991b). Monocular masking also decreases with increasing spatial separation of the stimuli. Enright explained the difference between adjacent and separated targets in terms of the greater instability of fixation with adjacent targets.
Stereoacuity declines as the dark interval between two successively presented targets increases. Foley (1976b) presented a line in the plane of fixation for 2 s and then, after a dark interval of between 0 and 32s, a second line for 100 ms in a different depth plane. For interstimulus intervals up to about 0.1s, the minimum detectable depth between the two lines increased slowly from an initial value of about 1 arcmin. With longer intervals, the threshold increased more rapidly to a final value of about 30 arcmin. In two of the three subjects, the threshold for vernier offset of lines presented to the same eye was lower than that for detection of relative depth. However, the two thresholds increased in a similar way with an increasing time interval between the presentations of the two lines. Foley explained this effect in terms of noise in the vergence system and a loss of memory for the position of the first stimulus. It seems that the effect of an interstimulus delay is reduced with repeated presentation of the stimuli. Kumar and Glaser (1994) used a vertical test line set at various disparities with respect to two comparison lines 13 arcmin on either side of the test line. The test line and the comparison lines were presented in alternation for periods of up to 50 ms. The stereo threshold exceeded 120 arcsec when the stimuli were presented in succession only once in the dark. The threshold was about 50 arcsec when the room lights were on, which is similar to the value obtained by Westheimer under the same conditions. However, when the alternating stimuli were presented repeatedly, the stereo threshold declined rapidly as the number of repetitions increased. After five repetitions, the threshold was similar to that with simultaneous viewing. Subjects must have integrated information over this number of repetitions. In conclusion, it seems that the ability to detect the relative depth of two sequentially presented adjacent objects depends on the interplay of three factors— interstimulus masking, vergence instability, and memory. With well-separated targets, the only factors are vergence instability, and memory. 18.12.3 S US TA I N E D A N D T R A NS I E N T S T E R E O P S I S
In Section 10.5.10 a distinction was drawn between transient vergence evoked by stimuli with large disparity that do not necessarily match in shape or luminance polarity, and sustained vergence evoked by matching stimuli with small disparity. Ogle made a similar distinction between transient impressions of qualitative depth produced by stimuli outside Panum’s area, and sustained quantitative depth impressions produced by disparate but fused stimuli. Pope et al. (1999) presented a pair of dichoptic Gaussian patches with 0.5˚ of crossed disparity above a fixation point and a second pair with 0.5˚ of uncrossed disparity below the fixation point. Each pair of patches had the same or
STEREOSCOPIC ACUIT Y
•
357
opposite contrast polarity and had various durations and contrasts. With matching stimuli, subjects correctly indicated which patch was nearer for all durations and contrasts. With reverse-contrast stimuli, performance declined as duration increased above 0.2 s or as contrast was reduced. Thus, images with matching contrasts evoked sustained stereopsis, while those with reversed contrasts evoked transient stereopsis. Reverse-contrast stereopsis was maintained for longer when stimulus contrast was increased or when temporal transients were introduced into the stimuli. Thus, the duration of transient stereopsis is determined by the temporal energy in the stimulus, rather than by just stimulus duration. Pope et al. argued that random-dot stereograms do not evoke transient stereopsis because the transient system responds only to low spatial frequencies. Edwards et al. (1999) investigated transient stereopsis by asking subjects to detect the depth order of two 1˚ circular Gabor patches presented for 140 ms. One patch had 2 to 4˚ of crossed disparity and the other 2 to 4˚ of uncrossed disparity relative to a previously exposed fixation point. Since the disparities were balanced about zero, the task was immune to any effects of induced vergence. Performance was worse when the carrier sinusoids within the patches were orthogonal rather than parallel. From the effects of varying the relative spatial frequencies and contrasts of the orthogonal sinusoidal gratings Edwards et al. concluded that transient stereopsis involves a first-order mechanism tuned to the orientation of the carrier but with two broadband spatial-frequency channels, and a second-order process based on detection of the low spatial frequency of the Gabor envelope. Schor et al. (2001) enquired whether transient stereopsis is more affected by an interocular difference in the sizes of Gabor envelopes than by differences in the orientation or spatial frequency of contrast modulation of the grating within the patches. They used narrow bandwidth Gabor patches of unequal size in the two eyes. Subjects made a forced-choice decision about the distance of the Gabor relative to a fixation cross. With a disparity of 0.5˚, both sustained and transient stereopsis tolerated a twofold size difference, as long as the carrier gratings were parallel in the two eyes. However, with a disparity of 5˚, transient stereopsis, but not sustained stereopsis, tolerated a threefold size difference, even when the gratings were orthogonal. Nevertheless, transient stereopsis was more dependent on the similarity of the sizes of the patches than on their similarity of contrast polarity, or the similarity of the spatial frequency or orientation of contrast modulations within the patches. Schor et al. concluded that image linking for transient stereopsis depends mainly on image size and temporal synchrony. The capacity of the transient stereoscopic system to detect depth between stimuli with opposite luminance polarity suggests that it depends on nonlinear, or secondorder, disparity processing. 358
•
Edwards et al. (2000) asked whether the transient stereo system is sensitive to both first- and second-order stimuli. The first-order stimulus was a vertical, luminance-defined 0.25-cpd grating in a random-dot display presented to each eye with a 1˚ (90˚ phase) disparity. The second-order stimulus was a contrast-defined grating presented in the same way. Each stimulus was presented for 200 ms with either crossed or uncrossed disparity. Subjects reported whether it appeared nearer or beyond a fixation point. Performance was perfect for the first-order stimulus and nearly perfect for the second-order stimulus. Thus both stimuli were processed in under 200 ms. When a first-order display was presented to one eye and a second-order stimulus to the other, subjects performed above chance over part of the range of contrasts. They performed at chance when one display had twice the spatial frequency of the other. It was concluded that first- and second-order stimuli combine before the stage where disparities are detected. First-and second-order stimuli presented in sequence to the same eye do not combine to produce an impression of motion (Ledgeway and Smith 1994). Thus, for motion, first- and second-order stimuli are processed in distinct channels. 1 8 . 1 3 AT T E N T I O N A N D S T E R E OAC U I T Y Uncertainty about which of several sequential stimuli is the test stimulus adversely affects stereo acuity. Thus, when subjects were not told which of five similar sequential stimuli was the test stimulus, the threshold for detection of a disparity-defined depth between a line and two flanking lines was elevated by up to 26%, compared with when subjects knew which was the test stimulus (Westheimer and Ley 1996). Uncertainty about the time of occurrence of a single test stimulus after an onset signal had very little effect on the stereo threshold. There are two ways to control the locus of attention. The first is to instruct the subject to attend to a particular location or to look for a particular stimulus feature. This is known as endogenous control of attention. The second method is to show the subject a stimulus that contains the visual feature to which attention should be directed in the subsequently exposed test stimulus. This is known as exogenous control of attention. Julesz and Chang (1976) showed that exposure of a random-dot stereogram depicting a surface at a given depth renders subjects more likely to see that depth plane when subsequently shown an ambiguous stereogram. They interpreted this effect as due to disparity averaging between the disparities in the two displays rather than as an effect of attention.Tyler and Kontsevich (1995) designed an experiment to dissociate the effects of disparity averaging and attention. A random-dot stereogram depicting a surface corrugated in depth was stereoscopically nearer than or
STEREOSCOPIC VISION
beyond a superimposed flat surface. Subjects detected the spatial phase of the depth corrugation, which had two values 180˚ apart. The exogenous attention cue was a flat surface presented before the test surfaces for 150 ms in the plane of either the undulating or flat test surface. Performance improved when attention was brought to within 10 arcmin of the disparity of the plane of the corrugated surface. This occurred only when the test corrugation was outside the plane of the horopter, which suggests that attention defaults to the horopter when there is no attentional cue. Disparity averaging between the priming plane and the test corrugation would have reduced rather than enhanced the apparent depth of the corrugation and would thus have degraded performance. In further experiments Tyler and Kontsevich showed that attentional priming also operates for perception of inclined surfaces and for a task in which subjects attended simultaneously to stimuli at two locations and with distinct disparities. The effect of depth on the ability to attend to objects in the visual field is discussed in Section 22.8. 1 8. 14 L E A R N I N G A N D S T E R E O P S I S 18.14.1 P R AC T I C E A N D S T E R E OAC U I T Y
It was mentioned in Section 4.9 that vernier acuity improves with practice, even in the absence of feedback. There have been several reports that stereoacuity, also, improves with practice. Practice was found to improve stereoacuity in the Howard-Dolman test but only for one of two subjects (Lit and Vicars 1966). Wittenberg et al. (1969) asked subjects to set two objects seen in a stereoscope to appear at the same distance. After 20 training sessions spread over 2 weeks, subjects showed a significant improvement in precision compared with subjects in a control group who were tested and retested without intervening training. Foley and Richards (1974) trained a stereoanomalous subject over 12 one-hour sessions to discriminate between crossed disparities, uncrossed disparities, and zero disparity in stimuli presented for as long as the subject wished. After training, the subject’s stereoanomaly, revealed with a flashed target, was considerably reduced. Fendick and Westheimer (1983) had two subjects perform a criterion-free stereoacuity test in sessions of 900 trials, repeated over a period of 9 weeks. With stereo targets removed 5˚ from the fovea, performance improved over about the first 2,000 trials for both subjects. With fixated targets, only one subject showed improvement. Whatever the reason for improvement of stereoacuity with practice, the effect is clearly subject to large individual differences. It has been claimed that improvement in stereoacuity with practice is specific to the spatial arrangement of the stimuli (Fahle 1993b). However, Snowden et al. (1996)
found that learning was not specific to the direction of disparity. Specificity of learning suggests that it occurs at an early stage of visual processing, but Snowden et al. suggested that it could also be due to selective spatial attention. Gantz et al. (2007) measured depth-discrimination thresholds with a random-dot stereogram, using repeated blocks of 1,000 trials. Over about 6,000 trials all subjects showed a significant reduction in the stereo threshold. The improvement occurred uniformly across random-dot stereograms containing various levels of decorrelation. They concluded that a reduction in neuronal noise could account for the improvement. The rate and magnitude of improvement in stereoacuity was found to be the same for a dense random-dot stereogram as for a stereogram consisting of well-spaced dots. Also, improvement transferred from one type of stereogram to the other type (Gantz and Bedell 2010). The decline of the stereo threshold with practice is particularly evident with very brief stimuli. Even for practiced observers, some practice trials were required for stimulus durations of less than 100 ms before performance stabilized (Kumar and Glaser 1993a). For stimuli exposed for only 5 ms, the disparity threshold declined about 10-fold during several hundred trials. There was little effect of practice for exposures of 1000 ms (Kumar and Glaser 1994). In these experiments, luminance was kept constant so that the effects of short stimulus duration could have been due to a reduction of total retinal flux. One can also ask whether practice improves the accuracy of depth judgments for suprathreshold stimuli. Van Ee (2001) approached this question by measuring the accuracy of judgments of slant about a vertical axis and inclination about a horizontal axis of a textured surface as a function of repeated trials over a 3-week period. Disparity provided the only cue to depth—other cues indicated a frontal surface. When the test surface was adjacent to a frontal surface, accuracy was very high and practice had no effect. It is well known that slant and inclination of isolated surfaces are greatly underestimated (Section 21.1). Accuracy for an isolated surface 65˚ in diameter improved markedly over the 3-week practice period. Subjects may have learned to attend to the only stimulus feature that varied, namely disparity, or, what amounts to the same thing, they may have learned to give more weight to disparity than to the other depth cues. Stereoacuity is adversely affected for a short time after a period in which stereoscopic vision is prevented. Thus, subjects showed a 40% increase in variability in a depthmatching task after wearing a patch alternately on the two eyes for 24-hours (Wallach and Karsh 1963). Subjects exposed to 8 hours of monocular patching also showed a large increase in variability in a stereo depth-matching task (Herman et al. 1974). The effect cannot be due to simple disuse, since 8 hours of binocular occlusion had no effect. The time course for the buildup and decline of the effect of
STEREOSCOPIC ACUIT Y
•
359
monocular occlusion was not studied, and the cause of the effect remains obscure. Monocular occlusion leads to permanent loss of stereoscopic vision in young animals but not in mature animals (see Chapter 8).
18.14.2 P R AC T I C E A N D S T E R EO L AT E N C Y
18.14.2a Introduction In his original paper, Julesz (1960) reported that it took longer to see depth in a random-dot stereogram than in a normal stereogram or in a random-dot stereogram in which the cyclopean form was outlined in the monocular images. Several authors have agreed that it can take several seconds or even minutes to see depth in a random-dot stereogram portraying a “complex” surface such as the spiral (Figure 18.60) or hyperbolic paraboloid (saddle-shaped) surfaces presented in Julesz (1971, p. 156) (Frisby and Clatworthy 1975; MacCracken and Hayes 1976; Ramachandran 1976; MacCracken et al. 1977). It is not clear whether the long latencies are due to the large range of disparities present in these stereograms or to the complexity of the surfaces. Also, we shall see later that the stereograms used by Julesz and others contained a good deal of random noise. Julesz (1960) also reported that the time to see cyclopean depth shortens after repetitive trials. The reduction of latency persisted during the same day but only partially from one day to the next (MacCracken and Hayes 1976). Part of the learning is more or less specific to the particular stereogram but there seems also to be a general improvement in viewing random-dot stereograms, at least for people who are slow initially (Weiman and Cooke 1982). Several factors have been proposed to account for the effects of practice in viewing random-dot stereograms, as we shall now see.
18.14.2b Familiarity with the pattern Learning to see depth in a random-dot stereograms could be due to subjects becoming familiar with the shape of the
Figure 18.60.
cyclopean object. However, telling subjects about what they can expect to see or showing them a model of the cyclopean object had no effect on how long it took them to see depth in a random-dot stereogram (Frisby and Clatworthy 1975). In ordinary viewing, increasing the uncertainty about a letter by increasing the size of the set from which it is selected increases the time required for recognition. Staller et al. (1980) found that the effect of increasing the set size was the same for monocularly defined letters as for cyclopean letters and concluded that the latency for random-dot stereograms is not specifically related to uncertainty about the cyclopean form.
18.14.2c Eye movements and practice Julesz (1971, p. 217) suggested that the perception of depth in random-dot stereograms is delayed because there are no clear visual features to guide vergence eye movements, and because it takes time to learn the sequence of vergence movements required to fuse the image. He wrote, “In order to obtain fusion one has to shift areas in registration . . . If one proceeds in the wrong sequence it will take a very long time . . . . However, if one learns the proper sequence of vergence movements . . . the step by step interlocking of areas follows rapidly.” Some support for this hypothesis is provided by the fact that stereo latency can be much shorter when depth features in a random-dot stereogram are made monocularly conspicuous by drawing a line round them or by adding shapes in the same depth plane (Saye and Frisby 1975; Saye 1976; Kidd et al. 1979). Stereo latency was shortened by the introduction of a difference in dot density between the stereo regions, even when the difference was below the monocular threshold ( Julesz and Oswald 1978). The eye-movement theory also gains support from the finding that reduction in stereo latency does not transfer fully across a change in the sign of the disparity of the stereo pattern (O’Toole and Kersten 1992). To test the eye-movement hypothesis, Bradshaw et al. (1995) varied the peak-to-base disparity of a complex spiral surface. Response times for 80- and 40-arcmin spirals were
Stereogram of a complex spiral
360
•
STEREOSCOPIC VISION
not significantly different. Contrary to the eye-movement hypothesis, latencies were slightly longer for 20-arcmin spirals in which the details of the shape were more difficult to see. Goryo and Kikuchi (1971) and Saye and Frisby (1975) reported that latencies were longer for stereograms with larger disparities, but their stereograms depicted a square rather than a spiral. It seems likely that the more objective and well-learned criterion used in the Bradshaw et al. experiment was partly responsible for shorter response times. Christophers et al. (1993) measured vergence movements to view the tip of the spiral surface and found them to be slower and less complete with a 30% decorrelated stereogram (Figure 18.61). Christophers and Rogers (1994) recorded the eye movements of one experienced and one naïve observer and found that vergence changes were slower for the naïve observer (Figure 18.62). When viewing stereograms with large disparities, the shape of the surface was typically not reported by either observer until several seconds after the eyes had made the appropriate vergence movements to the most disparate part of the surface. This effect was particularly evident with more distant surfaces. Figure 18.63 shows that, at a viewing distance of 400 cm, the subject had converged close to the tip of the 90-arcmin “wedding cake” shape many seconds before the shape was reported. The amplitude and latency of vergence changes were similar for the two viewing distances. The time taken to identify the shape of the surface increased from 2.4 to 8.1 s when viewing distance increased from 57 to 400 cm, even though the visual angle and disparities of the stereogram remained the same. This suggests that stereo latency varies with the perceived magnitude of the relative depth in the stereogram. In extreme cases, the shape was never reported even though vergence had brought the disparate dots into the fusional range.
Individual differences in vergence changes. Vergence changes in response to viewing a random-dot stereogram depicting a 90-arcmin “wedding cake” of concentric disks of decreasing size and increasing disparity, viewed at 170 cm. Vergence was more rapid and the latency for identifying the shape shorter for the experienced subject (left).
Figure 18.62.
Chung and Berbaum (1984) found that the latency for detecting a square in depth in a random-dot stereogram increased as a function of the distance in depth between the fixation point adopted before exposure of the stereogram and the visual target, but that fixation distance had no effect for a line stereogram. Christophers and Rogers (1994) concluded that, while subjects require vergence changes to see the shape of surfaces spanning a large disparity range, vergence changes do not guarantee that the shape is seen. A clearer idea of the role of eye movements in the perception of complex random dot stereograms would be gained by making the vergence system open loop so that the subject’s vergence movements have no effect on the disparities in the surface. Van Ee and Erkelens (1999) found that the time taken to see overall slant or inclination in large random-dot surfaces containing various types of horizontal or vertical disparity, with or without a zero-disparity reference surface, was not significantly influenced by whether subjects fixated the center or periphery of the surface or moved the
Vergence (deg)
2 1 0 –1
170 cm
400 cm
–2 Time (sec)
A
Vergence and decorrelation. Vergence changes in response to a random-dot stereogram depicting a 60-arcmin spiral (1-arcmin dots, dot density 10%, viewing distance 57 cm). Vergence was more rapid and complete for the 100% correlated version (A) than for the 30% decorrelated version (B). Latencies for identifying the direction of the spiral, a, the complete 3-D shape, b, and the discrete steps on the spiral surface, c, were shorter with the fully correlated stereogram.
Figure 18.61.
A
B
Time (sec)
B
Vergence and viewing distance. Vergence changes in response to a random-dot stereogram depicting a 90-arcmin “wedding cake” of concentric disks of decreasing size and increasing disparity. Vergence changes were similar when the stereogram was displayed at 170 cm (A) and 400 cm (B) but response times for identifying the shape were much shorter at the closer distance. The two stereograms had the same angular size and disparity.
Figure 18.63.
STEREOSCOPIC ACUIT Y
•
361
gaze over the surface. Thus, the time taken to scan a large stereogram does not seem to be a factor in stereo latency, at least for overall slant or inclination.
18.14.2d Pattern-Specific Learning The eye-movement theory does not explain why the reduction in stereo latency with repeated exposure to a stereogram made up of randomly positioned oblique lines transferred to other stereograms made up of similarly oriented line elements, but not to those with line elements oriented along the opposite oblique, even though they depicted the same shape (Ramachandran and Braddick 1973). This orientation-specific effect suggests a purely sensory process. For example, if stereopsis processing proceeds independently in different orientation-tuned channels, learning could be confined to one stereo-orientation channel. Ramachandran (1976) reported that the reduction in stereo latency after practice was fully preserved when the pattern of dots was changed in a random-dot stereogram, without changing dot density or the macropattern. A more sensitive forced-choice discrimination procedure revealed that latency reduction did not transfer fully when 50% of the dots were changed but did transfer fully when the luminance polarity of all the dots was changed (O’Toole and Kersten 1992). In this case one would have to assume that subjects learn the local clusters of dots. However, changing the micropattern of a random-dot stereogram had less effect than changing the orientation of elements in a random-line stereogram. The reduction in latency did not transfer when the stereogram was moved from one retinal location to another (O’Toole and Kersten 1992) or when the subject fixated on a different point in Julesz’s hyperbolic paraboloid (Ramachandran 1976). Blurring one image of a random-dot stereogram, keeping disparity constant, also disrupted transfer of latency reduction (Long 1982). This evidence suggests that local sensory factors rather than higher cognitive factors play a part in the reduction of response time with repeated exposure of randomdot stereograms. Eye movements may also play a part. An experiment is needed in which transfer is tested across stereograms filtered to different size ranges. If disparity processing occurs in distinct size channels, one would expect little transfer under these conditions.
362
•
18.14.2e Criterion Changes The reduction in response times with repeated exposure to complex stereograms may be due in part to a change in the criterion for seeing “the object in depth.” Bradshaw et al. (1995) minimized criterion effects in two ways. First, they used the more objective criterion of whether the spiral wound clockwise or counterclockwise, and second, they gave subjects a series of practice trials for discriminating clockwise and counterclockwise spirals defined by luminance. Response times averaged about 3 s, even on the first presentation of the stereogram to naïve subjects. There was a practice effect but it was necessarily small, given the short latencies on the first trial.
18.14.2f Stimulus Decorrelation Bradshaw et al. (1995) observed that the spiral stereograms in Julesz (1971), and especially the hyperbolic paraboloid stereogram, contain many uncorrelated elements (Figure 18.60). Frisby and Clatworthy (1975), Ramachandran (1976), and MacCracken and Hayes (1976) all used these stereograms to measure response times. Julesz (1971) drew attention to the importance of binocular correlation and devised a test of stereoscopic ability based on a set of stereograms with decreasing correlation. The average time to identify the shape of the fully correlated hyperbolic paraboloid used by Bradshaw et al. (1995) was just 3 s, compared with 9.1 s for the version in Julesz (1971). Christophers et al. (1993) measured response times for (1) discriminating the direction in which the spiral unwound and (2) detecting the presence of small discrete steps on the spiral surface while independently manipulating the binocular decorrelation. With a 30% decorrelation, latencies for discriminating the spiral direction increased 42% for stereograms and by almost 100% for detecting the small discrete steps. In summary it can be stated that familiarity with the cyclopean shape does not seem to be a factor in the reduction of depth latency in random-dot stereograms. Learning the pattern of eye movements required to fuse the images probably accounts for some of the reduction of depth latency. Several sensory factors are also involved such as the orientation, density, and location of the dots, and also the presence of uncorrelated dots, and viewing distance.
STEREOSCOPIC VISION
19 T YPES OF BINOCULAR DISPARIT Y
19.1 19.1.1 19.1.2 19.1.3 19.1.4 19.2 19.2.1 19.2.2 19.2.3 19.2.4 19.3 19.3.1 19.3.2
Point disparity 363 Introduction 363 Disparities on flat coplanar retinas 364 Disparities on converged retinas 364 Point disparities and vergence 365 Relative and size disparities 367 Definitions 367 Size disparities on a slanted surface 368 Size ratios 370 Dif-frequency disparity 371 Other types of disparity 372 Orientation disparities 372 Angular disparity 373
19.3.3 19.3.4 19.3.5 19.4 19.5 19.6 19.6.1 19.6.2 19.6.3 19.6.4 19.6.5 19.6.6
1 9 . 1 P O I N T D I S PA R I T Y
as a vector. The distribution of point disparities created in the image of a visual scene is a disparity vector field (see Figures 19.2 and 19.3). The distances to each point in a scene from a point midway between the eyes constitute a depth map or range map. Point disparities do not specify the range map. Any point disparity is a function of the depth interval between the point and the point upon which the eyes are converged. Also, the disparity produced by a point a given distance from the fixation point varies with the absolute distance and headcentric eccentricity of the fixation point. A disparity vector field is analogous to a gray-level representation of a scene, which specifies the luminance of each point (Marr 1982). If the visual system merely created images, like a stereocamera, a gray-level representation and a disparity vector field would be sufficient. The visual system must start with receptors detecting the luminance of each point and with disparity detectors in V1 detecting point disparities. However, spatial and temporal derivatives of luminance and disparity must be detected if the 3-D structure of a scene is to be perceived. Patterns of point disparities from a given stimulus depend on the shape of the surface on which the images are projected. Cameras have a flat image surface, and eyes have a spherical image surface. For convenience, they will both be referred to as retinas.
19.1.1 I N T RO D U C T I O N
A point disparity is the angular separation between the images in the two eyes produced by a single object point. The images of a point have a binocular disparity if either their azimuths (aL and aR) or elevations (bL and bR) differ. A difference in azimuth is an absolute horizontal disparity, and a difference in elevation is an absolute vertical disparity. Absolute horizontal disparity = a L − a R
*(1)
Absolute vertical disparity = b L − b R
*(2)
Deformation disparities 374 Polar disparity 376 Headcentric disparity 377 Linear disparity gradients 377 Higher-order disparities 379 Vertical disparity 379 Vertical disparities due to the visual system 379 Vertical disparities from the visual scene 380 Relative vertical disparities and size ratios 381 Vertical disparity as a cue to eccentricity 382 Vertical disparities as a cue to eye position 383 Vertical disparities as a cue to 3-D structure 383
An absolute disparity is a zero-order disparity. Values assigned to absolute disparities depend on the axis system used to measure them (Section 14.3). However, just as azimuth and elevation can be transformed from one axis system to another, so too can measures of horizontal and vertical disparity, if the axes in the two eyes are aligned. Binocular disparities created by any scene can be described by point disparities. Each point disparity has a magnitude and a direction and may therefore be represented
363
Convergence of images on coplanar camera planes could be achieved by translating the digital images in a computer. This eliminates vertical disparities, which simplifies computation. However, there are three disadvantages to translating digital images produced on coplanar retinas.
Frontal surface
Converged lenses
1. After convergence, all disparities are of one sign because the cameras are converged at infinity. This reduces the sensitivity of the system.
Coplanar retinas Vergence movements
A
2. Light detectors must have a uniform density, which would not allow the system to have a high concentration of detectors in the part of the system used for fine discriminations.
Frontal surface
3. The visual fields of coplanar retinas have little overlap at near distances. Therefore, cameras of robotic systems use rotational convergence. Disadvantages of linear convergence can be reduced or eliminated by physical translation of the camera image planes rather than of the digital image. B Frontal surface
C Types of convergence. (A) Flat coplanar retinas. (B) Flat, noncoplanar retinas. (C) Spherical retinas.
Figure 19.1.
Patterns of disparity produced by slanted and inclined surfaces on coplanar retinas are shown in Figure 19.2. An inclined surface produces gradients of horizontal disparity above and below the horizon. These are equivalent to horizontally shearing one image relative to the other. There are no vertical disparities. A slanted surface produces gradients of horizontal disparity in which one image is compressed horizontally with respect to the other. Again, there are no vertical disparities, although horizontal lines in one eye are rotated with respect to those in the other, with one sign of rotation above the horizon and the opposite sign below the horizon. 19.1.3 D I S PA R I T I E S O N C O N V E RG E D R ET I NA S
19.1.2 D I S PA R IT I E S O N FL AT C O P L A NA R R ET I NA S
Binocular disparity arises because the eyes view the world from different positions, so that the perspective produced in one eye differs from that produced in the other eye. The binocular disparities produced by a scene may be called binocular perspective. The image of a frontal surface on a flat retina has no perspective—parallel lines remain parallel, and angles are unchanged. Therefore, a frontal surface produces no binocular disparities on coplanar flat retinas when the center of each retina is on the visual axis, as in Figure 19.1A. Thus, the horopter for two coplanar retinas is the frontal plane on which the visual axes converge. Convergence is achieved by opposed rotation of the lenses and horizontal translation of the retinas to bring matching images onto corresponding retinal points. It is assumed that the retinas are vertically and rotationally aligned. 364
•
Assume that flat retinas are able to rotate about vertical axes so that the visual axes converge on any specified point, as in Figure 19.1B. A frontal plane projects onto each retina as a trapezoid with a horizontal taper because the object plane is not parallel to the image plane. The taper is opposite in sign in the two retinas and produces the pattern of disparities shown in Figure 19.3A. Note that there are gradients of both horizontal and vertical disparities. For a given angle of vergence, horizontal disparity is proportional to horizontal eccentricity, and vertical disparity is also proportional to horizontal eccentricity with a secondary dependence on vertical eccentricity. Both types of disparity reduce to zero as a point moves toward the horizontal or vertical line passing through the point of fixation. These lines therefore constitute the horopter for flat retinas. All disparities reduce to zero as the frontal display moves to infinity. At infinity the visual axes are parallel and the retinas are coplanar.
STEREOSCOPIC VISION
Right-eye image
Left-eye image
A
Left-eye image
Right-eye image
B Patterns of disparity produced on flat coplanar retinas. (A) An inclined surface produces a horizontal shear disparity. There are no vertical disparities. (B) A slanted surface produces opposite gradients of horizontal disparity. There are no vertical disparities.
Figure 19.2.
Patterns of disparity produced by inclined and slanted surfaces on flat retinas converged by rotation are shown in Figures 19.3B and C. An inclined surface horizontally shears one image relative to the other, as on coplanar retinas. Whereas coplanar retinas produce no vertical disparities, converged retinas produce gradients of vertical disparities in the four quadrants of the combined image. A slanted surface compresses one image horizontally relative to the other, as on coplanar retinas, but there are also vertical disparity gradients in the quadrants. Spherical retinas converge by rotation about the center of the sphere, as in Figure 19.1C. The disparities produced on spherical retinas by frontal, inclined, and slanted surfaces are similar to those produced on converged flat retinas. The main difference is that, on spherical retinas, horizontal disparities are larger relative to vertical disparities than they are on converged flat retinas. This is because the distance from the nodal point to the retina is constant for all lines of sight for a spherical retina, but it is not constant for a flat retina. Disparities produced on a flat retina are specified in terms of rectangular Cartesian axes. We saw in Section 14.3 that disparities produced on a spherical retina depend on the axis system used to measure them.
19.1.4 P O I N T D I S PA R I T I E S A N D VE RG E N C E
The directions of any point in space from two vantage points are always different, unless the point lies at infinity. Thus, all points, except those at infinity, have different headcentric directions at the two eyes. But this does not mean that the points have different directions with respect to the two retinas. An appropriate rotation of the eyes can bring the images of any object within the binocular visual field onto corresponding retinal locations in the two retinas. In particular, the images of any binocularly fixated object fall on the foveas and have zero horizontal and vertical disparities. For a given position of the eyes, the horopter is the locus of all points with zero disparity. The position in space of a binocularly fixated point relative to the head is fully specified by the angles of horizontal and vertical version of the two eyes. For an object on the median plane of the head, the distance, d, of a fixated point from the cyclopean eye midway between the eyes is given by:
T Y P E S O F B I N O C U L A R D I S PA R I T Y
d=
•
365
a/2 tan q / 2
*(3)
Disparities produced on flat retinas orthogonal to converged visual axes. (A) A frontal surface produces opposite gradients of horizontal disparity. Each quadrant contains a diagonal gradient of vertical disparity. Horizontal lines have opposite gradients of orientation disparity in the upper and lower fields. (B) A surface inclined top away produces a horizontal shear disparity superimposed on the disparities produced by a frontal surface. (C) A surface slanted about a vertical axis produces a horizontal compression disparity superimposed on the disparities produced by a frontal plane.
Figure 19.3.
where q is the angle of vergence and a is the interocular distance. In theory, the visual system could determine the absolute distance of any fixated object from the cyclopean point by registering the version and vergence angles of the eyes. It would not be necessary to register binocular disparities except to indicate that the eyes are converged on the object. If the visual system were to operate in this way, it would be acting like a range finder. The extent to which 366
•
human judgments of distance can be based on vergence is discussed in Section 25.2. In some animals the angle of vergence is fixed, so that the fixation point is at a fixed distance from the point midway between the eyes. The azimuth angles and elevations of the two images of a point fully specify the position of the point with respect to this fixed distance. If the animal registers the fixed distance, horizontal disparities provide
STEREOSCOPIC VISION
information about the absolute distances of objects. We will see in Chapter 33 that some animals use this simple system for judging the distances of objects. When the eyes are free to move, specification of the headcentric location of a point in space requires information about the positions and orientations of the eyes and about the locations of the images of the point in the two eyes. Eye-position information may be provided by proprioceptors (feedback) in the extraocular muscles or by efference copy (feedforward) signals from the oculomotor nuclei. Alternatively, we will see in Section 19.6.5 that it has been suggested that the positions of the eyes may be specified by patterns of binocular disparity produced by a visual scene. A change of horizontal vergence causes the retinas to rotate around vertical axes in opposite directions. Vertical vergence causes the retinas to rotate about horizontal axes. Consequently, the images in the two eyes translate, horizontally or vertically, in opposite directions over the retinas. These movements change the disparity of the images of all points in the visual scene equally. Cyclovergence causes the retinas to rotate about the visual axes in opposite directions. Consequently, the images rotate in opposite directions over the retinas. This produces an overall cyclodisparity. Any overall linear disparity or cyclodisparity is due only to the vergence state of the eyes. Therefore, such disparities convey no information about the depth structure of the scene. Any overall linear disparity or cyclodisparity may be reduced to zero by an appropriate vergence eye movement. Cameras have a flat image surface, and eyes have a spherical image surface. Each type of image surface requires particular types of vergence movements to cancel overall disparities. Flat retinas should remain coplanar, and vergence should involve an opposed horizontal, vertical, or rotary motion of the image surface, each within its own plane. The two lenses should remain fixed so that each retina moves with respect to a stationary image and the stereobase remains constant. It is usual to achieve convergence in binocular video cameras by rotating each camera like an eye. But this is not a true convergence, since each image surface does not move within the image plane. The movement therefore alters both overall and relative disparities. The spurious relative disparities due to vergence may be computed out by registering the movements of the cameras. Spherical retinas should converge by rotating in opposite directions about the nodal points of the eyes, as in Figure 19.1C. Each retina then moves within the surface of the stationary spherical image. Our eyes converge approximately in this way. It ensures that changes in vergence do not introduce spurious changes in perspective or relative disparity, and that the stereobase remains constant. Relative disparities between the images of different objects are not affected by changes in vergence. Any relative disparities that remain when vergence movements have
reduced the overall disparity to a minimum are determined wholly by the 3-D structure of the scene. Bringing the overall disparities to a minimum helps to bring the residual disparities within the detection range of disparity detectors. 1 9 . 2 R E L AT I VE A N D S I Z E D I S PA R I T I E S 19.2.1 D E F I N I T I O N S
This section is concerned with spatial derivatives of point disparities produced by two or more object points. Detection of a spatial derivative of disparity involves gathering information from a set of point disparities. Detection of higher-order spatial or temporal derivatives of disparity requires complex detectors that gather information over large areas. The simplest spatial derivative of disparity is the difference between the disparity produced by one point and that produced by another point. This is a first-order disparity, or relative disparity. If retinal directions are specified by azimuth/longitude-elevation/latitude coordinates (gun-turret model), azimuth and elevation differences are invariant over horizontal eye rotations. The horizontal disparity of point P1 relative to point P2 is the difference between the absolute horizontal disparities of the two points: Rel. horizontal disparity = (
1
−
1
)−(
2
−
2
)
*(4)
Where a1L and a1R are the azimuths of P1 in the left and right eyes and a2L and a2R are the azimuths of P2 in the left and right eyes, as shown in Figure 19.4A. For each eye, the azimuth of a point is the angle between a sagittal plane of the head and the line of sight through the point. A relative horizontal disparity can also be expressed as the horizontal separation of the images of the two points in the left eye, j, subtracted from their separation in the right eye, q, as in Figure 19.5. This involves rearranging the terms in equation (4). It represents a difference in angular width, which will be referred to as a horizontal-size disparity. = j −q or (
1
−
2
) − (a1R − a 2R )
*(5)
Similarly, the relative vertical disparity of the images of a pair of points is the difference between the absolute disparities of the two points: Relative vertical disparity = (
T Y P E S O F B I N O C U L A R D I S PA R I T Y
•
367
1
−
1
)−(
2
−
2
)
*(6)
P1
Plane A
Vieth-Müller circle
Line of convergence
φ
Plane C Plane B
P2
θ
Plane D
a1L a1R a2R
a2L
Isoazimuth planes and horizontal disparities. In an azimuthlongitude/elevation-latitude system, all points in a vertical plane through the y axis of one eye have the same azimuth. Points in different vertical planes have different azimuths. A point in plane B has zero horizontal disparity with respect to a point in plane C when the eyes are converged on a point where the planes intersect. Points in plane D have a horizontal disparity of q with respect to points in plane B. Points in planes A and B are separated by azimuth angle j and points in planes C and D are separated by angle q . The difference between j and q defines the horizontal-size disparity of the intervals between the two pairs of points. Figure 19.5.
A P2
Isodisparity locus
Vieth-Müller circle P1
P3
B The Vieth-Müller and isodisparity circles. (A) The headcentric azimuth of a point for each eye is specified with respect to a sagittal plane of the head. Point P1 has zero disparity and falls on the horopter (Vieth-Müller circle). (B) Midline point P2 is more distant than P1. It therefore subtends a smaller angle than P1 and produces an uncrossed disparity. Point P3 falls on the same isodisparity locus as P2 and has the same binocular subtense. However, it is nearer to the cyclopean point than P2.
Figure 19.4.
If the vertical separation between the points is measured separately in each eye we have: Vertical-size disparity = (
1
−
2
) − ( b1R − b2R )
*(7)
Any pair of points in any orientation can produce a size disparity. A single point disparity varies with changes in horizontal or vertical vergence. A size disparity is the 368
•
separation of two images in one eye minus the separation of two images in the other eye. Vergence changes do not affect size disparities. Specification of size disparity requires definitions of “horizontal” and “vertical.” In the gun-turret axis system (Section 14.3.1), vertical lines lie on vertical azimuth planes (Figure 19.5). A line is horizontal if points on the line have the same elevation. In this system, horizontal lines lie on lines of longitude. In the gun-turret axis system the horizontal-size disparities and the vertical-size disparities of a pair of object points acquire different values when the eyes change their elevation. The angular separation of the points does not change, nor does the physical separation of their images on the spherical retinas. However, the axes used to define what is vertical or horizontal rotate, and hence the measurements of horizontal and vertical separation will change. 19.2.2 S I Z E D I S PA R I T I E S O N A S L A N T E D S U R FAC E
Three factors affect the horizontal-size disparity of horizontally separated points: 1. The slant of the line joining the points relative to the tangent to the horopter. 2. The eccentricity of the points with respect to the median plane of the head. 3. The distance of the points from the viewer.
STEREOSCOPIC VISION
19.2.2a Size Disparity and Relative Depth For two objects in similar directions, like P1 and P2 in Figure 19.4B, the sign of their relative disparity specifies which object is nearer to the viewer. However, for objects in widely different directions, like P2 and P3 the sign of relative disparity specifies the relative distances of the two objects with respect to an isodisparity locus, not with respect to the cyclopean point. In general, measuring azimuth in a clockwise direction from above, a positive value of (a1L − a1R) − (a2L − a2R) specifies that point P2 is beyond the isodisparity locus passing through P1. A negative value specifies that it falls inside that circle. A horizontal-size disparity between the images of a horizontal line at a given distance and eccentricity varies with the slant of the line with respect to the tangent to the horopter. Disparity indicates slant of the line to the frontal plane only near the median plane of the head, where the tangent to the horopter and the frontal plane are parallel. In this case, the slant of a line with respect to a frontal plane is: Slant = arctan
(M ) d (M )a
B
ε
Frontal plane
A
d′ θ
d′ =
d cose
θ
1 d′
θ
cose
d
ε
*(8)
where M is the ratio of the horizontal size of one image to that of the other image, d is viewing distance, and a is the interocular distance (Ogle 1964, p. 162).
Orientation disparity on an inclined surface. On an inclined surface, the orientation disparity of horizontal lines is the same as that on a frontal surface. A vertical line produces a disparity of the same sign and magnitude at all locations. An oblique line produces a disparity of half the magnitude of that produced by a vertical line.
Figure 19.6.
19.2.2b Size Disparity and Stimulus Eccentricity Now consider the effect of eccentricity. Let one eye view a short horizontal line, AB, as in Figure 19.6. The visual angle subtended by the line decreases as it moves horizontally within a frontal plane to an angle of eccentricity e. Two factors are responsible for this. The first operates alone when the line maintains a constant angle to the line of sight, as in Figure 19.6. The distance of the line from the eye increases in inverse proportion to cos e. Since the angular subtense of AB is also inversely proportional to the distance of the line, it is proportional to cos e. The second factor is that, as the line AB moves into an eccentric position while remaining in a frontal plane, the visual axis intersects the line at a sharper angle (Figure 19.7). This effect is also proportional to cos e. Together, these two factors cause the image of line AB to decrease in proportion to cos2 e. The image of AB also changes in the other eye. When the horizontal line is centered on the median plane of the head, its ends have an equal and opposite angle of eccentricity. The cosine functions for the two eyes therefore have the same absolute value and the images in the two eyes are equal in size. As the line moves within a frontal plane to a horizontally eccentric position, the two angles of eccentricity change by different amounts. There is thus a phase difference between the two cos2 e functions, and the
image in one eye is larger than that in the other. The horizontal size disparity between images of a line thus varies with eccentricity. At infinity, the difference between the two angles of eccentricity reduces to zero whatever the eccentricity, as does the difference between the cosine functions. There are thus no size differences between the images of objects at infinity. As the line approaches the eyes along a hyperbola of Hillebrand (locus of equal vergence) the two angles of eccentricity increase by equal and opposite amounts, as does the phase difference between the cosine functions. Thus, the disparity between the images of a horizontal line depends on both its eccentricity and its distance. A horizontal line orthogonal to the cyclopean axis has the same angle with respect to each visual axis. Therefore, the horizontal disparity of such a line occurs only because it is nearer to one eye than to the other. Its horizontal-size disparity is proportional to the difference between two cos e functions. A horizontal line that remains tangential to the horizontal horopter projects equal images in the two eyes at all eccentricities, and therefore creates zero horizontal-size disparity. For such a line the change in image size due to the changing angle of the line to the lines of sight cancels the
T Y P E S O F B I N O C U L A R D I S PA R I T Y
•
369
A
B
Frontal plane
80
ε
70
A′
5m 4m
Slant (deg)
60
3m
2m
50
1m
40 50 cm
30 20
d A′B = AB cose θ
A′B′
θ
cose
20 cm 10 cm
10 1
θ
2
3
4
5
6
7
8
9
Size disparity % Slant as a function of size disparity and distance. Each graph shows how the slant of a short central surface about a vertical axis varies with the magnitude of horizontal size disparity at the viewing distance indicated on the curve.
Figure 19.8.
ε
Angular subtense of a frontal line. The angular subtense, q , of frontal line AB decreases as it moves into an eccentric position. Its angular subtense decreases because it becomes inclined to the line of sight by angle e . This effect is also proportional to cose. Therefore, the angular subtense of a line in the frontal plane decreases in proportion to cos 2e .
Figure 19.7.
change in image size due to the changing eccentricity of the line. A horizontal line moving vertically in a frontal plane and a vertical line moving laterally change in visual subtense but maintain a constant angle to the line of sight. Thus, as these lines move away from the straight ahead, their images acquire a size disparity proportional to the difference between two cos e functions.
19.2.2c Size Disparity and Viewing Distance The horizontal disparity between two points a fixed distance apart in depth is inversely proportional to the square of the distance of the points. A near pair of points with a small depth separation can create the same horizontal disparity as created by a more distant pair with a large depth separation. Consequently, the slant produced by a given size disparity in a central horizontal line element increases with viewing distance, as shown in Figure 19.8. The distance between isodisparity circles with equal intervals of disparity becomes larger as viewing distance increases. The spacing between isodisparity circles also varies with eccentricity, as shown in Figure 19.4B. 370
•
Thus, horizontal disparity provides information about relative depth at a particular eccentricity and distance. The locations of points in space can be recovered only when the absolute distance and headcentric eccentricity of points are known.
19.2.3 S I Z E R AT I O S
A horizontal-size disparity is the difference between the horizontal angular sizes of the images of an object. A horizontal size ratio (HSR) is the ratio of the angular sizes of the images.
(a1L (a1R
a2L )
a2R )
*(9)
Both measures of disparity have the same sign and are affected by the slant of a line and by its eccentricity and distance. Figure 19.9 plots of the joint effects of horizontal eccentricity and viewing distance on the horizontal size ratio of the images of a horizontal line element in a frontal plane. The HSR is a first-order disparity, which provides information about the disparity gradient. Consider two closely spaced and horizontally separated points whose angular subtense to the two eyes is a1L − a2L and a1R − a2R respectively. The horizontal-size disparity is defined as (a1L − a2L) − (a1R − a2R). Dividing the horizontal size disparity by the angular separation in the right eye (a1R − a2R) yields the disparity gradient as explained in Section 19.4.
STEREOSCOPIC VISION
(a1L
a 2 L ) − (a 1 R a 2 R )
(a1R
a2R )
*(10)
1.0
1.1 1.0
Infinity
0.8 –90
ac e(
cm )
0.9 57
19.2.4 D I F-F R E Q U E N C Y D I S PA R I T Y
os ur f
Horizontal size ratio (HSR)
1.2
The orientation of a vertical line to the line of sight does not change with horizontal eccentricity (e). Thus, the size disparity of a vertical line varies as a function of cose rather than of cos2e for a horizontal line, and the HSR = VSR2. This fact has implications for the perception of frontal surfaces, as we will see in Section 20.6.
0
entr
icity
(deg
)
45 90 28.5
Dis tan
Ecc
ce t
–45
Horizontal size ratios (HSRs) for a frontal surface. The graph shows how the HSR of the images of horizontal line on a frontal surface varies with eccentricity and the orthogonal distance to the surface. For any eccentricity and distance, the HSR is equal to VSR2.
Figure 19.9.
Thus, the disparity gradient of a pair of closely spaced points equals the HSR minus 1. Disparity gradient = HSR −1
*(11)
The HSR of a pair of closely spaced points corresponds to the disparity gradient plus 1. For a surface with a given degree of slant, HSR varies inversely with distance rather than with distance squared. A vertical size ratio (VSR) is the ratio of the vertical separation between the images of two points in one eye and the vertical separation of the images in the other eye. Figure 19.10 plots the joint effects of horizontal eccentricity and viewing distance on the vertical size ratio of the images of a vertical line element in a frontal plane.
Consider a centrally placed surface in a nearby frontal plane of the head with the eyes converged on the center. With respect to the visual axes, the surface slants to the right for the left eye and to the left for the right eye. Therefore, the perspective texture gradients in the two images are in opposite directions. This introduces a gradient of horizontal disparity extending in both directions out from the midline. The disparity gradient increases as the slant of the grating relative to the frontal plane of the head increases, as shown in Figure 19.3. If the surface is a regular vertical grating, the images in the two eyes will differ in spatial frequency. Tyler and Sutter (1979) referred to a dichoptic difference of spatial frequency as dif-frequency disparity. The dif-frequency disparity produced by a slanted regular grating is superimposed on monocular perspective gradients. In Figure 19.11 the left-eye image has a regular periodicity, while the right-eye image has a lower mean spatial frequency. The right-eye image has a gradient of spatial frequency as well as a mean difference with respect to the left-eye image.
Slanted vertical grating Vertical size ratio (VSR) 1.0
1.2 1.1
Infinity
1.0
(cm
)
0.9 su rfa
ce
0.8 – 90
57
to
–45
eg)
45 90 28.5
Dis
entr 0 icity (d
tan ce
Ecc
Disparity for a slanted vertical grating. Bars on the slanted vertical grating are graded in width so that they subtend equal angles in the left eye. The image of the grating in the right eye has a lower mean spatial frequency and shows a gradation of spatial frequency across the image. For a grating subtending a small visual angle, the difference in mean spatial frequency of the images is the main effect and is known as a dif-frequency disparity. The dashed lines are the visual axes.
Figure 19.11. Figure 19.10.
Vertical size ratios (VSRs) for a frontal surface. The graph shows
how the VSR of the images of a vertical line on a frontal surface varies with headcentric eccentricity and the distance to the surface. The VSR is maximal at eccentricities around ±45° and decreases back to 1.0 at eccentricities of either +90° or −90°, or as the distance to the surface approaches infinity.
T Y P E S O F B I N O C U L A R D I S PA R I T Y
•
371
A textured surface slanted about a vertical axis produces a pattern of horizontal point disparities. If the eyes remain converged on the near edge of the surface, the horizontal disparity of corresponding texture elements is zero at that position and increases linearly across the slanted surface to the far edge. This is a cumulative horizontal disparity. A slanted surface covered with a regular periodic pattern, such as a vertical grating, produces images in which similar pattern elements periodically fall on corresponding retinal regions. The result is a periodic modulation (beat) of horizontal disparity rather than a steady accumulation of disparity across the whole surface. Consider a slanted surface covered with a pattern of evenly and closely spaced elements interspersed with a distinct set of similar widely spaced elements, as in Figure 19.12. The closely spaced elements show a disparity modulation, and the widely spaced elements a cumulative disparity. In other words, high spatial-frequency components signify slant only in each local region, whereas low spatial-frequency components signify slant over a larger region. In more eccentric regions of the retina only large disparities between coarse surface features can be detected. Note that the slant is with respect to the frontal plane and has to be scaled with distance. Evidence that dif-frequency disparity is used in the perception of slant is discussed in Section 20.2.1. Evidence for the existence of specialized dif-frequency detectors was discussed in Section 11.6.1. 1 9 . 3 OT H E R T Y P E S O F D I S PA R I T Y
These disparities are of opposite sign above and below the horizon and could signal the absolute distance of the surface. Overall magnification of one image with respect to the other, as created by a surface normal to an eccentric cyclopean line of sight, has no effect on orientation disparities. The orientation disparity of all radial lines on any cyclopean-normal surface is zero, whatever the distance of the surface from the observer. If corresponding vertical meridians in the two eyes were parallel, a vertical line would have zero orientation disparity. However, when the horizontal meridians are parallel, corresponding vertical meridians are tilted top outward (declined) about 1°. Therefore, the images of a vertical line in the median plane have an orientation disparity of 2°. Theoretically, this disparity should cause the line to appear inclined top away by an amount that varies with viewing distance, as explained in Section 14.7. However, frontal surfaces appear frontal, which means that the visual system makes allowance for these orientation disparities.
19.3.1b Orientation Disparities on Slanted and inclined Surfaces Vertical lines on a slanted surface have zero orientation disparities and so do horizontal lines close to the horizontal meridian, assuming that the eyes are torsionally aligned. Slant creates orientation disparities for lines in other orientations. The function relating orientation disparity, q, to line orientation, f, is approximately q ( f ), as shown in Figure 19.13. Thus, orientation disparities on the two diagonals have maximum magnitude and opposite sign.
19.3.1 O R I E N TAT I O N D I S PA R I T I E S
19.3.1a Orientation Disparities on Frontal Surfaces
Orientation disparity (deg)
An orientation disparity between the images of a line can be generated by torsional misalignment of the eyes, by slant of the line about a vertical axis, or by inclination of the line about a horizontal axis. However, we will see that these ambiguities may be resolved by patterns of orientation disparities produced by lines on a surface. On a frontal surface, orientation disparities of horizontal line elements are zero along the horizon but increase above and below the horizon, as shown in Figure 19.2.
2.0
1.5 Inclined about a horizontal axis 1.0
Slanted about a vertical axis
0.5
0 0
10
20
30
40
50
60
70
80
90
Orientation of line elements (deg) Beat patterns of horizontal disparity. The images of the short lines come into binocular correspondence every four lines. The row of fused images tends to break up into slanted segments like a Venetian blind. The images of the long lines have the same relative size disparity as the short lines but are perceived as one slanted surface because they do not come into correspondence more than once.
Figure 19.12.
372
•
Orientation disparity and line orientation. Orientation disparity of line elements on a small surface inclined or slanted 17°, as a function of their orientation. On an inclined surface, vertical lines produce the largest disparity. On a slanted surface, 45° lines produce the largest disparity. The largest orientation disparity on a slanted surface is just half the largest on an inclined surface.
Figure 19.13.
STEREOSCOPIC VISION
The spatial pattern of orientation disparities created by a slanted surface is similar to the spatial pattern of polar disparities described in Section 19.3.4. The images of horizontal lines on a frontal or inclined surface have a vertical gradient of orientation disparity with one sign above the horizon, and with the opposite sign below the horizon. When a surface is inclined top away, the images of vertical lines in the left eye tilt to the left, and those in the right eye tilt to the right, as shown in Figure 19.2B. For an inclined surface, the function relating orientation disparity, q, to the orientation of line elements 2 to the horizontal, f, is approximately q (f ). Thus, the images of vertical lines are sheared with respect to the images of horizontal lines. A line in the median plane inclined at angle i with respect to vertical produces images with an orientation disparity q. If a is the interocular distance and d is the viewing distance, then for small values of q
q=
a
i d
or i = arctan
qd in radians a
*(12)
as shown in Figure 14.32. Figure 19.14 contains a family of curves showing orientation disparity (q) for different values of i and d, for an interocular distance of 6.5 cm. Estimation of inclination from this pattern of disparities requires a nonlinear scaling of viewing distance. This assumes that corresponding vertical meridians are parallel. However, allowance must be made for the declination of corresponding vertical medians. The issue is complicated further by the fact that cyclovergence accompanies horizontal vergence
19.3.2 A N GU L A R D I S PA R I T Y
cm
160
80
10
cm
cm
40
cm
cm 20
15
10 cm
5 cm
0
5
32
Orientation disparity (deg)
20
0 0
30 60 Inclination of the line to the frontal plane (deg)
90
Orientation disparity and inclination. Orientation disparity in the images of a line inclined top away at various angles and at various distances in the median plane, for an interocular distance of 6.5 cm. It is assumed that there is no cyclovergence.
Figure 19.14.
and also changes with elevation of gaze (Section 10.7.4). Effects of these factors are discussed later. Cyclovergence can null the orientation disparity of vertical lines on an inclined surface, since the disparity is the same over the whole surface. However, this would introduce an equivalent horizontal orientation disparity. Cyclovergence cannot null the orientation disparity of horizontal lines on an inclined surface, because the disparity is not constant and is opposite in sign above and below the horizon plane. Differential patterns of orientation disparity are preserved over changes in vergence. These patterns contain information for slant and inclination. An orientation disparity between images along the horizontal meridian can be due only to misalignment of the eyes. These disparities do not contain information about the world. It would therefore be a good strategy for the visual system to allow orientation disparities along the horizontal meridian of the visual field to evoke cyclovergence. Once these cyclodisparities are nulled, any residual orientation disparities can be used to code depth. This strategy has the added advantage that it brings the residual disparities within range of finer disparity detectors. Cyclovergence is indeed evoked more effectively by cyclorotation of horizontals than of verticals (see Section 10.7.5a). This could also be true of the neural image-linking process, which could initially link the images of horizontal lines and edges along the central horizontal meridians before linking corresponding images in other orientations. Orientation disparities could be detected by cells sensitive to gradients of point disparities or by cells with elongated receptive fields tuned to orientation disparity. The evidence for cells tuned to orientation disparities is reviewed in Sections 11.6.2 and 20.3.1. Fantoni (2008) has described how the distribution of local orientation disparities over a small surface, or a large surface in orthographic projection, could be used to recover surface slant and inclination.
For lines on a slanted or inclined surface, the angle between the images of a pair of lines in one eye differs from the angle between the corresponding pair of lines in the other eye. This is an angular disparity. For a slanted surface, angular disparity is at a maximum for pairs of lines close to +45° and −45°, and is zero for lines close to 0° and 90°, as shown in Figure 19.15. For an inclined surface, angular disparity is at a maximum for pairs of lines close to 0° and 90°, and is zero for lines close to +45° and −45°. Thus, the patterns of angular disparity created by lines on an inclined surface and those created by lines on a slanted surface are complementary. Therefore, the way angular disparities of orthogonal line elements vary as a function of their absolute orientations provides distinct signatures for slanted and inclined surfaces.
T Y P E S O F B I N O C U L A R D I S PA R I T Y
•
373
Angular disparity (deg)
Inclination about a horizontal axis 2.0 1.5 1.0 0.5 0 ±45
Slant about a vertical axis 0/90 ±45 Orientation of orthogonal line elements (deg)
Angular disparity and relative line orientation. In the detection of angular disparity, the angle between pairs of lines in each eye’s image is first registered. Angular disparity is the difference between these separate monocular angles. For inclined surfaces, orientation disparities between the images of orthogonal lines are maximal when the lines are oriented at 0° and 90° and zero when they are oriented at ±45°. For slanted surfaces, angular disparities are maximal for ±45° lines and zero for 0° and 90° lines.
Figure 19.15.
Orientation disparity on a slanted surface. On a slanted surface, a vertical line does not produce orientation disparity. A central horizontal line produces no disparity, while those above and below the center produce disparities of opposite sign. Oblique lines produce orientation disparities of opposite sign. Solid lines represent the right eye’s images. Dashed lines represent the left eye’s images.
Figure 19.17.
Orientation disparity on an inclined surface. On an inclined surface, the orientation disparity of horizontal lines is the same as on a frontal surface. A vertical line produces a disparity of the same sign and magnitude at all locations. An oblique line produces a disparity of half the magnitude of that produced by a vertical line.
Figure 19.16.
Koenderink (1985, 1986) suggested that orientation disparities of vertical line elements but not of horizontal elements is the signature of an inclined surface (Figure 19.16). Orientation disparities of one sign produced by +45° line elements and of the opposite sign produced by −45° line elements are the signature of a slanted surface (Figure 19.17). Orientation disparities of the same sign and magnitude for all line elements indicate torsional misalignment of the eyes. 19.3.3 D E F O R M AT I O N D I S PA R IT I E S
The most useful pattern of disparities created by a slanted surface would be one unique to slanted surfaces and invariant to location of the surface in the visual field, and to the direction of gaze. 374
•
The characteristic that comes closest is deformation disparity, which is defined as the deformation needed to map one eye’s image onto the other. The location invariance of deformation disparity arises because deformation is immune to overall size differences between the images of eccentric surfaces. Koenderink and van Doorn (1976) applied a useful decomposition of first-order disparity fields. Instead of describing the four horizontal and vertical disparity gradients in the horizontal and vertical directions, they described the disparity field in terms of four differential components. These are: expansion or dilatation; curl, or rotation; and two components of deformation, or shear, as in Figure 19.18. These patterns of disparity can also be expressed in terms of disparity gradients. The size disparities can be specified ∂fh to ∂h ∂fu . The shear dispariby the ratio of ∂h ∂f ties can be specified by the ratio of ∂h ∂f V to ∂h ∂fh. We saw in Section 19.2.3 that: HSR = ∂a ∂fh + 1 and VSR = ∂b ∂f v + 1
where ∂a ∂f h is the gradient of horizontal disparity in a horizontal direction and ∂b ∂f v is the gradient of vertical disparity in a vertical direction. Disparity-gradient fields and differential components are mathematically equivalent descriptions. One can be transformed into the other by the appropriate algebra. Deformation disparities may also be defined in terms of differences between horizontal disparities or between
STEREOSCOPIC VISION
Expansion
Rotation
Shear 1
Shear 2
Differential transformations. Koenderink (1985) pointed out that the differential characteristics of (1) the optic flow field over time and (2) the disparity field (the mapping of one eye’s image on to the other) can be fully described in terms of these transformations.
Figure 19.18.
vertical disparities, rather than in terms of ratios. The following terms will be used to describe first-order disparity gradients over a flat textured surface: Horizontal-size disparity is a shear component due to a width difference between the images. Vertical-size disparity is a shear component due to a height difference between the images. Size-deformation disparity This is the difference between horizontal- and vertical-size disparities. Overall-size disparity is the expansion component due to one image being larger than the other. It represents equal horizontal-size and vertical-size disparities of the same sign. Horizontal-shear disparity is a shear component due to relative shearing of images in a horizontal direction. Vertical-shear disparity is a shear component due to relative shearing of images in a vertical direction. Shear-deformation disparity This is the difference between horizontal-shear and vertical-shear disparities. Rotation disparity is the curl, or rotation, component due to rotation of one image relative to the other. It represents equal horizontal- and vertical-shear disparities of the same sign. Representing surface slant by size-deformation disparity and surface inclination by shear-deformation disparity has the following advantages. 1. Immunity to overall disparity Like all first-order disparities, deformation disparities are immune to overall horizontal or vertical disparity, as might be created by horizontal or vertical misalignments of the eyes or the cameras. 2. Immunity to effects of eccentric viewing Deformation disparities are immune to magnification of one image
relative to the other, because a magnification has no deformation. The invariance to dilatation is useful for discounting the effects of eccentric viewing. A surface that is eccentric with respect to the head is closer to one eye than to the other, whatever the angle of gaze. Its image is therefore larger in one eye than in the other. An eccentric surface close to the horizontal plane of regard and normal to the cyclopean line of sight, is uniformly larger in one eye. Consequently, there is no component of deformation. This is also true if the eccentric surface lies in an elevated direction, but only if azimuth and elevation angles are measured in the direction of the elevated surface. It follows that a surface with zero deformation disparity must be normal to the cyclopean direction. Any local dilation disparity is part of an overall gradient of dilation disparity across the whole binocular field. Thus, it need not be corrected locally; a single correction could be applied to all surfaces in the visual field. 3. Immunity to aniseikonia Use of deformation disparities renders the visual system immune to the effects of dilation disparity produced by aniseikonia. 4. Immunity to relative rotation of the images Deformation disparities are immune to rotation of one image relative to the other, because rotation has no deformation. Rotation disparity does not reflect 3-D features of the world. Overall rotation disparity is due only to torsional misalignment of the eyes, and may be eliminated by cyclovergence. 5. Detection of slant to the cyclopean direction Sizedeformation disparity captures information about the slant of a surface with respect to the cyclopean direction wherever the surface lies in the visual field and wherever the eyes are pointing. This is a very useful invariance that any visual system might exploit. In contrast, the gradient of horizontal disparity in a horizontal direction (∂ ∂f ) specifies the slant of the surface (scaled by distance) with respect to an isodisparity circle rather than with respect to the cyclopean direction. Additional information about the eccentricity of the surface and its viewing distance is needed to correct the slant estimate from the horizontal gradient alone. This fact is relevant to explanations of the induced effect (Section 20.2.3). 6. Direct scaling for distance A horizontal disparity does not specify the depth interval between two points because the relative disparity is inversely related to the square of viewing distance (Section 14.2.3). Viewing distance also affects disparity gradients, including the two components of deformation. Consider two points on a surface slanted 45° and close to the median plane (Figure 19.19). Doubling the distance of the surface reduces the disparity between the points to one
T Y P E S O F B I N O C U L A R D I S PA R I T Y
•
375
θ1 θ2
A
w
B
w
C
Disparity curvature. (A) Disparity difference between two points is the difference in their binocular subtenses (q1 − q2). It is inversely proportional to the square of viewing distance. (B) Disparity gradient of two points is the difference in disparity between two points divided by their separation (w). It is inversely proportional to viewing distance. (C) Disparity curvature is the difference in disparity gradient of two surface patches divided by their separation. It remains constant with changes in viewing distance.
Figure 19.19.
quarter but the angular separation of the points is halved. To a first approximation, the disparity gradient is an inverse function of distance rather than of distance squared. The same is true of deformation disparities. Consequently, deformation disparities need to be scaled by absolute distance to derive the slant or inclination of a surface. This issue is discussed in Section 20.6.4.
19.3.4 P O L A R D I S PA R I T Y
Disparity produced by a point can be represented by the polar coordinates of meridional direction (j) and angle of
radial eccentricity (q). The pattern of polar disparities created by a slanted surface is shown in Figure 19.20. The pattern created by an inclined surface is shown in Figure 19.21. Local polar disparities provide no information about slant or inclination. However, Liu et al. (1994a) claimed that the direction and magnitude of slant or inclination could be derived from the radial patterns of polar disparities. Like other types of disparity, the disparity pattern would have to be scaled by viewing distance. This interesting possibility presents the following problems, as Liu et al. acknowledged. 1. The pattern of polar disparities is not invariant over changes in direction of gaze, including small vergence changes. Polar direction disparities indicate the slant and inclination of a surface patch that is fixated. With fixation, the pattern of polar direction disparities created by a slanting surface resembles the pattern of orientation disparities. But there is an important difference. Polar disparities are expressed in terms of disparities in the directions of points with respect to the foveas. Line elements used to specify orientation disparity need not pass through the origin. 2. Polar disparities are necessarily zero along the horizontal meridian (assuming the eyes are torsionally aligned). Therefore, it would be impossible to determine the slant of a row of dots along the horizontal meridian if polar direction disparities were the only information available. 3. Absolute polar disparities are affected by torsional misalignment of the eyes.
Polar angle disparity map of a slanted surface. Grayscale values indicate polar angle disparities of points on a surface slanted 10° away to the right (A) or to the left (B) as a function of their location in a frontal plane. The surface subtended 40 x 40 cm at a distance of 40 cm. (Reprinted from Liu
Figure 19.20.
et al. 1994a, with permission of Elsevier Science)
376
•
STEREOSCOPIC VISION
Polar angle disparity map of an inclined surface. Grayscale values indicate polar angle disparities of points on a surface inclined 10° top away (A) or 10° top toward (B) as a function of their location in a frontal plane. The surface subtended 40 x 40 cm at a distance of 40 cm. (Reprinted from Liu
Figure 19.21.
et al. 1994a, with permission from Elsevier)
19.3.5 H E A D C E N T R I C D I S PA R I T Y
It is generally assumed that stereoscopic depth is coded in terms of retinal disparities. However, Zhang et al. (2010) have produced evidence that depth can also be coded in terms of headcentric disparities. They exploited the fact that a point flashed within 30 ms of the start of a saccade appears displaced in the direction of the saccade. To one eye they presented a 1° square less than 30 ms before a saccade and to the other eye a similar square in the same location 80 ms before the saccade. Although the flashed squares had zero retinal disparity they had a crossed or uncrossed headcentric disparity depending on the direction of the saccade. Subjects reported that the test square with headcentric disparity appeared in depth relative to a pair of synchronously presented comparison squares just above the test square. It is worth considering whether this effect is related to depth produced by temporal disparities, as discussed in Section 23.3. 1 9 . 4 L I N E A R D I S PA R I T Y G R A D I E N T S A first-order spatial change of disparity is a linear disparity gradient (Burt and Julesz 1980). Marks on a horizontal slanted line generate a horizontal gradient of horizontal disparity. Marks on an inclined line in the median plane generate a vertical gradient of horizontal disparity. A disparity gradient, G, is the difference in binocular disparity, h, between the images of two points divided by the difference, q, between the mean direction of the images of one point and the mean direction of the images of the other point, as shown in Figure 19.22A. Thus, G = h/q.
The relative horizontal disparity, hh, between the images of two horizontally separated points is: hh
(a L − a R ) (a L − a R )
where a is the azimuth angle of the images of point 1 and point 2 in the left (L) and right (R) eyes. The average horizontal separation of the images (qh) is: q h [(a L − a
R
) + (a L − a R )]
2
Therefore, the horizontal disparity gradient is:
(a L a R ) − (a L a R ) hh = q h ⎡⎣(a L a R ) + (a L − a R )⎤⎦ / 2
*(13)
A disparity gradient can be a gradient of horizontal disparity in a horizontal direction, hh/qh or in a vertical direction, hh/qv, or a gradient of vertical disparity in a horizontal direction, hv/qh or in a vertical direction, h v/qv. It can also be in any intermediate direction. A horizontal gradient of horizontal disparity at a given distance specifies the slant of a surface with respect to the tangent to a locus of isodisparity in the form of a vertical cylinder (see Figure 14.22b). The distance of the surface and its headcentric eccentricity are needed to derive slant with respect to the cyclopean direction. A vertical gradient of horizontal disparity specifies the inclination of a surface with respect to the vertical horopter, at a specified distance.
T Y P E S O F B I N O C U L A R D I S PA R I T Y
•
377
η
Fixated object
Right eye Left eye θ
Disparity gradient = η/θ
A
Objects on visual line of an eye
θ
Right eye Left eye η
Disparity gradient = 2
B
Objects on median plane
η
Right eye Left eye θ=0
Infinite disparity gradient
C Disparity gradients. A disparity gradient is the disparity difference, , between the images of two objects divided by the angle q between the mean angular direction of the images of one object and the mean direction of the images of the other object. The red object is fixated. The blue object has disparity h, and a mean direction shown by vertical dotted lines. Figure 19.22.
The disparity gradient produced by points in a plane containing the interocular axis is their relative horizontal disparity divided by their horizontal separation. The disparity gradient produced by points in a vertical plane is their relative horizontal disparity divided by their vertical separation. All points on the space horopter or any other isodisparity locus have a zero-disparity gradient. Points lying along a visual line of one eye have a horizontal disparity gradient of 2, as shown in Figure 19.22B (Burt and Julesz 1980; Trivedi and Lloyd 1985). Two objects with a horizontal disparity gradient of 2 correspond to Panum’s limiting case. They form one image in one eye and two images in the other eye (Section 17.6). 378
•
Points between corresponding visual lines produce a gradient greater than 2. The horizontal disparity gradient cannot exceed 2 for points on an opaque slanted surface because, beyond this value, the surface is invisible to one eye. Distinct objects at different distances can produce disparity gradients greater than 2. The gradient limit does not apply to points on an opaque inclined surface, since all points remain visible to both eyes until the surface becomes aligned with the visual axes. The vertical gradient of horizontal disparity along such a surface approaches infinity. Points lying in the median plane of the head produce a disparity gradient of infinity, as shown in Figure 19.22C. The images of the fixated point have the same mean direction as the images of the nonfixated point, but the images of the nonfixated point occur on opposite sides of the fovea. So the separation between the two mean positions is zero. We saw in Section 12.1.3 that dichoptic images do not fuse when the disparity gradient between them and a neighboring fused pair of images exceeds a value of about 1. Nevertheless, unfused images can code relative depth. The disparity gradient imposes a limit on binocular fusion, not on stereoscopic vision. A set of points in space for which the disparity between successive points is a constant proportion of the lateral separation of their images is a linear disparity ramp. A disparity ramp has a disparity gradient, as defined earlier; a depth, which is the difference in disparity between the near and far edges of the ramp; and a lateral extent, which is the angular subtense between the mean position of the images of the near edge and the mean position of the images of the far edge of the ramp. A disparity ramp also has an image density, which is the number of points for each 1° of visual angle (spatial frequency), and a surface density, which is the number of points per visual angle when the display is viewed in a frontal plane. A disparity ramp also has an orientation when projected into a frontal plane; it can be horizontal, vertical, or at an oblique angle. Burt and Julesz (1980) used the term “dipole angle” to refer to the angle that a disparity ramp makes with the horizontal in a frontal plane. A constant disparity gradient is the first spatial derivative of disparity. A visual line is any straight line through the nodal point of an eye. The visual axes are the principal visual lines. Corresponding visual lines project to corresponding points in the retinas. The region between any two corresponding visual lines is an inter-visual-line region. The region between the visual axes is the inter-visual-axis region. Figure 19.22C shows that the images of two objects at different depths between the visual axes have a reversed leftright order in the two eyes. Thus, the topological continuity of corresponding images is not preserved for objects with a disparity gradient greater than 2. In general, two objects lying on any line passing through the intersection of two visual lines and contained in the space between the two
STEREOSCOPIC VISION
vertical planes containing those visual lines do not preserve the relative order of images in the two eyes. Such objects can have a disparity gradient of less than 2 if they do not lie in the plane of two corresponding visual lines. This rule applies to objects between any pair of corresponding visual lines, not only to objects between the visual axes. For all other pairs of objects, the image of A is to the left of the image of B in both eyes, and the relative order of the images is the same in the two eyes. Objects lying between the two vertical planes containing the visual axes have another distinctive property. When the eyes converge on one of the objects, the images of the other object fall on opposite sides of the vertical retinal meridians. Thus, images of any nonfixated object lying in this region project to opposite cerebral hemispheres (Section 5.3.4). The disparate images of any object lying outside the region between the two visual axes project to the same side of the brain.
Change of slant in a horizontal direction
Change of slant in a vertical direction
Change of inclination in a horizontal direction
Change of inclination in a vertical direction
1 9 . 5 H I G H E R - O R D E R D I S PA R I T I E S
Components of second-order disparities. Second-order disparities specify changes of disparity gradient over visual angle. The upper changes are known as hinges. The lower changes are known as twists.
Figure 19.23.
Second-order spatial derivatives of disparity describe a change of disparity gradient over visual angle. They are created by surfaces curved in depth and may be called disparity curvatures. Changes in surface curvature over a surface give rise to third-order spatial derivatives of disparity. There are four second-order partial derivatives, of the horizontal disparity field. These are: ∂2hh/ ∂qh2, or a change of a horizontal disparity gradient in a horizontal direction; ∂2hh/ ∂qh∂qv, or a vertical change of a horizontal disparity gradient that changes in a horizontal direction; ∂2hh/ ∂qv2, or a change of a horizontal disparity gradient in a vertical direction; and ∂2hh/ ∂qv∂qh, or a horizontal change of a horizontal disparity gradient that changes in a vertical direction. The four partial derivatives are best illustrated diagrammatically in terms of the surfaces that would generate them, as in Figure 19.23. Similarly, there are four secondorder partial derivatives of the vertical disparity field. There are several advantages in describing disparities in terms of second spatial derivatives. Like disparity gradients, second spatial derivatives are not affected by horizontal or vertical misalignments of the eyes or by small torsional misalignments of the eyes. This is a useful feature that any visual system could potentially exploit. Moreover, the local value of a second spatial derivative remains roughly constant with changes in viewing distance. This can be seen most easily by considering the plan view of a surface curved about a vertical axis, as shown in Figure 19.19. The disparity gradients of the two sections of the surface are both approximately halved with a doubling of the viewing distance, as shown previously, but the distance between the two sections is also halved, making the change of disparity gradient roughly constant. This means that, unlike zero- and first-order disparities, second-order
disparities do not have to be scaled by viewing distance (Rogers and Cagenello 1989). However, this is true only for surfaces close to the median plane of the head. In general, second-order disparities have to be scaled by some estimate of headcentric eccentricity, as we will see in Section 20.6.5.
1 9 . 6 VE RT I C A L D I S PA R I T Y Since the eyes are separated horizontally, the principal disparities are horizontal (parallel to the interocular axis). Vertical disparities may arise from eye misalignment, anisometropia, and from the structure of the visual scene. Indeed, the images of most natural scenes contain vertical differences. The purpose of this section is to outline the circumstances under which vertical disparities arise and to determine the information they contain. Physiological evidence for the existence of binocular cells sensitive to vertical disparity was reviewed in Section 11.4.4. 19.6.1 V E RT I C A L D I S PA R I T I E S D U E TO T H E V I S UA L S Y S T E M
Vertical disparities can arise from any of the following causes intrinsic to the visual system. We assume that the eyes remain on a horizontal plane with respect to a vertical head. 1. Vertical misalignment of the eyes When the angles of vertical gaze of the two eyes are not equal, the visual
T Y P E S O F B I N O C U L A R D I S PA R I T Y
•
379
axes do not intersect and do not lie in the same plane. A vertical disparity is imposed on the images of all points in the binocular field. On a flat image plane, the imposed vertical disparity is approximately the same over the whole visual field. With spherical retinas, vertical disparity arising from vertical eye misalignment decreases with increasing eccentricity when image elevation is specified in terms of horizontal lines of longitude. This pattern of disparities evokes vertical vergence, which eliminates disparities due to misalignment of vertical gaze but leaves any residual vertical disparities arising from the structure of the visual scene.
19.6.2 V E RT I C A L D I S PA R I T I E S F RO M T H E V I S UA L S C E N E
Vertical disparities arise whenever an object subtends a larger vertical angle to one eye than to the other eye because it is nearer to one eye than to the other. This occurs for objects outside the median plane of the head, as shown in Figure 19.24. Objects in the median plane do not produce vertical disparities because they are an equal distance from the two eyes. Also, objects at eye level do not produce vertical disparities. With flat image planes, vertical disparities arising from the visual scene are easily specified with respect to Cartesian coordinates. However, with spherical retinas, vertical disparities depend on the axis system used to measure them (Section 14.3.1). Two image points in an eye may have the same elevation in one axis system but different elevations in another axis system. Although retinal points can be mapped from one axis system to another, the system in use should be specified to avoid confusion. When the positions of the eyes are not known, the vertical disparity produced by a single point provides no useful 380
•
0°
α Elevation
Vertical disparities arising from the above causes provide no information about the visual world. They are best eliminated by eye movements or compensated for in the visual system so that vertical disparities arising from the scene may be processed more effectively.
h ut im
3. Anisometropia The image in one eye may be magnified more than that in the other eye; a condition known as anisometropia (Section 9.9.1). This creates an overall vertical-size disparity that cannot be eliminated by any type of vergence.
Az
2. Torsional misalignment of the eyes This creates an overall cyclodisparity that contains a constant gradient of vertical disparity along all horizontal meridians. This evokes compensatory cyclovergence, which normally eliminates the disparity (Howard and Zacher 1991; Rogers and Howard 1991; De Bruyn et al. 1992). However, there may be a residual cyclotropia.
40°
β
0° D
40° 2° 0° it r y dispa rtical
–2°
Ve
Vertical disparity over the visual field. Theoretical vertical disparity of point P with respect to fixation point, F as a function of position over a frontal surface. Viewing distance, D, is 57 cm, and the interpupil distance is 6 cm. (Adapted from Gonzalez et al. 1993b)
Figure 19.24.
information, as we will see in the next section. However, the relative vertical disparities of two or more points provide information about the headcentric eccentricity of those points at a given distance. The absolute vertical disparity in the images of a point is the angular elevation of the image in one eye minus the elevation of the image in the other eye. Elevation is specified with respect to the horizontal plane of regard. Figure 19.3A shows that vertical disparities produced by an extended surface arise from differential perspective, or the difference in perspective from the two eyes (Rogers and Bradshaw 1993, 1993). In any axis system, all objects in the horizontal plane of regard or in the median plane of the head produce images with zero vertical disparity when the eyes are vertically and torsionally aligned (see Figure 14.22a). Assume that elevation is specified by lines of latitude, as in the gun-turret axis system. Then the vertical disparity produced by a point at a given eccentricity increases with increasing distance of the point from the horizontal plane of regard. Also, the vertical disparity increases from zero as the point moves from the median plane along a locus of constant elevation, as shown in Figures 19.24 and 19.25. The sign of disparity on one side of the median plane differs from that on the other side. Consequently, vertical disparities are greatest in the oblique quadrants of the visual field. Finally, the vertical disparity produced by a point decreases as the point moves along a cyclopean line of sight, and becomes zero at infinity. With elevation specified by lines of longitude, these changes in vertical disparity are independent of the horizontal convergence of the eyes (see Figure 19.25).
STEREOSCOPIC VISION
P1 d1R
P2
d1L β 1L − β 2L
d2R
y
d2L β 2L
β 1L
β 1R β 2R
β 1R − β 2R
x
Relative vertical disparity of a pair of points. The relative vertical disparity of points P1 and P2 is the difference between their absolute disparities: b b b1R and b b b22R . This is equivalent to the difference between b b b22L and b b b2R measured separately in the two eyes. Distances to the points from each eye are represented by dL and dR.
Figure 19.25.
19.6.3 R E L AT I VE V E RT I C A L D I S PA R I T I E S A N D S I Z E R AT I O S
The relative vertical disparity between the images of a pair of vertically separated points is the difference of their absolute vertical disparities (equations 6 and 7), as shown in Figure 19.25. Absolute vertical disparities are affected by the state of vertical and torsional alignment of the eyes. However, relative vertical disparities are not much affected by changes in vertical vergence because such changes introduce a constant vertical disparity over the central region of the binocular field. The constant disparity arising from vertical vergence is easy to distinguish from disparities arising from the structure of the scene, because the latter differ from place to place. The relative vertical disparity of a closely spaced pair of vertically separated points is not affected by small torsional misalignment of the eyes. Vertical disparities specified by horizontal lines of latitude, are not affected by changes of gaze in the plane of regard. For some purposes it is useful to consider the ratio of the vertical separations of the images of two points, rather than their differences. This is the vertical size ratio or VSR. VSR = (
1
−
2
) ( b1R − b2R )
*(14)
The locus of all points with the same VSR in the plane of regard is shown in Figure 19.26. The description of vertical-disparities in terms of VSRs is very useful (Bishop 1989,
140 Distance from observer in median plane (cm)
Read et al. (2009) have provided a detailed analysis of the relation between vertical disparities and the axis system used to measure them.
1.01 1.02
120 1.03 100 1.04 80 1.05 60
1.06 1.07 1.08 1.09
40
20
0
1.2
0
1.1
20 40 60 80 Distance from midline (cm)
100
Vertical size ratios as a function of eccentricity and distance. Each curve connects points in the scene with a particular vertical size ratio (VSR) given by its parameter. Close to the midline, VSRs increase with increasing eccentricity (for constant distance). The VSR created by an object close to the observer and to the median plane of the head can be the same as that of an object farther away and at a greater eccentricity. (Adapted from Gillam and Lawergren 1983)
Figure 19.26.
1994). The VSR of a pair of closely spaced points depends on just one factor—the relative distance of the points from the two eyes. Therefore, VSRs are not affected by vertical misalignment of the eyes or by horizontal vergence. Also, they are not affected by the horizontal eccentricity of the
T Y P E S O F B I N O C U L A R D I S PA R I T Y
•
381
stimulus when elevation is specified by the gun-turret axis system, since this system is rotationally symmetric about the y-axis. For two closely spaced points on a surface (so that tan q = q in radians), the VSR equals the ratio of their inverse distances from the two eyes (dR/dL). The VSR of points on a surface is not affected by the slant of the surface, and is not much affected by the inclination of the surface (Gillam and Lawergren 1983). Consider a short vertical line standing on the horizontal plane of regard. In general, the VSR of the images of the line increases with increasing eccentricity from the median plane of the head because the line becomes closer to one eye than to the other. The VSR reaches a maximum at an eccentricity of 90° when the difference in distance from the eyes is maximal, as can be seen in Figure 19.27. However, the VSR alone does not specify eccentricity since it also varies as a function of absolute distance along a cyclopean line of sight. As the viewing distance increases to infinity, the distance of the line from the two eyes approaches the same value and the VSR tends toward one. The same VSR can be created by a near line at a small (headcentric) eccentricity as by a more distant line at a large eccentricity. Ogle (1939c) calculated that, at an eccentricity of 30°, the VSR varies from 1.004 for an object at 6 m to 1.18 for an object at 20 cm. Hence a given VSR specifies the eccentricity at a distance. Howard (1970 p. 203) turned the argument around and observed that if eccentricity is known, the VSR “could form the basis for judgments of absolute distance.” The VSR and the local horizontal gradient of VSR specify the absolute distance to a surface for points close to the plane of regard (see Section 20.6.5). The VSR of the images of a vertical line standing on the plane of regard also changes with the length of the line, that is, by the elevation of its upper end. The effect of elevation
VSR 1.2 1.1 1.0 0.9 0.8 –90
n
–45 Ecce
0 y (de
ntricit
g)
45
a pe lo ) c 57 g cy (cm on ht al sig e c of an e 90 28 ist lin D
Vertical size ratios as a function of distance. The vertical size ratio (VSR) of the images of an object is the ratio of their angular extents. All objects in the median plane of the head (eccentricity of 0) have VSRs of 1.0. For a particular viewing distance, VSRs increase (to the left) and decrease (to the right) with increasing eccentricity. For all eccentricities, VSRs tend to 1.0 with increasing distance because the distances of the object from the two eyes become similar. (Redrawn from
Figure 19.27.
Rogers and Bradshaw 1994b)
382
•
on the VSR depends in part on the axis system used to define visual direction. If azimuth and elevation are measured with a latitude/latitude system, VSRs are not affected by elevation. In the other axis systems, VSRs vary with elevation. Hence the graph shown in Figure 19.26 has a general applicability only when the points are close to the horizontal plane of regard. We saw in Section 19.2.3 that relative horizontal disparity, the horizontal disparity gradient, and the HSR for two horizontally separated points are closely related. The same is true for vertical disparity. Equation (6) can be rearranged to express relative vertical disparity as a difference between the angular-size differences subtended at the two eyes (equation 7). If the two points are close together, then dividing by (b1R − b2R), the vertical separation of the points in the right eye yields the vertical disparity gradient of the two points in the vertical direction: Vertical disparity gradient =
( b1L
b 2 L ) − ( b1 R b 2 R )
( b1R b 2 R ) ( − ) −1 = ( b1R − b 2 R )
*(15)
But the ratio in equation (15) is the vertical size ratio (VSR) (see equation 14). It follows that VSR
+vertical disparity gradient
19.6.4 VE RT I C A L D I S PA R I T Y A S A CU E TO E C C E N T R I C I T Y
One might suppose that a frontal surface viewed straight ahead and containing a pattern of vertical disparities appropriate to a surface in a horizontally eccentric position would appear eccentric (Howard 1970). Banks et al. (2002) asked subjects to fixate the center of a random-dot surface presented at various horizontal eccentricities and frontal to the line of sight. At each eccentricity, the surface contained vertical disparities appropriate to each of several eccentricities. Subjects set an unseen pointer to indicate the azimuth of each surface. Settings varied with the actual eccentricity of the surface but were not affected by changes in the pattern of vertical disparities. In other words, patterns of vertical disparity were not used directly to indicate headcentric eccentricity. Any influence that vertical disparities may have on perceived visual direction is presumably overridden by information about stimulus eccentricity arising from the oculomotor system. But can an unusual association between vertical disparities and angle of gaze lead to a recalibration of the oculomotor system? Ebenholtz (1970) predicted that prolonged magnification of the image in one eye would change the apparent horizontal eccentricity of stimuli. Berends et al. (2002) tested this idea by having subjects inspect a large frontal
STEREOSCOPIC VISION
random-dot display containing different mixtures of horizontal and vertical size disparities. Five minutes’ exposure caused a shift in the apparent straight-ahead of a superimposed test spot in five out of nine subjects. The shift was a function of the extent to which vertical-size disparity in the display did not conform to the horizontal disparity. The effect might have been larger if the test spot had been presented without the fixed frame. They suggested that the effect is due to recalibration of the relation between eye position and perceived direction. Ishii (2009) asked whether altering the reliability of the eye-position signal changes the effect of disparity on perceived direction of gaze. Subjects held their eyes in an extreme left or right position for 15 s. They were then shown a 60° by 60° random-dot display with patterns of horizontal disparity, vertical disparity, or both types of disparity. The patterns of disparity conformed to those produced by flat displays at horizontal eccentricities between −30° to 30°. Subjects were asked to judge the direction of a cross at the center of each display. The disparities had no effect on perceived direction when there had been no prior extreme gaze. But after holding the eyes in extreme gaze in a given direction, the perceived direction of the central fixation cross changed in a direction by an amount related to the pattern of disparity in the surface. The effect was greatest when both horizontal and vertical disparities indicated an eccentric surface, and was absent when only vertical disparities indicated an eccentric surface. Thus, the overall pattern of disparity in a frontal surface affects the perceived direction of the surface when oculomotor signals have been disturbed. 19.6.5 VE RT I C A L D I S PA R I T I E S A S A CU E TO EY E P O S I T I O N
Mayhew and Longuet-Higgins (1982) produced a computational theory of vertical disparities. Extending a proof from the optic flow situation, Longuet-Higgins (1982) showed that the positions in space of three or more points that do not all lie on the same meridian are fully determined by the horizontal and vertical coordinates of their images on two flat retinas. For this to be true, the corresponding horizontal meridians of the two retinas must be coplanar. Moreover, if only two nonmeridional points are visible, the retinal images generally admit only two distinct 3-D interpretations of which one is usually unrealistic. Mayhew (1982) suggested that vertical disparities provide information about convergence distance and the angle of eccentric gaze. It had previously been thought that only information from the oculomotor system could specify eye position. Mayhew showed that the absolute vertical disparity, V, of a single point depends on both its eccentricity and the direction of gaze. Specifically, V=
v ahv avg + d d
*(16)
where a is the interocular distance, h and v are the horizontal and vertical eccentricities respectively, d is the viewing distance from the cyclopean point, and g is the azimuth of the cyclopean line of sight in radians. If the vertical disparities and eccentricities of a pair of points are known, the equations can be solved and distance and cyclopean direction of gaze can be determined. This method is based on the use of absolute vertical disparities on flat retinas, and works only if the retinas are vertically and torsionally aligned. On spherical retinas, the registration of vertical disparities depends on the axis system used to specify visual directions. If elevation is specified by parallel horizontal lines of latitude, vertical disparities do not vary with changes in horizontal gaze. Therefore, horizontal lines of latitude cannot be used to indicate gaze direction or vergence angle. This is because parallel horizontal lines on the retina remain parallel when an eye rotates about a vertical axis through its center. Vertical disparities vary with horizontal gaze only when elevation is specified by lines of longitude. This issue was discussed in Sections 14.3.1a and 19.6.2b. Clement (1992) argued that the requirement that the eyes be torsionally aligned makes Mayhew’s theory an unlikely model of the visual system. Instead, he suggested that Petrov’s (1980) fusional scheme, which first checks whether the disparities of a number of points correspond to a possible object and gaze angle, provides a more likely solution (but see Porrill and Mayhew 1994). 19.6.6 VE RT I C A L D I S PA R I T I E S A S A CU E TO 3-D S T RU C T U R E
Gillam and Lawergren (1983) suggested a different method for determining 3-D scene structure. They pointed out that the gradient of VSRs on spherical retinas is approximately constant as a vertical line is moved from the median plane of the head over a frontal surface. However, the gradient decreases with increasing distance of the surface. Therefore, this gradient could be used to indicate the absolute distance to a surface. Moreover, the gradient of the VSR-eccentricity function does not change substantially if the surface is slanted or inclined by up to 40°. They also pointed out that VSRs are a function of only the headcentric eccentricity and absolute distance of the surface and therefore do not depend on where the eyes are converged or on the direction of gaze. This is because a spherical retina, unlike a flat retina, rotates within its own surface when the eyes rotate. Convergence distance (d) and angle of eccentric gaze (g) could be derived from this information. However, information about the positions of the eyes is not required to determine the 3-D structure of the scene, because the gradient of VSRs does not vary with vergence or gaze angle. According to Mayhew’s implementation, specifications of convergence and gaze eccentricity derived from vertical disparities on flat retinas are required to scale horizontal
T Y P E S O F B I N O C U L A R D I S PA R I T Y
•
383
disparities and determine the structure of the scene. This implementation was based on flat retinas and is relevant to cameras rather than to eyes. There is a second difference between the two analyses. In Mayhew and Longuet-Higgins’s analysis, all points in the scene could contribute to the registration of eye positions. The points need not lie on continuous surfaces. For Gillam and Lawergren, the gradient of the VSRs over a surface provides a local estimate of the distance to the surface. This issue is discussed further in Section 20.2.4. Gillam and Lawergren’s analysis can be extended to take into account the effects of large eccentricities and different elevations. Figure 19.27 shows how the VSR of a small (< 5°) surface patch normal to the cyclopean direction varies as a function of its horizontal eccentricity and absolute distance. The VSR is 1.0 for a patch that lies across the median plane (0° eccentricity), and increases with eccentricity reaching a maximum at ± 90° eccentricity, when the ratio of the distances from the two eyes is maximal. For a surface patch at a very large distance from the observer,
384
•
the VSR is 1.0 for all eccentricities, because the distances from the two eyes approach the same value. The center portion of this graph, where the VSReccentricity function is approximately linear, represents the situation described by Gillam and Lawergren. The gradient in this center portion is a simple function of the absolute distance to the surface. In the general case, the absolute distance to the surface can be derived from the VSR of a surface patch together with the local horizontal gradient of the VSR at that point. For a surface close to the horizontal plane of regard these two quantities uniquely determine its absolute distance and headcentric eccentricity. For a scene containing a number of surfaces at different absolute distances, the magnitude of the VSR and the local horizontal gradient of the VSR will be different for each surface. In other words, a computational theory based on VSRs does not yield a single viewing-system parameter that applies to all parts of the visual scene, but provides local information to specify the absolute distance and headcentric eccentricity of each surface.
STEREOSCOPIC VISION
20 BINOCULAR DISPARIT Y AND DEPTH PERCEPTION
20.1 20.1.1 20.1.2 20.2 20.2.1 20.2.2 20.2.3 20.2.4 20.2.5 20.3 20.3.1 20.3.2 20.4
Introduction 385 Uses of binocular vision and stereopsis 385 Absolute disparities 387 Perception of slant 387 Horizontal-size and dif-frequency disparities 387 Slant constancies as a function of eccentricity 392 Size-disparity induced effect 395 Local versus global vertical-size disparity 399 Physiological theories of the induced effect 402 Perception of inclination 403] Orientation disparity and inclination 403 Deformation disparity and inclination 405 Stereoscopic anisotropies 413
20.4.1 20.4.2 20.5 20.5.1 20.5.2 20.5.3 20.6 20.6.1 20.6.2 20.6.3 20.6.4 20.6.5
20.1 INTRODUCTION
Slant-inclination anisotropy 413 Anisotropy of depth modulatios 415 Disparity-defined 3-D shape 416 Shape index and curvedness 416 Curvature discrimination thresholds 418 3-D shape discrimination thresholds 418 Constancy of disparity-defined depth 419 Introduction 419 Procedures 419 Constancy of relative depth magnitude 420 Depth constancy of sloping planar surfaces 424 Depth constancy of surface curvature 424
3. Many tasks, such as reading and eye-hand coordination, are performed better with two eyes ( Jones and Lee 1981; Sheedy et al. 1986).
20.1.1 US E S O F B I N O CU L A R VI S I O N AND STEREOPSIS
4. Convergence of the visual axes on an object of interest helps us to attend to the object and disregard objects nearer or further away (Section 22.8).
All vertebrates with eyes have binocular vision in the sense of having two eyes. Animals with eyes placed laterally on the head have panoramic vision, which can extend to behind the head. Animals with frontal eyes lack panoramic vision but gain a large region of binocular overlap—they have true binocular vision. There is another price to pay for frontal vision. Corresponding inputs from the eyes must converge onto cells in the visual cortex. This requires precise axon guidance and control of vergence eye movements. Strabismus disrupts the binocular system and produces amblyopia in the deviating eye. Animals with lateral eyes do not need to converge the visual inputs and are not subject to amblyopia. In spite of these costs, frontal binocular vision confers the following advantages.
5. A defect in one eye can be compensated for by the other eye. For example, a monocular scotoma (such as the blind spot) corresponds with an intact region in the other eye. People with unequal refractive power in the two eyes can use one eye for near viewing and other eye for far viewing. 6. Binocular vision provides the basis for stereoscopic vision. Stereoscopic vision and depth perception in general provide the following advantages.
1. Visual detection, resolution, and discrimination are slightly better when both eyes see the stimulus (Section 13.1.2). This would be especially important for nocturnal animals, such as tarsiers, from which primates are believed to have evolved.
1. Detection of the 3-D structure of the world Stereopsis allows us to discriminate small differences in depth. Under the best conditions, we can detect the depth between a fixated object and a second object when the images of the objects are only 2 to 6 arcsec apart. We will see in Section 28.2 that motion parallax created by moving the head from side to side is the only other cue to depth providing this degree of precision.
2. Reaction times to the onset of a stimulus are shorter with binocular viewing than with monocular viewing (Section 13.1.7). 385
We will see in Chapter 33 that the eyes of some animals have a fixed angle of vergence. Thus, the binocular disparity produced by an object varies in a consistent way with changes in the distance of the object from the eyes. The ability to judge depth on this basis is referred to as range-finding stereopsis, since it works like a rangefinder. The eyes of humans and many other animals can change their angle of vergence, so that the visual axes intersect on the object of interest. The images of an object have a disparity with a sign that depends on whether it is nearer or further away than a fixated object, and a magnitude proportional to its distance in depth from the fixated object. Thus disparity provides information about depth between objects. It does not indicate the distance of a small object from the viewer. Nor does it indicate the absolute distance between objects, because the disparity produced by two objects a given distance apart decreases as an inverse function of the distance of the objects (Section 14.2.3). There is ample evidence that depth intervals are more precisely detected with two eyes than with one eye (Section 18.3). Also, judgments of the inclination and smoothness of textured surfaces are more precise with binocular than with monocular viewing (Allison et al. 2009b). We will see in Section 20.3.2 that the pattern of disparity over an extended surface provides information about the absolute distance of the surface, even when no other information is available. Absolute disparity between the images of a particular object, or that between the retinal images as a whole, controls vergence movements of the eyes. The absolute distance of an isolated object could be provided by the convergence angle of the eyes. 2. Breaking camouflage Stereopsis helps to break camouflage in stationary objects. This is particularly important for predators. A predator could also break camouflage by generating parallax through motion of the head. But such motion would reveal the presence of the predator to the prey. 3. Guidance of movements Stereoscopic vision helps in the performance of fine motor tasks, such as guiding a ring over a contorted loop of wire (Murdoch et al. 1991). We will see in Chapter 34 that prehensile movements of the hand toward binocularly viewed targets show shorter movement times, higher peak velocities, and a more accurate final grasp position of the hand than do hand movements toward targets viewed with one eye. Binocular vision confers some advantage during the planning stage of a motion of the hand toward an object or of stepping over an obstacle. Also, subjects 386
•
walking with only one eye open walked more slowly and fixated obstacles for longer than when walking with both eyes open (Hayhoe et al. 2009). 4. Detecting direction and speed of approach People with good stereoscopic vision are better at catching balls, especially high-velocity balls, than are people with weak stereovision (Mazyn et al. 2004). Also, people with good stereoscopic vision show greater improvement with practice (Mazyn et al. 2007). This general topic is discussed in Section 31.3. 5. Allowing for effects of distance Some stimulus features covary with distance. For example, the size of the image of an object varies with the object’s distance. This must be taken into account when we judge the relative sizes of two objects, as described in Section 29.2. We take account of motion parallax produced by objects at different distances when judging the relative motions of objects, as described in Chapter 28. Also, we allow for the effects of depth relationships when judging the relative lightness (whiteness) of surfaces (Section 22.4). 6. Processing in distinct depth planes For many purposes stimuli are processed in distinct depth planes. For example, threshold elevation and metacontrast produced by successive stimuli occur most strongly between stimuli in the same depth plane (Section 22.5.1). Apparent motion occurs preferentially between coplanar stimuli (Section 22.5.3). Gogel (1956) used the term “adjacency principle” for the general finding that stimuli interact most when in the same depth plane. For other purposes it is advantageous to be influenced more by the more distant parts of a visual scene, even when the eyes are converged on a near object. For instance, the impression of self-motion (vection) is evoked by movement of the more distant of two moving visual displays. This is because the more distant parts of a natural scene provide a more stable frame of reference than do near parts (Section 22.7).
7. Processing in the attended depth plane It is an advantage to be able to process information in the depth plane to which one is attending without interference from stimuli in other planes. Usually, we attend to objects in the plane of convergence and we are helped by the fact that objects outside this plane produce diplopic images. We will see in Section 22.6.1 that optokinetic nystagmus is evoked preferentially by motion within the plane of zero disparity. We will also see that visual pursuit is degraded by stationary objects in the same depth plane as the moving stimuli but not by stationary objects in a distinct depth plane.
STEREOSCOPIC VISION
8. Rapid processing of depth information Schiller et al. (2007) suggested that one reason for the evolution of stereopsis is that it processes depth rapidly. Detection of depth by motion parallax involves integration of signals over time. Schiller et al. found that monkeys identified stimuli at different relative depths more rapidly when depth was indicated by disparity than when it was indicated by motion parallax. Stereopsis does not aid the performance of all depthrelated tasks. A lack of stereoscopic vision has no adverse effects on driving a car at normal velocities (Bauer et al. 2000). Experienced or newly trained pilots with one eye covered land an aircraft just as accurately as when they use both eyes (Lewis et al. 1973; Grosslight et al. 1978). The Italian skier Fausto Radici won two World Cup slalom races in the 1970s even though he had only one eye. In driving, flying, and skiing, motion parallax is a very effective cue and can substitute for lack of stereovision. The task of recognizing a familiar shape when it is rotated to very different orientations in 3-D space may be performed just as well by people who lack stereoscopic vision as by people with normal vision (Klein 1977). In this case performance is presumably determined by perspective. 20.1.2 A B S O LU T E D I S PA R I T I E S
Most binocular cells in V1 respond to absolute disparity rather than relative disparity (Section 11.4.1g). The visual system registers the disparity of single points for the purpose of triggering an appropriate vergence movement of the eyes. But can humans judge the location of an isolated point in depth with respect to the point where the eyes are converged (the horopter) when there is only one object in view? Detection of whether an object is nearer than or beyond the horopter on the basis of the sign of disparity would not require knowledge of the angle of vergence. But the task of judging how far an object is from the point of convergence requires knowledge of the vergence state of the eyes, including horizontal and vertical vergence, and cyclovergence. See Section 17.6.5 for a discussion of the ability of people to judge the depth of an isolated point with respect to the point of convergence. The absolute distance of a nearby object in the midline at eye level can be perceived with reasonable accuracy on the basis of information about horizontal vergence (see Section 25.2). Sousa et al. (2010) found that the distance of an object was judged more accurately when a second more distant object was added but not when a nearer object was added. The two objects were about 34 and 54 cm from the subject. They concluded that the relative disparity between the two objects restricts the range of possible distances of the nearer object but not of the more distant object.
Things are more complex when the eyes or head are off center. A point at different distances from a point midway between the eyes can produce the same disparity for different combinations of eye orientation and head orientation. Blohm et al. (2008) tested whether the perceived distance of a point with a given disparity varies with changes in the orientation of eye or head. They first measured, for each subject, the 3-D orientations of eyes and head for different positions of a fixated point in a volume of space. They then calculated, for each subject, the binocular projections of a test point at a distance of 50 cm in a 5° oblique position in the upper right quadrant with respect to a straight-ahead fixation point at a distance of 1 m. While the subject fixated the straight-ahead point, a flashed LED impressed an afterimage on the eye at the location of the test point. The subject then rotated the head and angle of gaze to a defined position and pointed with unseen hand to the apparent location of the afterimage. This was repeated for different head orientations and gaze angles. The afterimages were always in the same retinal location. As predicted, the perceived distance of the afterimage with respect to the cyclopean point changed as a function of head-eye orientation. However, distances were mostly underestimated, and there was considerable variability within and between subjects. A dim fixation point was in view in addition to the afterimage, so that there was a relative disparity present. Also pointing with one hand introduced an asymmetry, which may have affected the results. Blohm et al. concluded that judgments of the distance of a point object are based on nonvisual information about the 3-D orientation of head and eyes. Patterns of relative disparities are immune to the effects of vergence and, as we will see, it is these that are used for coding depth. Wallach and Lindauer (1962) championed the view that relative patterns of disparity rather than absolute values are important for stereopsis. The rest of this chapter is concerned with the perception of relative disparity. 20.2 PERCEPTION OF SL ANT 20.2.1 H O R I Z O N TA L-S I Z E A N D D I F-F R E Q U E N C Y D I S PA R I T I E S
This section is concerned with the types of disparity that contribute to the perception of slant. The term “slant” is used here to refer to the angle of rotation of a surface about a vertical axis with respect to the frontal plane. The term “inclination” refers to rotation about a horizontal axis. Horizontal magnification of the image in one eye relative to that in the other eye is a horizontal-size disparity. The geometry of size disparity was discussed in Section 19.2. This type of disparity causes a frontal stimulus to appear slanted in depth (Ogle 1964). One of the original
B I N O C U L A R D I S PA R I T Y A N D D E P T H P E R C E P T I O N
•
387
stereograms devised by Wheatstone in 1838, shown in Figure 2.59, contains a disparity of this type. Horizontal magnification of one image reduces the mean spatial frequency of that image relative to that of the other image. This is a dif-frequency disparity, which is most evident when the stimulus is a regular vertical grating. There can be no doubt that horizontal-size disparity contributes to the perception of slant. The following is a review of the evidence that dif-frequency disparity also contributes to the perception of slant. Blakemore (1970a) proposed that dif-frequency disparities are detected by specialized disparity detectors, distinct from those that detect point and orientation disparities. A typical ganglion cell has a receptive field of concentric excitatory and inhibitory zones, and is most sensitive to a local periodic pattern that matches the position and periodicity of these zones. Psychophysical evidence suggests that, for each retinal location, there are at least four channels with distinct but overlapping spatial-periodicity tuning functions (Wilson and Bergen 1979; Georgeson and Harris 1984). Physiological evidence from cats (Movshon et al. 1978) and monkeys (DeValois et al. 1982b) supports this conclusion. Each channel has a half-amplitude bandwidth of between 1 and 2 octaves of spatial periodicity. The mean size of the receptive fields of ganglion cells increases with increasing retinal eccentricity. Excitatory regions of receptive fields subtend about 6 arcmin at the fovea and about 12 arcmin at an eccentricity of 4° (Wilson and Giese 1977). When the monocular receptive fields of a binocular cell differ in the spatial periodicity of ON and OFF regions, the cell responds best to a specific local difference in spatial period in the images of a slanted surface. Blakemore proposed that such cells respond best to vertical lines, because slanted horizontal gratings do not produce differential spatial frequencies. This argument is misleading. The images of horizontal line elements on a slanted surface differ in length in the two eyes and are detected by cortical cells tuned to horizontal stimuli. Differences in horizontal spatial periodicity occur as differences in the horizontal size of the images of texture elements and of the intervals between them. Unless texture elements are long featureless lines, orientation is immaterial, as long as binocular cells are tuned to a similar orientation in the two eyes. A real slanted vertical grating produces images with a texture gradient in addition to a dif-frequency disparity. However, dichoptic vertical gratings differing in spatial frequency create a slanted surface even when they contain no monocular texture gradient. This can be confirmed by fusing the upper stereogram in Figure 20.1, although it usually takes some time before the impression of depth emerges. The fused image of a rectangular textured surface with disparity-defined slant or inclination appears trapezoidal, because a really sloping surface would have to 388
•
Dif-frequency disparity. Stereograms with vertical gratings that differ in spatial frequency in the two eyes. When the disparity is not too great, as in the upper stereogram, it creates a slanted surface, even with no monocular texture gradient. When the disparity is large, as in the lower stereogram, the impression of a slanted surface is replaced by that of a series of steps, like a Venetian blind. (Reprinted from Blakemore 1970b, with
Figure 20.1.
permission from Elsevier)
be trapezoidal in the opposite direction to produce rectangular images. Also, for the same reason, the fused image appears to have a texture gradient. The effect is very striking with a large isolated surface. An illusory difference in spatial frequency produced by preadapting one eye to a grating of a different frequency does not generate depth (Sloane and Blake 1987). Fiorentini and Maffei (1971) claimed that an impression of depth is created by presenting gratings of the same spatial frequency to the two eyes and reducing the contrast of one of them, but Blake and Cormack (1979c) failed to replicate this effect. Blakemore (1970a) asked subjects to increase or decrease the spatial frequency of a vertical grating presented to one eye relative to that of a fixed grating presented to the other eye until the impression of slant broke down. This was done for a range of spatial frequencies of the fixed grating. Note that the gratings in the two eyes remained the same overall width. Figure 20.2 shows that the range of spatial frequency ratios (expressed as ratios of spatial period—HSRs) over which the impression of slant persisted was greatest for a fixed grating with a spatial frequency of about 3 cpd, and fell off for smaller and larger frequencies. For gratings between 2 and 6 cpd, a spatial-frequency disparity of about 10% produced the largest slant, as can be seen in Figure 20.3. Blakemore argued that if slant were coded in terms of the cumulative horizontal disparity over the grating,
STEREOSCOPIC VISION
1.4 Spatial frequency of right-eye image < left-eye image
Binocular period ratio
1.3 1.2 1.1
Settings to frontal plane 1.0 0.9 0.8
Spatial frequency of left-eye image < right-eye image
0.7 0.5
1 5 10 20 Spatial frequency in left eye (cpd)
The limits of dif-frequency disparity. The maximum ratio of spatial periods of dichoptic vertical gratings that created a slanted surface, as a function of the spatial frequency of the grating presented to the left eye (N = 1). Bars are standard errors. (Redrawn from Blakemore 1970b)
Figure 20.2.
depth impressions would not depend on the spatial frequency of the grating, and the curves in Figure 20.2 would have been flat. He wrote, “If the computation is really a point-for-point affair I can see no reason why it should depend on the spatial frequency of the pattern, as long as it can be distinctly resolved.” (p. 1187) It is now known that thresholds for detection of local horizontal disparity vary with spatial frequency in just the way Blakemore’s results revealed (Section 18.7). Thus, the results of his experiment
Perceived slant (deg)
40
2 cpd
30
6 cpd
20
10
0
0
5
10 15 20 % frequency difference
25
Dif-frequency and magnitude of slant. Perceived slant on a vertical axis of a dichoptic vertical grating as a function of the percentage difference in spatial frequency between the two images. The left-eye grating had a fixed spatial frequency of 2 or 6 cpd, as indicated on the curves. For both gratings, a spatial-frequency difference of about 10% produced the largest slant. Bars are standard deviations. (Adapted from
Figure 20.3.
Fiorentini and Maffei 1971)
did not justify his conclusion that global dif-frequency disparity codes slant. In any case, the same evidence could support the idea that local size or width disparities code slant. This is because saying that the visual system is most sensitive to a spatial frequency of about 3 cpd is equivalent to saying it is most sensitive to a local spatial period of about 0.3°. There is no evidence that the primary visual system contains detectors of global spatial frequency. The visual system in not a Fourier analyzer. Another point is that the two images in Blakemore’s stereogram had the same width in the two eyes. This biases the system against using the cumulative horizontal disparity, because a true cumulative disparity results in a narrower overall display in one eye than in the other, with the number of pattern elements remaining the same in the two eyes. When the dichoptic difference in spatial frequency is too high, as in the lower stereogram of Figure 20.1, an impression of steps or of several slanted surfaces, like a Venetian blind, replaces the impression of a single slanted surface (Blakemore 1970d). The Venetian-blind effect is particularly evident when the mean spatial frequency is high and the display is wide. This is what one would expect, because, under these circumstances, there is a periodic modulation of disparity over the surface, corresponding to the beat frequency of the two gratings (f1 – f2). Van der Meer (1978) compared the apparent slant produced by vertical gratings of various spatial frequencies. He used gratings with the same width in each eye, but a different number of bars, like those used by Blakemore. He then used gratings with the same number of bars in each eye, but different overall widths. With the first type of grating he found the same drop-off in perceived slant at higher spatial frequencies that Blakemore had found. But there was no drop-off with the differential-width gratings. This suggests that people use the overall horizontal disparity produced by a slanted surface when that disparity is properly presented. Further evidence for this conclusion will now be reviewed. The stereograms in Figure 20.4 produce two lines slanting in opposite directions (Wilde 1950). In Figure A, the lines appear to slant even though there is no disparity information along the lines, as there is in Figure 20.4B. This is an example of the principle enunciated in Section 22.2 that, in the absence of information to the contrary, depth impressions are interpolated into figural areas bounded by disparity discontinuities. Ramachandran and Nelson (1976) devised the stereogram shown in Figure 20.5A. When the images are properly fused, they create the impression of a slanted row of dots because one row is longer than the other. The corresponding pairs of dots within the rows have slight disparities that cause each pair to appear independently slanted in the opposite direction to the slant of the whole row. The horizontal disparity between the pairs signifies slant of the whole row, while the disparities within the pairs signify independent slant of the pairs. Note that when the gaze is
B I N O C U L A R D I S PA R I T Y A N D D E P T H P E R C E P T I O N
•
389
A
B Depth from differences in line length. (A) When fused, the two lines appear to slant in opposite directions, even though information about relative spatial frequency of the images is reduced to a minimum. (B) The rows of dots have the same spacing in the two eyes, but each row in one eye has two extra dots. Fusion of the images produces opposite slants. (Redrawn from Wilde 1950)
Figure 20.4.
held in one location, especially at one end of the row, the impression of overall slant of the row is lost, and the slant of only the pair of dots being fixated is apparent. This may be interpreted as evidence that overall depth of the row of dots is signaled by the accumulation of information from several local areas only if the gaze is allowed to wander over the row. See Section 18.10.4 for more discussion of this point. Blakemore also found that perceived slant remained constant when he increased the width of his vertical-grating display. He argued that this was what one would expect from the constant difference in frequency but not what one would expect from the increasing cumulative horizontal disparity. A system that sequentially scans horizontal
A Left eye image Right eye image
B Global and local slant. Fusion of the images in (A) creates a slanted row of dots because one row is more widely spaced than the other, as shown in (B). The corresponding pairs of dots within the rows have slight disparities that cause each pair to appear independently slanted in the opposite direction to the slant of the whole row. (After
Figure 20.5
Ramachandran and Nelson 1976)
390
•
disparities would also be independent of the length of the surface. Furthermore, as Tyler and Sutter (1979) pointed out, wider displays extend into the peripheral retina, where depth impressions require larger disparities, and it may be this factor that accounts for Blakemore’s flat function. Tyler and Sutter produced their own evidence in favor of the global dif-frequency theory of slant perception. They presented a luminance-modulated grating of 1.5 cpd to each eye, in which the fine texture consisted of random dynamic vertical lines. The gratings were caused to drift from side to side at 1 Hz at a velocity of 4°/s either in phase or in antiphase. Both cases produced good static slant. Since the grating patterns in the two eyes were correlated, this display contained both dif-frequency disparity and pointfor-point disparities. Tyler and Sutter argued that the point-for-point disparity mechanism was inactive with opposed-movement because the back-and-forth movement in depth that it should have produced was not present. There was only the impression of slant. The argument collapses with the evidence cited in Section 31.3.2 that overall changes in horizontal disparity do not produce motionin-depth when they are not accompanied by image looming. Halpern et al. (1996) obtained no impression of slant from a similar display containing only dif-frequency disparity. In this case, point-for-point disparities were eliminated because the grating patterns in the two eyes were uncorrelated. Tyler and Sutter obtained an impression of slant when the dynamic lines in the images were uncorrelated between the two eyes. However, the effect was present only for large differences in spatial-frequency. They argued that the impression of slant arose from a primitive pure diffrequency mechanism, which they called protostereopsis. Halpern et al. obtained the same result but showed that there are sufficient point-for-point disparities in such a display to account for the weak impressions of slant. In summary, the evidence does not support the idea that the perception of slant is based on global dif-frequency disparities, as opposed to local gradients of horizontal disparity or local dif-size disparities. Also, all investigators ignored the possible contribution of sequentially scanned local disparity. Wilson (1976) objected to the view that the perception of slant relies on dif-frequency disparities. He pointed out that Blakemore’s stimuli omitted the texture gradients that occur in actual surfaces. A natural texture gradient has a wide range of spatial frequencies that should disrupt the visual system’s ability to compare spatial frequencies by narrowly tuned spatial-frequency detectors. He found that surfaces with monocular texture gradients, such as those shown in Figure 20.6, were more easily detected as slanting than were surfaces with only dichoptic size differences and no texture gradients. He argued that this could not have been due to supplementary depth information provided by the texture gradients, because a texture gradient by itself is ambiguous with regard to the direction of slant. This is not
STEREOSCOPIC VISION
Figure 20.6. Texture gradients plus a dif-frequency disparity. (Reprinted from Wilson 1976, with permission from Elsevier)
a convincing argument, because a texture gradient does provide extra information about slant magnitude, even though it is unsigned. Wilson proposed that we use local dif-size disparity detectors. Since the size of receptive fields increases in a more or less linear fashion with increasing retinal eccentricity, the position on the retina best suited to detect a dif-size disparity depends on the absolute size of the images along the texture gradient. People could optimize their ability to detect local dif-size disparities by moving the eyes to bring the texture gradients into correspondence with the retinal gradient of receptive field size. The slant of a surface covered with a regular fine pattern would be difficult to detect by a pure dif-frequency mechanism because the ability to detect a fine pattern declines rapidly with increasing eccentricity. More to the point, the ability to discriminate between two high spatial-frequency patterns declines even faster than grating acuity with increasing eccentricity, as can be seen in Figure 20.7 (Greenlee 1992). It has been argued that the size-matching strategy has some advantages over the point-for-point system for
0.4
Δf/f
0.3
8 cycles/deg
4 cycles/deg
0.2 0.1 cycles/deg 2 cycles/deg
0.1
0
0
5
10 15 Retinal eccentricity (deg)
20
Spatial-frequency discrimination. Spatial-frequency discrimination thresholds, Df/f , as a function of retinal eccentricity, for four baseline spatial frequencies. The stimuli were Gaussiantruncated sine-wave gratings with a spatial-frequency bandwidth of 0.5 octaves, at five times the contrast threshold. (Redrawn from Greenlee 1992)
Figure 20.7.
coding slant. The point-for-point system requires a precise linkage of specific texture elements in the two eyes. In a long slanted surface covered with a repetitive texture containing a mixture of spatial frequencies, the images may be difficult to match because corresponding elements become increasingly separated at locations on the surface more remote from the point of fixation. The size-matching process works even if the corresponding elements are not found, because the mean size difference between elements can be detected between noncorresponding elements. However, this advantage applies only if the viewer holds fixation constant, which is most unnatural. When the gaze wanders, point-for-point disparity can be sampled at several locations along the surface, and disparities outside the foveal region can be ignored while each sample is being registered. It has also been argued that moving one image relative to the other should perturb the point-for-point system, whereas the size-matching process should be relatively immune to this procedure. Blakemore (1970b) confirmed that motion of one image at 1°/s does not seriously degrade slant perception. This is not a very convincing argument, because the horizontal point-for-point system is probably capable of registering horizontal disparities within short periods. Levinson and Blake (1979) reported that dichoptic gratings with similar harmonic content and different cycle width produce slant. They concluded from this that diffrequency disparity codes slant. However, a group of investigators, including Blake, pointed out that these stimuli produce similar responses in size-tuned channels and that slant could arise from dif-size disparity or from horizontal disparity derived from the output of these size-tuned channels (Halpern et al. 1987b). The degree of apparent slant created by a difference of spatial frequency between two vertical gratings decreased when they introduced an overall horizontal disparity into the grating with respect to the edges of the circular aperture. They presented Figure 20.8 to illustrate this effect. The stimuli were presented briefly with convergence held in the plane of the aperture. They argued that this decrement of perceived slant would not occur if slant were coded in terms of dif-frequency
B I N O C U L A R D I S PA R I T Y A N D D E P T H P E R C E P T I O N
•
391
all types of disparity but dif-frequency disparity. The issue of whether binocular cells are sensitive to differences in spatial periodicity was discussed in Section 11.6.1. Vertical gratings that have been used to study diffrequency disparity are not good stimuli for investigating slant perception. Slant of a surface covered with vertical lines is difficult to see, takes long to build up, and does not reach the same magnitude as the slant of a surface covered with random texture. Also, the fused image of a vertical grating contains no vertical disparities. We will see in the following sections that vertical disparity plays an important role in judgments of surface slant.
Position disparity
20.2.2 S L A N T C O N S TA N C I E S A S A FUNCTION OF ECCENTRICIT Y
Spatial-frequency disparity
20.2.2a Introduction Consider an upright observer judging the slant of a vertical test surface. Figure 20.9 illustrates the stimulus variables and tasks that could be involved in such a judgment. The stimulus variables are: 1. The nature of the test surface, including its size, its shape, and the depth cues that it contains.
Spatial-frequency + position disparity Dif-frequency plus position disparity. With crossed fusion, the horizontal disparity in the top stereogram places the bars beyond the aperture. A dif-frequency disparity in the center stereogram produces slant of the grating. In the lower stereogram, a horizontal disparity is combined with a dif-frequency disparity. Observers reported less slant than in the center stereogram. (Redrawn from Halpern et al. 1987b)
Figure 20.8.
Variable distance Test surface
disparity, but would occur if it were coded in terms of horizontal disparity. This argument is not well founded. Adding a horizontal disparity made the grating appear beyond the circular frame. The disparity within the grating was thus placed on a disparity pedestal with respect to the circular frame, which was in the plane of convergence. Disparities on disparity pedestals are registered less efficiently than are those on a base of zero-disparity, as we saw in Section 18.3.3. Summary Although several investigators have suggested that global dif-frequency disparity may underlie the perception of slant, the arguments are not convincing. First, there is no physiological evidence for spatially extended receptive fields that register global spatial frequency rather than local spatial period. Second, dichoptic gratings that differ in spatial frequency are typically seen as a Venetian blind (with periodic modulation in depth) when the eyes are prevented from scanning the grating. This suggests that local size differences are detected rather than global dif-frequencies. Third, the perception of the direction of the slant of a surface is not above chance level when care is taken to exclude 392
•
Angle of gaze
Variable lateral location
Hidden hand
Angle of head
Angle of torso
The general task of judging slant. The figure shows the spatial stimulus variables that could be involved in judging the slant of a vertical surface. The task could be setting an unseen plate parallel to the target surface or setting the target surface to a defined orientation, such as the frontal plane of the head or the torso, or the gaze normal.
Figure 20.9.
STEREOSCOPIC VISION
2. The orientation of the torso. 3. The orientation of the head. 4. The direction of gaze. 5. The distance of the test surface from the observer. Directions are all specified with respect to the test surface. Only a subset of these variables has been investigated. The task that the observer is asked to perform could be: 1. Setting a plate held in the unseen hand to be parallel with the test surface. 2. Setting the test surface to the apparent frontal plane of the head or the torso. This method has been used to investigate frontal-plane slant constancy. 3. Setting the test surface to be orthogonal to the cyclopean line of sight. We will see below that this method has been used to investigate gaze-normal slant constancy. In general, slant constancy refers to the ability of an observer to perceive the slant of a surface as remaining constant over changes in any of the stimulus variables listed above. However, most experiments have been concerned with slant constancy with respect to the frontal plane of the head or the gaze-normal plane. In frontal-plane slant constancy the perceived orientation of a line or surface with respect to the frontal plane of the head remains constant at all eccentricities. A frontal surface at a horizontal eccentricity of q° is slanted by 2q° with respect to the tangent to the horizontal horopter, and thus contains both horizontal and vertical disparities (see Figure 14.6). This pattern of disparities is a joint function of eccentricity and distance and must be allowed for in judging the orientation of a surface with respect to the frontal plane of the head. We are able, within limits, to do this. The reader can verify this by fixating a point on a small flat surface while translating the head from side to side. The surface does not appear to change its orientation with respect to the head, although we can see that its orientation to the gaze normal has changed. A vertical line in the median plane contains a horizontal-shear disparity due to the inclination of corresponding vertical retinal meridians (Section 14.7). A vertical line to one side of the median plane contains an extra horizontalshear disparity because the line is nearer to one eye than to the other. A vertical frontal surface also contains verticalshear disparity. These disparities must be allowed for in judging the frontality of a line or surface (Section 20.6.4). In gaze-normal slant constancy the perceived orientation of a line or surface with respect to the normal to the cyclopean line of sight remains constant at all eccentricities. Consider a cyclopean line of sight at a headcentric
horizontal eccentricity of q. A surface normal to the line of sight is slanted by q ° with respect to the tangent to the horizontal horopter. The surface therefore contains both horizontal and vertical disparities, whether it is viewed with central gaze or with eccentric gaze. A surface normal to a cyclopean line of sight at a vertical retinal eccentricity of q is at an angle of approximately q + 2° with respect to the vertical horopter. The 2° is due to the inclination of the vertical horopter. In slant constancy over changes in distance the perceived slant of a surface remains constant over changes in the absolute distance of the surface. It requires that horizontal-size disparities be scaled by distance. This type of constancy is discussed in Sections 20.6.4 and 20.6.5. The following sections discuss sources of information that could be used to indicate the headcentric eccentricity of a textured flat surface.
20.2.2b Deformation Disparity A horizontal-size disparity produced by a slanted line can be expressed as the difference between the angles subtended by the line at the two eyes. It can also be expressed as a ratio of the two angles, or a horizontal size ratio (HSR). An eccentric vertical line also produces an image in one eye that is longer than that in the other eye. This is a vertical-size disparity. It too can be expressed as an angular disparity or as a vertical size ratio (VSR). A size ratio multiplied by 100 is a percentage size difference. The difference between horizontal-size disparity and vertical-size disparity of corresponding images is a sizedeformation disparity. A size-deformation disparity may also be expressed as the horizontal size ratio (HSR) divided by the vertical size ratio (VSR) of the two images (Section 19.2.3). A vertical surface normal to an eccentric cyclopean line of sight is closer to one eye than to the other. One image is therefore magnified uniformly relative to the other. This is an overall size disparity. An overall size disparity contains equal horizontal-size disparity and vertical-size disparity. Thus, a surface normal to any cyclopean line of sight has a size-deformation disparity of zero when it is expressed as a difference, or of 1 when it is expressed as a ratio. The vertical-size disparity produced by a surface patch at a given distance from the cyclopean point varies with its horizontal eccentricity but not with its slant (see Figure 19.24). The horizontal-size disparity of a surface patch varies with both its eccentricity and its slant. Therefore the size-deformation disparity of a patch at a given distance supplies information about surface slant to the cyclopean line of sight that is independent of eccentricity. Backus et al. (1999) derived the following expression relating the slant of a surface to the cyclopean normal, to deformation disparity and distance to the surface, defined as the angle of vergence in radians, , required to fixate
B I N O C U L A R D I S PA R I T Y A N D D E P T H P E R C E P T I O N
•
393
its center. Slant in the direction of the larger image is signed positive. ⎛1 HSR S ⎞ Slant ≈ − arctan ⎜ log e ⎝m VSR S ⎟⎠
(1)
Knowledge of viewing distance allows an observer to scale slant for distance, and knowledge of VSR allows an observer to correct for changes in horizontal disparity arising from changes in stimulus eccentricity, or to correct for aniseikonia. Slant perceived in this way is subject to the induced effect discussed in the next section. Ebenholtz and Paap (1973) investigated the frontal-plane constancy of lines. They placed a 29-cm vertical test line in the median plane, with its center 25° above or below eye level. In a second condition, a horizontal test line at eye level was placed with its center 25° to the left or right of center. Disparity was the major cue to the line’s inclination or slant. Subjects matched the orientation of each test line relative to the frontal plane to that of a central coplanar comparison line 62 cm away at eye level. They could move their gaze from one line to the other. Subjects performed with almost complete accuracy in judging the inclination of the vertical lines. The slants of eccentric horizontal lines were slightly underestimated. The authors concluded that, in achieving orientation constancy to the frontal plane, people assess the pattern of binocular disparity in an eccentric line with respect to the angle of gaze. Since subjects were allowed to fixate each line, there would be little uncertainty about the retinal location of the image. However, if subjects had been required to remain fixated on the central target it is possible that the retinal eccentricity of the test lines would have been misregistered. It is not surprising that orientation constancy was better for vertical lines than for horizontal lines. A vertically eccentric vertical line contains horizontal-shear disparity along its whole length. A horizontally eccentric horizontal line contains disparities only at its ends. Performance should be better with a dotted horizontal line. A horizontal line on the visual horizon contains no vertical disparities, so that orientation constancy cannot be based on vertical disparity for such a line.
20.2.2c Oculomotor Signals Information about object eccentricity, l , could be derived from oculomotor signals involved in fixating the object. If the object is not fixated, oculomotor signals must be combined with signals that code the retinal position of the image. Stimulus distance could be obtained from horizontal vergence required to bring the images into zero disparity. Since logVSR is approximately equal to m l , the predicted slant based on these signals can be derived by substitution from equation (1): ⎛1 ⎞ Slant ≈ − arctan ⎜ log HSR S − tan l ⎟ ⎝m ⎠ 394
(2)
•
The component of horizontal disparity in an overall size disparity would create an impression of slant if no account were taken of the vertical disparity. This would be the case if horizontal disparities at different eccentricities were scaled only by eye position. But an overall disparity produces little or no slant. So horizontal disparities are not scaled only by eye position. A surface with overall size disparity appears normal to the line of sight, because the component of horizontal disparity is compensated for by the presence of an equal vertical disparity. Put another way, the zero deformation disparity indicates that the surface is gaze normal, whatever the angle of gaze. Miller and Ogle (1964) formed an afterimage of a centrally fixated frontal display. After the eyes had moved into a position of eccentric gaze, subjects set a test surface to appear coplanar with the afterimage. They set the test surface close to the tangent to the horopter. Miller and Ogle concluded that the angle of gaze is not used to scale disparities in eccentric gaze so as to preserve gaze-normal slant constancy. But subjects were not asked to report the apparent orientation of the afterimage or of the test surface. Matching the disparities of the two displays tells us nothing about slant constancy. Although eye-position signals are not used for slant constancy in the presence of a strong deformation disparity, they may be used when the deformation signal is weak. Backus et al. (1999) found that estimates of the slant of eccentric targets varied with changes in eye-position when vertical disparity was weakened by use of vertical lines or by reducing the diameter of the stimulus to less than 6°. They found that the two cues were approximately additive in a cue-conflict situation. Herzau and Ogle (1937) had also found slant constancy when vertical disparities were removed. However, they did not allow for possible effects of perspective. We will see that effects of perspective can be strong. We saw in Section 19.6.4 that a central frontal surface with vertical disparities appropriate to a surface in an eccentric position does not appear eccentric. Thus vertical disparities are not used to indicate gaze angle. Insofar as they are used for slant constancy, they must indicate stimulus eccentricity directly rather than through the mediation of eye position. Any cyclovergence induced by horizontal-shear disparity in a vertically eccentric surface would reduce the disparity and perhaps contribute to orientation constancy. Ogle (1964, p. 118) suggested that orientation constancy of frontal-plane objects as one raises or lowers the eyes is due to cyclovergence. However, horizontal-shear disparity evokes only weak cyclovergence (see Section 10.7.5a), so that orientation constancy must depend mainly on other processes. Cyclovergence cannot cancel the total pattern of disparity produced by a horizontally eccentric surface.
STEREOSCOPIC VISION
20.2.2d Intrusion of Other Depth Cues Many stimuli used to investigate the role of binocular disparity in slant perception contained monocular cues that indicated that the test surface was not slanted. There was cue conflict between disparity indicating slant and other cues indicating no slant. For a spherical retina, perspective gradients depend only on the orientation of the surface to the line of sight. Because perspective in a cyclopean-normal surface does not vary with changes in distance or eccentricity, slant estimates based on perspective do not need to be scaled for these factors. Backus et al. (1999) eliminated the effects of perspective. They varied deformation disparity in a random-dot display with VSRs of −1.027, 0, and +1.027, corresponding to cyclopean-normal surfaces at horizontal eccentricities of −15, 0, and +15°. They presented each disparity at each of three eccentricities by rotating each arm of the amblyoscope about the center of rotation of the eye. Subjects rotated each stereoscopic surface until it appeared to lie in a cyclopean-normal orientation. Since perspective was constant, it could not help in the performance of this task. Also, at the end-point, there was no conflict between perspective and the other cues to slant. Slant estimates conformed to slant indicated by deformation disparity. They were affected only slightly by conflicting information about eccentricity provided by gaze angle. Overall, the evidence suggests that, when available, vertical-size disparity, rather than angle of gaze, is used to rescale horizontal-size disparities when a surface is moved into an eccentric location. Rescaling could be achieved locally or globally.
A
B
C Slant in a single surface. (A) Horizontal-size disparity produces slant. (B) Vertical-size disparity produces slant in the opposite direction. (C) Overall size disparity produces little or no slant.
Figure 20.10.
20.2.3 S I Z E -D I S PA R I T Y I N D U C E D E F F E C T
20.2.3a Basic Facts A frontal surface appears to slant about a vertical axis when the image in one eye is horizontally magnified relative to the image in the other eye. The surface appears to slant away from the eye seeing the narrower image, as shown in Figure 20.10A. Ogle called this the geometric effect because it is predicted from the geometry of the situation— the image of a near-left, far-right surface has a larger horizontal extent in the right eye than in the left. The same surface appears to slant in the opposite direction when the image that was previously smaller along the horizontal meridian is made smaller along the vertical meridian, as in Figure 20.10B. Ogle (1938) called this the induced effect. The effect involves no horizontal disparities, so that any theory of binocular stereopsis that considers only horizontal disparities would not predict it. The effect was first reported by Lippincott (1889) and by Green (1889) but the first systematic experiments on the induced effect were conducted by Ogle (1938) at the Dartmouth Eye Institute. Little or no slant is seen
when one image is magnified equally along both axes, as in Figure 20.10C. Westheimer (1978) could not obtain an induced effect with vertical lines or with a square subtending 24 arcmin and exposed for 500 ms, although horizontal magnification of the image in one eye produced the expected slant. Westheimer argued that the visual system does not have the necessary sensitivity to extract vertical disparities. In a later study using an inverted V configuration of five dots, Westheimer (1984) found that sensitivity to vertical disparities was at least an order of magnitude less than for horizontal differences (threshold for vertical magnification = 11.9%; threshold for horizontal magnification = 1.6%). However, in both these studies the stimulus subtended less than 1° × 1°. Stimuli this small do not produce detectable vertical disparities. Figure 20.11 shows the angles subtended in the two eyes by a vertical line sitting on the horizontal plane of regard as a function of its length, (z), its linear horizontal eccentricity from the cyclopean point, (x), and its distance
B I N O C U L A R D I S PA R I T Y A N D D E P T H P E R C E P T I O N
•
395
z x
λL
λR
y
P a
x
Images of an eccentric vertical line. The angle l R subtended at the right eye by a vertical line of length z, x cm to the right of cyclopean point P, and y cm from the interocular axis is given by:
Figure 20.11.
l R = arctan n
z
from Ogle 1939a)
( x − a )2 + y 2
The angle l L subtended in the left eye is given by:
l L = arctan n
Apparatus used to measure the induced effect. Subjects viewed the textured transparent surface through a slit and set it to the apparent frontal plane. Aniseikonia in the horizontal and vertical meridians was indicated by errors about each of the two axes. (Adapted
Figure 20.12.
z
( x + a )2 + y 2
from the cyclopean point (y). The difference between these two angles defines the vertical disparity of the images. For small angles of eccentricity and near distances, the disparity of a vertical line is approximately proportional to its angle of eccentricity and inversely proportional to its distance. Ogle derived the predicted slant, y, for a given horizontal magnification or vertical magnification, M, for one eye’s image using the expression: ⎛ d ( M − )⎞ y = − arctan ⎜ (4) ⎝ d ( M + ) ⎟⎠ in which d is viewing distance, and a is half the interpupillary distance (Householder 1943). The induced effect has three characteristics, (1) gain, or slant produced by 1° of vertical magnification of one image, (2) the range over which the effect is proportional to image magnification, and (3) peak value. Ogle measured the geometrical and induced effects by asking subjects to set a 30 cm by 30 cm glass plate covered with random dots to the apparent frontal plane while viewing the plate through a lens that magnified the image in one eye horizontally or vertically. The plate was 40 cm from the eyes and was viewed through a horizontal slit that restricted the visual field to the plate, as shown in Figure 20.12. Ogle ignored the fact that perspective indicated that the surface was frontal. Figure 20.13 shows results for one subject. For moderate degrees of vertical disparity, perceived slant increased in proportion to relative vertical magnification and distance, with a gain of between 3 and 3.5° of slant per degree of magnification. For small disparities, the gain for horizontal magnification was the same as that for vertical magnification. Thus, a given vertical disparity produced 396
•
slant similar to that produced by the same degree of horizontal disparity. However, the linear range extended over 10% for horizontal magnification but over only about 3% for vertical magnification. The peak value was also larger for horizontal magnification. This point is discussed later in this Section. Overall magnification of one eye’s image induced little or no apparent slant of the test surface. In this case, the geometric effect produced by horizontal disparity canceled the opposite induced effect produced by vertical magnification. Ogle noted that differences in image magnification due to anisometropia rarely exceed 6%. Ogle concluded, “some mechanism compensates for the difference in the sizes of the images in the vertical meridian but can only do so by an overall change in the relative sizes of the ocular images.” He conjectured that the induced effect might be related to the fact that the relative size of the images changes as a stimulus is moved into the periphery of the visual field. Ogle (1939a) obtained an induced effect with a stimulus consisting of two vertical black bars in the median plane symmetrically arranged above and below a horizontal row of dots. Fixation was on the center dot. The vertical eccentricity of each bar varied between 1 and 11.4°. Geometrically, the vertical disparity in the images of the bars was proportional to the magnification of one image and the eccentricity of the bars. For a 10% difference in image size, the induced effect increased slightly with increasing eccentricity of the test bars. Ogle argued that the increasing vertical disparity with increasing image eccentricity was largely offset by the decrease in sensitivity of the peripheral retina to vertical disparity. The other possibility is that image magnification is derived by scaling disparity with eccentricity. In a third paper Ogle (1939b) found that the induced effect decreased as the inclination of a textured surface about a horizontal axis increased. The maximum effect occurred when the test surface was inclined slightly backward to conform to the inclined vertical horopter. Also, the
STEREOSCOPIC VISION
induced effect produced by a horizontal row of dots and two vertical test bars decreased as horizontal disparity was added to the bars so that they were removed from the depth plane of the row of dots. Thus, in both cases, the induced effect was greatest when the whole display contained zero horizontal disparities. Ogle (1940) found no consistent shifts in the induced effect when a surface was moved 10° to the left or right of the midline. Subjects set the surface to the normal to the cyclopean axis. He expected a difference, because a surface normal to the cyclopean axis at an eccentricity of 10° is at an angle of about 10° to the tangent to the horizontal horopter. It therefore contains a gradient of horizontal disparity. He suggested that we compensate for this gradient of disparity when setting an eccentric surface orthogonal to the cyclopean line of sight. A surface away from the median plane of the head subtends a larger vertical angle in one eye than in the other. Ogle reasoned that for correct registration of the slant of the surface, the retinal images are scaled (isotropically) to the same height. If this were the case, a vertical magnification of one eye’s image should be interpreted as a consequence of eccentric viewing. The resulting scaling induces a horizontal disparity difference, which is interpreted as slant with respect to the direction of gaze. Another way of thinking about the induced effect is that the only real-world surface at eccentricity q that creates a vertical size difference but no horizontal size difference is one tangential to the horizontal horopter and at a slant of q with respect to the cyclopean normal (see Figure 14.6) (Gillam and Lawergren 1983). Ogle (1964, p. 248) pointed out that, for a pair of crossed oblique lines at ±45°, a vertical magnification of one image is geometrically equivalent to a horizontal magnification of the other image. Thus, for oblique lines, the induced effect can be explained without reference to vertical disparities. See Arditi et al. (1981b) and Arditi (1982). However, the induced effect occurs for stimuli in which this explanation does not apply. For example it occurs in a set of vertical lines. Although a pair of oblique lines provides horizontal disparities consistent with slant about a vertical axis, multiple oblique lines that maintain their horizontal separation when vertically magnified, are more consistent with inclination about a horizontal axis, as seen in the induced effect (Mayhew and Frisby 1982; Gillam and Lawergren 1983). Mayhew (1982) suggested that vertical disparities provide information about convergence distance and the angle of horizontal eccentric gaze. Mayhew and Longuet-Higgins (1982) proposed that a vertical-size disparity in the induced effect indicates that the gaze is eccentric. Since horizontal disparities are a function of gaze angle, the perceived slant of all surfaces in the scene would be affected by the information that the eyes are in eccentric gaze. According to this theory, the induced effect is due to the effect of vertical
disparity on perceived angle of gaze, rather than to a direct effect of vertical-size disparity on perceived slant. There are two good reasons to reject this theory. In the first place, the conclusion that vertical-size disparity indicates angle of gaze was based on vertical disparities on flat retinas (see Section 19.6.5). On spherical retinas, vertical disparities depend on the axis system used to specify visual directions. If gaze elevation is specified by parallel horizontal lines of latitude, vertical disparities do not vary with changes in horizontal gaze. This is because parallel horizontal lines on the retina remain parallel when an eye rotates about a vertical axis through its center. In the second place, the theory predicts that the induced effect affects all surfaces in a visual scene because registration of horizontal gaze is a single viewing-system parameter. We will see in Section 22.7 that distinct induced effects may be produced in different parts of the visual scene. The alternative theory is that vertical disparities are used directly to code depth. This theory was discussed in Section 19.6.6. Supporting evidence will be presented in what follows.
20.2.3b The Induced Effect and Disparity Magnitude All theories of the induced effect, based on vertical size disparity, predict that the effect should be the same as the geometric effect for the same magnification. However, Ogle (1938) found perceived slant to be a linear function of horizontal magnification over a 10% range but that the induced effect was linear only up to about 3% of vertical magnification (Figure 20.13). Other investigators have observed the same difference, and the following theories have been proposed to account for it. 1. Mayhew and Longuet-Higgins (1982) and Frisby (1984) argued that vertical magnifications larger than 5% are created only by impossibly large gaze angles and very close viewing distances. The drop-off in the induced effect beyond this 5% value is therefore consistent with the characteristics of binocular images within the normal working range of distances and angles of gaze. 2. Gillam et al. (1988a) attributed the roll-off in the induced effect to conflicting indicators of eccentricity given by the eye movement system. They used a slantmatching technique rather than the nulling procedure. They found that the induced effect was greater than the geometric effect for magnifications under 1% and reached an asymptote at approximately 2% magnification. 3. Banks and Backus (1998) attributed the roll-off of the induced effect to conflict between deformation disparity and perspective and between deformation
B I N O C U L A R D I S PA R I T Y A N D D E P T H P E R C E P T I O N
•
397
Rotation of object plane (deg)
30
20
10
Right eye
Left eye 10
8
Vertical magnification
6
4
2
2 4 6 8 10 % image magnification –10 Horizontal magnification –20
–30 Differential magnification and apparent slant. Errors in setting a textured surface to the frontal plane as a function of % horizontal or vertical magnification of the left- or right-eye image. The stimulus filled the binocular field at a distance of 40 cm (N = 1). (Adapted from Ogle 1938)
Figure 20.13.
disparity and gaze angle. The linear range of the induced effect increased when they removed linear perspective cues from the stimulus. Slant induced by horizontal-size disparity was the same with strong as with weak perspective. To reveal why conflicting perspective affects only the induced effect, they independently varied disparity and the angle of eccentric gaze by rotating the arms of a haploscope. When vertical-size disparity corresponded to the angle of gaze that would normally produce it, the induced effect was a linear function of disparity over a large range. This was so even with strong conflicting perspective cues. Also, conflict between horizontal disparity and angle of gaze attenuated the geometric effect. Berends and Erkelens (2001a) measured the horizontalsize disparity required to null perceived slant produced by a given magnitude of vertical-size disparity. This procedure avoids contaminating effects of conflicting perspective cues because, at the null point, the surface appears frontal. They found large individual differences in the null point, suggesting that different subjects assign different weights to horizontal-size and vertical-size disparities.
20.2.3c The Induced Effect and Viewing Distance Theoretically, slant produced by a given horizontal-size disparity or by a given vertical-size disparity should decline with increasing viewing distance. Ogle (1938) found that 398
•
slant produced by horizontal disparity declined with distance. The induced effect also declined with distance, although not by the predicted amount. Gillam et al. (1988a) and Rogers et al. (1995) found that slant produced by horizontal disparity varied with distance but that the induced effect did not vary with distance. Ogle’s subjects adjusted the physical slant of the test surface until it appeared to lie in the frontal plane. When the induced effect is measured by this procedure, the rotation of the surface to the apparent frontal plane introduces a horizontal-size disparity. Gillam et al. and Rogers et al. used a slant matching procedure, which leaves horizontal and vertical disparities in the test surface unchanged. Backus and Banks (1999) obtained Ogle’s results when they used a nulling procedure and the other results when they used a matching procedure in which subjects set the angle between two frontal plane lines to match the slant of the previously seen test surface. Frisby (1984) suggested that the induced effect is not correctly scaled for distance because it involves disparity and gaze-angle combinations that do not occur naturally. Backus and Banks (1999) pointed out that, in the induced effect produced in a random-dot stereogram, deformation disparity indicates a slanted surface, but horizontal disparity plus eye-position signals indicate a frontal surface. Thus, for the induced effect, the two estimates do not agree. For the geometric effect, both signals indicate a slanted surface. For both the geometric and induced effects, slant estimates based on monocular cues indicate that the surface is in a frontal plane. Backus and Banks independently varied disparity and angle of eccentric gaze by rotating the arms of a haploscope. The induced effect was scaled correctly with distance when the angle of gaze was appropriate to the vertical disparity in the stereogram but not when gaze was frontal. On the other hand, the geometric effect scaled with distance when gaze was frontal but not for eccentric gaze. For the geometric effect in eccentric gaze, slant estimates based on deformation disparity did not agree with those based on horizontal disparity plus eye-position signals. These results suggest that cue conflict is responsible for lack of distance scaling of the induced effect. Backus and Banks developed a model of slant estimates based on a weighted average of the available information, with weights determined by reliabilities of the signals. The reliability of deformation disparity decreases with increasing distance more rapidly than does the reliability of horizontal disparity plus eye position. Thus, as distance increases, information that a surface is frontal gains in strength over that indicating that it is slanted. The geometric effect scales with distance because deformation disparity and horizontal disparity agree at all distances. We need also to explain why the induced effect measured by the matching procedure is not scaled with distance while the effect measured by the nulling procedure is.
STEREOSCOPIC VISION
Backus and Banks suggested that nulling shows some distance scaling because the surface is rotated until deformation disparity is zero, its most reliable value. Monocular depth cues could also contribute to the difference between slant nulling and slant matching. In slant nulling, the surface is rotated until it appears frontal, where there is no conflict with monocular cues. In the matching task, monocular cues remain in conflict. At a given viewing distance, the same apparent slant can be created by different combinations of horizontal and vertical size disparities. The crucial factor determining apparent slant is the difference between the two types of disparity (size-deformation disparity). Backus (2002) confirmed that subjects cannot distinguish between the slants of textured surfaces with the same deformation disparity but differing proportions of horizontal and vertical disparities. The distinct patterns of disparity metamerize into the same percept. However, two metameric surfaces became distinguishable when moved to a different viewing distance, because the two types of disparity vary in different ways with viewing distance. Disparity metamerism is discussed further in Section 21.6.2d. 20.2.4 L O C A L VE R S US G L O BA L V E RT I C A L-S I Z E D I S PA R IT Y
Koenderink and van Doorn (1976) proposed that verticalsize disparity is detected at each location and is used directly to scale local horizontal-size disparities. The opposite theory proposed by Mayhew (1982) and Frisby (1984), is that vertical-size disparities are used to compute viewing distance and gaze angle, rather than being used directly to scale horizontal disparities. We will see that although horizontal disparities are detected locally, vertical disparities are pooled over large areas although not over the whole visual field. Horizontal-size disparities must be registered at each small location because they allow us to detect the multitude of rapidly changing disparities that indicate the fine structure of the 3-D scene. Local vertical-size disparities produced by surfaces at different distances are always accompanied by larger horizontal disparities and are therefore not needed for coding relative depth. Vertical-size disparity on a frontal surface changes only gradually with increasing eccentricity. The mean vertical-size disparity over a large area is sufficient for estimating eccentricity, rescaling local horizontal disparities, and for compensating for aniseikonia (see Sections 9.9.3 and 19.6.1). Let us look at the evidence on this question.
20.2.4a Spatial Resolution of Vertical Disparities Kaneko and Howard (1997b) showed that vertical-size disparities are averaged over large areas. They used a
random-dot display that subtended 55° by 55° and sinusoidally modulated vertical-size disparity over the display at various spatial frequencies. They then determined the amplitude of modulation of vertical disparity required to produce an impression of depth modulation as a function of the spatial frequency of the disparity modulation. For their two subjects, depth impressions were not elicited at any disparity amplitude for modulation frequencies higher than about 0.04 cpd. This indicates that vertical-size disparities in the central visual field were averaged over an area up to about 20° wide. Serrano-Pedraza et al. (2010) tested 10 subjects with square-wave modulations of vertical disparity. Disparity averaging occurred over widths of between 9° and 58°. Six subjects were maximally sensitive to a frequency of modulation of about 0.04 cpd rather than to a constant level of magnification (they showed a band-pass function). This may be because neighboring regions with different induced slants produce slant contrast. An overall induced slant would tend to normalize to the frontal plane. Modulations of horizontal-size disparity are perceived up to about 4 cpd with a peak sensitivity at about 0.3 cpd (Section 18.6.3).
20.2.4b Distinct Adjacent Vertical Disparities The area over which vertical-size disparities are pooled can be determined by using two abutting surfaces with different vertical-size disparities but the same horizontal-size disparity. If the surfaces fall within the area over which vertical disparities are pooled they will appear to have the same slant. The perceived relative slant of the two surfaces will be greatest when they are separated so that they occupy distinct regions of disparity pooling. A vertical-size disparity over the whole visual scene was traditionally created by an aniseikonic lens in front of one eye (Ogle 1938; Gillam et al. 1988a). But computer-generated displays are now used. Stenton et al. (1984) reported that pooling of verticalsize disparities occurs in a 7.2° by 7.2° display of 16 points with a range of vertical disparities. Rogers and Koenderink (1986) found that different slants could be seen in two patches within the same stereogram that had different vertical-size disparities but the same horizontal-size disparity. The effect may be observed in Figure 20.14. However, the slant was much less than predicted by the local deformation disparity or that produced by two patches with opposite horizontal-size disparity. These results indicate that vertical-size disparities are pooled over a large area, although not over the whole binocular field. Gårding et al. (1995) came to the same conclusion. Kaneko and Howard (1996) produced further evidence that vertical-size disparities are pooled over a large area. Subjects set an unseen tactile paddle to match the perceived
B I N O C U L A R D I S PA R I T Y A N D D E P T H P E R C E P T I O N
•
399
50
Perceived slant (deg)
60° diameter random-dot display 25
Overall-size disparity
0
Vertical-size disparity
–25
Horizontal-size disparity –50 –10
–5
0 5 Size disparity (%)
10
A
Perceived slant (deg)
50
Opposite induced effects. In the upper stereogram, the left-eye pattern is vertically magnified. In the lower stereogram, the right-eye pattern is vertically magnified (with divergent viewing). There are no horizontal disparities. If perception of slant is based on local deformation, separate slants should be seen in the two stereograms. If a whole-field parameter, corresponding to the angle of eccentric gaze, is applied to all disparity gradients in the scene, there should be no difference in slant in the two stereograms.
Figure 20.14.
30° display Horizontal-size disparity
25
With zero-disparity surround Without surround
0 Vertical-size disparity –25
–50 –10
–5
0 5 Size disparity (%)
10
B 60° display
400
•
40 Perceived slant (deg)
slant of a textured surface subtending 60° by 60° in totally dark surroundings. Figure 20.15A shows the perceived slant created by various magnitudes of horizontal-size disparity, vertical-size disparity, and overall-size disparity. These results are similar to those produced by Ogle (1939b) and by Gillam et al. (1988a). If vertical-size disparities are pooled over a large area, the slant produced by a given vertical-size disparity should be reduced if the display with vertical disparity is surrounded by an annulus with zero disparity, as shown in Figure 20.16A. On the other hand, slant produced by horizontal-size disparity should be enhanced by a surrounding zero-disparity annulus, as in Figure 20.16B, because of the strong relative disparity signal at the boundary between the two displays. Overall magnification of one image should produce some slant in a display surrounded by a zero-disparity annulus. This is because the visual system will average out the vertical disparity but register the horizontal disparity. In a second experiment, Kaneko and Howard had subjects set the tactile paddle to match the slant of each of two simultaneously presented random-texture displays. The first was a 30°-diameter central display with various vertical or horizontal size disparities. The second was a surrounding display with zero disparity. The results are shown in Figure 20.15B. As predicted, the zero-disparity surrounding display reduced the slant produced by vertical disparity of the central display, because it reduced the mean
4% horizontal-size disparity
4% vertical-size disparity
Disparate dots
20
Disparate dots and zero-disparity dots seen as one surface
0
–20
Zero-disparity dots
–40 1 0.5 0 0.5 1 1 0.5 0 0.5 1 Left image Right image Left image Right image larger larger larger larger Ratio of disparate dots to zero-disparity dots
C Perceived slant and size disparity. (A) Perceived slant of a 60°-diameter random-dot display as a function of the magnitude and type of size disparity. (B) Perceived slant of a 30°-diameter random-dot display with various horizontal-size disparities or vertical-size disparities as a function of the presence (solid dots) or absence (hollow dots) of a surround with zero disparity. (C) Perceived slant of a random-dot display containing dots with 4% horizontal-size disparity or vertical-size disparity mixed with a variable ratio of dots with zero disparity. Dots with horizontal-size disparity and nondisparate dots formed distinct slanted surfaces that increased in perceived slant as the percentage of disparate dots increased. Dots with vertical-size disparity and zero-disparity dots formed one surface that increased in perceived slant as the percentage of disparate dots increased (N = 3). (Adapted from Kaneko
Figure 20.15.
and Howard 1996)
STEREOSCOPIC VISION
A
For moderate vertical disparities, the perceived curvature depended on the mean vertical disparity. For large verticalsize disparities, the dots with zero disparity were more heavily weighted. Porrill et al. (1999) confirmed that mixed vertical-size disparities in a random-dot display are averaged for disparity differences up to about 8°. Beyond this point, perceived slant was determined by one or the other of the disparities. Observers did not see two transparent planes, as they did when dots with distinct horizontal-size disparities are superimposed. Thus, when the difference in vertical-size disparity in two sets of superimposed dots exceeds a certain value, subjects disregard one or the other disparity.
20.2.4d Vertical Disparities in Distinct Depth Planes B
C Slant in a surface with zero-disparity surround. (A) Vertical disparity in the central square produces no slant. (B) Horizontal disparity of the central square produces slant. (C) Overall disparity of the central square produces some slant.
Figure 20.16.
vertical disparity. But the zero-disparity surrounding enhanced slant produced by horizontal disparity, because it provided a contrasting slant.
20.2.4c Distinct Superimposed Vertical Disparities A random-dot display containing a mixture of two horizontal-size disparities creates two superimposed slanted surfaces. This is because horizontal disparities are detected locally. However, Kaneko and Howard (1996) showed that a random-dot display with a mixture of two vertical-size disparities appears as one surface at an intermediate slant. The results are shown in Figure 20.15C for displays containing a mixture of zero-disparity dots and dots with 4% vertical-size disparity. The surface with zero-disparity dots appeared to slant in the opposite direction to the surface with dots with horizontal-size disparity. This is a slant contrast effect. Adams et al. (1996) measured horizontal-size disparity required to null the perceived curvature in depth of a random-dot display in which half the dots possessed verticalsize disparity and the other half zero vertical-size disparity.
The above evidence shows that vertical-size disparities in the same depth plane are averaged over large areas. But what happens when texture elements with distinct vertical-size disparities are distributed in two distinct depth planes created by horizontal disparity? This question requires some theoretical background. The pattern of horizontal disparities over a large textured surface provides accurate information about surface curvature only if the distance of the surface is registered (Section 20.6.5). The pattern of vertical disparities can provide the required distance information. But how that information is registered depends on the axis system used to measure vertical disparities. Vertical disparities registered in terms of horizontal lines of latitude do not change with changes in horizontal vergence because lines of latitude lie in planes orthogonal to the vertical axis of eye rotation. But vertical disparities registered this way do change with increasing distance along a cyclopean line of sight. Therefore, distinct patterns of vertical-size disparity will be produced at each distance in depth. These distinct patterns could provide the distance information required for scaling the curvature in depth of each surface. Vertical disparities registered in terms of horizontal lines of longitude vary with changes in vergence but not with changes in distance. Vertical disparities registered in this way would indicate the vergence state of the eyes and would therefore provide only indirect information about distance, as proposed by Mayhew and Longuet-Higgins (1982). However, the single parameter of vergence state would not allow one to detect different curvatures at different distances. Duke and Howard (2005) asked whether the apparent curvature in depth of two superimposed surfaces can be independently manipulated by introducing distinct patterns of vertical disparity in each surface. In the first experiment, a row of dots was superimposed along the midhorizontal axis of a 30 by 30° random-dot stereoscopic display.
B I N O C U L A R D I S PA R I T Y A N D D E P T H P E R C E P T I O N
•
401
The display had a pattern of horizontal disparity corresponding to a frontal surface at a distance of 45 cm and a pattern of vertical disparity that produced a concave or convex curvature in depth of the surface. When the central row of dots (which had no vertical disparity) was in the same depth plane as the display, it had the same apparent curvature as the display. But when the array had a horizontal disparity of 20 arcmin that moved it out of the plane of the row of dots the vertical disparities in the array had no effect on the perceived curvature of the row of dots. Thus, the effects of the vertical disparities in the array were confined to the region of depth within which the display lay. In the second experiment, two 30 by 30° randomtexture displays were superimposed in a stereoscope. The texture elements of the two arrays were presented in alternate columns so that they did not coincide and they differed in shape and color so that subjects could distinguish them. When the two displays were in the same depth plane they appeared as a single surface with a curvature in depth that depended on the average of the vertical disparities in the two displays. But when they were separated in depth by more than 20 arcmin of horizontal disparity, the curvature of each surface was determined only by the vertical disparities in that surface. These results are incompatible with the idea that vertical disparities are registered in terms of lines of longitude and pooled over the visual field to provide information about the state of vergence. Rather, they support the idea that vertical disparities are registered in terms of lines of latitude. Vertical disparities of this type do not vary with horizontal vergence, because lines of latitude are parallel to the direction of the eye movements. These vertical disparities indicate distance directly. Vertical disparities are pooled locally within each depth plane, as discussed in Section 20.2.4a, but they are not pooled across sufficiently separated depth planes. They thus allow the curvatures of surfaces at different distances to be coded independently. 20.2.5 P H Y S I O L O G I C A L T H E O R I E S OF THE INDUCED EFFECT
Matthews et al. (2003) proposed a physiological theory of the induced effect. They started by pointing out that cortical cells detect disparity orthogonal to their preferred orientation. The cells cannot separate horizontal and vertical disparity components. This is the disparity aperture problem. Populations of cells with different preferred orientations could produce separate estimates of horizontal and vertical disparities. However, Matthews et al. suggested that cortical cells simply pool horizontal and vertical disparities. Accordingly, a cell registers a vertical disparity as an equivalent horizontal disparity. A vertical disparity in stimuli with a predominant oblique orientation will produce an equivalent horizontal disparity in cells tuned to that orientation. A vertical disparity in vertical lines will not be detected, and 402
•
a vertical disparity in horizontal lines will produce a very large equivalent horizontal disparity that is beyond the range of detection. Vertical disparity in stimuli lacking a dominant orientation, such as dots, activates cells tuned to all orientations. Assume that preferred orientations are isotropically distributed in the visual cortex. Then the equivalent horizontal disparities produced over any local set of cells should average to zero, and there should be no induced effect. However, there is a radial bias of orientation selectivity of monosynaptically driven simple and complex cells in the cat’s visual cortex (Vidyasagar and Henry 1990). From this radial bias, Matthews et al. predicted that vertical disparity in a diagonal array of dots should produce an oblique effect. Figure 20.17 shows the predicted induced effects. In each array of dots, the center dot has a vertical disparity with respect to the other dots. For the reasons given above, vertical disparities in horizontal or vertical dot arrays produce no depth, as can be seen in Figure 20.17A. For a given oblique orientation of the dots, the sign of the induced depth effect depends on the sign of the vertical disparity, as can be seen in Figure 20.17B. However, with the same vertical disparity, the sign of the induced effect is reversed when the array is oriented in the opposite direction, as in Figure 20.17C. Some predictions can be made from this theory. A large display with vertical size disparity should appear curved in depth because the radial bias increases with eccentricity. The effect should vary with changes in orientation of texture elements. Alternating the sign of vertical disparity should produce wavy undulations in depth. Read and Cumming (2006) proposed that detectors of vertical disparity are not required to explain the size-disparity induced effect. They argued that vertical disparity reduces the correlation between the inputs from the two eyes to binocular cells sensitive only to horizontal disparity. This reduces the binocular energy in these cells by an amount that depends on the magnitude of vertical disparity. Decorrelation produced by vertical disparity would affect mainly small receptive fields while that produced by a general loss of interocular correlation affects all receptive fields equally. According to this proposal, there is no explicit computation of vertical disparities. However, it does not account for the evocation of vertical vergence by vertical disparity. In any case, more recently, Serrano-Pedraza and Read (2009) produced evidence that vertical disparities are explicitly coded. They produced an induced effect in a stimulus that simulated a surface at infinity for which the sign of slant produced by the induced effect could not be detected simply by image decorrelation. The induced effect produced by this stimulus was indistinguishable from that produced by a standard stimulus. They concluded that both the magnitude and sign of vertical disparities are explicitly coded in the visual system.
STEREOSCOPIC VISION
The following psychophysical procedures have been used to investigate whether people use orientation disparities, gradients of horizontal disparity, or point disparities in perceiving surface inclination.
20.3.1a Conflicting Orientation and Positional Disparities
A
B
C An induced effect from a small array of dots. (A) Vertical disparities in horizontal or vertical dot arrays do not produce depth. (B) In oblique arrays, outer dots with vertical disparity appear beyond or nearer than the center dot. (C) Sign of depth is reversed when the array of dots is reversed. The patterns of disparity produced by the stereogram are shown on the right. (Adapted from Matthews et al. 2003)
Figure 20.17.
2 0 . 3 P E R C E P T I O N O F I N C L I N AT I O N 20.3.1 O R I E N TAT I O N D I S PA R I T Y A N D I N C L I NAT I O N
A surface inclined to the frontal plane of the head produces a horizontal-shear disparity (Figure 19.2b). Thus, horizontal-shear disparity is the deformation disparity that is specifically related to inclination. A horizontal-shear disparity can be defined as an orientation disparity of vertical elements or as a vertical gradient of horizontal disparity. The local inclination of a surface, as specified by horizontal-shear disparity (HSh), for a surface at a convergence distance m is approximately: ⎛1 ⎞ Inclination ≈ − arctan ⎜ tan ( HSh S − w )⎟ ⎝m ⎠
(4)
where w is the magnitude of cyclovergence (Banks et al. 2001).
In this procedure the stimuli contains orientation disparities between line elements, but the horizontal disparities of individual points in a local area are either zero or average to zero. An unfortunate consequence of this strategy is that information from orientation disparity conflicts with that from point disparities, at least at some spatial scales. Von der Heydt et al. (1978) used this procedure with randomly spaced vertical lines that were uncorrelated in the two eyes. The lack of binocular correlation between individual bars meant that point disparities based on linking nearest-neighbors were distributed symmetrically around zero disparity. Differences in orientation of the dichoptic gratings of between 0.3 and 20° created impressions of inclination. Consistent with these psychophysical findings, Hänny et al. (1980) found six cells in monkey striate cortex that responded to stereoscopic patterns containing orientation disparity but randomly distributed horizontal disparities. Although these results appear to be good evidence for the use of orientation disparity, the argument is not conclusive. At any moment, a pair of bars with an orientation disparity also creates a vertical gradient of horizontal point disparities. The perception of inclination could be based on this gradient rather than on orientation disparity. Ninio (1985) adopted a slightly different technique for investigating orientation disparities. He used a static and dense array of short line elements (needles) distributed over a four-sided pyramid. The needles were binocularly correlated and the positional disparity of each of the needle centers was always appropriate to its location on the surface of the pyramid. When there was an appropriate disparity between the ends of the needles and when the needles had orientation disparities appropriate for line elements covering the faces of the pyramid, observers judged the pyramid to be smooth. When needles had identical orientations in the two eyes (signaling a frontal orientation), the pyramids were judged to be “bristly” with the needles protruding out of the faces (see Figure 15.13). In the two experimental conditions, the needles had either (1) orientation disparities appropriate for the inclination of the pyramid faces but no appropriate horizontal disparities between the tips of the needles or (2) appropriate horizontal disparities between the tips of the needles but no orientation disparities. These manipulations were achieved by independently varying the orientation and length of the needles in one eye’s image. Initially, all of the needles were approximately the same orientation (+45° or −45°).
B I N O C U L A R D I S PA R I T Y A N D D E P T H P E R C E P T I O N
•
403
3
Inclination thresholds (deg)
Under these circumstances, observers judged the surfaces of the pyramids to be slightly smoother in the first condition than in the second. Ninio interpreted these results in terms of an orientation-disparity mechanism that is at least as important as the positional disparity mechanism in the visual system. However, there are two problems with this interpretation. First, it assumes that positional disparities are derived only from the tips of the needles, and second, it assumes that inappropriate vertical disparities do not disrupt the horizontal disparity mechanism. In another experiment, the same judgments were made using pyramids covered with both +45° and −45° needles. In this case, angular disparities between pairs of oriented needles provided unambiguous information about direction of surface slant and ruled out torsional misalignment as a possible cause. As a result, this stimulus provided a better test for the use of orientation and angular disparities. However, the opposite pattern of results was reported. The pyramid appeared more bristly when orientation disparities of the needles were appropriate for a smooth surface and positional disparities were inappropriate than vice versa (Ninio 1985).
±45° grid lines 2
0°/90° grid lines
1
0
0
5 15 10 Size of circular display (deg)
20
Inclination threshold and display size. Thresholds for discrimination of inclination (ground/sky plane) as a function of display size. The display was a grid of white lines on a dark ground oriented at ±45° or 0°/90°. The threshold decreased sharply with increasing display size up to 5°, beyond which it remained constant (N = 2). (Adapted from Cagenello and Rogers 1993)
Figure 20.18.
20.3.1b Effects of Element Orientation
20.3.1c Effects of Stimulus Size
In the second procedure for investigating the role of orientation disparity in the perception of inclination the stimuli contain point and orientation disparities that are both consistent with the inclination of the depicted surface. The oriented features are then adjusted to maximize or minimize orientation disparities. In this procedure there is no conflict between point and orientation disparities. Using this procedure, Cagenello and Rogers (1993) showed that slant-detection thresholds for a surface covered with a grid of ±45° lines were considerably lower (±1.5° from a frontal plane) than for a surface covered with a grid of vertical and horizontal lines (±3.5°). This result is consistent with the use of orientation disparities since there are no orientation disparities between either 0° or 90° elements on a slanted surface. Thresholds for detection of surface inclination were approximately the same (1.25° from a frontal plane) whether the surface was covered with a grid of ±45° or vertical and horizontal lines (Figure 20.18). The orientation disparities created by the vertical lines in the vertical and horizontal line grid are twice as large as the orientation disparities created by the 45° lines (see Figure 19.13). If the maximum orientation disparity were the limiting factor, inclination thresholds should have been lower by a factor of 2 for the 0°/90° stimuli (Section 20.4.1). The results of the experiment are therefore inconsistent with the use of orientation disparities, at least for the detection of inclination at threshold. However, the results are compatible with the use of disparity gradients.
Braddick (1979) measured the horizontal-shear disparity of a vertical grating at which diplopia became apparent. He reasoned that if the perception of inclination were based on orientation disparity, the diplopia threshold would not depend on the size of the grating, since an orientation disparity does not vary with line length. If the diplopia threshold depended on the linear disparity at the ends of the lines of the grating, diplopia would occur sooner with larger gratings. In fact, the diplopia threshold did not depend on the size of the grating, but on a critical value of orientation disparity. However, these results are confounded by the fact that larger gratings stimulate more peripheral retinal regions, where the diplopia threshold for point stimuli is higher than in foveal regions. This point was discussed in Section 12.1.1d. If detection of surface inclination depends only on orientation disparity, it should improve as the size of the surface increases, because there are more samples of orientation disparity in a larger surface. If detection of inclination depends only on point disparities along the upper and lower edges of the surface, it should improve with increasing size of the surface. This assumes that disparity thresholds are not affected by retinal eccentricity. But linear disparity thresholds increase with increasing retinal eccentricity because of the increase in the size of receptive fields. The two factors may cancel, leaving inclination thresholds unaffected by surface size. The threshold for detecting the direction of inclination (ground plane or sky plane) produced by horizontal-shear
404
•
STEREOSCOPIC VISION
disparity in a stereoscopic surface covered with a grid of white lines decreased as the diameter of the surface increased from 1° to 5° (Cagenello and Rogers 1989; Cagenello and Rogers 1993). However, the threshold remained fairly constant at around 0.8° of inclination (shear disparity of 0.1°) for diameters between 5° and 22° (Figure 20.19). They concluded from this constancy of the threshold that the threshold is limited by orientation disparity rather than positional disparity. However, the threshold would also stay roughly constant if linear disparity thresholds increased with retinal eccentricity. If detection of inclination were limited by the threshold for detection of orientation disparity, one would expect the threshold orientation disparity to be similar to the threshold for detecting a difference in orientation of grids presented to both eyes. In other words, when observers can just discriminate that a surface covered with dichoptic vertical and horizontal lines is inclined, they should be able to discriminate whether the same lines presented to both eyes are sheared clockwise or counterclockwise. Also, thresholds for the two stimuli should vary in the same way as a function of stimulus size. However, Figure 20.19 shows that, at the point where observers could reliably judge direction of inclination (±1° from a frontal plane), the orientation disparity threshold (∼6 arcmin) was 2 to 3 times smaller than the shear threshold for lines seen by both eyes (∼15 arcmin). These results suggest that the inclination threshold and the relative shear threshold are not limited by the same factors. It is thus unlikely that orientation disparities limit the inclination discrimination threshold.
Threshold orientation difference (deg)
0.5
0.4
0.3
2-D shear
0.2 3-D inclination 0.1
0
DeValois et al. (1975) used aftereffects to investigate the role of orientation disparities. An inclined induction surface was covered with a triangular pattern, which created orientation disparities between corresponding lines in the two eyes. After prolonged viewing of this surface a test surface appeared inclined in the opposite direction. The aftereffect was still seen in a test surface that had a horizontal disparity with respect to the fixation point. It was also seen after the induction surface was moved vertically during the induction period so that each retinal region was exposed to the entire range of crossed and uncrossed disparities. These results suggest that the depth aftereffects derive from orientation disparities or gradients of disparity rather than from disparities of points (Section 21.7.2), but they do not provide conclusive evidence for the role of orientation disparities. For example, Rogers and Graham (1985) and Lee and Rogers (1992) reported that depth aftereffects produced by adaptation to a sinusoidally corrugated random-dotcovered surface generalized to test surfaces in other depth planes. However, there were no orientation disparities in their displays. A change in the torsional alignment of the eyes affects the orientation disparity of all line elements in a scene in the same way. Slant or inclination of a surface is indicated by the pattern of orientation disparities created by lines in various orientations. In most studies on the perception of slant and inclination, stimuli contained lines of more than one orientation. They therefore provided an unambiguous pattern of orientation disparities. In other studies, the surround provided a reference for interpreting ambiguous disparities of oriented features. Consequently, these experiments were not concerned with simple orientation disparity. Rather, they revealed the role of relative orientation disparity, or angular disparity. This is the binocular difference in the angular separation of pairs of oriented elements in each of the monocular images. Angular disparities, unlike orientation disparities, are not affected by torsional misalignment of the eyes (cyclovergence). 20.3.2 D E F O R M AT I O N D I S PA R I T Y A N D I N C L I NAT I O N
20.3.2a Shear-Disparity Induced Effect 0
5 10 15 Size of circular display (deg)
20
Inclination and dioptic shear thresholds. The lower curve shows the orientation disparity threshold for detecting the direction of inclination, as a function of stimulus size. The upper curve is the threshold difference in orientation of dioptic lines for detection of clockwise/counterclockwise shear, as a function of stimulus size. The stimulus was a grid of white lines on a dark ground with 0° and 90° orientations (N = 3). (Redrawn from Cagenello and Rogers 1989)
Figure 20.19.
20.3.1d Evidence from Aftereffects
A vertical gradient of horizontal disparity is a horizontalshear disparity. A horizontal gradient of vertical disparity is a vertical-shear disparity. A rotation of one image relative to the other is a rotation disparity. These disparities can also be expressed in terms of orientation disparities. An overall rotation disparity is normally created only by torsional misalignment of the eyes. A rotation disparity can be thought of as composed of horizontal-shear and
B I N O C U L A R D I S PA R I T Y A N D D E P T H P E R C E P T I O N
•
405
vertical-shear disparities of the same size and sign. The difference between horizontal-shear disparity and verticalshear disparity for a given stimulus is a shear-deformation disparity. If perceived inclination were to depend only on horizontal-shear disparity, we could predict the following:
Horizontal shear no vertical shear
Vertical shear no horizontal shear 0
+
+
0
1. A frontal surface should appear inclined when the eyes are torsionally misaligned because rotation disparity contains a component of horizontal shear. 2. A surface containing only vertical-shear disparity should appear frontal.
A
B
Rotation (same H and V shear)
Deformation (opposite H and V shear) +
+
On the other hand, if perceived inclination were to depend on deformation disparity we could predict the following: 1. A surface containing only horizontal-shear disparity, as in Figure 20.20A, should appear inclined. A surface containing the same vertical-shear disparity, as in Figure 20.20B should appear inclined by the same amount but in the opposite direction. In these two cases, deformation disparities are equal and opposite. Inclination produced by vertical-shear disparity would be a shear-disparity induced effect analogous to the size-disparity induced effect described in Section 20.2.3. 2. A frontal surface should appear to remain frontal when the eyes are torsionally misaligned or when images in a stereoscope are rotated in opposite directions to produce a rotation disparity, as in Figure 20.20C. In this case the two types of shear disparity are equal and therefore shear-deformation disparity is zero. 3. A surface containing horizontal-shear disparity of one sign and vertical-shear disparity of the opposite sign, as in Figure 20.20D, should appear inclined more than a surface containing only one or the other type of shear disparity. Therefore, these stimuli can be used to test whether shear-deformation disparity is used in the perception of surface inclination. Cagenello and Rogers (1990) reported that a stereogram consisting of a pair of 20°-diameter random-dot patterns containing vertical-shear disparity appeared inclined, as predicted by the shear-deformation hypothesis. As viewing time increased from 1 s to 6.5 s, perceived inclination increased from about 30% to 75% of that predicted. They argued that the increase in perceived inclination was due to a change in cyclovergence, which converted the verticalshear disparity into a horizontal-shear disparity. No similar increase in perceived inclination occurred with a pair of 20°-diameter random-dot patterns related by a horizontal 406
•
+
C Figure 20.20.
D
Four types of shear disparity. The red and blue spots represent
the left-eye and right-eye images.
(Redrawn from Howard and Kaneko 1994)
shear, suggesting that the increase was not the result of a generalized build-up in the perception of inclination. Cagenello and Rogers also reported that inclination could be seen initially in a 20°-diameter random-dot stereogram containing a rotation disparity. However, perceived inclination decreased as exposure time was increased from 1 s to 6.5 s. The results of these experiments suggest that the visual system does not use shear-deformation disparity (Rogers 1992). There would be no need for the visual system to use deformation disparity if the eyes were always in perfect rotational alignment (Portrait Figure 20.21). An overall vertical-shear disparity arises only when the eyes are torsionally misaligned. Such a disparity could evoke cyclovergence and restore the torsional alignment of the eyes. We saw in Section 10.7.5a that cyclovergence is evoked more strongly by vertical-shear disparity than by horizontalshear disparity. However, many people have a cyclophoria (Section 10.2.3) and, in all people with normal sight, the eyes change their torsional alignment during elevations of gaze (Section 10.7.4). The use of deformation disparity would allow the visual system to disregard eye misalignment when judging inclination. Howard and Kaneko (1994) used a random-dot stereoscopic display 75° in diameter with a totally black surround. Subjects judged the perceived inclination of the fused image by setting a tactile paddle. A visual comparison stimulus
STEREOSCOPIC VISION
Ian P. Howard. Born in Rochdale, England, in 1927. He obtained a B.Sc. in psychology from Manchester University in 1952 and a Ph.D. in psychology from Durham University, England, in 1966. In 1953 he was appointed lecturer in psychology at Durham University. After a year in the Department of Psychology at New York University he moved to York University in Canada in 1966, where he founded the Centre for Vision Research. He is now Distinguished Research Professor emeritus. In 2009 he was awarded a D.Sc. from New York State University.
Figure 20.22. Figure 20.21.
Brian J. Rogers. Born in London in 1947. He obtained his
B.A. in psychology in 1969 and his Ph.D. in psychology in 1976, both from the University of Bristol in England. In 1973 he was appointed lecturer in psychology at the University of St. Andrews in Scotland. In 1984 he was appointed professor in the Department of Experimental Psychology in Oxford University.
was not used, because it would introduce unwanted disparities (Portrait Figure 20.22). The four types of disparity used in the experiment are illustrated in Figure 20.20. The results in Figure 20.23A show that, for disparities of up to about 4°, the magnitude of perceived inclination of a display containing only vertical-shear disparities was almost as large as that produced by only horizontal disparities. With disparities greater than about 4°, the displays became disparate and inclination became difficult to judge. Porrill et al. (1989), also, reported that a large stereoscopic display containing vertical-shear disparity appeared inclined. Figure 20.23B shows that, for all disparities, there was little or no perceived inclination when the image in one eye was rotated with respect to that in the other eye. Berends and Erkelens (2001a) confirmed that inclination produced by a given vertical-shear disparity is nulled by addition of an equal horizontal-shear disparity. People who give more weight to horizontal-shear disparity than to vertical-shear disparity should perceive some inclination when the images of a textured surface are rotated in opposite directions. There is a hint of this in Figure 20.23B. Also, Van Ee and Erkelens (1998) found that two out of six subjects saw some inclination in such a stimulus, although it was substantially less than that created by horizontal shear.
Figure 20.23B also shows that when the images had a vertical-shear disparity of one sign and a horizontal-shear disparity of the opposite sign (deformation) of up to about 4°, the perceived inclination was almost twice that produced by horizontal shear or vertical shear alone. These results would be explained if the eyes underwent cyclovergence so as to transfer vertical disparities into horizontal disparity. Howard and Kaneko checked this possibility by using four subjects in whom incyclovergence was much stronger than excyclovergence. If cyclovergence were the only factor in the inclination judgments the same asymmetry would have occurred in the psychophysical results. But no such asymmetry was present. These results provide strong support for the idea that the visual system possesses a neural mechanism that detects the difference between gradients of vertical and horizontal disparities (deformation). The difference signal is used to code inclination. This mechanism compensates for rotation disparity that occurs when the eyes are not torsionally aligned. For rotation disparity produced by cyclovergence, the compensatory mechanism could rely on oculomotor signals that indicate the state of cyclovergence. For rotation disparity produced by rotation of stereoscopic images the compensatory mechanism would have to rely on deformation
B I N O C U L A R D I S PA R I T Y A N D D E P T H P E R C E P T I O N
•
407
Perceived inclination (deg)
60
30
Inclination predicted from deformation disparity Horizontal-shear only Vertical-shear only
0
–30
–60 Hor.-12 Ver.12
0 –6 6 6 0 –6 Shear disparity (deg)
12 –12
A
Perceived inclination (deg)
60
Inclination predicted for deformation disparity by deformation hypothesis
30 Rotation Deformation
–30
Martin S. Banks. Born in Salt Lake City, Utah, in 1948. He obtained a B.A. in psychology from Occidental College in 1970 and a Ph.D. in developmental psychology from the University of Minnesota in 1976. He held academic appointments in the psychology department of the University of Texas at Austin and moved to the University of California at Berkeley in 1985, where he is professor of optometry, visual science, and psychology.
Figure 20.24.
0
Inclination predicted from horizontal-disparity component of both stimuli
–60 –12
–6 0 6 Horizontal-shear disparity (deg)
12
B Perceived inclination and shear disparity. (A) Perceived inclination produced by horizontal-shear disparity and vertical-shear disparity. The sigmoid line is the inclination for both shear conditions predicted from deformation disparity. (B) Perceived inclination produced by rotation disparity and deformation disparity. The shallow sigmoid curve is the inclination for rotation and deformation disparities predicted from the horizontal disparity component. The steep sigmoid curve is the inclination for deformation disparity predicted from deformation-disparity hypothesis (N = 4). Bars are standard errors. (Redrawn from Howard and Kaneko 1994)
Figure 20.23.
disparity to the extent that cyclovergence does not correct for the misalignment of the images. The idea of a neural compensatory mechanism would be further supported if it could be shown that torsional misalignment of the eyes (cyclovergence) does not affect the perceived inclination of a frontal surface. We will now see that there are two ways to induce cyclovergence. One way is to rely on the fact that the eyes remain out of torsional alignment for some time after viewing a cyclorotated stimulus, Kaneko and Howard (1997a) found that induction of cyclovergence just before exposing a display with horizontal-shear or vertical-shear disparity did not affect judgments of inclination. 408
•
Banks et al. (2001) (Portrait Figure 20.24) had subjects inspect a large display of lines with 8° of vertical-shear disparity for 10 s. This induced about 2° of cyclovergence, as measured by a nonius procedure. The lines were replaced for repeating periods of 100 ms by a circular textured stereoscopic display in a frontal plane. Subjects adjusted the horizontal-shear disparity in the surface until it appeared frontal. This procedure eliminated conflict between disparity and the monocular cue of texture gradient because, at the end point, both cues signified a frontal surface. Estimates of inclination were nearly veridical, showing that subjects had compensated for induced cyclovergence. In a second experiment, Banks et al. again induced cyclovergence but, in addition, they added extra verticalshear disparity into the test display. If only the oculomotor signals were used to estimate vertical-shear disparity, the extra vertical disparity should have had no effect. The results indicated that judgments of inclination depended on the total magnitude of vertical-shear disparity. The visual system must therefore use deformation disparity in compensating for rotation disparity. The second way to induce cyclovergence is to rely on the fact that the eyes show incyclovergence with elevated gaze and excyclovergence with downward gaze. Mitsudo (2007) found that when the pattern in Figure 20.25 was viewed with elevated gaze in a gaze-normal plane it appeared to
STEREOSCOPIC VISION
A
B
C
D
Effect of gaze elevation on shear disparity. When figure (A) is viewed with elevated gaze in a gaze-normal plane, the images in the two eyes acquire a vertical-shear disparity, as illustrated in (B). This produces an impression that the pattern bulges in depth. The same pattern with dashed lines, as in (C), does not produce an impression of depth. Its images contain both vertical-shear and horizontal-shear disparities, as shown in (D). (Redrawn from Mitsudo 2007)
Figure 20.25.
bulge out in depth. The depth was found to correspond to the vertical-shear disparity induced into the images by incyclovergence, as depicted in Figure 20.25B (Mitsudo et al. 2009). A pattern with dashed lines, as in Figure 20.25C, did not produce an impression of depth. In this case, the images had the equivalent of a rotation disparity, which does not produce inclination. In this case, it is clear that the eyes remained in a state of constant cyclovergence, so that any vertical-shear disparity was not converted into horizontal-shear disparity.
20.3.2b Local Versus Global Vertical-Shear Disparity In section 20.2.4 it was shown that vertical-size disparities are pooled over large areas. Is the same true of vertical-shear disparities? Vertical-shear disparity in a large display when there are no other stimuli in view produces an impression of inclination. The stereogram in Figure 20.26B contains a vertical-shear disparity but produces little or no inclination. This is because the display is too small and is seen in the context of other objects. The horizontal-shear disparity in Figure 20.26A does produce inclination because horizontal-shear disparity is processed locally. Cagenello and Rogers (1990) used a stereoscopic display with vertical-shear disparity of one sign in the right half and vertical-shear disparity of opposite sign in the left half, as shown in Figure 20.27B. If perceived inclination
were based on the local deformation disparity, opposite inclinations would be seen in the two halves of the dumbbell figure. On the other hand, if perceived inclination were based only on horizontal-shear disparities, no inclination would be seen. Subjects perceived only a very small differential inclination between the two halves of the dumbbell figure, and it was in the opposite direction to that predicted by the deformation hypothesis. This result is compatible with the idea that vertical-shear disparities are pooled over a large area of the visual field rather than being derived from separate parts of the scene. The opposite horizontalshear disparities in Figure 20.27A produced opposite inclinations in the two surfaces, showing that horizontal-shear disparities are processed locally. Kaneko and Howard (1997a) produced further evidence that vertical-shear disparity is detected globally. They used a 60° random-dot display containing dots with vertical-shear, horizontal-shear, or rotation disparity mixed with a variable proportion of dots with zero disparity. Subjects set a paddle to match the perceived inclination of one surface or of each of the two surfaces that they saw in the display. The results are shown in Figure 20.28. Dots with horizontal-shear disparity segregated into a distinct inclined surface relative to the frontal surface produced by the zero-disparity dots, as shown in Figure 20.28A. Displays containing dots with vertical-shear disparity mixed with zero-disparity dots appeared as a single surface with an inclination that increased as the proportion of disparate dots increased, as shown in Figure 20.28B. However, a threshold proportion of dots was required to produce the first signs of inclination. Thus, vertical-shear disparities were averaged over the whole display and, if the mean exceeded a certain value, it was used to scale the zerohorizontal disparity of the display. This produced inclination in the opposite direction to that produced by dots with only horizontal disparity. Displays containing dots with a rotation-disparity mixed with dots with zero-disparity segregated into two surfaces, as shown in Figure 20.28C. This is because each set of dots had a distinct horizontal disparity. As the proportion of disparate dots increased, the inclination of the surface defined by disparate dots decreased because the ratio of horizontal to vertical disparity for that surface moved closer to 1. However, the perceived inclination of the surface defined by zero-disparity dots increased because, for that surface, the ratio of zero horizontal disparity was combined with a large mean vertical disparity in the whole display. Thus, the horizontal-shear disparity in the images of each set of dots was scaled by the mean vertical disparity of all the dots in the display. Howard and Kaneko (1994) produced evidence that vertical-shear disparity in the retinal periphery is given more weight than that in the center of the visual field. They used a 30° circular random-dot display with 2.3° of horizontal-shear disparity, vertical-shear disparity,
B I N O C U L A R D I S PA R I T Y A N D D E P T H P E R C E P T I O N
•
409
A
Binocular images related by shear disparities. (A) The fused image has a 4°
Figure 20.26.
horizontal-shear disparity, which creates an impression of inclination. (B) The fused image has a 4° vertical-shear disparity, which produces little or no inclination. A large display with vertical-shear disparity, when viewed in isolation, does produce an impression of inclination.
B
A
Equal and opposite shear disparities in the half-fields.
Figure 20.27.
(A) Crossed or uncrossed fusion creates two surfaces with opposite horizontal-shear disparities. The surfaces appear inclined in opposite directions. (B) The two surfaces have opposite verticalshear disparities. The surfaces appear slightly inclined in opposite directions. But the sign of the inclinations does not comply with predictions from the local deformation hypothesis. (Derived from
B 410
Rogers 1992)
•
STEREOSCOPIC VISION
Perceived inclination (deg)
60
Horizontal-shear disparity Dots with disparity
30 Zero-disparity dots 0
–30 0
25
50
75
100
% of dots with disparity
Perceived inclination (deg)
A 60
Vertical-shear disparity
30 All dots 0
–30
0
50 75 25 % of dots with disparity
100
Perceived inclination (deg)
B 60
Rotation disparity Dots with disparity
30 Zero-disparity dots
0
–30
0
50 75 25 % of dots with disparity
100
C Perceived inclination with mixed disparities. (A) For a randomdot surface, perceived inclination defined by horizontal-shear disparity increased slightly in the presence of zero disparity dots, because of the relative depth signal. (B) With vertical-shear disparity, all dots appeared as one surface that increased in inclination as the % of disparate dots increased. (C) The addition of zero-disparity dots increased the perceived inclination of the surface defined by dots with rotation disparity (N = 3). (Adapted from Kaneko and Howard 1997a)
Figure 20.28.
or rotation disparity. This display was surrounded by a black background or by a random-dot display with zero disparity. Subjects judged the inclination of the central display and of the annulus. The results are shown in Figure 20.29. The central display with horizontal disparity appeared inclined at the same angle whatever the outer diameter of the annulus, and the surround appeared frontal. The central display with vertical disparity appeared inclined in the opposite direction but only when the annulus was black (Figure 20.29A). When the zero-disparity annulus was present, it and the central display appeared almost frontal. When the annulus
had vertical-shear disparity and the central display had zero disparity, both regions appeared inclined. The inclination increased with increasing outer diameter of the annulus. These results are what one would expect if vertical-shear disparity in the retinal periphery is given greater weight than that in the center. With rotation disparity, the central display appeared inclined only when the zero-disparity surround was present, as shown in Figure 20.29B. In this case, inclination occurred because the horizontal-shear disparity in the central display outweighed the low mean overall vertical-shear disparity over the center and surround. Van Ee and Erkelens (1995) obtained similar results. Gillam and Rogers (1991), also, obtained results inconsistent with the idea that stereopsis uses local deformation disparity. Subjects matched the perceived inclination of a probe to that of a 10°-diameter stereoscopic display. The zero-disparity surroundings were dimly in view. They reported little inclination with a vertical-shear disparity, which contained deformation. However, cyclorotated images, which contained no deformation, appeared inclined. These results demonstrate that vertical-shear disparity is not used locally to scale horizontal-shear disparity. The vertical-shear disparity signal used to scale horizontal-shear disparity is derived from the display as a whole, especially from the periphery of the visual field. This is what one would expect if relative-shear disparities were used to protect against torsional misalignment of the eyes that occurs in cyclophoria and oblique gaze. Such misalignments affect the whole image and are therefore most reliably detected by vertical-shear disparity over the whole visual field. Verticalshear disparity over the whole visual field does not usually arise from any cause other than misalignment of the eyes. Any difference between overall vertical disparity in the scene and local horizontal shear disparity in the images of a particular surface must be due to inclination of the surface. Local gradients of vertical-shear disparity are also produced on a slanted surface, as shown in Figure 19.3. The disparities above eye level have one sign and those below eye level have the opposite sign. Local vertical-shear disparity is therefore not a reliable indication that the eyes are out of alignment. Horizontal-shear disparities must be extracted locally, because surfaces inclined to different extents can be in view at the same time. But each horizontal disparity should be scaled with reference to the mean vertical-shear disparity over the whole field because it is this that indicates the torsional misalignment the eyes. Thus, rotation disparity arising from torsional misalignment of the eyes is a single parameter of the viewing-system, which has consequences for all visible surfaces. Analogous effects have been reported for slant and inclination produced by patterns of optic flow (Section 28.3.3c).
B I N O C U L A R D I S PA R I T Y A N D D E P T H P E R C E P T I O N
•
411
Perceived inclination (deg)
40
40
Vertical-shear disparity With black surround
20
Rotation disparity With zero-disparity surround
20
0
Perceived inclination from shear disparity as a function of type of surround. Perceived inclination of
Figure 20.29.
0 With black surround
With zero-disparity surround
–20
–20
–40
–40 5
0 2.5 –2.5 Vertical-shear disparity (deg)
–5
5
2.5 0 –2.5 Rotation disparity (deg)
A
–5
B
20.3.2c Temporal Aspects of Shear-Disparity Detection Allison et al. (1998) asked whether inclination produced by vertical-shear disparity takes longer to see than inclination produced by horizontal-shear disparity. They used a circular randomly textured display subtending 60° in diameter. A horizontal-shear disparity or a vertical-shear disparity of 0.73° or 1.46° was introduced into the display for durations of between 0.1 s and 30 s. After each presentation, the subject set the inclination of a real textured surface to match the perceived inclination of the test surface. For both types of disparity, perceived inclination increased in a similar way from zero for a duration of 0.1 s to about 10° for a duration of 30 s. In a second experiment, the display was sinusoidally oscillated at various frequencies through a peak-to-peak amplitude of horizontal-shear disparity or vertical-shear disparity. Subjects matched the inclination of the real surface to the perceived maximum amplitude of inclination of the test surface. For eight of 11 subjects, the perceived amplitude of oscillation of inclination declined from about 10° at an oscillation frequency of 0.1 Hz to about 3° at a frequency of 2 Hz. There was no significant difference between the two types of disparity. Further experiments revealed no temporal differences between the processing of horizontal-size disparity and vertical-size disparity in the creation of slant about a vertical axis. Thus, vertical-shear disparity and vertical-size disparity are processed at the same speed as the corresponding 412
•
(Adapted from Howard and Kaneko 1994)
horizontal-disparities, at least for frequencies between 0.1 and 2 Hz. Three of the subjects saw little modulation of depth even at the lowest temporal frequencies. However, these subjects saw the surface rocking in depth in antiphase to the disparity-defined oscillation. They were probably judging depth in terms of the pseudoperspective induced into the display by the oscillation of disparity. Gillam (1993) reported similar depth reversals. Allison et al. used frequencies of disparity oscillation only up to 2 Hz. Fukuda et al. (2006) asked whether oscillations of size disparity are perceived to higher frequencies than oscillations of vertical disparity. Instead of sinusoidal oscillations of disparity they used step oscillations of size disparity in a circular random-dot display 40° in diameter. The horizontal- or vertical-size disparity stepped back and forth from 2% to 8%. Subjects set a manual paddle to match the maximum and minimum perceived slants of the surface. The results for one subject are shown in Figure 20.30. All four subjects showed similar results. It can be seen that surface oscillation was evident for temporal oscillations of horizontal size-disparity up to 10 Hz. However, the oscillating
Matched slant of paddle
The perception of relative inclinations of surfaces does not require registration of absolute inclination. Judgments of relative inclination could be based on differences of horizontal-shear disparity. Our perception of the absolute inclination of a surface can be modified by the inclination of neighboring surfaces, as in simultaneous depth contrast (Section 21.5).
a 30°-diameter random-dot display as a function of (A) vertical-shear disparity and (B) rotation disparity. In one condition the test display was surrounded by a zero-disparity random-dot annulus extending out to 60° and in a second condition the surround was black (N = 3).
30° 20°
Oscillation of horizontal-size disparity Max. slant
10°
Oscillation of vertical-size disparity Max. slant
Min. slant Min. slant
0°
0.1
1 10 0.1 Temporal frequency (Hz)
1
10
Oscillations of size disparity. A surface oscillating in horizontalsize disparity at up to 10 Hz produced a surface oscillating in slant. A surface oscillating in vertical-size disparity at more than about 2 Hz collapsed into one surface at an intermediate slant. (Adapted from
Figure 20.30.
Fukuda et al. 2006)
STEREOSCOPIC VISION
surface collapsed into one surface at an intermediate angle of slant for oscillations of vertical-size disparity of more than about 2 Hz. This suggests that temporal resolution of vertical-size disparities is lower than that of horizontal-size disparities. Fukuda et al. then altered the duty cycle of the oscillations for step modulations of vertical-size disparity at 5 Hz. The perceived slant of the single surface produced by this frequency was pulled in the direction of the disparity that was exposed for a longer percentage of the duty cycle. However, there was a threshold effect. When one of the disparities was exposed for less than about 20% of the duty cycle, it did not affect the angle of perceived slant. These results are analogous to those reported in Section 20.2.4c, where it was shown that perceived slant produced by a spatial mixture of two vertical disparities is a weighted mean of the two disparities. There was a threshold effect in that case also.
2 0 . 4 S T E R E O S C O P I C A N I S OT R O P I E S 20.4.1 S L A N T-I N C L I NAT I O N A N I S OT RO P Y
20.4.1a Basic Findings Wallach and Bacon (1976) were the first to report that we are more sensitive to inclination about a horizontal axis arising from shear disparity than we are to slant about a vertical axis arising from compression disparity. Several investigators have subsequently reported this orientation anisotropy in stereopsis. Wallach and Bacon also reported that latencies for detection of inclination are typically shorter than those for detection of slant. Gillam et al. (1988b) confirmed this latency difference (Portrait Figure 20.31). Cagenello and Rogers (1993) measured thresholds for discriminating the direction of slant (left wall versus right wall) of a stereoscopic random-dot surface seen in semidarkness with minimal disparity in surrounding surfaces. Thresholds were about ±2.1° from a frontal plane for surfaces subtending 20° by 20°. At the 57-cm viewing distance, this corresponds to a horizontal disparity gradient of about 0.17 arcmin/deg. Thresholds for detecting the direction of inclination (ceiling versus floor) of similar surfaces were about ±1.3° for the two subjects. Hibbard and Langley (1998) produced similar evidence. Mitchison and McKee (1990) used textured surfaces subtending only between 0.5° and 2°. Slant thresholds for 10 observers were 10 times higher than those reported by Cagenello and Rogers when expressed in terms of disparity gradient. However, the thresholds of the best observers were more similar to those reported by Cagenello and Rogers. Four of the 10 observers showed a marked anisotropy as a function of surface orientation and were unable to
Barbara Gillam. She obtained a B.A. in psychology at the University of Sydney and a Ph.D. at the Australian National University. She was a lecturer at the University of Reading, England, before moving to Columbia University to become a research associate with Clarence Graham. She then joined the College of Optometry of the State University of New York, where she became professor and served a term as head of the Department of Behavioral Science and Public Health. In 1986 she moved back to Sydney to become professor and head of psychology at the University of New South Wales.
Figure 20.31.
detect any slant in surfaces slanted several tens of degrees away from the frontal plane. Orientation anisotropy is also evident in surfaces with suprathreshold slant or inclination (Rogers and Graham 1983). The slant of a single surface is typically more severely underestimated than is its inclination. This should be evident in (Figure 20.32). One would expect from the above results that we would be more sensitive to shearing motion than to compressive motion in a monocular random-dot display. However, Nakayama et al. (1985) found that, when the spatial frequency of motion modulation was higher than 1 cpd, the amplitude threshold for detection of shear motion rose well above that for detection of compressive motion. It seems that the detection of different types of motion is not the same as the detection of different types of disparity. Slant about a vertical axis is produced by a difference in lateral compression of the two images, which is equivalent to a difference in spatial frequency. Inclination about a horizontal axis is produced by a difference in the orientation of corresponding images. Therefore, one would expect spatial frequency discrimination to be higher than orientation discrimination in subjects showing higher sensitivity to inclination than to slant. Hibbard et al. (2002) found such a positive correlation between the two types of anisotropy.
B I N O C U L A R D I S PA R I T Y A N D D E P T H P E R C E P T I O N
•
413
gradient indicated frontal surfaces. The following evidence suggests that inclination-slant anisotropy shows only in this cue-conflict situation. Bradshaw et al. (2002a) asked subjects to point to various locations along a previously seen slanted or inclined textured square surface, which subtended 11°. When the stimulus was a real board with all depth cues present, pointing and a matching task revealed that slant and inclination were perceived accurately without anisotropy. However, when the stimulus was a random-dot stereogram, an inclined surface appeared steeper than a slanted surface, as indicated both by pointing and by a matching task. These results would be explained if, in a full-cue situation, anisotropy in the disparity system is canceled by an opposed anisotropy in the texture-gradient system. This would be confirmed if surfaces with only texture gradients showed the opposite anisotropy. But this has not been shown. The results would also be explained if texture gradients did not show an anisotropy but were heavily weighted when in conflict with disparity.
A
B Perception of slanted and inclined planes. (A) The fused image contains horizontal-shear disparity, which creates inclination about a horizontal axis. (B) The fused image contains a horizontal-size disparity, which creates slant about a vertical axis. With only one surface in view, most observers perceive that a given disparity produces less slant than inclination. Gillam et al. (1984) reported that perceived slant builds up over several seconds.
Figure 20.32.
20.4.1b Anisotropies in Oblique Surfaces The higher sensitivity to inclined surfaces could perhaps be due to the greater importance of inclined surfaces for the guidance of behavior. If this is so, the higher sensitivity may be independent of the particular stimuli that signify inclination and slant. Bradshaw et al. (2002b) approached this question by asking subjects to discriminate between two random-dot surfaces slanted about oblique axes. Such surfaces contain components of shear disparity and of compression disparity. Subjects took about 6 seconds to discriminate between two oblique surfaces with the same shear disparity but different compression disparities but only about 1 second to discriminate between two surfaces with the same compression disparity but different shear disparities. This result suggests that the perceptual anisotropy between slanted and inclined surfaces arises from a difference in the processing of shear and compression (size) disparities rather than from an anisotropy related to the orientation of the axis about which surfaces are inclined.
20.4.1c Cue Conflict and Slant-Orientation Anisotropy The above experiments used textured surfaces in which slope was specified only by disparity. The zero texture 414
•
20.4.1d Effects of the Orientation of Surface Lines The stimuli discussed so far were random-dot displays. A surface with horizontal and vertical lines contains only shear disparity when inclined and only horizontal compression disparity when slanted. A surface ruled with oblique lines contains shear disparity whether it is inclined or slanted (see Figure 19.13). Arditi (1982) found that thresholds for slant detection were lower for a ±45° crosshair pattern than for a 0/90° crosshair pattern. Cagenello and Rogers (1993) obtained similar results with gridline stimuli. The threshold for discriminating the direction of slant was about twice that for detecting the direction of inclination when the surfaces were covered with horizontal and vertical lines rather than random dots. When the surfaces were covered with a grid of ±45° lines, thresholds for slant discrimination were similar to those for random dots—about ±1.5° (Figure 20.33). Gagenello and Rogers interpreted these results as providing evidence for the use of either orientation or angular disparity in slant perception. Hibbard and Langley (1998) measured thresholds for detection of slant or inclination of surfaces containing a sinusoidal grating. When the grating was tilted more than 45° to the horizontal, thresholds for inclination (created by the shear-disparity component) were lower than those for slant. At a tilt of less than 45°, slant thresholds were lower than inclination thresholds. Thresholds for stimuli with two superimposed gratings (plaids) could be predicted from the additive contributions of the two gratings. Line convergence is a more powerful perspective cue than foreshortening (Gillam 1968) (see Section 26.3). From this, Gillam and Ryan (1992) predicted that the perceived inclination of a stereoscopically inclined surface
STEREOSCOPIC VISION
Slant/inclination threshold (deg)
4
3 Slanted surfaces 2
1
Inclined surfaces
0 0/90°
±45°
Grid line orientations (degrees) Slant/inclination thresholds with grid patterns. Thresholds for detection of slant and inclination of planar surfaces as a function of the orientation of surface grid lines—either 0° and 90° or ±45°. For slanted surfaces, thresholds for ±45° grids were half those for 0/90° grids. For inclined surfaces, thresholds were similar for the two grids. (Redrawn from
Figure 20.33.
Cagenello and Rogers 1993)
should be reduced more by the presence of vertical parallel lines, which have zero convergence perspective, than by the presence of equally spaced horizontal lines, which have zero foreshortening perspective. Also, the perceived slant of a stereoscopically slanted surface should be reduced more by horizontal lines than by vertical lines (Ryan and Gillam 1994). These predictions were confirmed. However, when perspective was congruent with disparity, slant about a vertical axis was still underestimated relative to inclination about a horizontal axis. Thus, slant-inclination anisotropy is not due only to differential effects of perspective. Conflicting perspective would have little effect at threshold values of slant, but it is possible that observers were influenced by its presence in the experiment by Cagenello and Rogers. Disparities between orientated line elements may contribute to the anisotropy of slanted and inclined surfaces. However, they are not sufficient, because the anisotropy is evident in random-dot surfaces lacking oriented features. 20.4.2 A N I S OT RO P Y O F D E P T H M O D U L AT I O N S
Stereoacuity for a random-dot stereogram depicting a sinusoidal depth corrugation has been reported to be about twice as high when the grating was horizontal than when it was vertical. Also, the perceived depth of a vertically oriented suprathreshold sinusoidal corrugation was found to be less than that of a horizontally oriented corrugation (Rogers and Graham 1983; Bradshaw and Rogers 1999) (Portrait Figure 20.34).
The same anisotropy has been found for detecting sinusoidal horizontal and vertical square-wave corrugations in a random-dot stereogram (White and Odom 1985). Bradshaw et al. (2006) measured the binocular disparity required for the detection of a depth corrugation in a random-dot stereogram as a function of the spatial frequency and orientation of the corrugation. The mean results of four subjects are shown in Figure 20.35A. For depth modulations of 0.4 and 0.8 cpd, the disparity threshold did not vary with the orientation of the depth modulation. For the 0.1 and 0.2 cpd modulations, the threshold rose as orientation increased from 0° (horizontal) to 90° (vertical). Thus orientation anisotropy was evident in only lowfrequency depth modulations. At low frequencies, the anisotropy was a monotonic function of the orientation of the corrugation. Figure 20.35B shows that the disparity in a horizontal depth-modulated grating required to match the depth of a test grating was maximal when the test grating was at 45°. In other words, a grating of a given disparity produced more depth when it was oblique than when it was horizontal or vertical. Thus, unlike the disparity threshold, the magnitude of perceived depth in a suprathreshold stimulus was not a monotonic function of orientation. Threshold and suprathreshold anisotropies are evident in the stereograms with swept corrugation frequency and swept amplitude depicted in Figure 20.36. The range of corrugation frequencies and the depth modulation changes are the same in the two stereograms but most observers report (1) more depth in the low-frequency horizontal corrugations than in the vertical corrugations and (2) that the boundary for detecting low-frequency vertical corrugations extends a shorter distance across the stereogram than the boundary for detecting horizontal corrugations. For most observers the anisotropy is especially evident with low corrugation frequencies and planar surfaces and may not be evident for all readers in Figure 20.36 because the surfaces subtend only a small visual angle. Ecologically it is important to detect changes in the inclination of a horizontal surface that arise, for instance, when we walk over undulating ground. Detection of undulations in the slant of vertical surfaces is less important. Serrano-Pedraza and Read (2010) produced evidence that the stereo mechanism is specifically adapted for detection of horizontal modulations of disparity. They found that the anisotropy for square-wave modulations of disparity was much weaker than that for sine-wave modulations. Also, while the anisotropy for sine-wave modulations was greater for low than for high modulation frequencies, the anisotropy for square-wave modulations was almost independent of modulation frequency. They concluded from these findings that the stereo mechanism contains several modulation-frequency channels for de tection of horizontal corrugations but only one channel for detection of vertical orientations.
B I N O C U L A R D I S PA R I T Y A N D D E P T H P E R C E P T I O N
•
415
Disparity threshold (arcsec)
100 80
N=4
Vertical
60 0.1 cpd
40
0.2 cpd 20
0.4 cpd 0.8 cpd
0
0 20 40 60 80 90 Orientation of corrugation (deg)
A
Figure 20.34.
Disparity setting (arcsec)
600
Mark Bradshaw. Born in Larne, Northern Ireland, in 1961.
450
N = 10
300
0.8 cpd
150
0.2 cpd
At the age of 17, he joined the merchant navy. He obtained an M.A. from Glasgow University and a Ph.D. in psychology from Sheffield University with J. Mayhew and J. Frisby in 1989. He conducted postdoctoral work with B. Rogers at Oxford University. In 1995 he obtained an academic appointment in psychology at the University of Surrey, where he became professor in 2003. Mark Bradshaw died in 2004.
0 0 20 40 60 80 90 Orientation of corrugation (deg)
B Effects of orientation of a depth corrugation on disparity threshold and perceived depth. (A) Disparity threshold as a function of the
Figure 20.35.
2 0 . 5 D I S PA R I T Y-D E F I N E D 3 -D S H A P E 20.5.1 S H A P E I N D E X A N D C U RV E D N E S S
Koenderink (1990) proposed that a smooth quadratic surface may be specified by two orthogonal features—its shape index and its curvedness. The shape index is a scale-independent quantity that does not depend on the size of the surface. It is related to the so-called principal curvatures of the surface by the following expression: ⎛ K + K min ⎞ 2 Shape index = − arctan ⎜ maxa p ⎝ K maxa − K min ⎟⎠
(5)
where Kmax and Kmin are the principal curvatures. Differently shaped surface patches have different shape indices, as shown in Figure 20.37. An ellipsoid (cone or sphere) has a shape index of +1 or −1 because it has the same curvature in all directions. A paraboloid (cylinder) has a shape index of +0.5 or −0.5 because it has no curvature along its axis and maximum curvature in the orthogonal direction. A hyperboloid (saddle shape) has a shape index of 0 because it is convex in one direction and concave in the orthogonal direction. Two surfaces with the same shape index but of opposite sign have opposite depths like a stamp and a mold. 416
•
orientation of a depth-modulated grating for four spatial frequencies of modulation. (B) The disparity of a horizontal depth-modulated grating required to match the depth of a test grating, as a function of the orientation of the test grating. (Adapted from Bradshaw et al. 2006)
The curvedness of a surface is a measure of how curved the surface is. It depends on the size of the surface and on the units used to measure size. It is specified by the following expression: Curvedness =
2 2 K max a − K min 2
(6)
Shape index and curvedness are independent descriptions of a local smooth surface patch rather than of a whole surface. Together, they uniquely define the shape of the patch. A flat surface and a sphere are the only objects for which the shape index and curvedness are the same over their entire surface. Koenderink’s classification of surface patches provides a criterion for distinguishing between “simple” and “complex” random-dot surfaces ( Julesz 1971) (Section 18.14.2). It also allows us to investigate whether response times for identifying simple and complex shapes differ (Uttal 1987; Uttal et al. 1988). It also allows shape discrimination performance to be compared for different depth cues (Erens et al. 1991).
STEREOSCOPIC VISION
A
B Frequency-amplitude swept disparity gratings in two orientations. (A) Horizontally oriented corrugations are swept in spatial frequency from low (bottom) to high (top). Peak-to-trough amplitude is swept from maximal (left) to zero (right) for all corrugations. Depth modulations of medium frequency corrugations (0.3–0.5 cpd) can be seen farther to the right than either low or high frequency corrugations. (B) Vertically oriented corrugations swept in spatial frequency from low (left) to high (right). Peak-to-trough amplitude is swept from maximal (bottom) to zero (top). Most observers report that the vertical corrugations have less depth and can be seen less far up the stereogram than the horizontal corrugations can be seen across the stereogram in (A).
Figure 20.36.
The visual system may not classify local surface patches according to their shape index and curvedness. However, we will see that observers can be trained to use these criteria. Surfaces curved in depth produce images with secondorder disparities, which are referred to as disparity curvatures (Section 19.5). The present section deals with the detection of the curvature and shape of 3-D surfaces arising from higher-order spatial patterns of disparity. 20.5.2 C U RVAT U R E D I S C R I M I NAT I O N THRESHOLDS
Thresholds for discriminating the sign of curvature (convex or concave) in random-dot stereograms depicting cylindrical surfaces with a parabolic profile are very low.
For a 20°-diameter random-dot display depicting a horizontal cylinder with a parabolic profile, the sign of curvature was reliably discriminated when the radius of curvature at the peak of the cylinder was greater than 400 cm. This corresponds to a disparity curvature (rate of change of disparity gradient per visual angle) of less than 0.02 arcmin/deg2 (Rogers and Cagenello 1989). Thresholds varied with display size. They were 10 times higher when the display subtended only 2.66° (Figure 20.38A). Thresholds for vertical cylinders were typically 1.5 times higher than those for horizontal cylinders. Thresholds were also influenced by the orientation of gridlines (0/90° or ±45°) on the surface, although to a lesser extent than thresholds for detecting the direction of slant of a flat surface (Section 20.4.1).
B I N O C U L A R D I S PA R I T Y A N D D E P T H P E R C E P T I O N
•
417
S
Kmax
Curvedness
– 0.5
Shape 1
0
33
0.025
40
0.02
50
0.015
67
0.01
100
0.0005
200
0 0.5
0
5
10 15 Display size (deg)
20
Radius of curvature (cm)
–1
Surface curvature threshold (cm–1)
Kmin
0.03
625 25
Shape and curvedness indices. The local curvature of a surface patch can be characterized by its shape index (angular coordinate) and its curvedness (radial coordinate). Convex spheres have a shape index of 1 and concave spheres (hollows) have a shape index of –1. Cylinders with convex and concave curvature have shape indices of 0.5 and –0.5 respectively, while saddles have a shape index of 0. Curvedness increases radially from the center in this representation. (Redrawn from de Vries et al. 1994)
Figure 20.37.
Thresholds for discriminating a difference in curvature of surfaces with suprathreshold curvatures have also been determined for cylinders with parabolic profiles. It can be seen in Figure 20.38B that, at the optimal curvature of around 3 arcmin/deg2, a curvature difference of less than 5% could be detected between two cylinders (Rogers and Cagenello 1989). Lunn and Morgan (1997) obtained Weber fractions of at least 15% for discrimination of a change in curvature of a horizontal cylinder. 20.5.3 3-D S H A P E D I S C R I M I NAT I O N THRESHOLDS
De Vries et al. (1993) investigated the discriminability of disparity-defined surfaces that differed in shape and curvedness. Observers were first shown wire-frame pictures of surfaces with different shape indices to familiarize them with the scale. They then assigned random-dot stereograms of surfaces to eight equal intervals along the shape index scale. The curvedness of the surfaces had four values that varied between blocks of trials. Between 70 and 90% of the categorizations were correct. Performance was best toward the cylinder ends of the shape scale (±1). Performance was worst for saddle shapes with an index close to zero. Shape discrimination was better for surfaces with larger curvedness, when the trials were blocked according to curvedness. Discriminations showed a more random pattern in the second experiment when curvedness was randomized over trials. When observers assigned indices to shapes, rather than categorizing them, an almost perfect linear 418
•
Threshold curvature as weber fraction
A 0.150 0.125 0.100 0.075 0.050 0.025 0.000
1 10 100 Disparity curvature of reference surface (arcmin/deg/deg)
B Surface curvature discrimination thresholds. (A) Thresholds for discriminating between convex and concave surface curvature in horizontally oriented parabolic corrugations as a function of display size. Thresholds decreased with increasing display size reaching a minimum of less than 0.002 cm-1 (500 cm radius of curvature) for the largest display (21.6°) (N = 2). (B) Thresholds for discriminating between one suprathreshold parabolic curved surface and another expressed as a fraction of the reference curvature. Weber fractions were smallest (< 5%) for curved surfaces with 3 arcmin/deg/deg disparity curvature (2.25 cm radius). Results for one subject. (Redrawn from Rogers and Cagenello 1989)
Figure 20.38.
relationship was found with little influence of surface curvedness. De Vries et al. (1994) extended these investigations. Rather than categorizing or labeling surfaces in randomdot stereograms with different shape indices, observers made forced-choice discriminations between surfaces with similar shape indices. Figure 20.39 shows that performance was best for cylindrical surfaces with shape indices of –0.5 or +0.5. All observers found saddle-shaped surfaces (index 0) and symmetrical ellipsoids (index ±1) more difficult to discriminate than cylinders. Performance was very impressive. The best observer could detect a shape-index difference of 0.01, which corresponds to 0.5% of the entire shape index scale from –1.0 to +1.0.
STEREOSCOPIC VISION
2 0 . 6 C O N S TA N C Y O F D I S PA R I T Y-D E F I N E D D E P T H
Consider first a simple depth interval of Dd between two objects A and B. The difference in disparity, m A m B , is inversely proportional to the square of the distance of the objects, D, from the viewer, or m A m B = k D 2 (given that Dd is small relative to D). Therefore judgments of a simple depth interval along a line of sight must be scaled by distance squared. An inclined flat surface produces a disparity gradient, defined as the change in disparity per unit visual angle, or ( m A m B ) q (Section 19.4). The visual angle subtended by a pair of objects is inversely proportional to their dism )q tance from the eye, or q = k D . Therefore, ( m is proportional to 1/D. In other words, the disparity gradient between two objects is inversely proportional to their distance along a given line of sight. Put another way, the disparity gradient is the first spatial derivative of the absolute difference in disparity between two points, and the first derivative of any squared function is a linear function. Therefore a judgment of the slant or inclination of a surface at different distances along a line sight must be scaled by distance. A surface curved in depth produces a disparity curvature, defined as the rate of change of a disparity gradient (second spatial derivative of disparity) as a function of visual m ) q 2 . We saw in Section 19.5 that the angle, or ( m second derivative of disparity, defined as the rate of change of disparity gradient, is independent of viewing distance. Therefore a judgment of the local curvature of a surface in depth does not need to be scaled by distance rather than by distance squared. Distance scaling also depends on the cues used to detect the distance of the stimulus. These cues are listed in Section 29.2.1, where it can be seen that they code depth in different ways.
20.6.1 I N T RO D U C T I O N
20.6.2 P RO C E D U R E S
This section deals with the perceived constancy of disparity-defined depth intervals and 3-D shapes over changes in viewing distance. The constancy of depth intervals defined by other depth cues is discussed in Section 29.2.2. A change in vergence either increases or decreases horizontal disparity by an equal amount over the whole binocular field. Therefore, absolute horizontal disparities do not indicate absolute distance. Relative disparities between objects at different distances are not affected by changes in either horizontal or vertical vergence. However, they do not provide unambiguous information about depth intervals or about the slant or inclination of a surface because they are affected by the absolute distance and headcentric eccentricity of the stimuli. The accurate perception of a disparity-defined depth interval requires the viewer to register the absolute distance of the stimuli and scale the disparity accordingly. However, the required distance scaling of disparity depends on whether the stimulus is a depth step, a depth gradient, or a depth curvature.
20.6.2a Use of Depth Probes
Shape-discrimination threshold
0.125 0.100 0.075 0.050 0.025 0 –1.0
–0.5
0 Shape index
0.5
1.0
Thresholds for discriminating shape differences. Shape discrimination thresholds for 21 reference shapes with shape indices between –1.0 and +1.0. Curvedness was fixed at 0.5 cm-1 (red) or 1.0 cm-1 (blue). Thresholds were lowest for cylindrical shapes (shape index +0.5) and highest for saddle shapes (index 0.0) and symmetrical ellipsoids (index ±1.0). (Redrawn from de Vries et al. 1994)
Figure 20.39.
De Vries et al. (1994) also reported the surprising result that shape discrimination thresholds were not higher when (1) the curvedness of the surfaces was varied from trial to trial with values between 0.3 and 1.25 cm–1, or (2) the slant of the differently shaped surfaces was randomly varied with values between ±30°, or (3) both curvedness and slant were varied.
In the depth-probe procedure, subjects adjust the distance of a visual comparison object, or probe, until it appears to be at the same distance as the near edge of the test object and then at the same distance as the far edge. There is a logical problem in using a probe in this way. The probe will also be subject to depth scaling. The fact that subjects match the distance of the probe to that of the test object accurately does not allow any conclusions to be made about the perceived distance of the test object. However, the probe usually contains more depth information than the test object. It therefore provides a standard for comparing the precision and accuracy of different sources of depth information.
20.6.2b Use of a Frontal-Plane Comparison Stimulus In this procedure, subjects set an interval between two stimuli presented in a frontal plane at a fixed distance to
B I N O C U L A R D I S PA R I T Y A N D D E P T H P E R C E P T I O N
•
419
match the depth dimension of a test object presented at various distances. The procedure requires the subject to change vergence in looking from the comparison stimulus to the test stimulus. Changes in vergence are not required if the comparison object is a pair of calipers or other object that subjects set by touch. For example, Bradshaw et al. (1998) used three LEDs placed in the plane of the visual horizon at viewing distances from 1.5 to 3 m. Subjects adjusted the positions of the LEDs until they formed a forward-pointing triangle in the horizon plane that matched the shape of a triangle held in the hand. They performed the task with depth provided by headgenerated motion parallax alone, disparity alone, and with both cues. The results indicated that subjects underestimated the size of the triangle. When this factor was allowed for, size constancy and depth constancy were remarkably good. Performance with both cues present was consistent with that predicted by averaging of information from the two cues.
20.6.2 Comparing Depth Intervals In this procedure, a depth interval between two objects at one distance is matched with the interval between an identical pair of objects seen at each of several other distances. In one version of the procedure, the depth cues are the same for the two sets of stimuli. In another version, one member of each pair of targets is presented with all cues to distance while the other member is presented with only the depth cue under investigation (see Gogel 1960; Glennerster et al. 1996). This procedure gets directly at the main question, namely, for a given cue to relative depth, does the perceived depth interval between two objects remain the same when the objects are placed at different absolute distances? Theoretically, information about viewing absolute distance is not required to set a depth interval at one distance to match a depth interval at another distance (Glennerster et al. 1996). In this task the viewer is not required to judge either depth interval accurately but only that they are the same. All that is required is information about the ratio of viewing distances of the two sets of objects.
20.6.2d Pointing with Unseen Hand In this method, subjects point with an unseen finger to each test object and the positions of the finger are recorded. It must be determined independently how the felt position of the hand is related to depth. 20.6.3 C O N S TA N C Y O F R E L AT I V E D E P T H M AG N I T U D E
20.6.3a Relative-Depth Constancy at Near Distance Ritter (1979) asked subjects to fixate an object while a test object was presented for 100 ms at a constant crossed disparity of 20 arcmin. The fixation and test objects were 420
•
then removed and subjects set a depth probe to the remembered position of each object. Subjects set the probe accurately and it was concluded that people have almost perfect constancy of disparity-defined depth over the 60 to 180 cm range of absolute distance used. These conclusions are suspect because they were based on the use of a depth probe. In an earlier experiment, Ritter (1977) asked subjects to match a comparison depth interval at a given absolute distance with the depth dimension of a 3-D wire object presented at different distances. This is a better procedure because the comparison stimulus was immune to changes in depth scaling, assuming that it was perceived as remaining at the same distance. Although there was no significant effect of distance on perceived relative depth, results were very variable over the distance range of 60 to 180 cm. Collett et al. (1991) asked subjects to estimate the disparity-defined depth between two frontal textured surfaces that abutted along the horizontal meridian. The display was between 2 and 10° wide, and was viewed from a distance of between 45 and 130 cm. Estimates of the depth interval declined with increasing viewing distance when the display had a constant angular size, but were nearly independent of viewing distance when linear size was constant. At any distance, the disparity-defined depth between the surfaces appeared greater as the angular size of the display was reduced and this effect became more pronounced with increasing viewing distance. Collett et al. concluded that their subjects used two sources of information to register the viewing distance of the displays. The first was eye vergence, and the second was the change in angular size with changing distance. The second factor became dominant as viewing distance increased. Mon-Williams et al. (2000) asked subjects to point with unseen finger to the near and far end of a concave wire pyramid extending 8 cm in depth from its base at a distance of 32.5 cm. Viewing was normal or with 6-diopter base-in prisms with lenses that increased the vergence distance of the base of the pyramid to 66.5 cm. For nine of 15 subjects, the perceived depth of the pyramid increased as predicted as convergence-specified distance was increased. Thus, at least for some subjects, vergence was used to scale a depth interval at a distance of 35 cm. It has been suggested that the judgment of absolute distance that is required for estimating a disparity-defined depth interval is influenced by the range of disparities in a scene (Glennerster et al. 1998; Harris 2004). A wide range of disparities is likely to arise from near objects and a narrow range is likely to arise from far objects. However, O’Kane and Hibbard (2010) found that estimates of a disparitydefined depth interval were not affected by the presence of frontal displays of LEDs placed at various distances relative to the test stimulus. But the perception of a depth interval surely improves in the presence of a textured ground plane that provides more adequate information about the range of disparities.
STEREOSCOPIC VISION
20.6.3b Relative-Depth Constancy Beyond 2 m In studying relative-depth constancy over large viewing distances it is difficult to keep disparity constant and eliminate the effects of other depth cues such as accommodation. Cormack (1984) overcame these problems by using a pair of disparate afterimages as the test object. Good stereo depth can be obtained from afterimages, and their use has the advantage that a given disparity remains the same over changes in vergence or accommodation (Wheatstone 1838; Ogle and Reiher 1962). Afterimages with disparities of 16.3 or 4.5 arcmin were viewed with reference to a fixation point at various distances up to 27 m in a corridor or up to 6 km outside at night. In both cases, surrounding objects were sufficiently visible to provide perspective information about distance. Subjects either matched the perceived depth of the afterimage with a variable depth probe or gave verbal estimates of its absolute distance and its depth relative to a fixation point set at each of several distances. The only results reported were derived from the use of the depth probe. Therefore, the fact that they revealed almost perfect depth constancy up to 27 m must be regarded with suspicion. Allison et al. (2009a) compared monocular and binocular estimates of depth intervals using two LEDs viewed at eye level. One LED was at a distance of 4.5 or 9 m. The other was between 0.05 and 1.7 m beyond the first. The LEDs were in dark surroundings or the foreground of the near LED was illuminated. Monocular estimates of the depth interval between the LEDs were grossly underestimated and did not vary as a function of the interval or whether the foreground was dark or illuminated. Binocular estimates of the depth interval were a linear function of the actual interval. Estimates were compressed about 75% in dark surroundings but only by about 50% when the foreground was visible. Depth-interval estimates were similar at the 4.5 and 9 m viewing distances in spite of the four-fold increase in relative disparities. There was thus considerable constancy of perceived relative depth over these two viewing distances. The better performance with the foreground in view must have been due to perspective information from the foreground because vergence and absolute disparity would not be available at viewing distances of 4.5 or 9 m. Palmisano et al. (2010) performed a similar experiment with two LEDs in a disused railway tunnel. The near LED was at a distance of 20 or 40 m. The tunnel was either dark or illuminated up to the distance of the near LED. With illuminated foreground, average depth-interval estimates were 59% of the actual interval at a viewing distance of 20 m and 52% at a distance of 40 m. There was thus considerable constancy of perceived relative depth over these two large viewing distances. Theoretically, information about absolute distance is not required to set a depth between two objects at one distance to match a depth between two objects at another.
T e s t stimulus 2 3m C o mparison s t mimulus 2.1 Test s t imulus 1 1.5 m
Obs e rver
Stimuli used by Bradshaw et al. (2000). For each test stimulus the subject set the adjustable LED so that the depth between it and the two base LEDs matched the depth within the comparison stimulus. In each stimulus, the base LEDs were 40 cm apart.
Figure 20.40.
All that is required is information about the ratio of viewing distances of the two sets of objects. Bradshaw et al. (2000) used the stimuli depicted in Figure 20.40. Subjects set the depth interval between the LEDs of a test stimulus at 1.5 m or 3 m to equal the depth interval between the LEDs of a comparison stimulus at 2.1 m. Mean errors were 7% at 1.5 m and 12% at 3 m. Subjects performed equally well when depth information was supplied by motion parallax, by disparity, or by both cues. Subjects performed much less accurately when asked to set the width of a set of LEDs to match the in-depth dimension of the set. This task requires knowledge of viewing distance, not merely ratios of distance.
20.6.3c Use of Vertical Disparity and Vergence in Scaling Relative Depth Michaels (1986) provided an ecological and nonmathematical analysis of binocular vision, which revealed the importance of patterns of vertical disparity in specifying the absolute distances of surfaces. She also presented evidence that manipulations of vertical disparities in stereograms depicting wall surfaces influence judgments of apparent slant and judgments of where the surface intersects the observer’s midfrontal plane. Sobel and Collett (1991) reported that setting vertical disparities to conform to viewing distances of 12.5 and 100 cm had no effect on the perceived depth between two abutting frontal surfaces at a vergence distance of 50 cm. Cumming et al. (1991) reported that the perceived curvature of a stereoscopic cylinder was not affected by changing vertical disparities corresponding to viewing distances between 37.5 cm and infinity (Figure 20.41A). However, changing vergence did affect the perceived shape of the cylinder by an amount equivalent to 25% of the change required for complete depth constancy (Figure 20.41B). The display used by Sobel and Collett subtended 25°. That used by Cumming et al. subtended about 11°. The maximum vertical-size disparity for their closest distance (37.5 cm) was less than 1.5%. This is probably below the
B I N O C U L A R D I S PA R I T Y A N D D E P T H P E R C E P T I O N
•
421
6
A
150
N = 10
75
50 cm
60
37.5 8
50 75 150 Viewing distance simulated by vertical disparity (cm)
37.5
28 cms Infinity
Normalized judged depth (cm)
Scaling distance (cm)
75 cm 5
4
Reference at57 cm
3 2 1 0
B
Scaling distance (cm)
150
Vertical + horizontal disparity
75 cm 150 cm
Vertical Horizontal “Window” disparity disparity condition only only
Effects of vertical disparity on perceived depth. Observers estimated the peak-to-trough depth in horizontal corrugations occupying the central (25° × 20°) region of the 75° × 75° display. When the vertical and horizontal disparity components were appropriate for a surface at infinity, depth was judged to be approximately double that when cues were appropriate for 28 cm. Most of this effect was due to the presence of the vertical component. Depth scaling was absent when only the 25° × 20° test area was visible but was still found when disparity cues were limited to the area surrounding the 25° × 20° test area in the “window” condition. (Redrawn from Rogers and Bradshaw 1993)
Figure 20.42.
75 50 cm 60
37.5 50 75 150 Viewing distance simulated by vergence (cm)
8
37.5
Effects of vertical disparity and vergence on depth scaling. Observers judged whether a stereoscopic horizontal cylinder appeared more, or less, curved than a semicircular cylinder at each of three distances of the computer monitors. (A) Changes in vertical disparity had no effect on perceived curvature, as indicated by the scaling distance subjects applied. (B) Manipulations of vergence had an effect equivalent to 25% of complete constancy (dotted diagonal line). Results for one subject. (Redrawn from Cumming et al. 1991)
Figure 20.41.
detection threshold, or at least below the threshold for depth scaling. Figure 20.47 shows that the VSR on a frontal plane increases with increasing eccentricity to a maximum at an eccentricity of about ±45°. Rogers and Bradshaw (1993) used stimuli subtending 75° that produced a VSR of about 1.12 (a 12% size difference) at the closest distance (28 cm). The stimuli contained a sinusoidal modulation of shear disparity that produced horizontal ridges and troughs. The surface appeared more distant when the pattern of vertical-disparities was appropriate to a surface at infinity than when it was appropriate to a surface at 28 cm. Accordingly, texture elements covering the surface appeared larger for the apparently more distant surface than for the nearer surface. When the pattern of vertical-disparities was appropriate for a surface at infinity, the perceived depth of a given corrugation was approximately twice that when vertical-disparities were appropriate for a surface 28 cm from the observer (Figure 20.42). Although scaling of perceived relative depth was much less 422
•
than that required for complete constancy, the results provided the first clear evidence that vertical disparity manipulations affect perceived depth in stereoscopic surfaces. This evidence demonstrates that stimulus size is an important factor in the perception of the depth structure of surfaces. Other evidence showing the importance of stimulus size is presented below. In previous studies, vergence and accommodation were constant while disparity-defined distance was varied. Therefore, there was conflict between information from disparity and that from vergence and accommodation. Bradshaw et al. (1996) assessed the effects of conflicting information on depth scaling. Observers adjusted calipers to indicate the perceived depth of disparity-defined horizontal corrugations in a random-dot stereogram that subtended 75° or 10°. For each size there were three conditions: (1) vertical-disparity changed, with vergence constant; (2) vergence changed, with vertical disparity constant; and (3) both cues changed. The results were clear. With displays subtending 75°, changing vergence alone and changing vertical disparity alone produced depth scaling equivalent to ∼20% of that required for complete constancy (Figure 20.43). When both cues signaled changes in viewing distance, depth scaling rose to over 30%. This suggests that contradictory vergence information in previous experiments was responsible, at least in part, for the small amount of depth constancy. The results for small (10°) displays were quite different. The effect of changing vertical-disparity on depth scaling
STEREOSCOPIC VISION
20.6.3d Depth Constancy in Stereoscopes
Effective scaling (%)
50 Vertical disparity + vergence
40 30
Vergence 20 10
Vertical disparity
0 10
20 40 Display size (deg)
80
Disparity scaling and display size. Disparity scaling is expressed as a percentage of that needed for perfect constancy of perceived relative depth at different distances. When both vertical disparities and the vergence angle indicated the viewing distance, constancy was about 35% for all sizes of display. When vertical disparities alone specified viewing distance and vergence was held constant, constancy was about 15% for the largest (80° diameter) displays and decreased to zero for 10° displays. Constancy was around 35% with vergence manipulations with the smallest (10°) display. The effectiveness of vergence decreased with increasing display size (N = 3). (Redrawn from Bradshaw et al. 1996)
Figure 20.43.
was negligible, as Cumming et al. had found. The effects of changing vergence, which were now free of the contradictory influence of vertical-disparities, rose to over 30% of that required for complete scaling (Figure 20.43). Unfortunately, it does not appear to be possible eliminate the effects of vergence in order to assess the true role of the vertical disparity. O’Kane and Hibbard (2007) asked whether vertical disparities in one area affect perceived depth in a neighboring area. Subjects judged the size and shape of a stereoscopic ellipsoid at a vergence-defined distance of between 20 and 45 cm. Judgments were more consistent when vertical disparity in a surrounding random-dot surface indicated a viewing distance of 16 cm than when it indicated an infinite viewing distance. Thus a large area of vertical disparity in one depth plane affected the perceived distance of a small central object in another depth plane.
The effects of an incorrect relationship between relative disparity and viewing distance on the perception of depth intervals is vividly illustrated by the cardboard cutout phenomenon. Familiar objects seen through binoculars tend to look their normal size but appear nearer. This is because magnification causes a linear increase in binocular disparity rather than a squared increase in disparity that results when objects are actually brought nearer. This shortfall in disparity scaling relative to that expected by depth constancy causes scenes viewed through binoculars to appear flattened (Wallach and Zuckerman 1963). Also, a telephoto lens causes objects to appear flattened in depth (Harper and Latto 2001). In general, if distance is underestimated, objects look flattened (Foley 1980; Johnston 1991). Objects and people in stereograms typically appear flattened, like a cardboard cutout (Figure 20.44). This is still true when the retinal images have the same size and disparities as those created by the original scene. As the distance of a stereogram is increased, the sizes of images in the stereogram decrease approximately in proportion to distance, as with real objects. However, disparities in a stereogram decrease linearly with distance, not by distance squared. For a person viewing a stereogram using the inappropriate distance-squared correction, perceived depth intervals should increase linearly as the distance of the stereogram increases. Psychophysical judgments conform approximately to this prediction (Wallach et al. 1979; Bishop 1996). Another factor is that, in many stereograms, distances indicated by accommodation and convergence are much less than distances of objects in the original scene. This reduction in distance cues causes objects to look flattened. Objects should look elongated in depth and more sharply inclined when cues to distance in the stereoscope signal a
The “cardboard cut-out” phenomenon. People and objects in
Figure 20.44.
pictorial stereograms often appear as cardboard cut-outs even when the visual angle and the disparities of the original scene are reproduced correctly. The phenomenon is a consequence of the fact that size scales with 1/d whereas disparity scales with 1/d2, where d is the distance to the object. Size and depth scaling are often inappropriate in a stereoscope because vergence and vertical disparity signal a closer viewing distance than in the original scene. B I N O C U L A R D I S PA R I T Y A N D D E P T H P E R C E P T I O N
•
423
greater distance than that of the objects in the original scene. Van Damme and Brenner (1997) showed from the way subjects judged the shape of a stereogram depicting a tennis ball, that the distance used to scale disparity was also used to scale frontal size.
Todd and Norman concluded that people do not have accurate perceptions of 3-D metric structure. Perhaps these failures were due to use of computer-generated objects rather than real objects and lack of adequate information about viewing distance. Further work is needed to clarify these issues.
20.6.4 D E P T H C O N S TA N C Y O F S L O P I N G P L A NA R S U R FAC E S
20.6.5 D E P T H C O N S TA N C Y O F 3-D S U R FAC E C U RVAT U R E
A useful task in investigating depth constancy for slanted or inclined surfaces is to set the dihedral angle between two abutting surfaces to appear 90°, as illustrated in Figure 20.45. Durgin et al. (1995) used real objects seen in a brightly lit, fully structured environment. The stimuli were wooden cones with the apex pointing toward the observer. The cones had depths that varied from 50% to 200% of the diameter of the base and were positioned at one of five viewing distances between 1 and 3 m. Observers adjusted a 2-D icon on a computer screen to match the perceived angle of the apex of the cone. At distances less than 3 m, settings indicated that subjects tended to overestimate depth in relation to width. However, overall, stereoscopic constancy was very high. Todd and Norman (2003) used a textured dihedral angle and a pyramid with depth defined by disparity or motion parallax. In the first task, subjects set the disparity of the stimulus to make its width equal its height. In a second task, they adjusted the disparity until the surfaces appeared to intersect at 90°. There were large individual differences but, on average, the depth of the dihedral angle was overestimated by 27% in the depth-to-width scaling task and by 40% in the angle-scaling task. Similar overestimations were obtained with the pyramid. Todd and Norman also asked subjects to adjust the disparity of a stereoscopic pyramid at a viewing distance of 115 cm to match the depth of each of three real pyramids at distances of 75 or 235 cm. Equating the depths of two 3-D objects at different distances only requires information about the relative distances of the objects. Although the repeat reliability of judgments was high, there were large constant errors. For 9 of 10 subjects, perceived depth within the real pyramids relative to that in the stereoscopic pyramid decreased by an average of 17% as the real pyramids were moved from 75 to 235 cm.
20.6.5a Judging the Flatness of Frontal Surfaces Rogers and Bradshaw (1993) pointed out that frontal surfaces have two invariant properties that could allow one to judge when a surface at any distance is frontal rather than curved in depth. The function relating the horizontal size ratio (HSR) to horizontal eccentricity across a frontal surface is shown in Figure 20.46. The function relating the vertical size ratio (VSR) to eccentricity is shown in Figure 20.47. The gradient of the HSR-eccentricity function across the midline is double that of the VSR-eccentricity function. Thus, when the HSR-gradient created by an extended smooth surface is double that of the VSR-gradient, the surface must lie in a frontal plane. In addition, if a surface is flat and frontal, the HSR for each local patch is simply the VSR squared: HSR = VSR2
(8)
Thus, when the HSR of a single patch is the square of the VSR, the patch must lie in a frontal plane whatever its distance from the observer (see Section 19.2.2b). This relationship between HSRs and VSRs holds only for surfaces close to the horizontal plane of regard. To determine whether a surface in a frontal plane is flat, the visual
Horizontal size ratio (HSR) 1.2
1.0
1.1 1.0
cm )
Infinity
57
(deg
)
Two abutting slanted surfaces. Crossed fusion creates a concave
dihedral angle. 424
•
45 90 28.5
Horizontal size ratios (HSRs) for a frontal surface. The graph shows how the HSR of the images of a horizontal line on a frontal surface varies with headcentric eccentricity and the orthogonal distance to the surface. The pattern of HSRs is very similar to the pattern of VSRs on a frontal surface (Figure 20.47) except for magnitude. For any eccentricity and distance, the HSR equals the VSR2.
Figure 20.46.
Figure 20.45.
an
0 icity
entr
Dis t
Ecc
ce t
os
–45
urf
0.8 –90
ac e(
0.9
STEREOSCOPIC VISION
Vertical size ratio (VSR) 1.0
1.2 1.1
Infinity
1.0
ce to su rfa ce (
cm )
0.9 0.8 –90
entr 0 icity (d
eg)
45 90 28.5
Dis tan
57
–45 Ecc
Vertical size ratios (VSRs) for a frontal surface. The graph shows how the VSR of the images of a vertical line on a frontal surface varies with headcentric eccentricity and the distance to the surface. The VSR is maximal at eccentricities around ±45° and decrease back to 1.0 at eccentricities of either +90° or –90°, or as the distance to the surface approaches infinity.
Figure 20.47.
Disparity with respect to frontal plane (deg)
system does not need to compute the distance to the surface directly. It would only have to determine whether the invariant relationship between the VSR and the HSR holds. The human visual system may use these simple invariants, at least for large frontal surfaces. A frontal surface at different viewing distances creates different patterns of horizontal disparities according to the extent to which the surface deviates from the curved horizontal horopter (see Figure 20.48). Hering, Helmholtz, and Hillebrand had noticed the deviation of the apparent frontal plane from the actual frontal plane. This is the Hering-Hillebrand deviation discussed in Section 14.6.2. Helmholtz (1909, vol. 3, p. 318) reported that three parallel vertical threads in a frontal plane did not always appear to lie in a frontal plane. When the threads were close to the
2 1 0 –1 –2 –3 –4 –5
observer, the central thread appeared closer than the outer threads, as if on a convex surface. When the threads were far away, the central thread appeared slightly farther away than the outer threads, as if on a concave surface. This is an example of the Hering-Hillebrand deviation. Other investigators, including Ames et al. (1932b), Ogle (1964), and Foley (1980) obtained similar results using vertical lines. However, the magnitude and direction of the deviations from frontality do not match those predicted by the horizontal disparities of frontal surfaces. On the basis of these disparities alone, we would expect near frontal surfaces to appear convex and far frontal surfaces flat. If we assume opposite gradients of lateral compression of points in the two retinas (see Section 14.5.2h), the predicted direction of the deviations from flatness is correct, but their magnitudes are too small (Tyler 1991a). This suggests that the pattern of horizontal disparities alone does not constitute a sufficient basis for explaining frontal-plane judgments of coplanar vertical rods. Helmholtz (1909) appreciated the fact that observers need to scale horizontal disparities with information about absolute distance to judge whether objects lie in a frontal plane. He attributed deviations from frontality to the poor information about distance provided by convergence. He was also aware that vertical-size disparities of eccentric objects might supply information about distance. The stimuli used by Ogle and others were untextured vertical rods seen through an aperture close to the observer. They therefore contained no vertical-size disparities. Helmholtz (1909, vol. 3, p. 320) used the stereograms shown in Figure 20.49 to demonstrate that vertical disparity can affect the perceived shape of surfaces. These stereograms are not ideal because they contain some perspective.
Infinity 57 cm Vieth-Muller circles –30
–20
–10 0 10 20 Eccentricity (deg)
28 cm 30
Frontal planes and Vieth-Müller circles. At infinity, a frontal plane coincides with the Vieth-Müller circle—all points on the plane have zero disparity. At closer distances (57 and 28 cm), all eccentric points on a frontal plane are disparate with respect to the circle or (as illustrated) the Vieth-Müller circle can be thought of as disparate with respect to a frontal plane. The pattern of horizontal disparities varies with distance. Thus, horizontal disparities alone do not specify whether a surface is frontal.
Figure 20.48.
Helmholtz’s display of vertical disparity effects. The stereograms have vertical distortions, which create vertical disparities. With divergent fusion, the upper pattern (mimicking a surface at a large distance) is seen as convex in depth from left to right. The lower stereogram has the opposite vertical distortion (mimicking a near surface) and appears concave from left to right.
Figure 20.49.
B I N O C U L A R D I S PA R I T Y A N D D E P T H P E R C E P T I O N
•
425
Rogers and Bradshaw (1995) used large, densely textured patterns, extending 75° horizontally and vertically. Rear-projected patterns were combined in a mirror stereoscope at a viewing distance of 57 cm, so that accommodation was constant. Distance information was provided by (1) vertical disparity alone, (2) vergence alone, or (3) both cues together. In each case, the patterns were synthesized to create images appropriate to each of a set of viewing distances between 28 cm and infinity. The results were clear. When both vertical disparities and vergence were adjusted to simulate viewing distance, frontality judgments were almost perfect at all distances (Figure 20.50A). There was no tendency to see near frontal surfaces as convex and far surfaces as concave. When vertical-disparity was adjusted to simulate various viewing distances, and vergence was held at 57 cm, distance scaling was reduced to about 70% of complete constancy. When vergence distance was adjusted and vertical-disparity was held constant at 57 cm, distance scaling was less than 30% of complete constancy. These results indicate that verticaldisparity is much more effective than vergence in large displays. The experiment was repeated with displays ranging in size from 10° to 80° (Figure 20.50B). With a 10° display, changing vertical-disparity had only a small effect, since 426
•
114
600
57
Vergence
400
Vertical disparity
38 29
Vertical disparity + vergence
800
cm
114
200
1000 1000 800 600 400 200 0 Distance expressed as vergence angle (arcmin)
A 100
Vertical disparity + vergence
90 80 Effective scaling (%)
3. They should not be masked by a fixed aperture that eliminates vertical disparity at the boundaries of the stimuli.
cm 57
0
1. They should contain features that generate vertical disparity (see Epstein 1952). 2. They should be large. Vertical disparities in small displays may be below the detection threshold (see Westheimer 1978, 1984).
38
8
Frontal setting expressed as vergence angle (arcmin)
29
8
He repeated his experiment with three vertical threads with gilt beads attached to them. He wrote, “Thereupon the illusion described (that the threads appear not frontal) disappeared almost entirely” (vol. 3, p. 322). In other words, when the threads contained both horizontal and vertical disparities, they appeared frontal at different distances. Ogle repeated Helmholtz’s experiment but obtained a negative result. However, he placed beads on only the central rod, which would not generate vertical disparities (Bishop 1989). This evidence has been either ignored or forgotten, as Bishop (1989) pointed out. Helmholtz also observed coplanar rods through diverging or converging prisms, and concluded that vergence also affects judgments of frontality. He noted that the effect of vergence was diminished when the surface was covered with figures or letters. In investigating the role of vertical disparities in judgments of frontality the stimuli should have the following properties:
70 Vertical disparity
60 50 40 30
Vergence
20 10 0
10
20 40 Display size (deg)
80
B Frontal surface scaling and display size. (A) When both cues specified distance, the pattern of horizontal disparities chosen was close to that created by a real surface at that distance (dashed line). Constancy was poorer when only one cue specified distance. Results for one subject. Display size 80°. (B) Frontal surface scaling as a percentage of that needed for perfect constancy at different distances, as a function of display size. When vertical disparities and vergence angle indicated the viewing distance, constancy was close to 100% for all display sizes. When vertical disparities alone specified viewing distance and vergence was constant, constancy was nearly 70% for the largest displays and decreased for smaller displays. Vergence manipulations had a large effect (∼90% scaling) on the shape of frontal surfaces only when the display was small (N = 3). Observers adjusted the pattern of horizontal disparities across the surface until it appeared flat. The viewing distance, expressed as a vergence angle, was specified by either vertical disparities, vergence angle, or both cues together. (Redrawn from Rogers and Bradshaw 1995)
Figure 20.50.
vertical disparities in a display this size are very small. However, vergence had a large effect now that contradictory information from vertical disparities was removed. Vergence provided more than 70% of scaling for complete constancy.
STEREOSCOPIC VISION
Rogers et al. (1993) compared sensitivity to horizontal disparities with sensitivity to vertical disparities. They used an 80°-diameter textured surface with vergence fixed at 57 cm (6.5°). In the first condition, vertical disparities were set for the viewing distance of 57 cm while horizontal disparities were varied. Observers reported whether the surface appeared convex or concave along the horizontal meridian. The best threshold was 6 arcmin (0.1°). This means that observers could reliably discriminate the curvature of a surface that had the pattern of horizontal disparities appropriate to a surface 6 arcmin (< 1 cm) in front of or beyond a surface at a distance of 57 cm. In the second condition, sensitivity to changes in vertical disparity was measured with horizontal disparities set to the 57-cm viewing distance. The best threshold was 8 arcmin (0.13°). This means that observers could discriminate the curvature of a surface that had vertical disparities appropriate to a surface 8 arcmin (1.2 cm) in front of or beyond a surface at a distance of 57 cm. The results were expressed in terms of the difference in equivalent vergence angle of the surface that can be reliably discriminated as concave or convex. This allowed sensitivities to horizontal and vertical disparities to be compared. For surfaces subtending 80°, sensitivity to gradients of vertical disparity was about 70% of sensitivity to gradients of horizontal disparity. For smaller surfaces, sensitivity to vertical disparity fell off more rapidly than sensitivity to horizontal disparity for judgments of surface curvature in a horizontal direction. The lower sensitivity to vertical than to horizontal disparities with large surfaces closely matches the 70% figure obtained when observers adjusted the pattern of horizontal disparities until a surface appeared flat (Figure 20.50). This is not coincidental. Discrimination data reveal the trading function between vertical and horizontal disparities when both cues are close to the actual viewing distance of 57 cm. The adjustment data reveal the effects of setting one cue (vertical disparities) to simulate a different viewing distance (28 cm or ∞) on the horizontal disparities needed to make the surface appear flat. For displays subtending 10°, changing vertical disparity had no effect on perceived relative depth (Figure 20.43) but changing vertical disparity did affect the perceived curvature of frontal surfaces, for this size of display (Figure 20.50B). This suggests that, while vertical disparities are used in both depth scaling and frontal-plane judgments, different processes and mechanisms are involved. Rogers and Bradshaw’s results with 10° displays are compatible with those obtained by Westheimer and Pettet (1992). Westheimer and Pettet used a stereoscopic display of four dots at the corners of a square subtending 7° × 7°. Observers adjusted the horizontal disparity of a dot at the center of the square until it appeared in the same depth plane as the corner dots. Vertical disparities in the four dots were appropriate to either (1) a very close surface or
(2) a physically impossible surface lying beyond infinity. They found that the horizontal disparity that had to be added to the center dot for all dots to appear in a frontal plane was only 25% of the vertical disparity of the surrounding dots. As can be seen in Figure 19.3A, a real frontal surface creates vertical disparities with the same value as the horizontal disparities along the major (±45°) diagonals of the surface. Westheimer and Pettet concluded that vertical disparities are weighted less than horizontal disparities in the detection of surface slant. However, the vertical disparities in their display corresponded to a surface at a viewing distance of only 6.5 cm with a convergence angle of 50°, or to an impossible surface beyond infinity with a divergence angle of –50°! Therefore, the 25% effectiveness of the vertical disparities may have been due to the physical impossibility of their stimuli. The small size of their displays (7° × 7°) also contributed to the low weighting of vertical disparity. Rogers and Bradshaw (1995) found a similar low weighting of vertical disparity with a 10° display (Figure 20.43). A comparison of relative-depth scaling and scaling for the perception of frontal surfaces reveals two similarities and one difference: 1. Both types of depth scaling are maximal when both vertical disparities and vergence angle specify viewing distance. 2. For both types of scaling, vertical disparities become more effective with increasing display size while vergence becomes more effective with smaller displays. 3. In the presence of both vergence and vertical-disparity, scaling of frontal-plane judgments was close to 100% of complete constancy. Scaling of relative-depth, on the other hand, was never greater than 40%. Why should this be so? Other cues, such as accommodation, provided contradictory information about viewing distance for both types of scaling. One possibility is that, while relative-depth scaling requires an explicit estimation of viewing distance, the frontal-plane task could rely on a direct computation from the pattern of disparities, which does not involve the explicit estimate of distance.
20.6.5b Invariant Properties of Curved Surfaces The second spatial derivative of disparity over a local smoothly curved patch is the disparity curvature. It remains invariant over changes in viewing distance. However, this invariant property does not tell us much about the shape of a surface for the following reasons. 1. Effect of eccentricity The disparity-defined curvature of a surface varies with its eccentricity with respect to the head.
B I N O C U L A R D I S PA R I T Y A N D D E P T H P E R C E P T I O N
•
427
3. Effect of surface size Only local disparity curvature is invariant over changes in distance along a line of sight. Consider a sinusoidally corrugated surface. The disparity curvature along any line of sight is the same at different distances. However, the disparity curvatures of points over the surface change with distance. This is because the images of all points, except one, on a receding surface change their eccentricity. Hence, while the disparity curvature of a local patch is invariant as the patch moves along a line of sight, disparity curvatures over an extended surface change as the surface is moved in depth. In general, the term “surface shape” is used to describe the changes of curvature over a surface rather than the local curvature at a particular point. The third spatial derivatives of disparity (change of curvature over space) increase directly with viewing distance and must therefore be scaled to achieve shape constancy. Consequently, if estimates of viewing distance are inaccurate, judgments of surface shape should be adversely affected. This is a consequence of the geometry of binocular stereopsis and is true whether shape estimates are based on disparities, disparity gradients, or disparity curvatures.
20.6.5c Constancy of Perceived Depth Curvature Since disparity curvature over a large surface changes with viewing distance, detection of the shape of a large surface requires an estimate of distance. This is easy to see for a half cylinder. With each doubling of viewing distance, the image of the front of the cylinder is approximately halved, whereas the base-to-peak disparity of the cylinder reduces to approximately one-quarter. When distance is underestimated, surface curvature defined by disparity should be underestimated. When distance is overestimated, surface curvature should be overestimated. Johnston (1991) presented subjects with a horizontal convex cylindrical surface defined solely by disparity in a 428
•
Equivalent vergence angle (arcmin) Depth of surface chosen as cylindrical (cm)
2. Effect of surface orientation The disparity curvature of a surface depends on both its radius of curvature and its orientation with respect to the line of sight. The orientation of a surface patch may be specified by the point on the surface with its tangent orthogonal to the line of sight. A sphere is the only surface for which disparity curvature does not vary with orientation. But a sphere does not have constant disparity curvature over its surface. A surface with constant local disparity curvature has a parabolic profile, since the second derivative of a parabola is a constant. But the disparity curvature of a parabolic surface is constant only when the surface is symmetrical about a line of sight. Thus detection of the disparity-defined curvature of a surface requires registration of disparity curvature and of surface orientation.
30
585
390
195
97.5
20
6.25 cm cylinder
10 7 5 3
3.75 cm cylinder
2
1 38
57 76 114 Viewing distance (cm)
228
Apparent semicircular cylinder results. Observers made forced-choice judgments about the shape of horizontal cylindrical surfaces with different depth:height ratios as a function of viewing distance. Perfect constancy is indicated by the dashed lines. At the close viewing distance, a cylinder with less depth than the radius was judged semicircular; at the far viewing distance, a cylinder which had more depth than the radius was judged semicircular. (Adapted from
Figure 20.51.
Johnston 1991)
random-dot stereogram. The width/depth ratio of the cylinder was varied and subjects decided whether it was flatter or more peaked in depth than a circular half cylinder. This was done at viewing distances of 53, 107, and 214 cm. Vergence was the only cue to absolute distance. The cylinder that was perceived to be semicircular at a viewing distance of 53 cm actually had a depth that was 67% of its radius. The cylinder perceived to be semicircular at a distance of 214 cm actually had a depth that was 175% of the radius (Figure 20.51). There was only one distance (between 75 cm and 1 m) at which performance was veridical. Johnston attributed this poor performance to use of an incorrect estimate of viewing distance coupled with incorrect scaling of size for distance. Vreven and Welch (2001) obtained similar results using surfaces created by disparities between illusory contours or contrast-defined contours. Similar distortions of curvature as a function of distance occur in the horizontal horopter (Section 14.6.2). On the face of it, these are not the results one would expect of judgments based on the second derivative of disparity. Rather, the results suggest that judgments were based on depth differences in the display with an incorrect scaling for distance—overestimation of short distances and underestimation of longer distances. This pattern of errors has often been reported (Gogel 1977; Foley 1980; Tittle et al. 1995), but never explained. When Glennerster et al. (1996) asked subjects to set the depth of a cylinder to equal its radius, depth constancy was
STEREOSCOPIC VISION
Depth of surface judged as cylindrical (cm)
surface is specified by changes of disparity curvature over space (the third spatial derivative of disparity). This derivative does not remain invariant with changes in distance, so that perceived shape will vary if the distance of the surface is judged incorrectly.
Assumes disparity is not scaled by D, but size is Assumes neither disparity or size scaled by D
3
2. One could argue that the task used by Johnston was not that of detecting surface shape, because observers matched only the perceived depth of the cylinder until it appeared to be equal to its radius.
Assumes both disparity and size scaled by D
3. A circular cylinder provides a standard for judging the depth to height ratio. However, it is an unsatisfactory stimulus because the disparity gradient becomes infinitely large toward the edges of the surface. Very steep disparity gradients are difficult or impossible to detect. Sinusoidal or parabolic depth corrugations are better stimuli.
Scaling distance (cm)
Equivalent vergence angle (arcmin)
A
B Depth-width judgment of a cylinder. (A) Observers adjusted peak-to-trough depth of a horizontal elliptical cylinder until it appeared equal to its radius, for cylinders at distances between 38 and 228 cm. The increase in chosen depth with increasing distance indicates a departure from perfect constancy (horizontal dashed line). (B) Same data replotted in terms of scaling distance—the distance the cylinder would have to be for that disparity-to-height ratio to be truly semicircular. The dashed line indicates perfect constancy. The extent of constancy was estimated from the slope of the best-fitting straight line through the data points. Overall constancy was better than 70% for cylinders at distances between 57 and 228 cm. Results plotted for three observers. (Adapted from Glennerster et al. 1996)
4. There may have been conflicting depth cues in Johnston’s stimulus. The screen was at different physical distances from the observer, so that both vergence and accommodation were appropriate. However, vertical disparities were not varied and, because of the diverging optical pathways to the screen, were probably appropriate to a distance beyond infinity. This may not have been important, because vertical disparities are not effective for displays smaller than 10°. 5. Johnston’s stimuli were viewed in darkness, while Glennerster et al. presented stimuli on a textured surface, which provided extra cues to distance. However, Glennerster et al. (1998) found that only naïve subjects showed improved depth constancy (from 46% to 62%) for a disparity-defined circular cylinder when surrounding objects were in view. Experienced subjects performed just as well when the room and monitor frames were obscured.
Figure 20.52.
Glennerster et al. (1998) used the method of constant stimuli. Depth constancy over a distance range of 38 to 228 cm was better when the range of disparities centered around the correct value at each viewing distance than when the same range was used at all distances. This suggests that subjects assumed that the disparity in the middle of the stimulus range was the correct setting at each distance.
20.6.5d Comparative Judgments of Curvature about 75% (Figure 20.52). This is higher than that reported by Johnston (< 30%). The following factors may have contributed to the poor depth constancy obtained by Johnston. 1. Disparity curvature is invariant over distance only for local surface curvature. The overall shape of a large
The task of equating the depth to width of a 3-D surface requires information about viewing distance. However, the task of equating the depths of two 3-D surfaces at different distances requires only information about the relative distances of the surfaces. Glennerster et al. (1996) addressed this issue using random-dot stereograms depicting surfaces with horizontally
B I N O C U L A R D I S PA R I T Y A N D D E P T H P E R C E P T I O N
•
429
Matched disparity (arcmin)
Matched depth (cm)
be perceived as constant in 3-D shape. A receding object needed to expand in depth by the same percentage to appear constant. But there were wide individual differences.
No constancy (disparity match)
Matching depths at different distances. Observers adjusted the peak-to-trough depth of sine-wave corrugations at 57 cm to match the perceived depth of reference corrugations at distances between 38 and 228 cm. The horizontal dashed line indicates perfect constancy. Results for two observers. (Adapted from Glennerster et al. 1996)
Figure 20.53.
oriented depth ridges with sine, square, triangle, or cylindrical profiles. Two pairs of 12-inch computer monitors were viewed in a mirror stereoscope. The monitors stood on a textured horizontal surface that provided cues to absolute depth. One pair of monitors was at a distance of 57 cm and the other pair was at various distances between 38 and 228 cm. Subjects set the peak-to-trough depth of one surface to match that of the other surface. Depth constancy was close to 100% for all depth profiles (Figure 20.53). Glennerster et al. concluded that when subjects compare depth intervals at different distances they estimate the relative distances of the two stimuli rather than their absolute distances. In judging relative distances they could use the visual angles subtended by objects of the same known physical size that happen to be present, such as the display monitors. The act of changing vergence between the stimuli in the matching task did not affect the degree of depth constancy. These results show that depth constancy can be close to perfect when observers judge relative depth. In this experiment only two experienced subjects were used. Scarfe and Hibbard (2006) asked subjects to match the depth of a horizontal stereoscopic cylinder at 40 cm to that of a cylinder at 70 cm. All but two of seven subjects set more depth in the far cylinder to match that in the near cylinder. But the stimuli were all presented on the same monitor at one distance. There were thus fewer cues to viewing distance than in the experiment by Glennerster et al. Scarfe and Hibbard also asked subjects to report whether an approaching or receding horizontal stereoscopic cylinder was expanding or contracting. On average, an approaching cylinder needed to contract in depth by about 25% to 430
•
20.6.5e Disparity Correction and Normalization The difference between the accuracy of frontal plane judgments and the poor constancy of shape and depth judgments led Gårding et al. (1995) to propose that disparities are processed in two stages. The first is disparity correction, which uses the horizontal and vertical components of the disparity field to compute scaled relative nearness. This does not give the precise metric structure of objects and their layout but, rather, a description of surface shape up to a scaling factor of distance, or relief transformation (Koenderink and van Doorn 1991). They proposed that the disparity correction process, like the calculation of deformation, uses vertical disparities pooled over a region smaller than the entire visual field. In these respects, scaled relative nearness is similar to deformation disparity, which provides information about the slant and inclination of surface patches when scaled by distance (Koenderink and van Doorn 1976) (Section 19.3.3). However, the two measures are not identical. Deformation provides information about surface shape with respect to the cyclopean direction. Scaled relative nearness (like horizontal disparities) provides information about shape with respect to isodisparity circles. Consequently, scaled relative nearness does not specify the actual shape of the surface since the curvatures of isodisparity circles vary with the viewing distance, as shown in Figure 19.4B. The second stage of disparity processing proposed by Gårding et al. is disparity normalization, in which the complete metric structure of the scene is determined. They suggested that many judgments, including that of flatness of a frontal plane, can be made on the basis of scaled-relativenearness without the second-stage. Disparity normalization would indicate the amount of depth in surfaces. To support their model, Gårding et al. cited empirical evidence that adjustments of vertical disparity to simulate surfaces at different distances affect the perceived curvature of vertical cylinders but not of horizontal cylinders. However, a similar result is also predicted by the viewingsystem parameter model of Mayhew and Longuet-Higgins and by the deformation model. All three models predict that changes in vertical size disparity will affect the perceived slant (Ogle’s induced effect) but not the perceived inclination of surfaces. Thus, all three models predict that introducing vertical size disparities to match disparities produced by surfaces at different distances will affect perceived curvature in a horizontal but not in a vertical direction. However, Frisby et al. (1999) argued that the Mayhew and Longuet-Higgins theory predicts that, for small displays, an added vertical disparity will affect horizontal and
STEREOSCOPIC VISION
1.2 Vertical size ratio (VSR)
vertical curvature in the same way. They used a random-dot stereogram subtending 11.4°, which represented a horizontal or vertical half-cylinder or parabolic surface. Vertical disparities were added to simulate different viewing distances. Subjects adjusted the depth of the cylinder until it appeared circular in profile, or judged the depth of the parabolic ridge. Judgments of the vertical cylindrical ridge were affected at simulated distances under 25 cm, and judgments of the vertical parabolic ridge were affected at simulated distances of less than 12.5 cm. But changes in simulated viewing distances had only a small effect on judgments of horizontal ridges.
1.1
m
65 c 1.0
0 15 30 45 Horizontal eccentricity (deg) (a) The vertical size ratio over a frontal surface increases with horizontal eccentricity. The increase is more rapid at nearer distances Horizontal gradient of VSR (10-3deg-1)
20.6.5f Effects of Stimulus Eccentricity
Viewing distances 35 cm
4 3
50 cm
2 65 cm 1 0
0 15 30 45 Horizontal eccentricity (deg)
(b) The horizontal gradient of the VSR over a frontal increases with distance but is fairly constant across any frontal plane
Vertical size ratio
1.2 Vergence angles 7.4° 10.6°
1.1
5.7°
1.0
0 15 30 45 Horizontal eccentricity (deg)
(c) The vertical size ratio over an isovergence locus increases with horizontal eccentricity and decreases with increasing distance Horizontal gradient of VSR
The vertical size ratio (VSR) in a binocular image increases with horizontal eccentricity and decreases with viewing distance. Figure 20.54a shows the functions for frontal surfaces at three viewing distances. If the VSR of a local stimulus were used to judge distance, the eccentricity of the stimulus would have to be registered. Figure 20.54b shows the horizontal gradient of the VSR as a function of horizontal eccentricity and distance. The gradient is fairly constant over an extended frontal surface at a given distance. If distance estimates were based on the disparity gradient it would not be necessary to register the eccentricity of the stimulus. Figures 20.54c and d show how the VSR and the horizontal gradient of the VSR vary with eccentricity within each of three isovergence loci. An isovergence locus is, approximately, a circle through the centers of rotation of the eyes. The gradient of the VSR is not constant over an isovergence locus but would provide a reasonable estimate of distance in an area within 15° of the midline. Brenner et al. (2001) investigated this issue by asking whether the depth scaling of a test object, as reflected in its apparent size, varies as a function of its headcentric eccentricity. The test object was a stereoscopic textured ellipsoid. Subjects adjusted its disparity and size until it appeared the same size and shape as a tennis ball. The test object was surrounded by a 33°-diameter random-dot annulus with zero horizontal disparity, as shown in Figure 20.55. The sizesettings of the test object were more accurate when the annulus contained an appropriate gradient of vertical disparities than when the positions of the dots in the annulus were uncorrelated in the two eyes. However, settings were no more accurate when the eyes turned to view the display at a headcentric eccentricity of 30° than when it was straight ahead. Brenner et al. concluded from these two results that subjects were using the gradient of VSRs rather than the absolute value of VSRs to scale distance.
Viewing distances cm 35 cm 50
5 10.6°
4
7.4°
3
5.7°
2 1 0 15 30 45 Horizontal eccentricity (deg)
(d) The horizontal gradient of the VSR over an isovergence locus is reasonably constant up to an eccentricity of 15° Figure 20.54.
The horizontal gradient of the VSR.
Summary The question addressed in this section was how well people judge the disparity-specified 3-D structure of an B I N O C U L A R D I S PA R I T Y A N D D E P T H P E R C E P T I O N
•
431
(Adapted from Brenner et al. 2001)
Stimuli used by Brenner et al. (2001). The upper images are for convergent fusion. The lower images are for divergent fusion. The actual stimulus subtended 33°. Subjects adjusted the size and disparity of the central stimulus until it appeared to have the size and shape of a tennis ball. (Reprinted with permission from Elsevier)
Figure 20.55.
object at various viewing distances. Ideally, in an experiment, the 3-D structure of the test object should be specified only by disparity, but the distance of the object should be specified by a full range of depth cues. This has not usually
432
•
been done. In many experiments, cues to absolute distance were impoverished. In any case, in a stereoscopic display, other depth cues, such as accommodation, vergence, and perspective, although unchanging, may influence depth judgments. The only way to avoid all cue conflicts is to use real stimuli actually moving in depth in the manner described in Section 24.1.8. Many different tasks have been used, including judging simple depth intervals, setting dihedral angles between slanted surfaces, setting depth-to-width ratios, and setting the depth in one object to equal that in another. Estimates of depth constancy have varied between tasks, which suggests that we do not have a consistent representation of visual depth but rather a set of strategies that we use in different tasks. Also, most experiments have found wide individual differences even for a given stimulus and task. None of the failures in depth constancy revealed in the laboratory seem to be evident in the real world. In a laboratory, subjects are well aware that objects are not constant. This is obvious to them because they are asked to adjust the 3-structure of test objects. In the real world we can usually safely assume that objects remain the same at different distances. Also, we can usually see several similar objects, such as people, houses, or cars, at different distances at the same time. With such reliable assumptions and rich displays we do not need refined mechanisms for distance scaling of disparity.
STEREOSCOPIC VISION
21 DEPTH CONTRAST
21.1 21.2 21.3 21.3.1 21.3.2 21.3.3 21.3.4 21.3.5 21.4 21.4.1 21.4.2 21.4.3
Types of depth contrast 433 Short-range effects 434 Depth contrast with points and lines 436 Basic findings 436 Frames of reference and norms 439 Cyclovergence and inclination contrast 440 Temporal properties 441 Disparity masking 441 Depth contrast between surfaces 441 Contrast between constant disparity areas 441 Contrast between sloping surfaces 442 Depth contrast and cue conflict 450
21.5 21.5.1 21.5.2 21.5.3 21.6 21.6.1 21.6.2 21.6.3 21.6.4 21.7 21.7.1 21.7.2
21.1 T YPES OF DEPTH CONTR AST
Disparity contrast mechanisms 451 Disparity receptive fields 451 Modeling the empirical evidence 453 An analogy between depth and brightness 454 Successive depth contrast 454 Depth aftereffects 454 Disparity-specific aftereffects 456 Mechanisms of depth aftereffects 461 Phase-independent depth adaptation 464 Depth contrast and deformation disparities 466 Depth contrast and size disparities 466 Depth contrast and shear disparities 468
In global contrast, a test stimulus is affected by an induction stimulus, which may be some distance from it. Typically, only one global contrast effect can be produced at one time because the effect depends on integrating information from a large area. Global contrast effects are usually much larger than local effects. For example, a vertical line appears tilted 20° or more when seen in the context of a large tilted scene, such as a tilted furnished room (Howard and Hu 2001). Global contrast effects are due to the induction stimulus acting as a headcentric or allocentric frame of reference against which the test stimulus is judged. Induced motion can occur at the level of local motion detectors, within a global headcentric frame of reference, or within a global external frame of reference (Section 22.7). Local contrast effects are interesting because they reveal characteristics of low-level sensory coding (Sutherland 1961; Mollon 1974; Anstis 1975; Frisby 1979). They are manifestations of the visual system’s strategy of coding the temporal and spatial changes of sensory stimuli in preference to steady values (Bekesy 1967; Over 1971; Anstis 1975). Successive and simultaneous contrast effects may interact. For example, Anstis and Reinhardt-Rutland (1976) found that an object showing induced movement can generate a motion aftereffect, and that an object manifesting a motion aftereffect may induce illusory motion in a neighboring object. Analogous interactions have been reported between color aftereffects and color contrast (Anstis et al. 1978).
Simultaneous contrast is an apparent change in a feature of a test stimulus produced by an induction stimulus with a distinct value of the same feature presented at the same time. Contrast effects occur for most, if not all, sensory features including spatial frequency (MacKay 1973), position, curvature (Gibson 1933), tilt (Gibson 1937), color, and motion (Ptolemy 2nd century; Duncker 1929). Successive contrast is an apparent change in a feature of a test stimulus produced by prior exposure to an induction stimulus with a distinct value of the same feature. Examples are the motion aftereffect (Wohlgemuth 1911), color aftereffects (Hering 1861), the spatial frequency aftereffect (Blakemore and Sutton 1969), figural aftereffects (Köhler and Wallach 1944), the curvature aftereffect (Gibson 1933), and the tilt (orientation) aftereffect (Gibson and Radner 1937). Contrast effects occur locally or globally. In local contrast, the induction and test stimuli are either in the same location or near each other, and the effect weakens rapidly as interstimulus distance is increased. Distinct local contrast effects may be produced in different regions of the visual field. Local contrast effects occur at the level of local feature-detectors within the particular sensory system. For example, a vertical line superimposed on a line tilted a few degrees appears tilted up to about 2° in the opposite direction. This is local tilt contrast due to interactions between visual orientation detectors.
433
Prior exposure to a stimulus with depth created by one depth cue can influence the depth created by another depth cue. Contrast effects may be used to investigate interactions between depth cues. Global contrast effects reveal the nature of high-level perceptual processes by which the outputs of local mechanisms are assessed in terms of broad frames of reference or in terms of activity in other sense organs. The interpretation of contrast effects is fraught with difficulties. Effects that are superficially alike may arise from the following very different causes. 1. Enhancement of differences between local stimuli In its simplest form contrast arises from local inhibitory interactions between neighboring cells that code the same feature. This enhances the perceived difference between neighboring stimuli. For example, a grey patch appears whiter when on a black surround than when on a white surround. 2. Rescaling a feature with reference to a norm For example, inspection of a line tilted to the vertical causes the line to appear more vertical than it is and a vertical line to appear tilted in the opposite direction. There is no change in the perceived tilt of one line with respect to another line. 3. Influence of context on perceived values of a stimulus For example, a dark patch on a surface will appear darker when it is interpreted as a mark on the surface rather than a shadow (Section 22.4). 4. Effects of cue conflict We will see in Section 21.4.3 that depth contrast between surfaces that differ in disparitydefined slant may be due to conflicting cues such as perspective. 5. Pooling of sensory signals over different areas Effects that superficially resemble depth contrast may be due to the
Stimulus 1 4.0 log trolands and intensity was suddenly reduced by 1 or 2 log units, visual latency of that eye became longer, as might be expected. However, as the eye became dark adapted down to the new level, the latency became significantly shorter over a period of about 60 s (the hump on the lower right of the positive diagonal in Figure 23.17). Rogers and Anstis offered no explanation for these paradoxical effects that occurred only at very high luminance levels when most of the pigment in the receptors would have been bleached.
Adapting field 54 trolands
Adapting field 5400 trolands Test field luminance (trolands) 17
50
50 Initial latency
40
40 Visual latency (ms)
54 Test field luminance (trolands) 170
30
30 170
540
540
20
20 1700 5400 1700
10
1700 10 Initial latency
0 0
1
2
3
4
5
6
7
0 8 9 0 1 2 Time after adaptation (minutes)
3
4
5
6
7
8
9
Visual latencies as a function of time following prior dark or light adaptation. Time course of visual latency following prior adaptation to either a dark field of 54 trolands (a) or a light field of 5400 trolands. (b) Prior dark adaptation had little effect on latencies, whereas prior light adaptation resulted in shorter latencies for the first 30 s. (Redrawn from Rogers and Anstis 1972)
Figure 23.18.
530
•
STEREOSCOPIC VISION
In conclusion, there is evidence that prior light adaptation of one eye produces a Pulfrich effect when the target has the same luminance in the two eyes. It remains unclear why the effect of adaptation on visual latency was more rapid in the Rogers and Anstis study (< 60s) than in Standing et al.’s study (15–20 min). The contrast of the moving target relative to the background may have been a factor. Prestrude and Baker (1971) observed changes in visual latency over the first 100 s after adapting to an intense bleaching source in one eye. The moving target was a dark line on a light background, as in the Rogers and Anstis study. On the other hand, Prestrude and Baker found longer lasting changes (over 5 min) when the moving target was a light line on a dark background, as in the Standing et al. study. The virtual absence of an effect of prior dark adaptation found in most studies is consistent with light adaptation being faster than dark adaptation. But it has not been possible to use the Pulfrich effect to measure changes in latency within the first few hundred milliseconds of adaptation to a different intensity level. Hence, all that can be concluded is that changes in latency occurring after the first second are generally much smaller than those found under steady-state conditions.
23.4.2b Long-Term Adaptation Effects There have been several reports of long-term adaptation to differences in visual latency created by differential illumination of the eyes. Douthwaite and Morrison (1975) found that the Pulfrich effect decreased by 25–50% over a 5-day period of wearing a 0.7-log-unit tinted lens (80% attenuation) over one eye. Using two observers, Wolpert et al. (1993) showed that the computed interocular delay decreased approximately linearly over a 9-day period of wearing a 0.6-log-unit filter (75% attenuation) to less than half that on the first day. The rate of adaptation for the particular conditions used by Wolpert et al. was around 1 ms per day. Upon removal of the filter after 9 days, so that target illumination in the two eyes was equal, there was a small Pulfrich effect in the opposite direction, equivalent to a 4- to 5-ms reduction in the latency of the previously filtered eye. This effect decreased to less than half over the first 24 hours and to zero over the following 3 days. Douthwaite and Morrison, on the other hand, reported that the reversed Pulfrich effect disappeared within a few minutes. Heard and Papakostopoulos (1993) also reported a slight decline in the magnitude of the Pulfrich effect during the wearing of a 1.8-log-unit filter over one eye for 7 days and a subsequent aftereffect on removal of the filter, which lasted for several further days. The adaptive change in the Pulfrich effect is not accompanied by any change in the temporal resolution of either eye, as measured by the frequency at which a flickering light fuses (Douthwaite and Morrison 1975).
The fact that the course of the initial long-term adaptation effect and the subsequent recovery both have time constants of several days suggests that these effects have a different origin from those responsible for the changes of sensitivity found during dark and light adaptation. Wolpert et al. suggested that the most likely site of the long-term adaptation effect is the retina rather than the cortex, and that it does not result from changes in pupil diameter.
23.4.3 RO L E O F C O N T R A S T
Dodwell et al. (1968) measured the effect of target contrast on the Pulfrich effect with minimal differential adaptation of the two eyes. Four reference rectangles of the same luminance in the two eyes surrounded the differentially filtered moving targets. The targets consisted of a series of bright bars moving across a very dark background. Although a Pulfrich effect was seen, the experiment does not provide clear evidence for the role of target contrast. Neither does it rule out the possibility that target luminance or adaptation level was responsible, since the adaptive state of the retinal receptors may have been differentially affected by the differently illuminated bars. A better test for the role of contrast in the Pulfrich effect would be to use dark target bars of unequal luminance, which move over a light background of the same luminance in the two eyes. If the highest luminance in the binocular displays is crucial, there should be no effect. If space-average luminance is crucial, there should be a Pulfrich effect with the eye seeing the darker bars subject to a longer latency. If contrast is crucial, there should be a Pulfrich effect with the eye seeing the darker target bars (with higher contrast) subject to a shorter latency. Evidence produced by Prestrude and Baker (1971) suggests that visual latency does not depend only on target contrast. They measured the phase difference between dichoptic target disks rotating about the same center, each with a superimposed radial line. In one condition, white radii (3.8 log trolands) were presented on a dark background (2.3 log trolands). In a second condition, dark radii were presented on a white background. The luminances of the dark and white areas and target contrasts were the same in the two conditions. With a filter covering one eye, a greater latency difference was found for the white targets on a dark background (which had the lower space-average luminance). Also, the latency difference was the same when the background in the dark target/light background configuration was reduced to the same retinal luminance as the dark background in the light target/dark background configuration. These results suggest that the crucial variable is neither peak luminance nor the direction or magnitude of target contrast. The single variable that appears to account for all these results is the space-average luminance level.
THE PULFRICH EFFECT
•
531
Note that these results were obtained using a nonstereoscopic alignment task and the experiment needs to be repeated in a Pulfrich situation.
Fixation point x
23.5 EY E MOVE M E N T S A N D T H E P U L F R I C H EFFEC T
Apparent path of bob
The Pulfrich effect can be seen if the eyes fixate a stationary point close to the path of the swinging pendulum bob. However, it can also be seen when the observer tracks the pendulum bob, providing there is some stationary object in the field of view (Kahn 1931; Kirkwood et al. 1969). When the observer tracks the pendulum bob, both images of the bob remain close to the foveas and hence the latency difference cannot be translated into a spatial disparity (Figure 23.19B). Thus, when the bob is tracked, the Pulfrich effect should not be seen. However, the eyes may track the apparent elliptical path in depth predicted by the visual-latency spatial-disparity hypothesis, rather than the physical path of the bob (Figure 23.19C). In this case, there would be a continuous change in depth signaled by changes in vergence, but it would be accompanied by a corresponding change in binocular disparity between the images of the pendulum bob. No Pulfrich effect should occur in this situation, either. Even if tracking were poor, the speed of the target over the retinas would be considerably less than that under fixation conditions and only a small Pulfrich effect should be observed. However, Kirkwood et al. (1969) and others have reported that the effect is typically similar whether subjects fixate a fixed point or track to moving bob. Rogers et al. (1974) suggested that the Pulfrich effect seen when the observer tracks the target is due to the effects of the filter on stationary objects in the field of view. As the eyes track the moving bob, the images of any stationary object move over the retinas, and the increased latency of the filtered eye creates a disparity in those images with respect to the moving bob. To test this hypothesis, observers were instructed to track a moving target of unequal luminance in the two eyes, but with the background maintained at a constant luminance. No Pulfrich effect was seen. In the converse situation in which the moving target was of equal luminance in the two eyes and the background was differentially filtered, a Pulfrich effect was seen, but only while the eyes tracked the target—it disappeared as soon as the stationary background was fixated. These findings provide clear evidence that the Pulfrich effect seen when observers track the pendulum bob results from differential filtering of the stationary background rather than of the moving target. Wallach and Goldberg (1977) obtained an apparently contradictory result. They claimed that a Pulfrich effect was seen when observers tracked an oscillating target viewed against a differentially illuminated but featureless background. The estimated depth of the elliptical path was significantly smaller with a featureless than with a featured 532
•
Actual path
Filter
Eyes fixate
A Actual and apparent path of bob
Eyes track the bob
Filter
B
Actual path
Tracked path
Eyes track the path
Filter
C Fixating and tracking the Pulfrich pendulum. (A) The latency hypothesis predicts an elliptical path when the subject fixates a stationary point. (B) No illusion is predicted with perfect tracking of the bob. (C) No illusion is predicted when the eyes track the bob in depth, because corresponding changes in disparity accompany changes in vergence. (Redrawn from Rogers et al. (1974)
Figure 23.19.
background. In addition, four of their 24 observers “had trouble getting a clear effect” in the latter situation. The authors’ prior expectations may have influenced the results since, when the observer’s description of the path was faulty, the correct motion path was suggested to the subject and the exposure was repeated (Wallach and Goldberg 1977). The differentially illuminated edges of the aperture surrounding the moving target may also have provided a stereoscopic reference frame.
STEREOSCOPIC VISION
What movements do the eyes make when tracking a moving target that is differentially filtered in the two eyes? In two brief accounts, Reading (1973, 1975) claimed that vergence eye movements occurred when observers tracked a differentially filtered target. She speculated that the Pulfrich effect under these conditions was a consequence of proprioceptive feedback from the extraocular muscles. Rogers et al. (1974) recorded eye movements while observers tracked (1) a differentially filtered oscillating target, with background luminance equal in the two eyes or (2) a target moving round an elliptical path in depth. Eyemovement records from condition (2) verified that observers could make vergence movements to track a target moving in an elliptical path in depth. In condition (1), Rogers et al. found no changes in vergence, which suggests that observers tracked the physical path of the moving target, as would be expected from theoretical considerations. It was also noted that the Pulfrich effect disappeared as soon as the observer started to track, which is consistent with the fact that the luminance of the background was the same in the two eyes. Ono and Steinbach (1983), also, reported that there were no vergence changes when an observer tracked a pendulum bob with both the background and target filtered. In this case, a Pulfrich effect could be seen. They suggested that the electrooculogram used by Reading to record eye position was incapable of resolving small vergence movements, and therefore her results were an artifact. Enright (1985) confirmed that continuous changes of vergence are not elicited when an observer tracks a differentially filtered pendulum bob, except during the first few hundred milliseconds of tracking. However, after the initial period, Enright found systematic errors of vergence such that the eyes maintained constant divergence of between 34 and 55 arcmin. These vergence errors were several times larger than those seen under conditions of steady fixation and seemed to be a consequence of interocular differences in illumination (Enright 1985). 2 3 . 6 DY N A M I C N O I S E P U L F R I C H EFFECT 23.6.1 I N T RO D U C T I O N
Julesz and White (1969) and Ross and Hogben (1974) showed that the stereoscopic system tolerates a time difference between binocularly correlated random-dot frames of up to 50 ms. Ross (1974) showed that, under certain circumstances, a time difference is not merely tolerated but can itself produce a stereoscopic effect. The display consisted of a central square region of dynamic noise seen at the same time by the two eyes with surrounding dynamic noise delayed to one eye. When the delay was longer than 70 ms, the dots in the surround appeared to lie in a plane behind
the central square. The depth effect could be canceled by a spatial disparity between the spatially correlated but temporally mismatched frames and was abolished when the surround dots were completely uncorrelated between the two eyes. In a later paper, Ross (1976) described the dots in the surround as appearing to lie on an upright cylinder rotating around its vertical axis. Other investigators have supported the original description and stressed the appearance of discrete planes of moving dots that lie predominantly (for some observers) behind the fixation point (Falk and Williams 1980; Neill 1981; Zeevi and Geri 1985). Ross interpreted his results as showing that the visual system detects temporal as well as spatial disparities between images reaching the two eyes—a temporal disparity hypothesis (Section 23.3.1). Tyler (1974b) reported a related dynamic noise Pulfrich effect. If a display of random dynamic noise, such as that created on a detuned television receiver, is viewed with a neutral-density filter over, say, the left eye, the display is seen in depth. The dots seen in front appear to stream to the left and those seen behind appear to stream to the right. Overall, the dots appear to swirl in a clockwise direction around the fixation point, with the apparent velocity of each dot linked to its apparent displacement in depth from the screen. With the filter over the right eye, the dots swirl in a counterclockwise direction. The only difference between the Ross effect and the Tyler effect is that Ross used a physical delay and Tyler used a filter-induced delay. Zeevi and Geri (1985) showed that when dynamic visual noise is viewed with a filter over one eye the apparent movement of the dots in an uncrossed disparity plane is a sufficient stimulus to create a movement aftereffect (Section 16.4.3). The following three hypotheses have been proposed to explain dynamic noise Pulfrich effects. 23.6.2 T E M P O R A L D I S PA R I T I E S
Ross (1974, 1976) suggested that the visual system is capable of detecting the temporal disparity of images that stimulate corresponding spatial locations on the two retinas. The rationale for a separate temporal disparity mechanism is that temporal disparities are created in the normal viewing of a three-dimensional scene when the eyes move (Portrait Figure 23.20). During an eye movement to the right, a point with uncrossed disparity stimulates corresponding retinal regions in the right eye earlier than in the left, and vice versa for a crossed point. One problem with this hypothesis, as Neill (1981) pointed out, is that when the eyes move in a given direction, the direction of movement of crossed and uncrossed points is the same, but the temporal disparities are in opposite directions. In the dynamic-noise Pulfrich effect, the temporal disparity is the same for crossed and uncrossed points, but the direction of perceived motion is opposite.
THE PULFRICH EFFECT
•
533
t1
y
x
y
x
t1
t2
t2
A
B
Tyler’s random spatial disparity hypothesis At time t1, the random pairing of two closest spaced dots (x and y) creates either an uncrossed disparity (A) or a crossed disparity (B). At time t2, the left eye sees the delayed image of dot y, which is to the right of x in (A) and to the left of x in (B). Hence, there is a necessary link between uncrossed disparity and motion to the right and crossed disparity and motion to the left.
Figure 23.21.
John Ross. He obtained a B.A. from the University of Sydney in 1953 and a Ph.D. from Princeton University in 1961. He then gained an academic appointment in psychology at the University of Western Australia, where he became professor in 1969. He retired in 1994.
Figure 23.20.
Therefore, the two situations are not necessarily equivalent. In addition, the temporal disparity hypothesis predicts no depth when the eyes are stationary, and yet the dynamicnoise Pulfrich effect can still be seen during fixation (Tyler 1977). However, the temporal-disparity hypothesis can be reformulated in terms of temporal disparities of moving objects rather than of those created by eye movements. For example, an object with uncrossed disparity moving from right to left stimulates a region in the right retina before the corresponding region in the left retina, as will an object with crossed disparity moving left to right (Figure 23.8A). This emphasizes the fact that for correct interpretation of temporal disparities, the direction of motion of the image must be registered. 23.6.3 R A N D O M S PAT I A L D I S PA R IT I E S
Tyler (1974b) interpreted his results in terms of spatial disparities. He argued that, at any instant, uncorrelated dots in the images in the two eyes are paired according to the nearest-neighbor principle (Section 15.3.2). These random pairings generate both crossed and uncrossed disparities with a range of magnitudes, and should give rise to a cloud of dots at different distances in front of and beyond the fixation plane (Figure 23.21). Random correspondences 534
•
between dots in different frames to the same eye would also produce apparent motion in all directions and at different velocities, according to the dot separations between frames. Tyler showed that there is a geometric link between the predicted disparity of randomly paired dots between the eyes and the direction and velocity of apparent motion between dots in the same eye (Figure 23.21). For example, a dot seen by the left eye paired with a dot displaced to the right in the right eye (on a nearest-neighbor basis) creates uncrossed disparity. When the left eye is subsequently stimulated by the right eye’s temporally delayed dot, the nearest dot for apparent motion must be to the right. Tyler argued that this association is consistent with the appearance of dots rotating or shearing in depth around the fixation point. He termed this explanation the random spatial-disparity hypothesis. One problem with this explanation is that stereopsis should also be possible between uncorrelated patterns of random dots presented to the two eyes, either dynamically or statically. The random pairing of dots seen by the eyes should be sufficient for observers to see a dense cloud of dots lying at different depths. The only difference between the appearance of binocularly uncorrelated patterns and the dynamic-noise Pulfrich effect is that the dots should not appear to swirl in a consistent direction round the fixation point in the former case. Unfortunately, there is dispute as to what is seen when binocularly uncorrelated dot patterns are presented to the eyes. Tyler (1977) claimed that depth is seen “after some initial confusion” and that the appearance is enhanced when the display is seen alongside a binocularly correlated noise field, which appears as a single depth plane. MacDonald (1977) and Neill (1981), on the other hand, claimed that only rivalry and no depth is seen under these conditions. A crucial test between the temporal-disparity and random spatial-disparity hypotheses was carried out by
STEREOSCOPIC VISION
Tyler (1977) using a display consisting of a sequence of different random-dot frames presented with opposite contrasts to the two eyes. With a neutral-density filter over one eye, observers reported that the dots appeared to swirl in depth around the fixation point, as with the original effect, but in the opposite direction. According to the temporaldisparity hypothesis, depth would be expected only if the stereo system could correlate the temporally displaced dots of opposite contrast to the two eyes. Results obtained with static opposite contrast random-dot stereograms suggest that this is unlikely (Section 15.3.7). But, in any case, the hypothesis fails because it predicts the same direction of three-dimensional swirling with complementary as with correlated noise. Falk and Williams (1980) claimed that the reversed direction of swirling seen with opposite contrast dynamic-noise patterns to the two eyes is compatible with the apparent-motion hypothesis described below. 23.6.4 A P PA R E N T-M OT I O N C A S C A D E S
The third hypothesis to explain the dynamic noise Pulfrich effect was proposed by Mezrich and Rose (1977) and Ward and Morgan (1978). According to this hypothesis, the effect depends on the apparent motion in “cascades” of dots in the noise and is therefore equivalent to the apparent-motion version of the Pulfrich effect (Section 23.3.4). According to Tyler’s random spatial-disparity hypothesis, apparent motion merely accompanies the retinal disparity between points. According to the apparent-motion hypothesis, disparity is created between the spatiotemporal interpolated positions of the moving dots. To test the two hypotheses, Morgan and Ward (1980) created a sequence of random-dot frames in which each dot survived for between 2 and 30 frames. In addition, each dot was displaced 3.6 arcmin to the right between each pair of frames during its lifetime. Instead of the Brownian motion seen in a sequence of uncorrelated random-dot frames, observers perceived the direction of drift under both monocular and binocular viewing, even when dot lifetime was only 2 frames. The time between frames was 25 ms, and there was an interocular delay of 12 ms. If dots in the leading eye’s image had been paired with dots in the same image in the lagging eye, no depth should have been seen. If dots in the leading eye’s image had been paired with dots in the preceding image seen by the other eye, the depth should have corresponded to a disparity of 3.6 arcmin. In fact, the disparity corresponding to the matched depth varied between 0.5 and 1.7 arcmin and increased with increasing dot lifetime. Tyler (1977) suggested that the compromise judgments between 0 and 3.6 arcmin could be due to disparity averaging, but Morgan and Ward pointed out that this would not account for the fact that the depth effect increased with the lifetime of the dots. Instead, they suggested that the results are more consistent with spatial interpolation of a target
undergoing apparent motion, which has been demonstrated for both binocular and monocular moving targets (Burr and Ross 1979; Morgan 1979). In other words, their hypothesis suggests that a simultaneous spatial disparity exists between the interpolated positions of the discretely moving targets in the two eyes. Apart from containing the additional idea of interpolation or spatiotemporal averaging, this hypothesis is equivalent to Fertsch’s original explanation of the normal Pulfrich effect. Overall, Morgan and Ward’s explanation seems to account for the effects seen with displays containing apparent-motion cascades. However, it is not clear that it is superior to Tyler’s random spatial-disparity hypothesis in accounting for the original dynamic-noise effect in which there are no explicit motion cascades. To distinguish between the predictions of the three hypotheses, Falk and Williams (1980) measured the effects of changes in filter density, viewing distance, and dot rate. The predicted velocity of disparate points calculated from the magnitude of the filter-induced delay was an order of magnitude larger than any velocity observed in their own or previous studies of the effect. Also, the apparent velocity of the streaming dots increased with the density of the filter covering one eye. According to the random spatial-disparity hypothesis, the speed of the streaming dots should be influenced only by the distance between nearest-neighbor dots that are paired in apparent motion. Yet another problem for the random spatial-disparity hypothesis is that it does not predict the depth effect seen when the filter-induced delay is significantly less than the interframe interval (Falk and Williams 1980). Neill (1981) provided additional evidence of spatiotemporal averaging by varying the frame rate of the uncorrelated noise patterns. If spatiotemporal averaging is important, the apparent velocity of the streaming dots should be independent of the frame rate as long as the frequency is high enough to allow averaging over several frames. According to the random spatial-disparity hypothesis, the apparent velocity should increase, since each random dot is displaced to its nearest neighbor in a shorter period of time. In Neill’s experiment, most observers opted to match the velocity of the uncrossed coherent sheet of dots that lay closest to the display screen. Varying the frame rate between 20 and 120 Hz had little effect on the matched velocity. Also, Neill pointed out that, according to the random spatial-disparity hypothesis, the effect should collapse if the filter-induced lag exceeds the interframe interval. Random disparities are still created between nearest corresponding dots in the simultaneously occurring uncorrelated dot frames. However, because of the intervening frame, the direction of apparent motion is no longer systematically related to disparity. Neill’s finding that the frame rate has little effect on the appearance of the dynamic-noise Pulfrich illusion is therefore not consistent with this prediction from the random spatial-disparity model.
THE PULFRICH EFFECT
•
535
Morgan and Tyler (1995) reported that the dynamicnoise Pulfrich effect was reduced when the stimuli were filtered to reduce vertically oriented Fourier components in the noise. This minimized horizontal apparent motion between frames, since the preferred direction of motion detectors is orthogonal to the orientation of their receptive fields. Reducing horizontal components had little effect. They concluded that all three explanations of the dynamic noise effect might be combined into a more general model in which there are disparity detectors tuned both to a particular direction of horizontal motion and a direction of disparity. 23.7 CLINICAL ASPECTS OF PULFRICH EFFECT The fact that a swinging pendulum bob is seen displaced in depth when viewed with a filter over one eye shows that the effect is stereoscopic and involving comparison of the spatiotemporal characteristics of the images in the two eyes. Pulfrich could not experience the effect himself because he was blind in one eye (Gregory 1966). It is therefore of considerable interest that Thompson and Wood (1993) found that four stereoblind subjects saw a Pulfrich effect under certain conditions. In particular, all four could set a marker under the forward path of the pendulum bob when the dominant eye was filtered with a 1-log-unit filter and the bob was tracked with the eyes. Only one stereoblind subject could set the marker when the nondominant eye was filtered, and she was the only subject who could set the marker when fixating a stationary point. The most obvious explanation of these results is that the stereoblind subjects were not completely stereoblind but stereo anomalous or simply poor at seeing depth in randomdot stereograms. Thompson and Wood’s criterion for stereoblindness was the fusion of a simple random-dot stereogram depicting a central square standing in front of the surround, which all four subjects failed to achieve. In addition, three out of the four subjects showed no interocular transfer of the movement aftereffect. However, their subjects may have been able to see depth in simple line stereograms, which are more similar in their spatial characteristics to the pendulum bob. Naturally occurring or pharmacologically induced unequal pupil size (anisocoria) produces unequal illumination of the retinas and a corresponding Pulfrich effect (Heron et al. 1995). The Pulfrich effect can occur without a filter in patients with anisometropic amblyopia, glaucoma, cataract, or optic nerve dysfunction arising from such conditions as multiple sclerosis and optic neuritis (Frisen et al. 1973; Sokol 1976; Slagsvold 1978). Tredici and von Noorden (1984) found three patients with anisometropic amblyopia who experienced a spontaneous Pulfrich effect in a direction determined by which 536
•
eye was weak. These patients had stereoacuity of at least 550 arcsec on a random-dot stereogram. None of 34 stereoblind patients experienced the Pulfrich effect, even with a filter in front of one eye. A spontaneous Pulfrich effect arises from delay of neural conduction in one visual pathway relative to that in the other. Hofeldt et al. (1985) reported that four patients with retinal detachment in one eye experienced a spontaneous Pulfrich effect, accompanied by delay in the cortical potential evoked by stimulation of the affected eye. The symptoms ceased when the retinopathy was resolved. A spontaneous Pulfrich effect arising from a lesion pressing on the visual pathways depends on the location of the lesion. Feinsod et al. (1979) reported the following effects in patients with variously positioned lesions and in subjects with normal vision wearing appropriate filters. A defect in one retina or one optic nerve weakened vision in one eye and created a conventional Pulfrich effect, as in Figure 23.22A. A lesion in the region of the chiasma created loss of vision in either the two nasal hemiretinas (crossed pathways) or the two temporal hemiretinas
Let-eye filter
Right-eye filter
A
Binasal filter
Bitemporal filter
B
Binocular hemifield filter
C
Monocular hemifield filter
D
The Pulfrich effect and filter placement. Plan views of the apparent movement of a Pulfrich pendulum bob produced by filters in various position. (Redrawn from Feinsod et al. 1979)
Figure 23.22.
STEREOSCOPIC VISION
(uncrossed pathways). The pendulum appeared to move in a figure eight. The same effects were produced by binasal or bitemporal filters, as in Figure 23.22B. A lesion in one optic tract created weak vision in one half of space (hemianopia). The pendulum appeared to move in an ellipse moving away in the hemianopic field. The same effects were produced by hemifield filters, as in Figure 23.22C, or by placing a filter over one half of one eye, as in Figure 23.22D. These latter two effects did not involve any binocular disparity and were presumably due to differences in perceived speed. The effects were accompanied by differences in latency of visually evoked potentials between one eye and the other or between one cortical hemisphere and the other. For example, a 1.0 neutral density filter placed before one eye of a person with normal vision delayed the first negative VEP response from that eye by 10 ms relative to the response from the unfiltered eye. However, abnormal differences in latency in the two eyes, revealed by visually evoked potentials, did not correlate with those determined by the Pulfrich phenomenon (Rushton 1975).
Mojon et al. (1998) devised a bedside test involving the Pulfrich stereophenomenon. A patient with multiple sclerosis experienced the Pulfrich effect when viewing a moving spot with one eye (Ell and Gresty 1982). Perceived depth did not vary with spot velocity, and the causes of this effect remain unexplained. The Pulfrich pendulum is not suitable for detecting localized visual defects because it requires a large stimulus. Regan et al. (1976a) introduced the delay campimeter for diagnosing local visual defects that cause delay in transmission of visual inputs. The patient fixates a central light while two LEDs are presented, one to each eye for 10 ms, in various positions on a screen. The patient adjusts the interflash interval of the two LEDs in 10-ms steps until they appear simultaneous. The test was used in the diagnosis of unilateral retrobulbar neuritis and local areas of demyelination associated with multiple sclerosis (Regan et al. 1976b). Carkeet et al. (1997) devised a similar test with a temporal resolution of 0.24 ms for centrally placed stimuli.
THE PULFRICH EFFECT
•
537
24 STEREOSCOPIC TECHNIQUES AND APPLICATIONS
24.1 24.1.1 24.1.2 24.1.3 24.1.4 24.1.5 24.1.6 24.1.7
Stereoscopic techniques 538 Geometry of stereoscopic displays Types of stereoscope 541 Lenticular plate methods 543 Volumetric displays 546 Random-dot stereograms 548 Random-dot autostereograms 549 Monocular stereoscopy 550
24.1.8 24.2 24.2.1 24.2.2 24.2.3 24.2.4 24.2.5 24.2.6
538
24.1 STEREOSCOPIC TECHNIQUES
The dichoptiscope 551 Applications of stereoscopy 555 Photogrammetry 555 Telescopes and rangefinders 555 Stereomicroscopy 557 Stereoendoscopy and stereo MRI 561 Stereosculpting and stereolithography 562 Television, virtual reality, and telepresence
x=
24.1.1 G E O M ET RY O F S T E R E O S C O P I C D I S P L AY S
562
Xd Yd and y = d Z d Z
(1)
Now consider two projectors at positions (–e/2, 0, –d) and (+e/2, 0, –d), as in Figure 24.1B, projecting the same scene taken by cameras distance e apart. Point (xL, yL), formed by the left projector, and point (xR, yR), formed by the right projector, originate from the same location in the scene. A point in 3-D space is created where the visual lines through the two image points meet. For any point (X, Y, Z) in 3-D space, the coordinates of the image for the left-eye are:
The image of a photograph viewed with one eye is similar to that created by the original object if the visual axis is orthogonal to the photograph and passes through the principal vanishing point of the photograph. The image is the same size as the image of the original object if the distance of the eye from the photograph equals the focal length of the camera lens multiplied by the magnification of the photograph relative to the image on the film. A stereoscopic system displays to each eye a 2-D picture of a 3-D scene. The two pictures are created by taking two photographs of the scene with the cameras separated by distance e on a horizontal axis parallel to the camera image planes. The following conditions hold when the two pictures recreate the impression of the original scene for a given viewer. (1) Distance e equals the interocular distance. (2) Each eye is in the same position with respect to the picture as the original center of projection for that eye’s picture. (3) The angle of convergence at each eye of all pairs of corresponding visual lines in the two pictures equals the angle of convergence of those visual lines in the original scene. (4) Points straight ahead in the scene project as a point straight ahead in the fused picture. Figure 24.1A depicts a photograph of a 3-D scene projected onto a screen by a projector at distance –d. Construct x-, y-, and z-axes with an origin at point O, where the optical axis cuts the screen. The position of the projector is (0, 0, –d). Point (x, y) is the image of point (X, Y, Z) in 3-D space. By similar triangles,
Ze 2 and y = Yd xL = L d Z d Z Xd −
(2)
The coordinates of the image for the right-eye are: Ze 2 and y = Yd xL = L d Z d Z Xd +
(3)
Since the left and right images have the same vertical coordinates, vertical disparities do not exist on a flat screen. The horizontal disparity for point P is the difference between xL and xR, namely, Horizontal disparity =
ze d+z
(4)
A simpler derivation of this relationship is shown in Figure 24.1C, in which projectors have been replaced by 538
Object point (X, Y, Z)
y axis
z axis
x axis Image point O (x.y) Projector
–d Pojection planer
Monocular
A Object point (X, Y, Z)
y axis
z axis
(xR, yR)
x axis
(xL, yL)
Right projector
O –d
e Left Binocular projector
Projection plane
Stereoscopic images are difficult to fuse if the distance between the nearest and farthest planes, distance range (D), is too great. The acceptable distance range is proportional to the square of the distance to the nearest plane and inversely proportional to the stereo base of the camera. Thus, D decreases rapidly as the distance of the nearest plane is reduced. For example, for a stereo base of 2.5 inches, a reduction of the distance to the nearest plane from 10 feet to 8 feet reduces D from 238 feet to about 26 feet (Dudley 1965). Figure 24.2 illustrates the effects of changing the distance between stereoscopic projectors and the screen. When the screen is at a distance corresponding to the distance of the cameras from the original object, the stereoscopic image recreates the original scene. If the screen is nearer than this, objects appear compressed in depth and tapered. If the screen is too distant, objects appear magnified and tapered in the opposite direction. Changing the distance of stereograms in a stereoscope produces the same effects. If the stereogram is magnified n times, then the focal
B Point in 3-D space Disparate image points
3-D image
z Screen x
Disparity 3-D image By similar triangles d+z z = x/2 e/2 ze and x = d+z
d
Screen
e
C Figure 24.1.
Projection onto a plane.
(Adapted from Hodges 1992)
the eyes. The linear horizontal disparity produced on a flat screen by a point in a 3-D scene is proportional to the distance of the point from the screen and to the interocular distance, and inversely proportional to the distance of the point from the eyes (Hodges 1992). The point in the 3-D scene will appear at position (X, Y, Z) in the above formula if the viewer correctly registers the disparity, the interocular distance, and the distance to the screen. If the picture was produced by cameras in the same positions as the eyes and with similar optics then the perceived 3-D structure of the scene should be the same as that produced by the original scene.
A
B
C
Effects of screen distance. Two projectors project the same stereogram images of a square on a ground plane onto a screen at three distances. (A) Screen is at the correct distance to create a square in depth. (B) Screen is too near. Square appears compressed and tapered. (C) Screen is too distant. Square is magnified and tapered.
Figure 24.2.
S T E R E O S C O P I C T E C H N I Q U E S A N D A P P L I C AT I O N S
•
539
length of the projector lens or stereoscopic lens should be n times that of the lens used to take the pictures. If a magnified stereogram is viewed with a lens equal to the camera lens, the picture will appear near, magnified, and flattened in depth. With a minified stereogram, the picture will appear far off and depth intervals will appear exaggerated. Viewing a stereoscopic display to one side of the correct station point also distorts the angular disparities in the image. People partially compensate for these distortions (BerebyMeyer et al. 2000). For a pair of images with a given disparity, the distance from the spectator to the screen divided by the distance of the spectator to the stereoscopically fused image in 3-D space is constant. This ratio is known as the nearness factor (N). For images with zero disparity, N = 1, and the fused image lies in the plane of the screen. When N = 2, the image is halfway out to the screen, except for small effects due to individual differences in interocular distance. The total depth depicted in the image divided by the resolving power of the optical system defines the number of discriminably distinct depth planes. Most films do not allow for more than about 20 discriminable depth planes for a nearness factor of 2. A pair of disparate images on film creates a disparity on the screen that is proportional to the magnification of the images, the focal length of the projection lenses, and the lateral separation of the lenses. Since these factors are constant for a given system, their product can be represented by a constant, C. For a crossed disparity, the distance from the spectator, d, of the fused image of an object that was distance D from the camera is given by: d=
AD C B BD
length of the lenses ( f ). Orthoscopic projection cannot be achieved in practice because of limitations in M and f. For example, if M = 300 (screen size 20 by 8 feet) and the lenses have a focal length of 2 inches, the nearest object to the camera must be at least 25 feet away. Even if orthoscopic projection is achieved for a viewer in one location, it will not hold in locations nearer or further from the screen. Orthoscopic projection also varies with the interocular distance of the viewer. Strict orthoscopic projection is less important than the more general requirement that the height, width, and depth of objects have the same proportions in image space as in object space. For projectors at distance d from the viewing screen and an observer with interocular distance e at distance do from the screen (Figure 24.3), the distance between the projectors, b, that produces the correct proportions of images is given by: b e
540
•
( dO
a) a)
(6)
where a is the stereoscopic distance of an image point behind the screen. This equation determines the interaxial spacing of cameras and projectors so that all dimensions in the image appear in their correct proportions and so that perspective and disparity are in balance (Hill 1953). When the observer is at the position of the projector, the equation becomes b = e.
Object point
a
(5)
where A is a factor that depends on the distance of the viewer from the screen and B is a factor governed by the convergence of the optical axes of the stereoscopic cameras and projectors. When the axes of the two instruments are parallel, B = 0. When B is positive, objects appear more distant than they were in the original scene and appear to elongate as they recede into the distance. When B is negative, objects appear nearer than they were in the original scene (Spottiswoode and Spottiswoode 1953). An orthoscopic projection, also known as an orthostereoscopic projection, is one in which the projected 3-D image is congruent with the scene it represents. Under this condition, the magnification of images in depth equals the magnification of images in width and height so that the 3-D shapes of objects are preserved. There are three conditions for orthoscopic projection. (1) B should be zero, (2) the interpupillary distance of the observer must equal the separation of the camera and projection lenses, and (3) the distance of the viewer from the screen must equal the magnification of the image (M) times the focal
(d
Screen
Apparent distance of observer from dOscreen d
e
b
Observer
Projectors
Positions of projectors, observer, and screen. The apparent distance of the observer from the screen is that distance which would produce an image with no magnification. (Redrawn from Hill 1953).
Figure 24.3.
STEREOSCOPIC VISION
Stereoscopic display systems are reviewed in Merritt and Fisher (1992, 1993). The basic principles of largescale stereo projection are described by Kurtz (1937), Spottiswoode et al. (1952), Spottiswoode and Spottiswoode (1953), Norling (1953), Valyus (1966), and Lipton (1982). See Hayes (1989) for an account of more recent developments in stereo movies. See Allison (2007) and Held and Banks (2008) for recent discussions of the perceptual effects of distortions in stereoscopic displays.
Fused image
Mirrors Left-eye display
Right-eye display
24.1.2 T Y P E S O F S T E R E O S C O P E A
24.1.2a The Mirror Stereoscope
Left-eye display
The optical features of the mirror stereoscope are shown in Figure 24.4A. Mirror stereoscopes are often referred to as Wheatstone stereoscopes. They are the most precise and versatile type of stereoscope for research purposes. Displays of up to 90˚ of visual angle can be viewed in a mirror stereoscope. Successive views of a wider display may be obtained by moving the monocular displays in opposite directions, as pointed out by Ramón y Cajal (1901). If the mirrors are semisilvered, the stereoscopic image may be combined with a display presented in the frontal plane. When the side displays are mounted on arms hinged about a point beneath each eye, the observer’s angle of vergence can be varied without changing the disparities. Clinical instruments used in orthoptics, such as amblyoscopes and synoptoscopes, are essentially mirror stereoscopes, with adjustable vergence and devices for controlling the luminance and size of each image (Section 10.2.3b). Léon Pigeon of Dijon developed a simplified mirror stereoscope in 1910 (see Lockett 1913) in which the right eye views one picture directly and the left eye views the other picture reflected in a right-angle prism. One picture must be printed in reverse.
Fused Right-eye image display
Partition Prism
B Polarizing screen
24.1.2b The Prism Stereoscope Wheatstone described a prism stereoscope in 1832. Later, Brewster developed the instrument. In his original prism stereoscope, dichoptic displays were printed side-by-side about 6.5 cm apart on a rectangular card and viewed through two base-out prisms, as shown in Figure 27.4B. The partition between the eyes allows each eye to see only one display. Magnifying lenses incorporated into the prisms enlarge the pictures and allow the viewer to accommodate on the stereogram. For this reason, this type of stereoscope is also known as a lenticular stereoscope. Prisms cause vertical lines to appear curved. In the prism stereoscope, the apparent curvature is in opposite directions in the two eyes, which causes a rectangular grid to appear concave. This was first noticed by Antoine Claudet (1856). The problem is solved by using lenses rather than prisms. Each eye sees through the center of a convex lens, which
Projector Horizontal polarizing filters
Vertical polarizing filters
C Types of stereoscope. (A) Wheatstone’s mirror stereoscope. (B) Brewster’s prismatic stereoscope. (C) Polaroid projection stereoscope.
Figure 24.4.
allows the viewer to accommodate on the images at close distance but does not displace the image. Side-by-side pictures may be fused by diverging or converging the eyes. This is known as free fusion. Free fusion is
S T E R E O S C O P I C T E C H N I Q U E S A N D A P P L I C AT I O N S
•
541
facilitated by holding a card between the eyes so that each eye sees only its own half-image. With a partition, the images can be fused only by diverging beyond the plane of the stereogram. Most people have no difficulty doing this, since the partition between the eyes removes the tendency to remain converged in the plane of the images. Thus, the essential element in a prism stereoscope is the partition rather than the prisms. The size of picture in a prism stereoscope is limited by the interocular distance, which subtends about 20˚ of visual angle at a viewing distance of 20 cm. Wide-angle lenses produce a broad view of the scene, but the picture suffers from fish-eye distortion. However, if the distorted pictures are viewed in a Brewster stereoscope through similar wideangle lenses, an undistorted wide-angle view of the original scene is produced. Stereoscopes constructed in this way are available commercially. In a field lens stereoscope, two stereo images are projected onto a field lens at least 0.5 m in diameter. A stereoscopic picture is seen by a person situated the correct distance from the other side of the lens (Burrows and Hamilton 1974).
24.1.2d Polaroid Stereoscope The polaroid stereoscopic system was first described and patented in 1891 by John Anderton of Birmingham, England (see Gernsheim 1969). Left- and right-eye images are projected onto the same screen through two projectors with oppositely oriented polaroid filters (Figure 24.4C). The display is viewed through cross-polarized spectacles so that each eye sees only one image. The screen must have an aluminized surface, since other types of screen do not preserve the plane of polarization of reflected light. There is no limitation on the size of display. Crossed polaroid filters attenuate luminance by only 1 to 1.5 log units. Each eye therefore sees a faint version of the other eye’s image in addition to its own. The effect increases if the viewer tilts the head so that the axis of polarization of the glasses is no longer aligned with that of the projected image. In 1938 Joseph Mahler, an inventor from Czechoslovakia, introduced the Vectograph, in which oppositely polarized images are superimposed on one film, but this has not had much commercial success (see Lipton 1982). Stereoscopic movies based on the polaroid system are discussed in Section 2.11.4.
24.1.2c Anaglyphs In anaglyph stereoscopy, the picture for one eye is red and that for the other eye is green. The subject wears red and green filters, so that one eye sees only the green picture and the other eye sees only the red picture. In 1841 the German physiologist Heinrich Dove developed a subtractive process of color printing, and presented one half of a stereogram in blue on a white ground and the other in red on a white ground. When the images were viewed in a stereoscope through red and blue filters he saw a black object in depth on a white ground (see Gosser 1977, p. 64). In 1858 the French physicist Joseph-Charles d’Almeida described a similar process in which images projected by two magic lanterns, one through a red filter and one through a green filter, were viewed through red and green filters. Note that color mixing achieved with printed colors is subtractive, whereas that achieved by projection is additive. In 1891 the Frenchman Louis Ducos du Hauron patented a method for superimposing red and green stereo photographs on the same film, which was then viewed through colored filters (see Gosser 1977). He called his stereograms “anaglyphs,” a term that originally meant a vessel, or other object, adorned with low-relief sculpture. The term is now used for superimposed colored stereo images, whether in printed or projected form. Anaglyph stereograms are convenient as 3-D illustrations in books, since they require only a simple pair of redgreen filters. They have been popular in comic books (see Morgan and Symmes 1982). There is some loss of resolution and some color rivalry with anaglyphs, and they cannot be used when color is important. 542
•
24.1.2e Field-Sequential Systems In the shutter system, left- and right-eye images are projected onto the same screen in rapid succession and viewed through binocular shutters that open alternately at the same rate. This system was described by Joseph Charles d’Almeida in 1858 and also by Stroh (1886) and Münsterberg (1894). Early devices used a rotating disk or drum as the shutter. In the modern system, dichoptic pictures are presented in rapid alternation on an interlaced monitor by writing the left-eye image on even lines and the right-eye image on odd lines. The subject views the display through electro-optical shutters that alternately occlude the eyes in phase with the alternation of the images. The images can be made to fill the whole binocular field. Electro-optical shutters were first introduced in the early 1980s. They consisted of a chemically doped ceramic wafer sandwiched between two polarizers with a 90˚ relative rotation. In the “off ” period, the polarizers blocked the light. In the “on” period, a voltage applied to the ceramic wafer rotated the light 90˚, which allowed it to pass through the second polarizer. These shutters transmitted only about 15% of the light. In about 1985 they were succeeded by liquid crystal shutters that transmit twice as much light. Wire connections to the electro-optical shutters can be avoided by use of a radio signal or a flickering source of infrared light. Flicker is not perceptible with systems running at 60 Hz in each eye. Polarizing plates can be used instead of active electrooptical shutters. A liquid-crystal polarizing plate is placed over the red tube of a television monitor, and two more
STEREOSCOPIC VISION
plates with their plane of polarization at 90˚ to the first are placed over the green and blue tubes. The left-eye image is presented on the red tube, and the right-eye image on the green and blue tubes. The polarizing plates reverse their polarizing angle by 90˚, at 60 Hz, as the left and right images are reversed between the tubes at the same rate. The viewer wears a pair of passive polaroid spectacles. In a related procedure, which minimizes flicker, the polarizers are interdigitated and aligned with alternate left-eye-right-eye stripes on the monitor display. Electro-optical shutters in the optical path of a video camera allow alternate left- and right-eye views to be recorded on one videotape. Field-sequential systems are subject to cross talk due to incomplete switching of the images. Starks (1995) reviewed these systems. In a commercial device known as Crystal Eyes, a stereoscopic display is modified by feedback from sensors that detect the position of the viewer’s head in 3-D space. The device enables one to move the head and see different parts of an object.
24.1.2f Claudet’s Stereomonoscope Antoine Claudet (1797–1867) was born in Lyons, France, but lived mainly in London. He was photographer to Queen Victoria and developed several photographic procedures and instruments. In 1858a, in response to a challenge by Brewster, he reported that some stereoscopic relief is produced when two stereoscopic pictures are projected on a ground glass screen from projectors some distance apart. He explained that each eye sees best those rays that fall on the ground glass in the direction of its visual axis. Nine years later, in 1867, Clark Maxwell described a similar instrument to the British Association in Dundee (see Gill 1969). He called it the “real image stereoscope.” Two lenses of 6-inch focal length with centers separated by 1.25 inch formed aerial images of two stereoscopic pictures, which were viewed through a large 8-inch-focal-length lens from a distance of about 2 feet. The original apparatus is in the Cavendish Laboratory in Cambridge, England. 24.1.3 L E N T I C U L A R P L AT E M ET H O D S
24.1.3a Parallax Stereograms The parallax autostereogram was patented by Frederic E. Ives in 1903, although the principle had been described by M. Berthier (1896). The object is first photographed from two vantage points with a plate of fine vertical slits placed over the film. In the resulting stereogram, left-eye and righteye views of the scene are arranged in alternating vertical strips. The correct distance, D, between the grid and the film so that the alternate strips are adjacent but nonoverlapping is given by: D=
Lens to plate distance × width of one strip Distance betw t eeen camera apertures
A similar slit plate is placed a short distance in front of the developed photograph. Since the periodicity of the picture strips corresponds to the periodicity of the slits in the plate, each eye sees only its own image when the stereogram is viewed from the correct direction. The geometry of a parallax autostereogram is depicted in Figure 24.5. The image seen by a particular eye through each slit is centered on a line of sight passing through the slit. The intersections of lines of sight from the two eyes through neighboring slits define the first depth plane. The second depth plane is defined by the set of intersections through every second slit, and the third depth plane by intersections of lines through every third slit, and so on for other depth planes. Thus, the perceived image consists of discrete depth planes of increasing separation. The separation between the depth planes depends on the spacing of the slits in relation to the interocular distance and on the distance between the slit plate and picture. The viewing distance is governed by these same factors but is not critical for the production of a stereoscopic effect. If the head of the viewer is moved laterally, the eyes see the incorrect images, which produces a pseudostereoscopic effect. If the head moves further, the images return to their correct order. Projection systems based on this principle were used in the cinema (Section 2.11.4), but they had several drawbacks, including darkening of the image, image diffraction, and dependence on viewing position. Liquid crystal (LCD) computer monitors may be modified to create 3-D images. A second LCD layer is added in front of the image-producing LCD layer. This second layer creates a slit plate of transparent vertical stripes about 60 m m wide interspersed with opaque stripes about 120 m m wide. The image is composed of stripes so that each eyes sees only alternate stripes. Since the viewer must be in the correct location, the monitors are suitable only for devices such as computer notebooks. Multiplying the number of image stripes allows more viewers to see the stereoscopic image but at the cost of degrading image resolution. The system could be used for computer games, but general television viewing would require the transmission of appropriate signals. Specially devised software is being developed that can convert images produced by a single camera into images that create stereoscopic depth. A type of parallax stereogram can be constructed by placing a transparent vertical grating a short distance in front of a similar vertical grating. When the two gratings are in parallel planes one sees an interference pattern (Moiré pattern) because the images of the two gratings differ slightly in spatial frequency. The spatial frequency of the Moiré pattern equals the difference between the spatial frequencies of the images of the component gratings. If the more distant grating is placed on a surface with depth modulations, each eye sees a slightly different interference pattern, which creates a pattern of disparities larger than those produced by the depth modulations of the more
S T E R E O S C O P I C T E C H N I Q U E S A N D A P P L I C AT I O N S
•
543
+4
+3
P
A
Uncrosed disparities
+2 B
C
Slit plate Q
+1 Picture with alternate left-eye right-eye views −1 −2 −3 Crossed −4 disparities
The geometry of a parallax stereogram. A slit plate is placed a defined distance in front of a picture composed of alternate left-eye and right-eye vertical strips. Images are produced in quantized depth levels, indicated by the dotted lines, with a separation that depends on the spacing of the slits in relation to the interocular distance. For instance, a picture element seen at A by the left eye and at B by the right eye has an uncrossed disparity corresponding to that produced by a point P. An image seen at B by the right eye and at C by the left eye has a crossed disparity corresponding to that produced by a point at Q.
Figure 24.5.
distant grating. This results in a greatly exaggerated stereoscopic impression of the depth modulation in the more distant grating (Chiang 1967). Depth is also created by modulating the spatial frequency of the more distant pattern. The German artist Ludwig Wilding has produced works of art based on this process (see Spillmann 1993; Wade 2007).
24.1.3b Lenticular-Sheet Stereograms The lenticular-sheet autostereogram was patented by the Swiss ophthalmologist Walter R. Hess in 1912 (Hess 1914). The lenticular sheet is a sheet of transparent plastic with multiple cylindrical lenses molded into its surface. In the original design there were 46 cylinders per centimeter, giving 23 stripes per centimeter for each eye. The geometry of lenticular-sheet autostereograms is similar to that of parallax stereograms, as depicted in Figure 24.5. Unexposed photographic film is cemented to the back of the lenticular sheet in the focal plane of the lenses. Leftand right-eye photographs of a scene are placed 6.5 cm apart in the focal plane of a pair of lenses so that collimated beams from each picture are superimposed on the lenticular plate. Rays of light from a pair of corresponding points in the two photographs converge on a given cylindrical lens 544
•
and form two distinct point images on the film behind the lenticular sheet, in the focal plane of the cylindrical lenses, as shown in Figure 24.6. When developed, the picture is composed of narrow vertical strips arranged in alternating left-eye right-eye columns. This is achieved when: f =
md dD d 2a
(7)
where f is the focal length of the cylindrical lenses, d is the width of one cylindrical lens element, m is the refractive index of the plate, D is the distance of the projection lenses from the plate, and a is the distance between the projection lenses. When the eyes are placed in the same locations as the the lenses of the projectors, each eye sees bundles of light rays, which reconstitute the scene seen by that eye. Lenticular stereograms have two advantages over parallax stereograms: the effects of diffraction are less severe, and they produce a brighter image, since there are no occlusions. They do not need a viewing instrument and are used commercially for 3-D picture postcards, record album covers, and magazines. However, the pictures are subject to chromatic aberration in the lenses, and pseudoscopic images occur if the picture is viewed from the wrong direction. Projection systems using lenticular sheets have been developed, but sheets for large pictures are expensive, and
STEREOSCOPIC VISION
Right print
Left print a
D
d f
Figure 24.6.
Stereoscopic images on a lenticular plate. The letters are defined in
the text.
(Adapted from Valyus 1982)
several projectors are required. A lenticular sheet can be placed over a high-resolution monitor carrying a computergenerated display or a specially designed television image (see Higuchi and Hamasaki 1978; McKenna and Zeltzer 1992). The construction and projection of lenticular-sheet stereograms are described in detail in Okoshi (1976) and Valyus (1966).
24.1.3c Parallax Panoramagrams Clarence Kanolt described the principles of the parallax panoramagram in a 1915 patent. A single-lens camera has a slit plate in front of the film. As the camera is moved horizontally through an interocular distance of 6.4 mm or more, the slit plate moves one slit width across the film. The resulting photograph consists of a series of vertical strips, each as wide as the slits in the grid. One edge of each strip is a view taken at the start of the motion of the camera and the other edge is a view taken at the end of the motion of the camera. Thus each strip contains views of one element of the scene taken from many vantage points. When the picture is viewed through a slit plate like the one in the camera, a stereoscopic effect is produced. As the head moves to the right there is a progressive change from a left to a right aspect of the scene and vice versa. In other words, the scene undergoes the same parallactic changes that occur in the natural scene. If the head moves too far the images fall in the wrong eyes to produce a pseudoscopic effect. Use of a lenticular sheet rather than a slit plate improves light transmission. Because of the time taken to move the camera, Kanolt’s method is not suitable for rapid exposures or for cinematography. Hubert Ives, son of Frederic Ives, patented a panoramagram in 1932 (Ives 1931) that overcame these
limitations. He later described a version for motion pictures (Ives 1933). Instead of sweeping a single camera, the scene is photographed by an array of synchronized stationary cameras arranged in an arc. Each camera produces a picture from a particular vantage point. Each picture is then projected by a corresponding projector in an array of projectors arranged in an arc like the arc in which the cameras were arranged. All the pictures are projected through a lenticular plate onto a screen. The screen is viewed from the other side through a similar lenticular plate, as shown in Figure 24.7. The display creates images with correct parallax when the head moves (McAllister and Robbins 1987). This is known as lateral multiplexing. Because of the small aperture of each imaging element, the region over which the effect is pseudoscopic is much narrower than the region over which the effect has the correct sign of disparity. The image formed by multiple projectors may be photographed by a single camera from the other side of the screen. This film may then be projected to an audience through a single projector. In a related method, described by Gabriel Lippmann in 1908, images of a scene are formed by a series of cylindrical lenses on curved surfaces behind the lenses. Each image is a narrow segment of the scene taken by a camera from a series of vantage points. When viewed from the same distance as the camera, each eye sees a series of segments, which form a complete image of the scene appropriate to the position of the eye (Figure 24.8). As the head moves laterally, both eyes see the scene as if from the new position. The lenticular plates required for this system could not be made in Lippmann’s time. The density of images that can be placed behind each lenticular lens is limited by the resolution of the film. This limitation is overcome by time-sequential multiplexing, Viewer
Lenticular plates
Screen
Projectors Figure 24.7.
Multiprojector panoramagram.
S T E R E O S C O P I C T E C H N I Q U E S A N D A P P L I C AT I O N S
•
545
1988; McAllister 1993). Also, several observers in different locations may view the display.
24.1.4a Holography
The lenticular panoramagram. The screen is composed of vertical cylindrical lenses, shown here enlarged. Behind each lens there is an ordered series of images of a segment of a 3-D scene, taken as the camera swept over the scene. Each eye sees the set of images in the array of lenses appropriate to its position. As the head moves laterally, the images seen by both eyes change appropriately to create the correct parallax, leaving the binocular disparity unchanged.
Figure 24.8.
in which a liquid crystal display is placed just in front of the display. The liquid crystal display operates as an assembly of slits, which scans sideways to reveal successive parts of the display. Successive views are projected in rapid succession to adjacent areas of the display, so that each view is visible through the slits of the liquid crystal display only within a narrow arc for each eye. With a rapid succession of images, a complete stereoscopic pair of images is built up for each viewing position (Meacham 1986). One problem is that most of the light is absorbed by the liquid crystal display. This problem could be overcome by using a laser beam to scan the display instead of a scanning liquid crystal display (Travis 1990). The early history of autostereogram methods was reviewed by Dudley (1951). 24.1.4 VO LUM ET R I C D I S P L AYS
In volumetric stereoscopic systems the image is distributed in real 3-D space rather than being projected onto a 2-D screen. These systems produce true parallax in which the form of the image changes appropriately with changes in vantage point. Also, vergence and accommodation change appropriately when the gaze is directed to different parts of the object, just as they do with a natural scene (see Harris 546
•
In 1948 Dennis Gabor invented what he called “wavefront reconstruction.” We now call it holography (Gabor 1949). At that time Gabor was not able to produce good 3-D images with his method because a good image requires coherent light, such as that provided by a laser. In holography, all the information required to reconstruct a 3-D scene is stored on a single photographic plate. With the proper illumination, the photograph creates a true 3-D image that allows the viewer to see different sides of an object by moving the head. A normal photograph stores only a focused image of the 2-D distribution of light arriving from the scene. Optically, information about the distance of each light ray from an object is contained in its phase angle. However, no detector can operate fast enough to record phase at the frequency of light. They detect only the time-averaged intensity of luminance at each location. In holography, spatial modulations of phase are converted into a spatial interference pattern, which can be recorded on film. In creating a hologram, a coherent monochromatic beam of laser light is first passed through a beam splitter to form two identical beams, as shown in Figure 24.9A. One beam is reflected off the object onto a photographic plate. Since there is no lens, there is no focused image. Instead, each point on the object reflects light to all parts of the photographic plate. The rays arriving at each image point from different parts of the object have different path lengths and therefore different phases. The second laser beam, known as the reference beam, is directly superimposed on the same photographic plate. Since this beam has not been reflected off the object, all the rays retain the same phase. The two superimposed beams form an interference pattern that is recorded on the photographic plate. The pattern in each region of the film depends on both the amplitude and phase of the light from all points of the object visible from a given angle. Views of an object from different angles are imaged in different locations. Thus, light from each object point goes to all points in the photograph, and any region of the photograph contains information about the object seen from one viewpoint. When the photograph is illuminated by the same coherent reference beam, as in Figure 24.9B, the reference beam acts like a diffraction grating and creates two firstorder diffracted beams. One of these beams is the same as the beam of light that was reflected off the object. Therefore, an observer viewing this beam will see a 3-D image of the original object. It is difficult to create moving holographic images in real time because of the computational load. There has been some progress in solving this problem (see Blanche
STEREOSCOPIC VISION
Photographic plate c
Object
Reference beam
Laser
Beam splitter
A
Laser Holographic image
Reference beam
Hologram
B
Figure 24.9.
Holography. (A) Recording a hologram. (B) Viewing
a hologram.
et al. 2010). Colored holograms have been created but at the cost of lower resolution. In a related procedure, two infrared laser beams intersect in a transparent cube of metal-doped fluoride glass to produce a red, green, or blue luminous point. The point of intersection of the beams can be made to scan out a 3-D form (Downing et al. 1996). For reviews of holography see Tricoles (1987) and Saxby (1988). Jones and Aitken (1994) compared the data requirements of different types of 3-D imaging system.
24.1.4b Slice-Stacking Stereo Imagery Gregory (1961) designed a microscope with narrow depth of focus in which the objective scanned the specimen in depth at 50 Hz to produce a repeating sequence of images. The images were projected in order onto a screen that oscillated in depth in synchrony with the objective to produce a volumetric display. The confocal scanning microscope described in Section 24.2.3b operates in a similar way. The varifocal mirror system was developed by A. C. Traub of the Mitre Corporation in 1967 (Traub 1967; Okoshi 1976; Sher 1993). A 3-D picture is built up by scanning an optical display onto a flexible mirror. The flexible mirror is either a metalized stretched membrane or a flexible metal plate, typically about 30 cm in diameter. The mirror vibrates at 30 Hz in response to an acoustic signal. The viewer looks into the mirror, which is set at 45˚ to the screen of a cathode-ray tube. As the focal length of the mirror oscillates, the image on the cathode-ray tube cycles through a series of in-depth slices within a defined volume of the object. For each position of the mirror, the appropriate part of the scanned picture is displayed on the screen. The image is scanned twice in each complete vibration of the mirror, producing an effective scan frequency of 60 Hz. The effect is equivalent to that produced by tracing out a 3-D form by rapid movements of an LED. The fineness of continuous gradations of depth is limited only by the bandwidth of the cathode-ray tube and the persistence of the phosphor. However, only green phosphors are of sufficiently short duration. Opaque objects look transparent because the far parts are seen through the near parts. The changing focal length of the mirror introduces unwanted changes in the size of the image. These distortions can be compensated for in computer-generated images. The swept volume system was developed by the companies Actuality, Felix 3D, and Genex. A high-definition projector or an array of lasers projects an image onto a screen rotating rapidly about its midvertical axis. As the screen rotates at 15 revs/s images appropriate to each depth plane of the mirror are projected onto it at 200 times per revolution. The mirror must be well balanced to prevent vibrations. On a moving platform, such as a ship, the device is subject to gyroscopic precession, which can severely degrade the image. The Light-Space Depth Cube is being developed by LightSpace Technologies, Inc. It uses a stack of 20 stationary screens rather then a rotating mirror. Each screen consists of voltage-sensitive crystals sandwiched between two plates of glass. A screen becomes transparent when voltage is applied and reflexive when the voltage is switched off. At any moment only one screen reflects the projected image—the others are transparent. As each screen becomes reflexive, an appropriate image is projected onto it. An antialiasing procedure smoothes transitions between the
S T E R E O S C O P I C T E C H N I Q U E S A N D A P P L I C AT I O N S
•
547
successive images. The display contains all depth cues including motion parallax and effects of changing accommodation and vergence. Since it contains no moving parts, it is not subject to the effects of vibration. However, since an image in one depth plane cannot occlude an image in another depth plane, the 3-D images are translucent rather than opaque. See Sullivan (2004) for more details on these systems. 24.1.5 R A N D O M-D OT S T E R E O G R A M S
The earliest known random-dot stereogram was produced by Ramón y Cajal and published in a photographic journal in Madrid in 1901 (Ramón y Cajal 1901). It was brought to light by Bergua and Skrandies (2000). Cajal produced a pair of stereo photographs of lines of print defined by dots placed in front of a background of random dots, as shown in Figure 24.10. In each image, the dotted letters melted into the background but became visible when the two images were combined in a stereoscope. He wrote, “My aim was to achieve a mysterious writing, which could only be deciphered with the stereoscope. . . . My little invention is, in fact, a puerile game unworthy of publishing, but it really amused me.” In 1939, Boris Kompaneysky, a member of the Russian Academy of Fine Arts, published the random-dot stereogram shown in Figure 24.11a. The cyclopean image of the face of Venus is not perfectly hidden in the monocular images (Tyler 1994). In 1954, Claus Aschenbrenner published the cyclopean stereogram shown in Figure 24.11b. When fused, the word “leak” appears in relief. Aschenbrenner worked on photographic reconnaissance and made the stereogram out of pieces of paper from a paper punch. Like Cajal, he saw the application to cryptography. This is not a perfect procedure, because elements of the pattern are dissected along the borders of the shifted region, and this could provide monocular information about these borders, although none is identifiable in this example. In computer-generated random-dot stereograms, the disparate region is shifted an
Figure 24.10.
Ramón y Cajal’s random-dot stereogram.
(From Ramón y Cajal 1901)
548
•
A
B Early cyclopean stereograms. (A) Stereogram, made by Boris Kompaneysky in 1939. It reveals the face of Venus when fused by divergence. (B) Stereogram, made by Aschenbrenner in 1954. It contains the word “leak” when fused by divergence.
Figure 24.11.
integral number of dots, so that the region is not evident in either eye’s image. In 1960 Bela Julesz introduced random-dot stereograms as research tools. This had a profound effect on research and testing. Random-dot stereograms were discussed in Section 18.2.3. The essential stages in creating a Julesz random-dot stereogram are illustrated in Figure 24.12. A similar process is easily programmed in a computer (Gonzalez and Krause 1994). Shapes visible only after monocular images are combined are cyclopean shapes. Julesz used the term “global stereopsis” to refer to stereopsis in cyclopean images. Although detection of the cyclopean shape in a random-dot stereogram requires detection of disparities, it does not require detection of depth. The shape is visible when the two images are superimposed in the same eye (see Section 18.2.3). The stereogram can be composed of dots, lines, crosses, or other texture elements, as long as they are randomly arranged. There should be no monocular evidence about the shape or depth of the cyclopean form. If the texture elements are not properly matched, the visual system links pairs of elements at random, which creates a random 3-D distribution of elements, an effect known as lacy depth. Julesz’s original random-dot stereogram consisted of equal numbers of black and white squares packed at random in a regular matrix. Before disparities are introduced, each eye sees the same random-dot pattern. This allows the visual system to find the correct correspondence between
STEREOSCOPIC VISION
Left-eye image
Right-eye image
A A random-line stereogram. A stereogram formed from lines generated by a random walk. When fused by convergence it creates the impression of a crater. (From Ninio 1981, Pion Limited, London)
Figure 24.13.
B
C Creating a random-dot stereogram. (A) Arrange black and white squares at random and duplicate the pattern. (B) Shift a region in one image laterally a whole number of squares. Transfer the overlapped squares to the empty space on the other side. (C) In a stereoscope, the shifted region appears in front of or beyond the background, depending on the way it was shifted.
Figure 24.12.
the two images. A simple cyclopean shape is created by shifting a region of dots in one eye horizontally by an integral number of dots. This creates a sharp discontinuity of disparity with respect to the surrounding dots. Along vertical edges of the shifted region a set of dots is visible to one eye but not to the other. These monocular occlusion zones play a crucial role in depth perception (Section 17.2). Monocular occlusion zones can be avoided if randomdot stereograms are generated by continuous modulations of disparity. In this case, the stereogram consists of wellspaced randomly distributed black dots on a white background rather than a regular cellular matrix. Disparity can now be a fraction of the interdot spacing. For example, Ninio (1981) composed stereograms from lines generated by a random walk and deformed in one image according to the disparities produced by a 3-D shape. The example shown in Figure 24.13 creates a protruding annulus with smooth sides. This type of stereogram can represent surfaces inclined
or curved in depth with no monocular occlusion zones arising from sharp depth discontinuities (Tyler 1974a ; Tyler and Raibert 1975; Brookes and Stevens 1989b). A continuous gradation of disparity across the horizontal extent of a textured surface introduces variations in dot density in each monocular image. The shapes seen in depth are therefore not purely cyclopean. This occurs, for example, in a stereogram representing a vertical depth-modulated grating. Variations in dot density are not evident in a stereogram of a horizontal grating, as shown in Figure 18.29, because horizontal shear disparities do not create variations in dot density. De Vries et al. (1994) attempted to minimize monocular density cues in projection stereograms, but their procedure does not eliminate the problem (Cobo-Lewis 1996). Monocular cues to relative displacement can be virtually eliminated by making the dot spacing about 20 times the disparity (Westheimer and McKee 1980a). It is often stated that Random-dot stereograms are devoid of depth information other than disparity. But it was shown in Section 18.2.3a that this is not so. Clinical screening tests for stereoscopic vision based on the random-dot stereogram were discussed in Section 18.2.3. Their great advantage over traditional tests is that they contain no monocular cues to indicate the correct response. Some people with otherwise normal stereoscopic vision have difficulty fusing random-dot stereograms, especially if they cannot correctly focus on the stimulus. 24.1.6 R A N D O M-D OT AU TO S T E R E O G R A M S
When a repetitive pattern, such as that shown in Figure 14.3, is viewed with both eyes, the images can be fused with various angles of convergence. As convergence changes, the images come into correspondence at every multiple of the spatial period of the pattern. The pattern appears closer and smaller when the eyes are converged, and larger and more distant when they are diverged. This is the wallpaper effect first described by Smith in 1738. Brewster (1844a, 1844b) and Locke (1849) created relative depth by introducing
S T E R E O S C O P I C T E C H N I Q U E S A N D A P P L I C AT I O N S
•
549
small horizontal deviations of opposite sign into adjacent elements of a repeating pattern. In 1970, Masayuki Ito, a Japanese graphic designer, produced the autostereogram shown in Figure 24.14, using four bands of random-dot patterns. The autostereogram shown in Figure 24.15 was produced in 1977 by Alfons Schilling, a Swiss-American artist now living in Vienna (Sakane 1994). These autostereograms produce simple squares or rectangles in depth. In 1979 Christopher Tyler and the computer programmer Maureen Clarke showed that almost any stereoscopic figure may be generated by programming a computer to produce suitably designed deviations of a repetitive pattern of random dots (Tyler and Clarke 1990). Figure 24.16 shows a one-dimensional random-dot autostereogram consisting of five repetitions of a random sequence of 21 black and white squares. An extra white square has been added to the third sequence. When the eyes converge or diverge by the width of one sequence indicated by the red lines, the sequence containing the extra square in one eye is superimposed on a sequence containing 21 dots in the other eye. This disparity causes this central region to appear displaced in depth relative to the other dots in the display. A 2-D random-dot autostereogram consists of rows of dots, each containing a repeating sequence. The width of the basic repeating sequence is the same in all the rows. Depth is created by adding or subtracting dots in selected sequences of each row, keeping dot density constant.
For instance, in Figure 24.17 the repetition cycle is 32 black and white dots wide. In certain regions in each row the width of the repetition cycle is reduced to 30 dots. These regions can differ in position in the different rows. But in Figure 24.17 the regions occur in the same rectangular area in the center of the display. When the eyes diverge by 32 dots, each end of the rectangular area contains a region with 32 dots in one eye and 30 dots in the other eye. This creates a crossed disparity two dots wide at each end of the rectangle. The rectangular region therefore appears to be raised in depth relative to the surrounding region. When the eyes diverge by 64 dots, an added central region is created in which the disparity is four dots wide. This creates the impression of a raised rectangle superimposed on a larger raised rectangle. For every 32-dot increase in convergence, an extra plane of depth is created. The autostereogram in Figure 24.18 creates parabolic ridges or valleys. See Tyler and Clarke (1990) for further information on the construction of autostereograms. A short history of the subject is provided in the book Stereogram (Cadence Books, San Francisco, 1994). Stork and Rocca (1989) described a computer program for generating autostereograms. Ninio (2007) has described refinements in the craft of producing autostereograms. The autostereogram has been a commercial success. 24.1.7 M O N O C U L A R S T E R E O S C O P Y
Even without binocular disparity, a picture of natural objects creates a strong sense of realism and depth when viewed so that information indicating that it is flat and near is removed. Under the best viewing conditions and with a rich variety of monocular cues to depth it is difficult to tell the difference between a picture with disparity and a picture without disparity (Ames 1925; Carr 1935; Schlosberg 1941; Schwartz 1971). The following viewing conditions help to create depth in a picture: 1. The eyes should be opposite the point in the picture corresponding to the point in the original scene that was straight ahead of the camera.
Autostereogram drafted by Masayuki Ito in 1970. An array of squares in depth is produced when the eyes misconverge on the display. An array of rectangles is produced when the display is rotated 90˚.
Figure 24.14.
(From Sakane 1994)
550
•
2. The picture should be viewed through lenses that restore the visual angles of pictured objects to equal those of the original objects (Rule 1941). The Zogroscope was an 18th-century device in which prints were viewed through a large lens and a mirror to bring the picture back into its correct projection angle at the eye. Similar devices were sold during the 19th century for viewing photographs (see Coe 1981). One such device, known as the Cosmoscope, was marketed by the English photographer Francis Frith in about 1870. Another device called the “Verant” was made by Zeiss in the early part of the 20th century (see von Aster 1906).
STEREOSCOPIC VISION
The principle of the random-dot autostereogram. An extra white space has been added to the third repeating sequence of 20 black and white spaces. When the eyes are converged or diverged by the period of the dot sequence, the central sequence of dots stands out from the surrounding dots.
Figure 24.16.
Autostereogram made by Alfons Schilling in 1977. Recessed and elevated rectangles are seen when the two dots are fused.
Figure 24.15.
(From Sakane 1994)
3. Lenses should place the picture at optical infinity so that eye accommodation is more appropriate for viewing a scene rather than a nearby picture. 4. A picture appears more solid when viewed through a tube or frame that covers the edges of the picture (Schlosberg 1941). The far end of the tube or the frame should be nearer to the viewer than the picture. A frame in the plane of the picture makes it look like a picture. Thus, a monocularly viewed perspective drawing of a slanted surface appeared less slanted when it was surrounded by a frame in the same plane compared with when it was presented without a frame (Eby and Braunstein 1995; Reinhardt-Rutland 1999). A frame placed round a real 3-D scene reduces the perceived depth within the scene (Hagen et al. 1978). In an art festival in California, objects and live actors were used to recreate famous paintings. The 2-D impression was enhanced when a large frame was placed round the scene (Braunstein 1976). 5. A stationary viewing point should be provided that prevents one’s noticing the absence of parallax that would occur in a real scene. This can be done by looking through a small aperture. It can also be done by placing the picture at the focal plane of a lens, and with the center of rotation of the eye at the exit pupil of the lens. The impression of depth in a 2-D display, such as a scene on a television screen, is greatly enhanced when motion parallax is added.
6. Finally, realism is improved if the picture is viewed through mirrors or reflecting prisms that bring the two visual axes into coincidence, as shown in Figure 24.19. Such an instrument is known as an iconoscope. Unlike the telestereoscope, which increases the stereo base, the iconoscope reduces the stereobase so that it is as if the eyes were closer together. In the limit, it is as if the eyes were both located at a point midway between the eyes. Apparent depth in real 3-D objects is reduced when they are viewed through an iconoscope, because disparity is reduced. However, the iconoscope increases perceived depth in a 2-D picture because it causes the visual axes to be diverged as if one is looking at a distant scene rather than converging on a nearby picture. In 1907 the Zeiss Company patented an instrument called the “synopter,” which altered vergence in this way, as shown in Figure 24.19. Koenderink et al. (1994) measured the ability of subjects to set an elliptical depth probe to conform to the local gradient of perceived depth at various locations on a 2-D picture of a complex 3-D object viewed monocularly, binocularly, or binocularly through the synopter illustrated in Figure 24.19b. The settings indicated that all subjects saw greater depth in the pictures with monocular viewing than with binocular viewing. Several subjects saw greater depth still when viewing with both eyes through the synopter, which brought the eyes into parallel vergence. Under these conditions, the zero disparity in the picture does not contradict other depth cues, since disparity is zero when the eyes are in parallel vergence and view a 3-D scene at infinity. With normal binocular viewing, the absence of disparity in the picture made it appear flatter. With real objects, perceived depth was greatest with binocular viewing, less with monocular viewing, and less still with objects viewed through a synopter (Koenderink et al. 1995). The camera obscura, viewing boxes, and dioramas, described in Section 2.11.1 produce vivid depth without disparity, as do computer-generated graphics and video games. 24.1.8 T H E D I C H O P T I S C O P E
A stereoscope is an instrument that combines two flat dichoptic displays with appropriate disparities to produce a fused image that appears three-dimensional. While an
S T E R E O S C O P I C T E C H N I Q U E S A N D A P P L I C AT I O N S
•
551
Autostereogram of a rectangle. A single rectangle emerges when the eyes converge or diverge by the distance between the two dots. For each extra interval of convergence, an extra rectangular layer emerges.
Figure 24.17.
object viewed in a stereoscope has realistic depth, it lacks three cues to depth processed by the actual object. The 3-D object created in a stereoscope from two flat images differs from the real object in several ways: 1. One can converge the eyes on different parts of a stereoscopic object. But one cannot accommodate on different parts of the object because focal distance is fixed in the plane of the monocular images. 2. Lateral motion of the head is not accompanied by parallax in a stereoscopic object. The stereoscopic object appears to rotate with the head. 3. When a real object moves in depth, the disparity between near and far parts of the object changes as an inverse function of the square of viewing distance. In a stereoscope, disparity changes only as an inverse function of the distance of the stereograms from the viewer. 4. When a real object moves in depth, perspective and occlusion relationships between near and far parts of the object change. These changes do not occur in a stereoscopic object. 5. The only way to continuously change disparities in a stereoscopic object is to use computer-generated images, which are subject to problems of pixilation and frame rates. 552
•
All these limitations can be overcome by optically combining two identical 3-D objects. An instrument containing two 3-D objects is not a stereoscope. The first author of this volume has designed such an instrument, which he calls a dichoptiscope. A dichoptiscope is any instrument that allows one to independently control the binocular images of any 2-D or 3-D object. Two identical objects in a dichoptiscope will be called image objects and the fused image will be called the dichoptic object. When the image objects are precisely aligned and oriented, the dichoptic object is indistinguishable from either of them. One can converge and accommodate on different parts of the dichoptic object, and head motion produces normal parallax. A dichoptic object contains all the shading, color, and fine texture of the actual object. When the image objects move in depth, the dichoptic object also moves in depth. The changes in perspective and disparity are identical to the changes in an actual moving object. The dichoptiscope allows one to independently control the following types of binocular disparity in the images of a stationary object or of an object moving in depth. Absolute disparity This is the angular separation of the images of an object lying outside the horopter. As an object moves in depth, with the eyes converged on a fixed point, its absolute disparity changes. If the eyes fixate the moving object, the change in absolute disparity is canceled by a change in vergence.
STEREOSCOPIC VISION
An autostereogram of parabolic valleys or ridges. Four parabolic valleys or ridges emerge when the eyes diverge or converge by the interval indicated by the two dots. Extra depth planes emerge when the eyes diverge or converge by two dot intervals.
Figure 24.18.
Internal disparity This is relative disparity arising from different parts of an object. It is an inverse function of the object’s distance squared. It is not affected by changes in vergence. External disparity This is relative disparity of one object with respect to another object. It is proportional to the depth separation between the objects. Figure 24.20 shows a plan view of the instrument. The two image objects are mounted on opposite sides of a pair of mirrors set at 45˚. Each image object is mounted on a vertical carriage and adjusted so that the images fuse into a single dichoptic object when viewed through the mirrors. Each carriage is mounted on a linear horizontal motion track. Each motion track is pivoted about a vertical axis. The pivot point can be anywhere along the midline of the track as long as it is in the same location on the two tracks. In what follows the pivot point is set at 57 cm from each eye. Each carriage can also be rotated about a vertical axis with respect to the central axis of the motion track and translated along a short track orthogonal to the motion track. Each type of disparity listed above is controlled as follows: Absolute disparity (vergence) As a real object approaches along the midline of the head its images translate in opposite directions in the two eyes to
produce a change in absolute disparity. If the object is binocularly fixated, the eyes converge so as to cancel the change in disparity. When the motion tracks of the dichoptiscope are parallel and aligned with the apex of the mirrors the image objects fuse to form a dichoptic object that moves along the midline, as shown in Figure 24.20. Now consider what happens when each motion track is turned about a point 57 cm from an eye until it is aligned with a visual axis. Each image object now moves along a visual axis so that there is no change in absolute disparity. The eyes must remain converged at 57 cm to maintain a fused image. When the motion tracks are turned further, the change in absolute disparity and the change in vergence are reversed. The observer must diverge when the dichoptic object approaches and converge when it retreats. When the motion tracks are rotated in the opposite direction from the initial parallel position, the change in absolute disparity is greater than normal. Thus, by simply pivoting the motion tracks one can set the change in absolute disparity to any value without changing any other cue to motion in depth. Internal disparity The near parts of a 3-D object produce images that have binocular disparity with respect to the images of far parts of the object. These internal disparities arise because the eyes see the object from a different angle. Therefore, the internal disparities in a dichoptic object can be changed by
S T E R E O S C O P I C T E C H N I Q U E S A N D A P P L I C AT I O N S
•
553
simply rotating the image objects in opposite directions about their midvertical axes. When each image object is aligned with the cyclopean axis, the internal disparities are correct. When each image object is aligned with a visual axis, all internal disparities are eliminated and the dichoptic object appears flat. Rotation of the image objects beyond this point reverses disparities and turns the dichoptic object inside out. Rotation in the opposite direction increases disparities and elongates the dichoptic object. Thus, by simply turning the image objects in opposite directions the internal disparity of the dichoptic object can be set to any value and the object can be made to appear to shrink, reverse, or expand in depth without affecting any other cues to depth.
The iconoscope.
As an object approaches along the midline, the angle that each image makes with respect to each visual axis changes. In effect, the images in the two eyes rotate in opposite directions. This means that internal disparity increases as an object approaches. It increases approximately in inverse proportion to the square of viewing distance. The rate at which internal disparity changes can be controlled by rotating each image object as it moves along the motion track. This is achieved by extending a rod from under each eye through a linear bearing in each carriage, as shown in Figure 24.20. When each eye rod is pivoted about a vertical axis under the apex of the mirrors the image objects remain orthogonal to the midline of the head, and the change in internal disparity is the same as that
Five right prisms cemented together
The synopter. Figure 24.19.
Instruments for combining the visual axes.
Dichoptic object
Visual axes
Midline (cyclopean) axis Mirrors
Left image object
Eye rods Motion tracks
Pivots of eye rods
Right image object
Axis of rotation Pivot of of image object motion track
The dichoptiscope. Two identical objects are mounted on motion tracks. When the tracks are aligned with the midline, the mirrors combine the images to create a dichoptic object moving along the cyclopean axis. Rotation of the motion tracks in opposite directions changes the absolute disparity of the images (vergence). Rotation of the image objects in opposite directions changes internal disparity in the cyclopean object. Adjusting the pivot point of the eye rods changes the way internal disparity changes as the object moves along the track. Moving the image objects in opposite directions along tracks orthogonal to the motion track changes the absolute disparity of the object.
Figure 24.20.
554
•
STEREOSCOPIC VISION
produced by an object moving along the midline. When the pivot point of each eye rod is positioned so that the rod lies along the visual axis, changes in internal disparity are eliminated. The value of the unchanging internal disparity depends on the position of the pivot point of the eye rod and on the orientation of the image object on its carriage. When the pivot of the eye rod is set beyond the image object the change in internal disparity can be reversed so that it decreases rather than increases as the object approaches. External disparity As an object approaches, the disparity in its images changes with respect to the images of a stationary object. At any instant, the external disparity is proportional to the distance between the two objects. In the dichoptiscope, external disparity is provided by placing an object in front of the viewer so that it is seen by both eyes through the semisilvered mirrors. 2 4 . 2 A P P L I C AT I O N S O F S T E R E O S C O P Y 24.2.1 P H OTO G R A M M ET RY
The development of stereophotography and stereoscopic movies was discussed in Section 2.11.4. Photogrammetry is the technology of deriving measurements from photographs. An example is aerial photography. In 1849, Colonel Aimé Laussedat of the French Army Corps of Engineers conducted the first experiments in preparing maps from aerial photographs. During the Second World War, stereoscopic photographs of enemy territory revealed structures, such as rocket sites, that were not evident in ordinary photographs. Aerial photography is also used in geological surveys, mapping, and forestry. Stereoscopic film sequences of clouds are used to reveal their structure, relative height, and their motions relative to ground. An example of an aerial stereogram are shown in Figure 24.21 (Wanless 1965). In aerial photography, the stereobase is the distance flown by the aircraft between exposures. The disparity in the pictures depends on the stereobase, the height of the aircraft, the height of the feature on the ground, and the angle with respect to the vertical at which the object is viewed. It may be advantageous to increase the stereobase so as to exaggerate the height of ground features and make them more visible. The actual height of ground structures can be calculated only if the camera remains level and the aircraft flies straight and at a constant known height. The elements of a typical instrument for measuring depth in stereoscopic photographs are shown in Figure 24.22. Each stereogram is viewed through a pair of mirrors and a magnifier. Two of the mirrors are partially silvered so that a fixed point of light is seen on the optic axis of one eye and the other eye sees a movable point of light. The observer adjusts
the movable light by a micrometer so that the two lights fuse to form one image in the depth plane of a selected feature of the stereoscopic display. The stereoscopic photographs are then moved together along orthogonal tracks in the plane of the horizontal surface on which they are mounted, and the depths of other features are measured. In this way a depth contour map of the stereoscopic display can be constructed. Automatic digital plotters can be used to draw maps with elevation contours. Stereo images can be synthesized from single displays that contain elevation data, such as the U.S. Geological Survey, or a contour map. Each point in a duplicate image is shifted laterally through a distance proportional to the specified elevation. The constant of proportionality determines the scale of depth values obtained. The original and shifted images are then viewed in a stereoscope. Stereoscopic displays can be constructed to represent any kind of statistical data that varies over the terrain ( Jensen JR 1980; Usery 1993). For example, stereo relief maps showing regional variations in population density, annual rainfall, or temperature can be created. Computer-generated stereograms based on the calculated dispositions of atoms in a molecule help chemists to visualize the structure of complex organic molecules (Figure 24.23). Physicists have used stereograms to gauge the angle through which an atomic particle is deflected in a photographic emulsion (Martin and Wilkins 1937). A stereoscope can be used to detect whether two objects are identical. For example, when a forged bank note is stereoscopically fused with a genuine bank note, slight differences become visible as surface relief. Cisne (2009) suggested that monks in Ireland and the North of England in the 7th and 8th centuries used a stereoscopic procedure to make exact copies of patterns in illuminated manuscripts, such as the Book of Kells. Inaccuracies in a set of repeating patterns would be revealed by free fusing the patterns. 24.2.2 T E L E S C O P E S A N D R A N G E FI N D E R S
Binoculars are stereoscopic telescopes that magnify the scene and increase the distance range of stereopsis. A good pair of field binoculars doubles the normal stereobase and magnifies about 12 times. If we accept 1.5 km as the distance range over which stereopsis normally operates, a pair of binoculars with these features would, theoretically, extend the range to 36 km. Since increasing magnification proportionately reduces the diameter of the visual field, it is better to improve depth sensitivity in stereoscopic telescopes by increasing the stereobase rather than by increasing magnification. Stereoscopic telescopes used in gunnery consist of a pair of telescopes with a magnification of about 10 and a field of view of about 5˚. The telescopes are linked by a hinge so that the eyepieces remain at a fixed distance apart. The stereobase can be increased by moving
S T E R E O S C O P I C T E C H N I Q U E S A N D A P P L I C AT I O N S
•
555
Figure 24.21.
Aerial photograph. Designed for divergent fusion.
(From Wanless 1965)
the objectives apart up to about 0.75 m. Such an instrument does not indicate the absolute distances of objects but does allow a gunner to see whether an exploding projectile undershot or overshot its target. A stereoscopic rangefinder is an instrument for measuring the absolute distance of an object. The basic principles are illustrated in Figure 24.24. Left-eye and right-eye images of the distant object are formed in the plane of the reticule plates by objectives and eyepieces of a Keplerian telescope. The pentaprisms act as mirrors and compensate for the inversion of the images by the telescope. Movement of the wedge prism along the axis of one optical system moves the image of the object in that eye horizontally relative to the image in the other eye. The distance of the object is given by the movement of the prism on a calibrated scale required to bring the object to appear at the same distance as a pair of marks on the graticule plates. A less sensitive procedure is to move one of the images by rotating one of the mirrors about a vertical axis. The sensitivity of the instrument may be increased by increasing the magnification of the images, but at the price of reduced field of view, or, by lengthening the instrument’s stereobase, but at the price of increased weight. In a stereocoincidence rangefinder, a normal stereoscopic image is presented in the upper half of the field, and a pseudoscopic image in the lower half. Any movement of the object toward the viewer causes the lower image to move nearer and the upper image to recede. The prism 556
•
compensator is then adjusted to bring the two images to the same depth plane. This doubles the sensitivity of the instrument because the relative movement of the upper and lower images is twice that of one image relative to a fixed calibration mark. In applications to astronomy the stereobase must be greatly enlarged. For example, a stereophotograph of Saturn’s rings can be taken by allowing an interval of 24 hours between the two pictures. Movement of the earth during this period produces a stereobase of 1,730,000 km. Astronomical objects that move with respect to the fixed stars may be detected stereoscopically. The motion produces a disparity in two photographs separated by a period
F i xed point of light
Movable point of light
Micrometer
Left image Figure 24.22.
Elements of a photogrammetry instrument.
STEREOSCOPIC VISION
R i ght image
Figure 24.23.
Stereoscopic structure of a protein molecule. The resolution is at
3 Angstrom. For divergent fusion.
(From Lawrence et al. 1995. Reprinted with
permission from AAAS)
of time. The planetoid “Stereoscopia” was discovered in this way, and so was the comet Shoemaker-Levy 9, which impacted on Jupiter in 1993 (Mestel 1994). Stereoscopic star charts may be synthesized by introducing disparities into a pair of identical charts proportional to the known distances of stars, in terms of an assumed stereobase of say 100 light years.
24.2.3 S T E R E O M I C RO S C O P Y
24.2.3a Binocular Microscopes Stereophotographs taken through a binocular microscope reveal the fine structure of animals and plants, as in Figure 24.25. The magnified image of a small object is optically close to the eyes, which greatly enlarges the effective angle of convergence. The two eyes therefore see very different views of the object, and this may prevent binocular fusion. This problem is avoided by reducing the effective interpupillary distance. In microscopes magnifying up to 50 times, the stereobase is reduced by prisms. In higher power microscopes, the stereobase is reduced by bringing the objective lenses close together. However, two objectives cannot be closer than their diameter. At very high magnifications, one objective lens is used with a beam-splitting device located in its exit pupil. The rays from the object are optically divided into two bundles, one for the left eye and one for the right eye. One bundle crosses over the other so that each eye receives the correct image. The result is that each eye views the object through its own half of the objective (Figure 24.26). M. M. Nachet
Objective lens
Wedge prism and scale
Graticule Eyepieces
Figure 24.24.
Optical components of a stereoscopic rangefinder.
Pentaprism
invented this procedure in the 1850s (Claudet 1858b, p. 238). Microscopes of this type have been in general use since about 1880. At very large magnifications, the depth of field is very limited. The eccentric projection of light rays through the two halves of the objective lens introduces distortions, which can be compensated for in digitized images. Recent improvements are described in Sroczynski (1990). Stereoscopic images may be reconstructed from serial optical sections of a 3-D object placed in a monocular microscope (Inoué and Inoué 1986). A series of images of closely spaced cross sections of the object is obtained and converted into digitized images in a computer (see Agard 1984; Danuser 1999). The images are superimposed to form a stack, which is sheared to the left. A duplicate stack is sheared to the right. The two stacks are then viewed in a stereoscope. Parts of the object occluded by nearer parts may be removed in the digitized images.
24.2.3b Confocal Scanning Microscope The optical separation between successive image planes can be improved by using the confocal scanning microscope, invented by Marvin Minsky in 1955 (Minsky 1988). A point of light is focused in an aperture in the object plane. The illuminated point or fluorescence emitted from the specimen is imaged in a confocal aperture in front of the detector, which may be a video camera or a photomultiplier (Boyde et al. 1990; Diaspro 2002). In a conventional microscope, light from out of focus parts of a specimen dilutes the focused image, although computer algorithms can remove out of focus images. In the confocal microscope, light from planes other than the plane of focus does not enter the image aperture, as illustrated in Figure 24.27. There is little optical interference from outof-focus planes because, with a high-intensity laser beam, the aperture can be kept small and the illumination distribution constrained by the diffraction limit of the lens. Resolution is improved by using a laser beam rather than light from a conventional light source (Brakenhoff et al. 1986). The focused beam must raster scan the specimen to produce a wide image. In the on-axis method the object is mechanically moved within the object plane relative to a fixed diffraction-limited laser beam focused to a diameter of under 1 m m. On-axis scanning at TV frame rates can be obtained by using digitally controlled acousto-optic deflectors rather than mechanical scanners (Goldstein et al. 1990). In the off-axis method, the laser beam scans over the stationary specimen. Although this allows faster scans, the light beam does not remain on the axis of the optical system. The laser beam can scan over a succession of planes through the object. The stack of digitized images is converted into images that would be seen by the other eye
S T E R E O S C O P I C T E C H N I Q U E S A N D A P P L I C AT I O N S
•
557
Figure 24.25.
Stereomicrograph of squirrel retina. For divergent fusion.
(By S. K. Fisher and K. A. Linberg, UC Santa Barbara)
by laterally shifting each image in the stack through the appropriate distance. The related images in each stack are superimposed and the composite image stacks are viewed in a stereoscope. Multiaperture confocal microscopes use many pinholes at the same time, rather than scan one pinhole over the specimen. This allows images to be collected more rapidly. In conventional microscopes, lateral resolution does not exceed the Rayleigh limit, which is approximately onethird the wavelength of light, or about 180 nm. Depth resolution is at least 500 nm. Scanning the specimen with a laser beam in the confocal microscope yields a 1.4-fold improvement in resolution compared with the standard microscope. Lateral and depth resolution can also be improved by using two opposed high-aperture lenses (Hell et al. 1997). A further 1.5-fold improvement in lateral resolution and improved imaging speed have been achieved by superimposing on the specimen a meshlike interference pattern produced by four lasers. Images produced from five positions of the interference pattern are postprocessed (Frohn et al. 2000). The harmonic excitation of the specimen contributes additional high spatial frequency contents of the specimen to the processed image. More recently, depth resolution has been improved down to 100 nm, and improvements down to 50 nm are in the offing (Egner and Hell 2005). Recent advances in high-resolution charge-coupled detectors (CCDs) and data processing has opened up the field of digital holography. With this procedure a 3-D image of a biological specimen may be acquired in one scan of the detector. Speed is important when studying the 558
•
dynamics of cellular processes. However, fluorescent chemicals used to identify tissues do not produce the coherent light required for holography. This limitation can be overcome by use of a two-aperture scanning system. For details see Indebetouw and Zhong (2006).
Figure 24.26.
A stereoscopic microscope.
STEREOSCOPIC VISION
(Redrawn from Valyus 1966)
Detector Image plane
Confocal apertures
Laser light source Dichroic mirror
Beam expander
Objective
Out-of-focus planes In-focus plane
Specimen The essentials of a scanning confocal microscope. A laser beam reflected from a dichroic mirror (one that reflects certain wavelengths but transmits other wavelengths) forms the image of a small aperture in a defined plane of the specimen. The laser beam causes light to be emitted from fluorescent dyes in the specimen. Light from the in-focus plane is transmitted through the dichroic mirror to form an image in a small aperture in front of the detector. Light from other planes in the specimen (dotted lines) falls outside the image aperture. Mirrors (not shown) scan the laser beam over one plane of the specimen. The specimen is then moved along the visual axis and the scanning process repeated to create a 3-D stack of 2-D images.
Figure 24.27.
24.2.3c Nonlinear Optical Microscopes In a conventional scanning microscope that uses fluorescent dyes, the dye is not closely confined to the aperture of the microscope. This reduces axial resolution. Also, fluorescent dyes have a toxic effect on living tissue. These problems have been overcome by nonlinear optical techniques made possible by the development of ultrashort pulsed lasers. The two-photon scanning microscope uses fluorescent dyes that are excited only when they absorb two photons almost simultaneously (Denk et al. 1990). The specimen is irradiated by a focused pulsed laser beam of about 100-femtosecond (10–15 s) duration at a repetition rate of 80 MHz. The wavelength of the laser is long so as not to damage living tissue. Excitation of dye molecules
normally requires dangerous short-wavelength ultraviolet light. However, a dye molecule is excited when two photons of the long-wavelength laser light combine their energy by striking the molecule at the same time. Axial resolution is improved over that of conventional scanning microscopes because the probability of two-photon absorption decreases rapidly away from where the laser is focused. Photobleaching and tissue damage is also confined to the neighborhood of the focused laser beam. The local excitation of dye molecules allows the image to be detected without the use of a confocal pinhole aperture. Resolution can be increased still further by simultaneous excitation of fluorescent dyes with three photons (Hell et al. 1996). This procedure allows one to direct a recording electrode to particular cells in cell cultures or in the living brain, as described in Section 5.4.3. In a conventional laser-scanning microscope, resolution is limited because the width of the focused beam cannot be less than the point-spread function, as defined by diffraction (Section 3.2.4). In the simulated emissionscanning microscope a principal focused laser excites the fluorescent dye over a region defined by the point-spread function, while a second laser forms two beams on the flanks of the main beam. The secondary beams deplete the excited state of the dye molecules before fluorescence occurs. This confines phosphor emission to a small region at the center of the point-spread function of the main beam. Resolution is improved by a factor of 4.5 relative to that of a single beam-scanning microscope (Hell and Wichmann 1994). The third-harmonic generation (THG) microscope depends on the fact that three photons of a given frequency can be absorbed at an optical boundary and emitted as one photon at three times the frequency. Focused pulses of near infrared laser light of about 100-fs duration are transmitted through a transparent specimen. Wherever the focused pulse of light coincides with a refractive boundary, third-harmonic light is emitted in the forward direction and recorded on a video camera. The method can be used only for transparent specimens illuminated by transmitted light. Since emission of third-harmonic light is proportional to the cube of light intensity, emission falls off rapidly away from where the laser light is focused. Thus, lateral and axial resolutions are very good. Fluorescent or other staining agents do not need to be injected into the specimen, because refractive boundaries generate third-harmonic light and therefore serve as a natural label. Organic molecules do not absorb the near-infrared laser light, so the method can be used on living tissue. The rapid 2-D scanning of the laser beam across the specimen creates a 2-D image of about 250 m by 250 mm and prevents local heating. The method lends itself to the production of a 3-D stack of images that can be converted into stereo displays of boundaries and surfaces in the specimen (Müller et al. 1998; Millard et al. 1999).
S T E R E O S C O P I C T E C H N I Q U E S A N D A P P L I C AT I O N S
•
559
24.2.3d Electron Microscopes In the transmission electron microscope (TEM) electrons are transmitted through a thin section of the specimen. Differential absorption of electrons reveals the internal structure of that section of tissue. The 3-D structure of a specimen can be inferred by inspecting a sequential series of sections. A single section can be very misleading—a series of apparently disconnected elements may be parts of a connected loop, spiral, or reticulum. Also, elements that appear connected may only overlap each other. In the field-emission scanning electron microscope (SEM), electrons are reflected from the surface of the specimen. This affords an immediate impression of depth because of the presence of shadowing, perspective, and the concealment of more distant features by nearer features (Gaunt and Gaunt 1978; Beil and Carlsen 1990). However, the procedure reveals only the surface topography of the specimen, not its internal structure. Stereoscopic photographs can be obtained from a transmission or field-emission electron microscope by taking two successive photographs of a thin section (usually between 1 and 5 microns thick) as the microscope stage is tilted through an angle between 10 and 20˚. The stage is first tilted in one direction and then in the opposite direction (Boyde 1973). An example is shown in Figure 24.28. To obtain a quantitative measure of depth intervals in the stereogram the linear disparity, d, between a pair of corresponding features is measured with a traveling microscope. It can be seen in Figure 24.29 that a point P distance D in front of reference point X is displaced horizontally by Dsin i q when the object is tilted by q . The displacement in the image produced by the microscope with magnification i q . Point P is displaced by an equal amount in M is MDsin the opposite direction when the object is tilted the other way. The total linear disparity, d, between the images of P i q . Since q and with respect to those of point X is 2 MDsin M are known, and d has been measured, the depth, D of point P above point X is
Figure 24.28.
D=
d 2 M sinq
(8)
Features such as the sign of a helix can be discerned only when the 3-D structure of objects is revealed. Shading can be introduced by evaporation of a metal onto the specimen in a high vacuum chamber. The metal particles are evaporated onto the specimen at an oblique angle so that, like light cast at an oblique angle, they create shadows of protruding or recessed elements of the specimen. Finally, carbon is evaporated onto the metal and the metal-carbon replica of the surface of the specimen is then stripped off and viewed in the electron microscope. The process, known as shadowing, creates a compelling impression of relief. The specimen may be a molecule, a virus, a cell organelle, or the surface of a larger object shadowed at ambient temperature. The 3-D structure of spherical objects is most effectively revealed by evaporating metal from a low angle but from all directions. This is known as rotary shadowing. Resolution of features is limited to 2–3 nm. Too thin a metal film fails to reveal small features, and too thick a film causes features to fuse. Resolution also depends on the size of the evaporated metal particles. In the process is known as freeze fracturing the specimen is frozen and then fractured to reveal inner surfaces and structures. Frozen membranes usually fracture in the middle. One half is called the protoplasm face, and the other half is called the exoplasm face. Proteins spanning the membrane cleave into two parts. With this procedure, Heuser et al. (1979) fixed tissue within a few milliseconds to reveal synaptic events associated with a single nerve impulse. General methods of biological electron microscopy are described in Maunsbach and Afzelius (1999). Methods for extracting 3-D images from scanning electron microscopes are described by Minnich et al. (1999).
Stereoscopic images from a scanning electron microscope. A sarcoma virus grown in culture. Taken with 10˚ of tilt between a pair of photographs.
Field width 50 m m.
(From Boyde 1973)
560
•
STEREOSCOPIC VISION
X
X
D
D
θ
θ P
P
Dsinθ Left object
P
P
The atomic force microscope operating in dynamic mode can achieve higher resolution. The specimen is placed in a high vacuum and the probe vibrates at atomic amplitudes as it scans the surface. Changes in chemical bonding forces between the tip and the molecules of the surface of the specimen modulate the vibration frequency (FM) and/ or amplitude (AM) and reveal the atomic structure of the surface (Giessibl et al. 2000).
Dsinθ Right object
Disparity from tilting an object in a microscope. Tilting an object through angle q displaces point P at distance D in front of reference point X by horizontal distance Dsinq . The binocular disparity in two pictures produced by tilting an object in two directions is 2Dsinq .
Figure 24.29.
24.2.3e Stereo X-ray Photographs Scanning procedures may be used to obtain stereoscopic photographs from an X-ray microscope or large scale X-ray machine (Hallert 1970). Stereoscopic X-ray photographs of the human body may be obtained either by taking photographs successively from two positions or by using two X-ray tubes. These methods were popular in the 1930s, but their use declined because of the additional dosage required. After the advent of X-ray intensifiers in the 1950s, stereoscopic X-ray images could be obtained with low dosage (Hardy et al. 1996).
24.2.3f Atomic Force Microscope The atomic force microscope was invented in 1986. By 1995 it had achieved atomic resolution. In the scanning mode of operation, a fine silicon or silicon nitride tip mounted on a cantilever scans over the surface of the specimen while a servomechanism displaces the sample vertically to keep the cantilever deflection constant (Fernandez 1997). The resulting movements produce 3-D images with a resolution down to 1 nm. In the tapping mode of operation, the tip of the instrument is moved toward and away from the surface of the specimen to measure forces of molecular attraction. The instrument can be used to study molecular dynamics in living cells in real time. For example, it has been used to record changes in the cell membrane associated with the docking and release of secretory vesicles (Schneider et al. 1997). It can be used to probe single protein or DNA molecules in the specimen. The specimen must be protected against uncontrollable drift of the cantilever caused by thermal instability in living tissue and acoustic noise. The atomic force microscope can be used in conjunction with a light microscope (Dvorak and Nagao 1998). Stereoscopic displays can be produced from images from an atomic force microscope (Smith et al. 1999).
24.2.4 S T E R E O E N D O S C O P Y A N D STEREO MRI
An endoscope is a surgical probe that conveys light to a surgical site inside the body. Basil Hischowitz developed the first practical endoscope in 1961. It consisted of a fiberoptic bundle with a lens on its distal end (see Diner and Fender 1993). Endoscopic surgery began in the late 1970s and by the 1990s it was used for viewing most abdominal tissues. Until recently, endoscopes conveyed only a monocular view of the body cavity, and the surgeon had to establish relative distances by motion parallax (Voorhorst et al. 1997) or by touching a landmark near the operative site. Stereoscopic endoscopes are now coming into use (Durrani and Preminger 1995). In one procedure, a small prism and a pair of mirrors are attached to the exit pupil of a conventional endoscope to produce left-eye and righteye views of the scene. The stereobase in this procedure is small. In another procedure, two charge-coupled video cameras are mounted at the distal end of the endoscope. Even when miniaturized, these cameras increase the diameter of the endoscope. In a third procedure, a pair of lenses is mounted on the end of the instrument, and the images are conveyed along the probe to a prism and mirrors, which deliver the images to right and left cameras. The surgeon views the stereo images on a monitor. Image separation is achieved either by optical shutters or by polarizing filters. In spectrally encoded endoscopy (SEE) polychromatic light is conveyed down a single optic fiber no more than 250 m m in diameter. Each wavelength is projected to a distinct location along a single axis on the tissue. The reflected light is decoded by a spectrometer to form a spatial image. The second dimension is formed by rotating the head of the probe (Yelin et al. 2006). There is conflicting data on the extent to which stereoendoscopy allows surgical operations to be carried out in less time and with fewer errors. According to one estimate, stereoendoscopy reduces the duration of a typical surgical procedure by about 36% relative to monocular endoscopy (Griffin 1995). However, other studies found no time or accuracy advantage for stereoendoscopy (Chan et al. 1997; Hanna and Cuschieri 2000). Thomsen and Lang (2004) found that, although stereoendoscopes provided no time advantage, they produced some improvement in accuracy.
S T E R E O S C O P I C T E C H N I Q U E S A N D A P P L I C AT I O N S
•
561
Jourdan et al. (2004) had eight experienced operators perform well-defined tasks, such as threading a needle and tying a knot. They completed the tasks more rapidly and with fewer errors with a stereoendoscope than with a monocular instrument. It is technically feasible to combine a real-time stereoscopic image produced by an endoscope with a stereo MRI image stored in a computer. The two images could be presented in sequence, side-by-side, or superimposed (Griffin 1995). The MRI image would provide a roadmap for the surgeon and indicate blockages, calcifications, and tumors. In magnetic resonance imaging (MRI), the object of interest is scanned by three orthogonal linear magnetic fields to produce an isometric 3-D image (Section 5.4.3f ). Stereoscopic images may be constructed from two scans taken from positions about 10˚ apart (Moseley et al. 1989). Since the resulting image is derived from isometric projections, it lacks perspective. Chen (1998) has proposed a method for producing correct perspective in MRI stereoscopic images. Stereoscopic ophthalmoscopes are used to inspect the retina. By analyzing 3-D images of the fundus over time, the clinician can detect small changes of the optic nerve head that indicate glaucoma (see Duke-Elder 1962, vol. 7, p. 62). In modern instruments, digitized stereoscopic images are produced on a computer monitor. 24.2.5 S T E R E O S CU L P T I N G A N D STEREOLITHOGRAPHY
Many industrial devices use digital information about the 3-D structure of an object to control a milling machine that creates a model of the object. This stereosculpting process is stereopsis in reverse. In stereolithography a 3-D object is built layer by layer on a platform suspended in a vat of liquid photosensitive polymer. A layer of polymer is formed on the platform and a computer-controlled laser beam traces out and hardens a region corresponding to the first cross section of the object. The platform is then lowered into the vat by between 0.05 and 0.15 mm, and a blade wipes a second layer of polymer over the first layer. This layer is scanned to form the second cross section of the object. The process is repeated until the complete 3-D object is formed (Peterson 1991). In a related process an inkjet printer head sprays a pattern of binder fluid that hardens successive layers of powder deposited by a wiper blade. After the object has been formed, unhardened powder is blown away. In more recent machines, microdots of hot plastic are ejected at 6,000 dots per second by a pair of heads that move rapidly back and forth to form successive layers of the object. The first layer is formed on a cooled platform, which rapidly solidifies the plastic. After each layer, the platform in lowered. 562
•
At present, these devices are used mainly for making 3-D prototypes. Machines now being developed will be able to construct finished products in metal or ceramics. See Gaunt and Gaunt (1978) for an account of methods used to construct 3-D models from stereomicrographs. Stereolithography is described in Serope and Schmid SR (2006). 24.2.6 T E L EV I S I O N, V I RT UA L R E A L I T Y, AND TELEPRESENCE
Although several methods have been used to create stereoscopic displays on television screens, none of them is available on a commercial scale. Stereoscopic television requires an extended bandwidth for signal transmission because there is more information in a two-eye display than in a monocular display. Several procedures have been proposed for compressing the information. One method makes use of redundancies in binocular images. One need transmit only one eye’s image plus the differences between the two images. In practice, this method has achieved transmission gains of only 30% (Puri et al. 1997). A second method uses standard compression algorithms. The algorithm known as JPEG is used for static images and that known as MPEG is used for moving images. There is some loss of information when the compressed images are reconstructed, but the effects are usually not detectable in the images of natural scenes. A compression algorithm divides the image into small blocks, which are processed independently. When the blocks are recombined there may be visible edges where they join, known as blocking artifacts. Meegan et al. (2001) found that blocking artifacts introduced into one eye’s image remain visible in the fused image. They would presumably interfere with stereopsis. A third method relies on the fact that stereopsis is relatively unaffected by a reduction in the quality of one eye’s image (see Section 18.2.3a). Thus, the amount of information in one image may be reduced without affecting the quality of the fused image. Meegan et al. found that considerable blurring (low-pass filtering) of one eye’s image did not degrade the quality of the fused image. Presumably, stereopsis would not be affected significantly. One of the most active applications of stereopsis is in the design of stereoscopic imaging devices for virtual-reality systems. In a virtual-reality system two small monitors are carried on a helmet and viewed through lenses, which magnify the images to create a binocular field about 40˚ wide with flanking monocular fields. The display is coupled to the movements of the head to allow the viewer to look around the virtual environment. Objects in the display may be coupled to the movements of the hand as detected by sensors in a glove, to allow the viewer to manipulate virtual objects or to initiate motion of the self through the virtual space.
STEREOSCOPIC VISION
Stereoscopic virtual-reality systems are used for entertainment. They are also used for simulators for learning skills, such as flying, athletics, and surgery, where practice in a real environment is dangerous or costly. CAE of Montreal have made a helmet-mounted flight simulator. This technology is reviewed by Earnshaw et al. (1993, 1995), Burdea and Coiffet (1994), and Barfield and Furness (1995). In a related technology, known as telepresence, video cameras convey stereoscopic information from a real-world scene to a monitor viewed at some distance away. The operator can then control machinery in dangerous or inaccessible environments such as mines, nuclear reactors, fires, and inside human bodies. A problem arises in the control of vergence in telepresence systems. If a camera rotates so that the optic axis moves to an oblique angle to a frontal surface, the flat image of the surface suffers keystone distortion. For example, the image of a vertical square in a nonvertical camera is tapered. If two cameras designed to produce stereoscopic images change their convergence, the opposite keystoning in each camera produces an unusual pattern of disparity in the stereo display seen by the operator. Images on converging spherical retinas are not subject to this type of keystone distortion. The distortion would not occur in a camera with a spherical image plane.
Stereoscopic displays improve the capacity of the operator to judge the 3-D layout of a virtual-reality scene, as indicated for instance by the ability to track a moving object manually. However, people have reported eyestrain and nausea when using virtual-reality systems. Any mismatch between the separation of the images and the viewer’s interocular distance and phoria induces a state of forced vergence. This problem can be overcome by asking the observer to adjust prisms in the viewer so that nonius lines are aligned (Mon-Williams et al. 1993; Wann et al. 1995). But there is still the problem that changes in accommodation accompanying changes in vergence in a natural scene do not occur when one views a virtual scene (Section 10.4.3d). This conflict between vergence and accommodation also strains the vergence system (Hasebe et al. 1996). When monocular cues to depth such as perspective and shading are enhanced, performance may be as good without stereoscopic information as with it (Kim et al. 1987). Stereoscopic display systems are reviewed in Merritt and Fisher (1992, 1993). Another application of stereopsis is the design of artificial stereoscopic systems, especially those attached to robots operating in a 3-D environment in industry and in the extraterrestrial environment (SPIE 1992; Harris and Jenkin 1993; Milios et al. 1993).
S T E R E O S C O P I C T E C H N I Q U E S A N D A P P L I C AT I O N S
•
563
REFERENCES Numbers in square brackets indicate the sections where the references are cited. Alais D, van der Smagt MJ, Verstraten FAJ (1996) Monocular mechanisms determine plaid motion coherence Vis Neurosci 13 615–26 [22.3.3] Alais D, O’Shea RP, Mesana-Alais C, Wilson IG (2000) On binocular alternation Perception 29 1437–45 [12.4.4b] Alais D, Lorenceau J, Arrighi R , Cass J (2006) Contour interactions between pairs of Gabors engaged in binocular rivalry reveal a map of the association field Vis Res 46 1473–87 [12.4.3] Alais D, Cass J, O’Shea RP, Blake R (2010) Visual sensitivity underlying changes in visual consciousness Curr Biol 20 1362–7 [12.10] Albus K (1975) A quantitative study of the projection area of the central and paracentral visual field in area 17 of the cat. I. The precision of the topology Exp Brain Res 27 159–79 [11.3.1] Alexander LT (1951) The influence of figure-ground relationships on binocular rivalry J Exp Psychol 41 376–81 [12.3.2c] Allik J (1992) Resolving ambiguities in orientation motion and depth domains Perception 21 731–46 [17.6] Allison RS (2007) Analysis of the influence of vertical disparities arising from in toed-in stereoscopic cameras J Im Sci Technol 51 317–27 [24.1.1] Allison RS, Howard IP, Rogers BJ, Bridge H (1998) Temporal aspects of slant and inclination perception Perception 27 1287–304 [20.3.2c] Allison RS, Gillam BJ, Vecellio E (2009a) Binocular depth discrimination and estimation beyond interaction space J Vis 9(1) Article 10 [20.6.3b] Allison RS, Gillam BJ, Palmisano SA (2009b) Stereoscopic discrimination of the layout of ground surfaces J Vis 9 1–11 [20.1.1] Allman JM, Meizin F, McGuinness EL (1985) Direction- and velocityspecific responses from beyond the classical receptive field in the middle temporal visual area (MT) Perception 14 105–29 [12.3.3b] Alpern M (1952) Metacontrast: historical introduction Am J Optom Arch Am Acad Optom 29 631–46 [13.2.7] Alpern M (1954) Relation of visual latency to intensity Arch Ophthal 51 369–74 [13.1.7, 23.2.2] Alpern M (1968) A note on visual latency Psychol Rev 75 260–4 [23.2.2, 23.4.1, 23.4.2a] Alpern M, Hofstetter HW (1948) The effect of prism on esotropia—a case report Am J Optom Arch Am Acad Optom 25 80–91 [14.4.1d] Alpern M, Rushton WAH, Torii S (1970) Signals from cones J Physiol 207 463–75 [13.2.7b] Ames A (1925) The illusion of depth from single pictures J Opt Soc Am 10 137–48 [24.1.7] Ames A (1929) Cyclophoria Am J Physiol Opt 7 3–38 [12.1.5] Ames A, Glidden GH, Ogle KN (1932a) Size and shape of ocular images. I. Methods of determination and physiologic significance Arch Ophthal 7 576–97 [14.6.1c] Ames A, Ogle KN, Glidden GH (1932b) Corresponding retinal points the horopter and size and shape of ocular images J Opt Soc Am 22 538–574; 575–631 [14.6.2, 14.6.2a, 14.6.2c, 20.6.5a] Amigo G (1963) Variation of stereoscopic acuity with observation distance J Opt Soc Am 53 630–5 [18.6.7] Amigo G (1974) A vertical horopter Optica Acta 21 277–92 [14.7] Andersen EE, Weymouth FW (1923) Visual perception and the retinal mosaic. I Retinal mean local sign—an explanation of the fineness of
Abadi RV (1976) Induction masking—a study of some inhibitory interactions during dichoptic viewing Vis Res 16 299–75 [12.3.3c] Adachi-Usami E, Lehmann D (1983) Monocular and binocular evoked average potential field topography: upper and lower hemiretinal stimuli Exp Brain Res 50 341–6 [11.5.1] Adams DL, Zeki S (2001) Functional organization for macaque V3 for stereoscopic depth J Neurophysiol 86 2195–203 [11.5.1] Adams WJ, Mamassian P (2002) Common mechanisms for 2D tilt and 3D slant after-effects Vis Res 42 2563–8 [21.6.1a] Adams WJ, Frisby JP, Buckley D, et al. (1996) Pooling of vertical disparities by the human visual system Perception 25 165–76 [20.2.4c] Addams R (1834) An account of a peculiar optical phenomenon seen after having looked at a moving body Lond Edin Philos Mag J Sci 5 373–4 [13.3.3a] Adelson EH (1982) Some new illusions and some old ones analyzed in terms of their Fourier components Invest Ophthal Vis Sci 22 (Abs) 144 [16.4.2b] Adelson EH (1993) Perceptual organization and the judgment of brightness Science 262 2042–44 [22.4.5] Adelson EH, Bergen JR (1985) Spatiotemporal energy models for the perception of motion J Opt Soc Am A 2 284–99 [16.4.1, 11.10.1b] Adelson EH, Movshon JA (1982) Phenomenal coherence of moving visual patterns Nature 300 523–5 [12.3.6b, 22.3.3] Adelson EH, Movshon JA (1984) Binocular disparity and the computation of two–dimensional motion J Opt Soc Am A 1 1266 [22.3.3] Adler FH (1945) Pathologic physiology of convergent strabismus: motor aspects of the nonaccommodational type Arch Ophthal 33 362–77 [14.4.1d] Adrian ED, Matthews R (1927) The action of light on the eye. Part II J Physiol 64 279–301 [23.2.2] Agard DA (1984) Optical sectioning microscopy: cellular architecture in three dimensions Ann Rev Biophys Bioeng13 191–219 [24.2.3a] Ahissar M, Hochstein S (1995) How early is early vision? Evidence from perceptual learning In Early vision and beyond (ed TV Papathomas, C Chubb, A Gorea, E Kowler) pp 199–206 MIT Press, Cambridge Mass [13.4.1] Ahissar M, Hochstein S (1996) Learning pop-out detection: specificities to stimulus characteristics Vis Res 36 3487–500 [13.4.1] Akase E, Inokawa H, Toyama K (1988) Neuronal responsiveness to three-dimensional motion in cat posterior late/rail suprasylvian cortex Exp Brain Res 122 214–26 [11.3.2] Akerstrom RA, Todd JT (1988) The perception of stereoscopic transparency Percept Psychophys 44 421–32 [18.9] Alais D, Blake R (1998) Interactions between global motion and local binocular rivalry Vis Res 38 637–44 [12.3.3b] Alais D, Blake R (1999) Grouping visual features during binocular rivalry Vis Res 39 4341–53 [12.4.3] Alais D, Blake R (2005) Binocular rivalry MIT Press, Cambridge MA [12.3.1a] Alais D, Melcher D (2007) Strength and coherence of binocular rivalry depends on shared stimulus complexity Vis Res 47 269–79 [12.8.3a] Alais D, Parker A (2006) Independent binocular rivalry processing for motion and form Neuron 52 911–20 [12.4.4a]
564
binocular perception of distance Am J Physiol 64 561–91 [18.2.1a, 18.6.5] Andersen GJ (1990) Focused attention in three-dimensional space Percept Psychophys 47 112–20 [22.5.1e] Andersen GJ, Kramer AF (1993) Limits of focussed attention in threedimensional space Percept Psychophys 53 658–67 [22.8.1] Anderson BL (1992) Hysteresis cooperativity and depth averaging in dynamic random–dot stereograms Percept Psychophys 51 511–28 [18.8.2c] Anderson BL (1994) The role of partial occlusion in stereopsis Nature 367 365–7 [17.3, 22.3.1] Anderson BL (1997) A theory of illusory lightness and transparency in monocular and binocular images: the role of contour junctions Perception 26 419–53 [22.4.5] Anderson BL (1999a) Stereoscopic surface perception Neuron 24 919–28 [17.2.2] Anderson BL (1999b) Stereoscopic occlusion and the aperture problem for motion: a new solution Vis Res 39 1273–84 [22.3.1] Anderson BL (2003) Perceptual organization and White’s illusion Perception 32 269–84 [22.4.5] Anderson BL, Julesz B (1995) A theoretical analysis of illusory contour formation in stereopsis Psychol Rev 102 705–43 [22.2.4a] Anderson CH, Van Essen DC (1987) Shifter circuits: a computational strategy for dynamic aspects of visual processing Proc Natl Acad Sci 84 6297–301 [18.10.3a] Anderson JD, Bechtoldt HP, Dunlap GL (1978) Binocular integration in line rivalry Bull Psychonom Soc 11 399–402 [12.3.5a] Anderson PA, Movshon JA (1989) Binocular combination of contrast signals Vis Res 29 1115–32 [13.1.2b] Andrews DP (1967) Perception of contour orientation in the central fovea Part 1: Short Lines Vis Res 7 975–97 [13.1.3a] Andrews TJ, Blakemore C (1999) Form and motion have independent access to consciousness Nat Neurosci 2 405–6 [12.3.6b, 12.5.4b] Andrews TJ, Blakemore C (2002) Integration of motion information during binocular rivalry Vis Res 42 301–9 [12.3.6b, 12.5.4b] Andrews TJ, Lotto RB (2004) Fusion and rivalry are dependent on the perceptual meaning of visual stimuli Curr Biol 14 418–23 [12.3.2d] Andrews TJ, Purves D (1997) Similarities in normal and binocularly rivalrous viewing Proc Natl Acad Sci 94 9905–8 [12.3.8a] Andrews TJ, White LE, Binder D, Purves D (1996) Temporal events in cyclopean vision Proc Natl Acad Sci 93 3689–92 [13.1.5] Andrews TJ, Glennerster A, Parker AJ (2001) Stereoacuity thresholds in the presence of a reference surface Vis Res 41 3051–61 [18.3.3a] Anstis SM (1975) What does visual perception tell us about visual coding In Handbook of psychobiology (ed C Blakemore, MS Gazzaniga) pp 269–323 Academic Press, New York [21.1, 21.4.1] Anstis SM (1980) The perception of apparent movement Philos Tr R Soc B 290 153–68 [16.5.3a] Anstis SM (1986) Motion perception in the frontal plane In Handbook of human perception and performance (ed KR Boff, L Kaufman, JP Thomas) Vol 1 Chap 16 Wiley, New York [16.4.2a] Anstis SM (2000) Monocular lustre from flicker Vis Res 40 2551–6 [12.3.8c] Anstis SM, Duncan K (1983) Separate motion aftereffects from each eye and from both eyes Vis Res 23 161–9 [13.3.3d] Anstis SM, Harris JP (1974) Movement aftereffects contingent on binocular disparity Perception 3 153–68 [22.5.4] Anstis SA, Ho A (1998) Nonlinear combination of luminance excursions during flicker simultaneous contrast afterimages and binocular fusion Vis Res 38 523–9 [13.1.6a] Anstis SM, Moulden BP (1970) After–effect of seen movement: evidence for peripheral and central components Quart J Exp Psychol 22 222–9 [13.3.3d, 16.4.3] Anstis SM, Reinhardt-Rutland AH (1976) Interactions between motion aftereffects and induced movement Vis Res 16 1391–4 [21.1] Anstis SM, Rogers BJ (1975) Illusory reversal of visual depth and movement during changes of contrast Vis Res 15 957–61 [15.3.7b]
Anstis SM, Howard IP, Rogers B (1978) A Craik–Cornsweet illusion for visual depth Vis Res 18 213–17 [21.1, 21.4.2e, 21.5.1, 21.5.2] Anstis SM, Smith DRR , Mather G (2000) Luminance processing in apparent motion, Vernier offset and stereoscopic depth Vis Res 40 657–75 [16.4.2h] Anzai A, Bearse MA, Freeman RD, Cai D (1995) Contrast coding by cells in the cat’s striate cortex: monocular vs binocular detection Vis Neurosci 12 77–93 [13.1.8a] Anzai A, Ohzawa I, Freeman RD (1999a) Neural mechanisms for encoding binocular disparity: field position versus phase J Neurophysiol 82 874–90 [11.4.3a, 11.4.3c, 11.4.5b, 11.10.1b] Anzai A, Ohzawa I, Freeman RD (1999b) Neural mechanisms for processing binocular information I. Simple cells J Neurophysiol 82 891–908 [11.10.1b] Anzai A, Ohzawa I, Freeman RD (1999c) Neural mechanisms for processing binocular information II. Complex cells J Neurophysiol 82 909–24 [11.10.1b] Anzai A, Ohzawa I, Freeman RD (2001) Joint-encoding of motion and depth by visual cortical neurons: neural basis of the Pulfrich effect Nat Neurosci 4 513–18 [11.6.5, 23.3.2] Apkarian PA, Nakayama K , Tyler CW (1981) Binocularity in the human visual evoked potentials: facilitation summation and suppression EEG Clin Neurophysiol 51 32–48 [11.7, 12.9.2e, 13.1.8b] Archer SM, Miller KK , Helveston EM (1987) Stereoscopic contours and optokinetic nystagmus in normal and stereoblind subjects Vis Res 27 841–4 [16.5.1] Archie KA, Mel BW (2000) A model for intradendritic computation of binocular disparity Nat Neurosci 3 54–63 [11.10.1b] Arditi A (1982) The dependence of the induced effect on orientation and a hypothesis concerning disparity computations in general Vis Res 22 247–56 [20.2.3a, 20.4.1d] Arditi A, Kaufman L (1978) Singleness of vision and the initial appearance of binocular disparity Vis Res 18 117–20 [12.1.1c] Arditi A, Anderson PA, Movshon JA (1981a) Monocular and binocular detection of moving sinusoidal gratings Vis Res 21 329–36 [13.3.3f ] Arditi A, Kaufman L, Movshon JA (1981b) A simple explanation of the induced size effect Vis Res 21 755–64 [20.2.3a] Aristotle (1931) Parva naturalia De somni. In The works of Aristotle translated into English Vol III Oxford University Press, London [13.3.3a] Arndt PA, Mallot HA, Bülthoff HH (1995) Human stereovision without localized image features Biol Cyber 72 279–93 [17.1.1c] Arnold DH, Grove PM, Wallis TSA (2007) Staying focused: a functional account of perceptual suppression during binocular rivalry J Vis 7(7) Article 7 [12.3.2b, 15.4.1] Arnold DH, Law P, Wallis TSA (2008) Binocular switch suppression: A new method for persistently rendering the visible ‘invisible’ Vis Res 48 994–1001 [12.3.5f ] Arnold DH, James B, Rosenboom W (2009) Binocular rivalry: spreading dominance through complex images J Vis 9(13) Article 4 [12.3.5e] Arnott SR , Shedden JM (2000) Attention switching in depth using random-dot autostereograms: attention gradient asymmetries Percept Psychophys 62 1459–73 [22.8.1] Aschenbrenner CM (1954) Problems in getting information into and out of air photographs Photogram Engin 20 398–401 [24.1.5] Asher H (1953) Suppression theory of binocular vision Br J Ophthal 37 37–49 [12.7.2] Assee A, Qian N. (2007) Solving da Vinci stereopsis with depthedge-selective V2 cells Vis Res 47 2585–602 [17.3] Atchley P, Kramer AF (2001) Object and space-based attentional selection in three-dimensional space Vis Cognit 8 1–32 [22.8.1] Atchley P, Kramer AF, Andersen GJ, Theeuwes J (1997) Spatial cuing in a stereoscopic display: evidence for a “depth aware” attentional focus Psychonom Bull Rev 4 524–9 [22.8.1] Atkinson J (1972) Visibility of an afterimage in the presence of a second afterimage Percept Psychophys 12 257–62 [12.3.8d] Atkinson J, Campbell FW (1974) The effect of phase on the perception of compound gratings Vis Res 14 159–62 [12.3.8a]
REFERENCES
•
565
Atkinson J, Campbell FW, Fiorentini A, Maffei L (1973) The dependence of monocular rivalry on spatial frequency Perception 2 127–33 [12.3.8a] Attneave F, Block G (1973) Apparent movement in tridimensional space Percept Psychophys 13 301–7 [22.5.3a] Auerbach E, Peachey NS (1984) Interocular transfer and dark adaptation to long-wave test lights Vis Res 27 1043–8 [13.2.2] Avilla CW, von Noorden GK (1981) Limitation of the TNO random dot stereo test for visual screening Am Orthopt J 31 87–90 [18.2.3b] Azar RF (1965) Postoperative paradoxical diplopia Am Orthopt J 15 64–71 [14.4.1e] Bach M, Schmitt C, Kromeier M, Kommerell G (2001) The Freiburg test: automatic measurement of stereo threshold Graefe’s Arch Clin Exp Ophthal 239 562–6 [18.2.2e] Backus BT (2002) Perceptual metamers in stereoscopic vision In Advances in neural information processing systems 14 (ed G Dietterich, S Becker, Z Ghahramani) MIT Press, Cambridge, MA [20.2.3c] Backus BT, Banks MS (1999) Estimator reliability and distance scaling in stereoscopic slant perception Perception 28 217–42 [20.2.3c] Backus BT, Matza-Brown D (2003) The contribution of vergence change to the measurement of relative disparity J Vis 3 737–50 [18.10.2a] Backus BT, Banks MS, van Ee R , Crowell JA (1999) Horizontal and vertical disparity, eye position, and stereoscopic slant perception Vis Res 39 1143–70 [20.2.2b, 20.2.2d] Backus BT, Fleet DJ, Parker AJ, Heeger, DJ (2001) Human cortical activity correlates with stereoscopic depth perception J Neurophysiol 86 2054–68 [11.8.1] Bacon BA, Villemagne J, Bergeron A, et al. (1998) Spatial disparity coding in the superior colliculus of the cat Exp Brain Res 119 333–44 [11.2.3] Bacon BA, Lepore F, Guillemot JP (2000) Neurons in the posteromedial lateral suprasylvian area of the cat are sensitive to binocular positional depth cues Exp Brain Res 134 464–76 [11.3.2] Bacon JH (1976) The interaction of dichoptically presented spatial gratings Vis Res 16 337–44 [13.1.6c] Badcock DR , Derrington AM (1987) Detecting the displacements of spatial beats: a monocular capability Vis Res 27 793–7 [12.1.7] Badcock DR , Schor CM (1985) Depth–increment detection function for individual spatial channels J Opt Soc Am A 2 1211–15 [11.4.2, 18.3.3a, 18.7.2b] Bagby JW (1957) A cross–cultural study of perceptual predominance in binocular rivalry J Abn Soc Psychol 54 331–4 [12.8.3a] Bagolini B (1967) Anomalous correspondence: definition and diagnostic methods Doc Ophthal 23 346–98 [14.4.1e] Bagolini B (1976) Part I Sensorial anomalies in strabismus. Part II. Sensori-motorial anomalies in strabismus Doc Ophthal 41 1–41 [14.4.1b, 14.4.1e] Bagolini B, Capobianco NM (1965) Subjective space in comitant squint Am J Ophthal 59 430–42 [14.4.1b] Bailey NJ (1958) Locating the center of visual direction by binocular diplopia method Am J Optom Arch Am Acad Optom 35 484–95 [16.7.6a] Baitch LW, Levi DM (1988) Evidence for nonlinear binocular interactions in human visual cortex Vis Res 28 1139–43 [13.1.8b] Baker CH (1970) A study of the Sherrington effect Percept Psychophys 8 406–10 [13.1.5] Bakin JS, Nakayama K , Gilbert CD (2000) Visual responses in monkey areas V1 and V2 to three-dimensional surface configurations J Neurosci 20 8188–98 [22.2.4c] Balch W, Milewski A, Yonas A (1977) Mechanisms underlying the slant aftereffect Percept Psychophys 21 581–5 [21.6.1a] Ball K , Sekuler R (1987) Direction-specific improvement in motion discrimination Vis Res 27 953–65 [13.4.1] Bando T, Yamamoto N, Tsukahara N (1984) Cortical neurons related to lens accommodation in posterior lateral suprasylvian area in cats J Neurophysiol 52 879–91 [11.3.2]
566
•
Bando T, Hara N, Takagi M, Yamamoto K , Toda H (1996) Roles of the lateral suprasylvian cortex in convergence eye movements in cats Prog Brain Res 112 143–56 [11.3.2] Banks MS, Backus BT (1998) Extra-retinal and perspective cues cause the small range of the induced effect Vis Res 38 187–94 [20.2.3b] Banks MS, van Ee R , Backus BT (1997) The computation of binocular visual direction: a re–examination of Mansfield and Legge (1996) Vis Res 37 1605–10 [16.7.7] Banks MS, Hooge ITC, Backus BT (2001) Perceiving slant about a horizontal axis from stereopsis J Vis 1 55–79 [20.3.1, 20.3.2a] Banks MS, Backus BT, Banks RS (2002) Is vertical disparity used to determine azimuth? Vis Res 42 801–7 [19.6.4] Banks MS, Gephstein S, Landy MS (2004a) Why is stereoresolution so low? J Neurosci 24 2077–89 [11.10.1c, 18.6.3c] Banks S, Ghose T, Hillis M (2004b) Relative image size, not eye position, determines eye dominance switches Vis Res 44 229–34 [12.3.7] Bannister H (1932) Retinal reaction time Physical and Optical Societies report of a joint discussion on vision pp 227–34 The Physical Society, London [23.4.1] Banton T, Levi DM (1991) Binocular summation in vernier acuity J Opt Soc Am A 8 673–80 [13.1.3c] Bárány EH (1946) A theory of binocular visual acuity and an analysis of the variability of visual acuity Acta Ophthal 27 63–92 [13.1.1b] Bárány EH, Halldén U (1948) Phasic inhibition of the light reflex of the pupil during retinal rivalry J Neurophysiol 11 25–30 [12.5.1] Barbeito R (1983) Sighting from the cyclopean eye: the cyclops effect in preschool children Percept Psychophys 33 561–4 [16.7.2c] Barbeito R , Ono H (1979) Four methods of locating the egocentre: a comparison of their predictive validities and reliabilities Behav Res Meth Instrum 11 31–6 [16.7.6a, 16.7.6b] Barbeito R , Simpson TL (1991) The relationship between eye position and egocentric visual direction Percept Psychophys 50 373–82 [16.7.6b] Barbeito R , Levi D, Klein S, Loshin D, Ono H (1985) Stereo–deficients and stereoblinds cannot make utrocular discriminations Vis Res 25 1345–8 [16.8] Barfield W, Furness TA (1995) Virtual environments and advanced interface design Oxford University Press, New York [24.2.6] Barlow HB (1958) Temporal and spatial summation in human vision at different background intensities J Physiol 141 337–50 [13.1.6b, 18.3.5] Barlow HB (1978) The efficiency of detecting changes of density in random dot patterns Vis Res 18 637–50 [18.3.5] Barlow HB, Brindley GS (1963) Inter–ocular transfer of movement aftereffects during pressure blinding of the stimulated eye Nature 200 1349–50 [13.3.3b, 16.4.1] Barlow HB, Fitzhugh R , Kuffler SW (1957) Change of organization in the receptive fields of the cat’s retina during dark adaptation J Physiol 137 338–54 [12.4.2] Barlow HB, Blakemore C, Pettigrew JD (1967) The neural mechanism of binocular depth discrimination J Physiol 193 327–42 [11.1.2, 11.3.1, 11.4.4, 11.4.5b] Barnard ST (1987) A stochastic approach to stereo vision In Readings in computer vision (ed MA Fischler, O Fischein) pp 21–5 Kauffman Los Altos CA [15.2.1a] Barnes GR , Benson AJ, Prior ARJ (1978) Visual vestibular interaction in the control of eye movement Aviat Space Environ Med 49 557–64 [18.10.5] Battersby WS, Wagman IH (1962) Neural limitations of visual excitability IV: spatial determinants of retrochiasmal interaction Am J Physiol 203 359–65 [13.2.3] Bauer A, Kolling G, Dietz K , et al. (2000) Are squinters second-class motorists? Influence of stereoscopic disparity on driving performance Klin Monat Augenheil 217 183–9 [20.1.1] Baumgartner G (1964) Neuronale mechanismen des Kontrast- und Bewegungssehens Ber D Ophthal Ges 66 111–25 [21.5.2] Beard BL, Levi DM, Reich LN (1995) Perceptual learning in parafoveal vision Vis Res 35 1679–90 [13.4.1]
REFERENCES
Bearse MA, Freeman RD (1994) Binocular summation in orientation discrimination depends on stimulus contrast and duration Vis Res 34 19–29 [13.1.3a] Beasley WC, Peckham RH (1936) An objective study of “cyclotorsion” Psychol Bull 33 741–2 [12.1.5] Beck J (1965) Apparent spatial position and the perception of lightness J Exp Psychol 59 170–9 [22.4.3b] Beck J (1967) Perceptual grouping produced by line figures Percept Psychophys 2 491–5 [22.8.2a] Beck J (1972) Similarity grouping and peripheral discrimination under uncertainty Am J Psychol 85 1–19 [22.8.2a] Becker S, Hinton GE (1992) Self-organizing neural network that discovers surfaces in random-dot stereograms Nature 355 161–3 [11.10.2] Bedell HE, Klopfenstein JF, Yuan N (1989) Extraretinal information about eye position during involuntary eye movement: optokinetic afternystagmus Percept Psychophys 46 579–86 [22.7.2] Behrens F, Grüsser OJ (1988) The effect of monocular pattern deprivation and open-loop stimulation on optokinetic nystagmus in squirrel monkeys In Post-lesion neural plasticity (ed H Flohr) pp 455–72 Springer, Berlin [22.6.1b] Beil W, Carlsen IC (1990) A combination of topographical contrast and stereoscopy for the reconstruction of surface topographies in SEM J Micros 157 127–33 [24.2.3d] Békésy G von (1967) Sensory inhibition Princeton University Press, Princeton N J [21.1] Békésy G von (1970) Apparent image rotation in stereoscopic vision: the unbalance of the pupils Percept Psychophys 8 343–7 [17.9] Belheumer PN (1996) A Bayesian approach to binocular stereopsis Int J Comp Vis 19 237–60 [11.10.1a, 11.10.1c] Belheumer PN, Mumford D (1992). A Bayesian Treatment of the Stereo Correspondence Problem Using Half occluded Regions. Proc IEEE Conf. CVPR , 506–12, Champaign, I [11.10.1c] Benson AJ, Barnes GR (1978) Vision during angular oscillation; the dynamic interaction of visual and vestibular mechanisms Aviat Space Environ Med 49 340–5 [18.10.5] Berardi N, Galli L, Maffei L, Siliprandi R (1986) Binocular suppression in cortical neurons Exp Brain Res 63 581–4 [12.9.2b] Bereby-Meyer Y, Leiser D, Meyer J (2000) Perception of artificial stereoscopic stimuli from an incorrect viewing point Percept Psychophys 61 1555–63 [24.1.1] Berends EM, Erkelens CJ (2001a) Strength of depth effects induced by three types of vertical disparity Vis Res 41 37–45 [20.2.3b, 20.3.2a] Berends EM, Erkelens CJ (2001b) Adaptation to disparity but not to perceived depth Vis Res 41 883–92 [21.6.2d] Berends EM, van Ee R , Erkelens CJ (2002) Vertical disparity can alter perceived direction Perception 31 1323 33 [21.6.2d] Berends EM, Zhang ZL, Schor CM (2003) Eye movement facilitate stereo-slant discrimination when horizontal disparity is noisy J Vis 3 780–94 [18.10.2b] Berends EM, Liu B, Schor CM (2005) Stereo-slant adaptation is high level and does not involve disparity coding J Vis 5 71–80 [21.6.2c] Bergman R , Gibson JJ (1959) The negative aftereffect of the perception of a surface slanted in the third dimension Am J Psychol 72 364–74 [21.6.1b, 21.6.3a, 21.6.3b, 21.6.4] Bergua A, Skrandies W (2000) An early antecedent to modern random dot stereograms—’the secret stereoscopic writing’ of Ramon y Cajal Int J Psychophysiol 36 69–72 [24.1.5] Berlucchi G, Rizzolatti G (1968) Binocularly driven neurons in visual cortex of split–chiasm cats Science 159 308–10 [11.9.1] Berman N, Blakemore C, Cynader M (1975) Binocular interaction in the cat’s superior colliculus J Physiol 276 595–615 [11.2.3] Berry RN (1948) Quantitative relations among vernier real depth and stereoscopic depth acuities J Exp Psychol 38 708–21 [13.1.3e, 18.11] Berry RN, Riggs LA, Duncan CP (1950) The relation of vernier and depth discriminations to field brightness J Exp Psychol 40 349–54 [18.5.1]
Bertamini M, Lawson R (2008) Rapid figure-ground responses to stereogrms reveal an advantage for a convex foreground Perception 37 483–94 [22.1.3] Berthier A (1896) Images stéréoscopiques de grand format Cosmos 34 227–33 [24.1.3a] Bielschowsky A (1898) über monokuläre Diplopie ohne physikalische Grundlage nebst Bemerkungen über das Sehen Schlielender Graefes Arch klin exp Ophthal 46 143–83 [12.3.8b, 14.4.1d, 14.4.2] Bielschowsky A (1937) Application of the after-image test in the investigation of squint Arch Ophthal 17 408–19 [14.4.1b] Birch EE, Foley JM (1979) The effects of duration and luminance on binocular depth mixture Perception 8 293–7 [18.8.2c] Birch EE, Salomao S (1998) Infant random dot stereoacuity cards J Pediat Ophthal Strab 35 86–90 [18.2.3d] Bishop PO (1979) Stereopsis and the random element in the organization of the striate cortex Proc R Soc B 204 415–44 [11.3.1, 11.6.2] Bishop PO (1989) Vertical disparity egocentric distance and stereoscopic depth constancy: a new interpretation Proc R Soc B 237 445–69 [19.6.3, 20.6.5a] Bishop PO (1994) Size constancy depth constancy and vertical disparities: a further quantitative interpretation Biol Cyber 71 37–47 [19.6.3] Bishop PO (1996) Can random-dot stereograms serve as a model for the perception of depth in relation to real three-dimensional objects Vis Res 36 1473–7 [20.6.3d] Bishop PO, Henry GH (1971) Spatial vision Ann Rev Psychol 22 119–60 [11.9.2, 15.3.4b] Bishop PO, Kozak W, Vakkur GJ (1962) Some quantitative aspects of the cat’s eye: axis and plane of reference visual field coordinates and optics J Physiol 163 466–502 [11.1.2] Bishop PO, Henry GH, Smith CJ (1971) Binocular interaction fields of single units in the cat’s striate cortex J Physiol 216 39–68 [11.4.1d] Bishop PO, Coombs JS, Henry GH (1973) Receptive fields of simple cells in the cat striate cortex J Physiol 231 31–60 [12.9.2b, 13.3.2b] Bjorklund RA, Magnussen S (1981) A study of interocular transfer of spatial adaptation Perception 10 511–18 [13.2.6] Black P, Myers RE (1964) Visual functions of the forebrain commissures in the chimpanzee Science 146 799–800 [13.4.2] Blackwell HR (1952) Studies of psychophysical methods for measuring thresholds J Opt Soc Am 42 606–16 [18.1] Blake A, Bülthoff H (1990) Does the brain know the physics of specular reflection? Nature 343 165–8 [17.1.6] Blake A, Bülthoff H (1991) Shape from specularities: computation and psychophysics Philos Tr R Soc B 331 237–52 [17.1.6] Blake R (1977) Threshold conditions for binocular rivalry J Exp Psychol HPP 3 251–7 [12.3.2b] Blake R (1988) Dichoptic reading: the role of meaning in binocular rivalry Percept Psychophys 44 133–41 [12.8.3b] Blake R (1989) A neural theory of binocular rivalry Psychol Rev 96 145–67 [12.10] Blake R , Boothroyd K (1985) The precedence of binocular fusion over binocular rivalry Percept Psychophys 37 114–27 [12.7.2] Blake R , Bravo M (1985) Binocular rivalry suppression interferes with phase adaptation Percept Psychophys 38 277–80 [12.6.2] Blake R , Camisa J (1978) Is binocular vision always monocular? Science 200 1497–99 [12.7.2] Blake R , Camisa J (1979) On the inhibitory nature of binocular rivalry suppression J Exp Psychol HPP 5 315–23 [12.3.2a] Blake R , Cormack RH (1979a) On utrocular discrimination Percept Psychophys 29 53–68 [16.8] Blake R , Cormack RH (1979b) Psychophysical evidence for a monocular visual cortex in stereoblind humans Science 203 274–5 [16.8] Blake R , Cormack RH (1979c) Does contrast disparity alone generate stereopsis? Vis Res 19 913–15 [20.2.1] Blake R , Fox R (1972) Interocular transfer of adaptation to spatial frequency during retinal ischaemia Nat New Biol 270 76–7 [12.6.2, 13.2.6]
REFERENCES
•
567
Blake R , Fox R (1973) The psychophysical inquiry into binocular summation Percept Psychophys 14 161–85 [13.1.1, 13.1.6c] Blake R , Fox R (1974a) Binocular rivalry suppression: insensitive to spatial frequency and orientation change Vis Res 14 687–92 [12.5.3] Blake R , Fox R (1974b) Adaptation to invisible gratings and the site of binocular rivalry suppression Nature 279 488–90 [12.6.1] Blake R , Lehmkuhle SW (1976) On the site of strabismic suppression Invest Ophthal 15 660–3 [12.6.1] Blake R , Lema SA (1978) Inhibitory effect of binocular rivalry suppression is independent of orientation Vis Res 18 541–4 [12.3.3c] Blake R , Levinson E (1977) Spatial properties of binocular neurons in the human visual system Exp Brain Res 27 221–32 [13.1.2a, 13.1.2c, 13.1.6c] Blake R , O’Shea RP (1988) “Abnormal fusion” of stereopsis and binocular rivalry Psychol Rev 95 151–4 [12.7.4] Blake R , Overton R (1979) The site of binocular rivalry suppression Perception 8 143–52 [12.4.3, 12.6.2] Blake R , Rush C (1980) Temporal properties of binocular mechanisms in the human visual system Exp Brain Res 38 333–40 [13.1.2c] Blake R , Wilson HR (1991) Neural models of stereoscopic vision TINS 14 445–52 [11.4.3c] Blake R , Fox R , McIntyre C (1971) Stochastic properties of stabilized– image binocular rivalry alternations J Exp Psychol 88 327–32 [12.3.6a] Blake R , Fox R , Westendorf D (1974) Visual size constancy occurs after binocular rivalry Vis Res 14 585–6 [12.4.1] Blake R , Camisa JM, Antoinetti DN (1976) Binocular depth discrimination depends on orientation Percept Psychophys 20 113–18 [18.6.5] Blake R , Breitmeyer B, Green M (1980a) Contrast sensitivity and binocular brightness: dioptic and dichoptic luminance conditions Percept Psychophys 27 180–1 [13.2.2] Blake R , Westendorf DH, Overton R (1980b) What is suppressed during binocular rivalry? Perception 9 223–31 [12.4.4a, 12.7.3] Blake R , Martens W, Di Gianfilippo A (1980c) Reaction time as a measure of binocular interaction in human vision Invest Ophthal Vis Sci 19 930–41 [13.1.7] Blake R , Sloane M, Fox R (1981a) Further developments in binocular summation Percept Psychophys 30 296–76 [13.1.1] Blake R , Overton R , Lema–Stern S (1981b) Interocular transfer of visual aftereffects J Exp Psychol HPP 7 367–81 [13.3.1, 13.3.2a] Blake R , Zimba L, Williams D (1985) Visual motion binocular correspondence and binocular rivalry Biol Cyber 52 391–7 [12.3.6b] Blake R , Westendorf D, Fox R (1990) Temporal perturbations of binocular rivalry Percept Psychophys 48 593–602 [12.10] Blake R , Yang Y, Westendorf D (1991a) Discriminating binocular fusion from false fusion Invest Ophthal Vis Sci 32 2821–25 [12.3.5a] Blake R , Yang Y, Wilson HR (1991b) On the coexistence of stereopsis and binocular rivalry Vis Res 31 1191–203 [12.7.3] Blake R , O’Shea RP, Mueller TJ (1992) Spatial zones of binocular rivalry in central and peripheral vision Vis Neurosci 8 469–78 [12.4.1] Blake R , Yu K , Lokey M, Norman H (1998) Binocular rivalry and motion perception J Cog Neurosci 10 46–60 [12.3.6b, 13.3.3d] Blake R , Sobel KV, Gilroy LA (2003) Visual motion retards alternations between conflicting perceptual interpretations Neuron 39 869–78 [12.3.6b] Blakemore C (1970a) Binocular depth perception and the optic chiasm Vis Res 10 43–7 [11.6.1, 11.9.1, 20.2.1] Blakemore C (1970b) The representation of three–dimensional visual space in the cats striate cortex J Physiol 209 155–78 [11.3.1, 14.6.1b, 20.2.1] Blakemore C (1970c) The range and scope of binocular depth discrimination in man J Physiol 211 599–622 [18.1, 18.3.3a, 18.4.1a, 18.6.1a, 18.6.4, 18.7.2b] Blakemore C (1970d) A new kind of stereoscopic vision Vis Res 10 1181–99 [20.2.1] Blakemore C, Campbell FW (1969) On the existence of neurones in the human visual system selectively sensitive to the orientation and size of retinal images J Physiol 203 237–60 [13.2.6, 21.6.4]
568
•
Blakemore C, Hague B (1972) Evidence for disparity detecting neurones in the human visual system J Physiol 225 437–55 [22.5.1b] Blakemore C, Julesz B (1971) Stereoscopic depth aftereffect produced without monocular cues Science 171 286–8 [21.6.2a] Blakemore C, Pettigrew JD (1970) Eye dominance in the visual cortex Nature 225 429–9 [11.3.1] Blakemore C, Sutton P (1969) Size adaptation: a new aftereffect Science 166 275–7 [13.3.4, 21.1, 21.6.4] Blakemore C, Fiorentini A, Maffei L (1972) A second neural mechanism of binocular depth discrimination J Physiol 229 725–49 [11.6.2] Blakemore C, Diao Y, Pu M, et al. (1983) Possible functions of the interhemispheric connections between visual cortical areas in the cat J Physiol 337 331–49 [11.9.2] Blakeslee B, McCourt ME (1999) A multiscale filtering account of the White effect, simultaneous brightness contrast and grating induction Vis Res 39 4361–77 [22.4.5] Blanche PA, Bablumian A, Voorakranam R , et al. (2010) Holographic three-dimensional telepresence using large-area photorefractive polymer Nature 468 80–3 [596] [24.1.4a] Blasdel GG, Fitzpatrick D (1984) Physiological organization of layer in macaque striate cortex J Neurosci 4 880–95 [13.4.1] Blaser E, Domini F (2002) The conjunction of feature and depth information Vis Res 42 273–79 [21.6.2f ] Blohm G, Khan AZ , Ren L, et al. (2008) Depth estimation from retinal disparity requires eye and head orientation signals J Vis 8(16) Article 3 [20.1.2] Bloj MG, Kersten D, Hurlbert AC (1999) Perception of threedimensional shape influences colour perception through mutual illumination Nature 402 877–79 [22.4.6] Blomfield S (1973) Implicit features and stereoscopy Nature B275 256 [22.2.4a] Bodé DD (1986) Chromostereopsis and chromatic dispersion Am J Optom Physiol Opt 63 859–66 [17.8] Boeder P (1964) Anomalous retinal correspondence refuted Am J Ophthal 58 366–73 [14.4.1d] Boeder P (1966) Single binocular vision in strabismus Am J Ophthal 61 78–86 [14.4.1d] Bogert BP, Healy WJR , Tukey JW (1963) The frequency analysis of time series for echoes: cepstrum pseudoautocovariance cross cepstrum and saphe cracking In Proceedings of symposium on time series analysis (ed M Rosenblatt) pp 209–43 Wiley, New York [15.2.1d] Bolanowski SJ (1987) Contourless stimuli produce binocular brightness summation Vis Res 27 1943–51 [13.1.4c] Bolanowski SJ, Doty RW (1987) Perceptual “blankout” of monocular homogeneous fields (Ganzfelder) is prevented with binocular viewing Vis Res 27 967–82 [12.3.3a] Boltz RL, Smith EL, Bennett MJ, Harwerth RS (1980) Vertical fusional vergence ranges of the rhesus monkey Vis Res 20 83–5 [18.4.2b] Bonds AB (1989) Role of inhibition in the specification of orientation selectivity of cells in the cat striate cortex Vis Neurosci 2 41–55 [12.9.2b, 13.3.2b] Bonneh YS, Sagi D (1999) Configuration saliency revealed in short duration binocular rivalry Vis Res 39 271–81 [12.3.3c] Bonneh YS, Cooperman A, Sagi D (2001a) Motion induced blindness in normal observers Nature 411 798–801 [12.3.8a] Bonneh YS, Sagi D, Karni A (2001b) A transition between eye and object rivalry determined by stimulus coherence Vis Res 41 981–9 [12.4.4b] Boothroyd K , Blake R (1984) Stereopsis from disparity of complex grating patterns Vis Res 27 1205–22 [17.1.1a] Boring EG (1942) Sensation and perception in the history of experimental psychology Appleton–Century–Crofts, New York [16.7.4b] Bossink CJH, Stalmeier PF M, de Weert CMM (1993) A test of Levelt’s second proposition for binocular rivalry Vis Res 33 1413–9 [12.10, 12.3.2a] Boucher JA (1967) Common visual direction horopters in exotropes with anomalous correspondence Am J Optom Arch Am Acad Optom 44 547–72 [14.4.1b]
REFERENCES
Bouman MA (1955) On foveal and peripheral interactions in binocular vision Optica Acta 1 177–83 [12.3.2a] Bouman MA, van den Brink G (1952) On the integrate capacity in time and space of the human peripheral retina J Opt Soc Am 42 617–20 [13.1.6b] Bourassa CM, Rule SJ (1994) Binocular brightness: a suppressionsummation trade off Can J Exp Psychol 48 418–34 [13.1.4c, 13.2.4a] Bourdon B (1902) La perception visuelle de l’espace Reinwald, Paris [18.2.1a] Bourdy C (1978) Horopter-vernier et couleur Vis Res 18 445–51 [14.6.2b] Bowd C, Rose D, Phinney R , Denny M, Patterson R (1996) Enduring stereoscopic motion aftereffects induced by prolonged adaptation Vis Res 36 3655–60 [16.5.3a] Bowd C, Donnelly M, Shorter S, Patterson R (2000) Cross-domain adaptation reveals that a common mechanism computes stereoscopic (cyclopean) and luminance plaid motion Vis Res 40 331–9 [16.5.3a] Bowen RW, Wilson HR (1994) A two process analysis of pattern masking Vis Res 34 645–57 [13.2.4a] Bower TGR (1966) A local sign for depth Nature 210 1081–2 [18.5.4a, 23.3.1] Bower TGR , Goldsmith WM, Hochberg J (1964) Stereodepth from afterimages Percept Mot Skills 19 510 [18.10.1a] Boyaci H, Maloney LT, Hersh S (2003) The effect of perceived surface orientation on perceived surface albedo in binocularly viewed scenes J Vis 3 541–3 [22.4.3b] Boyde A (1973) Quantitative photogrammetric analysis and qualitative stereoscopic analysis of SEM images J Micros 98 452–71 [24.2.3d] Boyde A, Jones SJ, Taylor ML, Wolfe LA (1990) Fluorescence in the tandem scanning microscope J Micros 157 39–49 [24.2.3b] Boynton RM (1979) Human color vision Holt Rinehart and Winston, New York [12.2.1] Boynton RM, Wisowaty JJ (1984) Selective color effects in dichoptic masking Vis Res 27 667–75 [13.2.3] Braccini C, Gambardella G, Suetta G (1980) A noise masking experiment in grating perception at threshold: the implications for binocular summation Vis Res 20 373–6 [13.1.1e, 13.1.2b] Braddick OJ (1974) A short–range process in apparent movement Vis Res 14 519–27 [16.4.2a, 16.4.2c] Braddick OJ (1979) Binocular single vision and perceptual processing Proc R Soc B 204 503–12 [12.1.3a, 20.3.1c] Braddick OJ, Adlard A (1978) Apparent motion and the motion detector In Visual psychophysics and physiology (ed JC Armington, J Krauskopf, BR Wooten) pp 417–29 Academic Press, New York [16.4.2a] Bradley DC, Andersen RA (1998) Center-surround antagonism based on disparity in primate area MT J Neurosci 18 7552–65 [11.5.2a, 22.3.2] Bradley DC, Qian N, Andersen RA (1995) Integration of motion and stereopsis in middle temporal cortical area of macaques Nature 373 609–11 [11.5.2a, 22.3.2] Bradley DC, Chang GC, Andersen RA (1998) Encoding of three-dimensional structure-from-motion by primate MT neurons Nature 392 714–17 [11.5.2a] Bradley DR (1982) Binocular rivalry of real vs subjective contours Percept Psychophys 32 85–7 [12.3.3d] Bradshaw JL (1969) Brightness of the dominant field, and pupillary reflexes in retinal rivalry Br J Psychol 60 351– 6 [12.5.1] Bradshaw MF, Cumming BG (1997) The direction of retinal motion facilitates binocular stereopsis Proc R Soc B 294 1421–7 [15.3.9] Bradshaw MF, Glennerster A (2006) Stereoscopic acuity and observation distance Spat Vis 19 21–36 [18.6.7] Bradshaw MF, Rogers BJ (1999) Sensitivity to horizontal and vertical corrugations defined by binocular disparity frequency Vis Res 39, 3049–56 [18.6.3b, 20.4.2, 21.4.2e]
Bradshaw MF, Frisby J, Mayhew JEW (1987) The recovery of structure from motion: no evidence for a special link with the convergent disparity mechanism Perception 16 351–7 [22.3.4] Bradshaw MF, Rogers BJ, De Bruyn B (1995) Perceptual latency and complex random-dot stereograms Perception 27 749–59 [18.14.2c, 18.14.2f ] Bradshaw MF, Glennerster A, Rogers BJ (1996) The effect of display size on disparity scaling from differential perspective and vergence cues Vis Res 36 1255–64 [20.6.3c] Bradshaw MF, Parton AD, Eagle RA (1998) The interaction of binocular disparity and motion parallax in determining perceived depth and perceived size Perception 27 1317–31 [20.6.2b] Bradshaw MF, Parton AD, Glennerster A (2000) The task-dependent use of binocular disparity and motion parallax information Vis Res 40 3725–34 [20.6.3b] Bradshaw MF, Hibbard PB, van der Willigen R , et al. (2002a) The stereoscopic anisotropy affects manual pointing Spat Vis 15 443–58 [20.4.1c] Bradshaw MF, Hibbard PB, Gillam B (2002b) Perceptual latencies to discriminate surface orientation in stereopsis Percept Psychophys 64 32–40 [20.4.1b] Bradshaw MF, Hibbard PB, Parton AD, et al. (2006) Surface orientation, modulation frequency and the detection and perception of depth defined by binocular disparity and motion parallax Vis Res 46 2636–44 [20.4.2] Brakenhoff GJ, van der Voort HTM, van Spronsen EA, Nanninga N (1986) Three-dimensional imaging by confocal scanning fluorescence microscopy Ann N Y Acad Sci 483 405–15 [24.2.3b] Brandt T, Dichgans J, Koenig E (1973) Differential effects of central versus peripheral vision on egocentric motion perception Exp Brain Res 16 476–91 [22.7.3] Brandt T, Wist ER , Dichgans J (1975) Foreground and background in dynamic spatial orientation Percept Psychophys 17 497–503 [22.7.3] Brascamp JW, van Ee R , Pestman WR , van den Berg AV (2005) Distributions of alternation rates in various form of bistable perception J Vis 5 287–98 [12.10] Brascamp JW, Knapen THJ, Kanai R , et al. (2007) Flash suppression and flash facilitation in binocular rivalry J Vis 7(12) Article 12 [12.3.5f ] Brauner JD, Lit A (1976) The Pulfrich effect simple reaction time and intensity discrimination Am J Psychol 89 105–14 [23.2.2] Braunstein ML (1976) Depth perception through motion Academic Press, New York [24.1.7] Bredfeldt CE, Cumming BG (2006) A simple account of cyclopean edge responses in macaque V2 J Neurosci 26 7581–96 [11.5.1] Bredfeldt CE, Ringach DL (2002) Dynamics of spatial frequency tuning in macaque V1 J Neurosci 22 1976–84 [11.4.8b] Breese BB (1899) On inhibition Psychol Rev Monogr Supp 3 (whole number 11) [12.3.6b, 12.3.8a, 12.8.1] Breese BB (1909) Binocular rivalry Psychol Rev 16 410–15 [12.3.2c] Breitmeyer B, Battaglia F, Bridge J (1977) Existence and implication of a tilted binocular disparity space Perception 6 161–4 [18.6.1b] Brenner E, van Damme WJM (1998) Judging distance from ocular convergence Vis Res 38 493–8 [18.10.2a] Brenner E, Smeets JBJ, Landy MS (2001) How vertical disparities assist judgements of distance Vis Res 41 3455–65 [20.6.5f ] Brewster D (1830) Optics. In the Edinburgh Encyclopedia Vol. 15 Blackwoods, Edinburgh pp 460–662 [16.7.7] Brewster D (1844a) On the knowledge of distance given by binocular vision Trans Roy Soc Edinb 15 663–74 [24.1.6] Brewster D (1844b) On the law of visible position in single and binocular vision and on the representation of solid figures by the union of dissimilar plane pictures on the retina Trans Roy Soc Edinb 15 349–68 [16.7.2b, 16.7.7, 24.1.6] Brewster D (1851) Notice of a chromatic stereoscope Philosophical Magazine 4th series 3 31 [17.8] Bridge H, Cumming BG (2001) Responses of macaque V1 neurones to binocular orientation differences J Neurosci 21 7293–302 [11.6.2]
REFERENCES
•
569
Bridge H, Parker AJ (2007) Topographical representation of binocular depth in the human visual cortex using fMRI J Vis 7(14) Article 15 [11.4.1f ] Bridge H, Cumming BG, Parker AJ (2001) Modeling V1 neuronal responses to orientation disparity Vis Neurosci 18 879–91 [11.6.2] Bridgman CS, Smith KU (1945) Bilateral neural integration in visual perception after section of the corpus callosum J Comp Neurol 83 57–68 [11.9.2] Briggs W (1676) Ophthalmographia London 2nd edition 1685 [16.7.2d] Brill MH (1978) A device performing illuminant-invariant assessment of chromatic relations J Theor Biol 71 473–8 [22.4.6] Broadbent H, Westall C (1990) An evaluation of techniques for measuring stereopsis in infants and young children Ophthal Physiol Opt 10 3–7 [18.2.3b, 18.2.4] Brock FW, Givner I (1952) Fixation anomalies in amblyopia Arch Ophthal 47 775–86 [14.4.1b] Brookes A, Stevens KA (1989a) Binocular depth from surfaces versus volumes J Exp Psychol HPP 15 479–84 [21.4.2e] Brookes A, Stevens KA (1989b) The analogy between stereo depth and brightness Perception 18 601–14 [21.4.1, 21.4.2c, 21.5.1, 21.5.3, 24.1.5] Brooks KR , Gillam BJ (2006a) Quantitative perceived depth from sequential monocular decamouflage Vis Res 46 605–13 [17.3] Brooks KR , Gillam BJ (2006b) The swinging doors of perception: stereomotion without binocular matching J Vis 6 685–95 [17.3] Brooks KR , Gillam BJ (2007) Stereomotion perception for a monocularly camouflaged stimulus J Vis 7 (13) Article 13 [17.3] Brown JP, Ogle KN, Reiher L (1965) Stereoscopic acuity and observation distance Invest Ophthal 4 894–900 [18.6.7] Brown KT (1953) Factors affecting differences in apparent size between opposite halves of a visual meridian J Opt Soc Am 43 464–72 [14.6.2a] Brown KT (1955) An experiment demonstrating instability of retinal directional values J Opt Soc Am 45 301–7 [14.6.2a] Brown RJ, Norcia AM (1997) A method for investigating binocular rivalry in real-time with the steady-state VEP Vis Res 37 2701–8 [12.9.2e] Bryngdahl O (1976) Characteristics of superposed patterns in optics J Opt Soc Am 66 87–94 [12.1.7] Büchert M, Greenlee MW, Rutschmann RM, et al. (2002) Functional magnetic resonance imaging evidence for binocular interactions in human visual cortex Exp Brain Res 145 334–9 [13.1.8b] Buck SL, Pulos E (1987) Rod-cone interaction in monocular but not binocular pathways Vis Res 27 479–82 [13.2.3] Buckley D, Frisby JP, Mayhew JEW (1989) Integration of stereo and texture cues in the formation of discontinuities during three– dimensional surface interpolation Perception 18 563–88 [22.2.2] Buckley D, Frisby JP, Freeman J (1994) Lightness perception can be affected by surface curvature from stereopsis Perception 23 869–81 [22.4.4] Buckthought A, Wilson HR (2007) Interaction between binocular rivalry and depth in plaid patterns Vis Res 47 2543–56 [12.7.3] Buckthought A, Kim J, Wilson HR (2008) Hysteresis effects in stereopsis and binocular rivalry Vis Res 48 819–30 [12.7.3] Bülthoff HH, Fahle M, Wegmann M (1991) Perceived depth scales with disparity gradient Perception 20 145–53 [21.2] Burbeck CA (1987) Locus of spatial-frequency discrimination J Opt Soc Am A 4 1807–13 [22.5.1d] Burdea G, Coiffet P (1994) Virtual reality technology Wiley, New York [24.2.6] Burian HM (1943) Influence of prolonged wearing of meridional size lenses on spatial localization Arch Ophthal 30 645–68 [14.4.2] Burian HM (1951) Anomalous retinal correspondence Am J Ophthal 34 237–53 [14.4.1e] Burian HM (1958) Normal and anomalous correspondence In Strabismus (ed JH Allen) pp 184–200 Mosby, St Louis MO [14.4.1e]
570
•
Burian HM, Capobianco NM (1952) Monocular diplopia (binocular triplopia) in concomitant strabismus Arch Ophthal 47 23–30 [14.4.2] Burke D, Wenderoth P (1989) Cyclopean tilt aftereffects can be induced monocularly: is there a purely binocular process? Perception 18 471–82 [13.3.2a] Burke D, Alais D, Wenderoth P (1999) Determinants of fusion of dichoptically presented orthogonal gratings Perception 28 73–88 [12.3.2c] Burkhalter A, Van Essen DC (1986) Processing of color form and disparity information in visual areas VP and V2 of ventral extrastriate cortex in the macaque monkey J Neurosci 6 2327–51 [11.5.1] Burns BD, Prichard R (1968) Cortical conditions for fused binocular vision J Physiol 197 149–71 [12.9.2b] Burr DC (1979) Acuity for apparent vernier offset Vis Res 19 835–37 [23.3.6] Burr DC, Ross J (1979) How does binocular delay give information about depth? Vis Res 19 523–32 [23.3.1, 23.3.6, 23.6.4] Burr DC, Ross J (1982) Contrast sensitivity at high velocities Vis Res 22 479–82 [18.10.1b] Burr DC, Ross J, Morrone MC (1986) A spatial illusion from motion rivalry Perception 15 59–66 [12.3.8c] Burrows AA, Hamilton VE (1974) Stereopsis using a large aspheric field lens App Optics 13 739–40 [24.1.2b] Burt P, Julesz B (1980) Modifications of the classical notion of Panum’s fusional area Perception 9 671–82 [12.1.3a, 19.4] Burt P, Sperling G (1981) Time distance and feature trade–offs in visual apparent motion Psychol Rev 88 171–95 [22.5.3a] Busettini C, Masson GS, Miles FA (1996) A role for stereoscopic depth cues in the rapid visual stabilization of the eyes Nature 380 342–5 [22.6.1e] Butler TW, Westheimer G (1978) Interference with stereoscopic acuity: spatial temporal and disparity tuning Vis Res 18 1387–92 [18.6.2a] Buttner-Ennever JA, Cohen B, Horn AK , Reisine H (1996) Efferent pathways from the nucleus of the optic tract in monkey and their role in eye movements J Comp Neurol 373 90–107 [22.6.1a] Cagenello R , Rogers BJ (1989) Binocular discrimination of line orientation and the stereoscopic discrimination of surface slant and curvature Invest Ophthal Vis Sci 30 (Abs) 252 [20.3.1c] Cagenello R , Rogers BJ (1990) Orientation disparity cyclotorsion and the perception of surface slant Invest Ophthal Vis Sci 31 (Abs) 97 [20.3.2a] Cagenello R , Rogers BR (1993) Anisotropies in the perception of stereoscopic surfaces: the role of orientation disparity Vis Res 33 2189–201 [20.3.1b, 20.3.1c, 20.4.1a, 20.4.1d] Cagenello R , Arditi A, Halpern DL (1993) Binocular enhancement of visual acuity J Opt Soc Am A 10 1841–8 [13.1.3e] Campbell A (1971) Interocular transfer of mirror-images by goldfish Brain Res 33 486–90 [13.4.2] Campbell FW, Green DG (1965) Monocular versus binocular visual acuity Nature 208 191–2 [13.1.1d, 13.1.2a] Campbell FW, Howell ER (1972) Monocular alternation: a method for the investigation of pattern vision J Physiol 225 19–21P [12.3.8a] Campbell FW, Maffei L (1971) The tilt aftereffect: a fresh look Vis Res 11 833–40 [13.3.2a] Campbell FW, Robson JG (1968) Application of Fourier analysis to the visibility of gratings J Physiol 197 551–66 [18.6.3c] Campbell FW, Gilinsky AS, Howell ER , et al. (1973) The dependence of monocular rivalry on orientation Perception 2 123–5 [12.3.8a. 12.3.8d] Campos EC (1978) On the reliability of some tests of binocular sensorial status in strabismic patients J Ped Ophthal Strab 15 8–14 [14.4.1b] Campos EC (1982) Binocularity in comitant strabismus: binocular visual fields studies Doc Ophthal 53 279–81 [14.4.1a] Campos EC, Enoch JM (1980) Amount of aniseikonia compatible with fine binocular vision: some old and new concepts J Ped Ophthal Strab 17 44–7 [18.3.4]
REFERENCES
Carkeet, A, Wildsoet CF, Wood JM (1997) Inter-ocular temporal asynchrony (IOTA): psychophysical measurement of inter-ocular asymmetry of visual latency Ophthal Physiol Opt 17 255–62 [23.7] Carlson TA, He S (2000) Visible binocular beats from invisible monocular stimuli during binocular rivalry Curr Biol 10, 1055–8 [12.5.5] Carlson WA, Eriksen CW (1966) Dichoptic summation of information in the recognition of briefly presented forms Percept Psychophys 5 67–8 [13.1.3e] Carman GJ, Welch L (1992) Three–dimensional illusory contours and surfaces Nature 360 585–7 [22.2.4a] Carney T (1997) Evidence for an early motion system which integrates information from the two eyes Vis Res 37 2361–8 [16.4.2d] Carney T, Shadlen MN (1992) Binocularity of early motion mechanisms: comments on Georgeson and Shackleton Vis Res 32 187–91 [16.4.2b] Carney T, Shadlen MN (1993) Dichoptic activation of the early motion system Vis Res 33 1977–95 [16.4.2c, 16.4.3] Carney T, Shadlen MN, Switkes E (1987) Parallel processing of motion and colour information Nature 328 647–9 [12.5.4a] Carney T, Paradiso MA, Freeman RD (1989) A physiological correlate of the Pulfrich effect in cortical neurons of the cat Vis Res 29 155–65 [23.3.2] Carpenter RHS (1988) Movements of the eyes Pion, London [16.7.7] Carr HC (1935) An introduction to space perception Longmans-Green, New York [24.1.7] Carter DB (1958) Studies of fixation disparity. II. Apparatus, procedure and the problem of constant error Am J Optom Arch Am Acad Optom 35 590–8 [14.6.1c] Carter OL, Pettigrew JD (2003) A common oscillator for perceptual rivalries? Perception 32 295–305 [12.3.8a] Casanova C, Freeman RD, Nordmann JP (1989) Monocular and binocular response properties of cells in the striate-recipient zone of the cat’s lateral posterior-pulvinar complex J Neurophysiol 62 544–57 [11.2.1, 11.6.4] Cass EE (1941) Monocular diplopia occurring in cases of squint Br J Ophthal 25 565–77 [14.4.2] Castelo-Branco M, Formisano E, Backes W, et al. (2002) Activity patterns in human motion-sensitive areas depend on the interpretation of global motion Proc Natl Acad Sci 99 13914–19 [22.3.3] Catania AC (1965) Interocular transfer of discriminations in the pigeon J Exp Anal Behav 8 145–55 [13.4.2] Cavanagh P, Mather G (1989) Motion: the long and short of it Spat Vis 4 103–29 [16.4.2a, 18.7.2d] Cavanagh P, Arguin M, Treisman A (1990) Effect of surface medium on visual search for orientation and size features J Exp Psychol HPP 16 479–91 [22.8.2a] Cavonius CR (1979) Binocular interactions in flicker Quart J Exp Psychol 31 273–80 [13.1.5] Chan ACW, Chung SCS, Yim APC, et al. (1997) Comparison of twodimensional vs three-dimensional camera systems in laparoscopic surgery Surgical Endoscopy 11 438–40 [24.2.4] Chang JJ (1990) New phenomena linking depth and luminance in stereoscopic motion Vis Res 30 137–47 [16.5.1] Chang JJ, Julesz B (1983) Displacement limits direction anisotropy and direction versus form discrimination in random–dot cinematograms Vis Res 23 639–46 [16.4.2c] Chapanis A, McCleary RA (1953) Interposition as a cue for the perception of relative distance J Gen Psychol 48 113–32 [22.1.1] Charnwood JRB (1949) Observations on ocular dominance The Optician 116 85–8 [16.7.3b, 16.7.6b] Chen CN (1998) Generation of depth-perception information in stereoscopic nuclear magnetic resonance imaging by non-linear magnetic field gradients Magnetic Resonance Imaging 16 405–12 [24.2.4] Chen G, Lu, HD, Roe AW (2008) A map for horizontal disparity in monkey V2 Neuron 58 442–50 [11.5.1] Chen VJ, Cicerone CM (2002) Depth from subjective color and apparent motion Vis Res 42 2131–5 [15.3.8a]
Chen X, He S (2003) Temporal characteristics of binocular rivalry: visual field asymmetries Vis Res 43 2207–12 [12.3.4] Chen Y, Wang Y, Qian N (2001) Modeling V1 disparity tuning to time-varying stimuli J Neurophysiol 86 143–55 [11.10.1b] Cherry EC (1953) Some experiments on the recognition of speech with one and two ears J Acoust Soc Am 25 975–9 [12.8.3b] Chevreul ME (1839) The principles of harmony and contrast of colors Based on the English translation of 1854. Reinhold, New York [22.4.1] Chiang C (1967) Stereoscopic Moiré patterns J Opt Soc Am 57 1088–90 [24.1.3a] Chong SC, Blake R (2006) Exogenous attention and endogenous attention influence initial dominance in binocular rivalry Vis Res 46 1794–803 [12.8.2] Chong SC, Tadin D, Blake R (2005) Endogenous attention prolongs dominance durations in binocular rivalry J Vis 5 1004–12 [12.8.2] Chowdhury SA, DeAngelis GC (2008) Fine discrimination training alters the causal contribution of macaque area MT to depth perception Neuron 60 367–77 [11.5.2a] Christianson S, Hofstetter HW (1972) Some historical notes on Carl Pulfrich Am J Optom Arch Am Acad Optom 49 944–7 [23.1.1] Christophers RA, Rogers BJ (1994) The effect of viewing distance on the perception of random dot stereograms Invest Ophthal Vis Sci 35 (Abs) 1627 [18.14.2c] Christophers RA, Rogers BJ, Bradshaw MF (1993) Perceptual latencies vergence eye movements and random-dot stereograms Invest Ophthal Vis Sci 34 (Abs) 1438 [18.14.2c, 18.14.2f ] Chung CS, Berbaum K (1984) Form and depth in global stereopsis J Exp Psychol HPP 10 258–75 [18.14.2c] Church J (1966) Language and the discovery of reality Vintage Press, New York [16.7.2c] Cibis PA, Haber H (1951) Anisopia and perception of space J Opt Soc Am 41 676–83 [17.9] Cigánek L (1970) Binocular addition of the visually evoked response with different stimulus intensities in man Vis Res 10 479–87 [13.1.8b] Cisne JI (2009) Stereoscopic comparison as the long-lost secret to microscopically detailed illumination like the Book of Kells’ Perception 38 1087–103 [24.2.1] Ciuffreda KJ, Hokoda SC (1985) Subjective vergence error at near during active head rotation Ophthal Physiol Opt 5 411–15 [18.10.5] Claudet A (1856) On various phenomena of refraction through semilenses or prisms producing anomalies in the illusion of stereoscopic images Proc R Soc 8 104–111 [24.1.2b] Claudet A (1858a) On the stereomonoscope: a new instrument by which an apparently single picture produces the stereoscopic illusion Proc Roy Soc 10 194–6 [24.1.2f ] Claudet A (1858b) Binocular vision The Edinburgh Review 107 223–41 [24.2.3a] Clement RA (1985) The geometry of specific horopters Ophthal Physiol Opt 5 397–401 [14.5.3] Clement RA (1987) Line correspondence in binocular vision Perception 16 193–9 [14.5.3] Clement RA (1992) Gaze angle explanations of the induced effect Perception 21 355–7 [19.6.5] Cobb WA, Morton HB, Ettlinger G (1967) Cerebral potentials evoked by pattern reversal and their suppression in visual rivalry Nature 216 1123–5 [12.9.2e] Cobo-Lewis AB (1996) Monocular dot-density cues in random-dot stereograms Vis Res 36 345–50 [24.1.5] Cobo–Lewis AB, Yeh YY (1994) Selectivity of cyclopean masking for the spatial frequency of binocular disparity modulation Vis Res 34 607–20 [18.6.3e] Cobo-Lewis AB, Gilroy LA, Smallwood TB (2000) Dichoptic plaids may rival, but their motions can integrate Spat Vis 13 415–29 [12.3.6b] Coe B (1981) The history of movie photography Eastview Editions Westfield NJ [24.1.7]
REFERENCES
•
571
Cogan AI (1978) Fusion at the site of the “ghosts” Vis Res 18 657–64 [15.4.6] Cogan AI (1987) Human binocular interaction: towards a neural model Vis Res 27 2125–39 [13.1.4b, 13.3.1] Cogan AI, Silverman G, Sekuler R (1982) Binocular summation in detection of contrast flashes Percept Psychophys 31 330–8 [13.1.6b] Cogan AI, Clarke M, Chan H, Rossi A (1990) Two–pulse monocular and binocular interactions at the differential luminance threshold Vis Res 30 1617–30 [13.1.6c] Cogan AI, Lomakin AJ, Rossi AF (1993) Depth in anticorrelated stereograms: effects of spatial density and interocular delay Vis Res 33 1959–75 [15.3.7d] Cogan AI, Kontsevich LL, Lomakin AJ, et al. (1995) Binocular disparity processing with opposite-contrast stimuli Perception 27 33–47 [15.3.7b] Cohn TE, Leong H, Lasley DJ (1981) Binocular luminance detection: availability of more than one central interaction Vis Res 21 1017–23 [13.1.4b] Cole RG, Boisvert RP (1974) Effect of fixation disparity on stereoacuity Am J Optom Physiol Opt 51 206–13 [18.10.3b] Collett TS (1985) Extrapolating and interpolating surfaces in depth Proc R Soc B 227 43–56 [22.2.2] Collett TS, Schwarz U, Sobel EC (1991) The interaction of oculomotor cues and stimulus size in stereoscopic depth constancy Perception 20 733–54 [20.6.3a] Collewijn H (1975) Direction–selective units in the rabbit’s nucleus of the optic tract Brain Res 100 489–508 [22.6.1a, 22.6.1e] Collewijn H, Steinman RM, Erkelens CJ, Regan D (1991) Binocular fusion stereopsis and stereoacuity with a moving head In Vision and visual dysfunction Vol 9 Binocular vision (ed D Regan) pp 121–36 MacMillan, London [18.10.5] Collins MJ, Goode A (1994) Interocular blur suppression and monovision Acta Ophthal 72 376–80 [12.3.7] Collyer SC, Bevan W (1970) Objective measurement of dominance control in binocular rivalry Percept Psychophys 8 437–9 [12.8.1] Coltheart M (1971) Visual feature-analyzers and after-effects of tilt and curvature Psychol Rev 78 114–21 [21.6.3a] Coltheart M (1973) Colour–specificity and monocularity in the visual cortex Vis Res 13 2595–8 [13.3.2a] Comerford JP (1974) Stereopsis with chromatic contours Vis Res 14 975–82 [17.1.4a] Cook M, Gillam B (2004) Depth of monocular elements in a binocular scene: the conditions for da Vinci stereopsis J Exp Psychol HPP 30 92–103 [17.3] Cooper J, Feldman J (1979) Assessing the Frisby stereo test under monocular viewing conditions J Am Optom Assoc 50 807–9 [18.2.1e] Cooper J, Warshowsky J (1977) Lateral displacement as a response cue in the Titmus stereo Test Am J Physiol Opt 54 537–41 [18.2.2b] Cooper ML, Pettigrew JD (1979) A neurophysiological determination of the vertical horopter in the cat and owl J Comp Neurol 184 1–29 [14.7] Corballis MC, Beale IL (1970) Monocular discrimination of mirrorimage obliques by pigeons: evidence for lateralized stimulus control Anim Behav 18 563–6 [13.4.2] Corbin HH (1942) The perception of grouping and apparent movement in visual depth Arch Psychol 273 1–50 [22.5.3a] Coren S, Kaplan CP (1973) Patterns of ocular dominance Am J Optom Arch Am Acad Optom 50 283–92 [12.3.7] Coren S, Porac C (1983) Subjective contours and apparent depth: a direct test Percept Psychophys 33 197–200 [22.2.4a] Cormack LK , Riddle RB (1996) Binocular correlation detection with oriented dynamic random-line stereograms Vis Res 36 2303–10 [15.2.2c] Cormack LK , Stevenson SB, Schor CM (1991) Interocular correlation luminance contrast and cyclopean processing Vis Res 31 2195–207 [15.2.2b, 18.5.1, 18.5.2] Cormack LK , Stevenson SB, Schor CM (1993) Disparity-tuned channels of the human visual system Vis Neurosci 10 585–96 [11.4.2]
572
•
Cormack LK , Stevenson SB, Schor CM (1994) An upper limit to the binocular combination of stimuli Vis Res 34 2599–608 [15.2.2b] Cormack LK , Stevenson SB, Landers DD (1997a) Interactions of spatial frequency and unequal monocular contrasts in stereopsis Perception 29 1121–1136 [18.5.4a] Cormack LK , Landers DD, Ramakrishnan S (1997b) Element density and the efficiency of binocular matching J Opt Soc Am A 14 723–30 [15.2.2b] Cormack R (1984) Stereoscopic depth perception at far viewing distances Percept Psychophys 35 423–28 [20.6.3b] Cormack R , Fox R (1985a) The computation of retinal disparity Percept Psychophys 37 176–8 [14.2.3] Cormack R , Fox R (1985b) The computation of disparity and depth in stereograms Percept Psychophys 38 375–80 [14.2.3] Cornforth LL, Johnson BL, Kohl P, Roth N (1987) Chromatic imbalance due to commonly used red-green filters reduces accuracy of stereoscopic depth perception Am J Optom Physiol Opt 64 842–5 [18.2.3b] Cornsweet TN (1970) Visual perception Academic Press, New York [21.4.1, 21.4.2e, 22.4.1] Cosmelli D, David O, Lachaux JP, et al. (2004) Waves of consciousness: ongoing cortical patterns during binocular rivalry Neuroimage 23 128–140 [12.9.2e] Cottereau BR , McKee SP, Ales JM, Norcia AM (2011) Disparity-tuned population responses from human visual cortex J Neurosci 31 954–65 [11.8.2] Coutant BE, Westheimer G (1993) Population distribution of stereoscopic ability Ophthal Physiol Opt 13 3–7 [18.3.1] Cowey A (1985) Disturbances of stereopsis by brain damage In Brain mechanisms and spatial vision (ed DJ Ingle, M Jeannerod, N Lee) pp 259–78 Nijhoff Dordrecht [11.9.2] Cowey A, Perry VH (1980) The projection of the fovea to the superior colliculus in rhesus monkeys Neuroscience 5 53–61 [11.2.3] Cowey A, Wilkinson F (1991) The role of the corpus callosum and extrastriate visual areas in stereoacuity in macaque monkeys Neuropsychologia 29 465–79 [11.5.1] Cozzi A, Crespi, B, Valentinotti F, Wörgötter F (1997) Performance of phase-based algorithms for disparity estimation Mach Vis Appl 9 334–40 [11.10.1a, 11.10.1b] Crabus H, Stadler M (1973) Untersuchungen zur Localisierung von Wahrnehmungsprozessen: figurale Nachwirkungen bei binocularen Wettstreit-Bedingungen Perception 2, 67–77 [12.6.2] Craik KJW (1966) The nature of psychology Cambridge University Press, Cambridge [21.4.2e] Crassini B, Broerse J (1982) Monocular rivalry occurs without eye movements Vis Res 22 203–4 [12.3.8d] Crawford BH (1938) Some observations on the rotating pendulum Nature 141 792–3 [23.4.2a] Crawford BH (1940a) Ocular interaction in its relation to measurements of brightness threshold Proc R Soc B 128 552–9 [13.1.2a, 13.2.2] Crawford BH (1940b) The effect of field size and pattern on the change of visual sensitivity with time Proc R Soc B 129 94–106 [13.2.3] Crawford MLJ, Cool SJ (1970) Binocular stimulation and response variability of striate cortex units in the cat Vis Res 10 1145–53 [11.3.1, 13.1.8a] Creed RS (1935) Observations on binocular fusion and rivalry J Physiol 84 381–92 [12.3.2f ] Crick F (1996) Visual perception: rivalry and consciousness Nature 379 485–6 [12.9.2a] Crone RA, Leuridan OMA (1973) Tolerance for aniseikonia. I. Diplopia thresholds in the vertical and horizontal meridians of the visual field Graefes Arch klin exp Ophthal 188 1–16 [12.1.1d, 12.1.5] Crovitz HF, Lipscomb DB (1963a) Binasal hemianopia as an early stage in binocular color rivalry Science 139 596–7 [12.3.4] Crovitz HF, Lipscomb DB (1963b) Dominance of the temporal visual fields at a short duration of stimulation Am J Psychol 76 631–7 [12.3.4]
REFERENCES
Crovitz HF, Lockhead GR (1967) Possible monocular predictors of binocular rivalry of contours Percept Psychophys 2 83–5 [12.3.1a] Crozier WJ, Wolf E (1941) Theory and measurement of visual mechanisms: IV Critical intensities for visual flicker monocular and binocular J Gen Physiol 27 505–34 [13.1.5] Cumming BG (2002) An unexpected specialization for horizontal disparity in primate primary visual cortex Nature 418 633–6 [11.4.4] Cumming BG, DeAngelis GC (2001) The physiology of stereopsis Ann Rev Neurosci 24 303–38 [11.9.2] Cumming BG, Parker AJ (1997) Responses of primary visual cortical neurons to binocular disparity without depth perception Nature 389 280–3 [11.10.1a, 11.4.1f, 15.3.7b, 15.3.7d] Cumming BG, Parker AJ (1999) Binocular neurons in V1 of awake monkeys are selective for absolute, not relative, disparity J Neurosci 19 5602–18 [11.4.1g, 11.4.6a] Cumming BG, Parker AJ (2000) Local disparity not perceived depth is signaled by binocular neurons in cortical area V1 of the macaque J Neurosci 20 4758–67 [11.4.1g] Cumming BG, Johnston EB, Parker AJ (1991) Vertical disparities and the perception of three–dimensional shape Nature 349 411–13 [20.6.3c] Cumming BG, Shapiro SE, Parker AJ (1998) Disparity detection in anticorrelated stereograms Perception 27 1367–77 [11.4.1f, 15.3.7d, 21.6.2e] Cüppers C (1956) Moderne Schielbehandlung Klin Monat Augenheilk 129–579 [14.4.1d] Curran W, Johnston A (1996) Three-dimensional curvature contrastgeometric or brightness illusion Vis Res 36 3641–53 [21.4.2f ] Curtis DW, Rule SJ (1978) Binocular processing of brightness information: a vector–sum model J Exp Psychol: HPP 4 132–43 [13.1.4b] Curtis DW, Rule SJ (1980) Fechner’s paradox reflects a nonmonotone relation between binocular brightness and luminance Percept Psychophys 27 293–6 [13.2.4a] Cynader M, Gardner J, Douglas R (1978) Neural mechanisms underlying stereoscopic depth perception in cat visual cortex In Frontiers in visual science (ed SJ Cool, EL Smith) pp 373–86 Springer, Berlin [23.3.2] Cynader M, Gardner JC, Mustari M (1984) Effects of neonatally induced strabismus on binocular responses in cat area 18 Exp Brain Res 53 384–99 [14.4.1c] Cynader M, Gardner JC, Dobbins A, et al. (1986) Interhemispheric communication and binocular vision: functional and developmental aspects In Two hemispheres – one brain: functions of the corpus callosum (ed F Lepore, M Ptito, HH Jasper) pp 198–209 Liss, New York [11.9.1] Cynader M, Giaschi DE, Douglas RM (1993) Interocular transfer of direction–specific adaptation to motion in cat striate cortex Invest Ophthal Vis Sci 34 (Abs) 1188 [13.3.3f ] d’Almeida MJC (1858) Nouvel appareil stéréoscopique Comp Rendu Acad Sci 47 61–3 Also, On a new stereoscopic apparatus Photographic Journal 5 2 [24.1.2c, 24.1.2e] D’Zmura M, Iverson G (1993) Color constancy. I. Basic theory of two-stage linear recovery of spectral descriptions for lights and surfaces J Opt Soc Am A 3 1662–72 [22.4.6] Dalby TA, Saillant ML, Wooten BR (1995) The relation of lightness and stereoscopic depth in a simple viewing situation Percept Psychophys 57 318–32 [22.4.2] Danuser G (1999) Photogrammetric calibration of a stereo light microscope J Micros 193 62–83 [24.2.3a] Daum KM (1982) Covariation in anomalous correspondence with accommodative vergence Am J Optom Physiol Opt 59 146–51 [14.4.1d] Davis ET, King RA, Anoskey A (1992) Oblique effect in stereopsis SPIE 166 Human vision, visual processing, and digital display III 465–75 [18.6.5] Davis G, Driver J (1998) Kanizsa subjective figures can act as occluding surfaces at parallel stages of visual search J Exp Psychol HPP 27 169–84 [22.1.2]
Dawson S (1913) Binocular and uniocular discrimination of brightness Br J Psychol 6 78–108 [12.3.5a] Dawson S (1917) The experimental study of binocular colour mixture I Br J Psychol 8 510–51 [12.2.2, 12.3.2f ] Day RH (1958) On interocular transfer and the central origin of visual after–effects Am J Psychol 71 784–9 [13.3.1] Day RH (1961) On the stereoscopic observation of geometrical illusions Percept Mot Skills 13 277–58 [16.3.1] Day RH, Wade NJ (1988) Binocular interaction in induced rotary motion Aust J Psychol 40 159–64 [13.3.3e] Dayan P (1998) A hierarchical model of binocular rivalry Neural Comput 10 1119–35 [12.10] De Bruyn B, Rogers BR , Howard IP, Bradshaw MF (1992) Role of positional and orientational disparities in controlling cyclovergent eye movements Invest Ophthal Vis Sci 33 (Abs) 1149 [19.6.1] De Lange H (1954) Relationship between critical flicker frequency and a set of low–frequency characteristics of the eye J Opt Soc Am 44 380–9 [13.1.5] De Marco A, Penengo P, Trabucco A, et al. (1977) Stochastic models and fluctuations in reversal time of ambiguous figures Perception 6 645–56 [12.10] De Silva HR , Bartley SH (1930) Summation and subtraction of brightness in binocular perception Br J Psychol 20 271–50 [13.1.4] de Vries SC, Kappers AM, Koenderink JJ (1993) Shape from stereo: a systematic approach using quadratic surfaces Percept Psychophys 53 71–80 [20.5.3] De Vries SC, Kappers AML, Koenderink JJ (1994) Influence of surface attitude and curvature scaling on discrimination of binocularly presented surfaces Vis Res 34 2709–23 [20.5.3, 24.1.5] De Weert CMM (1979) Colour contours and stereopsis Vis Res 19 555–64 [17.1.4a] De Weert CMM, Levelt WJM (1974) Binocular brightness combinations: additive and nonadditive aspects Percept Psychophys 15 551–62 [13.1.4a] De Weert CMM, Levelt WJM (1976a) Comparison of normal and dichoptic color mixing Vis Res 16 59–70 [12.2.3, 12.3.2f ] De Weert CMM, Levelt WJM (1976b) Dichoptic brightness combination for unequal coloured lights Vis Res 16 1077–86 [13.1.4a] De Weert CMM, Sadza KJ (1983) New data concerning the contribution of colour differences to stereopsis In Colour vision (ed JD Mollon, LT Sharpe) pp 553–62 Academic Press, New York [17.1.4a] De Weert CMM, Wade NJ (1988) Compound binocular rivalry Vis Res 28 1031–40 [12.2.2, 12.4.1] De Weert CMM, Snoeren PR , Koning A (2005) Interactions between binocular rivalry and Gestalt formation Vis Res 45 2571–2579 [12.4.4b] Dean P, Redgrave P, Westby GWM (1989) Event or emergency? Two response systems in the mammalian superior colliculus TINS 12 137–47 [11.6.4] DeAngelis GC, Newsome WT (1999) Organization of disparityselective neurons in macaque area MT J Neurosci 19 1398–415 [11.5.2a] DeAngelis GC, Uka T (3003) Coding of horizontal disparity and velocity by MT neurons in the alert monkey J Neurophysiol 89 1094–111 [11.5.2a] DeAngelis GC, Ohzawa I, Freeman RD (1991) Depth is encoded in the visual cortex by a specialized receptive field structure Nature 352 156–9 [11.4.3a,] DeAngelis GC, Robson JG, Ohzawa I, Freeman RD (1992) Organization of suppression in receptive fields of neurons in cat cortex J Neurophysiol 68 144–163 [12.3.8d, 12.9.2b] DeAngelis GC, Ohzawa I, Freeman RD (1993) Spatiotemporal organization of simple–cell receptive fields in the cat’s striate cortex. II. Linearity of temporal and spatial summation J Neurophysiol 69 1118–35 [23.3.2] DeAngelis GC, Freeman RD, Ohzawa I (1994) Length and width tuning of neurones in the cat’s primary visual cortex J Neurophysiol 71 347–74 [11.6.3, 11.7, 13.3.2b]
REFERENCES
•
573
DeAngelis GC, Cumming BG, Newsome WT (1998) Cortical area MT and the perception of stereoscopic depth Nature 394 677–80 [11.5.2a] DeAngelis GC, Cumming BG, Newsome WT (2000) A new role for cortical area MT: the perception of stereoscopic depth In The new cognitive neurosciences (ed MS Gazzaniga) pp 305–313 MIT Press, Cambridge MA [11.5.2a] Delicato LS, Qian N (2005) Is depth perception of stereo plaids predicted by intersection of constraints, vector average or second-order feature? Vis Res 45 75–89 [22.1.4] den Ouden HEM, van Ee R , de Haan EHE (2005) Colour helps to solve the binocular matching problem J Physiol 567 665–71 [15.3.8a] Dengis CA, Steinbach MJ, Goltz HC, Stager C (1993a) Visual alignment from the midline: a declining developmental trend in normal strabismic and monocularly enucleated children J Ped Ophthal Strab 30 323–6 [16.7.2c] Dengis CA, Steinbach MJ, Ono H, et al. (1993b) Egocenter location in children with strabismus: in the median plane and unchanged by surgery Invest Ophthal Vis Sci 34 2990–5 [16.7.5] Dengis CA, Steinbach MJ, Ono H, et al. (1996) Learning to look with one eye: the use of head turn by normals and strabismics Vis Res 36 3237–42 [16.7.2c] Dengis CA, Steinbach MJ, Ono H, Gunther LN (1997) Learning to wink voluntarily and to master monocular tasks: a comparison of normal versus strabismic children Binoc Vis 12 113–18 [16.7.2c] Dengis CA, Simpson TL, Steinbach MJ, Ono H (1998) The cyclops effect in adults: sighting without visual feedback Vis Res 38 327–31 Dengis et a. 1998 [16.7.2c] Dengler M, Nitschke W (1993) Color stereopsis: a model for depth reversals based on border contrast Percept Psychophys 53 150–6 [17.8] Denk W, Strickler JH, Webb WW (1990) Two-photon laser scanning fluorescence microscopy Science 278 73–6 [24.2.3c] Denny N, Frumkes TE, Barris MC, Eysteinsson T (1991) Tonic interocular suppression and binocular summation in human vision J Physiol 437 449–60 [13.2.2] Derrington AM, Cox M (1998) Temporal resolution of dichoptic and second-order motion mechanisms Vis Res 38 3531–9 [16.4.2a] Desaguliers JT (1716) A plain and easy experiment to confirm Sir Isaac Newton’s doctrine of the different refrangibility of the rays of light Philos Tr R Soc 29 448–52 [12.2.1] Dev P (1975) Perception of depth surfaces in random-dot stereograms: A neural model Int J Man-Mach Stud 7 511–28 [15.4.5] DeValois KK , von der Heydt R , Adorjani CS, DeValois RL (1975) A tilt aftereffect in depth Invest Ophthal Vis Sci 15(ARVO Abs) 90 [20.3.1d] DeValois RL, Walraven J (1967) Monocular and binocular aftereffects of chromatic adaptation Science 155 463–5 [12.2.1, 13.2.8] DeValois RL, Yund EW, Hepler N (1982a) The orientation and direction selectivity of cells in macaque visual cortex Vis Res 22 531–44 [11.6.2] DeValois RL, Albrecht DG, Thorell LG (1982b) Spatial frequency selectivity of cells in macaque visual cortex Vis Res 22 545–559 [20.2.1] Di Stefano L, Marchionni M, Mattoccia S (2004) A fast area-based stereo matching algorithm Image Vis Comp 22 983–1005 [15.4.6] Di Stefano M, Lepore F, Ptito M, et al. (1991) Binocular interactions in the lateral suprasylvian visual area of strabismic cats following section of the corpus callosum Eur J Neurosci 3 1016–24 [14.4.1c] Diamond AL (1958) Simultaneous brightness contrast and the Pulfrich phenomenon J Opt Soc Am 48 887–90 [23.4.2a] Dias EC, Rocha–Miranda CE, Bernardes RF, Schmidt SL (1991) Disparity selective units in superior colliculus of the opossum Exp Brain Res 87 546–52 [11.2.3] Diaspro A (ed) (2002) Confocal and two photon microscopy Wiley, New York [24.2.3b] Diaz-Caneja E (1928) Sur l’alternance binoculaire Annales d’Oculistique 165 721–31 [12.4.4b]
574
•
Dichgans J, Brandt T (1978) Visual–vestibular interaction: effects on self motion perception and postural control In Handbook of sensory physiology (ed R Held, W Leibowitz, HL Teuber) Vol VII pp 755–804 Springer, New York [22.7.3] Diener HC, Wist ER , Dichgans J, Brandt T (1976) The spatial– frequency effect on perceived velocity Vis Res 16 169–76 [23.3.6] Diner DB, Fender DH (1987) Hysteresis in human binocular fusion: temporalward and nasalward ranges J Opt Soc Am A 4 1814–19 [12.1.6] Diner DB, Fender DH (1988) Dependence of Panum’s fusional area on local retinal stimulation J Opt Soc Am A 5 1163–9 [12.1.6] Diner DB, Fender DH (1993) Human engineering in stereoscopic viewing devices Plenum, New York [24.2.4] Ding J, Sperling G (2006) A gain-control theory of binocular combination Proc Natl Acad Sci 103 1141–6 [12.3.1b, 13.1.3b] Distler C, Mustari MJ, Hoffmann KP (2002) Cortical projections to the nucleus of the optic tract and dorsal terminal nucleus and to the dorsolateral pontine nucleus in macaques: a dual retrograde tracing study J Comp Neurol 444 144–58 [22.6.1b] Dixon HH (1938) A binocular illusion Nature 141 792 [14.2.2] Dobbins AC, Jeo RM, Fiser J, Allman JM (1998) Distance modulation of neural activity in the visual cortex Science 281 552–5 [11.5.3a] Dodd JV, Krug K , Cumming BG, Parker AJ (2001) Perceptually bistable three-dimensional figures evoke high choice probabilities in cortical area MT J Neurosci 21 4809–21 [11.5.2a] Dodd MD, McAuley T, Pratt J (2000) An illusion of 3-D motion with the Ternus display Vis Res 45 969–73 [16.4.2e] Dodge R (1900) Visual perception during eye movements Psychol Rev 7 454–65 [23.2.4] Dodwell PC, Harker GS, Behar I (1968) Pulfrich effect with minimal differential adaptation of the eyes Vis Res 8 1431–43 [23.4.3] Domini F, Braunstein M (2001) Influence of a stereo surface on the perceived tilt of a monocular line Percept Psychophys 63 607–24 [14.6.1c, 16.7.4a, 21.6.2c] Domini F, Blaser E, Cicerone CM (2000) Color-specific depth mechanisms revealed by a color-contingent depth aftereffect Vis Res 40 359–64 [15.3.8b] Domini F, Adams W, Banks MS (2001) 3D after-effects are due to shape and not disparity adaptation Vis Res 41 2733–9 [21.6.2c] Donnelly M, Miller RJ (1995) Ingested ethanol and binocular rivalry Invest Ophthal Vis Sci 36 1548–54 [12.3.2c] Donnelly M, Bowd C, Patterson R (1997) Direction discrimination of cyclopean (stereoscopic) and luminance motion Vis Res 37 2041–6 [16.5.1] Donzis PB, Rappazzo A, Burde RM, Gordon M (1983) Effect of binocular variations of Snellen’s visual acuity on Titmus stereoacuity Arch Ophthal 101 930–2 [18.5.4b] Douthwaite WA, Morrison LC (1975) Critical flicker frequency and the Pulfrich phenomenon Am J Optom Physiol Opt 52 745–49 [23.4.2b] Dove HW (1841) Uber die Combination der Eindrücke beider Ohren und beider Augen zu einem Eindruck Monat Ber Akad 251–2 [18.10.3a, 18.12.1a] Downing CJ, Pinker S (1985) The spatial structure of visual attention In Attention and performance (ed MI Posner, OS Marin) Vol XI pp 171–87 Erlbaum, Hillsdale NJ [22.8.1] Downing E, Hesselink L, Ralston J, Macfarlane R (1996) A three-color solid-state three-dimensional display Science 273 1185–8 [24.1.4a] Dresp B, Bonnet C (1995) Subthreshold summation with illusory contours Vis Res 35 1071–8 [12.3.3d] Drobe B, Monot A (1997) Partition of perceived space within the fusional area on apparent fronto-parallel plane criterion Ophthal Physiol Opt 17 340–7 [14.6.2] Drobnis BJ, Lawson RB (1976) The Poggendorff illusion in stereoscopic space Percept Mot Skills 42 15–18 [16.7.4b] Dudley LP (1951) Stereoptics: an introduction MacDonald, London [24.1.3c]
REFERENCES
Dudley LP (1965) Stereoscopy In Applied optics and optical engineering (ed R Kingslake) pp 77–117 Academic Press, New York [24.1.1] Duff y CJ, Wurtz RH (1993) An illusory transformation of optic flow fields Vis Res 33 1481–90 [22.7.4] Duke PA, Howard IP (2005) Vertical-disparity gradients are processed independently in different depth planes Vis Res 45 2025–35 [20.2.4c] Duke PA, Wilcox LM (2003) Adaptation to vertical disparity induceddepth: implications for disparity processing Vis Res 43 135–47 [21.6.2d] Duke PA, Oruc I, Haijiang Q, Backus BT (2006) Depth aftereffects mediated by vertical disparities: evidence for vertical disparity driven calibration of extraretinal signals during stereopsis Vis Res 46 228–41 [21.7.1] Duke–Elder S (1962) System of ophthalmology Vol. VII The foundations of ophthalmology Kimpton, London [24.2.4] Duke–Elder S (1968) System of ophthalmology Vol IV The physiology of the eye and of vision Kimpton, London [12.7.2] Duncan J (1984) Selective attention and the organization of visual information J Exp Psychol Gen 113 501–17 [22.5.1e] Duncan J, Martens S, Ward R (1997) Restricted attentional capacity within but not between sensory modalities Nature 387 808–10 [22.8.2a] Duncan RO, Albright TD, Stoner GR (2000) Occlusion and the interpretation of visual motion: perceptual and neuronal effects of context J Neurosci 20 5885–97 [22.3.1] Duncker K (1929) Über induzierte Bewegung Psychol Forsch 22 180–259 [21.1, 22.7] Dunlap K (1944) Alleged binocular mixing Am J Psychol 57 559–63 [12.2.1] Durand JB, Zhu S, Celebrini S, Trotter Y (2002) Neurons in parafoveal areas V1 and V2 encode vertical and horizontal disparities J Neurophysiol 88 2874–9 [11.4.4] Durand JB, Celebrini S, Trotter Y (2007a) Neural basis of stereopsis across visual field of the alert monkey Cereb Cortex 17 1260–73 [11.4.4] Durand JB, Nelissen K , Joly O, et al. (2007b) Anterior regions of monkey parietal cortex process visual 3D shape Neuron 55 493–505 [11.5.2b, 11.8.1] Durgin FH (2001) Texture contrast aftereffects are monocular; texture density aftereffects are binocular Vis Res 41 2619–30 [13.2.6, 13.3.4] Durgin FH, Huk AC (1997) Texture density aftereffects in the perception of artificial and natural textures Vis Res 23 3273–82 [13.3.4] Durgin FH, Proffitt DR , Olson TJ, Reinke KS (1995) Comparing depth from motion with depth from binocular disparity J Exp Psychol HPP 21 679–99 [20.6.4] Durrani AF, Preminger GM (1995) Three-dimensional video imaging for endoscopic surgery Comput Biol Med 25 237–47 [24.2.4] Dürsteler MR , Wurtz RH (1988) Pursuit and optokinetic deficits following chemical lesions of cortical areas MT and MST J Neurophysiol 60 940–65 [22.6.1b] Dutour EF (1760) Discussion d’un question d’optique L’Académie des Sciences. Mémoires de Mathématique et de physique présentés par Divers Savantes 3 514–30. An English translation by O’Shea RP (1999) of both Dutour papers is available at http://psy.otago.ac. nz:800/r-oshea.dutour63.html. [12.2.1, 12.7.2] Dutour EF (1763) Addition au Mémoire intitulé, Discussion d’un question d’optique L’Académie des Sciences. Mémoires de Mathématique et de physique présentés par Divers Savantes 4 499–511 [12.7.2, 13.1.1b] Duwaer AL (1982) Assessment of retinal image displacement during head movement using an afterimage method Vis Res 22 1379–88 [14.6.2a, 18.10.5] Duwaer AL (1983) Patent stereopsis with diplopia in random–dot stereograms Percept Psychophys 33 443–54 [14.6.2a, 18.4.1b]
Duwaer AL, van den Brink G (1982a) The effect of presentation time on detection and diplopia thresholds for vertical disparities Vis Res 22 183–9 [12.1.4] Duwaer AL, van den Brink G (1982b) Detection of vertical disparities Vis Res 22 467–78 [18.3.3b] Dvorak JA, Nagao E (1998) Optimization and utilization of the atomic force microscope for living systems Scanning 20 138–9 [24.2.3f ] Dvorák V (1870) Versuche über Nachbilder von Reizveränderungen Sitzungsbericht der Kaiserlichen Akademie der Wissenschaften: Mathematisch-Naturwissenschaftliche Klasse, II Abteilung (Wein) 61 257–62. Translation in Broerse et al. (1994) [13.3.3a] Dwyer WO, Lit A (1970) Effect of luminance-matched wavelength on depth discrimination at scotopic and photopic levels of target illumination J Opt Soc Am 60 127–31 [18.5.5] Earle DC (1985) Perception of Glass pattern structure with stereopsis Perception 14 545–52 [16.6.2] Earnshaw RA, Gigante MA, Jones H (1993) Virtual reality systems Academic Press, London [24.2.6] Earnshaw RA, Vince JA, Jones H (1995) Virtual reality applications Academic Press, London [24.2.6] Ebenholtz SM (1970) On the relation between interocular transfer of adaptation and Hering’s law of equal innervation Psychol Rev 77 343–7 [13.4.3, 19.6.4] Ebenholtz SM, Paap KR (1973) The constancy of object orientation: compensation for ocular rotation Percept Psychophys 14 458–70 [20.2.2b] Ebenholtz SM, Walchli RM (1965) Stereoscopic thresholds as a function of head– and object–orientation Vis Res 5 455–61 [18.6.5] Eby DW, Braunstein ML (1995) The perceptual flattening of threedimensional scenes enclosed by a frame Perception 27 981–93 [24.1.7] Edwards M Schor CM (1999) Depth aliasing by the transient-stereopsis system Vis Res 39 4333–40 [15.3.2] Edwards M, Pope DR , Schor CM (1999) Orientation tuning of the transient-stereopsis system Vis Res 39 2717–27 [18.12.3] Edwards M, Pope DR , Schor CM (2000) First- and second-order in transient stereopsis Vis Res 40 2645–51 [18.12.3] Efron R (1957) Stereoscopic vision. I. Effects of binocular temporal summation Br J Ophthal 41 709–30 [18.12.2a] Egnal G, Wildes R (2002) Detecting binocular half-occlusions: empirical comparisons of five approaches IEEE Tr Patt Anal Mach Intel 24 1127–33 [11.10.1c] Egner A, Hell SW (2005) Fluorescence microscopy with super-resolved optical sections Trends in Cell Biol 15 207–15 [24.2.3b] Ehrenstein W (1925) Versuche über beziehungen zwischen Bewegungsund Gestaltwahrnehmung. Erste Abhandlung Z Psychol 96 305–52 [13.3.3a] Ehrenstein W (1941) Uber Abwandlungen der L Hermannschen Helligkeitserscheinung Z Psychol 150 83–91 [22.2.4a] Ehrenstein WH, Gillam BJ (1999) Early demonstrations of subjective contours, amodal completion, and depth from half-occlusions: “Stereoscopic experiments with silhouettes” by Adolf von Szily (1921) Perception 27 1407–16 [22.2.4a] Ehrenstein WH, Arnold-Schulz-Gahman BE, Jaschinski W (2005) Eye preference within the context of binocular functions Graefe’s Arch Clin Exp Ophthal 243 926–32 [16.7.3b] Eifuku S, Wurtz RH (1999) Response to motion in extrastriate area MSTl: disparity sensitivity J Neurophysiol 82 2762–75 [11.5.2a, 22.3.2] Einthoven W (1885) Stereoscopie durch Farbendifferenz Graefe’s Arch Klin Exp Ophthal 31 211–38 [17.8] Elberger AJ (1979) The role of the corpus callosum in the development of interocular eye alignment and the organization of the visual field in the cat Exp Brain Res 36 71–85 [11.9.2] Elberger AJ (1980) The effect of neonatal section of the corpus callosum on the development of depth perception in young cats Vis Res 20 177–87 [11.9.2]
REFERENCES
•
575
Elberger AJ (1989) Binocularity and single cell acuity are related in striate cortex of corpus callosum sectioned and normal cats Exp Brain Res 77 213–16 [11.9.2] Elberger AJ (1990) Spatial frequency thresholds of single striate cortical cells in neonatal corpus callosum sectioned cats Exp Brain Res 82 617–27 [11.9.2] Elberger AJ, Smith EL (1983) Binocular properties of lateral suprasylvian cortex are not affected by neonatal corpus callosum section Brain Res 278 259–98 [11.9.2] Elberger AJ, Smith EL (1985) The critical period for corpus callosum section to affect cortical binocularity Exp Brain Res 57 213–23 [11.9.2] Ell JJ, Gresty MA (1982) Uniocular Pulfrich phenomenon: an abnormality of visual perception Brit J Ophthal 66 610–13 [23.7] Ellenberger C, Duane MD, Shuttlesworth E (1978) Electrical correlates of normal binocular vision Arch Neurol 35 834–7 [13.1.8b] Ellerbrock VJ (1954) Inducement of cyclofusional movements Am J Optom Arch Am Acad Optom 31 553–66 [12.1.1b] Emerson PL, Pesta BJ (1992) A generalized visual latency explanation of the Pulfrich phenomenon Percept Psychophys 51 319–27 [23.2.1] Emoto M, Mitsuhashi T (1998) Interocular suppression of a halfoccluded region of stereoscopic images J Opt Soc Am A 15 2257–62 [17.2.3] Engel E (1956) The role of content in binocular resolution Am J Psychol 69 87–91 [12.8.3a] Engel GR (1967) The visual processes underlying binocular brightness summation Vis Res 7 753–67 [13.1.4b] Engel GR (1969) The autocorrelation function and binocular brightness mixing Vis Res 9 1111–30 [13.1.4b] Engel GR (1970a) Tests of a model of binocular brightness Can J Psychol 27 335–52 [13.1.4b] Engel GR (1970b) An investigation of visual responses to brief stereoscopic stimuli Quart J Exp Psychol 22 148–66 [18.12.2a] Engelking E, Poos F (1927) Uber die Bedeutung des Stereophaenomens für die isochrome und heterochrome Helligkeitsvergleichung Graefe’s Arch Klin Exp Ophthal 114 340–79 [23.4.1, 23.4.2a] Enns JT, Rensink RA (1990) Influence of scene-based properties on visual search Science 247 721–3 [22.8.2c] Enns JT, Rensink RA (1991) Preattentive recovery of three-dimensional orientation from line drawings Psychol Rev 98 335–51 [22.8.2c] Enoch JM, Goldmann H, Sunga R (1969) The ability to distinguish which eye was stimulated by light Invest Ophthal Vis Sci 8 317–31 [16.8] Enoksson P (1963) Binocular rivalry and monocular dominance studied with optokinetic nystagmus Acta Ophthal 41 544–63 [12.3.1a] Enright JT (1970) Distortions of apparent velocity: a new optical illusion Science 168 464–7 [23.2.1] Enright JT (1985) On Pulfrich–illusion eye movements and accommodation vergence during visual pursuit Vis Res 25 1613–22 [23.5] Enright JT (1988) The cyclopean eye and its implications: vergence state and visual direction Vis Res 28 925–30 [16.7.7] Enright JT (1990) Stereopsis cyclotorsional “noise” and the apparent vertical Vis Res 30 1487–97 [21.3.2] Enright JT (1991a) Exploring the third dimension with eye movements: better than stereopsis Vis Res 31 1549–62 [18.10.2a] Enright JT (1991b) Stereo–thresholds: simultaneity target proximity and eye movements Vis Res 31 2093–100 [18.10.2a, 18.12.2b, 18.6.2a] Enright JT (1996) Sequential stereopsis: a simple demonstration Vis Res 36 307–12 [18.10.2a] Epelbaum M, Teller DY (1995) Infant eye movement asymmetries: temporal-nasal asymmetry is reversed at isoluminance in 2-month-olds Vis Res 35 1889–95 [22.6.1b, 22.6.1e] Epstein LI (1952) Space perception and vertical disparity J Opt Soc Am 42 145–6 [20.6.5a] Epstein W (1961) Phenomenal orientation and perceived achromatic color J Psychol 52 51–3 [22.4.3a]
576
•
Epstein W, Morgan-Paap CL (1974a) Aftereffect of inspection of a perspectival stimulus for slant depth: a new normalization effect Percept Psychophys 16 299–302 [21.6.3c] Epstein W, Morgan-Paap CL (1974b) The effect of depth processing and degree of information discrepancy on adaptation to uniocular image magnification J Exp Psychol 102 585–94 [21.6.3c] Erens RGF, Kappers AML, Koenderink JJ (1991) Limits on the perception of local shape from shading In Studies in perception and action (ed PJ Beek, RJ Bootsma, PCW van Wieringen) pp 72–5 Rodopi, Amsterdam [20.5.1] Eriksen BA, Eriksen CW (1974) Effects of noise-letters on identification of a target letter in a nonsearch task Percept Psychophys 16 143–9 [13.2.5] Eriksen CW (1966) Independence of successive inputs and uncorrelated error in visual form perception J Exp Psychol 72 29–35 [13.1.1b] Eriksen CW, Greenspon TS (1968) Binocular summation over time in the perception of form at brief durations J Exp Psychol 76 331–6 [13.1.3e] Eriksen CW, Greenspon TS, Lappin J, Carlson WA (1966) Binocular summation in the perception of form at brief durations Percept Psychophys 1 415–9 [13.1.1b, 13.1.3e] Erkelens CJ (1988) Fusional limits for a large random–dot stereogram Vis Res 28 345–53 [18.4.1b] Erkelens CJ (2000) Perceived direction during monocular viewing is based on signals of the viewing eye only Vis Res 40 2411–19 [16.7.7] Erkelens CJ, Collewijn H (1985) Motion perception during dichoptic viewing of moving random–dot stereograms Vis Res 25 583–8 [18.3.2a] Erkelens CJ, van de Grind WA (1994) Binocular visual direction Vis Res 34 2963–9 [16.7.3b, 16.7.7] Erkelens CJ, van Ee R (1997a) Capture of visual direction: an unexpected phenomenon in binocular vision Vis Res 37 1193–6 [16.7.3a] Erkelens CJ, van Ee R (1997b) Capture of the visual direction of monocular objects by adjacent binocular objects Vis Res 37 1735–45 [16.7.3a, 16.7.4a] Erkelens CJ, van Ee R (1998) A computation model of depth perception based on headcentric disparity Vis Res 38 2999–18 [14.3.1c] Erkelens CJ, Van Ee R (2002a) Multi-coloured stereograms unveil two binocular colour mechanisms in human vision Vis Res 42 1103–12 [12.2.2] Erkelens CJ, van Ee R (2002b) The role of the cyclopean eye in vision: sometimes inappropriate, always irrelevant Vis Res 42 1157–63 [16.7.7] Erkelens CJ, Muijs AJM, van Ee, R (1996) Binocular alignment in different depth planes Vis Res 36 2141–7 [16.7.7] Erwin E, Miller, KD (1999) The subregion correspondence model of binocular simple cells J Neurosci 19 7212–29 [11.4.3c] Evans CR , Clegg JM (1967) Binocular depth perception of “Julesz patterns” viewed as perfectly stabilized retinal images Nature 215 893–5 [18.10.1a] Exner S (1868) über die zu einer Gesichtswahrnemung Nöthige Zeit Sitzungsbericht der Akademie Wissenschaft Wien 58 601–32 [13.2.7] Eyre MB, Schmeeckle MM (1933) A study of handedness eyedness and footedness Child Devel 4 73–8 [12.3.7] Eysteinsson T, Barris MC, Denny N, Frumkes TE (1993) Tonic interocular suppression binocular summation and the evoked potential Invest Ophthal Vis Sci 34 2743–8 [13.2.2] Fagin RR , Griffin JR (1982) Stereoacuity test: comparison of mathematical equivalents Am J Optom Physiol Opt 59 427–35 [18.2.4] Fahle M (1982a) Cooperation between different spatial frequencies in binocular rivalry Biol Cyber 44 27–9 [12.3.2b] Fahle M (1982b) Binocular rivalry: suppression depends on orientation and spatial frequency Vis Res 22 787–800 [12.3.2b, 12.3.4] Fahle M (1987) Naso-temporal asymmetry of binocular inhibition Invest Ophthal Vis Sci 28 1016–17 [12.3.4]
REFERENCES
Fahle M (1991) Psychophysical measurement of eye drifts and tremor by dichoptic or monocular vernier acuity Vis Res 31 209–222 [18.10.3a, 18.11] Fahle M (1993) Visual learning in the hyperacuity range in adults Ger J Ophthal 2 83–6 [18.14.1] Fahle M (1994) Human pattern recognition: parallel processing and perceptual learning Perception 23 411–27 [13.4.1] Fahle M (1995) Perception of oppositely moving verniers and spatiotemporal interpolation Vis Res 35 925–37 [18.10.3a] Fahle M (2004) Perceptual learning: a case of early selection J Vis 4 879–90 [13.4.1] Fahle M, Palm G (1991) Perceptual rivalry between illusory and real contours Biol Cyber 66 1–8 [12.3.3d] Fahle M, Westheimer G (1988) Local and global factors in disparity detection of rows of points Vis Res 28 171–8 [18.6.2b] Fahle M, Westheimer G (1995) On the time-course of inhibition in the stereoscopic perception of rows of dots Vis Res 35 1393–9 [18.6.2b] Fahle M, Fahle SH, Harris J (1994) Definition of thresholds for stereoscopic depth Br J Ophthal 78 572–6 [18.2.4] Falk DS, Williams R (1980) Dynamic visual noise and the stereophenomenon: interocular time delays depth and coherent velocities Percept Psychophys 28 19–27 [23.6.1, 23.6.3, 23.6.4] Fan WCS Brown B, Yap MKH (1996) A new stereotest: the double two rod test Ophthal Physiol Opt 16 196–202 [18.2.1a] Fang F, He S (2005) Cortical responses to invisible objects in the human dorsal and ventral pathways Nat Neurosci 8 1380–5 [12.9.2f ] Fantoni C (2008) 3D surface orientation based on a novel representation of the orientation disparity field Vis Res 48 2509–22 [19.3.1b] Farell B (1998) Two-dimensional matches from one-dimensional stimulus components in human stereopsis Nature 395 689–93 [22.1.4] Farell B (2003) Detecting disparity in two-dimensional patterns Vis Res 43 1009–26 [22.1.4] Farell B (2006) Orientation–specific computation in stereoscopic vision J Vis 26 9098–106 [18.6.5] Farell B, Li S (2004) Seeing depth coherence and transparency J Vis 4 209–23 [22.1.4] Farell B, Li S, McKee SP (2004a) Disparity increment thresholds for gratings J Vis 4 156–68 [18.3.3b] Farell B, Li S, McKee SP (2004b) Coarse scales, fine scales, and their interactions in stereo vision J Vis 4 488–99 [18.7.2e] Faubert J (1994) Seeing depth in colour: more than just what meets the eye Vis Res 34 1165–86 [17.8] Faugeras O (1995) Stratification of three-dimensional vision: projective, affine, and metric representations J Opt Soc Am 12 465–84 [14.2.3] Favreau OE (1978) Interocular transfer of color–contingent motion aftereffects; positive aftereffects Vis Res 18 841–4 [13.3.5] Favreau OE, Cavanagh P (1983) Interocular transfer of a chromatic frequency shift Vis Res 23 951–7 [13.3.4] Favreau OE, Cavanagh P (1984) Interocular transfer of a chromatic frequency shift: temporal constraints Vis Res 27 1799–804 [13.3.4] Fawcett SL, Birch EE (2003) Validity of the Titmus and Randot circles tasks in children with known binocular disorders J AAPOS 7 333–8 [18.2.4] Fechner GT (1860) Uber einige Verhältnisse des binokularen Sehens Berichte Sächs gesamte Wissenschaft 7 337–564 [12.3.1a] Feinsod M, Bentin S, Hoyt WF (1979) Pseudostereoscopic illusion caused by interhemispheric temporal disparity Arch Neurol 36 666–8 [23.7] Felleman DJ, Van Essen DC (1987) Receptive field properties of neurons in area V3 of macaque monkey extrastriate cortex J Neurophysiol 57 889–920 [11.5.1] Felton TB, Richards W, Smith RA (1972) Disparity processing of spatial frequencies in man J Physiol 225 349–62 [11.4.2, 18.7.1, 22.5.1b] Fender D, Julesz B (1967) Extension of Panum’s fusional area in binocularly stabilized vision J Opt Soc Am 57 819–30 [12.1.6, 18.10.3a, 18.4.1b]
Fendick M, Westheimer G (1983) Effects of practice and the separation of test targets on foveal and peripheral stereoacuity Vis Res 23 145–50 [18.14.1, 18.6.1a] Fenelon B, Neill RA, White CT (1986) Evoked potentials to dynamic random dot stereograms in upper centre and lower fields Doc Ophthal 63 151–6 [11.7] Fernandez JM (1997) Cellular and molecular mechanics by atomic force microscopy: capturing the exocytotic fusion pore in vivo? Proc Natl Acad Sci 94 9–10 [24.2.3f ] Fernández JM, Watson B, Qian N (2002) Computing relief structure from motion with a distributed velocity and disparity representation Vis Res 42 883–98 [11.5.2a, 11.6.4] Ferraina S, Paré M, Wurtz RH (2000) Disparity sensitivity of frontal eye field neurons J Neurophysiol 83 625–9 [11.5.3b] Ferree CE, Rand G (1934) Perception of depth in the after-image Am J Psychol 46 329–32 [18.10.1a] Ferris SH, Pastore N (1971) Interocular apparent movement in depth: a motion preference effect Science 174 305–7 [16.4.2f ] Ferster D (1981) A comparison of binocular depth mechanisms in areas 17 and 18 of the cat visual cortex J Physiol 311 623–55 [11.3.1, 11.4.1d, 11.4.1e, 11.4.3b, 12.9.2b] Ferster D (1987) Origin of orientation selective EPSP’s in simple cells of cat visual cortex J Neurosci 7 1780–91 [12.9.2b] Filippini HR , Banks MS (2009) Limits of stereopsis is explained by local cross-correlation J Vis 9(1) Article 8 [11.10.1c, 18.6.3c] Fincham EF (1963) Monocular diplopia Brit J Ophthal 47 705–12 [14.4.2] Finlay DC, Manning ML, Dunlop DP, Dewis SAM (1989) Difficulties in the definition of ‘stereoscotoma’ using temporal detection of thresholds of dynamic random dot stereograms Doc Ophthal 72 161–73 [18.6.4] Fiorentini A, Berardi N (1981) Learning in grating waveform discrimination: specificity for orientation and spatial frequency Vis Res 21 1149–58 [13.4.1] Fiorentini A, Maffei L (1971) Binocular depth perception without geometrical cues Vis Res 11 1299–305 [20.2.1] Fiorentini A, Bayly EJ, Madei L (1972) Peripheral and central contributions to psychophysical spatial interactions Vis Res 12 253–9 [13.2.3] Fiorentini A, Sireteanu R , Spinelli D (1976) Lines and gratings: different interocular after-effects Vis Res 16 1303–9 [13.2.6] Fischer B, Krüger J (1979) Disparity tuning and binocularity of single neurons in cat visual cortex Exp Brain Res 35 1–8 [11.3.1] Fischer FP (1927) Experimentelle Beitrage zum Begriff der Schrichtungsgemeinschaft der Netzhäute auf Grund der binocularen Noniusmethode Pflügers Arch ges Physiol 204 233–290 [14.6.1c] Fischer FP, Wagenaar JW (1954) Binocular vision and fusion movements Doc Ophthal 7 359–91 [14.1] Flandrin JM, Jeannerod M (1977) Developmental constraints of motion detection mechanisms in the kitten Perception 6 513–27 [11.2.3] Fleet DJ, Jepson AD, Jenkin M (1991) Phase-based disparity measurement Comput Vis Gr Im Proc 53 198–210 [11.10.1a, 11.10.1b] Fleet DJ, Wagner H, Heeger DJ (1996a) Neural encoding of binocular disparity: energy models position shifts and phase shifts Vis Res 36 1839–57 [11.10.1b, 11.4.3c, 15.2.1c, 18.8.1] Fleet DJ, Wagner H, Heeger DJ (1996b) Modelling binocular neurons in the primary visual cortex In Computational and biological mechanisms of visual coding (ed M Jenkin, L Harris) Cambridge University Press, London [11.4.1d, 11.10.1a, 11.10.1b] Fletcher JL, Ross S (1953) Tests of stereoscopic vision: a review Int Rec Med Quart Rev Ophthal 166 551–62 [18.2.2a] Flipse JP, van der Wildt GJ, Rodenburg M, et al. (1988) Contrast sensitivity for oscillating sine wave gratings during ocular fixation and pursuit Vis Res 28 819–26 [18.10.1b] Flitcroft DI, Morley JW (1997) Accommodation in binocular contour rivalry Vis Res 37 121–5 [12.5.1]
REFERENCES
•
577
Flitcroft DI, Judge SJ, Morley JW (1992) Binocular interactions in accommodation control: effects of anisometropic stimuli J Neurosci 12 188–203 [12.5.1] Flock HR , Freedberg E (1970) Perceived angle of incidence and achromatic surface color Percept Psychophys 8 251–6 [22.4.3a] Flom MC (1980) Corresponding and disparate retinal points in normal and anomalous correspondence Am J Optom Physiol Opt 57 656–65 [14.4.1b] Flom MC, Eskridge JB (1968) Change in retinal correspondence with viewing distance J Am Optom Assoc 39 1094–7 [14.6.2a] Flom MC, Kerr KE (1967) Determination of retinal correspondence Multiple-testing results and the depth of anomaly concept Arch Ophthal 77 200–13 [14.4.1a, 14.4.1b] Flom MC, Weymouth FW (1961a) Centricity of Maxwell’s spot in strabismus and amblyopia Arch Ophthal 66 290–8 [14.4.1b] Flom MC, Weymouth FW (1961b) Retinal correspondence and the horopter in anomalous correspondence Nature 189 34–6 [14.6.1d] Flom MC, Heath GG, Takahashi E (1963) Contour interaction and visual resolution: contralateral effects Science 142 979–89 [13.2.5] Flom MC, Kirschen DG, Williams AT (1978) Changes in retinal correspondence following surgery for intermittent exotropia Am J Optom Physiol Opt 55 456–62 [14.4.1a] Foley JE (1974) Factors governing interocular transfer of prism adaptation Psychol Rev 81 183–6 [13.4.3] Foley JE, Miyanshi K (1969) Interocular effects in prism adaptation Science 165 311–12 [13.4.3] Foley JM (1966) Locus of perceived equidistance as a function of viewing distance J Opt Soc Am 56 822–7 [14.6.1e] Foley JM (1970) Loci of perceived equi– half– and double–distance in stereoscopic vision Vis Res 10 1201–9 [14.6.1e] Foley JM (1976a) Binocular depth mixture Vis Res 16 1293–7 [18.8.2c] Foley JM (1976b) Successive stereo and vernier position discrimination as a function of dark interval duration Vis Res 16 1299–73 [18.12.2b] Foley JM (1980) Binocular distance perception Psychol Rev 87 411–34 [20.6.3d, 20.6.5a] Foley JM, Richards W (1974) Improvement in stereoanomaly with practice Am J Optom Physiol Opt 51 935–8 [18.14.1] Foley JM, Richards W (1978) Binocular depth mixture with non– symmetric disparities Vis Res 18 251–6 [18.8.2c] Foley JM, Tyler CW (1976) Effect of stimulus duration on stereo and vernier displacement thresholds Percept Psychophys 20 125–8 [18.12.1a] Foley JM, Applebaum TH, Richards WA (1975) Stereopsis with large disparities: discrimination and depth magnitude Vis Res 15 417–21 [18.4.1a] Formankiewicz MA, Mollon JD (2009) The psychophysics of detecting binocular discrepancies of luminance Vis Res 49 1929–38 [13.1.3d] Forte J, Peirce JW, Lennie P (2002) Binocular integration of partially occluded surfaces Vis Res 42 1225–35 [22.1.2] Fortin A, Ptito A, Faubert J, Ptito M (2002) Cortical areas mediating stereopsis in the human brain: a PET study Neuroreport 13 895–7 [11.8.1] Foster DH, Mason, RJ (1977) Interaction between rod and cone systems in dichoptic masking Neurosci Lett 4 39–42 [13.2.7b] Fox R (1991) Binocular rivalry In Vision and visual dysfunction Vol 9 Binocular vision (ed D Regan) pp 93–110 MacMillan, London [12.3.1a] Fox R , Check R (1966a) Binocular fusion: a test of the suppression theory Percept Psychophys 1 331–4 [12.7.2, 12.5] Fox R , Check R (1966b) Forced–choice form recognition during binocular rivalry Psychonom Sci 6 471–2 [12.7.2, 12.8.1] Fox R , Check R (1968) Detection of motion during binocular rivalry suppression J Exp Psychol 78 388–95 [12.6.4] Fox R , Check R (1972) Independence between binocular rivalry suppression duration and magnitude of suppression J Exp Psychol 93 283–9 [12.10]
578
•
Fox R , Herrmann J (1967) Stochastic properties of binocular rivalry alternations Percept Psychophys 2 432–6 [12.10] Fox R , Patterson R (1981) Depth separation and lateral interference Percept Psychophys 30 513–20 [13.2.4b, 22.5.1c] Fox R , Rasche F (1969) Binocular rivalry and reciprocal inhibition Percept Psychophys 5 215–17 [12.10, 12.3.2a] Fox R , Todd S, Bettinger LA (1975) Optokinetic nystagmus as an objective indicator of binocular rivalry Vis Res 15 849–53 [12.3.1a] Fox R , Lehmkuhle SW, Leguire LE (1978) Stereoscopic contours induce optokinetic nystagmus Vis Res 18 1189–92 [16.5.1] Fox R , Patterson R , Lehmkuhle S (1982) Effect of depth position on the motion aftereffect Invest Ophthal Vis Sci 22 (Abs) 144 [16.5.3a, 22.5.4] France TD Ver Hoeve JN (1994) VECP evidence for binocular function in infantile esotropia J Ped Ophthal Strab 31 225–31 [13.1.8b] Frank H (1923) Über die Beeinflussung von Nachbildern durch die gestalteigenschaften der projektionsflaeche Psychol Forsch 3 33–7 [22.5.1a] Frank M (1905) Beobachtungen betreffs der Ubereinstimmung der Hering-Hillebrand’schen Horopterabweichung und des Kundt’schen Teilungsversuches Pflügers Arch ges Physiol 109 63–72 [14.6.2a] Freeman AW (2005) Multistage model for binocular rivalry J Neurophysiol 94 4412–20 [12.10] Freeman AW, Nguyen VA (2001) Controlling binocular rivalry Vis Res 41 2943–50 [12.5.3] Freeman RB (1967) Contrast interpretations of brightness constancy Psychol Bull 67 165–87 [22.4.2] Freeman RD, Ohzawa I (1990) On the neurophysiological organization of binocular vision Vis Res 30 1661–76 [11.4.1f, 11.4.3a] Freeman RD, Robson JG (1982) A new approach to the study of binocular interactions in visual cortex: normal and binocularly deprived cats Exp Brain Res 48 296–300 {5} [11.3.1] Freeman TCB, Durand S, Kiper DC, Carandini M (2002) Suppression without inhibition in visual cortex Neuron 35 759–71 [12.9.2b] French JW (1923) Stereoscopy re-stated Trans Opt Soc 24 226–56 [15.4.6] Freud SL (1964) The physiological locus of the spiral aftereffect Am J Psychol 77 422–8 [13.3.3a] Fricke T, Siderov J (1997) Non-stereoscopic cues in the Random-Dot E stereotest: results for adult observers Ophthal Physiol Opt 17 122–7 [18.2.3c] Friedman JR , Kosmorsky GS, Burde RM (1985) Stereoacuity in patients with optic nerve disease Arch Ophthal 103 37–8 [18.11] Friedman RB, Kaye MG, Richards W (1978) Effect of vertical disparity upon stereoscopic depth Vis Res 18 351–2 [18.6.5] Fries P, Roelfsema PR , Engel AK , et al. (1997) Synchronization of oscillatory responses in visual cortex correlates with perception in interocular rivalry Proc Natl Acad Sci 94 12999–704 [12.9.2b] Fries P, Schröder JH, Roelfsema PR , Singer W, Engel AK (2002) Oscillatory neuronal synchronization in primary visual cortex as a correlate of stimulus selection J Neurosci 22 3739–54 [12.9.2e] Frisby JP (1979) Seeing Houghton Mifflin, London [21.1] Frisby JP (1984) An old illusion and a new theory of stereoscopic depth perception Nature 307 592–3 [20.2.3b, 20.2.3c, 20.2.4] Frisby JP, Clatworthy JL (1975) Learning to see complex random–dot stereograms Perception 4 173–8 [18.14.2a, 18.14.2b, 18.14.2f ] Frisby JP, Julesz B (1975a) Depth reduction effects in random–line stereograms Perception 4 151–8 [15.3.5] Frisby JP, Julesz B (1975b) The effect of orientation difference on stereopsis as a function of line length Perception 4 179–86 [15.3.5] Frisby JP, Julesz B (1976) The effect of length differences between corresponding lines on stereopsis from single and multi–line stimuli Vis Res 16 83–7 [15.3.5] Frisby JP, Mayhew JEW (1978a) Contrast sensitivity function for stereopsis Perception 7 423–9 [18.5.2, 18.7.2a] Frisby JP, Mayhew JEW (1978b) The relationship between apparent depth and disparity in rivalrous texture stereograms Perception 7 661–78 [17.1.3, 17.5]
REFERENCES
Frisby JP, Mayhew JEW (1979a) Does visual texture discrimination precede binocular fusion Perception 8 153–6 [16.6.2] Frisby JP, Mayhew JEW (1979b) Depth inversion in random-dot stereograms Perception 8 397–99 [21.6.2g] Frisby JP, Mayhew JEW (1980) Spatial frequency tuned channels: implications for structure and function from psychophysical and computational studies of stereopsis Philos Tr R Soc 290 95–116 [18.7.1] Frisby JP, Pollard SB (1991) Computational issues in solving the stereo correspondence problem In Computational models of visual processing (ed MS Landy, JA Movshon) pp 331–57 MIT Press, Cambridge MA [17.1.2b] Frisby JP, Roth B (1971) Orientation of stimuli and binocular disparity coding Quart J Exp Psychol 23 367–72 [15.3.5] Frisby JP, Mein J, Saye A, Stanworth A (1975) Use of random-dot stereograms in the clinical assessment of strabismic patients Br J Ophthal 59 545–52 [18.2.4] Frisby JP, Catherall C, Porrill J, Buckley D (1997) Sequential stereopsis using high-pass spatial frequency filtered textures Vis Res 37 3109–16 [18.10.2a] Frisby JP, Buckley D, Grant H, et al. (1999) An orientation anisotropy in the effects of scaling vertical disparities Vis Res 39 481–92 [20.6.5e] Frisén L, Lindblom B (1988) Binocular summation in humans: evidence for a hierarchical model J Physiol 402 773–82 [13.1.3e] Frisén L, Hoyt WF, Bird AC, Weale RA (1973) Diagnostic uses of the Pulfrich phenomenon Lancet 2 385–6 [23.7] Frohn JT, Knapp HF, Stemmer A (2000) True optical resolution beyond the Rayleigh limit achieved by standing wave illumination Proc Natl Acad Sci 97 7232–6 [24.2.3b] Fry GA (1936) The relationship of accommodation to suppression of vision in one eye Am J Ophthal 19 135–8 [12.8.1, 12.9.1] Fry GA (1950) Visual perception of space Am J Optom Arch Am Acad Optom 27 531–53 [16.7.6b] Fry GA, Bartley SH (1933) The brilliance of an object seen binocularly Am J Ophthal 16 687–93 [13.1.4a] Fry GA, Kent PR (1944) The effects of base-in and base-out prisms on stereo-acuity Am J Optom Arch Am Acad Optom 21 492–507 [18.6.7] Fry GA, Bridgman CS, Ellerbrock VJ (1949) The effect of atmospheric scattering on binocular depth Am J Optom Arch Am Acad Optom 29 9–15 [18.7.3b] Fukuda H, Blake R (1992) Spatial interactions in binocular rivalry J Exp Psychol HPP 18 362–70 [12.4.3] Fukuda K , Kaneko H, Matsumiya K (2006) Vertical-size disparities are temporally integrated for slant perception Vis Res 46 2749–56 [20.3.2c] Fukuda K , Wilcox LM, Allison RS, Howard IP (2009) A reevaluation of the tolerance to vertical misalignment in stereopsis I Vis 9(2) Article 1 [18.4.2a] Funaishi S (1926) Weiteres über das Zentrum des Sehrichtungen Graefe’s Arch Klin Exp Ophthal 117 296–303 [16.7.6a] Funaishi S (1927) über die falsche Lichtlokalisation bei geschlossenen Lidern sowie über das subjektive Zyklopenauge Graefe’s Arch Klin Exp Ophthal 119 227–34 [16.7.6b] Funt B, Drew M, Ho J (1991) Color constancy from mutual reflection Int J Comp Vis 6 5–24 [22.4.6] Furchner CS, Ginsburg AP (1978) “Monocular rivalry” of a complex waveform Vis Res 18 1641–8 [12.3.8d] Gabor D (1949) Microscopy by reconstructed wavefronts Proc Roy Soc A 179 454–87 [24.1.4a] Gantz L, Patel SS, Chung STL, Harwerth RS (2007) Mechanisms of perceptual learning of depth discrimination in random-dot stereograms Vis Res 47 2170–8 [18.14.1] Gantz L, Bedell HE (2010) Transfer of perceptual learning of depth discrimination between local and global stereograms Vis Res 50 1891–9 [18.14.1] Gårding J, Porrill J, Mayhew JEW, Frisby JP (1995) Stereopsis, vertical disparity and relief transformations Vis Res 35 703–22 [20.2.4b, 20.6.5e]
Gardner JC, Cynader MS (1987) Mechanisms for binocular depth sensitivity along the vertical meridian of the visual field Brain Res 413 60–74 [11.9.2] Gardner JC, Raiten EJ (1986) Ocular dominance and disparity– sensitivity: why there are cells in the visual cortex driven unequally by the two eyes Exp Brain Res 64 505–14 [11.3.1] Gardner JC, Douglas RM, Cynader MS (1985) A time–based stereoscopic depth mechanism in the visual cortex Brain Res 328 154–57 [23.3.2] Gassendi P (1658) Gassendi; opera omnia Vol 2 p 395 Lyon [12.7.2] Gaunt WA, Gaunt PN (1978) Three dimensional reconstruction in Biology University Park Press, Baltimore MD [24.2.3d, 24.2.5] Gawryszewski L de G, Riggio L, Rizzolatti G, Umiltá C (1987) Movements of attention in the three spatial dimensions and the meaning of “neutral cues” Neuropsychologia 25 19–29 [22.8.1] Genovesio A, Ferraina S (2004) Integration of retinal disparity and fixation-distance related signals toward an egocentric coding of distance in the posterior parietal cortex of primates J Neurophysiol 91 2670–84 [11.4.6a] Georgeson MA (1984) Eye movements, afterimages and monocular rivalry Vis Res 27 1311–19 [12.3.8d] Georgeson MA (1988) Spatial phase dependence and the role of motion detection in monocular and dichoptic forward masking Vis Res 28 1193–1205 [13.1.6c] Georgeson MA, Harris MG (1984) Spatial selectivity of contrast adaptation: models and data Vis Res 27 729–41 [20.2.1] Georgeson MA, Phillips R (1980) Angular selectivity of monocular rivalry: experiment and computer simulation Vis Res 20 1007–13 [12.3.8d] Georgeson MA, Shackleton TM (1989) Monocular motion sensing, binocular motion perception Vis Res 29 1511–23 [16.4.2a, 16.4.2c] Georgeson MA, Shackleton TM (1992) No evidence for dichoptic motion sensing: a reply to Carney and Shadlen Vis Res 32 193–8 [16.4.2b] Georgeson MA, Sullivan GD (1975) Contrast constancy: deblurring in human vision by spatial frequency J Physiol 252 627–56 [18.6.3d] Georgeson MA, Turner RSE (1985) Afterimages of sinusoidal squarewave and compound gratings Vis Res 25 1709–20 [21.6.2b] Georgeson MA, Yates TA, Schofield AJ (2008) Discriminating depth in corrugated stereo surfaces: Facilitation by a pedestal is explained by removal of uncertainty Vis Res 48 2321–8 [18.3.3b] Georgieva S, Peeters R , Kolster H, et al. (2009) The processing of threedimensional shape from disparity in the human brain J Neurosci 29 727–42 [11.8.1] Gepshtein S, Cooperman A (1998) Stereoscopic transparency: a test for binocular vision’s disambiguating power Vis Res 38 2913–32 [18.9] Gerbino W (1984) Low–level and high–level processes in the perceptual organization of three–dimensional apparent motion Perception 13 417–28 [22.5.3d] Gernsheim H (1969) History of photography McGraw-Hill, New York [24.1.2d] Gerstmann J, Kestenbaum A (1930) Monokuläres Doppeltsehen bei cerebralen Erkrankungen Z Neurol Psychiat 128 42–56 [14.4.2] Gestrin PJ, Teller DY (1969) Interocular hue shifts and pressure blindness Vis Res 9 1297–71 [12.2.1] Gettys CF, Harker GS (1967) Some observations and measurements of the Panum phenomenon Percept Psychophys 2 387–95 [17.6.3] Gibbs T, Lawson RB (1974) Simultaneous brightness contrast in stereoscopic space Vis Res 14 983–7 [22.4.2] Gibson JJ (1933) Adaptation after–effect and contrast in the perception of curved lines J Exp Psychol 16 1–31 [13.3.2a, 13.3.5, 21.1, 21.6.1a] Gibson JJ (1937) Adaptation aftereffect and contrast in the perception of tilted lines. II. Simultaneous contrast and the areal restriction of the aftereffect J Exp Psychol 20 553–69 [13.3.2a, 21.1, 21.6.1a] Gibson JJ (1961) Ecological optics Vis Res 1 253–62 [14.1] Gibson JJ (1966) The senses considered as perceptual systems HoughtonMifflin, Boston, MA [14.1]
REFERENCES
•
579
Gibson JJ, Radner M (1937) Adaptation aftereffect and contrast in the perception of tilted lines. I. Quantitative studies J Exp Psychol 20 453–67 [21.1] Giessibl FJ, Hembacher S, Bielefeldt H, Mannhart J (2000) Subatomic features on the silicon (111)-(7x7) surface observed by atomic force microscopy Science 289 422–5 [24.2.3f ] Gilbert CD, Ts’o DY, Wiesel TN (1991) Lateral interactions in visual cortex In From pigments to perception (ed A Valberg , BB Lee) pp 239–47 Plenum, New York [12.9.2c] Gilbert DS, Fender DH (1969) Contrast thresholds measured with stabilized and non–stabilized sine–wave gratings Optica Acta 16 191–204 [18.10.1a] Gilchrist AL (1977) Perceived lightness depends on perceived spatial arrangement Science 195 185–7 [22.4.3b] Gilchrist AL (1980) When does perceived lightness depend on perceived spatial arrangement? Percept Psychophys 28 527–38 [22.4.3b] Gilchrist AL (2006) Seeing black and white Oxford University Press, New York [22.4.3b] Gilchrist AL, Kossyfidis C, Bonato F, et al. (1999) An anchoring theory of lightness perception Psychol Rev 106 795–834 [22.4.3b] Gilchrist J, Pardhan S (1987) Binocular contrast detection with unequal monocular illuminance Ophthal Physiol Opt 7 373–7 [13.1.2b] Gilinsky AS, Doherty RS (1969) Interocular transfer of orientational effects Science 164 454–5 [13.2.4a] Gill AT (1969) Early stereoscopes Photograph J 109 546–59, 606–14, 641–51 [24.1.2f ] Gillam B (1967) Changes in the direction of induced aniseikonic slant as a function of distance Vis Res 7 777–83 [18.1] Gillam B (1968) Perception of slant when perspective and stereopsis conflict: experiments with aniseikonic lenses J Exp Psychol 78 299–305 [20.4.1d] Gillam B (1993) Stereoscopic slant reversals: a new kind of ‘induced’ effect Perception 22 1025–36 [18.1] Gillam B (1995) Matching needed for stereopsis Nature 37 202–4 [17.3] Gillam B, Blackburn SG (1998) Surface separation decreases stereoscopic slant but a monocular aperture increases it Perception 27 1297–86 [21.4.2d] Gillam B, Borsting E (1988) The role of monocular regions in stereoscopic displays Perception 17 603–8 [17.2.2] Gillam B, Lawergren B (1983) The induced effect vertical disparity and stereoscopic theory Percept Psychophys 34 121–30 [19.6.3, 19.6.6, 20.2.3a] Gillam B, Nakayama K (1999) Quantitative depth for a phantom surface can be based on cyclopean occlusion cues alone Vis Res 39 109–12 [17.3] Gillam B, Pianta MJ (2005) The effect of surface placement and surface overlap on stereo slant contrast and enhancement Vis Res 45 3083–95 [21.4.2d] Gillam B, Rogers B (1991) Orientation disparity deformation and stereoscopic slant perception Perception 20 441–8 [20.3.2b] Gillam B, Flagg T, Finley D (1984) Evidence for disparity change as the primary stimulus for stereoscopic processing Percept Psychophys 36 559–64 [21.4.2b] Gillam B, Chambers D, Lawergren B (1988a) The role of vertical disparity in the scaling of stereoscopic depth perception: an empirical and theoretical study Percept Psychophys 44 473–83 [20.2.3b, 20.2.4b, 21.3.1] Gillam B, Chambers D, Russo T (1988b) Postfusional latency in slant perception and the primitives of stereopsis J Exp Psychol: HPP 14 163–75 [18.12.1b, 20.4.1a, 21.4.2b, 21.5.2] Gillam B, Blackburn S, Cook M (1995) Panum’s limiting case: double fusion convergence error or ‘da Vinci stereopsis’ Perception 27 333–46 [17.6.3] Gillam B, Blackburn S, Nakayama K (1999) Stereopsis based on monocular gaps: metrical encoding of depth and slant without matching contours Vis Res 39 493–502 [17.3]
580
•
Gillam B, Cook M, Blackburn S (2003) Monocular discs in the occlusion zones of binocular surfaces do not have quantitative depth—a comparison with Panum’s limiting case Perception 32 1009–19 [17.6.4] Gillam B, Blackburn S, Brooks K (2007) Hinge versus twist: The effects of ‘reference surfaces’ and discontinuities on stereoscopic slant perception Perception 36 596–616 [21.4.2d] Gilroy LA, Blake R (2005) The interaction between binocular rivalry and negative afterimages Curr Biol 15 1740–4 [12.3.3a] Glass L, Perez R (1973) Perception of random–dot interference patterns Nature 276 360–2 [16.6.2] Glennerster A (1996) The time course of 2-D shape discrimination in random dot stereograms Vis Res 36 1955–68 [15.4.3] Glennerster A (1998) dmax for stereopsis and motion in random dot displays Vis Res 38 925–34 [18.4.1e] Glennerster A, McKee SP (1999) Bias and sensitivity of stereo judgements in the presence of a slanted reference plane Vis Res 39 3057–69 [18.3.2a, 18.3.2b, 21.4.2d] Glennerster A, McKee SP (2004) Sensitivity to depth relief on slanted surfaces J Vis 4 378–87 [18.3.2b] Glennerster A, Parker AJ (1997) Computing stereo channels from masking data Vis Res 37 2143–52 [18.7.4] Glennerster A, Rogers BJ (1993) New depth to the Müller–Lyer illusion Perception 22 691–704 [17.7] Glennerster A, Rogers BJ, Bradshaw MF (1996) Stereoscopic depth constancy depends on the subject’s task Vis Res 36 3441–56 [20.6.2c] Glennerster A, Rogers BJ, Bradshaw MF (1998) Cues to viewing distance for stereoscopic depth constancy Perception 27 1357–66 [20.6.3a, 20.6.5c] Glickstein M, Miller J, Smith OA (1964) Lateral geniculate nucleus and cerebral cortex: evidence for a crossed pathway Science 145 159–61 [11.9.2] Gnadt JW, Mays LE (1995) Neurons in monkey parietal area LIP are tuned for eye-movement parameters in three-dimensional space J Neurophysiol 73 280–97 [11.5.2b] Goethe JW von (1810) Zur Fabenlehre Tübingen. English translation in Matthaei R (1971) Goethe’s color theory. Van Nostrand Reinhold, New York [14.2.2, 17.8] Gogel WC (1956) The tendency to see objects as equidistant and its inverse relation to lateral separation Psychol Monogr 70 (Whole No 411) [20.1.1, 21.3.2, 22.5] Gogel WC (1960) The perception of a depth interval with binocular disparity cues J Psychol 50 257–69 [20.6.2c] Gogel WC (1963) The visual perception of size and distance Vis Res 3 101–20 [21.3.1] Gogel WC (1965) Equidistance tendency and its consequences Psychol Bull 64 153–63 [21.3.2] Gogel WC (1975) Depth adjacency and the Ponzo illusion Percept Psychophys 17 125–32 [22.5.2] Gogel WC (1977) An indirect measure of perceived distance from oculomotor cues Percept Psychophys 21 3–11 [20.6.5c] Gogel WC, MacCracken PJ (1979) Depth adjacency and induced motion Percept Mot Skills 48 343–50 [22.7.2] Gogel WC, Mershon DH (1969) Depth adjacency in simultaneous contrast Percept Psychophys 5 13–17 [22.4.2] Gogel WC, Mershon DH (1977) Local autonomy in visual space Scand J Psychol 18 237–50 [21.3.1] Gogel WC, Newton RE (1975) Depth adjacency and the rod– and– frame illusion Percept Psychophys 18 163–71 [22.5.2] Goldstein AG (1967) Retinal rivalry and Troxler’s effect Psychonom Sci 7 427–8 [12.3.3a] Goldstein SR , Hubin T, Rosenthall S, Washburn C (1990) A confocal video-rate laser-beam scanning reflected-light microscope with no moving parts J Micros 157 29–38 [24.2.3b] González EG, Steinbach MJ, Gallie, BL, Ono H (1999) Egocentric localization: visually directed alignment to projected head landmarks in binocular and monocular observers Binoc Vis Strab Quart 14 127–36 [16.7.2c]
REFERENCES
González EG, Ono H, Lam L, Steinbach MJ (2005) Kanizsa’s shrinkage illusion produced by a misapplied 3D corrective mechanism Perception 34 1181–92 [16.7.4b] González EG, Weinstock M, Steinbach MJ (2007) Peripheral fading with monocular and binocular viewing Vis Res 47 136–44 [12.3.1a, 12.3.3a] Gonzalez F, Krause F (1994) Generation of dynamic random-element stereograms in real time with a system based on a personal computer Med Biol Engin Comput 32 373–76 [24.1.5] Gonzalez F, Perez R (1998a) Modulation of cell responses to horizontal disparities by ocular vergence in the visual cortex of the awake macaca mulatta monkey Neurosci Lett 275 101–4 [11.4.6a] Gonzalez F, Perez R (1998b) Neural mechanisms underlying stereoscopic vision Prog Neurobiol 55 191–227 [11.9.2] Gonzalez F, Krause F, Perez R , et al. (1993a) Binocular matching in monkey visual cortex: single cell responses to correlated and uncorrelated dynamic random dot stereograms Neurosci 52 933–9 [11.4.1a] Gonzalez F, Revola JL, Perez R , et al. (1993b) Cell responses to vertical and horizontal retinal disparities in the monkey visual cortex Neurosci Lett 160 167–70 [11.4.4] Goodwin RT, Romano PE (1985) Stereoacuity degradation by experimental and real monocular and binocular amblyopia Invest Ophthal Vis Sci 29 917–23 [18.5.4b] Gorea A, Conway TE, Blake R (2001) Interocular interactions reveal the opponent structure of motion mechanisms Vis Res 41 441–8 [22.3.2] Goryo K , Kikuchi T (1971) Disparity and training in stereopsis Jap Psychol Res 13 148–52 [18.14.2c] Gosser HM (1977) Selected attempts at stereoscopic moving pictures and their relationship to the development of motion picture technology 1852–1903 Arno Press, New York [24.1.2c] Gouras P, Link K (1966) Rod and cone interaction in dark adapted monkey ganglion cells J Physiol 184 499–510 [13.2.3] Goutcher R , Hibbard PB (2010) Evidence for relative disparity matching in the perception of an ambiguous stereogram J Vis 10(12) [15.3.2] Goutcher R , Mamassian P (2005) Selective biasing of stereo correspondence in an ambiguous stereogram Vis Res 45 469–83 [15.3.2] Grabowska A (1983) Lateral differences in the detection of stereoscopic depth Neuropsychologia 21 279–57 [18.6.4] Graf EW, Adams WJ, Lages M (2004) Prior depth information can bias motion perception J Vis 4 427–33 [22.3.1] Graham ME (1983) Motion parallax and the perception of threedimensional surfaces Ph.D. Thesis University of St Andrews [21.6.2b] Graham ME, Rogers BJ (1982) Simultaneous and successive contrast effects in the perception of depth from motion–parallax and stereoscopic information Perception 11 277–62 [21.4.1, 21.4.2c, 21.5.2, 21.6.2b, 21.6.4] Graham ME, Rogers BJ (1983) Phase-dependent and phase-independent depth aftereffects Perception 12 (Abs) A16 [21.6.4] Grant S, Berman NEJ (1991) Mechanisms of anomalous retinal correspondence: maintenance of binocularity with alteration of receptive– field position in the lateral suprasylvian (LS) visual area of strabismic cats Vis Neurosci 7 259–81 [14.4.1c] Grasse KL (1991) Pharmacological isolation of visual cortical input to the cat accessory optic system: effects of intravitreal tetrodotoxin on DTN unit responses Vis Neurosci 6 175–183 [22.6.1b] Grasse KL (1994) Positional disparity sensitivity of neurons in the cat accessory optic system Vis Res 13 1673–89 [11.2.2, 11.6.4, 22.6.1e] Grasse KL, Cynader MS (1986) Response properties of single units in the accessory optic system of the dark-reared cat Devel Brain Res 27 199–210 [11.2.2] Grasse KL, Cynader MS (1987) The accessory optic system of the monocularly deprived cat Devel Brain Res 31 229–41 [11.2.2, 22.6.1b]
Grasse KL, Cynader MS (1990) The accessory optic system in frontal– eyed animals In Vision and visual disfunction (ed AL Leventhal) Vol IV pp 111–39 MacMillan, London [22.6.1a] Gray, MS, Pouget A, Zemel RS, et al. (1998) Reliable disparity estimation through selective integration Vis Neurosci 15 511–28 [11.10.2] Graybiel AM (1976) Evidence for banding of the cat’s ipsilateral retinotectal connections Exp Brain Res 114 318–27 [11.2.3] Green DM, Swets JA (1966) Signal detection theory and psychophysics Wiley, New York [13.1.1e] Green J (1889) On certain stereoscopical illusions evoked by prismatic and cylindrical spectacle–glasses Tr Am Ophthal Soc 449–56 [20.2.3a] Green M (1986) What determines correspondence strength in apparent motion Vis Res 29 599–607 [22.5.3a] Green M (1989) Color correspondence in apparent motion Percept Psychophys 45 15–20 [22.5.3a] Green M (1992) Temporal sampling requirements for stereoscopic displays In Stereoscopic displays and applications III Proc Int Soc Opt Engin 1669 101–11 [16.4.2g] Green M, Blake R (1981) Phase effects in monoptic and dichoptic temporal integration: flicker and motion detection Vis Res 21 365–72 [16.4.2a, 16.4.2c] Green M, Odom JV (1986) Correspondence matching in apparent motion: evidence for three–dimensional spatial representation Science 233 1427–29 [22.5.3b] Green M, Odom JV (1984) Comparison of monoptic and dichoptic masking by light Percept Psychophys 35 265–8 [13.2.3] Greene RT, Lawson RB, Godek CL (1972) The Ponzo illusion in stereoscopic space J Exp Psychol 95 358–64 [22.5.2] Greenlee MW (1992) Spatial frequency discrimination of band–limited periodic targets: effects of stimulus contrast bandwidth and retinal eccentricity Vis Res 32 275–83 [20.2.1] Greenspon TS, Eriksen CW (1968) Interocular nonindependence Percept Psychophys 3 93–6 [13.1.3e] Gregory RL (1961) The solid-image microscope Res Devel 1 101–3 [24.1.4b] Gregory RL (1966) Eye and brain World University Library, London [23.2.1, 23.4.2a, 23.7] Gregory RL (1970) Distortion of visual space as inappropriate constancy scaling Nature 199 678–80 [16.7.4b] Gregory RL (1972) Cognitive contours Nature 238 51–2 [22.2.4a] Gregory RL (1973) Fusion and rivalry of illusory contours Perception 2 235–42 [22.2.4a] Gregory RL (1979) Stereo vision and isoluminance Proc R Soc B 204 467–76 [17.1.4a] Gregory RL, Harris JP (1974) Illusory contours and stereo depth Percept Psychophys 15 411–16 [22.2.4a] Griffin JR , Grisham JD (1995) Binocular anomalies Diagnosis and vision therapy Butterworth-Heinemann, Boston [14.4.1b] Griffin WP (1995) Three-dimensional imaging in endoscopic surgery Biomed Instrum Technol 29 183–9 [24.2.4] Grigo A, Lappe M (1998) Interaction of stereo vision and optic flow processing revealed by an illusory stimulus Vis Res 38 281–90 [22.7.4] Grimsley G (1943) A study of individual differences in binocular color fusion J Exp Psychol 32 82–7 [12.2.2] Grimson WEL (1981) A computer implementation of a theory of human stereo vision Philos Tr R Soc B 292 217–53 [15.3.1, 17.1.1a] Grinberg DL, Williams DR (1985) Stereopsis with chromatic signals from the blue–sensitive mechanism Vis Res 25 531–7 [17.1.4c] Grindley GC, Townsend V (1965) Binocular masking induced by a moving object Quart J Exp Psychol 17 97–109 [12.3.6b] Gronwall DMA, Sampson H (1971) Ocular dominance: a test of two hypotheses Br J Psychol 62 175–85 [12.3.7] Grossberg S, Howe DL (2003) A laminar cortical model of stereopsis and three-dimensional surface perception Vis Res 43 801–29 [11.10.1b] Grossberg S, Kelly F (1999) Neural dynamics of binocular brightness perception Vis Res 39 3796–816 [13.1.4c]
REFERENCES
•
581
Grossberg S, Marshall JA (1989) Stereo boundary fusion by cortical complex cells: a system of maps, filters, and feedback networks for multiplexing distributed data Neural Networks 2 29–51 [11.10.1b] Grossberg S, McLoughlin NP (1997) Cortical dynamics of three-dimensional surface perception: binocular and half-occluded scenic images Neural Networks, 10, 1583–605 [11.10.1b] Grosslight JH, Fletcher HJ, Masterton RB, Hagen R (1978) Monocular vision and landing performance in general aviation pilots: cyclops revisited Hum Factors 20 27–33 [20.1.1] Grove PM, Ono H (1999) Ecological invalid monocular texture leads to longer perceptual latencies in random-dot stereograms Perception 28 627–39 [17.2.2] Grove PM, Regan D (2002) Spatial frequency discrimination in cyclopean vision Vis Res 42 1837–46 [18.6.3f ] Grove PM, Kaneko H, Ono H (2001) The backward inclination of a surface defined by empirical corresponding points Perception 30 411–29 [14.7] Grove PM, Gillam B, Ono H (2002) Content and context of monocular regions determine perceived depth in random dot, unpaired background and phantom stereograms Vis Res 42 1859–70 [17.2.2, 17.3] Grove PM, Brooks KR , Anderson BL, Gillam BJ (2006) Monocular transparency and unpaired stereopsis Vis Res 46 1695–705 [17.4] Grove PM, Ashida H, Kaneko H, Ono H (2008) Interocular transfer of a rotational motion aftereffect as a function of eccentricity Perception 37 1152–9 [13.3.3a] Grunewald A, Mingolla E (1998) Motion after-effect due to binocular sum of adaptation to linear motion Vis Res 38 2963–71 [13.3.3d] Grüsser OJ, Grüsser–Cornehls U (1965) Neurophysiological Grundlagen des Binocularsehens Arch Psychiat Z ges Neurol 207 296–317 [13.1.1d] Guillemot JP, Paradis MC, Samson A, et al. (1993) Binocular interaction and disparity coding in area 19 of visual cortex in normal and split–chiasm cats Exp Brain Res 94 405–17 [11.3.2, 11.9.1] Gulick WL, Lawson RB (1976) Human stereopsis Oxford University Press, New York [11.1.1, 14.5.2a, 17.2.4] Gulyás B, Roland PE (1994) Binocular disparity discrimination in human cerebral cortex: functional anatomy by positron emission tomography Proc Natl Acad Sci 91 1239–43 [11.8.1] Gunter R (1951) Binocular fusion of colours Br J Psychol 42 363–72 [12.2.2, 2] Gur M (1991) Perceptual fade–out occurs in the binocularly viewed Ganzfeld Perception 20 645–54 [12.3.3a] Gur M, Akri V (1992) Isoluminant stimuli may not expose the full contribution of color to visual functioning: spatial contrast sensitivity measurements indicate interaction between color and luminance processing Vis Res 32 1253–62 [17.1.4e] Gur M, Snodderly DM (1987) Studying striate cortex neurons in behaving monkeys: benefits of image stabilization Vis Res 27 2081–7 [18.10.3a] Gur M, Snodderly DM (1997) Visual receptive fields of neurons in primary visual cortex (V1) move in space with the eye movements of fixation Vis Res 37 257–65 [18.10.3a] Gur M, Beylin A, Snodderly DM (1997) Response variability of neurons in primary visual cortex (V1) of alert monkeys J Neurosci 17 2914–20 [11.4.8a] Guth SL (1971) On probability summation Vis Res 11 747–50 [13.1.1e] Gyoba J (1978) The Poggendorff illusion under stereopsis Tohoku Psychol Folia, 37, 94–101 [16.7.4b] Hadani I, Vardi N (1987) Stereopsis impairment in apparently moving random dot patterns Percept Psychophys 42 158–65 [18.10.1b] Hadani I, Meiri AZ , Guri M (1984) The effects of exposure duration and luminance on the 3-dot hyperacuity task Vis Res 24 871–4 [18.12.1a] Haefner RM, Cumming BG (2008) Adaptation to natural binocular disparities in primate V1 explained by a generalized energy model Neuron 57 147–158 [11.10.1b, 13.1.8a]
582
•
Hagen MA, Jones RK , Reed ES (1978) On a neglected variable in theories of pictorial perception: truncation of the visual field Percept Psychophys 23 329–30 [24.1.7] Haines RF (1977) Visual response time to colored stimuli in peripheral retina: evidence for binocular summation Am J Optom Physiol Opt 54 387–98 [13.1.7] Hajos A (1962) Farbunterscheidung ohne “Farbigsehen” Naturwissenschaften 49 93–7 [17.8] Hajos A (1968) Sensumotorische Koordinationsprozesse bei Richtungslokalisation Z Exp Angew Psychol 15 435–61 [13.4.3] Hajos A, Ritter M (1965) Experiments to the problem of interocular transfer Acta Psychol 27 81–90 [13.3.2a, 13.3.5, 13.4.3] Häkkinen J, Nyman G (1996) Depth asymmetry in da Vinci stereopsis Vis Res 36 3815–19 [17.6.2] Häkkinen J, Nyman G (2001) Phantom surface captures stereopsis Vis Res 41 187–99 [17.3] Häkkinen J, Liinasuo M, Kojo I, Nyman G (1998) Three-dimensionally slanted illusory contours capture stereopsis Vis Res 38 3109–15 [22.2.4b] Haldat C (1806) Expériences sur la double vision J de Physique 63 387–401 [12.2.1] Hall C (1982) The relationship between clinical stereotests Ophthal Physiol Opt 2 135–43 [18.2.4] Halldén U (1952) Fusional phenomena in anomalous correspondence Acta Ophthal Supp 32 1–93 [14.4.1a, 14.4.1b, 14.4.1e] Halldén U (1956) An optical explanation of the Hering–Hillebrand horopter deviation Arch Ophthal 55 830–5 [14.6.2b] Hallert B (1970) X-ray photogrammetry—basic geometry and quality Elsevier, New York [24.2.3e] Halpern DL (1991) Stereopsis from motion–defined contours Vis Res 31 1611–17 [17.1.5] Halpern DL, Blake R (1988) How contrast affects stereoacuity Perception 17 483–95 [18.5.2, 18.5.3, 18.5.4a] Halpern DL, Patterson R , Blake R (1987a) Are stereoacuity and binocular rivalry related? Am J Optom Physiol Opt 64 41–4 [12.3.2c] Halpern DL, Patterson R , Blake R (1987b) What causes stereoscopic tilt from spatial frequency disparity Vis Res 27 1619–29 [20.2.1] Halpern DL, Wilson HR , Blake R (1996) Stereopsis from interocular spatial frequency differences is not robust Vis Res 36 2293–70 [20.2.1] Hamilton CR , Tieman SB, Winter HL (1973) Optic chiasm section affects discriminability of asymmetric patterns by monkeys Brain Res 49 427–31 [13.4.2] Hammond P (1991) Binocular phase specificity of striate corticotectal neurones Exp Brain Res 87 615–23 [11.4.1d] Hammond P, Mouat GSV (1988) Neural correlates of motion after– effects in cat striate cortical neurones: interocular transfer Exp Brain Res 72 21–8 [13.3.3f ] Hammond P, Pomfrett CJD (1991) Interocular mismatch in spatial frequency and directionality characteristics of striate cortical neurones Exp Brain Res 85 631–40 [11.6.1] Hammond P, Mouat GSV, Smith AT (1988) Neural correlates of motion after–effects in cat striate cortical neurones: monocular adaptation Exp Brain Res 72 1–20 [13.3.3f ] Hammond RS, Schmidt PP (1986) A random dot E stereogram for the vision screening of children Arch Ophthal 104 54–60 [18.2.3c] Hampton DR , Kertesz AE (1983) The extent of Panum’s area and the human cortical magnification factor Perception 12 161–65 [12.1.1d] Hamstra SJ, Regan D (1995) Orientation discrimination in cyclopean vision Vis Res 35 365–74 [16.2.2b] Hancock S, Whitney D, Andrews TJ (2008) The initial interactions underlying binocular rivalry require visual awareness J Vis 8 1–9 [12.3.5f ] Handa T, Mukuno K , Uozato H, et al. (2004) Effects of dominant and nondominant eyes in binocular rivalry Optom Vis Sci 81 377–82 [12.3.7]
REFERENCES
Hanna GB, Cuschieri A (2000) Influence of two-dimensional and threedimensional imaging on endoscopic bowel suturing World Journal of Surgery 27 444–8 [24.2.4] Hänny P, von der Heydt R , Poggio GF (1980) Binocular neuron responses to tilt in the monkey visual cortex Evidence for orientation disparity processing Exp Brain Res 41 A29 [11.6.2, 20.3.1a] Hansell R (1991) Stereopsis and ARC Am Orthopt J 41 122–7 [14.4.1e] Hardy JE, Dodds SR , Roberts AD (1996) An objective evaluation of the effectiveness of different methods of displaying three-dimensional information with medical x-ray images Invest Radiol 31 433–45 [24.2.3e] Hariharan-Vilupuru S, Bedell HE (2009) The perceived visual direction of monocular objects in random-dot stereograms is influenced by depth and allelotropia Vis Res 49 190–201 [16.7.4a] Harker GS (1962) Apparent frontoparallel plane stereoscopic correspondence and induced cyclorotation of the eyes Percept Mot Skills 14 75–87 [21.3.2] Harker GS (1967) A saccadic suppression explanation of the Pulfrich phenomenon Percept Psychophys 2 423–6 [23.2.4, 23.3.3] Harker GS, Jones PD (1985) Interocular intermittence, retinal illuminance, and apparent depth displacement of a moving object Percept Psychophys 37 50–8 [23.3.3] Harker GS, O’Neal OL (1967) Some observations and measurements of the Pulfrich phenomenon Percept Psychophys 2 438–40 [23.2.1] Harper B, Latto R (2001) Cyclopean vision, size estimation, and presence in orthostereoscopic images Presence 10, 312–29 [20.6.3d] Harrad RA, McKee SP, Blake R , Yang Y (1994) Binocular rivalry disrupts stereopsis Perception 23 15–28 [12.7.3] Harris (2004) Binocular vision: moving closer to reality Trans Roy Soc A 362 2721–39 [20.6.3a] Harris JM, Morgan MJ (1993) Stereo and motion disparities interfere with positioning averaging Vis Res 33 309–12 [22.5.2] Harris JM, Parker AJ (1992) Efficiency of stereopsis in random–dot stereograms J Opt Soc Am A 9 1–12 [18.3.5] Harris JM, Parker AJ (1994a) Constraints on human stereo dot matching Vis Res 34 2761–72 [18.3.5] Harris JM, Parker AJ (1994b) Objective evaluation of human and computational stereoscopic visual systems Vis Res 34 2773–85 [18.3.5] Harris JM, Parker AJ (1995) Independent neural mechanisms for bright and dark information in binocular stereopsis Nature 374 808–11 [17.1.1b] Harris JM, Watamaniuk SNJ (1996) Poor speed discrimination suggests that there is no specialized speed mechanism for cyclopean motion Vis Res 36 2149–57 [16.5.2] Harris JM, Willis A (2001) A binocular site for contrast-modulated masking Vis Res 41 873–81 [13.2.4a] Harris JM, McKee SP, Smallman HS (1997) Fine-scale processing in human binocular stereopsis J Opt Soc Am 14 1673–83 [15.2.2d] Harris JP, Gregory RL (1973) Fusion and rivalry of illusory contours Perception 2 235–47 [22.2.4a] Harris L (1988) Varifocal mirror display integrated into a high speed image processor Proc Soc Photo Opt Instrum Engin 902 2–9 [24.1.4] Harris LR , Cynader M (1981) The eye movements of the dark-reared cat Exp Brain Res 44 41–56 [22.6.1b] Harris LR , Jenkin M (1993) Spatial vision in humans and robots Cambridge University Press, Cambridge [24.2.6] Harris VA, Hayes W, Gleason JM (1974) The horizontal–vertical illusion: binocular and dichoptic investigations of bisection and verticality components Vis Res 14 1323–6 [16.3.1] Harter MR , Seiple WH, Salmon L (1973) Binocular summation of visually evoked responses to pattern stimuli in humans Vis Res 13 1433–46 [13.1.8b] Harter MR , Seiple WH, Musso M (1974) Binocular summation and suppression: visually evoked cortical responses to dichoptically presented patterns of different spatial frequency Vis Res 14 1169–80 [13.1.4c]
Harter MR , Towle VL, Zakrzewski M, Moyer SM (1977) An objective indicant of binocular vision in humans: size-specific interocular suppression of visual evoked potentials EEG Clin Neurophysiol 43 825–36 [13.2.4a] Hartridge H (1918) Chromatic aberration and resolving power of the eye J Physiol 52 175–276 [17.8] Harwerth RS, Boltz RL (1979a) Stereopsis in monkeys using random dot stereograms: the effect of viewing duration Vis Res 19 985–91 [18.12.1a, 18.5.4a] Harwerth RS, Boltz RL (1979b) Behavioral measures of stereopsis in monkeys using random dot stereograms Physiol Behav 22 229–234 [18.3.1] Harwerth RS, Rawlings SC (1977) Viewing time and stereoscopic threshold with random–dot stereograms Am J Optom Physiol Opt 54 452–7 [18.12.1a, 18.2.4] Harwerth RS, Smith EL (1985) Binocular summation in man and monkey Am J Optom Physiol Opt 62 439–46 [13.1.2c] Harwerth RS, Smith EL, Levi DM (1980) Suprathreshold binocular interactions for grating patterns Percept Psychophys 27 43–50 [13.1.2d, 13.1.7] Harwerth RS, Smith EL, Siderov J (1995) Behavioral studies of local stereopsis and disparity vergence in monkeys Vis Res 35 1755–70 [18.3.1] Harwerth RS, Fredenburg PM, Smith EL (2003) Temporal integration for stereoscopic vision Vis Res 43 505–17 [18.12.1a] Hasebe H, Oyamada H, Ukal K , et al. (1996) Changes in oculomotor functions before and after loading of a 3-D visually-guided task by using a head-mounted display Ergonomics 39 1330–43 [24.2.6] Hastorf AH, Myro G (1959) The effect of meaning on binocular rivalry Am J Psychol 72 393–400 [12.8.3a] Hatta S, Kumagami T, Qian J, et al. (1998) Nasotemporal directional bias of V1 neurons in young infant monkeys Invest Ophthal Vis Sci 39 2259–67 [22.6.1b] Hay JC, Pick HL, Rosser E (1963) Adaptation to chromatic aberration by the human visual system Science 141 167–9 [13.3.5] Hayashi R , Miyawaki Y, Maeda T, Tachi S (2003) Unconscious adaptation: a new illusion of depth induced by stimulus features without depth Vis Res 43 2773–82 [21.6.2e] Hayashi R , Maeda T, Shimojo S, Tachi S (2004) An integrative model of binocular vision: a stereo model utilizing interocularly unpaired points produces both depth and binocular rivalry Vis Res 44 2367– 80 [11.10.1c, 17.30] Hayashi R , Nishida S, Tolias A, Logothetis NK (2007) A method for generating a “purely first-order” dichoptic motion stimulus J Vis 7 1–10 [16.4.2c] Hayes RM (1989) 3-D movies A history and filmography of stereoscopic cinema McFarland, London [24.1.1] Hayhoe M, Gillam B, Chajka K , Vecellio E (2009) The role of binocular vision in walking Vis Neurosci 26 73–80 [20.1.1] Haynes JD, Deichmann R , Rees G (2005) Eye-specific effects of binocular rivalry in the human lateral geniculate nucleus Nature 438 496–499 [12.9.1] He S, Davis WL (2001) Filling-in at the natural blind spot contributes to binocular rivalry Vis Res 41 835–40 [12.3.4] He ZJ, Nakayama K (1994a) Perceiving textures: beyond filtering Vis Res 34 151–62 [22.1.2] He ZJ, Nakayama K (1994b) Apparent motion determined by surface layout not by disparity or 3–dimensional distance Nature 367 173–4 [22.5.3c] He ZJ, Nakayama K (1994c) Perceived surface shape not features determines correspondence strength in apparent motion Vis Res 34 2125–35 [22.5.3d] He ZJ, Nakayama K (1995) Visual attention to surfaces in threedimensional space Proc Natl Acad Sci 92 11155–9 [22.8.1] He ZJ, Ooi TL (1999) Perceptual organization of apparent motion in the Ternus display Perception 28 887–92 [16.4.2e, 22.5.3b] He ZJ, Ooi TL (2000) Perceiving binocular depth with reference to a common surface Perception 29 1313–34 [21.4.3]
REFERENCES
•
583
Heard PF, Papakostopoulos D (1993) Long term adaptation of the Pulfrich illusion Invest Ophthal Vis Sci 34 (Abs) 1053 [23.4.2b] Hecht S (1928) On the binocular fusion of colors and its relation to theories of color vision Proc Natl Acad Sci 14 237–41 [12.2.1] Heckmann T, Howard IP (1991) Induced motion: isolation and dissociation of egocentric and vection–entrained components Perception 20 285–305 [22.7.2] Heckmann T, Post RB (1988) Induced motion and optokinetic afternystagmus; parallel response dynamics with prolonged stimulation Vis Res 28 681–94 [22.7.2] Heckmann T, Schor CM (1989a) Panum’s fusional area estimated with a criterion–free technique Percept Psychophys 45 297–306 [12.1.1c, 12.1.2] Heckmann T, Schor CM (1989b) Is edge information for stereoacuity spatially channelled Vis Res 29 593–607 [18.5.2] Heeley DW, Scott-Brown KC, Reid G, Maitland F (2003) Interocular orientation disparity and the stereoscopic perception of slanted surfaces Spat Vis 16 183–207 [11.6.2] Hegdé J, Van Essen DC (2005a) Stimulus dependence of disparity coding in primate visual area V4 J Neurophysiol 93 620–6 [11.5.3a] Hegdé J, Van Essen DC (2005b) Role of primate visual area V4 in the processing of 3-D shape characteristics defined by disparity J Neurophysiol 94 2856–2866 [11.5.3a] Heider B, Spillmann L, Peterhans E (2002) Stereoscopic illusory contours—cortical neuron responses and human perception J Cog Neurosci 14 1018–29 [22.2.4c] Heine L (1900) Sehschärfe und Tiefenwahrnehmung Pflügers Arch ges Physiol 51 146 [18.2.1a] Heinrich SP, Kromeier M, Bach M, Kommerell G (2005) Vernier acuity for stereodisparate objects and ocular prevalence Vis Res 45 1321–8 [18.11] Held RT, Banks MS (2008) Misperceptions in stereoscopic dispkays: a vision science perspective ACM Trans, APGV08, 23–31 [24.1.1] Hell SW, Wichmann J (1994) Breaking the diffraction resolution limit by stimulated emission: stimulated–emission-depletion fluorescence microscopy Optics Letters 19 780–2 [24.2.3c] Hell SW, Bahlmann K , Schrader M, et al. (1996) Three-photon excitation in fluorescence microscopy J Biomed Opt 1 71–4 [24.2.3c] Hell SW, Schrader M, van der Voort HTM (1997) Far-field fluorescence microscopy with three-dimensional resolution in the 100-nm range J Microsc 187 1–7 [24.2.3b] Helmholtz H von (1893) Popular lectures on scientific subjects (Translated by E Atkinson) Longmans Green, London [12.7.1] Helmholtz H von (1909) Helmholtz’s treatise on physiological optics Dover, New York 1962 (Translation by JPC Southall from the 3rd German edition of Handbuch der Physiologischen Optik) Vos Hamburg [12.1.3a, 12.2.1, 14.6.1b, 14.5.2g, 15.3.7b, 16.8, 17.8, 20.6.5a] Helson H (1963) Studies of anomalous contrast and assimilation J Opt Soc Am 53 179–84 [22.4.5] Hendricks JM, Holliday IE, Ruddock KH (1981) A new class of visual defect: spreading inhibition elicited by chromatic light stimuli Brain 104 813–40 [17.1.4e] Henning GB, Hertz BG (1973) Binocular masking level differences in sinusoidal grating detection Vis Res 13 2755–63 [13.2.4b] Henning GB, Hertz BG (1977) The influence of bandwidth and temporal properties of spatial noise on binocular masking–level differences Vis Res 17 399–402 [13.2.4b] Hepler N (1968) Color: a motion–contingent aftereffect Science 162 376–7 [13.3.5] Herbomel P, Ninio J (1993) Processing of linear elements in stereopsis: effects of positional and orientational distinctiveness Vis Res 33 1813–25 [15.3.11] Hering E (1861) Beitrage zur Physiologie Vol 5. Engelmann, Leipzig [12.2.2, 12.3.1a, 21.1] Hering E (1865) Die Gesetze der binocularen Tiefenwahrnehmung Arch für Anat Physiol Wissen Med 152–165 [15.3.1, 16.7.3a, 18.2.1d]
584
•
Hering E (1868) The theory of binocular vision (Translated by B Bridgeman) B Bridgeman & L Stark Eds, Plenum, New York 1977 [16.7.2b] Hering E (1874) Outlines of a theory of the light sense (Translated by L Hurvich, D Jameson) Harvard University Press, Cambridge MA 1964 [12.3.5a, 22.4.1] Hering E (1879) Spatial sense and movements of the eye (Translated by CA Radde) Am Acad Optom, Baltimore 1942 [12.2.1, 14.4.1b, 14.6.2a, 16.7.2b, 16.7.2c, 17.6.3] Herman JH, Tauber ES, Roff warg HP (1974) Monocular occlusion impairs stereoscopic acuity but total visual deprivation does not Percept Psychophys 16 225–8 [18.14.1] Heron G, Dholakia S, Collins DE, McLaughlan H (1985) Stereoscopic threshold in children and adults Am J Optom Physiol Opt 62 505–15 [18.2.1e, 18.2.3b] Heron G, McQuaid M, Morrice E (1995) The Pulfrich effect in optometric practice Ophthal Physiol Opt 15 425–9 [23.7] Herpers MJ, Caberg HB, Mol JMF (1981) Human cerebral potentials evoked by moving dynamic random dot stereograms EEG Clin Neurophysiol 52 50–6 [11.7] Herring RD, Bechtoldt HP (1981) Categorical perception of stereoscopic stimuli Percept Psychophys 29 129–37 [18.6.4] Hertel K , Monjé M (1947) über den Einfluss des Zeitfactors auf das räumliche Sehen Pflügers Arch ges Physiol 279 295–306 [18.12.1a] Herzau V (1976) Stereosehen bei alternierender Bildarbietung Graefe’s Arch Klin Exp Ophthal 200 85–91 [18.12.2a] Herzau W, Ogle KN (1937) Über den Grösenunterschied der Bilder beider Augen bei asymmetrischer Konvergenz und seine Bedeutung für das Zweiäugige Sehen Graefe’s Arch Klin Exp Ophthal 137 327–63 [14.6.2a, 20.2.2c] Hess C (1904) Untersuchungen über den Erregungsvorgang im Sehorgan bei kurz– und bei längerdauernder Reizung Pflügers Arch ges Physiol 101 229–62 [23.2.3] Hess RF (1978) Interocular transfer in individuals with strabismic amblyopia: a cautionary note Perception 7 201–5 [13.2.6] Hess RF, Holliday I (1992) The coding of spatial position by the human visual system Vis Res 32 1085–97 [18.7.2d] Hess RF, Wilcox LM (1994) Linear and non-linear filtering in stereopsis Vis Res 34 2731–8 [15.3.6, 18.7.2d] Hess RF, Wilcox LM (2006) Stereo dynamics are not scale-dependent Vis Res 46 1911–23 [18.12.1c, 18.7.2a] Hess RF, Wilcox LM (2008) The transient nature of 2nd-order stereopsis Vis Res 48 1327–34 [18.7.2d] Hess RF, Demanins R , Bex PJ (1997) A reduced motion aftereffect in strabismic amblyopia Vis Res 37 1303–11 [13.3.3b] Hess RF, Kingdom FAA, Ziegler LR (1999) On the relationship between the spatial channels for luminance and disparity processing Vis Res 39 559–68 [18.7.2c] Hess RF, Liu CH, Wang YZ (2002) Luminance spatial scale and local stereo-sensitivity Vis Res 42 331–42 [15.3.6, 18.5.2] Hess RF, Liu CH, Wang YZ (2003) Differential binocular input and local stereopsis Vis Res 43 2303–13 [18.5.4a, 18.5.4b] Hess RF, Hutchinson CV, Ledgeway T, Mansouri B (2007) Binocular influences on global motion processing in the human visual system Vis Res 47 1682–92 [13.1.6d] Hess RF, Huang PC, Maehara G (2009) Spatial distortions produced by purely dichoptic-based visual motion Perception 38 1012–18 [16.4.2a] Hess WR (1914) Direct wirkende Stereoskopbilder Z Wissen Photog Photophy Photochem 14 34–8 [24.1.3b] Hetherington PA, Swindale NV (1999) Receptive field and orientation scatter studied by tetrode recordings in cat area 17 Vis Neurosci 16 637–52 [11.3.1, 11.6.2] Heuser JE, Reese TS, Dennis MJ, et al. (1979) Synaptic vesicle exocytosis captured by quick freezing and correlated with quantal transmitter release J Cell Biol 81 275–300 [24.2.3d] Hibbard PB (2005) The orientation bandwidth of cyclopean channels Vis Res 45 2780–2785 [18.6.3e]
REFERENCES
Hibbard PB (2007) A statistical model of binocular disparity Vis Cogn 15 149–65 [11.10.1a] Hibbard PB (2008) Binocular energy responses to natural images Vis Res 48 1427–39 [11.10.1a, 11.10.1c] Hibbard PB, Bouzit S (2005) Stereoscopic correspondence for ambiguous targets is affected by elevation and fixation distance Spat Vis 18 399–411 [15.3.12] Hibbard PB, Bradshaw MF (1999) Does binocular disparity facilitate the detection of transparent motion? Perception 28 183–91 [22.3.5] Hibbard PB, Langley K (1998) Plaid slant and inclination thresholds can be predicted from components Vis Res 38 1073–84 [20.4.1d] Hibbard PB, Bradshaw MF, De Bruyn B (1999) Global motion processing is not tuned for binocular disparity Vis Res 39 961–74 [22.3.5] Hibbard PB, Bradshaw MF, Langley K , Rogers BJ (2002) The stereoscopic anisotropy: individual differences and underlying mechanisms J Exp Psychol: HPP 28 469–76 [20.4.1a] Highman VN (1977) Stereopsis and aniseikonia in uniocular aphakia Br J Ophthal 61 30–3 [18.3.4] Higuchi H, Hamasaki J (1978) Real-time transmission of 3-D images formed by parallax panoramagrams App Optics 17 3895–902 [24.1.3b] Hill AJ (1953) A mathematical and experimental foundation for stereophotography J Soc Motion Pict Televis Engin 61 461–87 [24.1.1] Hillebrand F (1893) Die Stabilatät der Raumwerte auf der Netzhaut Z Psychol Physiol Sinnesorg 5 1–60 [14.6.2a, 14.6.2b] Hillebrand F (1929) Lehre von den Gesichtsempfindungen auf Grund hinterlassener Springer, Vienna [14.6.2a] Hillis JM, Banks MS (2001) Are corresponding points fixed? Vis Res 41 2457–73 [14.6.2a] Hine T (1985) The binocular contribution to monocular optokinetic nystagmus and after nystagmus asymmetries in humans Vis Res 25 589–98 [22.6.1e] Hinkle DA, Connor CE (2002) Three-dimensional orientation tuning in macaque area V4 Nat Neurosci 5 665–70 [11.5.3a, 11.6.2] Hinkle DA, Connor CE (2005) Quantitative characterization of disparity tuning in ventral pathway area V4 J Neurophysiol 94 2726–37 [11.5.3a] Hinton GE (1989) Connectionist learning procedures Artificial Intelligence 40 185–234 [11.10.2] Hiris E, Blake R (1996) Direction repulsion in motion transparency Vis Neurosci 13 187–97 [22.7.4] Hirsch MJ (1947) The stereoscope as a means of measuring distance discrimination Am J Optom Arch Am Acad Optom 27 442–6 [18.2.4] Hirsch MJ, Weymouth FW (1948a) Distance discrimination. I. Theoretical consideration Arch Ophthal 39 210–23 [18.10.2a, 18.6.2a] Hirsch MJ, Weymouth FW (1948b) Distance discrimination. II. Effect on threshold of lateral separation of the test objects Arch Ophthal 39 227–31 [18.10.2a, 18.6.2a] Ho WA, Berkley MA (1991) Interactions between channels as revealed by ambiguous motion stimuli Invest Ophthal Vis Sci 32 (Abs) 829 [22.3.1] Hochberg JE, Beck J (1954) Apparent spatial arrangement and perceived brightness J Exp Psychol 47 293–6 [22.4.3b] Hodges LF (1992) Time-multiplexed stereoscopic computer graphics IEEE Tr Comput Graph App 14 20–30 [24.1.1] Hofeldt AJ, Leavitt J, Behrens MM (1985) Pulfrich stereo-illusion phenomenon in serous sensory retinal detachment of the macula Am J Ophthal 100 576–80 [23.7] Hoffman CS (1962) Comparison of monocular and binocular color matching J Opt Soc Am 52 75–80 [12.2.3] Hoffmann KP (1979) Optokinetic nystagmus and single-cell responses in the nucleus tractus opticus after early monocular deprivation in the cat In Developmental neurobiology of vision (ed RD Freeman) pp 63–72 Plenum, New York [22.6.1b] Hoffmann KP (1982) Cortical versus subcortical contributions to the optokinetic reflex in the cat In Functional basis of ocular motility
disorders (ed G Lennerstrand, DS Zee, EL Keller) pp 303–11 Pergamon, New York [22.6.1b, 22.6.1e] Hoffmann KP, Distler C (1986) The role of direction selective cells the nucleus of the optic tract of cat and monkey during optokinetic nystagmus In Adaptive processes in vision and oculomotor systems (ed EL Keller, DS Zee) pp 291–7 Pergamon, New York [22.6.1b] Hoffmann KP, Distler C (1989) A quantitative analysis of visual receptive fields of neurons in nucleus of the optic tract and dorsal terminal nucleus of the accessory optic tract in macaque monkey J Neurophysiol 62 416–28 [22.6.1b] Hoffmann KP, Stone J (1985) Retinal input to the nucleus of the optic tract of the cat assessed by antidromic activation of ganglion cells Exp Brain Res 59 395–403 [22.6.1b] Hoffmann KP, Bremmer F, Thiele A, (2002) Directional asymmetry of neurons in cortical areas MT and MST projecting to the NOT-DTN in macaques J Neurophysiol 87 2113–23 [22.6.1b] Holland HC (1965) The spiral after–effect Pergamon, Oxford [13.3.3a] Holliday IE, Braddick OJ (1991) Pre-attentive detection of a target defined by stereoscopic slant Perception 20 355–62 [22.8.2a] Hollins M (1980) The effect of contrast on the completeness of binocular rivalry suppression Percept Psychophys 27 550–6 [12.4.1] Hollins M, Bailey GW (1981) Rivalry target luminance does not affect suppression depth Percept Psychophys 30 201–3 [12.3.2a] Hollins M, Leung EHL (1978) The influence of color on binocular rivalry In Visual psychophysics and physiology (ed JC Armington, J Krausfopf, BR Wooten) pp 181–90 Academic Press, New York [12.3.2e] Holopigian K (1989) Clinical suppression and binocular rivalry suppression: the effects of stimulus strength on the depth of suppression Vis Res 29 1325–33 [12.3.2a] Home R (1984) Binocular summation: a study of contrast sensitivity visual acuity and recognition Vis Res 18 579–85 [13.1.3c] Hopfield JJ (1982) Neural networks and physical systems with emergent collective computational abilities Proc Natl Acad Sci 79 2554–8 [15.2.1b] Horowitz MW (1949) An analysis of the superiority of binocular over monocular visual acuity J Exp Psychol 39 581–96 [13.1.1a] Householder AS (1943) A theory of the induced size effect Bull Math Biophys 5 155–9 [20.2.3a] Hovis JK (1989) Review of dichoptic color mixing Optom Vis Sci 66 181–90 [12.2.4] Hovis JK , Guth SL (1989a) Dichoptic opponent hue cancellations Optom Vis Sci 66 304–19 [12.2.3] Hovis JK , Guth SL (1989b) Changes in luminance affect dichoptic unique yellow J Opt Soc Am A 6 1297–301 [12.2.3] Howard HJ (1919) A test for the judgment of distance Am J Ophthal 2 656–75 [18.2.1a] Howard IP (1959) Some new subjective phenomena apparently due to interocular transfer Nature 184 1516–17 [12.3.3a] Howard IP (1960) Attneave’s interocular color–effect Am J Psychol 73 151–2 [13.3.5] Howard IP (1961) An investigation of a satiation process in the reversible perspective of a revolving skeletal cube Quart J Exp Psychol 13 19–33 [21.6.2g] Howard IP (1970) Vergence, eye signature, and stereopsis Psychonom Monogr Supp 3 201–4 [16.8, 19.6.3, 19.6.4] Howard IP (1982) Human visual orientation Wiley, Chichester [13.3.2a, 13.4.3, 16.2.1, 16.2.2b, 18.6.5, 22.7.3] Howard IP (1993) The optokinetic system In The vestibulo–ocular reflex nystagmus and vertigo (ed JA Sharpe, HO Barber) pp 163–84 Raven Press, New York [22.6.1a] Howard IP (1995) Depth from binocular rivalry without spatial disparity Perception 27 67–74 [16.1.2c, 17.5] Howard IP (1996) Alhazen’s neglected discoveries of visual phenomena Perception 25 1203–17 [16.7.2b] Howard IP, Duke PA (2003) Monocular transparency generates quantitative depth Vis Res 43 2615–21 [17.4]
REFERENCES
•
585
Howard IP, Gonzalez EG (1987) Optokinetic nystagmus in response to moving binocularly disparate stimuli Vis Res 27 1807–17 [22.6.1e] Howard IP, Heckmann T (1989) Circular vection as a function of the relative sizes distances and positions of two competing visual displays Perception 18 657–67 [22.7.3] Howard IP, Howard A (1994) Vection; the contribution of absolute and relative visual motion Perception 23 745–51 [22.7.3] Howard IP, and Hu G (2001) Visually induced reorientation illusions Perception 30 583–600 [21.1] Howard IP, Kaneko H (1994) Relative shear disparities and the perception of surface inclination Vis Res 34 2505–17 [20.3.2a, 21.7.2] Howard IP, Marton C (1992) Visual pursuit over textured backgrounds in different depth planes Exp Brain Res 90 625–9 [22.6.2] Howard IP, Ohmi M (1984) The efficiency of the central and peripheral retina in driving human optokinetic nystagmus Vis Res 27 969–76 [22.6.1e] Howard IP, Ohmi M (1992) A new interpretation of the role of dichoptic occlusion in stereopsis Invest Ophthal Vis Sci 33 (Abs) 1370 [17.6.2] Howard IP, Pierce BJ (1998) Types of shear disparity and the perception of surface inclination Perception 27 129–45 [21.7.2] Howard IP, Simpson WS (1989) Human optokinetic nystagmus is linked to the stereoscopic system Exp Brain Res 78 309–14 [22.6.1e] Howard IP, Templeton WB (1964) The effect of steady fixation on the judgment of relative depth Quart J Exp Psychol 16 193–203 [21.6.1a] Howard IP, Templeton WB (1966) Human spatial orientation Wiley, London [16.7.6a] Howard IP, Wade NJ (1996) Ptolemy’s contributions to the geometry of binocular vision Perception 25 1189–1201 [16.7.2b] Howard IP, Zacher JE (1991) Human cyclovergence as a function of stimulus frequency and amplitude Exp Brain Res 85 445–50 [19.6.1, 21.7.2] Howard IP, Giaschi D, Murasugi CM (1989) Suppression of OKN and VOR by afterimages and imaginary objects Exp Brain Res 75 139–45 [22.7.2] Howard IP, Ohmi M, Sun L (1993) Cyclovergence: a comparison of objective and psychophysical measurements Exp Brain Res 97 349–55 [21.3.3] Howarth E (1951) The role of depth of focus in depth perception Brit J Psychol (Genera) 42 11–20 [18.5.1] Howarth PA, Bradley A (1986) The longitudinal chromatic aberration of the human eye and its correction Vis Res 29 361–6 [17.8] Howe PDL, Livingstone MS (2006) V1 partially solves the stereo aperture problem Cereb Cortex 16 1332–7 [11.4.5a] Howe PDL, Watanabe T (2003) Measuring the depth induced by an opposite-luminance (but not anticorrelated) stereogram Perception 32 415–421 [15.3.7b] Hubel DH, Livingstone MS (1987) Segregation of form color and stereopsis in primate area 18 J Neurosci 7 3378–415 [11.4.1a, 11.5] Hubel DH, Wiesel TN (1959) Receptive fields of single neurones in the cat’s visual cortex J Physiol 148 574–91 [11.1.2] Hubel DH, Wiesel TN (1962) Receptive fields binocular interaction and functional architecture in the cat’s visual cortex J Physiol 160 106–54 -215, 235, 255 [11.1.2, 11.6.1] Hubel DH, Wiesel TN (1970) Stereoscopic vision in macaque monkey Nature 225 41–2 [11.1.2, 11.4.1a] Hubel DH, Wiesel TN (1973) A re-examination of stereoscopic mechanisms in area 17 of the cat J Physiol 232 29–30P [11.6.2] Hughes A (1977) The topography of vision in mammals of contrasting life style: comparative optics and retinal organization In Handbook of sensory physiology Vol VII/5 (ed F Crescitelli) pp 615–756 Springer, New York [14.1] Humphriss D (1982) The psychological septum An investigation into its function Am J Optom Physiol Opt 59 639–41 [12.3.2b] Hurvich LM, Jameson D (1951) The binocular fusion of yellow in relation to color theories Science 114 199–202 [12.2.1]
586
•
Hyson MT, Julesz B, Fender DH (1983) Eye movements and neural remapping during fusion of misaligned random–dot stereograms J Opt Soc Am 73 1665–73 [18.4.1b] Iavecchia HP, Folk CL (1994) Shifting visual attention in stereographic displays: a time course analysis Hum Factors 36 606–18 [22.8.1] Ibbotson MR , Marotte LR , Mark RF (2002) Investigations into the source of binocular input to the nucleus of the optic tract in an Australian marsupial, the Wallaby Macropus eugenii Exp Brain Res 147 80–88 [22.6.1a] Ichihara S, Goryo K (1978) The effects of relative orientation of surrounding gratings on binocular rivalry and apparent brightness of central gratings Jap Psychol Res 20 159–66 [12.3.3b] Ichikawa M, Egusa H (1993) How is depth perception affected by long– term wearing of left–right reversing spectacles Perception 22 971–84 [21.6.2g] Idesawa M, Uchida M, Watanabe T (2005) 3-D illusory objects viewed with integrated prism glasses Perception 34 (Suppl) 186 [17.8] Ikeda M (1965) Temporal summation of positive and negative flashes in the visual system J Opt Soc Am 55 1527–34 [13.1.6c] Ikeda M, Nakashima Y (1980) Wavelength difference limit for binocular color fusion Vis Res 20 693–7 [12.2.2] Ikeda M, Sagawa K (1979) Binocular color fusion limit J Opt Soc Am 69 316–20 [12.2.2] Indebetouw G, Zhong W (2006) Scanning holographic microscopy of three-dimensional fluorescent specimens J Opt Soc Am 23 1699–707 [24.2.3b] Ingling CR (1991) Psychophysical correlates of parvo channel function In From pigments to perception (ed A Valberg , BB Lee) pp 413–27 Plenum, New York [11.5.4] Ingling CR , Grigsby SS (1990) Perceptual correlates of magnocellular and parvocellular channels: seeing form and depth in afterimages Vis Res 30 823–8 [11.5.4] Ingling CR , Martinez–Uriegas E (1985) The spatiotemporal properties of the r–g X–cell channel Vis Res 25 33–8 [11.5.4] Inoué S, Inoué TD (1986) Computer-aided stereoscopic video reconstruction and serial display from high-resolution light-microscope optical sections Ann N Y Acad Sci 483 392–404 [24.2.3a] Ioannou GL, Rogers BJ, Bradshaw MF, Glennerster A (1993) Threshold and supra-threshold sensitivity functions for stereoscopic surfaces Invest Ophthal Vis Sci 34 (Abs) 1186 [18.6.3d] Ireland FH (1950) A comparison of critical flicker frequencies under conditions of monocular and binocular stimulation J Exp Psychol 40 282–6 [13.1.5] Ishii M (2009) Effect of a disparity pattern on the perception of direction: Non-retinal information masks retinal information Vis Res 49 1563–8 [19.6.4] Ishikawa H, Geiger D (2006) Illusory volumes in human stereo perception Vis Res 46 171–8 [22.2.1] Ito H (1997) The interaction between stereoscopic and luminance motion Vis Res 37 2553–59 [16.5.1] Ito H (2003) The aperture problems in the Pulfrich effect Perception 32 367–75 [23.1.2] Ito H (2005) Illusory depth perception of oblique lines produced by overlaid vertical disparity Vis Res 45 931–42 [15.3.13] Ives HE (1931) Optical properties of a Lippmann lenticulated sheet J Opt Soc Am 21 171–6 [24.1.3c] Ives HE (1933) An experimental apparatus for the projection of motion pictures in relief J Soc Motion Pict Engin 21 106–15 [24.1.3c] Iwabuchi A, Shimuzu H (1997) Antiphase flicker induces depth segregation Percept Psychophys 59 1312–29 [22.1.1] Iwami T, Nishida Y, et al. (2002) Common neural processing regions for dynamic and static stereopsis in human parieto-occipital cortices Neurosci Lett 327 29–32 [11.8.1] Jaensch ER (1911) über die Wahrnehmung des Raumes Eine experimentell–psychologische Untersuchung nebst Anwendung auf ästhetik und Erkenntnislehre Z Psychol Physiol Sinnesorg 6 (Supp) 1–448 [17.6.2]
REFERENCES
Jampolsky A, Flom BC, Freid AN (1957) Fixation disparity in relation to heterophoria Am J Ophthal 43 97–106 [18.10.3b] Janssen P, Vogels R , Orban GA (2000a) Selectivity for 3D shape that reveals distinct areas within macaque inferior temporal cortex Science 288 2054–6 [11.5.3b] Janssen P, Vogels R , Orban GA (2000b) Three-dimensional shape coding in inferior temporal cortex Neuron 27 385–97 [11.5.3b] Janssen P, Vogels R , Liu Y, Orban GA (2001) Macaque inferior temporal neurons are selective for three-dimensional boundaries and surfaces J Neurosci 21 9419–29 [11.5.3b] Janssen P, Vogels R , Liu Y, Orban GA (2003) At least at the level of inferior temporal cortex, the stereo correspondence problem is solved Neuron 37 693–701 [11.5.3b] Jaschinski W, Bröde P, Griefahn B (1999) Fixation disparity and nonius bias Vis Res 39 669–77 [14.6.1c] Jaschinski W, Jainta S, Schürer M (2006) Capture of visual direction in dynamic vergence is reduced with flashed monocular lines Vis Res 46 2608–14 [16.7.4a] Javal E (1865) De la neutralisation dans l’acte de la vision Annals d’ Oculistique Paris 54 5–16 [14.4.2] Jeeves MA (1991) Stereo perception in callosal agenesis and partial callosotomy Neuropsychologia 29 19–34 [11.9.2] Jenkin MR , Jepson AD (1988) The measurement of binocular disparity In Computational processes in human vision (ed ZW Pylyshyn) pp 69–98 Ablex Publishing , Norwood NJ [11.4.3a] Jenkin MR , Jepson AD (1994) Recovering local surface structure through local phase difference measurements Comp Vis Gr Im Proc: Im Underst 59 72–93 [15.2.1d] Jenkin MR , Jepson AD, Tsotsos JK (1991) Techniques for disparity measurement Comp Vis Im Proc: Im Underst 53 14–30 [11.10.1a, 11.4.3a] Jenkins TCA, Pickwell LD, Abd-Manan F (1992) Effect of induced fixation on binocular visual acuity Ophthal Physiol Opt 12 299–301 [13.1.1a] Jenkins TCA, Abd-Manan F, Pardhan S, Murgatroyd RN (1994) Effect of fixation disparity on distance binocular visual acuity Ophthal Physiol Opt 14 129–31 [13.1.1a] Jennings JAM (1985) Anomalous retinal correspondence: a review Ophthal Physiol Opt 5 357–68 [14.4.1e] Jensen JR (1980) Stereoscopic statistical maps The American Cartographer 7 25–37 [24.2.1] Jiao SL, Han C, Jing QC, Over R (1984) Monocular–contingent and binocular–contingent aftereffects Percept Psychophys 35 105–10 [13.3.3d] Jiménez JR , Rubino M, Hita E, Jiménez del Barco L (1997) Influence of the luminance and opponent chromatic channels on stereopsis with random-dot stereograms Vis Res 37 591–6 [17.1.4a] Jiménez JR , Rubino M, Díaz JA, et al. (2000) Changes in stereoscopic depth perception caused by decentration of spectacle lenses Optom Vis Sci 77 421–7 [18.6.7] Jiménez JR , Medina JM, Jiménez DD, Diaz JA (2002a) Binocular summation of chromatic changes as measured by visual reaction time Percept Psychophys 64 140–7 [13.1.2g] Jiménez JR , Ponce A, Del Barco J, et al. (2002b) Impact of induced aniseikonia on stereopsis with random-dot stereogram Optom Vis Sci 79 121–5 [18.3.4] Jiménez JR , Castro JJ, Jiménez R , Hita E (2008) Interocular differences in higher-order aberrations on binocular visual performance Optom Vis Sci 85 174–9 [18.5.4b] Johannsen DE (1930) A quantitative study of binocular color vision J Gen Psychol 4 282–308 [12.2.1, 12.2.2] Johansson G (1964) Perception of motion and changing form Scand J Psychol 5 181–208 [22.1.1] Johnston A, Curran W (1996) Investigating shape-from-shading illusions using solid objects Vis Res 36 2827–35 [21.4.2f ] Johnston AW (1971) Clinical horopter determination and the mechanism of binocular vision in anomalous correspondence Ophthalmologica 163 102–119 [14.4.1b]
Johnston EB (1991) Systematic distortions of shape from stereopsis Vis Res 31 1351–60 [20.6.3d] Jolicoeur P, Cavanagh P (1992) Mental rotation physical rotation and surface media J Exp Psychol HPP 18 371–84 [16.2.2b] Jones DG, Malik J (1992) Computational framework for determining stereo correspondence from a set of linear spatial filters Image Vis Comp 10 699–708 [15.2.1c] Jones HE, Grieve KL, Wang W, Sillito AM (2001) Surround suppression in primate V1 J Neurophysiol 86 2011–28 [12.3.3b] Jones PF, Aitken GJM (1994) Comparison of three-dimensional imaging systems J Opt Soc Am A 11 2913–21 [24.1.4a] Jones RK , Lee DN (1981) Why two eyes are better than one: the two views of binocular vision J Exp Psychol HPP 7 30–40 [20.1.1] Jones RM, Tulunay-Keesey U (1980) Phase selectivity of spatial frequency channels J Opt Soc Am 70 66–70 [21.6.4] Jordan JR , Geisler WS, Bovik AC (1990) Color as a source of information in the stereo correspondence process Vis Res 30 1955–70 [17.1.4b] Joseph JS, Chun MM, Nakayama K (1997) Attentional requirements in a ‘preattentive’ feature search task Nature 387 805–7 [22.8.2a] Joshua DE, Bishop PO (1970) Binocular single vision and depth discrimination Receptive field disparities for central and peripheral vision and binocular interaction on peripheral single units in cat striate cortex Exp Brain Res 10 389–416 [11.3.1, 11.4.3c, 11.4.4, 11.4.5b] Jourdan IC, Dutson E, et al. (2004) Stereoscopic vision provides a significant advantage for precision robotic laparoscopy Brit J Surg 91 879–85 [24.2.4] Julesz B (1960) Binocular depth perception of computer generated patterns Bell System Technical Journal 39 1125–62 [15.4.4, 17.1, 17.5, 18.3.4, 18.14.2a, 24.1.5] Julesz B (1963) Stereopsis and binocular rivalry of contours J Opt Soc Am 53 994–9 [18.3.3a, 18.10.1a] Julesz B (1964) Binocular depth perception without familiarity cues Science 145 356–62 [17.2.3, 18.12.1b, 21.7.2] Julesz B (1966) Binocular disappearance of monocular symmetry Science 153 657–9 [16.6.2] Julesz B (1971) Foundations of cyclopean perception University of Chicago Press, Chicago [15.3.7d, 16.1.1, 16.3.2, 16.6.2, 18.14.2a, 18.14.2c, 18.14.2f, 20.5.1, 21.4.1] Julesz B, Bergen JR (1983) Textons the fundamental elements on preattentive vision and perception of texture Bell System Technical Journal 62 1619–45 [22.8.2a] Julesz B, Chang JJ (1976) Interaction between pools of binocular disparity detectors tuned to different disparities Biol Cyber 22 107–19 [15.4.5, 18.13] Julesz B, Johnson SC (1968) Stereograms portraying ambiguous perceivable surfaces Proc Natl Acad Sci 61 437–41 [18.8.2a] Julesz B, Miller JE (1975) Independent spatial frequency tuned channels in binocular fusion and rivalry Perception 4 125–43 [12.7.3, 13.1.2c, 18.7.4] Julesz B, Oswald HP (1978) Binocular utilization of monocular cues that are undetectable monocularly Perception 7 315–22 [18.14.2c] Julesz B, Payne RA (1968) Differences between monocular and binocular stroboscopic movement perception Vis Res 8 433–44 [16.4.1, 16.5.1] Julesz B, Tyler CW (1976) Neurontropy an entropy–like measure of neural correlation in binocular fusion and rivalry Biol Cyber 22 107–19 [11.7, 15.2.2a] Julesz B, White B (1969) Short term visual memory and the Pulfrich phenomenon Nature 222 639–41 [23.3.1, 23.6.1] Julesz B, Kropfl W, Petrig B (1980) Large evoked potentials to dynamic random–dot correlograms permit quick determination of stereopsis Proc Natl Acad Sci 77 2348–51 [11.7] Julesz B, Breitmeyer B, Kropfl W (1976) Binocular disparity–dependent upper–lower hemifield anisotropy and left–right hemifield isotropy as revealed by dynamic random–dot stereograms Perception 5 129–41 [18.6.1b]
REFERENCES
•
587
Justo MS, Bermudez MA, Perez R , Gonzalez F (2004) Binocular interaction and performance of visual tasks Ophthal Physiol Opt 24 82–90 [13.1.7] Kaernbach C, Schröger E, Jacobsen T, Roeber U (1999) Effects of consciousness on human brain waves following binocular rivalry Neuroreport 10 713–6 [12.9.2e] Kahn RH (1931) Über den Stereoeffekt von Pulfrich Pflügers Arch ges Physiol 227 213–27 [23.5] Kahneman D (1968) Methods findings and theory in studies of visual masking Psychol Bull 70 693–7 [13.2.7b] Kaiser P (1971) Minimally distinct border as a preferred psychophysical criterion in visual heterochromatic photometry J Opt Soc Am 61 966–71 [17.1.4a] Kaiser P, Boynton RM (1985) Role of the blue mechanism in wavelength discrimination Vis Res 25 523–9 [12.3.2e] Kalarickal GJ, Marshall JA (2000) Neural model of temporal and stochastic properties of binocular rivalry Neurocomputing 32–33 843–53 [12.10] Kalberlah C, Distler C, Hoffmann KP (2009) Sensitivity to relative disparity in early visual cortex of pigmented and albino ferrets Exp Brain Res 192 379–89 [11.3.1] Kamphuisen A, Bauer M, van Ee R (2008) No evidence for widespread synchronized networks in binocular rivalry: MEG frequency tagging entrains primarily early visual cortex J Vis 8(5) Article 4 [12.9.2e] Kanai R , Verstraten FA (2005) Perceptual manifestations of fast neural plasticity: motion priming, rapid motion aftereffect and perceptual sensitization Vis Res 4 3109–16 [12.3.5f ] Kaneko H, Howard IP (1996) Relative size disparities and the perception of surface slant Vis Res 36 1919–30 [20.2.4b] Kaneko H, Howard IP (1997a) Spatial properties of shear disparity processing Vis Res 37 315–23 [20.3.2a, 20.3.2b] Kaneko H, Howard IP (1997b) Spatial limitation of vertical-size disparity processing Vis Res 37 2871–78 [20.2.4a] Kang MS (2009) Size matters: a study of binocular rivalry dynamics J Vis 9(1) Article 17 [12.3.2a] Kanizsa G (1979) Organization in vision: Essays on Gestalt perception Praeger, New York [22.2.4a] Kaplan IT, Metlay W (1964) Light intensity and binocular rivalry J Exp Psychol 67 22–6 [12.3.2c] Kasai T, Morotomi T (2001) Event-related potentials during selective attention to depth and form in global stereopsis Vis Res 41 1379–88 [11.7] Katsumi O, Tsuyoshi T, Hirose T (1986) Effect of aniseikonia on binocular function Invest Ophthal Vis Sci 27 601–4 [13.1.8b, 18.3.4] Katz MS, Schwartz I (1955) New observation of the Pulfrich effect J Opt Soc Am 45 523–27 [23.2.1] Kaufman L (1963) On the spread of suppression and binocular rivalry Vis Res 3 401–15 [12.3.5a, 12.3.6a, 12.4.2] Kaufman L (1964) Suppression and fusion in viewing complex stereograms Am J Psychol 77 193–205 [12.7.2, 12.7.3] Kaufman L (1965) Some new stereoscopic phenomena and their implications for theories of stereopsis Am J Psychol 78 1–20 [17.2.4] Kaufman L (1974) Sight and mind An introduction to visual perception Oxford University Press, London [12.1.7, 17.1.2a] Kaufman L (1976) On stereopsis with double images Psychologia 19 227–33 [17.6.2] Kaufman L, Arditi A (1976) The fusion illusion Vis Res 16 353–43 [12.1.1c] Kaufman L, Pitblado C (1965) Further observations on the nature of effective binocular disparities Am J Psychol 78 379–91 [15.3.7b] Kaufman L, Pitblado CB (1969) Stereopsis with opposite contrast conditions Percept Psychophys 6 10–12 [15.3.7b] Kaufman L, Bacon J, Barroso F (1973) Stereopsis without image segregation Vis Res 13 137–47 [18.8.2b] Kavadellas A, Held R (1977) Monocularity of color–contingent tilt aftereffects Percept Psychophys 21 12–14 [13.3.5] Kawano K , Sasaki M, Yamashita M (1984) Response properties of neurons in posterior parietal cortex of monkey during visual–vestibular
588
•
stimulation. I. Visual tracking neurons J Neurophysiol 51 340–51 [22.6.1d] Kawano K , Inoue Y, Takemura A, Miles FA (1994) Effect of disparity in the peripheral field on short-latency ocular following responses Vis Neurosci 11 833–7 [22.6.2] Kaye M (1978) Stereopsis without binocular correlation Vis Res 18 1013–22 [17.6.5] Kaye SB, Siddiqui A, Ward A, et al. (1999) Monocular and binocular depth discrimination thresholds Optom Vis Sci 76 770–82 [18.2.1a] Keesey UT (1960) Effects of involuntary eye movements on visual acuity J Opt Soc Am 50 769–74 [18.10.1a] Kellman PJ, Garrigan P, Shipley TF, et al. (2005) 3-D Interpolation in object perception: Evidence from an objective performance paradigm J Exp Pychol HPP 31 558–83 [22.2.1] Kennedy H, Courjon JH, Flandrin JM (1982) Vestibulo-ocular reflex and optokinetic nystagmus in adult cats reared in stroboscopic illumination Exp Brain Res 48 279–87 [22.6.1b] Kepler J (1604) Ad Vitellionem Paralipomena Marinium and Aubrii, Frankfurt (Translated in Donahue 2000) [16.7.7] Kerr KE (1980) Accommodative and fusional vergence in anomalous correspondence Am J Optom Physiol Opt 57 676–80 [14.4.1d] Kerr KE (1998) Anomalous correspondence—the cause or consequence of strabismus Optom Vis Sci 75 17–22 [14.4.1d] Kertesz AE (1973) Disparity detection within Panum’s fusional areas Vis Res 13 1537–43 [12.1.5] Kertesz AE (1980) Human fusional vergence Proceedings of the eye movement conference (OMS 80) California Institute of Technology, Pasadena [16.7.3a] Kertesz AE (1981) Effect of stimulus size on fusion and vergence J Opt Soc Am 71 289–93 [12.1.5] Kertesz AE, Jones R (1970) Human cyclofusional response Vis Res 10 891–6 [12.1.1b] Kham K (2004) An opaque surface influences the depth from the Pulfrich phenomenon Perception 33 1201–13 [23.1.3] Kham K , Blake R (2000) Depth capture by kinetic depth and by stereopsis Perception 29 211–20 [22.2.4b] Khan AZ , Crawford JD (2001) Ocular dominance reverses as a function of horizontal gaze angle Vis Res 41 1743–8 [12.3.7, 16.7.7] Khan AZ , Crawford JD (2003) Coordinating one hand with two eyes: Optimizing for field of view in a pointing task. Vis Res 43 409–17 [16.7.3b] Khokhotva M, Ono, H, Mapp AP (2005) The cyclopean eye is relevant for predicting visual direction Vis Res 45 2339–45 [16.7.7] Khuu SK , Hayes A (2005) Glass-pattern detection is tuned for stereodepth Vis Res 45 2461–9 [16.2.2] Khuu SK , Li WO, Hayes A (2006) Global speed averaging is tuned for binocular disparity Vis Res 46 407–16 [22.3.5] Kidd AL, Frisby JP, Mayhew JEW (1979) Texture contours can facilitate stereopsis by initiating vergence eye movements Nature 280 829–32 [18.14.2c] Kienker PK , Sejnowski TJ, Hinton GE, Schumacher LE (1986) Separating figure from ground with a parallel network Perception 15 197–216 [15.2.1b] Kim WS, Ellis SR , Tyler ME, et al. (1987) Quantitative evaluation of perspective and stereoscopic displays in three-axis manual tracking tasks IEEE Tr Sys Sci Cybern 17 61–72 [24.2.6] Kim YJ, Grabowecky M, Suzuki S (2006) Stochastic resonance in binocular rivalry Vis Res 46 392–406 [12.3.5c] Kimmig HG, Miles FA, Schwarz U (1992) Effects of stationary textured backgrounds on the initiation of pursuit eye movements in monkeys J Neurophysiol 68 2147–64 [22.6.2] King SM, Cowey A (1992) Defensive responses to looming visual stimuli in monkeys with unilateral striate cortex ablation Neuropsychologia 30 1017–27 [11.6.4] Kingdom FAA (1999) Old wine in new bottles? Some thoughts on Logvinenko’s “Lightness induction revisited” Perception 28 929–34 [22.4.5]
REFERENCES
Kingdom FAA, Simmons DR (1996) Stereoacuity and colour contrast Vis Res 36 1311–19 [17.1.4a] Kingdom FAA, Blakeslee B, McCourt ME (1997) Brightness with and without perceived transparency: when does it make a difference? Perception 26 493–506 [22.4.5] Kingdom FAA, Simmons DR , Rainville S (1999) On the apparent collapse of stereopsis in random-dot-stereograms at isoluminance Vis Res 39 2127–41 [17.1.4a] Kingdom FAA, Li HCO, MacAulay EJ (2001) The role of chromatic contrast and luminance polarity in stereoscopic segmentation Vis Res 41 375–83 [15.3.8a] Kiorpes L, Walton PJ, O’Toole LP, et al. (1996) Effects of early-onset strabismus on pursuit eye movements and on neuronal responses in area MT of macaque monkeys J Neurosci 16 6537–53 [22.6.1e] Kirkpatrick S, Gelatt CD, Vecchi MP (1983) Optimization by simulated annealing Science 220 671–80 [15.2.1a] Kirkwood B, Ellis A, Nicol B (1969) Eye movement and the Pulfrich effect Percept Psychophys 8 206–8 [23.2.4, 23.5] Kishto BN (1965) The colour stereoscopic effect Vis Res 5 313–30 [17.8] Kitazaki M. Shimojo S (1996) The ‘Generic View Principle’ for Threedimensional motion perception: Optics and inverse optics of a single straight bar Perception 25 797–814 [22.1.2] Kitterle FL, Thomas J (1980) The effects of spatial frequency orientation and color upon binocular rivalry and monocular pattern alternation Bull Psychonom Soc 16 405–7 [12.3.8a] Kitterle FL, Kaye RS, Nixon H (1974) Pattern alternation: effects of spatial frequency and orientation Percept Psychophys 16 543–6 [12.3.8a] Klein R , Stein R (1934) über einem Tumor des Kleinhirns mit anfallsweise auftretendem Tonusverlust und monokulärer Diplopie bzw binokulärer Triplopie Arch Psychiat Nervenkrank 102 478–92 [14.4.2] Klein RM (1977) Attention and visual dominance: a chronometric analysis J Exp Psychol HPP 3 365–78 [20.1.1] Knapen T, Paffen C, Kanai R , van Ee R (2007) Stimulus flicker alters interocular grouping during binocular rivalry Vis Res 47 1–7 [12.4.4a] Knill DC, Kersten D (1991) Apparent surface curvature affects lightness perception Nature 351 228–30 [22.4.4] Koenderink JJ (1985) Space form and optical deformations In Brain mechanisms and spatial vision (ed D Ingle, M Jeannerod and D Lee) pp 31–58 Martinus Nijhoff, The Hague [19.3.2] Koenderink JJ (1986) Optic flow Vis Res 29 161–80 [19.3.2] Koenderink JJ (1990) Solid shape MIT Press, Cambridge Mass [20.5.1] Koenderink JJ, van Doorn AJ (1976) Geometry of binocular vision and a model for stereopsis Biol Cyber 21 29–35 [19.3.3, 20.2.4, 20.6.5e] Koenderink JJ, van Doorn AJ (1980) Photometric invariants related to solid shape Optica Acta 27 981–96 [17.1.6] Koenderink JJ, van Doorn AJ (1991) Affine structure from motion J Opt Soc Am A 8 377–85 [20.6.5e] Koenderink JJ, van Doorn AJ, Kappers AML (1994) On so-called paradoxical monocular stereoscopy Perception 23 583–94 [24.1.7] Koenderink JJ, van Doorn AJ, Kappers AML (1995) Depth relief Perception 27 115–29 [24.1.7] Koffka K (1935) Principles of Gestalt psychology Harcourt Brace, New York [22.1.1, 21.4.2c] Köhler W, Emery DA (1947) Figural aftereffects in the third dimension of visual space Am J Psychol 60 159–201 [21.6.1a, 21.6.3a] Köhler W, Wallach H (1944) Figural aftereffects: an investigation of visual processes Proc Am Philos Soc 88 299–357 [21.1, 21.6.1a, 21.6.3a] Kohly RP, Regan D (1999) Evidence for a mechanism sensitive to the speed of cyclopean form Vis Res 39 1011–27 [16.5.2] Kohly RP, Regan D (2001) Long-distance interactions in cyclopean vision Proc Roy Soc B 268 213–19 [16.2.2b] Kohn H (1960) Some personality variables associated with binocular rivalry Psychol Rec 10 9–13 [12.8.3a]
Kolb FC, Braun J (1995) Blindsight in normal observers Nature 377 336–8 [12.5.6] Kolehmainen K , Keskinen E (1974) Evidence for the latency–time explanation of the Pulfrich phenomenon Scand J Psychol 15 320–21 [23.1.2] Kolers PA, Rosner BS (1960) On visual masking (metacontrast): dichoptic observation Am J Psychol 73 2–21 [13.2.7b] Köllner H (1914) Das funktionelle überwiegen der nasalen Netzhauthälften im gemeinschaftlichen Sehfeld Arch Augenheilk 76 153–64 [12.3.4] Komatsu H, Wurtz RH (1988) Relation of cortical areas MT and MST to pursuit eye movements. I. Localization and visual properties of neurons J Neurophysiol 60 580–603 [22.6.1d] Komatsu H, Roy JP, Wurtz RH (1988) Binocular disparity sensitivity of cells in area MST of the monkey Soc Neurosci Abstr 14 202 [11.5.2a, 11.6.4] Kommerell G, Schmidt C, Kromeier M, Bach M (2003) Ocular prevalence versus ocular dominance Vis Res 43 1397–403 [16.7.3b] Kontsevich LL (1986) An ambiguous random-dot stereogram which permits continuous change of interpretation Vis Res 29 517–19 [15.4.5] Kontsevich LL, Tyler CW (1994) Analysis of stereothresholds for stimuli below 2.5 c/deg Vis Res 34, 2317–29 [15.3.6] Kontsevich LL, Tyler CW (2000) Relative contributions of sustained and transient pathways to human stereoprocessing Vis Res 40 3245–55 [11.5.4] Kooi FL, Toet A, Tripathy SP, Levi DM (1994) The effect of similarity and duration on spatial interaction in peripheral vision Spat Vis 8 255–79 [13.2.5] Kovács I, Julesz B (1992) Depth motion and static-flow perception at metaisoluminant color contrast Proc Natl Acad Sci 89 10390–4 [17.1.4a] Kovács I, Papathomas TV, Yang M, Fehér A (1996) When the brain changes its mind: interocular grouping during binocular rivalry Proc Natl Acad Sci 93 15508–11 [12.4.4b] Krauskopf J, Forte JD (2002) Influence of chromaticity on vernier and stereo acuity J Vis 2 645–52 [17.1.4b, 18.11] Krekling S (1973) Some aspects of the Pulfrich effect Scand J Psychol 14 87–90 [23.1.2] Krekling S (1974) Stereoscopic threshold within the stereoscopic range in central vision Am J Physiol Opt 51 629–34 [18.3.3a, 18.6.1a] Krekling S, Blika S (1983a) Meridional anisotropia in cyclofusion Percept Psychophys 34 299–300 [12.1.5] Krekling S, Blika S (1983b) Development of the tilted vertical horopter Percept Psychophys 34 491–3 [14.7] Krol JD, van de Grind WA (1980) The double–nail illusion: experiments on binocular vision with nails needles and pins Perception 9 651–69 [15.4.6, 17.6.3] Krol JD, van de Grind WA (1983) Depth from dichoptic edges depends on vergence tuning Perception 12 425–38 [15.3.7b] Krol JD, van de Grind WA (1986) Binocular depth mixture: an artifact of eye vergence? Vis Res 29 1289–93 [18.8.2c] Kromeier M, Heinrich SP, Bach M, Kommerell (2006) Ocular prevalence and stereoacuity Ophthal Physiol Opt 26 50–6 [18.11] Kröncke K (1921) Zur Phänomenologie der Kernfläche des Sehraums Z Psychol Physiol Sinnesorg 52 217–28 [14.6.2] Krug K , Cumming BG, Parker AJ (2004) Comparing perceptual signals of single V5/MT neurons in two binocular depth tasks J Neurophysiol 92 1586–96 [11.5.2a] Kruse P, Carmesin HO, Pahlke L et al. (1996) Continuous phase transitions in the perception of multistable visual patterns Biol Cyber 75 321–30 [15.2.1b] Kubie LS, Beckmann JW (1929) Diplopia without extra-ocular palsies caused by heteronymous defects in the visual fields associated with defective macular vision Brain 52 317–33 [14.4.2] Kuffler SW (1953) Discharge patterns and functional organization of mammalian retina J Neurophysiol 16 37–68 [22.4.1]
REFERENCES
•
589
Kulikowski JJ (1978) Limit of single vision in stereopsis depends on contour sharpness Nature 275 129–7 [12.1.2] Kulikowski JJ (1992) Binocular chromatic rivalry and single vision Ophthal Physiol Opt 12 168–70 [12.4.3] Kumano H, Tanabe S, Fujita I (2008) Spatial frequency integration for binocular correspondence in macaque area V4 J Neurophysiol 99 402–8 [11.10.1c] Kumar T (1996) Multiple matching of features in simple stereograms Vis Res 36 675–98 [17.6.3] Kumar T, Glaser DA (1991) Influence of remote objects on local depth perception Vis Res 31 1687–99 [21.3.1] Kumar T, Glaser DA (1992) Depth discrimination of a line is improved by adding other nearby lines Vis Res 32 1667–76 [18.6.2a] Kumar T, Glaser DA (1993a) Initial performance learning and observer variability for hyperacuity tasks Vis Res 33 2287–300 [18.14.1] Kumar T, Glaser DA (1993b) Temporal aspects of depth contrast Vis Res 33 947–57 [21.3.4] Kumar T, Glaser DA (1994) Some temporal aspects of stereoacuity Vis Res 34 913–25 [18.12.2b, 18.14.1] Kumar T, Glaser DA (1995) Depth discrimination of a crowded line is better when it is more luminant than the lines crowding it Vis Res 35 657–66 [18.6.2a] Kundt A (1863) Untersuchungen über Augenmaass und optische Täuschungen Poggendorff ’s Ann Physik Chem 196 118–158 [14.6.2a] Kuroki D, Nakamizo S (2006) Depth scaling in phantom and monocular gap stereograms using absolute distance information Vis Res 46 4206–16 [17.3] Kurtz HF (1937) Orthostereoscopy J Opt Soc Am 27 323–39 [24.1.1] Kuu SK , Hayes A (2005) Glass-pattern detection is tuned for stereodepth Vis Res 45 2451–9 [16.6.2] Kwee IL, Fujii Y, Matsuzawa H, Nakada T (1999) Perceptual processing of stereopsis in humans: high-field (3.0-tesla) functional MRI study Neurology 53 1599–601 [11.8.1] Lack LC (1969) The effect of practice on binocular rivalry control Percept Psychophys 6 397–400 [12.8.1] Lack LC (1971) The role of accommodation in the control of rivalry Percept Psychophys 10 38–42 [12.8.1] Lack LC (1974) Selective attention and the control of binocular rivalry Percept Psychophys 15 193–200 [12.6.4] Laing CR , Chow CC (2002) A spiking neuron model for binocular rivalry J Comput Neurosci 12 39–53 [12.10] Lam AKC, Chau ASY, Lam WY, et al. (1996) Effects of naturally occurring visual acuity differences between two eyes in stereoacuity Ophthal Physiol Opt 16 189–95 [18.5.4b] Lam AKC, Tse P, Choy E, Chung M (2002) Crossed and uncrossed stereoacuity at distance and the effect from heterophoria Ophthal Physiol Opt 22 189–93 [18.10.3b] Lamme VAF (1995) The neurophysiology of figure-ground segregation in primary visual cortex J Neurosci 15 1605–15 [22.5.1a] Land EH (1986) Recent advances in retinex theory Vis Res 26 7–21 [22.4.6] Landers DD, Cormack LK (1997) Asymmetries and errors in perception of depth from disparity suggest a multicomponent model of disparity processing Percept Psychophys 59 219–31 [18.6.4] Landrigan DT, Bader IA (1981) The Pulfrich effect: filtering portions of both eyes J Psychol 109 165–72 [23.2.1] Lang J (1983) A new stereotest J Ped Ophthal Strab 20 72–4 [18.2.3e] Lang J, Lang TJ (1988) Eye screening with the Lang stereotest Am Orthopt J 38 48–50 [18.2.3e] Lang J, Rechichi C, Stürmer J (1991) Natural versus haploscopic stereopsis Graefe’s Arch Klin Exp Ophthal 229 115–18 [18.2.4] Lange–Malecki B, Creutzfeldt OD, Hinse P (1985) Haploscopic colour mixture with and without contours in subjects with normal and disturbed binocular vision Perception 14 587–600 [12.2.3] Langer T, Fuchs AF, Chubb MC, et al. (1985) Floccular efferents in the rhesus macaque as revealed by autoradiography and horseradish peroxidase J Comp Neurol 235 29–37 [22.6.1d]
590
•
Langlands NMS (1926) Experiments on binocular vision Medical Research Council Special Report Series No 133 His Majesty’s Stationary Office, London [18.12.1a] Langlands NMS (1929) Experiments on binocular vision Tr Opt Soc Lon 28 45–82 [18.12.1a] Langley K , Fleet DJ, Hibbard PB (1999) Stereopsis from contrast envelopes Vis Res 39 2313–27 [18.7.2d] Lankheet MJM, Lennie P (1996) Spatio-temporal requirements for binocular correlation in stereopsis Vis Res 36 527–38 [18.6.3d] Lankheet MJM, Palmen M (1998) Stereoscopic segregation of transparent surfaces and the effect of motion contrast Vis Res 38 659–68 [15.3.9] Lansford TG, Baker HD (1969) Dark adaptation: an interocular light–adaptation effect Science 164 1307–9 [13.2.2] Lansing RW (1964) Electroencephalographic correlates of binocular rivalry in man Science 146 1325–7 [12.9.2e] Lappe M (1996) Functional consequences of an integration of motion and stereopsis in area MT of monkey extrastriate visual cortex Neural Comput 8 1449–61 [22.3.2] Lappin JS, Craft WD (1997) Definition and detection of binocular disparity Vis Res 37 2953–74 [21.4.2a] Larson WL (1988) Effect of TNO red-green glasses on local stereoacuity Am J Optom Physiol Opt 65 946–50 [18.2.3b] Larson WL (1990) An investigation of the difference in stereoacuity between crossed and uncrossed disparities using Frisby and TNO tests Optom Vis Sci 67 157–61 [18.6.4] Larson WL, Giroux R (1982) A precision instrument for the clinical measurement of stereoscopic acuity Can J Optom 44 100–4 [18.2.1a] Lasley DJ, Kivlin J, Rich L, Flynn JT (1984) Stereo–discrimination between diplopic images in clinically normal observers Invest Ophthal Vis Sci 25 1316–20 [18.6.4] Lau E (1922) Versuche über das stereoskopische Sehen Psychol Forsch 2 1–4 [17.7] Lau E (1925) Uber das stereoskopische Sehen Psychol Forsch 6 121–6 [17.6.2, 17.7] Lavin E, Costall A (1978) Detection thresholds of the Hermann grid illusion Vis Res 18 1061–2 [16.3.2] Lawrence CM, Rodwell VW, Stauffacher CV (1995) Crystal structure of Pseudomonas mevalonii HMG-CoA reductase at 3.0 angstom resolution Science 298 1760–1763 [24.2.1] Lawson RB, Gulick WL (1967) Stereopsis and anomalous contour Vis Res 7 271–97 [17.2.4] Lawson RB, Mount DC (1967) Minimum conditions for stereopsis and anomalous contour Science 158 804–6 [17.2.4] Lawson RB, Cowen E, Gibbs TD, Whitmore CG (1974) Stereoscopic enhancement and erazure of subjective contours J Exp Psychol 103 1142–6 [22.2.4a] Lawwill T, Biersdorf WR (1968) Binocular rivalry and visual evoked responses Invest Ophthal 7 378–85 [12.9.2e] LeConte J (1871) On some phenomena of binocular vision; The mode of representing the position of double images Am J Sci 1 33–44 [16.7.2b] LeConte J (1881) Sight: An exposition of the principles of monocular and binocular vision Appleton, New York 2nd edition 1879 [16.7.2b] Ledgeway T, Rogers BJ (1999) The effects of eccentricity and vergence angle upon the relative tilt of corresponding vertical and horizontal meridia revealed using the minimum motion paradigm Perception 28 143–53 [14.7] Ledgeway T, Smith AT (1994) Evidence for separate motion-detecting mechanisms for first- and second-order motion in human vision Vis Res 34 2727–40 [18.12.3] LeDoux JE, Deutsch G, Wilson DH, Gazzaniga MS (1977) Binocular stereopsis and the anterior commissure in man The Physiologist 20 55 [11.9.2] Lee B (1994) Aftereffects and the representation of stereoscopic surfaces D Phil Thesis Oxford University [21.6.2b]
REFERENCES
Lee B (1999) Aftereffects and the representation of stereoscopic surfaces Perception 28 1155–69 [21.6.2c] Lee B, Rogers BJ (1992) Aftereffects of stereoscopic surfaces are selectively tuned to the plane of the adapting surface Invest Ophthal Vis Sci 33 (Abs) 1372 [20.3.1d, 21.6.3b] Lee B, Rogers BJ (1997) Disparity modulation sensitivity for narrowband-filtered stereograms Vis Res 37 1769–77 [18.7.2c] Lee DN (1970a) A stroboscopic stereophenomenon Vis Res 10 587–93 [23.3.3, 23.3.6] Lee DN (1970b) Spatio–temporal integration in binocular–kinetic space perception Vis Res 10 65–78 [23.3.4, 23.3.6] Lee DN (1970c) Binocular stereopsis without spatial disparity Percept Psychophys 9 216–8 [17.1.5] Lee S, Shioiri S, Yaguchi H (2007) Stereo channels with different temporal frequency tunings Vis Res 47 289–97 [18.7.2b] Lee SH, Blake R (1999) Rival ideas about rivalry Vis Res 39 1447—54 [12.4.4a] Lee SH, Blake R (2002) V1 activity is reduced during binocular rivalry J Vis 2 618–26 [12.9.2f ] Lee SH, Blake R (2004) A fresh look at interocular grouping during binocular rivalry Vis Res 44 983–91 [12.4.4b] Lee SH, Blake R , Heeger DJ (2005) Traveling waves of activity in primary visual cortex during binocular rivalry Nat Neurosci 8 22–3 [12.3.5e, 12.9.2f ] Lee SH, Blake R , Heeger DJ (2007) Hierarchy of cortical responses underlying binocular rivalry Nat Neurosci 10 1048–54 [12.9.2f ] Legge GE (1979) Spatial frequency masking in human vision: binocular interactions J Opt Soc Am 69 838–47 [13.2.4a] Legge GE (1984a) Binocular contrast summation. I. Detection and discrimination Vis Res 27 373–83 [12.9.2b, 13.1.2a, 13.1.3a] Legge GE (1984b) Binocular contrast summation. II. Quadratic summation Vis Res 27 385–94 [13.1.2b] Legge GE, Gu Y (1989) Stereopsis and contrast Vis Res 29 989–1004 [18.5.1, 18.5.2, 18.5.4a] Legge GE, Rubin GS (1981) Binocular interactions in suprathreshold contrast perception Percept Psychophys 30 49–61 [13.1.4b] Leguire LE, Blake R , Sloane M (1982) The square–wave illusion and phase anisotropy of the human visual system Perception 11 547–56 [12.6.2] Lehky SR (1983) A model of binocular brightness and binaural loudness perception in humans with general applications to nonlinear summation of sensory inputs Biol Cyber 49 89–97 [13.1.4b] Lehky SR (1988) An astable multivibrator model of binocular rivalry Perception 17 215–28 [12.10] Lehky SR (1995) Binocular rivalry is not chaotic Proc R Soc B 259 71–6 [12.10] Lehky SR , Blake R (1991) Organization of binocular pathways: modeling and data related to rivalry Neural Comput 3 44–53 [12.9.1] Lehky SR , Maunsell JHR (1996) No binocular rivalry in the LGN of alert macaque monkeys Vis Res 36 1225–34 [12.9.1] Lehky SR , Sejnowski TJ (1990) Neural model of stereoacuity and depth interpolation based on a distributed representation of stereo disparity J Neurosci 10 2281–99 [11.4.1a, 21.2] Lehky SR , Sejnowski TJ (1999) Seeing white: qualia in the context of decoding population codes Neural Comput 11 1261–80 [11.10.1c] Lehman RAW, Spencer DD (1973) Mirror-image shape discrimination: interocular reversal of responses in the optic chiasm sectioned monkey Brain Res 52 23–41 [13.4.2] Lehmann D, Fender DH (1967) Monocularly evoked electroencephalogram potentials: influence of target structure presented to the other eye Nature 215 204–5 [12.9.2e] Lehmann D, Fender DH (1968) Component analysis of human averaged evoked potentials: dichoptic stimuli using different target structure EEG Clin Neurophysiol 27 542–53 [12.9.2e] Lehmann D, Fender DH (1969) Averaged visual evoked potenials in humans: mechanism of dichoptic interaction studied in a subject with a split chiasma EEG Clin Neurophysiol 27 142–45 [12.9.2e]
Lehmann D, Julesz B (1977) Human average evoked potentials elicited by dynamic random-dot stereograms EEG Clin Neurophysiol 43 469 [11.7] Lehmann D, Julesz B (1978) Lateralized cortical potentials evoked in humans by dynamic random–dot stereograms Vis Res 18 1295–71 [11.7] Lehmkuhle SW, Fox R (1975) Effect of binocular rivalry suppression on the motion aftereffect Vis Res 15 855–9 [12.6.4] Lehmkuhle SW, Fox R (1976) On measuring interocular transfer Vis Res 16 428–30 [13.3.3a] Lehmkuhle SW, Fox R (1980) Effect of depth separation on metacontrast masking J Exp Psychol HPP 6 605–21 [13.2.7a, 22.5.1c] Lehnert K (1941) Uber wahre und Scheinhoropteren Pflügers Arch ges Physiol 275 112–20 [14.6.2a] Leibowitz H, Walker L (1956) Effect of field size and luminance on the binocular summation of suprathreshold stimuli J Opt Soc Am 46 171–2 [13.1.4c] Leonards U, Sireteanu R (1993) Interocular suppression in normal and amblyopic subjects: the effect of unilateral attenuation with neutral density filters Percept Psychophys 54 65–74 [12.3.5a] Leopold DA, Logothetis NK (1996) Activity changes in early visual cortex reflect monkeys’ percepts during binocular rivalry Nature 379 549–53 [12.9.2a] Leopold DA, Wilke M, Maier A, Logothetis NK (2002) Stable perception of visually ambiguous patterns Nat Neurosci 5 605–9 [12.3.5g] Lepore F, Guillemot JP (1982) Visual receptive field properties of cells innervated through the corpus callosum in the cat Exp Brain Res 46 413–27 [11.9.1] Lepore F, Samson A, Molotchnikoff S (1983) Effects on binocular activation of cells in visual cortex of the cat following the transection of the optic tract Exp Brain Res 50 392–6 [11.9.2] Lepore F, Ptito M, Lassonde M (1986) Stereoperception in cats following section of the corpus callosum and/or the optic chiasma Exp Brain Res 61 258–64 [11.9.1, 11.9.2] Lepore F, Samson A, Paradis MC, Ptito M (1992) Binocular interaction and disparity coding at the 17–18 border: contribution of the corpus callosum Exp Brain Res 90 129–40 [11.9.1] LeVay S, Voigt T (1988) Ocular dominance and disparity coding in cat visual cortex Vis Neurosci 1 395–414 [11.1.2, 11.3.1, 11.4.5b] LeVay S, Connolly M, Houde J, Van Essen DC (1985) The complete pattern of ocular dominance stripes in the striate cortex and visual field of the macaque monkey J Neurosci 5 486–501 [13.2.5] Levelt WJM (1965a) Binocular brightness averaging and contour information Br J Psychol 56 1–13 [13.1.4a] Levelt WJM (1965b) On binocular rivalry Institute for Perception, Soesterberg, The Netherlands [12.3.2a] Levelt WJM (1966) The alternation process in binocular rivalry Br J Psychol 57 225–38 [12.10, 12.3.2a] Levelt WJM (1967) Note on the distribution of dominance times in binocular rivalry Br J Psychol 58 143–5 [12.3.6a] Levi DM, Klein S (1990) The role of separation and eccentricity in encoding position Vis Res 30 557–85 [12.1.1d] Levi DM, Schor CM (1984) Spatial and velocity tuning of processes underlying induced motion Vis Res 24 1189–96 [13.3.3e] Levi DM, Pass AF, Manny RE (1982) Binocular interactions in normal and anomalous binocular vision: effects of flicker Br J Ophthal 66 57–63 [13.1.5] Levi DM, Klein S, Aitsebaomo AP (1985) Vernier acuity crowding and cortical magnification Vis Res 25 963–77 [13.2.5, 18.6.1a, 18.7.1] Levick WR , Thibos LN (1982) Analysis of orientation bias in the cat retina J Physiol 329 243–61 [13.1.2e] Levick WR , Cleland BG, Coombs JS (1972) On the apparent orbit of the Pulfrich pendulum Vis Res 12 1381–8 [23.2.1] Levinson E, Blake R (1979) Stereopsis by harmonic analysis Vis Res 19 73–8 [20.2.1] Levinson E, Sekuler R (1975) The independence of channels in human vision selective for direction of movement J Physiol 250 347–66 [13.3.3a, 22.3.2]
REFERENCES
•
591
Levitt JB, Yoshioka T, Lund JS (1995) Connections between the pulvinar complex and cytochrome oxidase-defined compartments in visual area V2 of macaque monkey Exp Brain Res 104 419–30 [11.2.1] Levy MM, Lawson RB (1978) Stereopsis and binocular rivalry from dichoptic stereograms Vis Res 18 239–46 [15.3.7b] Levy NS, Glick EB (1974) Stereoscopic perception and Snellen visual acuity Am J Ophthal 78 722–4 [18.5.4b] Lewis CE, Blakeley WR , Swaroop R , et al (1973) Landing performance by low-time private pilots after the sudden loss of binocular vision: cyclops II Aviat Space Environ Med 44 1271–45 [20.1.1] Lewis JL (1970) Semantic processing of unattended messages using dichotic listening J Exp Psychol 85 225–8 [12.8.3b] Lewis P (1944) Bilateral monocular diplopia with amblyopia Am J Ophthal 27 1029–7 [14.4.2] Li B, Peterson MR , Thomson JK , Duong T, et al. (2005) Crossorientation suppression: monoptic and dichoptic mechanisms are different J Neurophysiol 94 2645–50 [12.9.2b] Lidén L, Mingolla E (1998) Monocular occlusion cues alter the influence of terminator motion in the barber pole phenomenon Vis Res 38 3883–98 [22.3.1] Lie I (1969) Psychophysical invariants of achromatic colour vision: IV Depth adjacency and simultaneous contrast Scand J Psychol 10 282–6 [22.4.2] Liebermann P von (1910) Beitrag zur Lehre von der binocularlen Tiefenlokalization Z Psychol Physiol Sinnesorg 44 428–43 [14.6.2] Likova LT, Tyler CW (2007) Stereomotion processing in the human occipital cortex. Neuroimage 38 293–305 [11.8.2] Lindblom B, Westheimer G (1989) Binocular summation of hyperacuity tasks J Opt Soc Am A 6 585–9 [13.1.3c] Lindblom B, Westheimer G (1992) Spatial uncertainty in stereoacuity tests: implications for clinical vision test design Acta Ophthal 70 60–65 [18.12.1b] Lindsey DT, Teller DY (1990) Motion at isoluminance: Discrimination/ detection ratios for moving isoluminant gratings Vis Res 30 2727–40 [16.4.1] Ling S, Hubert-Wallander B, Blake R (2010) Detecting contrast changes in invisible patterns during binocular rivalry Vis Res 50 2421–9 [12.10] Linksz A (1952) Physiology of the eye Vol II Vision Grune and Stratton, New York [14.6.2a] Linksz A (1971) Comments on the papers by C Blakemore (1969 1970) and DE Mitchell and C Blakemore (1970) Survey Ophthal 15 348–53 [15.3.4b] Lippert J, Fleet DJ, Wagner H (2000) Disparity tuning as simulated by a neural net Biol Cybern 83 61–72 [11.10.2] Lippincott JA (1889) On the binocular metamorphopsia produced by correcting glasses Arch Ophthal 18 18–30 [20.2.3a] Lipton L (1982) Foundations of the stereoscopic cinema Van Nostrand Reinhold, New York [24.1.1, 24.1.2d] Lit A (1949) The magnitude of the Pulfrich stereophenomenon as a function of binocular differences in intensity at various levels of illumination Am J Psychol 62 159–81 [23.2.1, 23.2.2, 23.4.1] Lit A (1959a) Depth–discrimination thresholds as a function of binocular differences of retinal illumination at scotopic and photopic levels J Opt Soc Am 49 746–52 [18.5.4a] Lit A (1959b) The effect of fixation conditions on depth discrimination thresholds at scotopic and photopic illuminance levels J Exp Psychol 58 476–81 [18.10.4] Lit A (1960a) Effect of target velocity in a frontal plane on binocular spatial localization at photopic retinal illuminance levels J Opt Soc Am 50 970–3 [18.10.1b, 23.2.1] Lit A (1960b) The magnitude of the Pulfrich stereo-phenomenon as a function of target velocity J Exp Psychol 59 165–75 [23.1.2, 23.2.1] Lit A (1964) Equidistance settings at photopic retinal-illuminance levels as a function of target velocity in a frontal plane J Opt Soc Am 54 83–8 [18.10.1b] Lit A (1968) Illumination effects on depth discrimination Optometric Weekly 59 42–54 [23.2.1]
592
•
Lit A, Finn JP (1976) Variability of depth–discrimination thresholds as a function of observation distance J Opt Soc Am 66 740–2 [18.6.7] Lit A, Hamm HD (1966) Depth–discrimination for stationary and oscillating targets at various levels of illuminance J Opt Soc Am 56 510–16 [18.5.1, 18.10.1b] Lit A, Hyman A (1951) The magnitude of the Pulfrich stereophenomenon as a function of distance of observation Am J Optom Arch Am Acad Optom 28 564–80 [23.2.1] Lit A, Vicars WM (1966) The effect of practice on the speed and accuracy of equidistance–settings Am J Psychol 72 464–9 [18.14.1] Lit A, Vicars WM (1970) Stereoacuity for oscillating targets exposed through apertures of various horizontal extents Percept Psychophys 8 348–52 [18.10.1b] Lit A, Finn JP, Vicars WM (1972) Effect of target-background luminance contrast on binocular depth discrimination at photopic levels of illumination Vis Res 12 1271–51 [18.5.1] Liu CH, Kennedy JM (1995) Misalignment effects in 3-D versions of Poggendorff displays Percept Psychophys 57 409–15 [16.7.4b] Liu L, Schor CM (1994) The spatial properties of binocular suppression zone Vis Res 34 937–47 [12.4.2] Liu L, Schor CM (1995) Binocular combination of contrast signals from orthogonal orientation channels Vis Res 35 2559–67 [13.1.4d] Liu L, Schor CM (1998) Functional division of the retina and binocular correspondence J Opt Soc Am A 15 1740–55 [14.3.1c] Liu L, Tyler CW, Schor CM (1992a) Failure of rivalry at low contrast: evidence of a suprathreshold binocular summation process Vis Res 32 1471–9 [12.3.2c, 17.5] Liu L Tyler CW, Schor CM, Ramachandran VS (1992b) Position disparity is more efficient in coding depth than phase disparity Invest Ophthal Vis Sci 33 (Abs) 1373 [11.4.3c] Liu L, Stevenson SB, Schor CM (1994a) A polar coordinate system for describing disparity Vis Res 34 1205–22 [19.3.4] Liu L, Stevenson SB, Schor CM (1994b) Quantitative stereoscopic depth without binocular correspondence Nature 367 66–9 [17.3] Liu L, Stevenson SB, Schor CM (1997) Binocular matching of dissimilar features in phantom stereopsis Vis Res 37 633–44 [17.3] Liu L, Stevenson SB, Schor CM (1998) Vergence eye movements elicited by stimuli without corresponding features Perception 27 7–20 [17.3] Liu Y, Vogels R , Orban GA (2004) Convergence of depth from texture and depth from disparity in macaque inferior temporal cortex J Neurosci 24 3795–800 [11.5.3b] Liu Y, Bovik AC, Cormack LK (2008) Disparity statistics in natural scenes J Vis 8 (11) Article 19 [11.10.1a] Livingstone MS (1996) Differences between stereopsis interocular correlation and binocularity Vis Res 36 1127–40 [12.9.2e, 17.1.4a] Livingstone MS, Hubel DH (1988) Segregation of form, color movement and depth: anatomy physiology and perception Science 270 740–9 [11.5.4] Livingstone MS, Hubel DH (1994) Stereopsis and positional acuity under dark adaptation Vis Res 34 799–802 [18.5.1] Livingstone MS, Tsao DY (1999) Receptive fields of disparity selective neurons in macaque striate cortex Nat Neurosci 2 825–32 [11.10.1b] Locke J (1849) On single and double vision produced by viewing objects with both eyes: and on an optical illusion with regard to the distance of objects Am J Sci Arts 7 68–74 [14.2.2, 24.1.6] Lockett A (1913) The evolution of the modern stereoscope Sci Am Supplement Number 76 276–9 [24.1.2a] Logan GD (1994) Spatial attention and the apprehension of spatial relations J Exp Psychol: HPP 20 1015–34 [22.8.2c] Logothetis NK (1998) Single units and conscious vision Philos Tr R Soc 353 1801–18 [12.9.2b] Logothetis NK , Schall JD (1989) Neuronal correlates of subjective visual perception Science 275 761–3 [12.9.2a] Logothetis NK , Schall JD (1990) Binocular motion rivalry in macaque monkeys: eye dominance and tracking eye movements Vis Res 30 1409–19 [12.3.1a]
REFERENCES
Logothetis NK , Schiller PH, Charles ER , Hurl Bert AC (1990) Perceptual deficits and the activity of the color–opponent and broadband pathways at isoluminance Science 277 214–17 [17.1.4a] Logothetis NK , Leopold DA, Sheinberg DL (1996) What is rivalling during binocular rivalry Nature 380 621–4 [12.10, 12.4.4a] Logvinenko AD (1999) Lightness induction revisited Perception 28 803–16 [22.4.5] Logvinenko AD, Menshikova G (1994) Trade-off between achromatic colour and perceived illumination as revealed by the use of pseudoscopic inversion of apparent depth Perception 23 1007–23 [22.4.4] Long GM (1979) The dichoptic viewing paradigm: do the eyes have it Psychol Bull 86 391–403 [13.3.1] Long NR (1982) Transfer of learning in transformed random–dot stereostimuli Perception 11 409–14 [18.14.2d] Long NR , Over R (1973) Stereoscopic depth aftereffects with randomdot patterns Vis Res 13 1283–7 [21.6.2a, 21.6.3a] Long NR , Over R (1974a) Stereospatial masking and aftereffect with normal and transformed random–dot patterns Percept Psychophys 15 273–8 [22.5.1b] Long NR , Over R (1974b) Disparity masking with ambiguous randomdot stereograms Vis Res 14 31–4 [21.6.2a] Longuet–Higgins HC (1982) The role of the vertical dimension in stereoscopic vision Perception 11 371–6 [19.6.5] Lorber M, Zuber BL, Stark L (1965) Suppression of the pupillary light reflex in binocular rivalry and saccadic suppression Nature 208 558–60 [12.5.1] Lotto RB, Purves D (1999) The effects of color on brightness Nat Neurosci 2 1010–14 [22.4.4] Lou L (2008) Troxler effect with dichoptic stimulus presentations: Evidence for binocular inhibitory summation and interocular suppression Vis Res 48 1514–21 [12.3.3a] Lovasik JV, Szymkiw M (1985) Effects of aniseikonia, anisometropia, accommodation, retinal illuminance, and pupil size on stereopsis Invest Ophthal Vis Sci 29 741–50 [18.3.4, 18.5.1, 18.5.4a, 18.5.4b] Lowe KN, Ogle KN (1966) Dynamics of the pupil during binocular rivalry Arch Ophthal 75 395–403 [12.5.1] Lu C, Fender DH (1972) The interaction of color and luminance in stereoscopic vision Invest Ophthal 11 482–90 [11.5.4, 17.1.4a] Lu ZL, Sperling G (1995) The functional architecture of human visual motion perception Vis Res 35 2997–722 [16.4.1, 16.4.2d, 18.7.2d] Ludwig I, Pieper W, Lachnit H (2007) Temporal integration of monocular images separated in time: stereopsis, stereoacuity, and binocular luster Percept Psychophys 69 92–102 [18.12.2a] Lumer ED, Rees G (1999) Covariation of activity in visual prefrontal cortex associated with subjective visual perception Proc Natl Acad Sci 96 1669–73 [12.9.2f ] Lumer ED, Friston KK , Rees G (1998) Neural correlates of perceptual rivalry in the human brain Science 280 1930–4 [12.9.2f ] Lunghi C, Binda P, Morrone MC (2010) Touch disambiguates rivalrous perception at early stages of visual analysis Cur Biol 20 R 143–4 [12.8.4] Lunn PD, Morgan MJ (1995) The analogy between stereo depth and brightness: a reexamination Perception 27 901–4 [21.4.1] Lunn PD, Morgan M (1997) Discrimination of the spatial derivatives of horizontal binocular disparity J Opt Soc Am A 14 360–71 [18.6.6, 20.5.2] Lyle TK , Wybar KC (1967) Lyle and Jackson’s practical orthoptics in the treatment of squint and other anomalies of binocular vision Charles C Thomas, New York [14.4,1b] Lythgoe RJ (1938) Some observations on the rotating pendulum Nature 141 474 [23.4.1, 23.4.2a] Lythgoe RJ, Phillips LR (1938) Binocular summation during dark adaptation J Physiol 91 427–36 [13.1.2a] MacCracken PJ, Hayes WN (1976) Experience and latency to achieve stereopsis Percept Mot Skills 43 1227–31 [18.14.2a, 18.14.2f ] MacCracken PJ Bourne JA, Hayes WN (1977) Experience and latency to achieve stereopsis: a replication Percept Mot Skills 45 291–292 [18.14.2a]
MacDonald RI (1977) Temporal stereopsis and dynamic visual noise (Letter to the editor) Vis Res 17 1127–8 [23.6.3] Mach E (1866) Über die physiologische Wirkung räumlich vertheilter Lichtreize (Dritte Abhandlung) Sitzungsbericht der Ostereichischen Akademie der Wissenschaft 54 393–408 (Translation in F Ratcliff Mach bands pp 285–98) Holden–Day, San Francisco 1965 [17.1.1c] Mach E (1886) The analysis of sensations and the relation of the physical to the psychical English translation. Dover, New York, 1959 [22.4.1] Mach E, Dvorak V (1872) Über Analoga der persönlichen Differenz zwischen bedizen Augen und den Netzhautstellen desselben Auges Sitzungsbericht der königlichen böhmischen Gesellschaft der Wissenschaft Prague 65–74 [23.1.1, 23.3.6] Mack A, Chitayat D (1970) Eye–dependent and disparity adaptation to opposite visual–field rotation Am J Psychol 83 352–69 [21.6.1a] MacKay DM (1968) Evoked potentials reflecting interocular and monocular suppression Nature 217 81–3 [12.9.2e] MacKay DM (1973) Lateral interaction between neural channels sensitive to texture density Nature 275 159–61 [21.1] MacKay DM, MacKay V (1975) Dichoptic induction of McColloughtype effects Quart J Exp Psychol 27 225–33 [13.3.5] Mackensen G (1953) Untersuchungen zur Physiologie des optokinetischen Nystagmus Klin Monat Augenheilk 123 133–43 [22.6.1e] Macknik SL, Haglund MM (1999) Optical images of visible and invisible percepts in the primary visual cortex of primates Proc Natl Acad Sci 96 15208–10 [13.2.7] Macknik SL, Martinez-Conde S (2004) Dichoptic visual masking reveals that early binocular neurons exhibit weak interocular suppression: implications for binocular vision and visual awareness J Cog Neurosci 16 1049–59 [13.2.4a] MacLeod DIA (1972) The Schrödinger equation in binocular brightness combination Perception 1 321–4 [13.1.4b] Maffei L, Berardi N, Bisti S (1986) Interocular transfer of adaptation after effect in neurons of area 17 and 18 of split chiasm cats J Neurophysiol 55 966–76 [13.2.6] Maier A, Wilke M, Aura C, et al. (2008) Divergence of fMRI and neural signals in V1 during perceptual suppression in the awake monkey Nat Neurosci 11 1193–200 [12.9.2f ] Makous N, Boothe R (1974) Cones block signals from rods Vis Res 14 285–94 [13.2.3] Makous W, Sanders RK (1978) Suppressive interactions between fused patterns In Visual psychophysics and physiology (ed JC Armington, J Krausfopf, BR Wooten) pp 167–79 Academic Press, New York [12.3.2a, 12.7.2] Makous W, Teller D, Boothe R (1976) Binocular interaction in the dark Vis Res 16 473–6 [13.2.2] Malach R , Strong NP, van Sluyters RC (1981) Analysis of monocular optokinetic nystagmus in normal and visually deprived kittens Brain Res 201 367–72 [22.6.1b] Malik J, Anderson BL, Charowhas CE (1999) Stereoscopic occlusion junctions Nat Neurosci 2 840–3 [17.3] Mallett RFJ (1973) Anomalous retinal correspondence Br J Physiol Opt 28 1–10 [14.4.1b] Mallot HP (1997) Spatial scale in stereo and shape from shading: image input mechanisms and tasks Perception 29 1137–46 [17.1.1c] Mallot HA, Bideau H (1990) Binocular convergence influences the assignment of stereo correspondences Vis Res 30 1521–3 [15.4.6] Mallot HA, Gillner S, Arndt PA (1996a) Is correspondence search in human stereo vision a coarse-to-fine process? Biol Cyber 74 95–106 [15.4.2] Mallot HA, Arndt PA, Bülthoff HH (1996b) A psychophysical and computational analysis of intensity-based stereo Biol Cyber 75 187–98 [17.1.1c] Mann VA, Hein A, Diamond R (1979a) Localization of targets by strabismic subjects: contrasting patterns in constant and alternating suppressors Percept Psychophys 25 29–34 [13.4.3] Mann VA, Hein A, Diamond R (1979b) Patterns of interocular transfer of visuomotor coordination reveal differences in the representation of visual space Percept Psychophys 25 35–41 [13.4.3, 16.7.5]
REFERENCES
•
593
Manning ML, Finlay DC, Neill RA, Frost BG (1987) Detection threshold differences to crossed and uncrossed disparities Vis Res 27 1683–6 [18.6.1b, 18.6.4] Manning ML, Finlay DC, Dewis SAM, Dunlop DB (1992) Detection duration thresholds and evoked potential measures of stereosensitivity Doc Ophthal 79 161–75 [11.7, 18.6.1b] Manny RE, Martinez AT, Fern KD (1991) Testing stereopsis in the preschool child: is it clinically useful? J Ped Ophthal Strab 28 223–31 [18.2.1e] Mansfield JS, Legge GE (1996) The binocular computation of visual direction Vis Res 36 27–41 [16.7.3b, 16.7.7] Mansfield JS, Legge G (1997) Binocular visual direction, the cyclopean eye, and vergence: Reply to Banks, van E and Backus (1997) Vis Res 37 1610–13 [16.7.7] Mansfield JS, Parker AJ (1993) An orientation–tuned component in the contrast masking of stereopsis Vis Res 33 1535–44 [17.1.2b] Mansfield JS, Simmons DR (1989) Contrast thresholds for the identification of depth in bandpass stereograms Invest Ophthal Vis Sci 30 (Abs) 251 [18.5.3] Mapp AP, Ono H (1986) The rhino–optical phenomenon: ocular parallax and the visible field beyond the nose Vis Res 29 1163–5 [16.7.2b] Mapp AP, Ono H (1999) Wondering about the wandering cyclopean eye Vis Res 39, 2381–6 [16.7.7] Mapp AP, Ono H, Barbeito R (2003) What does the dominant eye dominate? A brief and somewhat contentious review Percept Psychophys 65 310–17 [12.3.7, 16.7.6b] Mapp AP, Ono H, Khokhotva M (2007) Hitting the target: Relatively easy, yet absolutely impossible? Perception 36 1139–51 [16.7.5] Mapperson B, Lovegrove W (1991) Orientation and spatial-frequencyspecific surround effects on binocular rivalry Bull Psychonom Soc 29 95–7 [12.3.3b] Mapperson B, Bowling A, Lovegrove W (1982) Problems for an afterimage explanation of monocular rivalry Vis Res 22 1233–4 [12.3.8d] Marc RE, Sperling HG (1977) Chromatic organization of primate cones Science 196 454–6 [17.1.4c] Markoff JI, Sturr JF (1971) Spatial and luminance determinants of the increment threshold under monoptic and dichoptic viewing J Opt Soc Am 61 1530–7 [13.2.3] Marr D (1982) Vision Freeman San Francisco [19.1.1] Marr D, Poggio T (1976) Cooperative computation of stereo disparity Science 194 283–7 [15.4.5] Marr D, Poggio T (1979) A computational theory of human stereo vision Proc R Soc B 204 301–28 [15.4.2, 17.1.1a, 18.7.1, 18.7.2e, 18.10.4] Marr D, Palm G, Poggio T (1978) Analysis of a cooperative stereo algorithm Biol Cyber 28 223–39 [15.4.5] Marrara MT, Moore, CM (2000) Role of perceptual organization while attending in depth Percept Psychophys 62 786–99 [22.8.1] Marrocco RT, Carpenter MA, Wright SE (1985) Spatial contrast sensitivity: effects of peripheral field stimulation during monocular and dichoptic viewing Vis Res 25 917–27 [13.2.4a] Marsh WR , Rawlings SC, Mumma JV (1980) Evaluation of clinical stereoacuity tests Ophthalmology 87 1295–72 [18.2.4] Marshak W, Sekuler R (1979) Mutual repulsion between moving visual targets Science 205 1399–401 [16.5.3b, 22.7.4] Martens W, Blake R , Sloane M, Cormack RH (1981) What masks utrocular discrimination Percept Psychophys 30 521–32 [16.8] Martin JI (1970) Effects of binocular fusion and binocular rivalry on cortically evoked potentials EEG Clin Neurophysiol 28 190–201 [12.9.2e] Martin LC, Wilkins TR (1937) An examination of the principles of orthostereoscopic photomicrography and some applications J Opt Soc Am 27 340–9 [24.2.1] Maruya K , Blake R (2009) Spatial spread of interocular suppression is guided by stimulus configuration Perception 38 215–31 [12.4.2] Maruyama M, Kobayashi T, Katsura T, Kuriki S (2003) Early behavior of optokinetic responses elicited by transparent motion stimuli during depth-based attention Exp Brain Res 151 411–19 [22.6.1f ]
594
•
Marzi CA, Antonini A, Di Stefano M, Legg CR (1982) The contribution of the corpus callosum to receptive fields in the lateral suprasylvian visual areas of the cat Behav Brain Res 4 155–76 [11.9.2] Marzi CA, Antonucci G, Pizzamiglio L, Santillo C (1986) Simultaneous binocular integration of the visual tilt effect in normal and stereoblind observers Vis Res 29 477–83 [13.3.2a] Maske R , Yamane S, Bishop PO (1984) Binocular simple cells for local stereopsis: a comparison of receptive field organizations for the two eyes Vis Res 27 1921–9 [11.6.1] Maske R , Yamane S, Bishop PO (1986a) End–stopped and binocular depth discrimination in the striate cortex of cats Proc R Soc B 229 257–76 [11.1.2, 11.3.1, 11.4.5a, 11.4.5b] Maske R , Yamane S, Bishop PO (1986b) Stereoscopic mechanisms: binocular responses of the striate cells of cats to moving light and dark bars Proc R Soc B 229 227–56 [11.3.1] Masson GS, Busettini CM, Yang DS, Miles FA (2001) Short-latency ocular following in humans: sensitivity to binocular disparity Vis Res 41 3371–87 [22.6.1e] Mather G (1989) The role of subjective contours in capture of stereopsis Vis Res 29 143–6 [22.2.4b] Mather G, Verstraten F, Anstis S (1998) The motion aftereffect MIT Press Cambridge, MA [13.3.3a] Matin L (1962) Binocular summation at the absolute threshold for peripheral vision J Opt Soc Am 52 1276–86 [13.1.1c, 13.1.6c] Matsumiya K , Howard IP, Kaneko H (2007) Perceived depth in the ‘sieve effect’ and exclusive binocular rivalry Perception 36 990–1002 [17.5] Matsuoka K (1984) The dynamic model of binocular rivalry Biol Cyber 49 201–8 [12.10] Matthews N, Geesaman BJ, Qian N (2000) The dependence of motion repulsion and rivalry on the distance between moving elements Vis Res 40 2025–36 [12.3.6c, 22.7.4] Matthews N, Meng X , Xu P, Qian N (2003) A physiological theory of depth perception from vertical disparity Vis Res 43 85–99 [11.4.4, 20.2.5] Maunsbach AB, Afzelius BA (1999) Biomedical electron microscopy Academic Press, New York [24.2.3d] Maunsell JHR , Van Essen DC (1983) Functional properties of neurons in middle temporal visual area of the macaque monkey. II. Binocular interactions and sensitivity to binocular disparity J Neurophysiol 49 1148–67 [11.5.2a, 11.6.4] Mayhew JEW (1982) The interpretation of stereo–disparity information: the computation of surface orientation and depth Perception 11 387–404 [19.6.5, 20.2.3a, 20.2.4] Mayhew JEW, Anstis SM (1972) Movement aftereffects contingent on color intensity and pattern Percept Psychophys 12 77–85 [13.3.5] Mayhew JEW, Frisby JP (1976) Rivalrous texture stereograms Nature 294 53–6 [17.1.3] Mayhew JEW, Frisby JP (1978) Stereopsis masking in humans is not orientationally tuned Perception 7 431–6 [17.1.2b] Mayhew JEW, Frisby JP (1979a) Convergent disparity discriminations in narrow–band–filtered random–dot stereograms Vis Res 19 63–71 [18.7.3a, 18.10.1a] Mayhew JEW, Frisby JP (1979b) Surfaces with steep variations in depth pose difficulties for orientationally tuned disparity filters Perception 8 691–8 [17.1.2b] Mayhew JEW, Frisby JP (1980) The computation of binocular edges Perception 9 69–86 [15.4.3, 15.4.4] Mayhew JEW, Frisby JP (1981) Psychophysical and computational studies towards a theory of human stereopsis Artificial Intelligence 17 349–85 [15.4.3, 17.1.1a] Mayhew JEW, Frisby JP (1982) The induced effect: arguments against the theory of Arditi, Kaufman and Movshon Vis Res 22 1225–8 [20.2.3a] Mayhew JEW, Longuet–Higgins HC (1982) A computational model of binocular depth perception Nature 297 376–8 [14.2.3, 19.6.5, 20.2.3a, 20.2.3b, 20.2.4]
REFERENCES
Mayhew JEW, Frisby JP, Gale P (1977) Computation of stereodisparity from rivalrous texture stereograms Perception 6 207–8 [17.1.3] Mays LE, Sparks DL (1980) Dissociation of visual and saccade-related responses in superior colliculus neurones J Neurophysiol 43 207–32 [11.2.3] Mazyn LIN, Lenoir M, Montagne G, Savelsbergh GJP (2004) The contribution of stereo vision to one-handed catching Exp Brain Res 157 383–90 [20.1.1] Mazyn LIN, Lenoir M, Montagne G, et al. (2007) Stereo vision enhances the learning of a catching skil Exp Brain Res 179 723–6 [20.1.1] McAllister DF (1993) Stereo computer graphics and other true 3D technologies Princeton University Press, Princeton NJ [24.1.4a] McAllister DF, Robbins WE (1987) Three-dimensional imaging techniques and display technologies Proc Int Soc Opt Engin 761 35–43 [24.1.3c] McCarthy JE (1993) Directional adaptation effects with contrast modulated stimuli Vis Res 33 2653–62 [13.3.3b] McCollough C (1965) Colour adaptation of edge–detectors in the human visual system Science 149 1115–16 [13.3.5] McConkie AB, Faber JM (1979) Relation between perceived depth and perceived motion in uniform flow fields J Exp Psychol: HPP 5 501–508 [22.7.3] McCormack G (1990) Normal retinotopic mapping in human strabismus with anomalous retinal correspondence Invest Ophthal Vis Sci 31 559–68 [14.4c] McKee SP (1983) The spatial requirements for fine stereoacuity Vis Res 23 191–8 [18.6.1a, 21.4.3] McKee SP, Levi DM (1987) Dichoptic hyperacuity: the precision of nonius alignment J Opt Soc Am A 4 1104–8 [14.6.1c, 18.11] McKee SP, Mitchison GJ (1988) The role of retinal correspondence in stereoscopic matching Vis Res 28 1001–12 [11.10.1b, 22.2.3b] McKee SP and Verghese P (2002) Stereo transparency and the disparity gradient limit Vis Res 42 1963–77 [12.1.3a] McKee SP, Westheimer G (1970) Specificity of cone mechanisms in lateral interaction J Physiol 206 117–28 [13.2.7b] McKee SP, Welch L, Taylor DG, Bowne SF (1990a) Finding the common bond: stereoacuity and the other hyperacuities Vis Res 30 879–91 [18.3.3a, 18.7.2b, 18.11] McKee SP, Levi DM, Bowne SF (1990b) The imprecision of stereopsis Vis Res 30 1763–79 [18.11] McKee SP, Bravo MJ, Taylor DG, Legge GE (1994) Stereo matching precedes dichoptic masking Vis Res 34 1047–60 [13.2.4b] McKee SP, Bravo MJ, Smallman HS, Legge GE (1995) The “uniqueness constraint” and binocular masking Perception 24 49–65 [15.3.1] McKee SP, Watamaniuk SNJ, Harris JM, et al. (1997) Is stereopsis effective in breaking camouflage for moving objects Vis Res 37 2047–55 [22.3.5] McKee SP, Verghese P, Farell B (2005) Stereo sensitivity depends on stereo matching J Vis 5 783–92 [18.3.3a] McKenna M, Zeltzer D (1992) Three dimensional visual display systems for virtual environments Presence 1 421–58 [24.1.3b] McLaughlin NP, Grossberg S (1998) Cortical computation of stereo disparity Vis Res 38 91–9 [11.10.1b] Meacham GBK (1986) Autostereoscopic displays—past and future Proc Soc Photo Opt Instru Engin 627 90–101 [24.1.3c] Meadows JC (1973) Observations on a case of monocular diplopia of cerebral origin J Neurol Sci 18 279–53 [14.4.2] Meegan DV, Stelmach LB, Tam WJ (2001) Unequal weighting of monocular inputs in binocular combination: implications for the compression of stereoscopic imagery J Exp Psychol: App 7 143–53 [24.2.6] Meenes M (1930) A phenomenological description of retinal rivalry Am J Psychol 42 290–9 [12.3.1a] Meese TS, Smith V, Harris MG (1995) Induced motion may account for the illusory transformation of optic flow fields found by Duff y and Wurtz Vis Res 35 981–4 [22.7.4] Meese TS, Georgeson MA, Baker DH (2006) Binocular contrast vision at and above threshold J Vis 6 1224–43 [13.1.3a]
Mehdorn E (1982) Nasal–temporal asymmetry of the optokinetic nystagmus after bilateral occipital infarction in man In Functional basis of ocular motility disorders (ed G Lennerstrand, DS Zee, EL Keller) pp 321–4 Pergamon, New York [22.6.1b] Meissner G (1854) Beiträge zur Physiologie des Sehorganes Engleman, Leipzig [14.6.1a] Mello NK (1966) Concerning inter-hemispheric transfer of mirrorimage patterns in pigeon Physiol Behav 1 293–300 [13.4.2] Meng M, Tong F (2004) Can attention selectively bias bistable perception? Differences between binocular rivalry and ambiguous figures J Vis 4 539–51 [12.8.1] Meng X , Chen Y, Qian N (2004) Both monocular and binocular signals contribute to motion rivalry Vis Res 44 45–55 [12.3.6c] Menz MD, Freeman RD (2003) Stereoscopic depth processing in the visual cortex: a coarse-to-fine mechanism Nat Neurosci 6 59–65 [11.4.8b] Menz MD, Freeman RD (2004a) Functional connectivity of disparitytuned neurons in the visual cortex J Neurophysiol 91 1794–807 [11.4.1d, 11.4.5b] Menz MD, Freeman RD (2004b) Temporal dynamics of binocular disparity processing in the central visual pathway J Neurophysiol 91 1782–93 [11.4.8b, 18.12.1c] Meredith GM, Meredith CGW (1962) Effect of instructional conditions on rate of binocular rivalry Percept Mot Skills 15 655–64 [12.8.1] Merritt JO, Fisher SS (1992) Stereoscopic displays and applications III. Proc Soc Photo Opt Instru Engin Vol 1669 [24.1.1, 24.2.6] Merritt JO, Fisher SS (1993) Stereoscopic displays and applications IV. Proc Soc Photo Opt Instru Engin Vol 1915 [24.1.1, 24.2.6] Mershon DH (1972) Relative contributions of depth and direction adjacency to simultaneous whiteness contrast Vis Res 12 969–79 [22.4.2] Mershon DH, Gogel WC (1970) Effect of stereoscopic cues on perceived brightness Am J Psychol 83 55–67 [22.4.2] Mestel R (1994) Night of the strangest comet New Scientist 143 (no 1933) 23–25 [24.2.2] Mestre DR , Masson GS (1997) Ocular responses to motion parallax stimuli: the role of perceptual and attentional factors Vis Res 37 1627–41 [22.6.1f ] Metropolis N, Rosenbluth A, Rosenbluth M, et al. (1953) Equations of state calculations by fast computing machines J Chem Physics 21 1087–92 [15.2.1b] Metzger W (1975) Gesetze des Sehens Woldemar Kramer Verlag , Frankfurt [22.1.3] Meyer GE (1974) Pressure blindness and the interocular transfer of size aftereffects Percept Psychophys 16 222–4 [13.3.4] Meyer (1842) Uber einige Tåuschungen in der Entfernung and Gröse der Gesichtsobjekte Arch Physiol Heil 1 316–26 [14.2.2] Meyer H (1852) Uber die Schätzung der Grösse und Entfernung Poggendorff ’s Ann Physik Chem 25 198–207 [14.2.2] Mezrich JJ, Rose A (1977) Coherent motion and stereopsis in dynamic visual noise Vis Res 17 903–10 [23.6.4] Michaels CF (1986) An ecological analysis of binocular vision Psychol Res 48 1–22 [20.6.3c] Michaels CF, Carello C, Shapiro B, Steitz C (1977) An onset to onset rule for binocular integration in the Mach-Dvorak illusion Vis Res 17 1107–13 [23.3.3] Miezin FM, Myerson J, Julesz B, Allman JM (1981) Evoked potentials to dynamic random–dot correlograms in monkey and man: a test for cyclopean perception Vis Res 21 177–9 [11.7] Mikaelian HH (1975) Interocular generalization of orientation specific color aftereffects Vis Res 15 661–3 [13.3.5] Mikaelian S, Qian N (2000) A physiologically-based explanation of disparity attraction and repulsion Vis Res 40 2999–3116 [21.2] Miles PW (1953) Anomalous binocular depth perception due to unequal image brightness Arch Ophthal 50 475–8 [17.9] Miles WR (1930) Ocular dominance in human adults J Gen Psychol 3 412–30 [12.3.7]
REFERENCES
•
595
Milewski A, Yonas A (1977) Texture size specificity in the slant aftereffect Percept Psychophys 21 47–9 [21.6.1a] Milios E, Jenkin M, Tsotsos J (1993) Design and performance of TRISH, a binocular robot head with torsional eye movements Int J Patt Recog Artif Intell 7 51–68 [24.2.6] Millard AC, Wiseman PW, Fittinghoff DN, et al. (1999) Thirdharmonic generation microscopy by use of a compact, femtosecond fiber laser source App Optics 38 7393–7 [24.2.3c] Miller SM, Liu BB, Ngo TT, et al. (2000) Interhemispheric switching mediates perceptual rivalry Curr Biol 10 383–92 [12.9.2d] Miller TJ, Ogle KN (1964) Stereoscopic localization of afterimages with eyes in asymmetric convergence Invest Ophthal Vis Sci 3 339–53 [20.2.2c] Miller WT, Sutton RS, Werbos PJ (1991) Neural networks for control MIT Press, Cambridge MA [11.10.2] Milleret C, Houzel JC (2001) Visual interhemispheric transfer to areas 17 and 18 in cats with convergent strabismus Eur J Neurosci 13 137–52 [11.9.1] Mimeault D, Lepore F, Guillemot JP (2002) Phase- and positiondisparity coding in the posteromedial lateral suprasylvian area of the cat Neurosci 110 59–72 [11.3.2] Mimeault D, Paquet V, et al. (2004) Disparity sensitivity in the superior colliculus of the cat Brain Res 1010 87–94 [11.2.3] Minciacchi D, Antonini A (1984) Binocularity in the visual cortex of the adult cat does not depend on the integrity of the corpus callosum Behav Brain Res 13 183–92 [11.9.2] Minnich B, Leeb H, Bernroider EWN, Lametschwandtner A (1999) Three-dimensional morphometry in scanning electron microscopy: a technique for accurate dimensional and angular measurements of microstructures using stereopaired digitized images and digital image analysis J Micros 195 23–33 [24.2.3d] Minsky M (1988) Memoirs on inventing the confocal scanning microscope Scanning 10 128–38 [24.2.3b] Minucci PK , Connors MM (1964) Reaction time under three viewing conditions: binocular dominant eye and nondominant eye J Exp Psychol 67 298–75 [13.1.7] Mitchell DE (1966a) Retinal disparity and diplopia Vis Res 6 441–51 [12.1.2] Mitchell DE (1966b) A review of the concept of “Panum’s fusional areas” Am J Optom Arch Am Acad Optom 43 387–401 [12.1.1a, 12.1.1d] Mitchell DE (1969) Qualitative depth localization with diplopic images of dissimilar shape Vis Res 9 991–4 [15.3.5] Mitchell DE (1970) Properties of stimuli eliciting vergence eye movements and stereopsis Vis Res 10 145–62 [18.4.2a] Mitchell DE, Baker AG (1973) Stereoscopic aftereffects: evidence for disparity–specific neurones in the human visual system Vis Res 13 2273–88 [21.6.1a, 21.6.2b] Mitchell DE, Blakemore C (1970) Binocular depth perception and the corpus callosum Vis Res 10 49–54 [11.9.2] Mitchell DE, O’Hagan S (1972) Accuracy of stereoscopic localization of small line segments that differ in size or orientation for the two eyes Vis Res 12 437–54 [15.3.5, 17.1.2a] Mitchell JF, Stoner GR , Reynolds JH (2004) Object-based attention determines dominance in binocular rivalry Nature 429 410–13 [12.8.2] Mitchell RT, Liaudansky LH (1955) Effect of differential adaptation of the eyes upon threshold sensitivity J Opt Soc Am 45 831–4 [13.2.2] Mitchison GJ (1988) Planarity and segmentation in stereoscopic matching Perception 17 753–82 [22.2.3b] Mitchison GJ (1993) The neural representation of stereoscopic depth contrast Perception 22 1415–29 [21.5.1, 21.5.2] Mitchison GJ, McKee SP (1987a) The resolution of ambiguous stereoscopic matches by interpolation Vis Res 27 285–94 [22.2.3b] Mitchison GJ, McKee SP (1987b) Interpolation and the detection of fine structure in stereoscopic matching Vis Res 27 295–302 [22.2.3b] Mitchison GJ, McKee SP (1990) Mechanisms underlying the anisotropy of stereoscopic tilt perception Vis Res 30 1781–91 [20.4.1a]
596
•
Mitchison GJ, Westheimer G (1984) The perception of depth in simple figures Vis Res 24 1063–73 [21.3.1] Mitchison GJ, Westheimer G (1990) Viewing geometry and gradients of horizontal disparity In Vision: coding and efficiency (ed C Blakemore) pp 302–9 Cambridge University Press, Cambridge [21.5.3] Mitson L, Ono H, Barbeito R (1976) Three methods of measuring the location of the egocentre: their reliability comparative locations and intercorrelations Can J Psychol 30 1–8 [16.7.6a] Mitsudo H (2007) Illusory depth induced by binocular torsional misalignment Vis Res 47 1303–14 [14.5.2f, 20.3.2a] Mitsudo H, Nakamizo S, Ono H (2005) Greater depth seen with phantom stereopsis is coded at the early stages of visual processing Vis Res 45 1365–74 [17.3] Mitsudo H, Nakamizo S, Ono H (2006) A long-distance detector for partially occluding surfaces Vis Res 46 11806 [17.3] Mitsudo H, Kaneko H, Nishida S (2009) Perceived depth of curved lines in the presence of cyclovergence Vis Res 49 348–61 [20.3.2a] Moidell B, Steinbach MJ, Ono H (1988) Egocenter location in children enucleated at an early age Invest Ophthal Vis Sci 29 1348–51 [16.7.5] Mojon DS, Rösler KM, Oetliker H (1998) A bedside test to determine motion stereopsis using the Pulfrich phenomenon Ophthalmology 105 1337–44 [23.7] Mollon J (1974) Aftereffects and the brain New Scientist 61 479–82 [21.1] Mon-Williams M, Wann JP, Rushton S (1993) Binocular vision in a virtual world: visual deficits following the wearing of a head-mounted display Ophthal Physiol Opt 13 387–91 [24.2.6] Mon-Williams M, Tresilian JR , Roberts A (2000) Vergence provides veridical depth perception from horizontal retinal image disparities Exp Brain Res 133 407–13 [20.6.3a] Moore CM, Elsinger CL, Lleras A (2001) Visual attention and the apprehension of spatial relations: the case of depth Percept Psychophys 63 595–606 [22.8.2c] Moore RJ, Spear PD, Kim CBY, Xue JT (1992) Binocular processing in the cat’s dorsal lateral geniculate nucleus. III. Spatial frequency orientation and direction sensitivity of nondominant–eye influences Exp Brain Res 89 588–98 [12.9.1] Moradi F, Heeger DJ (2009) Inter-ocular contrast normalization in human visual cortex J Vis 9(3) Article 13 [12.3.1b, 13.1.3b] Moraglia G, Schneider B (1990) Effects of direction and magnitude of horizontal disparities on binocular unmasking Perception 19 581–93 [13.2.4b] Moraglia G, Schneider B (1991) Binocular unmasking with vertical disparity Can J Psychol 45 353–66 [13.2.4b] Morgan H, Symmes D (1982) Amazing 3-D Little Brown Co, Boston [24.1.2c] Morgan MJ (1975) Stereoillusion based on visual persistence Nature 256 639–40 [23.3.4] Morgan MJ (1976) Pulfrich effect and the filling in of apparent motion Perception 5 187–95 [23.3.6] Morgan MJ (1977) Differential visual persistence between the two eyes: a model for the Fertsch–Pulfrich effect J Exp Psychol HPP 3 484–95 [23.3.6] Morgan MJ (1979) Perception of continuity in stroboscopic motion: a temporal frequency analysis Vis Res 19 491–500 [23.3.6, 23.6.4] Morgan MJ (1980) Spatiotemporal filtering and the interpolation effect in apparent motion Perception 9 161–74 [23.3.6] Morgan MJ (1981) Vernier acuity and stereopsis with discontinuously moving stimuli Acta Psychol 48 57–67 [23.3.6] Morgan MJ (1986) Positional acuity without monocular cues Perception 15 157–62 [16.2.1] Morgan MJ, Castet E (1995) Stereoscopic depth perception at high velocities Nature 378 380–3 [18.10.1b] Morgan MJ, Castet E (1997) The aperture problem in stereopsis Vis Res 37 2737–44 [18.6.5] Morgan MJ, Fahle M (2000) Motion-stereo mechanisms sensitive to inter-ocular phase Vis Res 40 1667–75 [23.3.2]
REFERENCES
Morgan MJ, Thompson P (1975) Apparent motion and the Pulfrich effect Perception 4 3–18 [23.1.1, 23.3.2, 23.3.4, 23.3.6] Morgan MJ, Tyler CW (1995) Mechanisms for dynamic stereomotion respond selectively to horizontal velocity components Proc Roy Soc B 262 371–6 [23.6.4] Morgan MJ, Ward R (1980) Interocular delay produces depth in subjectively moving noise patterns Quart J Exp Psychol 32 387–95 [23.6.4] Morgan MJ, Mason AJS, Solomon JA (1997) Blindsight in normal subjects Nature 385 401–2 [12.5.6] Morgan MW (1955) A unique case of double monocular diplopia Am J Optom Arch Am Acad Optom 32 70–87 [14.4.2] Morgan MW (1961) Anomalous correspondence interpreted as a motor phenomenon Am J Optom Arch Am Acad Optom 38 131–48 [14.4.1d] Morris JS, Friston KJ, Dolan RJ (1997) Neural responses to salient visual stimuli Proc R Soc 294 769–75 [11.2.1] Morrison LC (1977) Stereoscopic localization with the eyes asymmetrically converged Am J Optom Physiol Opt 54 556–66 [14.6.2] Morrone MC, Burr DC, Maffei L (1982) Functional implications of cross–orientation inhibition of cortical visual cells. I. Neurophysiological evidence Proc R Soc B 216 335–54 [12.9.2b] Morrone MC, Burr DC, Speed HD (1987) Cross-orientation inhibition in cats is GABA mediated Exp Brain Res 67 635–44 [12.9.2b] Moseley ME, White DL, Wang , SC, et al. (1989) Stereoscopic MR imaging J Comput Assist Tomog 13 167–73 [24.2.4] Motter BC, Poggio GF (1984) Binocular fixation in the rhesus monkey: spatial and temporal characteristics Exp Brain Res 54 304–14 [18.10.3a] Motter BC, Poggio GF (1990) Dynamic stabilization of receptive fields of cortical neurons during fixation of gaze in the macaque Exp Brain Res 83 37–43 [18.10.3a] Moulden BP (1980) After–effects and the integration of patterns of neural activity within a channel Philos Tr R Soc B 290 39–55 [13.3.1, 13.3.2a] Mousavi MS, Schalkoff RJ (1994) ANN implementation of stereo vision using a multi-layer feedback architecture IEEE Tr Man Mach Cybern 24 1220–38 [15.2.1c] Moutoussis K , Zeki S (2002) The relationship between cortical activation and perception investigated with invisible stimuli Proc Natl Acad Sci 99 9527–32 [12.5.6] Movshon JA, Lennie P (1979) Pattern–selective adaptation in visual cortical neurones Nat New Biol 278 850–2 [12.6.2] Movshon JA, Thompson ID, Tolhurst DJ (1978) Spatial and temporal contrast sensitivity of neurones in areas 17 and 18 of the cat’s visual cortex J Physiol 283 101–120 [11.4.1e, 20.2.1] Movshon JA, Adelson EH, Gizzi MS, Newsome WT (1985) The analysis of moving visual patterns In Pattern recognition mechanisms (ed C Chigas, R Gattas, C Gross) pp 117–51 Springer, New York [22.3.3] Mowatt MH (1940) Configurational properties considered ‘good’ by naïve subjects Am J Psychol 53 46–69 [22.1.1] Mueller CG, Lloyd VV (1948) Stereoscopic acuity for various levels of illumination Proc Natl Acad Sci Washington 34 223–7 [18.5.1] Mueller TJ (1990) A physiological model of binocular rivalry Vis Neurosci 4 63–73 [12.10] Mueller TJ, Blake R (1989) A fresh look at the temporal dynamics of binocular rivalry Biol Cyber 61 223–32 [12.10, 12.3.2a] Muller C, Lankheet MJM, van de Grind WA (2004) Binocular correlation does not improve coherence detection for fronto-parallel motion Vis Res 44 1961–69 [22.3.5] Müller M, Squier J, Wilson KR , Brakenhoff GJ (1998) 3D microscopy of transparent objects using third-harmonic generation J Micros 191 296–74 [24.2.3c] Münster C (1941) über den Einfluss von Helligkeitsunterscheiden in beiden Augen auf die stereoscopische Wahrnehmung Z Sinnesphysiol 69 275–60 [17.9] Münsterberg H (1894) A stereoscope without mirrors or prisms Psychol Rev 1 56–60 [24.1.2e]
Muntz WRA (1961) Interocular transfer in Octopus vulgaris J Comp Physiol Psychol 54 49–55 [13.4.2] Murakami I (1999) Motion-transparent inducers have different effects on induced motion and motion capture Vis Res 39 1671–81 [22.7.2] Murakami I, Cavanagh P (1998) A jitter after-effect reveals motionbased stabilization of vision Nature 395 798–801 [18.10.3a] Murasugi CM, Salzman CD, Newsome WT (1993) Microstimulation in visual area MT: effects of varying pulse amplitude and frequency J Neurosci 13 1719–29 [13.3.3b] Murch GM (1972) Binocular relationships in a size and color orientation specific aftereffect J Exp Psychol 93 30–4 [13.3.4] Murch GM (1974) Color contingent motion aftereffects: single or multiple levels of processing Vis Res 14 1181–4 [13.3.5] Murdoch JR , McGhee CNJ, Glover V (1991) The relationship between stereopsis and fine manual dexterity: pilot study of a new instrument Eye 5 642–43 [20.1.1] Murray E (1939) Binocular fusion and the locus of ‘yellow’ Am J Psychol 52 117–21 [12.2.1] Mussap AJ, Levi DM (1995) Binocular processes in vernier acuity J Opt Soc Am A 12 225–33 [13.2.4a] Mustari MJ, Fuchs AF (1990) Discharge patterns of neurons in the pretectal nucleus of the optic tract NOT in the behaving primate J Neurophysiol 64 77–90 [22.6.1b] Mustari MJ, Fuchs AF, Wallman J (1988) Response properties of dorsolateral pontine units during smooth pursuit in the Rhesus macaque J Neurophysiol 60 664–86 [22.6.1d] Mustillo P (1985) Binocular mechanisms mediating crossed and uncrossed stereopsis Psychol Bull 97 187–201 [18.6.4] Mustillo P, Francis E, Oross S, et al. (1988) Anisotropies in global stereoscopic orientation discrimination Vis Res 28 1315–21 [16.2.1] Mutch K , Smith IM, Yonas A (1983) The effect of two–dimensional and three–dimensional distance on apparent motion Perception 12 305–12 [22.5.3a] Myers RE (1955) Interocular transfer of pattern discrimination in cats following section of crossed optic fibres J Comp Physiol Psychol 48 470–3 [13.4.2] Nachmias J, Sansbury RV (1974) Grating contrast: discrimination may be better than detection Vis Res 14 1039–42 [13.1.3a] Naegele J, Held R (1982) The postnatal development of monocular optokinetic nystagmus in infants Vis Res 22 341–6 [22.6.1b, 22.6.1e] Naganuma T, Nose I, Inoue K et al. (2005) Information processing of geometrical features of a surface based on binocular disparity cues: as fMRI study Neurosci Res 51 147–55 [11.8.1] Nagel WA (1902) Stereoskopie und Tiefenwahrnehmung im Dämmerungssehen Z Psychol Physiol Sinnesorg 27 294–6 [18.5.1] Nakamizo S, Shimono K , Kondo M, Ono H (1994) Visual directions of two stimuli in Panum’s limiting case Perception 23 1037–48 [16.7.3b] Nakamizo S, Ono H, Ujike H (1999) Subjective staircase: a multiple wallpaper illusion Percept Psychophys 61 13–22 [14.2.2] Nakamizo S, Kawabata H, Ono H (2008) Misconvergence to the stimulus plane causes apparent displacement of the stimulus element seen monocularly Japanese Psychol Res 51 49–62 [16.7.5] Nakamura S, Shimojo S (1999) Critical role of foreground stimuli in perceiving visually induced self-motion (vection) Perception 28 893–902 [22.7.3] Nakayama K (1977) Geometric and physiological aspects of depth perception Proceedings of Society of Photo–Optical Instrumentation Engineers 120 2–9 [14.6.1d, 14.7] Nakayama K (1996) Binocular visual surface perception Proc Natl Acad Sci 93 634–9 [22.1.2] Nakayama K , Shimojo S (1990) Da Vinci stereopsis: depth and subjective occluding contours from unpaired image points Vis Res 30 1811–25 [17.2.2, 17.6.2, 17.6.4] Nakayama K , Silverman GH (1986) Serial and parallel processing of visual feature conjunctions Nature 320 294–5 [22.5.1e, 22.8.2b]
REFERENCES
•
597
Nakayama K , Tyler CW (1978) Relative motion induced between stationary lines Vis Res 18 1663–8 [22.7.1] Nakayama K , Silverman GH, MacLeod DIA, Mulligan J (1885) Sensitivity to shearing and compression motion in random dots Perception 14 225–38 [18.6.3b, 20.4.1a] Nakayama K , Shimojo S, Silverman GH (1989) Stereoscopic depth: its relation to image segmentation grouping and the recognition of occluded objects Perception 18 55–8 [22.1.2] Nascimento SMC, Foster DH (2001) Detecting changes of spatial coneexcitation ratios in dichoptic viewing Vis Res 41 2601–6 [13.1.3d] Neary C (1992) The effect of a binocular disparate background on smooth pursuit eye movements Perception 21 (Supplement 2) 52 [22.6.2] Neill RA (1981) Spatio–temporal averaging and the dynamic visual noise stereophenomenon Vis Res 21 673–82 [23.6.1, 23.6.2, 23.6.3, 23.6.4] Neill RA, Fenelon B (1988) Scalp response topography to dynamic random dot stereograms EEG Clin Neurophysiol 69 209–217 [11.7] Neinborg H, Cumming BG (2007) Psychophysically measured task strategy for disparity discrimination is reflected in V2 neurons Nat Neurosci 10 1608–14 [11.5.1] Nelson JI (1975) Globality and stereoscopic fusion in binocular vision J Theor Biol 49 1–88 [15.4.5] Nelson JI (1977) The plasticity of correspondence: after–effects illusions and horopter shifts in depth perception J Theor Biol 66 203–66 [21.7.1, 21.7.2] Nelson JI (1981) A neurophysiological model for anomalous correspondence based on mechanisms of sensory fusion Doc Ophthal 51 3–100 [14.4.1e] Nelson JI, Kato H, Bishop PO (1977) Discrimination of orientation and position disparities by binocularly activated neurons in cat striate cortex J Neurophysiol 40 290–83 [11.6.2] Neri P, Parker AJ, Blakemore C (1999) Probing the human stereoscopic system with reverse correlation Nature 401 695–8 [11.4.1f, 15.3.7d] Neri P, Bridge H, Heeger DJ (2004) Stereoscopic processing of absolute and relative disparity in human visual cortex J Neurophysiol 92 1880–91 [11.8.1] Newsome WT, Wurtz RH, Komatsu H (1988) Relation of cortical areas MT and MST to pursuit eye movements. II Differentiation of retinal from extraretinal inputs J Neurophysiol 60 604–20 [22.6.1d] Ngo TT, Miller, SM, Liu GB, Pettigrew JD (2000) Binocular rivalry and perceptual coherence Curr Biol 10 R134–6 [12.4.4b] Nguyen VA, Freeman A, Wenderoth P (2001) The depth and selectivity of suppression in binocular rivalry Percept Psychophys 63 348–60 [12.4.4a] Nguyen VA, Freeman AW, Alais D (2003) Increasing depth of binocular rivalry suppression along two visual pathways Vis Res 43 2003–8 [12.3.3d] Nguyenkim JD, DeAngelis GC (2003) Disparity-based coding of three-dimensional surface orientation by macaque middle temporal neurons J Neurosci 23 7117–28 [11.5.2a] Nichols DF, Wilson HR (2009) Stimulus specificity in spatially-extended interocular suppression Vis Res 49 2110–20 [12.4.2] Nickalls RWD (1986) The rotating Pulfrich effect and a new method of determining visual latency differences Vis Res 29 367–72 [23.1.1] Nickalls RWD (1996) The influence of target angular velocity on visual latency difference determined using the rotating Pulfrich effect Vis Res 36 2865–72 [23.1.2] Nielsen KRK , Poggio T (1984) Vertical image registration in stereopsis Vis Res 27 1133–40 [18.4.2b] Nienborg H, Bridge H, Parker AJ, Cumming BG (2004) Receptive field size in V1 neurons limits acuity for perceiving disparity modulation J Neurosci 24 2065–76 [11.6.3] Nienborg H, Bridge H, Parker AJ, Cumming BG (2005) Neuronal computation of disparity in V1 limits temporal resolution for detecting disparity modulation J Neurosci 25 10207–19 [11.10.1b] Nijhawan R (1995) “Reversed’ illusion with three-dimensional MüllerLyer shapes Perception 27 1281–96 [22.5.2]
598
•
Nikara T, Bishop PO, Pettigrew JD (1968) Analysis of retinal correspondence by studying receptive fields of binocular single units in cat striate cortex Exp Brain Res 6 353–72 [11.1.2, 11.3.1, 11.4.4] Ninio J (1981) Random–curve stereograms: a flexible tool for the study of binocular vision Perception 10 403–10 [24.1.5] Ninio J (1985) Orientational versus horizontal disparity in the stereoscopic appreciation of slant Perception 14 305–14 [15.3.5, 20.3.1a] Ninio J (2007) The science and craft of autostereograms Spat Vis 21 185–200 [24.1.6] Ninio J, Herlin I (1988) Speed and accuracy of 3D interpretation of linear stereograms Vis Res 28 1223–33 [15.3.11] Nishida S, Ashida H (2000) A hierarchical structure of motion system revealed by interocular transfer of flicker motion aftereffects Vis Res 40 295–78 [13.3.3b] Nishida S, Ashida H (2001) A motion aftereffect seen more strongly by the non-adapted eye: evidence of multistage adaptation in visual motion processing Vis Res 41 561–70 [13.3.3b] Nishida S, Sato T (1995) Motion aftereffect with flickering test patterns reveals higher stages of motion processing Vis Res 35 477–90 [13.3.3b, 16.5.3a] Nishida S, Ashida H, Sato T (1994) Complete transfer of motion aftereffect with flickering test Vis Res 34 2707–16 [13.3.3b] Nishina S, Okada M, Kawato M (2003) Spatio-temporal dynamics of depth propagation on uniform region Vis Res 43 2493–503 [22.2.3a] Noble J (1966) Mirror-images and the forebrain commissures of the monkey Nature 211 1293–6 [13.4.2] Noble J (1968) Paradoxical interocular transfer of mirror-image discrimination in the optic chiasm sectioned monkey Brain Res 10 127–51 [13.4.2] Noda H (1986) Mossy fibres sending retinal slip eye and head velocity signals to the flocculus of the monkey J Physiol 379 39–60 [22.6.1d] Nomura M (1993) A model for neural representation of binocular disparity in striate cortex: distributed representation and veto mechanism Biol Cyber 69 165–71 [11.4.1d] Nomura M, Matsumoto G, Fugiwara S (1990) A binocular model for the simple cell Biol Cyber 63 237–42 [11.4.1d] Noorden GK von (1970) Etiology and pathogenesis of fixation anomalies in strabismus. I. Relationship between eccentric fixation and anomalous retinal correspondence Am J Ophthal 69 210–22 [14.4.1d] Norcia AM, Tyler CW (1985) Spatial frequency sweep VEP: visual acuity during the first year of life Vis Res 25 1399–408 [13.1.8b] Norcia AM, Sutter EE, Tyler CW (1985) Electrophysiological evidence for the existence of coarse and fine disparity mechanisms in human Vis Res 25 1603–11 [11.7] Norling JS (1953) The stereoscopic art J Soc Motion Pict Televis Engin 60 298–308 [24.1.1] Norman HF, Norman, JF, Bilotta J (2000) The temporal course of suppression during binocular rivalry Perception 29 831–41 [12.10, 12.3.6b] Norman JF, Todd JT (1998) Stereoscopic discrimination of interval and ordinal depth relations on smooth surfaces and in empty space Perception 27 257–72 [18.6.2a] Norman JF, Lappin JS, Zucker SW (1991) The discriminability of smooth stereoscopic surfaces Perception 20 789–807 [18.3.2b] Nothdurft HC (1985) Texture discrimination does not occur at the cyclopean retina Perception 14 527–37 [16.6.1b] O’Brien V (1958) Contour perception illusion and reality J Opt Soc Am 48 112–19 [21.4.2e] O’Kane LM, Hibbard PB (2007) Vertical disparity affects shape and size judgments across surfaces separated in depth Perception 36 696–702 [20.6.3c] O’Kane LM, Hibbard PB (2010) Contextual effects on perceived three-dimensional shape Vis Res 50 1095–100 [20.6.3a] O’Shea RP (1987) Chronometric analysis supports fusion rather than suppression theory of binocular vision Vis Res 27 781–91 [12.7.2]
REFERENCES
O’Shea RP (1989) Depth with rival Kaufman–type stereograms Invest Ophthal Vis Sci 30 (Abs) 389 [17.1.2a] O’Shea RP, Blake R (1986) Dichoptic temporal frequency differences do not lead to binocular rivalry Percept Psychophys 39 59–63 [12.3.5b] O’Shea RP, Blake R (1987) Depth without disparity in random–dot stereograms Percept Psychophys 42 205–14 [16.1.2c, 17.5] O’Shea RP, Corballis PM (2003) Binocular rivalry in split-brain observers J Vis 3 610–15 [12.9.2d] O’Shea RP, Corballis PM (2005) Visual grouping on binocular rivalry in a split-brain observer Vis Res 45 247–61 [12.9.2d] O’Shea RP, Crassini B (1981a) The sensitivity of binocular rivalry suppression to changes in orientation assessed by reaction–time and forced–choice techniques Perception 10 283–93 [12.5.3] O’Shea RP, Crassini B (1981b) Interocular transfer of the motion after– effect is not reduced by binocular rivalry Vis Res 21 801–4 [12.6.4] O’Shea RP, Crassini B (1982) The dependence of cyclofusion on orientation Percept Psychophys 32 195–6 [12.1.5] O’Shea RP, Crassini B (1984) Binocular rivalry occurs without simultaneous presentation of rival stimuli Percept Psychophys 36 296–76 [12.3.5d] O’Shea RP, Williams DR (1996) Binocular rivalry with isoluminant stimuli visible only via short-wavelength-sensitive cones Vis Res 36 1561–71 [12.3.2e] O’Shea RP, Wilson RG, and Duckett A (1993) The effects of contrast reversal on the direct, indirect, and interocularly-transferred tilt aftereffect NZ J Psychol 22 94–100 [13.3.2a] O’Shea RP, Blake R , Wolfe JM (1994a) Binocular rivalry and fusion under scotopic luminances Perception 23 771–84 [12.3.2c] O’Shea RP, Blackburn SG, Ono H (1994b) Contrast as a depth cue Vis Res 34 1595–604 [18.7.3b] O’Shea RP, Simms AJH, Govan DG (1997) The effect of spatial frequency and field size on the spread of exclusive visibility in binocular rivalry Vis Res 37 175–83 [12.4.1] O’Shea RP, Parker A, La Rooy D, Alais D (2009) Monocular rivalry exhibits three hallmarks of binocular rivalry: Evidence for common processes Vis Res 49 671–81 [12.3.8d] O’Shea WF, Ciuffreda KJ, Fisher SK , et al. (1988) Relation between distance heterophoria and tonic vergence Am J Optom Physiol Opt 65 787–93 [12.4.2] O’Toole AJ, Kersten DJ (1992) Learning to see random–dot stereograms Perception 21 227–43 [18.14.2c, 18.14.2d] O’Toole AJ, Walker CL (1997) On the preattentive accessibility of stereoscopic disparity: evidence from visual search Percept Psychophys 59 202–18 [22.8.2a] Odom JV, Chao GM (1987) A stereo illusion induced by binocularly presented gratings: effects of number of eyes stimulated spatial frequency orientation field size and viewing distance Percept Psychophys 42 140–9 [14.2.2] Odom JV, Chao GM (1995) Models of binocular luminance interaction evaluated using visually evoked potential and psychophysical measures: a tribute to M Russell Harter Int J Neurosci 80 255–80 [13.1.8b] Ogle KN (1932) An analytical treatment of the longitudinal horopter; its measurement and application to related phenomena especially to the relative size and shape of the ocular images J Opt Soc Am 22 665–728 [14.6.2a] Ogle KN (1938) Induced size effect. I. A new phenomenon in binocular space–perception associated with the relative sizes of the images of the two eyes Arch Ophthal 20 604–23 [20.2.3a, 20.2.3b, 20.2.4b] Ogle KN (1939a) Induced size effect. II. An experimental study of the phenomenon with restricted fusion stimuli Arch Ophthal 21 604–25 [20.2.3a] Ogle KN (1939b) Induced size effect. III. A study of the phenomenon as influenced by horizontal disparity of the fusion contours Arch Ophthal 22 613–35 [20.2.3a] Ogle KN (1939c) Relative sizes of ocular images of the two eyes in asymmetrical convergence Arch Ophthal 22 1046–67 [18.10.2a, 19.6.3]
Ogle KN (1940) Induced effect with the eyes in asymmetrical convergence Arch Ophthal 23 1023–8 [20.2.3a] Ogle KN (1946) The binocular depth contrast phenomenon Am J Psychol 59 111–29 [21.3.3, 21.5.2, 21.3.1] Ogle KN (1952) On the limits of stereoscopic vision J Exp Psychol 44 253–9 [18.4.1a] Ogle KN (1953) Precision and validity of stereoscopic depth perception from double images J Opt Soc Am 43 906–13 [18.3.3a] Ogle KN (1955) Stereopsis and vertical disparity Arch Ophthal 53 495–504 [18.4.2a, 18.6.5] Ogle KN (1956) Stereoscopic acuity and the role of convergence J Opt Soc Am 46 269–73 [18.6.2a, 18.10.2a] Ogle KN (1958) Note on stereoscopic acuity and viewing distance J Opt Soc Am 48 794–8 [18.6.7] Ogle KN (1962) The optical space sense. In The eye (ed H Davson) Vol 4 pp 211–432 Academic Press, New York [17.7, 17.9, 23.2.1] Ogle KN (1963) Stereoscopic depth perception and exposure delay between images to the two eyes J Opt Soc Am 53 1296–304 [18.12.2a] Ogle KN (1964) Researches in binocular vision Hafner, New York [12.1.1d, 14.6.1c, 14.6.2a, 18.3.4, 18.10.3b, 20.2.1, 20.2.2c, 20.6.5a] Ogle KN, Prangen A de H (1953) Observations on vertical divergences and hyperphorias Arch Ophthal 49 313–34 [12.1.1a, 12.1.1d] Ogle KN, Reiher L (1962) Stereoscopic depth perception from after–images Vis Res 2 439–47 [18.10.1a, 20.6.3b] Ogle KN, Wakefield JM (1967) Stereoscopic depth and binocular rivalry Vis Res 7 89–98 [12.7.3] Ogle KN, Weil MP (1958) Stereoscopic vision and the duration of the stimulus Arch Ophthal 59 4–17 [18.10.1a, 18.12.1a, 18.5.1] Ohmi M, Howard IP (1991) Induced visual motion; dissociation of oculocentric and headcentric (oculomotor) components Invest Ophthal Vis Sci 32 (Abs) 1272 [22.7.1] Ohmi M, Howard IP, Everleigh B (1986) Directional preponderance in human optokinetic nystagmus Exp Brain Res 63 387–94 [22.6.1c] Ohmi M, Howard IP, Landolt J (1987) Circular vection as a function of foreground–background relationships Perception 16 17–22 [22.7.3] Ohtsuka S (1995a) Perception of direction in three–dimensional space with occlusion The Institute of Electronics, Information and Communication Engineers Tech Rep 95 31–6 (Abstract in English) [16.7.4b] Ohtsuka S (1995b) Relationship between error in inclination perception in observing Poggendorff figures and stereopsis The Institute of Electronics, Information and Communication Engineers Tech Rep 95 24–6 (Abstract in English) [16.7.4b] Ohtsuka S, Yano S (1994) The phenomenon causing the Poggendorff illusion compensates geometrical error in reconstructed 2D image from stereopsis. The Institute of Television Engineers of Japan (ITE) Tech Rep 18–60 25–30 (Abstract in English) [16.7.4b] Ohwaki S (1960) On the destruction of geometrical illusions in stereoscopic observation Tohoku Psychol Folia 29 27–36 [16.3.1] Ohzawa I, Freeman RD (1986a) The binocular organization of simple cells in the cat’s visual cortex J Neurophysiol 56 221–42 [11.4.1d, 11.4.5b, 13.1.8a] Ohzawa I, Freeman RD (1986b) The binocular organization of complex cells in the cat’s visual cortex J Neurophysiol 56 273–60 [11.3.1, 11.4.1d, 13.1.8a] Ohzawa I, Freeman RD (1988) Cyclopean visual evoked potentials: a new test of binocular vision Vis Res 28 1167–70 [13.1.8b] Ohzawa I, Sclar G, Freeman RD (1985) Contrast gain control in the cat’s visual system J Neurophysiol 54 651–67 [11.4.1f, 13.2.6] Ohzawa I, DeAngelis GC, Freeman RD (1990) Stereoscopic depth discrimination in the visual cortex: neurons ideally suited as disparity detectors Science 279 1037–41 [11.1.2, 11.4.1d, 11.10.1a, 11.10.1b] Ohzawa I, DeAngelis GC, Freeman RD (1996) Encoding of binocular disparity by simple cells in the cat’s visual cortex J Neurophysiol 75 1779–805 [11.4.3a, 11.4.5b]
REFERENCES
•
599
Ohzawa I, DeAngelis GC, Freeman RD (1997) Encoding of binocular disparity by simple cells in the cat’s visual cortex J Neurophysiol 77 2879–910 [11.4.1f ] Okoshi T (1976) Three–dimensional imaging techniques Academic Press, New York [24.1.3b, 24.1.4b] Ono H (1979) Axiomatic summary and deductions from Hering’s principles of visual direction Percept Psychophys 25 473–7 [16.7.2d] Ono H (1981) On Well’s (1792) law of visual direction Percept Psychophys 30 403–6 [16.7.2d] Ono H (1984) Exorcising the double–nail illusion: giving up the ghost Perception 13 763–8 [15.4.6, 17.6.3] Ono H (1991) Binocular visual directions of an object when seen as single or double In D. Regan (Ed.) Vision and visual dysfunction: Vol 9. Binocular Vision, (pp. 1–18). MacMillan, London [16.7.2d] Ono H, Angus R (1974) Adaptation to sensory–motor conflict produced by the visual direction of the hand specified from the cyclopean eye J Exp Psychol 103 l–9 [16.7.3a] Ono H, Barbeito R (1982) The cyclopean eye vs the sighting–dominant eye as the center of visual direction Percept Psychophys 32 201–10 [16.7.3b, 16.7.6b] Ono H, Barbeito R (1985) Utrocular discrimination is not sufficient for utrocular identification Vis Res 25 289–99 [16.8] Ono H, Gonda G (1978) Apparent movement eye movements and phoria when two eyes alternate in viewing a stimulus Perception 7 75–83 [16.7.5, 16.7.5] Ono H, Mapp AP (1995) A restatement and modification of WellsHering’s laws of visual direction Perception 27 237–52 [14.6.1c, 16.7.2d] Ono H, Steinbach MJ (1983) The Pulfrich phenomenon with eye movement Vis Res 23 1735–7 [23.5] Ono H, Wade NJ (1985) Resolving discrepant results of the Wheatstone experiment Psychol Res 47 135–42 [16.7.3a] Ono H, Weber EU (1981) Nonveridical visual direction produced by monocular viewing J Exp Psychol HPP 7 937–47 [16.7.5] Ono H, Hasdorf A, Osgood CE (1966) Binocular rivalry as a function of incongruity of meaning Scand J Psychol 7 225–33 [12.8.3a] Ono H, Komoda M, Mueller ER (1971) Intermittent stimulation of binocular disparate colors and central fusion Percept Psychophys 9 343–7 [12.2.2] Ono H, Wilkinson A, Muter P, Mitson L (1972) Apparent movement and change in perceived location of a stimulus produced by a change in accommodative vergence Percept Psychophys 12 187–92 [16.7.5, 16.7.7] Ono H, Angus R , Gregor P (1977) Binocular single vision achieved by fusion and suppression Percept Psychophys 21 513–21 [16.7.3a] Ono H, Shimono K , Shibuta K (1992) Occlusion as a depth cue in the Wheatstone–Panum limiting case Percept Psychophys 51 3–13 [17.6.4] Ono H, Ohtsuka S, Lillakas L (1998) The visual system’s solution to Leonardo da Vinci’s paradox and to the problems created by the solution Proceeding for Workshop on Visual Cognition (Tsukuba, Japan: Science and Technology Association and National Institute of Bioscience and Human-Technology) 125–36 [16.7.4b] Ono H, Shimono K , Saida S, Ujike H (2000) Transformation of the visual-line value in binocular vision: stimuli on corresponding points can be seen in two different directions Perception 29 421–36 [16.7.3a] Ono H, Lillakas L, Mapp AP (2002a) The making of the direction sensing system for the Howard eggmobile In L Harris and M Jenkin (Eds) Levels of perception Springer Verlag , New York in press [16.7.7] Ono H, Mapp AP, Howard IP (2002b) The cyclopean eye in vision: the new and old data continue to hit you right between the eyes Vis Res 42 1307–24 [16.7.7] Ono H, Wade NJ, Lillakas L (2002c) The pursuit of Leonardo’s constraint Perception 31 83–102 [16.7.4b] Ono H, Lillikas L, Grove PM, Suzuki M (2003) Leonardo’s constraint: two opaque objects cannot be seen in the same direction J Exp Psychol Gen 132 253–65 [16.7.4b]
600
•
Ono H, Lillikas L, Wade NJ (2007a) The cyclopean illusion unleashed Vis Res 47 2067–75 [16.7.7] Ono H, Lillikas L, Wade NJ (2007b) Seeing double and depth with Wheatstone’s stereograms Perception 36 1611–23 [18.4.1a] Ono H, Wade NJ, Lillakas L (2009) Binocular vision: Defining the historical direction Perception 38 492–507 [16.7.2b, 16.7.7] Ooi TL, He ZJ (1999) Binocular rivalry and visual awareness: the role of attention Perception 28 551–74 [12.3.3c, 12.8.2] Ooi TL, He ZJ (2003) A distributed intercortical processing of binocular rivalry: psychophysical evidence Perception 32 155–66 [12.4.4b] Ooi TL, He ZJ (2006) Binocular rivalry and surface-boundary processing Perception 35 581–603 [12.3.3b] Ooi TL, Loop MS (1994) Visual suppression and its effect upon color and luminance sensitivity Vis Res 34 2997–3003 [12.3.2f ] Orban GA, Janssen P, Vogels R (2006) Extracting 3D structure from disparity TINS 29 466–73 [11.6.3] Oster G (1965) Optical art App Optics 4 1359–69 [12.1.7] Osuobeni EP (1991) Effect of chromatic aberration on isoluminance stereothreshold Optom Vis Sci 68 552–5 [17.8] Osuobeni EP, O’Leary DJ (1986) Chromatic and luminance difference contribution to stereopsis Am J Optom Physiol Opt 63 970–7 [17.1.4a] Oswald I (1957) After–images from retina and brain Quart J Exp Psychol 9 88–100 [12.6.2, 13.3.1] Over R (1971) Comparison of normalization theory and neural enhancement explanation of negative aftereffects Psychol Bull 75 225–43 [21.1] Over R , Long N, Lovegrove W (1973) Absence of binocular interaction between spatial and color attributes of visual stimuli Percept Psychophys 13 534–40 [13.3.5, 15.3.8a] Owens DA, Leibowitz HW (1975) Chromostereopsis with small pupils J Opt Soc Am 65 358–9 [17.8] Paap KR , Ebenholtz SM (1977) Concomitant direction and distance aftereffects of sustained convergence: a muscle potentiation explanation for eye–specific adaptation Percept Psychophys 21 307–14 [13.4.3] Pack CC, Born RT, Livingstone MS (2003) Two-dimensional substructure of stereo and motion interactions in macaque visual cortex Neuron 37 525–35 [11.6.5] Paffen CLE, te Pas SF, Kanai R , et al. (2004) Center-surround interactions in visual motion processing during binocular rivalry Vis Res 44 1635–39 [12.3.3b] Paffen CLE, Tadin D, te Pas SF, et al. (2006a) Adaptive center-surround interactions inhuman vision revealed during binocular rivalry Vis Res 46 599–604 [12.3.3b] Paffen CLE, Alais D, Verstraten FA (2006b) Attention speeds binocular rivalry Psychol Sci 17 752–6 [12.8.2] Paffen CLE, Naber M, Verstraten FAJ (2008a) The spatial origin of a perceptual transition in binocular rivalry PLoS ONE 3 e2311 [12.3.5e] Paffen CLE, Verstraten FAJ, Vidnyánszky Z (2008b) Attentionbased perceptual learning increases binocular rivalry suppression of irrelevant visual features J Vis 8(4) Article 25 [12.8.2] Palanca BJA, DeAngelis GC (2003) Macaque middle temporal neurons signal depth in the absence of motion J Neurosci 23 7647–58 [11.5.2a] Palmer DA (1961) Measurement of the horizontal extent of Panum’s area by a method of constant stimuli Optical Acta 8 151–9 [12.1.1d, 12.1.4] Palmisano S, Allison RS, Howard IP (2001) Effects of horizontal and vertical additive disparity noise on stereoscopic corrugation detection Vis Res 41 3133–43 [11.10.1c, 18.4.2b] Palmisano S, Allison RS, Howard IP (2006) Effect of 3-D grating detection with static and dynamic random-dot stereograms Vis Res 46 57–70 [20.6.3b] Palmisano S, Gillam B, Govan DG, et al. (2010) Stereoscopic perception of real depths at large distances J Vis 10(6), 19 [15.2.2b]
REFERENCES
Pantle A (1974) Motion aftereffect magnitude as a measure of the spatio-temporal response properties of direction-selective analyzers Vis Res 14 229–36 [16.5.3a] Pantle A, Picciano L (1976) A multistable movement display: evidence for two separate motion systems in human vision Science 193 500–2 [16.4.2a, 16.4.2e] Panum PL (1858) Physiologische Untersuchungen über das Sehen mit zwei Augen. Schwerssche Buchhandlung , Kiel [12.1.1a, 14.2.1] Papathomas TV, Julesz B (1989) Stereoscopic illusion based on the proximity principle Perception 18 589–94 [15.3.2] Papathomas TV, Feher A, Julesz B, Zeevi Y (1996) Interactions of monocular and cyclopean components and the role of depth in the Ebbinghaus illusion Perception 25 783–95 [16.3.2, 22.5.2] Papert S (1961) Centrally produced geometrical illusions Nature 191 733 [16.3.2] Papert S (1964) Stereoscopic synthesis as a technique for locating visual mechanisms MIT Quart Prog Rep 73 239–43 [13.3.3d, 16.3.2, 16.5.3a] Pardhan S (2003) Binocular recognition summation in the peripheral visual field: contrast and orientation dependence Vis Res 43 1249–55 [13.1.2e] Pardhan S, Rose D (1999) Binocular and monocular detection of Gabor patches in binocular two-dimensional noise Perception 28 203–15 [13.1.2b] Pardhan S, Gilchrist J, Douthwaite W (1989) The effect of spatial frequency on binocular contrast inhibition Ophthal Physiol Opt 9 46–9 [13.1.2b] Pardon HR (1962) A new testing device for stereopsis J Am Optom Assoc 33 510–12 [18.2.1c] Paris J, Prestrude AM (1975) On the mechanism of the interocular light adaptation effect Vis Res 15 595–603 [13.2.2] Park K , Shebilske WL (1991) Phoria Hering’s Laws and monocular perception of direction J Exp Psychol HPP 17 219–31 [16.7.5] Parker A, Alais D (2007) A bias for looming stimuli to predominate in binocular rivalry Vis Res 37 2661–74 [12.8.2] Parker AJ (2007) Binocular depth perception and the cerebral cortex Nat Rev Neurosci 8 379–91 [11.9.2] Parker AJ, Yang Y (1989) Spatial properties of disparity pooling in human stereo vision Vis Res 29 1525–38 [18.8.2c] Parker AJ, Johnston EB, Mansfield JS, Yang Y (1991) Stereo surfaces and shape In Computational models of visual processing (ed MS Landy, JA Movshon) pp 359–81 MIT Press, Cambridge MA [17.1.2b] Parks TE, Rock I (1990) Illusory contours from pictorially threedimensional inducing elements Perception 19 119–21 [22.2.4a] Pasino L, Maraini G (1966) Area of binocular vision in anomalous retinal correspondence Br J Ophthal 50 646–50 [14.4.1a] Pasley BN, Mayes LC, Schultz RT (2004) Subcortical discrimination of unperceived objects during binocular rivalry Neuron 42 163–72 [12.9.2f ] Pastore N (1964) Induction of a stereoscopic depth effect Science 144 888 [21.3.1] Pastore N, Terwilliger M (1966) Induction of stereoscopic depth effects Br J Psychol 57 201–2 [21.3.1] Patel SS, Ukwade MT, Stevenson SB, et al. (2003) Stereoscopic depth perception from oblique phase disparities Vis Res 43 2479–92 [18.6.5] Patterson R (1990) Spatiotemporal properties of stereoacuity Optom Vis Sci 67 123–8 [18.5.2] Patterson R (1999) Stereoscopic (cyclopean) motion sensing Vis Res 39 3329–45 [16.4.1] Patterson R , Becker S (1996) Direction-selective adaptation and simultaneous contrast induced by stereoscopic (cyclopean) motion Vis Res 36 1773–81 [16.5.3b] Patterson R , Fox R (1983) Depth separation and the Ponzo illusion Percept Psychophys 34 25–8 [22.5.2] Patterson R , Fox R (1984) Stereopsis during continuous head motion Vis Res 27 2001–3 [18.10.5]
Patterson R , Fox R (1990) Metacontrast masking between cyclopean and luminance stimuli Vis Res 30 439–48 [13.2.7a] Patterson R , Hart P, Nowak D (1991) The cyclopean Ternus display and the perception of element versus group movement Vis Res 31 2085–92 [16.4.2e] Patterson R , Moe L, Hewitt T (1992a) Factors that affect depth perception in stereoscopic displays Hum Factors 34 655–67 [18.6.4] Patterson R , Ricker C, McGary J, Rose D (1992b) Properties of cyclopean motion perception Vis Res 32 149–56 [16.5.2] Patterson R , Bowd C, Phinney R , et al. (1994) Properties of the stereoscopic (cyclopean) motion aftereffect Vis Res 34 1139–47 [16.5.3a] Patterson R , Cayko R , Short GL, et al. (1995) Temporal integration differences between crossed and uncrossed stereoscopic mechanisms Percept Psychophys 57 891–7 [18.6.4, 18.12.1a] Patterson R , Bowd C, Phinney R , et al. (1996) Disparity tuning of the stereoscopic (cyclopean) motion aftereffect Vis Res 36 975–83 [16.5.3a] Patterson R , Donnelly M, Phinney RE, et al. (1997) Speed discrimination of stereoscopic (cyclopean) motion Vis Res 37 871–8 [16.5.2] Patterson R , Bowd C, Donnelly M (1998) The cyclopean (stereoscopic) barber pole illusion Vis Res 38 2119–25 [22.3.1] Patterson R , Fournier LR , Wiediger M, et al. (2005) Selective attention and cyclopean motion processing Vis Res 45 2601–2607 [16.5.3a] Payne BR , Pearson HE, Berman N (1984a) Role of corpus callosum in functional organization of cat striate cortex J Neurophysiol 52 570–94 [11.9.2] Payne BR , Pearson HE, Berman N (1984b) Deafferentation and axotomy of neurons in cat striate cortex: time course of changes in binocularity following corpus callosum transection Brain Res 307 201–15 [11.9.2] Payne WH (1967) Visual reaction times on a circle about the fovea Science 155 481–82 [12.3.4] Pearson J, Clifford CWG (2004) Determinants of visual awareness following interruptions during rivalry J Vis 4 96–202 [12.4.4b] Pearson J, Clifford CWG (2005) Suppressed patterns alter vision during binocular rivalry Curr Biol 15 2142–48 [12.6.3] Pearson J, Tadin D, Blake R (2007) The effects of transcranial magnetic stimulation on visual rivalry J Vis 7 1–11 [12.9.2a] Peckham RH, Hart WM (1960) Binocular summation of subliminal repetitive visual stimulation Am J Ophthal 49 1121–5 [13.1.5] Pei F, Pettet MW, Norcia AM (2002) Neural correlates of object-based attention J Vis 2 588–96 [22.8.1] Peirce JW, Solomon SG, Forte JD, Lennie P (2008) Cortical representation of color is binocular J Vis 8 1–10 [17.1.4a] Pelli DG, Palomares M, Majaj NJ (2004) Crowding is unlike ordinary masking: distinguishing feature integration from detection J Vis 4 1136–69 [13.2.5] Pennington J (1970) The effects of wavelength on stereoacuity Am J Optom Arch Am Acad Optom 47 288–94 [18.5.5] Penrose LS, Penrose R (1958) Impossible objects: a special type of illusion Br J Psychol 49 31–3 [15.3.2] Péres-Martinez D (1995) Texture discrimination at the cyclopean retina Perception 27 771–86 [16.6.1b] Perez R , Gonzalez F, Justo M, Ulibarrena C (1999) Interocular temporal delay sensitivity in the visual cortex Eur J Neurosci 11 2593–9 [23.3.1] Peterhans E, Heitger F (2001) Simulation of neuronal responses defining depth order and contrast polarity at illusory contours in monkey area V2 J Comp Neurosci 10 195–211 [22.2.4c] Peters HB (1969) The influence of anisometropia on stereosensitivity Am J Optom Arch Am Acad Optom 46 120–3 [18.5.4b] Peterson I (1991) Plastic math. Growing plastic models of mathematical formulas Science News 140 72–3 [24.2.5] Petrov AP (1980) A geometrical explanation of the induced size effect Vis Res 20 409–13 [19.6.5] Petrov Y (2002) Disparity capture by flanking stimuli: a measure for the cooperative mechanism of stereopsis Vis Res 42 809–13 [22.2.3b]
REFERENCES
•
601
Petrov Y (2003) Is there a pop-out of exclusively binocular (cyclopean) contours and regions Perception 32 1441–50 [16.6.1b] Petrov Y (2004) Higher-contrast is preferred to equal-contrast in stereomatching Vis Res 44 775–84 [15.3.7a] Petrov Y, Glennerster A (2004) The role of a local reference in stereoscopic detection of depth relief Vis Res 44 367–76 [18.3.2b] Petrov Y, Glennerster A (2006) Disparity with respect to a local reference plane as a dominant cue for stereoscopic depth relief Vis Res 46 4321–32 [18.3.2b] Pettet MW (1997) Spatial interactions modulate stereoscopic processing of horizontal and vertical disparities Perception 29 693–706 [21.4.2g] Pettigrew JD, Dreher B (1987) Parallel processing of binocular disparity in the cat’s retinogeniculate pathways Proc R Soc B 232 297–321 [11.3.2] Pettigrew JD, Nikara T, Bishop PO (1968) Binocular interaction on single units in cat striate cortex: simultaneous stimulation by single moving slit with receptive fields in correspondence Exp Brain Res 6 391–410 [11.1.2, 2, 11.4.3b] Phinney R , Wilson R , Hays B, et al. (1994) Spatial displacement limits for cyclopean (stereoscopic) apparent-motion perception Perception 23 1287–300 [16.5.1] Pianta MJ, Gillam BJ (2003a) Monocular gap stereopsis: manipulation of the outer edge disparity and the shape of the gap Vis Res 43 1937–50 [17.3] Pianta MJ, Gillam BJ (2003b) Paired and unpaired features can be equally effective in human depth perception Vis Res 43 1–6 [17.3] Piantanida TP (1986) Stereo hysteresis revisited Vis Res 29 431–7 [18.4.1b] Pick HL, Hay JC, Willoughby RH (1966) lnterocular transfer of adaptation to prismatic distortion Percept Mot Skills 23 131–5 [13.4.3] Pickersgill MJ (1961) On knowing with which eye one is seeing Quart J Exp Psychol 13 168–72 [16.8] Pickersgill MJ, Jeeves MA (1964) The origin of the after–effect of movement Quart J Exp Psychol 16 90–103 [16.4.1] Pierce BJ, Howard IP (1997) Types of size disparity and the perception of surface slant Perception 26 1503–17 [21.7.1] Pierce BJ, Howard IP, Feresin C (1998) Depth interactions between inclined and slanted surfaces in vertical and horizontal orientations Perception 27 87–103 [21.4.2d] Pierce DM, Benton AL (1975) Relationship between monocular and binocular depth acuity Ophthalmologica 170 43–50 [18.2.1a] Pieron H (1947) Recherches sur la latence de la sensation lumineuse par la method de l’effet chronostereoscopique Ann Psychol 48 1–51 [23.2.1] Piggins D (1978) Moirés maintained internally by binocular vision Perception 7 679–81 [14.2.2] Pinckney GA (1964) Reliability of duration as a measure of the spiral aftereffect Percept Mot Skills 18 375–6 [13.3.3a] Pirenne MH (1943) Binocular and uniocular threshold of vision Nature 152 698–9 [13.1.1b] Pizlo Z , Li Y, Francis G (2005) A new look at binocular stereopsis Vis Res 45 2244–55 [22.1.3] Plateau JAF (1850) Vierte Notiz über neue sonderbare Anwenduggen des Verweilens der Eindrücke auf die Netzhaut Poggendorff ’s Ann Physik Chem 80 287–92 [13.3.3a] Poggio GF (1991) Physiological basis of stereoscopic vision In Vision and vision dysfunction Vol 9 Binocular vision (ed D Regan) pp 227–38 MacMillan, London [11.4.1a] Poggio GF (1995) Mechanisms of stereopsis in monkey visual cortex Cereb Cortex 5 193–204 [11.5.1] Poggio GF, Fischer B (1977) Binocular interaction and depth sensitivity in striate and prestriate cortex of behaving rhesus monkey J Neurophysiol 40 1392–405 [11.4.1a, 11.6.4] Poggio GF, Poggio T (1984) The analysis of stereopsis Ann Rev Neurosci 7 379–412 [11.4.1a, 11.5.1]
602
•
Poggio GF, Talbot WH (1981) Mechanisms of static and dynamic stereopsis in foveal cortex of the rhesus monkey J Physiol 315 469–92 [11.4.1a, 11.6.4] Poggio GF, Motter BC, Squatrito S, Trotter Y (1985) Responses of neurons in visual cortex (VI and V2) of the alert Macaque to dynamic random–dot stereograms Vis Res 25 397–406 [11.4.1a] Poggio GF, Gonzalez F, Krause F (1988) Stereoscopic mechanisms in monkey visual cortex: binocular correlation and disparity selectivity J Neurosci 8 4531–50 [11.4.1a] Pollard SB, Frisby JP (1990) Transparency and the uniqueness constraint in human and computer stereo vision Nature 347 553–6 [15.3.1] Pollard SB, Mayhew JEW, Frisby JP (1985) PMF: a stereo correspondence algorithm using a disparity gradient limit Perception 14 449–70 [11.10.1c] Polonsky A, Blake R , Braun J, Heeger DJ (2000) Neuronal activity in human primary visual cortex correlates with perception during binocular rivalry Nat Neurosci 3 1153–9 [12.9.2f ] Pong TC, Kenner MA, Otis J (1990) Stereo and motion cues in preattentive vision processing–some experiments with random-dot stereographic image sequences Perception 19 161–70 [15.3.9] Poom L (2002) Seeing stereoscopic depth from disparity between kinetic edges Perception 31 1439–48 [17.1.5] Poom L, Olsson H, Börjesson E (2007) Dissociations between slantcontrast and reversed slant-contrast Vis Res 47 746–54 [21.4.3] Pope DR , Edwards M, Schor CS (1999) Extraction of depth from opposite-contrast stimuli: transient system can, sustained system can’t Vis Res 39 4010–17 [15.3.7c, 18.12.3] Popple AV, Findlay JM (1999) ‘Coarse-to-fine’ cyclopean processing Perception 28 155–65 [18.3.3a] Porac C, Coren S (1976) The dominant eye Psychol Bull 83 880–97 [12.3.7] Porac C, Coren S (1978) Sighting dominance and binocular rivalry Am J Optom Physiol Opt 55 208–13 [12.3.7] Porac C, Coren S (1984) Monocular asymmetries in vision: a phenomenal basis for eye signature Can J Psychol 38 610–27 [16.8] Porac C, Coren S (1986) Sighting dominance and egocentric localization Vis Res 29 1709–13 [16.7.6b] Porrill J, Mayhew JEW (1994) Gaze angle explanations of the induced effect Perception 23 219–22 [19.6.5] Porrill J, Mayhew JEW, Frisby JP (1989) Cyclotorsion, conformal invariance, and induced effects in stereoscopic vision In Image Understanding (ed S Ullman, W Richards) pp 185–96 Ablex, Norwood NJ [20.3.2a] Porrill J, Frisby JP, Adams WJ, Buckley D (1999) Robust and optimal use of information in stereo vision Nature 397 63–6 [20.2.4c] Porta GB della (1593) De refractione Optices Parte Carlinum and Pacem Naples [12.7.2] Porterfield W (1737) An essay concerning the motions of our eyes. Part 1. Of their external motions Edinburgh Medical Essays and Observations 3 160–263 [16.7.7] Portfors-Yeomans CV, Regan D (1997) Just-noticeable differences in the speed of cyclopean motion in depth and the speed of cyclopean motion within a frontoparallel plane J Exp Psychol HPP 23 1074–86 [16.5.2] Posner MI, Snyder CRR , Davidson BJ (1980) Attention and the detection of signals J Exp Psychol HPP 109 160–174 [22.8.1] Potetz B, Lee TS (2003) Statistical correlations between twodimensional images and three-dimensional structures in natural scenes J Opt Soc Am A 20 1292–30 [11.10.1] Potts MJ, Harris JP (1979) Dichoptic induction of movement aftereffects contingent on color and on orientation Percept Psychophys 29 25–31 [13.3.5] Pouget A, Sejnowski TJ (1994) A neural model of the cortical representation of egocentric distance Cereb Cortex 4 314–29 [11.4.6a] Prablanc C, Tzavaras A, Jeannerod M (1975) Adaptation of the two arms to opposite prism displacements Quart J Exp Psychol 27 667–71 [13.4.3]
REFERENCES
Prazdny K (1983) Stereoscopic matching eye position and absolute depth Perception 12 151–60 [15.3.10] Prazdny K (1984) Stereopsis from kinetic and flicker edges Percept Psychophys 36 490–2 [17.1.5] Prazdny K (1985a) On the disparity gradient limit for binocular fusion Percept Psychophys 37 81–3 [12.1.3a] Prazdny K (1985b) Detection of binocular disparities Biol Cyber 52 93–9 [15.4.5] Prazdny K (1985c) Vertical disparity tolerance in random-dot stereograms Bull Psychonom Soc 23 413–14 [18.4.2b] Prazdny K (1986) Three–dimensional structure from long–range apparent motion Perception 15 619–25 [22.3.4] Prentice WCH (1948) New observations of binocular yellow J Exp Psychol 38 284–8 [12.2.1] Preston TJ, Li S, Kourtzi Z , Welchman AE (2008) Multivoxel pattern selectivity for perceptually relevant binocular disparities in the human brain J Neurosci 28 11315–27 [11.4.1f ] Prestrude AM (1971) Visual latencies at photopic levels of retinal illuminance Vis Res 11 351–61 [23.2.3, 23.4.1] Prestrude AM, Baker HD (1971) Light adaptation and visual latency Vis Res 11 363–9 [23.4.2a, 23.4.3] Previc FH, Breitmeyer BG, Weinstein LF (1995) Discriminability of random-dot stereograms in three-dimensional space Int J Neurosci 80 277–53 [18.6.1b] Prévost A (1843) Essai sur la theorie de la vision binoculaire Ramboz, Geneva [14.5.2c] Price TJ, O’Toole AJ, Dambach KC (1998) A moving cast shadow diminishes the Pulfrich phenomenon Perception 27 591–3 [23.1.3] Prince SJD, Eagle RA (1999) Size-disparity correlation in human binocular depth perception Proc R Soc B 296 1361–5 [18.7.2a] Prince SJD, Eagle RA (2000a) Weighted directional energy model of human stereo correspondence Vis Res 40 1143–55 [11.10.1b, 15.2.1c] Prince SJD, Eagle RA (2000b) Stereo correspondence in onedimensional Gabor stimuli Vis Res 40 913–27 [18.4.1d] Prince SJD, Rogers BJ (1998) Sensitivity to disparity corrugations in peripheral vision Vis Res 38 2533–7 [18.6.3b] Prince SJD, Eagle RA, Rogers BJ (1998) Contrast masking reveals spatial-frequency channels in stereopsis Perception 27 1293–87 [18.7.4] Prince SJD, Pointon AD, Cumming BG, Parker AJ (2000) The precision of single neuron responses in cortical area V1 during stereoscopic depth judgments J Neurosci 20 3387–3400 [11.4.1c] Prince SJD, Pointon AD, Cumming BG, Parker AJ (2002a) Quantitative analysis of the responses of V1 neurons to horizontal disparity in dynamic random-dot stereograms J Neurophysiol 87 191–208 [11.4.1a, 11.4.1b, 11.4.1d, 11.4.5a] Prince SJD, Cumming BG, Parker AJ (2002b) Range and mechanism of encoding of horizontal disparity in macaque V1 J Neurophysiol 87 209–21 [11.4.3a, 18.7.2b] Prins N, Juola JF (2001) Relative roles of 3-D and 2-D coordinate systems in solving the correspondence problem in apparent motion Vis Res 41 759–769 [22.5.3b] Ptito M, Lepore F, Guillemot JP (1992) Loss of stereopsis following lesions of cortical areas 17–18 in the cat Exp Brain Res 89 521–30 [11.3.1] Puerta AM (1989) The power of shadows: shadow stereopsis J Opt Soc Am 6A 309–11 [17.1.1c] Pulfrich C (1922) Die stereoskopie im Dienste der isochromen und heterochromen Photometrie Naturwissenschaften 10 553–64 [23.1.1, 23.1.2, 23.4.1] Pulliam K (1981) Spatial frequency analysis of three-dimensional vision. Visual stimulation and image realism II Proc Soc Photo Opt Instru Engin 303 71–7 [18.7.2c] Purdy DM (1934) Double monocular diplopia J Gen Psychol 11 311–27 [14.4.2, 2] Purghé F (1995) Illusory figures from stereoscopically three-dimensional inducers depicting no occlusion event Perception 27 905–18 [22.2.4a]
Puri A, Kollaris RV, Haskell BG (1997) Basics of stereoscopic video, new compression results with MPEG-2 and a proposal for MPEG-4 Sig Proc: Image Comm 10 201–34 [24.2.6] Purkinje J (1825) Beobachtungen und Versuche zur Physiologie der Sinne Vol 2 p 60 JG Calve, Prague [13.3.3a] Purves D, Shimpi A, Lotto RB (1999) An empirical explanation of the Cornsweet effect J Neurosci 19 8542–51 [22.4.4] Pylyshyn ZW, Storm RW (1988) Tracking multiple independent targets: evidence for a parallel tracking mechanism Spat Vis 3 179–97 [22.8.2c] Qian N (1994) Computing stereo disparity and motion with known binocular cell properties Neural Comput 6 390–404 [11.10.1a, 11.10.1b, 11.10.1c] Qian N (1997) Binocular disparity and the perception of depth Neuron 18 359–68 [11.10.1b] Qian N, Andersen RA (1994) Transparent motion perception as detection of unbalanced motion signals. II. Physiology J Neurosci 14 7367–80 [22.3.2] Qian N, Andersen RA (1997) A physiological model for motion-stereo integration and a unified explanation of Pulfrich-like phenomena Vis Res 37 1683–98 [11.10.1b, 23.3.2] Qian N, Freeman RD (2009) Pulfrich phenomena are coded effectively by a joint motion-disparity process J Vis 9(5) Article 24 [23.3.2] Qian N, Zhu Y (1997) Physiological computation of binocular disparity Vis Res 37 1811–27 [11.4.1d, 11.10.1a, 11.10.1b] Qian N, Andersen RA, Adelson EH (1994a) Transparent motion perception as detection of unbalanced motion signals 1 Psychophysics J Neurosci 14 7357–66 [22.3.2] Qian N, Andersen RA, Adelson EH (1994b) Transparent motion perception as detection of unbalanced motion signals. III. Modeling J Neurosci 14 7381–92 [22.3.2] Qiu FT, von der Heydt R (2005) Figure and ground in the visual cortex: V2 combines stereoscopic cues with gestalt rules Neuron 47 155–66 [11.5.1] Quam LH (1987) Hierarchical warp stereo In Readings in computer vision (ed MA Fischler, O Firschein) pp 80–86 Los Altos, California [18.7.2e] Quick RF (1974) A vector–magnitude model of contrast detection Kybernetik 16 65–7 [13.1.1b] Radonjic A, Todorovic D, Gilchrist A (2010) Adjacency and surroundedness in the depth effect on lightness J Vis 10(9) 12 [22.4.3b] Rady AA, Ishak IGH (1955) Relative contributions of disparity and convergence to stereoscopic acuity J Opt Soc Am 45 530–4 [18.5.4a, 18.10.2a] Raghunandan A, Anderson CS, Saladin JJ (2009) Spatial scaling of the binocular capture effect Optom Vis Sci [16.7.4a] Ramachandran VS (1973) Apparent movement with subjective contours Vis Res 13 1399–401 [16.4.2a] Ramachandran VS (1975) Suppression of apparent movement during binocular rivalry Nature 256 122–3 [12.5.4a] Ramachandran VS (1976) Learning–like phenomena in stereopsis Nature 292 382–4 [18.14.2a, 18.14.2d, 18.14.2f ] Ramachandran VS (1986) Capture of stereopsis and apparent motion by illusory contours Percept Psychophys 39 361–73 [22.2.4a, 22.2.4b] Ramachandran VS (1987) Visual perception of surfaces: a biological approach. In The perception of illusory contours (ed S Petry, GE Meyer) pp 93–108 Springer–Verlag , New York [22.2.4a] Ramachandran VS (1991) Form motion and binocular rivalry Science 251 950–1 [13.3.3d] Ramachandran VS, Anstis SM (1990) Illusory displacement of equiluminous kinetic edges Perception 19 611–16 [16.4.2a] Ramachandran VS, Braddick OL (1973) Orientation–specific learning in stereopsis Perception 2 371–6 [18.14.2d] Ramachandran VS, Cavanagh P (1985) Subjective contours capture stereopsis Nature 317 527–30 [22.2.4b] Ramachandran VS, Nelson JI (1976) Global grouping overrides point–to–point disparities Perception 5 125–8 [20.2.1]
REFERENCES
•
603
Ramachandran VS, Sriram S (1972) Stereopsis generated with Julesz patterns in spite of rivalry imposed by colour filters Nature 237 347–8 [12.5.4a, 15.3.8a] Ramachandran VS, Rao VM, Vidyasagar TR (1973a) The role of contours in stereopsis Nature 272 412–14 [17.1.2a, 17.1.3] Ramachandran VS, Rao VM, Sriram S, Vidyasagar TR (1973b) The role of colour perception and “pattern” recognition in stereopsis Vis Res 13 505–9 [15.3.3] Ramachandran VS, Cobb S, Levi L (1994a) The neural locus of binocular rivalry and monocular diplopia in intermittent exotropes Neuroreport 5 1141–44 [12.3.8b, 14.4.2] Ramachandran VS, Cobb S, Levi L (1994b) Monocular double vision in strabismus Neuroreport 5 1418 [12.3.8b, 14.4.2] Ramamurthy M, Bedell HE, Patel SS (2005) Stereothresholds for moving line stimuli for a range of velocities Vis Res 45 789–99 [18.10.1b] Ramón y Cajal S (1901) Recreaciones estereoscópicas y binoculares La Fotgrapfía 27 41–8. [24.1.2a, 24.1.5] Ramón y Cajal S (1911) Histologie du system nerveux de l’homme et des vertébrés A Maloine, Paris [11.1.2] Rao VM (1977) Tilt illusion during binocular rivalry Vis Res 17 327–8 [12.6.3] Rashbass C (1970) The visibility of transient changes of luminance J Physiol 210 165–86 [13.1.6c] Ratcliff F (1965) Mach Bands: Quantitative studies on neural networks in the retina Holden–Day, San Francisco [22.4.1] Rauschecker JP, Campbell FW, Atkinson J (1973) Colour opponent neurones in the human visual system Nature 275 42–3 [12.3.8a] Rawlings SC, Shipley T (1969) Stereoscopic acuity and horizontal angular distance from fixation J Opt Soc Am 59 991–3 [14.6.2a, 18.6.1a] Raymond JE (1993) Complete interocular transfer of motion adaptation effects on motion coherence thresholds Vis Res 33 1865–70 [13.3.3a, 13.3.3b] Read JCA (2005) Early computational processing in binocular vision and depth perception Prog Biophys Molec Biol 87 77–108 [11.9.2] Read JCA, Cumming BG (2003) Testing quantitative models of binocular disparity selectivity in primary visual cortex J Neurophysiol 90 2795 – 817 [11.4.1d] Read JCA, Cumming BG (2004) Ocular dominance predicts neither strength nor class of disparity selectivity with random-dot stimuli in primate V1 J Neurophysiol 91 1271–81 [11.3.1, 11.10.1a] Read JCA, Cumming BG (2005a) All Pulfrich-like illusions can be explained without joint encoding of motion and disparity J Vis 5(11) Article 1. [23.3.2] Read JCA, Cumming BG (2005b) The stroboscopic Pulfrich effect is not evidence for the joint encoding of motion and depth J Vis 5 417–34 c18 [15.3.9, 23.3.2] Read JCA, Cumming BG (2006) Does depth perception require vertical-disparity detectors? J Vis 6 1323–55 [11.4.4, 11.10.1c, 20.2.5] Read JCA, Cumming BG (2007) Sensors for impossible stimuli may solve the stereo correspondence problem Nat Neurosci 10 1322–28 [11.10.1b, 11.10.1c] Read JCA, Eagle RA (2000) Reversed stereo depth and motion direction with anti-correlated stimuli Vis Res 40 3345–58 [15.3.7d] Read JCA, Parker AJ, Cumming B, (2002) A simple model accounts for the reduced response of disparity-tuned V1 neurons to anticorrelated images Vis Neurosci 19 735–53 [11.4.1d, 11.10.1c] Read JCA, Phillipson GP, Glennerster A (2009) Latitude and longitude vertical disparities J Vis 9(13) Article 11 [19.6.2] Reading RW (1983) Possible alterations in correspondence associated with asymmetric convergence Ophthal Physiol Opt 3 121–7 [14.6.2a] Reading RW, Tanlamai T (1980) The threshold of stereopsis in the presence of differences in magnification of the ocular images J Am Optom Assoc 51 593–5 [18.3.3a, 18.3.4]
604
•
Reading RW, Tanlamai T (1982) Finely graded binocular disparities from random-dot stereograms Ophthal Physiol Opt 2 47–56 [18.2.3b, 18.2.4] Reading RW, Woo GS (1972) Some of the time factors associated with stereopsis Am J Optom Arch Am Acad Optom 41 20–8 [18.5.4a] Reading VM (1973) An objective correlate of the Pulfrich stereo-illusion Proc R Soc Med 66 1043–4 [23.5] Reading VM (1975) Eye movements and the Pulfrich stereo-illusion J Physiol 276 40P [23.5] Redding GM, Lester CF (1980) Achromatic color matching as a function of apparent test orientation, test and background luminance, and lightness or brightness instructions Percept Psychophys 27 557–63 [22.4.3a] Reed MJ, Steinbach MJ, Anstis SM, et al. (1991) The development of optokinetic nystagmus in strabismic and monocularly enucleated subjects Behav Brain Res 46 31–42 [22.6.1e] Reeves A, Peachey NS, Auerbach E (1986) Interocular sensitization to a rod-detected test Vis Res 29 1119–27 [13.2.2] Regan D (1973) Rapid objective refraction using evoked potentials Invest Ophthal 12 669–79 [13.1.8b] Regan D (1977) Speedy assessment of visual acuity in amblyopia by the evoked potential method Ophthalmologica 175 159–64 [13.1.8b] Regan D (1986) Form from motion parallax and form from luminance contrast: vernier discrimination Spat Vis 1 305–18 [16.4.1, 22.3.4] Regan D (1989a) Human brain electrophysiology Evoked potentials and evoked magnetic fields in science and medicine Elsevier, New York [11.7] Regan D (1989b) Orientation discrimination for objects defined by relative motion and objects defined by luminance contrast Vis Res 29 1389–400 [16.4.1] Regan D, Hamstra SJ (1994) Shape discrimination for rectangles defined by disparity alone by disparity plus luminance and by disparity plus motion Vis Res 34 2277–91 [16.2.2a, 18.2.3a] Regan D, Spekreijse H (1970) Electrophysiological correlate of binocular depth perception in man Nature 225 92–4 [11.7] Regan D, Varney P, Purdy J, Kraty N (1976a) Visual field analyser: assessment of delay and temporal resolution of vision Med Biol Engin January 8–14 [23.7] Regan D, Milner BA, Heron JR (1976b) Delayed visual perception and delayed evoked potentials in the spinal form of multiple sclerosis and in retrobulbar neuritis Brain 99 43–66 [23.7] Regan D, Erkelens CJ, Collewijn H (1986) Necessary conditions for the perception of motion in depth Invest Ophthal Vis Sci 27 584–97 [18.3.2a] Regan MP, Regan D (1988) A frequency domain technique for characterizing nonlinearities in biological systems J Theor Biol 133 293–317 [13.1.8b] Regan MP, Regan D (1989) Objective investigation of visual function using a nondestructive zoom–FFT technique for evoked potential analysis Can J Neurol Sci 16 168–79 [13.1.8b] Reichardt W (1961) Autocorrelation, a principle for the evaluation of sensory information by the central nervous system In Sensory communication (ed WA Rosenblith) pp 303–318 Wiley, New York [16.4.1] Reimann D, Haken H (1994) Stereo vision by self organization Biol Cyber 71 17–29 [15.3.5] Reinecke RD, Simons K (1974) A new stereoscopic test for amblyopia screening Am J Ophthal 78 714–21 [18.2.3c] Reinhardt-Rutland AH (1999) The framing effect with rectangular and trapezoidal surfaces: actual and pictorial surface slant, frame orientation, and viewing condition Perception 28 1361–71 [24.1.7] Richards W (1966) Attenuation of the pupil response during binocular rivalry Vis Res 6 239–40 [12.5.1] Richards W (1970) Stereopsis and stereoblindness Exp Brain Res 10 380–8 [21.6.2a] Richards W (1971) Independence of Panum’s near and far limits Am J Optom Arch Am Acad Optom 48 103–9 [12.1.1a]
REFERENCES
Richards W (1972) Response functions for sine– and square–wave modulations of disparity J Opt Soc Am 62 907–11 [11.4.2, 21.3.1, 21.3.5] Richards W, Foley JM (1971) Interhemispheric processing of binocular disparity J Opt Soc Am 61 419–21 [18.4.1a, 18.4.1c] Richards W, Foley JM (1974) Effect of luminance and contrast on processing large disparities J Opt Soc Am 64 1703–5 [18.4.1a] Richards W, Foley JM (1981) Spatial bandwidth of channels for slant estimated from complex gratings J Opt Soc Am 71 274–9 [18.8.2c] Richards W, Kaye MG (1974) Local versus global stereopsis: two mechanisms Vis Res 14 1345–7 [18.4.1d] Richards W, Lieberman HR (1985) Correlation between stereo ability and the recovery of structure–from–motion Am J Optom Physiol Opt 62 111–18 [22.3.4] Richards W, Regan D (1973) A stereo field map with implications for disparity processing Invest Ophthal Vis Sci 12 904–9 [18.6.1a] Ridder WH, Smith EL, Manny RE, et al. (1992) Effects of interocular suppression on spectral sensitivity Optom Vis Sci 69 227–36 [12.3.2f ] Riggs LA, Whittle P (1967) Human occipital and retinal potentials evoked by subjectively faded visual stimuli Vis Res 7 441–51 [12.9.2e] Ripamonti C, Bloj M, Hauk R , et al. (2004) Measurement of the effect of surface slant on perceived lightness J Vis 4 747–63 [22.4.3a] Ritter AD, Breitmeyer BG (1989) The effects of dichoptic and binocular viewing on bistable motion percepts Vis Res 29 1215–19 [16.4.2e] Ritter M (1977) Effect of disparity and viewing distance on perceived depth Percept Psychophys 22 400–7 [20.6.3a] Ritter M (1979) Perception of depth: processing of simple positional disparity as a function of viewing distance Percept Psychophys 25 209–14 [20.6.3a] Rivest J, Cavanagh P, Lassonde M (1994) Interhemispheric depth judgments Neuropsychologia 32 69–76 [11.9.2] Robertson VM, Fry GA (1937) After–images observed in complete darkness Am J Psychol 49 295–76 [13.3.5] Robinson DL, Petersen SE (1992) The pulvinar and visual salience TINS 15 127–32 [11.2.1] Robinson DN (1968) Visual disinhibition with binocular and interocular presentation J Opt Soc Am 58 254–7 [13.2.7b] Robinson JO (1972) The psychology of visual illusions Hutchinson, London [16.7.4b] Robinson TR (1895) Experiments with Fechner’s paradoxon Am J Psychol 7 9–23 [13.1.4] Rock ML, Fox BH (1949) Two aspects of the Pulfrich phenomenon Am J Psychol 62 279–84 [23.4.1, 23.4.2a] Roelofs C, van der Waals HG (1935) Veränderung der haptischen und optischen Lokalisation bei optokinetischer Reizung Z Psychol 136 5–49 [22.7.2] Roelofs CO (1959) Considerations on the visual egocentre Acta Psychol 16 229–34 [16.7.6a, 16.7.6b] Rogers BJ (1987) Motion disparity and structure-from-motion disparity Invest Ophthal Vis Sci 28 (Abs) 233 [17.1.5] Rogers BJ (1992) The perception and representation of depth and slant in stereoscopic surfaces In Artificial and biological vision systems (ed GA Orban, HH Nagel) pp 271–296 Springer-Verlag , Berlin [20.3.2a] Rogers BJ, Anstis SM (1972) Intensity versus adaptation and the Pulfrich stereophenomenon Vis Res 12 909–28 [23.1.1, 23.2.1, 23.2.2, 23.3.2, 23.4.1, 23.4.2a] Rogers BJ, Anstis SM (1975) Reversed depth from positive and negative stereograms Perception 4 193–201 [15.3.7b, 18.8.2b] Rogers BJ, Bradshaw MF (1993) Vertical disparities, differential perspective and binocular stereopsis Nature 361 253–5 [19.6.2, 20.6.3c, 20.6.5a] Rogers BJ, Bradshaw MF (1995) Disparity scaling and the perception of frontoparallel surfaces Perception 24 155–79 [20.6.5a] Rogers BJ, Bradshaw MF (1999) Disparity minimisation, cyclovergence, and the validity of nonius lines as a technique for measuring torsional alignment Perception 28 127–41 [14.6.1c]
Rogers B, Brecher K (2007) “Straight lines, ‘uncurved lines’, and Helmholtz’s ‘great circles on the celestial sphere’” Perception 36 1275–89 [14.3.1c] Rogers BJ, Cagenello R (1989) Disparity curvature and the perception of three–dimensional surfaces Nature 339 135–7 [11.6.3, 18.6.6, 19.5, 20.5.2] Rogers BJ, Graham ME (1983) Anisotropies in the perception of three–dimensional surfaces Science 221 1409–11 [18.12.1b, 20.4.1a, 20.4.2, 21.4.2b, 21.4.2e, 21.5.2] Rogers BJ, Graham ME (1985) Motion parallax and the perception of three–dimensional surfaces In Brain mechanisms and spatial vision (ed D Ingle, M Jeannerod, D Lee) pp 95–111 Martinus Nijhoff, The Hague [20.3.1d, 21.6.3a, 21.6.3b, 21.6.4] Rogers BJ, Howard IP (1991) Differences in the mechanisms used to extract 3–D slant from disparity and motion parallax cues Invest Ophthal Vis Sci 32 (Abs) 695 [19.6.1] Rogers BJ, Koenderink J (1986) Monocular aniseikonia: a motion parallax analogue of the disparity-induced effect Nature 322 62–3 [20.2.4b] Rogers BJ, Steinbach MJ, Ono H (1974) Eye movements and the Pulfrich phenomenon Vis Res 14 181–5 [23.3.4, 23.5] Rogers BJ, Cagenello R , Rogers S (1988) Simultaneous contrast effects in stereoscopic surfaces: the role of tilt slant and surface discontinuities Quart J Exp Psychol 40A 417 [21.4.2c, 21.5.2] Rogers BJ, Bradshaw MF, Glennerster A (1993) Differential perspective disparity scaling and the perception of fronto-parallel surfaces Invest Ophthal Vis Sci 34 (Abs.) 1438 [20.6.5a] Rogers BJ, Bradshaw MF, Gillam B (1995) The induced effect does not scale with viewing distance Perception 27 (Suppl) 33 [20.2.3c] Rogers DC, Hollins M (1982) Is the binocular rivalry mechanism tritanopic? Vis Res 22 515–20 [12.3.2e] Rohaly AM, Wilson HR (1993) Nature of coarse-to-fine constraints on binocular fusion J Opt Soc Am A 10 2733–41 [18.7.2e] Rohaly AM, Wilson HR (1994) Disparity averaging across spatial scales Vis Res 34 1315–25 [18.8.2c] Rohaly AM, Wilson HR (1999) The effects of contrast on perceived depth and depth discrimination Vis Res 39 9–18 [18.7.3b] Rokers B, Cormack LK , Huk C (2009) Disparity- and velocity-based signals for three-dimensional motion perception in human MT+ Nat Neurosci 12 1050–5 [11.5.2a] Rommetveit R , Toch H, Svendsen D (1968) Semantic syntactic and associative context effects in a stereoscopic rivalry situation Scand J Psychol 9 145–9 [12.8.3a] Rose D (1978) Monocular versus binocular contrast thresholds for movement and pattern Perception 7 195–200 [13.1.6c] Rose D (1980) The binocular: monocular sensitivity ratio for movement detection varies with temporal frequency Perception 9 577–80 [13.1.6c] Rose D, Blake R (1988) Mislocation of diplopic images J Opt Soc Am A 5 1512–21 [16.7.3a] Rose D, Blake R , Halpern DL (1988) Disparity range for binocular summation Invest Ophthal Vis Sci 29 283–90 [13.1.2d] Rose D, Bradshaw MF, Hibbard PB (2003) Attention affects the stereoscopic depth aftereffect Perception 32 635–40 [21.6.2a] Rosenbluth D, Allman JM (2002) The effect of gaze angle and fixation distance on the responses of neurons in V1, V2, and V4 Neuron 33 143–9 [11.4.6a, 11.4.6b] Rosenfeld A, Vanderbrug GJ (1977) Coarse–fine template matching IEEE Tr Man Mach Cybern 7 104–7 [15.4.2] Rosner J, Clift GD (1984) The validity of the Frisby stereotest as a measure of precise stereoacuity J Am Optom Assoc 55 505–06 [18.2.1e] Ross J (1974) Stereopsis by binocular delay Nature 278 363–4 [23.3.1, 23.6.1, 23.6.2] Ross J (1976) The resources of perception Sci Amer 234 80–6 [23.6.1, 23.6.2] Ross J, Hogben JH (1974) Short–term memory in stereopsis Vis Res 14 1195–201 [18.12.2a, 23.2.2, 23.6.1]
REFERENCES
•
605
Ross J, Hogben JH (1975) The Pulfrich effect and short-term memory in stereopsis Vis Res 15 1289–90 [23.2.2] Roumes C, Planter J, Menu JP, Thorpe S (1997) The effects of spatial frequency on binocular fusion: from elementary to complex images Hum Factors 39 359–73 [12.1.2] Rovamo J, Virsu V (1979) An estimation and application of the human cortical magnification factor Exp Brain Res 37 495–510 [12.1.1d] Rovamo J, Virsu V, Laurinen P, Hyvarinen L (1982) Resolution of gratings oriented along and across meridians in peripheral vision Invest Ophthal Vis Sci 23 666–670 [13.1.2e] Roy JP, Komatsu H, Wurtz RH (1992) Disparity sensitivity of neurons in monkey extrastriate area MST J Neurosci 12 2778–92 [11.4.6a, 11.5.2a, 11.6.4] Rozhkova GI, Nickolayev PP, Shchadrin VE (1982) Perception of stabilized retinal stimuli in dichoptic viewing conditions Vis Res 22 293– 302 [12.3.3a] Rubin E (1921) Figure and ground In Readings in perception, (ed DC Beardslee, M Wertheimer) pp 194–203 Van Nostrand, Princeton NJ [22.1.1] Ruddock KH, Wigley E (1976) Inhibitory binocular interaction in human vision and a possible mechanism subserving stereoscopic fusion Nature 290 604–6 [13.2.6] Ruddock KH, Waterfield VA, Wigley E (1979) The response characteristics of an inhibitory binocular interaction in human vision J Physiol 290 37–49 [13.2.6] Rule JT (1941) The shape of stereoscopic images J Opt Soc Am 31 124–29 [24.1.7] Rumelhart DE, McClelland JL (1986) Parallel distributed processing MIT Press, Cambridge MA [11.10.2] Rushton D (1975) Use of the Pulfrich pendulum for detecting abnormal delay in the visual pathway in multiple sclerosis Brain 98 283–96 [23.7] Russell PW (1979) Chromatic input to stereopsis Vis Res 19 831–4 [17.1.4a] Rutstein RP, Marsh–Tootle W, Scheiman MM, Eskridge JB (1991) Changes in retinal correspondence after changes in ocular alignment Optom Vis Sci 68 325–30 [14.4.1e] Ryan C, Gillam B (1993) A proximity-contingent stereoscopic depth aftereffect: evidence for adaptation to disparity gradients Perception 22 403–18 [21.6.3b, 21.6.4] Ryan C, Gillam B (1994) Cue conflict and stereoscopic surface slant about horizontal and vertical axes Perception 23 645–58 [20.4.1d] Rychkova SI, Ninio J (2009) Paradoxical fusion of images and depth perception with a squinting eye Vis Res 49 530–5 [14.4.2] Sabrin HW, Kertesz AE (1983) The effect of imposed fixational eye movements on binocular rivalry Percept Psychophys 34 155–7 [12.3.6a] Sachsenweger R (1958) Sensorische Fusion und Schielen Graefe’s Arch Klin Exp Ophthal 159 502–28 [16.7.3b] Sachtler WLB, Gillam B (2007) The stereoscopic sliver: a comparison of duration thresholds for fully stereoscopic and unmatched versions Perception 36 135–44 [17.3] Sagawa K (1981) Minimum light intensity required for color rivalry Vis Res 21 1467–74 [12.3.2e] Sagawa K (1982) Dichoptic color fusion studied with wavelength discrimination Vis Res 22 945–52 [12.2.2] Saito H, Yukie M, Tanaka K , et al. (1986) Integration of direction signals of image motion in the superior temporal sulcus of the macaque monkey J Neurosci 6 145–57 [11.6.4] Sakai K , Ogiya M, Hirai Y (2005) Perception of depth and motion from ambiguous binocular information Vis Res 45 2471–80 [23.3.5] Sakane I (1994) The random-dot stereogram and its contemporary significance: new directions in perceptual art In Stereogram pp 73–82 Cadence Books, San Francisco [24.1.6] Sakano Y, Ando H (2010) Effect of head motion and stereo viewing on perceived glossiness J Vis 10(9) 15 [17.1.6]
606
•
Sakata H, Shibutani H, Kawano K (1980) Spatial properties of visual fixation neurons in posterior parietal association cortex of the monkey J Neurophysiol 43 1654–72 [11.4.6a] Sakata H, Taira M, Kusunoki TM, et al. (1999) Neural representation of three-dimensional features of manipulation objects with stereopsis Exp Brain Res 128 160–9 [11.5.2b] Saladin JJ (1995) Effects of heterophoria on stereopsis Optom Vis Sci 72 487–92 [18.10.3b] Sanger TD (1988) Stereo disparity computation using Gabor filters Biol Cyber 59 405–18 [11.4.3a, 11.10.1b, 15.2.1d] Sarmiento RF (1975) The stereoacuity of macaque monkey Vis Res 15 493–8 [18.3.1] Sáry G, Vogels R , Kovács G, Orban GA (1995) Responses of monkey inferior temporal neurons to luminance-, motion-, and texturedefined gratings J Neurophysiol 75 1341–54 [11.5.3b] Sasaki H, Gyoba J (2002) Selective attention to stimulus features modulates interocular suppression Perception 31 409–19 [12.8.2] Sasaki KS, Tabuchi Y, Ohzawa I (2010) Complex cells in the cat striate cortex have multiple disparity detectors in the three-dimensional binocular receptive fields J Neurosci 30 13826–37 [11.4.1g] Sato M, Howard IP (2001) Effects of disparity-perspective cue conflict on depth contrast Vis Res 41 415–26 [21.4.3] Savoy RL (1984) “Extinction” of the McCollough effect does not transfer interocularly Percept Psychophys 36 571–6 [13.3.5] Saxby G (1988) Practical holography Prentice Hall, New York [24.1.4a] Saye A (1976) Facilitation of stereopsis from a large disparity randomdot stereogram by various monocular features: further findings (A short note) Perception 5 461–5 [18.14.2c] Saye A, Frisby JP (1975) The role of monocularly conspicuous features in facilitating stereopsis from random–dot stereograms Perception 4 159–71 [15.2.2d, 18.14.2c] Scarfe P, Hibbard PB (2006) Disparity-defined objects moving in depth do not elicit three-dimensional shape constancy Vis Res 46 1599–610 [20.6.5d] Scharff LFV (1997) Decreases in the critical disparity gradient with eccentricity may reflect the size-disparity correlation J Opt Soc Am A 14 1205–12 [12.1.3c] Scharff LFV, Geisler WS (1992) Stereopsis at isoluminance in the absence of chromatic aberrations J Opt Soc Am A 9 868–76 [17.1.4b] Scheidt RA, Kertesz AE (1993) Temporal and spatial aspects of sensory interactions during human fusional response Vis Res 33 1259–70 [12.1.3b] Schein SJ, De Monasterio FM (1987) Mapping of retinal and geniculate neurons onto striate cortex of macaque J Neurosci 7 996–1009 [12.4.1] Schiff B, Cohen T, Raphan T (1988) Nystagmus induced by stimulation of the nucleus of the optic tract in the monkey Exp Brain Res 70 1–14 [22.6.1b] Schiller PH (1965) Monoptic and dichoptic visual masking by patterns and flashes J Exp Psychol 69 193–9 [13.2.7b] Schiller PH (1993) The effects of V4 and middle temporal (MT) area lesions on visual performance in the rhesus monkey Vis Neurosci 10 717–46 [11.5.3a] Schiller PH, Dolan RP (1994) Visual aftereffects and the consequences of visual system lesions on their perception in the rhesus monkey Vis Neurosci 11 643–65 [11.5.4] Schiller PH, Greenfield A (1969) Visual masking and the recovery phenomenon Percept Psychophys 6 182–4 [13.2.7b] Schiller PH, Smith M (1968) Monoptic and dichoptic metacontrast Percept Psychophys 3 237–9 [13.2.7b] Schiller PH, Wiener M (1962) Binocular and stereoscopic viewing of geometrical illusions Percept Mot Skills 15 739–47 [16.3.1] Schiller PH, Wiener M (1963) Monoptic and dichoptic masking J Exp Psychol 66 386–93 [13.2.7b] Schiller PH, Logothetis NK , Charles ER (1990) Role of the coloropponent and broad-band channels in vision Vis Neurosci 5 321–46 [11.5.4]
REFERENCES
Schiller PH, Slocum WM, Weiner VS (2007) How the parallel channels of the retina contribute to depth processing Eur J Neurosci 26 1307–21 [11.5.4, 18.12.1b, 20.1.1] Schirillo JA, Shevell SK (1993) Lightness and brightness judgments of coplanar retinally noncontiguous surfaces J Opt Soc Am 10 2742–52 [22.4.2] Schirillo J, Reeves A, Arend L (1990) Perceived lightness but not brightness of achromatic surfaces depends on perceived depth information Percept Psychophys 48 82–90 [22.4.3b] Schlerf J, Domini F, Caudek C (2004) 3D shape-contingent processing of luminance gratings Vis Res 44 1079–91 [22.4.4] Schlesinger BY, Yeshurun Y (1998) Spatial size limits in stereoscopic vision Spat Vis 11 279–93 [18.6.3b] Schlosberg H (1941) Stereoscopic depth from single pictures Am J Psychol 54 601–5 [24.1.7] Schmidt M, Schiff D, Bentivoglio M (1995) Independent efferent populations in the nucleus of the optic tract: an anatomical and physiological study in the rat and cat J Comp Neurol 360 271–85 [22.6.1a] Schmidt PP (1994) Vision screening with the RDE stereotest in pediatric populations Optom Vis Sci 71 273–81 [18.2.3c] Schmidt WC (1997) Artificial looming yields improved performance over lateral motion: implications for stereoscopic display techniques Hum Factors 39 352–8 [22.3.5] Schneider B, Moraglia G (1992) Binocular unmasking with unequal interocular contrast: the case for multiple cyclopean eyes Percept Psychophys 52 639–60 [13.2.4b] Schneider B, Moraglia G (1994) Binocular vision enhances target detection by filtering the background Perception 23 1297–86 [13.1.2f ] Schneider B, Moraglia G, Jepson A (1989) Binocular unmasking: an analogue to binaural unmasking Science 273 1479–81 [13.2.4b] Schneider B, Moraglia G, Speranza F (1999) Binocular vision enhances phase discrimination by filtering the background Percept Psychophys 61 468–89 [13.2.4b] Schneider SW, Sritharan KC, Geibel JP, et al. (1997) Surface dynamics in living acinar cells imaged by atomic force microscopy: identification of plasma membrane structures involved in exocytosis Proc Natl Acad Sci 94 316–21 [24.2.3f ] Schoessler JP (1980) Disparity-induced vergence responses in normal and strabismic subjects Am J Optom Physiol Opt 57 666–75 [14.4.1d] Schoppmann A (1985) Functional and developmental analysis of a visual corticopretectal pathway in the cat: neuroanatomical and electrophysiological study Exp Brain Res 60 363–74 [22.6.1b] Schor CM (1991) Binocular sensory disorders In Vision and visual dysfunction Vol 9 Binocular vision (ed D Regan) pp 179–223 MacMillan, London [14.4.1b, 14.4.1e] Schor CM, Badcock DR (1985) A comparison of stereo and vernier acuity within spatial channels as a function of distance from fixation Vis Res 25 1113–19 [18.11] Schor CM, Heckmann T (1989) Interocular differences in contrast and spatial frequency: effects on stereopsis and fusion Vis Res 29 837–47 [18.5.4a] Schor CM, Howarth PA (1986) Suprathreshold stereo–depth matches as a function of contrast and spatial frequency Perception 15 279–58 [18.7.3a, 18.7.3b] Schor CM, Levi DM (1980) Disturbances of small–field horizontal and vertical optokinetic nystagmus in amblyopia Invest Ophthal Vis Sci 19 668–83 [22.6.1e] Schor CM, Tyler CW (1981) Spatio–temporal properties of Panum’s fusional area Vis Res 21 683–92 [12.1.4] Schor CM, Wood I (1983) Disparity range for local stereopsis as a function of luminance spatial frequency Vis Res 23 1649–54 [18.4.1d, 18.6.4, 18.7.2a, 18.7.3a] Schor CM, Wood IC, Ogawa J (1984a) Binocular sensory fusion is limited by spatial resolution Vis Res 27 661–5 [12.1.2, 12.4.2] Schor CM, Wood IC, Ogawa J (1984b) Spatial tuning of static and dynamic local stereopsis Vis Res 27 573–8 [15.3.6]
Schor CM, Landsman L, Erickson P (1987) Ocular dominance and the interocular suppression of blur in monovision Am J Optom Physiol Opt 64 723–30 [12.3.2b] Schor CM, Heckmann T, Tyler CW (1989) Binocular fusion limits are independent of contrast luminance gradient and component phases Vis Res 29 821–35 [12.1.2] Schor, CM, Edwards M, Pope DR (1998) Spatial-frequency and contrast tuning of the transient-stereopsis system Vis Res 38 3057–68 [15.3.5, 15.3.6] Schor CM, Edwards M, Sato M (2001) Envelope size tuning for stereodepth perception of small and large disparities Vis Res 41 2555–67 [18.12.3] Schoups AA, Orban GA (1996) Interocular transfer in perceptual learning of a pop-out discrimination task Proc Natl Acad Sci 93 7358–62 [13.4.1] Schoups AA, Vogels R , Orban GA (1995) Human perceptual learning in identifying the oblique orientation: retinotopy orientation specificity and monocularity J Physiol 483 797–810 [13.4.1] Schreiber KM, Tweed DB (2003) Influence of eye position on stereo matching Strabismus 11 9–16 [15.3.10] Schreiber KM, Crawford JD, Fetter M, Tweed D (2001) The motor side of depth vision Nature 410 819–22 [15.3.10] Schreiber KM, Tweed DB, Schor CM (2006) The extended horopter: quantifying retinal correspondence across changes of 3D eye position J Vis 6 64–74 [14.5.2f ] Schreiber, KM, Hillis JM, Filippini HR , et al. (2008) The surface of the empirical horopter J Vis 8 1–20 [14.7] Schrödinger E (1926) Die Gesichtempfindungen In Mueller–Pouillet’s Lehrbuch der Physik (11th edn) Book 2 Part 1 pp 456–560 Vieweg, Braunschweig [13.1.4b] Schumann F (1900) Beitrage zur Analyse der Gesichtswahrnehmungen: 1. Einige Beobachtungen über die Zusammenfassung von Gesichtseindrücken zu Einheiten Z Psychol Physiol Sinnesorg 23 1–32. English translation by A Hogg (1987) In The perception of illusory contours (ed S Petry, GE Meyer) pp 21–34 Springer, New York [22.2.4a] Schumer RA, Ganz L (1979) Independent stereoscopic channels for different extents of spatial pooling Vis Res 19 1303–14 [18.6.3e, 21.5.1, 21.6.4] Schumer RA, Julesz B (1984) Binocular disparity modulation sensitivity to disparities offset from the plane of fixation Vis Res 27 533–42 [18.6.3b, 18.6.4, 21.5.1] Schwartz AH (1971) Perception with single pictures Optical Spectra 5 25–7 [24.1.7] Schwarz W (1993) Coincidence detectors and two–pulse visual temporal integration: new theoretical results and comparison data Biol Cyber 69 173–82 [13.1.1d] Scott TR , Wood DZ (1966) Retinal anoxia and the locus of the aftereffect of motion Am J Psychol 79 435–42 [13.3.3b] Scott-Samuel NE, Georgeson MA (1999) Does an early non-linearity account for second-order motion? Vis Res 39 2853–65 [18.7.2d] Seiffert AE, Cavanagh P (1998) Position displacement, not velocity, is the cue to motion detection of second-order stimuli Vis Res 38 3569–82 [16.5.2] Sekuler R , Pantle A, Levinson E (1978) Physiological basis of motion perception In Handbook of sensory physiology (ed R Held, HW Leibowitz, HL Teuber) Vol VIII pp 67–98 Springer–Verlag , Berlin [13.3.3f ] Sen DK , Singh B, Mathur GP (1980) Torsional fusional vergences and assessment of cyclodeviation by synoptophore method Br J Ophthal 64 354–7 [12.1.5] Sengpiel F, Blakemore C (1994) Interocular control of neuronal responsiveness in cat visual cortex Nature 368 847–50 [12.9.2b] Sengpiel F, Vorobyov V (2005) Intracortical origins of interocular suppression in the visual cortex J Neurosci 25 6394–400 [12.9.2b] Sengpiel F, Klauer S, Hoffmann PK (1990) Effects of early monocular deprivation on properties and afferents of nucleus of the optic tract in the ferret Exp Brain Res 83 190–9 [22.6.1b]
REFERENCES
•
607
Sengpiel F, Blakemore C, Kind PC, Harrad R (1994) Interocular suppression in the visual cortex of strabismic cats J Neurosci 14 6855–71 [12.9.2b] Sengpiel F, Blakemore C, Harrad R (1995a) Interocular suppression in the primary visual cortex: a possible neural basis of binocular rivalry Vis Res 35 179–95 [12.9.2b] Sengpiel F, Freeman TCB, Blakemore C (1995b) Interocular suppression in cat striate cortex is not orientation selective Neuroreport 6 2235–9 [12.9.1, 12.9.2b] Sereno ME, Trinath T, Augath M, Logothetis NK (2002) Threedimensional shape representation in monkey cortex Neuron 33 635–52 [11.8.1] Serope K , Schmid SR (2006) Manufacturing engineering and technology 5th edit. pp 586–7 Pearson Prentice Hall, New Jersey [24.2.5] Serrano-Pedraza I, Read JCA (2009) Stereo vision requires an explicit encoding of vertical disparity J Vis 9(4) Article 3 [11.4.4, 11.10.1c, 20.2.5] Serrano-Pedraza I, Read JCA (2010) Multiple channels for horizontal, but only one for vertical corrugations? A new look at the stereo anisotropy J Vis 10(2) 10 [20.4.2] Serrano-Pedraza I, Phillipson GP, Read JCA (2010) A specialization for vertical disparity discontinuities J Vis 10(3) Article 2 [20.2.4a] Shadlen M, Carney T (1986) Mechanisms of human motion perception revealed by a new cyclopean illusion Science 232 95–7 [12.5.4a, 16.4.2a] Shapley R , Victor JD (1978) The effect of contrast on the transfer properties of cat retinal ganglion cells J Physiol 285 275–98 [11.4.1f ] Shattuck S, Held R (1975) Color and edge sensitive channels converge on stereo-depth analyzers Vis Res 15 309–11 [13.3.5] Shebilske WL, Nice DS (1976) Optical insignificance of the nose and the Pinocchio effect in free–scan visual straight–ahead judgments Percept Psychophys 20 17–20 [16.7.1] Sheedy JE, Fry GA (1979) The perceived direction of the binocular image Vis Res 19 201–11 [16.7.3a] Sheedy JE, Bailey IL Buri M, Bass E (1986) Binocular vs monocular task performance Am J Optom Physiol Opt 63 839–46 [20.1.1] Sheinberg DL, Logothetis NK (1997) The role of temporal cortical areas in perceptual organization Proc Natl Acad Sci 94 3408–13 [12.9.2a] Sheni DD, Remole A (1986) Field of vergence limits Am J Optom Physiol Opt 63 252–8 [14.1] Shepard RN, Cooper LA (1982) Mental images and their transformation MIT Press, Cambridge MA [16.2.2b] Sher LD (1993) The oscillating-mirror technique for realizing true 3D In Stereo computer graphics and other true 3D technologies (ed DF McAllister) pp 196–213 Princeton University Press, Princeton NJ [24.1.4b] Sherrington CS (1904) On binocular flicker and the correlation of activity of corresponding retinal points Br J Psychol 1 29–60 [12.7.1, 13.1.4, 13.1.5] Shevell SK , Miller PR (1996) Color perception with test and adapting lights perceived in different depth planes Vis Res 36 949–54 [22.4.6] Shiffrar M, Li X, Lorenceau J (1995) Motion integration across differing image features Vis Res 35 2137–46 [22.3.1] Shimojo S, Nakajima Y (1981) Adaptation to the reversal of binocular depth cues: effects of wearing left–right reversing spectacles on stereoscopic depth perception Perception 10 391–402 [21.6.2g] Shimojo S, Nakayama K (1990a) Real world occlusion constraints and binocular rivalry Vis Res 30 69–80 [17.2.3] Shimojo S, Nakayama K (1990b) Amodal representation of occluded surfaces: role of invisible stimuli in apparent motion correspondence Perception 19 285–99 [22.5.3e] Shimojo S, Silverman GH, Nakayama K (1988) An occlusion–related mechanism of depth perception based on motion and interocular sequence Nature 333 295–8 [23.3.5] Shimojo S, Silverman GH, Nakayama K (1989) Occlusion and the solution to the aperture problem for motion Vis Res 29 619–29 [22.3.1] Shimono K , Wade NJ (2002) Monocular alignment in different depth planes Vis Res 42 1127–35 [16.7.4a]
608
•
Shimono K , Ono H, Saida S, Mapp AP (1998) Methodological caveats for monitoring binocular eye position with nonius stimuli Vis Res 38 591–600 [14.6.1c] Shimono K , Tam J, Nakamizo S (1999) Wheatstone-Panum limiting case: occlusion, camouflage, and vergence-induced disparity cues Percept Psychophys 61 445–55 [17.6.2] Shimono K , Tam WJ, Asakura N, Ohmi M (2005) Localization of monocular stimuli in different depth planes Vis Res 45 2631–2641 [16.7.4a] Shimono K , Tam, WJ, Ono H (2007) Apparent motion of monocular stimuli in different depth planes with lateral head movements Vis Res 47 1027–35 [16.7.4a] Shioiri S, Hatori T, Yaguchi H, Kubo S (1994) Spatial frequency channels for stereoscopic depth Optical Review 1 311—13 [18.7.4] Shipley T (1961) An experimental study of the frontal reference curves of binocular visual space Doc Ophthal 15 321–50 [14.6.1e] Shipley T (1971) The first random–dot texture stereogram Vis Res 11 1491–2 [17.1.1c] Shipley T, Rawlings SC (1970a) The nonius horopter. I. History and theory Vis Res 10 1225–62 [14.6.2a] Shipley T, Rawlings SC (1970b) The nonius horopter. II. An experimental report Vis Res 10 1293–99 [14.6.2a] Shipley WC, Kenney FA, King ME (1945) Beta apparent movement under binocular monocular and interocular stimulation Am J Psychol 58 545–9 [16.4.2a] Shippman S, Cohen KR (1983) Relationship of heterophoria to stereopsis Arch Ophthal 101 609–10 [18.6.4] Shorter S, Patterson R (2001) The stereoscopic (cyclopean) motion aftereffect is dependent upon the temporal frequency of adapting motion Vis Res 41 1809–16 [16.5.3a] Shorter S, Bowd C, Donnelly M, Patterson R (1999) The stereoscopic (cyclopean) motion aftereffect is selective for spatial frequency and orientation of disparity modulation Vis Res 39 3745–51 [16.5.3a] Shortess GK , Krauskopf J (1961) Role of involuntary eye movements in stereoscopic acuity J Opt Soc Am 51 555–9 [18.10.1a, 18.12.1a] Siderov J, Harwerth RS (1993a) Effects of the spatial frequency of test and reference stimuli on stereo–thresholds Vis Res 33 1545–51 [18.7.2b] Siderov J, Harwerth RS (1993b) Precision of stereoscopic depth perception from double images Vis Res 33 1553–60 [18.3.3a] Siderov J, Harwerth RS (1995) Stereopsis spatial frequency and retinal eccentricity Vis Res 35 2329–37 [18.3.3a, 18.6.1a] Siderov J, Harwerth RS, Bedell HE (1999) Stereopsis, cyclovergence and the backward tilt of the vertical horopter Vis Res 39 1347–57 [14.7] Siegel H, Duncan CP (1960) Retinal disparity and diplopia vs luminance and size of target Am J Psychol 73 280–4 [12.1.2] Silver MA, Logothetis NK (2004) Grouping and segmentation in binocular rivalry Vis Res 44 1675–92 [12.4.4b] Silver MA, Logothetis NK (2007) Temporal frequency and contrast tagging bias the type of competition in interocular switch rivalry Vis Res 47 532–43 [12.4.4a] Simmons DR (1998) The minimum contrast requirements for stereopsis Perception 27 1333–43 [18.5.3] Simmons DR , Kingdom FAA (1994) Contrast thresholds for stereoscopic: depth identification with isoluminant and isochromatic stimuli Vis Res 34 2971–82 [17.1.4b] Simmons DR , Kingdom FAA (1995) Differences between stereopsis with isoluminant and isochromatic stimuli J Opt Soc Am A 12 2094– 2104 [17.1.4b] Simmons DR , Kingdom FAA (1997) On the independence of chromatic and achromatic stereopsis mechanisms Vis Res 37 1271–80 [17.1.4b] Simmons DR , Kingdom FAA (1998) On the binocular summation of chromatic contrast Vis Res 38 1063–71 [13.1.2g] Simmons DR , Kingdom FAA (2002) Interactions between chromaticand luminance-contrast-sensitive stereopsis mechanisms Vis Res 42 1535–45 [17.1.4b]
REFERENCES
Simonet P, Campbell MCW (1990a) The optical transverse chromatic aberration of the fovea of the human eye Vis Res 30 187–206 [17.8] Simonet P, Campbell MCW (1990b) Effect of illuminance on the directions of chromostereopsis and transverse chromatic aberration observed with natural pupils Ophthal Physiol Opt 10 271–9 [17.8] Simons K (1981) A comparison of the Frisby Random–Dot E TNO and Random Circles stereotests in screening and office use Arch Ophthal 99 446–52 [18.2.4] Simons K (1984) Effects on stereopsis of monocular versus binocular degradation of image contrast Invest Ophthal Vis Sci 25 987–9 [18.5.4a] Simons K , Elhatton K (1994) Artifacts in fusion and stereopsis testing based on red/green dichoptic image separation J Ped Ophthal Strab 31 290–7 [18.2.3b] Simons K , Reinecke RD (1974) A reconsideration of amblyopia screening and stereopsis Am J Ophthal 78 707–13 [18.2.2b] Simonsz HJ, Tonkelaar D (1990) 19th Century mechanical models of eye movements, Donders’ law, Listing’s law and Helmholtz’ direction circles Docum Ophthal 74 95–112 [14.3.1a] Simpson JI (1984) The accessory optic system Ann Rev Neurosci 7 13–41 [22.6.1a] Simpson T (1991) The suppression effect of simulated anisometropia Ophthal Physiol Opt 11 350–8 [12.3.2b] Simpson WA, Swanston MT (1991) Depth–coded motion signals in plaid perception and optokinetic nystagmus Exp Brain Res 86 447–50 [22.3.3] Sindermann F, Lüddeke H (1972) Monocular analogues to binocular contour rivalry Vis Res 12 763–72 [12.3.8a] Sireteanu R , Best J (1992) Squint-induced modification of visual receptive fields in the suprasylvian cortex of the cat: binocular interaction, vertical effect and anomalous correspondence Eur J Neurosci 4 235–42 [14.4.1c] Sireteanu R , Fronius M (1989) Different patterns of retinal correspondence in the central and peripheral visual field of strabismics Invest Ophthal Vis Sci 30 2023–33 [14.4.1a] Skrandies W (1991) Contrast and stereoscopic visual stimuli yield lateralized scalp potential fields associated with different neural generators EEG Clin Neurophysiol 78 274–83 [11.7] Skrandies W (1997) Depth perception and evoked brain activity: the influence of horizontal disparity and visual field location Vis Neurosci 14 527–32 [11.7] Skrandies W, Vomberg HE (1985) Stereoscopic stimuli activate different cortical neurones in man: electrophysiological evidence Int J Psychophysiol 2 293–6 [11.7] Slagsvold JE (1978) Pulfrich pendulum phenomenon in patients with a history of acute optic neuritis Acta Ophthal 56 817–29 [23.7] Sloan LL, Altman A (1954) Factors involved in several tests of binocular depth perception Arch Ophthal 52 527–44 [18.2.1a] Sloane AE, Gallagher JR (1945) Evaluation of stereopsis: a comparison of the Howard-Dolman and the Verhoeff test Arch Ophthal 34 357–9 [18.2.1b] Sloane ME, Blake R (1984) Selective adaptation of monocular and binocular neurons in human vision J Exp Psychol HPP 10 406–42 [13.2.6] Sloane ME, Blake R (1987) Perceptually unequal spatial frequencies do not yield stereoscopic tilt Percept Psychophys 42 569–75 [20.2.1] Smallman HS (1995) Fine-to-coarse scale disambiguation in stereopsis Vis Res 35 1047–60 [18.7.2e] Smallman HS, MacLeod DIA (1994) Size–disparity correlation in stereopsis at contrast threshold J Opt Soc Am A 11 2169–83 [18.5.2, 18.5.3, 18.7.2a] Smallman HS, MacLeod DIA (1997) Spatial scale interactions in stereo sensitivity and the neural representation of binocular disparity Perception 29 977–94 [18.7.2b, 18.7.2e, 21.6.3a] Smallman HS, McKee SP (1995) A contrast ratio constraint on stereo matching Proc R Soc B 290 295–71 [15.3.7a] Smith AT (1983) Interocular transfer of colour–contingent threshold elevation Vis Res 23 729–34 [13.3.5]
Smith AT, Jeffreys DA (1979) Evoked potential evidence for differences in binocularity between striate and prestriate regions of human visual cortex Exp Brain Res 36 375–80 [13.1.8b] Smith AT, Scott-Samuel NE (1998) Stereoscopic and contrastdefined motion in human vision Proc R Soc B 295 1573–81 [16.5.1] Smith AT, Wall MB (2008) Sensitivity of human visual cortical areas to the stereoscopic depth of a moving stimulus J Vis 8(10) Article 1 [11.8.2] Smith EL, Levi DM, Harwerth RS, White JM (1982) Color vision is altered during the suppression phase of binocular rivalry Science 218 802–4 [12.3.2f ] Smith EL, Chino YM, Ni J, et al. (1997a) Binocular spatial phase tuning characteristics of neurons in the macaque striate cortex J Neurophysiol 78 351–65 [11.4.1f ] Smith EL, Chino YM, Ni J, Cheng H (1997b) Binocular combination of contrast signals by striate cortical neurons in the monkey J Neurophysiol 78 366–82 [11.4.1f, 13.1.8a] Smith JR , Connell SD, Swift JA (1999) Stereoscopic display of atomic force microscope images using anaglyph techniques J Micros 196 347–51 [24.2.3f ] Smith R (1738) A compleat system of opticks in four books Cambridge [14.2.2, 24.1.6] Smith S (1945) Utrocular or “which eye” discrimination J Exp Psychol 35 1–14 [16.8] Snowden P, Davies I, Rose D, Kaye M (1996) Perceptual learning of stereoacuity Perception 25 1043–52 [18.14.1] Snowden RJ, (1992) Sensitivity to relative and absolute motion Perception 21 563–8 [13.3.3f, 22.7.3] Snowden RJ, Hammett ST (1992) Subtractive and divisive adaptation in the human visual system Nature 355 278–50 [12.9.2b] Snowden RJ, Rossiter MC (1999) Stereoscopic depth cues can segment motion information Perception 28 193–201 [22.3.5] Sobel EC, Collett TS (1991) Does vertical disparity scale the perception of stereoscopic depth? Proc R Soc B 244 87–90 [20.6.3c] Sobel KV, Blake R (2002) How context influences predominance during binocular rivalry Perception 31 813–24 [12.4.3] Sobel KV, Blake R (2003) Subjective contours and binocular rivalry suppression Vis Res 43 1533–40 [12.3.3d] Sohn W, Seiffert AE (2006) Motion aftereffects specific to surface depth order: beyond binocular disparity J Vis 6 119–31 [22.5.4] Sokol S (1976) The Pulfrich stereo-illusion as an index of optic nerve dysfunction Survey Ophthal 20 432–4 [23.7] Solomon JA, Morgan MJ (1999) Dichoptically cancelled motion Vis Res 39 2293–7 [12.5.6] Solomons H (1975a) Derivation of the space horopter Br J Physiol Opt 30 56–80 [14.5, 14.5.2g] Solomons H (1975b) Properties of the space horopter Br J Physiol Opt 30 81–100 [14.5.2g] Somers WW, Hamilton MJ (1984) Estimation of the stereoscopic threshold utilizing perceived depth Ophthal Physiol Opt 4 275–50 [18.2.4] Sousa R , Brenner E, Smeets JBJ (2010) A new binocular cue for absolute distance: disparity relative to the most distant structure Vis Res 50 1786–92 [20.1.2] Sparks DL, Mays LE, Gurski MR , Hickey TL (1986) Long-term and short-term monocular deprivation in the rhesus monkey: effects on visual fields and optokinetic nystagmus J Neurosci 6 1771–80 [22.6.1b] Spehar B, Zaidi Q (1996) New configurational effects on perceived contrast and brightness: second-order White’s effects Perception 25 409–417 [22.4.5] Spehar B, Gilchrist A, Arend L (1995) The critical role of relative luminance relations in White’s effect and grating induction Vis Res 35 2903–14 [22.4.5] Spekreijse H, van der Tweel LH, Regan D (1972) Interocular sustained suppression: correlations with evoked potential amplitude and distribution Vis Res 12 521–6 [12.9.2e]
REFERENCES
•
609
Sperling G (1965) Temporal and spatial masking. I. Masking by impulse flashes J Op Soc Am 55 541–59 [13.2.3] Sperling G (1970) Binocular vision: a physical and a neural theory Am J Psychol 83 461–534 [11.10.1b, 15.2.1a] Sperry RW, Clark E (1949) Interocular transfer of visual discrimination habits in a teleost fish Physiol Zool 22 372–8 [13.4.2] SPIE (1992) Applications of artificial intelligence X: machine vision and robotics Proc Int Soc Opt Engin 1708 20 [24.2.6] Spiegler JB (1983) Distance, size and velocity changes during the Pulfrich effect Am J Optom Physiol Opt 60 902–7 [23.1.3] Spiegler JB (1986) Apparent path of a Pulfrich target as a function of the slope of its plane of motion: a theoretical note Am J Optom Physiol Opt 63 209–16 [23.1.2] Spillmann L (1993) The perception of movement and depth in moiré patterns Perception 22 287–308 [12.1.7, 24.1.3a] Spillmann L, Redies C (1981) Random–dot motion displaces Ehrenstein illusion Perception 10 411–15 [22.2.4b] Spottiswoode R , Spottiswoode N (1953) The theory of stereoscopic transmission and its application to the motion picture University of California Press, Berkeley CA [24.1.1] Spottiswoode R , Spottiswoode N, Smith C (1952) Basic principles of the three-dimensional film J Soc Motion Pict Televis Engin 59 279–86 [24.1.1] Spang K , Morgan M (2008) Cortical correlates of stereoscopic depth produced by temporal delay J Vis 8(9) Article 10 [11.8.2] Springbett BM (1961) Some stereoscopic phenomena and their implications Br J Psychol 52 105–9 [16.3.1] Squires PC (1956) Stereopsis produced without horizontally disparate stimulus loci J Exp Psychol 52 199–203 [17.7] Srebro R (1978) The visually evoked response: binocular facilitation and failure when binocular vision is disturbed Arch Ophthal 96 839–44 [13.1.8b] Srinivasan R , Russell DP, Edelman GM, Tonini G (1999) Increased synchronization of neuromagnetic responses during conscious perception J Neurosci 19 5435–48 [12.9.2e] Srivastava S, Orban GA, De Mazière PA, Janssen P (2009) A distinct representation of three-dimensional shape in macaque anterior intraparietal area: fast, metric, and coarse J Neurosci 29 10613–26 [11.5.2b] Sroczynski SF (1990) Methods for obtaining high quality stereoscopic images of microscopic objects J Micros 157 163–79 [24.2.3a] St Cyr GF, Fender DH (1969) The interplay of drifts and flicks in binocular fixation Vis Res 9 275–65 [18.10.3a] Staller JD, Lappin JS, Fox R (1980) Stimulus uncertainty does not impair stereopsis Percept Psychophys 27 361–7 [18.14.2b] Stalmeier PFM, de Weert CMM (1988) Binocular rivalry with chromatic contours Percept Psychophys 44 456–62 [12.3.2e] Standing LG, Dodwell PC, Lang D (1968) Dark adaptation and the Pulfrich effect Percept Psychophys 4 118–20 [23.4.1, 23.4.2a] Starks M (1995) Stereoscopic imaging technology: A review of patents and the literature Int J Virtual Reality 1 2–25 [24.1.2e] Starr BS (1971) Veridical and paradoxical interocular transfer of left/ right mirror image discriminations Brain Res 31 377 [13.4.2] Steenblik RA (1993) Chromostereoscopy In Stereo computer graphics and other true 3D technologies (ed DF McAllister) pp 183–95 Princeton University Press, Princeton NJ [17.8] Stein BE, Magalháes-Castro B, Kruger L (1976) Relationship between visual and tactile representations in the cat superior colliculus J Neurophysiol 39 401–19 [11.2.3] Steinbach MJ, Howard IP, Ono H (1985) Monocular asymmetries in vision: we don’t see eye–to–eye Can J Psychol 39 476–8 [16.8] Steinbach MJ, Musarella MA, Gallie BL (1988) Extraocular muscle proprioception and visual function: psychophysical aspects In Strabismus and amblyopia: Experimental basis for advances in clinical management (ed G Lennerstrand, GK von Noorden, EC Campos) MacMillan, New York [16.7.5] Steiner V, Blake R , Rose D (1994) Interocular transfer of expansion rotation and translation motion aftereffects Perception 23 1197–202 [13.3.3b]
610
•
Steinman RM, Collewijn H (1980) Binocular retinal image motion during active head rotation Vis Res 20 415–29 [18.10.5] Steinman RM, Cushman WB, Martins AJ (1982) The precision of gaze Hum Neurobiol 1 97–109 [18.10.5] Steinman RM, Levinson JZ , Collewijn H, van der Steen J (1985) Vision in the presence of known natural retinal image motion J Opt Soc Am A 2 229–33 [18.10.5] Steinman SB (1987) Serial and parallel search in pattern vision? Perception 16 389–98 [22.8.2b] Stenton SP, Frisby JP, Mayhew JEW (1984) Vertical disparity pooling and the induced effect Nature 309 622–4 [20.2.4b] Stevenson SB, Cormack LK (2000) A contrast paradox in stereopsis, motion detection, and vernier acuity Vis Res 40 2881–4 [18.5.4a] Stevenson SB, Schor CM (1997) Human stereo matching is not restricted to epipolar lines Vis Res 37 2717–23 [18.4.2b] Stevenson SB, Cormack LK , Schor CM (1989) Hyperacuity superresolution and gap resolution in human stereopsis Vis Res 29 1597–605 [18.11] Stevenson SB, Cormack LK , Schor CM (1991) Depth attraction and repulsion in random dot stereograms Vis Res 31 805–13 [18.8.2c, 21.2] Stevenson SB, Cormack LK , Schor CM, Tyler CW (1992) Disparity tuning in mechanisms of human stereopsis Vis Res 32 1685–94 [11.4.2, 14.6.1b, 15.2.2d, 18.4.1e] Stevenson TJ, Sanford EC (1908) A preliminary report of experiments on time relations in binocular vision Am J Psychol 19 130–7 [18.12.2a] Stigmar G (1970) Observations on vernier and stereo acuity with special reference to their relationship Acta Ophthal 48 979–98 [18.11] Stigmar G (1971) Blurred visual stimuli. II. The effect of blurred visual stimuli on vernier and stereo acuity Acta Ophthal 49 364–79 [18.11] Stiles WS (1939) The directional sensitivity of the retina and the spectral sensitivities of the rods and cones Proc R Soc B 127 64–105 [13.2.7b] Stoner GR , Albright TD (1997) Luminance contrast affects motion coherency in plaid patterns by acting as a depth-from-occlusion cue Vis Res 38 387–401 [22.3.3] Stoner GR , Albright TD, Ramachandran VS (1990) Transparency and coherence in human motion perception Nature 344 153–5 [22.3.3] Stork DG, Rocca C (1989) Software for generating auto–random–dot stereograms Behav Res Meth Instrum Comput 21 525–34 [24.1.6] Stratton GM (1900) A new determination of the minimum visible and its bearing on localization and binocular depth Psychol Rev 7 429–35 [18.11] Stroh A (1886) On a new form of stereoscope Proc R Soc 40 317–19 [24.1.2e] Stromeyer CF (1978) Form–color aftereffects in human vision In Handbook of sensory physiology (ed H Teuber, R Held) Vol VII pp 97–142, Springer, New York [13.3.5] Stromeyer CF, Mansfield RJW (1970) Colored aftereffects produced with moving edges Percept Psychophys 7 108–14 [13.3.5] Stromeyer CF, Kronauer RE, Madsen JC, Klein SA (1984) Opponentmovement mechanisms in human vision J Opt Soc Am A 1 876–84 [22.3.2] Strong DS (1979) Leonardo on the eye Garland, New York [17.2.1] Stuart GW, Edwards M, Cook ML (1992) Colour inputs to random– dot stereopsis Perception 21 717–29 [17.1.4e] Stuit SM, Verstraten FAJ, Paffen CLE (2010) Saliency in a suppressed image affects the spatial origin of perceptual alternation during binocular rivalry Vis Res 50 1913–21 [12.3.5e] Stumpf C (1916) Binaurale Tonmischung, Mehrheitsschwelle und Mitteltonbildung Z Psychol 75 330–50 [11.1.1] Stumpf P (1911) über die Abhängigkeit der Bewegegungsempfindung und ihres negativen Nachbildes von den Reizvorgängen auf der Netzhaut (Vorläufige Mitteilung) Z Psychol 59 321–330 (Translated by D Todorovic) Perception 25 1235–42 [22.3.1]
REFERENCES
Sturr JF, Teller DY (1973) Sensitization by annular surrounds: dichoptic properties Vis Res 13 909–18 [13.2.3] Sugie N (1982) Neural models of brightness perception and retinal rivalry in binocular vision Biol Cyber 43 13–21 [12.10, 13.1.4b] Sugita Y (1995) Contrast assimilation on different depth planes Vis Res 35 881–4 [22.4.5] Sullivan A (2004) 3-deep. New displays render images you can reach out and touch IEEE Spectrum May 30–5 [24.1.4b] Sumner FC, Watts FP (1936) Rivalry between uniocular negative after–images and the vision of the other eye Am J Psychol 48 109–16 [13.3.5] Sun F, Tong J, Yang Q, et al. (2002) Multi-directional shits of optokinetic responses to binocular-rivalrous motion stimuli Brain Res 944 56–64 [12.3.6b] Sundet JM (1972) The effect of pupil size variations on the colour stereoscopic phenomenon Vis Res 12 1027–32 [17.8] Sundet JM (1976) Two theories of colour stereoscopy Vis Res 16 469–72 [17.8] Sutherland NS (1961) Figural aftereffects and apparent size Quart J Exp Psychol 13 222–8 [21.1] Suzuki DA, Keller EL (1984) Visual signals in the dorsolateral pontine nucleus of the alert monkey their relationship to smooth–pursuit eye movements Exp Brain Res 53 473–8 [22.6.1d] Suzuki S, Grabowecky M (2002) Evidence for perceptual “trapping” and adaptation in multistable binocular rivalry Neuron 36 243–57 [12.4.4b] Swanston MT, Wade NJ (1985) Binocular interaction in induced line rotation Percept Psychophys 37 363–8 [13.3.3e] Symons LA, Pearson PM, Timney B (1996) The aftereffect to relative motion does not show interocular transfer Perception 25 651–60 [13.3.3b] Szily A von (1921) Stereoscopische Versuche mit Schattenrissen Graefe’s Arch Klin Exp Ophthal 105 964–72 See Ehrenstein and Gillam (1999) for English translation [22.2.4a] Taira M, Tsutsui KI, Jiang M, et al. (2000) Parietal neurons represent surface orientation from the gradient of binocular disparity J Neurophysiol 83 3140–46 [11.5.2b] Takayama Y, Sugishita M, Kido T, et al. (1994) Impaired stereoacuity due to a lesion in the left pulvinar J Neurol Neurosurg Psychiat 57 652–4 [11.2.1] Takeichi H, Nakazawa H (1994) Binocular displacement of unpaired region Perception 23 1025–36 [16.7.3b] Takemura A, Inoue Y, Kawano K , et al. (2001) Single-unit activity in cortical area MST associated with disparity-vergence eye movements: evidence for population coding J Neurophysiol 85 2245–66 [11.5.2a] Tam WJ, Ono H (1987) Zero horizontal disparity in binocular depth mixture stimuli Vis Res 27 1207–10 [18.8.2c] Tanabe S, Cumming BG (2008) Mechanisms underlying the transformation of disparity signals from V1 to V2 in the macaque J Neurosci 28 11304–14 [11.10.1c] Tanabe S, Umeda K , Fujita I (2004) Rejection of false matches for binocular correspondence in macaque visual cortical area V4 J Neurosci 24 8170–80 [11.5.3b] Tanabe S, Doi T, Umeda K , Fujita I (2005) Disparity-tuning characteristics of neuronal responses to dynamic random-dot stereograms in macaque visual area V4 J Neurophysiol 94 2683–99 [11.5.3a] Tanaka H, Ohzawa I (2006) Neural basis for stereopsis from secondorder contrast cues J Neurosci 26 4370–82 [11.4.7] Tanaka H, Uka T, Yoshiyama K , Kato M, Fujita I (2001) Processing of shape defined by disparity in monkey inferior temporal cortex J Neurophysiol 85 735–44 [11.5.3b] Tanner WP (1956) Theory of recognition J Acoust Soc Am 28 882–888 [13.1.4b] Tansley BW, Boynton RM (1978) Chromatic border perception: the role of red- and green-sensitive cones Vis Res 18 683–97 [12.3.2e] Tao R , Lankeet MJM, van de Gring WA, van Wezel RJA (2003) Velocity dependence of the interocular transfer of dynamic motion aftereffects Perception 32 855–66 [13.3.3c]
Taroyan NA, Buckley D, Porrill J, Frisby JP (2000) Exploring sequential stereopsis for co-planarity tasks Vis Res 40 3373–90 [18.10.2a] Tauber ES, and Atkin A (1968) Optomotor responses to monocular stimulation: relation to visual system organization Science 160 1365–7 [22.6.1a] Taya R , Ehrenstein WH, Cavonius CR (1995) Varying the strength of the Munker-White effect by stereoscopic viewing Perception 27 685–94 [22.4.5] Taya S, Sato M, Nakamizo S (2005) Stereoscopic depth aftereffects without retinal position correspondence between adaptation and test stimuli Vis Res 45 1857–66 [21.6.1b] Taylor J (1738) Le mechanisme ou le nouveau Traité de l’anatomie du globe de l’oeil avec l’usage de ses différentes paries et de celles qui lui sont contigues David, Paris [12.2.1] Taylor MM (1963) Tracking the neutralization of seen rotary movement Percept Mot Skills 16 513–19 [13.3.3a] Te Pas SF, Kappers, AML (2001) First-order structure induces the 3-D curvature contrast effect Vis Res 41 3829–35 [21.4.2f ] Teichert T, Klingenhoefer S, Wachtler T, Bremmer F (2008) Depth perception during saccades J Vis 8(14) Article 27 [18.10.2a] Teichner WH, Kobrick JL, Wehrkamp RF (1955) The effects of terrain and observation distance on relative depth perception Am J Psychol 68 193–208 [18.3.1] Teller DY, Gallanter E (1967) Brightness luminances and Fechner’s paradox Percept Psychophys 2 297–300 [13.1.4a] Temme LA, Malcus L, Noell WK (1985) Peripheral visual field is radially organized Am J Optom Physiol Opt 62 545–54 [13.1.2e] Templeton WB, Green FA (1968) Chance results in utrocular discrimination Quart J Exp Psychol 20 200–3 [16.8] Teping C, Silny J (1987) evidence of pericentral stereopsis in random dot VECP Doc Ophthal 66 291–66 [11.7] Ternus J (1926) Experimentalle Untersuchungen über phänomenale Identität Psychol Forsch 7 81–136 [16.4.2e] Theeuwes J, Atchley P, Kramer AF (1998) Attentional control within 3-D space J Exp Psychol: HPP 27 1476–85 [22.5.1e] Theimer WM, Mallot HA (1994) Phase-based vergence control and depth reconstruction using active vision Comput Vis Gr Im Proc: Im Underst 60 343–58 [18.10.4] Thibos LN, Bradley DL, Still DL, et al. (1990) Theory and measurement of ocular chromatic aberration Vis Res 30 33–49 [17.8] Thomas FH, Dimmick FL, Luria SM (1961) A study of binocular color mixture Vis Res 1 108–20 [12.2.2] Thomas GJ (1956) Effect of contours on binocular CFF obtained with synchronous and alternate flashes Am J Psychol 69 369–77 [13.1.5] Thomas J (1977) A reciprocal model for monocular pattern alternation Percept Psychophys 22 310–12 [12.3.8a] Thomas J (1978) Binocular rivalry: the effects of orientation and pattern color arrangement Percept Psychophys 23 360–2 [12.3.3c] Thomas OM, Cumming BG, Parker AJ (2002) A specialization for relative disparity in V2 Nat Neurosci 5 472–8 [11.5.1] Thompson P, Wood V (1993) The Pulfrich pendulum phenomenon in stereoblind subjects Perception 22 7–14 [23.7] Thomsen MN, Lang RD (2004) An experimental comparison of 3-dimensional and 2-dimensional endoscopic systems in a model Arthroscopy 20 419–23 [24.2.4] Thomson LC (1947) Binocular summation within the nervous pathway of the pupillary light reflex J Physiol 106 59–65 [13.1.1a] Thorn F, Boynton RM (1974) Human binocular summation at absolute threshold Vis Res 14 445–58 [13.1.1c, 13.1.6c] Thorpe SJ, Celebrini S, Trotter Y, Imbert M (1991) Dynamics of stereo processing in area V1 of the awake primate J Neurosci 4 (Supp) 83 [11.4.8b] Tian J, Wang C, Sun F (2003) Interocular motion combination for dichoptic moving stimuli Spat Vis 16 407–18 [12.3.6b] Timney B, Elberger AJ, Vandewanter ML (1985) Binocular depth perception in the cat following early corpus callosum section Exp Brain Res 60 19–29 [11.9.2]
REFERENCES
•
611
Timney B, Wilcox LM, St John R (1989) On the evidence for a ‘pure’ binocular process in human vision Spat Vis 4 1–15 [12.7.4] Timney B, Symons LA, Wilcox LM, O’Shea RP (1996) The effect of dark and equiluminant occlusion on the interocular transfer of visual aftereffects Vis Res 36 707–15 [13.3.3a] Tittle JS, Todd JT, Perotti VJ, Norman JF (1995) Systematic distortion of perceived three-dimensional structure from motion and binocular stereopsis J Exp Psychol: HPP 21 663–78 [20.6.5c] Todd JT, Norman JF (2003) The visual perception of 3-D shape from multiple cues: are observers capable of perceiving metric structure Percept Psychophys 65 31–47 [20.6.4] Todd JT, Norman JF, Koenderink JJ, Kappers AML (1997) Effects of texture illumination and surface reflectance on stereoscopic shape perception Perception 29 807–22 [17.1.6] Toet A, Levi DM (1992) The two–dimensional shape of spatial interaction zones in the parafovea Vis Res 32 1349–57 [13.2.5] Toet A, van Eekhout MP, Simons HLJJ, Koenderink JJ (1987) Scale invariant features of differential spatial displacement discrimination Vis Res 27 441–51 [18.7.2d] Tolhurst DJ, Movshon JA, Dean AF (1983) The statistical reliability of signals in single neurons in cat and monkey visual cortex Vis Res 23 775–85 [11.4.8a] Tong F, Engel SA (2001) Interocular rivalry revealed in the human cortical blind-spot representation Nature 411 195–9 [12.9.2f ] Tong F, Nakayama K , Vaughan JT, Kanwisher N (1998) Binocular rivalry and visual awareness in human extrastriate cortex Neuron 21 753–9 [12.9.2f ] Tong L, Guido W, Tumosa N, et al. (1992) Binocular interactions in the cat’s dorsal lateral geniculate nucleus. II. Effects on dominant–eye spatial–frequency and contrast processing Vis Neurosci 8 557–66 [12.9.1, 18.5.4a] Towle VL, Harter MR , Previc FH (1980) Binocular interaction of orientation and spatial frequency channels: evoked potentials and observer sensitivity Percept Psychophys 27 351–60 [13.2.4a] Towne J (1865) The stereoscope, and stereoscopic results – Section VI. Guy’s Hospital Reports 11 144–80 [16.7.2b] Towne J (1866) Contributions to the physiology of binocular vision – Section VII Guy’s Hospital Reports 12 285–301 [16.7.2b] Townsend JT (1968) Binocular information summation and the serial processing model Percept Psychophys 4 125–8 [13.1.3e] Toyama K , Komatsu Y, Kasai H, et al. (1985) Responsiveness of Clare-Bishop neurons to visual cues associated with motion of a visual stimulus in three-dimensional space Vis Res 25 407–14 [11.3.2] Toyama K , Fugii K , Kasai S, Maeda K (1986) The responsiveness of Clare-Bishop neurons to size cues for motion stereopsis Neurosci Res 4 83–109 [11.3.2] Traub AC (1967) Stereoscopic display using varifocal mirror oscillations Applied Opt 6 1085–7 [24.1.4b] Travis ARL (1990) Autostereoscopic 3-D display App Optics 29 4341–3 [24.1.3c] Tredici TD, von Noorden GK (1984) The Pulfrich effect in anisometropic amblyopia and strabismus Am J Ophthal 98 499–503 [23.7] Treisman A (1962) Binocular rivalry and stereoscopic depth perception Quart J Exp Psychol 14 23–37 [15.3.7b, 15.3.8a] Treisman A (1988) Features and objects Quart J Exp Psychol 40A 201–38 [22.8.2a] Trick GL, Compton JR (1982) Analysis of the effect of temporal frequency on the dichoptic visual-evoked response Am J Optom Physiol Opt 59 155–61 [13.1.8b] Trick GL, Guth SL (1980) The effect of wavelength on binocular summation Vis Res 20 975–80 [13.1.2c] Tricoles G (1987) Computer generated holograms: an historical review App Optics 29 4351–60 [24.1.4a] Trincker D (1953) Light dark adaptation and space perception. I. The Pulfrich effect as an asymmetrical phenomenon Pflügers Arch ges Physiol 257 48–69 [23.2.1]
612
•
Tripathy SP, Levi DM (1994) Long-range dichoptic interactions in the human visual cortex in the region corresponding to the blind spot Vis Res 34 1127–38 [13.2.5] Trivedi HP, Lloyd SA (1985) The role of disparity gradient in stereo vision Perception 14 685–90 [19.4] Troscianko T (1982) A stereoscopic presentation of the Hermann grid Vis Res 22 485–9 [16.3.2] Trotter Y, Celebrini S (1999) Gaze direction controls response gain in primary visual-cortex neurons Nature 398 239–42 [11.4.6b] Trotter Y, Celebrini S, et al. (1992) Modulation of neural stereoscopic processing in primate area V1 by the viewing distance Science 257 1279–81 [11.4.6a] Trotter Y, Celebrini S, et al. (1996) Neural processing of stereopsis as a function of viewing distance in primate visual cortical area V1 J Neurophysiol 76 2872–85 [11.4.6a] Trotter Y, Celebrini S, Durand JB (2004) Evidence for implication of primate area V1 in neural 3-D spatial localization processing J Physiol Paris 98 125–34 [11.4.6b] Truchard AM, Ohzawa I, Freeman RD (2000) Contrast gain control in the visual cortex: monocular versus binocular mechanisms J Neurosci 20 3017–32 [11.4.1f ] Trueswell JC, Hayhoe MM (1993) Surface segmentation mechanisms and motion perception Vis Res 33 313–28 [22.3.3] Tsai JJ, Victor JD (2000) Neither occlusion constraint nor binocular disparity accounts for the perceived depth in the ‘sieve effect’ Vis Res 40 2265–76 [17.5] Tsai JJ, Victor JD (2003) Reading a population code: a multi-scale neural model for representing binocular disparity Vis Res 43 445–66 [11.4.3b, 11.10.1c] Tsao DY, Vanduffel W, Sasaki Y, et al. (2003a) Stereopsis activates V3A and caudal intraparietal areas in macaque and humans Neuron 39 555–68 [11.5.1, 11.8.2] Tsao DY, Conway BR , Livingstone MS (2003b) Receptive fields of disparity-tuned simple cells in macaque V1 Neuron 38 103–14 [11.4.3c] Tschermak–Seysenegg A von (1899) über anomale Sehrichtungsgemeinschaft der Netzhäute bei einem Schielenden Graefe’s Arch Klin Exp Ophthal 47 508–50 [14.4.2] Tschermak–Seysenegg A von (1900) Beiträge zur Lehre vom Längshoropter Pflügers Arch ges Physiol 81 328–48 [14.6.1b, 14.6.1c] Tschermak-Seysenegg A von (1952) Introduction to physiological optics Thomas, Springfield IL [14.3.1c] Tse PU, Logothetis NK (2002) The duration of 3-D form analysis in transformational apparent motion Percept Psychophys 64 244–65 [22.5.3d] Tseng CH, Gobell JL, Lu ZL, Sperling G (2006) When motion appears stopped: stereo motion standstill Proc Natl Acad Sci 103 14953–8 [16.5.1] Tsirlin I, Allison RS, Wilcox LM (2008) Stereoscopic transparency: constraints on the perception of multiple surfaces J Vis 8(5) Article 5 [18.9] Tsirlin I, Wilcox LM, Allison RS (2010a) Monocular occlusions determine the perceived shape and depth of occluding surfaces J Vis 10(6) 11 [17.2.1, 17.3] Tsirlin I, Wilcox LM, Allison RS (2010b) Perceptual artifacts in random-dot stereograms Perception 39 349–55 [18.9] Tsuchiya N, Koch C (2005) Continuous flash suppression reduces negative afterimages Nat Neurosci 8 1096–101 [12.3.5f ] Tsuchiya N, Koch C, Gilroy LA, Blake R (2006) Depth of interocular suppression associated with continuous flash suppression, flash suppression, and binocular rivalry J Vis 6 1068–78 [12.3.5f ] Tsutsui KI, Jiang M, Yara K , Sakata H, Taira M (2001) Integration of perspective and disparity cues in surface-orientation selective neurons of area CIP J Neurophysiol 86 2856–67 [11.5.2b] Tumosa N, McCall MA, Guido W, Spear PD (1989) Responses of lateral geniculate neurons that survive long–term visual cortex damage in kittens and adult cats J Neurosci 9 280–98 [12.9.1]
REFERENCES
Tychsen L, Lisberger SG (1986) Maldevelopment of visual motion processing in humans who had strabismus with onset in infancy J Neurosci 6 2795–508 [22.6.1e] Tyler CW (1971) Stereoscopic depth movement: two eyes less sensitive than one Science 174 958–61 [13.3.3d] Tyler CW (1973) Stereoscopic vision: cortical limitations and a disparity scaling effect Science 181 276–8 [12.1.3a, 18.6.3a] Tyler CW (1974a) Depth perception in disparity gratings Nature 251 140–2 [16.2.2a, 18.6.3b, 24.1.5] Tyler CW (1974b) Stereopsis in dynamic visual noise Nature 250 781–2 [23.6.1, 23.6.3] Tyler CW (1975a) Spatial organization of binocular disparity sensitivity Vis Res 15 583–90 [18.6.3a] Tyler CW (1975b) Stereoscopic tilt and size aftereffects Perception 4 187–92 [16.1.2d, 16.3.3] Tyler CW (1977) Stereomovement from interocular delay in dynamic visual noise: a random spatial disparity hypothesis Am J Optom Physiol Opt 54 374–86 [23.6.2, 23.6.3, 23.6.4] Tyler CW (1980) Binocular Moiré fringes and the vertical horopter Perception 9 475–8 [14.2.2] Tyler CW (1983) Sensory processing of binocular disparity In Vergence eye movements: Basic and clinical aspects (ed MC Schor, KJ Ciuffreda) pp 199–296 Butterworth, Boston [14.5.2g, 18.6.3e, 18.11] Tyler CW (1987) Analysis of visual modulation sensitivity. III. Meridional variations in peripheral flicker sensitivity J Opt Soc Am A 4 1612–19 [12.3.4] Tyler CW (1990) A stereoscopic view of visual processing streams Vis Res 30 1877–95 [15.4.3] Tyler CW (1991a) Cyclopean vision In Vision and visual dysfunction Vol 9 Binocular Vision (ed D Regan) pp 38–74 MacMillan, London [14.7, 16.1.2d, 20.6.5a] Tyler CW (1991b) The horopter and binocular fusion In Vision and visual dysfunction Vol 9 Binocular Vision (ed D Regan) pp 19–37 MacMillan, London [11.10.1c, 14.5, 14.5.2g, 18.12.1a] Tyler CW (1994) The birth of computer stereograms for unaided stereovision In Stereogram pp 86–9 Cadence Books, San Francisco [24.1.5] Tyler CW (1997) On Ptolemy’s geometry of binocular vision Perception 26 1579–81 [16.7.2b] Tyler CW (2004) Representation of stereoscopic structure in human and monkey cortex TINS 27 116–18 [11.8.2] Tyler CW, Apkarian PA (1985) Effects of contrast orientation and binocularity in the pattern evoked potential Vis Res 25 755–66 [12.9.2e] Tyler CW, Cavanagh P (1991) Purely chromatic perception of motion in depth: two eyes as sensitive as one Percept Psychophys 49 53–61 [17.1.4c, 17.1.4d] Tyler CW, Clarke MB (1990) The autostereogram Proc Int Soc Opt Engin 1256 182–97 [24.1.6] Tyler CW, Julesz B (1978) Binocular cross–correlation in time and space Vis Res 18 101–5 [15.2.2a] Tyler CW, Julesz B (1980) On the depth of the cyclopean retina Exp Brain Res 40 196–202 [18.6.3a, 18.8, 18.10.1a] Tyler CW, Kontsevich LL (1995) Mechanisms of stereoscopic processing: stereoattention and surface perception in depth reconstruction Perception 27 127–53 [18.10.4, 18.13] Tyler CW, Kontsevich LL (2001) Stereoprocessing of cyclopean depth images: horizontally elongated summation fields Vis Res 41 2235–43 [18.6.3b] Tyler CW, Raibert M (1975) Computer technology: generation of random–dot stereogratings Behav Res Meth Instrum 7 37–41 [24.1.5] Tyler CW, Sutter EE (1979) Depth from spatial frequency difference: an old kind of stereopsis? Vis Res 19 859–65 [12.7.3, 19.2.4, 20.2.1] Tyler CW, Likova LT, Kontsevich LL, Wade AR (2006) The specificity of cortical region KO to depth structure Neuroimage 30 228–38 [11.8.1] Uhlarik JJ, Canon LK (1971) Influence of concurrent and terminal exposure conditions on the nature of perceptual adaptation J Exp Psychol 91 233–9 [13.4.3]
Uka T, DeAngelis GC (2002) Binocular vision: an orientation to disparity coding Cur Biol 12 R764–6 [11.4.4] Uka T, DeAngelis GC (2003) Contribution of middle temporal area to coarse depth discrimination: comparison of neuronal and psychophysical sensitivity J Neurosci 23 3515–30 [11.5.2a] Uka T, DeAngelis GC (2004) Contribution of area MT to stereoscopic depth perception: choice-related response modulations reflect task strategy Neuron 42 297–310 [11.5.2a] Uka T, DeAngelis GC (2006) Linking neural representation to function in stereoscopic depth perception: role of the middle temporal area in coarse versus fine disparity discrimination J Neurosci 26 6791–802 [11.5.2a] Uka T, Tanaka H, Kato M, Fujita I (1999) Behavioral evidence for visual perception of 3-dimensional surface structures in monkeys Vis Res 39 2399–410 [22.1.2] Uka T, Tanaka H, Yoshiyama K , Kato M, Fujita I (2000) Disparity selectivity of neurons in monkey inferior temporal cortex J Neurophysiol 84 120–32 [11.5.3b] Uka T, Tanabe S, Watanabe M, Fujita I (2005) Neural correlates of fine discrimination in monkey inferior temporal cortex J Neurosci 25 10796–802 [11.5.3b] Ukwade MT, Bedell HE (1999) Stereothresholds in persons with congenital nystagmus and in normal observers during comparable retinal image motion Vis Res 39 2963–73 [18.10.3a] Ukwade MT, Bedell HE, Harwerth RS (2003a) Stereopsis is perturbed by vergence error Vis Res 42 181–93 [18.10.3b] Ukwade MT, Bedell HE, Harwerth RS (2003b) Stereothresholds with simulated vergence variability and constant error Vis Res 42 195–204 [18.10.3b] Ukwade MT, Harwerth RS, Bedell E (2007) Stereoscopic acuity, observation distance and fixation disparity: a commentary on ‘Stereoscopic acuity and observation distance’ by Bradshaw and Glennerster (2006) Spat Vis 20 489–92 [18.6.7] Ullman S (1978) Two dimensionality of the correspondence problem Perception 7 683–93 [22.5.3a] Ullman S (1979) The interpretation of visual motion MIT Press, Cambridge MA [22.5.3a] Ullman S (1980) The effect of similarity between line segments on the correspondence strength of apparent motion Perception 9 617–29 [22.5.3a] Umeda K , Tanabe S, Fujita I (2007) Representation of stereoscopic depth based on relative disparity in macaque area V4 J Neurophysiol 98 241–52 [11.5.3a] Updyke BV (1974) Characteristics of unit responses in superior colliculus of the Cebus monkey J Neurophysiol 37 896–908 [11.2.3] Usery E (1993) Virtual stereo display techniques for three-dimensional geographic data Photogram Engin Rem Sens 59 1737–44 [24.2.1] Uttal WR (1987) The perception of dotted forms Erlbaum, Hillsdale NJ [20.5.1] Uttal WR , Fitzgerald J, Eskin TE (1975a) Parameters of tachistoscopic stereopsis Vis Res 15 705–12 [13.2.7a, 18.12.1b] Uttal WR Fitzgerald J, Eskin TE (1975b) Rotation and translation effects on stereoscopic acuity Vis Res 15 939–44 [18.10.4] Uttal WR , Davis SN, Welke C, Kakarala R (1988) The reconstruction of static visual forms from sparse dotted samples Percept Psychophys 43 223–40 [20.5.1] Uttal WR , Davis NS, Welke C (1994) Stereoscopic perception with brief exposures Percept Psychophys 56 599–604 [18.12.1a] Uttal WR , Baruch T, Allen L (1995) Dichoptic and physical information combination: a comparison Perception 27 351–62 [13.1.3e] Vallortigara G, Bressan P (1994) Occlusion transparency and stereopsis: a new explanation for stereo capture Vis Res 34 2891–6 [22.2.4b] Valmaggia C, Proudlock F, Gottlob I (2003) Optokinetic nystagmus in strabismus: are asymmetries related to binocularity? Invest Ophthal Vis Sci 44 5142–50 [22.6.1e] Valyus NA (1966) Stereoscopy Focal Press, London [24.1.1, 24.1.3b]
REFERENCES
•
613
van Boxtel JJ, van Ee R , Erkelens CJ (2007) Dichoptic masking and binocular rivalry share common perceptual dynamics J Vis 7 3.1–11 [12.3.5d] van Boxtel JJA, Alais D, van Ee R (2008) Retinotopic and nonretinotopic stimulus encoding in binocular rivalry and the involvement of feedback J Vis 8 (5) Article 17 [12.3.1a] van Dam LCJ, van Ee R (2004) Stereoscopic matching and the aperture problem Perception 33 769–87 [18.6.5] van Dam LCJ, van Ee R (2006a) The role of saccades in exerting voluntary control in perceptual and binocular rivalry Vis Res 46 787–99 [12.8.1] van Dam LCJ, van Ee R (2006b) Retinal image shifts, but not eye movements per se, cause alternations in awareness during binocular rivalry J Vis 6 1172–9 [12.8.1] Van Damme W, Brenner E (1997) The distance used for scaling disparities is the same as the one used for scaling retinal size Vis Res 37 757–64 [20.6.3d] Van de Castle RL (1960) Perceptual defense in a binocular–rivalry situation J Person 28 448–62 [12.8.3a] Van de Grind WA, Verstraten FAJ, Zwamborn KM (1994) Ensemble models of the movement aftereffect and the influence of eccentricity Perception 23 1171–9 [13.3.3a] Van de Grind WA, Erkelens CJ, Laan AC (1995) Binocular correspondence and visual direction Perception 27 215–35 [16.7.2d] Van de Grind WA, van Hof P, van der Smagt MJ, Verstraten FA (2001) Slow and fast visual motion channels have independent binocular rivalry stages Proc Roy Soc B 268 437–43 [13.3.3c] Van der Meer HC (1978) Linear combinations of stereoscopic depth effects in dichoptic perception of gratings Vis Res 18 707–14 [20.2.1] Van der Smagt MJ, Verstraten FA, van der Grind WA (1999) A new transparent motion aftereffect Nat Neurosci 2 595–6 [13.3.3c] Van der Tweel LH, Estévez O (1974) Subjective and objective evaluation of flicker Ophthalmologica 169 70–81 [13.1.5] Van der Willigen RF, Harmening WM, Vossen S, Wagner H (2010) Disparity sensitivity in man and owl: psychophysical evidence for equivalent perception of shape-from-stereo J Vis 10(1) Article 10 [20.4.2] Van der Zwan R , Wenderoth P, Alais D (1993) Reduction of a patterninduced motion aftereffect by binocular rivalry suggests the involvement of extrastriate mechanisms Vis Neurosci 10 703–9 [12.6.4] Van Die GC, Collewijn H (1986) Control of human optokinetic nystagmus by the central and peripheral retina: effects of partial visual field masking, scotopic vision and central retinal scotomata Brain Res 383 185–94 [22.6.1c] Van Ee R (2001) Perceptual learning without feedback and the stability of stereoscopic slant estimation Perception 30 95–114 [18.14.1] Van Ee R (2003) Correlation between stereoanomaly and perceived depth when disparity and motion interact in binocular matching Perception 32 67–84 [15.3.9] Van Ee R , Anderson BL (2001) Motion direction, speed and orientation in binocular matching Nature 410, 690–4 [15.3.11, 15.3.9] Van Ee R , Erkelens CJ (1995) Binocular perception of slant about oblique axes relative to a visual frame of reference Perception 27 299–14 [20.3.2b] Van Ee R , Erkelens C (1996a) Temporal aspects of binocular slant perception Vis Res 36 45–51 [18.12.1b, 21.4.2b] Van Ee R , Erkelens CJ (1996b) Stability of binocular depth perception with moving head and eyes Vis Res 36 3827–42 [21.4.2b] Van Ee R , Erkelens CJ (1996c) Anisotropy in Werner’s binocular depthcontrast effect Vis Res 36 2253–62 [21.4.2d] Van Ee R , Erkelens CJ (1998) Temporal aspects of stereoscopic slant estimation: an evaluation and extension of Howard and Kaneko’s theory Vis Res 38 3871–82 [20.3.2a] Van Ee R , Erkelens CJ (1999) The influence of large scanning eye movements on stereoscopic slant estimation of large surfaces Vis Res 39 467–79 [18.14.2c]
614
•
Van Ee R , Erkelens CJ (2000) Is there an interaction between perceived direction and perceived aspect ratio in stereoscopic vision? Percept Psychophys 62 910–26 [16.7.4b] Van Ee R , Schor, CM (2000) Unconstrained stereoscopic matching of lines Vis Res 40 151–62 [18.6.5] Van Ee R , van Dam LCJ (2003) The influence of cyclovergence on unconstrained stereoscopic matching Vis Res 43 307–19 [15.3.10] Van Ee R , Banks MS, Backus BT (1999) An analysis of binocular slant contrast Perception 28 1121–45 [21.4.3] van Ee R , van Boxtel JJA, Parker AL, Alais D (2009) Multisensory congruency as a mechanism for attentional control over perceptual selection J Neurosci 29 11641–9 [12.8.4] Van Hof–van Duin J, Mohn G (1982) Stereopsis and optokinetic nystagmus In Functional basis of ocular motility disorders (ed G Lennerstrand, DS Zee, EL Keller) pp 113–9 Pergamon, New York [22.6.1e] Van Hof-van Duin J, Mohn G (1986) Monocular and binocular optokinetic nystagmus in humans with defective stereopsis Invest Ophthal Vis Sci 27 574–83 [22.6.1e] Van Kruysbergen NAWH, de Weert CMM (1993) Apparent motion perception: the contribution of the binocular and monocular systems. An improved test based on motion aftereffects Perception 22 771–84 [13.3.3d] Van Kruysbergen NAWH, de Weert CMM (1994) Aftereffects of apparent motion: the existence of an AND-type binocular system in human vision Perception 23 1069–83 [13.3.3d] Vanduffel W, Fize D, Peuskens H, Denys K , Sunaert S, Todd JT, Orban GA (2002) Extracting 3D from motion: differences in human and monkey intraparietal cortex Science 298 413–5 [11.5.2a] Varela FJ, Singer W (1987) Neuronal dynamics in the visual corticothalamic pathway revealed through binocular rivalry Exp Brain Res 66 10–20 [12.9.1] Vargas CD, Volchan E, Hokoc JN, et al. (1997) On the functional anatomy of the nucleus of the optic tract-dorsal terminal nucleus commissural connections in the opossum (Didelphis marsupialis aurita) Neuroscience 76 313–21 [22.6.1a] Vautin RG, Berkley MA (1977) Responses of single cells in cat visual cortex to prolonged stimulus movement: neural correlates of visual aftereffects J Neurophysiol 40 1051–65 [13.3.3f ] Verhoef BE, Vogels R , Janssen P (2010) Contribution of inferior temporal and posterior parietal activity to three-dimensional shape perception Curr Biol 20 909–13 [11.5.4] Verhoeff FH (1928) An optical illusion due to chromatic aberration Am J Ophthal 11 898–900 [17.8] Verhoeff FH (1933) Effect on stereopsis produced by disparate retinal images of different luminosities Arch Ophthal 10 640–4 [16.7.3b, 18.5.4a] Verhoeff FH (1935) A new theory of binocular vision Arch Ophthal 13 151–75 [12.7.2] Verhoeff FH (1942) Simple quantitative test for acuity and reliability of binocular stereopsis Arch Ophthal 28 1000–19 [18.2.1b] Verstraten FAJ, Fredericksen RE, van de Grind WA (1994a) Movement aftereffect of bi-vectorial transparent motion Vis Res 34 349–58 [13.3.3c, 22.3.2] Verstraten FAJ, Verlinde R , Fredericksen RE, van de Grind WA (1994b) A transparent motion aftereffect contingent on binocular disparity Perception 23 1181–8 [22.3.2, 22.5.4] Verstraten FA, van der Smagt MJ, van de Grind WA (1998) Aftereffect of high speed motion Perception 27 1055–66 [13.3.3c] Verstraten FAJ, van der Smagt MJ, Fredericksen RE, van de Grind WA (1999) Integration after adaptation to transparent motion: static and dynamic test patterns result in different aftereffect directions Vis Res 39 803–10 [22.3.2] Vickery RM, Morley JW (1999) Binocular phase interactions in area 21a of the cat J Physiol 514 541–549 [11.3.2] Vidyasagar TR (1976) Orientation specific colour adaptation at a binocular site Nature 291 39–40 [13.3.5]
REFERENCES
Vidyasagar TR , Henry GH (1990) Relationship between preferred orientation and ordinal position in neurons of cat striate cortex Vis Neurosci 5 565–9 [20.2.5] Virsu V, Taskinen H (1975) Central inhibitory interactions in human vision Exp Brain Res 23 65–74 [13.3.2a] Viswanathan L, Mingolla E (2002) Dynamics of attention in depth: evidence from multi-element tracking Perception 31 1415–37 [22.8.2c] Vlaskamp BNS, Filippini HR , Banks MS (2009) Image-size differences worsen stereopsis independent of eye position J Vis 9(2) Article 17 [18.3.4] Volkmann AW (1836) Neue Beiträge zur Physiologie des Gesichtssinnes Breitkopft, Leipzig [12.3.1a] Von Aster E (1906) Beiträge zur Psychologie der Raumwahrnehmung Z Psychol 43 161–203 [24.1.7] Von Bezold W (1876) The theory of colour Prang , Boston [22.4.5] Von der Heydt R , Peterhans E (1989) Mechanisms of contour perception in monkey visual cortex. I. Lines of pattern discontinuity J Neurosci 9 1731–48 [13.3.2a, 22.2.4c] Von der Heydt R , Adorjani CS, Hänny P, Baumgartner G (1978) Disparity sensitivity and receptive field incongruity of units in the cat striate cortex Exp Brain Res 31 523–45 [11.3.1, 11.4.3a, 11.4.5b, 20.3.1a] Von der Heydt R , Hänny P, Dürsteler MR , Poggio GF (1982) Neuronal responses to stereoscopic tilt in the visual cortex of the behaving monkey Invest Ophthal Vis Sci 22 (Abs) 12 [11.6.2] Von der Heydt R , Zhou H, Friedman HS (2000) Representation of stereoscopic edges in monkey visual cortex Vis Res 40 1995–67 [11.5.1] Von Grünau MW, Dubé S, Kwas M (1993) The effect of disparity on motion coherence Spat Vis 7 227–41 [22.3.3] Voorhorst FA, Overbeeke K , Smets GJF (1997) Using movement parallax for 3D laparoscopy Med Prog Technol 21 211–18 [24.2.4] Vos JJ (1960) Some new aspects of color stereoscopy J Opt Soc Am 50 785–90 [17.8] Vos JJ (1966) The color stereoscopic effect Vis Res 6 105–7 [17.8] Vreven D (2006) 3D shape discrimination using relative disparity derivatives Vis Res 46 4181–92 [18.6.6] Vreven D, Welch L (2001) The absence of depth constancy in contour stereograms Perception 30 693–705 [20.6.5c] Vul E, Krizay E, MacLeod DIA (2008) The McCollough effect reflects permanent and transient adaptation in early visual cortex J Vis 8 (12) Article 4 [13.3.5] Wade NJ (1973) Binocular rivalry and binocular fusion of after–images Vis Res 13 999–1000 [12.3.5a, 12.3.6a] Wade NJ (1974) The effect of orientation in binocular contour rivalry of real images and afterimages Percept Psychophys 15 227–32 [12.3.3c] Wade NJ (1975a) Binocular rivalry between single lines viewed as real images and afterimages Percept Psychophys 17 571–7 [12.3.6a] Wade NJ (1975b) Monocular and binocular rivalry between contours Perception 4 85–95 [12.3.2e, 12.3.8a] Wade NJ (1976) Monocular and dichoptic interaction between afterimages Percept Psychophys 19 149–54 [12.3.8d] Wade NJ (1977) Binocular rivalry between after-images illuminated intermittently Vis Res 17 310–12 [12.3.6a] Wade NJ (1978) Why do patterned afterimages fluctuate in visibility? Psychol Bull 85 338–52 [12.3.3a] Wade NJ (1980) The influence of colour and contour rivalry on the magnitude of the tilt illusion Vis Res 20 229–33 [12.6.3] Wade NJ (1998) Early studies of eye dominance Laterality 3 97–108 [12.3.7] Wade NJ (2007) The stereoscopic art of Ludwig Wilding Perception 36 479–82 [24.1.3a] Wade NJ, de Weert CMM (1986) Aftereffects in binocular rivalry Perception 15 419–34 [12.3.1a] Wade NJ, Ono H (2005) From dichoptic to dichotic: historical contrasts between binocular vision and binaural hearing Perception 34 645–68 [11.1.1]
Wade NJ, Swanston MT (1993) Monocular and dichoptic interactions between moving and stationary stimuli Perception 22 1111–19 [13.3.3e] Wade NJ, Wenderoth P (1978) The influence of colour and contour rivalry on the magnitude of the tilt after–effect Vis Res 18 827–35 [12.6.3] Wade NJ, De Weert CMM, Swanston MT (1984) Binocular rivalry with moving patterns Percept Psychophys 35 111–22 [12.3.6b] Wade NJ, Swanston MT, de Weert CMM (1993) On interocular transfer of motion aftereffects Perception 22 1365–80 [13.3.3a] Wade NJ, Ono H, Lillakas L (2001) Leonardo da Vinci’s struggles with representations of reality Leonardo 34 231–5 [16.7.4b] Wade NJ, Ono H, Mapp AP (2006) The lost direction in binocular vision: The neglected signs posted by Wells, Towne, and LeConte J Hist Behav Sci 42 61–86 [16.7.2b] Wade NJ, Ono H, Mapp AP, Lillakas L (2011) The singular vision of William Charles Wells (1757–1817) J History Neurosci 20 1–15 [16.7.2a] Waespe W, Henn V (1979) The velocity response of vestibular nucleus neurons during vestibular visual and combined angular acceleration Exp Brain Res 37 337–47 [22.6.1a] Wales R , Fox R (1970) Increment detection thresholds during binocular rivalry suppression Percept Psychophys 8 90–4 [12.7.2] Walker GA, Ohzawa I, Freeman RD (1998) Binocular cross-orientation suppression in the cat’s striate cortex J Neurophysiol 79 227–39 [12.9.2b] Walker JT (1976) Slant perception and binocular brightness differences: some aftereffects of viewing apparent and objective surface slants Percept Psychophys 20 395–402 [17.9] Walker JT, Kruger MW (1972) Figural aftereffects in random–dot stereograms without monocular contours Perception 1 187–92 [16.3.3] Walker P (1975) Stochastic properties of binocular rivalry alternations Percept Psychophys 18 467–73 [12.10] Walker P (1978a) Orientation–selective inhibition and binocular rivalry Perception 7 207–14 [13.3.2a] Walker P (1978b) Binocular rivalry: central or peripheral selective processes? Psychol Bull 85 376–89 [12.10] Walker P, Powell DJ (1979) The sensitivity of binocular rivalry to changes in the nondominant stimulus Vis Res 19 277–9 [12.3.3d, 12.5.3] Wallace JM, Mamassian P (2004) The efficiency of depth discrimination for non-transparent and transparent stereoscopic surfaces Vis Res 44 2253–67 [18.9] Wallach H (1935) über visuell wahrgenommene Bewegungsrichtung Psychol Forsch 20 325–80 (Translated by S Wuerger, R Shapley, N Rubin) Perception 25 1317–67 [22.3.1, 22.3.3] Wallach H (1948) Brightness constancy and the nature of achromatic colors J Exp Psychol 38 310–24 [22.4.3b] Wallach H (1976) The direction of motion of straight lines. In On perception (H Wallach) pp 200–216 New York Times Book Co, New York [22.3.1] Wallach H, Adams PA (1954) Binocular rivalry of achromatic colors Am J Psychol 67 513–6 [12.3.2d] Wallach H, Bacon J (1976) Two forms of retinal disparity Percept Psychophys 19 375–82 [20.4.1a] Wallach H, Frey KJ (1972) Adaptation in distance perception based on oculomotor cues Percept Psychophys 11 77–83 [13.4.3] Wallach H, Goldberg J (1977) An exploration of the Pulfrich effect Scand J Psychol 18 231–6 [23.5] Wallach H, Karsh EB (1963) Why the modification of stereoscopic depth–perception is so rapid Am J Psychol 76 413–20 [18.14.1] Wallach H, Lindauer J (1962) On the definition of retinal disparity Psychol Beit 6 521–30 [20.1.2] Wallach H, Zuckerman C (1963) The constancy of stereoscopic depth Am J Psychol 76 404–12 [20.6.3d] Wallach H, Bacon J, Schulman P (1978) Adaptation in motion perception: alteration of induced motion Percept Psychophys 27 509–14 [22.7]
REFERENCES
•
615
Wallach H, Gillam B, Cardillo L (1979) Some consequences of stereoscopic depth constancy Percept Psychophys 29 235–40 [20.6.3d] Walls GL (1943) Factors in human visual resolution J Opt Soc Am 33 487–505 [18.11] Walls GL (1951) A theory of ocular dominance Arch Ophthal 45 387–412 [16.7.6b] Walls GL (1953) Interocular transfer of after–images Am J Optom Arch Am Acad Optom 30 57–64 [13.3.1, 13.3.3a] Walraven J (1975) Amblyopia screening with random–dot stereograms Am J Ophthal 80 893–9 [18.2.3b] Walsh G (1988) The effect of mydriasis on the pupillary centration of the human eye Ophthal Physiol Opt 8 178–82 [17.8] Walton NH (1952) A study of retinal correspondence by after-image methods Am J Optom Arch Am Acad Optom 29 90–103 [14.4.1b] Wang C, Dreher B (1996) Binocular interactions and disparity coding in area 21a of cat extrastriate visual cortex Exp Brain Res 108 257–72 [11.3.2] Wang YZ , Thibos LN, Bradley A (1997) Effects of refractive error on detection acuity and resolution acuity in peripheral vision Invest Ophthal Vis Sci 38 2134–43 [13.1.2e] Wang Z , Wu X, Ni R , Wang U (2001) Double fusion does not occur in Panum’s limiting case: evidence from orientation disparity Perception 30 1143–9 [17.6.3] Wanless HR (1965) Aerial stereo photographs Hubbard Scientific Company Northbrook Illinois [24.2.1] Wann JP, Rushton S, Mon-Williams M (1995) Natural problems for stereoscopic depth perception in virtual environments Vis Res 35 2731–6 [23.6.4] Ward R , Morgan MJ (1978) Perceptual effect of pursuit eye movements in the absence of a target Nature 274 158–9 [21.6.1a] Ware C, Mitchell DE (1974) The spatial selectivity of the tilt aftereffect Vis Res 14 735–7 [21.6.1a] Warren N (1940) A comparison of standard tests of depth perception Am J Optom Arch Am Acad Optom 17 208–11 [18.2.4] Warren PA, Maloney LT, Landy MS (2002) Interpreting sampled contours in 3-D: analysis of variability and bias Vis Res 42 2431–46 [18.6.6] Washburn MF (1933) Retinal rivalry as a neglected factor in stereoscopic vision Proc Natl Acad Sci 19 773–7 [12.7.2] Washburn MF, Faison C, Scott R (1934) A comparison between the Miles A–B–C method and retinal rivalry as tests of ocular dominance Am J Psychol 46 633–6 [12.3.7] Watamaniuk SNJ, Sekuler R , Williams DW (1989) Direction perception in complex dynamic displays: the integration of direction information Vis Res 29 47–59 [16.5.3b, 22.7.4] Watanabe K (1999) Optokinetic nystagmus with spontaneous reversal of transparent motion perception Exp Brain Res 129 156–60 [22.6.1f ] Watanabe K , Paik Y, Blake R (2004) Preserved gain control for luminance during binocular rivalry suppression Vis Res 44 3065–71 [12.7.2] Watanabe M, Tanaka H, Uka T, Fujita I (2002) Disparity-selective neurones in area V4 of macaque monkeys J Neurophysiol 87 1960–73 [11.5.3a] Watanabe O, Fukushima K (1999) Stereo algorithm that extracts a depth cue from interocularly unpaired points Neural networks 12 569–78 [11.10.1c, 17.3] Watanabe T, Cavanagh P (1992) Depth capture and transparency of regions bounded by illusory and chromatic contours Vis Res 32 527–32 [22.2.4b] Watanabe T, Nanez JE, Moreno MA (1995) Depth release of illusory contour shape in the Ehrenstein grid Vis Res 35 2845–51 [22.2.4b] Watson AB, Nachmias J (1977) Patterns of temporal interaction in the detection of gratings Vis Res 17 893–902 [13.1.6c] Watson SE, Kramer AF (1999) Object-based visual attention and perceptual organization Percept Psychophys 61 31–49 [22.8.1] Watson TL, Pearson J, Clifford CWG (2004) Perceptual grouping of biological motion promotes binocular rivalry Curr Biol 14 1670–4 [12.4.4b]
616
•
Watt RJ (1987) Scanning from coarse to fine spatial scales in the human visual system after the onset of a stimulus J Opt Soc Am A 4 2006–21 [18.12.1c] Waugh SJ, Levi DM (1993) Visibility, timing and vernier acuity Vis Res 33 505–26 [18.12.1a] Weale RA (1954) Theory of the Pulfrich effect Ophthalmologica 128 380–8 [23.2.1] Weale RA (1956) Stereoscopic acuity and convergence J Opt Soc Am 46 907 [18.10.2a] Wehrhahn C, Westheimer G, Abulencia A (1990) Binocular summation in temporal-order detection J Opt Soc Am A 7 731–2 [13.1.6c] Weinman J, Cooke V (1982) A nonspecific learning effect in the perception of random-dot stereograms Perception 11 93–5 [18.14.2a] Weinshall D (1991) Seeing “ghost” planes in stereo vision Vis Res 31 1731–48 [15.3.1, 17.6.3, 18.9] Weinshall D (1993) The computation of multiple matching of doubly ambiguous stereograms with transparent planes Spat Vis 7 183–98 [15.3.1, 17.6.3, 18.9] Weiskrantz L (1987) Blindsight: a case study and implications Oxford University Press, London [11.6.4] Weisstein N (1972) Metacontrast In Handbook of sensory physiology (ed D Jameson, LM Hurvich) Vol VII/4 pp 233–72 Springer, New York [13.2.7] Weitzman B (1963) A threshold difference produced by a figure-ground dichotomy J Exp Psychol 66 201–5 [22.5.1a] Welchman AE, Deubelius A, Conrad V, et al. (2005) 3D shape perception from combined depth cues in human visual cortex Nat Neurosci 8 820–7 [11.8.1] Wells W C (1792) An Essay upon Single Vision with Two Eyes: Together with Experiments and Observations on several other Subjects in Optics (Cadell, London) (Reprinted in Wade 2003) [16.7.2b, 16.7.2d] Welpe E, von Seelen W, Fahle M (1980) A dichoptic edge effect resulting from binocular contour dominance Perception 9 683–93 [12.3.1a] Wenderoth PM (1970) A visual spatial aftereffect of surface slant Am J Psychol 83 576–90 [21.6.1b] Wenderoth PM (1971) Studies of a stereoscopic aftereffect of a contour slanted in the median plane Aust J Optom 54 114–23 [21.6.1a] Wenderoth PM, Rodger RS, Curthoys IS (1968) Confounding of psychophysical errors and sensory effects in adjustment measures of spatial aftereffects Percept Psychophys 4 133–8 [21.6.2b] Werner A (2006) The influence of depth segmentation on colour constancy Perception 35 1171–84 Werner H (1935) Studies on contour: I Qualitative analysis Am J Psychol 47 40–64 [22.4.6] Werner H (1937) Dynamics in binocular depth perception Psychol Monogr 49 1–120 [21.3.1, 21.3.4, 21.5.2] Werner H (1938) Binocular depth contrast and the conditions of the binocular field Am J Psychol 51 489–97 [21.3.1, 21.5.2] Werner H (1940) Studies on contour: strobostereoscopic phenomena Am J Psychol 53 418–22 [13.2.7, 13.2.7b] Wertheimer M (1912) Experimentelle Studien über das Sehen von Bewegung Z Psychol Physiol Sinnesorg 61 161–295 [22.5.3d] Wertheimer M (1923) Untersuchungen zur Lehre von der Gestalt Psychol Forsch 4 301–50. English translation in WD Ellis (1967) A source book of Gestalt psychology Humanities Press, New York [22.1.1] Westall CA, Schor CM (1985) Asymmetries of optokinetic nystagmus in amblyopia: the effect of selected retinal stimulation Vis Res 25 1431–8 [22.6.1c] Westall CA, Eizenman M, Kraft SP, et al. (1998) Cortical binocularity and monocular optokinetic asymmetry in early-onset esotropia Invest Ophthal Vis Sci 39 1352–9 [22.6.1e] Westendorf DH (1989) Binocular rivalry and dichoptic masking: suppressed stimuli do not mask stimuli in a dominating eye J Exp Psychol HPP 15 485–92 [13.2.4c] Westendorf DH, Blake R (1988) Binocular reaction times to contrast increments Vis Res 28 355–9 [13.1.7] Westendorf DH, Fox R (1974) Binocular detection of positive and negative flashes Percept Psychophys 15 61–5 [13.1.6a]
REFERENCES
Westendorf DH, Fox R (1975) Binocular detection of vertical and horizontal line segments Vis Res 15 471–76 [13.1.2c, 13.1.6a] Westendorf DH, Fox R (1977) Binocular detection of disparate light flashes Vis Res 17 697–702 [13.1.2d] Westendorf DH, Blake R , Fox R (1972) Binocular summation of equivalent energy flashes of unequal duration Percept Psychophys 12 445–8 [13.1.6b] Westendorf DH, Blake R , Sloane M, Chambers D (1982) Binocular summation occurs during interocular suppression J Exp Psychol HPP 8 81–90 [12.5.2] Westheimer G (1965) Spatial interaction in the human retina during scotopic vision J Physiol 181 881–94 [13.2.3] Westheimer G (1967) Spatial interaction in human cone vision J Physiol 190 139–54 [13.2.3] Westheimer G (1978) Vertical disparity detection: is there an induced size effect? Invest Ophthal Vis Sci 17 545–51 [20.2.3a, 20.6.5a] Westheimer G (1979a) Cooperative neural processes involved in stereoscopic acuity Exp Brain Res 36 585–97 [18.3.2a, 18.12.2b] Westheimer G (1979b) The spatial sense of the eye Invest Ophthal Vis Sci 18 893–912 [18.11] Westheimer G (1984) Sensitivity for vertical retinal image differences Nature 307 632–4 [20.2.3a, 20.6.5a] Westheimer G (1986a) Panum’s phenomenon and the confluence of signals from the two eyes in stereoscopy Proc R Soc B 228 289–305 [17.6.1, 17.6.3] Westheimer G (1986b) Spatial interaction in the domain of disparity signals in human stereoscopic vision J Physiol 370 619–29 [21.2, 21.5.1] Westheimer G (2011) Reversed tilt effect for dichoptic stimulation in vertical meridian Vis Res 51 101–4 [13.3.2a] Westheimer G, Hauske G (1975) Temporal and spatial interference with vernier acuity Vis Res 15 1137–41 [13.2.5] Westheimer G, Levi DM (1987) Depth attraction and repulsion of disparate stimuli Vis Res 27 1361–8 [21.2] Westheimer G, Ley E (1996) Temporal uncertainty effects on line orientation discrimination and stereoscopic thresholds J Opt Soc Am 13 884–6 [18.13] Westheimer G, McKee SP (1977) Integration regions for visual hyperacuity Vis Res 17 89–93 [18.10.1a] Westheimer G, McKee SP (1978) Stereoscopic acuity for moving retinal images J Opt Soc Am 68 450–5 [18.3.3a, 18.10.1b] Westheimer G, McKee SP (1979) What prior uniocular processing is necessary for stereopsis? Invest Ophthal Vis Sci 18 614–21 [18.6.2a, 18.11] Westheimer G, McKee SP (1980a) Stereogram design for testing local stereopsis Invest Ophthal Vis Sci 19 802–9 [18.6.2a, 24.1.5] Westheimer G, McKee SP (1980b) Stereoscopic acuity with defocused and spatially filtered retinal images J Opt Soc Am 70 772–8 [18.5.2, 18.5.4b] Westheimer G, Mitchell DE (1969) The sensory stimulus for disjunctive eye movements Vis Res 9 749–55 [15.3.4b, 18.10.3a] Westheimer G, Pettet MW (1990) Contrast and duration of exposure differentially affect vernier and stereoscopic acuity Proc R Soc 27 42–6 [18.11] Westheimer G, Pettet MW (1992) Detection and processing of vertical disparity by the human observer Proc R Soc B 250 273–7 [20.6.5a] Westheimer G, Tanzman IJ (1956) Qualitative depth localization with diplopic images J Opt Soc Am 46 116–17 [18.4.1a, 18.6.4] Westheimer G, Truong TT (1988) Target crowding in foveal and peripheral stereoacuity Am J Optom Physiol Opt 65 395–9 [18.6.2a] Westheimer G, Shimamura K , McKee SP (1976) Interference with line-orientation sensitivity J Opt Soc Am 66 332–8 [13.2.5] Wetherick NE (1977) The significance of the nose for certain phenomena of visual perception Nat New Biol 296 442–3 [16.7.1] Wexler M, Quarti N (2008) Depth affects where we look Curr Biol 18 1872–6 [18.10.2b]
Weyand TG, Malpeli JG (1993) Responses of neurons in primary visual cortex are modulated by eye position J Neurophysiol 69 2258–60 [11.4.6b] Wheatley C, Cook ML, Vidyasagar TR (2004) Surface segregation influences pre-attentive search in depth Neuroreport 15 303–5 [22.8.2b] Wheatstone C (1838) Contributions to the physiology of vision – Part the first On some remarkable and hitherto unobserved phenomena of binocular vision Philos Tr R Soc 128 371–94 [12.1.1a, 12.3.1a, 16.7.3a, 16.7.7, 18.10.1a, 20.6.3b] White CT, Bonelli L (1970) Binocular summation in the evoked potential as a function of image quality Am J Optom Arch Am Acad Optom 47 304–9 [13.1.8b] White KD, Odom JV (1985) Temporal integration in global stereopsis Percept Psychophys 37 139–44 [20.4.2] White KD, Petry HM, Riggs LA, Miller J (1978) Binocular interactions during establishment of McCollough effects Vis Res 18 1201–15 [12.6.3, 13.3.5] White M (1979) A new effect of pattern on perceived lightness Perception 8 413–16 [22.4.5] Whitten DN, Brown KT (1973) Photopic suppression of monkey’s rod receptor potential, apparently by a cone-initiated lateral inhibition Vis Res 13 1629–58 [13.2.3] Whittle P (1965) Binocular rivalry and the contrast at contours J Exp Psychol 17 217–29 [12.3.2a] Whittle P, Challands PDC (1969) The effect of background luminance on the brightness of flashes Vis Res 9 1095–1110 [13.2.2] Whittle P, Bloor DC, Pocock S (1968) Some experiments on figural effects in binocular rivalry Percept Psychophys 4 183–8 [12.4.4b] Wick B (1990) Stability of retinal correspondence during divergence: evaluation with afterimages and Haidinger brushes Optom Vis Sci 67 779–86 [14.6.2a] Wick B (1991) Stability of retinal correspondence in normal binocular vision Optom Vis Sci 68 146–58 [14.4] Wickelgren BG, Sterling P (1969) Influence of visual cortex on receptive fields in the superior colliculus of the cat J Neurophysiol 32 16–22 [11.2.3] Wieniawa-Narkiewicz BM, Wimborne BM, Michalski A, Henry GH (1992) Area 21a in the cat and the detection of binocular orientation disparity Ophthal Physiol Opt 12 299–72 [11.6.2] Wiesenfelder H, Blake R (1990) The neural site of binocular rivalry relative to the analysis of motion in the human visual system J Neurosci 10 3880–8 [12.6.4] Wiesenfelder H, Blake R (1991) Apparent motion can survive binocular rivalry suppression Vis Res 31 1589–99 [12.5.4a] Wiesenfelder H, Blake R (1992) Binocular rivalry suppression disrupts recovery from motion adaptation Vis Neurosci 9 143–8 [12.6.4] Wilcox LM, Hess RF (1995) Dmax for stereopsis depends on size not spatial frequency content Vis Res 35 1061–9 [18.4.1c, 18.4.1d] Wilcox LM, Hess RF (1996) Is the site of non-linear filtering in stereopsis before or after binocular combination Vis Res 36 391–9 [18.7.2d] Wilcox LM, Hess RF (1997) Scale selection for second-order (nonlinear) stereopsis Vis Res 37 2981–92 [18.7.2d] Wilcox LM, Hess RF (1998) When stereopsis does not improve with increasing contrast Vis Res 38 3671–79 [18.7.2d] Wilcox LM, Lakra DC (2007) Depth from binocular half-occlusions in stereoscopic images of natural scenes Perception 36 830–9 [17.2.2] Wilcox LM, Timney B, St John R (1990) Measurement of visual aftereffects and inferences about binocular mechanisms in human vision Perception 19 43–55 [13.3.1, 13.3.2a] Wilcox LM, Timney B, Girash M (1994) On the contribution of a binocular ‘AND’ channel at contrast threshold Perception 23 659–69 [13.3.2a] Wilcox LM, Elder JH, Hess RF (2000) The effects of blur and size on monocular and stereoscopic localization Vis Res 40 3575–84 [18.7.2d] Wilcox LM, Harris JM, McKee, SP (2007) The role of binocular stereopsis in monoptic depth perception Vis Res 47 2367–77 [17.6.5]
REFERENCES
•
617
Wilde K (1950) Der Punktreiheneffekt und die Rolle der binocularen Querdisparation beim Tiefensehen Psychol Forsch 23 223–62 [20.2.1, 22.2.3a] Wilkie M, Logothetis NK , Leopold DA (2003) Generalized flash suppression of salient visual targets Neuron 39 1043–52 [12.3.5f ] Williams DR , Artal P, Navarro R , et al. (1996) Off-axis optical quality and retinal sampling in the human eye Vis Res 36 1103–14 [13.1.2e] Williams JM, Lit A (1983) Luminance–dependent visual latency for the Hess effect the Pulfrich effect and simple reaction time Vis Res 23 171–9 [23.2.3] Williams MA, Morris AP, McGlone F et al. (2004) Amygdala responses to fearful and happy facial expressions under conditions of binocular suppression J Neurosci 24 2898–904 [12.9.2f ] Williams R (1974) The effect of strabismus on dichoptic summation of form information Vis Res 14 307–9 [13.1.3e] Williams S, Simpson A, Silva PA (1988) Stereoacuity levels and vision problems in children from 7 to 11 years Ophthal Physiol Opt 8 386–9 [18.2.3b] Wilson HR (1976) The significance of frequency gradients in binocular grating perception Vis Res 16 983–9 [20.2.1] Wilson HR (1977) Hysteresis in binocular grating perception: contrast effects Vis Res 17 843–51 [18.5.4a] Wilson HR (2003) Computational evidence for a rivalry hierarchy in vision Proc Natl Acad Sci 100 14499–503 [12.10] Wilson HR (2007) Minimal physiological conditions for binocular rivalry and rivalry memory Vis Res 47 2741–50 [12.10] Wilson HR , Bergen JR (1979) A four mechanism model for threshold spatial vision Vis Res 19 19–32 [20.2.1] Wilson HR , Gelb DJ (1984) Modified line element theory for spatial– frequency and width discrimination J Opt Soc Am A 1 127–31 [18.7.4] Wilson HR , Giese SC (1977) Threshold visibility of frequency gradient patterns Vis Res 17 1177–90 [20.2.1] Wilson HR , Kim J (1994) A model for motion coherence and transparency Vis Neurosci 11 1205–20 [22.3.3] Wilson HR , Blake R , Pokorny J (1988) Limits of binocular fusion in the short wave sensitive (“blue”) cones Vis Res 28 555–62 [12.1.3d] Wilson HR , Blake R , Halpern DL (1991) Coarse spatial scales constrain the range of binocular fusion on fine scales J Opt Soc Am A 8 229–36 [12.1.3b] Wilson HR , Ferrera VP, Yo C (1992) A psychophysically motivated model for two-dimensional motion perception Vis Neurosci 9 79–97 [18.7.2d] Wilson HR , Blake R , Lee SH (2001) Dynamics of travelling waves in visual perception Nature 412 907–10 [12.3.5e] Wilson JA, Anstis SM (1969) Visual delay as a function of luminance Am J Psychol 82 350–8 [23.2.3, 23.4.1] Wilson JA, Robinson JO (1986) The impossibly twisted Pulfrich pendulum Perception 15 503–4 [23.1.3] Wilson ME, Cragg BG (1967) Projections from the lateral geniculate nucleus in the cat and monkey J Anat 101 677–92 [11.9.2] Winn B, Bradley A, Strang NC, et al. (1995) Reversals of the colourdepth illusion explained by ocular chromatic aberration Vis Res 35 2975–84 [17.8] Wist ER (1968) The influence of the equidistance tendency on depth shifts resulting from an interocular delay in stimulation Percept Psychophys 3 89–92 [23.3.1] Wist ER (1970) Do depth shifts resulting from an interocular delay in stimulation result from a breakdown of binocular fusion? Percept Psychophys 8 15–19 [23.3.1] Wist ER (1974) Mach bands and depth adjacency Bull Psychonom Soc 3 97–9 [22.4.2] Wist ER (1975) Convergence and stereoscopic depth shifts produced by interocular delays in stimulation Bull Psychonom Soc 5 251–3 [23.3.1] Wist ER , Gogel WC (1966) The effect of interocular delay and repetition interval on depth perception Vis Res 6 325–34 [18.12.2a]
618
•
Wist ER , Brandt TH, Diener HC, Dichgans J (1977) Spatial frequency effect on the Pulfrich stereophenomenon Vis Res 17 391–7 [23.3.6] Witasek St (1899) über die Natur der geometrisch-optischen Täuschungen Z Psychol Physiol Sinnesorg 19 81–174 [16.3.1] Wittenberg S, Brock FW, Folsom WC (1969) Effect of training on stereoscopic acuity Am J Optom Arch Am Acad Optom 46 645–53 [18.14.1] Wohlgemuth A (1911) On the after–effect of seen movement Br J Psychol Monogr Supp No 1 1–117 [13.3.3a, 13.3.3d, 13.3.3f, 16.4.3, 21.1] Wojciulik E, Kanwisher N, Driver J (1998) Modulation of activity in the fusiform face area by covert attention: an MRI study J Neurophysiol 79 1574–8 [12.9.2f ] Wolf E, Zigler MJ (1955) Course of dark adaptation under various conditions of pre–exposure and testing J Opt Soc Am 45 696–702 [13.2.2] Wolf E, Zigler MJ (1963) Effects of uniocular and binocular excitation of the peripheral retina with test fields of various shapes on binocular summation J Opt Soc Am 53 1199–205 [13.1.2e] Wolf E, Zigler MJ (1965) Excitation of the peripheral retina with coincident and disparate test fields J Opt Soc Am 55 1517–19 [13.1.2e] Wolfe JM (1983a) Afterimages binocular rivalry and the temporal properties of dominance and suppression Perception 12 439–45 [12.3.5a] Wolfe JM (1983b) Influence of spatial frequency luminance and duration on binocular rivalry and abnormal fusion of briefly presented dichoptic stimuli Perception 12 447–56 [12.3.5a] Wolfe JM (1984) Reversing ocular dominance and suppression in a single flash Vis Res 27 471–8 [12.3.5f, 12.4.4a] Wolfe JM (1986a) Briefly presented stimuli can disrupt constant suppression and binocular rivalry suppression Perception 15 413–17 [12.3.5a] Wolfe JM (1986b) Stereopsis and binocular rivalry Psychol Rev 93 299–82 [12.7.3] Wolfe JM, Franzel SL (1988) Binocularity and visual search Percept Psychophys 44 81–93 [16.6.1a, 16.8] Wolfe JM, Held R (1980) Cyclopean stimulation can influence sensations of self-motion in normal and stereoblind subjects Percept Psychophys 28 139–42 [16.4.2g] Wolfe JM, Held R (1981) A purely binocular mechanism in human vision Vis Res 21 1755–9 [13.3.1, 13.3.2a] Wolfe JM, Held R (1982) Binocular adaptation that cannot be measured monocularly Perception 11 287–95 [13.3.2a] Wolfe JM, Held R (1983) Shared characteristics of stereopsis and the purely binocular process Vis Res 23 217–27 [13.3.1, 13.3.2] Wolfe JM, Held R , Bauer JA (1981) A binocular contribution to the production of optokinetic nystagmus in normal and stereoblind subjects Vis Res 21 587–90 [16.5.1] Wolpert DM, Miall RC, Cumming B, Boniface SJ (1993) Retinal adaptation of visual processing time delays Vis Res 33 1421–30 [23.4.2b] Wong BP, Woods RL, Peli E (2002) Stereoacuity at distance and near Optom Vis Sci 79 771–8 [18.2.4] Wong E, Weisstein N (1982) A new perceptual context–superiority effect: line segments are more visible against a figure than against a ground Science 218 587–9 [22.5.1a] Wong E, Weisstein N (1983) Sharp targets are detected better against a figure, and blurred targets are detected better against a background J Exp Psychol HPP 9 194–202 [22.5.1a] Wong E, Weisstein N (1985) A new visual illusion: flickering fields are localized in a depth plane behind nonflickering fields Perception 14 13–17 [22.1.1] Woo GCS (1974a) The effect of exposure time on the foveal size of Panum’s area Vis Res 14 473–80 [12.1.4] Woo GCS (1974b) Temporal tolerance of the foveal size of Panum’s area Vis Res 14 633–5 [12.1.4] Woo GCS, Sillanpaa V (1979) Absolute stereoscopic thresholds as measured by crossed and uncrossed disparities Am J Optom Physiol Opt 56 350–5 [18.6.4]
REFERENCES
Wood ICJ (1983) Stereopsis with spatially-degraded images Ophthal Physiol Opt 3 337–40 [18.5.4b] Wood JM, Collins MJ, Carkeet A (1992) Regional variations in binocular summation across the visual field Ophthal Physiol Opt 12 46–51 [13.1.2e] Woodburne LS (1934) The effect of constant visual angle upon the binocular discrimination of depth differences Am J Psychol 46 273–86 [18.2.1a] Wood CC, Spear PD, Braun JJ (1973) Direction-specific deficits in horizontal optokinetic nystagmus following removal of visual cortex in the cat Brain Res 60 231–7 [22.6.1b] Woods RL, Bradley A, Atchison DA (1996) Monocular diplopia caused by ocular aberrations and hyperopic defocus Vis Res 36 3597–606 [14.4.2] Worth C (1903) Squint Blakiston, Philadelphia [14.4.2] Wright MJ (1986) Apparent velocity of motion aftereffects in central and peripheral vision Perception 15 603–12 [13.3.3a] Wright WD (1951) The role of convergence in stereoscopic vision Proc Physics Soc 64B 289–97 [18.10.2a] Wu MC, David, SV, Gallant, JL (2006) Complete functional characterization of sensory neurons by system identification Ann Rev Neurosci 29 477–505 [11.10.1b] Wu X, Zhou Q, Lin X, Wang YJ (1998) Stereo capture: local rematching driven by binocularly attended 3-D configuration rather than retinal images Vis Res 38 2081–5 [22.2.4b] Wunderlich K , Schneider KA, Kastner S (2005) Neural correlates of binocular rivalry in the human lateral geniculate nucleus Nat Neurosci 8 1595–602 [12.9.1] Würger SM, Landy MS (1989) Depth interpolation with sparse disparity cues Perception 18 39–54 [22.2.1] Xue JT, Ramoa AS, Carney T, Freeman RD (1987) Binocular interaction in the dorsal lateral geniculate nucleus of the cat Exp Brain Res 68 305–10 [11.2.1] Xue JT, Carney T, Ramoa AS, Freeman RD (1988) Binocular interaction in the perigeniculate nucleus of the cat Exp Brain Res 69 497–508 [11.2.1] Yang J, Stevenson SB (1999) Post retinal processing of background luminance Vis Res 39 4045–51 [13.2.2] Yang JN, Maloney LT (2001) Illuminant cues in surface color perception: tests of three candidate cues Vis Res 41 2581–600 [22.4.6] Yang JN, Shevell SK (2002) Stereo disparity improves color constancy Vis Res 42 1979–89 [22.4.6] Yang M, Papathomas TV, Kovács J, Julesz B (1996) No fusion in reversecolor-polarity stereograms: symmetries in luminance and color contributions Invest Ophthal Vis Sci 37 (Abs) 284 [15.3.8a] Yang Y, Blake R (1991) Spatial frequency tuning of human stereopsis Vis Res 31 1177–89 [18.7.4] Yang Y, Rose D, Blake R (1992) On the variety of percepts associated with dichoptic viewing of dissimilar monocular stimuli Perception 21 47–62 [12.3.3b] Yang Z , Purves D (2003) A statistical explanation of visual space Nat Neurosci 6 632–40 [15.3.12] Yantis S (1992) Multielement visual tracking; attention and perceptual organization Cog Psychol 24 295–340 [22.8.2c] Yarbus AL (1967) Eye movements and vision (Translated by LA Riggs) Plenum, New York [18.10.4] Ye M, Bradley A, Thibos LN, Zhang X (1991) Interocular differences in transverse chromatic aberration determine chromostereopsis for small pupils Vis Res 31 1787–96 [17.8] Ye M, Bradley A, Thibos LN, Zhang X (1992) The effect of pupil size on chromostereopsis and chromatic diplopia: interaction between the Stiles–Crawford effect and chromatic aberrations Vis Res 32 2121–8 [17.8] Yelin D, Rizvi I, White WM, et al. (2006) Three-dimensional miniature endoscopy Nature 443 765 [24.2.4] Yellott JI, Kaiwi JL (1979) Depth inversion despite stereopsis: the appearance of random-dot stereograms on surfaces seen in reverse perspective Perception 8 135–42 [21.6.2g]
Yellott JI, Wandell BA (1976) Color properties of the contrast flash effect: monoptic vs dichoptic comparisons Vis Res 16 1275–80 [13.2.7b] Yeshurun Y, Schwartz EL (1989) Cepstral filtering on a columnar image architecture: a fast algorithm for binocular stereo segmentation IEE Tr Patt Anal Mach Intel 11 759–67 [15.2.1d] Yeshurun Y, Schwartz EL (1990) Neural maps as data structures Fast segmentation of binocular images In Computational neuroscience (ed EL Schwartz) pp 256–66 MIT Press, Cambridge MA [15.2.1d] Yeshurun Y, Schwartz EL (1999) Cortical hypercolumn size determines stereo fusion limits Biol Cyber 80 117–29 [12.1.1d] Yin C, Kellman PJ, Shipley TF (2000) Surface integration influences depth discrimination Vis Res 40 1969–78 [22.1.3] Young RH, Lit A (1972) Stereoscopic acuity for photometrically matched background wavelengths at scotopic and photopic levels Percept Psychophys 11 213–16 [18.5.5] Yu K , Blake R (1992) Do recognizable figures enjoy an advantage in binocular rivalry? J Exp Psychol 18 1158–73 [12.8.3a] Zanoni D, Rosenbaum AL (1991) A new method for evaluating stereo acuity J Ped Ophthal Strab 28 255–60 [18.2.3d] Zaretskaya N, Thielscher A, Logothetis NK , Bartels A (2010) Disrupting parietal function prolongs dominance durations in binocluar rivalry Curr Biol 20 2106–11 [12.9.2f ] Zee DS, Tusa RJ, Herdman SJ, et al. (1987) Effects of occipital lobotomy upon eye movements in primate J Neurophysiol 58 883–906 [22.6.1b] Zeevi YY, Geri GA (1985) A purely central movement aftereffect induced by binocular viewing of dynamic visual noise Percept Psychophys 38 433–7 [16.4.3, 23.6.1] Zeki SM (1978) Uniformity and diversity of structure and function in rhesus monkey prestriate visual cortex J Physiol 277 273–90 [13.1.8b] Zeki SM (1990) A century of achromatopsia Brain 113 1721–77 [17.1.4e] Zeki SM, Fries W (1980) A function of the corpus callosum in the Siamese cat Proc R Soc B 207 279–58 [11.9.2] Zemon V, Pinkhasov E, Gordon J (1993) Electrophysiological tests of neural models: evidence for nonlinear binocular interactions in humans Proc Natl Acad Sci 90 2975–8 [13.1.8b] Zhang X, Bradley A, Thibos LN (1991) Achromatizing the human eye: the problem of chromatic parallax J Opt Soc Am A 8 686–91 [17.8] Zhang ZL, Edwards M, Schor CM (2001) Spatial interactions minimize relative disparity between adjacent surfaces Vis Res 41 2995–307 [15.3.2] Zhang ZL, Berends EM, Schor CM (2003) Thresholds for stereo-slant discrimination between spatially separated targets are influenced mainly by visual and memory factors but not oculomotor instability J Vis 3 710–24 [18.10.2a, 18.10.2b] Zhang ZL, Cantor C, Ghose T, Schor CM (2004) Temporal aspects of spatial interactions affecting stereo-matching solutions Vis Res 44 3183–92 [15.3.2] Zhang ZL, Cantor C, Schor CM (2010) Perisaccadic stereo depth with retinal disparity Curr Biol 20 1176–81 [19.3.5] Zhaoping L (2002) Pre-attentive segmentation and correspondence in stereo Philos Trans R Soc B 357 1877–83 [11.10.1c] Zhou H, Friedman HS, von der Heydt R (2000) Coding of border ownership in monkey visual cortex J Neurosci 20 6594–611 [11.5.1] Zhou W, Jiang Y, He S, Chen D (2010) Olfaction modulates visual perception in binocular rivalry Curr Biol 20 1356–8 [12.8.4] Zhu M, Hertle RW, Kim CH, et al. (2008) Effect of binocular rivalry suppression on initial ocular following responses J Vis 8 (4) Article 19 [12.5.4b] Zhu Y, Qian N (1996) Binocular receptive field models, disparity tuning and characteristic disparity Neural Comput 8 1647–77 [11.4.3c] Ziegler LR , Hess RF (1997) Depth perception during diplopia is direct Perception 26 1225–30 [15.3.4b, 18.4.1g]
REFERENCES
•
619
Ziegler LR , Hess RF (1999) Stereoscopic depth but not shape perception from second order stimuli Vis Res 39 1491–507 [18.7.2d] Ziegler LR , Kingdom FAA, Hess RF (2000a) Local luminance factors that determine the maximum disparity for seeing cyclopean surface shape Vis Res 40 1157–65 [18.4.1e] Ziegler LR , Hess RF, Kingdom FAA (2000b) Global factors that determine the maximum disparity for seeing cyclopean surface shape Vis Res 40 493–502 [18.4.1f ] Zimba LD, Blake R (1983) Binocular rivalry and semantic processing: out of sight out of mind J Exp Psychol HPP 9 807–15 [12.8.3b]
620
•
Zinn WJ, Solomon H (1985) A comparison of static and dynamic stereoacuity J Am Optom Assoc 56 712–15 [18.2.4] Zlatkova MB, Anderson RS, Ennis FA (2001) Binocular summation for grating resolution in foveal and peripheral vision Vis Res 41 3093–100 [13.1.2e] Zohary E, Shadlen MN, Newsome WT (1994) Correlated neuronal discharge rate and its implications for psychophysical performance Nature 370 140–3 [13.1.1b] Zuber BL, Stark L (1966) Saccadic suppression: elevation of visual threshold associated with saccadic eye movements Exp Neurol 16 72–79 [23.2.4]
REFERENCES
INDE X OF CITED JOURNALS
Acta Ophthal Acta Ophthal Supp Acta Otolaryngol Acta Physiol Acta Psychol Adv Child Devel Behav Adv Ophthal Acoustica Akustique Zeitschrift Am J Hum Genet Am J Islamic Soc Sci Am J Ophthal Am J Optom Arch Am Acad Optom Am J Optom Physiol Opt Am J Philol Am J Physics Am J Physiol Am J Physiol Opt Am J Psychol Am J Sci Arts Am Nat Am Orthopt J Am Psychol Am Zool An Acad Bras Cienc Anim Behav Ann Biomed Engin Ann Hum Genet Ann Med Hist Ann N Y Acad Sci Ann Neurol Ann Ophthal Ann Rev Biophys Bioeng Ann Rev Cell Dev Bio Ann Rev Neurosci Ann Rev Physiol Ann Physik Ann Physik Chem Ann Sci Ann Psychol Ann Neurol App Optics Arch Augenheilk Arch für Anat Physiol Wissen Med Arch Ges Psychol Arch für Ophthal Arch Psychiat Nervenkrank Arch Neurol Arch Neurol Psychiat Arch Ophthal Arch Ophthal Otol Arch Psychiat Z ges Neurol Arch Psychol Arch Hist Exact Sci Arch Mikros Anat Ent
Art History Art Quarterly Artif Intell Atten Percept Psychophys Auk Aust J Optom Aust J Psychol Aviat Space Environ Med Behav Biol Behav Ecol Sociobiol Behav Brain Res Behav Res Meth Instrum Behav Res Meth Instrum Comput
Acta Ophthalmologica Acta Ophthalmologica Supplement Acta Otolaryngologica Acta Physiologica Acta Psychologica Advances in Child Development and Behavior Advances in Ophthalmology Akust Z American Journal of Human Genetics American Journal of Islamic Social Sciences American Journal of Ophthalmology American Journal of Optometry and Archives of American Academy of Optometry American Journal of Optometry and Physiological Optics American Journal of Philology American Journal of Physics American Journal of Physiology American Journal of Physiological Optics American Journal of Psychology American Journal of Science and Arts American Naturalist American Orthoptic Journal American Psychologist American Zoologist Anais da Academia Brasileira de Ciencias Animal Behaviour Annals of Biomedical Engineering Annals of Human Genetics Annals of Medical History Annals of the New York Academy of Science Annals of Neurology Annals of Ophthalmology Annual Review of Biophysics and Bioengineering Annual Review of Cell and Developmental Biology Annual Review of Neuroscience Annual Review of Physiology Annalen der Physik Annalen der Physik und Chemie Annals of Science Année Psychologie Annals of Neurology Applied Optics Archiv Augenheilkunde Archiv für Anatomie Physiologie und Wissenschaftliche Medicin Archiv für gesamte Psychologie Archiv für Ophthalmologie Archiv für Psychiatrie und Nervenkrankheiten Archives of Neurology Archives of Neurology and Psychiatry AMA Archives of Ophthalmology Archives of Ophthalmology and Otology Archiv für Psychiatrie und Zeitschrift für die gesamte Neurologie Archives of Psychology Archive for History of Exact Sciences Archiv Mikroskopie Anatomie und Entwicklungsmechanisme
Behav Proc Ber D Ophthal Ges Binoc Vis Binoc Vis Strab Quart Biochem Cell Biol Biol Behav Biol Cybern Biomed Instrum Technol Biophys J Brain Behav Evol Brain Cogn Brain Res Brain Res Rev Br J Anim Behav Br J Ophthal Br J Philos Sci Br J Photog Br J Physiol Opt Br J Psychol Bull Acad R Sci Belg Bull Math Biophys Bull Psychonom Soc Bull Soc Fran Photo Can J Exp Psychol Can J Neurol Sci Can J Ophthal Can J Optom Can J Physiol Pharmacol Can J Psychol Can J Zool Cell Tissue Res Cereb Cortex Child Devel Clin Exp Optom Clin Vis Sci Cog Psychol Comptes Rendus Acad Sci Comput Graph Comput Biol Med Comput Vis Gr Im Proc Copeia Curr Biol Curr Opin Neurobiol Development Devel Biology
Art Bulletin
621
Artificial Intelligence Attention, Perception, and Psychophysics Auk Australian Journal of Optometry Australian Journal of Psychology Aviation Space and Environmental Medicine Behavioural Biology Behavioral Ecology and Sociobiology Behavioural Brain Research Behavior Research Methods and Instrumentation Behavior Research Methods, Instruments, and Computers Behavioural Processes Berichte der Deutschen Ophthalmologie Gesellschaft Binocular Vision Binocular Vision and Strabisus Quarterly Biochemistry and Cell Biology Biology and Behaviour Biological Cybernetics Biomedical Instrumentation and Technology Biophysical Journal Brain Behavior and Evolution Brain and Cognition Brain Research Brain Research Review British Journal of Animal Behaviour British Journal of Ophthalmology British Journal for the Philosophy of Science British Journal of Photography British Journal of Physiological Optics British Journal of Psychology Bulletin de l Academie Royale des Sciences de Belgique Bulletin of Mathematical Biophysics Bulletin of the Psychonomic Society Bulletin de la Societé Française de Photographie Canadian Journal of Experimental Psychology Canadian Journal of Neurological Sciences Canadian Journal of Ophthalmology Canadian Journal of Optometry Canadian Journal of Physiology and Pharmacology Canadian Journal of Psychology Canadian Journal of Zoology Cell and Tissue Research Cerebral Cortex Child Development Clinical and Experimental Optometry Clinical Vision Science Cognitive Psychology Comptes Rendus de l’ Academie des Sciences Computer Graphics Computers in Biology and Medicine Computer Vision, Graphics and Image Processing Current Biology Current Opinion in Neurobiology Developmental Biology
Devel Brain Res Devel Psychol Displays Doc Ophthal Ecol Psychol Edinb J Sci EEG Clin Neurophysiol Eur J Neurosci Exp Brain Res Exp Neurol Exp Psychol Monogr Forsch Zool Ger J Ophthal Graefe’s Arch Klin Exp Ophthal Herpetologica Hum Factors Hum Genet Hum Neurobiol IEEE Tr Patt Anal Mach Intel IEEE Tr Comput Graph App IEEE Tr Biomed Engin IEEE Tr Sys Sci Cybern IEEE Tr Man Mach Cybern IEEE Tr Man-Mach Syst Image Vis Comp Infant Behav Devel Int J Comput Vis Int J Devel Biol Int J Man-Mach Stud Int J Neurosci Int J Optom Int J Patt Recog Artif Intell Int J Psychophysiol Int J Robot Res Int J Virtual Reality Int Rec Med Quart Rev Ophthal Internat J Neurosci Invest Ophthal Invest Ophthal Vis Sci Invest Radiol Isis Jap Psychol Res J Am Intra–ocular Implant Soc J Abn Soc Psychol J Acoust Soc Am J AAPOS J Am Optom Assoc J Anat J Audio Engin Soc J Audit Res J Biomed Opt J Cell Biol J Cell Comp Physiol j Chem Physics J Clin Invest J Cog Neurosci J Comp Neurol J Comp Physiol J Comp Physiol Psychol J Comput Assist Tomog J Comput Neurosci J de Physique J Display Technol
Developmental Brain Research Developmental Psychology Documenta Ophthalmologica Ecological Psychology Edinburgh Journal of Science Electroencephalography and Clinical Neurophysiology European Journal of Neuroscience Experimental Brain Research Experimental Neurology Experimental Psychology Monograph Forchritte der Zoologie German Journal of Ophthalmology Graefes Archiv für klinische und experimentelle Ophthalmologie Human Factors Human Genetics Human Neurobiology IEEE Transactions on Pattern Analysis and Machine Intelligence IEEE Transactions on Computer Graphics Applications IEEE Transactions on Biomedical Engineering IEEE Transactions on Systems Science and Cybernetics IEEE Transactions on Man, Machines and Cybernetics IEEE Transactions on Man-Machine Syst Image and Vision Computing Infant Behavior and Development International Journal of Computer Vision International Journal of Developmental Biology International Journal of Man-Machine Studies International Journal of Neuroscience International Journal of Optometry International Journal of Pattern Recognition and Artificial Intelligence International Journal of Psychophysiology International Journal of Robotic Research International Journal of Virtual Reality International Record of Medicine and Quarterly Review of Ophthalmology International Journal of Neuroscience Investigative Ophthalmology Investigative Ophthalmology and Visual Science Investigative Radiology Japanese Psychological Research Journal of the American Intra-ocular Implant Society Journal of Abnormal and Social Psychology Journal of the Acoustical Society of America Journal of the American Association of Pediatric Ophthalmology and Strabismus Journal of the American Optometric Association Journal of Anatomy Journal of the Auditory Engineering Society Journal of Auditory Research Journal of Biomedical Optics Journal of Cell Biology Journal of Cellular and Comparative Physiology Journal of Chemical Physics Journal of Clinical Investigation Journal of Cognitive Neuroscience Journal of Comparative Neurology Journal of Comparative Physiology Journal of Comparative and Physiological Psychology Journal of Computer Assisted Tomography Journal of Computational Neuroscience Journal de Physique Journal of Display Technology
J Electrophysiol J Exp Anal Behav J Exp Biol J Exp Child Psychol J Exp Psychol J Exp Psychol Gen J Exp Psychol HPP J Franklin Inst J Gen Physiol J Gen Psychol J Hist Behav Sci J Hist Med Allied Sci J Hist Philos J Im Sci Technol J Inst Elec Engin J Math Imag Vis J Math Psychol J Micros J Morphol J Mot Behav J Nat Philos Arts J Navig J Neurobiol J Neurochem J Neurocytol J Neurol J Neurol Neurosurg Psychiat J Neurol Sci J Neurophysiol J Neurosci J Opt Soc Am J Orn J Ped Ophthal Strab J Person J Physiol J Psychol J Soc Motion Pict Engin J Soc Motion Pict Televis Engin J Warb Court Inst J Sports Sci J Theor Biol J Ultrastruct Res J Vestib Res J Vis Klin Monat Augenheilk Lond Edin Philos Mag J Sci Lon Edin Dub Philos Mag J Sci Mar Biol Med Biol Engin Med Biol Engin Comput Med Prog Technol Med Sci Res Medieval Studies Mem Am Philos Soc MIT Quart Prog Rep Molec Neurobiol Monat Ber Akad Müller’s Arch Anat Nature Med Nat Neurosci Nat New Biol Nat Rev Neurosci Naturvissenschaften Neural Comput Neural Networks Neurocase Neurocomputing
I N D E X O F C I T E D J O U R NA L S
•
Journal of Electrophysiology Journal of the Experimental Analysis of Behavior Journal of Experimental Biology Journal of Experimental Child Psychology Journal of Experimental Psychology Journal of Experimental Psychology: General Journal of Experimental Psychology: Human Perception and Performance Journal of the Franklin Institute Journal of General Physiology Journal of General Psychology Journal of the History of the Behavioral Sciences Journal of the History of Medicine and Allied Sciences Journal of the History of Philosophy Journal of Image Science and Technology Journal of the Institute of Electrical Engineers Journal of Mathematical Imaging and Vision Journal of Mathematical Psychology Journal of Microscopy Journal of Morphology Journal of Motor Behavior Journal of Natural Philosophy and Arts Journal of Navigation Journal of Neurobiology Journal of Neurochemistry Journal of Neurocytology Journal of Neurology Journal of Neurology Neurosurgery and Psychiatry Journal of the Neurological Sciences Journal of Neurophysiology Journal of Neuroscience Journal of the Optical Society of America Journal of Ornithology Journal of Pediatric Ophthalmology and Strabismus Journal of Personality Journal of Physiology Journal of Psychology Journal of the Society of Motion Picture Engineers Journal of the Society of Motion Picture and Television Engineers Journal of the Warburg and Courtauld Institutes Journal of Sports Science Journal of Theoretical Biology Journal of Ultrastructure Research Journal of Vestibular Research Journal of Vision Klinische Monatsblátter für Augenheilkunde London and Edinburgh Philosophical Magazine and Journal of Science London Edinburgh and Dublin Philosophical Magazine and Journal of Science Marine Biology Medical and Biological Engineering Medical and Biological Engineering and Computing Medical Progress Through Technology Medical Science Research Memoirs of the American Philosophical Society MIT Quarterly Progress Report Molecular Neurobiology Monatsberichte der Berliner Akademie Müller’s Archiv für Anatomie Nature Medicine Nature Neuroscience Nature New Biology Nature: Reviews Neuroscience Neural Computation Neural Networks
622
NeuroImage Neuro-Ophthal Neuropsychologia Neurosci Neurosci Lett Neurosci Res News Physiol Sci Nucleic Acids Res NZ J Psychol NZ J Zool Ophthal Physiol Opt Ophthal Res Otica Acta Optom Vis Sci Percept Mot Skills Percept Psychophys Pflügers Arch ges Physiol Philos Stud Philos Tr R Soc Photogram Engin Photogram Engin Rem Sen Photograph J Physiol Behav Physiol Rev Physiol Zool Poggendorff ’s Ann Physik Chem Primates Proc Soc Neurosci Proc Am Philos Soc Proc Ass Res nerv Dis Proc Internat Soc Opt Engin Proc Natl Acad Sci Proc Physics Soc Proc R Soc Proc R Soc Med Proc Soc Photo Opt Instrum Engin Proc Physiol Soc Prog Brain Res Prog Neurobiol Prog Psychobiol Physiol Psychol Psychol Beit Psychol Bull Psychol Forsch Psychol Monogr Psychol Rec Psychol Res Psychol Rev Psychol Rev Monogr Supp Psychol Sci
Neuro-Ophthalmology Neuroscience Neuroscience Letters Neuroscience Research News in Physiological Sciences Nucleic Acids Research New Zealand Journal of Psychology New Zealand Journal of Zoology Ophthalmic and Physiological Optics Ophthalmic Research Optometry and Vision Science Perceptual and Motor Skills Perception and Psychophysics Pflügers Archiv für die gesamte Physiologie Philosophische Studien Philosophical Transactions of the Royal Society London Photogrammetric Engineering Photogrammetric Engineering and Remote Sensing Photographic Journal Physiology and Behavior Physiological Reviews Physiological Zoology Poggendorff ’s Annalen der Physik und Chemie Proceedings of the Society of Neuroscience Proceedings of the American Philosophical Society Proceedings of the Association for Research in Nervous Diseases Proceedings of the International Society of Optical Engineers Proceedings of the National Academy of Sciences Proceedings of the Physics Society Proceedings of the Royal Society London Proceedings of the Royal Society of Medicine Proceedings of the Society of Photo-optical Instrumentation Engineers Proceedings of the Physiological Society Progress in Brain Research Progress in Neurobiology Progress in Psychobiology and Physiological Psychology Psychologische Beitrage Psychological Bulletin Psychologische Forschung Psychological Monographs Psychological Record Psychological Research Psychological Review Psychological Review Monograph Supplements Psychological Science
623
•
Psychol Stud Psychonom Bull Rev Psychonom Monogr Supp Psychonom Sci Quart J Exp Physiol Quart J Exp Psychol Res Devel Rev Mod Phys Rev Neurosci Rev Neurol Scand J Psychol Sci Am Sem Neurosci Sen Proc Sig Proc: Image Comm Soc Neurosci Abstr Soc Neurosci Symp Spat Vis Survey Ophthal System Zool Art Quart The American Cartographer Tohoku Psychol Folia Tr Am Ophthal Soc
Psychologische Studien (Leipzig) Psychonomic Bulletin and Review Psychonomic Monograph Supplements Psychonomic Science Quarterly Journal of Experimental Physiology Quarterly Journal of Experimental Psychology Research Development Review of Modern Physics Review of Neuroscience Revue Neurologique Scandinavian Journal of Psychology Scientific American Seminars in Neuroscience Sensory Processes Signal Processing: Image Communication Society of Neuroscience Abstracts Society of Neuroscience Symposium Spatial Vision Survey of Ophthalmology Systematic Zoology The Art Quarterly
Tohoku Psychologica Folia Transactions of the American Ophthalmological Society Tr Am Philos Soc Transactions of the American Philosophical Society Tr Microscop Soc Lon Transactions of the Microscopical Society of London Tr Ophthal Soc UK Transactions of the Ophthalmological Society of the United Kingdom Tr Opt Soc Lon Transactions of the Microscopical Society of London Tr R Soc Edinb Transactions of the Royal Society of Edinburgh Tr Soc Motion Pict Engin Transactions of the Society of Motion Picture Engineers TINS Trends in Neuroscience Trends Cognit Sci Trends in Cognitive Science Vis Cognit Visual Cognition Vis Neurosci Visual Neuroscience Vis Res Vision Research Vis Sci Soc Abstracts Vision Sciences Society Abstracts Z Exp Angew Psychol Zeitschrift für experimentelle und angewandte Psychologie Z Gesch Arabi-Islam Wissen Zeitschrift für Geschichte der ArabischIslamischen Wissenschaft Z Naturforsch Zeitschrift für Naturforschung Z Neurol Psychiat Zeitschrift für Neurologie und Psychiatrie Z Psychol Zeitschrift für Psychologie Z Psychol Physiol Sinnesorg Zeitschrift für Psychologie und Physiologie des Sinnesorgane Z Sinnesphysiol Zeitschrift für Sinnesphysiologie Z Tierpsychol Zeitschrift für Tierpsychologie Z Wissen Photog Photophy Photochem Zeitschrift für Wissenschaftliche Photographie Photophysik und Photochemie
I N D E X O F C I T E D J O U R NA L S
PORTRAIT INDE X VOLUME 2
Adelson, E. H. Anderson, B. L. Anstis, S. M.
495 486 227
Banks, M. S. Barlow, H. B. Bishop, P. O. Blake, R . Blakemore, C. Braddick, O. J. Bradshaw, M. F. Burian, H. M. Burr, D. C.
408 2 3 67 2 220 416 165 521
Cumming, B. G.
15
De Weert, C. M. M. DeAngelis, G. C.
83 27
Fahle, M. Foley J. M. Fox, R .
68 303 65
Ogles, K .N. O’Shea, R. P. Ohzawa, I. Ono, H.
302 75 10 237
Parker, A. J. Patterson, R . Pettigrew, J. D. Poggio, G. F.
14 219 3 7
255 259
Ramachandran, V. S. Rogers, B. J. Ross, J.
484 407 534
Legge, G. E. Lit, A.
111 519
Simmons, D. R . Stevenson, S. B.
259 342
Mayhew, J. E. W. McKee, S. Morgan, M. J.
251 314 523
Trotter, Y. Tyler, CW
21 316
Nakayama, K .
472
Ullman, S. Uttal, W. R .
500 350
Freeman, R. D. Frisby, J. P.
10 250
Georgeson, M. A. Gilchrist, A. L. Gillam, B.
220 492 413
Howard, I. P.
407
Julesz, B.
291
Kaufman, L. Kingdom, F. A. A.
624
Van de Grind, W. A. Verstraten, F. A. J. Von der Heydt, R .
237 503 24
Wade, N. J. Westheimer, G. Wilson, H. R . Wolfe, J. M.
69 296 311 74
SUBJECT INDE X
Abathic distance, 178 for frontal plane, 176 Absolute disparity, 190, 191f, 387 horizontal, 363 vertical, 363, 380 Absolute slant, 445f Accommodation, binocular rivalry and, 87–88 Achromatic stimuli chromatic stimuli compared to, 258–60 depth modulation and, 260f dynamics of, 260 noise and, 260–61 Achromatopsia, 261 Acuity. See also Stereoacuity binocular summation for, 115 in cyclopean domain, 213–15 stereoacuity and other, 351–54 Adaptation to disparity depth aftereffects from, 456 depth perception versus, 458–59 Adelson, Edward H., 495f Adjacency principle, 386, 438, 497 Adjacent vertical disparities, distinct, 399–401 Aerial photography, 555, 556f Afferent visual signal, 149 Afoveate mammals, OKN in, 503–4 Allocentric frame of reference, 439 Alternating monocular induction, 136 Anaglyph stereoscopes, 542 AND cells binocular, 136 cyclopean image and, 139 Anderson, Bart, 486f Angle alpha, 284 Angle of anomaly, 163 Angle of declination, 180 Angle of strabismic deviation, 163 Angular disparity, 373–74, 374f Angular subtense, 155 Anisometropia, 380 stereoacuity reduced by, 312 Anisopia, 286 Anisotropy of depth modulations, 415, 416f–417f of monocular acuity, 325 in oblique surfaces, 414 orientation, 413 slant-inclination, 413–15, 414f–415f slant-orientation, 414 stereoscopic, 413–15 Anomalous retinal correspondence (ARC) causes of, 164–65 harmonious, 163 measurement of, 163–64 monocular diplopia and, 163 physiological indicators of, 164 treatment of, 165 types of, 162–63 Anstis, Stuart, 227f
Anticorrelated stereograms, depth aftereffects from, 460 AO Vetcographic Card Test, 291 Aperture problem, 484, 485f Apparent brightness, dichoptic, 119, 119f Apparent motion configurational factors in, 502 coplanarity factor and, 501–2 Interstimulus distance and, 499–501, 501f occlusion and, 502f three-dimensional, 502f from suppressed images, 88 Apparent-motion cascades, dynamic noise Pulfrich effect and, 535–36 Approaching objects, detection of, 34 ARC. See Anomalous retinal correspondence Area-contour rivalry, 63 Area rivalry, 63 Area TE. See Inferior temporal cortex Aristotle, 140 Aschenbrenner, Claus, 548 Aspect-ratio discrimination, cyclopean domain and, 213, 213f Association field, 84 Atomic force microscope, 561 Attention binocular rivalry and, 96–97 controlling locus of, 358 endogenous, 96, 358 exogenous, 96, 358 OKN and shifts in, 507–8 stereoacuity and, 358–59 in 3-D space, 511–14 Attraction, depth, 434f, 436, 436f Austereograms, random-dot, 549–50, 550f–553f Azimuth, 155 cyclopean, 160 Azimuth-latitude/elevation-latitude system, 156, 156f Azimuth-latitude/elevation-longitude system, 156, 156f Azimuth-longitude/elevation-latitude system, 155–56, 156f Azimuth-longitude/elevation-longitude system, 155, 156f
Discovery of, 2–3 disparity detectors and, 42 left-right inseparable, 32 left-right separable, 32 model of, 11, 12f monocular receptive fields of V1 and, 15f primates, classification of, 7 receptive field of, 5 Binocular color mixing dispute over, 61 factors affecting , 61–62 Young-Helmholtz theory on, 61 Binocular composites, cyclopean figural effects with, 215–16, 215f Binocular correlation-sensitivity function, 322f Binocular correspondence normal, 162 problem, 152, 183 stability of, 161–66 Binocular disparity, 51 absolute and relative, 288, 288f causes of, 152 dichoptic cyclopean stimuli and, 211–12, 212f geometrical definition of, 153–54, 153f monocular occlusion interactions with, 263–67 OKN and, 506 retinal axis systems measuring , 157, 157t Binocular ecological optics, 149 Binocular energy neurons disparity detector types of, 43f, 44 output processing of, 46 Binocular field of fixation, 149 Binocular fusion, 51–60 binocular rivalry and precedence of, 95, 95f cyclovergence and, 203, 203f disparity gradients and, 55, 55f dual-response theory of, 94–95 Helmholtz on, 93 mental theory of, 93 spatial frequency and, 54 stereopsis and, 92–96 suppression theory of, 93–94 two-channel theory of, 94–95 Binocular images correlating , 183–89 cross-correlation of, 183–84 hill-climbing procedure for, 184 interocular correlation of, 183–86 phase filters and, 185–86 properties of, 150–55 shear disparities and, 410f types of, 151t Binocular lateral gaze, 160 Binocular luster, 63, 276 Binocularly linked images, 151 Binocular masking-level difference, 132 Binocular perspective, 364
Backward masking. See Metacontrast Bagolini striated glasses test, 163 Banks, Martin S., 408f Barber pole illusion, 486, 486f Barlow, Horace B., 2f Bar stimuli, disparity detectors and orientation of, 20 Binaural masking-level difference, 132 Binaural unmasking , 132 Binocular cells AND, 136 OR , 135
625
Binocular plane, 202 Binocular processing dichoptic cyclopean stimuli and, 210–11 exclusive, 211 interactive, 211 Binocular reaction times, 124 Binocular recruitment, 136 Binocular rivalry, 150. See also Monocular rivalry; Spatial zones of binocular rivalry accommodation during , 87–88 attention and, 96–97 basic phenomena of, 63–65, 64f binocular fusion precedence over, 95, 95f chromatic specificity and, 69–70 cognition and, 96–99 color and, 69 color contrast and, 69 contrast and, 66–69 of contrast modulated images, 75, 76f cross-orientation inhibition and, 100–101 depth from, 273–77, 275f–277f dichoptic masking and, 75–76, 127 dual-response theory of, 94–95 dynamic features of, 75 eye dominance and, 79–80 eye movements and, 78 figural factors in, 70–73 figure-ground relationships and, 71–72, 71f flash facilitation and, 77–78 flash suppression and, 77–78 flickering and, 75 fMRI and, 102–3 functions of, 66 fusion theory of, 65 homogenous visual field dominance in, 70–71, 70f image interruptions and, 78 interhemispheric, 101 intersensory effects of, 98–99 lateral cortical connections and, 101 LGN and, 99 luminance and, 66–69 meaning and, 97–98 MEG and, 102 mental theory of, 93 models of, 103–6 moving stimuli and, 78–79 neural network models of, 105 orientation and synchrony of, 84f orthogonal drifting gratings and, 78–79 physiology of, 99–103 pupils and, 87–88 relative orientation and, 72 retina position in, 73–74, 74f rivalry contrast threshold and, 67–68 spatial propagation of, 76–77, 76f stereopsis and, 92–96 stimulus complexity and, 72
Binocular rivalry (Continued) stimulus duration in, 74–75 subjective contours and, 72–73, 73f suppression theory of, 65, 93–94 temporal factors in, 74–78 theories of, 65–66 threshold summation of, 88 Troxler effect and, 64 two-channel theory of, 94–95 VEP and, 101–2 visual cortex and, 99–103 voluntary control of, 96 Binocular stereomicroscopy, 557, 558f Binocular stimulus, 1, 149 Binocular subtense, 161 Binocular summation for acuity, 115 of brightness, 116–20, 123 camouflage and, 114 contour, 112, 112f contours with and without, 118–19 of contrast sensitivity, 110–14 distribution model of, 113 flicker fusion and, 120, 121f gain control and, 115 integration model and, 110 isoluminant stimuli and, 114 low-level factors in, 107–8 of motion detection, 123 for pattern recognition, 116 physiology of, 124–27 response saturation and, 115 signal-detection theory and, 109–10 of single-cell responses, 124–25 stimulus position and eccentricity effecting , 113–14 stimulus spacing and, 113 at suprathreshold contrasts, 114–16 theoretical background for, 107–10 VEP and, 125–27, 126f Binocular suppression, zone of, 63 width of, 66 Binocular switch suppression, 77–78 Binocular tilt aftereffect, 138 Binocular vision terminology of, 1 uses of, 385–87 Binocular visual direction, 230–32 Binocular visual fields, 149 Binocular zone, 263, 264f Bipolar axes, 159 Bipolar elevation, 160 Bipolar parallax, 161 Bishop, Peter O., 3f Blake, Ralph, 67f Blakemore, Colin, 2f, 388–89 Blindness motion-induced, 78, 81 of Pulfrich, 516 Bloch’s law, 115, 354 in dichoptic conditions, 121 Blue-cone system, disparity scaling in, 57 Blue-in-front-of-red chromostereopsis, 285 Blur, image, 312 Braddick, Oliver J., 220f Bradshaw, Mark, 416f Brewster’s prism stereoscopes, 541–42, 541f Brightness. See also Luminance binocular summation of, 116–20, 123 depth analogy with, 454 dichoptic, 117, 117f dichoptic apparent, 119, 119f models, 453–54 whiteness and, 490 Brightness summation Fechner’s paradox and, 116–17, 117f
Levelt’s experiments on, 116–18, 117f models of, 118 Burian, Herman M., 165f Burr, David, 521, 521f Cajal, Ramón y, 548, 548f Callosotomy, midline stereopsis and, 40 Camouflage. See also Subjective contours background effects on, 265, 265f binocular summation and, 114 monocular occlusion and, 263–64, 264f in Panum’s limiting case, 279, 279f Cardboard cut-out phenomenon, 423, 423f Carrier, 22 Cats disparity detectors in, 5–6 split-chiasm, 39 visual cortex responses to dichoptic stimulus in, 10, 11f CCDs. See Charge-coupled detectors Centripetal preponderance, of OKN, 505 Cepstrum, 186 CFF. See Critical fusion frequency Charge-coupled detectors (CCDs), 558 Chiasm cats with split, 39 midline stereopsis and, 39 Chromatic aberration longitudinal, 284 transverse, 284–85 Chromatic adaptation, transfer of, 135 Chromatic channel, stereopsis in, 257–61 Chromatic contrast, 490 Chromatic detectors, luminance and, 258–59 Chromatic diplopia, 285 Chromatic Gabor patches, 259 Chromatic stimuli achromatic stimuli compared to, 258–60 depth modulation and, 260f dynamics of, 260 noise and, 260–61 Chromostereopsis blue-in-front-of-red, 285 causes of, 284–85 red-in-front-of-blue, 285–86 reversal of, 285–86 spectacles enhancing , 285 Circularvection, 509 Claudet, Antoine, 543 Claudet’s stereomonoscope, 543 Coarse-to-fine disparities, 206–7 Coarse-to-fine spatial scales, 205–6 Cognition, binocular rivalry and, 96–99 Coincident patterns, 487 Color. See also Isoluminance; Luminance binocular mixture of, 61–62 binocular rivalry and, 69 dichoptic mixture of, 60–63 dioptic mixture of, 62–63 matching , 200–201 for matching images, 201 pattern rivalry and, 69–70 rivalry, 61, 62f, 90, 200–201, 200f stereoacuity and, 312–13 Color blindness, 261 Color-contingent depth aftereffects, 460 Color-contingent motion aftereffect, 145–46 Color contrast binocular rivalry and, 69 constancy and, 496–97 Comitant esotropia, 164 Compression, image displacement and, 241–42, 242f Cones, relative density of, 259–60
626
•
Confocal scanning microscope, 557–58, 559f Conjunctions of disparity, 513 Constant depth column, 5 Contiguous temporal disparity, 520, 524, 524f Contingent aftereffects interocular transfer of, 144–46 positive, 144–45 Continuous flash suppression, 77 Contour-contour rivalry, 63 Contours. See Subjective contours Contrast. See also Color contrast; Depth contrast; Orientation contrast; Tilt contrast binocular rivalry and, 66–69 binocular rivalry and image modulation of, 75, 76f binocular summation at suprathreshold, 114–16 chromatic, 490 cyclopean motion and, 227–28 cyclovergence and inclination, 440, 440f depth contrast compared to luminance, 442f depth perception and, 336–37 dichoptic masking and, 130–31, 130f direction-of-motion, 228 disparity, 451–54 disparity limit and, 304 disparity modulation and resolution of, 320 disparity tuning and differences in, 13–14 effects, 434 global, 433–34 image fusion and effects of, 53–55 images differing in, 197–98 inclination, 452f interocular differences in, 111–13, 310–12 at isoluminance, 258–59 local, 433 motion-direction contrast effect, 511 perceived 3-D layout, 493, 493f Pulfrich effect and role of, 531 relative image, 197–200 reversed, 198–99, 198f rivalry contrast threshold and, 67–68 rivalry from reversed, 13 sieve effect with reduced, 276, 277f simultaneous, 433 simultaneous tilt, 91–92 slant, 437–38, 438f, 448f, 456f stereoacuity and, 307–8, 308f, 310, 310f stereogram with mixture of, 252–53 successive, 433 Contrast-defined stimuli, disparity in, 22–23, 22f–23f Contrast sensitivity binocular summation of, 110–14 dichoptic masking and, 128, 128f temporal contrast-sensitivity function, 120 Contrast-sensitivity function, for stereopsis, 308–9, 308f Converged retinas, 364–65, 364f Convergence image linkage determined by, 208–9, 208f symmetrical, 235–36 theoretical point horopter and, 168 types of, 364f Convergent disparity, 153 Convergent strabismus, 73 Coordinate transformations, retinal axis systems and, 157–58
SUBJECT INDE X
Coplanar flat retinas, 364, 366f Coplanarity factor, 501–2 Coplanar ratio principle, 491 Correlation-sensitivity function, 15 binocular, 322f Corresponding images, 150–52, 182 Corresponding points, 182 cyclopean line of, 154 empirical, 150 geometrical, 150 Hering-Hillebrand deviation and shear of, 176–79, 177f–178f horopter and, 152–53 retinal, 182 theoretical horopters and noncongruent, 170–71, 172f Corresponding visual lines, 167 Covariation effect, 163–64 Craik-O’Brien-Cornsweet illusion, 448–49, 449f, 493 Criterion changes, 362 Critical fusion frequency (CFF), 120 Cross-correlation of binocular images, 183–84 disparity detectors and, 48 energy model and, 48–49 function, 184 Crossed disparity, 153, 153f stereoacuity with, 323–24, 324f Cross-orientation inhibition, 81, 139 binocular rivalry and, 100–101 monoptic, 100 Crowding dichoptic masking and, 133 relative depth and, 498 stereoacuity and, 315 Cue conflict depth contrast and, 450–51, 451f, 464 Pulfrich effect and, 518 Cumming, Bruce, 15f Curl disparity, 374 Curvature comparative judgments of, 429–30, 430f depth aftereffects of, in depth, 459, 459f discrimination thresholds, 417–18, 418f disparity, 427–28 principal, 416 Curvedness, 416–17, 418f Curved surfaces depth contrast between, 450, 450f invariant properties of, 427–28 Cyclopean axes, 233 retinal axis systems and, 159–61, 160f Cyclopean azimuth, 160 Cyclopean domain, 210 acuities in, 213–15 aspect-ratio discrimination and, 213, 213f orientation discrimination and, 214 visual features of, 212 Cyclopean Ebbinhaus illusion, 216, 217f Cyclopean elevation, 159 headcentric, 160–61 oculocentric, 160 Cyclopean eye, 232 controversy over, 245–47 Hering on, 234, 234f law of, 233, 236 locating , 243–45, 244f Cyclopean figural effects with binocular composites, 215–16, 215f figural aftereffects and, 216 geometrical illusions and, 216, 216f–217f spatial frequency aftereffects and, 216–17, 218f tilt aftereffects and, 216, 218f
Cyclopean illusion, 246, 246f phoria and, 247, 247f Cyclopean images AND cells and, 139 OR cells and, 139 metacontrast with, 134 Cyclopean line, of corresponding points, 154 Cyclopean motion. See also Dichoptic apparent motion aftereffect, 227–28 contrast effects from, 227–28 direction-of-motion contrast and, 228 sensitivity to, 225–26 speed discrimination of, 226–27 types of, 217–19 Cyclopean pop out, 228–29, 229f Cyclopean projection, law of, 233, 236 Cyclopean shape, 212, 213f, 548 motion of, 225–28 pop out of, 228–29, 229f random-dot stereograms and, 249, 250f Cyclopean stimuli, 136–37, 210. See also Dichoptic cyclopean stimuli Cyclopean texture segregation, 228–29, 230f Cyclopean tilt aftereffect, 139, 139f Cyclopean transparency, depth from, 272–73, 274f Cyclopean vernier acuity, 213, 213f Cyclops effect, 235 Cyclovergence binocular fusion and, 203, 203f inclination contrast and, 440, 440f induced tilt and, 440, 440f inducing , 408–9 orientation constancy and, 394 orientation disparity and, 58–59 rotation disparity and, 407–8 theoretical horopters with, 169, 170f vertical-shear disparity and, 406 D6 Gaussian patches, 56, 57f d’Almeida, Joseph Charles, 542 Dark adaptation, Pulfrich effect and prior, 529–31, 530f da Vinci, Leonardo, 263, 264f on monocular zones, 267 da Vinci stereopsis, 66, 267–72 background and, 271, 272f depth perception and, 272, 273f with fused images, 271–72, 272f monocular-gap stereopsis, 270 Panum’s limiting case and, 270, 271f, 282, 282f DeAngelis, Gregory C., 27f Deformation disparities advantages of, 375–76 components of, 374 depth contrast and, 466–69 inclination and, 405–13 slant and, 393–94 types of, 375 De Lange function, 120 Depth achromatic and chromatic stimuli and modulation of, 260f anisotropy of modulations of, 415, 416f–417f assimilation, 447, 447f attraction, 434f, 436, 436f from binocular rivalry, 273–77, 275f–277f brightness analogy with, 454 from cyclopean transparency, 272–73, 274f
disparity-defined, 419–32 enhancement, 442 fMRI of motion in, 38–39 gaze shifts and detection of, 345–48 graded transparency, 273, 274f intervals, 420 lacy, 338, 548 matching, at different distances, 430, 430f monocular zones and discontinuity of, 265–66, 266f monoptic, 2, 283, 283f motion between planes of, 499–503 from objects passing aperture, 525–26, 525f Panum’s limiting case and, 277–84 in pictures with and without disparity, 550–51 practice improving judgment of, 359 primates and sign of relative, 25 probes, 419 in Pulfrich effect, 527 in random-dot stereograms, 201–2 repulsion, 434f, 435–36, 436f rivaldepth, 212, 275 segregation, 474 shading relationship with, 492–94 sieve effect and, 274–76, 275f–276f size-disparity and relative, 369–70 slant and intrusion of, 395 sloping planar surfaces and constancy of, 424 spatial frequency and discrimination of, 335, 498 stereoacuity and detection of, 309 stereoscopes and constancy of, 423–24, 423f stereoscopic vision tests of real, 289–90 subjective contours and, 478–81, 479f–482f transparency, 338–39 from vertical disparities and monocular occlusion, 269, 270f vertical disparities in distinct planes of, 401–2 whiteness and, 490–97 Depth aftereffects adaptation and, 461–63 from adaptation disparity, 456 from anticorrelated stereograms, 460 color-contingent, 460 of curvature in depth, 459, 459f from depth corrugations, 457–58, 457f from disparity discontinuity, 456–57 from disparity metamerism, 459 from inclined surfaces, 455–56, 456f with line stimuli, 454–55, 455f mechanisms of, 461–64 multichannel model of, 462f normalization and, 461–63 phase dependency and, 464–66, 465f superimposing, on corrugated surfaces, 462f texture-contingent, 460 twining function for, 463, 463f–464f in two-channel system, 461f zero-, first-, second-order disparities and, 462–63 Depth capture, 474 Depth contrast between constant disparity areas, 441–42 cue conflict and, 450–51, 451f, 464 between curved surfaces, 450, 450f deformation disparities and, 466–69
between disparity gradients, 443–45, 446f disparity masking and, 441, 441f disparity-specific aftereffects and, 456–60 frames of reference and, 439–40, 451f with inclined surfaces, 446f luminance contrast compared to, 442f–443f models, 453–54 norms and, 439 with points and lines, 436–41, 437f second-order disparity detection and, 453–54 with shading-defined shape, 450, 450f shear disparities and, 468–69, 468f short-range, 434–36 simultaneous, 436–37, 436f size-disparity and, 466–68, 467f between sloping surfaces, 442–50 spatial arrangement of surfaces and, 445–48, 446f–448f stimuli producing , 438–39, 438f successive, 454–66 between surfaces, 441–51 in surface without discontinuities, 448f temporal properties of, 441 types of, 433–34 vertical disparity inducing , 450 zero-disparity circle inducing , 445f Depth corrugations, depth aftereffects from, 457–58, 457f Depth curvature, constancy of, 428–29, 428f–429f Depth-discrimination threshold, 287 pedestal disparity increase and, 330 Depth interpolation ambiguous, 477f, 483f figural continuity and, 476f between subjective contours, 482–83 Depth map, 363 Depth perception adaptation disparity versus, 458–59 binocular, 46 contrast and, 336–37 da Vinci stereopsis and, 272, 273f from disparity energy, 46–49 figural continuity and, 473f models, 453–54 in Panum’s limiting case, 278 spatial scale and, 336 symmetry and, 474f uses of, 385–87 Depth-specific geometrical illusions, 498–99 motion aftereffects, 503 threshold effects, 497–98 visual processing , 497–503 Depth-threshold function, for disparity modulation, 317–18, 317f Deviation coefficient, 177 de Weert, Charles M. M., 83f Diastereopsis, 338–39 detecting , 353 Diastereo test, 289–90 Dichoptic apparent brightness, 119, 119f Dichoptic apparent motion with added pedestal grating , 222 aftereffects of, 224 ambiguous, 222f flashes and, 223–24 long-range, 219–20 with missing fundamental, 220–21 in ON and OFF channels, 224 with opposed polarity, 224f short-range, 219–20
SUBJECT INDE X
•
627
spatial frequency and, 221f in spatiotemporal quadrature, 221–22 stimuli for, 223f with Ternus display, 222–23 Dichoptic brightness, contour influencing , 117, 117f Dichoptic cancellation of texture boundaries, 89–90, 89f Dichoptic coherent motion, 79 orthogonal motion combined with, 88–89 Dichoptic color mixture basic phenomena of, 60–61 dioptic color mixture and, 62–63 luminance and, 61 saturation and, 61 stimulus duration and, 61–62 stimulus size and, 62 Dichoptic composite shape, 211, 211f Dichoptic cyclopean procedure, 210 Dichoptic cyclopean stimuli binocular disparity and, 211–12, 212f binocular processing and, 210–11 superimposed, 211 Dichoptic equal-contrast function, 112, 112f Dichoptic flashes, 121 Dichoptic masking, 100. See also Metacontrast; Paracontrast between adjacent figured stimuli, 128–30, 129f binocular rivalry and, 75–76, 127 chromatic adaptation transfer and, 135 contrast and, 130–31, 130f of contrast sensitivity, 128, 128f crowding and, 133 of dominant images, 132 interstimulus delay and, 129, 129f light adaptation and, 127–28 relative disparity and, 132 spatial frequency and, 131–32, 131f spatial scale and stereopsis, 337–38 with superimposed patterns, 130–33 of suppressed images, 132 threshold elevation and, 133–34 Dichoptic metacontrast, 134–35 Dichoptic moiré patterns, 59–60 Dichoptic motion empirical horizontal horopter and locus of zero, 175 transparency, 141–42 Dichoptic nonius procedures, 174 Dichoptic objects, 552 Dichoptic stimulus, 1–2, 149 cat visual cortex responses to, 10, 11f dioptic stimulus compared to, 2 rivalry and, 52–53 thickening of, 52 Dichoptic summation of antiphase flicker, 120 of in-phase signals, 120 Dichoptiscope, 551 dichoptic objects in, 552 disparity controls with, 552–55, 554f image objects in, 552 stereoscope compared to, 551 Difference-of-Gaussian (DOG), 305 luminance profiles, 330 Differential magnification, apparent slant and, 397, 398f Differential perspective, 380 Dif-frequency disparity, 31, 371–72, 371f limits of, 389f position disparity and, 391–92, 392f slant and, 388–91, 389f texture gradients and, 391f
Digital holography, 558 Dilatation disparity, 374 Dioptic color mixture, 62–63 Dioptic nonius procedures, 173–74, 174f Dioptic shear threshold, inclination and, 405, 405f Dioptic stimulus, 149 dichoptic stimulus compared to, 2 Diplopia threshold eccentricity increasing , 53 hysteresis and, 59, 60f measuring , 52 spatial frequency and, 53f symmetry and, 52 Diplopic images, 51, 150, 183 Dipole angle, 378 Dipper function, in disparity discrimination, 299–300, 300f Directional averaging, of disparate images, 237–38 Directional preponderance, in OKN, 504 Direction circles, 157, 159f Direction-of-motion aftereffect, 228 Direction-of-motion contrast, 228 Disambiguation, fine-coarse, 335–36 Disparate images, 150 directional averaging of, 237–38 visual direction of, 237–39, 238f Disparity averaging display used for, 341f linked images and, 340–42, 341f monocular averaging versus, 339–40, 340f Disparity beats, 17 Disparity bias, 17, 323 Disparity capture by flanking stimuli, 480 occurrence of, 482, 483f strength of, 482 Disparity contrast, 451–54 Disparity correction, 430–31 Disparity-corrugation sensitivity function, 318, 318f Disparity curvature, 376f, 379, 427–28 Disparity-defined depth, 419–32 3-D shape, 416–19 Disparity detectors. See also Horizontal disparities; Vertical disparities adaptation of, 458f in area TE, 28–29 bar stimuli orientation and, 20 binocular cells and, 42 binocular energy neurons and types of, 43f, 44 in cats, 5–6 classification of, 7–8, 8f cross-correlation and, 48 discovery of, 2–4 disparity selectivity of, 5 in dorsal stream, 26–28 dynamics of, 23–24 energy models of, 40–49 with Gabor function, 41 Gaussian patches and, 333–34 hierarchy of, 442 high-order, 30–34 homogeneity of, 14–16 of horizontal gradients, 31 hybrid, 17–18 identifying , 3–4 in magnocellular system, 29–30 in MT and MST, 26–27 neural network models of, 49–50 number of, 14–16 in parietal cortex, 27–28
in parvocellular system, 29–30 phase, 16–17, 42 position, 16–17, 42–43 preferred disparity of, 5 primary, 14 in primates high visual centers, 24–30 response variability of, 5, 23 secondary, 14, 31f spatial scale and, 329–36 stereoacuity sensitivity to absolute and relative, 296–97 for stereopsis, 8 terminology of, 1–2 tertiary, 14 types of, 18f, 325–26 in V1, 6–25 in V2 and V3, 24–26, 25f in V4, 28 in ventral stream, 28–29 of vertical gradients, 31–33 Disparity discontinuity, aftereffects from, 456–57 Disparity discrimination dipper function in, 299–300, 300f reference surfaces and, 296–97 Disparity energy binocular depth perception and, 46 complex cells and, 44–46 depth perception from, 46–49 local function of, 185, 185f recent models on, 40 Disparity gradients arrangements of, 446f binocular fusion and, 55, 55f depth contrast between, 443–45, 446f discrimination of, 326–27, 327f–328f disparity limit and, 305 disparity scaling and, 55–56, 56f image fusion and, 56, 56f limit, 305 linear, 377–79, 378f random-dot stereogram for, 327 underestimation of gradual, 448–50, 449f Disparity interpolation, 480f Disparity limits contrast and, 304 disparity gradient and, 305 element density and, 305 eye movements and, 305–6 hysteresis in stereopsis and, 303–4 for line images, 302–3 luminance and, 304 spatial frequency and, 305 Disparity magnitude induced effect and, 397–98, 398f in spatial scale, 328–29 Disparity masking, depth contrast and, 441, 441f Disparity metamerism depth aftereffects from, 459 production of, 338–39 variables of, 339 Disparity modulation depth-threshold function for, 317–18, 317f discrimination of, 323 disparity resolution and, 319–20 frequency-swept grating of, 317f of line, 316–17, 317f of luminance, 331–32, 332f Nyquist limit and, 320 in random-dot stereograms, 317–19, 317f–320f sensitivity to, 318, 318f spatial frequency of, 316–23
628
•
suprathreshold functions and, 320–21, 321f–322f visual channels for, 321–23 Disparity normalization, 347, 430–31 Disparity pedestal, 288 stereoacuity and, 297–300, 298f–299f Disparity pooling mechanisms for, 338 metamerism and, 338–42 noise reduction and, 338 Disparity receptive fields, 451–53, 453f Disparity resolution, 319–20 Disparity scaling in blue-cone system, 57 disparity gradients and, 55–56, 56f display size and, 423f fusion limits and, 55–57 in periphery, 57 spatial frequency and, 56–57, 57f Disparity selectivity, 5 Disparity-specific aftereffects, 456–60 Disparity-stereopsis, 249–63 Disparity threshold function, 318, 319f for lateral offset, 333 pedestal disparity and, 335, 336f Disparity tuning contrast differences and, 13–14 eye position and, 21–22 gaze direction and, 22 monocular receptive fields and, 9–12 in NOT, 4 orientation and, 20–21 position invariance of, 12–13 precision of, 9 in pulvinar, 4 of subcortical cells, 4–5 in superior colliculus, 4–5 in V1, 6–14, 8f viewing distance and, 21–22 Disparity tuning function, 5 characteristics/features of, 9 energy model and, 10–12, 11f idealized, 7–8, 8f in V1, 8–9 Disparity vector field, 363, 365f–366f Distal stimulus, 148 Distance. See also Viewing distance paradox, 462 scaling , 419 Distance-specific adaptation, 458, 458f Distinct adjacent vertical disparities, 399–401 Distinct superimposed vertical disparities, 401 Distribution model of binocular summation, 113 Divergent disparity, 153 DOG. See Difference-of-Gaussian Dominant images dichoptic masking of, 132 potency to inhibit, 103 suppressed images interacting with, 87–90 visibility of, 103 Dominant stimulus, 63 Dominant suppression mechanism, 104–5 Dorsal stream, disparity detectors in, 26–28 Double-duty image linking , 279–82, 280f–281f Double-nail illusion, 208f DRDC. See Dynamic random-dot correlogram Dynamic noise Pulfrich effect, 533–36 apparent-motion cascades and, 535–36 random spatial-disparity hypothesis for, 534–35, 534f temporal disparity hypothesis for, 533–34
SUBJECT INDE X
Dynamic random-dot correlogram (DRDC), 36, 219 Dynamic random-dot stereogram, 35–36 Eccentricity. See Stimulus eccentricity Eccentricity at distance, 381–82, 381f–382f Ecological optics, 149 Edge continuity, 207 Egocentric frame of reference, 439 Ehrenstein’s figure, 478, 480f Electron microscopes, 560, 560f Electro-optical shutters, 542 Element density, disparity limit and, 305 Element orientation, 404, 404f Elevation, 155. See also Threshold elevation bipolar, 160 cyclopean, 159–61 theoretical horopters and gaze, 169–70, 170f Ellipsoid, 416 Empirical horizontal horopter criteria for, 172–76 equal-distance criterion for, 175–76 locus of fused images and, 172, 173f locus of maximum stereoscopic acuity and, 172–73 locus of zero dichoptic motion and, 175 nonius procedures and, 173–75, 174f–175f Empirical probability summation, 108–9 Empirical vertical horopter, 179–81, 180f Endogenous attention, 96, 358 Energy models cross-correlation and, 48–49 of disparity detectors, 40–49 disparity tuning function and, 10–12, 11f for simple cells, 41–44 for stereoscopic vision, 46 vertical disparities and, 19 Envelope, 22 Epipolar constraint, 161 Epipolar geometry, 154 Epipolar images, 202–3 Epipolar meridians, 160–61, 202 Epipolar planes, 160–61 Equidistance tendency, 440 Esotropia, comitant, 164 Euclid, 267 Ever-ascending staircase, 191, 192f Exclusive binocular process, 211 Exclusive dominance, 63 spatial zones of binocular rivalry and, 82 Exocentric frame, 231 Exogenous attention, 96, 358 Expansion disparity, 374 Exposure duration, stereoacuity and, 356, 356f Extrinsic ends, 485 Eyes dark field of closed, 70–71, 70f differential perspective of, 380 distinct motion aftereffect in each, 142–43 holoptic, 1 paralysis of, 210 torsional misalignment of, 380 vertical misalignment of, 379–80 Eye dominance binocular rivalry and, 79–80 criteria defining , 80 Eye drift, 3 Eye movements binocular rivalry and, 78 disparity limit and, 305–6 lateral, 344–45
OKN, 89, 509 out of register, 3 practice and, 360–62, 361f Pulfrich effect and, 532–33, 532f stereoacuity and, 344–51 suppressed images evoking , 89 Eye-of-origin information, 247–48 rivalry, 66 Eye position disparity tuning and, 21–22 vertical disparities as cue to, 383 Eye rivalry, stimulus rivalry compared to, 84–86 Eye-swap procedure, 84–85 Face recognition, 85–86 Fahle, Manfred, 68f Falling bead test, 290 Fechner’s paradox, 116–17, 117f Fertsch, F., 515–16, 518 Field-emission scanning electron microscope, 560 Field-sequential stereoscopes, 542–43 Figural continuity depth interpolation and, 476f depth perception and, 473f stereopsis and, 471–72, 471f–472f Figural effects experimental paradigms and, 135–37 interocular transfer of, 135–46 Figural induction, 135 Figure-ground relationships binocular rivalry and, 71–72, 71f patterns and, 470 stereopsis and, 472–74, 473f Figure-ground reversal, 497–98 Figure perception, stereopsis and, 470–74 Fine-coarse disambiguation, spatial scale and, 335–36 First-order disparities, 462–63 First-order motion, 217 First-order stereopsis, 333–35, 334f Fixating, Pulfrich effect and, 532, 532f Fixation disparity, 150 visual direction and, 242–43, 243f Fixation point, 167 Flashes dichoptic, 121 dichoptic apparent motion and, 223–24 time separating , 121–23 Flash facilitation, binocular rivalry and, 77–78 Flash suppression binocular rivalry and, 77–78 continuous, 77 Flicker fusion, 120, 121f Flickering binocular rivalry and, 75 CFF and, 120 as stereopsis token, 261–62 fMRI. See Functional magnetic resonance imaging Foley, John M., 303f Forward masking. See Paracontrast Fovea, sequential fixation of, 346–47 Fox, Robert, 65f Frames of reference, depth contrast and, 439–40, 451f Free fusion, 541–42 Freeman, Ralph D., 10f Freeze fracturing , 560 Freiburg ocular prevalence test, 239f Freiburg stereoacuity test, 291 Frisby, John P., 250f Frisby Stereo test, 290
Frontal meridian, 155 Frontal motion, indepth motion competing with, 501, 501f Frontal plane abathic distance for, 176 comparison stimulus, 419–20 Hering-Hillebrand deviation and, 176–79, 176f slant constancy, 393 Vieth-Müller circle and, 425f Frontal surface HSR for, 424–25, 424f judging flatness of, 424–27 scaling , 426–27, 426f VSR for, 424–25, 425f Front effect, 497 Functional magnetic resonance imaging (fMRI) binocular rivalry and, 102–3 of motion in depth, 38–39 stationary stimuli and, 37–38 stereopsis and, 37–39 Fused images, 150, 182 da Vinci stereopsis with, 271–72, 272f empirical horizontal horopter and locus of, 172, 173f Fusion. See Image fusion Fusion limits, 52, 53f disparity scaling and, 55–57 eccentricity and, 53 for orientation disparity, 58–59 of spatial frequency and effects of vergence instability, 54–55 temporal factors in, 57–58, 58f Fusion range, 52 Fusion theory of binocular rivalry, 65 Gabor, Dennis, 546f Gabor filter, 43 combined outputs of, 44 Gabor function, disparity detectors with, 41 Gabor patches, 72 association field and, 84 chromatic, 259 Gabor receptive fields, 42, 43f Gain binocular summation and control of, 115 induced effect and, 396 stereoscopic, 288 Galen, 267 Gaussian patches, disparity detectors and, 333–34 Gaussian window, spatial scale for, 333 Gaze binocular lateral, 160 depth detection across shifts of, 345–48 disparity tuning and direction of, 22 sequential foveal fixation and, 346–47 shear disparity and elevation of, 409f shifts between distinct objects, 345–47, 346f slant perception and shifts of, 347–48 theoretical horopters and elevation of, 169–70, 170f theoretical horopters and oblique, 170, 171f Troxler fading and, 347 vergence changes and, 346 Gaze-normal slant constancy, 393 Geometrical corresponding points, 150 Geometric effect, 395 Geometry binocular disparity definition in, 153–54, 153f cyclopean illusions of, 216, 216f–217f
depth-specific illusions of, 498–99 epipolar, 154 of Pulfrich effect, 516–18, 517f stereopsis from illusions of, 283–84, 284f of stereoscopic displays, 538–41, 539f–540f Georgeson, Mark, 220f Gestalt psychology, 470, 487 Ghosts, 152 Gilchrist, Alan, 492f Gillam, Barbara, 413f Global contrast, 433–34 Global matching rules coarse-to-fine disparities, 206–7 coarse-to-fine spatial scales, 205–6 edge continuity, 207 image linkage determined by convergence, 208–9, 208f minimizing unpaired images, 204–5 surface smoothness, 207–8 Global slant, 389–90, 390f Global stereopsis, 249, 548 Global vertical-shear disparity, 409, 410f, 411–12 Global vertical-size disparity, 399–402 Graded transparency depth, 273, 274f Gradual disparity gradients, underestimation of, 448–50, 449f Gray-level representation, 363 Grid nonius procedures, 173–74 Grouping, perceptual, 513–14, 513f Halldén test, 163–64 Haplopic images, 51 Harmonious anomalous correspondence, 163 Harris, Joseph, 51 Headcentric cyclopean coordinates, 160 Headcentric cyclopean elevation, 160–61 Headcentric direction demonstration of laws of, 234–37, 234f–235f law of differences in, 233–34, 236 laws of, 232–36, 233f–234f visual line in, 232 Headcentric disparity, 377 Headcentric frame, 230–31 of reference, 439 Head movements image stabilization and, 351 stereoacuity and, 350–51 Helmholtz, Hermann Ludwig von on binocular fusion, 93 stereogram of, 238f vertical disparities display of, 425–26, 425f Helmholtz checkerboard, 158, 159f Hering, E. on cyclopean eye, 234, 234f visual direction law of, 235f, 245, 245f Hering-Bielschowsky afterimage test, 163 Hering-Hillebrand deviation frontal plane and, 176–79, 176f optical factors in, 179 shear of corresponding points and, 176–79, 177f–178f vertical disparities and, 179 Hering illusion, 283–84, 284f Hermann grid, 441–42, 444f Hess effect, 520 High-order disparities, 30–34, 379 horizontal gradients of, 31 joint tuning to motion and, 33–34 spatial modulations of, 33 vertical gradients of, 31–33 Hill-climbing procedure, 184
SUBJECT INDE X
•
629
Holography, 546–47, 547f digital, 558 Holoptic eyes, 1 Homogenous visual fields, 70–71, 70f Horizontal arrays, interocular correlation in, 188 Horizontal disparities, 152–53 absolute, 363 upper limit of, 302–6 in V1, 20 vertical disparity effects on detection of, 324–25 viewing distance and, 21 Horizontal gradient, 72 of point disparity, 31 of VSR , 431, 431f Horizontal horopter, 166 empirical, 172–79 Vieth-Müller circle and, 167, 167f Horizontal-line horopters, 171, 173f Horizontal meridian, 155 Horizontal OKN, 506 Horizontal plane, 167 of regard, 160 Horizontal-shear disparity, 375, 405 inclination and, 407 Horizontal-size disparity, 267, 375 isoazimuth plane and, 368f slant and, 387–88 Horizontal size ratio (HSR), 370–71, 371f, 393 for frontal surface, 424–25, 424f Horopter. See also specific horopters; theoretical horopters corresponding points and, 152–53 curvature and skew of, 177, 177if default rule, 278 image fusion and, 51 Ogle’s, 176f stereoacuity away from, 297–300 theoretical, 166–72 Howard, Ian P., 407f Howard-Dolman test, 289, 354, 359 stereoscopic, 290 HSR. See Horizontal size ratio Hue-discrimination functions, 15 Hybrid detectors, 17–18 Hyperboloid, 416 Hypercyclopean feature, 212 Hypercyclopean level of analysis, 212 Hysteresis diplopia threshold and, 59, 60f in disparity limit for stereopsis, 303–4 image fusion and effects of, 59 stereopsis and effect of, 310–11, 311f Ibotenic acid, 30 Iconoscope, 551 Ideal observer for stereoacuity, 301 Illumination, dichoptic masking from even, 127–28 Image adjacency, in slanted surfaces, 193, 193f Image blur, interocular differences in, 312 Image correlation, VEP and, 36, 37f Image density, 378 Image displacement, compression and, 241–42, 242f Image fusion. See also Fusion limits binocular, 51–60 contrast effecting , 53–55 criteria for, 52–53 disparity gradient and, 56, 56f eccentricity and limits of, 53 horopter and, 51 hysteresis effects in, 59
Image fusion (continued) spatial frequency effecting , 53–55 terminology of, 51–52 Image interruptions, binocular rivalry and, 78 Image linking. See Linked images Image order interhemispheric, 194–95, 195f local matching rules and, 193–95, 193f–194f topological, 193–94, 194f Image size, stereoacuity and relative, 300–301 Image stabilization, 344 head movements and, 351 Image superimposition, 64 Inclination aftereffects and, 405 contrast, 452f cyclovergence and contrast of, 440, 440f deformation disparities and, 405–13 dioptic shear threshold and, 405, 405f element orientation and, 404, 404f horizontal- and vertical-shear disparity and, 407 orientation disparity and, 403–5 perception of, 403–13 shear disparity and, 407–8, 408f stimulus size and, 404–5 terminology of, 387 Inclined lines, Panum’s limiting case with, 280, 281f Inclined surfaces depth aftereffects from, 455–56, 456f depth contrast and, 446f orientation disparity on, 372–73, 374f Indepth motion, frontal motion competing with, 501, 501f Induced effect disparity magnitude and, 397–98, 398f gain and, 396 measuring , 396, 396f opposite, 400f peak value and, 396 physiological theories of, 402, 403f range and, 396 shear disparity, 405–9 size-disparity, 395–99 from small array of dots, 403f viewing distance and, 398–99 Induced tilt, cyclovergence and, 440, 440f Induced visual motion interocular transfer of, 143 oculomotor, 509 retinocentric, 509 stereopsis and, 508–11 vection-entrained, 509–11 Inferior temporal cortex (area TE), 28–29 In-series processing of disparity, 512–14 Integration model, binocular summation, 110 Interactive binocular process, 211 Interhemispheric image order, 194–95, 195f Interocular axis, 167 Interocular correlation adaptation and, 189, 189f of binocular images, 183–86 degrees of, 187, 187f detection of, 186–89 discrimination of, 187–88 in horizontal and vertical arrays, 188 matching problem and, 184–85 time to detect, 186–87 Interocular delays, 356–57
Interocular differences in contrast, 111–13, 310–12 in image blur, 312 in luminance, 310–12 in spatial frequency, 312, 313f in stereoacuity, 310–13 Interocular grouping , 86–87, 87f Interocular transfer of contingent aftereffects, 144–46 of figural effects, 135–46 of induced visual motion, 143 of motion aftereffect, 140–44 paradigm, 136 of pattern discrimination, 146–47 of perceptual learning , 146–47 for simple visual tasks, 146 of spatial frequency shift, 144 of tilt contrast, 137–40 of visual-motor learning , 147 Interstimulus delays, 357 Interstimulus distance, apparent motion and, 499–501, 501f Interstimulus interval (ISI), 346–47 Interstimulus masking , 347 Inter-visual-axis region, 378 Inter-visual-line region, 378 Intrinsic ends, 485 Inverse cyclopean stimulation, 229, 231f Inverted-disparity signals, 13 Ipsiversive preponderance, 504 Irradiation stereoscopy, 286, 286f ISI. See Interstimulus interval Isoazimuth plane, 155 horizontal-size disparity and, 368f Isodisparity circles, Vieth-Müller circle and, 368f Isodynamic cells theory, 2 Isoelevation, 155 Isoelevation lines, 158 Isoluminance binocular summation and stimuli of, 114 contrast at, 258–59 matches with, 257 stereopsis at, 257–58 Joint tuning, to motion and high-order disparities, 33–34 Julesz, Bela, 291f, 548 Kanizsa triangle, 478, 480f Kaufman, Lloyd, 255f Keplerian projection, 150–52, 151f Keystone Visual Skills Test, 291 Kinetic boundary disparity, 262 Kinetic depth effect, 489 Kingdom, Fred, 259f Koenderink’s classification of surface patches, 416 Kompaneysky, Boris, 548 Kundt partition effect, 177–78 Lacy depth, 338, 548 Lambertian surface, 262 Lang Stereo Test, 294 Lateral cortical connections, binocular rivalry and, 101 Lateral extent, 378 Lateral eye movements, 344–45 Lateral geniculate nucleus (LGN), 99 Lateral multiplexing , 545 Lateral offset, disparity threshold function for, 333 Law of cyclopean eye, 233, 236 Law of differences in headcentric direction, 233–34, 236 Law of visual direction, 232, 236
630
•
Laws of headcentric direction, 232–36, 233f–234f demonstration of, 234–37, 234f–235f Learning. See also Practice pattern-specific, 362 stereopsis and, 359–62 Left-right inseparable cells, 32 Left-right separable cells, 32 Legge, Gordon E., 111f Lenticular-sheet stereograms, 544–45, 545f Leonardo’s constraint, 241 Lesions, Pulfrich effect and, 546–47, 546f Levelt, W. J. M. averaging formula of, 112 brightness summation experiments of, 116–18, 117f LGN. See Lateral geniculate nucleus Light adaptation dichoptic masking and, 127–28 Pulfrich effect and prior, 529–31, 530f Lightness assimilation effect, 496f constancy, 491 shading and perceived, 494f transparency and, 494–96, 494f–495f Light sources, surface orientation effects and, 491–92 Linear disparity gradients, 377–79, 378f Linear disparity ramp, 378 Linear summation of dichoptic inputs, 109 Linear vection, 509 Line-end occluders, 326 Line horopter, 166 theoretical, 171–72, 173f Line orientation, orientation disparity and, 372f Lines of latitude, 155 Lines of longitude, 155 Line stimuli, depth aftereffects with, 454–55, 455f Linked images, 182–83 binocularly, 151 convergence determining , 208–9, 208f disparity averaging and, 340–42, 341f ease of, 326 Panum’s limiting case and double-duty, 279–82, 280f–281f Lit, Alfred, 519f Local contrast, 433 Local disparity energy function, 185, 185f Local matching rules ecological factors and, 204 ever-ascending staircase and, 191, 192f image adjacency in slanted surfaces in, 193, 193f image order and, 193–95, 194f–195f nearest-neighbor images, 190–92, 191f–192f oblique line linking ambiguity, 204, 205f texture inhomogeneity and, 203–4 unique-linkage rule, 189–90, 190f Local slant, 389–90, 390f Local vertical-shear disparity, 409, 410f, 411–12 Local vertical-size disparity, 399–402 Longitudinal chromatic aberration, 284 Long-range motion, 219–20 Long-term adaptation, 78 Luminance. See also Brightness; Isoluminance binocular rivalry and, 66–69 chromatic detectors and, 258–59 contrast of depth compared to, 442f–443f dichoptic color mixture and, 61 disparity limit and, 304
SUBJECT INDE X
disparity modulation of, 331–32, 332f DOG profiles of, 330 interocular differences in, 310–12 Pulfrich effect with visual latency depending on, 528–29, 528f–529f relative motion and, 520 spatial frequency of modulation of, 333 stereoacuity and, 307–8, 308f stereograms and intensity profiles of, 253, 253f stereopsis and reversed, 199–200, 200f stimuli, 260 visual latency and, 519–20, 519f Luminance-defined edges, 250–52 Luminance-defined gradients, 253–54, 253f Luminance ratios, dichoptic detection of, 115–16 Mach-Dvorak effect, 515, 523f nullifying , 524 Pulfrich effect compared to, 523–24 Maddox rod cover test, 163 Magnetic resonance imaging (MRI), 561–62 Magnetoencephalography (MEG), 102 Magnification, differential, 397, 398f Magnocellular system disparity detectors in, 29–30 ibotenic acid destroying , 30 Masking. See Dichoptic masking; Visual masking Matching images, 182. See also Global matching rules; Local matching rules color as aid in, 201 visual system and, 203, 203f Maxwell, Clark, 543 Maxwell’s spot, 164 Mayhew, John E. W., 251f McCollough effect, 144 McKee, Suzanne, 314f Median plane of head, 160, 167 MEG. See Magnetoencephalography Metacontrast with cyclopean images, 134 dichoptic, 134–35 relative depth and, 498 Metamerism, 338 disparity pooling and, 338–42 variables of, 339 Microtexture motion, stereoacuity and, 345, 346f Middle superior temporal area (MST), 26–27 Middle temporal area (MT) disparity detectors in, 26–27 motion sensitivity in, 487 spatial and temporal disparities in, 34, 35f Midline stereopsis callosotomy and, 40 chiasm and, 39 problem of, 39 Mirror stereoscopes, 541, 541f Misconvergence, Panum’s limiting case and, 278, 279f Modulation transfer function (MTF), 337 Mogan, Michael, 523f Moiré patterns, dichoptic, 59–60 Monocular acuity, anisotropy of, 325 Monocular averaging, disparity averaging versus, 339–40, 340f Monocular camouflage, 263–64, 264f Monocular decamouflage, sequential, 269, 270f
Monocular diplopia ARC and, 163 causes of, 165 development of, 165–66 monocular rivalry and, 81 Monocular figural repulsion, 277 Monocular flash, 67 Monocular-gap stereopsis, 270 Monocular induction, alternating , 136 Monocular luster, 81 Monocular occlusion background effects on, 265, 265f binocular disparity interactions with, 263–67 camouflage and, 263–64, 264f depth from vertical disparities and, 269, 270f in Panum’s limiting case, 279, 279f phantom square from, 269, 269f rivalry and, 266–67, 266f rules of, 263–64, 264f surface continuity and, 475f Monocular occlusion zones cyclopean view of, 241f identifying , 48 random-dot stereograms and, 549 visual direction of, 240–42, 240f–242f Monocular reaction times, 124 Monocular receptive fields of binocular cell in V1, 15f disparity tuning and, 9–12 position-disparity detectors and, 42 Monocular rivalry basic findings on, 80–81 monocular diplopia and, 81 monocular luster and, 81 theories of, 81–82 Monocular stereoscopy, 550–51 Monocular stimulus, 1 perceived direction of, 239–42, 239f Monocular tilt aftereffect, 138 Monocular zones, 263–64, 264f. See also Monocular occlusion zones appropriate and inappropriate, 266, 266f da Vinci on, 267 depth discontinuity and, 265–66, 266f left, 263–64, 264f normal and anomalous, 266–67, 266f right, 263–64, 264f surface opacity and, 267, 267f Monoptic cross-orientation inhibition, 100 Monoptic depth, 2 stimulus condition for, 283, 283f Mosaic rivalry, 63, 82, 86 Motion. See also Apparent motion; Cyclopean motion; Dichoptic apparent motion; Dichoptic motion; Orthogonal motion ambiguity of visual direction of, 484–87, 485f–486f binocular summation and detection of, 123 of cyclopean shapes, 225–28 in depth, fMRI, 38–39 between depth planes, 499–503 depth-specific aftereffects of, 503 extrinsic ends and, 485 first-order, 217 frontal competing with indepth, 501, 501f intrinsic ends and, 485 joint tuning to high-order disparities and, 33–34 microtexture, 345, 346f MT and sensitivity to, 487 OKN velocity and changes in, 79
Pulfrich effect for oblique, 517–18, 517f relative, 201–2 second-order, 217 spatial zones of binocular rivalry and coherence of, 84, 85f specularities during , 262 stereopsis and detection of coherent, 489–90 stereopsis and induced visual, 508–11 stereopsis and opposed, 487–88 stereopsis and perception of, 484–90 as stereopsis token, 261–62 third-order, 217–18 visual sensitivity to, 121–24 Motion aftereffect basic studies on, 140 at different processing levels, 140–41 eyes and distinct, 142–43 interocular transfer of, 140–44 physiology of transfer of, 143 stimulus velocity and transfer of, 141–42 suppressed images and, 92 Motion-defined shapes kinetic depth effect and, 489 stereopsis and, 488–89 Motion-direction contrast effect, 511 Motion-induced blindness, 78, 81 Motion segregation, patterns of, 486, 486f MRI. See Magnetic resonance imaging MST. See Middle superior temporal area MT. See Middle temporal area MTF. See Modulation transfer function Müller-Lyer illusion, 216, 216f, 336 projections of 3-D, 499, 500f stereopsis from, 284, 284f, 499, 499f Multiaperture confocal microscope, 558 Multiplexing, lateral, 545 Mutual inhibition mechanism, 103–6 Nakayama, Ken, 472f Nearest-neighbor images, 190–92, 191f–192f Nearness factor, 540 Neural network models of binocular rivalry, 105 of disparity detectors, 49–50 for stereoscopic vision, 49 Neural remapping , 304 Nodal point, 148 Nodes, short ranges between, 193, 193f Noise reduction, disparity pooling and, 338 Noncoincident patterns, 487 Noninteractive dichoptic stimulus, 210–11 Nonius lines, 173 Nonius procedures dichoptic, 174 dioptic, 173–74, 174f empirical horizontal horopter and, 173–75, 174f–175f grid, 173–74 Nonlinear optical microscopes, 559 Normalization, 439 depth aftereffects and, 461–63 disparity, 347, 430–31 equidistance tendency and, 440 global versus local, 437f of single surfaces, 442 Norms depth contrast and, 439 in stereopsis, 440 NOT. See Nucleus of optic tract Notched mask, 322–23 Nucleus of optic tract (NOT) disparity tuning in, 4 signals from, 504
Nulling procedure, Pulfrich effect and, 516 Nyquist limit, 320 Objective spatial frame of reference, 439 Oblique effect, 72, 213 Oblique gaze, theoretical horopters and, 170, 171f Oblique lines, ambiguity in linking , 204, 205f Oblique motion, Pulfrich effect for, 517–18, 517f Oblique quadrants, disparities in, 169 Oblique surfaces, anisotropy in, 414 Occlusion apparent motion and, 502f disparity, 270, 271f Ocular parallax, 233 Ocular prevalence Freiburg test for, 239f visual direction and, 238–39 Oculocentric cyclopean coordinates, 160 Oculocentric cyclopean elevation, 160 Oculocentric frame, 230 Oculomotor induced visual motion, 509 Oculomotor signals, slant and, 394 Odyssey (Homer), 210 OFF channel, dichoptic apparent motion in, 224 Off-channel viewing , 322 Ogle, Kenneth N., 302f Ogle’s horopter, 176f Ohzawa, Izumi, 10f OKN. See Optokinetic nystagmus ON channel, dichoptic apparent motion in, 224 Ono, Hiroshi, 237f Opacity, surface, 267, 267f Opposed motion, stereopsis and, 487–88 Opposite induced effect, 400f Optic array, 148–49 Optic array horopter, 167 theoretical, 167–68, 167f Optic axis, 148 Optic flow field, 148–49 Optic tectum, 4 Optic visual field, 148 Optokinetic nystagmus (OKN) in afoveate mammals, 503–4 attentional shifts and, 507–8 binocular disparity and, 506 centripetal preponderance of, 505 cortical inputs in, 504–5 directional preponderance in, 504 disparity control of, 33–34 eye movements, 89, 509 horizontal, 506 motion changes and velocity of, 79 smooth-pursuit and, 505 stereopsis and, 503–8 vertical, 506, 507f VOR and, 350 OR cells binocular, 135 cyclopean image and, 139 Orientation. See also Surface orientation anisotropy, 413 binocular rivalry and relative, 72 binocular rivalry synchrony and, 84f cyclovergence and constancy of, 394 discrimination, 214 disparity detectors and bar stimuli, 20 disparity tuning and, 20–21 element, 404, 404f line, 372f sensitivity to different, 20–21 similarity of, 195–97, 195f–196f
SUBJECT INDE X
•
631
stereoacuity and stimulus, 324–26, 325f stereopsis and specificity of, 255 stereopsis and stimulus, 254–55 Orientation contrast physiological data on, 139 simultaneous, 137 Orientation disparity, 31, 152 conflicting , 403–4 conforming to surface, 195, 195f cyclovergence and, 58–59 on frontal surfaces, 372 fusion limits for, 58–59 inclination and, 403–5 on inclined surfaces, 372–73, 374f line orientation and, 372f on slanted surfaces, 372–73, 374f Orthogonal dichoptic edges, 119–20, 119f Orthogonal drifting gratings, binocular rivalry between, 78–79 Orthogonal lines, stereopsis and, 196, 196f, 254–55, 255f Orthogonal motion dichoptic coherent motion combined with, 88–89 stereopsis and, 488 Orthoscopic projection, 540, 540f O’Shea, Robert, 75f Overall-size disparity, 375, 393 Panoramagrams, parallax, 545–46, 545f–546f Panum, Peter Ludvigh, 52 Panum’s fusional area, 52, 150 displacement of, 59, 60f Panum’s limiting case camouflage in, 279, 279f da Vinci stereopsis and, 270, 271f, 282, 282f depth and, 277–84 depth perception in, 278 double-duty linked images in, 279–82, 280f–281f with inclined lines, 280, 281f misconvergence and, 278, 279f monocular occlusion in, 279, 279f occlusion configuration of, 277, 277f random-dot stereograms of, 277, 277f sieve effect and, 275 vergence theory of, 278–79, 279f Panzo illusion, 499f Paraboloid, 416 Paracontrast, 134 Parallax panoramagrams, 545–46, 545f–546f Parallax stereograms, 543–44, 544f Parallel processing of disparity, 512–14 Paralysis of eye, 210 Parietal cortex, disparity detectors in, 27–28 Parietooccipital cortex, suprasylvian area of, 6 Parker, Andrew J., 14f Parvocellular system disparity detectors in, 29–30 ibotenic acid destroying , 30 Patent stereopsis, 302 Pattern discrimination, interocular transfer of, 146–47 Pattern familiarity, 360 Pattern recognition, binocular summation for, 116 Pattern rivalry, 66, 85 color and, 69–70 Pattern-specific learning , 362 Patterson, Robert, 219f
Pedestal disparity depth-discrimination threshold increase and, 330 disparity threshold function of, 335, 336f spatial frequency and, 330–31 Pedestal grating, dichoptic apparent motion with added, 222 Perceived coplanarity, 491–92 Perceived direction, of monocular stimulus, 239–42, 239f Perceived occlusion, subjective contours and, 481f Perceptive hypercolumn, 133 Perceptual grouping, in 3D space, 513–14, 513f Perceptual learning, interocular transfer of, 146–47 Perimetric system, 156–57, 156f PET. See Positron emission tomography Pettigrew, John D., 3f Phantom fringes, 144 Phase-dependent detectors, 464–65 Phase-disparity detectors function of, 16 position-disparity detectors compared to, 16–17 stereopsis and, 42 Phase filters, 185–86 Phase-independent detectors, 465–66 Phasic vergence, 350 Phoria cyclopean illusion and, 247, 247f visual direction and, 242–43, 243f Phosphenes, 210 Photogrammetry, 555, 556f Planar surfaces, sloping , 424 Poggio, Gian F., 7 Point disparities, 363–67 on converged retinas, 364–65, 364f on coplanar flat retinas, 364, 366f distribution of, 363, 365f–366f patterns of, 363 vergence and, 365–67 Point horopter, 166 theoretical, 168–71 Pointing with unseen hand, 420 Polar axis system, 156–57, 156f Polar disparity, 376, 376f–377f Polarizing plates, 542–43 Polaroid stereoscope, 541f, 542 Pooling. See Disparity pooling Position disparity, dif-frequency disparity and, 391–92, 392f Position-disparity detectors function of, 16 limit of, 43 monocular receptive fields and, 42 phase-disparity detectors compared to, 16–17 Position invariance of disparity tuning , 12–13 simple cells lacking , 44 Positive contingent aftereffects, 144–45 Positron emission tomography (PET) stationary stimuli and, 37 stereopsis and, 37–39 Practice criterion changes and, 362 depth judgment improved with, 359 eye movements and, 360–62, 361f pattern familiarity and, 360 stereoacuity and, 359–60 stereo latency and, 360–62 stimulus decorrelation and, 361f, 362 Preferred disparity, 5
Primary disparity detectors, 14 Primates. See also V1; V2; V3; V4 binocular cell classification in, 7 disparity detectors in high visual centers of, 24–30 sign of relative depth and, 25 Priming , 78 Principal curvatures, 416 Principal line, 167 Principal plane, 155 Prism stereoscopes, 541–42, 541f Probability summation classic, 108 empirical, 108–9 Probes, depth, 419 Processing time spatial scale and sequential, 355–56 for stereopsis, 355 stimulus duration and, 354–56 Projection orthoscopic, 540, 540f screen distance and stereoscopic, 539–40, 540f Projectively congruent visual lines, 168 Proximal visual stimulus, 149 Pseudo-square-wave stereograms, 251, 251f Pulfrich, Carl, 515 blindness of, 516 Pulfrich effect clinical aspects of, 546–47 contrast and, 531 cue conflict and, 518 dark adaptation and, 529–31, 530f depth in, 527 discovery of, 515–16, 516f dynamic noise, 533–36 eye movements and, 532–33, 532f filter placement and, 546, 546f fixating and, 532, 532f geometry of, 516–18, 517f lesions and, 546–47, 546f light adaptation and, 529–31, 530f long-term adaptation effects and, 531 Mach-Dvorak effect compared to, 523–24 nulling procedure and, 516 for oblique motion, 517–18, 517f saccadic expression and, 520 size variations of, 522 spatiotemporal disparity hypothesis and, 521–23 stroboscopic, 524–25, 524f, 526f–527f temporal-disparity hypothesis and, 520–21, 521f tracking and, 532–33, 532f visual latency dependency on luminance with, 528–29, 528f–529f visual latency hypothesis for, 518–20 Pulsed stimuli, 121–24 Pulvinar, disparity tuning in, 4 Pupillary line, 148 Pupils, binocular rivalry and, 87–88 Pyknostereopsis, 338 threshold for, 353 Qualitative stereopsis, 302 Radial-line horopters, 171–72, 173f Ramachandran, Vilayanur S., 484f Ramp density, 315, 316f Random-dot austereograms, 549–50, 550f–553f Random-Dot E StereoTest, 294, 294f Random-dot kinematogram, 219 Random-dot stereograms aftereffect from, 457f
632
•
Cajal’s, 548, 548f creating , 548, 549f cyclopean shape in, 249, 250f depth in, 201–2 for disparity gradients, 327 disparity modulation in, 317–19, 317f–320f history of, 548–49, 548f inverse cyclopean stimulation and, 229 monocular occlusion zones and, 549 of Panum’s limiting case, 277, 277f properties of, 250f reversed polarity, 199–200, 200f stereoscopic vision tests with, 291–94, 292f–293f vertical disparities in, 306–7 Random spatial-disparity hypothesis, 534–35, 534f Randot Test, 294 Range map, 363 Rangefinder, 366, 555–57, 557f Range-finding stereopsis, 386 Reaction times, monocular and binocular, 124 Receptive fields. See also Monocular receptive fields disparity, 451–53, 453f Gabor, 42, 43f sensitivity profiles of simple cells and, 41f Refusion limit, 52 Regular disparity, absolute disparity compared to, 14 Relative depth aperture problem and, 484, 485f constancy of, 420–24 constancy of, at near-distance, 420–21 constancy of, beyond 2m, 421 crowding and, 498 metacontrast and, 498 perceived whiteness and, 490–91, 492f scaling , 427 sign of, 25 size-disparity and, 369–70 threshold elevation and, 498 vertical disparity and vergence in scaling , 421–23, 422f–423f Relative disparity, 190, 191f, 367–68 dichoptic masking and, 132 stereopsis and, 157 Relative image contrast, 197–200 Relative image size, stereoacuity and, 300–301 Relative motion, 201–2 luminance and, 520 Relative orientation, binocular rivalry and, 72 Relative separation, stereoacuity and, 352f Relative slant, 445f Relief transformation, 430 Repulsion, depth, 434f, 435–36, 436f Resolution contrast, 320 disparity, 319–20 width, 52 Response saturation, binocular summation and, 115 Response variability, 5, 23 Retina. See also Anomalous retinal correspondence binocular rivalry and position on, 73–74, 74f converged, 364–65, 364f coplanar flat, 364, 366f corresponding points of, 182 Retinal axis systems binocular disparity measured by, 157, 157t
SUBJECT INDE X
coordinate transformations and, 157–58 cyclopean axes and, 159–61, 160f off-center axes in, 158–59, 158f–159f origin changes along visual axis in, 158, 159f types of, 155–57, 156f Retinal visual field, 148 Retinocentric frame of reference, 439 Retinocentric induced visual motion, 509 Reversed contrast, 198–99, 198f Ricco’s law, 115 Rivaldepth, 212, 275 Rivalrous images, 182–83 Rivalry. See Binocular rivalry; Monocular rivalry Rivalry contrast threshold, 67–68 Rogers, Brian J., 407f Ross, John, 534f Rotation disparity, 152, 374–75, 405 cyclovergence and, 407–8 Rubin’s cross, 470 Saccadic expression, Pulfrich effect and, 520 Sagittal meridian, 155 Salience of objects, 438, 453 Saturation dichoptic color mixture and, 61 response, 115 Scaled relative nearness, 430 Scaling. See also Spatial scale disparity, 55–57, 423f distance, 419 expansion, 288 frontal surface, 426–27, 426f relative depth, 427 shift, 288 vertical disparity and vergence in relative depth, 421–23, 422f–423f Schilling, Alfons, 550, 551f Screen distance, and stereoscopic projection, 539–40, 539f Secondary disparity detectors, 14, 31f Second-order disparities, 379, 379f depth aftereffects and, 462–63 depth contrast and detecting , 453–54 Second-order motion, 217 Second-order stereopsis, 45, 333–35, 334f SEE. See Spectrally encoded endoscopy Semantic processing, of suppressed images, 98 Sequential monocular decamouflage, 269, 270f Sequential processing , 355–56 Shading depth relationship with, 492–94 perceived 3-D scene structure and, 493, 493f perceived lightness and, 494f Shading-defined shape, depth contrast with, 450, 450f Shadowing , 560 Shape index, 416–17, 418f Shear-deformation disparity, 375, 406 Shear disparity, 374–75 binocular images and, 410f depth contrast and, 468–69, 468f equal and opposite, in half-fields, 410f gaze elevation and, 409f inclination and, 407–8, 408f induced effect, 405–9 temporal aspects of detecting , 412–13, 412f types of, 406f Shifter circuit model, 348 Short-range depth contrast, 434–36
Short-range motion, 219–20 Sieve effect, 212 depth and, 274–76, 275f–276f factors affecting , 276f Panum’s limiting case and, 275 with reduced contrast, 276, 277f Signal-detection theory, binocular summation and, 109–10 Sign of disparity, reversal of, 288–89, 460 Sign of relative depth, 25 Similar-surface default rule, 278 Simmons, David, 259f Simple cells energy models for, 41–44 position invariance lacking in, 44 sensitivity profiles of receptive fields in, 41f squaring operation in, 44 Simulated emission-scanning microscopes, 559 Simultaneous contrast, 433 Simultaneous depth contrast, 436–37, 436f Simultaneous orientation contrast, 137 Simultaneous spatial disparity, 520 Simultaneous tilt contrast, 91–92 Single-cell responses, binocular summation of, 124–25 Single surfaces normalization of, 442 perception of, 442–43 Size-deformation disparity, 375 Size-disparity correlation, 328 depth contrast and, 466–68, 467f horizontal, 267–68, 268f, 375, 387–88 induced effect, 395–99 oscillations of, 412, 412f perceived slant and, 400, 400f relative depth and, 369–70 on slanted surfaces, 368–70 stimulus eccentricity and, 369–70, 370f vertical, 368, 375 viewing distance and, 370, 370f Size ratios, 370–71, 371f. See also Horizontal size ratio; Vertical size ratio relative vertical disparities and, 381, 381f Skew coefficient, 177, 177f Slant absolute, 445f adaptation, 458f assimilation, 447, 447f deformation disparities and, 393–94 depth intrusion and, 395 differential magnification and apparent, 397, 398f dif-frequency disparity and, 388–91, 389f geometric effect of, 395 global, 389–90, 390f horizontal-size disparity and, 387–88 judging , 392–93, 392f local, 389–90, 390f oculomotor signals and, 394 perception of, 387–92, 445f relative, 445f size-disparity and perceived, 400, 400f size-disparity induced effect and, 395–99 stimulus eccentricity and constancies of, 392–95 terminology of, 387 Slant constancy over changes in distance, 393 Slant contrast, 437–38, 438f, 448f successive, 456f
Slanted surfaces abutting , 424f image adjacency in, 193, 193f orientation disparity on, 372–73, 374f size-disparity on, 368–70 Slant-inclination anisotropy, 413–15, 414f–415f Slant perception, gaze shifts and, 347–48 Slice-stacking stereo imagery, 547–48 Sloping planar surfaces, depth constancy of, 424 Sloping surfaces, depth contrast between, 442–50 Smooth-pursuit, OKN and, 505 Space horopter, 166 theoretical, 166–67 Spatial frequency binocular fusion and, 54 cyclopean aftereffect of, 216–17, 218f depth discrimination and, 335, 498 dichoptic apparent motion and, 221f dichoptic masking and, 131–32, 131f diplopia threshold and, 53f discrimination thresholds, 391f disparities between component, 251, 252f disparity limit and, 305 of disparity modulation, 316–23 disparity scaling and, 56–57, 57f image fusion and effects of, 53–55 interocular differences in, 312, 313f of luminance modulation, 333 masking function, 131, 131f pedestal disparity and, 330–31 similarity of, 197 stereoacuity and, 309 stereoacuity of differences in, 197 stereo thresholds and, 330, 331f suppressed images and aftereffect of, 91 venetian-blind effect and, 389 vergence instability and, 54 Spatial frequency shift, interocular transfer of, 144 Spatial resolution, of vertical disparities, 399 Spatial scale coarse-to-fine, 205–6 depth perception and, 336 disparity correlation with, 329–30 disparity detectors and, 329–36 disparity magnitudes in, 328–29 fine-coarse disambiguation and, 335–36 for Gaussian window, 333 sequential processing and, 355–56 stereoacuity and, 328–38 stereopsis masking and, 337–38 Spatial zones of binocular rivalry exclusive dominance in, 82 extent of, 82–83 independence of, 83–84, 83f–84f motion coherence and, 84, 85f Spatiotemporal averaging stroboscopic Pulfrich effect and, 526–28, 526f–527f virtual disparity and, 526 Spatiotemporal disparity hypothesis, Pulfrich effect, 521–23 Spatiotemporal quadrature, dichoptic apparent motion in, 221–22 Spatiotemporal VEP profile, 126 Spectacles, for chromostereopsis, 285 Spectrally encoded endoscopy (SEE), 561 Specularities disparity between, 262–63, 263f motion and, 262
surface curvature and, 263 Square-wave illusion, 91 Stabilized images, stereoacuity with, 344 Station point, 148 Statistical efficiency, 301 Stereoacuity, 57 absolute and relative disparity detection and, 296–97 anisometropia reducing , 312 attention and, 358–59 color and, 312–13 contrast and, 307–8, 308f, 310, 310f with crossed and uncrossed disparities, 323–24, 324f crowding and, 315 depth detection and, 309 disparity pedestal and, 297–300, 298f–299f exposure duration and, 356, 356f eye movements and, 344–51 features of, 295–302 Freiburg test for, 291 away from horopter, 297–300 head movements and, 350–51 ideal observer for, 301–2 interocular differences in, 310–13 lateral eye movements and, 344–45 limits of, 295–96 locus of maximum, 172–73 luminance and, 307–8, 308f measuring , 287–88 microtexture motion and, 345, 346f other acuities and, 351–54 practice and, 359–60 relative image size and, 300–301 relative separation and, 352f spatial factors in, 313–28 spatial frequency and, 309 of spatial frequency differences, 197 spatial scale and, 328–38 with stabilized images, 344 stimulus duration and, 355, 356f stimulus eccentricity and, 313–14, 313f stimulus orientation and, 324–26, 325f stimulus spacing and, 315–16 temporal factors in, 354–58 terminology and tasks of, 287–89 transparency and, 342–44, 343f types of, 353f in upper and lower visual fields, 314 vergence accuracy and, 349 vergence changes and, 449–50 vergence stability and, 348–49 vernier acuity compared to, 351–52, 352f viewing distance and, 327–28 width discrimination and, 353f Stereo aperture problem, 20, 326 Stereoendoscopy, 561–62 Stereo Fly Test, 290–91, 290f Stereograms. See also Random-dot stereograms of complex spiral, 360f with contrast mixtures, 252–53 depth aftereffects from anticorrelated, 460 of Helmholtz, 238f lenticular-sheet, 544–45, 545f luminance intensity profiles for, 253, 253f of nonmatching letters, 256f parallax, 543–44, 544f pseudo-square-wave, 251, 251f stereoscopic vision tests with, 290–91 Stereo latency, practice and, 360–62 Stereolithography, 562
SUBJECT INDE X
•
633
Stereomicroscopy atomic force microscope, 561 binocular, 557, 558f confocal scanning microscope, 557–58, 559f electron microscopes, 560, 560f nonlinear optical microscopes, 559 stereo x-ray photographs, 561 Stereomonoscope, 543 Stereo motion standstill, 225 Stereo MRI, 561–62 Stereopsis. See also da Vinci stereopsis; Disparity-stereopsis binocular fusion and rivalry and, 92–96 in chromatic channel, 257–61 chromostereopsis, 284–86 coherent motion detection and, 489–90 with color rivalry, 200–201, 200f contrast-sensitivity function for, 308–9, 308f without corresponding vertical edges, 268, 268f disparity detectors for, 8 figural continuity and, 471–72, 471f–472f figure-ground relationships and, 472–74, 473f figure perception and, 470–74 first-order, 333–35, 334f flicker as token for, 261–62 fMRI and, 37–39 from geometrical illusions, 283–84, 284f global, 249, 548 from Hering illusion, 283–84, 284f hysteresis effect in, 310–11, 311f hysteresis in disparity limit for, 303–4 induced visual motion and, 508–11 irradiation, 286, 286f at isoluminance, 257–58 learning and, 359–62 midline, 39–40 monocular-gap, 270 motion as token for, 261–62 motion-defined shapes and, 488–89 motion-direction contrast effect and, 511 motion perception and, 484–90 from Müller-Lyer illusion, 284, 284f, 499, 499f from nonmatching letters, 268f norms in, 440 OKN and, 503–8 opposed motions and, 487–88 orientation specificity of, 255 with orthogonal line elements, 196, 196f, 254–55, 254f orthogonal motion and, 488 patent, 302 PET and, 37–39 phase-disparity detectors and, 42 processing time for, 355 qualitative, 302 range-finding , 386 relative disparity and, 157 reversed luminance polarity and, 199–200, 200f second-order, 45, 333–35, 334f simple detection for, 309 spatial scale and dichoptic masking , 337–38 stimulus adjacency and, 435, 435f stimulus orientation and, 254–55 subjective contours and, 478–84, 481f sustained, 197, 357–58 texture-defined regions and, 255–57, 256f–257f
Stereopsis (continued) texture segregation and, 473f transient, 197, 199, 357–58 transparency from, 495, 496f uncorrelated texture and, 256, 256f uses of, 385–87 VEP and, 34–37, 36f–37f vertical disparity and, 18–19 visual pursuit and, 503–8 voluntary pursuit and, 508 from Zöllner illusion, 283–84, 284f Stereopter, 289 Stereoscopes anaglyph, 542 applications of, 555–63 depth constancy in, 423–24, 423f dichoptiscope compared to, 551 field-sequential, 542–43 geometry of displays of, 538–41, 539f–540f lenticular plate methods for, 543–46 mirror, 541, 541f monocular, 550–51 polaroid, 541f, 542 prism, 541–42, 541f screen distance and projection of, 539–40, 539f types of, 541–43, 541f volumetric, 546–48 Stereoscopic accuracy, 288 Stereoscopic acuity. See stereoacuity Stereoscopic anisotropy, 413–15 Stereoscopic gain, 288 Stereoscopic Howard-Dolman test, 290 Stereoscopic inaccuracy, 288–89 Stereoscopic interpolation in ambiguous regions, 476–78, 479f over blank areas, 474–75 over monocular areas, 475–76 over rows of dots, 477–78, 479f in similar regions, 476–77 Stereoscopic subjective contours, 483–84 Stereoscopic vision energy model for, 46 neural network models for, 49 random-dot stereogram tests for, 291–94, 292f–293f real depth tests of, 289–90 stereogram tests for, 290–91 terminology of, 1 test correlations for, 295 uses of, 385–87 Stereosculpting , 562 Stereo x-ray photographs, 561 Stevenson, Scott B., 342f Stiles-Crawford effect, 285 Stimuli. See also Achromatic stimuli; Binocular stimulus; Chromatic stimuli; Dichoptic cyclopean stimuli; Dichoptic stimulus; Dioptic stimulus bar, 20 binocular rivalry and complexity of, 72 binocular rivalry and duration of, 74–75 binocular rivalry and moving , 78–79 binocular summation and spacing of, 113 contrast-defined, 22–23, 22f–23f cyclopean, 136–37, 210 depth aftereffects with line, 454–55, 455f depth contrast produced by, 438–39, 438f for dichoptic apparent motion, 223f dichoptic color mixture and duration of, 61–62 dichoptic color mixture and size of, 62 dichoptic masking between adjacent figured, 128–30, 129f
disparity capture by flanking , 480 for disparity-stereopsis, 249–63 distal, 148 dominant, 63 fMRI and stationary, 37–38 frontal plane comparison, 419–20 inclination and size of, 404–5 isoluminant, 114 luminance, 260 monocular, 1 motion aftereffect and velocity of, 141–42 PET and stationary, 37 pooling , 338 proximal visual, 149 pulsed, 121–24 stereoacuity and location of, 313–14 stereoacuity and orientation of, 324–26, 325f stereoacuity and spacing of, 315–16 stereopsis and orientation of, 254–55 suppressed, 63 Stimulus adjacency, stereopsis and, 435, 435f Stimulus decorrelation, practice and, 361f, 362 Stimulus delays effects of, 356–57 interocular, 356–57 interstimulus, 357 Stimulus duration effects of, 354–55 processing time and, 354–56 stereoacuity and, 355, 356f Stimulus eccentricity binocular summation, position of, 113–14 effects of, 431 size-disparity and, 369–70, 370f slant constancies as function of, 392–95 stereoacuity and, 313–14, 313f vertical disparity as cue to, 382–83 Stimulus rivalry, eye rivalry compared to, 84–86 Strabismic amblyopia, 150 Strabismic deviation, angle of, 163 Strabismus convergent, 73 visual direction and, 243 Stroboscopic Pulfrich effect, 524–28, 524f spatiotemporal averaging and, 526–28, 526f–527f Structure-from-motion disparity, 262 Subcortical cells, disparity tuning of, 4–5 Subjective angle, 163 Subjective contours binocular rivalry between, 72–73, 73f depth and, 478–81, 479f–482f depth interpolation between, 482–83 disparity creating , 481f Ehrenstein’s figure and, 478, 480f in multiple planes, 481f perceived occlusion and, 481f stereopsis and, 478–84, 481f stereoscopic, 483–84 Subjective frame of reference, 440 Subregion correspondence detector, 18 Successive contrast, 433 Successive depth contrast, 454–66 Successive slant contrast, 456f Superimposed dichoptic stimuli, 211 Superimposed gratings, disparity in, 474f Superimposed patterns, dichoptic masking with, 130–33 Superimposed vertical disparities, distinct, 401
634
•
Superior colliculus disparity tuning in, 4–5 functions of, 4 Suppressed images apparent movement from, 88 dichoptic masking of, 132 dominant images interacting with, 87–90 effects from, 90 effects of changing , 88 eye movement evoked by, 89 motion aftereffect from, 92 movement signals from, 88–89 semantic processing of, 98 spatial frequency aftereffect from, 91 threshold elevation from, 90, 90f tilt aftereffect from, 91–92 visual beats from, 89 Suppressed stimulus, 63 Suppression recovery mechanism, 104–5 Suppression theory of binocular fusion, 93–94 Suppression theory of binocular rivalry, 65, 93–94 Suprasylvian area, 6 Suprathreshold contrasts, binocular summation at, 114–16 Suprathreshold functions, disparity modulation and, 320–21, 321f–322f Surface continuity, monocular occlusion and, 475f Surface curvature, specularities and, 263 Surface density, 378 Surface opacity, monocular zones and, 267, 267f Surface orientation lightness constancy and, 491 light sources and effects of, 491–92 Surface patches, Koenderink’s classification of, 416 Surface smoothness, 207–8 Sustained stereopsis, 197, 357–58 Swept volume system, 547 Symmetrical convergence, 235–36 Symmetry depth perception and, 474f diplopia threshold and, 52 Telepresence, 562–63 Telescopes, 555–57 Television, 562–63 TEM. See Transmission electron microscopes Temporal contrast-sensitivity function, 120 Temporal disparity hypothesis dynamic noise Pulfrich effect, 533–34 Pulfrich effect, 520–21, 521f Ternus display, dichoptic apparent motion with, 222–23 Tertiary disparity detectors, 14 Texture dif-frequency disparity and gradients of, 391f inhomogeneity, 203–4 stereopsis and regions of, 255–57, 256f–257f stereopsis and uncorrelated, 256, 256f Texture-contingent depth aftereffects, 460 Texture segregation, 471 stereopsis and, 473f Theoretical horopters, 166–72 with cyclovergence, 169, 170f with elevation of gaze, 169–70, 170f with noncongruent corresponding points, 170–71, 172f oblique gaze and, 170, 171f
SUBJECT INDE X
Theoretical line horopters, 171–72, 173f Theoretical optic array horopter, 167–68, 167f Theoretical point horopter, 168–71 conditions for, 168 convergence and, 168 Theoretical space horopter, 166–67 Theoretical vertical horopter, 169, 169f Third-harmonic generation microscope (THG microscope), 559 Third-order motion, 217–18 3-D optic flow, detection of, 34 3-D scene structure shading and, 493, 493f vertical disparities as cue to, 383–84 3-D shape discrimination thresholds, 418–19, 419f disparity defined, 416–19 Threshold elevation dichoptic masking and, 133–34 relative depth and, 498 from suppressed images, 90, 90f Threshold summation, 127. See also Visual masking Tilt aftereffect cyclopean, 139, 139f, 216, 218f monocular and binocular, 138 stimuli measuring , 138, 138f from suppressed images, 91–92 test stimulus in, 137 Tilt contrast interocular transfer of, 137–40 psychophysical studies on, 137–39 simultaneous, 91–92 Tilt test, Wallach’s, 508, 508f Time flashes separated by, 121–23 interocular correlation detection and, 186–87 processing , 354–56 TNO Test, 293–94 Topological image order, 193–94, 194f Torsional misalignment of eyes, 380 Torsocentric frame, 231 of reference, 439 Total visual field, 149 Tracking, Pulfrich effect and, 532–33, 532f Transient stereopsis, 197, 357–58 reversed contrast and, 199 Transmission electron microscopes (TEM), 560 Transparency depth, 338–39 depth from cyclopean, 272–73, 274f graded transparency depth, 273, 274f lightness and, 494–96, 494f–495f stereoacuity and, 342–44, 343f stereopsis creating , 495, 496f Transverse chromatic aberration, 284–85 Traub, A. C., 547 Treatise of Optics (Harris), 51 Trotter, Yves, 21f Troxler fading , 64, 71 gaze and, 347 Tuned inhibitory cells, 7 21/2-D sketch, 350 Twining function, for depth aftereffects, 463, 463f–464f Twist configuration surfaces, 446–47 Two-photon scanning microscopes, 559 Tyler, Christopher W., 316f Ullman, Shimon, 500f Uncrossed disparity, 153, 153f stereoacuity with, 323–24, 324f Unique-linkage rule, 189–90, 190f
Unlinked images, 182–83 Unpaired images, 182–83 minimizing , 204–5 Unseen hand, pointing with, 420 Upper limit of disparity, 289 Utrocular discrimination, 247–48 Uttal, William R., 350f V1 disparity detectors in, 6–25 disparity tuning function in, 8–9 disparity tuning in, 6–14, 8f horizontal disparities in, 20 monocular receptive fields of binocular cells in, 15f spatial and temporal disparities in, 34, 35f tuned inhibitory cells in, 7 V2, disparity detectors in, 24–26, 25f V3, disparity detectors in, 24–26, 25f V4, disparity detectors in, 28 van de Grind, Wim A., 237f Vantage point, 148 Varifocal mirror system, 547 Vection, 509–11 Vection-entrained induced visual motion, 509–11 Venetian-blind effect, 193, 193f spatial frequency and, 389 Ventral stream, disparity detectors in, 28–29 VEP. See Visual evoked potentials Vergence accuracy, stereoacuity and, 349 Vergence angle, 167 Vergence instability fusion limit of spatial frequency and, 54–55 spatial frequency and, 54 stereoacuity and, 348–49 Vernier acuity cyclopean, 213, 213f monocular compared with dichoptic, 351 stereoacuity compared to, 351–52, 352f Verstraten, Frans A. J., 503f Vertical arrays, interocular correlation in, 188 Vertical disparities, 152–53 absolute, 363, 380 adaptation to, 467f depth contrast induced by, 450 detection of, 18–20 distinct adjacent, 399–401 in distinct depth planes, 401–2 distinct superimposed, 401 in dot and line displays, 306 energy model and, 19
eye position and, 383 frontality judgment with, 426 Helmholtz’s display of, 425–26, 425f Hering-Hillebrand deviation and, 179 horizontal disparity detection and effects of, 324–25 monocular occlusion and depth from, 269, 270f in random-dot stereograms, 306–7 in relative depth scaling , 421–23, 422f–423f size ratios and relative, 381, 381f spatial resolution of, 399 stereopsis and, 18–19 stimulus eccentricity and, 382–83 3-D scene structure and, 383–84 tolerance for, 306–7, 306f in visual cortex, 19–20 from visual scene, 380–81 visual system causing , 379–80 Vertical gradient of horizontal disparity, 31–33 Vertical grating , 72 Vertical horopter, 166 empirical, 179–81, 180f inclination of, 180–81, 180f theoretical, 169, 169f Vertical-line horopters, 171, 173f Vertical misalignment of eyes, 379–80 Vertical OKN, 506, 507f Vertical plane, 167 principal, 154–55, 155f of regard, 160 Vertical-shear disparity, 375, 405 cyclovergence and, 406 inclination and, 407 local versus global, 409, 410f, 411–12 Vertical-size disparity, 368, 375 local versus global, 399–402 Vertical size ratio (VSR), 371, 371f, 393 eccentricity at distance and, 381–82, 381f–382f for frontal surface, 424–25, 425f horizontal gradient of, 431, 431f Vestibulo-ocular response (VOR), 350 Vieth-Müller circle, 153, 161 frontal planes and, 425f horizontal horopter and, 167, 167f isodisparity circles and, 368f Viewing distance disparity tuning and, 21–22 eccentricity at, 381–82, 381f–382f horizontal disparities and, 21 induced effect and, 398–99 size-disparity and, 370, 370f stereoacuity and, 327–28 vergence and, 361, 361f
Viewing-system parameters, 154 Virtual disparity, 526 Virtual reality, 562–63 Visual axis, 148 retinal axis systems origin and changes along , 158, 159f Visual beats, from suppressed images, 89 Visual channels, for disparity modulation, 321–23 Visual cortex binocular rivalry and, 99–103 cats responses to dichoptic stimulus in, 10, 11f direct stimulation of, 210 vertical disparities in, 19–20 Visual direction. See also Binocular visual direction ambiguity of motion and, 484–87, 485f–486f aperture problem and, 484, 485f basic law of, 232 controversies over, 236 of disparate images, 237–39, 238f fixation disparity and, 242–43, 243f Hering’s law of, 235f, 245, 245f laws of, 236–37 of monocular occlusion zones, 240–42, 240f–242f ocular prevalence and, 238–39 phoria and, 242–43, 243f strabismus and, 243 Visual egocenter, 232, 234f–235f Visual evoked potentials (VEP) binocular rivalry and, 101–2 binocular summation and, 125–27, 126f DRDC and, 36 image correlation and, 36, 37f spatiotemporal profile of, 126 stereopsis and, 34–37, 36f–37f Visual fields binocular, 149 binocular rivalry and dominance of homogenous, 70–71, 70f optic, 148 retinal, 148 stereoacuity in upper and lower, 314 total, 149 visual line in, 148 Visual latency from judged simultaneity, 519f luminance and, 519–20, 519f Pulfrich effect and hypothesis of, 518–20 Pulfrich effect with luminance and, 528–29, 528f–529f Visual line corresponding , 167
SUBJECT INDE X
•
635
in headcentric direction, 232 projectively congruent, 168 in visual fields, 148 Visual masking. See also Dichoptic masking dichoptic, 127–35 from even illumination, 127–28 types of, 127, 127t Visual-motor learning, interocular transfer of, 147 Visual processing, depth-specific, 497–503 Visual pursuit, stereopsis and, 503–8 Visual sensitivity figure-ground reversal and, 497–98 to pulsed sensitivity and motion, 121–24 Visual system ghosts and, 152 matching images and, 203, 203f vertical disparities due to, 379–80 Volkmann disks, 179, 179f Volumetric stereoscopes, 546–48 Voluntary pursuit, stereopsis and, 508 von der Heydt, Rudiger, 24f VOR. See Vestibulo-ocular response VSR. See Vertical size ratio Wade, Nicholas J., 69f Wallach’s tilt test, 508, 508f Wallpaper illusion, 152, 152f Weber’s law, 115 Wertheimer, Max, 470 Westheimer, Gerald, 296f Wheatsone’s mirror stereoscopes, 541, 541f Whiteness brightness and, 490 constancy, 491 depth and, 490–97 relative depth and perceived, 490–91, 492f Width discrimination, stereoacuity and, 353f Width resolution, 52 Wilson, Hugh R., 310–11, 311f Wolf, Max, 515 Wolfe, Jeremy, 74f Young-Helmholtz theory, 61 Zero crossings, 250–51 Zero-disparity circle, depth contrast induced in, 445f Zero-order disparities, 462–63 Zöllner illusion, 283–84, 284f Zone of binocular suppression, 63 width of, 66 Zone of rivalry. See Spatial zones of binocular rivalry