The McGurk Universe: The Physiological and the Psychological in Audiovisual Culture (Palgrave Studies in Audio-Visual Culture) 303118632X, 9783031186325

This book reconsiders audiovisual culture through a focus on human perception, with recourse to ideas derived from recen

127 4

English Pages 230 [227] Year 2023

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Preface
Contents
List of Figures
Chapter 1: Introduction
Preamble
Audiovisual Culture
Chapter 2: The McGurk Universe: Neuro and Aesthetic Theory
Neuroscience, Aesthetics and the Study of Film
Evolutionary Psychology and the Brain
Perception
Neuroscience
Conclusion
Chapter 3: Perpetual Realism: Mediating Fantasy and Reality
The Reality Effect
Audiovisual Traditions and Realism
Evidence for the Real
Doubling Perception: The Technical Analogue
Conclusion
Chapter 4: Mediating the Psychological and the Physiological
Bridging ‘The Gap’
Mediating: Physiological Reality, Psychological Fantasy
Toggling the Phantasmagorical Gap: ‘Fantasy’ and ‘Reality’
Conclusion
Chapter 5: Gestalt, Spandrels and Synergy
Audiovisuals and Gestalt Psychology
Gestalt Extrapolation
Spandrels and Sweet Spots
Extrapolating Off-Screen Sound: The Technological Supernatural
Conclusion
Chapter 6: ‘Gymnasium for the Senses’: The Artificiality of Audiovisual Space
Experiencing Audiovisual Spaces
Rural Sights and Sounds
Nonindifferent Nature
Changed Perception, Underload and Overload
Perceptual Health
Conclusion
Chapter 7: Conclusion
Recommend Papers

The McGurk Universe: The Physiological and the Psychological in Audiovisual Culture (Palgrave Studies in Audio-Visual Culture)
 303118632X, 9783031186325

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

PALGRAVE STUDIES IN AUDIO-VISUAL CULTURE

The McGurk Universe The Physiological and the Psychological in Audiovisual Culture K.J. Donnelly

Palgrave Studies in Audio-Visual Culture

Series Editor

K. J. Donnelly School of Humanities University of Southampton Southampton, UK

The aesthetic union of sound and image has become a cultural dominant. A junction for aesthetics, technology and theorisation, film’s relationship with music remains the crucial nexus point of two of the most popular arts and richest cultural industries. Arguably, the most interesting area of culture is the interface of audio and video aspects, and that film is the flagship cultural industry remains the fount and crucible of both industrial developments and critical ideas. Palgrave Studies in Audio-Visual Culture has an agenda-setting aspiration. By acknowledging that radical technological changes allow for rethinking existing relationships, as well as existing histories and the efficacy of conventional theories, it provides a platform for innovative scholarship pertaining to the audio-visual. While film is the keystone of the audio visual continuum, the series aims to address blind spots such as video game sound, soundscapes and sound ecology, sound psychology, art installations, sound art, mobile telephony and stealth remote viewing cultures.

K. J. Donnelly

The McGurk Universe The Physiological and the Psychological in Audiovisual Culture

K. J. Donnelly Humanities University of Southampton Southampton, UK

ISSN 2634-6354     ISSN 2634-6362 (electronic) Palgrave Studies in Audio-Visual Culture ISBN 978-3-031-18632-5    ISBN 978-3-031-18633-2 (eBook) https://doi.org/10.1007/978-3-031-18633-2 © The Editor(s) (if applicable) and The Author(s), under exclusive licence to Springer Nature Switzerland AG 2022 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. Cover illustration: United Archives GmbH / Alamy Stock Photo This Palgrave Macmillan imprint is published by the registered company Springer Nature Switzerland AG. The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

Preface

The McGurk Universe was a long labour, involving many diversions into blind alleys, some of which were negotiated and others labyrinthine paths to dead ends. The journey involved various hardware vessels, including my 12-year-old Acer Aspire laptop that was held together with sellotape. Sadly, it gave up the ghost and the book had to be finished on a brand-new MacBook Air. During the Covid pandemic my village has been extremely quiet apart from the ritual of visiting Bennett’s mobile fish and chip van on Fridays. This was such an attraction that the queues have at times been gargantuan. Queueing up to an hour gave me plenty of ‘dead time’ to write one-fingered on my iPhone. More of the book came from there than I could ever have imagined. Having said that, most of the core arguments and ideas have been with me for years, including several failed funding bids and slow and difficult progress during the Covid 19 pandemic. Indeed, while this book’s gestation period has been long, its birth difficult, many have helped both directly and indirectly. Lina Aboujieb at Palgrave has been remarkably patient and consistently encouraging. My department at the University of Southampton has also helped me, not least by granting me a period of research leave to develop the ideas that form the basis of this book. I spent that at New York University (Steinhardt) and especially hearty thanks to Ron Sadoff for welcoming me there. His annual ‘Music and the Moving Image’ conferences are a constant source of inspiration to many and to me in particular. I also spent some happy time at Lund University, where Ann-Kristin Wallengren was also extremely welcoming and helpful. v

vi 

PREFACE

Thanks to my students, who have had to hear me rattling on about the McGurk Effect for years. Also, to all those who have discussed relevant matters with me but in particular: Emilio Audissino, Louis Bayman, Beth Carroll, Erik Daubenton, Joan Donnelly, Patrick Donnelly, Robert Donnelly, Mary Kennelly, Danijela Kulezic-Wilson, Neil Lerner, Maggie Xiaoge Li, Wilfred Marlow and Nora Tourey. Very special thanks for everything to Mandy Marler. Southampton, UK

K. J. Donnelly

Contents

1 Introduction  1 2 The McGurk Universe: Neuro and Aesthetic Theory 15 3 Perpetual Realism: Mediating Fantasy and Reality 57 4 Mediating the Psychological and the Physiological107 5 Gestalt, Spandrels and Synergy141 6 ‘Gymnasium  for the Senses’: The Artificiality of Audiovisual Space177 7 Conclusion215

vii

List of Figures

Figs. 2.1–2.3 Cry by Godley and Creme 18 Figs. 2.4 and 2.5 Singin’ in the Rain20 Fig. 3.1 The Lumière brothers’ L’ariveé d’un train en gare de la Ciotat58 Fig. 3.2 In Bruges64 Fig. 3.3 A Clockwork Orange66 Figs. 3.4 and 3.5 Gimme Shelter81 Fig. 3.6 ‘We’re on Top of the World’ 84 Figs. 3.7 and 3.8 Lady in the Lake90 Figs. 4.1 and 4.2 Silent Hill 3120 Figs. 4.3 and 4.4 Escape into Night124 Fig. 4.5 The Singing Detective129 Figs. 4.6 and 4.7 Sucker Punch133 Fig. 5.1 Nosferatu, Phantom der Nacht157 Figs. 5.2 and 5.3 The Phantom Carriage160 Figs. 5.4 and 5.5 The Innocents169 Fig. 6.1 Wallander with Krister Henriksson 184 Fig. 6.2 Midsomer Murders187 Figs. 6.3–6.6 Performance200 Fig. 7.1 Nanook of the North216

ix

CHAPTER 1

Introduction

Preamble The irony is not lost on me that as I have been researching, thinking and writing this book about audiovisual culture and perception, my own perceptual faculties have been degenerating. Of course, everyone’s faculties are degenerating all the time but mine have been moving apace. I suppose this has emphasized for me the importance of our senses, perhaps making me more aware than I would have been otherwise of the absolute significance of these human faculties in relation to audiovisual culture. I’ve had intermittent issues with hearing for decades, and over the past few years my eyesight, which was once excellent, now is worsening consistently. As if I didn’t have enough to remind me of issues pertaining to perception, the 2020 Covid-19 pandemic gave me more to think about. Initially, the move to video conferencing offered the possibility of live subtitling software, which I quickly discovered can deliver ludicrous guesstimates as to what is being said. Upon returning to teaching classes ‘live in person’, I was confronted by classrooms of students wearing face masks. Not only could I not discern what they were saying but often I could not even work out which student was talking. I was painfully aware of the irony that I was teaching them about ‘the McGurk Effect’ and discussing the importance of perception.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 K. J. Donnelly, The McGurk Universe, Palgrave Studies in Audio-Visual Culture, https://doi.org/10.1007/978-3-031-18633-2_1

1

2 

K. J. DONNELLY

In the introduction to one of my earlier books, The Spectre of Sound, I noted that the practice bombing runs of RAF jets illustrated the confusion and shock of a lack of unity of sound and image. Since moving to Dorset on the south coast of England, I have had further military illustration of sound and image aspects. The military regularly train with heavy ordnance nearby. Actually, this takes place at some distance from where I live but the artillery generates extremely low-frequency vibrations which can barely be heard but have the shocking effect of shaking buildings, so sound is felt more than heard. On rare fortuitous occasions, wild shaking has enhanced my video game playing, meaning I don’t need to buy a rumbleseat. More often it is simply a shock. As I have been finishing writing this book in 2022 at the same time of the Russian invasion of Ukraine, the guns fell silent and perhaps had moved. The silence is welcome but has been replaced by loud isolated bass sounds from illegal rave parties and corporate music festivals, which are held about five miles away. Some of the festival bands can be heard quite well. It has been suggested evolved perception and cognition is such that vision is necessary to confirm or validate distant sound sources. I’m not so sure. Do I imagine the sound sources? Seeing as these are mostly reformed bands from nearly 40 years ago, I try not to do so.

Audiovisual Culture Tom Gunning memorably suggested that the welding of recorded sound to film in the late 1920s came from a desire for ‘reintegrating senses’, which had been pulled asunder by the technologies of the phonograph and film. There was “… a desire to heal the breach”, as he put it in “Doing for the Eye What the Phonograph Does for the Ear”. 1 Yet I would argue that this was not returning to the normality our senses might expect but moving to something altogether stronger and possibly more potent: the wholehearted aesthetic merger of sound and the moving image. Film and other audiovisual culture should not be conceived as ‘film and accompanying music’, ‘multimedia’ or ‘multimodality’ but more ‘cross-wired media’, unified by fusion at an almost genetic level. This book is about how film, video games, television and other instances of moving images are precisely audiovisual, a combination of sounds and images, rather than being primarily visual with some audio occasionally being of interest. To read many books on the subject you would think that sound and music are peripheral to audiovisual culture. Michel Chion’s subject defining book in English, Audio-Vision, establishes a strong sense of film and other media being audiovisual, manifesting what he calls ‘audiovision’, with sound being a significant partner

1 INTRODUCTION 

3

rather than a bit-part player.2 He discusses the process of ‘synchresis’ as “… the forging of an immediate and necessary relationship between something one sees and something one hears”.3 While audiovision appears to be a distinctive object, it coheres around an aesthetic core and psychological response or means of involving the user. This study takes the unification of sound and image as the key defining feature of audiovisual culture. I get very annoyed when beholding university courses or publishers’ catalogues that are called ‘visual culture’ or ‘screen studies’ when they mean audiovisual studies. Residual ideas about the primacy of images remain, despite research that appears to prove that sound and hearing do not play second fiddle to images and seeing for humans. For instance, neuroscientists Shams, Kamitani and Shimojo state that contrary to dominant common discourse on the matter, vision does not dominate human multisensory perception. They prove that audio signals can alter the perception of an unambiguous visual stimulus.4 So, vision can be changed by sound, rendering it secondary in some situations. A study by Stein, London, Wilkinson and Price showed that the intensity of visual images can be enhanced significantly by added sounds.5 Indeed, sound can change qualitative sensory experience. Many of us are of this opinion, but there is a strong body of research where the importance of sound is underlined by empirical experimentation. A relevant example is Jousmaki and Hari’s research into how sound can fundamentally change human perception and feeling. Adding different rubbing sounds as a test subject rubbed their hands made them think their skin was drier or moister.6 So sound has the potential to change us psychologically as well as converting signals into something else. This book is not an attempt to perorate an understanding of audiovisual culture from a point of view of scientific knowledge, although this study draws upon recent perceptual and cognitive insights. I am only interested in neuroscience and scientific approaches insofar as they allow me to rethink audiovisual culture and its aesthetics. So, this book is not a work of ‘cognitive film theory’. It lacks the scientific endeavour or the reference points in philosophy that characterize this area of scrutiny and approach to the subject. Instead, this is a fairly traditional investigation of audiovisual aesthetics, but using some scientifically derived approaches as well as those long-established. These include Gestalt psychology, which in my opinion has been surprisingly underused in the field, and taking some inspiration from Evolutionary Psychology, which is at least explicit about its embrace of functionalism that tacitly is adhered to although not registered in most analyses of culture. I do not intend to become lost in

4 

K. J. DONNELLY

scientific and para-scientific theories but merely to take enough to sustain my discussion. My approach eschews the dominant and overwhelmingly culture-­ focused conceptualization and analysis, and narratological strategy which understands everything in films and beyond as explicable only through narrative development. Instead of adopting an approach inspired by science-­based ideas, this book addresses the clear fact that audiovisual culture very directly exploits aspects apart not restricted to narrative and representation, which have demanded so much attention from arts and humanities researchers and writers. However, through necessity, it mixes analysis in order to include such concerns but reorienting these through an appeal to the fundamental requirements of human neurology and perception. My argument is certainly not that cultural analysis is irrelevant, merely that it has shaded out the central determinant upon the format of audiovisual culture. In essence, this book is about audiovisual culture since the advent of synchronized recorded sound with film onwards that marries moving images with sounds and in particular music.7 My argument is that this is a pervasive form and a cultural dominant and that it is crucial to understand that this plays to human perception in the first place. A video game is indeed very different from a film. Yet in terms of the relationship between audio and visuals, film, television, music video and related audiovisual objects on the Internet, all share a particular repertoire of aesthetics and related psychologies. While I certainly don’t want to deny its significance in terms of cultural issues or narrative involvement, I am more interested here in the primary psychological level exploited often explicitly in audiovisual culture. This is why my discussion looks elsewhere, to insights from neuroscience and neuropsychology, in an attempt to better engage the subject, as well as having recourse to Gestalt psychology and phenomenological approaches, which would have to be central in any such endeavour. I am under no illusions that I am presenting incontrovertible fact. If it does nothing, spending a lot of time reading about scientific theories and experimentation makes one more aware of just how far scientific ideas may be conceived as temporary and anticipate being overtaken by new insights. Indeed, this is some way from the caricaturing of ‘Science’ as context-free beliefs in primordial absolute truth that sometimes can be encountered in Arts debates. Having stated that much experimental knowledge appears temporary but authoritative, I have tried to draw less upon precise

1 INTRODUCTION 

5

instances of neuroscience and more taken general principles as useful in my discussion here. The phenomenon of the McGurk Effect is a fundamental fact of audiovisual culture, demonstrating as it does, that sound and image are never separate but always mutually affecting (we ‘see with our ears’ and ‘hear with our eyes’). As electronic audiovisual culture has become not only the dominant form of modern culture but also the very heart of modern experience itself, this book aims to investigate the ‘McGurk Universe’ by addressing the interface of the human and audiovisual culture, both in terms of psychological aspects and the more fundamental physiological aspects, and crucially, the fundamental interaction and relationship between the two. Audiovisual culture’s seductive illusions are more than enough to make audiences forget its mechanical-electronic basis. They are also more than enough to make theorists forget about the crucial fact that electronic audiovisual culture is built upon and fundamentally exploits human perception rather than necessarily being simply stories about the depiction of ‘things’. On one level, it is a tautology that audiovisual culture is cut to the measure of human perception and cognition. However, once its illusions are bypassed, its procedures can tell us a great deal about human ‘hardware’. Little research has followed this path. I would argue that audiovisual culture is in essence about the integration of sound and image, both in aesthetic terms (production) and psychologically (in our heads). I proceed from the reasonable assumption that rather than simply working on a straightforward cognitive level audiovisual culture also functions more fundamentally on a physiological level, strongly exploiting precise aspects of human perception. Although this pervades every aspect of films, television programmes and videogames, it is most evident in its direct mimicking of human senses. Electronic sound and vision both double and mimic human perception. Sound and image technologies may be separate, like hearing and seeing, but form a unified illusion the same way that vision and hearing are unified in the brain through being processed in the same areas.8 This is illustrated vividly by the McGurk Effect, whereby the perception of spoken sound is changed by its accompanying image, and also by counterpart perceptual effects which demonstrate that what we see is affected by different accompanying sounds.9 A focus on sound and the audiovisual nature of film and related culture can make us notice just how far it is determined by being geared very directly towards the physiological aspects of humanity, namely,

6 

K. J. DONNELLY

human perception. This may not seem a surprising thing to say, but discussions of audiovisual culture have become so mired in debates about artistic creativity, culture’s relationship to the social, technology, ‘thought’, history and narrative, to name a few, that it has forgotten the essential level, that film works through exploring the limits and the affordances of human perception. A focus on perception can rethink all aspects of film as effects of perception rather than as integral parts of systems of film language, narrative, representation and so on. The traditions of sound film have moved across into other media, where they remain as a core, despite differences in the formats. This book addresses films, television and video games, which as sister media together comprise the dominant forms of audiovisual culture, whether appearing as an embedded YouTube clip on the Internet experienced on a cell phone or on an IMAX screen with 11.1 sound (which is available at some specialist theatres). Of course, all audiovisual culture is certainly not the same, yet what is formed by the combination of sound and music with moving images follows the same core processes, both in terms of dominant syntax and construction as well as psychology and perception. Indeed, the use of the term ‘film’ has become commonplace for short clips on Internet platforms. Indeed, it makes far more sense to think of film as being a perceptual object, based on the unity of screen and speakers/headphones, than to think of it as something defined by its production, distribution and exhibition technologies. Audiovisual culture continues expanding its domination of human existence. Originally, film had the most significant impact, although this mantle arguably was taken later by television, and perhaps more recently by video games. Currently, all of this has been bolstered rather than replaced by the dominance of Internet culture. One clear attraction of such culture is the compulsive desire for sound and image sensory stimulation. Indeed, human beings appear far happier with an ‘overload’ of sensory data than they are with a paucity of it, which has ramifications for mental health and physical well-being. Films, television and video games clearly serve a basic human need. Following Torben Grodal, I would argue that audiovisual culture not only is important for its utter pervasiveness, but also for the fact that it unifies so many aesthetic approaches into a coherent and powerful whole. He notes, “Films have a special position within aesthetics due to the fact that film is ‘total art’”, using music, colour, form, perspective, narrative, acting and many other things to produce “the most

1 INTRODUCTION 

7

sophisticated simulation of human experience by the interaction of the embodied brain and the world”.10 The ‘binding’ of audio and visual signals in human perception is at the heart of film and other audiovisual culture. I discussed this in some detail in my earlier book Occult Aesthetics: Synchronization in Sound Film.11 Pourtois, de Gelder, Vroomen, Rossion and Crommelinck investigated intermodal binding and audiovisual integration. Showing faces concurrently with auditory fragments showed a very rapid onset of combination of sound and image in the brain and also suggested that the character and content of each channel was important for speed of binding.12 Ramachandran and Hirstein note that this process directly follows the tenets of the Gestalt laws of perception, and that the less ambiguous and more seemingly salient the material the more solid the binding of sound and vision.13 It appears that even though simultaneous sound and vision trigger specialist sound and vision areas of the brain, heteromodal areas of the brain, which receive inputs from multiple senses, are where sound and image converge and become unified.14 Sensory signals are unified in a number of ‘association areas’, parts of the cerebral cortex that receive and integrate incoming sensory information, which make connections between sensory and motor areas of the brain and are often linked to the most complex brain functions.15 The development of audiovisual culture appears highly significant. The conceptual and emotional possibilities of the novel were not supplanted but were shifted into a fundamentally different register, one more direct and more visceral, more mentally and emotionally immersive. This allowed for a more enveloping and overwhelming experience, making audiovisual culture into the consummate vehicle for dreams about humanity’s possibilities and humanity’s limits, which have changed notably over the last century. Sometimes, in terms of subject matter, issues of evolution and the evolutionary gap (which I will discuss later) have been addressed directly. However, perhaps more clearly, they have been addressed in a more circumspect manner and engaged through sensual, perceptual and cognitive means. Sometimes, I almost wonder if this has been ‘missed’ in the whole process of the dominant approach of thinking audiovisual culture into existing traditions of ‘stories’, ‘histories’, ‘cultures’, ‘issues’ and so forth. As I noted earlier, inspiration for my approach will come from Gestalt psychology, which is premised upon attending essentially to perception and the processing of sensory impulses. It is necessary to acknowledge this

8 

K. J. DONNELLY

as a defining aspect of human beings, and perhaps one of the most defining.16 If sensory input is deprived, we not only are put into a negative mental state, but our perception and cognition also start to compensate for the lack of sensory input. Gestalt psychology accounts precisely for this process, where we make wholes from fragments, as well as finding patterns where they may not exist. This last phenomenon, known as pareidolia, is the extreme result of our innate drive to find meaningful patterns, even when there are none. This surely is of prime importance for dealing with audiovisual culture,17 if not a defining characteristic of being human. In terms of perceptual drives, it is interesting to note that one of the current explanations of the hearing illness tinnitus is due to perception and post-­ perceptual overcompensation producing an effect of continuous sound.18 As I’ve noted, it seems surprising that Gestalt approaches are marginal in the study of audiovisual culture.19 Maria Poulaki provides a pertinent commentary: Films simulate the processes of the mind and ‘synchronise’ with them: this is a trope followed from early film psychology till today’s psychological film studies. Among the first psychologists who showed interest in cinema, Rudolf Arnheim considered the ‘art of film’ (as in the original title of his 1932 book, Film als Kunst) to be a case for understanding how visual art reflects and embodies the ways in which the mind organises the perceived world … Before Arnheim, Psychologist Hugo Münsterberg already in 1916 has argued that film moulds reality according to the inner laws of the mind, which do not follow natural laws of causality and continuity in space and time but transform them through attention, memory, imagination and emotion. Both these ‘grandfathers’ of film psychology were in some way associated with the Gestalt theory.20

By implication many of these ideas remain somewhere in Film Theory, neither discredited nor dispelled. Indeed, the phenomenon of seeing successive film images as movement comes from Hugo Münsterberg adapting Gestalt Psychologist Max Wertheimer’s theorization of Phi Phenomenon, in the second decade of the twentieth century. Following Gestalt approaches, I understand perception and cognition as not being a single process but a dual operation. Indeed, a number of Gestalt experiments prove that there is some division between these two procedures. Perception, as the primary and fully unconscious process, is of

1 INTRODUCTION 

9

more interest to me, although higher-level cognition is beyond being discounted. In terms of other theory and analysis, I’m quite happy with some existing categories, as long as we are aware that these are not absolutes or immutable. I see no reason to problematize or pull apart ideas of physiology and psychology. I understand that some might like to see them as aspects of the same thing, and indeed, I am proposing something along those lines here, although for the purposes of discussion at least, it still seems important to retain a sense of difference. What I’m suggesting is that physiology is the key determinant on psychology and that conceiving human psychology as somehow independent and autonomous, a dominant approach in the arts, is not a viable assumption when put under any sustained scrutiny. As it seems I keep saying, it is surprising that ‘audiovision’ has been so neglected by scholars, as the backdrop of the last couple of decades has been sound’s increasingly important role in recent years (with spatialized multi-channel, multi-speaker sound systems in cinemas and at home and the use of headphones for small screens). Perhaps this has led to some sound studies that neglect the central audiovisual relationship, too. In recent years, there has been far more interest in analysing sound and addressing audiovision, although some of my theoretical touchstones are older theorists who have posed the sort of fundamental questions that remain unanswered but important to keep in mind rather than jettison as they have not provided a simple answer to a complex question. Marshall McLuhan has been a notable influence, although not one engaged with explicitly in my analysis and discussion. The book’s title, The McGurk Universe, was inspired by Marshall McLuhan’s The Gutenberg Galaxy.21 His subtitle ‘The Making of Typographic Man’ makes his argument explicit. McLuhan argued that in ‘the West’ the printed book had changed humanity’s consciousness, not only translating all human experience into the format of language and books but also through controlling imagination and ultimately fostering alienation. McLuhan also pointed to electronic media as a continuation and its specific conceptual and organizational models and limitations which ultimately have a profound social effect. I’m sure McLuhan would have seen the overwhelming dominance of screen and speaker culture as a continuation and development of this, a recomposition of humanity’s consciousness through a medium built around our fundamental perceptual affordances. The McGurk Universe is

10 

K. J. DONNELLY

notably less ambitious than McLuhan’s book. I also decided not to retain the term ‘Galaxy’ due to Francesco Cassetti’s impressive book The Lumiere Galaxy (2015), but there is something apt about using the term ‘Universe’, though, as audiovisual culture has become so utterly universal worldwide. Like McLuhan, and other associates of the ‘Toronto School’, I take it that dominant forms of culture create and shape psychological states as well as social states. That electronic audiovisual culture creates a distinct form of human consciousness, yet I am happy to suggest that reciprocally it has evolved to fit our physiological requirements at least as much as our intellectual desires.22 Like one of McLuhan’s other influential books, Understanding Media: Extensions of Man, this book has a coherent structure but can also be ‘dipped into’ in a non-linear manner. After all, isn’t this the way with all culture these days? Where the option exists the linear is rarely the mode of choice these days, and perhaps with good reason, as a ‘hypertextual’ approach appears possibly to be an analogue of the human brain’s activities. I don’t want to overstate this: if you want to you can also move between successive chapters in a straightforward linear manner. This book’s second chapter addresses aspects pertaining to perception and cognition, and how we might use Evolutionary Psychology to better understand the human processes of perception upon which audiovisual culture is built. The third chapter is about the so-called reality effect of audiovisual culture, whereby we take it on some level as capturing, recording and replaying the ‘real’. The fourth chapter addresses how audiovisual culture mediates between the physiological and the psychological, via perception inculcating psychological disposition as well as higher-level cognition. The ensuing chapter, number five, explicates one of the principal theoretical approaches adopted, that of Gestalt psychology, and also engages audiovisual culture through the lens of Stephen Jay Gould’s articulation of the notion of the ‘spandrel’, which in evolutionary terms is something formed fortuitously rather than developed as an evolutionary adaptation to benefit humanity. The sixth chapter looks at sound and image dynamics and how the ‘spaces’ of audiovisual landscapes are constructed as a textural manifestation of these dynamics that are not only at the heart of human perception but also at the centre of audiovisual techniques and culture. If the McGurk Effect hadn’t been ‘discovered’ as a reality, they would have had to invent it. As a concept, it sums up the essence of film’s effect and confirms what most of us probably know about audiovisual culture already: that sound and vision work in a synergetic manner to enhance each other and produce an effect of seamless unity. A fundamental point is

1 INTRODUCTION 

11

that the human senses have a need for activity, and this is the crucial backdrop to the radical development and expansion of electronic audiovisual culture from the end of the nineteenth century into the coming future. Although not the focus of this study, inevitably I will engage briefly late on with discourses about physical and mental health and well-being at the interface between the human body and audiovisual screen and speaker culture. This book rather neglects dialogue. This was no accidental oversight, though. It is partly because dialogue in some cases registers as an emanation from the image rather than audio and often is less neglected in analysis than other sonic aspects. It is also because dialogue appears dominated by its semantic rather than aesthetic valence, and thus is taken as ‘communication’ rather than aesthetics, and thus engaging higher brain functions more readily than the immediacy of other sounds. Of course, it also has a crucial aesthetic dimension which deserves a dedicated study somewhere else. It should also be noted that sound can be conceptualized as an aesthetic rather than ‘realistic’ element. This is more my concern in discussion here, as this can enable a stronger synergetic effect in tandem with image, and might be understood in a manner closer to music— although it is also worth registering that music combined with images is not the same as music on its own and the audiovisual has a compelling logic of its own. The aim of this book, though, is to illustrate that audiovisual culture is precisely a mixture of audio and visual, through neglecting the concerns of narrative, representation and character in favour of focusing more on the relationship of sound and image and how these merge into a single perceptual signal.

Notes 1. Tom Gunning, “Doing for the Eye What the Phonograph Does for the Ear” in Richard Abel and Rick Altman, eds., The Sounds of Early Cinema (Bloomington, IN: Indiana University Press, 2001), p. 16. 2. Michel Chion, Audio-Vision: Sound on Screen (New York: Columbia University Press, 1994). A variation on this is Robert Miklitsch’s use of the term ‘audiovisuality’. Roll Over Adorno: Critical Theory, Popular Culture, Audiovisual Media (Albany, NY: SUNY Press, 2006). 3. Chion, op. cit., 1994, p. 5. 4. Ladan Shams, Yukiyasu Kamitani and Shinsuke Shimojo, “What You See is What You Hear” in Nature, vol.408, 2000, p. 788.

12 

K. J. DONNELLY

5. Barry E.  Stein, Nancy London, Lee K.  Wilkinson and Donald D.  Price, “Enhancement of Perceived Visual Intensity by Auditory Stimuli: A Psychophysical Analysis” in Journal of Cognitive Neuroscience, vol.8, no.6, 1996, pp. 497–506. 6. Veikko Jousmäki and Riitta Hari, “Parchment-Skin Illusion: Sound-Biased Touch” in Current Biology, vol. 8, no. 6, March 1998, R190. 7. Of course, silent cinema also followed these same procedures. This discussion is largely relevant for silent cinema but I have focused more on the ‘recorded’ cinema and later audiovisual culture as the marriage of technologies appears to offer certain aesthetic affordances geared around precise synchronization and unity. Having notes that, a later chapter addresses at length an instance of silent film, although one which was made available on disc with a novel synchronized recorded soundtrack. 8. Petra Vetter, Fraser W.  Smith and Lars Muckli, “Decoding Sound and Imagery Content in Early Visual Cortex” in Current Biology, vol. 24, no. 11, 2 June 2014, pp. 1256–62. 9. Harry McGurk and John MacDonald, “Hearing Lips and Seeing Voices” in Nature, no. 264, 1976, pp.  746–748; See further discussion in K.  J. Donnelly, Occult Aesthetics: Synchronization in Sound Film (New York: Oxford University Press, 2013), pp. 25–26. 10. Torben Grodal, “Film Aesthetics and the Embodied Brain” in Martin Skov and Oshin Vartanian, eds., Neuroaesthetics (Amityville, NY: Baywood Publishing, 2009), p. 249. 11. Donnelly, op. cit., 2013. 12. Gilles Pourtois, Beatrice de Gelder, Jean Vroomen, Bruno Rossion and Marc Crommelinck, “The Time-Course of Intermodal Binding between Seeing and Hearing Affective Information” in Neuroreport, vol. 11, no. 6, 27 April 2000, pp. 1329–1333. 13. Vilayanur S. Ramachandran and William Hirstein, “The Science of Art, A Neurological Theory of Aesthetic Experience” in Journal of Consciousness Studies, vol. 6, nos. 6–7, 1999, pp. 15–51. 14. Gilles Pourtois, Beatrice de Gelder, Anne Bol and Marc Crommelinck, “Perception of Facial Expressions and Voices and of their Combination in the Human Brain” in Cortex, vol. 41, no. 1, 205, pp. 49–59. 15. “Association Areas” at Neuroscientifically Challenged: Neuroscience Made Simplerhttps://neuroscientificallychallenged.com/glossary/association-­ areas [accessed 02/02/2022]. 16. Perceptual faculties have to be understood as the norm in our species, even though as I noted some people may lack acute perception due to degeneration or disability. 17. Donnelly, op. cit., 2013, p. 6.

1 INTRODUCTION 

13

18. Josef P. Rauschecker, Amber M. Leaver and Mark Mühlau, “Tuning out the Noise: Limbic-Auditory Interactions in Tinnitus” in Neuron, vol. 66, no. 6, 24 Jun 2010, pp. 819–826. 19. I don’t want to suggest that it has totally been ignored, and indeed, university Music departments are far more familiar with this approach than Film departments. Colleagues and erstwhile colleagues at the University of Southampton’s Film department have adopted approaches inspired by Gestalt in their research and publications, including Beth Carroll, Emilio Audissino and Jady Jiang. 20. Maria Poulaki, “The ‘Good Form’ of Film: The Aesthetics of Continuity from Gestalt Psychology to Cognitive Film Theory” in Gestalt Theory, vol. 40, no. 1, 2018, p. 29. 21. Marshall McLuhan, The Gutenberg Galaxy: The Making of Typographic Man (Toronto: University of Toronto Press, 1962). 22. Another more general influence along these lines is Suzanne K. Langer’s approach in Feelings and Form: A Theory of Art (New York: Scribners, 1953).

CHAPTER 2

The McGurk Universe: Neuro and Aesthetic Theory

The McGurk Effect established that not only are vision and hearing active but also that they merge in a ‘synaesthetic’ manner in human perception. Electronic audiovisual culture can tell us much about human physicality. Indeed, looking into audiovisual culture is to know what it is to be human—not just our innermost desires that may be represented, but the electrochemical physical processes that are our often unacknowledged essence. This chapter begins by discussing a dominant notion in Evolutionary Psychology (EP): that art and culture are an ‘adaptation’.1 This holds that one way or another it serves a useful function for survival, perhaps as a repository of valuable ideas and simulations of possible real-­ world scenarios. Going further, I would suggest a physiological function, too. Although many voices have been raised against the implications of Evolutionary Psychology, it remains highly influential, although sometimes in different guises. Despite the general pervasiveness of its ideas, it is firmly on the margins of humanities research, arguably due to distaste for its social implications rather than any theoretical objection. Although hardly an evangelist for it, I nevertheless believe EP can furnish audiovisual research with a different and fruitful perspective. Doing so will not be founded in approaching film and other audiovisual culture in representational terms but perceptually, at the point of translation of audiovisual style and techniques into human terms. I would suggest that studying sound, music and the moving image as a physical/physiological object © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 K. J. Donnelly, The McGurk Universe, Palgrave Studies in Audio-Visual Culture, https://doi.org/10.1007/978-3-031-18633-2_2

15

16 

K. J. DONNELLY

necessitates dealing with audiovisual phenomena and the highly specific aspects of sound and vision that are processed through sight and hearing into a merged signal before reaching a viable level for complex discussion of aspects such as ‘representation’ and ‘reference’. Laboratory experiments have furnished us with some important points of perceptual-cognitive ‘reality’ which ought to be approached as absolutely fundamental for any study of film, television and video games that accepts the essentially audiovisual basis of the medium. Relations between sound and image do not follow simple, ‘common-sense’ patterns, and thus the fundamentals of the relationship need more concentrated and careful thought. First stated explicitly in an article in Nature in 1976, McGurk and MacDonald pointed to the perceptual phenomena whereby images of a person’s mouth saying a sound, added to a recording of a different vocal sound, add up to the perception of a new sound that objectively is absent.2 The so-called McGurk Effect is a startling illustration of the unified essence of audiovisual culture. Sounds can be changed by images: a phoneme (vocal sound) combined with an almost matching viseme (image of sound being produced by a human face), when added together, yields a different phoneme for the perceiver.3 Most commonly, the repeated sound of a person saying ‘ba’ is accompanied successively by images of the same person saying ‘ga’ and ‘da’, and almost universally human beings perceive that the sound is changing, even though in reality it is exactly the same.4 So, the objective fact of the sound is altered for us by the accompanying image. Illustrating directly and conclusively that visuals can change perception of sounds, this phenomenon significantly also demonstrates the fundamental interrelatedness of sound and image. Indeed, it suggests that fully isolating sound or image may be invalid. It certainly suggests that blanket assertions of ‘film being a visual medium’ (or similar assertions) should be long banished. The implications for sound may in fact be more profound. Indeed, any appeal to the ‘purity’ of musical experience is, of course, also thrown into question.5 A related phenomenon is called the ‘Bounce Inducing Effect’, which was demonstrated by Sekuler, Sekuler and Lau through having two concentric target shapes moving towards and through each other on a screen.6 Once sound is added at the point of coincidence, it seems as if they are bouncing off each other, whereas without the sound, it does not appear this way. This

2  THE MCGURK UNIVERSE: NEURO AND AESTHETIC THEORY 

17

illustrates clearly how sound drastically affects visual perception. These show a sense of ‘congruence’, where adding two media together produces the effect of their direct relationship. They also vividly illustrate the centrality of precise temporal synchronization in the primary perceptual-cognitive effects.7 So, the truth of it is that we partly hear with our eyes and partly see with our ears. So, ‘film music’ in this light is not simply music that accompanies film, but a sonic element integrated on a genetic level. This means a potential problem for conceptualizations and experiments that approach the subject as moving images plus music, as one added to a pre-­existing other. These effects are without doubt at the centre of film and have been exploited consistently although unsystematically and with only an intuitive and unconscious awareness of their importance. Annabel Cohen notes that the study of film music has been dominated by an ‘associationist’ approach, which divines “… the effects of complex configurations from the properties of the underlying simple ideas. Thus, the effects of combining dimensions of music (e.g., rhythms and harmonies) or of combining music and film can be predicted. This is consistent with the view that music independently adds meaning”.8 The McGurk Effect informs us that the addition of image track and soundtrack is not a simple product. Adding two and two make five in this case, or the two tracks add up to something different: a ‘third term’, which is more than each individually, and quite possibly the magic product at the heart of cinema itself.9 Despite not knowing about the McGurk Effect, as a young person I was aware that something strange was going on. I can remember seeing and hearing the music video for Godley and Creme’s Cry in 1985. The words are all sung by Kevin Godley but the images consist of a succession of many different cross-faded faces singing parts of the song. This includes the images of a young man with a clear frontal lisp and a man who looks like he has had an operation for a cleft palate. I can remember finding it strange that at the point where these two ‘sing’ the ‘s’ in the song, words were not vocalized the way they were in the rest of the song. Godley’s singing voice appears to change at that point. This is only a passing effect but we hear the change; the inability to produce high-pressure sounds, such as ‘s’, is expressed in both men’s mouth movements; and if we are watching closely, on a large screen, we register the sound of the image rather than the sound of the sound itself (Figs. 2.1–2.3). The McGurk Effect ought to have significant implications for understanding how audiovisual culture works. Aspects of sound and image merge. Musicians and music video makers have been aware of this on an

18 

K. J. DONNELLY

Figs. 2.1–2.3  Cry by Godley and Creme

instinctive level for decades—that a poor song can be enhanced by adding impressive images, while filmmakers have understood that good music might help prop up an unengaging film. Indeed, since Hollywood’s studio era, there has been an (unproven) assumption that a good musical score might be able to ‘save’ a poor film. For analysis and deeper understanding of audiovisual culture, the McGurk Effect forces us to focus on perception, emphasizing the physiological rather than film as a merely ‘cerebral’ experience. The foundation of audiovisual culture in human perception and cognition that mixes audio and visual aspects surely is highly significant. According to Gestalt Psychology proponent Wolfgang Köhler, “sensory fields have in a way their own social psychology”.10 This confirms and underlines the specificity of audiovisual culture as something distinctive in itself rather than a ‘multimedia’ conglomeration, an affixing of separate parts into a composite whole that still shows the joins. The psychology of audiovisual culture, I would suggest, should be founded upon a solid grounding: acknowledging the importance of music and sound as the heart of audiovisual culture as testified to by specifically audiovisual phenomena which drive ‘synchresis’ and the redoubling of effect that is often evident when films use sound and music in impressive combination with moving images. It should be based on ideas derived from these specifically audiovisual phenomena, allied with experiments designed and calibrated for media music and sound rather than jerry-built from the prevailing methods in psychology and scientific analysis. In other words, roughly welding together film psychology plus music psychology does not equal film music psychology. It is no sum and far from a simple equation. On the other hand, an approach that registers the primary importance of the yoking together of sound and image for the mixed perception of hearing and seeing would seem far more appropriate.

2  THE MCGURK UNIVERSE: NEURO AND AESTHETIC THEORY 

19

The McGurk Effect works through the integration of sound and image impulses into a unity in our head. Rather than being a cognitive (‘top-­ down’) process, it is part of perception (a ‘bottom-up’ process), using primary principles of perceptual organization, which is the same whether we are beholding something in a cinema or beholding the original object in actuality. Different terms have been used for the close integration, or fusion, of sound and image: ‘audiovisual integration’, ‘bimodal integration’ or ‘multimodal integration’.11 For thinking about the workings of audiovisual culture, this process clearly ought to have a profound influence. Registering the significance of the McGurk Effect also allows an acknowledgement of the centrality of perception and thus a ceding of suitable importance to it in analysis.12 Indeed, this has notable implications for understanding how hearing and vision work, and particularly for audiovisual culture’s yoking of the two together, with much analysis assuming a supporting role for sound, with occasional but significant ‘mutual implication’. However, such theorizations are insufficient as the ‘merged signal’ makes for a far more complex and nuanced effect than sound and image separately. That we hear with our eyes, and see with our ears, is a remarkable revelation, but the importance of this for understanding audiovisual culture and its powerful effect is that the signal is far from straightforward and consequently aesthetics are able to produce a highly complex effect.13 On occasions, film has played around with the potential of the McGurk Effect, knowing about it instinctively before it was theorized in the 1970s. For instance, Singin’ in the Rain (1951) is all about the coming of synchronized recorded sound to film and concludes with famous a sequence that addresses the illusory unity of sound and image. Lina Lamont sings the title song on stage, but she is miming and her voice is provided by Kathy Selden, who is hidden behind a curtain. This sequence initially unifies sound and image, which is revealed as an illusion to the film audience but not the audience depicted in the film. So, this illustrates how sound and vision form a perceptual unity for the diegetic audience. This is fine until the point when Don, Cosmo and R.F. Simpson pull the partitioning screen up to reveal both singers and the actual origins of the singing voice (Figs. 2.4 and 2.5). This is irony upon irony as when Kathy (Debbie Reynolds) has to dub film dialogue for Lina (Jean Hagen) earlier in the film, it is actually Hagen’s speaking voice we hear in a double bluff. Furthermore, immediately after the song on stage, Kathy and Don sing the duet You Are My Lucky Star.

20 

K. J. DONNELLY

Figs. 2.4 and 2.5  Singin’ in the Rain

Yet it is not Debbie Reynolds’ voice that we now hear but rather a recording by Betty Noyes. For a film about the coming of sound and synchronization, it is a riot of technique, artifice and illusion. A couple of years later, a similar surrogate voice appeared in British film Mad About Men (1954), a sequel to the successful comedy film Miranda (1948). Teacher Caroline sings in a show on a pier but the singing voice we hear is not hers but rather belongs to her mermaid double, Miranda, who is under the pier and using a microphone. In this film, things fall apart when the microphone drops in the water and is picked up by Miranda’s sister who sings but in a totally different voice, redoubled by outlandish reverb. The diegetic audience are confused as to what is going on, although they realize the spectacle has been a sham. However, both Caroline and Miranda are played by Glynis Johns. So, we haven’t been party to a lip-synch illusion bringing together different sound and image but to the reality of the actress and her own singing voice. This is more about sound than image and foregrounds technology’s ability to cross time and space. In both cases, singing and apparent singer initially appear as a unity for a duped diegetic audience, who behold the enhanced image of the miming women through the addition of a good singing voice. Engaging with these sequences for a second (or even third) time, and looking away from the screen intermittently, I thought the sound of the singing voice changed slightly, also registering that this was the case between hearing the song during Hagen’s miming and Reynolds’ actual singing—which nevertheless was miming to her own recording on playback. The sequences are all about telling us that sounds can change images, yet reciprocally images can also change sounds here, and the complexities of sound-image relations and perception embodied by the McGurk Effect are illustrated notably.

2  THE MCGURK UNIVERSE: NEURO AND AESTHETIC THEORY 

21

Neuroscience, Aesthetics and the Study of Film The study of film and audiovisual culture, like the arts and humanities more generally, has been slow and often unwilling to engage with scientific approaches and insights. Ted Nannicelli and Paul Taberham note in the introduction to Cognitive Media Theory that “we ought to acknowledge that while the greater collaboration with the sciences presents enormous opportunities, it also requires us to engage in a kind of meta-theoretical reflection”.14 Over the years, the study of film and other media has largely eschewed approaches originating in science, although the development of so-called Cognitive Film Theory in recent years has opened up a new approach to the subject. Many writings in this area have clustered around certain concerns, however, such as how audiences might understand films on a cognitive level, and the place of emotion in this process—often leavened with a heavy dose of philosophy. Some scientific meta-theories have had less interest. Evolutionary Psychology, for example, is broadly accepted in the sciences although has had little significant impact on the arts. Malcolm Turvey, one of the few film theorists to have engaged with it notably, makes a careful discussion of this situation and notes that there is “ the possibility that, as with mental modules, evolutionary psychologists might discover new design features of the abilities cinema depends on by postulating problems they evolved to solve in the Pleistocene, and that this in turn might lead to fresh discoveries about the design features of films”.15 He is certainly willing to entertain the notion that the arts can learn something significant from scientific developments, which is not a common position among film theorists, and yet he also points to a fundamental difference between theory in the sciences and arts and how conjugation is no simple matter.16 Edward O. Wilson’s Consilience: The Unity of Knowledge17 attempted to bridge the culture gap between the sciences and the humanities that was the subject of C.P.  Snow’s notion of ‘the two cultures’.18 Yet attempts at unifying the two houses arguably only have been minimally effective. Evolutionary Psychology centrally considers the psychology of modern humans in the same way as the rest of human biology.19 Considerations of film theory using EP, often alongside cognitive theory and insights from

22 

K. J. DONNELLY

neuroscience, include Torben Grodal’s influential Moving Pictures (from 1997), Joseph D.  Anderson’s The Reality of Illusion from 1998, Ira Konigsberg’s ‘Film Studies and the New Science’ (2007), Grodal’s ‘Pain, Sadness, Aggression, and Joy: An Evolutionary Approach to Film Emotions’ (2007) and Embodied Visions: Evolution, Emotion, Culture and Film (2009) and Carol Fry’s The Primal Roots of Horror Cinema: Evolutionary Psychology and Narratives of Fear (from 2019).20 Some writers are quite sanguine. Joseph Anderson, for example, points to how such an approach can prove “ a crucial corrective to the view that cinema and our responses to it are primarily, if not exclusively, determined by culture”.21 His focus is on how features of film interact with our evolved perceptual mechanisms to achieve certain effects, and this goes against the dominant tide in the Arts that responses are ‘plastic’ and overwhelmingly determined by cultural aspects and developments. As Turvey notes, most EP film theorists start by saying “ film is dependent upon innate perceptual and psychological capacities for their effects”.22 This approach then forms a foundation for noting that these traits on which a general attraction to film are built developed in the Pleistocene era, but ultimately goes on to tell us little about film or why film has a big effect on people. I suppose to some degree that’s what I intend, too. However, moving to conclusions is difficult, and extremely problematic to prove to any satisfactory degree. This book is not about EP, but rather about the particularities of the audiovisual as it has become established in mass culture. Consequently, appeals to EP from me here will only be parts of an attempt to understand the audiovisual process, how recorded images and sounds fit together into a highly effective format that has come to dominate the world. I don’t want to draw a simplistic paper man of ‘dominant orthodox film theory’, while Murray Smith rightly also warns about seduction by scientific novelty and potential subsequent misunderstandings about what conclusions scientific ideas can allow us to draw.23 Broadly I agree with him and like many others I would suggest that scientific insights might make us reconsider certain aspects of film theory and reappraise aspects that have been discarded over the years.24 In recent years, there has been a cross-disciplinary interest in the possibilities for investigating film using scientific experiments, most of which incorporate a neurological focus and have recourse to hardware for auguring the human brain. One such project is so-called neurocinematics which has inspired a plethora of laboratory experiments addressing film, with

2  THE MCGURK UNIVERSE: NEURO AND AESTHETIC THEORY 

23

proponents such as Tim Smith and Uri Hasson prominent.25 As an approach, it brings cognitive neuroscience, and in particular the insights available from new electronic body auguring and measuring hardware used in medical research, to bear on film and its surrounding theories. It is premised upon empirical experiments using traditional scientific and psychological laboratory techniques. It aims to build a stronger understanding of the psychological process of experiencing films, as well as exploring theories about film in a scientific manner. While it has been one of the most interesting new developments in the study of film, it has been limited by an overwhelming domination of research geared almost exclusively around vision. This is not necessarily due to the unfortunate persistence in film studies of the notion that film is ‘visual culture’, but more likely is due to the ascendancy of neuropsychological techniques pertaining to vision, and the affordances of some of the dominant technological hardware, which concentrate on vision. However, Tim Smith has addressed audio as well as visual aspects; a defining collection on The Psychology of Music in Multimedia was edited by Siu-Lan Tan, Annabel Cohen, Scott Lipscomb and Roger Kendall; and work by Miguel Mera and a special issue of the journal Music and the Moving Image was co-edited by Ann-­ Kristin Wallengren and me, as well as a few others.26 I don’t want to suggest I’m doing something totally different and novel here. I am not a scientist and my discussion only aims to bring some of these insights to the study of audiovisual culture. Indeed, the last couple of decades has involved both Arts and Science researchers addressing the perception and understanding of film and other forms of audiovisual culture. This is often generically labelled ‘Cognitive Film Theory’. This has taken some remarkable steps to try to integrate artistic forms of research with more scientific ones. However, creating a unified field of research has not proven easy and I am aware that there is a division (perhaps registered and not registered) between the two forms of researchers. While arts and humanities researchers might be unhappy with the ‘generalist’ approaches to films and the desire for positivistic answers, (broad) scientists might dismiss some of the others’ writing as ‘simply philosophy’, and indeed, there is an insistent strain of philosophy that has dominated organs such as the journal Projections and the Society for Cognitive Study of Moving Image (SCSMI).27 Like many organizations and publications dealing with audiovisual culture, they are far from alone in failing to register sound as a notable component in their organization’s title or the journal name.

24 

K. J. DONNELLY

As I have already noted, there has been less of an interest in sonic aspects and a focus on the visual. Sometimes there has been an interest in adding/subtracting music in experiments. There have been related experiments investigating how far music can reorient an activity, such as by Pehrs et  al., who found that musical accompaniment can alter the effects and feeling of a kiss.28 This followed several studies that appear to prove music increases and elicits emotional engagement. For instance, an fMRI-based study by Kreifelts, Ethofer, Grodd, Erb and Wildgruber showed that nonverbal emotional information was communicated more strongly by audiovisual signals than sound or vision alone, also that the affected areas of the brain were more sensitive to emotional content than not emotional.29 Underlining the ‘merged signal’ proven by the McGurk Effect, Oliver Vitouch completed an experiment that showed that changing a component changes the audiovisual object.30 He showed that different music can change film audience expectation, which illustrated not only expectation and conjecture but also the psychological disposition of the audience could be altered through music.

Evolutionary Psychology and the Brain These days, it is de rigeur for film scholars either to use an assemblage of diverse, unrelated—indeed, sometimes contradictory—theories but care (or understand) nothing about them or their wider implications or to espouse one theory or another and to filter everything through that theory and denigrate other approaches. I may be evangelical about one or two approaches but Evolutionary Psychology is not one of them. I have been sceptical about it and noncommittal, and yet I’m happy to invoke it here, as it offers the possibility of a different and radical approach to film, approaching physiology with no half measures. A general definition of Evolutionary Psychology, particularly for my purposes here, is the broad assumption that evolution has shaped the human physical form and the brain is no less a part of this, and human behaviour, emotions and thoughts equally have been shaped and constrained by evolutionary concerns.31 Steven Pinker states that the mind’s “operation was shaped by natural selection to solve the problems of the hunting and gathering life led by our ancestors in most of our evolutionary history”.32 This seems reasonable, particularly in the light of a lack of evidence that the human brain has developed in that time. One of the most important things about Evolutionary Psychology, particularly for my

2  THE MCGURK UNIVERSE: NEURO AND AESTHETIC THEORY 

25

study here, is that it reinserts a sense of the physiological to psychology. A number of prominent, indeed dominant, schools of psychology are premised upon models that are wholly abstract and based on theoretical structures with no interest in the physicality of the human brain or how humans relate on a basic level with the outside world.33 Probably the most influential account of Evolutionary Psychology was made by Tooby, Cosmides and Barkow. They state that “Evolutionary psychology is simply psychology that is informed by the additional knowledge that evolutionary biology has to offer, in the expectation that understanding the process that designed the human mind will advance the discovery of its architecture”.34 Indeed, Leda Cosmides and John Tooby were instrumental in establishing Evolutionary Psychology (often known as ‘EP’). In the same book, with Jerome Barkow, they set their approach in opposition to what they call the ‘Standard Social Science Model’ (or the ‘SSSM’), which is a general approach that remains dominant in social and psychological research. In short, the SSSM has retained and bolstered a sense that almost everything is determined by environmental concerns against a backdrop of solid demarcation between ‘nature’ and ‘nurture’. According to the SSSM, evolution produced a ‘general-purpose’ mind (a tabula rasa) with little innate knowledge, specialist activities or predispositions. This is almost diametrically opposed to the concept of mind in EP.  This schismatic division is ‘reified’, made into something seemingly solid and insurmountable, allowing for social and anthropological research to amass evidence and utterly marginalize social and psychological research and discussions that bolster an instinctual or physiological position to account for human activities.35 So, what might EP do for analysis? There has been a schematic debate from time to time about the primacy of either ‘nature’ or ‘nurture’, of human activity being determined by physiological and innate tendencies, or of being unshackled from these and creatively determining humanity’s own activities and future. In the study of the Arts, the latter has held sway in a remarkable manner. While most would accept that innate, ‘natural’ aspects have a significance, there is a consensus that the Arts are an embodiment of the power and value of human culture and are in essence an index of creativity and human psychology, and so a tacet negation of the power of the determining power of the innate. Evolutionary Psychology as an approach takes a different view. It adopts the principles of evolutionary biology in order to research the processes and structure of the human mind. In the same way that evolutionary

26 

K. J. DONNELLY

biologists explain physical features of animals by using evolutionary arguments about adaptation and survival, Evolutionary Psychologists aim to explain human behaviour, thinking and emotion through addressing the human ‘hardware’ of the brain and its adaptations towards certain functions that have aided the survival and persistence of humanity. An adoption of approaches from EP can redress the balance to some degree and register an acceptance that both aspects are important. Joseph Anderson notes that “both the genes and the developmentally relevant environment are the product of evolution”, and this ‘ecological approach’ takes culture and human endeavour into account as an evolutionary product in itself.36 Indeed, Anderson adopts what is known as an ecological approach, derived from J.J.  Gibson, which posits a dynamic interaction between the persisting genes and environment, rather than a simple reaction to environment. Humans change it, and culture is one notable aspect of this. Yet for Arts or Humanities research, EP has been relegated firmly to the periphery. It appears antihumanist and often involves railing against what adherents call the SSSM model, which insists on the mind as an empty machine with little in the way of directional tendencies while being ready and able to tackle almost anything. Of course, there are clear questions about how any task might be dealt with by a ‘general-purpose’ brain, rather than one that used many pathways with particular functions. The first approach sounds utterly abstract and vague. In contrast, Evolutionary Psychology-based approaches focus on ‘hard-wired’ processes, reminiscent of the ‘Read-Only Memory’ of computers, and downplays newly learned and ‘developed’ elements of human psychology, and indeed, can often radically limit focus to explanations that only have origins in the development of ancient humans. I am aware of criticisms of EP, and there are many.37 Hilary and Steven Rose’s book Alas Poor Darwin: Arguments Against Evolutionary Psychology included a chapter by Stephen Jay Gould (in reply to Daniel Dennett), Steven Rose’s chapter ‘Escaping Evolutionary Psychology’ (focusing on its limitation of ideas) and Hilary Rose’s chapter ‘Colonising the Social Sciences’ (about the perceived attack on Social Science method and theory) 38 What might be called ‘hardcore Darwinist’ approaches have proven divisive and polarizing. Edward O. Wilson’s Sociobiology: The New Synthesis, released in 1975, caused a controversy, particularly its final chapter where Wilson shifts analysis from animal behaviour to human behaviour.39 While on the surface similar to Evolutionary Psychology, and identified as such by some detractors, those who espouse it have pointed out that it explains

2  THE MCGURK UNIVERSE: NEURO AND AESTHETIC THEORY 

27

mentality rather than ‘explaining away’ (perhaps even excusing) certain forms of behaviour that are antisocial.40 Sharp criticism has been levelled at EP for being speculative rather than based on plain data. Indeed, its nature is certainly speculative and it deals with complex and abstract situations that are not easily addressed by empirical research.41 Current understanding of the human brain registers that it is a parallel processor, with physical sections carrying distinctive functions, while sometimes it is not wholly apparent what their specific activities might be.42 Its operation appears extremely complex. Brain scans show ‘active areas’ during certain functions and far more complex than initial ideas of how brain sections, or modules, might work. The same brain sections can be used for multiple activities and more than one area regularly can be triggered at the same time. So, the traditional idea, which was embodied by the model head of phrenology, of direct mapping between a brain region and a mental state has been jettisoned.43 Rather than simply addressing the ‘meat’ of the brain and its function, more theoretical and abstract understandings of the brain exist, too. One influential and largely accepted model is the so-called Bayesian brain.44 The brain is conceived as being dominated by a process of hierarchical top-down inference which predicts on the basis of past experience. The brain makes sense of the world by anticipating and deciding how likely different hypotheses are to fit and explain perceptual information in a process of constantly reworking and reapplying from a large number of hypotheses derived from previous experience. This seems reasonable and would certainly help account for the range of abilities of the human brain, and its overtaking of the basic functions hunter-gatherers required for survival. From a point of view of evolution, the brain and its processes are conceived as having developed in precisely the same way as the rest of the human body. This approach involves a fundamentally different way of conceiving mind activity. Rather than the dominant notion of the tabula rasa (the empty slate), where the brain is able to adapt and deal with whatever comes its way, they think of the brain as having physiological properties like the rest of the human body and having characters that are predispositions towards dealing with problems that faced humans in the Pleistocene era. Tooby and Cosmides argue that the dominant way of thinking (of the SSSM) is that the human mind consists of ‘domain-­ general psychological mechanisms’, which are general in character and use, and are able to be put to a variety of different purposes.45 These don’t

28 

K. J. DONNELLY

have any particular existing disposition and therefore adapt to the task in hand, no matter how different the task in hand. On the other hand, Tooby and Cosmides support the idea that the mind is made up of ‘domain-­ specific mechanisms’, where different modules have different functions or propensities. These are not general purpose but specific, with specialisms that might be combined quickly and easily to confront a task in hand.46 Confer et al. note that evolutionary theories which are premised upon the brain model of numerous domain-specific brain operations have been empirically confirmed by scientific experiment, while the theory of domain-general thinking has produced no scientific confirmation.47 They also note that the rapidity of mental responses appears to support domain-­ specific dedicated modules to deal with certain situations rather than general modules, which would have to rationally work through every situation from scratch.48 So the theory goes, domain-specific mechanisms have proliferated exponentially to allow for abilities in a massive number of domains, and the brain works through patching together a number of these for novel requirements. This is sometimes called the ‘Swiss Army Knife brain’, where new issues can be addressed with a mixture of domain-specific modules, which can be linked and might even be repurposed to complete an operation. Conceptually, this appears to be an important point for EP, as the notion of domain-specific adaptations appears to underline the importance of evolution, whereas the domain-general could just adapt to anything and so had no need to evolve—or to compensate for its limitations. There is some disagreement about the effects of evolution. While it is clear from archaeological records that the human skull has not noticeably changed since the Pleistocene era, some think that evolution has carried on but less visibly, as a process of brain interconnectedness and patching abilities which allow for increasingly high level and complex brain activity. On the other hand, though, judging from evidence about the human physiological form, it does not appear that the human form has developed for 100,000  years.49 This poses a ‘gap’ between the capabilities of the human form and the current environment, and this will be discussed in further detail in the following chapter. It is accepted that humans took a ‘Great Leap Forward’ around 40,000 years ago, which has been called by some a ‘cultural revolution’.50 In the Upper Palaeolithic era, more sophisticated tools and cave art appears. There is evidence for a monumental change in human activities, particularly in the realms of culture, religion and organization of

2  THE MCGURK UNIVERSE: NEURO AND AESTHETIC THEORY 

29

communities. This may well have been determined by ecological aspects, including more efficient tool making leading to a higher level of hunting success and correspondingly increasing density of human population, adding together to produce new ways of living. Some see this point as more physiological. In The Prehistory of the Mind, Steven Mithen suggests that a development in brains took place at this point, where previously separate domains of the mind became accessible to one another, allowing for novel and creative combinations, allowing for an entirely new range of creative cognitive activities.51 Fossil and bone records are isolated and hardly conclusive but current consensus suggests that human brain cases took on the current form, allowing a larger and wider functioning brain, no earlier than 200,000 years ago, while the development of basic but clear signs of human culture can be dated to likely no greater than 40,000 years ago, although some isolated cave paintings appear to be much older.52 There has been plenty of discussion—and disagreement—about the function of culture for humanity. While John Cage may have declared art to be purposeless play,53 some assign more concrete functions for cultural activities. Mithen, for example, sees all art, literature and music as ways to develop and regulate the complex cognitive brain hardware upon which our more highly developed functions depend.54 Steven Pinker disagrees, and accords more with Cage. However, he is particularly dismissive about music: Compared with language, vision, social reasoning, and physical know-how, music could vanish from our species and the rest of our lifestyle would be virtually unchanged. Music appears to be a pure pleasure technology, a cocktail of recreational drugs that we ingest through the ear to stimulate a mass of pleasure circuits at once.55

In contrast, Joseph Carroll agrees with Mithen and envisages a significant function for music and other culture, as being able to “provide models of behavior and help regulate the complex cognitive machinery through which humans negotiate their social and cultural environments”.56 We can see here embodiments of the wider notions of culture, ones that see is as a mark of the high level human achievement and those at the other extreme who see it merely as commerce pandering to the lowest common denominator among humanity. Yet this notion that it serves an organization function for the human mind is compelling and engaging. It also has significant

30 

K. J. DONNELLY

implications for thinking about audiovisual culture itself. Understanding it in this way, we should attend to its material aspects and the manner in which these engage with the human mind, which as I have noted is initially and significantly based on the faculties of perception. It is significant that EP potentially can provide a different point of view on aspects of culture that have come to be about themselves, and are assumed to mean only in relation to themselves. It is easy to forget that human activity has a physiological horizon and this retains a hard but rarely acknowledged determination to culture and other human endeavours. We should remember that culture is built upon something—our evolutionary foundations—rather than on nothing, than simply on ‘traditions’ that ultimately are assumed to come from virtually nothing and nowhere. Like hidden rock and soil formations below the landscape surface, fields, flora and forests, and cities, the underlying topography determines the ultimate structure of the surface even if people are unaware of it. In The Adapted Mind, Barkow, Cosmides and Tooby state: Culture is not causeless and disembodied. It is generated in rich and intricate ways by information-processing mechanisms situation in human minds. These mechanisms are, in turn, the elaborately sculpted product of the evolutionary process. Therefore, to understand the relationship between biology and culture one must first understand the architecture of our evolved psychology.57

This seems a reasonable statement. Yet debates about why humans have developed and fostered culture are far from complete, and I haven’t got the space to go into these in detail here. Evolutionary Psychology’s notion is of art as adaptation, rather than as something that fills humanity’s idle time, or as something that marks its high-water mark of achievement. Instead, it is taken to be a functional manifestation of the central drive of humanity to continue its species. Art and wider culture are approached as an aid for humanity’s persistence.58 This EP view might allow a replacement and realignment of ideas, an antidote to simplistic humanistic ideas which litter the subject: the director as a God figure, art as an ‘embodiment of the people’, a reflection of the ‘spirit of the age’, a valorization of ‘individual creativity’ which is at the heart of the human and a destiny of the species (etc.). Central is the idea of culture as an emanation from the human form and its striving for

2  THE MCGURK UNIVERSE: NEURO AND AESTHETIC THEORY 

31

survival and development. Yet there are polarized views on art and culture from an EP point of view. It may be an adaptation itself, something that developed and retained an important function for the primary aspects of survival (social order, position in the hierarchy, securing power and stable reproduction, etc.). Or, on the other hand, it might be viewed as something that has had little evolutionary value and has developed into a leisurely ‘time-filler’, although I suppose a functional approach to it is clear in both views.59 An Evolutionary Psychology approach to culture would likely assume that due to its importance, it essentially must be functional and aiding us in some way. Such an approach looks for aspects of culture being useful, no matter how minor or how tangential. But how? Perhaps most clearly this could be through stories and morality plays and cautionary tales such as Grimm’s fairy tales.60 These provide experiences by proxy, indeed, knowledge about situations that we are yet to encounter. Yet an approach derived from Evolutionary Psychology might look further, focusing on (and for many proponents of EP simply assuming) how it might it be a ‘physiological’ adaptation. Can it be of physical help to us, perhaps as a form of both physical exercise and mental exercise? Evidence suggests we might engage with illusionistic narratives and situations in a way that matches very precisely how we would react in the real world. So, it supplies a simulation of sorts, making us think about situations depicted and what might happen, setting us to wondering about answers to narrative enigmas, but also forcing us to make sense of shapes, colours and sounds, engaging them on a representational level but also appreciating them on a fundamental level of perception. If there is a move to divide ‘culture’ from physiological and evolutionary processes, at heart that appears to emphasize a difference between the ‘primitive’ human body and the ‘destiny of science’ as a transcendent progression. This appears to be the tension between our meat and our nature and intellectual capability, and potentially between our past and our future. It seems overly simple and schematic to divide ‘thinking in the brain’ and physiology, and indeed, much scientific as well as some Arts theory has loudly declared this—what is commonly known as the ‘Cartesian Divide’. A focus on perception acknowledges the intimacy of cognitive and physiological processes. Furthermore, considering art as an adaptation should lead us to think about the physiological level, rather than the softer and perhaps less thought-provoking ‘behavioural’ or ‘cultural’ evolution.

32 

K. J. DONNELLY

In Homo Aestheticus: Where Art comes from and Why, Ellen Dissanayake pointed to the crucial evolutionary role of culture. She posits that not only is a central part of humanity but also that it was crucial in the development of humanity beyond a basic subsistence survival state. 61 Going further, Dissanayake also suggested that art and culture have a function of making the banal extraordinary, and thus retaining interest through ritual and defamiliarization, while also solving intractable issues symbolically. 62 She states that “the evolutionary appearance and development of ‘artification’  – a behavioral predisposition for deliberate use of aesthetic operations by adult individuals and groups in contexts of uncertainty (occurring perhaps 200 kya [thousand years ago])  – has implications for prevalent cognitivist and evolutionary ideas about the relationship between ritual, religion, and the arts”.63 A number of other commentators have taken a similar position to Dissanayake, and averred the origins of culture in human development, while some have gone further and situated the origin in physiologically based processes. For instance, Colwyn Trevarthen points to the psychobiological origins of music, in the unique way human beings move, and their body’s physical relationships with the world and other bodies. Rhythm oriented around the ‘Intrinsic Motive Pulse’ (IMP) articulates all human experience and thinking, as well as regulating all aspects of human remembering and communicating.64 Gestural mimesis and rhythmic narrative expression of purposes and images of awareness, regulated by, and regulating dynamic emotional processes, form the foundations of human activities, with culture being an expression and development from this. Some have posed strong notions of how we could understand culture as more physiological than necessarily ‘mental’, indeed, as being genetically ‘secreted’ outside the body. Richard Dawkins posited that genes could have a manifestation outside the human body as an ‘external phenotype’ as opposed to the ‘internal phenotype’, where genes affect the human body.65 While aimed primarily at built objects and extensions of perception, culture clearly is understood as functioning in this way, perhaps most clearly through Dawkins’ concept of the ‘meme’, where cultural ideas follow a process not unlike that of genes.66 In its pervasiveness, it is almost as if the term has proven Dawkins’ point precisely. In perhaps a similar manner, the so-called extended mind thesis (EMT) posits that the mind does not exclusively exist in the brain, or other part, so the human body but rather extends outwards into the physical world.67 For example, a spider uses its web as extension of its limbs’ ability to feel

2  THE MCGURK UNIVERSE: NEURO AND AESTHETIC THEORY 

33

touch, or in human cases a shopping list or Wikipedia on a smartphone might be understood as extensions of a human being’s memory. This might open up many possible avenues of thought. For instance, we might adopt a more diffuse idea of the mind, which might move perception to a more central position rather than its traditional place on the periphery. And we might also include electronic culture as part of that perception. Indeed, outside the body might be understood as wholly integrated with the self rather than as a separate entity. In other words, this implies that audiovisual culture, and indeed wider culture, might be understood as essential part of the self rather than wholly separate objects that are the self’s products. This accords with McLuhan’s postulations of media as extensions to the human physiology and mind. Conversely, perhaps, we might think of the outside as a fairly direct effect of the inside, and we might approach perception as an externalization of an internal process.68 And this might be figured directly and immediately in audiovisual culture. Perhaps audiovisual culture generally might be understood as an external phenotype, as a manifestation of genes, the essence of humanity, that occur outside of the organism that possesses those genes. Yet this would minimize any importance ceded to the ‘gap’ between human physiological makeup and contemporary living. It is incontrovertible that the environment is changed (and unpredictable change is set in motion) and humans, while instituting momentous changes, have not simply adapted the world to their requirements as much as set in train a number of crises and dialectical tensions. I would suggest it is more likely that culture might embody an externalization of the struggle and indirect process of evolution, while dealing with its bequests, its ill-fitting aspects and the pressures of trying to adapt to contemporary living.

Perception According to Collins’ English dictionary, “Your perception of something is the way that you think about it or the impression you have of it. … Perception is the recognition of things using your senses ...”.69 Perception involves signals that travel from the sensory system through the nervous system. It involves physical stimulus on our senses, with, for example, hearing coming from air waves hitting and vibrating the eardrum and body, and light striking the eye’s retina. These physical energies are converted into signals in the nervous system that end up at the brain. Traditional notions of perception saw them as mechanical and direct,

34 

K. J. DONNELLY

providing raw data for the brain to address. However, since experiments in Gestalt Psychology early in the twentieth century, it has been accepted that the process is not passive but involves active operation which shapes and partially ‘digests’ the impulses ready for higher operations in the brain.70 So, perception involves organization, identification and interpretation of sensory information in order to provide rapid ‘pre-digested’ understanding of information and environment. Differentiating between ‘perception’ and ‘cognition’ is not a simple procedure. Most current writing (to do with film) subsumes the former under the latter, or simply ignores it. As a precognitive procedure, there’s a tendency to imagine that perception is simply raw data fed into our computer-like brain. We understand an equation like that easily. Yet perception is a more complex activity. Rather than channelling raw data into our heads, the processes of perception were already partly digesting those impulses, and, significantly, adding and enhancing that material. Perhaps the most basic formulation of perception and cognition is that the former is ‘bottom-up’ and the latter is ‘top-down’. Perception can provide us with ready-processed stimuli, already organized in a particular way; however, the mental ‘processing’ and mulling over of this is achieved by so-called higher-level processes. ‘Top-down’ means that these are cognitive, conscious, ‘thinking’ activities, while the latter are unconscious ones pertaining to perceptual processes founded in our senses. Traditionally, these were understood as being mechanical until the point where Gestalt psychologists proved that perception is an active and constructive process. More recent neuroscientific research appears to back up their ideas. Schroeder and Foxe note that neurobiologists traditionally assumed that multisensory integration as a higher order process that took places after sensory signals have already been through extensive processing, yet substantial recent research demonstrates that it takes place in low-level cortical structures that were generally believed to be dealt with one sense only.71 ‘Top-down’ processing can be understood as cognition, engaging the higher and more conscious processes of the brain, which is influenced by the goals of the person, while the ‘bottom-up’ processing can be understood as perception, which depends directly on external stimuli.72 There has been a debate about how far activities are determined by top-down or bottom-up processes.73 Bottom-up processes dominating (the ‘Attentional Capture Hypothesis’) suggests that immediate capture of attention is not overridden by the slower process of top-down control, while the ‘Contingent Capture Hypothesis’ suggests that the goals of the perceiving subject can shape and change the signal.74 Traditionally, there has also

2  THE MCGURK UNIVERSE: NEURO AND AESTHETIC THEORY 

35

been something of an assumption that bottom-up processes are something fairly banal and can be influenced by the more sophisticated top-­ down processes of ‘thinking’ (including desires, background knowledge, predispositions and intentions). However, research by Chaz Firestone and Brian Scholl suggests otherwise. It appears that there is no evidence that visual processing can be directly influenced (are not ‘cognitively penetrable’) by cognition.75 Such research suggests a notable demarcation between perception and cognition, with different processes. A strong difference between the two might be significant for the study of culture. Understanding audiovisual culture as perception-led, with physiological concerns as the main determinant upon it, can pull away from overly ‘Cartesian’ approaches which interpret films and other objects as narratives to be ‘cognized’. While there may be disagreement and differing experimental results, it is clear that these two processes work together in a manner that is clearly not only central to human being functioning but also to encounters with audiovisual culture. Scientific and psychological paradigms tend to yield approaches based on their overall conceptions. For example, a Behaviourist approach understands the human mind as being in effect the same as (other) physiological reflexes, and so relevant primarily in its ‘bottom-up’ aspects and being direct in its operations. Such a conception is defined at least partly by the analytical techniques favoured by Behaviourist approaches, which often deal exclusively in physiological reactions. Approaches inspired by Cognitive Psychology tend to subsume ‘bottom-up’, lower-level processes into the ‘top-down’, higher-level processes, seeing them as part of the same thing and perhaps even irrelevant.76 Yet Gestalt diagrams and the McGurk Effect itself show us the power of perception over cognition. Once we have experienced a film of the McGurk Effect in action, we know how it works, but when we repeat it, despite knowing the reality of the sound remaining the same, we nevertheless still perceive it as changing. This can be illustrated immediately with reference

36 

K. J. DONNELLY

to the Müller-Lyer illusion, where straight lines with arrows stuck on the end are perceived as being of different length when they are the same. Even when we know that the lines are the same length, they appear to be of different sizes. This indicates that the conscious, thinking part of us is in thrall of a powerful process situated in our perceptual faculties. This plays to what might be a problematic distinction but one that nevertheless dominates ideas, that of the so-called mind-body problem debate. This often coheres around taking the brain as a physical part of the human body and trying to square that with the mind, consciousness and emotions. Of course, this is a problem only when mind and the body are considered fundamentally different in nature. There is a danger of over-emphasizing the brain or the body and falling into the trap of the Cartesian Divide— following René Descartes’ declaration of ‘I think therefore I am’, and its dualism of active mind and passive body. Ridiculed by Gilbert Ryle as the mind in the physical meat being the ‘Ghost in the Machine’,77 the dominance of approaches deriving from Cognitive Psychology has firmly concentrated focus on the mind and cognition. This ‘brain alone’ approach remains most evident in the cognitive approach to film, which mirrors the dominant approach to spectatorship of most Arts, where the audience is a distant and even a coldly comprehending receiver. This approach has become less popular in recent years. Phenomenological concerns have come to some prominence since the 1990s. These include so-called affect theory and what some have named the ‘bodily turn’, with its focus on how emotion and cognition are not merely mental phenomena but engage essentially with the physical. This is often described through the process of ‘embodiment’. For phenomenologists, affect relates directly to body activity and reactions, engaging emotions, feeling and mood, and while in some cases this can literally focus on physiological reactions, perhaps more often it can be metaphorical, or synaesthetic, about how culture can make us ‘feel’ the physicality of its representations. Highly influential here has been Lakoff and Johnson’s Philosophy in the Flesh: The Embodied Mind and Its Challenge to Western Thought.78 The premise of ‘embodiment’ fuses materiality with mental abstraction, deeming the physiological body not to be a vessel for the human but being the human itself. Such an approach has roots in scientific approaches such as Behaviourism with its focus on physiology. This is related to the ‘Embodied Mind Thesis’ in philosophy, which states that the mind and its processes are in essence determined

2  THE MCGURK UNIVERSE: NEURO AND AESTHETIC THEORY 

37

by the body, and approach is counter to Cartesianism, and dominant approaches in cognitivism, and computationalism. A related concept influential in the study of film is the notion of the haptic and the focus on the sensual, where the spectator’s contact with media is conceived of as touching, as opposed to simply seeing at a distance. While some assign this approach to Gilles Deleuze, it has a far older history in both art and film. Indeed, this idea in relation to film appears to have been first articulated by Noël Burch in the 1960s. Borrowing from art historian Aloïs Riegl, Burch considered the term ‘haptic’ to encompass the optical and spatial elements of film that ‘touch’ the spectator. He argued that a ‘haptic’ space is created through the development of depth of field “with the movement of actors, camera placement and lighting”.79 Burch’s inspiration, Aloïs Riegl, discussed an opposition between the ‘haptic’ (or tactile) and the ‘optical’ modes of representation in art (looking at Roman art). Several writers have addressed the notion of the haptic and embodiment as a way of perceiving and understanding film. These include Laura Marks, Vivian Sobchack, Steven Shaviro and Jennifer Barker.80 These all focus onto the sensual aspects of film, shifting analytical emphasis onto phenomenology, experience and feeling of film. Marks’ The Skin of the Film discusses the notion of ‘haptic visuality’ in detail, where “vision itself can be tactile, as though one were touching a film with one’s eyes”,81 although she says little about aurality. This is where a film can cause a sensuous feeling or memory of physicality, particularly an idea of touch. She notes that “in haptic visuality, the eyes themselves function like organs of touch”,82 and makes for a highly immersive situation. There are similarities to Vivian Sobchack’s approach. According to her, ‘affect’ lies in the corporeal relationship between film and its viewer-auditor, underlined by the film apparatus’ similarities to a bodily organ. Film possesses rhythm and movement of the sort experienced by human bodies, and it also presents image movement and metaphorical movement.83 This ‘physiological’ approach has proven to be an inspiration to many other scholars, and has engaged with a larger grouping of theories which converge around phenomenological approaches, affect theory and methods that focus on embodiment.84 This cluster of theories is heavily indebted to underlying concepts proposed by Edmund Husserl and Maurice Merleau-Ponty.85 Merleau-Ponty and Henri Bergson are attributed with inspiring a philosophical approach that countered the strong tradition of putting human consciousness at the heart of understanding and engaging with the world.

38 

K. J. DONNELLY

Merleau-Ponty posited that the human body was central to processes of knowing and understanding, while also noting that the body and what it perceived were not easily separated. So, affect is directly correlated to perception. In Matter and Memory, Henri Bergson defined affect as “… that part or aspect of the inside of our bodies which mix with the image of external bodies”. This puts perception at the heart of the human, with there being “no perception without affection”.86 Later, Merleau-Ponty utilized an approach inspired by Gestalt psychology, although perhaps not without some reservations. This is most evident in The Phenomenology of Perception where he not only focuses on the body and perception but also points to Gestalts as fundamental and directly uses Gestalt psychology ideas.87

Neuroscience Wojciehowski and Gallese point to a general ‘biocultural turn’ since the Millennium.88 Investigations into the human mind have been a notable part of this. I don’t intend to provide a full course about perception and the brain here and aim in taking merely what I need for the book’s arguments about audiovisual culture. An approach that embraces neuroscience can provide us with certain useful insights. Neuroscience is the scientific study of the human nervous system. It is closely related to neuropsychology, which is concerned with how a person’s cognition and behaviour are related to the brain and the rest of the nervous system. It is a relatively young field, less than half a century old. It aims to uncover the relationship between the workings of the human mind and the nervous system, and as an endeavour has been accelerated by the development of relevant experimental hardware. This includes neuroimaging techniques such as fMRI (functional magnetic resonance imaging), which makes images of blood flow; MEG (magnetoencephalography), which maps changes to magnetic fields caused by electrical currents in the brain; and EEG (electroencephalography), which measures electrical charge and changes in this on the scalp, indicative of brain activity. The link between mental functions and neural regions is not simple and straightforward as the brain is a multiple parallel processor, regularly activating a number of areas at the same time. Tasks require multiple cortical areas in action simultaneously, while there is also the phenomenon of ‘plasticity of the brain’, where an activity can be rerouted to another area if the original area is impaired. A great deal of excitement has been

2  THE MCGURK UNIVERSE: NEURO AND AESTHETIC THEORY 

39

generated in the last decades about the existence of a particular form of neuron in the brain, as well as knowledge of the brain’s architecture and function being the key to understanding art and culture. These potential insights come from the field of neuroscience, and in particular neuropsychology, which addresses the human nervous system, searching for a biological foundation for perception, memory, understanding and consciousness, looking to physical processes underlying cognition and behaviour. Neuropsychology assumes that the electronic images and data produced by auguring hardware (fMRI, EEG, etc.) constitute an accurate and incontrovertible indication of the human mind and its activities. This is certainly contestable, as is the assumption that physiological responses equate to mental content and aspects. While the results from these devices certain tell us something about our physiological and mental activities, they remain open to interpretation. This is compounded by consciousness remaining an enigma. The way that mind relates to the meat of the brain is not solved, and there is no known location in the brain assigned to consciousness or understanding of human subjective experience more widely.89 The field has not been without criticism, particularly in its application beyond its immediate area of expertise, such as when dealing with ideas of culture and human activity. The relation of the meat of the brain to the human mind and consciousness is a clear issue, and there have been some criticisms about its application having both scientific and logical shortcomings.90 There have also been some disagreements on the grounds of some researchers and writers using the frame of experimental science and others using one of philosophy. Some applications of science are very direct to areas of culture and have proven to be very controversial. One such novel area of investigation is so-called neuroaesthetics. The term ‘neuroaesthetics’ was coined by Semir Zeki in 1999,91 and assumes that it is possible to understand aesthetic experiences as a brain-­ based physiological reaction. Using brain-analysing hardware, neural correlates of aesthetic judgement and creativity. Consequently, art can be understood through our neurological reactions, and reciprocally, the brain can be understood better through analysing responses to art. The emphasis has tended to lie more strongly on the former, however. It is worth pointing out that there has been little scrutiny of films and audiovisual culture.92 Most of the experiments have focused on visual art and used traditional categories such as beauty.

40 

K. J. DONNELLY

In Inner Visions: An Exploration of Art and the Brain, Semir Zeki posits that artists such as Claude Monet were able to ‘bypass’ human consciousness to seemingly ‘plug directly’ into human emotions, in an immediate and perhaps even unmediated manner. On the other hand, he suggests, modernist art, such as Salvador Dali’s paintings, set off neural conflicts in that they work counter to the natural processes of perception as well as cognition.93 Some of these conclusions may not appear surprising but their foundation in empirical science and the artist as unconscious experimenter are radical. An often-repeated quotation from Zeki declares, “the artist is in a sense, a neuroscientist, exploring the potentials and capacities of the brain, though with different tools. How such creations can arouse aesthetic experiences can only be fully understood in neural terms. Such an understanding is now well within our reach”.94 Potentially revolutionary in many ways, Zeki’s approach believes that ages-old mysteries of art, culture and affect can be assessed coldly and objectively through brain-­ auguring hardware and theories about evolutionary processes. Broadly, neuroaesthetics uses the frame of Evolutionary Psychology, looking for a reason for the existence and form of these aesthetic responses and structures. Zeidel and Nadal note, “One of neuroaesthetics’ main objectives is to characterize the neural underpinnings of such varied artistic activities and aesthetic experiences. This endeavour, however, also requires understanding the evolutionary history of the crucial neural systems: how, when, and why did they come into place”.95 This explicitly introduces EP into the equation, as a means of finding function and a sense of historical development. Through the concept of neuroaesthetics, Zeki claims to have rethought our relationship to art, establishing “the biological basis of aesthetic experience … [and a] neurobiological definition of art”.96 He states: “Aesthetics, like all other human activities, must obey the rules of the brain of whose activity it is a product, and it is my conviction that no theory of aesthetics is likely to be complete, let alone profound, unless it is based on an understanding of the workings of the brain”.97 As an approach, neuroaesthetics has been heavily criticized. For instance, John Hyman dismissed Ramachandran and Zeki’s analysis as explaining some features of paintings but doing nothing more than that. He points out that Ramachandran doesn’t approach art as art, and in fact misrecognizes it in a rush to make extravagant overgeneralizations about art and how it works.98 Pointing out the shortcomings of neuroaesthetics, Ellen Dissanayake stated: “Evolutionary aesthetics and neuroaesthetics

2  THE MCGURK UNIVERSE: NEURO AND AESTHETIC THEORY 

41

studies are typically concerned with perceptual and cognitive preferences for features that were or are adaptive  - e.g., associated with salubrious landscapes, healthy mates, nutritious food, cognitively-satisfying forms, or works - e.g., pictorial or literary depictions of adaptively relevant subjects such as romance or resolved conflict …”.99 Overall, neurological investigations, and neuroaesthetics in particular, have been dominated by a focus on visual art and have been beset with extremely general statements about art and culture. While refreshing, some of the theories are overambitious, and neuroaesthetics’ ‘big theories of art’ have not become pervasive the way some imagined they might.100 There has been a great deal of interest in ‘mirror neurons’, which are sensorimotor neurons that register a direct understanding of a physical action. In watching an action by someone, the same neurons fire precisely as if that action was being done by the watcher. As Carl Plantinga puts it, “brain processes involving mirror neurons enable us to understand faces and bodies in action and link us to other people’s activities and feelings. Such processes allow us to understand and respond affectively to human events and behaviour, whether on the screen or in the extrafilmic world”.101 This seems to amount to an ‘embodied simulation’ that allows us to understand other people’s activities and could account not only for empathy but also for situations of profound identification and caring for others’ situations. Indeed, the discovery of mirror neurons could prove crucial in that they may be central in allowing human beings to relate to others, learn skills and language and develop our minds more generally. Pioneered by Vittorio Gallese and others, experiments were with monkeys but it appears they have discovered that human beings have similar neuron activity.102 Mirror neurons might explain a great many issues to do with how we understand and empathize with other people, and potentially how we become so involved in audiovisual culture. Rizzolatti and Craighero point to mirror neurons as the basis of understanding and imitating action, and so at the heart of perception and cognition.103 Going further, Vilayanur Ramachandran sees mirror neurons as the key to the ‘Great Leap Forward’ in human evolution, and predicts “that mirror neurons will do for psychology what DNA did for biology: they will provide a unifying framework and help explain a host of mental abilities that have hitherto remained mysterious and inaccessible to experiments”.104 This remains to be seen but the notion of mirror neurons increasingly has been taken up to solve a host of issues, including possibly even being the key to social

42 

K. J. DONNELLY

competence and cohesion.105 The overwhelming majority of experiments and discussions focus on vision, yet the isolated writings that include sound as well draw some potentially significant conclusions. An interesting possibility is suggested by Gallese in Kohler, Keysers et al. Experimenting with monkeys, they found that neurons fired in their premotor cortex when they did an action, and then later when they heard the sounds of that action separately. They concluded that these ‘audiovisual motor neurons’ fire in the same way when an action is performed, seen or heard.106 This confirms the unification of sound and image in the brain, rather than any retention of them as separate ‘channels’, while also indicating how sounds alone can carry significant value as well as elicit empathetic reactions. While mirror neurons appear to have momentous implications for the understanding of the human mind, and in particular how we deal with other people, understanding and empathizing, there have been some dismissals of conclusions drawn and questioning of mirror neurons’ significance. Gregory Hickok, for instance, posed ‘eight problems’ for mirror neuron theory, and a notable one was the paucity of actual evidence that was being used as a basis for a large amount of speculation. And furthermore, the physiological information was derived wholly from a limited grouping of small monkey species and being extrapolated directly and unproblematically to human beings.107 Others are equally dismissive of the ‘mirror neuron project’, such as Rizzolatti and Sinigaglia, who state: “to speak of the mirror mechanism as the basis of action understanding is mere nonsense”.108 They point out that these neurons are part of a larger, much more complex system yet are being approached as a simple answer to a complex question. However, they concede that the debate and research it has inspired has given further insights into how we understand and empathize with others and their actions. Film theorist Malcolm Turvey has been more sceptical, pointing to some logical shortcomings and an inability to explain things like how we are able to understand emotions we have never previously experienced.109 There are distinct question marks over mirror neurons but they appear to have caught researchers’ imagination as they appear to offer some solutions to aspects of understanding audiovisual culture. In many ways, mirror neurons seem very ‘filmic’, and if they hadn’t been discovered, we would have had to invent them.

2  THE MCGURK UNIVERSE: NEURO AND AESTHETIC THEORY 

43

Conclusion Arthur Koestler’s book The Ghost in the Machine posits that modern humans are haunted by their primitive past, and inheritances that have received little attention. Following but reorienting Gilbert Ryle, one of Koestler’s central concepts is that as the human brain developed, it is built upon a foundation of earlier, more primitive brain structures. The reptilian primitive brain is the oldest and joins the later emotional mammalian brain, alongside the most recent, the thinking cortex. The older parts constitute the eponymous ‘Ghost in the Machine’.110 Koestler developed Paul MacLean’s triune brain model, which posited that this tripe structure came from a succession of evolutionary brain sections developing, with all mammals gaining some new limbic elements and all of the neocortical, which added to the basal ganglia, an ancient and primitive reptilian basis. These structures can have significant agency, overpowering higher functions, and being responsible for destructive impulses. While a diverting idea, this serves to reinforce the sense of ‘machine and me’, meat and human consciousness. And perception is firmly part of the non-­ transcendent meat. These appear to remain defining categories for understanding the human, and perhaps audiovisual culture plays a role in unifying and mediating, holding together both body and mind. Or on the other hand, perhaps it perpetuates and diversifies between the two. Joseph Anderson, using J.J. Gibson, states that perception is direct, and indeed perception must show the real world precisely as it is. This veridical perception is necessary, and logically it needs to be close to reality for the purposes of survival.111 It is certainly effective but that is not the same things as saying that it is veridical. It is enhanced through complex perceptual operations, and I would suggest that audiovisual culture is similarly ‘souped up’. It is enhanced and ‘better than the real thing’, adding up to a ‘modded perception’, if you like. This idea might help account for perception’s effectiveness and engagement, rather than simply understanding it as a relatively banal but easily mentally distributed mode of communication. The McGurk Effect confirms that something more is going on than a straightforward process of recording and adding things together for the process of momentary diversion. The ‘synergy’ taking place in audiovisual culture defies simple description let alone explanation yet we are constantly reminded of its power (and that is despite programmes to banalize it and insist that it is everyday rather than magical). Sound and vision once married cannot

44 

K. J. DONNELLY

easily be separated without loss of the auratic effect, the synergetic weld. The audiovisual, as in ‘audiovisual culture’, is premised upon the perceptual phenomenon described and illustrated by the McGurk Effect, which marries two of our senses through conjoining the signals of two separate technologies, even with digital recording. I am slightly uneasy about using Evolutionary Psychology and am aware that it has a strong presence across medical and scientific approaches that aim to understand the human through a horizon of physiological traits. It certain supplies answers although sometimes I’m not sure that the answers are applied to the appropriate questions. The vast majority of writings that bring together art and neuroscience use a sense of EP to guide them. For instance, Anjan Chatterjee’s The Aesthetic Brain: How We Evolved to Desire Beauty and Enjoy Art.112 This is because EP can supply a reason for art and culture, as well as a reason for brains and bodies working in a certain way. While I am not wholly at ease with it, we must register that evolution has been perhaps the dominant idea of the last couple of centuries.113 To the point where it may sometimes be a metaphor or an ideal rather than necessarily a description or theory about development over time. As I already noted, I don’t like functionalism—yet EP appears to be the epitome of it, where everything is retrofitted to simple survival requirements. Yet, it brings up interesting answers and sets up new questions. I’m not going to couple up with EP as an ‘off the peg’ theory. This methodology is a common these days, where a theory is used in an instrumental manner and as something that is a fait accompli—as long as it provides the required answer it is not worth questioning the theory itself. Having said that, I don’t want to get mired in a substantial debate about the merits and problems of EP. However, it has influenced certain theory that has helped make physiology part of our psychology while simultaneously making psychology part of physiology. It is hardly a revolutionary statement to note that audiovisual culture is geared around perception and human hardware, although this appears little in discussions on the subject. As Malcolm Turvey declares, “If emotional responses had not evolved to be rapid, films would be designed very differently”.114 While it might be easy to become enrapt by the possibilities of ‘full understanding’ of film and other media through scientific endeavours, this seems simplistic and idealistic, and in some cases might be inspired by a desire to wipe away existing film theory and the traditions of understanding audiovisual culture that have grown from an aesthetic and humanities-based concern.115 This would seem an ill-fated undertaking, as

2  THE MCGURK UNIVERSE: NEURO AND AESTHETIC THEORY 

45

the symbiotic relationship between cultural theory and culture is far from a random exercise and has provided highly significant insights into our relationship with culture. The possibilities for enriching these ideas and approaches are certainly offered by recourse to scientific-­based theories. Some of the studies under discussion in this chapter, certainly those under the umbrella term of ‘neurocinematics’, are food for thought even if they do not definitively recast ideas in the study of audiovisual culture. The ‘bio’ turn in recent years has proven that inspiration from the natural sciences is able to fertilize the study of culture and film studies.116 Yet there are also issues. It is difficult to ignore the often-absent interest in aesthetics, and tendency towards strikingly simplistic idea about how films and other cultures function. In some cases, the studies adopt an extremely naïve approach to art and culture. Furthermore, in some cases, it is clear that film can simply be used to justify scientific theories. And I should remain vigilant because I am aware that perhaps this book follows a similar strategy of using science to justify or rethink film theory, which is not a necessity and not my aim. I am interested rather in the nature of audiovisual culture as an object and attempting to understand it further. As such, the McGurk Effect has significant implications for understanding audiovisual culture. It forces us to focus on perception, emphasizing the physiological rather than film as a merely ‘cerebral’ experience. It tells us about the unified audiovisual signal, and perhaps something about why audiovisual culture is so effective.

Notes 1. Leda Cosmides, John Tooby and Jerome Barkow, eds., The Adapted Mind: Evolutionary Psychology and the Generation of Culture (Oxford University Press, 1992). 2. Harry McGurk and John MacDonald, “Hearing Lips and Seeing Voices” in Nature, no. 264, 1976, pp. 746–748. 3. Ibid. 4. People with hearing difficulties who tend to partially lipread will also confirm that this phenomenon is not limited to audiovisual culture and yet it is particularly clear when rendered as sound and screen. 5. It worth emphasizing that while the McGurk Effect is based on speech, there is plenty of experimental evidence for audio changing visual perception, and vice versa. Another point worth noting is that while the McGurk Effect appears to be universal, I would be interested to understand more about whether it has a weaker effect in countries where dubbing foreign

46 

K. J. DONNELLY

language films is common. While it would be reasonable to suggest it might be, the effect is not limited to audiovisual culture, although it is extremely clear here. 6. Allison B.  Sekuler, Robert W.  Sekuler and Renee Lau, “Sound Alters Visual Motion Perception” in Nature, no. 385, 1997, p.  308; Allison B.  Sekuler and Robert W.  Sekuler, “Collision Between Moving Visual Targets: What Controls Alternative Ways of Seeing an Ambiguous Display? in Perception, no. 28, 1999, pp. 415–432. 7. Other notable research into this includes Katsumi Watanabe and Shinsuke Shimojo, “When Sound Affects Vision: Effects of Auditory Grouping on Visual Motion Perception” in Psychological Science, vol. 12, no. 2, March 2001, pp.  109–116; Massimo Grassi and Clara Casco, “Audiovisual Bounce-­Inducing Effect: Attention Alone Does Not Explain Why the Discs are Bouncing” in Journal of the Experimental Psychology of Human Perception Performance, vol. 35, no. 1, February 2009, pp.  235–43; Massimo Grassi and Clara Casco, “Audiovisual Bounce-Inducing Effect: When Sound Congruence Affects Grouping in Vision” in Attention Perception Psychophysics, vol. 72, no. 2, February 2010, pp.  378–86; Marcello Maniglia, Massimo Grassi, Clara Casco and Gianluca Campana, “The Origin of the Audiovisual Bounce Inducing Effect: a TMS Study” in Neuropsychologia, vol. 50, no. 7, June 2012, pp. 1478–82. 8. Annabel J. Cohen, “Associationism and Film Soundtrack Phenomena” in Contemporary Music Review, vol. 9, parts 1 and 2, 1993, p. 164. 9. There are a couple of qualifications. One is that the effect is not limited to speech. There is plenty of research (some cited in the introductory chapter) that illustrates how sound changes image and vice versa. Another point to note is that from what I can gather, and there seems to be no solid research here, countries where dubbing is prevalent still have the McGurk Effect for their own language but compensate with higher brain functions to avoid the potential issues dubbing might cause. More research is needed, though. 10. Wolfgang Köhler, Gestalt Psychology. An Introduction to New Concepts in Modern Psychology (New York: Liveright, 1947), 20. 11. Gerard B.  Remijn, Hiroyuki Ito and Yoshitaka Nakajima, “Audiovisual Integration: An Investigation of the ‘Streaming-Bouncing’ Phenomenon” in Journal of Physiological Anthropology and Applied Human Science, vol. 23, no. 6, November 2004, pp. 243–7; David Alais and David Burr, “The Ventriloquist Effect Results from Near-Optimal Bimodal Integration” in Current Biology, vol. 14, no. 3, 2004, pp.  257–262; Daniel Sanabria, Ángel Correa, Juan Lupiáñez and Charles Spence, “Bouncing or Streaming? Exploring the Influence of Auditory Cues on the Interpretation

2  THE MCGURK UNIVERSE: NEURO AND AESTHETIC THEORY 

47

of Ambiguous Visual Motion” in Experimental Brain Research, vol. 157, no. 4, August 2004, pp. 537–541. 12. A useful approach here is taken in Thomas Elsaesser and Malte Hagener’s Film Theory: An Introduction Through the Senses (New York: Routledge, 2010). 13. The brain merges the signals in a ‘multisensory’ area. J.  Driver and Toemme Noesselt, “Multisensory Interplay Reveals Crossmodal Influences on ‘Sensory-Specific’ Brain Regions, Neural Responses, and Judgments” in Neuron, no. 57, 2008, pp. 11–23; Asif A. Ghazanfar and Charles E. Schroeder, “Is Neocortex Essentially Multisensory?” in Trends in Cognitive Sciences, vol. 10, issue 6, 2006, pp. 278–285. 14. Ted Nannicelli and Paul Taberham, “Introduction: Contemporary Cognitive Media Theory” in Ted Nannicelli and Paul Taberham, eds., Cognitive Media Theory (New York: Routledge, 2014), p. 3. 15. Malcolm Turvey, “Evolutionary Film Theory” in Ted Nannicelli and Paul Taberham, eds., Cognitive Media Theory (New York: Routledge, 2014), p. 52. 16. Turvey, Ibid., p.  52; Malcolm Turvey, “Can Scientific Models of Theorizing Help Film Theory” in Thomas E.  Wartenberg and Angela Curran, eds., The Philosophy of Film: Introductory Text and Readings (Malden, MA: Blackwell, 2005), pp. 21–32. 17. Edward O.  Wilson, Consilience: The Unity of Knowledge (New York: Knopf, 1998). 18. C.P. Snow, The Two Cultures and the Scientific Revolution (Cambridge: Cambridge University Press, 1959). 19. David Buller, Adapting Minds: Evolutionary Psychology and the Persistent Quest for Human Nature (Cambridge, MA: MIT Press, 2005). 20. Torben Grodal, Moving Pictures: A New Theory of Film Genres, Feelings, and Cognition (Oxford: Oxford University Press, 1997); Joseph D. Anderson, The Reality of Illusion: An Ecological Approach to Cognitive Film Theory (Carbonedale, IL: Southern Illinois University Press, 1998); Torben Grodal, “Pain, Sadness, Aggression, and Joy: An Evolutionary Approach to Film Emotions” in Projections, vol. 1, issue 1, 2007; Torben Grodal, Embodied Visions: Evolution, Emotion, Culture, and Film (Oxford: Oxford University Press, 2009); and Carol L. Fry, The Primal Roots of Horror Cinema: Evolutionary Psychology and Narratives of Fear (Jefferson, NC: McFarland, 2019). 21. Anderson, op. cit., 1998, p. 50. 22. Turvey, op. cit., 2014, p. 49. 23. Murray Smith, “‘The Pit of Naturalism’: Neuroscience and the Naturalized Aesthetics of Film” in Ted Nannicelli and Paul Taberham, eds., Cognitive Media Theory (New York: Routledge, 2014), p. 42.

48 

K. J. DONNELLY

24. This has already been suggested for the study of sound in film. “The study of film sound theory, historically marginalized and thus underdeveloped in cinema studies, has only recently started to evolve, and it offers numerous possibilities for advancing, revisiting, and revising current feminist, Marxist, psychoanalytic, queer and apparatus theories. ... we recognize work on sound as a clarion call for a return to theory, one that allows for a number of innovative and original approaches to theoretical perspectives that have otherwise been regarded as defunct”. Jay Beck and Tony Grajeda, “The Future of Film Sound Studies” in Jay Beck and Tony Grajeda, eds., Lowering the Boom: Critical Studies in Film Sound (Chicago: University of Illinois Press, 2008), p. 18. 25. Uri Hasson, Ohad Landesman, Barbara Knappmeyer, Ignacio Vallines, Nava Rubin, and David J. Heeger, “Neurocinematics: The Neuroscience of Film” in Projections, no. 2, 2008, pp.  1–26; John P.  Hutson, Tim J. Smith, Joseph P. Magliano and Lester C. Loschky, “What is the Role of the Film Viewer? The Effects of Narrative Comprehension and Viewing Task on Gaze Control in Film” in Cognitive Research: Principles and Implications, vol. 2, no. 46, 2017; Steven J. Hinde, Tim J. Smith, and Iain D.  Gilchrist, “In Search of Oculomotor Capture during Film Viewing: Implications for the Balance of Top-Down and Bottom-Up Control in the Saccadic System” in Vision Research, no. 134, 2017, pp. 7–17. 26. Sermin Ildirar, Daniel T.  Levin, Stefan Schwan Tim J.  Smith, “Audio Facilitates the Perception of Cinematic Continuity by First-Time Viewers” in Perception, vol. 47, no. 3, 2017, pp. 276–295; Siu-Lan Tan, Annabel Cohen, Scott D. Lipscomb and Roger A. Kendall, eds., The Psychology of Music in Multimedia (Oxford: Oxford University Press, 2013); Miguel Mera and Simone Stumpf, “Eye-Tracking Film Music” in Music and the Moving Image, vol. 7, no. 3, 2014, pp. 3–23; K. J. Donnelly and Ann-­ Kristin Wallengren, eds., ‘Laboratory Experiments and Psychology of Film Music’ special issue of Music and the Moving Image, vol. 8, issue 2, Summer 2015; Marianne G. Boltz, “Musical Soundtracks as a Schematic Influence on the Cognitive Processing of Filmed Events” in Music Perception, vol. 18, 2001, pp. 427–454; Claudia Bullerjahn and Markus Güldenring, “An Empirical Investigation of Effects of Film Music using Qualitative Content Analysis” in Psychomusicology, vol. 13, nos. 1–2, 1994, pp.  99–118; and Berthold Hoeckner, Emma W.  Wyatt, Jean Decety and Howard Nusbaum, “Film Music Influences How Viewers Relate to Movie Characters” in Psychology of Aesthetics, Creativity, and the Arts, vol. 5, 2011, pp. 146–153. 27. Some relevant articles include Asif A. Ghazanfar and Stephen V. Shepherd, “Monkeys at the Movies: What Evolutionary Cinematics Tells Us about

2  THE MCGURK UNIVERSE: NEURO AND AESTHETIC THEORY 

49

Film” in Projections, vol. 5, issue 2, 2011, pp. 11–25; Gal Raz and Talma Hendler, “Forking Cinematic Paths to the Self: Neurocinematically Informed Model of Empathy in Motion Pictures” in Projections, vol. 8, issue 2, 2014, pp.  89–114; Jane Stadler, “Mind the Gap”: Between Movies and Mind, Affective Neuroscience, and the Philosophy of Film” in Projections, vol. 12 issue 2, 2018, pp. 86–94. 28. Corrina Pehrs, Lorenz Deserno, Jan-Hendrik Bakels, Lorna H. Schlochtermeier, Herrmann Kappelhoff, Arthur M. Jacobs, Thomas Hans Fritz, Stefan Koelsch Lars Kuchinke, “How Music Alters a Kiss: Superior Temporal Gyrus Controls Fusiform–Amygdala Effective Connectivity” in Social Cognitive and Affective Neuroscience, vol. 9, issue 11, November 2014, pp. 1770–1778. 29. Benjamin Kreifelt, Thomas Ethofer, Wolfgang Grodd, Michael Erb, Dirk Wildgruber, “Audiovisual Integration of Emotional Signals in Voice and Face: An Event-Related fMRI Study” in NeuroImage, 2007, vol. 37, no. 4, 2007, pp. 1445–1456. 30. Oliver Vitouch, “When Your Ear Sets the Stage: Musical Context Effects in Film Perception” in Psychology of Music, 2001, vol. 29, no. 1, pp. 70–83. 31. Evolutionary psychology is well-established with textbooks for taught courses. Stephen V.  Shepherd, The Wiley Handbook of Evolutionary Neuroscience (Hoboken, NJ: Wiley Clinical Psychology Handbooks, 2016); David Buss, Evolutionary Psychology: The New Science of the Mind (Hove: Psychology Press, 2015). 32. Stephen Pinker, How the Mind Works (New York: Norton, 1997), p. 21. 33. Indeed, evolutionary psychology offers a metatheory, which can encompass any area of inquiry. Annemie Ploeger, “Evolutionary Psychology as a Metatheory for the Social Sciences” in Integral Review, Volume 6, Number 3, 2010, pp. 164–174. 34. Leda Cosmides, John Tooby and Jerome H.  Barkow, “Introduction: Evolutionary Psychology and Conceptual Integration” in Jerome Barkow, Leda Cosmides and John Tooby, eds., The Adapted Mind: Evolutionary Psychology and the Generation of Culture (Oxford University Press, 1992), p. 3. 35. Discussed in detail in John Tooby and Leda Cosmides, “The Psychological Foundations of Culture” in Jerome H. Barkow, Leda Cosmides and John Tooby, eds., The Adapted Mind: Evolutionary Psychology and the Generation of Culture (Oxford University Press, 1992), pp. 19–136. 36. Anderson, op. cit., 1998, p. 84. 37. For example, see Jean Suplizio, “Evolutionary Psychology: The Academic Debate” in Science in Context, vol. 19, no. 2, 2006, pp. 269–294. 38. Hilary Rose and Steven Rose, eds., Alas Poor Darwin: Arguments Against Evolutionary Psychology (London: Vintage, 2001).

50 

K. J. DONNELLY

39. Edward O.  Wilson, Sociobiology: The New Synthesis (Cambridge, MA: Belknap Press, 1975). 40. There were accusations of Darwinists being reactionaries. However, Tyber, Miller and Gangestad’s empirical study into the political leanings of ‘adaptationists’ found that rather than espousing the approach due to a right-wing agenda, the reality was that they were much less politically conservative than average Americans. Joshua M. Tybur, Geoffrey F. Miller and Steven W. Gangestad, “Testing the Controversy” in Human Nature, vol. 18, no. 4, 2007, pp. 313–328. 41. Sometimes criticized as ‘speculative fictions’. Cf. Stephen Jay Gould, “Evolution: The Pleasures of Pluralism” in New York Review of Books, vol. 44, no. 11, 26 June 1997, pp. 47–52. 42. Confirmed by experiments with two impulses simultaneously. Claude Alain, Karen Reinke, Yu He, Chenghua Wang and Nancy Lobaugh, “Hearing Two Things at Once: Neurophysiological Indices of Speech Segregation and Identification” in Journal of Cognitive Neuroscience, vol. 17, no. 5, 2005, p. 815. 43. Michael Shermer, “The Brain Is Not Modular: What fMRI Really Tells Us: Metaphors, Modules and Brain-Scan Pseudoscience” in Scientific American, 1 May 2008. https://www.scientificamerican.com/article/a-­new-­phrenology/ [accessed 1/3/2022]. 44. Kenji Doya, Shin Ishii, Alexandre Pouget and Rajesh P.N.  Rao, eds., Bayesian Brain: Probabilistic Approaches to Neural Coding (Cambridge, MA: MIT Press, 2006). 45. Leda Cosmides and John Tooby, “Origins of Domain Specificity: The Evolution of Functional Organization” in Lawrence A.  Hirschfeld and Susan A. Gelman, eds., Mapping the Mind: Domain Specificity in Cognition and Culture (Cambridge: Cambridge University Press, 1994), pp. 89–91, 107–109. 46. Ibid., p. 107–109. 47. Jaime C. Confer, Judith A. Easton, Diana S. Fleischman, Cari D. Goetz, David M. G. Lewis,. Carin Perilloux, and David M. Buss, “Evolutionary Psychology: Controversies, Questions, Prospects, and Limitations” in American Psychologist, vol. 65, no. 2, 2010, pp. 110–126. 48. Lawrence A. Hirschfeld and Susan A. Gelman, “Toward a Topography of Mind: An Introduction to Domain Specificity” in Lawrence A. Hirschfeld and Susan A.  Gelman, eds., Mapping the Mind: Domain Specificity in Cognition and Culture (Cambridge: Cambridge University Press, 1994), p. 3. 49. Drew H. Bailey Davis C. Geary, “Hominid Brain Evolution” in Human Nature, vol. 20, no. 1, 2009, pp. 67–79.

2  THE MCGURK UNIVERSE: NEURO AND AESTHETIC THEORY 

51

50. Francesco d’Errico, “The Origin of Humanity and Modern Cultures: Archaeology’s View” in Diogenes, vol. 54, no. 2, 2007, pp.  122–133; Kate Wong, “The Morning of the Modern Mind” in Scientific American, vol. 292, no. 6, June 2005, pp.  74–83; Vilayanur S.  Ramachandran, “Mirror Neurons and Imitation Learning as the Driving Force Behind ‘The Great Leap Forward’ in Human Evolution” in Edge, no. 69, 2000. www.edge.org/3rd_culture/ramachandran/ramachandran_p1.html [accessed 20/2/2020]. 51. Steven Mithen, The Prehistory of the Mind: The Cognitive Origins of Art, Religion, and Science (London: Thames and Hudson, 1996), p. 194. 52. Brian Handwerk, “An Evolutionary Timeline of Homo Sapiens” in ‘Science’ in Smithsonian Magazine, 2 February 2021. www.smithsonianmag.com/science-­nature/essential-­timeline-­understanding-­evolution-­ homo-­sapiens-­180976807/ [accessed 17 March 2021]. 53. John Cage, Silence: Lectures and Writings (Middletown, CT: Wesleyan University Press, 1973), p. 12. 54. Mithen, op. cit., 1996, p. 194. 55. Pinker, op. cit., 1997, p. 528. 56. Joseph Carroll, “The Deep Structure of Literary Representations” in Evolution and Human Behavior, no. 20, 1999, p. 159. 57. Leda Cosmides, John Tooby and Jerome Barkow, “Introduction: Evolutionary Psychology and Conceptual Integration” in Jerome Barkow, Leda Cosmides and John Tooby, eds., The Adapted Mind: Evolutionary Psychology and the Generation of Culture (Oxford University Press, 1992), p. 3. 58. Pascal Boyer, “Evolutionary Psychology and Cultural Transmission” in American Behavioral Scientist, vol. 43, no. 6, 2000, pp. 987–1000. 59. According to Joseph D. Anderson, the human ability to imagine but separate fiction from reality was developed through culture. Op. cit, 1998, p. 114. 60. Bruno Bettelheim criticized the removal of some of the darker features of fairy stories in children’s versions, which had allowed children to develop emotionally through engaging their fears in remote, symbolic terms. The Uses of Enchantment: The Meaning and Importance of Fairy Tales (London: Thames and Hudson, 1976). 61. Ellen Dissanayake, Homo Aestheticus: Where Art Comes From and Why (Seattle, WA: University of Washington Press, 1992). 62. Ellen Dissanayake, “If Music is the Food of Love, What About Survival and Reproductive Success?” in Musicae Scientiae, vol. 12, no. 1 supplement, March 2008, pp.  169–195; Ellen Dissanayake, “Art in Global Context: An Evolutionary/Functionalist Perspective for the twenty-first

52 

K. J. DONNELLY

Century” in International Journal of Anthropology, vol. 18, no. 4, 2003, pp. 245–258. 63. Ellen Dissanayake, “The Artification Hypothesis and Its Relevance to Cognitive Science, Evolutionary Aesthetics, and Neuroaesthetics” in Cognitive Semiotics, issue 5, Fall 2009, pp. 148–173; pp. 149–150. 64. Colwyn Trevarthen, “Musicality and The Intrinsic Motive Pulse: Evidence from Human Psychobiology and Infant Communication” in Musicae Scientiae, vol. 3, no. 1, supplement, September 1999, pp. 155–215. 65. Richard Dawkins, The External Phenotype (Oxford: Oxford University Press, 1982). 66. Richard Dawkins, The Selfish Gene (Oxford: Oxford University Press, 1989), p.  192; Raghavendra Gadagkar, “The Evolution of Culture (or the Lack Thereof): Mapping the Conceptual Space” in Journal of Genetics, vol. 96, no. 3, July 2017, p. 513. 67. Andy Clark, David J Chalmers, “The Extended Mind” Richard Menary, ed., The Extended Mind (Cambridge, MA: MIT Press, 2010), pp. 27–42. Andy Clark, Supersizing the Mind: Embodiment, Action, and Cognitive Extension (Oxford University Press 2008). 68. Pia Tikka, “Cinema as Externalization of Consciousness” in Robert Pepperell and Michael Punt, eds., Screen Consciousness: Mind, Cinema and World (Amsterdam: Rodopi Press, 2006), pp. 139–162. 69. Collins English Language Dictionary. https://www.collinsdictionary. com/dictionary/english/perception [accessed 7/7/2021]. 70. Willis D. Ellis, ed., A Source Book of Gestalt Psychology (London: Routledge and Kegan Paul, 1950); Mitchell G. Ash, Gestalt Psychology in German Culture, 1890–1967: Holism and the Quest for Objectivity (Cambridge: Cambridge University Press, 1995). 71. Charles E.  Schroeder and John Foxe, “Multisensory Contributions to Low-Level, ‘Unisensory’ Processing” in Current Opinion in Neurobiology, vol. 15, issue 4, August 2005, p. 454. 72. Michael I.  Posner, “Orienting of Attention” in Quarterly Journal of Experimental Psychology, vol. 32, no. 1, 1980, pp. 3–25. 73. Steven Yantis and John Jonides, “Attentional Capture by Abrupt Onsets: New Perceptual Objects or Visual Masking?” in Journal of Experimental Psychology: Human Perception and Performance, vol. 22, no. 6, 1996, pp. 1505–1513. 74. Laurent Itti, Christof Koch and Ernst Niebur, “A Model of Saliency-­ based Visual Attention for Rapid Scene Analysis” in Transactions on Pattern Analysis and Machine Intelligence, vol. 20, no. 11, December 1998, pp.  1254–1259; Jan Theeuwes, “Top-Down Search Strategies Cannot Override Attentional Capture” in Psychonomic Bulletin and Review, vol. 11, no. 1, 2004, pp. 65–70.

2  THE MCGURK UNIVERSE: NEURO AND AESTHETIC THEORY 

53

75. Chaz Firestone and Brian J.  Scholl, “Cognition Does Not Affect Perception: Evaluating the Evidence for ‘Top-Down’ Effects” in Behavioral and Brain Sciences, vol. 39, e229, 2016, pp.  1–77; Chaz Firestone and Brian J. Scholl, “Can You Experience ‘Top-Down’ Effects on Perception?: The Case of Race Categories and Perceived Lightness” in Psychonomic Bulletin and Review, vol. 22, no. 3, 2015, pp. 694–700. 76. Arguably, a Cognitive Psychology approach to emotion also subsumes it under ‘Cognitive’ processing, too. 77. Gilbert Ryle, The Concept of Mind (London: Hutchinson, 1949), p. 17. 78. George Lakoff and Mark Johnson, Philosophy in the Flesh: The Embodied Mind and Its Challenge to Western Thought (New York: Basic Books, 1999). 79. Noël Burch, Life to Those Shadows, translated by Ben Brewster (Princeton, NJ: Princeton University Press, 1969), p. 180. 80. Laura Marks, The Skin of the Film: Intercultural Cinema, Embodiment, and the Senses (Durham, NC: Duke University Press, 2000) and Touch: Sensuous Theory and Multisensory Media (Minneapolis, MN: University Of Minnesota Press, 2002); Vivian Sobchack, The Address of the Eye: A Phenomenology of Film Experience (Princeton, NJ: Princeton University Press, 1992) and Carnal Thoughts: Embodiment and Moving Image Culture (Los Angeles, CA: University of California Press, 2004); Steven Shaviro, The Cinematic Body (Minneapolis, MN: University of Minnesota Press, 1993) and Jennifer M.  Barker, The Tactile Eye: Touch and the Cinematic Experience (Berkeley, CA: University of California Press, 2009). 81. Marks, op. cit., 2000, p. xi. 82. Ibid., p. 162. 83. Sobchack, op. cit., 1992, p. 206. 84. Marina Grishakova and Maria Poulaki, eds. Narrative Complexity: Cognition, Embodiment, Evolution (Lincoln: University of Nebraska Press, 2019); Maarten Coëgnarts, Film as Embodied Art: Bodily Meaning in the Cinema of Stanley Kubrick (Brookline: Academic Studies Press, 2019); Maarten Coëgnarts and Peter Kravanka, eds., Embodied Cognition and Cinema (University of Leuven Press, 2015). 85. Edmund Husserl, Ideas Pertaining to a Pure Phenomenology and to a Phenomenological Philosophy (The Hague: Martinus Nijhoff, 1980 [f.p.1912]); Maurice Merleau-Ponty, The Phenomenology of Perception (London: Routledge, 1966 [f.p.1945]). 86. Henri Bergson, Matter and Memory (London: Zone, 1990 [f.p.1896]), p. 60). 87. Maurice Merleau-Ponty, The Phenomenology of Perception (London: Routledge, 1966 [f.p. 1945]). 88. Hannah Wojciehowski and Vittorio Gallese, “How Stories Make Us Feel: Toward an Embodied Narratology” in California Italian Studies, vol. 2, Issue 1, 2011. https://escholarship.org/uc/item/3jg726c2 [accessed 12/12/2021].

54 

K. J. DONNELLY

89. Common criticisms are of the widespread assumption of naïve realism in terms of perception and the identity and causal theories of mental and neural states. See further discussion in John Smythies, “The Metaphysical Foundations of Contemporary Neuroscience: A House Built on Sand” in John Smythies and Robert French, eds., Direct versus Indirect Realism A Neurophilosophical Debate on Consciousness (Cambridge, MA: Elsevier Academic Press, 2018), pp.  5–15; Jan De Vos and Ed Pluth, eds., Neuroscience and Critique: Exploring the Limits of the Neurological Turn (New York: Taylor & Francis, 2016). 90. Maxwell R. Bennett and Peter M. S. Hacker, Philosophical Foundations of Neuroscience (Oxford: Blackwell, 2003). 91. Semir Zeki, Inner Visions: An Exploration of Art and the Brain (Oxford: Oxford University Press, 1999). 92. An isolated instance is Torben Grodal and Mette Kramer, “Film, Neuroaesthetics and Empathy” in Recherches sémiotiques/Semiotic Inquiry, vol. 30, nos. 1, 2 and 3, 2010, pp. 19–35. 93. Zeki, op. cit., 1999. 94. Zeki, Semir. “Statement on Neuroesthetics.” Neuroesthetics, 24 November 2009. http://www.neuroesthetics.org/statement-­on-­neuroesthetics. php [accessed 5/12/2009]. 95. Dahlia W. Zeidel and Marcos Nadal, “An Evolutionary Approach to Art and Aesthetic Experience” in Psychology of Aesthetics, Creativity, and the Arts, vol. 7, no. 1, 2013, p. 100. 96. Zeki, op. cit., 1999, pp. 2, 22. 97. Ibid., p. 94. 98. John Hayman, “Art and Neuroscience” in Roman Frigg and Matthew Hunter, eds., Beyond Mimesis and Convention: Representation in Art and Science (London: Springer, 2008), pp. 254, 260. 99. Ellen Dissanayake, “The Artification Hypothesis and Its Relevance to Cognitive Science, Evolutionary Aesthetics, and Neuroaesthetics” in Cognitive Semiotics, issue 5, Fall 2009, p. 159. 100. Di Dio Cinzia and Vittorio Gallese, “Neuroaesthetics: A Review” in Current Opinion in Neurobiology, vol. 19, no. 6, 2009, pp. 682–687. 101. Carl R.Plantinga, “The Affective Power of Movies” in Arthur Shimamura, ed., Psychocinematics: Exploring Cognition at the Movies (Oxford: Oxford University Press, 2013), p. 101. 94–111. 102. Vittorio Gallese, Luciano Fadiga, Leonardo Fogassi and Giacomo Rizzolatti, “Action Recognition in the Premotor Cortex” in Brain, issue 2, vol. 119, April 1996, pp.  593–609; Vittorio Gallese V, Alvin I.  Goldman, “Mirror Neurons and the Simulation Theory of MindReading” in Trends in Cognitive Sciences, vol. 2, Issue 12, 1 December

2  THE MCGURK UNIVERSE: NEURO AND AESTHETIC THEORY 

55

1998, pp. 493–501; Vittorio Gallese, “The ‘Shared Manifold’ Hypothesis: From Mirror Neurons to Empathy” in Journal of Consciousness Studies, vol. 8, 2001, pp.33–50. 103. Giacomo Rizzolati and Laila Craighero, “The Mirror-Neuron System” in Annual Review of Neuroscience, vol. 27, no. 1, 2004, p. 172. 104. Vilayanur S. Ramachandran, “Mirror Neurons and Imitation Learning as the Driving Force Behind ‘The Great Leap Forward’ in Human Evolution” in Edge, no. 69, 2000. http://www.edge.org/3rd_culture/ ramachandran/ramachandran_p1.html [accessed 20/1/2021]. 105. Jonas T. Kaplan and Marco Iacoboni. “Getting a Grip on Other Minds: Mirror Neurons, Intention Understanding, and Cognitive Empathy” in Social Neuroscience, vol. 1, no. 3, 2006, pp. 175–183. 106. Evelyne Kohler, Christian Keysers, Maria Alessandra Umiltà, Leonardo Fogassi, Vittorio Gallese and Giacomo Rizzolatti. “Hearing Sounds, Understanding Actions: Action Representation in Mirror Neurons” in Science, no. 297, 2002, pp. 846–848. 107. Gregory Hickok, “Eight Problems for the Mirror Neuron Theory of Action Understanding in Monkeys and Humans” in Journal of Cognitive Neuroscience, vol. 21, no. 7, July 2009, pp. 1229–1243. See further discussion in Gregory Hickok, The Myth of Mirror Neurons: The Real Neuroscience of Communication and Cognition (New York: W.W. Norton, 2014). 108. Giacomo Rizzolatti and Corrado Sinigaglia, “The Space of Mirrors” in Pier Francesco Ferrari and Giacomo Rizzolatti, eds., New Frontiers in Mirror Neurons Research (Oxford: Oxford University Press, 2015), p. 529. 109. Malcolm Turvey, “Mirror Neurons and Film Studies: A Cautionary Tale from a Serious Pessimist” in Projections, vol. 14, issue 3, 2020, 21–46. 110. Arthur Koestler, The Ghost in the Machine (London: Hutchinson, 1967). 111. Anderson, op. cit., 1998, p. 14. 112. Anjan Chatterjee, The Aesthetic Brain: How We Evolved to Desire Beauty and Enjoy Art (New York: Oxford University Press, 2013). 113. See, for example, Daniel Dennett, Darwin’s Dangerous Idea (New York: Simon and Schuster, 1995). 114. Turvey, op. cit., 2014, p. 50. 115. As David Rodowick notes. “An Elegy for Theory” in October, vol. 122, Fall 2007, p. 95. 116. Torben Grodal, “Film Aesthetics and the Embodied Brain” in Martin Skov and Oshin Vartanian, eds., Neuroaesthetics (Amityville, NY: Baywood Publishing, 2009), p. 271.

CHAPTER 3

Perpetual Realism: Mediating Fantasy and Reality

The Lumière brothers’ L’Arrivée d’un train en gare de La Ciotat (Arrival of a Train at La Ciotat, 1896) depicts a train pulling into a train station in the French coastal town of La Ciotat. It was 50  seconds long and shot from a single camera set up without editing. The camera was placed on the station platform to capture the train’s arrival diagonally from right to left, where it comes close to the camera as it passes. Films such as these were often known as ‘actualités’ or actuality films and this was one of the first. So the story goes, when the Lumière brothers screened their films, audiences were enthralled by the seeming effect of capturing reality, to the point where they reacted—allegedly—to the train in L’ariveé d’un train en gare de la Ciotat. This is one of the founding myths of film at the birth of the medium, and while it is extremely likely to be an apocryphal story. In “An Aesthetic of Astonishment: Early Film and the (In)credulous Spectator”, Tom Gunning notes that there is no solid evidence for the train response and that this is something of an imaginary ‘primal scene’: “The terrorised spectator of the Grand Café still stalks the imagination of film theorists who envision audiences submitting passively to an all-­ dominating apparatus, hypnotised and transfixed by its illusionist power”.1 It is more of a symbolic idea perhaps than an actual incidence of audiences flinching.2 Yet, like many apocryphal stories, it captures something of a truth. In this case, the amazement of audiences and the magic of new technology. Gunning states that “The astonishment derives from a © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 K. J. Donnelly, The McGurk Universe, Palgrave Studies in Audio-Visual Culture, https://doi.org/10.1007/978-3-031-18633-2_3

57

58 

K. J. DONNELLY

magical metamorphosis rather than a seamless reproduction of reality”.3 So, it is not simply about being fooled into thinking the train is real. He goes on that this marks a disowning of our own belief in audiovisual culture generally through projecting this idea onto the ‘naïve’ early spectators, with the “ fetishistic viewer, wavering between the credulous position of believing the image and the repressed, anxiety-causing knowledge of its illusion”.4 This dichotomy potentially describes ‘top-down’ and ‘bottom up’ processes, which I discuss later, where belief is split (‘This appears to be true but I know it is not true’) between perception of audiovisual culture processing it as if it was real (on the most basic levels) and the higher order acceptance and reasoning that it is not real, making for a mental contradiction or paradox. A seeming contradiction forces us to expend more energy keeping disagreeing with our perception and ultimately it is easier to go along with it, particularly if there are other benefits on offer. Audiovisual culture has always seen that there are (Fig. 3.1).

Fig. 3.1  The Lumière brothers’ L’ariveé d’un train en gare de la Ciotat

3  PERPETUAL REALISM: MEDIATING FANTASY AND REALITY 

59

Yet their remains an important truth: that film can appear very real to us, to the point where we are enrapt by its illusion. Indeed, that film has a remarkable relationship with reality, or rather that audiovisual culture has a remarkable relationship with what our perception registers as reality. Film certainly inaugurated revolutionary ‘new ways of seeing’ (and hearing) but these were built upon the (perhaps previously underutilized) affordances of age-old human physiological hardware. The phenomenon of film having an illusionary effect was a significant foundation for the medium, as it had been for the sound recording of the wax cylinder and phonograph in earlier decades. Audiovisual culture has a strong relationship with a sense of realism. In phenomenological terms there is an involuntary effect, but may be based on the desire, to believe, the strength of conventional repetition and the human desire for order and recognition. On a basic level, we take sounds and images to be real. The audience’s processing of the projected images and sounds of a film are precisely the same as if they are responding to visual and aural stimuli in the world outside the cinema. This should not be surprising. We can only use the faculties we possess and audiovisual culture is geared towards these. Although possible, it is unlikely that we would use radically different aspects of our bodies and brains to process something that looks and sounds like the world. Torben Grodal notes, “what enters the eyes and the visual cortex are not representations, but light waves that cause neural activation. The humble neurons in the rear of the brain cannot distinguish representations from the real thing; the emotion-inducing limbic system will be activated whether we are confronted by a real wolf or by an audiovisual simulation of the wolf”.5 This emphasizes the centrality of perception to audiovisual culture and also to being human more generally. I don’t want to get into a debate about the nature of reality but need to register that there is a strong sense of the ‘real world’ outside electronic culture, and a pervasive sense that culture is representing or perhaps replacing it.6 It is crucial that much culture sets up a state of ambiguity and some confusion between ‘real’ and interior states of mind, and indeed arguably relies upon this. We approach diegetic worlds as on some level being ‘real’, through the so-called suspension of disbelief. Indeed, much culture, such as literature and film, are premised upon making fantasy seem real. On the reverse side of this, we can become confused between manifestations of psychological ‘inner states’ in a seemingly ‘realistic’ format. In some diegeses, two different worlds can be present and set apart, which illustrates precisely the process of formulating constructions of

60 

K. J. DONNELLY

‘fantasy’ and ‘reality’ within the same space. These can play to or across the traditional tendency for audiovisual culture to use sound to indicate interior states of mind and image for more objective situations.

The Reality Effect Probably the earliest theory of film to address the compulsive psychology of the medium was the book The Photoplay: A Psychological Study by psychiatrist Hugo Münsterberg, first published in 1915.7 This focused on the visual language of film and its direct relation to human perception and understanding, seeing the close-up as an equivalent of the eye making closer scrutiny of something, and declaring that at heart film reconfigured the world using the format of human consciousness. Perhaps surprisingly, this line of inquiry did not lead to significant later theoretical developments. As Münsterberg noted, one of the most startling and immediate aspects of film is what has been termed its ‘reality effect’, that on some level when confronted by it we take it as being ‘real’. In ‘Film, Reality, and Illusion’, Gregory Currie differentiates between the notions of realism that appear in film theory discussions. There are three: ‘transparency’, ‘illusionism’ and ‘perceptual realism’.8 Realism as transparency is what André Bazin points to, where the photographic basis of film reproduces the reality of the world mechanically. Illusionism is where the audience believe (on some level) what film presents. The third, perceptual realism is where the audience uses the same abilities and processes for recognizing elements in a film as they would in the real world. These three notions clearly are connected, and Currie sees them as flawed, yet each facet of the idea of realism is deeply entrenched and has origins that appeared with the medium of film itself. Realism for Bazin was both film’s ontology—its essence—and a language, based on the principles of simplicity, purity and transparency.9 André Bazin’s sense of realism in film can be tied directly to certain film techniques. These include most clearly the use of deep focus and the long take, as well as other aspects including colour film, location shooting and location-recorded sound. The use of these techniques, particularly in combination, enables a certain way of showing the world on film. Bazin discussed the alliance of deep space within the frame and the long take as a distinctive mode of discourse in films, one that allows the audience to see for itself, to look around within the frame rather than be directed by a clear sense in each successive shot of what should be looked at.10 This

3  PERPETUAL REALISM: MEDIATING FANTASY AND REALITY 

61

visual strategy is premised upon a unity of time and space, with integrity for each, as well as a reality of recorded performance in front of the camera. This makes plain that there is a world beyond the screen as the limits of the frame are evident and not hidden by tight focus on objects and editing. Furthermore, the lack of editing and sense of moving frame more closely resembled the spectator’s perception of reality, and the nature of the film space makes for more active watching as the spectator’s wandering eye can decide what might be important within the frame. So, rather than manipulating the ‘reality effect’,11 the cinema feted by Bazin was more interested in films that allow latitude to the spectator, in terms of finding their own focus and interpretation. He stated: I will distinguish, in the cinema between 1920 and 1940, between two broad and opposing trends: those directors who put their faith in the image and those who put their faith in reality … [and that] … depth of focus brings the spectator into a relation with the image closer to that which he enjoys to reality.12

Bazin declares cinema to be ‘objectivity in time’ and posits that cinema fulfils human desire for realistic representation and identifies cinema as the potential fulfilment of the aspiration for realistic representation.13 This seems a grand statement yet it is easy to understand how people would want stories to appear more real to them. What Bazin does not suggest is that this desire or craving from our senses reacting strongly to reality and a version of it that in many ways was better (larger, time compressed, more emotional, more sustained excitement). Siegfried Kracauer’s theories about film often are understood as a continuity with Bazin’s. He also saw film as a unique medium in its ability to record physical reality. He saw this ‘realist tendency’ as something innate to the medium, inheriting an interest in capturing reality from sculpture, painting and literature. Kracauer’s book Theory of Film has the subtitle ‘the Redemption of Physical Reality’, which points to his abiding interest in the relationship of film to the real world. Indeed, he saw its ability to engage with and transmit the real world as the most important aspect of film. However, his theory is not a simple ‘naïve realist’ concept of film as being ‘like reality’ but rather posits that both the form of film and the conditions of its consumption in cinemas correspond with the experience of the fragmented and ‘damaged’ reality of modernity.14

62 

K. J. DONNELLY

Kracauer acknowledged that film was about mixing formalist and realist tendencies, but he concluded the films that allow the audience to “ experience aspects of physical reality are the most valid aesthetically”.15 This could often be fortuitous and a capturing of undetermined aspects of profilmic reality, sometimes as an incidental part of the film. Kracauer cites Laurence Olivier’s Hamlet (1948), when the camera shows the actual sea through the window of Elsinore castle, which introjects a remarkable moment of realism in the Shakespearean production.16 Another good example of this is Tintin et le mystère de la toison d’or (Tintin and the Golden Fleece, 1961), when Snowy the dog excitedly, aggressively and unselfconsciously chases a rolling barrel in what was no doubt far in excess of any script directions. The camera simply catches the action through not cutting and giving space for the activity. For Bazin, staying true to the ontological realism of film was achieved through a limited stylistic palette, which included the long take, the use of deep focus, which meant a limited amount of editing and sequences of montage. Bazin noted that deep space can make for an ambiguity of narrative information, even though its sense of transmitting the real is stronger than the classical breakdown of space (so-called analytical editing or continuity editing).17 The aim is to secure a realistic sense of space and time, though establishing its unified integrity. This cohered itself around the potential of deep space for dramatizing distance and proximity in a shot, the use of the hand-held or agile camera yielding a mobile frame and in the long take’s ‘capture’ of real time.18 A number of films have espoused this long take strategy, including Hitchcock’s Rope (1948), with edits only taking place once the reel of celluloid film loaded into the camera ran out, and Alexander Sokurov’s Russian Ark (2002), which was shot as a meticulously choreographed single take with a digital camera. Other films such as The Blair Witch Project (1999) and Cloverfield (2008) were premised upon the notion of the film footage emanating from a hand-held camera and ‘catching’ the events that unfold. These two films follow the logic that hand-held cameras on someone’s shoulder will run continuously, and thus retain time and space integrity. Both films also play heavily on off-screen space, and the integrity of time and space emphasizes what can’t be seen. Similarly, sound emphasizes what can’t be seen and triggers the audience’s imagination. Bazin particularly feted certain filmmakers who ‘put their faith in reality’, and one was Orson Welles, who famously used deep space shots in

3  PERPETUAL REALISM: MEDIATING FANTASY AND REALITY 

63

Citizen Kane (1941). For instance, when Thatcher gets Kane’s mother to sign a document to give him guardianship of the child, the action is divided into three planes. In the foreground, the stoic mother and Thatcher; in the middle ground, the complaining but ineffectual father; and in the background, the oblivious child Kane playing in the snow. While this shot has been celebrated and remains widely used in film analysis, often focus on the visual setup has overshadowed the fact that sound is also in three planes and has a crucial role in directing attention within the visual frame. Furthermore, the three planes are given higher definition by sound, into units of agreement, disagreement and oblivion. In that sequence from Citizen Kane, while our eyes may be drawn to movements or the perceived sources of sounds on screen, they are still free to roam between the planes at will. This is clearer with the image track than the soundtrack. However, other instances of deep space and real, chronological time in Welles’ films are not so ‘open’ and can be far more focused. Perhaps the most famous instance in a Welles film is the celebrated opening of Touch of Evil (1958). The film starts with a six-and-a-­ half-minute shot without cutting. The shot starts with a close-up framing of a bomb being put into the boot of a car and the car being driven to the border post, where it explodes, along the way engaging with the walking Mr. and Mrs. Vargas (Charlton Heston and Janet Leigh). Using a crane, the camera can remain in almost constant motion following the carefully choreographed action not only through travelling but also through raising and lowering. It is less well known that a longer take appears later in the film for activity in an apartment. Touch of Evil’s opening appears on the television in Martin McDonagh’s In Bruges (2008). Its appearance inaugurates the film’s own single-shot long take sequence where we see and hear Ken (Brendan Gleeson) undertake a real-time telephone conversation with his off-screen boss Harry (Ralph Fiennes) (Fig. 3.2). Although the Bruges hotel room is cramped, there is still scope for the audience to look around within the frame—although this perhaps remains a limited scope. This is due to the sequence being a long take but not using deep space, and so not having the option for the eye to ‘change focus’ between background and foreground. Brian De Palma opens his The Bonfire of the Vanities (1990) with a five and a half minute travelling shot long take as an homage to Welles. Replacing Welles’ crane with a Steadicam, De Palma’s long take shot is startling. A high degree of artifice is required to shoot it. In this case, it was meticulously choreographed to the point where director De Palma had no choice but to appear in the shot

64 

K. J. DONNELLY

Fig. 3.2  In Bruges

as a uniformed guard. Yet the phenomenological effect is to capture something of reality through its time and space integrity.19 It captures the reality of performance, and not only of actors but also of camera and its interaction with space—although all these discussions focus on visuals and neglect sounds. The reality of performance relies on our understanding of an event taking place in the profilmic space in front of the camera, as is the case with sporting events on television. If they are not based on filming an actual event, they are premised upon the reality of the staging. Dancing sequences in musicals regularly rely on this, as we marvel at the skills of a dancer such as Gene Kelly. In Singin’ in the Rain (1952), he is upstaged by Donald O’Connor in Make ‘Em Laugh, where the athletic spectacle of his performance in front of the camera remains in long shot to catch the action, helped only by a couple of edits. Similarly, in Evil Dead II (1987), Bruce Campbell’s energetic performance as he battles against his possessed hand keeps the camera running to catch his play of comic violence rather than rushing to cut. Lea Jacobs and Richard de Cordova discussed spectacle in film, pointing to crucial stylistic aspects of an audiovisual regime they call ‘performance’.20 This includes the deactivation of off-screen space, the standing down of point of view structures and the drawing of the audience outside the moral and narrative position established by the film in order to directly appreciate the spectacle. They state, “Performance is not constituted through an entity but rather through an activity. It is the point at which the production of the discourse becomes the main event of the fiction.

3  PERPETUAL REALISM: MEDIATING FANTASY AND REALITY 

65

The enunciation is itself dramatized and theatricalized as a part of the meaning enounced”. So, spectacle and ‘performance’ are constituted directly by film style and mark a different regime from the dominant dramatic mode. An example of this is in A Clockwork Orange (1971), where protagonist Alex (Malcolm McDowell) has been released from prison and has met his old droogs who are now policemen. They take him to a field containing a cattle trough full of water and then dunk his head in the water while hitting him with a stick. The startling aspect of this sequence is that apart from one cut the camera holds without cutting. This serves directly to show the violence of holding his head under the water in real time. This is not the reality of performance as in an actor or actress excelling at their trade, or a film musical where a talented dancer exhibits their abilities. Instead, this is the reality of violence as performance. This appears to be director Stanley Kubrick ‘stepping outside’ the illusion of the film to make a statement directly to the audience about film style and the transmission of the actual profilmic event. This sequence is all about film style and its transmission of real time and capturing the reality of what unfolded in front of the eye of the camera. However, Kubrick does not keep things that simple. The horror of the reality of violence is counterpointed and undermined by the soundtrack, which consists of strange highly echoed synthesizer sounds by Wendy Carlos. These are a replacement for the diegetic soundtrack, and the impacts of the stick upon Alex’s body are matched by these metallic, slap back echo sounds, which appear to parody the sense of reality in the sequence (Fig. 3.3). Sound is clearly crucial in framing this sequence from A Clockwork Orange, and as I noted, many discussions fail to address sound, or to understand how far sequences such as this work audiovisually. Indeed, Bazin’s theory and many discussions in its wake have failed to address sound as an integral part of this ‘regime of realism’. Perhaps it is worth noting that in the same way as with the image, technology has set limits on sound but it has also been dominated by production conventions and aesthetic fashions. According to Theo van Leeuwen, in Speech, Music, Sound, film (and radio) sound dubbing technicians conventionally have divided the soundtrack into three zones of proximity: 1. Close; ‘immediate’ 2. Middle; ‘support’ 3. Far distance; ‘background’

66 

K. J. DONNELLY

Fig. 3.3  A Clockwork Orange

The added-up whole of the soundtrack is known as the ‘scenic’, and traditionally, the immediate (1) is to be listened to while the other two are simply to be heard.21 This is certainly the conventional structure of the film soundtrack yet is not always the case. It homologizes the structure of the visual frame into focused-on foreground and background (see later discussion of ‘figure-ground’ structure). Indeed, sound regularly can be structured around a sense of sound planes, which functions not only to give a spatial stability to the image track but also to furnish a hierarchy of sound importance. This sense of proximity is there in the image, too. Spatialized sound of the sort we are used to, with stereo sound and cinema surround sound, always allows the listener some latitude with ‘wandering around’ the sonic space in mental terms, which I will come on to shortly. While this is remarkable, it is perhaps even more remarkable that sound has so rarely been discussed when dealing with film and realism. It is astonishing how film theorists have failed to understand sound as sonic representation and approached it more as reality itself. For instance, Gerald Mast noted, “There is clearly a difference between a filmed object or action (it is a photograph of the thing or act) and a recorded musical sound. For (the latter) is the sound itself. There is no ontological difference between hearing a violin in a concert hall and hearing it on a sound

3  PERPETUAL REALISM: MEDIATING FANTASY AND REALITY 

67

track in the movie theatre”.22 Christian Metz also similarly claimed that there was no difference between a recorded sound in a film and the way it was in the real world, providing it was ‘well-recorded’.23 This seems a remarkably naïve approach to recorded sound. No recorded sound will sound the same as its original in the real world, although sometimes we can be fooled into thinking loud recorded sounds are not recordings. The sonic profile of particular sounds might be very similar but spatial aspects will be different, as they come from speakers in a formal space that aims not to be a significant part of the sound’s character. While many might assume that sound in film and television is ‘natural’ to the images we are seeing, that is often far from the case. Much sound is added in post-production, and in some cases, whole soundtracks are constructed in post-production, particularly where traditions of shooting without location sound are habitual (such as in Italian popular cinema). In many cases, for audiovisual drama, the function of the soundtrack is to bolster a sense of reality and foster a sense of predictability and expectation. Dialogue should be audible and diegetic sounds should remain in the background and not impinge on voices or other elements in the foreground. While some films stylistically make something of soundtrack conventions, such as Robert Altman’s Nashville (1975) with its simultaneous dialogue, the vast majority retain a conventional sense of structured soundtrack with elements in hierarchical order of volume and audibility, and a lack of simultaneous sound ‘clashing’. Indeed, a blurred wall of sound obscuring individual sounds can be difficult to concentrate on and comprehend. As van Leeuwen notes, on many occasions sound design will also include some ‘masking noise’, as audiences may find very discrete noises disturbing.24 Indeed, psychoacoustics is a pervasive influence on audiovisual sound in recent years, even if it was only by implication earlier in the twentieth century. Indeed, it is tempting to map the development of modern film theatres by the ability and desire to generate low-frequency sound. This is crucial for the audience’s psychological immersion and establishing a sense of ‘wrap-around sound’, as low frequencies fill spaces more and are less easily assigned to a specific spatial origin. Since late in the twentieth century, audiences have become used to spatialized sound on multiple speakers or in headphones. This is based upon stereo, the procedure of having more than one sound channel, each with different signals, which merge into a single perceived soundscape for the audience. A sense of directionality has been available since two-source stereo but became more precise and elaborate with more speakers and

68 

K. J. DONNELLY

more channels. Indeed, surround sound can be understood as something close to an actual experience of 360-degree sounds wrapping around the listener—yet, of course, even 20 or 30 speakers will not equate with real life listening to multiple sources from different directions. However, it is a remarkable experience in comparison with the single-speaker setups that arrived in cinemas in the first half of the twentieth century. Many discussions of sound, particularly for films, tend to approach it as if it were an absolute rather than a technological effect. It is all too easy to ignore the effect of different speakers and systems or engage with directionality and the impact of the space on the sound.25 Stereophonic sound is based on having more than one speaker playing different sounds, often closely related and with different volumes of the same objects. This furnishes an effect of spatiality, and experiments took place in the late nineteenth century but modern developments were pioneered by Alan Blumlein at EMI in Britain in the 1930s.26 By 1940, Disney had released animated film Fantasia with a stereo soundtrack (three-channel ‘Fantasound’) of specially recorded classical music. At the time of release, the film had to be roadshown as the specialist sound equipment was far from standard. In the USA, two-channel stereo sound began being adopted by film theatres in the early 1950s, while the music industry was slower due to the dominance of single-speaker disc players, although stereo sound in music recordings expanded exponentially in the 1960s.27 Two speakers and sound sources (i.e. as headphones) have remained dominant in-home consumption, despite quadraphonic and larger speaker systems being used in cinemas and then for home entertainment systems. Expensive headphones can give a remarkable impression of the reality of sound, with solid spatialization on a clear sound stage and strong separation. One form of stereo sound that aims to be realistic through basing itself on the mechanics of human hearing is binaural recording.28 This process entails using the principle of recording and playback the placement of the human ears. For recording, this can mean a so-called dummy head recording with two microphones facing in different directions as if on the sides of a human head (and indeed, sometimes they are on a dummy head). For playback, headphones give a faithful reproduction of the positioning of sounds recorded and registered by the dummy head. A variant on this was the Zuccarelli holophonic system, which asked for the two speakers to be set on each side of the listener at a 90-degree angle.29 Pink Floyd’s album The Final Cut (1983) and Psychic TV’s Dreams Less Sweet (1983) were

3  PERPETUAL REALISM: MEDIATING FANTASY AND REALITY 

69

recorded using this, the latter of which included instructions for listening with the record. This furnishes a direct analogue to human hearing and supplies precise ‘point-of-view’ sound. The theory was more controversial and unproven, that human ears produce sounds that mix with sounds from outside to make interference patterns, which are processed by the cochlea as being 3-D, as an ‘acoustic hologram’. Nevertheless, binaural recording persisted and are used in virtual reality (VR). The audio system of so-called Ambisonics involves full three-dimensional immersion in 360-degree sound. This is constructed around a centre point of the recording microphone is placed while recording, yielding point-of-view sound as standard in 360 video and VR.30 The way that modern film soundtracks are assembled in post-­production from multiple recordings militates against the use of binaural sound, which would need everything recorded from the same position by the same dummy head or device holding the two omnidirectional microphones apart. Such an approach is in effect unworkable for multi-speaker setups such as in cinemas, yet it is certainly an option for audiovisual culture that uses headphones, while headphone-based culture increasingly has taken into account the requirements for point-of-view sound evident in VR.31 However, cheaper or more experimental productions are not dominated by complex post-production sound and thus would be able to exploit this system.32 Listening to a particular location and then comparing with its representation in feature films will bring a surprise. The latter doesn’t sound much like the former, more an exaggerated and heavily simplified version. Indeed, a processed version. In some ways, this represents the way our perception sifts and orders a soundscape into an easily consumed, perhaps even unconsciously registered, whole, or wholes, for us. This can be illustrated easily by making a basic monophonic recording (e.g. on a cell phone) of that same location. This can often be a surprise, too, where the recording sounds not only dynamically flat and lacking in sound separation, but a chaotic melange of sounds that might even lose individual identity. While part of this is down to what the recoding makes of the soundscape, as Bazin might note, it has a certain objectivity in its mechanical process. Perhaps the important part of this is that the recording must be re-processed by our perceptual faculties using new criteria, and indeed, after a while, a chaotic sounding recording of a location will start to make more listening sense.

70 

K. J. DONNELLY

What might be understood as ‘Bazinian’ sound techniques would include Son Direct, live sound or direct sound, which is recorded at the same time the film is shot, often using a single microphone. This was a tradition in French cinema and became a cultural dominant by the 1980s for amateur videotape films across the world with integrated sound and vision camcorders. This approach yields a sound that matches the perspective of the image but can prove to be a fragmented experience upon editing. Related to this is so-called point-of-view sound. This remains a rarity although it can be used as a momentary effect in audiovisual drama. During the 1930s, Hollywood established ‘omniscient sound’, where we hear everything in a scene from a vague centre point, as remains the norm for dramatic films. Rick Altman charted the debate among sound personnel in the early 1930s between ‘scale matching’, where sounds appear as close as their images on screen, and the ‘intelligibility-oriented’ approach, based around the narrative importance of what was to be heard and with a sense of seeming to be a more static general sound for a space.33 The latter won out and became the standard strategy for auditory perspective. This rendered ‘point of audition’ as an occasional effect, which is what we are used to these days. Yet there have been isolated recurrences, such as the widescreen cinema in the 1950s. Cinerama, for example, used magnetic tape for sound and pioneered the practice of ‘travelling’ sound. This was recorded with five microphones attached to the camera, effectively binding visual and auditory perspective together and putting the spectator in the midst of the on-screen action. Twentieth Century Fox, on the other hand, adapted the existing system to allow more spatiality but retain the impression of omniscient sound occupying a seemingly objective position rather than giving an impression of entering a character’s subjectivity.34 The advent and widespread adoption of stereo sound in the cinema began in the 1950s. As I noted, this had more than one speaker channel and containing slightly different signals, which yields a sense of space. Voices were retained in mono and appeared centrally rather than being spatialized, while initially two channels gave a sense of stereo space and movement through their similar, complimentary but slightly different sounds. This broad overall format has been retained although we are now used to more speakers supplying a wider sound space (or ‘sound stage’ as it is known among sound technicians). So-called positional audio (aka spatial audio) is when an audiovisual object has sounds diffused in 3-D space, yielding the appearance of sounds having a distinct position in relation to the listener. Sonic elements are regularly recorded singly in isolation and

3  PERPETUAL REALISM: MEDIATING FANTASY AND REALITY 

71

then, for cinema, are mixed into surround channels and with separate feeds for low frequency (LFE). Music is in stereo but not often mixed for full surround, while dialogue is given little spatialization and put in the centre channel. This system for cinema sound is a compromise between keeping dialogue central and understandable, while allowing for spatial effects and musical enveloping, which enhance the atmosphere and redouble emotional effects, as well as immersing the audience physiologically as well as mentally. It is also a compromise between what immediately seems real and what affects us in other ways. Broadly, this has always been the case but more recent cinema and some other audiovisual culture has emphasized these aspects as central to their nature.35

Audiovisual Traditions and Realism The status of ‘consensus’ reality became widely questioned as a central characteristic of modernism. Whereas earlier the demarcation between real and not-real regularly was assumed to be simple and straightforward, questions began to appear, inspired partly at least by instances such as the Gestalt experiments with perception, other forms of psychology, notions of time and space relativity and then later Quantum Physics, and also political analysis. For instance, Marxism questioned the way societies were ordered and how these legitimized themselves, establishing and following increasingly complex theories about how our sense of reality was being manipulated and regulated.36 Dramatic films and television have a certain synoptic ability, potentially to see everyone and to open a window to all areas of society. Realism in film habitually brings to the screen individuals and situations often marginalized by mainstream cinema and society. This is what Raymond Williams has called the ‘social extension’ of realism, its intention to represent not just people of rank but also the spectators’ “equals”.37 Such Social Realism gives profile to parts of society that have been unseen and allows their voices to be heard. Having belief in the ontological realism of the image, these filmmakers assumed that ‘realist’ films were able to serve progressive political agendas through demystifying the fog of ideologically primed pro-establishment images promulgated by mainstream film. A group of filmmakers in Denmark got together in 1995 and made a solemn vow to reject a number of stylistic aspects of film that they thought obscured and diverted film from giving the audience direct access to truth. Including Lars von Trier and Thomas Vinterberg, among others, Dogme

72 

K. J. DONNELLY

95 declared a manifesto for limiting the audiovisual language, which included the following38: 1. Shooting must be done on location. Props and sets must not be brought in. 2. The sound must never be produced apart from the image or vice versa. 3. The camera must be hand-held. Any movement or mobility attainable in the hand is permitted. 4. The film must be in colour. Special lighting is not acceptable. 5. Optical work and filters are forbidden. 6. The film must not contain superficial action. 7. Temporal and geographical alienation are forbidden. 8. Genre movies are not acceptable. 9. The film format must be Academy 35 mm. 10. The director must not be credited. This tells us about what techniques were considered ‘honest’ and conducive to a sense of realism and which were considered obfuscatory, mannered or manipulative. The Dogme filmmakers clearly were interested in eschewing formalist approaches, as well as being concerned about post-­ production, particularly for sound. Many of the aspects of the Dogme 95 manifesto pushed towards a sense of exploiting and retaining film’s ability to capture a sense of reality. Is the aim at ‘real’ or naturalism? In terms of theory, by the 1970s, the critical tide had turned against Bazin and his conceptualization of realism.39 The new orthodoxy of film theory, inspired by Lacan’s rerouting of Freud allied with Althusser’s rerouting of Marx, unveiled the seeming reality of film as a dangerous insidious mirage under the yoke of dominant ideology. Christian Metz was influential in his declaration that the ‘reality effect’ of film was not an effect of the medium’s unique relationship with the real, but rather was caused by audience expectation set by the standardization of conventions.40 This semiotic-dominated approach understood film as being structured like a language and requiring the techniques of something approaching linguistic analysis. This approach had little time for a sense of direct communication and took all of film to be something that was coded in production and decoded in reception. Indeed, Metz’s influential approach understood the screen not as a window onto the world but instead as more like a mirror, reflecting back to spectators their own sense of themselves filtered through ideology.

3  PERPETUAL REALISM: MEDIATING FANTASY AND REALITY 

73

For Laura Mulvey, a sense of realism in film is structured by the unconscious processes of patriarchal society. Film is not a record of reality, but a particular casting of it which reinforces certain social relations. Mulvey’s response was to make films that espoused anticlassical techniques to bypass the dominant ideology imbricated in the mode of film. For instance, she co-directed Riddles of the Sphinx (1977) with Peter Wollen, which included diegetic sequences with actors that depicted the action through continuously revolving a camera on a tripod (360-degree pans). These long takes that guarantees time and space integrity, underscoring the way that the acting is taking place in real time and in a real space. While this destroys the expectations of vision, diegetic sound is not obscured in a similar way, which allowed the audience to follow narrative developments which were obscured by the image strategy. Incidental music, on the other hand, followed a far more unconventional approach. Soft Machine’s Mike Ratledge made a score based on drones and repetition that at times was reminiscent of Terry Riley’s A Rainbow in Curved Air. The pulsating, repetitive music that accompanies diegetic scenes eschews the main functions of the tradition classical film score, as it did not help to frame and emphasize action and emotions. A significant discussion about realism took place in the mid-1970s in the journal Screen. The so-called Days of Hope debate cohered around the Ken Loach, Tony Garnett and Jim Allen’s television miniseries Days of Hope (1975), which followed a working-class family through the social and political landscape of the 1920s and 1930s. Its approach was largely that of classical Hollywood, a highly conventional strategy making for ‘transparent’ style and the drama in many ways looking and sounding very much like other television dramas of the time. The relevant critical articles were reprinted in a volume and were built upon the debate about the way that mainstream film transmitted a sense of realism and authoritative ‘truth’. The debate was built largely upon Colin MacCabe’s characterization of the ‘classical realist text’ in ‘Realism and Cinema: A Note on Some Brechtian Theses’.41 The debate focused on the social impact of the dominant form of realism in audiovisual culture, which was largely derived from the model of narration and representation established by classical Hollywood. Colin MacCabe suggested that the standardized form for film and television drama, sometimes called ‘realism’ and sometimes called ‘naturalism’, was unable to offer actual social critique but instead functioned as an apologist for the current social order. This was imbricated in the form itself, its sense

74 

K. J. DONNELLY

of authority and its tacit values. MacCabe noted that ‘the classic realist text’ is based on a transparent structure that is not registered as such, and appears to channel knowledge and truth (structured as a hierarchy of discourses).42 Colin McArthur helpfully sums up: the ‘classical realist text’ cannot deal with the real in its contradiction and that in the same moment it fixes the subject in its point of view from which everything becomes obvious. ... It is [Days of Hope’s] this retention of classical features which render it accessible and open it up to charges of ‘recuperation’ (meaning something like ‘absorption’ or ‘nullification’, usually in a political context). … [however a] progressive realist text, such as Days of Hope, might be a more appropriate agitational weapon than the (utopian?) Revolutionary text canvassed by Screen.43

McArthur noted that conventional forms of realism were able to inspire viewers politically through an accepted, ‘transparent’ showing of negative social situations, while MacCabe claimed that the conventional mode of representing itself produced a passivity and acceptance in the viewer who was unable to grasp the politics of the situation represented apart from in a romanticized manner. While social and political contradictions might: appear temporarily in texts like Days..., for example, as McArthur argues, but that these temporary contradictions are still resolved in the usual way in the end by the narrative. There are specific problems with the way history and the past are depicted in Days... too - as ‘fixed and immutable’.44

The format ‘works’ in the same way as any costume drama, and the only way of stepping outside and having an impact is through a more Brechtian-­ inspired approach of emotional distanciation allowing the audience to adopt a critical approach. McArthur and MacCabe both shared a focus on the political/ideological potential of certain aesthetic film and television forms, and yet perhaps it might be more interesting to focus on the regimes of the separate aesthetic models and their relationship to a sense of realism. The continued importance of this debate is clear for representations of political events, as these can be solidified in public consciousness, and spun in a certain direction by interested parties. What is at stake is the notion that following a

3  PERPETUAL REALISM: MEDIATING FANTASY AND REALITY 

75

highly conventional approach to representation supplies its own ideological position, in effect irrespective of the material being represented. MacCabe noted the “classic relation between narrative and vision in which what we see is true and this truth confirms what we see”.45 This addresses the notion of convention as well as implicating something of the phenomenon of accepting, or desiring, what we are presented with as being on some level reality and related intimately with an idea of ‘the truth’. At least partly, as I have been arguing, this is due to the dominance of perception in setting out a strong image and sound composite for us, which may or may not inspire more critical activity by higher brain functions. The very audiovisual language can be adopted in the hope of removing critical or extreme responses through its very normality. So-called television naturalism is the traditional zero-degree style of television drama, with clear dialogue-led and without extraneous noise on the soundtrack (much like radio drama), accompanied by a preponderance of shot-reverse shot structures of ‘talking heads’. While these function as a clear analogue of close quarters conversation, they are far more formalized: one actor speaks, then the other replies, and so on. These nearly straight talking heads almost address us personally; and we, sitting at home, are on the way to being their reverse shot. The very conventionality of this approach renders it almost invisible, and while it has clear advantages for audience involvement, while giving a certain impression of the real, it also limits the possibilities of engaging with reality. In the early 1960s, screenwriter Troy Kennedy Martin argued for television drama rejecting ‘naturalism’ in favour of using modernist techniques, and so aiming to confront the spectator intellectually rather than simply engaging with their emotions through narrative and characterization. Kennedy Martin noted that television drama was (and still is) dominated by an approach that aims simply to dramatize content, and with a format that is stylistically invisible and predictable and follows unremarkable linear time.46 His remedy is to suggest that drama should follow audiovisual techniques associated with modernism, such as montage and partly autonomous camera movement.47 Writing over a decade later, in 1977, John McGrath pointed to the same issue and characterized naturalism as ‘encapsulating the status quo’ and lacking the emotional involvement of cinema.48 He continues that “Naturalism contains everything within a closed system of relationships. Every statement is mediated

76 

K. J. DONNELLY

through the situation of the character speaking. Mediated to the point of triviality … it imposes a certain neutrality about life on the writer, the actor and the audience”.49 This is a strong description of the form of naturalism that dominates both television drama and has been dominant in the stylistic norms of classical cinema, too. This zero-degree style is consensually accepted through the limited vocabulary and repetition of conventions. This yields a believable illusion, invisible technique, invisible acting and overall, a concern with surface, avoiding jarring ‘alienating’ surprises, and a pervasive sense of ‘normality’ which allows the audience to concentrate on the illusion of ‘what happens’ in the drama.50 Classical narration, as described in detail by David Bordwell in Narration in the Fiction Film,51 became a dominant narrative and spatial system that enabled an economy of production and economy of mental energy in consumption. The industrialization of image and sound was adapted a little by television drama but retaining both economies. Classical Hollywood reserved stylistic anomalies for exceptional and for subjective situations, with good examples being the protagonist’s alcoholism in The Lost Weekend (1945) or for extraordinary otherworldly events in The Day the Earth Stood Still (1951). In both cases, there are visual novelties that transcend the plateau of stylistic normality, and, crucially, in both cases, there is a sonic indication of the unusual. Both Miklós Rózsa and Bernard Herrmann’s orchestral scores showcase the eerie electronic sound of the theremin. The pace and aesthetic style of films have changed since the era of classical Hollywood.52 David Bordwell updated his stylistic description of different forms of narration in the article ‘Intensified Continuity’.53 He focuses on four main developments: more rapid editing, bipolar extremes of lens length, more close framings in dialogue scenes and a free-ranging camera.54 Surprisingly there is no mention of the importance of omniscient sound for continuity in a regime of fast-cutting and mobile camerawork. Bordwell notes: “… building dialogue scenes out of brief shots, the new style has become slightly more elliptical, utilizing fewer establishing shots and long held two-shots”.55 Yet he does not notably address the change in dialogue delivery and visuals that accompany them, either, although being fair his article explicitly addresses ‘visual style’. He is certainly correct in his conclusion that films retain a sense of general continuity and follow many of the foundational conventions, both in terms of the

3  PERPETUAL REALISM: MEDIATING FANTASY AND REALITY 

77

image and sound. It would have been useful to have read more about the exponential rise in off-screen sound in mainstream film since the 1980s. Particularly as this has had an effect on the affordances available for the accompanying images. Jeff Smith notes this while broadly agreeing with Bordwell. He points to significant changes in sound techniques and affordance that include “ increased volume, low frequency effects, expanded frequency range, the spatialization of sound, the ‘hyperdetail’ of contemporary Foley work, and the use of non-diegetic sound effects as stylistic punctuation”.56 Despite these notable developments, the sonic elements of intensified continuity still retain the broad principles of storytelling in film that were established a century ago. These aesthetic developments in mainstream cinema have their antithesis in so-called slow cinema, where directors such as Béla Tarr revel in the extremity of long takes and slow pacing. Good examples include Michelangelo Frammartino’s Le Quattro Volte (2010) and Tarr and Ágnes Hranitzky’s The Turin Horse (2011). Slow cinema often appears to use direct sound, although it can be difficult to tell in some cases. The retention of often static images on screen allows a contemplation of the details of the image, but also allows for careful consideration of the soundtrack by the audience. Conventional aspects of audiovisual culture can certainly not be denied, but this cannot be used to dismiss or downplay the phenomenological impact of the recording of image and sound. While these aspects of classical style or naturalism are based on a consensus of industrial production and audience consumption, there is a clear perceptual basis to these conventions. The television genre of so-called reality TV shows, with their often hidden or studiously ignored cameras and microphones, adopt an audiovisual style that aims to make the frame of the image invisible and is widely accepted as ‘capturing reality’ as if they were documentaries. Yet it is hard not to think of Baudrillard’s declaration of the simulation as having replaced the real, and McLuhan’s characterization of media as having become integrated parts of the human. However, debates about the political effect of realism remain, though, and some of these old discussions don’t deserve to be forgotten or ignored by current scholars. The question of the relation of audiovisual culture to reality is crucial although increasingly difficult to define as the two have become inseparable.

78 

K. J. DONNELLY

Evidence for the Real The sense of reality associated with audiovisual culture gives it a sense of status as a form of ‘witnessing’ as well as ‘recording’ reality. The power of recording reality relies upon the intensity of the reality effect of film, as well as its ‘mechanical’ aspects. Recordings have a certain legal status and can be an important part of recordings ‘citizen journalism’ and comprise visual and audio evidence for legal cases. Indeed, audiovisual evidence has become one of the most conclusive pieces of evidence in modern court cases. CCTV (closed-circuit television) appears to have a status as accepted and guaranteed reality, as alluded to in the ‘sound and videotape recordings’ legal guide published by the UK Government, Health and Safety Executive.57 While we tend to think of visual recordings, sound recordings have also had an important place as a record of real events in courts, such as the torture recordings made by the notorious series killers ‘the Moors Murderers’ that were played in court in their 1966 trial. The sound without images can, and likely does, inspire the listener to imagine accompanying images. One of the most famous films of all time is the ‘Zapruder film’ of the assassination of John F. Kennedy in Dallas on 22 November 1963. It is surely one of the most highly regarded pieces of visual evidence. Abe Zapruder shot the events on regular 8 mm film, which had no sound component. He was shooting from an elevated concrete pedestal, on Elm Street at Dealey Plaza, with an expensive camera: a Model 414 PD Bell & Howell Zoomatic Director Series 8 mm camera. Zapruder shot film from the moment when the presidential limousine turned into Elm Street, capturing 26.6  seconds of film with a fairly direct sight of Kennedy being shot.58 It is a single take, providing a single point of view with no edits or manipulation of the film—and yet it does not solve or dispel the mysteries and conspiracies that surround the assassination. Life magazine immediately bought the footage, although Zapruder insisted that frame 313 be excluded from publication. The importance of the Zapruder film is not simply that it caught a remarkable historical moment but that it was solid evidence used to counter the ‘official version’ of the Kennedy assassination, although it has been far from conclusive in itself.59 Over 30 people took film footage in Dealey Plaza (with another well-known one being shot by Orville Nix),60 but the Zapruder film is the most highly regarded. In 1994, the Zapruder film footage was deemed ‘culturally, historically or

3  PERPETUAL REALISM: MEDIATING FANTASY AND REALITY 

79

aesthetically significant’ by the United States Library of Congress and was selected for permanent preservation in the National Film Registry.61 Another important incident that was captured on film was the attack by police on Rodney King. On 3 March 1991, Los Angeles Police Department beat King up badly after a high-speed chase, which inspired the onset of the 1992 Los Angeles Riots. George Holliday, a plumbing salesman, filmed the incident with his new Sony camcorder from his nearby apartment and sent the footage to local news station KTLA.62 At later trials, four officers were charged but none convicted despite clear evidence of the assault on video. A more recent case is that of George Floyd Jr. In Minneapolis on 25 May 2020, Floyd was suspected of having passed a counterfeit $20 bill. One of four arresting police officers, Derek Chauvin, knelt on Floyd’s neck and back for 9 minutes and 29 seconds causing him to die of suffocation. Bystanders recorded video of the horrifying event on their mobile phones. Teenager Darnella Frazier’s video was the most prominent, being shared on the Internet and used at the murder trial.63 One striking aspect was the existence of a soundtrack that made it clear that the policeman was aware of Floyd’s physical situation. In some ways, this was more iconic to the campaigns that followed, where the repeated phrase ‘I can’t breathe’ made by Floyd as he was dying became a clarion call. The relationship between film and reality has perhaps been encapsulated in the documentary film format. Early films such as those by the Lumiere brothers, and so-called Actualities, aimed simply to capture the images set in front of the camera. ‘Scenics’ were popular travelogue films that were popular after the turn of the twentieth century. Also, around this period, medical and scientific films detailed operations and the natural world, with early ones made by Eugène-Louis Doyen, who filmed medical operations, while in the 1920s, Jean Painlevé made remarkable underwater films. There was a burst of anthropological documentaries made in the 1920s and 1930s, in the wake of Robert Flaherty’s Nanook of the North (1922). Flaherty’s film focused on the activities of the titular Inuk, although restaging and dramatizing aspects of Inuit life rather than simply recording their way of life. Films exoticizing remote cultures from America and Europe were popular among filmmakers, such as Moana (1926) and the Thailand-set Chang: A Drama of the Wilderness (1927), about a Thai farmer and directed by Merian C.  Cooper and Ernest Schoedsack, who later made King Kong (1933). While considered remarkable and educational at the time, it is difficult not to be aware of the filmmakers imposing

80 

K. J. DONNELLY

themselves and their requirements on distinct cultures that are then cast as strange, ‘primitive’ and exotic. The figure of the ‘Noble Savage’ is never far from sight, if not in plain sight. As a pioneer of documentary film making, Flaherty had no issue with restaging activities. Perhaps one of the best known was the shark hunt in the small boat in Man of Aran (1934) which had not been practiced on the islands for decades.64 Man of Aran was shot without any location sound and had this all completed in post-production. To modern ears, apart from John Greenwood’s score, the sounds do not have a strong weld with the image.65 Dialogue in Irish and English is intermittent, and often incidental. It sounds like it was recorded in a different space and crucially is not synchronized, only roughly matching the situation. I have described this strategy as ‘plesiochrony’, a loose and generalized matching of sound to image rather than aiming for the sort of precise synchronization dominant in cinema.66 More recent approaches to the documentary, as evident in the cinéma vérité and closely related Direct Cinema movements, aimed at a profoundly different approach. Their techniques were enabled by technological developments: lighter and more portable hand-held (16 mm) cameras, and portable sound recording using magnetic tape. Perhaps the best-­ known proponent of cinéma vérité was Jean Rouch in France, while Direct Cinema in North America involved directors such as Richard Leacock, Frederick Wiseman and Albert and David Maysles. While there are strong similarities of approach and assumptions among these filmmakers, their films vary greatly as do their approaches to making them. However, they shared a strong interest in ‘catching reality’ rather than simply illustrating reality or showing something unfamiliar. Developments in popular music and the advent of large-scale festivals in the mid- and late-1960s led to the development of rock documentary as a format, which had a clear influence from Direct Cinema. Festivals such as Monterey, Woodstock, Altamont and Wattstax gave filmmakers the opportunity not only to ‘catch’ the reality of the event but also to showcase the exotic aspects of the new youth culture. D.A. Pennebaker exploited the idea of the camera as observer in Don’t Look Back (1967) by showing behind the scenes of Bob Dylan’s controversial 1965 British Tour. While the footage of Dylan performing onstage were dramatic, aided by booing from the audience who disliked his addition of electric instruments, the intimacy of the backstage events added a remarkable dimension to the format of the music documentary. Indeed, arguably Pennebaker set its

3  PERPETUAL REALISM: MEDIATING FANTASY AND REALITY 

81

format of ‘frontstage concert’ and ‘backstage’ that continues to dominate the genre. Using hand-held camera and portable sound recording, Pennebaker’s crew aimed to be unobtrusive and give the feeling for the audience of presence at the events. The Maysles brothers and Charlotte Zwerin followed a similar procedure in the making of Gimme Shelter (1970). The desire to capture the ‘reality’ of the Altamont free festival in 1969 appears to be at the heart of the film. There is no clarifying voice-­ over narration and the film was assembled from a massive amount of footage taken by a large number of cameras (including one operated by George Lucas, well before Star Wars). Gimme Shelter documents the degeneration of the event into chaos and violence, and highly memorably shows the killing of a man (named Meredith Hunter) in front of the stage as the Rolling Stones perform. At this point, the linear presentation of events in the film halt and step outside the footage of the concert to show Mick Jagger watching the film on a Steenbeck-type flatbed editing suite. “Take that back, will you, David,” he asks. The images roll backwards and David Maysles then talks him (and the film audience) through the images, moving them in slow motion and holding a frame to show a gun silhouetted against a woman’s dress (Figs. 3.4 and 3.5). Gimme Shelter shows explicitly a dangerous situation, and is at times shocking and horrifying, illustrating the potential for the new uncontrolled protean culture. The reaction of the film’s audience is perhaps not simply intrigued but also horrified, as the film provides some understanding the danger of unstable crowds. In Gimme Shelter, it is striking how the desire to have the cameras in among the audience looking at the stage— the classic analogue of the concert experience—is quickly overtaken by

Figs. 3.4 and 3.5  Gimme Shelter

82 

K. J. DONNELLY

dangerous situation, meaning that the cameras essentially shoot most of the film from behind the musicians on the stage. The film has a number of long shots that take in a large amount of the stage and the front portion of the audience. These shots contain a large amount of simultaneous action and movement, and the lack of framing hierarchy, telling us what to look at, makes for something different. The audience is able to scan the image and so can see different things each time the film is viewed. Bazin noted that the ability to move our eyes around the frame is an important aspect of his conception of film realism, where, in deep focus, we can shift our focus from foreground to background, as we might when confronted with a real-life vista. The concert film and ‘rockumentary’ arguably came out of Direct Cinema, and the urge for ‘visual anthropology’, or keeping for posterity. For instance, Woodstock (1970) depicted the overall ‘event’, with sometimes as much interest in the audience as in the musical performance. Such nurturing of ‘history’ has guaranteed the 1960s’ pop festival films a perennial status, although some historical events were significantly enhanced, such as Led Zeppelin—The Song Remains the Same (1976).67 So, documentaries remain with a question mark over them yet still exploit the abilities of film to capture mechanically what is in front of the lens and wreathe that in a sense of accessing reality. While certain areas of audiovisual culture remain founded on a notion of the neutral ‘capture’ of reality, digital post-production, embodied by CGI special effects, has tested belief in sound and image. The move to digital recording has changed image and sound media from basic terms capturing something of reality to a focus on processing and alteration after initial recording. Lev Manovich noted that digital cinema ought to be understood as more like ‘animation’ and involves an integral aspect of ‘painting’.68 This reflects a move away from a strong sense of understanding audiovisual culture as having a direct relationship with reality. In the 1980s, Jean Baudrillard declared that the world, through the integration of electronic media, was collapsing into a state of ‘simulation’ where any meaningful distinction between reality and electronic fabrication.69 Yet this is not to say that we simply believe everything but that perhaps we are more cynical overall. Stephen Prince states that CGI and other forms of digital ‘enhancement’ pose new challenges to realism and its foundational theories of resemblance and undermine traditional approaches to cinema that think of its techniques as realist or not.70

3  PERPETUAL REALISM: MEDIATING FANTASY AND REALITY 

83

Peter Jackson’s They Shall Not Grow Old (2018) that used as raw material film footage from World War I had been changed by digital techniques to supply the sense of realism expected from modern films. The phenomenological effect is startling, giving a sense of immediacy to the film footage and making remote events seem far closer than before. While the colourization and image and movement enhancement are clear, audiences were probably less likely to note the soundtrack. The footage, of course, was shot silent but is transformed by totally newly created sounds. The soundtrack assembled music and wild track interviews with soldiers along with diegetic sound effects as accompaniment to the actions depicted in the images. The soundtrack is reminiscent of Robert Flaherty’s soundtrack to Man of Aran (1934). In this case, Flaherty shot the film with no location sound, which would have been difficult in the early years of synchronized recorded sound cinema. As I mentioned earlier, instead, he fabricated diegetic sound in a studio recording which had a plesiochronous relationship to the images, roughly fitting but never synchronizing precisely. During activities in They Shall Not Grow Old, we hear voices that we assign to characters on screen despite the quality of the sound suggesting the voices are not from that actual space, and despite never seeing a character’s lips move precisely at the point that words are spoken on the soundtrack. This film was made in the early years of talkies and so perhaps did not feel a need to follow strict expectations. Jackson’s film had no such leeway and by necessity had to approximate the expectations of modern audiences for audiovisual representations being understood as real. Silent films have had something of a rejuvenation with the opportunity to add something new to the film through the addition of novel music (more about this in a later chapter).71 Following the tenets of the McGurk Effect and the composite audiovisual signal, different music makes the film different. Rather than it being a modulated experience, it is a transformed experience. The way that the real is transmitted has changed perhaps, now including seemingly ‘invisible’ aspects. Not simply definition of image but also movement, colour and suitable sound to hit the standard of modern high-quality audiovisual culture. When there are no ‘issues’ for perception to focus on, cognition perhaps moves onto another level involuntarily. They Shall Not Grow Old was released at the centenary of the conclusion of World War I, and despite its modern adaptation will serve as surrogate experience for future generations, with its remarkable emotional and immediate impact. In 2021, a 33-second audiovisual advertisement for a Dubai-based airway showed a woman on the pinnacle of a skyscraper. First, she is shown

84 

K. J. DONNELLY

first in medium shot and looking relatively normal as she holds up and shows a succession of cards containing writing, like Bob Dylan in Don’t Look Back (1966). Then the camera moves backwards and outwards to reveal her actual position.72 Upon seeing this advertisement (called ‘We’re on Top of the World’ and made by the agency Prime Productions AMG), I, like many, thought it was faked through CGI as it was so dramatic and impressive. It turns out that it was not faked. Stuntwoman Nicole Smith-­ Ludvik stood 823 metres high, on the top of the world’s tallest building, the Burj Khalifa in Dubai. She stood on an extremely small platform at the top of the building’s pinnacle that allows only enough space for a person to stand without moving. While a short film was made to show how the startling imagery was achieved, perhaps its main function was to assure the audience of the veracity of the images.73 (Fig. 3.6). Our initial reaction is being impressed by the camera pulling out to extreme long shot to reveal the woman’s actual position. This is a physiological reaction, as a form of acrophobia. Then we recover our composure, and our cynicism. Even when it is explained to the incredulous spectator that this is not software trickery, there remains a desire to find something less than straightforward. Having watched it and the ‘making of’ documentary a few times, I am aware that there is a slight jerky moment to the frame, which likely comes from the transition between footage from the fixed camera on the platform next to the tower and the drone footage which moves rapidly away. On initial viewing, this all looks like a single

Fig. 3.6  ‘We’re on Top of the World’

3  PERPETUAL REALISM: MEDIATING FANTASY AND REALITY 

85

shot. This short film is interesting in terms of its function, too. It promotes visiting the United Arab Emirates but doesn’t make it look an interesting or beautiful place to visit. Instead, the impact of the advertisement and the difficulty of its production reflect positively on the company, giving an impression of sophistication and inventiveness and promoting the company rather than the location. Stephen Prince notes that there has been a critical anxiety in the audiovisual industries about digital imagery undermining the sense of reality in images, which has brought a sense of crisis to the established modes of representing things.74 While we may perceive CGI as initially seeming real, as a phenomenological effect, our intellectual reaction may be to question our vision and hearing, and be suspicious. This appears to be a situation where we might not believe although our senses tell us it’s true, where there is a conflict between our top-down cognition and our bottom-up perception. We may believe less but we nevertheless want to believe and perhaps need to believe. Yet this situation is hardly new for film,75 and perhaps this fundamental aspect of splitting belief and the so-called suspension of disbelief is a defining aspect. Indeed, the proliferation of short films on Internet sites such as YouTube and TikTok means that people are highly accustomed to seeing and hearing what appears to be amateur footage and are far from naïve about whether it has been enhanced or not. These debates almost always focus on imagery. Believable sounds can make us accept images more. Are we more likely to believe sounds? Of course, there is a strong tradition in audiovisual culture of sound retaining a certain power of credibility. For instance, voice-overs on television advertisements, on news or on party political broadcasts. These promote a sense of being authoritative and trustworthy, converting the image into a bearer of the authentic. The United Arab Emirates advertisement soundtrack has been noticed far less than the images. The sound is not about furnishing a sense of the real so much as it functions for continuity. There is no sound of wind, which surely would have been present judging by the stunt woman’s flapping headband, and the music proceeds and furnishes a sense of seamless progress perhaps more than for drama. Consequently, I would be tempted to suggest that the filmmakers are sensitive about continuity in this advertisement, underlining hidden ‘joins’ in the visual track. Certainly, if you turn the sound down, there is a tendency to notice the slight sense of possible post-production aspects, which is far less apparent with the soundtrack, which not only forms a perceptual unity with the image but also uses up more mental processing power and perhaps makes us less critical.

86 

K. J. DONNELLY

We have a strong tendency to believe sound, even though some of us might be extremely aware of how mainstream films build their soundtracks in post-production. An interesting and persuasive case in point is the call of the Australian bird, the kookaburra. Likely first used in a Tarzan film from the 1930s. The industry wisdom since has been that jungle scenes ‘don’t sound right without it’. It is endemic. Strangely, though it is also evident beyond African, Asian or South American jungles, appearing for the Himalayas in Powell and Pressburger’s Black Narcissus (1947), for North Carolina in Cape Fear (1962) and for the Borgo Pass (Transylvania’s Tihut ̦a Pass) in Jess Franco’s Count Dracula (1970).76 Some useful research completed by The Sound and the Foley Internet blog points to a likely first use of the kookaburra call as a general sound. Tarzan and the Green Goddess (1938), starring Herman Brix in the title role, and re-edited into film from its original serial format was very likely the first time the sound was used in a Tarzan film.77 This film was largely re-edited from the serial The New Adventures of Tarzan (1935) and was notably later for the film kookaburra than general wisdom suggests, with Tarzan the Tiger (1929) having been the first sound film featuring Tarzan and Tarzan the Ape Man (1932) proving a massive success and leading to sequels featuring its star Johnny Weissmüller. Tarzan and the Green Goddess was well into the run of tens of Tarzan films that went on into the following decade. In Creature from the Black Lagoon (1954), the first shot that shows ‘the Amazon’ (really Florida) is inaugurated sonically by a recording of a kookaburra call. The same procedure was followed precisely in the sequel, Revenge of the Creature (1955), with the initial image of what’s meant to be the Amazonian rainforest being matched instantly by the sound of the kookaburra. A kookaburra can be heard in The Treasure of the Sierra Madre (1948) as they enter a forest in Mexico, and more recently, they are heard on a tropical island near Costa Rica in The Lost World: Jurassic Park (1997). Its consistent appearance where it is nonnative shows that the sound of the kookaburra is a general jungle sound rather than being taken to be the sound of a particular animal. As an acousmatic sound, without an on-­ screen source, rather than making us interested in its origin, we merely understand it as a sonic texture of the jungle. For the audience, its origin is the jungle rather than a particular animal.78 I have heard this referred to as the ‘Coconut Effect’, where audiences expect certain sounds. Perhaps the clearest example is for punches to sound like pieces of wood slapped together. This is particularly evident in Kung Fu films of the 1970s and 1980s, where the sounds sometimes can

3  PERPETUAL REALISM: MEDIATING FANTASY AND REALITY 

87

be heavily exaggerated and with a tail of reverb that does not fit the space of the image.79 Such post-production Foley work and library sounds can have a hyperbolic or exaggerated effect but nevertheless are highly effective and are a conventional requirement if action films are going to be accepted by their audiences.80 Indeed, punches and other impacts show how sound and image fit together in precise synchronization. They also underline how sound can ‘validate’ the image, and vice versa. A big sound for an explosion or gunshot can make innocuous images seem something more extreme. Similarly, a close-up insert shot of an animal can make the ambient sounds of nature somehow more tangible and less simply ‘background noise’. During the Covid-19 football season in the UK, 2020–2021, audiences were not allowed to attend stadia and games were mostly streamed live on the Internet. Quickly, there was an anxiety as lack of expected soundtrack for the games disturbed film audience’s sense of ‘the real’. This showed just how important expectation is for experience, while also underlining the psychological importance of sound. The answer in some stadia was for an electronic soundtrack to be piped in, giving an approximation of the experience of watching a normal game on television. Informal evidence suggested that football players were also happier with something to mask the lack of sound, even though the soundtrack could vary from a continuum of crowd recording to an alternation between an encouraging crowd and a highly excited crowd, the latter of which often didn’t match the tenor of the action well. The added sounds supplied something of the ‘psychological realism’ required for the experience of sport, whereas the silence was eerie and alienating, and brings to mind Adorno and Eisler’s characterization silent films devoid of live music as suffering from film’s inherent ‘Ghostly Effect’.81 It is probably also worth noting that sound had a reassuring effect, supplying the impression that everything was ‘as normal’, a vital contribution during the anxieties of the onset of Covid-19. We tend to think of sound as ‘belonging’ to an image or becoming anchored once we see the origins of the sound. This suggests that sound is firmly secondary to image. However, these examples discussed here suggest something more: sound can make violence physical (punches), can embody a particular location (jungle) and can validate images (enhanced crown noise at sports events).

88 

K. J. DONNELLY

Doubling Perception: The Technical Analogue In contrast with this last situation, there has been an intermittent but insistent strain in audiovisual culture that involves direct homology, copying the phenomenological effect of the hearing and seeing subject. Audiovisual culture can often follow the idea of ‘the Technical Analogue’, giving an impression of ‘reality’ through particular audio and image style, which mimics human perception. For visuals, this follows the strategy of exploiting point of view, sometimes in a sustained manner and sometimes only at privileged times. Traditionally for film, this has been understood as a means of ‘suturing’ the audience emotionally to a character on screen. The dominant approach to film sound, and subsequent audiovisual drama, was standardized by the Hollywood studio system and its industrial and aesthetic standardization. Based on the principle of audible clarity and hearing of relevant narrative information rather than sound being heard from the point of view of a character or a sense of what the actual location would sound like. This can serve as a contrast to dominant stylistic naturalism, yet often works as a leavening of latter. As already mentioned, ‘subjective sound’ is far less common than ‘objective sound’, and this gives the impression that we are an omniscient but invisible bystander, while providing clarity of salient sound in the whole scenic location. The soundtrack is ordered conventionally, which is most evident in the way that we always hear dialogue clearly, and lines are not spoken simultaneously. In terms of volume, dialogue is mixed loudest, with sound effects, ambience and non-diegetic music giving way. When dialogue is absent, there can be a more interesting relationship between the other sounds present. At highly emotional moments, the musical score can swell to encourage an emotional reaction in the audience to the narrative situation and images, while a featured song recording can often inspire volume and presentation of the music in the same way as it would sound on a music system outside the cinema. These are also evident on headphones with films as home media. However, sound perceptive often remains the same: omniscient and ‘objective’, as if a consensus ‘reality’. We rarely notice changes in sound perspective unless they are made very clear. For example, in Reservoir Dogs (1992), Mr. White has been torturing a captive policeman while listening to Stealers Wheel’s Stuck in the Middle with You on the radio. He then goes outside to fetch a can of gasoline from his car. The sound of the song illustrates the change in sound perspective, following Mr. White as subjective sound, and so the

3  PERPETUAL REALISM: MEDIATING FANTASY AND REALITY 

89

sound recedes in volume and dynamic range as he goes outside, although we still hear it proceeding. As he walks back into the building, the song’s sound moves back towards its original state, which nevertheless was a fulsome sound in the cinema’s speakers rather than the sound that would have come from a radio.82 A notable example of film attempting a direct analogue of human perception was Lady in the Lake (1947). Directed by and starring Robert Montgomery as Philip Marlowe, this film’s strategy is based on almost everything being shown as a point of view from the film’s detective protagonist. This strategy was also evident in another film noir at the time, Dark Passage (1947), where consistent point of view was used early in the film to avoid showing the protagonist’s face. Based on Raymond Chandler’s story, it is pedestrian in narrative terms, telling of a missing wife and a dead body at the lake. Lady in the Lake appears to be based wholly on the point of view structure in film, and bolsters this with something approaching point of audition sound. The audience only sees or hears what the protagonist Marlowe sees and hears. This is an experiment, with the camera and microphone as the protagonist of the film. Other characters talk directly to the camera, with intermittent returning to Marlowe addressing the camera and telling what happened next. This device serves to hold together narrative development, but in aesthetic terms, it serves to provide variation from the visual point of view shots and simultaneously remind the audience of the protagonist and of our imminent return to the detective’s point of view. These shots showing Marlowe’s perception of events make for a very static film, with a succession of scenes rendered as extremely long takes and with a conventional composition of a medium shot with another character in the centre of the frame. The overwhelming majority of these are static camera setups but sometimes they include dollying and more elaborate movement, and of course almost all from the same elevation. There were clearly several issues for camera movement, and the film might be understood as a compendium of technical and technological answers to the questions posed by its strategy. Lady in the Lake is a film dominated by mirrors and disembodied hands, the former allowing short glimpses of Marlowe and the latter cementing the camera view as being from his head position (Figs. 3.7 and 3.8). We are regaled with ‘subjective’ effects. The camera moves downwards a Marlowe sits down, smoke blows from under the camera into a policemen’s face and doors are opened allowing the camera moves through

90 

K. J. DONNELLY

Figs. 3.7 and 3.8  Lady in the Lake

them. When Marlowe visits Chris Lavery, the camera pans right analogous to Marlowe looking across at clock on mantelpiece, and then when the camera turns back, Lavery lays a punch almost straight into it, showing his fist in big close-up. Sometimes these visual experiments are less startling and less successful. Marlowe goes to look at a telegram on the table in Adrienne’s apartment. The frame dollies left and tilts downwards towards the telegram but is forced to cut mid-movement, as clearly the smooth, human-like movement desired is not achievable in a single fluid camera movement. As might be expected, Lady in the Lake’s personal, perception-centred viewpoint has implications for film sound, too. An interesting sequence is the first phone call, in the ‘Press Room’ at the Bay City Police Station. We see Marlowe’s hand ringing the number on the telephone. Then, for the duration of the call, the camera retains a static frame, stuck looking at the floor and table in ‘dead shot’ of unimportant visuals as he talks. This shot underlines just how far this film has been based on sound, namely, dialogue, and would work easily as a radio play. The film does not emphasize the use of point-of-view sound and aims to retain a sense of normality to the sound, seeming unremarkable and much like dialogue in any other film of the time. In the first scene with other actors, Adrienne Fromsett has relatively close voice, while Marlowe has a deep and quite close-­ sounding voice. However, this is not an extremely close ‘in-head’ sound, so retains a sense of the dominant ‘objective sound’ convention in classical Hollywood cinema. Indeed, the view looks slightly less close than it probably would be as an authentic point of view from the camera position. The dialogue scenes are remarkable anyway, consisting simply of a Bazinian

3  PERPETUAL REALISM: MEDIATING FANTASY AND REALITY 

91

long take and uncut actor performance as they directly address the camera. At one point in the film, Marlowe looks through a partly open door (we see his hand on screen left) to hear two men talking beyond. Their voices do not sound any more distant and betray that narrative expediency has dictated objective rather than subjective sound. Music has an unusual place in the film, though. David Snell provided the music, which is demoted from its ‘wall-to-wall’ conventional appearance in Hollywood films of the time to Christmas carols sung by a choir for the title sequences, Christmas songs at the office party and a few instances of incidental music cues. The dialogue sequences that dominate the film lack the incidental music that would have been at the time, and thus sonically the film marks itself out as different. Even the sequences that use non-diegetic music are unusual. For instance, Marlowe drives his car and we are shown another car in the rear-view mirror that then comes alongside and causes a crash. The music is choral and undulating rather than the dramatic orchestral music we might expect. Indeed, it is reminiscent of the kind of dissonant choir music evident in Ligeti’s music as used in 2001: A Space Odyssey (1968) or at the conclusion of Performance (1970). It is mixed low, however, and it is less easy to note how remarkably unconventional it sounds, as well as how its use in this situation is also singular. The removal of a conventional classical film score for a film such as this is notable, and there is an instance of a further musical element that tries to set the film apart. From time to time, Marlowe whistles an angular melody, which sometimes takes on an eerie character.83 Whistling gives an extremely close sound, an intermittent but inconsistent confirmation of point-of-view sound. This is a strangely unmusical film, made during a period when Hollywood films habitually included scores of over an hour and some had just a few minutes scoreless. This serves to emphasize the ‘realistic’ audiovisual style in the reduced style of the dialogue sequences, and underlines how the sound and image we see is what Marlowe sees. This displaces the omniscient narration that dominated classical Hollywood cinema. A handful of startling set-pieces also underline this, such as when the violent policemen De Garmot slaps Marlowe twice, and then suddenly Marlowe hits him back, where we see an arm appear and impact the policemen. A stranger instance that emphasizes the subjective view is when Marlowe talks with Adrienne and she asks, ‘why are you looking at me like that?’ Momentarily, we might expect to see a reverse shot, to confirm the look on Marlowe’s face that inspired the comment. None is forthcoming and the audience is reminded of the film’s audiovisual conceit. Indeed, this is

92 

K. J. DONNELLY

reminiscent of an ‘alienation effect’, as used in modernist drama and associated with German playwright Bertolt Brecht. Indeed, the whole film’s premise, rather than being a personal implication of the audience, might be addressed as an alienation effect, particularly as the audience is constantly dealing with being directly addressed, in a manner similar to Brecht’s plays breaking with the illusion of the action to speak directly to the audience and reassert the drama as construct rather than illusion. So, the effect of making it look like what our eye might see has a converse effect. Aspects that might illustrate or guarantee the reality of a sequence come over as extraordinary and are laid bare precisely as a set-­ piece effect. This is most clear in moments such as the point when Marlow pretends that he’s going to kiss Adrienne. The subjective camera moves towards her but then veers off. Similarly, the injured Marlowe crawling away after the car crash supplies a blurred view of road racked into focus and then we see his hands crawling along; then after calling Adrienne, he passes out and the image blurs and then fades out. Rather than immersing us in the action, we are interested in how the point of view will render extraordinary events. To dramatize but also emphasize the sequence as construct, it is accompanied by some unusual angular choral non-diegetic music, rather than conventional Hollywood orchestral score prevalent at the time. Bazin thought Lady in the Lake an interesting experiment but a failure as a film experience. According to Angela Della Vache, Bazin’s verdict on Lady in the Lake was that: realism requires Otherness, namely the unraveling of the physical world on both sides of the axis of action through editing, because the cut marks the shift from the subject of the gaze to the object of the gaze. Without a subject, the object does not exist, since nobody is looking at it. Likewise, without an object, the subject cannot come to terms with the subjectivity of its own perceptions.84

Indeed, doesn’t this cease to be meaningful as a film point of view shot once it all is? Point of view, certainly in the classical film sense, is integrated as a part of a system rather than being a logic in itself. Lady in the Lake’s use of direct address does not have value as an ‘alienation effect’, and its bizarreness is quickly adapted to by the audience, who become only intermittently aware of its highly singular visual style. It has been only

3  PERPETUAL REALISM: MEDIATING FANTASY AND REALITY 

93

rarely noted that, a few isolated moments apart, all sound in Lady in the Lake is also point of audition. Rick Altman noted that point of audition is a clumsy ocularcentric term, which seems only a derivative of the point-of-view shot.85 Often this can be less than precise for a diegetic character and Michel Chion pointed this out in his 1994 book on film sound, Audiovision: Sound on Screen, where he noted that this ‘point’ should be understood as a general space or zone rather than a point.86 Yet ‘interior sound’ provides a precise structural equivalent of the point-of-view shot, as well as giving a privileged view of the interior of the character’s mind. Richard Raskin distinguishes five different types of subjective interior sound: distorted sounds, inner voice, remembered sound, imagined sound and spoken writing. Interior voices are always closely recorded to sound highly intimate, although it often can be given some electronic reverb to make sure the audience knows this is not within the diegetic space on screen but within the character’s head.87 Yet audiovisual culture often stylistically puts us into a character’s position without explicitly putting us ‘inside their head’. Alfred Hitchcock’s Rear Window (1954) is celebrated not only as an engaging crime drama but as a metaphorical drama of the way that films are consumed by the audience. L.B. Jeffries (James Stewart) is immobile with a broken leg and looks out his apartment window at his surrounding apartments. He begins to think that he witnessed a murder. Most analyses concentrate on the film’s visual aspects. As Rear Window has a clear restriction on vision—the audience only gets to see what Jeffries gets to see— sound has a potentially different function. The soundtrack seems chaotic through much of the film and yet it is very carefully organized. Temporal progression is marked through the gradual composing of the song Lisa and the weave of diegetic sounds for the courtyard serve a sense of realism as well as calling attention to certain events. The audience hears almost everything from Jeffries’ point of view, including the film’s music, as there is no non-diegetic score but instead fragments of music, not only including the song Lisa but also music from radios in different apartments and a slightly mysterious vocal scale being sung by a woman we never get to see.88 As Michel Chion notes, starting the film with the protagonist waking makes like all seem like a dream.89 Rear Window is an exercise in restricted narration, having medium shots and close-ups in Jeffries’ apartment but only long shots of events outside. A momentary anomaly is to show us a

94 

K. J. DONNELLY

far closer shot than we might expect across the courtyard of the key in Lisa’s hand. The film is dominated by point-of-view shots surveying the courtyard. Sound calls attention to things, such as when the owner laments her dog’s mysterious death. The sound is always from Jeffries’ point of view, evident when he sees Lisa is caught in Thorwald’s apartment, where we can see that they are speaking in a point-of-view long shot, and we hear no voices. The same goes for some of the other people Jeffries watches from his window, confirming that sound retains the same subjective point of view as image. Indeed, we hear an array of distant sounds from the other apartments but, as Elisabeth Weis notes, the music that we hear gives the audience useful information about the different characters in the various apartments as we have an absolutely minimal amount of character development for each of them.90 Paramount used the ‘Perspecta’ sound system at this point, an optical sound pseudo-stereo system which allowed mono soundtracks to be diffused in stereo and furnishing some sense of space. 91 While the soundtrack may sound full and chaotic as we watch the film, listening to it alone, it is clearer that it is far more structured and mixed to place important narrative sounds in the foreground. The soundtrack to Rear Window is a good illustration of the perceptual phenomenon of ‘the cocktail party effect’, whereby we can perceive important things in a meleé of different sounds. This is evidence for how perception hierarchizes sounds, producing a sense of order out of a cacophony. Indeed, this is clear with film soundtracks and even more clear in the less dense tradition of television drama soundtracks. While Rear Window is a claustrophobic film, with its limited space and vision, the sounds provide a greater sense of space, corresponding to one of the principal evolutionary functions of hearing: that of perceiving beyond the distance we can see. In the same way that Rear Window’s restricted viewpoint aims to immerse us in Jeffries’ experience and psychology, video games have exploited trying to double our perception. Realism has not only been something that video games have aspired to, but also a term that has a valence in discussions of video games by both designers and game players, while it is also often of paramount importance in their publicity and marketing.92 For a medium that regularly aims at player immersion, a sense of being involved in something close to reality can be crucial for the game’s efficacy. In a more sophisticated approach, an article from Eurogamer notes that ‘realism’ is a term used readily by designers, gamers and journalists, and then goes on to differentiate between “ functional

3  PERPETUAL REALISM: MEDIATING FANTASY AND REALITY 

95

realism  – [which] relates to how a game behaves as a simulation; … [and] …. perceptual realism – [which] relates to how a game looks and sounds as a set of moving sound-images”.93 This is an instructive distinction. Both are necessary but they have an uneven relationship. Many games achieve functional realism and make for strong gameplay but can be less than convincing on the second count. However, games that achieve perceptual realism but lack functional realism are less attractive. Certain video games clearly aim at a sense of both but some specialize in one or the other. Perhaps the clearest for the former would be arcade games, where the interface with the game involves a physical part of the game itself. Examples include car or aircraft cockpits, and hand-held implements such as guns, which provide the player with a realistic control as a physiological connection with the activities on screen. Good examples include Namco game Prop Cycle (1996), where the player must pedal a bicycle to keep a cycle-powered aircraft airborne, or Football Power (1999, Gaelco), where the game players must kick an actual football on a stick to make the avatar players on screen kick the ball.94 Arcade game Sega Rally Championship (1994) was a groundbreaking car racing game which led to the Sega Rally series of games. As a multiplayer car rally game, it could like up to five screens all with cockpits based on car driving seats with a stick-shiftable gear box, a foot pedal for controlling the acceleration and one for the brakes and a steering wheel. The player could pick a car and pick a view on screen: either a third-person view (from behind the car) and ‘VR View’, which approximated the view from the driving seat of the car. The player was able to choose a ‘World Championship’ track with three successive levels (African desert, South American forests and a Mountain circuit based on Monaco). Sega Rally Championship aimed for a sense of authenticity as well as a physical feeling of actually driving the car. The car cockpits were certainly an attraction; sitting in a car interface immediately aided a sense of the player perceiving the driving as a convincing operation. Similar simulations of point of view added to physical interface included games such as Sega’s Wave Runner (1996) or Namco’s Aqua Jet (1996), jet ski games where the player had to mount a rig and control direction through lateral movement. In the case of Aqua Jet, this was standing on a jet ski and leaning from side to side. The large screen had a subjective view, which was particularly effective at the point where the player makes a large drop into the water and submerges to go through swimming fish. The game’s soundtrack of jaunty and energetic electronic music is altered as the player is submerged,

96 

K. J. DONNELLY

returning to its normal sound and volume upon emerging from the water. This illustrates how far video games, and in particular arcade games, can ignore or see as irrelevant the audiovisual conventions of earlier drama that to some degree they have inherited. The sense of demarcation between the non-diegetic music and the diegesis (the on-screen world of the game) is broken at this point, and the momentary subjective effect of entering the water trumps the integrity of the illusion. This, allied with the sheer physicality of leaning from side to side on the ski platform, makes the gameplay highly immersive and a convincing simulation (including an exciting ‘jump contest’). Jesper Juul describes the game world as ‘half-­ real’, stating that “To play a video game is to interact with real rules while imagining a fictional world, and a video game is a set of rules as well as a fictional world”.95 Several early arcade games aimed to make the interface as close as possible to actuality. Perhaps the best example of this would be submarine games, such as Midway’s arcade games Sea Wolf (1976) and Submarine (1979), both of which has a physical periscope interface and a point of view towards ships that the player should attempt to sink with torpedoes. This is the origin of the ‘first-person shooter’ (FPS) game which supplies a subjective point of view, usually with the characteristic hand with a gun in the centre of the screen. This cohered around a number of games in the 1990s, including Wolfenstein 3D (1992), Doom (1993) and Quake (1996). These supplied a very realistic feeling of aiming and shooting things, with the action taking place in a limited space of mostly building interiors in which the player was free to choose where to move. Such games, rather than always following the ‘first-person’ approach, often instead use a ‘third-person’ view, where the whole body of the avatar is visible on screen in video games. This is evident in Resident Evil 4 (2005, Capcom), Dead Space (2008, EA Redwood Shores) and the Silent Hill games (starting 1999, Konami). Massively multiplayer online role-playing games (MMORPGs) regularly offered the choice of first-person or third-person point of view for the player. This is the case in some of Bethesda’s The Elder Scrolls series, including III: Morrowind (2002), IV: Oblivion (2006) and V: Skyrim (2011), and Rock Star Games’ Red Dead Redemption (2010, in later versions) and Red Dead Redemption 2 (2018), PUBG Studios’ PUBG: Battlegrounds ([Player Unknown’s Battle Grounds] 2017) or Aurora’s Ring of Elysium (2018). For the most part, such games retain a functional soundtrack, stratified in a manner similar to mainstream films to fit with distinct locations, although broadly they may seem

3  PERPETUAL REALISM: MEDIATING FANTASY AND REALITY 

97

co-ordinated around the player’s avatar and supportive of the subjective psychology built for the player.

Conclusion In recent years, the possibilities of digital manipulation in post-production have led to something of a ‘crisis of the real’. While I don’t want to make an extensive discussion of the way this changes a sense of audiovisual culture’s relationship to reality, it is worth addressing a couple of instances. One returns to a significant moment in film history. As I mentioned at the start of this chapter, the Lumière brothers’ film L’Arrivée d’un train en gare de La Ciotat (Arrival of a Train at La Ciotat, 1896) captured images of a train pulling into a train station in the French coastal town of La Ciotat. In 2020, Denis Shiryaev developed and then posted onto the Internet a version of the this film that had been digitally adapted (‘upscaled’) and rendered in 4 K and at 60 frames per second.96 This led to a discussion among film historians, some of whom were impressed with the phenomenological effect and potential, while others were either unimpressed or claimed that archive prints were of high quality and audience expectations had been lowered by the prevalence of poor digital copies on YouTube. 97 Shiryaev upscaled the film by using artificial intelligence algorithms, which were able to enhance and add further detail to the images, including making estimations at what was missing from the images. Having never seen a high-quality print of the film, I was impressed with Shiryaev’s version although I was unnerved by the idea that a computer program had guessed what details should be, and no doubt in some cases had guessed wrong.98 Other digitally enhanced versions followed, in some cases as a showcase for certain forms of software.99 No doubt an important element is the effect of the speed of playback at 60fps (frames per second). When the film was shot, films were hand-cranked and usually made for between 16 and 20fps. This, along with later projection of the film at the wrong speed, is why the motion on early films usually appears fast and jerky. Today, filmmakers typically shoot film or video at a minimum of 24fps, while 60fps is the standard for high-definition digital video. This is the expectation now and deviation from it registers. Initially, our perception will register activities not looking ‘right’ before our higher-level processes contextualize older films. Yet these films did not have jerky and fast movement when

98 

K. J. DONNELLY

they were released but rather this was to do with changing the speed of projection. It may seem obvious to point out certain things, like audiovisual culture being based on perception, yet our tendency, even for analysis, is to forget the frame and often simply to deal with sound and images on screen as if we are encountering actual people and real spaces. This tells us something about the centrality of the process of perception, where we have an overwhelming desire to understand electronic sounds and images as reality. While digital effects may have caused us to doubt sound and images, the ‘reality effect’ prevails to the point where audio and video make excellent courtroom evidence. It is no surprise that the reality effect has dominated film, and at times discussions surrounding it. We take images that have the same shape and details and have the same movement as reality to be something close to reality. Torben Grodal notes the ‘direct drive’ of film and audiovisual culture, where our perceptual faculties process sound and images the same way that they would the actual objects being re-presented as recordings.100 Image and sound regularly double human subjectivity in audiovisual culture through doubling perception, from film’s ‘point-of-view/audition’ implication to the genre of first-person shooter (FPS) videogames. All formats provide at least intermittent first-person perspective or implication of focused subjectivity through direct technological simulation of distinct aspects of human perception in terms of both sound and image. Indeed, audiovisual aesthetics forge subjectivity, both of the characters on screen and the audience. Of course, its configurations match up to perceptual requirements as much as if not more so than it does to cognitive expectations. However, this is not far from a simple duplication of sight and hearing. Yet it includes a large degree of approximations and homologies, and even some direct mimicking of human perceptual faculties. Its relationship is closer to our perceptual faculties than it is to our relationship with reality, assuming that these are not quite the same thing. Culture is a form of evolution, where environmental changes are far too rapid for any developments in human physiology to keep up, and adaptation has taken place immediately around the human form. While technology may be clear in its function, culture is less so, but no doubt helps us cohere with our surroundings as much as it may also offer a separate environment for humanity, outside the everyday. This suggests that culture and the media are like a hat to fit our heads, a helpful extra built around our perceptual requirements, and not in a banal way. It would suggest that

3  PERPETUAL REALISM: MEDIATING FANTASY AND REALITY 

99

audiovisual culture works around not only our senses but the way that our senses work, to play to aspects of seeing and hearing as well as to the ways that the two are merged and ordered as part of the perceptual process. This is not far from Rudolf Arnheim’s suggestion that the effectiveness of film came from the closeness but also the deviation from the human senses, exploiting an overemphasis and exaggeration of perception as well as manipulating the processes of perception themselves.101 That the media are a glove-to-our-senses is testified to perhaps most clearly in the replacement of ‘hi-fidelity’ sound and its conception (prominent during the era of analogue sound) with psychoacoustics, which are not built around any scientifically objective measures of sound reproduction but rather producing what human hardware processes most readily (which is dominant in contemporary digital culture, with its filter shelfs and compression). While in this manner digital culture has accelerated audiovisual culture’s harmonization with human perception, finally, in the second decade of the twenty-first century, virtual reality (VR) has been moving beyond the prototype stage. VR has adopted almost all the visual and sonic conventions of audiovisual culture that developed with synchronized film nearly a century earlier. Audiovisual objects exploit the ambiguities of perception, and points where a synergy of image with sound and music is most clear. Contemporary audiovisual culture is based not simply on the illusion of movement but more crucially on the illusion and effect of sound and image being merged into a coherent whole.

Notes 1. Tom Gunning, “An Aesthetic of Astonishment: Early Film and the (In) credulous Spectator” in Art and Text, vol. 34, Spring 1989, p. 114. 2. Stephen Bottomore gives a little credence to the stories but concludes that they were massively exaggerated. “The Panicking Audience? Early Cinema and the ‘Train Effect.” in Historical Journal of Film, Radio and Television, vol. 19, no. 2, 1999, p. 201. 3. Gunning, op.cit., 1989, p. 118. 4. Ibid., pp. 114–115. 5. Torben Grodal, “The PECMA Flow: A General Model of Visual Aesthetics” in Film Studies, issue 8, Summer 2006, p. 7. 6. There are, of course, questions about objectivism and the notion that reality is coherent and unified, as there are about the subjectivity and coherence of the perceiver. It would not be practical here to get into

100 

K. J. DONNELLY

philosophical ­discussion of these, or to deepen discussions about consciousness and human psychology. 7. Hugo Münsterberg, The Photoplay: A Psychological Study (New York: D.Appleton and co, 1916). 8. Gregory Currie, “Film, Reality, and Illusion” in David Bordwell Noel Carroll, eds., Post-Theory: Reconstructing Film Studies (Madison WI: University of Wisconsin Press, 1996), p. 325. 9. Andre Bazin, ‘The Ontology of the Photographic Image’ in What is Cinema? Volume 1 (London: University of California Press, 2005), p. 10. 10. André Bazin, “Cinematic Realism” in Thomas Wartenberg and Angela Curran, eds., The Philosophy of Film: Introductory Text and Readings (New York: Wiley, 2004), p. 68. 11. Roland Barthes’ influential essay about literature, ‘The Reality Effect’, focuses on the importance for fiction of minute and seemingly extraneous detail (‘Flaubertian description’). This forms the basis for Murray Pomerance’s recent edifying discussion of realism and the cinema. Roland Barthes, “The Reality Effect” in The Rustle of Language, trans. R. Howard (Berkeley, CA: University of California Press, 1989); Murray Pomerance, The Eyes Have It: Cinema and the Reality Effect (Brunswick, NJ: Rutgers University Press, 2013). 12. Bazin, op.cit., 2004, pp. 64, 67. 13. Andre Bazin “The Ontology of the Photographic Image” in Andre Bazin, What Is Cinema?, Volume 1, translated by Hugh Gray (Los Angeles, CA: University of California Press, 1967 [f.p.1945]), p. 15. 14. Discussed by Siegfried Kracauer in Chapter 4 of Theory of Film: The Redemption of Physical Reality (Oxford: Oxford University Press, 1960), pp. 60–76. 15. Ibid., p. 36. 16. Ibid., p. 36. 17. Bazin, op.cit, 2004, p. 67. 18. Ian Aitken notes that, among other ways, realism can be understood ‘in terms of technique, convention and expectation’. The Major Realist Film Theorists: A Critical Anthology (Edinburgh: Edinburgh University Press, 2016), p. 6. 19. Other celebrated long take sequences include one in Robert Altman’s The Player (1992) where characters talk about editing speeds and MTV, and the eight-minute tracking shot alongside a traffic jam in Jean-Luc Godard’s Week End (1967). 20. Lea Jacobs and Richard De Cordova, “Spectacle and Narrative Theory” in Quarterly Review of Film Studies, vol. 7, no. 1, Fall 1982, p. 300.

3  PERPETUAL REALISM: MEDIATING FANTASY AND REALITY 

101

21. This seems partly derived from R.  Murray Schafer’s The Tuning of the World (London: Random House, 1977). Theo van Leeuwen, Speech, Music, Sound (Basingstoke: Macmillan, 1999), p. 15. 22. Gerald Mast, Film/Cinema/Movie: A Theory of Experience (New York: Harper and Row, 1977), p. 216. 23. Christian Metz, “Aural Objects” in Yale French Studies, no. 60, Special Issue “Cinema/Sound”, 1980, p. 29. 24. Theo van Leeuwen, Speech, Music, Sound (Basingstoke: Macmillan, 1999), p. 17. 25. Mark Kerins, Beyond Dolby (Stereo): Cinema in the Digital Sound Age (Bloomington, IN: Indiana University Press, 2010). 26. “Alan Blumlein and the Invention of Stereo” at EMI Archive Trust. https://www.emiarchivetrust.org/alan-­blumlein-­and-­the-­invention-­of-­ stereo/ [accessed 20/5/2022]. 27. James Lastra notes how John Culshaw’s stereo recording of Wagner’s Das Rheingold in 1958 was important for convincing people of stereo’s abilities as it aimed its sound at home consumption rather than duplication of the stage performance. James Lastra, “Film and the Wagnerian Aspiration: Thoughts on Sound Design and History of the Senses” in Elisabeth Weis and John Belton, eds., Film Sound: Theory and Practice (New York: Columbia University Press, 1985), p. 135. 28. Chris Korff, “An Introduction to Binaural Recording: Use Your Head” at Sound on Sound, April 2021 https://www.soundonsound.com/techniques/introduction-­binaural-­recording [accessed 20/3/2022]. 29. “Welcome to the World of Holophonics”. http://www.acousticintegrity. com/acousticintegrity/Holophonics.html [accessed 20/3/2022]. 30. “Ambisonics Explained: A Guide for Sound Engineers” at Waves, October 10, 2017. https://www.waves.com/ambisonics-­explained-­guide-­for-­ sound-­engineers [accessed 20/5/2022]. 31. For instance, aimed at VR, Sennheiser’s Ambeo Smart Headset allow for easy binaural recording. Anon, “Sennheiser’s short film shows the power of binaural audio” at The NextWeb. 17 September 2018. https://thenextweb.com/news/sennheisers-­short-­film-­shows-­the-­ power-­of-­binaural-­audio [accessed 11/5/2022]. 32. Film dialogue is recorded with shotgun or hypercardioid microphones which are highly focused, whereas binaural recordings use omnidirectional microphones, which would pick up extraneous ambient sounds that are aimed to be avoided, losing a sense of focus and clarity. Of course, we are used to a situation in feature films where most sounds are either enhanced or added in post-production. 33. Rick Altman, “Sound Space” in Rick Altman, ed., Sound Theory Sound Practice (London: Routledge, 1992), pp. 58, 60.

102 

K. J. DONNELLY

34. John Belton, “1950s Magnetic Sound: The Frozen Revolution” in Rick Altman, ed., Sound Theory Sound Practice (London: Routledge, 1992), pp. 161–162. 35. In the 1990s, IMAXes such as the one at the National Museum of Film and Television at Bradford in the UK regularly included 3-D as part of their spectacle. 36. One of the most influential theories of ideology and culture, Louis Althusser’s notion of ‘interpellation’, involves the subject being ‘hailed’ and thus placed by different discourses, a metaphorical process which suggests sound. Louis Althusser, “Ideology, and Ideological State Apparatuses” in Lenin and Philosophy and Other Essays, translated by Ben Brewster (London: Verso, 1971), p. 11. 37. Raymond Williams, “A Lecture on Realism” in Screen, vol. 18, issue 1, Spring 1977, p. 63. 38. Jordan Raup, “Watch: Lars von Trier Explains His Dogme 95 Manifesto In 1998 Documentary” in The Film Stage, 18 February 2014.https:// thefilmstage.com/watch-­l ars-­v on-­t rier-­e xplains-­h is-­d ogme-­9 5-­ manifesto-­in-­1998-­documentary/ (accessed 20/06/2020) 39. Ian Aitken, The Major Realist Film Theorists: A Critical Anthology (Edinburgh: Edinburgh University Press, 2016), p. 2. 40. Christian Metz, Film Language: A Semiotics of Cinema. Translated by Michael Taylor (Oxford: Oxford University Press, 1974). 41. Colin MacCabe, “Realism and Cinema: A Note on Some Brechtian Theses” in Tony Bennett, Susan Boyd-Bowman, Colin Mercer and Janet Woollacott, eds., Popular Television and Film (London: BFI, in association with the Open University, 1981). 42. Colin McArthur, “Days of Hope” in Tony Bennett, Susan Boyd-­Bowman, Colin Mercer and Janet Woollacott, eds., Popular Television and Film (London: BFI, in association with the Open University, 1981), p. 307. 43. Ibid., pp. 308, 309. 44. Colin MacCabe, “Days of Hope: A Response to Colin McArthur” in Tony Bennett, Susan Boyd-Bowman, Colin Mercer and Janet Woollacott, eds., Popular Television and Film (London: BFI, in association with the Open University, 1981), p, 312. 45. Colin MacCabe, “Memory, Phantasy, Identity: Days of Hope and the Politics of the Past” in Tony Bennett, Susan Boyd-Bowman, Colin Mercer and Janet Woollacott, eds., Popular Television and Film (London: BFI, in association with the Open University, 1981), p. 315. 46. Troy Kennedy Martin, “Nats Go Home” in Encore, no. 48, March–April 1964, pp. 24–25. 47. Ibid., p. 28.

3  PERPETUAL REALISM: MEDIATING FANTASY AND REALITY 

103

48. John McGrath, “TV Drama: The Case Against Naturalism” in Sight and Sound, vol. 46, issue 2, 1977, reprinted in Bob Franklin, ed., Television Policy: The MacTaggart Lectures (Edinburgh: Edinburgh University Press, 2005), p. 38. John Caughie includes a section called ‘Boring Naturalism’ in his article “Progressive Television and Documentary Drama” in Tony Bennett, Susan Boyd-Bowman, Colin Mercer and Janet Woollacott, eds., Popular Television and Film (London: BFI, in association with the Open University, 1981), pp. 327–352. 49. McGrath, op.cit., 1977, p. 36. 50. Raymond Williams, “A Lecture on Realism” in Screen, vol.18, issue 1, 1 March 1977, pp. 61–74. 51. David Bordwell, Narration in the Fiction Film (Madison, WI.: University of Wisconsin Press, 1985), pp. 156–204. 52. James E. Cutting and Ayse Candan, “Shot Durations, Shot Classes, and the Increased Pace of Popular Movies” in Projections, Volume 9, Issue 2, Winter 2015: 40–62. pp. 40–41. 53. David Bordwell, “Intensified Continuity: Visual Style in Contemporary American Film” in Film Quarterly, vol. 55, no. 3, Spring 2002, pp. 16–28. 54. Ibid., pp. 16–21. 55. Ibid., p. 17. 56. Jeff Smith, “The Sound of Intensified Continuity” in John Richardson, Claudia Gorbman, and Carol Vernallis, eds., The Oxford Handbook of New Audiovisual Aesthetics (New York: Oxford University Press, 2013), pp. 353, 338. 57. https://www.hse.gov.uk/enforce/enforcementguide/court/physical-­ sound.htm [accessed 20/5/2022]. 58. It is a total of 26.6  seconds, exposing 486 frames of standard 8  mm Kodachrome II safety film, running at an average of 18.3 frames/second. 59. Analyses of it have abounded. An interesting one with some insights is “Sound Designer Leo Chaloukian on his analysis of the JFK assassination tape” on YouTube (interviews.televisionacademy.com). www.youtube. com/watch?v=LA4zSZC9mIQ [accessed 8/4/2022]. 60. For further discussion, see the documentary Image of an Assassination: A New Look at the Zapruder Film (2018); 2003 David Wrone, The Zapruder Film: Reframing JFK’s Assassination (Kansas City: University of Kansas Press, 2013) and Alexandra Zapruder, Twenty-Six Seconds: A Personal History of the Zapruder Film (New York: Twelve, 2016). 61. Anon, “25 Films Added to National Registry” in the New York Times, 15 November 1994https://www.nytimes.com/1994/11/15/movies/25-­ films-­added-­to-­national-­registry.html [accessed 20/05/2018]. 62. Clay Risen, “George Holliday, Who Taped Police Beating of Rodney King, Dies at 61” in the New York Times, 22 September 2021.

104 

K. J. DONNELLY

https://www.nytimes.com/2021/09/22/us/george-­holliday-­dead. html [accessed 5/10/2021] 63. Anon, “Teen who filmed George Floyd’s murder given journalism award” at BBC News, 11 June 2021https://www.bbc.co.uk/news/world-­us-­ canada-­57449229 [accessed 1/10/2021] 64. Brian Winston, “How the Myth Was Deconstructed”, Wide Angle, Volume 21, Number 2, March 1999, pp. 71–86 65. The band British Sea Power made a new soundtrack to the film by masking Flaherty’s original, which was enabled by the semi-disconnected status of the film and its sound. Cf. K.J. Donnelly, “Irish Sea Power: a New Version of Man of Aran (1934/2009)” in Holly Rogers, ed., Music and Sound in Documentary Film (New York, Routledge, 2015), pp. 137–150. 66. K.J.  Donnelly, Occult Aesthetics: Synchronization in Sound Film (New York: Oxford University Press, 2014), pp. 181–3 67. For a more detailed discussion, see K.J. Donnelly, Magical Musical Tour: Rock and Pop in Film Soundtracks (New York: Bloomsbury, 2015), pp. 70–71. 68. Lev Manovich, ‘What is Digital Cinema?’, p. 8. http://manovich.net/ content/04-­projects/009-­what-­is-­digital-­cinema/07_article_1995.pdf 69. Jean Baudrillard, “The Precession of Simulacra” in Brian Wallis and Marcia Tucker, eds., Art After Modernism: Rethinking Representation (New York: New Museum of Contemporary Art, 1984), 253–281. (p. 256). 70. Stephen Prince, “True Lies: Perceptual Realism, Digital Images, and Film Theory” in Film Quarterly, vol. 49, no. 3, Spring 1996, p. 27–38. 71. K.J. Donnelly, “How Far Can Too Far Go? Radical Approaches to Silent Film Music” in K. J. Donnelly and Ann-Kristin Wallengren, eds., Today’s Sounds for Yesterday’s Films: Making Music for Silent Cinema (New York: Palgrave, 2016), p. 10. 72. “We’re on top of the world: Emirates Airline” at YouTube. https://www. youtube.com/watch?v=uQHhYRuaEtM [accessed 18/9/2021]. 73. Alan Granville, “This is How that Amazing Anxiety-Inducing Airline Advert Was Made” in Stuff, 10, August 2021. https://www.stuff.co.nz/travel/news/126021186/this-­is-­how-­that-­ amazing-­anxietyinducing-­airline-­advert-­was -made [accessed 17 September 2021]. 74. Stephen Prince, Digital Visual Effects in Cinema: The Seduction of the Real (New Brunswick, NJ: Rutgers University Press, 2012), p. 4. 75. Lisa Bode, “‘It’s a Fake!’: Early and Late Incredulous Viewers, Trick Effects, and CGI” in Film History, vol. 30, no. 4, Winter 2018, pp. 1–21.

3  PERPETUAL REALISM: MEDIATING FANTASY AND REALITY 

105

76. Melissa, “That Jungle Sound” at The Sound and the Foley, 30 May 2013. http://soundandthefoley.com/2013/05/30/that-­j ungle-­s ound/ (accessed 10/9/2021). 77. Melissa, “Of Tarzan and Kookaburras” at The Sound and the Foley, 27 August 2013.http://soundandthefoley.com/2013/08/27/of-­tarzan-­ and-­kookaburras/ (accessed 10/9/2021). 78. Highly stereotypical animal sounds abound in film and television, with all big cats sounding like lions and all seabirds like herring gulls, for example. 79. Indeed, there is an apocryphal story I was once told about Chuck Norris wanting actual sounds of punching in one of his films, but ultimately he was disappointed as the end results were unusable and had to be replaced with the stock library sounds that gave a conventional effect. 80. See further discussion in Jason Jacobs, “Gunfire” in Jose Arroyo, ed., Action Spectacle Cinema: A Sight and Sound Reader (London: BFI, 2000); Gwyn Symonds, The Aesthetics of Violence in Contemporary Media (London: Continuum, 2008). 81. Hanns Eisler and Theodor Adorno, Composing for the Films (London: Athlone, 1994 [f.p.1947]), pp.  75–76; K.  J. Donnelly, “The Ghostly Effect Revisited” in Ron Sadoff, Miguel Mera and Ben Winters, eds., The Routledge Companion to Screen Music and Sound (New York: Routledge, 2017), pp. 17–25. 82. Another example of this is the club sequence in David Lynch’s Twin Peaks: Fire Walk With Me (1992), where, rather than hearing the characters talking, we have a sound perspective that would be closer to the characters in the diegesis, and hear almost exclusively the club’s loud diegetic music. 83. This brief burst of whistling appears to be the Scottish song Comin’ Through the Rye. 84. Angela Della Vache, André Bazin’s Film Theory: Art, Science, Religion (New York: Oxford University Press, 2020), p. 13. 85. Rick Altman, “Sound Space” in Rick Altman, ed., Sound Theory Sound Practice (London: Routledge, 1992), p. 60. 86. Michel Chion, AudioVision: Sound on Screen (New York: Columbia University Press, 1994), p. 90. 87. Richard Raskin, “Varieties of Film Sound: A New Typology” from (Pré) Publications, no. 132 (Romansk Instituet, Aarhus Universitet), April 1992, pp. 32–48, quoted in David Sorfa, “Seeing Oneself Speak: Speech and Thought in First-Person Cinema” in JOMEC Journal, 2019, pp. 104–121. p. 104, ff2. 88. Bing Crosby’s recording of To See You is to Love You accompanies ‘Miss Lonelyhearts’ pretending she has a dinner guest.

106 

K. J. DONNELLY

89. Michel Chion, “Alfred Hitchcock’s Rear Window: The Fourth Side” in John Belton, ed., Rear Window (Cambridge University Press, 2000), pp. 110–117. p. 112. 90. Elisabeth Weis, The Silent Scream: Alfred Hitchcock’s Soundtrack (Rutherford, NJ.: Fairleigh Dickinson University Press, 1982), p. 109. 91. John Belton, “Introduction: Spectacle and Narrative” in John Belton, ed., Rear Window (Cambridge University Press, 2000); Eric Diensfrey, “The Myth of the Speakers: A Critical Reexamination of Dolby History” in Film History, vol. 28, no. 1, 2016, pp. 167–193 p. 176; Nathan Platte, “Postwar Hollywood, 1947–1967” in Kathryn Kalinak, ed., Sound: Dialogue, Music, and Effects (New Brunswick, NJ: Rutgers University Press, 2015), p. 70. 92. Paul Martin, “Realism in Play: The Uses of Realism in Computer Game Discourse” in Dirk Göttsche, Rosa Mucignat and Robert Weninger, eds., Landscapes of Realism: Rethinking Literary Realism in Comparative Perspectives, Vol. I (Amsterdam: John Benjamins, 2021), pp. 715–733. 93. ‘Deleted user, 2010’, “How Does Realism Effect Game Play?” Eurogamer. www.eurogamer.net/forum/thread/181582 [accessed 22/7/2019]. 94. Indeed, the realism of this game was such that on a number of occasions, in kicking the ball, I managed to sustain a foot injury. 95. Jesper Juul, Half-Real: Video Games Between Real Rules and Fictional Worlds (Cambridge, MA: MIT Press, 2011), p. 15. 96. Zack Sharf, “Lumière Brothers’ 1895 Short ‘Arrival of a Train’ Goes Viral with Fan-Made 4  K Restoration” at Indiewire, 5 February 2020. https://www.in diewire.com/2020/02/lumiere-­brothers-­arrival-­of-­a-­ train-­4k-­update-­1202208955/ [accessed 10/2/2020]. 97. Initially I spoke to my colleagues Michael Williams, Malcolm Cook and Mike Hammond, which broadened my view of the film; some scholars were also not happy with the details Shiryaev provided, either. 98. Original video was processed with deep learning algorithms to achieve modern look and quality. Gigapixel AI from Topaz Labs for upscaling to 4  K, Dain for adding the missing frames, and DeOldify to add c o l o u r : h t t p s : / / w w w. y o u t u b e . c o m / w a t c h ? v = 3 R Y N T h i d 2 3 g (accessed …).https://www.reddit.com/r/videos/comments/eyoxfb/ oc_i_have_made_60_fps_4k_version_of_1896_movie/. 99. For instance, the one at ‘Olden Days’. https://www.youtube.com/ watch?v=U6GI6dg266E [accessed 4/3/2022]. 100. Torben Grodal, “The PECMA Flow: A General Model of Visual Aesthetics” in Film Studies, issue 8, Summer 2006, p. 3. 101. Rudolf Arnheim, Art and Visual Perception: A Psychology of the Creative Eye (Los Angeles: University of California Press, 1974), pp. 10–12.

CHAPTER 4

Mediating the Psychological and the Physiological

There appears to be a ‘gap’ between human physiology, including the brain and nervous system, and the current environment, which is far different from the one which humans seemingly evolved to ‘fit’. Culture appears to mediate this gulf, and this chapter will address the way audiovisual culture openly mediates between ‘reality’ and ‘imagination’, the physiological and the psychological, in perceptual, aesthetic and narrative terms.1 Following the chapter title, I wonder how audiovisual culture ‘mediates’ between the physiological and the psychological. What is the relationship between these two aspects of being human? Evolutionary Psychologists tend to think of the relationship as direct, with both following the same patterns. One aspect addressed in this chapter is the strategy of ‘toggling’ between physical and mental, which highlights the relationship between the two and poses a particular relationship: one of alternation, of one more exterior and the other interior and more negative. Is this ‘mediating’ process a dialectic? Does it hold the two things in a tense relationship? Held in suspension? Or should it be more profitably thought of as a process that ‘translates’ between the two states? Not necessarily as an ‘intermediary” as such or making a direct and straightforward connection between the two but perhaps as something in between that embraces but also helps to define both terms. I will focus on films that oscillate between two radically different worlds, which are heavily demarcated in a few ways. Such diegeses often contain a strong physical and psychological © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 K. J. Donnelly, The McGurk Universe, Palgrave Studies in Audio-Visual Culture, https://doi.org/10.1007/978-3-031-18633-2_4

107

108 

K. J. DONNELLY

demarcation between the two worlds, which have a strong physical and perceptual interplay. Much culture sets up a state of ambiguity and some confusion between ‘real’ and interior states of mind. We approach diegetic worlds as on some level being ‘real’, aided by the so-called suspension of disbelief. Indeed, much culture, such as literature and film, are premised upon making fantasy seem real. On the reverse side of this, we can become confused between manifestations of psychological ‘inner states’ in a seemingly ‘realistic’ format. In some diegeses, two different worlds can be present and set apart, which precisely illustrates the process of formulating constructions of ‘fantasy’ and ‘reality’ within the same space. These can play to or across the traditional tendency for audiovisual culture to use sound for interior states of mind and image for more objective situations. Mediating human physiological and psychological aspects, audiovisual culture also mediates ‘reality’ and ‘imagination’ for us, as well as potentially bridging what evolutionary psychology has posed as the ‘gap’ between the environment for which our brains and physiology developed and the very different modern world.2

Bridging ‘The Gap’ For ‘adaptationists’, such as Cosmides and Tooby, and Pinker, human physiological form has remained the same for thousands of years and our minds have basically remained in the same form, too.3 They contend that our modern skulls contain a Stone Age mind. The remarkable developments that humanity has wrought since the Stone Age are not due to evolution but to the powers of the human brain that were in place at that time. The architecture of the human brain evolved into an amalgam of modules with different specialized micro-functions that can work together following different paths to solve problems. It is this seemingly infinite possibility of recombination of functions and approaches that has enabled humanity’s trajectory from ancient to modern life. Indeed, perhaps this is the reason why physiological evolution appears to have stopped. Radical changes in environment might have been negotiated without a need for physiological changes, through changes instituted by thought and culture. However, this nevertheless poses a ‘gap’ between the physiological form of human beings, which reached the current form during the Pleistocene era, which ended over 12,000  years ago. This is when Homo sapiens existed in its current form. This point is known to

4  MEDIATING THE PSYCHOLOGICAL AND THE PHYSIOLOGICAL 

109

adaptationists as the ‘environment of evolutionary adaptedness’ (‘the EEA’). This appears to be a static condition where human physiology and brain structure was fixed, sometime in the past 200,000 years or so. The term EEA addresses the ancestral environment to which a species is adapted and describes the conditions and properties of the habitat in which evolutionary adaptations took place, and the format of the human mind as much as any other human physiological characteristic. According to Steven Pinker, “The mind is a system of organs of computation, designed by natural selection to solve the kinds of problems our ancestors faced in their foraging way of life, in particular, understanding and out manoeuvring objects, animals, plants, and other people”.4 In Biophilia, biologist E.O. Wilson discussed ‘the Savannah Hypothesis’, which understands human beings as evolving for and being precisely suited to a life of hunter-gathering on the East African Savannah. He noted that aspects of this environment remain in human desires and motivations.5 He suggests that quite general tendencies remain, and shape current human activities. However, Leda Cosmides and John Tooby theorized the EEA in detail, and noted that this was not a particular location or habitat, but a conglomeration of aspects ancient man would have encountered,6 rather than it being simply a specific type of terrain and ecosystem. Yet these pose an interesting question. If the human physiology, the brain included, have not adapted since that time, there is a clear difference between the world for which humans evolved and current environments, the vast majority of which are radically different. Indeed, even the last century has changed the habitats of most human beings beyond all recognition. So, this poses a sense of the likelihood of a gap between mind and world, and perhaps one that has been increasing steadily in magnitude. Evolution is the result of extremely long-term adaptation of organisms to their environment. Changes in environment will cause slow change in a population to match the requirements of that environment’s particularities. This is all well and good, but it appears from archaeological evidence that in physiological terms human beings have hardly changed in hundreds of thousands of years. Environments have changed radically, particularly over the last few hundred years. This poses an evolutionary mismatch, where environmental change can happen quickly and radically while organisms such as people can take far longer to catch up. Sometimes this is called the ‘Mismatch Theory’, posing a significant ‘gap’ between the human form and the requirements of the world it inhabits.7 Nathan Cofnas notes that this ‘evolutionary mismatch’ between environment and

110 

K. J. DONNELLY

organism has become a primary notion in evolution-informed psychology and medicine, and indeed is understood as having a significant impact on human health.8 ‘Maladaptation’ describes the situation where, due to rapid changes in the environment, the evolved characteristics are a poor fit, as the environment the organism has evolved to match has gone. This leads to drops in population numbers in most species, but in humans it leads to phenomena such as obesity due to abundant high-calorie food and addictive behaviour (not just drugs but alcohol, food and gambling). Human psychology at times might seem ill-fitted to our current society; indeed, sometimes it can seem extremely self-destructive and perhaps even degenerate. According to EP, it fitted the Pleistocene perfectly.9 The gap between our physical characteristics and modern life is probably most clear in terms of physiology.10 It looks likely that ancient Homo sapiens were athletic, which aided hunting, and lived relatively short lives, meaning that tribes or groupings were not burdened heavily with old and ageing members. Also, at the time, there were limited problems which needed solving. While many physiological aspects might be assigned to idealized hunter-gatherer Adam and Eve figures, the innovative approach of EP is to classify our brains as physiological objects. Consequently, they are limited by the environment in which they initially developed as much as the rest of our bodies. Their architecture and processes have become better understood in recent decades and appear to follow a number of basic activity patterns that are added together to complete complex tasks. However, as we are well aware but few are willing to acknowledge, there are many Homo sapiens who are quite happy to live their lives without stretching their mental capabilities. While others with the same physical brain hardware can push back the frontiers of knowledge as research scientists or inventors. So, there is no reason to conclude that the human brain needs to be involved in complex problem-solving. An approach that does not privilege the brain but rather sees it as an integral part of the main organism suggests that it is perfectly clear that for human beings, life can be lived in a matter that minimizes the requirement for complex brain activity. Indeed, the limitations/horizons of the human physical form have decided how people live, and retreating into being a lazy ‘couch potato’ is not a corruption of human physicality but an option within its range of possibilities.11 Indeed, (over)pursuit of abuse of actions that stimulate the release of the pleasure-generating neurotransmitter dopamine arguably short-circuits its original function.

4  MEDIATING THE PSYCHOLOGICAL AND THE PHYSIOLOGICAL 

111

Perhaps the best example of this maladaptation is the desire to eat high-­ calorie foods, which would work well as a hunter-gatherer where food was potentially scarce. However, in the modern developed world, this urge can lead to obesity and a state where the person is less able to deal with perpetuating existence. Some might argue that this is a form of self-­ destruction, and certainly taken to extremes can be. Certain issues that beset people in many countries might be understood in the light of the ‘gap’ and an inability for people to manage to mentally bridge that gap. For instance, obesity, anorexia, self-harming, violence to others, drug and other addictions and so on. Of course, other reasons are no doubt important or even totally defining. My point is that we can rethink such issues in the light of EP, and this doesn’t have to be accepted as the definitive answer on the matter, merely a shifting of perspective that might provide a new insight or even make us think just lightly differently about things. One vivid illustration of the gap is in the work of Mark Hanson and Peter Gluckman, whose review of the evolution of puberty found that Palaeolithic girls arrived at first menstruation (menarche) between the ages of 7 and 13 years. This is a similar age to contemporary girls, which suggests that this is the evolutionarily determined age of puberty in girls. It has stabilized at around this age since better conditions and nutrition brought it down from a higher age in earlier centuries. They note that in Palaeolithic times, girls would have reached a maturation sufficient to function as an adult hunter-gatherer. The issue today is that girls do not reach that level of mental maturation due to the complexities of modern living, at least in direct comparison with activities in the old Stone Age.12 Indeed, modern life appears dominated by stress and mental illness in the global north, with the threat of the onset of dementia in later life. The dominance of culture suggests a question: is culture in general an agent, perhaps a major one, in helping humans negotiate the evolutionary gap? Some might suggest that human social organization is an adaptation, and a direct way of dealing with the gap between human physiological affordances and limitations, and the rapid changes to the environment. While some of these changes have been engineered to meet human needs, others emanate from that—such as polluted and poisonous environments, massive population densities, poverty and depleted natural resources. Culture and society clearly have not adapted sufficiently. There are constant issues of mismatch. An idealised notion of a ‘state of nature’ would have a certain stability in terms of social structure of males producing offspring as well as those electing or not expecting to mate. Western European societies, for

112 

K. J. DONNELLY

example, now largely set up an expectation of males to have the opportunity to produce children should they do wish—even if that opportunity is not readily on offer. Similarly, while fairly convincing arguments might be made about increasingly prevalence and acceptability of gay and lesbian sexuality as a natural response to human overpopulation, some societies persist in demonizing it or making it illegal. It seems reasonable to assume that human beings were well adapted to earlier life situations and indeed, some people abandon modern life in an attempt to rebalance and better ‘fit’ the world. Similarly, Evolutionary Psychiatry is premised upon the principle of the evolutionary mismatch and sees mental health issues as having environmental causes that are exacerbated by mental machinery that are not geared towards the modern world, and an inherited drive that is not driven by aiming at long-term health.13 Taking it that there is a mismatch between human physiology, which clearly worked very well in hunter-gatherer scenarios, and in many cases performs less well in current environments, is not an outrageous premise. Indeed, it appears utterly unreasonable to assume that there is a gap. This is the EEA’s bequest to us. The next question is why should we assume that the human mind retains the limitations from the EEA, too? There is a large question here, relating to how far the abilities of the human brain have transcended their physiological origins. Some might point to humanity’s achievements and how this is a long way from hunting on the savannah. However, the horizons of the human brain appear to have remained the same, even if it had few chances to demonstrate its capabilities in the deep past. Physiologically speaking, human brains appear to be the same in terms of size and makeup. Environment has certainly changed far more rapidly than the human form, and not least due to human interventions. Indeed, as Dave Robson notes in New Scientist, over a period of a million years, the human brain expanded at an increasingly rapid rate, and then, 200,000  years ago, the expansion abruptly stopped.14 Modern popular science appears to have overlooked this in order to maintain that we are at the pinnacle of our evolution. Of course, the brain may have carried on evolving internally, in terms of its connections and abilities.15 Yet its size, structure and composition appear to be the same as it was a couple of hundred thousand years ago. Paralleling this, it is incontrovertible that the human environment has changed exponentially during that same period. Explanations that ignore the primary physical level are limiting their dimensions and validity. As EP would suggest, the gap is evident in the

4  MEDIATING THE PSYCHOLOGICAL AND THE PHYSIOLOGICAL 

113

way our physiology developed for a different environment, and equally, our basic brain activity and psychology developed in the same way.16 Of course, they have had to adapt, and have encouraged adaptation as times have changed radically over thousands of years. Some might object that human beings are adapted to the modern world, yet I would suggest we are not adapted particularly well. It is undeniable that in my immediate environs and repeated constantly in the media, there is a sense that modern life appears to be overshadowed by stress and issues with mental health, as part of a pervasive feeling that society has followed a path that has made it sick, one way or another. Technology and culture have achieved much in attempting to make that adaption better. Some might disagree. Perhaps the fact that we humans are so unaware of this ‘gap’ is testament to human culture’s ability to bridge it at least partly.17

Mediating: Physiological Reality, Psychological Fantasy Audiovisual culture is not simply a useful ‘dry run’ for the real world but in fact bridges the gap between human faculties that evolved for a very different world of hundreds of thousands of years ago (the Pleistocene era), and the requirements of contemporary living. Indeed, the key to Evolutionary Psychology is that our mental and perceptual hardware developed for a situation Millennia ago, and addressing this can give us the answers to certain current dispositions of the mind. However, much in audiovisual culture is about compensation or closing a sense of gap between the modern world and antiquated human hardware. I would suggest that audiovisual culture is geared precisely toward this. Not only does it offer certain experiences that sometimes resemble subsistence hunting and desperate basic survival, but it also exploits visual and hearing dispositions founded upon fundamental fears and excitements. I would suggest that we can learn perhaps more from articulation than story, in other words, less from narratives (cf. psychoanalytic and cognitive interpretations) but more from human perception and the proclivities of the audiovisual as a medium. Culture has an ambiguous position in society, with often simple answers as to why it can be so effective and at times appear so important. These can be platitudes along the lines of ‘it’s only entertainment’ for idle moments, or ‘art’ embodies something of the essence of the human spirit. Culture

114 

K. J. DONNELLY

might well do something significant to bridge this gap between our human hardware and the requirements of the modern world that vary so radically from the earliest instances of Homo sapiens, whose bones appear almost identical to those of modern people. Indeed, as I have noted, there seems to be conclusive evidence that physically human beings have not developed since that time, although healthier ways of living and less violence have allowed for increased height and age. The dramatic expansion of the human brain before the Pleistocene is not evident since, with the brain cases of human skulls appearing identical. One might almost suggest that the degree to which humanity has had to adapt and develop new technology is a testament to the limitations of human physiology and brain hardware. Technology, in a broad sense, has allowed affordances that can be both physiological (transport allows us to move distances more quickly and easily) and psychological (we are able to know more people, certainly with the development of social media after the Millennium). When thinking about evolutionary adaptation, physical aspects of the human body are the most immediate to come to mind. The psychological advantages supplied by audiovisual culture are perhaps less unequivocal but are also potentially highly significant. A perspective from EP suggests that culture can provide us with wider view and experience, as well as a wider range of emotions. It makes sense that culture exists for some reason—if it isn’t just to fill up time when people aren’t busy or when they might be forced to sit and think about things. It is not unreasonable to imagine that culture would, both directly and indirectly, address the way human beings relate to their world. Indeed, I would argue that this is highly evident in culture, both in the stories and subject matter and, perhaps more significantly, in the modes of address as well as the aesthetic formulations that enable the interaction of culture and people on a primary and psychological level. Sometimes, we can reduce complex and heavily textured culture into simple statements of communication through our analysis. This downplays and underestimates the ‘wholes’ of cultural objects in what can be a gross simplification of the complex signal into a simply parsed message, rendering the complexities as ‘window dressing’. And this also underestimates the value of window dressing as well as reducing the complexities and sophistication evident in even the lost crude cultural objects. So, how might audiovisual culture work to bridge the gap between our physiological makeup and the rapidly changing environment? By appealing very directly to perception, the level of unconscious activity that we all

4  MEDIATING THE PSYCHOLOGICAL AND THE PHYSIOLOGICAL 

115

share. This is not some Freudian appeal to what we have repressed but rather an appeal to our primary physiological selves, and the level that is at the heart of all of us but is almost never registered. As such, it is in a position to have significant impact. It also works by appealing to our imaginations. The imagination not only allows the ‘suspension of disbelief’ that makes the fantasies of audiovisual culture work, but it also allows us to rethink our relationship with the world and other people, as well as our sense of self. Furthermore, audiovisual culture establishes a sense of a ‘consensus’ representation of the world as an ‘objective reality’ depicted in audiovisual culture. As part of this, there is a sense that we live in a shared reality, which is constructed for us largely through audiovisual convention and our ‘recognition’ of that rather than necessarily any recognition of the real world. As an adaptation, audiovisual culture allows us more of the same. We can experience more people, events and places, extending ourselves and knowing more things than our limited brains evolved for. Also, as social psychologists have noted, through behavioural models. Indeed, these were held up as a way to criticize culture, and particularly audiovisual culture as something that supplied negative models of behaviour and made violence (for example) seem more acceptable through repetition, normalization and ultimately a desensitization of the audience. Of course, we could turn this interpretation around and provide a functional view, that audiovisual culture is not ‘diverting’ humanity from its path but instead is aiding audiences to come to terms with and confront issues that are immediate and urgent. Certainly, some violent films can make the audience reflect on the nature of violence, while perhaps in some cases offering models for dealing with violent situations. It makes sense to suggest that audiovisual culture has a possible function as a mediator of change, preparing us for challenges not simply on an everyday level but on a macro level of decades and lifetimes. The so-called ecological approach adopted by J.J.  Gibson and used by Evolutionary Psychologists, such as Tooby and Cosmides and Joseph Carroll, sees evolution as dynamic.18 The organism is not only changed by the environment but also the environment is changed by the organism, in a circular process of development and feedback. Therefore, evolution may well have not ‘stopped’ as such, but one thing is for certain: the change in environment has been so radical that there hasn’t been a hope of the organism keeping up with physiological adaptation. However, ecologically speaking, audiovisual culture can be a significant agent for both changes in the environment and humanity.

116 

K. J. DONNELLY

According to Marshall McLuhan, audiovisual culture extends us, allowing us a wider view and experience, a wider range of emotions and a sense of omniscience. McLuhan’s 1964 book Understanding Media has the subtitle ‘The Extension of Man’. He states that the medium is any extension of ourselves.19 McLuhan sees the content as far less important than the medium itself, which effects a modification of consciousness by changing the relationship between the different senses and faculties. Studies of culture often simply see the surface, the ‘content’, if you like, rather than the underlying structures that may be the most important point. Perhaps an instance proving McLuhan’s ‘the medium is the message’ adage (well, in one of its forms) was when Sony bought Columbia in 1987 and Matsushita bought Universal in 1990. Both of these Japanese electronic home hardware manufacturers arguably wanted to control the ‘software’ (films) in order to regulate and stimulate the sales of their primary product (the hardware: TV sets, DVD players, computers, etc.). We are encouraged to think of films as individual objects that are highly distinctive rather than necessarily as a flow of similar objects. Physiologically, perhaps we don’t desire the latest film starring a certain actor or actress; perhaps we just need some sound and images to run through our senses and brain. Often, the higher quality, the better, and often we want to repeat experiences that we have found satisfying and reassuring. A crucial but perhaps unanswerable question is about the place of audiovisual aesthetics and culture in human fantasy, and by implication, also their place in defining and guiding a sense of reality. Fantasy is firmly inside our heads, interior to us and yet it is far from easy to work out where reality stops, and fantasy starts. For many people, a significant question is: what is the relationship of the real to fantasy? Culture can redouble this question. However, culture, and particularly audiovisual culture, is also able to provide an answer of sorts. Conventionalists might say that most of what we experience as reality is in fact a fantasy concocted in our heads, while psychologically disturbed people can be unable to differentiate one from the other at all. Films and other audiovisual culture have expounded these ideas at length. In film and subsequent audiovisual culture, there are conventional approaches to subjectivity. Psychological fantasy, which is in the on-screen character’s head, often contrasts with the sense of the objective achieved by following the dominant audiovisual approaches, as discussed in the previous chapter in relation to ‘naturalism’. Subjectivity can be an effect of limiting events to a character’s experience and use of point-of-view shots

4  MEDIATING THE PSYCHOLOGICAL AND THE PHYSIOLOGICAL 

117

that tie the audience to their position. However, a more extreme sense of being interior to a character’s psychology is achieved through indications of fantasy, sometimes which can be very strongly signalled while at others achieved more subtly. In terms of vision, this manifests as disjunctive editing, confusing a sense of screen direction and orientation, as well as distortions to the images themselves. In terms of sound, this often appears as forms of enhanced or distorted sound, such as added reverb or echo, or filtering that can remove high or low pitches. Synchronization of sound and image can also move apart and seem awry. So, first-person viewpoint can be extreme, and certain distortions may indicate fantasy or drug mind-alteration, while third person suggests objectivity through the unnoticed ‘invisible’ style of naturalism, which equates with classical continuity conventions, allied with its associated omniscient narration, which pulls the audience away from being tied to a single character. At extremes, this sliding scale inculcates a sense of the objective and consensual, outside, which is based on human physiology in contrast with the highly subjective, which is an effect of the interior of the human head and fantasy. The accepted view of fantasy is that it is not reality. In psychological terms, it is an imagined scenario that is visualized vividly and which stages a desire, conscious or unconscious, where the person imagining has a place somewhere in the scene. Freud discussed how fantasy is an expression of mastery, and how this could manifest in fantasies about beating.20 Just before the turn of the twentieth century, Freud changed his approach and decided that memories of being sexually abused as children were predominantly fantastic projections rather than actual memories. This is controversial,21 but was crucial in bringing Freud’s focus onto fantasy and ultimately made the concept of reality more complex and problematic. Yet this initial demarcation between fantasy and reality gave way to a sense of reality as being heavily mediated and constructed discursively by the self. Yet a relatively solid demarcation between fantasy and reality is not only evident in much audiovisual culture, but is considered an important aspect of everyday life, even though it may not be as solid as we like to think. Films about fantasizers and their fantasies often signal strongly the ‘real’ and ‘fantasy’ levels. For instance, The Secret Life of Walter Mitty (1947 and 2013) and Billy Liar (1963 and TV series 1973–1974), where the eponymous protagonists intermittently daydream. Such films, and arguably all audiovisual drama, make fantasy seem real within the conventions of the

118 

K. J. DONNELLY

drama at the same time that they are indicating that we should not understand this as being reality. The depiction of these inner states questions the status of ‘the real’ and on the other hand simultaneously guarantees it as a coherent state. Scholarly writing has been little interested in the nature of fantasy in film and shown far more interest in ‘fantasy’ as a film genre. However, according to psychoanalysis, fantasy establishes a scene that in some way stages the subject’s desire in their imagination. The problem of ‘(what is) reality’ necessitates a solid demarcation necessary for a strong idea of fantasy. Yet perhaps ‘reality’ does not need an absolute definition but can be understood more as an effect of audiovisual culture.

Toggling the Phantasmagorical Gap: ‘Fantasy’ and ‘Reality’ The key to fantasy is knowing it isn’t reality. Audiovisual culture emphasizes and sometimes confuses this through ‘toggling’ between different diegeses and dramatizing transition between two psychological states. These predominantly involve a sense of ‘real’ and ‘interior’ states of mind, which become rendered as different and demarcated levels of diegesis. The two worlds are signalled as different, and yet also signalled as a continuity in certain ways. Audiovisual aesthetics define the states, set the boundaries and mark the transition between them. It is absolutely essential to have a strong idea of each to make the other have any meaning: they have meaning in relation to one another. This relational positioning appears to be a specifically audiovisual culture notion of ‘fantasy’. Sound and music regularly adopt a crucial role in demarcating between these two states. Rather than the vector of crossing that Robynn Stilwell identifies in music traversing the diegetic/non-diegetic divide in ‘the Fantastical Gap’, this is more of a ‘Phantasmagorical Gap’.22 Here, music in particular is able to indicate clearly that we are now in a different state or in a diegesis of alternative status. It can often make a striking change, often although not always allied directly to changes in image modality, and clearly defining the character of the new situation. Audiovisual culture mediates between ‘reality’ and ‘imagination’— often one state is signalled as ‘real’ and the other in the character’s head. This tends to be divided conceptually into the physiological and the

4  MEDIATING THE PSYCHOLOGICAL AND THE PHYSIOLOGICAL 

119

psychological. This is ‘outside’ and ‘inside’ the character’s head so, while this is a direct mediating real and fantasy, this is also a mediation of physiological and the psychological. The physiological is understood as ‘objective’, the ‘normal’ version of the reality in the diegesis. In contradistinction, the ‘other world’ is defined as being interior and psychological in nature. These can often be presented as two separate realities, with varying status applied to each. In audiovisual culture, it is quite conventional that the real is portrayed as a physical state, while fantasy is situated in the mind, although the latter is not quite so simple. This has a net effect of ‘stabilizing’ our sense of ourselves, and our sense of our relationship with the world, even though on the surface, it might problematize them. Indeed, this appears to restate the Cartesian divide between mind and body. This was mentioned earlier, along with Gilbert Ryle’s formulation of the ‘Ghost in the Machine’ as a dismissal of this.23 While plenty of films, television and other culture addresses the Cartesian divide, audiovisual culture is able to address it in a discursive manner which emanates from the specificities of the medium rather than merely thematically. Furthermore, rather than simply setting up these poles, audiovisual culture regularly goes about ‘discussing’ how they might and might not work and how we understand the two. Sometimes this can be done almost wholly in a material and stylistic format. The survival horror video game Silent Hill 3 (2003, Konami) intermittently and often randomly toggled between two different versions of the same location, changing both psychology and gameplay at the same time. The third instalment in the Silent Hill series and a direct sequel to the first Silent Hill game,24 it follows Heather, a teenager who becomes entangled in the machinations of the town’s cult, which seeks to revive a malevolent deity. She attempts to return home to her father from a shopping mall in the deserted town of Silent Hill. It is rendered as a third-person avatar navigating the misty ghost town of Silent Hill, under threat from various disturbing creatures and attempting to solve a central mystery. Intermittently, the environment transmogrifies into the Otherworld, another dimension which is a more threatening version of Silent Hill. This toggling is at the heart of the gameplay, although many ‘survival horror’ games oscillate between moments of repose and moments of threat and attack.

120 

K. J. DONNELLY

Figs. 4.1 and 4.2  Silent Hill 3

The Silent Hill games make a distinctive use of its sound component, which plays an important role in providing information and enabling the player’s survival. Changes in the soundtrack signal the change between the ‘normal’ location of Silent Hill and its hell-like Otherworld version (Figs. 4.1 and 4.2). Incidental music serves a notable function in that the games are premised less upon action and excitement than atmosphere, with extended sections of potentially aimless wandering about through a deserted town shrouded in mist. The game exploits sound and image for atmospheric effect with the use of obscured images and sounds of

4  MEDIATING THE PSYCHOLOGICAL AND THE PHYSIOLOGICAL 

121

uncertain origin, and it owes much to the horror film genre. However, it also has its own innovations. Sound is used to warn the player of danger, while distant disturbing sounds provide a constant emotional backdrop for avatar Heather’s activities. There are few moments of relaxed repose. Images can include mist, darkness and fire and smoke, thus lacking the clarity of vision that often is highly evident in video games. The music and the sound design for each of the early games in the series was written and produced by Akira Yamaoka, a specialist in video game music.25 It embraces morose atmospheres, heavily echoed notes, resonant chords, deep reverbs, ‘trip hop’ grooves and ambient textures. Yamaoka’s music provides an essential character for the game, to the point where the two Silent Hill spinoff films were constructed around Yamaoka’s music for the games. Apart from providing a theme song for each game, Yamaoka’s music includes atmospheric pieces, simple memorable melodic pieces and austere bursts of ‘noise’. The former appears predominantly as accompaniment for different locations in the deserted town of Silent Hill. The latter, consisting of metallic sounds—often isolated in succession, and the constant use of disjointed looped rhythms that sound like broken machines, accompany threatening locations, most notably when there is a transition to the ‘Otherworld’. This is a flip-side manifestation of Silent Hill, which occurs from time to time where everything is nightmarish. The volume and intensity of the music for the Otherworld is beyond the rest of the game’s norm and at times is almost unbearable. The music includes no direct matching of action, with the musical accompaniment to sections of the location that the player’s avatar (in the case of Silent Hill 3, Heather) traverses. Rather than being dynamic and matching momentary changes in the gameplay, the music is keyed to being part of the environment. The locations in Silent Hill are highly atmospheric and have a certain feeling to them already but the music serves to give a particular emotional cast to them, yielding a fully integrated atmospheric and emotionally tinted succession of game locations that add up to a coherent and continuous version of the deserted town. Zach Whalen notes that “Below the static sound, the dominant base sound in Silent Hill is a chilling ambient wash which throbs with the sound of machinery and sirens. The volume level of this ambient sound is low, but its ubiquitous presence keeps the player on edge and sets an ominous tone for the visual environments of both worlds of Silent Hill. Its mechanical tone also blends smoothly with the machine-produced static of the radio such that the sonic texture of the atmosphere remains consistent”.26

122 

K. J. DONNELLY

Indeed, in Silent Hill, evil is associated sonically with the appearance of white noise. White noise is a sound that has equal presence and intensity of all pitches and sounds like a radio tuned to no station. In terms of sound, white noise represents a primal soup out of which other sounds can emerge. Significantly, it also suggests a mental state of being overwhelmed by ‘noise’ and marks a homology of the ‘immersion’ at the heart of the game. In the game, the appearance of static on the radio carried by the protagonist warns that monsters are approaching. So, the sound is an indication of off-screen threats. Particularly in the mist-wreathed streets of the town, this crackling and static sound emphasizes one of the evolutionary advantages of hearing: that of knowing of threats that are at a distance. In this case, it also underlines the geographical nature of this video game, in that the town appears like a real space which can not only be traversed but also can be an interesting and extensive place to wander around. The provision of space is at the heart of the game and music and sounds as much as constantly changing perspective on a seemingly three-­ dimensional location. Of course, in audiovisual culture, generally the sense of space in sound adds a further dimension to flat images on screens. This is an often-overlooked aspect of all music, and particularly recorded music. In The Philosophy and Aesthetics of Music, Edward Lippman notes that spatial information is central to music, and that sharp high-pitched sounds tend to sound close and are perceived in the head, while dull low-pitched sounds appear distant and are felt in the lower body.27 In the Silent Hill games, there is a dynamic divide between the high-pitched screams and sirens, and the sub-bass rumbles that give a sense of anxiety as well as terror. Furthermore, in Silent Hill games, as well as in other horror or psychologically based audiovisual culture, the spatial information of sounds and music mismatches the visual spaces on screen. The music often serves a geographical function in that different musical pieces delineate different locations, so that a progression through the game is a progression not only through a series of distinct spaces (shopping streets, hotel, hospital, park, amusement park) but also through a series of distinct musical pieces. Silent Hill 3, like the other video games in the series, focuses heavily on atmosphere and player immersion rather than consistently exciting gameplay and competition. It is a solitary and introspective game and pulls the player into its free wandering and highly engaging music, the first of which underlines player agency, while the second not only builds sonic landscape but also elicits strong emotional involvement. The move to the Otherworld cues discordant music that is disturbing in itself and a continuum of

4  MEDIATING THE PSYCHOLOGICAL AND THE PHYSIOLOGICAL 

123

dissonance to indicate danger but also to elicit fear in itself. This is reminiscent of William James’s theory that our fear comes at least partly from our physical responses to a situation rather than to the scary thing itself. The so-called James-Lange Theory of emotion posits that human experience of emotion arises from physiological changes in the autonomic nervous system’s response to external events.28 William James and Carl Lange separately came to the same conclusion: that emotions come from physiological reactions to a stimulus, rather than being reactions to the stimulus itself. This appears to undermine the conventional understanding of the dominant cause-effect chain. A similar process perhaps is evident in video games like the Silent Hill games. The onset of disturbing music in survival horror games, which is particularly marked in the Silent Hill games, may be the most affecting stimulus to our physiological reaction rather than the situation in the game itself. So, on-screen threats may well not cause our fear as much as the music that indicates (in a Pavlovian manner) the existence of imminent threats, and indeed, music’s often direct impact might trigger physiological fear reactions. Of course, it is difficult to say whether this is the case but the music for the games plays a crucial role, not only in giving a sense of geography to the game but also to manipulating and regulating emotions throughout gameplay. The musical landscape of the Silent Hill games has a primary psychological function. The ‘believability’ of the three-dimensional space of the town is crucial and the sound and image components of Silent Hill cohere into an almost tangible landscape. This is partly premised upon a sense of ‘equivalence’ of ambience with ideas: the deserted ruined town, mist and falling snow and ash, mysterious apparitions and impossible geographies where the player crosses into the nightmarish Otherworld. The games appear to be a coherent landscape but suggest a terrain that is interior to the head of the protagonist that is being operated by the player. Silent Hill’s world appears to emanate from the neurotic mind of the game’s central character and the constant dislocation of sound and image establishes the sense of aberrant psychology at the heart of the game. Indeed, the Silent Hill games appear to represent, or perhaps further, to simulate, the mental condition of psychosis. A central tenet of psychosis is the inability to differentiate between what is real and what is fantasy, and a person with a psychotic condition might experience hallucinations and delusions, as well as possibly having the extreme changes of mood associated with bipolar disorder.29 The game’s toggling between states differentiates ‘real’ and ‘in-head’, but this distinction is not as solid as it seems, as

124 

K. J. DONNELLY

Silent Hill games tend ultimately to confusion about the level and status of objectivity in the game’s ‘real world’. Nevertheless, the division remains significant. The games also alternate between physiological activities— moving around the town and violently killing threatening beings—and ‘mental’ ones, the game includes regular intermittent puzzles that need to be completed to allow progression in the game. Silent Hill is a particularly powerful manifestation of the mediation of the psychological and physiological, even though its aesthetic representation of ‘internal’ landscapes appears to mean that the psychological eclipses the physiological. The Silent Hill games often indicated moving to ‘inside the head’ of the game’s avatar via a process that exploited ‘weird sounds’. A similar oscillation between two states with a clear transition was evident in a remarkable television drama that is premised upon the movement between two worlds or dimensions. In ATV children’s drama serial Escape into Night (1972), an ill and recuperating schoolgirl called Marianne draws pictures for entertainment as she is bed-ridden for months. She then dreams about the pictures.30 As she draws a house, she then becomes able to enter it through dreams. Marianne meets a sick boy in the house, which then becomes threatened by evil forces, which she has also drawn. The same story, Catherine Storr’s Marianne Dreams, was later adapted into the film Paperhouse (1988, Bernard Rose). The ‘dream’ part of Escape into Night is never depicted as a particularly dreamlike experience for Marianne. However, it is highly disturbing and nightmarish, particularly when the dark empty house is threatened by noisy, moving stones, and is more significant for Marianne than her ‘reality’ of being stuck at home in bed (Fig. 4.3).

Figs. 4.3 and 4.4  Escape into Night

4  MEDIATING THE PSYCHOLOGICAL AND THE PHYSIOLOGICAL 

125

In Episode 1’s pre-titles sequence, the narrative is set in motion when Marianne falls off a horse. Yet it is accompanied by strange, saturated reverb on the horse hoof sounds. This is reminiscent of the later strange sounds in the programme and gives something of a presaging of the later disturbing activities. As Marianne first draws the house, there is a passage of music, which sounds like atonal electric piano followed by a pulsing organ cluster. This same music recording reappears as she dreams, accompanying a succession of slow crossfades and canted frame which shows her ‘landing’ at the other location. After this series of visual aspects that indicate a move from normality, sound then takes over this function. The sound of extremely strong wind that is not evident in sounds and dark lighting, contrasting very starkly with the audiovisual depiction of the banality of her bedroom. Marianne has drawn an isolated house, seemingly on barren moorlands, and she tells her mother it is her house. She then decides to put someone in the empty house and each time she draws something, it appears in the house once she visits in her dreams (Fig. 4.4). The boy is called Mark (and who appears also to be taught by Marianne’s home tutor Miss Chesterfield). Inside the house is empty and unfurnished with dark lighting, but sounds are dominated by deserted a deep saturated over-echoey grandfather clock ticking on two alternating pitches in the hallway. This is a transposition of her normal house hallway, which has a deep dead-sounding, reverb-free ticking from a grandfather clock. The sonic characteristic of the two spaces, absorbing sound versus reverb from an empty hallway, is more of an emotional and psychological effect than a representation of the two different spaces. It also has a central structural function of differentiating the two worlds. In Episode 5 of Escape into Night, the clock’s two pitches are enhanced by electronic echo, and indeed, as is conventional, echo delineates a location that is extraordinary, is in some way not ‘real’ and is disturbing, both for on-screen protagonists and for the audience, Marianne draws stones with eyes in front of the house, just outside the perimeter fence, in a moment of anger at Mark but these become their bane once they appear. When she goes next to the house, she notices the continuous echoey off-screen sounds that come from the stones. They sound like distorted voices, with burying echo fully obscuring what they are saying into repeated bubbling sounds. The use of electronic echo piles up as Marianne denies the house and Mark, wakes and scribbles all over the drawing of the house. This striking sound has an underwater quality to it and appears based on distorted whispered voices. It is reminiscent of the kind of

126 

K. J. DONNELLY

distorted voices that can appear on analogue radio that are not properly tuned in. This rhymes with the radio in the house which makes static sounds and then is occupied by the stones’ sound. At the end of Episode 3 and the start of Episode 4, Marianne looks out the window and sees the watching stones, whose eyes light up to the accompaniment of sinister music. We then are given a succession of images as points of view from the stones, captured in eyes of the stones so that we see what they are seeing. In the centre are a succession of still images of Marianne, each accompanied by a boat horn-like glissando sound as punctuation. These appear quite discontinuously merely as a succession rather than with a sense of logical continuity. These stones are the epicentre of the nightmarish aspects of Marianne’s dream world. In Episode 4 of Escape into Night, when they turn the radio on, we hear the voice sound that emanates from the stones, repeating ‘we’re coming’ in a threatening manner. At the conclusion of the serial, Marianne and Mark escape the empty house and pass the stones to reach the lighthouse, all the while accompanied by these disembodied voices/sounds from the stones. While visually the stones are disturbing, each having a single lit-up eye, their sounds are far more affecting. This is, at least partially, due to their pervasive off-screen presence delivered sonically. Their threat is undefined but constant. The very nature and texture of their sounds are disquieting, defined by the heavy use of electronic echo distorting the sounds and obscuring what sounds like a whispered vocal sound. The whole serial appears to be about the mind and psychology. The dream part of the narrative Figuration of the unconscious mind, with Mark as an autonomous part of the self, and a move from the gloomy house to the lighthouse as a trajectory for Marianne from illness back to health. The stones and their pervasive sounds provide a figuration and manifestation of Marianne’s anxiety and adopt a location around the periphery of the empty house, which feels like entering Marianne’s unconscious, and these things occupy its uncertain edges. Conventionally, transitions between states and dimensions usually require extraordinary sound and image to demarcate. On a few occasions, in Escape into Night, transitions from Marianne’s bedroom to the dark house include the sounds of extremely heavy wind and the images of the moon in clouds. The banality of the everyday world includes little in the way of ambient sounds, while the other world is dominated almost continuously by sounds. The potent place of sound in this serial is emphasized by the unconventional opening

4  MEDIATING THE PSYCHOLOGICAL AND THE PHYSIOLOGICAL 

127

and closing title sequences, which are accompanied by a voice humming a slightly sinister melody. In Episode 3, Marianne has a dream sequence that is ‘outside’ her regular dream visits to the house. In terms of sound, there is a heartbeat-­ inspired pulse accompanied by image distortions, marking a clear alternative regime from her other dreams. On occasions, rather than simply being an aesthetically differentiated contrast of spaces, the transitions between the bedroom and the dream house are marked. At one point, where Mark desperately shouts, “they can hear, I tell you, they can hear”. The camera moves in to a close-up of Mark’s face in profile, while the stones’ whispering sounds continue. Suddenly, a rapid and progressive reverse echo resonance makes a twanging sonic climax, upon a cut to Marianne in bed talking to Doctor Burton in what she refers to as her ‘ordinary life’. This sound is outside the film’s sonic repertoire but Escape into Night features a range of fantastic sounds, which are beyond everyday experience, or recognition, and therefore disturbing. Indeed, the stones are extremely disturbing and must have been very upsetting for children in the early 1970s. Visually they are certainly scary, but their sonic substance is far more powerful. Marianne and Mark, inside the house, are constantly reminded of the stones’ threat by their sounds from off-screen which form a continuum in the quiet dream house space. While this might seem like it is mediating fantasy and reality (the subject of the following chapter), Marianne is physically inert, stuck in bed and lacking power or agency, but when she enters her dream world, she is Godlike in that she creates the world herself in a move reminiscent of lucid dreaming. Here, she can walk and move around, which is underlined by Mark’s inability to walk. Escape into Night illustrates unmistakeable toggling not only between physiological and psychological states, playing between ‘internal’ and ‘external’ states, but also between the objective ‘consensus’ diegesis and the world inside her head. Marianne is physically inert, stuck in bed and frustrated. Her dream process is bout empowering herself and going somewhere, where she has physical mobility and is contrasted with Mark, who is unable to walk. The conclusion of the narrative shows that Mark was real and seems to remember meeting Marianne in the house. While this has connotations of the two meeting on the ‘astral plane’, its depiction is more like Hades or some other nightmarish hinterland. Indeed, rather than simply fantasy Escape into Night shows entering another world, like the underworld, which Marianne notes is more ‘real’ than her bedroom.

128 

K. J. DONNELLY

In a similar manner to lucid dreaming, Marianne appears to enter an ‘astral plane’, a world that is not fantasy but in fact a different level of reality. The differences are clearer in terms of her mental states, where the limits of the physical lead to an active life of the mind, and ultimately an understanding of the greater importance of the mental world she has moulded. One film genre most clearly has a strategy of ‘toggling’ and that is musicals. They habitually shift between song sequences and ‘narrative’, sections that follow the conventions of mainstream cinema. Furthermore, in many cases, the song sequences might be construed as being ‘inside the head’ of a character on screen. Consequently, some musicals might be understood as ‘physical reality’ but with song sequences that are the emotional and psychological comment or revelation of a character. This division can be very stylized, indeed. Acclaimed television playwright Dennis Potter exploited but diverted the musical format in some of his television dramas, including Pennies from Heaven (1978, BBC), The Singing Detective (1986, BBC) and Lipstick on Your Collar (1993, BBC). The Singing Detective used old songs in an ironic manner and had a complex narrative involving multiple diegeses. Bedridden with extreme psoriasis, Marlow, the author of a pulp novel about a detective who also is a professional singer, hallucinates and moves between different diegetic worlds. These are his normality in the hospital and fantasies about his ex-wife conspiring against him, his childhood and the stylized Film Noir world of the detective from his book. This television drama has a gradation from physiological to psychological. The physiological is a relatively banal level, where Marlow lies inert and helpless, stuck in a hospital bed. There is a clear tactile and haptic aspect in the protagonist’s psoriasis that covers his whole body. The ‘wife story’ fantasy is what the protagonist can’t see and imagines, although it could actually be true. Indeed, the drama contains a number of ambiguities. Another section is the ‘childhood memories’ section, while another is the fully fantasy level is the Film Noir world of the Singing Detective and his murder mystery investigation. It is this final level that is fully inside the head of Marlow and the place where his mental issues are played out and ultimately resolved.

4  MEDIATING THE PSYCHOLOGICAL AND THE PHYSIOLOGICAL 

129

Fig. 4.5  The Singing Detective

Built around re-contextualizing banal popular songs from the pre-­ World War II period. They are often radically reoriented by the narrative context and the visual rendering. Songs that at face value had quite a simple emotional and communicational character are converted into something extremely different. For instance, in Episode 1, Them Dry Bones is converted into a grotesque Hollywood musical-style song sequence with medical personnel in the middle of their activities lip-­ synching to the song. As the song finishes, the medics return to their normal activities as if nothing had happened. It is not a new recording but the antiquated monophonic sound of Fred Waring and his Pennsylvanians, which makes the lip-synching of the medics all the more of a strange effect. It is motivated as a hallucination by the bedridden protagonist, who leads us through multiple diegeses: him in bed in hospital, his hallucinations, his childhood memories and his fantasies about his wife conspiring against him. These are not fully distinct and crossovers regularly take place. His wife and her lover are presented like a modern television drama, while The Singing Detective’s 1940s world of spies and intrigue looks like a stereotypical Noir to the point of parody. This section constantly uses a single repeated piece of stock dramatic underscore, underlining the repetition and aporias of information. Most song sequences are set in the ‘childhood memories’ sections. Old culture functions as not only personal memory but as collective memory here. For example, in Episode 6 (‘Who Done It’) of The Singing Detective, the protagonist Marlow as a child converses with some soldiers on a train. As they pass a scarecrow in a field, it begins to sing Al Jolson’s recording of After You’ve Gone. This is highly

130 

K. J. DONNELLY

disturbing, not least as the scarecrow has the features of Marlow’s schoolteacher. The start of the song has the scarecrow lip-synching the song in close-up, but then, as the song continues, there follows a montage of shots showing the child’s parents separating and him running to escape it. Most songs appearing in the ‘childhood memories’ section includes both bursting into song as in the integrated musical and diegetically motivated songs. An example of the latter in Marlow’s father performing Birdsong at Eventide, including miming its extensive section of bird song mimicking, as if performing in working men’s club. On the surface following the structure and logic of a film musical, Glen Creeber calls The Singing Detective a “ bizarre subversion of the Hollywood musical”.31 The old song recordings are highly significant in The Singing Detective, and like traditional film musicals, there is a structure of alternation between dialogue sequences and song set pieces. The drama tabulates memories through old popular culture (Film Noir) and old songs. Demarcation is strong between songs and dialogue scenes, as is demarcation between what are signalled aesthetically as the ‘reality’ of present day in the hospital and the other strata of memory and fantasy. However, the insistence of their individual ‘realities’ problematizes the ultimate level of reality (Marlow in bed in the hospital), with a suitable conclusion that brings a unity of fantasy and reality. The structure of the musical, alternating songs and narrative, and by implication oscillating between consensus reality in the film and something closer to an interior psychological world. A similar alternation is at play in Zack Snyder’s film Sucker Punch (2011). It is set in a mental institute where Babydoll has a sustained dream that she is in a brothel and planning an escape. She does a succession of intermittent exotic dances in the brothel which materialise wild fantasy sequences. So, there are three levels of diegesis in operation, with most seemingly taking place in the first level of fantasy, with loud music cueing entry to the fantasy level within that. This Russian doll-like structure of fantasies within fantasies allows for a maximum of imagination and a sense that nothing is quite what it seems. The vast majority of the film takes place within a subjective fantasy world, reminiscent of Ambrose Bierce’s short story An Occurrence at Owl Creek Bridge or the latter part of Terry Gilliam’s Brazil (1985). While the audience was made aware of most of the action’s status as fantasy, the concluding return from the film’s fantasy level is an effective shock.

4  MEDIATING THE PSYCHOLOGICAL AND THE PHYSIOLOGICAL 

131

The three levels of the diegesis involve reconfiguring characters that we first encounter later in the fantasy level, from the point where Babydoll is committed to the mental institute, to the brothel where she dances, and to the fantasy sequences reminiscent of video games. The film moves from the brothel world to the videogame-style world each time Babydoll begins a solo public dance, and her team (Sweet pea, Amber, Rocket and Blondie) embarks with her on a quest for a variety of objects that will enable their escape. However, Babydoll is never actually shown dancing, and upon returning to this world after each set piece song sequence, we are given reaction shots showing the diegetic audience enrapt by the performance we have not seen. Yet their reaction might be understood as a model for ours, having brought us down from each pyrotechnic action accompanied by a song. In some ways, Sucker Punch might be construed as a musical, in that it has extended set piece sequences based around songs that resemble extravagant and choreographed sequences in musicals. However, while they start as dance sequences, they move rapidly into video game-inspired violent action with loud music and no dialogue. These include attacking a World War I German bunker and a castle defended by what look like Lord of the Rings orcs, as well as fighting robots on a train, and another in samurai-era Japan. In every case, they contain anachronistic and fantastic elements, mixing components from different eras and being reminiscent of various other films and video games. Each song sequence starts with the song recording, which then recedes and is replaced by orchestral film score. Yet this is often based loosely on the musical elements of the particular song. The song recordings themselves are arrangements of the original, too, if you like, as cover versions of famous pop and rock songs. As noted, Sucker Punch might be understood as a musical. Elsewhere, Beth Carroll and I have pointed out that rather than a strict demarcation between musicals and other genres, the musical form fluidly runs into other genres and is not a matter of simple traditional format.32 Dr. Gorski triggers the music (and the musical sequences) by switching on a reel-to-reel tape machine. As each song starts, Babydoll begins to gently sway as a start to her dance. Then we have an audio dissolve into extreme violent action in a succession different video games-inspired diegeses for each dance. Each of these set piece sequences matches the movement of the song with fast and dramatic action, almost all of which is CGI. Indeed, the look of the sequences aims to be reminiscent of video game graphics. The violence and action are hyperbolic, and, like many

132 

K. J. DONNELLY

action video games, the enemy is less-than-human and therefore dispatched in massive numbers without any concern about morality. In these song/action sequences, dialogue becomes almost wholly marginalized and the explosive action and music adopt the foreground. For instance, at the beginning of the film, the song Sweet Dream (Are Made of This) accompanies a visual-only narration of the situation leading up to Babydoll’s incarceration. Indeed, music regularly dominates and the film is premised upon loud stereo music alongside striking images on screen. Sucker Punch has an overall sonic coherence, with sparing incidental music by experienced film composer Tyler Bates alongside Marius de Vries, who is better known as a dance music producer, alongside a selection of well-­ known songs performed by different artists from their original recordings. For instance, Emily Browning (who plays Babydoll in the film) sings the aforementioned version of Sweet Dreams (Are Made of This), which was originally recorded by the Eurythmics. This appears as non-diegetic music, concluding as Babydoll is brought into the asylum near the start of the film. She also sings the Smiths’ song Asleep and the Pixies song Where is My Mind?, which also appear non-diegetically. At the film’s conclusion and for the end titles, Carla Gugino (who plays Gorski) and Oscar Isaac (who plays Club boss/orderly Blue Jones) perform Love is the Drug, which was originally by Roxy Music. The film’s approach to music lends it a sense of unity, with ‘song sequences’ beginning with the new version of the classic song, and as the sequence progresses, the song drops out to be replaced by incidental orchestral and electronic score, which make variations on—sometimes close and sometimes moving some distance from—the song material. Like a film musical, sections of the film are narrated my images and music with dialogue marginalized and often obliterated by loud music. As noted, at the start of the film, Sweet Dreams (Are Made of This) has an ironic relationship to the images of Babydoll’s framing for murder and incarceration. The whole narration is achieved visually with music, with no recourse to dialogue. Similarly, for the mayor’s arrival at the brothel to see Babydoll dance, there is a mixture of the two Queen songs I Want It All and We Will Rock You, which is motivated as being diegetic with crowd involvement in the chanting and clapping. Without dialogue, this sequence communicates not only that the mayor is important but also more like a gangster than a politician while the crowd’s sycophantic reaction to him illustrates their subordination to his power over the anticipated event of Babydoll’s dance. Musically, the use of rap music by Armageddon Aka Geddy appears to be a crude allusion to gangsterism, although the segueing into Queen’s We Will Rock You and I Want It All is impressive,

4  MEDIATING THE PSYCHOLOGICAL AND THE PHYSIOLOGICAL 

133

inspiring recognition in the audience while retaining the thumping rhythm of anticipation. The song sequences are startling in their fast-paced action and sense of kinesis. In order, these set piece action/dances are premised upon the songs Army of Me (Sucker Punch Remix) by Björk featuring Skunk Anansie, White Rabbit by Emiliana Torrini and originally by Jefferson Airplane, Search and Destroy by Skunk Anansie and originally by the Stooges and Tomorrow Never Knows by Alison Mosshart and Carla Azar and originally by The Beatles. Army of Me accompanies a sequence where Babydoll fights against three gigantic Japanese feudal samurai-type figures in the Mount Mikasa temple and its snowy outside (Figs. 4.6 and 4.7). This is distantly reminiscent of the video game Shadow of the Colossus (2005, Japan Studio/Team Ico) where the player must destroy a number of massive combatants. The action is based directly on video game action, with extreme gymnastic movement and fast violence.

Figs. 4.6 and 4.7  Sucker Punch

134 

K. J. DONNELLY

The second ‘song sequence’ accompanies based White Rabbit and depicts an attack on German trenches in a version of World War I, where the Germans have reanimated their dead soldiers as a form of zombie. This sequence exhibits a clear sense of steampunk that is evident intermittently throughout the film. As the music and sequences begin, the song’s slow start is accompanied by slow-motion shots of the groups of female protagonists walking in the trench. The following song, Search and Destroy, accompanies the group attacking a castle with orcs and dragons, directly extracted from the Lord of the Rings films (2001, 2002, 2003) and reminiscent of the multi-player online game World of Warcraft (2004, Blizzard), although its dragons predate but look related to those in Game of Thrones (2011–2019). As in most of the sequences, the song drops out and is replaced by orchestral score that is related distantly and intermittently to the song. Here, the song re-enters dramatically at the precise moment when Babydoll stabs a baby dragon in the throat, a climactic moment in the sequence. The final song sequence is based on a radical rearrangement of The Beatles’ song Tomorrow Never Knows. The original appears on Revolver (1966) and marks the first foray by the group into experimental studio techniques. Indeed, the original still sounds highly singular and Alison Mosshart and Carla Azar have conventionalized the song considerably for Sucker Punch.33 The song is accompanied by a scenario of a bomb on a train that needs to be defused before it reaches a city. The antagonists on the train are robots and the train and landscape are a fantastic science fiction concoction. As in the other song sequences, the song drops out but reappears at a crucial moment: at the point where the bomb is being lifted from its housing in the hope of removing it from being dangerous, entering still motion and ultimately leading to the death of Rocket, one of the group.34 Like the rest of the action sequences, it is full of sweeping camera movements, avid zooms and bullet time. The film’s action is almost overwhelming, having relentless movement in the song/action sequences and constant aim for visual variation and spectacle. Indeed, this is perhaps as far from the ‘invisible’ style of naturalism and classical Hollywood continuity as it is possible to get. Not only are there the three diegetic levels, but constant use of CGI to the point where the audience might be unsure of what was really filmed and how much is Manovich’s ‘animation’.35 This stylistic overload is compounded by constantly camera movement, wild acrobatics from the characters and the constant intrusion of slow motion.

4  MEDIATING THE PSYCHOLOGICAL AND THE PHYSIOLOGICAL 

135

Sucker Punch is a complex and contradictory film in many ways. Its depiction of female repression takes the form of making a group of women into a bodily spectacle. Yet its multiple diegeses and metatextual levels make everything into a parody of clichéd representations. Yet at the same time, the parody appears to retain a sense of the original more than of its criticism.36 Music dominates, underlining the loss of any sense of depicting a real world across the whole of the film, while the absolute dominance of the song/action sequences connect Sucker Punch very directly to the sense of the stylized and self-consciously ‘unreal’ escapism predominant in the film musical. That these sequences offer an escape for Babydoll and her friends directly engages with Richard Dyer’s notion of the ‘glimpse of utopia’ evident in film musical song sequences,37 and certainly mark a happier and more empowered world in comparison with the two successive diegetic worlds they have moved from. Certainly, songs are ‘freedom’ from the brothel, which in turn is freedom from the mental hospital.38 The film’s multiple ‘realities’ question the sense of reality as a whole. Each level is fantasy, and quite possibly including the base level of the diegesis we enter at the start of the film. Music is the clear agent of moving between these diegeses and emphasizing the fantastic aspects of the film. Indeed, one might argue that musicals are comparative through necessity, where the two ‘channels’ are contrasted (so, backstage musicals in particular such as 42nd Street [1932]). Audiovisual culture tells us what is ‘real’ and what isn’t, and they are often (although not always) well demarcated. However, toggling between two relatively stable states can not only question the status of ‘the real’ and our perception, but also the fantastic visions appear more ‘real’, in a flattening out of the distinction of status between the two channels. Furthermore, any level that is ‘in someone’s head’ can be more real than the seemingly ‘real’ diegesis, as it is about the true self rather than being lies or ‘mere surfaces’. This process of ‘toggling’ between different states is central to audiovisual culture, and evident in films as diverse as the adaptions of Robert Louis Stevenson’s Doctor Jekyll and Mr. Hyde and comedy Gremlins (1984). The two states make for psychological differences and this can often be signalled through distinct stylistic regimes, and the inculcation of an uncertainty in the space between the two states, and anticipation of moving from one to the other. In a way, these illustrate one of the attractions of audiovisual narrative in that we can feel we are involved in the drama through a surrogate. The fantasy of alternative identity is not only depicted in these films but also is figured stylistically.

136 

K. J. DONNELLY

Conclusion When our bodies and minds ‘don’t quite fit’ modern life, culture provides fantasy that helps bridge. Well, potentially, I would suggest it is not necessarily to do with the ‘stories’ and how these engage human issues. It might be more in the way that audiovisual culture engages perception in a way that mediates between physiological horizons and psychological states. Audiovisual culture also regularly mediates between ‘reality’ and ‘imagination’, yet the key to fantasy is knowing it isn’t reality, and this conundrum is at the heart of the illusory status of audiovisual objects since the advent of film. This sense of ‘channel shift’ is perhaps something human beings make constantly. Perhaps toggling between a modern world and our most natural tendencies which fit the Pleistocene. Indeed, sometimes we experience situations where we react in an unexpected manner, perhaps even a basic ‘primitive’ manner, that surprises us. Indeed, we can shift between our reflective, consciously thinking self and our ‘auto-pilot’ physiological self. Such toggling is often represented and dramatized in audiovisual culture, suggesting it might be an important aspect of the human. This process also questions continuity and understanding of things as continuous rather than fragmented and discontinuous. As such, this engages with one of the principal symptoms of schizophrenia, that of temporal confusion and a sense of discontinuity, and audiovisual culture, while on the one hand presenting a strong sense of a coherent and unified reality (indeed, far more than in reality itself), also indicates the fragility of that experience. Indeed, mediation through a process of toggling between different states questions the status of ‘reality’ more generally, even though simultaneously and perhaps conversely, it strengthens our sense of ‘the real’ by showing it is as different from our imagination. A sense of mediation might also be available between sound and image. While audiovisual culture regularly bases itself on the unified signal of sound and image merged, there are, of course, times when sound and image diverge to a particular effect. Drawing upon J.J. Gibson and Albert Bregman, Joseph Anderson points to humans regularly utilizing ‘cross-­ modal confirmation’.39 Here, a question posed in the image, for example, might have an answer in sound, or vice versa. This is a higher-level cognitive activity, where information drawn from one sense might be confirmed or denied by others. In a basic state, we hear potential threats at a distance and visual confirmation is often awaited. In an everyday setting, we might

4  MEDIATING THE PSYCHOLOGICAL AND THE PHYSIOLOGICAL 

137

meet a new person and be uncertain of their attitude towards us from their visual presentation. They might say polite words to us, but their tone, the music of their voice, might tell us whether they are sympathetic or not to us. A similar process can happen in films and other audiovisual culture, where information is withheld, or presented partially in sound or image, requiring a confirmation in the other. Yet these moments appear less common than a sense of unified sound and image, as a homology to our lives, where this question between hearing and seeing is an occasional rather than habitual state. This implies another question, though, which is how far does our perception ‘fill in the gaps’ to make a unity when we should be questioning those gaps? They may be of significance and our active perception might simply ‘paper them over’. Film and other audiovisual culture perhaps have this issue less than we might as individuals, as it is clearer and more selective in its presentations, regularly aiming less at ambiguity and the establishment of a state of seeming stability for the audience.

Notes 1. Hayden White, “The Value of Narrativity in the Representation of Reality” in Critical Inquiry, vol. 7, no. 1, Autumn, 1980, pp. 8–9. 2. Zlatan Krizan, “Ancient Brains in a Modern World: How Our Bodies and Minds deal with Life in the 21st Century” in Psychology Today, blog, no date. https://www.psychologytoday.com/us/blog/ancient-­brains-­in-­ modern-­world [accessed 5/5/2022] Adam Gazzaley and Larry D.Rosen, The Distracted Mind: Ancient Brains in a High-Tech World (Cambridge, MASS.: MIT Press, 2016). 3. Leda Cosmides, John Tooby and Jerome Barkow, eds., The Adapted Mind: Evolutionary Psychology and the Generation of Culture (Oxford University Press, 1992); Stephen Pinker, How the Mind Works (New York: Norton, 1997). 4. Pinker, op.cit., p. 21. 5. Edward O.Wilson, Biophilia (Cambridge, MASS.: Harvard University Press, 1984), pp. 108–110. 6. John Tooby and Leda Cosmides, “The Past Explains the Present: Emotional Adaptations and the Structure of Ancestral Environments” in Ethology and Sociobiology, vol. 11, nos. 4–5, 1990, pp. 386–7. 7. Ronald Giphart and Mark van Vugt, Mismatch: How Our Stone Age Brain Deceives Us Every Day (And What We Can Do About It) (London: Robinson, 2018).

138 

K. J. DONNELLY

8. Nathan Cofnas, “A Teleofunctional Account of Evolutionary Mismatch” in Biology and Philosophy, vol. 31, no. 4, 2016, pp. 507–525.(507). 9. Steven Pinker states, “People hold many beliefs that are at odds with their experience but were true in the environment in which we evolved, and they pursue goals that subvert their own wellbeing but were adaptive in that environment.” Op.cit., 1997, p. 32. 10. Pamela E.King, Justin L.Barrett, Tyler S.Greenawa, Sarah A. Schnitker and James L.Furrow, “Mind the Gap: Evolutionary Psychological Perspectives on Human Thriving” in Journal of Positive Psychology, vol. 13, no. 4, 4 July 2018, pp. 336–345. 11. This situation is summed up well in Daniel Levitin, This is Your Brain on Music (New York: Dutton, 2006), p. 242. 12. Mark A.Hanson and Peter D.Gluckman, “Evolution, Development and Timing of Puberty” in Trends in Endocrinol Metabolism, vol. 17, no. 1, January–February 2006, pp.  7–12; Mark A.Hanson and Peter D. Gluckman, “Changing Times: the Evolution of Puberty” in Molecular and Cellular Endocrinolology, vol. 25, July 2006, pp. 26–31, 254–255. 13. George Williams and Randolph Nesse, Why We Get Sick: The New Science of Darwinian Medicine (New York: First Vintage, 1996). 14. Dave Robson, “A Brief History of the Brain” in New Scientist, issue 2831, 24 September 2011. https://www.newscientist.com/article/mg21128311-­8 00-­a -­b rief-­ histor y-­o f-­t he-­b rain/#:~:text=Not%20only%20did%20the%20 growth,3%20or%204%20per%20cent [accessed 6/1/2022]. 15. Yet according to some, such as Steven Mithen, humans are not limited by brains from the EEA. He poses that the human mind is defined not by its structure but by its ability to blend, merge and interact its particular abilities, creating new ways of problem-solving. He calls this ‘cognitive fluidity’. The Prehistory of the Mind: The Cognitive Origins of Art, Religion, and Science (London: Thames and Hudson, 1996), p. 194. 16. Leda Cosmides, John Tooby, & Jerome H.Barkow, “Introduction: Evolutionary Psychology and Conceptual Integration” in Cosmides, Tooby Barkow, op. cit., 1992, p. 5. 17. Certain psychiatric or problematic conditions may in some cases be evolved states which we are misinterpreting as disorders because they no longer fit our social expectations; or they may be mental states or traits which would manifest healthily in ancestral environments but become pathological due to some feature of modern environments. Equally, some might be early manifestations of evolution in behaviour, brain activity or general physiology. 18. James J.  Gibson, The Ecological Approach to Visual Perception (Boston, MA.: Houghton Mifflin, 1979).

4  MEDIATING THE PSYCHOLOGICAL AND THE PHYSIOLOGICAL 

139

19. Marshall McLuhan, Understanding Media: The Extensions of Man (Cambridge, MASS.: MIT Press, 1994 [f.p.1964]), p. 7. 20. Sigmund Freud, “‘A Child is Being Beaten’: A Contribution to the Study of the Origin of Sexual Perversions” (1919) in The Standard Edition of the Complete Works of Sigmund Freud, translated and edited by James Strachey, vol.XVII (London: Hogarth Press, 1961), p. 177. 21. According to the theory (often referred to as ‘seduction theory’) that Freud disowned, a repressed memory of an early childhood sexual abuse or molestation experience was the essential precondition for hysterical or obsessional symptoms, with the addition of an active sexual experience up to the age of eight for the latter. Jeffrey M.Masson, The Assault on Truth: Freud’s Suppression of the Seduction Theory (New York: Farrar, Straus and Giroux, 1984), p. 109. 22. Robynn J.  Stilwell, “The Fantastical Gap between Diegetic and Nondiegetic” in Daniel Goldmark, Lawrence Kramer and Richard Leppert, eds., Beyond the Soundtrack (Berkeley, CA: University of California Press, 2007), pp. 184–202. 23. Gilbert Ryle, The Concept of Mind (London: Hutchinson, 1949), p. 17. 24. Silent Hill is a highly singular series of games produced by Japanese company Konami and has appeared for a number of different gaming platforms: Silent Hill (protagonist Harry) (1999, for Playstation, later versions ported for PC), Silent Hill 2 (James) (2001, Play Station 2, PC, Xbox), Silent Hill 3 (Heather) (2003, PS2, PC), Silent Hill 4: The Room (Henry) (2004, PS2, PC, Xbox), Silent Hill Origins (Travis) (2007, Sony PSP hand-held), Silent Hill: The Escape (2007, mobile iOS, first-person perspective, so no character), Silent Hill: Homecoming (Alex) (2008, PS3, Xbox360, Windows), Silent Hill: Shattered Memories (Harry, like the first) (2009, Nintendo Wii, PS2, PSP), Silent Hill: Downpour (Murphy) (2012, PS3, Xbox360) which features 3D (stereoscopic) graphics and Silent Hill: Book of Memories (2012, PS Vita), which has an overhead isometric view and players can create their own protagonist, male or female, and five archetypes (bookworm, goth, jock, preppy and rocker). 25. “Every sound and every line of sound that is in the game is done by me. And I make all my own sound effects …” “GDC 2005: Akira Yamaoka Interview” Game Informer magazine. www.gameinformer.com/News/ Story/200503/N05.0310.1619.39457.htm [accessed 07/03/2007]. 26. Zach Whalen, “Film Music vs. Video Game Music: The Case of Silent Hill” in Jamie Sexton, ed., Music, Sound and Multimedia: From the Live to the Virtual (Edinburgh: Edinburgh University Press, 2013). 27. Edward A.  Lippman, The Philosophy and Aesthetics of Music (Lincoln, Nebraska: University of Nebraska Press, 1999), p. 27.

140 

K. J. DONNELLY

28. Peter J. Lang, “The Varieties of Emotional Experience: A Meditation on James–Lange Theory” in Psychological Review, vol. 101, no. 2, 1994, p. 211. 29. “Bipolar Disorder FAQs” at Brain and Behavior Research Foundation. https://www.bbrfoundation.org/faq/frequently-­asked-­questions-­about-­ bipolar-­disorder [accessed 17/6/2022]. 30. British television show Escape into Night (1972) was an ATV children’s drama consisting of six 25-minute episodes. 31. Glen Creeber, Dennis Potter: Between Two Worlds (Basingstoke: Macmillan, 1998), p. 8. 32. K.J. Donnelly and Beth Carroll, “Introduction” in K.J.Donnelly and Beth Carroll, eds., Contemporary Musical Film (Edinburgh: Edinburgh University Press, 2017), pp. 4, 6. 33. They weren’t the first. The band 801, featuring Brian Eno, recorded a quite conventional live version of the song for the album 801 Live (1976). 34. The train scenario and shots approaching the train are distantly reminiscent of the film Source Code (2011). 35. Lev Manovich, “What is Digital Cinema?” in Shane Denson & Julia Leyda, eds., Post-Cinema: Theorizing 21st-Century Film (Brighton: REFRAME Books, 2016). https://reframe.sussex.ac.uk/post-­cinema/1-­1-­ manovich/ [accessed 2/2/2018]. 36. Connor McKeown and Jennifer Ng, “‘You have All the Weapons You Need’—Sucker Punch and the Multiform Gaze” in The Computer Games Journal, vol. 3, Issue 2, October 2014, pp.  54–63; Alexander Sergeant, “Zack Snyder’s Impossible Gaze: The Fantasy of ‘Looked-at-ness’ Manifests in Sucker Punch” in Gilad Padva and Nurit Buchweitz, eds., Sensational Pleasures in Cinema, Literature and Visual Culture (Basingstoke: Palgrave Macmillan, 2014). 37. Richard Dyer, “Entertainment and Utopia” in Rick Altman, ed., Genre: The Musical (Routledge and Kegan Paul, 1981), pp. 174–189. 38. The film’s sexual politics are questionable, mixing powerful but sexually displayed female figures alongside repressive men and father figure/saviour. It is all parodic and knowing but deserves a sophisticated discussion somewhere. If anyone knows of one, let me know. 39. Joseph D.  Anderson, The Reality of Illusion: An Ecological Approach to Cognitive Film Theory (Carbondale, IL.: Southern Illinois University Press, 1998), pp. 86–89; Paul Taberham, Lessons in Perception: The Avant-Garde Filmmaker as Practical Psychologist (London: Berghahn, 2018), p. 171.

CHAPTER 5

Gestalt, Spandrels and Synergy

Audiovisual culture consistently seems to produce a synergetic effect that is more than the sum of its parts. Consequently, it is a fine example of the Gestalt understanding of perception, and further, it not only embodies Gestalt principles but affords analysis using techniques derived from them. Perhaps audiovisual culture is fortuitously effective and significant, yet in its effectiveness it tells us much about human ‘innards’ (both inner life and physiological being). A Gestalt-inspired line of analysis will tell us something different from the dominant cognitive psychology-inspired ‘atomized’ approach in the arts of looking at individual components at the expense of wholes and their domination of component parts. The McGurk Effect is an excellent example of the principles of Gestalt perception. Two objects are perceived as one, yielding a new and different object. As noted in an earlier chapter, Evolutionary Psychology (EP) often approaches art as an ‘adaptation’, as an evolutionary development with a direct or indirect function of aiding survival. Of course, this might be to a minimal effect. One response to this, from biologist Steven Jay Gould, was to suggest that art might only be a ‘spandrel’, an evolutionary by-product without a function in this regard.1 In architecture, a spandrel is an inevitable but not necessarily desirable by-product of the addition of a dome to a succession of arches (or a flat surface to the top of an arch). Gould suggests that it is easy to misrecognize the spandrel as a central feature rather than as a consequence of other features. Might we understand the © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 K. J. Donnelly, The McGurk Universe, Palgrave Studies in Audio-Visual Culture, https://doi.org/10.1007/978-3-031-18633-2_5

141

142 

K. J. DONNELLY

effective power and cultural energy of the audiovisual as a spandrel—both a product of adding ‘film’ (or moving images) and ‘music’ (electronic sounds) and a by-product of biological adaptation—rather than a significant adaptation in itself? Perhaps the idea of the ‘spandrel’ could help also account for the synergetic effect of music combined with the moving image. This chapter addresses the audiovisual as a startling by-product of mixing two technologies (and sometimes two separate aesthetic traditions) and also as something that exploits precisely the affordances of human perception.

Audiovisuals and Gestalt Psychology Gestalt psychology has its origins in the early twentieth century as an approach based on ideas about perception. It offered a radically different approach to perception, which had been conceived as a simple mechanism translating outside influences directly into neurological impulses that were then worked upon by higher-level processes. Perception was thought to be determined by the primary characteristics of outside stimuli which were transduced into raw data by our senses to feed our thinking and emotions.2 Countering this, Gestalt psychology focused on scientific experiments that revealed facts about how perception worked rather than extrapolating the existing logic of how brains were thought to work and dominate human beings. Gestalt established that perception was an active process rather than a passive one of converting stimulus directly into signal. Perception orders and processes stimuli following several standard processes which were presented as laws.3 The first and best-known principle is that human beings perceive the world in terms of a series of wholes rather than elements that are added together in higher-level processing. This is proven by the well-­known Gestalt diagrams such as Necker’s cube.

5  GESTALT, SPANDRELS AND SYNERGY 

143

In these cases, it is clear that we perceive an overall shape (or Gestalt) of an object, even when we are aware that we are ‘completing’ an image through adding in elements that are seemingly missing. Indeed, this shows the independence of the perception process, in that although we can become well aware that we are not perceiving what is actually there and we understand what is happening and can see the incomplete images, we nevertheless still tend to ‘see’ the images completed. Initially, perception was thought of in mechanical physiological terms and explained by the mechanisms of the eye or the ear and the nature of light and sound. Instead, Gestalt psychologists shifted focus to principles of organization that were completed before the impulse was sent forward to higher-level activity. One of the founding Gestalt theorists was Max Wertheimer, whose early research outlined the ‘phi effect’ or phi motion’. Initially concerned with the illusion of smooth movement when successive light spots flash in a line across a space, this is the phenomenon at the heart of film and explains the illusion of movement supplied by successive still photographic frames. This displaced the mechanical notion of ‘the persistence of vision’, whereby the eye’s physiology was thought not to be able to register individual frames and takes into account the more complex procedures involved in perception. Wertheimer summed up Gestalt theory: “There are wholes, the behaviour of which is not determined by that of their individual elements, but where the part-processes are themselves determined by the intrinsic nature of the whole”.4 Things are perceived as integrated objects rather than assemblages of pieces and “the whole is more than the sum of its parts”.5 Gestalt psychology understands the world in holistic, dynamic, subjective terms, rather than discrete, objective events and units, which has also been influential in terms of focusing importance on the environment and its ecological interaction with people.6 In recent years, neurophysiological research appears to provide some validation to Gestalt concepts and approaches, pointing to more holistic procedures in visual processing.7 Gestalt psychology states that perception organizes itself following a number of ‘laws’. These are the laws of similarity, continuity (smoothness, common fate), closure, symmetry and proximity. The phenomenon of ‘Prägnanz’ is a central principal of Gestalt theory, derived from the way that our perception organizes things. Prägnanz is sometimes called the law of the ‘good Gestalt’ or good figure and has also been called the law of simplicity. When we are presented with ambiguous or complex objects, our perception will make them appear as simple and as stable as possible.8

144 

K. J. DONNELLY

Perception is organized into an easily manageable format through the selection of laws, and consequently, the aesthetic organization of audiovisual culture clearly works around these principles that are the foundation of human understanding. One of these ‘laws’ is that of perceptual grouping, where we perceive elements in our perceptual field as belonging together more clearly than other elements, and this defines the overall sense of what constitutes the perceptual field itself for us. All of these principles describe how our perception works in unconscious and extremely rapid ways to make a coherent and digestible signal from what we perceive, even when this involves adding in missing parts or altering what is there to make more sense. Rudolf Arnheim noted that Gestalt psychology has to be central to our understanding of art as it supplies us with an understanding of vision and goes on to note that art appears constantly and prominently in the writings of founding Gestalt psychologists Kurt Koffka, Max Wertheimer and Wolfgang Köhler.9 The Gestaltist discovery that vision was not a simple mechanical operation but an active process, creating patterns of significance, had important implications for understanding art. Thus, significance lay not only in interpretation and ‘appreciation’ but in attending structures and aesthetic dynamics. As Arnheim declared in Art and Visual Perception, “…perceiving accomplishes at the sensory level what in the realm of reasoning is known as understanding”.10 The perceptual process is so active that Arnheim likens it to higher-level mental activity while emphasizing its defining aspect for the experience of art, which he posited clearly in the title for his book Visual Thinking (1969). Wolfgang Köhler maintained that the relations between elements were fundamental to the emergence of the perceptual whole.11 Going further, he declared that sensory fields inculcate their own psychology in the subject.12 An approach to analysis derived from Gestalt ideas underlines a sense of unity of signal, the importance of the whole rather than the component part in itself, the importance of the component part in relationship to the whole and an understanding that perception is separate from and has a defining effect on cognition. So, audiovisual culture is geared in essence towards a coherent psychological disposition of the audience through providing elements that will be combined in a certain manner. Various principles associated with Gestalt psychology are relevant for aesthetic analysis. I want to concentrate on figure and ground, isomorphism and homology, and multistability. Figure and ground are both a principal of understanding and a principal of general organization and so constantly

5  GESTALT, SPANDRELS AND SYNERGY 

145

evident in different ways in audiovisual culture. Mainstream film has a strong convention of foreground and background, with human figures almost always being the important focus while the background may not be of value in itself, yet remains crucial as a defining aspect for the person in the foreground. The figure is the foreground object upon which the perceiver focuses, while the ground is the backdrop but significantly is central to defining the shape of the figure.13 Indeed, apart from figurative art, such a structure not only adequately describes the general conventions of film and television sound, where dialogue and its intelligibility is prime, but also the format for much music, too, where voices or lead instruments make a seemingly important focus while other instruments form a less prominent ‘backing’ (so-called melody-dominated homophony). The sonic structure of most films conforms remarkably to the basic concept in Gestalt psychology of the figure-ground relationship. Figure-­ ground organization divides the perceptual field into a ‘figure’, which stands out in the foreground, and a ‘ground’, a background behind the figure. We focus on the figure and the ground works to define the figure. Danish psychologist Edgar Rubin pioneered work on figure-ground. In terms of sound, dialogue tends to occupy the ‘figure’ position while sound effects and incidental music often function as the sonic ground. However, when there is no dialogue, sounds and music in particular can become ‘figure’, adopting the foreground. We should not forget that figure is defined by ground and ground is given meaning by figure. In audiovisual culture, aesthetics can be quite fluid, certainly more than traditions of old masters painting, for example. The malleability of figure and ground in non-dialogue sequences with music makes them the exception to dialogue and medium close-up (MCU)-dominated cinema, which retains a strong conventional sense of figure (‘talking head’) and ground (unimportant background). In terms of film, this might account for how the audience deals with the (previously mentioned) single-shot sequence in Citizen Kane where three image planes spatialize the drama. In a situation of what Gestaltists might call ‘multistability’, we change focus between planes, largely cued by voices. The same can happen with planes of sound,14 and a reversal of conventional processes regularly is used for dramatic effect. For instance, in a startling sequence from Nosferatu: Phantom der Nacht (1979), talking characters are pushed into the background by shots of landscape and music, as Bruno Ganz as Harker walks to the Borgo Pass. While this is most clear in terms of image where the human figure is transitory,

146 

K. J. DONNELLY

sometimes small and absent in shots, sonic aspects perhaps are less obvious in terms of figure and ground. Human voices are clear figures, while ambient sound and unobtrusive underscore make for a clear ground. Indeed, on occasions disturbing effects can be achieved through breaching sonic convention, where such ‘background’ sounds such as birdsong and reverberant distant sounds are moved to the film’s aural foreground. Such an aesthetic ‘move’ appears intermittently in horror films and illustrates the dramatic rearrangement of conventionally ordered elements rather than sound in space simply working to bolster narrative drama. Sounds that conventionally occupy the ‘ground’ of the film are inexplicably moved into a position they normally would not occupy, endowing a sense of unease that is not immediately (and perhaps not consciously) apparent. This can manifest a distinct spatial drama in sound, following an abstract process where film has other organizational levels than narrative and representation. According to R.  Murray Schafer, certain sounds, which he calls ‘signals’, occupy the foreground and are listened to consciously, which suggests that background sounds tend to be apprehended in an unconscious manner.15 This underlines a common assumption: that sonic backgrounds are not really noticed and, in a way, constitute a subconscious entity. The effect of moving background to foreground recedes elements that habitually are conscious at a cost of bringing into focus those that normally remain unconscious. Following the Gestalt principle I noted earlier, in some cases figure and ground can be reversed as an aesthetic effect. As there is a strong convention for sounds that habitually adopt the foreground to have a clear on-screen source, it is potentially disturbing in that these loud, seemingly close sounds, or those foregrounded through isolation such as the echoed sounds, usually lack on-screen sources and fail to be precisely synchronized with the world of vision presented to the audience. Sound removes from ‘representation’ using sound as sound, therefore in the manner of music, becoming ‘organized sound’ at the expense of sound as a guarantee and representation of illusion of the real world available in the cinema. Such aesthetic rather than representational determination on the film’s sound reorders the conventions of sound in film in a more abstract manner. Indeed, in discussing THX1138 (1971), William Whittington notes soundman Walter Murch’s “… approach to sound revealed a constant tension between musicality and

5  GESTALT, SPANDRELS AND SYNERGY 

147

functionality …”16 Such ‘musicality’ is not subject to the dictates of representing events on screen or marking developments but instead following a musical logic at the expense of the film’s narrative and representational requirements. In fact, some film soundtracks might be approached as if they were an electro-acoustic musical composition, with images and dialogue as merely material elements in an overall aesthetic unity beyond the requirements of narrative and more resembling avant-garde cinema. Isomorphism describes the structural relationship of similarity between different objects. This accounts for a sense of similarity or relationship between different objects as well as at least partially accounting for the sense of ‘what goes with what’ in terms of aesthetics, design, shapes and unity. An example of this is how deep sounds are commonly associated with large objects. For example, a large motor vehicle or an elephant might be expected to fit with a big, deep sound or a beat with a slow surface rate. A large space might be expected to fit with a sound featuring strong echo or reverb. The expectation can overwhelm logic, as illustrated by the use of echoed, expansive music for films set in space, where, of course, there would be no sound due to the vacuum of space. A sense of congruence can come from convention/tradition but also are determined by an isomorphic relationship. Homology is closely related and marks the rhyming and structural similarity of different elements. While it describes an ‘echoing’ of forms, it can be more loosely applied. According to Rudolf Arnheim, ‘homology’ and ‘structural kinship’ in cultural objects work on a psychological level as an essential part of their unity across forms. This process is central to the ‘psychophysical parallelism’ of mental state and object perceived.17 So, isomorphism states that there is the same structure to experiences and the processes that underlie them and that consequently processes taking place in differing domains involve a corresponding structural organization. Multistability is a regular perceptual phenomenon where an ambiguous percept can oscillate between two (and sometimes more) different understandings of the same object. This is most clearly illustrated by some of the most famous Gestalt diagrams, perhaps most notably Rubin’s vase. It can be understood either as a vase with a wide top and bottom and a slim waist or as two silhouettes of human faces in profile facing each other. We can ‘toggle’ mentally between the two different ‘stabilities’.

148 

K. J. DONNELLY

This illustrates vividly how human beings perceive a sense of unity to what is an abstract design or, rather, perceive two different unified configurations within a single form. Multistability allows for conceptual movement between different configurations within the same object. Related directly to this is the notion of ‘invariance’, whereby an object can remain recognized as the same object despite being changed in terms of angle of view or scale or, indeed, other variations. An example of this, what is sometimes called Gestalt movement in effect, is a music sequence of melody. People can recognize a sequence of perhaps six or seven notes, despite them being transposed into a different tuning or key. This shows that we don’t perceive notes/pitches as individual discrete objects but rather as intervals, variations in pitch between the notes, and, crucially, the succession of intervals as a whole. This underlines how the whole and relationship between the parts is crucial and more defining than the individual notes in themselves. Rudolf Arnheim characterizes the intervention from Gestalt psychologists in the early twentieth century as a necessary and significant enrichment of scientific approaches, offering “… something like an artistic vision of reality was needed to remind scientists that the most natural phenomena are not described adequately if they are analyzed piece by piece. That a whole cannot be attained by the accretion of isolated parts was not something the artist had to be told”.18 The dominant form of psychology, cognitive psychology, and its counterpart in the study of film, ‘cognitive poetics’, both privilege component analysis and an approach that atomizes elements of the object under scrutiny. Such an approach to audiovisual culture has given many useful insights. However, for audiovisual analysis, a Gestalt-inspired approach makes more. It is clear that individual film elements are not autonomous, although on occasion they might be partially uncoupled from the rest of their context, and music is a good example of this. So, for audiovisual culture, an approach less based on ‘pulling apart’

5  GESTALT, SPANDRELS AND SYNERGY 

149

elements of what are often understood as separate discourses (e.g. ‘the film’ and ‘its music’) is not as helpful as approaching the whole as the significant object and understanding its components as significant in relation to the whole.

Gestalt Extrapolation Dominant film theory has tended to underplay the level of perception and emphasize the conscious comprehension of film. Cognitive poetics is without doubt the dominant approach to the analysis of film, and David Bordwell has done more than anyone to establish this highly effective strategy. In Making Meaning, David Bordwell discusses the levels of understanding used in film comprehension and analysis.19 These are: . Referential meaning (syntax) 1 2. Explicit meaning—shared symbols 3. Implicit meaning—from a ‘rich engagement’, themes 4. Symptomatic meaning—film is hiding something The first, ‘referential meaning’, works on a syntactic level and concerns ‘what happens’ in the film. This requires an understanding of the basics of audiovisual ‘language’, without which elements would seem random and meaningless. ‘Explicit meaning’ works on a level of shared symbols, themes and continuities that hold together and underpin narratives (e.g. murder and crime, police and detection). Understanding narrative, this level comprehends not just what happens but why it happens and might engage generic and other aspects common to audiovisual narratives. The level of ‘implicit meaning’ is on a higher and more abstract level, coming from a ‘rich engagement’ with the film’s themes. This might relate to social and political ideas, for example, that require some extra-filmic knowledge for their realization. The final level, ‘symptomatic meaning’ is when the analysis assumes that the film is hiding something, and he states that this is not a valid form of film analysis. This last approach has been prominent at times in the study of films and audiovisual culture. I understand Bordwell’s antipathy towards such approaches to analysing film. His negativity towards ‘symptomatic meaning’ was likely inspired by psychoanalysisinspired approaches to film interpretation, with their ‘diagnoses’ of ‘symptoms’, finding something else hidden underneath that is not apparent or perhaps just a couple of clear elements that need to be ‘joined up’ by some

150 

K. J. DONNELLY

imaginative commentary that addresses elements not apparent in the film. While I take Bordwell’s point, I am uneasy about this positivistic approach. Sometimes film can be interpreted through what it is not and through what is not there in the film. For instance, classical Hollywood cinema of the 1930s and 1940s has remarkably few African American actors evident on screen. They are absent; one might argue they are a ‘structured absence’ in these films. This could not only lead to questioning why African Americans are not present, but it might also lead to analysing the film texts for evidence of displacement or metaphor. In both cases, it would be something of a symptomatic reading. Similarly, subtext can sometimes be intermittent and incomplete but insistent. An approach dealing only with what is directly evident in the film doesn’t consider how much we mentally fill in with film. Indeed, Bordwell’s influential discussion of art cinema emphasizes its narrative voids, its gapped syuzhet, as a defining characteristic. Furthermore, at various times and in different ways, filmmakers have had to negotiate censorship and have been compelled to code sexual or political ideas in ways that are not explicit in their meaning. Perhaps Bordwell’s levels of interpretation can be rethought as ‘perceptual’, ‘cognitive’, ‘interpretative’ and, beyond that, ‘imaginary’ or ‘exegetic’? While Gestalt perception works at every level, it is worth noting that the first and the fourth might be of particular interest. This final level concerns me. To take this a step further, we most certainly ‘fill in the gaps’ when watching films and other audiovisual culture, often to the degree that we would be surprised just how much we have surmised rather than having been presented with. I would suggest this process might be approached as a way of dealing with aspects that are implied but not evident, not represented but perhaps imagined. If we learn anything from Gestalt, it is that we add to what is there—we extrapolate. Partially occluded objects are an interesting phenomenon. Do we perceive them as we see or hear them, as a thing in themselves, or as only a partial object? Children see the crescent of the moon and sometimes in this saw the face of the ‘man in the moon’. Adults, in their minds, understand the ‘whole’ of the moon, despite only seeing the crescent.20 Sometimes, it seems like the other part can be seen as we imagine the whole when faced with partial information. Situations like this appear to be an instance of top-down mental processes over-ruling and adding to what is provided by perception. Something similar might be happening in our perception. The Gestaltist ‘law of closure’ illustrates that we ‘finish off’ images that are partially explicit, making them into a coherent whole

5  GESTALT, SPANDRELS AND SYNERGY 

151

despite the missing section(s). Perceptually, the visual system can bridge holes or scotomas in the visual field. This is the case for the pair of holes in vision present naturally in all individuals. This is the ‘scotoma’ (sometimes called the ‘blind spot’ in our vision). Each eye has a zone where no stimulus is received but our perceptual system fills in the gap with information.21 Yet not only do we not notice this gap, even monocularly by shutting one eye at a time, but also, visual objects that run across this spot appear to us to be continuous even if they have gaps that fall within the scotoma. Mechanisms of visual fill-in are present not only here at the scotoma, and such ‘papering over’ of a gap in impulse can be created by causing an aporia of information.22 Perhaps something similar happens with our perception of screens, where we don’t notice beyond the edges of the frame, unless our attention is called to it. Similarly, perhaps, we rarely become aware of sound perspective changing in audiovisual drama, unless called to pay attention to it. And it changes constantly and dynamically, from a pervasive enveloping overall sound to a character’s point of view sound to being able to hear whispering characters perfectly. So, a Gestaltist approach fully understands the principle of a ‘structured absence’, and perhaps we imagine what the film or audiovisual object doesn’t actually show. Of course, film has always relied on this. Gaps in what a character does are either ignored as irrelevant or imagined explicitly or not. We might assume that we are shown everything that is salient. However, that is not always the case, as classical Hollywood was unable to show sex scenes they had to be implied and understood to have happened by the audience—although perhaps not explicitly imagined. Gestalt principles understand that humans have a desire to apprehend objects as wholes and easily understood unities. This follows the principles of the ‘good Gestalt’, which applies a sense of closure and comprehensive structure. Faced with an incomplete picture, we will likely ‘fill in’ what we don’t see. Our perception extrapolates for us, and our top-down cognitive processes will do the same on top of that. As with all Gestalts, when viewing a film, the whole is perceived before the parts, the overall pattern before the particular details. The details are often ‘filled in’ by the mind without them actually being seen or even sometimes without them really existing, like it happens in the well-known illusion of the so-called Kanizsa triangle, which is defined only by the objects surrounding it.23

152 

K. J. DONNELLY

This clearly must be a bottom-up process, part of perception, as we still see the triangle if we look away and then look back at the image. This is a fine illustration of the primacy of perception and its trumping of higher-­ level brain activity, even if that sometimes is only temporary. An approach from Gestalt psychology underlines a reality. If we hear a particular sound, we imagine its completion with an image. The source needs confirmation. However, when we see something without an associated defining sound, we think there’s a problem with our hearing.24 The point is that we assume a unity, a whole, rather than taking the received signals as full. Of course, sounds that we consign to the background soundscape and images of things where we would expect no sound almost prove this point rather than contradicting it. This process takes place across perception, with us making up the full object from incomplete data. A fine illustration is the ‘phantom fundamental’, where we can seem to hear a low note that is not in fact present. If a selection of the harmonics (the specific higher multiples of that low frequency) is sounded, then we add in the ‘implied’ note that is absent.25 If there are spaces, we ‘fill in the gaps’, making perception stronger and more personal and perhaps even making the powerful effect gained by the audiovisual even stronger. Gestalt completion is begun as a bottom-up process but can be finished as a top-down process. We should be careful to be aware that this is often ‘post-perceptual effect’, a top-down operation to fill in the gaps. Sometimes it might be difficult to tell the difference. Andy Clarke sees the process as being a matter of incoming sensory signals being met by top-down expectation, and an assessment being made of them.26 This makes sense as we need some model of experience to have any hope of understanding incoming perceptual impulses, and Clarke poses that ‘predictive processing’ is possibly the key area in this procedure.27 So, Gestalt can be model of understanding, but in some ways more significantly, Gestalt can define the expectation of a complete and coherent object. Implications here go beyond that of the perceptual Gestalt and draw upon both perception and

5  GESTALT, SPANDRELS AND SYNERGY 

153

‘post-perceptual effect’ of top-down processes. I am particularly interested in how this might be used in aesthetic terms as well as in cultural interpretation. In discussing the general approach of Wolfgang Iser and literary ‘reception theory’, Terry Eagleton points to how with cultural objects an extrapolation takes place from available information rather than perceiving something as being full of gaps.28 This gives a conceptual application to Gestalts as well as their perceptual functioning, although the division between the two is far from clear. Is this perception filling in the gaps or a post-perceptual effect or the incomplete signal being supplemented by top-down processes? At the very least, it appears to mark a deactivation of our focus on the incompleteness of objects. Video game theorist Mark J.P. Wolf uses the idea of Gestalt completion to explain how game developers can create the illusion of a complete world in a game through presenting some but encouraging the player to fill in the gaps and understand it as whole.29 It is also clear how this works with films, for example, where we do not worry about the ellipses between sequences, when it is immaterial what a character is spending their time doing between two significant scenes. A similar process appears to take place where there are aporias of any sort that are not signalled to be registered as such. Yet even when it is plain that there is a gap in our information, there appears to be a drive to ‘join up’ what is supplied, to make it into the most pleasing and easily consumed format. Higher processes make inferences to unify. Another film example is the censor’s edit from Frankenstein (1931), denying depicting the fate of the little girl. This leads to imagination filling the gap, which suggests a brutal wilful murder by the monster and possibly with sexual intentions. The excised section cuts just as he looks at her and moves towards her, cutting to her father carrying her dead body into the town, losing the sequence where the monster throws her into the water like the flowers floating and then is distraught when she drowns. Aporias get filled. After seeing the film for the first time, many people watching Tarantino’s Reservoir Dogs (1991) are convinced that they saw a man’s ear being cut off. In fact, the camera coyly moves focus to the corner and we hear his screams. The sound and narrative situation are enough for the audience not only to mentally ‘fill in’ the images but also to believe that they have experienced an image that was absent. While we might imagine that higher operations are separate in character, perhaps they follow the logic evident in perception as well as being dominated by its bounty. Irvin Rock suggests that thinking could well

154 

K. J. DONNELLY

have evolved from perception and thus might follow the same procedures rather simply than being master of the signals it receives.30 Consequently, we could well have an in-built desire for wholes and completing incomplete objects that might dominate our higher-order processes, too. Post-­ perceptual effect appears to corroborate this. The Gestalt logic focuses on wholes rather than separate parts, forcing us to understand objects as distinctive things and unities in themselves. Indeed, conspiracy theories and wild interpretations of both films and historical events might be based on some solid evidence, but the extrapolation is the important part, the stretching to accommodate the structured absence and imagining the whole, no matter how strange that might be.

Spandrels and Sweet Spots While Evolutionary Psychology (EP) often tries to understand art as an evolutionary ‘adaptation’, there are some other possibilities. Steven Pinker sees art and wider culture as ‘cheesecake’, something attractive to us but not an evolutionary adaptation. I’ll discuss this shortly. Stephen Jay Gould suggested that art might be a ‘spandrel’, an evolutionary by-product but without a particular function.31 This was derived from architecture, where two separate functional features might sit side by side and appear as a third feature, giving the impression of that being the important feature itself. This is not an uncommon aspect of design. A spandrel is an example of this, where a flat surface is added to the top of an arch, producing triangular spaces which can often be used for holding a decoration. Gould and Lewontin’s point is that some notable aspects of human existence might not be adaptations that developed to help human survival but actually a serendipitous mixture of aspects derived from almost unrelated evolutionary adaptations. Yet this idea offers an intriguing prospect. While something that seems significant in itself might actually turn out to be powerful due to its combination of materials rather than in its singularity, its importance may not be in its function for human survival, but then this is perhaps not currently so important as a direct impulse for human development. Instead, perhaps, it has developed to fit the exigencies of human perception and developed in a particularly efficacious manner. Perhaps Evolutionary Psychologists casting around for vague ideas about how culture serves evolutionary impulses and processes is a waste of time. Ellen Dissanayake suggests that art might be a spandrel.32 Rather than thinking of it as an

5  GESTALT, SPANDRELS AND SYNERGY 

155

adaptation, perhaps it makes more sense as a by-product of adaptation. The notion of the spandrel offers a potential answer, with culture more generally and audiovisual culture more specifically serving as an example of an object that may have derived indirectly from more than one adaptation, but rather than a direct aid for human survival, its central tenets have come from perceptual impulses and their product: leading to the proverbial ‘2 + 2 = 5’. The notion of the spandrel fits well with Gestalt theory and its oft-repeated adage of the whole being more than the sum of its parts. The spandrel also marks a whole new object in itself, rather than simply the product of two different objects being pressed together and exhibits a significance for itself that essentially is not due to its constituent parts. While the synergetic effect of music combined with the moving image can make a startling by-product, we should always remember that it exploits precisely the affordances and propensities of human perception. Perhaps moments of significant combination in terms of our aural and visual perception embody this ‘spandrel’ effect. Such ‘sweet spots’ of audiovisual culture are relatively common, and people often tend to agree on their efficacy. While music and sound aficionados discuss the ‘sweet spot’ of audition, there clearly exists a synergetic ‘sweet spot’ of music and moving image combined, where the whole yields far more than the sum of the parts to exquisite effect. While this may be accepted wisdom, it has been scrutinized little by theory, with almost an assumption that the characteristics of sound and image separately cause effect. These can be particularly effective and strived for by producers. They can have a significant effect on the audience, sometimes being breathtaking, and with a high degree of immersion, while afforded by the unity of ‘the audiovisual’ as a particularity rather than an amalgamation of two different media, or as separate channels of aesthetics. Audiovisual culture potentially has a momentous emotional effect which can be a ‘sweet spot’ of perception rather than being a higher-level recognition of the emotional situation on screen. This is less ‘emotional engagement’ and more like the way that sugar hits the human tongue’s ‘sweet spot’, gaining a response. Such sequences and moments owe at least something to the tradition of song would dance sequences in film musicals. They also owe much to the point in the 1960s where filmmakers decided that it made sense to edit their dramatic images to existing pieces of music. Indeed, is this process at the heart of it all? The proliferation of such sequences on YouTube not only points to the increasingly modular construction of films but also to the more general short sequence-based

156 

K. J. DONNELLY

conception of audiovisual culture. Indeed, this ‘sweet spot’ might be particularly evident when dynamic images are accompanied by dynamic music, with peaks and troughs in both coalescing and diverging in a further dynamic relationship. Such a level of stimulation is likely to produce a state of arousal, as experimental psychologists say, in the audience which would yield a physiological reaction. The issue of conceptualizing sound and image’s relationship: I would suggest that music does not ‘add’ to the image but converts it—through a process of ‘mutual transference’. Music is ‘imagized’ and the moving image is ‘musicalized’, in a reciprocal relationship that embodies and homologizes the perceptual-cognitive process taking place in our heads. The distinctive formation makes a particularly potent combination of the indexical realism of the image with the emotional immediacy of music. As a part of this, the image provides an (imaginary) visual spatialization of musical structure—one that will not correspond with actual musical structure—but adumbrates an ‘emotional sense of structure’, or what might better be thought of as a seemingly logical sense of emotional structure. Returning to the Borgo Pass sequence in Nosferatu: Phantom der Nacht that I discussed earlier, the fairly disjointed succession of images of rivers, waterfalls, mountains and sky are accompanied by Popol Vuh’s On the Way and Richard Wagner’s prelude to Das Rheingold. Both musical pieces lack traditional structure around melody and harmonic movement. The Popol Vuh piece is based on the repetition of two notes a tone apart sung by a choir, while the Wagner overture is an intense build-up of texture, an orchestral dynamic ramp of arpeggios and sustained notes. On one level, there is a clear isomorphic relationship between the inert but emotionally charged images and music. Such moments in film are not about ‘representation’ as such. The music does not ‘add’ to the image but converts it into something else, while the music is also transformed by the images. The combination yields a significant effect. The logic of the McGurk Effect is clear here in that the final product is something startling and, significantly, qualitatively different from the constituent parts that nevertheless are impressive on their own in this case. The sequence is overwhelming, yet perhaps it forces the audience into a state of ‘under-perception’ of elements, merely ‘getting the gist’ of it,

5  GESTALT, SPANDRELS AND SYNERGY 

157

receiving a general idea without getting the specific details. Or perhaps this becomes an ‘emotional representation’, which is less of a representation of a place than it is a representation of an emotion. Even the most specific footage of a place becomes something different as film and music combined. Clearly, there is something of the sensual overload here. A physiological-functional approach might be aware of the impact of an audiovisual ‘sweet spot’ coming from an overloading, providing too much in the way of stimuli for the senses. Logically, this makes some sense. The music is made ‘to be listened to’ (both the Wagner and Popol Vuh pieces were not written to fit the film), added to images that are highly impressive and interesting in themselves. Most are extreme long shots. The addition of awe upon awe is near overwhelming, particularly on a large screen with good sound. Of course, director Werner Herzog likely was aiming at an effect approaching the Burkean sublime in this sequence, which marks a psychic transposition between the ‘normality’ of Bremen and the ‘otherworld’ of Count Dracula.

Fig. 5.1  Nosferatu, Phantom der Nacht

158 

K. J. DONNELLY

Indeed, it is possible almost to feel ‘drunk’ on audiovisual effect. This is not easily accounted for, either. It seems reasonable to suggest that it is something to do with synergetically lining up perception and emotion. It doesn’t seem unreasonable to imagine that at least some of this startling effect might emanate from audiovisual culture ‘hitting the spot’ that has been left as a by-product of evolutionary developments of the human form. Along similar lines, the feeling of significance that wreathes some films might have less to do with ‘content’ than it is to do with the spandrel of the audiovisual, where a sense of quasi-religious transcendence might be accessed as a by-product of mental machinery’s development engaged by culture that seeks out direct affect. Music is often understood as the art with the most direct affect. However, according to cognitive psychologist Steven Pinker, “As far as biological cause and effect are concerned, music is useless. It shows no signs of design for attaining a goal such as long life, grandchildren, or accurate perception and prediction of the world … I suspect that music is auditory cheesecake, an exquisite confection crafted to tickle the sensitive spots of at least six of our mental faculties”.33 He understands music as a spandrel, a by-product of evolutionary adaptations and their mixing rather than an adaptation in itself. He points out that cheesecake is a by-product of the human desire for fat and sugar, a desire which goes back deep into the human past. And so, cheesecake is not important in itself. ‘Spandrel theory’ is a useful outlet for evolutionary psychologists or biologists, who then do not have to find an evolutionary function for everything. However, things are not so simple, and some theorists have suggested that there are direct evolutionary advantages from music. For example, Daniel Levitin, in This Is Your Brain on Music, notes that music can supply advantages in social bonding, cognitive development and sexual selection—all crucial aspects of human development and evolution.34 While broadly I have to agree with Levitin, there is something alluring about this ‘useless’ theory being assigned to a significant area of culture. Pinker thus dismisses music as unimportant and merely something that gives fleeting but insubstantial pleasure. I’ve heard this argument many times about certain types of music in comparison to others, or certain films in comparison with others. Perhaps cultural objects should not all be tarred with the same brush but registered as doing different things on every level. Yet on some levels they are remarkably similar.

5  GESTALT, SPANDRELS AND SYNERGY 

159

Now, if the intellectual faculties could identify the pleasure-giving patterns, purify them and concentrate them, the brain could stimulate itself without the messiness of electrodes or drugs. It could give itself intense artificial doses of the sights and sounds and smells that ordinarily are given off by healthful environments. We enjoy strawberry cheesecake, but not because we evolved a taste for it. We evolved circuits that gave us trickles of enjoyment from the sweet taste of ripe fruit, the creamy mouth feel of fats and oils from nuts and meat and the coolness of fresh water. Cheesecake packs a sensual wallop unlike anything in the natural world because it is a brew of megadoses of agreeable stimuli which we concocted for the express purpose of pressing our pleasure buttons. Pornography is another …[and] the arts are a third.35

This is also an alluring idea, although one that atomizes the cheesecake rather than attempting to understand what the cake does as a whole. After all, few of us would take the cake into its constituent parts to eat it. Pinker’s approach also negates any sense of the cheesecake’s nutritional (calorific) value, which is important in both positive and negative ways, as a desired infusion of calories or a ‘guilty pleasure’. In some cases, the conglomeration of elements can, while forming a new whole, retain a strong image of separate parts. Perhaps some of the best examples of forcing together two potentially autonomous discourses in the production of something totally new, in what is sometimes called a ‘forced marriage’, are from the addition of new music for silent films from the early twentieth century. A good case in point is the version of The Phantom Carriage (1921, Körkarlen in the original Swedish) toured and released on DVD in 2010 with music by the ensemble KTL. The Phantom Carriage was directed by and starred Victor Sjöström. It concerns self-­ destructive alcoholic David Holm, who repents his life after being faced with the grim reaper in The Phantom Carriage. It is a dramatic tale of morality, with Holm treating everyone extremely poorly (especially his own wife and child), allowing for a strong redemption at the film’s conclusion. The Phantom Carriage has a notable status as a classic silent film. Not only is it known as the film that inspired Ingmar Bergman to become a filmmaker,36 but it is also a landmark for its double exposure visual effects for ghostly apparitions. Over the years different scores have abounded, in effect yielding a whole new experience of it.37 The Phantom Carriage has received DVD releases as different versions with scores by Matti Bye or KTL. Live versions have been performed by Jonathan Richman (who topped the charts with the Modern Lovers in the late 1970s) in 2007, and The Horses (aka

160 

K. J. DONNELLY

Acid Pony Club, consisting of DJs Laura Ingalls and Clement Pony) performing live electronic music to the film in 2014 in Shanghai.38 On the Internet, there are also accessible versions by Gustaf Lindström (electronics and voice), Edward Rolf Boensnes (electronic keyboards), Signal to Noise Ratio (rock, 2011), Franz Danksagmüller and Berit Barfred (electronics and voice, Barcelona 2010), Matt Marshall (piano) and the Napa Valley Youth Symphony.39 The 2010 DVD release of the non-traditional ‘KTL version’ (as it is known) aimed at the inculcation of a primal psychological state in a more insistent way than most film scores. KTL’s film music was much like their non-film music: the ensemble is comprised of drone rock guitarist Stephen O’Malley and electronic musician Peter Rehberg.40 The music is based on a continuous droning sound, lacking notable melody or harmonic changes. The default sound of KTL involves O’Malley’s distorted electric guitar sustaining chords seemingly endlessly, while Rehberg uses electronic treatment of that sound and bolsters it with complementary electronic sound from either a synthesizer or his laptop computer. One might argue that their sound is ‘cold’ and that this comes partly from their embrace of technology and lack of organic sound.41 Its bleak austerity is harrowing enough, but allied to Sjöström’s horrifying film, the effect at times is overwhelming. In terms of cumulative effect, the use of constant drones inspires anticipation in the audience but also makes for a physically wearing experience. Furthermore, rather than following the conventional modes of silent film musical accompaniment, or indeed even taking a recognizably ‘musical’ (in the traditional sense) approach, the soundtrack adopts experimental musical aesthetics and aspires to a direct articulation of a distinctive psychology, aiming at wreathing the film in a feeling of dread.

Figs. 5.2 and 5.3  The Phantom Carriage

5  GESTALT, SPANDRELS AND SYNERGY 

161

The KTL soundtrack to The Phantom Carriage is a wholly alien prospect for traditional silent film music and arguably goes against the intention of the film and likely the desires of the filmmakers. Clear differences evident in the KTL version include harsh timbre, regular use of low-­ frequency sound, use of a drone basis with almost no melody, asynchrony and the cumulative effect of the music’s repetitive character. In terms of asynchrony, the music is not closely integrated with the image and often feels like it is proceeding almost irrespective of the film. This is most evident when David has a flashback to when he was happy, going for a picnic with his family at the lake. The tone of the images is a stark contrast to earlier in the film, yet the music largely carries on with its ominous droning and slow changes, making a sharp Eisensteinian counterpoint with the film’s emotional tone. KTL’s Phantom Carriage invokes some fundamental questions of film theory, pertaining to the possibilities of audiovisual disjunction along the lines discussed by Sergei Eisenstein and Nöel Burch in relation to politicized art and Hanns Eisler and Theodor Adorno in relation to musical integrity in relation to film.42 In Eisenstein’s analyses he looked for a common denominator between music and image and alighted at first upon the notion of ‘movement’.43 His sense of equivalence between film images and music was informed by a sense of structural resemblance at a profound level, allied to a sense of synaesthetic mixing of stimuli and sensation.44 This is on the way to addressing isomorphism and ‘sweet spots’ but approaching them from a very different angle. If an audiovisual product tests the McGurkian melding of sound and image, it might be this. In the KTL The Phantom Carriage, music appears more ‘cyclical’ than developmental. In fact, in musicological terms it embodies stasis rather than a sense of ‘movement’ that we might expect in a dynamic relationship with the images. Unlike film scoring traditions of ‘Mickey Mousing’ (mimicking action) and less crude precise matching of music to action, here perhaps, the film moves while the sound freezes. This is the opposite of Eisler and Adorno’s suggestion in Composing for the Movies that music ‘breathes life’ into the frozen images.45 Rather than functioning as a stimulus of movement, KTL’s music effects to slow events down.46 The semi-autonomy of music for some silent films has caused consternation in a similar manner. The merest mention of the Moroder version of Metropolis (1927, 1984) used to inspire film historians and film buffs to paroxysms of anger, as would the Type O Negative version of Nosferatu (1922, 1998), had more of them known about it.

162 

K. J. DONNELLY

Extrapolating Off-Screen Sound: The Technological Supernatural As noted earlier, off-screen sound can pose a question about the source of the sound. It can also pose a question about the status of perception. Consequently, it has become a central device in horror films after the advent of synchronized recorded sound, in particular for the depiction of the supernatural in film. Sound alone sets vision into action—we hear a sound and look for its origin. While in the real world we might understand a predator as the object, its approaching sound is the symptom. In certain films, sound is used in this way, perhaps most notably in the horror film, which has a strong tradition of activating off-screen space. However, it is often used in a manner that denies the confirmation of the source of the sound, exploiting this anxiety-producing evolutionary trait pertaining to uncertainty about threats. Examples include Norman Bates’ mother in Hitchcock’s Psycho (1960) or the killer in Dario Argento’s Tenebrae (1982). Here the reverse shot showing the killer’s identity is withheld during the attacks, leaving his point of view of murders but furnishing the sounds of his voice: a disturbing approximation of Donald Duck’s voice. This relies upon the kind of cross-modal confirmation discussed at the end of the previous chapter. Pierre Schaeffer’s term ‘acousmatic’ was co-opted by Michel Chion to describe off-screen sound in film that is disconnected from its source and from what is depicted in the accompanying shot and notes that this “… opposition between visualised and acousmatic provides a basis for the fundamental audiovisual notion of offscreen space”.47 In the two cases cited above, this would be what Chion calls an ‘acousmêtre’, an unapparent person or object whose voice can be heard but whose material body cannot be verified by vision to match the voice.48 Matching these concepts, R. Murray Schafer’s notion of ‘schizophonia’ has been used to describe the splitting of sounds from their original contexts. Technology, such as recording, allows for a voice to split from its original origin and move to somewhere else.49 Indeed, some technological dislocated voices make an arguable approximation of the supernatural. Is the acousmatic supernatural by nature? Structurally, for audiovisual culture, it might be. The ‘natural’ is synched, on-screen, explained and unproblematic. As I noted, off-screen sounds can pose a question of narrative, of diegesis and of perception.50

5  GESTALT, SPANDRELS AND SYNERGY 

163

The Gestalt of sound added to ‘not its image’ poses an absence, and we are driven to imagine ghosts because of this. Indeed, partial absence and partial presence are defining characteristics of the supernatural. The Innocents (1961) was an adaptation of Henry James’ celebrated short story ‘The Turn of the Screw’ (from 1898), concerning a governess (played by Deborah Kerr) and her young wards in a country mansion. Detailed by their guardian uncle to look after them without disturbing his life in London, Miss Giddens becomes convinced that the siblings Flora and Miles are possessed or threatened by the ghosts of two recently dead lovers: a servant (Peter Quint) and her predecessor as governess (Miss Jessell). In an almost empty house, Giddens, with some reluctant help from housekeeper Mrs. Groce, confronts each of the children over their precocious behaviour, which results in Flora becoming hysterical and Miles dying. The film has strong overtones of child abuse and paedophilia. Following James’s story, the film is ambiguous as to whether it is a ghost story or one of psychological horror, where the governess is imagining a situation that may not be real. Due to this pivotal ambiguity, the story has inspired many different interpretations. Culture often functions to mediate between the physical and the psychological, between the material (so-called ‘reality’) and the imagination. The Innocents is a potentially confusing film while following the general conventions of naturalism that suggest the depiction of a believable reality; in significant ways it also suggests that we are following merely the imagination of the protagonist. While there is seemingly a fluid boundary between ‘reality’ and imagination in the film’s narrative and representations, equally, there is no solid demarcation between the functions of music and sound in the film, and sound and images also have a relationship that at times appears to have drifted apart. In The Innocents, one of the two children, Flora, tells her governess: “You get a lot of sounds but Mrs. Groce says you should ignore them”. The film’s soundtrack elements include birds, electronic sound effects, musical score, ambient sound effects, adult and children’s voices (sometimes echoed), a child’s song and children whispering. It is no less dynamic than the film’s black and white images. The relationship of sounds and images, and the way we extrapolate to imagine the wholes, is clear from the way that the film begins in a highly singular manner. While the screen remains a black emptiness, a little girl’s voice appears, singing an unaccompanied song about a dead lover (O Willow Waly). This establishes the

164 

K. J. DONNELLY

importance of sound for The Innocents, while the film remains sightless. After some time, the film’s titles begin (with ‘20th Century Fox’, then ‘Cinemascope’). Next, the girl’s voice is superseded by the song of a nightingale, and then shortly afterwards the sound is joined by incidental music as the governess’ disembodied praying hands appear in darkness, and then she begins mumbling about saving the children as her body emerges from the darkness of the void like a lost soul or ghost herself. The music has a chamber character, with a thin texture based on woodwind. Once the hands appear on screen, the music becomes more strident, with cymbal rolls and stormy string support. The flutes sound as if they are being overblown with close microphone placement, making them sound distorted. The melodic impetus of the music is a variation on O Willow Waly that works around some of the basic melody without any of the rhythmic articulation of the song.51 The most convincing sonic interpretation of this sequence is not as a range of different sounds (dialogue, sound effects, and music) but as a ‘musical’ melange of sonic elements. Music in in the film ought to be approached (at least partially) as a sound effect52 and vice versa. In the opening sequence, the musical score is displaced by voice-over and maintains the morose tone of the voice. Indeed, in film analysis there is a tendency to ignore the ‘music’ (specific sound) of spoken dialogue in favour of its semantic content. The song of the nightingale remains, as a sonic grounding for a two-part temporal structure that is not motivated by narrative development but more by purely aesthetic concerns within a sequence that includes some diegetic action and narrative information, mixed with conventional aspects of film opening credits. The persistent black screen at the start, however, is far from conventional and serves to cede the foreground to the soundtrack. From the start, the soundtrack is not ‘grounded’ in the image, which is the normal approach in both theoretical and industrial conceptualizations. The song, O Willow Waly, is haunting and disturbing, not least for being a first-person paean to a dead lover and seemingly sung by a little girl.53 Flora hums its tune on two occasions immediately before the ghosts appear in the film, so it seems to have invocatory powers, and this marks a noteworthy and portentous opening of the film. The film’s opening, like the final sequence in the film, has a level of confusion about diegetic status (as a ‘real’ place in the film or in the governess’ head). In The Innocents, the role more often taken by non-diegetic music is displaced onto diegetic sound elements, which are organized in a

5  GESTALT, SPANDRELS AND SYNERGY 

165

manner that has less to do with diegetic events on screen and more to do with abstract aesthetics, both in terms of structure and in terms of the characteristics of the sounds themselves. Not only is the musical score pushed aside by other sounds, but when it appears it is also often synched precisely—in the face of much unsynchronized off-screen sound. Even though children’s rhymes and singing have become a staple of horror film scores, it wasn’t at the time of The Innocents’ release. The film’s musical score was written by Georges Auric and is not a ‘horror score’ in the conventional sense of the time of the film’s production. Rather than noise and dissonance, it is often quiet and unassuming but highly atmospheric. The score has a close feeling and a chamber intimacy to it, comprising a small ensemble with close microphone placement for the instruments. This mirrors the enclosed rooms often evident on screen and close-up shots of faces, although not the shots outside the house or the deep focus shots of large spaces in the house. The closeness of the non-diegetic music makes a strong contrast with the expansive sounds of artificial echo and reverb. Electronic sounds appear at key moments in the film where supernatural events take place, regularly accompanying a succession of antecedent ‘looking’ shots followed by consequent POV shots. The sounds signify ‘abnormal’ situations, both as a code and as an unfamiliar sensual sonic texture. The bursts of electronic music in The Innocents are not melodic but rather single tremulous tones and characteristically electronic sounds that likely were produced using an oscillator and electronic filters.54 This predates the use of integrated synthesizers with keyboards, and it should be emphasized that this is pioneering electronic sound. The Innocents’ electronic sound was created by Daphne Oram of the BBC Radiophonic Workshop,55 which might be thought of as the British equivalent to electronic research centres such as IRCAM, Columbia-Princeton and Darmstadt. The BBC Radiophonic Workshop was not an art music organization but produced ‘special sounds’ for radio and television programmes. Oram later called her electronic music ‘Oramics’ and was not a ‘musician’ as such but trained as a BBC ‘studio manager’, which was essentially considered a form of official job description based on engineering rather than music—as was the case with the BBC Radiophonic Workshop generally, until much later. She wrote a book called An Individual Note: Of Music, Sound and Electronics (1972), which defined a theory of music derived from ‘natural laws’ inherent in electronics. The early pages expound a correspondence between music and electronics: “Can we enter both the music and the electronic fields at the same time? …

166 

K. J. DONNELLY

Can composer be mingling with capacitors?”56 Later, there is an appeal to idea of many circuits becoming on the verge of organic—like a human brain. While it is a highly eccentric but engaging book, there are resonances with the notion of taking a holistic approach to soundtracks, adding together parts that seemingly are separate by definition. The book also marks an early call for the history of music (and especially film and television music) to be seen as tied inextricably to developments in electric recording and technology, which has been articulated most forcefully in recent years by Mark Katz in Capturing Sound: How Technology Has Changed Music.57 While there has been some scholarly interest in sound recording, there has been less about electronic sound processing, some of which has been crucial for film sound and music generally and The Innocents specifically. The phenomenon of ‘delay’ comes from sound reflection off hard surfaces and its lack of absorption in certain spaces. This can lead to sounds that range from reverb (a ‘smearing’ or the clarity of sound that comes from a large space) to echo (where repeats or ‘slap back’ occurs as sounds bounce towards the listener).58 Reverb (short for reverberation) offers a sense of space, where slight sonic reflections delineate a spatial topography. The effect when speaking into a microphone is to sound like the event is taking place in a large empty room, to a degree that varies with the amount of effect added to the original electronic signal. Echo is a more extreme version of sound reflection, where the space appears larger through repeated reflections of the initial sound, again to a degree depending on the amount of processing of the initial signal.59 Originally, reverb and echo effects were achieved mechanically and electronically, through the use of metal springs and reverberation plates. Tape delay used looped magnetic tape to record and repeat a sound, and recent digital delay uses a mathematical processing algorithm. The earliest sound processing of this sort came from spring reverbs (such as those that were an essential part of the Hammond organ) and metal plate reverbs to devices that were based on tape recorders with multiple recording and playback heads. Analogue echo units included vital pieces of hardware such as the Echoplex, the Watkins Copicat, the Binson Echorec and the Roland Space Echo, while developments in digital technology have allowed for units such as the Line 6 DL4 and echo and reverb software that varies from basic to massively complex. While some might characterize these devices as merely an accoutrement for musicians, like a trumpet mute or brush drumsticks, in the hands of some, they become a para-musical instrument in themselves.60

5  GESTALT, SPANDRELS AND SYNERGY 

167

Both reverb and echo effects deliver a false space to our perception, an imaginary topography that does not match any reality of space. Delay ‘occultizes’ space in that it exposes its hidden dimensions. It is a concrete effect, delivering the uncanny appearance of familiar sounds. Sonic technology has developed definite associations, manifesting particular psychological states. Since the 1930s, sound effects such as reverb and echo have been used to signify interior or abnormal states, either for memories of spoken words, as voices in the mind or as dreams or indication of supernatural activity. So, the extraordinary electronic sound or processing of sound marks the supernatural through an analogue process of sound that clearly is unnatural. This appears to be a technological more than a technical supernatural, exploiting sound technology more than it does filmic convention. Yet naturally occurring reverberation and echo are also associated with the transcendent and the supernatural. Reverberant spaces have been connected with religion and social power, such as large caves, cathedrals and even underground structures such as the Newgrange chamber tombs in Ireland.61 Such spaces rely more on raw sonic effect than plain communication, although one might argue that the effect of the sound is the communicative aspect. Perhaps the most seemingly supernatural aspect of such reverberant spaces is that they uncouple the direct connection of sound and hearing. Philip Brophy notes: “Psychoacoustically, reverb grants us an out-of-body experience: we can aurally separate what we hear from the space in which it occurs. While this sensation was a wholly acoustic trait since time immemorial, the recording, rendering and representation of its texture was rediscovered as an ‘electro-acoustic’ feature in recorded sound”. In the only notable study of delay processing, Echo and Reverb: Fabricating Space in Popular Music Recording, Peter Doyle notes that “… echo as encountered by the human listener is an uncanny phenomenon, as if the sound has been emitted by the mass that reflects it”.62 He proceeds to note that echo from a mountain is like the ‘voice of God’: Western movie echo might be seen as an enactment of the Protestant conscience; the solitary, nonconforming pilgrim, free from the mediatory interventions of a corrupt social world, provided he is sufficiently pure, may count on receiving divine guidance. … The puritan deity speaks from inside the heart of the believer. Like echo, the deity’s voice will sound remarkably like that of the believer.63

168 

K. J. DONNELLY

This description is remarkably reminiscent of the solitary governess, Miss Giddens, in The Innocents. The film inaugurates with her disembodied voice discussing the possibility of redemption for the children. She trusts her ‘inner voice’ instinct, which is of unequivocal religious inspiration and has an unswerving belief in her self-appointed mission to purify the children. While she is not hearing another voice as such, her irrational self quite possibly has become disentangled from her rational self. Echoed sounds on the one hand suggest an uncertain world of spectral appearances but on the other suggest a confusing world of skewed perception. Echoes were associated with the supernatural (in terms of both religion and the unexplained) before becoming a fixed convention in films. There are several sequences in The Innocents where echoed sounds become highly prominent to the point of domination. One is where Miss Giddens and Mrs. Groce talk in the hallway once they have entered the house. Others are where Miss Giddens is looking for the children around the seemingly empty house and when she is looking through the house in the middle of the night by candlelight. In both sequences, there are heavily echoed children’s voices sourced off-screen. In the latter case, the echoes are highly non-naturalistic and include effects such as speeding up the echoed repeat and some very metallic-sounding reverb ‘smearing’, which obscures the original voice sound. Most of the sounds are laughs and whispers and could not be classified as dialogue but rather have a textural rather than communicational value. The voices are unattached to the children and have a spectral life of their own, as uncoupled sound cast adrift from the dominant convention of cinema where the synchronization of sound and image renders an illusion of a coherent ‘reality’ on screen. Indeed, in addition to the uncanny character of echoed sound, The Innocents also evinces lengthy sections of the film where there is a lack of synchronization between sounds and images. In The Innocents, sounds often have off-screen sources, and the disjunction of sound and image spaces has potentially far-reaching consequences. Physical perception of dislocation yields a mental dislocation. After all, human beings rely unswervingly on their senses and the fundamental disruption of perception leads inevitably to mental distress.

5  GESTALT, SPANDRELS AND SYNERGY 

169

Figs. 5.4 and 5.5  The Innocents

The Innocents often relies on sounds to confuse the audience and set up a state somewhere between beholding the supernatural and a psychologically disturbed point of view. Of course, it is a tradition in horror films to have ambiguous perception as a manifestation of the supernatural, rendered through unclear status of the images and particular audio and visual effects. The Innocents manages to blur the binary between diegetic reality and supernatural or imagined yet has a strong sense of the sonic supernatural delivered by electronic sound, exploiting its novel status at the time and sense of it being ‘unnatural’. Supernatural sounds in horror films often involve some sort of dislocation between sound and source. In The Exorcist (1973), there is not only the uncanny voice for the possessed girl, Regan, which was provided by

170 

K. J. DONNELLY

the visually absent Mercedes McCambridge, but there is also the scary and unidentified off-screen sound from the attic earlier in the film. In Psycho (1960), the voice of Norman’s mother is never synchronized to her absent mouth or even to Norman’s.64 In the Evil Dead II (1987), there are sections of sourceless laughter, possibly imagined by protagonist Ash, and unexplained sounds of wild movement that is not visible although the characters on screen move their heads rapidly to and fro following the disembodied sounds. Such dislocation is perceptually disturbing, as potentially is any sound with an uncertain source. The term ‘acousmatic’ was developed originally by pioneer of musique concrete Pierre Schaeffer and redirected for film analysis by Michel Chion.65 Schaeffer’s original, purely sonic, sense is that the audience are uncertain of what the source of the sound might be. Chion’s transposition of the notion to audiovisual culture conflates it with a strategic use of off-screen sound.

Conclusion Synergy is one of the most important questions in relation to the marriage of moving images and music or sounds. The audiovisual is a thing in itself, despite being produced often by separately created sound and images (or even with divisions within those). Sound and image, as sense-data and aesthetics, have merged into something distinctive and powerful, and perhaps the power of its effect is the by-product of their marriage. Audiovisual culture’s remarkable effect exploits the physiological realities of human perception, as much as and perhaps even more than it engages our higher cognitive faculties. It also exploits the fact that our senses are constantly anticipating something to process. Sensory deprivation makes bad mental health. On the other hand, sensory overload may be bad for you, but often it is compulsive, for short periods anyway. Despite Gestalt theory’s initial focus on images, it explains the cohesion of sound and image together, even in situations where the sound doesn’t really seem to fit the image. Our perception tries to make them fit together. Perception’s Gestalt basis also accounts for the McGurk effect, in that our perceptual faculties fit together and mix the sounds and images into a configuration that is not in fact present. Although any objective definition is far from straightforward and perhaps not desirable, ‘sweet spots’ might be a symptom of audiovisual culture’s foundational origin as a by-product of survival hardware, fortuitously showing up the synergetic points where sound and image converge into

5  GESTALT, SPANDRELS AND SYNERGY 

171

something special and highly affecting. If, as we might imagine, startling audiovisual configurations are more than simply ‘what is there’, their notable effect could likely be down to the ‘extrapolation’ of the human perception and cognition to fill in gaps and produce a remarkable whole in our minds. The notion of spandrels is a handy metaphor for understanding anything. They can be a happy by-product, misrecognized as the ‘main point’ of any situation. I would argue this is an extremely useful application, where a general misrecognition of function dominates and there are unforeseen positive (or negative) consequences. This might be more literal, too. It is more than possible that the considerable effect of audiovisual culture is itself a spandrel, caused by the structure of the human brain and its relationship to the rest of the nervous system. Thus, audiovisual culture’s massive emotional effect is a ‘sweet spot’ of perception. It is based on a fullness of signal in sound and image being merged. Perhaps certain aspects of sound and certain aspects of image serendipitously produce something that the two lack apart, as an unpredictable mixture of sometimes unpredictable effect. This helps to account for moments of extreme emotional effect and the sense that audiovisual culture habitually exceeds the sum of its parts. This is registered in Michel Chion’s notion of ‘synchresis’66 and evident in the synaesthetic effects brought to light by the McGurk Effect, which appears due to shared processing areas of the brain for audio and visual signals. Gould notes that such spandrels require a Gestaltist approach rather than explaining each element as something freestanding.67 Indeed, this notion illustrates the importance of Gestaltist thinking on such matters.

Notes 1. Stephen Jay Gould and Richard C.  Lewontin, “The Spandrels of San Marco and the Panglossian Paradigm: A Critique of the Adaptationist Programme” in Proceedings of the Royal Society: Biological Sciences, vol. 205, no. 1161, 21 September 1979, pp. 581–598. 2. Aniruddha Das, “Contextual Interactions in Visual Processing” in Encyclopaedia of Neuroscience (Amsterdam: Elsevier, 2009), p. 145. 3. Kurt Koffka, Principles of Gestalt Psychology (New York: Harcourt Brace, 1935).

172 

K. J. DONNELLY

4. Max Wertheimer, “Gestalt Theory” in W.D.  Ellis, ed., A Source Book of Gestalt Psychology (London: Kegan Paul, Trench, Trubner & Company, 1938), p. 2. 5. Robert J. Sternberg and Karin Sternberg, Cognitive Psychology (Belmont, Calif.: Cengage Learning, 2012), p. 13. 6. Das, op.cit., 2009, p. 145. 7. Ibid., p. 145. 8. “Gestalt Laws of Perceptual Organization” at Verywellmind. https:// www.ver ywellmind.com/gestalt-­l aws-­o f-­p erceptual-­o rganization-­ 2795835 [accessed 2/4/2022]. 9. Rudolf Arnheim, Art and Visual Perception: A Psychology of the Creative Eye (Berkeley, CA.: University of California Press, 1974), pp. 4–5. 10. Ibid., p. 46. 11. Wolfgang Köhler, Gestalt Psychology: An Introduction to New Concepts in Modern Psychology (New York: New American Library, 1947), p. 118. 12. Ibid., p. 20. 13. Marshall McLuhan’s notion of ‘the medium is the message’ is derived from a Gestaltist approach, where the medium is the ground and the message is the figure. See opening chapter of Understanding Media: The Extensions of Man (Cambridge, MASS.: MIT Press, 1964). 14. Theo van Leeuwen, Speech, Music, Sound (Basingstoke: Macmillan, 1999), p. 17. 15. R. Murray Schafer, Our Sonic Environment and the Soundscape: The Tuning of the World (Rochester, Vt.: Destiny, 1994), pp. 9–10. 16. William Whittington, Sound Design and Science Fiction (Austin, TX.: University of Texas Press, 2007), p. 20. 17. Ibid., p. 308. 18. Rudolf Arnheim, Art and Visual Perception: A Psychology of the Creative Eye (Berkeley: University of California Press, 1974), p. 5. 19. David Bordwell, Making Meaning: Inference and Rhetoric in the Interpretation of Cinema (Cambridge, MASS.: Harvard University Press, 1991), pp. 8–9. 20. Is this what the words of the famous Waterboys song are suggesting? 21. According to Das, this mechanism appears to be part of the visual system from birth and a dynamic property of neurons in adults’ visual cortexes. Das, op.cit., 2009, p. 154. 22. Vilayanur S. Ramachandran and Richard L. Gregory, “Perceptual Filling In of Artificially Induced Scotomas in Human Vision” Nature, vol. 350, no. 6320, 1991, pp. 699–702. 23. Maria Poulaki, “The “Good Form” of Film: The Aesthetics of Continuity from Gestalt Psychology to Cognitive Film Theory” in Gestalt Theory, vol. 40, no. 1, 2018, pp. 29–43. p. 30.

5  GESTALT, SPANDRELS AND SYNERGY 

173

24. For instance, as I mentioned briefly at the start of the book, and in more detail at the start of one of my other books, sudden appearances of jet fighters momentarily without their accompanying sound are a big shock. 25. There is a clear explanation and audio example in “What Is the Phantom Fundamental” at Splice blog. https://splice.com/blog/what-­i s-­t he-­p hantom-­f undamental/ [accessed 17/7/2022] 26. “Perception as Controlled Hallucination: Predictive Processing and the Nature of Conscious Experience: A Conversation with Andy Clark” in Edge, 8 September 2021. www.edge.org/conversation/andy_clark-­ perception-­as-­controlled-­hallucination?fbclid=IwAR0XTKw8SWMiW4cL DwOTWu2P3icztzl6fBSZkQKy-­dmzkQM4BNB77TyLHIo [accessed 5/11/2021]. 27. Andy Clarke, Surfing Uncertainty: Prediction, Action, and the Embodied Mind (New York: Oxford University Press, 2016). 28. Terry Eagleton, Literary Theory: An Introduction (Oxford: Blackwell: 1983), pp. 66–67. 29. Mark J.P.  Wolf, “World Gestalten: Ellipsis, Logic, and Extrapolation in Imaginary Worlds” in Projections, vol. 6, no. 1, 2012, pp. 124–5. 30. Irvin Rock, “Inference in Perception” in PSA: Proceedings of the Biennial Meeting of the Philosophy of Science Association, vol. 2, 1982, p. 525. 31. Stephen Jay Gould and Richard C. Lewontin, (21 September 1979). “The Spandrels of San Marco and the Panglossian Paradigm: A Critique of the Adaptationist Programme” in Proceedings of the Royal Society B: Biological Sciences, vol. 205, no. 1161, 21 September 1979, pp. 581–598. 32. Ellen Dissanayake, “Chimera, Spandrel, or Adaptation: Conceptualizing Art in Human Evolution” in Human Nature, vol. 6, issue 2, 1995, pp. 99–118. 33. Steven Pinker, How the Mind Works (New York: Dutton, 1997), pp. 528, 524. 34. Daniel Levitin, This is Your Brain on Music (New York: Dutton, 2006), pp. 245–259. 35. Pinker, op.cit., 1997, pp. 524–525. 36. Indeed, this is the first thing noted in the DVD review by Andy Battaglia, “Sound: The Phantom Carriage: A Most Unorthodox Victor Sjöström Remix” in Film Comment, May/June 2012. www.filmcomment.com/article/sound-­the-­phantom-­carriage [accessed 20/04/2014] 37. K.J. Donnelly, “How Far Can Too Far Go? Radical Approaches to Silent Film Music” in K.J. Donnelly and Ann-Kristin Wallengren, eds., Today’s Sounds for Yesterday’s Films: Making Music for Silent Films (New York: Palgrave, 2016).

174 

K. J. DONNELLY

38. www.chinamusicradar.com/uncategorized/phantom-­c arriage-­b y-­t he-­ horses-­acid-­pony-­club/ accessed 2/6/2014 39. See further discussion in K.J. Donnelly, “Music Cultizing Film: KTL and the New Silents” in New Review of Film and Television Studies, vol. 13, issue 1, 2015, pp. 31–44. 40. O’Malley is most recognized as a member of Sunn0))), a drone avant garde/heavy metal band whose live show aims at sonic effects, and Rehberg was a prolific digital electronic musician (he died in 2021). 41. KTL V (2013) expands the sound through the use of an orchestra. 42. Eisler and Adorno state that the radical aesthetic divergence of sound and image is a potentially legitimate means of expression. Hanns Eisler and Theodor Adorno, Composing for the Movies (London: Athlone, 1994), p. 74; Sergei Eisenstein, The Film Sense, Jay Leyda, ed. and trans. (London: Faber and Faber, 1943), pp. 67–68; Noel Burch, Theory of Film Practice (Princeton, NJ: Princeton University Press, 1981), p. 90. 43. Sergei M. Eisenstein, The Film Sense, translated and edited by Jay Leyda (London: Faber and Faber, 1963), p. 67. 44. Evident in his discussions of ‘nonindifferent nature’, the ‘musicality of landscape’ and the ‘musicality of colour and tone’. Sergei M. Eisenstein, Nonindifferent Nature: Film and the Structure of Things, translated by Herbert Marshall (Cambridge: Cambridge University Press, 1987), p. 389. 45. Hanns Eisler and Theodor Adorno, Composing for the Movies (London: Athlone, 1994), p. 78 46. I tried an experiment with my students of showing the two versions to different classes and asking them how long the film was. Estimates for the KTL version were notably longer. 47. Michel Chion, Audio-Vision: Sound on Screen (Columbia University Press, 1994), p. 73. 48. Michel Chion, The Voice in Cinema (New York: Columbia University Press, 1999), p. 24. 49. R. Murray Schafer, The New Soundscape: A Handbook for the Modern Music Teacher (New York: Associated Music Publishers, 1969), p. 45. 50. In other cases, they do not. John Belton notes that Robert Bresson’s off-­ screen sounds become essences that ground the films in reality. “The Phenomenology of Film Sound: Robert Bresson’s A Man Escaped” in Rick Altman, ed., Sound Theory Sound Practice (London: Routledge, 1992), p. 25. 51. The song as culture is obscured by nature; human emotion obscured by the ‘mechanical’ indifference of the natural world. 52. K.J. Donnelly, The Spectre of Sound: Film and Television Music (London: BFI, 2005).

5  GESTALT, SPANDRELS AND SYNERGY 

175

53. It is sung with a child’s voice by Scottish folk singer Isla Cameron. This strange masquerade, with an adult playing a child fits the narrative notion of the adults ‘possessing’ the children. 54. It is often an undulating single tone, sometimes broken into fast pulses that supply a nearly continuous tone, throbbing, reminiscent of an old aeroplane engine at some distance. 55. In his DVD commentary, Sir Christopher Frayling wrongly attributes the electronic music to Auric. 56. Daphne Oram, An Individual Note: Of Music, Sound and Electronics (London: Galliard, 1972), npn. 57. Mark Katz in Capturing Sound: How Technology has Changed Music (Berkeley, CA.: University of California Press,2004); also Colin Symes, Setting the Record Straight: A Material History of Classical Recording (Middletown, CN.: Wesleyan University Press, 2004), Robert Phillip, Performing Music in the Age of Recording (New Haven, CN.: Yale University Press, 2004); Philip Auslander, Liveness: Performance in a Mediatized Culture (London: Routledge, 1999). 58. The human ear is unable to distinguish between the original sound and its echo if the delay is less than 1/10 second. 59. Yet the ‘Precedence Effect’ shows that despite echo, we can still compute the directional source of a sound. Peter H.  Lindsay and Donald A.  Norman, Human Information Processing (London: Academic Press, 1972), p. 284. 60. An interesting example here might be the Virgin Prunes’ Heresie album (1982, L’Invitation au suicide) which apparently is ‘about madness’. This involves copious use of echo units, far beyond any sense of everyday space. 61. Upon reflection, echo should be rare and perhaps even ‘abnormal’, and yet it appears on pretty much all records today. Indeed, perhaps it is the quotidian acoustic of our times. 62. Peter Doyle, Echo and Reverb: Fabricating Space in Popular Music Recording (Middletown, CN.: Wesleyan University Press, 2006), p. 108. 63. Ibid., pp. 108–109. 64. Michel Chion, AudioVision: Sound on Screen (New York: Columbia University Press, 1994), p. 129–131. 65. Ibid., p. 73. 66. Ibid., pp. 63–64. 67. Gould and Lewontin, op.cit., 1979, pp. 582–3.

CHAPTER 6

‘Gymnasium for the Senses’: The Artificiality of Audiovisual Space

This chapter investigates the dynamic character of audiovisual culture with particular interest in the space of landscapes in film, television and video games. The play of dynamics is one of the defining logics of audiovisual culture, with constant change to perceptual and cognitive stimulation at its heart. Consequently, the sense of visual and sonic space and movement within that is of paramount importance. Audiovisual space is fabricated through dynamics in camera movement, editing, deep space sets as well as depth cues in sound, sound reverb and spatial separation. Contemporary audiovisual culture is based not simply on the illusion of movement but more crucially on the illusion and effect of sound and image being merged into a coherent whole. These mark important moments exercising not only the medium but also the physiological requirements of its human audience. Rudolf Arnheim noted: The organism … is by no means a closed system. Physically, it counteracts the running-down of usable energy within itself by constantly drawing resources of heat, oxygen, water, sugar and salt, and other nutrients from its environment. Psychologically, too, the living creature replenishes its fuel for action by absorbing information through the senses and processing and transforming it internally. Brain and mind envisage change and crave it; they strive for growth, invite challenge and adventure. Man prefers life to death, activity to inactivity.1 © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 K. J. Donnelly, The McGurk Universe, Palgrave Studies in Audio-Visual Culture, https://doi.org/10.1007/978-3-031-18633-2_6

177

178 

K. J. DONNELLY

So, human beings have a mental, perceptual and cultural appetite and requirement, as much as they have physical needs. Our senses crave stimulation. Indeed, denying them stimulation can cause anxiety and psychological issues. On the other hand, an overload of stimulation can prove exciting but ultimately emotionally overwhelming and perhaps mentally disconnecting. Indeed, ‘sensory overload’ is considered a distinct condition with general symptoms including difficulty in focusing, irritability, restlessness, stress, anxiety, a higher level of sensitivity to stimuli and feeling overly excited.2 Indeed, most of the time our physiological need is to expend as little energy as possible on perception and cognition. So, recognition of repeated images, ideas and situations is a positive thing, both for ‘bottom-up’ perception which locks on to repeats and for ‘top-down’ cognition, which needs to expend less energy understanding. It is no accident, then, that the repetition of elements and stimuli configured in conventional relationships is at the heart of audiovisual culture. We think of audiovisual culture as being ‘dynamic’, yet these dynamics are held within quite strict boundaries and often are repeated in a stereotypical manner. Dynamics, like all other parameters of audiovisual culture, are set and guided by conventions. Sometimes these can be strict and heavily circumscribed. As I stated, easy recognition of objects in terms of sound and image is a positive occurrence. Conventions and familiarity set up expectation, which can aid perception (keeping us relaxed rather than anxious) and ‘top-down’ cognition (allowing rapid processing and lessening cognitive load). Sound and image dynamics define space in film and subsequence audiovisual culture and are at the heart of it. Predominantly, in terms of image this is a two-dimensional flat space with the illusion of depth. Action and framing furnish a further sense of dimension. Depending on the number of speakers and the number of different musical signals (often called channels), the ‘sound stage’ of audio will be more of less wide and differentiated. Of course, the two merge together in perception into a seamless space that is viable as an illusion. Even in situations where the sense of space in the sound does not fit the sense of space in the image, we will not perceive it as a mismatch unless the difference in radical. An instance of this might be a large room with little contents and hard wall and ceiling surfaces, accompanied by a ‘dead’ sound with no reflective sounds, yielding a very close sound as if in an enclosed box. Such as instance would signal a psychological dimension, we might be experiencing the character’s point of view, or there may be something ‘unreal’ about this space.

6  ‘GYMNASIUM FOR THE SENSES’: THE ARTIFICIALITY OF AUDIOVISUAL… 

179

Audiovisual space is all about dynamics as without them there is no sense of space. The separation of sounds on the soundtrack is also crucial, as it can supply a space that gives something to allow movement across it, or perhaps even to deliver the possibility of movement. Vistas, panoramas and landscapes more generally tend to be characterized by dynamics. A sense of distance is often offset by something closer in the foreground, for example, and it is indeed rare for pleasing landscapes to lack a dynamic sense. With large amounts of the world’s population living in cities where viewing distance is limited, the attraction of looking long distances has become a premium for sea view hotel rooms or chalets up mountains. Landscapes in audiovisual culture are similarly attractive and can help give us a sense of a short ‘holiday’ in the location on screen. Thinking along these lines is reminiscent of the apartment-­ bound cats whose owners supply them with a video depicting the outdoors and featuring small animals for their excitement. Landscapes in films, television and video games are audiovisual illusions, and yet on occasions they appear extremely tangible and enterable. They are habitually a combination of images and sounds recorded at different places. In the case of music accompanying landscape shots, it is clear that the music was recorded in a studio most likely in an urban environment a long way from the images. Yet even in other situations, it may not be what we are expecting. In natural history documentaries, for example, sound is almost never recorded at the same time as images and very rarely in the same location. Indeed, library sounds can be used, too, to fabricate environmental sound that appears fairly faithful to (or at least not incongruous to) the location and recorded images. Perhaps, in some oblique way, experiencing landscapes in audiovisual culture is related to experiencing them as a healthy dynamic experience, and as I noted, perhaps a little like a holiday. There is often an interest in showing us remarkable views and places as spectacles perhaps make us into armchair tourists.3 The backdrop for an interest in ‘healthy’ vistas is that many of us spend far too much time looking at an electronic screen and often with only a keyboard separating it from our bodies. Approaching the audiovisual as dominated by concerns of perception and thus characterizing it as essentially physiological in nature can lead to a reconceptualization of audiovisual culture as an area for exercising perception, something like a gymnasium. ‘Healthy’ use of the eye varies focal length and mixes distant and close vision, while a similar procedure might be argued for hearing. Keeping perceptual faculties in good order involves an attention

180 

K. J. DONNELLY

to dynamic signals and much audiovisual culture endeavours to provide this. Indeed, it is more than likely that the dynamics of artificial deep space are beneficial, perhaps even like a perceptual ‘workout’. This is not essentially to do with ‘survival fitness’. Although a case certainly can be made, it is a requirement of human hardware (eye lens, etc.), which only indirectly might have evolved for fitness. So, the process might perhaps be incidental, perhaps a translation across two ‘borders’, which makes the process less of a simple adaptation and more complex negotiation of determinants. This chapter will explain the concept while attending to important aspects such as the artificiality of electronic audio and visual space, the alternation of proximity and distance effects and the exploration of perceptual overload as an occasional but dramatic effect.

Experiencing Audiovisual Spaces Landscape painting and later still photography had made us think that ‘landscape’ is essentially visual. All actual landscapes include sounds an integrated element. Since the advent of film and later integrated audiovisual culture, the rendering of landscapes habitually involves a prominent sonic, and often musical, component. Landscapes can appear real enough to give an impression of being almost enterable. This has been emphasized and concretized by film and television, for instance, in the British film Three Cases of Murder (1955) segment called ‘The Picture’, where characters can enter a house in a painting, or the BBC television adaptation of M.R. James’ The Mezzotint (2021) where something appears imperceptibly to be moving across the picture towards its new owner. These play upon the notion that we believe in the image. The painting and etching respectively exceed the medium’s flatness and texture, becoming real. Audiovisual culture allows for an increased sense of ‘enterability’, particularly through the deployment of sound. Sound can envelop us, removing the perpetual distance of the image.4 Indeed, there is a sense that sound can embody its own landscape and this has been exploited by corresponding visuals. Although not quite the same, the moving image can give a sense of the audience entering through the camera moving forward and perhaps between a movement between long shot and close up, although this is not as literal as sound and music. Harper and Rayner contend that in cinematic landscape, music retains an equal place with other elements but remains defined through the image.5 This is a traditional view of the image as prime and sound as

6  ‘GYMNASIUM FOR THE SENSES’: THE ARTIFICIALITY OF AUDIOVISUAL… 

181

secondary, and traditionally Film Theory has suffered from a failure to adequately address sound. I would suggest that music can redefine the image rather than simply being defined by it. This would assume a fundamentally transformative relationship with the images of landscape shots— one that appears more complicated and more far-reaching in its effect and not simply defined by the image. There has been a widespread assumption that music ‘adds to’ the image in audiovisual culture. A more sophisticated understanding would be that it rather converts it, perhaps through a process of ‘mutual transference’. Music accompanying landscape shots might mean that music is ‘landscaped’ and landscape is ‘musicalized’, in a reciprocal relationship that embodies and homologizes the perceptual-cognitive process taking place in our heads. The distinctive formation makes a particularly potent combination of the indexical realism of the image with the emotional immediacy of music. These ‘musicalized screen landscapes’ potentially can become an emotional representation, which might appear less of a representation of a place than a representation of an emotion. I discussed this briefly in the previous chapter. Yet rather than simply a case of music furnishing the emotion of the image, the image reciprocally provides something important for the music. As a part of this, it is possible that the image provides an (imaginary) visual spatialization of musical structure—one that will not correspond with actual musical structure but adumbrates an ‘emotional sense of structure’, or what might better be thought of as a seemingly logical sense of emotional structure. These elements have a defining relationship in the overall effect of the whole. A good example is Maurice Jarre’s music added to the initial landscape shots of the desert in David Lean’s Lawrence of Arabia (1962). The sequence inaugurates with Lawrence extinguishing a match with his fingers cueing the heroic orchestral music bursting forth alongside a succession of desert shots joined by crossfades. This combination of sound and image is so iconic it has been reused and parodied, in the James Bond film Moonraker (1979), for example. The music here is the counterpart to the images as drama rather than an atmospheric equivalent to the desert, though. While the sustained notes are perhaps isomorphs of the sweeping sand dunes, the music supplies a sense of movement lacking in the static desert shots. On other occasions, music appears to embody the emotional tone of the landscape more, such as Ry Cooder’s slide guitar music accompanying barren American deserts in Wim Wenders’ Paris Texas (1984).6 In

182 

K. J. DONNELLY

other cases, music can sometimes subtly integrate with the images to form an audiovisual environment as a whole. Music on its own can also carry a charge of landscape implication. The so-called ambient music contains a distinct implication of images, or at least something pertaining to a landscape or location as a complement to the music. In The Ambient Century: From Mahler to Moby, Mark Prendergast points to Brian Eno’s central importance to ambient electronica.7 Eno’s seemingly programmatic album Ambient 4: On Land (1982) evinced a lack of purposeful progression or development, alongside an evacuation of melody/harmonic movement. The music thus formed its own sense of environment rather than utilizing traditional aspects of dynamic movement (a concern with melody, harmony and rhythm). This form of static musical structure tends to foreground texture and sonority. It also mixes what is traditionally understood as music with what appear more like sound effects or ambient sounds. As Brian Eno himself notes on the cover of Ambient 4: On Land: “… cluster all disparate sounds into one aural frame: they become music”.8 It is difficult not to be aware that there is a similarity to film scores that have had an important role in furnishing aural impressions of locations and associated images. Ambient 4: On Land was an influential signpost to the future genre of ‘ambient music’. Most of the music lacks notable melody or harmonic movement and instead is premised upon slow musical events and a focus on textures. Timbres are defined by electronic treatments and do not sound like distinguishable instruments. Indeed, many of the sounds are from previous recordings or library sound effects. For instance, The Lost Day involves periodic keyboard that sounds more like the slow clanking of metal ropes on spinnakers in a yacht marina. This punctuation provides an important temporal structure but also provides a sense of spatial distance, along with much in the way of deep indistinct sounds. Certainly, compared to other music of the time, this must have sounded strange. Its character is closer to environmental sound than traditionally music. It lacks dynamics or development and seems less of a piece of coherent music than a span of ambient sound that easily could be employed as backdrop in a film. All the sounds were subjected to lengthy signal processing and development in the studio—what in the film industry they would refer to as ‘post-production’. Continuity and slow unfolding of sound make this music a far cry from the traditional standard song format of ‘popular music’. Instead, the recordings present a strong impressionist sense of ‘painting pictures’ through sound. Music in its wake has had a significant

6  ‘GYMNASIUM FOR THE SENSES’: THE ARTIFICIALITY OF AUDIOVISUAL… 

183

impact on audiovisual culture, and indeed, Ambient 4: On Land sounds relatively straightforward today due to the degree of influence in later music and in scores for film and television.

Rural Sights and Sounds Images of the English countryside have been common on film and television programmes. It has often rarely been acknowledged that this image includes an associated sound/music. This has cohered into welded together countryside panoramic long shots and ‘pastoral’ music. A good example of this is the Wessex described in detail in Thomas Hardy’s novels. This was translated into music by Gustav Holst in his tone poem Egdon Heath (1927) and screen adaptations of Far from the Madding Crowd in 1967 and 2015, with music by Richard Rodney Bennett and Craig Armstrong, respectively, and in both cases suitably rustic to fit the films’ country landscapes. Similarly, American Westerns also are associated with a particular sound; in terms of music, this involves a repertoire of particular instruments and combinations of timbre. Similarly, Scandinavian drama that shows landscapes tends to exploit the low-angle light and accompany that with what has been called a ‘silver sound’ in music. This has to do with spare textures and leaving space in the music.9 This might be understood as a clear equivalent of sound and image, which merges into a strong whole through structural isomorphism and homology. Some post-millennial television programmes have aimed for a level of integration between dramatic location and score that does not stem from significant traditions in film and television music scoring. For instance, there have been several successful television adaptations of Henning Mankell’s Wallander novels, which feature a dour ageing Swedish detective in a small Baltic coastal town. They tend to be slow, have an atmosphere of austerity and include little in the way of humour. The first Wallander series starred Rolf Lassgård and consisted of nine episodes from 1994 to 2006 and was made by Sveriges Television. The second was more successful, starring Krister Henriksson. It ran for 32 episodes from 2005 to 2013 and was produced by Yellow Bird/Svensk Filmindustri/ARD Degeto. The third television adaptation was made by the BBC and Yellow Bird from 2008 to 2016 and consisted of 12 episodes. It starred Kenneth Branagh and was shot in English and primarily used British actors, despite being shot in Sweden. The second and third versions were both co-­ produced by Yellow Bird and there is an evident continuity, not of the actors or the language but of the location (Ystad in Skane, southern Sweden) and music and sound world.

184 

K. J. DONNELLY

Fig. 6.1  Wallander with Krister Henriksson

Lassgård’s version was not shot in picturesque Ystad and has minimal music from Frans Bak, almost always for moments of drama. Music for Henriksson’s version is not matched but more ambient, recorded by Adam Nordén and was later replaced by music from the Fläskkvartetten. Branagh’s version is similar in approach and sound to Henriksson’s, with music by British composer Martin Phipps.10 The second version is more stylish than the third but it looks likely that the ‘texture’ of Henriksson’s version was directly ‘ported’ into Branagh’s version. These last two versions emphasize beaches and the Baltic Sea, the beautiful but bleak countryside and particular colours. This often looks like an early evening light, perhaps shooting during the ‘golden hour’, and matching the music’s ‘silver sound’, aiming for a sparse but soft and extremely clear tone. The Henriksson and Branagh Wallanders have thick atmospheres, clearly aiming for a sense of continuity in terms of privileging the location and the music. Indeed, these two elements might be understood as the stars of the Wallander series and provide an ‘essence’ to the programmes. The character interaction and narratives vary, and what might be construed as backdrop in many television dramas occupies a position far nearer than the centre of the stage.11 Visually, while the programmes embrace a good number of close ups of talking heads, as befits any television drama, there are plenty of shots that showcase the location. This not only includes the

6  ‘GYMNASIUM FOR THE SENSES’: THE ARTIFICIALITY OF AUDIOVISUAL… 

185

town of Ystad but the Skåne countryside, with its massive fields of undulating crops and dark woodlands.12 With direct relevance to Wallander, Daniel Grimley has written about Scandinavian music and its relationship to landscape. Landscape, he states, “… presupposes both a process of composition (the creation of frames of reference or forms of spatial organisation) and the presence and active participation of a viewer (in the sense of perspective). Furthermore … landscape is not merely concerned with spatial perception, but also possesses a temporal dimension”.13 Grimley goes on to point to a sense of ‘halted time’, and musicologist Carl Dahlhaus’s description of Naturklang (sounds of nature) and Klangfläche (sound surface). These appear both static through repetition and in constant motion, simultaneously. They openly avoid the impulse of traditional tonal musical ‘development’ to depict nature, and a good example of this is ‘Forest Murmurs’ in Wagner’s Siegfried (1876). The consequence is the importance of a spatial dimension in the sound, as well as an overall sensitivity to sound. By this latter point, I mean an interest in sonority and texture and focusing in on the particularities of the sounds available to specific instruments. This notion transfers well to the ‘music as landscape’ in the Wallander series. There is a tendency to use electronic reverb to give a strong sense of space to the music, allied with sparse and clear instrumental textures. The use of looped arpeggios gives a sense of stasis, like inert landscape, rather than the sense of movement often evident in music. There is also a tendency towards the use of parallel harmony, such as minor chords simply being shifted upwards or downwards in pitch, and the music often appears to have little direct ‘narrative’ cueing. Indeed, while there is some action music, the programmes utilize music primarily for mood, and this is compounded by the intermittent inclusion of shots of the southern Swedish landscape, with its sweeping treeless fields of crops like oilseed rape, accompanied by atmospheric music that also isomorphically parallels space, stasis and emptiness.14 Henriksson’s Wallander is in Swedish and so is subtitled in UK television broadcasts. For non-Swedish speakers, this makes the spoken language into something akin to music. It takes on a modal character added to the tone of the images and the rest of the sound. For anyone who saw the Swedish one first, Branagh’s Wallander is confounding in that it is in English and shows a Sweden peopled by British bit-part actors. There is one innovation: Branagh’s Sony Ericsson cell phone ringtone. This is a constant source of anxiety as well as an accelerator of narrative

186 

K. J. DONNELLY

development. It is a highly characteristic ringtone, made especially for Branagh in Wallander. Indeed, this must be one of the first occasions when the use of a mobile phone is in the foreground of a programme, with his highly specific ringtone recalling the programme for audiences.15 While Lassgård’s version is certainly dramatic, it works in a different way from the other two. It does not emphasize its locations, and indeed they are rather nondescript, and perhaps consequently its music is also not prominent and appears only to underscore dramatic moments. Henriksson and Branagh’s versions both feature music as an analogue of images, isomorphically merging sound and image to yield a relatively static tangible atmosphere. Audiovisual location (landscape shots, music) provides an ‘essence’ of Wallander as an atmospheric location, with musical emotional content, in which narrative and characterization take place. The location and background music functions not simply as a backdrop but rather as the heart of the television series. This process of equivalence between certain types of music and images might be understood with reference to notions of ‘congruence’ as well as with the structural idea of isomorphism, discussed in the previous chapter. While isomorphism posits that there is a formal mirroring or structural equivalence between sound and image, the notion of congruence investigates the psychological and cultural sense that certain images seem to fit well with certain sounds (‘congruence’). Foundational work on this was completed by Annabel Cohen with some more recent work by David Ireland.16 The sense of whether we think certain music and certain images as particularly ‘congruent’ has clearly had an important role in setting traditions, for both production and expectation. However, some films and television dramas aim for a sharp contrast between music and images through the use of anachronistic music, such as Plunkett and Macleane (1999), Sofia Coppola’s Marie Antoinette (2006) and the BBC television series Peaky Blinders (2013–2022). These challenge expectation, although the more expectation is challenged, of course, the less it remains a solid expectation. However, conventional usage tends to dominate and there can be little to gain from radical divergence of sound and image confounding expectation. Similar in some ways to Wallander, the British television series Midsomer Murders (sometimes known as Barnaby or Inspector Barnaby overseas) has been one of the most successful British television exports. 17 The series began in 1997 and is still in production. It is a rurally set police procedural drama, where an inspector and his assistant investigate what habitually is a

6  ‘GYMNASIUM FOR THE SENSES’: THE ARTIFICIALITY OF AUDIOVISUAL… 

187

Fig. 6.2  Midsomer Murders

succession of gruesome murders in an atmosphere mixing pretty and polite locations often with an air of irony and camp. The ‘whodunit’ prides itself on finding novel ways for murders to be achieved in this country location that is peopled with eccentric and exaggerated characters while also allowing for regular cameos from ageing celebrity actors. The location shots are regular and one of the programme’s attractions. Indeed, the ‘Home Counties’ locations, set across Buckinghamshire, Berkshire, Hertfordshire and Oxfordshire, are listed as points for tourists to visit in the book Midsomer Murders on Location, which is available in the Oxford Tourist Information office. The back cover of the book notes: “This book is a must for all Midsomer Murders enthusiasts as well as those interested in visiting some of England’s finest countryside”.18 The distinctive visuals have an equally distinctive sonic counterpart. The music for Midsomer Murders plays an important part in the programme. It is produced by experienced television music composer Jim Parker and has a ‘mock-classical’ character, not only punctuating the action and providing structure for the narrative but also providing a playful sense to the arch murders and investigation.19 In Midsomer Murders, a small repertoire of reused recorded cues appeared in each episode, alongside new ones that might be specific for the episode. The programme’s main theme, as is traditionally the case, establishes a sense of emotional

188 

K. J. DONNELLY

tone as well as cutting out an individual sense of the show’s character. A waltz with a swinging tonic-dominant bass, it has an arch and drily comic feel. Its edge of irony is reminiscent of composers such as Michael Nyman or Danny Elfman, whose work can sound like it is not straightforward and is asking you to join in with a parody. Perhaps the clearest antecedent might be the Stranglers’ Waltzinblack, which had become a staple as television library music.20 The Midsomer Murders theme uses a series of parallel minor chords, which sounds unusual in the light of waltz’s tendency to use strong tonal structures of harmony.21 Its melody is also distinctive and exotic, being performed on the theremin, one of the earliest electronic musical instruments and used often for its eerie sound in music for science fiction films such as The Day the Earth Stood Still (1951) or for disturbed psychology as in Spellbound (1945). These musical aspects provide a solid sense of the programme’s distinctive character and mark it out as unusual for a detective show. On the opening titles, the theme accompanies images of the locations and actors— in other words, the diegetic action has already started. This already provides a significant ‘cast’ on the images and provides a certain levity to images which can often include murder and its aftermath, not what might be considered lightly in many television dramas. The theremin reappears, too, in the incidental music. This happens at moments of revelation, or dramatic points that lead into advert breaks, where the instrument makes a startling rise in pitch climax. This resembles a scream and is similar to the sort of portamento that can be achieved on a slide trombone. While this bears similarities to horror film music, it retains its arch, ironic connotation. These moments are a good example of the unity of narrative development and sonic dynamics, both peaking at the same time and underlining the significance of events on screen. A prominent repeated theme is called ‘The Village’ and is clearly based on the classical music tradition of music inspired by the rural fox hunt and idealist notions surrounding it. As such, the melody is played by the French horn, a common ‘pastoral’ instrument, and the melody again appears perhaps slightly comic, although not overtly so. It appears to parody the sense of urbane behaviour—which oddly is associated with the blood sport—and this polite melody is given an even more formal sense by the addition of urbane obligatos and the light triple-time pulse. Sometimes this piece is rearranged and has its melody performed by a clarinet, another instrument regularly associated with the simplicity of the rural and highly evident as a solo instrument in English pastoral classical music.22

6  ‘GYMNASIUM FOR THE SENSES’: THE ARTIFICIALITY OF AUDIOVISUAL… 

189

So, such music not only furnishes a sense of the rural to reinforce the images on screen of English villages and countryside but also suggests a hidden underneath of blood sport and irony, where things are not what they appear to be. This overall unity of sound and image has provided the show with a unique atmosphere that is the ‘static’ sold element with a surface of different narrative intricacies on top of it. It is not only the incidental music that is fully integrated and a clear part of the Midsomer Murders’ character. Sounds are used in a way that extends beyond simple guarantors of diegetic realism. The programme uses sounds of nature very prominently. These are stereotypical sounds of rural England, which reinforce a sense of nostalgia (for lost rurality) and a shared English culture based on land and tradition. Garden shots regularly include a singing blackbird, while sometimes the undercurrent of negativity is indicated by a hoarse call from a magpie. Night sequences almost always include the sound of a lone fox bark. Ironically, all these sounds, and particularly the latter, are far less rural these days in Britain. They are certainly less evident in the intensive farming-based countryside and far more evident in ‘ruburban’ (rural suburban) gardens, or perhaps in these garden outposts in rural villages. Midsomer Murders thus moves aspects that traditionally are on the edges of television dramas into being closer to a featured aspect. These nature sounds, which are all clearly added in post-­ production, are unsynced and off-screen and comprise an important psychological aspect of the landscape that is the heart of the programme. For instance, in the episode ‘A Tale of Two Hamlets’ (from 2003) as a family sit on their lawn talking to Inspector Barnaby, we hear the cooing of a collared dove, then a tolling church bell, along with a thrush and some crows. These are as much, perhaps even more, of the ‘landscape’ of the programme than the images or the narrative and characters. They are also not present for the purposes of realism, and while they are an ‘emotional landscape’ they also function structurally. One potential reason for the importance of the soundscape and its function as landscape is that in some case, when rural village scenes in England are being shot, they avoid showing the reverse shot, which is a large road full of cars or some form of recent building development. Of course, this is not always the case, but Britain has experienced an unprecedented expansion of cities such as London into the nearby countryside in recent decades. Indeed, the visual construction of Midsomer country is highly selective and takes images from diverse sources to collage an imaginary rural location. Overall, Midsomer Murders’ sound and music provide

190 

K. J. DONNELLY

the dimension of ‘landscape’ that images cannot (unifying fragmented illusions of the rural) as well as providing a wry commentary on the action. Novelist and filmmaker Iain Sinclair stated that the version shown in France (Inspecteur Barnaby) with its translated French subtitles alters the programme, making it appear less comic and more of an ‘existential drama’.23 It is not necessarily that the translation loses and gains something but that the programme has certain ambiguities that can be emphasized or not and that sonic irony can easily lose any impact due to wider context, as potentially can comedy, too. While it might easily be argued that the landscape images owe something to the English tradition of landscape painting, the music clearly has a debt to the tradition of ‘English Pastoral’ art music.24 This is most commonly associated with British composers such as Ralph Vaughan Williams, George Butterworth, Frederick Delius, Gustav Holst, Frank Bridge and others and was prominent in the first half of the twentieth century. Some prime examples of this particular sound include Butterworth’s A Shropshire Lad Rhapsody and The Banks of Green Willow, Delius’ Brigg Fair, Holst’s Egdon Heath and Vaughan Williams’ The Lark Ascending and Symphony no. 3 ‘Pastoral’. Indeed, the piece ‘Driving Home’ by Parker is reminiscent of Butterworth’s A Shropshire Lad Rhapsody. While this ‘English Pastoral’ was not a movement as such, it appears to have a broad stylistic similarity, often being based on folk songs and with prominent woodwind. Its sense of the rustic music of England’s past homologized with a Romantic notion of Britain’s countryside and the (largely lost) simple rural life. It also coincided with a period of strong nationalism in Europe that also was evident in the Arts. In his book Music in England, Eric Blom declared that English Pastoral music was the “counterpart of English lyric poetry”.25 It is perhaps unsurprising that this tradition is also clearly evident in music for British film and television costume drama and films set in Britain’s past. While Richard Rodney Bennett’s gentle rustic flute theme for Far from the Madding Crowd is a good example, so are the BBC Great Expectations (1999) with music by Peter Salem, or the film Akenfield (1974) that featured prominently a piece of Michael Tippett’s music (Fantasia Concertante on a Theme of Corelli). Geoffrey Burgon’s music for Brideshead Revisited (1981, Granada) was perhaps less pastoral and more ‘drawing room classical’ which is the norm for recent British television costume drama. This seems more about class than landscape, although it might be argued that the rural landscape includes (and

6  ‘GYMNASIUM FOR THE SENSES’: THE ARTIFICIALITY OF AUDIOVISUAL… 

191

sometimes overwhelms) the peasant/working class, whereas the ruling class is shown in expensive clothes and buildings, which demand a different form of music.

Nonindifferent Nature Audiovisual culture increasingly has developed the notion of the ‘emotional sound effect’, which has become particularly prominent since the increasingly collapsed distinction between music and sound effects.26 A clear outcome of losing the distinction between music and other sound is that sound effects become more emotional, like music, while music has the option to become less emotional but more integrated with the action. Sound effects can grow to be part of ‘music’ and taking on some of its traditional functions. Also, at least partly, this is due to digital and ‘musical’ software being used for both sound effects and music and their being mixed also through the same programmes. The merging of these two traditionally separate functions is matched by a similar loss of distinction between sound and images. Here, and in other rural set films and drama, there can be a sense where the particular images need a particular image at the same time and, indeed, vice versa. The sound of English pastoral music is ‘completed’ by images of the English countryside, as well as the other way round. From the late 1950s to the early 1980s, British regional television company Southern Television began its daily broadcast with the specially written piece Southern Rhapsody by Richard Addinsell. This was accompanied by a slow cross-faded montage of still photographs and film of predominantly city landmarks, country and seaside locations across the region, the south coast of England. It is clear not only that the producers thought the images and music ‘fitted with each other’ but also that they were the equivalent on some level. This sense of integration between landscape and music (and other sound) is hardly new. Sergei Eisenstein was interested in their expressive possibilities, discussing what he called ‘nonindifferent nature’ where landscape shots had an emotional function similar to music and lacking the burden of providing narrative information. Rather than being neutral and representational, such images carry a suitable emotional charge which is in tune with the rest of the film. He calls this ‘the musicality of landscape’.27 Eisenstein’s notion of ‘nonindifferent nature’ posits an equivalence between landscape shots and music.28 Both are more than simple backdrop and supply an emotional charge and valence for the film. Both are

192 

K. J. DONNELLY

able to move beyond simple communication and narrative function to elicit a sense of thick atmosphere and emotion. Eisenstein was interested in emotional landscape shots and their ability to step outside of narrative and diegesis. Later, he was concerned with sound film beyond basic ‘talking films’ and often thought and theorized in musical metaphors. Landscape shots matched with non-diegetic music would potentially redouble this effect. Rather than simply a representation of location, film landscapes with music regularly become transformed into something extraordinary and, by implication, with their own integrity. Eisenstein appears to pull back from this, though, and focuses on landscape shots’ place within the film system, in relation to other shots and elements, where they offer emotion to their surrounding pieces of celluloid. Yet, they can also be understood as something-in-themselves, as semi-autonomous objects within film. So, this is not about ‘representation’, and the music does not ‘add’ to the image but rather converts it, perhaps into an ‘emotional representation’, which is less of a representation of a place than it is a representation of an emotion. Even the most specific footage of a place becomes something notably different as film and music combined. ‘Double nonindifferent nature’ is not a straightforward process, however. The mixture of the two channels and corresponding senses can produce an ‘artefacting’ or ‘aliasing’ effect, yielding certain ideas, emotions and implications as ‘spectral’. Despite the genetic fusion of landscape image and music, there can remain something of a ‘gap’ between sound and image, which emanates from technological realization and perception. The ‘feeling’ of landscapes resembles the supernatural. The images become ‘more’. They are complex and not straightforward and appear to have subtle connotations and implications. Potentially, illogical emotions might appear. While any moving image could fit to any music, some, of course, fits better than others. But sometimes the effect can be strange and uncanny. This is due to the process being more important than the content (as classical Eisenstein sound montage, or ‘vertical montage’). This is not simply about feeling the landscape, but the conversion of the landscape to something else: music providing a sense of depth (and a sense of different time scale) to the landscape image. Landscapes welded to music are significant for video games, too. Tadhg Kelly, a video game designer, producer and creative director, argues, “forget the person. The art of game design is all about the place”,29 and additionally, Geoff King and Tanya Krzywinska claim that “more than simply a background setting, the world of the game is often as much of a

6  ‘GYMNASIUM FOR THE SENSES’: THE ARTIFICIALITY OF AUDIOVISUAL… 

193

protagonist, or even antagonist, as its inhabitants”.30 This is certainly true for so-­ called Open World games, where the player can traverse what appears to be a real space, choosing a path and movement within a highly distinctive location. Good examples of this include the Elder Scrolls and Silent Hill games. In Atlas of Emotion, Giuliana Bruno points to the walk in a city as a ‘haptic geography’ that links film and architectural concerns and then she proceeds to suggest that most Italian Neorealist films might be understood as city walks.31 A relatively recent video game genre appears quite literally to be a walk, the so-called walking simulators, sometimes known as ‘exploration games’. In these games, the player has little or nothing to do as gameplay and simply traverses the world on screen at will. Examples include Dear Esther (2012, The Chinese Room), The Vanishing of Ethan Carter (2014, The Astronauts) and Gone Home (2018, Fulbright/ Blitworks). In the same mould, independent video game The Old City: Leviathan (2015, PostMod) is not based on skilful action fighting or thoughtful puzzling. This ‘walking simulator’ involves a detailed visual environment that is open to the player’s aimless exploration and contemplation. As a counterpart to the sumptuous and mobile visuals, there is an extensive and atmospheric music soundtrack by a Swedish dark ambient industrial musician Atrium Carceri. Rather than music derived from an interactive video game tradition,32 the music firmly retains a distinctive aesthetic from ambient music and retaining a character of sombre uneasiness. Less concerned with gameplay and more directly concerned with embodying environment, it lacks notable dynamic shifts and development. This decentres the player and makes them a visitor in a large dominant and independent sound and image-scape. Before creating the music for The Old City: Leviathan, Atrium Carceri’s music sometimes appeared to be descriptive of cities, such as Inside the City (on Phrenitis from 2009) and A Stroll Through the Abandoned City (on Kapnobatai from 2005). Indeed, it is almost as if the game has actualized the potential in the music. The Old City: Leviathan’s music adopts a different function from most video game music, potentially bringing with it aspects of electro-acoustic environmental music for a game that espouses the tenets of psychogeography, through imbuing a landscape with emotional characteristics and mysterious implications. The PostMod website states: “Players have the option to simply walk from start to finish, but the real meat of the game lies in the hidden nooks and crannies of the world; in secret areas, behind

194 

K. J. DONNELLY

closed doors ….”33 The music is not only an integrated part of the experience but also follows a similar process of being open to exploration and contemplation. In fact, in a number of ways, The Old City: Leviathan problematizes the notion of the ‘video game’, and indeed some game players do not consider such ‘walking simulators’ as games at all. Prominent game theorist Espen Aarseth stated: “Games are both object and process; they can’t be read as text or listened to as music, they must be played”.34 Yet, the contemplative nature of games such as Leviathan allows a player to listen to the music while wandering or remaining inert. It appears to be a remarkably pure instance of focus on audio and visuals, as there is little narrative that the player doesn’t have to follow anyway and little that resembles the traditions of gameplay in electronic games. The opportunity for immersion is high, though, as there is little to distract the player from wandering and appreciating. As I noted already, Atrium Carceri’s music is not ‘functional’ in the regular video game sense and sounds very similar to his other music. The aesthetics could almost derive from this ambient music that dominates the game’s logic. The music changes for different locations in the game, which is common in video games where the player or avatar moves through a space resembling an actual location. Philip Kirby notes that ‘geographical approaches’ to video games can often exploit or rely on music.35 In The Old City: Leviathan, impressive images and atmospheric music cohere into a sombre but engaging view of the empty city, devoid of a population. This is a compulsive ‘vision of the end’ of humanity, embodying the Freudian death drive as a video game, if you like. The lack of gameplay simulates an afterlife scenario, being a tourist after the demise of humanity, while the music suggests the ghost of human agency and the implication of absent humanity. Some video games exploit a similar interest in exploring location while also including goal-oriented activities for the player. S.T.A.L.K.E.R.: Shadow of Chernobyl (2007) is a first-person shooter survival horror video game developed by Ukrainian game developer GSC Game World and published by THQ in 2007 following a long development. The game is set in an alternative reality, where a second nuclear disaster occurred at the Chernobyl Exclusion Zone in Ukraine, causing strange changes in the area around it. The game features a fragmented line of progression and includes role-playing game (RPG) elements such as dealing with nonplayer characters. Inspired by Andrei Tarkovsky’s Stalker (1979), an

6  ‘GYMNASIUM FOR THE SENSES’: THE ARTIFICIALITY OF AUDIOVISUAL… 

195

adaptation of the Strugatsky brothers’ novel Roadside Picnic, the game’s extremely large traversable area is based at least partly on ‘the zone’ in the film. Similarly, it is a place with strange mutations and developments, but also with changing weather. While the desolate zone appears to be a continuity, it is in fact a large number of different successive 3D locations that are accessed via distinct portals. Like The Old City: Leviathan, the game’s music (credited to MoozE) is more like ambient music used for atmosphere and as a merged part of the location rather than being dynamic game music that develops and changes with specific gameplay. The third game in the series, S.T.A.L.K.E.R.: Call of Pripyat (2010), includes playable locations modelled directly on actual places in the Chernobyl Exclusion Zone, such as the town Pripyat, the village of Kopachi and Yaniv railway station. The use of meticulously copied actual locations can add a significant dimension to the sense of actuality in playing video games. Good examples include Assassins Creed (2007, Ubisoft) and its ancient Rome setting and Assassins Creed Unity (2014, Ubisoft) in Paris, inFAMOUS Second Son (2014, Sucker Punch) in Seattle and Tom Clancy’s The Division 2 (2016, Massive Entertainment) in Washington, D.C.  This ‘realist’ tendency is very different from the dreamlike use of landscape in many games. Indeed, in recent years, the approach of ‘psychogeography’ has become popular, even though it seems to lack a solid definition. One facet is the notion that we have an emotional understanding of landscape, as well as landscapes appearing to be partly ‘inside the head’ of the observer rather than fully separate external objects. In an earlier chapter, I pointed to how the Silent Hill games are premised upon a landscape that appears to take place inside the protagonist’s head, while independent game Barrow Hill: The Dark Path (2016, Shadow Tor studios) has the player move through different locations around an English village in a hypnogogic situation. Indeed, landscapes can be ‘read’ by those with skills understanding the symptoms of the past (old mines, fields, old rail lines or roads, villages destroyed, etc.), while there are also other means of dealing more ‘magically’ with landscape, such as ‘divining’ and ‘dowsing’. So, the landscape and our understanding of it is not a simple thing but one premised upon taking basic perceptions and being able to accept that the ‘face value’ of our initial signal should not be taken as the full truth but that we need to ‘think behind’ and imagine, in order to understand more fully. This is a fundamental questioning of perception that has been and remains a persistent strand in audiovisual culture.

196 

K. J. DONNELLY

It is a common strategy for culture to set up a state of ambiguity and some confusion between ‘real’ and interior states of mind, and indeed certain genres arguably rely upon this. There might be some evidence that sound is particularly susceptible to this confusion. Flinker et  al. suggest that judging from audio cortex activity, human brains cannot easily differentiate between our own voice and that of others, suggesting that a solid demarcation between inside and outside out heads may be fragile.36 On one level, we approach diegetic worlds as on some level being ‘real’, through the so-called suspension of disbelief. Indeed, as already noted, much culture, such as literature and film, are premised upon making fantasy seem real. On the contrary, we can become confused between manifestations of psychological ‘inner states’ in a seemingly ‘realistic’ format. In some diegeses, two different worlds can be present and set apart, which illustrates precisely the process of formulating constructions of ‘fantasy’ and ‘reality’ within the same space. These can play to or across the traditional tendency for audiovisual culture to use sound for interior states of mind and image for more objective situations. Yet this can often be less schematic, and rather than posing a clear sense of ‘a real world’ and an interior world, stylistic aspects can augur a sense of momentary change or amendment to subjectivity—distortions of perception as a way of expanding human perceptual experience and reach, and of questioning perception itself, or perhaps of perception questioning cognition.

Changed Perception, Underload and Overload Perceptual distortion and setting cognition into uncertainty have been common strategies. From dream sequences to perception changed by drugs or psychosis, this altered state habitually is telegraphed through distortions and fugues from ‘zero degree’ film style, both in terms of sound and image.37 Extending stylistic conventions beyond the norm foregrounds and focuses on a sense of perception. Early on, it was noted that film had a dreamlike character, for instance, film theorists such as Ricciotto Canudo in his influential book Birth of the Sixth Art (published in 1911).38 The historical movement of Surrealism in the 1920s and 1930s was premised upon film’s ability to seem dream-like—oneiric—and to exploit something close to a hypnogogic or hypnopompic state. For film analysis, writers often approached the subject using the same inspiration taken by many surrealists: psychoanalysis.39 While not sustained as a movement, Surrealism has had an isolated influence in terms of specific films and a

6  ‘GYMNASIUM FOR THE SENSES’: THE ARTIFICIALITY OF AUDIOVISUAL… 

197

more pervasive influence around the edges of mainstream cinema, yet the dream sequence itself has become a standardized and accepted aspect of twentieth- and twenty-first-century audiovisual culture. Actual dreams are very personal, about our ‘inner selves’ rather than public and objective. Yet in film dreams are more public and more standardized in terms of format and aesthetics. They are often signalled in a certain formal manner, perhaps by repetitive harp music, by strange, illogical imagery. A good example of this is Salvador Dali’s dream sequence in Hitchcock’s Spellbound (1945). For the film’s narrative development and conclusion, it is imperative that this level is accessed in order to supply a solution of the film’s central enigma. However, dreams can often be presented as if they are ‘more real than real’ giving us access to the ‘real’ inner character rather than simply their surface manifestation. The oneiric status of dreams on film bears a marked similarity to ‘psychedelia’ with its altered sense of reality and questioning of the status of ‘reality’. Psychedelia is difficult to define. Michael Hicks notes that by the late 1960s “it was a household term that people used to describe almost anything, from neo-expressionist paintings to strip shows”.40 It was commonly imagined to mean ‘drug-inspired’ (particularly hallucinogen LSD) or ‘multi-coloured’ and has implications of using a certain type of collaged visual design and espousing a sense of being ‘revolutionary’ and breaking with tradition. As a cultural trend, psychedelia was a massively influential idea for a few years in the wake of the so-­ called Summer of Love in 1967, which was a watershed for the cultural underground.41 Probably, the most concise definition of psychedelia is “the musical response to LSD”.42 This emphasizes its appeal to instant perception and its distortion. In film, this itself often appeared as hallucinogenic interludes that resemble surrealist cinema in their stylization and grotesquerie. However, it also involved a very direct use of music associated with psychedelia and the counterculture, particularly as the late 1960s and early 1970s were a period where popular music developed radically.43 For a relatively short period, this coalesced into a distinctive sensorium of sound and vision. Some musical recordings associated with psychedelia were characterized by certain distinctive sound effects, including phasing and the use of magnetic tape recordings run backwards. The visual repertoire of psychedelia included mixtures of different colours, fragmented and obscured vision and juxtapositions of objects from different contexts, often mixing the very new with antiquated objects, as part of a distinctive design ethic. Also evident were the use of disorientating camera

198 

K. J. DONNELLY

movement, the use of particular camera lenses that distort images, self-­ conscious sunbursting from shooting into the sun and sometimes fast but often discontinuous editing. Psychedelic aesthetics regularly mimic the effects on perception from hallucinogenic drugs.44 More recently, research into the effects of hallucinogens has become more prominent. For instance, Kometer and Vollenweider investigated perceptual changes caused by hallucinogens, finding ‘visual intensifications’, ‘visual illusions’, ‘altered self-reference’, ‘elementary imagery and hallucinations’, ‘audiovisual synaesthesia’ and ‘complex imagery and hallucinations’. They concluded that a number of neuropsychopharmacological mechanisms, rather than a single one, are likely to be behind the effects.45 In a way, perhaps psychedelic films make for a more coherent discourse of audiovisual effects that, while inspired by the effects of drugs, mould the idea into specific ideas and, in particular, manipulations of the image and sound. Performance (1970) is one of the clearest examples of the late 1960s psychedelia on film, a cultural point that George Melly called the 1960s popular culture’s “noisy and brilliant decadence”.46 Shot in 1968 but released two years later, Performance is a remarkable film that remains elusive in many ways. Music is crucial for the film, as might be expected from a film that showcased Mick Jagger of the Rolling Stones in his first major acting role. Performance’s complexity allows for different readings and understandings, and the film’s music plays an important role in this. It is about contrasting and parallel performers, gangster Chas (James Fox) and reclusive rock star Turner (Mick Jagger). Chas intimidates businesses into paying protection money to gangster Harry Flowers, but when he kills an associate, the gangsters want him dead. He flees to the house of Turner, a rock musician who has lost his demon, where the two of them investigate each other’s identities. The film is interesting for its eclectic range of music.47 In an early sequence in film, the gangsters have set up the ‘piped music’ unit that they stole from the taxi company. The film cuts to Harry Flowers as he says, “I like that, turn it up”. This calls very direct attention to the music. The muzak machine in Harry Flowers’s office provides the closest thing the film has to the traditional Hollywood underscore, yet Jack Nitzsche composed it to sound like ambient ‘elevator music’ (or muzak). Its function initially seems to follow that of the traditional orchestral score, but it moves from the background into the foreground, explicitly articulating the image track and defining the logic of the sequence. This shift of figure

6  ‘GYMNASIUM FOR THE SENSES’: THE ARTIFICIALITY OF AUDIOVISUAL… 

199

and ground has a disorienting effect. The gangsters are all in their office base, and the ensuing conversation is fragmented, not only by the interposing of non-diegetic insert shots related to what is said, but it is additionally fragmented by seemingly discontinuous dialogue. The sequence starts as a conversation between the gangsters but culminates in a filmic regime of heightened subjectivity—space is articulated around Chas and his point of view. Throughout the sequence, the music provides a fabric— from the initiation of the scene, where it is hardly audible, to the point where its volume is raised in the diegesis, to its final position in the subjective section, where it has marginalized diegetic sound and taken on some obtrusive distortion. In Unheard Melodies, Claudia Gorbman likens the music of classical cinema to muzak.48 She points to the shared functions, where both ease our anxieties and enable us to be more easily manipulated. In films, this ‘muzak’ enables our suspension of disbelief, making us more willing to believe what we see on screen. Like muzak, the traditional underscore is, in the vast majority of cases, not meant to be heard in a conscious manner. Yet here we have what I would argue is a parody of the scores in classical Hollywood films. It has the soupy string-dominated orchestral sound but is purposely banal. If it were in the mould of the classical film score, it would follow the dynamics of the action, instead of which it forces the dynamics of the action conformed to itself. The music consists largely of a string section and a piano, playing a strong but simple, almost prosaic melody, accompanied by well-defined harmonic movement in the bass and chordal background. The timbre (instrumental sound) is more important than the melody and the harmonic movement (the chords) is aligned firmly to the temporal structure. A sound dissolve brings the music to the fore, as part of this highly aestheticized sequence of audiovisual effects. This involves music being distorted/treated as well as being loud, allied to changes in image tonality and continuity. The regularity of its structure becomes a skeleton upon which to hang the visuals, with editing built firmly on the musical structure. The piece of music is based on a regular rhythmic formation and a short 2-bar phrase structure, with chords changing at this regular interval. These two aspects regularize the music. In the ‘subjective’ section towards the end of the sequence, the music and the montage coalesce at the critical point. The sequence is a succession of reverse shots of gangster Chas and his boss Harry Flowers, with images distorting where Chas repeats a saying of his boss’ (“At the death, who’s left holding the sodding baby?”).

200 

K. J. DONNELLY

Figs. 6.3–6.6  Performance

They change tonality drastically distorting the image in tandem with the sound of the music. At this point there are successive images of the other gangsters and ultimately a strange series of shots of Flowers from increasing distances (Figs. 6.3–6.6), while the music and his voice distort on the soundtrack, through phasing and removing high and low pitches to sound like it is being fed through a telephone. While the music supplies a standardized time to the sequence, the distortions make it feel like a shift outside of conventional time; an aesthetic parallel takes place between the image and soundtrack isomorphically, with the images being ‘treated’ in a similar manner to the music. Performance plays with expectation and deals constantly in aesthetic surprise and confusion, at least partly inspired by psychedelia’s rerouting of perception. Andy Clarke suggests, “Perception itself is a kind of controlled hallucination. … [T]he sensory information here acts as feedback on your expectations. It allows you to often correct them and to refine them. But the heavy lifting seems to be being done by the expectations. Does that mean that perception is a controlled hallucination? I sometimes think it would be good to flip that and just think that hallucination is a kind of uncontrolled perception. …”49 Perhaps we should think of psychedelia as uncontrolled perception of reality. Psychedelia certainly appears interested in a reorientation of human perception. Clarke points to the

6  ‘GYMNASIUM FOR THE SENSES’: THE ARTIFICIALITY OF AUDIOVISUAL… 

201

work of Robin Carhart-Harris, Head of the Centre for Psychedelic Research in the Faculty of Medicine at Imperial College, London. He has been investigating “the idea that what serotonergic psychedelics do is relax the influence of top-down beliefs and top-down expectations so that sensory information can find new channels”.50 This seems a reasonable notion and one that might easily be transposed to our engagement with audiovisual culture, which might sometimes act to overwhelm perception and also work to limiting top-down prediction and computation. This might help account for ‘altered states’ in terms of film style, as well as other issues in audiovisual culture. It is also clear that psychedelic culture can ‘overload’ with too much detail, particularly in visual design, providing an approximation of altered perception, but perhaps something else through cognitive overloading. Films, certainly seen in the cinema with a big screen and loud sound, can have negative as well as positive physiological effects. They can be tiring, rejuvenating, exercising the emotions wildly, or they can retain a pervasive feeling of anxiety. The KTL version of The Phantom Carriage discussed in the previous chapter appears more physically exhausting than the experience of the film with a more conventional score. Indeed, one of the musicians’ regular ensembles, Sunn0))), is utterly draining in concert due to the combination of volume and vibration, along with low light conditions. I regularly treat my students by showing music videos on a large cinema screen and the effect can sometimes be overwhelming. For example, U2’s (Even Better than) The Real Thing (1991, directed by Kevin Godley) is a tiring experience due to the sheer amount of material presented to be processed alongside a radical aesthetic strategy. The spine of this music video is travelling images of the band performing, shot on a camera that loops above and below them. As if this was not disconcerting enough, this is intercut with a plethora of other images in fast montages, some so fast you hardly get time to register what appeared onscreen. The speed of editing allows the viewer-auditor just enough or perhaps just not-­ enough time to register what has appeared on screen, leading to a state of mental arousal, aiming at rapid cognition. The camera encircling the band has more of a direct physiological effect. I am always compelled to warn students when I have shown this to them, as from time to time I still feel slightly nauseous with this fairground-like camera movement. This underlines the point made in Chapter 2 that even though it is not real and is images on a screen, our perception registers the movement and the space as a real fairground ride.

202 

K. J. DONNELLY

Video games can often use the principal of increasingly overloading the abilities of the player, in terms of perception, cognition and motor movement, as the game progresses. This approach was evident early in video game history with Space Invaders (1978, Taito), where, as the player shot and destroyed more and more invaders that progressed down the screen, the pace of the last ones increased exponentially, making it difficult to shoot them and complete the level. The further the player got into higher levels, the scenario remained exactly the same. The difference was that the pack of invaders would start closer to the player’s gun, to the point where they actually began immediately next to the player’s gun. As with most shooting games, this tested the player’s perception as well as their hand to eye coordination and speed of reaction. A similar level of overload was central to Lemmings (1991, Psygnosis). The player needed to guide a large pack of ‘lemmings’ through from an entry gate to an exit gate. The lemmings walk mechanically and relentlessly requiring the player to add place in objects to help them past the obstacles (‘blockers’, ‘diggers’, ‘wallsmashers’, stair builders, umbrella parachutes, etc.). Lemmings are released regularly from their entrance portal, and at a certain point the sheer number of them overwhelms (or very nearly) the player’s ability to save them. Games like this can start with a fairly pedestrian pace but build to a conclusion of frenzied activity. A more recent game with a corresponding sense of relentlessness and overload is Plants vs Zombies (2009, PopCap). As a horror comedy-style ‘tower defence’ game, Plants vs Zombies requires the player to defend a house (screen left) from a slowly moving siege of relentless zombies who traverse a lawn from the right side of the screen (sometimes with a swimming pool) and later a roof.51 Defence is achieved through planting various vegetables, fungi and flowers that counter the zombies through blocking, exploding or showering them with projectiles. It has horizontal lanes of movement and many different types of zombie, beginning each level slowly with isolated attacking zombies but concluding with a massive torrent of them.52 The first levels on the back lawn of the player’s unseen house are accompanied by a piece of music by Laura Shigihara that lasts just over 2 min and simply repeats once finished, not fitting activities on the screen. It has a strong rhythmic impetus that is not going to be interrupted (and is a tango). The uniform regularity is essential. The structure is premised upon units of four bars, with the opening tango lasting 16 bars, followed by 8 bars of soaring oboe melody (the same 4 bars repeated), then an orphan

6  ‘GYMNASIUM FOR THE SENSES’: THE ARTIFICIALITY OF AUDIOVISUAL… 

203

4-bar ‘drop out’ section, then pizzicato strings for 8 bars followed by the same with added sustained strings for 8 bars, leading to piano arpeggios of 16 bars (the same 4 bars repeated), after which the piece repeats all over again. Apart from one section, all are 8 or 16 bars, but the 4-bar strophe is the fundamental structural unit, with regularity giving something of a mechanical character to proceedings. Just after the first 8 bars, when the music begins repeating, an eerie sound accompanied by a voice intoning ‘the zombies are coming’ gives warning that a zombie is about to appear on the right side of the screen. The relation of music to action is almost negligible, with the regularity of the music furnishing something of a mechanical character to gameplay. As the player waits for the zombies’ appearance on the left part of the screen, an eerie wavy synth sound enters with a voice over whisper saying, ‘the zombies are coming’. At 0.17 a string melody enters and a lone zombie appears shortly afterwards. At 0.48 brass enters and another zombie appears. At 1.29 both strings and brass can be heard as two zombies appear simultaneously. The tango simply halts abruptly when the level is finished successfully, with a crudely intruding burst of unconnected jazz guitar. At the ‘endgame’ section, a drumbeat enters as an accompaniment to the existing music, appearing kinetically to ‘choreograph’ movement through grabbing proceedings by the scruff of the neck as what is billed on-screen as a ‘massive wave of zombies’ approaches at the conclusion of each level. The assumption is that the beat matches the excitement of action (and chaotic simultaneity on screen). Apart from the crude succession of jazz guitar if the zombies are vanquished, there is another possible conclusion precipitated again by a vulgar interruption. Four notes of ponderous sinister cartoon music materialize if the zombies break through and win (eating the player’s brains). Overall, Plants vs Zombies music expresses a burlesque of horror, warding off the possibility of any actual terror yet its regularity is related directly and isomorphically to the unceasing regularity of the zombie movement and the gameplay. The game’s aesthetics contain some clear homologies or parallels. The relentless forward movement of zombies in Plants vs Zombies homologizes the looped musical accompaniment. Such game music is in effect simply ‘a countdown’ and on one level potentially a more general metaphor for being overtaken by age and death. While these two channels (of sound and image) may involve similar logics, they are not strictly speaking matching. They make two separate and parallel paths of ‘inevitability’, and the lack of integration between them suggests a strange, aberrant psychology.

204 

K. J. DONNELLY

The aesthetic situation has distinct ramifications for perception. We may be forced into a déjà vu, splitting the signal in the brain with a delay in reception and processing, particularly as a discrepancy exists between the speeds of aural and visual perception and processing.53 Sound having to be held up slightly for image impulses illustrates the way the brain is not only a parallel processing device, channelling impulses to different regions and working upon them simultaneously but also making unified sound-­image signals through using different parts of the brain. Sometimes, brains appear able to go ‘out of synch’, dividing and confusing broad cerebral functions. Perhaps on one level, this is like the ‘brain-dead’ zombies, who appear only to have lower brain functions, based around the cerebellum, which governs motor activity and rhythm-oriented activity. The player in Plants vs Zombies is forced into an out-of-synch engagement with the game whereby they must think slightly ahead of the action, thinking ahead in an advanced manner to halt the zombie attack rather than simply react to their immediate threat. If the game arguably embodies physically something of a split in the brain, this is compounded by the music’s relationship to the gameplay. This seeming ‘split brain’ aspect of the game is also homologized by the game’s central depiction of ‘social’ division. The game’s scenario functions as a clear metaphor: the zombies are a clear metaphor for the social underclass invading the lawns of respectable suburbia. Here, garden plants—which are a ‘useless’ sign of the ‘cultivated’ middle class—prove effective as a bulwark against the great unwashed. Thus, this ‘cultivation’ destroys barbarism, and in evolutionary terms agricultural ‘planters’ succeed over the more primitive hunter-gatherers. The player embodies the ‘civilized’ and must play through prediction and forethought, while the zombies represent the mindless masses, who ‘live’ (if you like) from hand to mouth. This and other video games with time constraints put increasing pressure on the player, slowly tipping the player into the realization of likely failure and compounding the ‘paranoiac environment’ of video games that Gillian Skirrow accords to them.54 Part of this is surely the principle of ‘overload’, which not only overwhelms the player in terms of physical ability of gameplay but simultaneously, in terms of psychology, being overcome by the inability to deal with incoming information. Our sense crave stimulation. Sensory deprivation is bad for us. This fundamental physiological and psychological craving is proven by negative responses from them when sensory stimulation is withdrawn. As I noted, physiological baselines provide psychological horizons and sensory

6  ‘GYMNASIUM FOR THE SENSES’: THE ARTIFICIALITY OF AUDIOVISUAL… 

205

deprivation radically changes the psychological disposition and cognitive abilities of a human subject. Indeed, this is one of the foundations of mind control, torture and brainwashing processes. Psychologist D.O.  Hebb attempted to persuade behaviourist-dominated psychology in the early to mid-twentieth century that rather than simply logging behaviour, a focus on neural mechanisms would allow for explaining that behaviour.55 His proposition was one of the earliest conceptualizations of the human neurological system as groupings of ‘cell assemblies’, which matches current thinking about neurons and mental activities. Hebb was also one of the first to note the effect of depriving the sense of input. When there is little sensory input, the brain produces sensations that appear to be real. So-called ganzfeld experiments have underlined this.56 A ‘ganzfeld’ white field occupying a subject’s visual field leads to eye receptors shutting down, while the equivalent for sound are white noise, fatigues and underloads for ear receptors as we struggle to make sense of the lack of signals, with the visual and aural cortexes amplifying any perceptual signals received and ‘hallucinating’ to fill empty space. Exposure to a long period of silence can trigger music ear syndrome (MES), where people can think they hear music that empirically is not present. One explanation for these ‘phantom sounds’ is apophenia, specifically audio pareidolia. This occurs when our perception tries to find patterns in the noise and might find partial patterns that it develops filling in the missing pieces to create a specific sound that is not actually there but can seem so to us. I have experienced apophenia in sound and vision a number of times but none more keenly than when I went on a wildlife safari in Sweden. In the middle of a remote forest, a few of us sat for a whole sleepless night in a shed looking out at a clearing frequented by bears. That night they did not frequent it, though. Expectation, low-light conditions and almost total silence led to constant images and sounds of bears appearing momentarily but without substance. Stranger things appeared, too. I saw strange visions, once it was almost pitch darkness, of odd and disturbing creatures and even of myself at one point. In terms of sounds, hearing speaking at some distance was fairly regular, and occasional closely spoken words, too, and even at one point hearing a burst of the choir-style chorus from King Crimson’s 1960s song The Court of the Crimson King. But in actuality there was nothing and definitely no bears. It is, of course, hard to tell how much of this relies on perception making solid gestalts, and how much of it is post-perceptual organization and making sense. What is clear, though, is that the lack of stimulation of seeing and hearing leads to that lack being filled by our perceptual mechanisms and our semi-conscious imaginations.

206 

K. J. DONNELLY

Perceptual Health The ‘gymnasium’ of the chapter title refers to a strong sense of a space and the ‘healthy’ aspects of that audiovisual space. Since the Millennium, the health industry has carried on growing apace, alongside such workplace concerns as ‘health and safety’. During one of the many ‘health and safety’ courses I have been forced to attend (and where I failed the examination at the end), I found myself thinking about health and video games. The ‘health bar’ or ‘health meter’ is an important fixture in first-person shooter video games. It is crucial, showing how ‘life’ your avatar has left. You can usually replenish life through searching and finding certain objects. First-­ aid kits are a popular choice for games, but if you let its health ebb away, it is the end of the game or the end of that ‘life’ for the player. This is of clear importance in FPS games, which arguably are about lives, losing and regaining them, and regeneration through finding and collecting useful objects in the player’s travels. Indeed, there are few audiovisual forms concerned with health less than video games. The Wii Fit (2008, Nintendo) is probably the most explicit in this regard. An integrated system for exercise including a balancing board, during yoga and strengthening exercises, the audiovisual world aims to promote a sense of well-being. The on-­ screen trainer looks fairly realistic but not enough to disturb, while the musical accompaniment is easy and unassuming to the point where we don’t consciously register it, although the music can remain in our heads through repeated usage. This is clearly what is considered a relaxing and positive environment in terms of sound and image. The irony is that looking closely at screens is widely considered detrimental to our eyesight and general health. We are recommended to follow the ‘20-20-20 rule’. With this, we are meant to look away from our screens every 20 min or so, to an object that is around 20 feet away, and do this for a full 20 s.57 This emphasizes dynamics and perception. Changing focal length is crucial here. This is simulated in audiovisual culture and we might imagine that there is something health about a sense of dynamics, after all looking at the same thing and listening to an unerring mechanical drone for hours is the sort of strategy used as torture. On the other hand, sound and enveloping environments are regularly used as therapy. Acoustic ecology has had a significant influence on the sense of there being a ‘healthy’ soundscape. This has also had a practical effect on noise abatement, town planning and architecture.58

6  ‘GYMNASIUM FOR THE SENSES’: THE ARTIFICIALITY OF AUDIOVISUAL… 

207

Broadly speaking, processing a dynamic signal is healthy.59 Being static and staring at a computer screen all day is bad for the eyes and the constant use of ear buds for sound can be damaging for the ears. Our senses crave stimulation. Sensory deprivation is used for torture but most people are more used to aesthetic dynamics that verge on the overloaded. Humans are not ‘built’ for sustained rapid successions of images or sounds to process, nor are they ‘built’ for loud sounds and bright lights. Loud sound can damage hearing and, along with undynamic sound signals, has also been used as a central agent in forms of torture and mind control. Approaching audiovisual culture through perception supplied a decidedly physiological dimension to conceiving our relationship with it. While video games clearly have a physiological engagement, film and television are more often conceived in a more ‘Cartesian’ sense of distant processing by the skull-bound mind. It is possible that there is a way that audiovisual objects might affect us physiologically as much as cognitively. Perhaps the dynamics of artificial deep space are good for you (as a perceptual ‘workout’). This clearly is not related to ‘survival fitness’ but more to the requirements of human hardware (eye lens, etc.), and so while this might have a small connection to evolutionary adaptation, the process is decidedly indirect. A translation across two ‘borders’, which makes it less a simple adaptation but perhaps might make it more complex and interesting, as it could be taken as a shadow of physiological activity, transformed into an exaggerated but particularly satisfying form. This is most clearly evident in video games where the avatar habitually is able to exceed the player’s normal physical capabilities by far. Audiovisual culture is able to have a negative (and positive) physiological effect (to be tiring or rejuvenating, to make us happy or unhappy). Again, thinking in physiological terms, perhaps audiovisual culture might be understood less as ‘information’ and more as something closer to nutrition. As such, it is a banal necessity for the senses, which require stimulation. Yet it is also able to be a sensual feast of great awe but equally can become something of a siphon, where consumption can turn into habitual gluttony. In some ways it might be characterized as a health issue, pertaining to exercise and (over) indulgence of senses. Similarly, the origins of perception are geared towards dynamics, and the audiovisual is cut to this measure in its ‘healthiest’ form. The necessity for audiovisual dynamics is

208 

K. J. DONNELLY

important for physical health, yet similarly, we might begin to understand audiovisual culture in another physical way: as similar to eating, with its related issues of compulsion for basic nutrition and the capacity for over-­ consumption. While there is a consistent desire for access to audiovisual culture in its many forms, there is a constant social anxiety about over-­ stimulation and the negative effects of a spiralling ‘gluttony’ on this material.

Conclusion Physiological baselines provide psychological horizons. Dynamic variation is the lifeblood of audiovisual culture, and this is most clearly evident in the depiction of landscapes where sound and image are able to forge a highly dramatic sense of space. This audiovisual space is created through a repertoire of image and sound techniques that yield a desirable illusory space. Mimicking distant vistas with extreme long shots and large spaces through sound and music with electronic reverb, audiovisual landscapes have a haptic as well as emotional sense. This chapter addressed the central play of dynamics evident in audiovisual culture, which might be conceived as one of its central and defining logics. This will include discussions of culture that limits dynamics and points where sensual ‘overload’ and tiredness are essential parts of cultural objects. Dynamics are crucial for perception, and their configuration into a consumable format in audiovisual culture arguably has a function in helping regulate the brain. They certainly exploit its central tendencies. Rudolf Arnheim noted, “We envisage the human mind as an interplay of tension-­ heightening and tension-reduction strivings. The tendency toward tension-­reduction cannot run its course unopposed, except in the final disintegration of death”.60 While video games clearly exploit this, it is also a clear strategy in thrillers and horror films, and sometimes this process is mirrored extremely clearly in the incidental music’s regular tension-release structures. These are premised upon repose and anxiety, usually built by narrative development, although in perceptual terms they can follow the music’s building of tension and ultimate release, and so engage both top-­ down and bottom-up processes rather than simply being based on the illusory situation on screen and narrative context. They remain importantly physiological in their engagement of the body: not only the perceptual organs but also connections to rates of breathing and heartbeat.

6  ‘GYMNASIUM FOR THE SENSES’: THE ARTIFICIALITY OF AUDIOVISUAL… 

209

The sense of space and aesthetic dynamics are mutually implicating and productive. Landscapes in the real world are characterized by distance and dynamics, and audiovisual culture negotiates these attributes through audiovisual dynamics, long shots supplying visual space, while music and sound cues also have their own sense of dynamics. Both sound and image have their own intrinsic dynamics but, crucially, merge into a new configuration of audiovisual dynamics. Although the metaphors are not good as it is ocularcentric, there is a longstanding saying that the music can either be ‘camera’ (part of narration) or ‘set’ (part of atmosphere, emotional tone). Some television programmes, for instance, provide an instant emotional repertoire and sense of imagination for the show as part of its audiovisual landscape’. We can feel like we make a ‘visit’ to some drama programmes, like a short holiday, perhaps. Both Wallander and Midsomer Murders are fine examples of this, both concentrating on a coherent audiovisual landscape for the audience to ‘enter’. In a similar manner, Star Trek (retrospectively relabelled ‘The Original Series’, 1966–1969) has a strong feel for its spaceship location through not just coherent visual structure of the sets but crucially also through the selection of ambient spaceship sounds that belong with it. Indeed, the visuals without those sounds would not be convincing, and for those of us who have watched the show regularly repeated since childhood, there is the reassuring feeling of visiting the ‘place’ of the bridge of the starship USS Enterprise. This sense of coherent sonic and visual landscape is evident in many dramas, in particular in science fiction series, even though they aim for an exoticism of their visual and sonic landscapes that they might visit as ‘alien planets’. While we might imagine this comes from the tradition of ‘sound effects’ rather than music, I would suggest not. Sound becomes wielded in the manner of music, as a modal and emotional element rather than necessarily an object to furnish a sense of reality to the programme world. For instance, the cell phone ringtone in Branagh’s Wallander or the strangely foregrounded animal sounds in Midsomer Murders do little if anything for narrative development but instead give a sense of texture to the show that marks it out as distinctive. Indeed, in the case of the latter, it would be a jarring surprise to actually be shown the night time off-screen barking fox. The fox’s sound has become decoupled from the animal and reintegrated into the electronic landscape, integrated with dark images of the characteristic location.

210 

K. J. DONNELLY

Notes 1. Rudolf Arnheim, Art and Visual Perception: A Psychology of the Creative Eye (Berkeley: University of California Press, 1974), p. 411. 2. Anon, “7 Sensory Overload Symptoms to Look Out For” at Experia. USA, 15 November 2020. https://www.experia-­usa.com/blog/7-­sensory-­overload-­symptoms-­ to-­look-­out-­for/ [accessed 27/6/2022]. 3. John Urry implied this in his influential book The Tourist Gaze (London: Sage, 1990). 4. Virtual reality (VR) is ‘enterable’ as a 3D space, particularly with immersive headphones. In fact, headphones are particularly significant, allowing us to carry a particular soundscape around us like our own mind. 5. Graeme Harper and Jonathan Rayner, “Introduction—Cinema and Landscape” in Graeme Harper and Jonathan Rayner, eds., Cinema and Landscape: Film, Nation and Cultural Geography (Bristol: Intellect, 2009), p. 19. 6. Cooder’s music is based primarily on the blues piece Dark Was the Night Cold was the Ground by Blind Willie Johnson. 7. Mark Prendergast, The Ambient Century: From Mahler to Moby (London: Bloomsbury, 2001), pp. 131, 52–53. 8. Brian Eno, cover notes of Ambient 4: On Land (1982 EG Records EG 2311107). 9. Composer Adam Norden discusses this in the sleeve notes for his incidental music recording. Henning Mankells’s Wallander (Kritzerland KR20023-2, 2012). 10. In November 2009, the Royal Television Society presented the series with two awards at the 2009 RTS Craft & Design Awards; Aidan Farrell at postproduction house The Farm was presented with the Effects (Picture Enhancement) award, and Martin Phipps and Emily Barker with the Music (Original Title) award for the opening theme. 11. Michael Tapper, “More than Abba and Skinny-Dipping in Mountain Lakes: Swedish Dystopia, Henning Mankell and the British Wallander series” in Film International, vol. 7, no,2, April 2009, 60–69. 12. Indeed, Henning Mankell’s books and the television shows have increased tourism in Ystad. In the small ‘Tourist Information’ centre it is possible to buy a small handful of quite incongruous Wallander objects. 13. Daniel Grimley, Grieg, Music, Landscape and Norwegian Identity (Woodbridge: Boydell and Brewer, 2006), p. 56. 14. It includes parallel harmonies, with the theme associated with Kurt starting with a minor chord and then moving to another minor chord 3 ­semitones

6  ‘GYMNASIUM FOR THE SENSES’: THE ARTIFICIALITY OF AUDIOVISUAL… 

211

higher. The music also includes looped harp arpeggios and sounds with electronic echo. 15. Apparently, “… no fan has been able to track down the name or composer of the tune used in the ring tone, despite repeated requests to Sony Ericsson and the BBC.  The tone, it can now be revealed by Telegraph. co.uk, was especially composed for the programme by Lee Crichlow, the sound effects editor for Left Bank Pictures. The company is the independent production house that co-produces the English-language show for the BBC.” Harry Wallop, “Wallander: Mobile Ringtone Mystery Solved” Daily Telegraph, 13 Jan 2010 http://www.telegraph.co.uk/culture/culturenews/6983024/Wallander-­m obile-­r ingtone-­m ystery-­s olved.html [accessed 20/1/20]. 16. Annabel J.  Cohen, “Congruence-Association Model of Music and Multimedia: Origin and Evolution” in Sui-Lan Tan, Annabel J.  Cohen, Scott D. Lipscomb and Roger A. Kendall, eds., The Psychology of Music in Multimedia (Oxford: Oxford University Press, 2013), pp. 17–37; David Ireland, Identifying and Interpreting Incongruent Film Music (New York: Palgrave Macmillan, 2018). 17. BFI Television Yearbook (London: BFI, 2003), p. 39. 18. Sabine Schreiner and Joan Street, edited by Antony J. Richards, Midsomer Murders on Location (Cambridge: Irregular Special Press, 2010). 19. However, the strategy changed in 2019 to have new individual scores by different composers. 20. Which appeared on the album The Gospel According to the Meninblack (1981). A similarly ironic sounding waltz, Shostakovich’s Waltz No. 2 from Suite for Variety Stage Orchestra, which was used notably in Kubrick’s Eyes Wide Shut (1999). 21. The chord progression starts on a minor chord and then moves down two semitones to make a parallel minor chord, followed by a seventh chord 2 semitones lower and then a seventh chord a semitone lower (Im-VIIm-­ VI7-V7). This is a variation on what is sometimes called ‘the Aeolian Progression’ a relatively common descending chord progression in minor keys (or the Aeolian mode). 22. The track is called ‘Driving Home’ on the CD release. 23. He states this in Grant Gee’s film Patience (After Sebald) (2012). 24. Which English modernist composer Elisabeth Lutyens allegedly called ‘Cowpat’, a name that stuck. 25. Eric Blom, Music in England (London: Pelican, 1943), p. 200. 26. K.J.  Donnelly, “Emotional Sound Effects and Metal Machine Music: Soundworlds in Silent Hill Games and Films” in Liz Greene and Danijela Kulezic-Wilson, eds., The Palgrave Handbook of Sound Design and Music

212 

K. J. DONNELLY

in Screen Media: Integrated Soundtracks (New York: Palgrave, 2016), pp. 73–88. 27. Sergei M.  Eisenstein, Nonindifferent Nature: Film and the Structure of Things, Herbert Marshall, trans. (Cambridge: Cambridge University Press, 1987), p. 389. 28. Ibid., p. 389. 29. Tadhg Kelly, ‘Worldmakers [Game Design]’, www.whatgamesare. com/2010/12/worldmakers-­game-­design.html [Accessed 17/06/2015]. 30. Geoff King and Tanya Krzywinska, Tomb Raiders & Space Invaders: Videogame Forms and Contexts (London: I.B. Tauris, 2006), p. 76. 31. Guiliana Bruno, Atlas of Emotion: Journeys in Art, Architecture, and Film (London: Verso, 2002), pp. 64, 30. 32. Video game music traditions are discussed in detail by Karen Collins, Game Sound (Cambridge, MASS.: MIT Press, 2008), and Richard Stevens and Dave Raybould, The Game Audio Tutorial: A Practical Guide to Sound and Music for Interactive Games (London: Focal Press, 2011), p. 169. 33. Visiting http://postmodsoftworks.com/ will show you that the company no longer exists and developed only this game. 34. Espen Aarseth, “Computer Game Studies, Year One” in Game Studies, vol. 1, issue 1, July 2001. http://gamestudies.org/0101/editorial.html [accessed 12/5/2009]. 35. Philip Kirby, “Musical orientation in virtual space: videogame score and the spatiality of musical style and topic” in Social and Cultural Geography, vol. 23, issue 6, 2020. https://www.tandfonline.com/doi/full/10.108 0/14649365.2020.1821392 [accessed 20/2/2021]. 36. Aideen Flinker, Edward F. Chang, Heidi E. Kirsch, Nicholas M. Barbero, Nathan E. Crone and Robert T. Night, “Single Trial Speech Suppression of Auditory Cortex Activity in Humans” in Journal of Neuroscience, vol. 30, no. 49, 2010, pp. 16643–16650. 37. Although sometimes films can lie to us about the status of what we are shown. A good example is in An American Werewolf in London (1981) where the lead character wakes from a disturbing dream only to be attacked again, with that diegesis also turning out to be a dream. 38. Ricciotto Canudo, Birth of the Sixth Art (New York: Alfred A. Knopf, 1911). 39. Robert T. Eberwein, Film and the Dream Screen: a Sleep and a Forgetting (Princeton, N.J.: Princeton University Press, 1984). 40. Michael Hicks, Sixties Rock: Garage, Psychedelic and Other Satisfactions (Chicago: University of Illinois, Press, 2000), p. 58. 41. George Melly, Revolt into Style: The Pop Arts in Britain (London: Penguin, 1970), pp. 115, 106. 42. Andy Davis, Mark Paytress and John Reed, “The Beatles and Psychedelia” in Record Collector, no. 166, June 1993, p. 20.

6  ‘GYMNASIUM FOR THE SENSES’: THE ARTIFICIALITY OF AUDIOVISUAL… 

213

43. See further discussion in K.J.  Donnelly, “The Psychedelic Screen” in Magical Musical Tour: Rock and Pop in Film Soundtracks (London: Bloomsbury, 2015, pp. 31–43. 44. Stuart Laing, “Economy, Society and Culture in the 1960s: Contexts and Conditions for Psychedelic Art” in Christoph Grunenberg and Jonathan Harris, eds., Summer of Love: Psychedelic Art, Social Crisis and Counterculture in the 1960s (Liverpool: Liverpool University Press, 2006), p. 32. 45. Michael Kometer and Franz X. Vollenweider, “Serotonergic HallucinogenInduced Visual Perceptual Alterations” in Current Top Behavioral Neuroscience, vol. 36, 2018, pp. 257–282. 46. Melly, op.cit., 1970, p. 123. 47. Prominent among the musicians performing Jack Nitzsche’s score is slideguitarist Ry Cooder, who includes some of Dark Was the Night, Cold Was the Ground (originally by bluesman Blind Willie Johnson), which later formed the foundation for Cooder’s acclaimed score for the Paris, Texas (1984). Nitzsche assembled a score, which I have called a ‘composite’, as it was composed on macrolevel of mostly existing or generic elements on a microlevel, making for a diverse assemblage that nevertheless had a cohesive character and unity. K.J. Donnelly, “Performance and the Composite Film Score” in K.J.  Donnelly, ed., Film Music: Critical Approaches (Edinburgh: Edinburgh University Press, 2001). Also c.f. K.J. Donnelly, “Jack Nitzsche’s Performance” in Mark Goodall, ed., Gathering of the Tribe: Music and Heavy Consciousness Creation (London: Headpress, 2013), pp. 267–272. 48. Claudia Gorbman, Unheard Melodies: Narrative Film Music (London: BFI, 1987), pp. 56–59. 49. Anon, “Perception as Controlled Hallucination: Predictive Processing and the Nature of Conscious Experience: A Conversation with Andy Clark” in Edge, 8 September 2021 https://www.edge.org/conversation/andy_clark-­p erception-­a s-­ controlled-­hallucination?fbclid=IwAR0XTKw8SWMiW4cLDwOTWu2P3 icztzl6fBSZkQKy-­dmzkQM4BNB77TyLHIo [accessed 12/9/2021]. 50. For instance, Carhart and others found that Psilocybin dramatically reduced oscillatory power in certain areas of the brain. Suresh D.  Muthukumaraswamy, Robin L.  Carhart-Harris, Rosalyn J.  Moran, Matthew J. Brookes, Tim M. Williams, David Errtizoe, Ben Sessa, Andreas Papadopoulos, Mark Bolstridge, Krish D.  Singh, Amanda Feilding, Karl J. Friston and David J. Nutt, RJ, “Broadband Cortical Desynchronization Underlies the Human Psychedelic State” in Journal of Neuroscience, vol. 33, 2013, pp.  15171–15183; also see Frederick S.  Barrett, Samuel R. Krimmel, Roland R. Griffiths, David A. Seminowicz, Brian N. Mathur,

214 

K. J. DONNELLY

“Psilocybin Acutely Alters the functional connectivity of the Claustrum with Brain Networks that Support perception, memory, and attention” in Neuroimage, September 2020, p. 218. 51. Originally to be called ‘Lawn of the Dead’, Plants vs Zombies was developed by PopCap Games and published in 2009 initially for PC and Mac and was later ported to Xbox, PlayStation, Nintendo DS and mobile phones (iOS, Android and BlackBerry). 52. See a more detailed analysis in K.J.  Donnelly, “Lawn of the Dead: The Indifference of Musical Destiny in Plants vs Zombies” in K.J.  Donnelly, William Gibbons and Neil Lerner, eds., Music in Video Games: Studying Play (London: Routledge, 2014), pp. 151–165. 53. While individual responses vary, auditory information is processed faster. The brain commonly activates 30–50 msecs earlier for sound than for image. Rob L.J. van Eijk, Armin Kohlrausch, James F. Joula and Steven van de Par, “Audiovisual Synchrony and Temporal Order Judgments: Effects of Experimental Method and Stimulus Type” in Perception and Psychophysics, vol. 70, no. 6, 2008, p. 955. 54. Gillian Skirrow, “Hellvision: An Analysis of Video Games” in Colin MacCabe, ed., High Theory/Low Culture: Analyzing Popular Television and Film (New York: St. Martin’s Press, 1986), p. 130. 55. D.O.  Hebb, The Organization of Behavior: A Neuropsychological Theory (New York: Wiley, 1949). 56. Ann Pietrangelo, “What is the Ganzfeld?” at Healthline, 15 October 2020. https://www.healthline.com/health/ganzfeld-­effect#how-­to [accessed 21/2/2022]. 57. Anon, “Eye Safety” at Royal National Institute for the Blind (RNIB) https://www.rnib.org.uk/eye-­health/safe-­eyes [accessed 19/7/2022]. 58. Jing Chen and Hui Ma, “A Conceptual Model of the Healthy Acoustic Environment: Elements, Framework, and Definition” in Frontiers in Psychology, 29 October 2020. https://www.frontiersin.org/articles/10.3389/fpsyg.2020.554285 [accessed 15/2/2022] 59. Vaibhav Chhaya, Sutirtha Lahiri, M. Abhinava Jagan, Ram Mohan, Nafisa A.  Pathaw and Anand Krishnan, “Community Bioacoustics: Studying Acoustic Community Structure for Ecological and Conservation Insights” in Frontiers in Ecology and Evolution, vol. 9, 2021. https://www.frontiersin.org/articles/10.3389/fevo.2021.706445 [accessed 29/6/2022]. 60. Rudolf Arnheim, Art and Visual Perception: A Psychology of the Creative Eye (Berkeley: University of California Press, 1974), p. 411.

CHAPTER 7

Conclusion

In Robert Flaherty’s documentary Nanook of the North (1922) there is a startling moment. It may have been staged with a certain naivety at the time, but now it is difficult perhaps impossible to watch without a sense of irony. Nanook listens to a gramophone and is amazed by ‘how the white man cans his voice’. Nanook is apparently bemused and impressed by the ‘magic’ of recorded sound. This sequence sums up western technological, as well as cultural and spatial hegemony. Flaherty and his crew came to the Inuit land (in northern Canada on maps) and proceed to make a spectacle of Nanook and his people, anthropologically categorizing and defining him, and then making him look childlike and enchanted and by unfamiliar technology (although, apparently he already knew of the gramophone and was acting this way at Flaherty’s behest). An interesting point that cannot be overlooked (rather than overheard) is that this is a silent film with no dedicated recorded soundtrack. I have seen the film with different music added. In each case, there was a feeling of uncertainty about how this should be accompanied. Should it be music; should it be a voice (Flaherty’s perhaps); should it be a happy tune or a national anthem? Perhaps, the silence that never would have accompanied it at the time of its initial release.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 K. J. Donnelly, The McGurk Universe, Palgrave Studies in Audio-Visual Culture, https://doi.org/10.1007/978-3-031-18633-2_7

215

216 

K. J. DONNELLY

Fig. 7.1  Nanook of the North

Does Nanook think he is listening to reality, or does he think he is listening to technology? I have been concerned less with technology than I would have liked in this book. In recent years, I have been impressed by Friedrich Kittler’s hard focus on technological developments and mediation. With his famous motto of ‘media determine our situation’, he declared that media form the foundational infrastructure as well as the quasi-transcendental condition for human experience and understanding.1 This is particularly apparent with neuroscience, where a new generation of auguring hardware has enabled the radical development of the field and defined corresponding emergent insights. Researchers could register this more often than they do. The history of audiovisual culture might well be the history of its technology, but during that period of development, and seemingly for long before that, the human form has remained constant. It is likely that many of the attractions of audiovisual culture, although developing in certain ways have retained strong underlying continuities. Audiovisual culture must directly exploit the propensities of perception that are at the heart of being human. Someone attending a film screening

7 CONCLUSION 

217

in 1905 would not have perceived the moving images as a large selection of still but related images, just as they would not have concentrated on piano accompaniment as if it was a concert impaired by darkness and the film being screened. The overall effect, the ‘film’, was a merging of these elements into a different order of objects. So, rather than the delivery of the truth to our faculties, our perception delivers something improved, reordered, and in some cases better. While it might be even better than the real thing, one concern in this book has been the phenomenological situation of the audience thinking audiovisual culture is real on some level, rather than realism as a cultural discussion about how realistic people think something is, or what the idea might means to them individually or as a society, although the other two aspects are clearly also highly significant. The combination of moving images with sounds and music has become a core of audiovisual culture and a dominant across the world, embracing film, television, video games and Internet mini films. This audiovisual culture is geared precisely to our perceptual hardware, and negotiating these before engaging higher functions (such as cognition of narrative or thinking about gameplay). It is important to note the direct but ambiguous relationship between psychology and aesthetics, which is an interesting aspect of inquiry, yet I have learned that many of the remarkable insights that have come from neuroscience in the past quarter of a century are clues but cannot be taken as the final truth about the physiology of the brain and its complex workings. Indeed, science regularly develops and reorders its provisional conclusions, yet much of the material I have referenced pertains to experimental work about perception, cognition and the human brain and appears more like accrued knowledge than speculation. Audiovisual culture has become pervasive not simply as entertainment and diversion but also as a crucial mediation between human psychology and physiology, as well as between the material (‘reality’) and the imagination. Perhaps this is why we increasingly cannot differentiate reality and simulation, and the line between them has become a thin, permeable membrane. If our physical senses bridge the divide between ‘us’ and the world, audiovisual culture is another level of the same process, where a few operations mediate the on-screen (and in-speaker) world for us. Most of these are audiovisual culture aesthetics and conventions. So, they can tell us something about human perception more generally. The McGurk Effect has significant implications for understanding audiovisual culture. It forces us to focus on perception, emphasizing the

218 

K. J. DONNELLY

physiological rather than film as merely a ‘cerebral’ experience. Whilst I may not think that this is the bottom line, it is a worthwhile endeavour to address the functionality of audiovisual culture and ask whether it is ‘more than entertainment’. A lot of time and effort is spent making us think that it is merely a ‘product’ that we should consume and then forget. The power of individual films, television programmes and video games militates against this ‘flattening’ approach. Yet the net effect of this flattening approach makes us focus on the discourse rather than the individual cultural objects, rather in the manner of not caring about individual animal specimens in favour of a notion of the whole of the species. Sometimes that is a useful heuristic approach. Physiological baselines provide psychological horizons. As Joseph Anderson notes, the perceptual basis is a common human denominator and is even cross-cultural.2 Audiovisual aesthetics provide the physiological position, as a process of mirroring. Attending the physiological is a way of bypassing the (bourgeois?) insistence on everyone’s individuality or the trite insistence that ‘everyone is different’ or ‘everyone has a different reality’.3 A zoological approach would point to the essential similarity of human beings and the superficiality of differences in biological terms. Mechanisms of vision and sound favour certain things (midrange sounds, certain colours, limited movement, simple shapes, etc.), and these are general perceptual points that ‘fit’ the human form’s limitations and affordances of its perceptual ‘hardware’. My degenerating faculties remind me of disabilities and how they might relate to audiovisual culture. In fact, some aspects of films shown in the cinema with sharp focus and loud sound are an aid to limited visual and aural abilities. The extremely rare Sensory Processing Disorder (SPD) illustrates something about how bottom-­up perception and top-down cognition function. Although it is currently not fully recognized as a medical condition, its previous name ‘sensory integration dysfunction’ tells of its nature.4 The issue is with how the perceptual signals are dealt with once received in the brain, where they are not integrated and regulated. Some stimuli (often sounds or colours) are ‘over-perceived’, while others are ‘under-perceived’, which can lead to sufferers exhibiting under-responsive behaviour and being physically uncoordinated in relation to their environment. This neurological condition is not related to autism, although neurodiversity has become increasingly accepted as a means of understanding that all people may not compute in the same way and indeed can vary radically in their responses. In both cases, these look likely to be top-down functions, potentially

7 CONCLUSION 

219

underlining the baseline of audiovisual culture being geared towards standard perception. Autism is now understood as a number of conditions corralled together rather than a single unitary condition, and consequently responses can vary significantly. One common issue is ‘Simultagnosia’ or object blindness, which is an inability to recognize multiple elements in a scene.5 Signals are not integrated and computed as a coherent whole, and while some elements might be highly appreciated, the sense of unity among component parts is absent. There are some good, logical reasons why we might think of putting perception at the heart of audiovisual analysis. I would suggest a compelling one is the immediacy of sound and image—its ‘presence’ (close to us, not avoidable). In essence audiovisual culture is not like literature. It is sensual and causing strong emotional responses. The smallest children or people with advanced dementia can ‘get something’ from film as in the first instance it works on a most basic level of perception of dynamics in sound and image and their interaction. Approaches derived from Gestalt psychology have a significant potential for engaging with audiovisual culture, precisely as a merging of audio and visual, and I hope that in the future more (and better) analysis will be achieved by scholars inspired by this strategy. The McGurk Effect clearly illustrates the shortcomings of such current theoretical approaches as ‘multimodality’ and atomized Cognitive Psychology-inspired aesthetic analysis. In general, this book aims to help reconceptualize audiovisual culture less as ‘information’ and more as something tied to physicality and as I tentatively suggested in the last chapter perhaps closer to health or even nutrition. As such, in one way it constitutes a banal necessity for the senses, which require stimulation. Yet it is also able to be a sensual feast of great affect, but equally can become something of a siphon, where consumption can turn into habitual gluttony. In some ways it might be characterized as a health issue, pertaining to exercise and (over)indulgence of senses. Similarly, the origins of perception are geared towards dynamics and the audiovisual is cut to this measure in its ‘healthiest’ form. This is most evident in the effects of stimulus starvation, where our ‘hardware’ imagines sound and image input in its absence. That this requirement for sound and image input is essential to humans is beyond doubt. However, such a ‘physiologically based’ approach has found little purchase in the study of our most prevalent sound and image forms. Although superficially similar, such an approach should not be confused with the so-called bodily turn in cultural studies, which

220 

K. J. DONNELLY

sometimes is subsumed under the broader ‘affective turn’. It is perhaps more closely related to ‘media naturalness theory’, which approaches media through assumptions about the evolved form of human communication matching the development of the human form.6 I’ve learned from some past book reviewers who never read as far as the conclusion or have made their mind up in advance about what the book is about. So, more caveats appear here so that they can be missed. As I am concerned with the audiovisual and perception, I have not been concerned with adopting a philosophy-derived or theorist-acolyte approach where quotations from the great and the good are taken as self-evident and fully present readings from the scriptures. Instead, my approach to neurological and neuropsychological insights is largely heuristic, trying to gauge how far these can provide a novel understanding of audiovisual culture. Some of my assumptions and conclusions might not be right or perhaps even valid, but the approach has allowed a different perspective on or a different angle of approach to audiovisual culture, and that is the important thing for me. And, actually, it has proven relatively easy to discover that there is much in the way of experimental research that points not only to the paramount unity of sound and vision but also to the importance of sound and the folly of assuming that film or any other audiovisual culture is ‘essentially a visual medium’. My interests here are focused more directly on how audiovisual culture remains and has developed as a direct homology of human perceptual hardware and how an understanding of human perception and cognition is able to give a deeper understanding of films, television and video games, while reciprocally these dominant cultural objects are able to provide significant insight into human brain processes. They make something physical seem ‘in our heads’ and something in our heads ‘seem physical’. While I have discussed at length how audio and visual signals are merged into a whole, there are, of course, points where there is ambiguity. Here, the channels may be parted or pose a question in one. This is clear in the case of many slasher films and their withheld reverse shot, not showing the source of the point of view shot so as not to reveal the killer. In such cases, we might be given sound as a tantalizing clue. Here, signals are held apart mentally in the search for ‘cross-modal confirmation’, where one sense checks facts that may be absent or ambiguous in the other.7 So, it seems that cross-modal confirmation is intermittent but not dominant and at points where sound and image diverge rather than the common situation of a clearly merged signal.

7 CONCLUSION 

221

The aims of this book were to address the audiovisual precisely and to gain insights into contemporary audiovisual culture through the application of theory derived from neuropsychology and evolutionary psychology. It moves towards understanding the place of the physiological in audiovisual experience and thus accepting audiovisual culture as an essential part of a human being through its ‘doubling’ of human perception, sometimes in basic aesthetic terms and on other occasions being founded upon the ‘McGurkian’ merging of hearing and seeing in our physiological being. Audio and visual synchronization and unity hold together not only a sense of coherence in the audiovisual object but also a sense of coherence in us, the perceivers.

Notes 1. Friedrich Kittler, Gramophone Film Typewriter (Stanford, CA: Stanford University Press, 1999), p. xxxix. 2. Joseph D.  Anderson, The Reality of Illusion: An Ecological Approach to Cognitive Film Theory (Carbondale, IL: Southern Illinois University Press, 1998), p. 52. 3. In recent years there has also been increasing sensitivity and awareness of disability as a spectrum rather than a simple ‘able’/‘disabled’ dichotomy. 4. Brenda Goodman, “Sensory Processing Disorder” at WebMD, 7 February 2021. https://www.webmd.com/children/sensor y-­p rocessing-­d isorder [accessed 3/6/2022]. 5. Paul Isaacs, “Visual Perception in Autism” at National Autistic Society, 9 June 2016. https://www.autism.org.uk/advice-­a nd-­g uidance/professional-­ practice/visual-­perception [accessed 8/4/2022]. 6. Ned Kock, “Media Richness or Media Naturalness? The Evolution of Our Biological Communication Apparatus and Its Influence on Our Behavior Toward E-Communication Tools” in IEEE Transactions on Professional Communication, vol. 48, no. 2, July 2005, pp. 117–130. 7. Anderson, op.cit., 1998, pp. 87–88; Joseph Anderson, “Sound and Image Together: Cross-Modal Confirmation”, Wide Angle, vol. 15, no. 1, January 1993, pp. 30–43; Paul Taberham, Lessons in Perception: The Avant-Garde Filmmaker as Practical Psychologist (London: Berghahn, 2021), p. 171.