284 111 20MB
English Pages 336 [324] Year 2023
Aesthetics in Digital Photography
Series Editor Marie-Christine Maurel
Aesthetics in Digital Photography
Henri Maître
First published 2023 in Great Britain and the United States by ISTE Ltd and John Wiley & Sons, Inc.
Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act 1988, this publication may only be reproduced, stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers, or in the case of reprographic reproduction in accordance with the terms and licenses issued by the CLA. Enquiries concerning reproduction outside these terms should be sent to the publishers at the undermentioned address: ISTE Ltd 27-37 St George’s Road London SW19 4EU UK
John Wiley & Sons, Inc. 111 River Street Hoboken, NJ 07030 USA
www.iste.co.uk
www.wiley.com
© ISTE Ltd 2023 The rights of Henri Maître to be identified as the authors of this work have been asserted by him in accordance with the Copyright, Designs and Patents Act 1988. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s), contributor(s) or editor(s) and do not necessarily reflect the views of ISTE Group. Library of Congress Control Number: 2022946611 British Library Cataloguing-in-Publication Data A CIP record for this book is available from the British Library ISBN 978-1-78630-753-8
Contents
Introduction: Image and Gaze . . . . . . . . . . . . . . . . . . . . . . . . .
xi
Chapter 1. The Legacy of Philosophers . . . . . . . . . . . . . . . . . . .
1
1.1. The objectivist approach . . . . . . . . . . . . . . . . . . 1.1.1. The source: ancient Greece . . . . . . . . . . . . . . . 1.1.2. After Greece . . . . . . . . . . . . . . . . . . . . . . . 1.1.3. Kant and modern aesthetics . . . . . . . . . . . . . . 1.1.4. Objectivism after Kant: from pseudo-subjectivism to aesthetic realism . . . . . . . . . . . . . . . . . . . . . . . . 1.2. The subjectivist approach . . . . . . . . . . . . . . . . . . 1.2.1. From classicism to romanticism . . . . . . . . . . . . 1.2.2. The moderns . . . . . . . . . . . . . . . . . . . . . . . 1.2.3. The influence of neurobiology . . . . . . . . . . . . . 1.3. Subjectivism and objectivism: an ongoing debate . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
3 3 5 7
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
9 13 14 15 18 19
Chapter 2. Neurobiology or the Arbitrator of Consciousness . . . .
25
2.1. fMRI protocols and neuroaesthetics . . . . . . . . . . 2.2. The fMRI quest for “beauty processes” in the brain . 2.2.1. The role of the prefrontal cortex . . . . . . . . . . 2.2.2. The role of the insular cortex . . . . . . . . . . . . 2.2.3. The role of the visual areas . . . . . . . . . . . . . 2.2.4. The role of memory and cognition . . . . . . . . . 2.2.5. The role of embodiment . . . . . . . . . . . . . . 2.3. Responses from functional electric encephalography 2.4. A global cognitive scheme for aesthetic judgment? . 2.4.1. J. Petitot’s neurogeometric model . . . . . . . . . 2.4.2. A. Chatterjee’s aesthetic emotion model . . . . . 2.4.3. The model by Brown et al. . . . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
27 28 28 31 33 35 35 36 39 40 40 42
vi
Aesthetics in Digital Photography
2.4.4. Model proposed by H. Leder . . . . . . . . . . . . . . 2.4.5. The model by C. Redies . . . . . . . . . . . . . . . . 2.4.6. The emotions model developed by S. Koelsch et al. . 2.4.7. L.H. Hsu’s model of emotions based on A. Damásio 2.4.8. Other models . . . . . . . . . . . . . . . . . . . . . . . 2.5. A critique of neuroaesthetic methods . . . . . . . . . . . 2.5.1. Criticism of neuroaesthetic methods . . . . . . . . . . 2.5.2. Criticisms of the objectives of neuroaesthetics . . . .
. . . . . . . .
Chapter 3. What Are the Criteria For a Beautiful Photo? 3.1. Before we enter into the fray . . . . . . . . . . . . . 3.1.1. What reference books do we have? . . . . . . . 3.1.2. “Beauty of an image” or “quality of an image”? 3.1.3. A glossary of aesthetic appraisal . . . . . . . . . 3.1.4. Measuring beauty . . . . . . . . . . . . . . . . . 3.2. Composition . . . . . . . . . . . . . . . . . . . . . . 3.2.1. Complexity versus simplicity . . . . . . . . . . . 3.2.2. Unity . . . . . . . . . . . . . . . . . . . . . . . . 3.2.3. A specific case in composition: landscapes . . . 3.2.4. Using oculometry to analyze composition . . . 3.2.5. Format or aspect ratio . . . . . . . . . . . . . . . 3.2.6. The rule of thirds (RoT) . . . . . . . . . . . . . . 3.2.7. The center of the image . . . . . . . . . . . . . . 3.2.8. Other rules for composition . . . . . . . . . . . 3.3. Histograms, spectral properties and textures . . . . 3.3.1. Histograms and gray levels . . . . . . . . . . . . 3.3.2. Focus, spectral density, fractals . . . . . . . . . 3.3.3. Textures . . . . . . . . . . . . . . . . . . . . . . . 3.4. Color . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.1. About the concept of color . . . . . . . . . . . . 3.4.2. Preferences related to isolated colors . . . . . . 3.4.3. Preferences related to color palettes . . . . . . . 3.5. What behavioral psychosociology has to say . . . . 3.5.1. Images of nature . . . . . . . . . . . . . . . . . 3.5.2. The aesthetics of faces . . . . . . . . . . . . . . 3.5.3. The role of the signature, title and context . . . 3.5.4. Perception and memory: prototypicality . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . .
. . . . . . . .
43 45 47 47 50 51 51 52
. . . . . . .
55
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. 56 . 56 . 57 . 58 . 60 . 63 . 63 . 64 . 64 . 67 . 68 . 70 . 72 . 73 . 76 . 76 . 78 . 80 . 82 . 82 . 84 . 86 . 93 . 93 . 96 . 99 . 101
Chapter 4. Algorithmic Approaches to “Calculate” Beauty . . . . . . 103 4.1. First steps: C. Henry . . . . . . . . . . . 4.2. G.D. Birkhoff’s mathematical approach 4.3. Those who followed G.D. Birkhoff . . 4.3.1. Beauty according to H.J. Eysenck .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
103 104 106 106
Contents
4.3.2. The Post-War years: the designers, A. Moles and M. Bense 4.3.3. A dynamic approach: P. Machado and A. Cardoso . . . . . . 4.3.4. Work carried out by J. Rigau, M. Feixas and M. Bert . . . . 4.4. Algorithmic approach with AI: J. Schmidhuber . . . . . . . . .
. . . .
. . . .
. . . .
. . . .
vii
106 107 108 110
Chapter 5. The Holy Grail of the Digital World: Artificial Intelligence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 5.1. Which artificial intelligence? . . . . . . . . . . . . . . . . . . 5.1.1. The principles . . . . . . . . . . . . . . . . . . . . . . . . 5.1.2. Learning algorithms . . . . . . . . . . . . . . . . . . . . . 5.2. Why artificial intelligence in aesthetics? . . . . . . . . . . . 5.3. Expert opinions . . . . . . . . . . . . . . . . . . . . . . . . . 5.4. The database . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.1. Generalist databases, used for aesthetic judgments . . . 5.4.2. Databases that are specialized for aesthetic photography 5.4.3. Databases dedicated to artistic judgment . . . . . . . . . 5.4.4. Other image databases that are sometimes used . . . . . 5.4.5. Increasing databases . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
114 114 115 116 118 120 122 126 129 130 131
Chapter 6. Primitive-based Classification Methods . . . . . . . . . . . 133 6.1. Judging aesthetics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1.1. Multimedia primitives: the ACQUINE system (Datta et al.) . . . . 6.1.2. Edges and chromatic distance: Ke et al. . . . . . . . . . . . . . . . . 6.1.3. Photography rules: Luo and Tang and Mavridaki and Mezaris . . . 6.1.4. High-level primitives: Dhar et al. . . . . . . . . . . . . . . . . . . . 6.1.5. Generic descriptors of vision: Marchesotti et al. . . . . . . . . . . . 6.2. Help in composing beautiful photos . . . . . . . . . . . . . . . . . . . . 6.2.1. The library of aesthetic primitives developed by Su et al. . . . . . . 6.2.2. The OSCAR system by Yao et al. . . . . . . . . . . . . . . . . . . . 6.2.3. Embedded systems: Lo et al. and Wang et al. . . . . . . . . . . . . 6.3. Some specific research related to the evaluation of aesthetics using primitives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.1. Color harmony: Lu et al. . . . . . . . . . . . . . . . . . . . . . . . . 6.3.2. Group photography: Wang et al. . . . . . . . . . . . . . . . . . . . . 6.3.3. Social networks and crowdsourcing: Schifanella et al. . . . . . . . 6.3.4. Looking at comments: San Pedro et al. . . . . . . . . . . . . . . . .
136 136 137 140 143 144 148 148 148 150 151 151 153 153 154
Chapter 7. Deep Neural Network Systems . . . . . . . . . . . . . . . . . 155 7.1. DNNs dedicated to aesthetic evaluation . . . . . . . . . . . 7.1.1. High and low resolutions: the RAPID system, Lu et al. 7.1.2. The multi-path DMA-Net architecture: Lu et al. . . . . 7.1.3. Adapting to the size of the image: Mai et al. . . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
157 157 160 160
viii
Aesthetics in Digital Photography
7.1.4. Finding beauty on the Web: Redi et al. . . . . . . . . . . . 7.1.5. Siamese and GAN networks: Kong et al. and Deng et al. . 7.1.6. Paying attention to the image construction: A-Lamp . . . 7.2. Variants around the basic DNN architecture . . . . . . . . . . 7.2.1. Comparing photos between themselves: Schwarz et al. . . 7.2.2. Making use of knowledge of the subject: Kao et al. . . . . 7.2.3. BDN: halfway between classification and DNN . . . . . . 7.2.4. Using the distribution of the evaluations . . . . . . . . . . 7.2.5. Extracting a “dramatic” image from a panorama: the Creatism system . . . . . . . . . . . . . . . . . . . . . . . . . 7.3. Written appraisals: analyzing them and formulating new ones 7.3.1. Photo critique captioning dataset (PCCD) . . . . . . . . . 7.3.2. Neural aesthetic image retriever (NAIR) . . . . . . . . . . 7.3.3. Semantic processing by Ghosal et al. . . . . . . . . . . . . 7.3.4. Aesthetic multi attribute network (AMAN) . . . . . . . . . 7.4. Measuring subjective beauty . . . . . . . . . . . . . . . . . . . 7.4.1. Recommendation systems . . . . . . . . . . . . . . . . . . 7.4.2. Defining the user’s psychological profile . . . . . . . . . . 7.4.3. Learning the user’s tastes through tests . . . . . . . . . . . 7.4.4. Multiplying concurrent expertise . . . . . . . . . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
163 165 167 170 170 172 174 175
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
178 179 181 182 182 183 185 186 188 191 194
Chapter 8. A Critical Analysis of Machine Learning Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 8.1. The popularity of studies on aesthetics . . . . . . . . . . . . 8.2. A summary of learning methods . . . . . . . . . . . . . . . . 8.2.1. Which architecture? Which software? . . . . . . . . . . . 8.2.2. What performances? . . . . . . . . . . . . . . . . . . . . 8.3. Questioning the hypotheses . . . . . . . . . . . . . . . . . . . 8.4. Specific features of beautiful images detected by a computer 8.4.1. Some observations on the photos in the AVA database . 8.4.2. The scores in the AVA database . . . . . . . . . . . . . . Conclusion
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
197 199 199 200 203 204 205 207
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
Appendix 1. A Brief Review of Aesthetics . . . . . . . . . . . . . . . . . 219 Appendix 2. Aesthetics in China . . . . . . . . . . . . . . . . . . . . . . . 237
Contents
ix
Appendix 3. The Aesthetic of Persian Miniatures . . . . . . . . . . . . 251 Appendix 4. Aesthetics in Japan . . . . . . . . . . . . . . . . . . . . . . . 263 References Index
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295
Introduction Image and Gaze
The aesthetic emotion, which gives rise to the impression of beauty, is undoubtedly universal in humans. Edgar M ORIN (2013, p. 13)
The proliferation of images in the modern world increasingly forces us to select photos from among a vast set. The task of choosing images to illustrate an article, a book cover, a poster, or one that represented an event or a voyage used to be entrusted only to professionals in the fields of publishing, communication, archives, or professional photographers and collectors. Today, it is incumbent on each of us to decide what we wish to preserve and what to forget, what we wish to send, post on the Internet, delete forever, or to archive (in the improbable event of us ever revisiting it). Managing photo archives is a real bane for many us1. The step of selecting images has always been assumed to be delicate and of great importance. In the professional context, it is entrusted to those who are renowned as experts or in a position of authority. Let us pause for a moment to examine the mechanisms that are activated when we make decisions about sorting and selecting images. Where does the beauty of the image fall in our hierarchy? It would seem that the primary mechanisms that are activated when selecting photographs can be divided into three broad categories: 1 It is estimated that 7,400 billion photos have been archived to date, with 1,400 billion being taken in 2020 alone (Carrington, D. (2020). How many photos will be taken in 2020?. life in focus: https://focus.mylio.com/tech-today/how-many-photos-will-be-takenin).
xii
Aesthetics in Digital Photography
– why the document is of interest, that is, its ability to draw and hold our attention by relating the document to contexts familiar to us; – the surprise factor, that is, contrary to the previous point, its ability to give us a novel visual or cognitive experience by bringing in an unexpected contribution; – beauty, that is, the pleasure it brings us, independent of its content, through the arrangement of its elements. The last point is what we will be looking at, exclusively, in this book. In this framework, the first two points are the result of associative impressions, as used by G.T. Fechner, who used this distinction from the 19th century (Fechner 1871). It often happens that the same image associates several of these registers, with its attractiveness being heightened, but with the contribution of each register of attention being less clear. However, with respect to our decision-making, the contributions of interest, surprise and beauty would seem to evolve2, in independent or orthogonal spaces, as used by Gärdenfors (2000), that is, without any intimate influence on each other. They are therefore evaluated separately by our consciousness, and then probably combined into a single score that ultimately makes us prefer one image to another in a heuristic choice that is difficult to express, but which is likely to follow the empirical decision-making schemas proposed by Tversky and Kahneman (1981). Before we move away from them, let us briefly study the first two domains. In the field of image processing, these are sometimes referred to by other names as well: interestingness3, memorability4, unusualness5, or popularity6; we will illustrate these with examples (Isola et al. 2011; Gygli et al. 2013; Amengual et al. 2015). 2 Or at least that is what we assume, and this could be a risky hypothesis and a weak point in this text. 3 Interestingness – many authors also use this term to refer to aesthetic attraction (Gygli et al. 2013). It can also be used in a very specific sense in the case of social networks, where it measures any form of interest, independently of the causes that are brought into play (Dhar et al. 2011). 4 Memorability differentiates between images based on a criterion apart from their immediate impact on the observer. It reflects the capacity of an image to enter long-term memory, which is where memories are stored in our mind in a lasting manner. Although this is a universal quality, it is assessed at the individual level (Khosla et al. 2012; Kim et al. 2013). 5 Unusualness refers to that property of images that is characterized by producing surprise: to this effect, the concerned images are distinguishable from the other images proposed, especially in the case of animated sequences. This property is specific to a given context and may vanish in a different context (Schuster et al. 2010). 6 Popularity is the term used to characterize images posted on social media. Popularity reflects an interest shared by a large number of people, based on criteria that may be very variable (Amengual et al. 2015).
Introduction
xiii
Mechanisms related to interest These mechanisms have been of particular interest in studies among experimental psychologists from the 1920s onwards. The work produced by these psychologists is often brought together under the umbrella term relevance theory (Sperber and Wilson 2004). Interest or relevance, that is, a complex psycho-physiological experience, may be produced by: – external stimuli (environmental stimuli transmitted by our senses) or internal stimuli (biochemical, created by our cortex). It is therefore a passive process; – or through reasoning, that is, cognitive processes that produce new elements of knowledge from a particular context. This is then an active process. W. James’ work (Lange and James 1922) pioneered studies on the relation between emotion and how a person appraises a situation. The importance of arousal and the order of the various steps involved (arousal, emotion, appraisal) has also been well documented. This research has served as a guide for various studies conducted 50 years later in the field of aesthetics. However, it was Sperber and Wilson (2004), above all, who made it possible to construct a relevance theory, bringing together all the cognitive baggage of the recipient and not only their linguistic knowledge, as suggested by W. James. In this context, Dessalles (2008) suggested a more quantitative measure of relevance and endowed it with predictive capacities using an original mathematical model. If we want to apply this relevance theory to the interest aroused by a collection of images, it is useful to define two extreme cases that may be approached in different ways: – universal interest refers to themes that are often displayed in society and media and transmitted through context and culture: such-and-such an actor or sportsperson, a car, monument or an event that is explicitly and regularly covered. The popularity of these themes can today be measured using mediametry tools (Hsieh et al. 2014; Fu et al. 2014): frequency of exposure on television, the number of instances on the Internet, “clicks” or “likes” on social media, etc. (Figure I.1(a)); – personal interest brings one closer to the images because the themes are deeply connected to individual life, more specifically, the viewer’s personal life: “my” family, “my” city, “my” work, etc. The evaluation of relevance then takes the conventional forms proposed in Dessalles (2013), for example, which involves variables of space and time that decrease more or less rapidly, and relations in the degree of proximity (e.g. in a family tree or a company flow chart) (Figure I.1(b)).
xiv
Aesthetics in Digital Photography
Figure I.1. An image’s relevance may be related to the general context of knowledge within a community ((a) actors, sportspersons, objects that frequently appear in newspapers or social media), or, on the contrary, to the personal context of the observer ((b) members of the same family, vacation sites, hobbies, etc.). For a color version of this figure, see www.iste.co.uk/maitre/aesthetics.zip
Mechanisms associated with surprise These mechanisms function a little differently from those mentioned above, as they become more effective when the document presented to us is more removed from the familiar (Figure I.2). Surprise or amazement (as this is a broader term, it may be more useful for us here as it covers a wider range of responses) can be narrowed down to various forms: humor, fear, perplexity, affection, disgust, etc. It can be clearly seen that relevance, on the one hand, and surprise, on the other, are different from aesthetic qualities that motivate us to remember a photograph. It is also quite likely that they operate together to lead to a choice7, as can be seen in the two photos in Figure I.3. However, in the following sections, we will do our best to separate these two mechanisms. That is, we will strive to assess aesthetic qualities with relevance and surprise being equal. This will not be easy and it must be remembered 7 It is especially interesting to see how retrospective collections put together regularly by professional journals, with titles like, “The 100 most beautiful photos of the year”, largely combine photos with exceptional aesthetics and those that evoke exceptional events from sports, politics or world affairs, which have been selected largely for their qualities of being remembered .
Introduction
xv
that in many situations, the opinions shared during subjective tests are likely to have been somewhat confused on this point (Gygli et al. 2013). This is especially true when opinions are solicited from unknown and remote persons, as happens with evaluations on the Internet, for example.
Figure I.2. Amazement is another mechanism that leads us to pay particular attention to an image. We then strive to clearly identify the specific elements in the image that fall beyond the known schemas of our representation of the universe. For a color version of this figure, see www.iste.co.uk/maitre/aesthetics.zip
Pleasure versus arousal The three motivations behind our attention when we study a photo, namely its beauty, our interest in the subject of the photo, and the surprise it may produce in us, are the result of a long, careful and well-reasoned observation. However, it is also possible to study the immediate effects on ourselves following a very brief viewing of a photo. We therefore seek to identify the most elementary biological effects, which involve only the most basic physiological activation and not the elaborate functioning of cognition and reasoning. This is done by presenting a photo to an observer for a fraction of a second and measuring a few physiological indicators that
xvi
Aesthetics in Digital Photography
reflect two independent emotional reflexes: pleasure and arousal8. These experiments were first carried out by psychologists at the National Institute of Health (USA) and were then repeated by many other authors. This experiment was carried out by creating a database with a thousand reference photographs, the IAPS database (International Affective Picture System9) (Lang et al. 1999). They then conducted psychophysiological experiments that made it possible to place each image along a pleasure graph as a function of the arousal, revealing two clear orientations, which they named the direction of appetitive motivation and the direction of defensive motivation (Figure I.4).
a)
b)
Figure I.3. Both these photographs are indisputably remarkable for their aesthetic qualities. Photograph (a) (Portrait of Jean Cocteau by Irving Penn, 1948) holds our attention because we recognize a famous man (“universal interest” stimulus). Photograph (b) (Bruxelles by Henri Cartier-Bresson, 1932) holds us by making us wonder “what are they looking at?” (“surprise” stimulus)
A photo that was selected for its capacity to surprise may evoke pleasure or arousal. A photo selected for the interest it arouses in us is selected especially for pleasure, and even more so for a photo distinguished by its beauty, and we will see that even today, this pleasure is a commonly accepted determinant of Beauty. However, literature from the 18th and 19th centuries would sometimes place the Sublime10 above the beautiful. 8 Arousal: this term, which we have already seen, refers to a person coming out of a state of indifference. 9 A competitor to the IAPS database, OASIS, was created with the same objective by Harvard University (Kurdi et al. 2017). It also offered the advantage of being open access. 10 The Sublime is the subject of Edmund Burke’s first philosophical text (in 1757): A philosophical enquiry into the origin of our ideas of the Sublime and Beautiful. The Sublime was soon at the heart of many literary debates across Europe: it brought in the lyricism of enthusiasm to counterbalance good measure and harmony. It was discussed at length by Immanuel Kant in 1790 in his Critique of Judgement (Kant 2015) where he writes, in particular: “The Beautiful
Introduction
xvii
The Sublime had a visceral component of fear (and thus, arousal). The sublime is not commonly used in this sense today and is most often only used as a superlative for beauty.
Figure I.4. The affective space defined by Lang et al.
prepares us to love disinterestedly something, even nature itself; the Sublime prepares us to esteem something highly even in opposition to our own (sensible) interest”. And he added, “The satisfaction in the Sublime of nature is then only negative (whilst that in the Beautiful is positive)”. Byron waxes lyrical on the Sublime during his travels in a storm in narrow Alpine valleys. We find, in Saint Girons (2005) a historical study of the sublime. Our text has very few references to the sublime, apart from the historical overview of the concept of beauty, as in modern usage it is not separate from extreme beauty.
xviii
Aesthetics in Digital Photography
C OMMENT ON F IGURE I.4.– The photos from the IPAS database (Lang et al. 1999) are distributed along two axes, one is arousal, which has only positive valence, and the other is pleasure, which has both positive and negative valence. A few remarks isolate specific themes and make it possible to judge the gradation of the affects along the two (hypothetical) axes: appetitive motivation and defensive motivation (Schupp et al. 2004). Art, Beauty and Aesthetics – how are they related? Let us now turn to only those images that we judge to be beautiful and let us strive to set aside the more complex motivations that could lead to us being attracted by these images. This text aims to highlight objective elements that would allow us to explain why we attribute this adjective to a photo. However, in order to do this, we must first share a few definitions to give shape to a vocabulary that is often quite fluid. Aesthetics11 is the science whose aim is to research and determine the characteristics of beauty in works of nature or art (Académie Française, 1835). Beauty is the quality of that which is beautiful [. . . ]. It is the characteristic of that which evokes admiration and emotion through its forms, proportions, rhythms and harmonies. Today, we often prefer definitions that place more of an emphasis on the origin of the stimulus as well as the form taken by its effects: “Beauty is the characteristic of a person or an object that makes them pleasurable or satisfying to perceive”. Or, as a poet expressed more lyrically, “Beauty is nothing but the promise of happiness”12. Art is the range of activities that humans carry out that lead to the production of artifacts, with the objective that they will be appreciated for their beauty or emotional power. Bringing together these three terms, Aesthetic, Beauty and Art, it can be seen that they cover our field of study. And as each of them has produced abundant literature, it would be tempting to use all three. However, we must be careful with the third term, “Art”. While, for many centuries, the objective of Art tended largely toward the search for Beauty13, its evolution from the 19th century onward must be studied 11 In reality, there are three different meanings for the word aesthetic. We will use here only the definition given above. However, the Greek origin expresses the general sense of perception, a meaning that is rarely used today. The more common meaning is the technical one that designates the choices made by an artist in producing their creation: Fritz Lang’s aesthetic. 12 Henri Beyle, writing under the pen-name Stendhal, De l’Amour (Of Love), Chapter XVII. 13 Our observations here are restricted to Western art, born out of the Greek school of art. Art from other societies, especially those from the East, was born out of other aims and intentions.
Introduction
xix
with some circumspection. In particular, we must consider the “fracture in the 20th century” as a definitive change in paradigm. It is quite clear that the second term in the above definition (“their emotional power”) has become the central question in art, superseding “Beauty”14. Looking back, it is easy to find the foundations for this preoccupation well before the Surrealists, as many eminently artistic works (Bosch, Chardin, Goya, etc.) do not claim to be primarily “beautiful” (Figure I.5).
Figure I.5. Art that is not very aesthetic. For a color version of this figure, see www.iste.co.uk/maitre/aesthetics.zip
C OMMENT ON F IGURE I.5.– Many works of art have been designed without the “Beauty” criterion being highlighted. The artist seeks to focus on the “emotional load” of their message. These works “with a message” were very much in the minority in the classical period (they are represented here by J. Bosch, F. Goya and J.S. Chardin). They quickly became the dominant form in the 20th century however (represented here by M. Duchamp, A. Warhol, E. Munch, F. Bacon) and it would be difficult to evaluate artwork from the 20th century based only on their aesthetic qualities. Given this split in the very objectives of Art, we must be very careful when relying on texts in our quest of the Beautiful. And while this is especially true for paintings, it is also the case in photography, which also rightly aspires to be Appendices 2 to 4, therefore, offer short studies of the conception of Aesthetics in ancient China, in Persian miniatures, and in Japan. 14 We will speak here of the stated intentions of the movements that largely contributed to art in the 20th century: Dadaism, Surrealism, Situationism, CoBrA, Fluxus, Pop Art, Guerilla Girls, Punk, Kitsch, etc.
xx
Aesthetics in Digital Photography
classified as an Art15. We must be very prudent when following the recommendations of photographers, especially the most celebrated, when their advice does not clearly separate the intention of arriving at a greater aesthetic from that of producing a greater emotional impact. We will also see that a clear definition for what Art is in society (see, for example, Danto (1992); Danto and Goehr (2014)) is particularly useful in answering two frequently heard questions that spark off heated debates on contemporary art: – Is a perfect copy of the Mona Lisa as beautiful as the original16? – Is Warhol’s Campbell Soup Can beautiful? Is it more beautiful than the can found in your local grocery store? What makes it a work of art? Our work We must first explain why we are taking up this subject today. The opening lines of this book reminded us how abundant images are in society today. They described the complexity of the necessary task of sorting through the mass of photographs and the role played by aesthetics, along with other emotions, interest and surprise, in carrying out this winnowing. We must therefore take into account the progress that has been made in understanding how we judge the aesthetic quality of a photograph and the attempts made in using information processing tools to facilitate this task. Therefore, in 2018, for the first time ever, an international photography contest was judged in parallel by two juries: one composed of expert photographers and the other consisting of a computer running an artificial intelligence program17. 15 That photography can be classified as art is well-accepted among art lovers today. A survey by the Beaux Arts Magazine in October 2017 showed that when asked, “Using what form of expression are artists most successful in capturing beauty?” the photograph was ranked on top by 32% of French respondents, while the painting received the top vote only in 31% of responses, and the sculpture in 15%. We have come quite a long way from the observation made by Bourdieu (1965) in the middle of the 20th century, calling photography in France, “a middle-brow art”, not only for its quality of production, but also in terms of the esteem it received. 16 This question was long-debated by G.T. Fechner over the Dresden Madonna (see Vidal (2011)). Contrary to the position given in Danto (1992), for example, which presents a consensus that is widely held today, Fechner supported the close dependence between Schönheit (beauty) and Echtheit (authenticity), denying any beauty to any copy, however perfect. 17 This was the SPARK, a Renaissance photographic contest conducted by Huawei, built around the company’s P20 Pro cellphone model, which had an embedded algorithm for the aesthetic evaluation of a picture. The algorithm uses a neural network trained with over 4 million images. This is the algorithm that was in charge of judging the contest, awarding each image a score between 0 and 100. The photographs that got the best scores are available at https://consumer.huawei.com/it/campaign/sparkarenaissance/prizes/ (August 2020). There was very little information available comparing the judgments of the human jury and the algorithm. Later editions of the contest do not seem to have been judged by an algorithm.
Introduction
xxi
It appears that these two aspects, understanding the mechanisms involved in humans, as well as the development of assistance tools, go together. At least, this was the deeply held conviction that motivated us to launch this study. However, even as this book was progressing (i.e. from September 2016 to 2020), there was a constant stream of results that seemed to contradict this idea: deep neural networks, even though in their nascent stage, ignore the need to understand the relation between the observer and the photo, but still lead to astonishingly good judgments of the aesthetic qualities of the photo. While they are still nowhere near replacing an expert (human) eye as of now, they are well ahead of the other tools developed earlier. This will be discussed at length later in Chapter 8. Thus, our subject continued to evolve even as we worked on it. We have fewer explanations than we had hoped for initially, but we can give more detailed reports of the procedure undertaken, report on the evolution in the approaches and compare performances and highlight advances and weaknesses. Our work can be divided into two significant parts. In the first part, we attempt to give an account of the work undertaken in different fields of knowledge to identify the mechanisms underlying aesthetic perception. We begin with the vast corpus of philosophical literature, which we have tried to map in a highly schematic manner in Chapter 1. We have tried to simplify the many texts that have attempted, over 25 centuries, to find certain universal rules. The guiding principle we have used is the contrast between the objectivist and subjectivist perception of Beauty. We then examine what biologists have to say (focusing especially on neurologists) about the mechanisms of aesthetic perception. The development of modern brain-imaging techniques has transformed what we knew of the mapping of areas involved in developing emotions and judgments. Work that emerged from the field of art, today grouped under the term “neuroaesthetics” is then presented in Chapter 2. In the 150 years of its existence, photography has also developed a large corpus of recommendations and teachings, often born out of the practice of shooting a photo, but also often extended through experiments and scientific processes based on knowledge gained from experimental psychology. These results, which are often tracked and verified by image processing experts, are presented in great detail in Chapter 3. In the second section, we will then look at how these are used today by algorithms. In this second section, we examine the work that uses tools born from mathematics and computer science to measure beauty. In Chapter 4, we begin by presenting the (often quite old) techniques that propose a closed form of beauty (like that proposed by G. Birkhoff), which does not involve a stage of learning. Although quite rare, this research is ongoing even today. The next three chapters are dedicated to current methods that have emerged from artificial intelligence. We start with a
xxii
Aesthetics in Digital Photography
broad study of the context that makes these processes possible, especially image databases available in the network, and then the expertise that was used to bring in the essential step of machine learning (Chapter 5). Chapter 6 details the first generation of research carried out using the detection of primitives that are assumed to express the aesthetic quality and their classification using machine learning techniques. This research came up in the late 1990s. The following chapter covers more current work, essentially carried out since 2014, using deep neural networks and doing away with any human expertise once an annotated database is set up (Chapter 7). The conclusion to these information science methods reviews their results (Chapter 8), highlighting current directions being explored and presenting future perspectives that seem to be revealed. It is tempting to conclude that the immense progress will be fruitful very soon, but it is just as easy to demonstrate the futility of the efforts undertaken given how naive and inherently limited these results are. We must of course provide a global conclusion to this book, by putting into perspective the contributions from all the chapters: philosophy, neurology, experimental and social psychology, image processing and artificial intelligence. Unfortunately, we could not succeed in bringing together the various sciences that have worked on the judgment of Beauty. We can, at most, reveal a few key problems that hinder our understanding, related to the deepest visual perception pathways by bringing in the chronology and causality of activations of various brain regions. A better understanding of these mechanisms could result in models that could then be translated into semiconductors. Finally, four appendices complete the book. The first complements Chapter 1 by tracing aesthetics from the ancient Greeks up to the Enlightenment, leading to the apogee of what is today called the classical period. This is done with a directive reading of philosophers who fit into the framework of our project of moving from objectivism to subjectivism. Then, over the next three appendices, we look at aesthetics that followed a different path from that of the West: we travel to China, Persia (in the broader, historical sense) and to Japan18. It seems evident that these aesthetics cannot be served by the approaches used by artificial intelligence as it is today, without entirely redesigning its foundations, which nobody seems interested in doing at present.
18 We chose these countries as there is a large amount of written material that discusses the aesthetic choices that governed their art. While other societies also constructed very coherent systems of “artistic creation” (Australia, Africa, the Americas, etc.), it is more difficult to find primary texts that specify the rules and means.
1 The Legacy of Philosophers
Everyone reasons about the beautiful: it is admired in the works of nature: it is demanded in the productions of the Arts: at each moment its quality is conferred or denied; however, if one were to ask men with the surest and most exquisite taste what is its origin, its nature, the precise notion of it, its true idea, its exact definition; whether it is something absolute or relative; whether there is an essential, eternal, unchanging beautiful that would be the rule and the model for a subaltern beautiful; or whether beauty is like fashions: one would immediately see divided opinions; some declare their ignorance, while others fall into skepticism. How is it that almost all men agree that there is a beautiful; that so many among them feel strongly where it lies, yet so few know what it is? Denis D IDEROT (1751)
The importance that philosophers have accorded the idea of Beauty, from Antiquity onwards, can be seen in the abundance of major writings arising from this concept. The “theoretical works” of Socrates, Plato and Aristotle, based on the “practical works” of Apelles, Phidias and Praxiteles, provided a general framework for reflection that traveled from Greece to the Roman world and, through Medieval times, finally arrived into the Renaissance. In the Renaissance, this framework was accorded a pre-eminent place in a particularly vibrant artistic and intellectual life. In the century of the Enlightenment, the world of ideas often carried the day during heated disputes (Boileau, Locke, Diderot, etc.) but without being really questioned by the world of the Arts (Tatarkiewicz 1970). Toward the end of the 18th century and For a color version of all figures in this chapter, see www.iste.co.uk/maitre/aesthetics.zip.
2
Aesthetics in Digital Photography
in the 19th century, no great philosopher was left out of the debate: Kant, Hegel, Schopenhauer, Nietzsche and so on. All of them produced treatises on this subject and these are essential books in our libraries today, while artists were carrying out a Copernican revolution in how they designed their art, emblematically represented by the Romantics in literature and the Impressionists in painting. In the field of Art, the 20th century dawned on an indescribable battlefield with confrontations between theoreticians, each fervently entrenched in their positions: Heidegger, Adorno, Wittgenstein, Deleuze, Derrida, Foucault, Eco, Danto, etc.1, while the world of artwork was given over to a rare anarchy2. Of course, every contribution cannot be faithfully reported here. We will present an extremely simplified version (which is therefore highly open to criticism), with the aim of contextualizing the central question of our study and placing certain terms in perspective. Thus, for our purposes, we will highlight two, antithetical points of view of aesthetic thought: the “objectivist” approach, which attributes the merit of beauty only to the object that is beautiful, and the “subjectivist” approach which, on the contrary, locates beauty in the “eye of the beholder”3. We will show how objectivism has given way to subjectivism over the last few centuries, and then show how both of these approaches have resulted, for better or for worse, in a hybridized approach. Above all, we will demonstrate the implications that both these theories have on our project of using computerized processes to determine the beauty of a photograph, and we will look at why these approaches are inevitably present in the expectations from software developed today. We are inclined toward placing a greater emphasis on the objectivist approach as it appears so that all processes that seek to replace human judgment by an automated decision-making process start from an objectivist point of view, even if some of them have been led to adopt a subjectivist viewpoint as they developed. First of all, let us introduce the question of classifying approaches into two opposing families and to do this we must again quote Diderot, who clearly explained the terms in the Encyclopédie: “It is evident that Saint Augustin had gone much further in his research into the beautiful than the Leibnizian philosopher4: the latter seems to claim first that something is beautiful because it pleases us; when it only pleases us because it is beautiful; as Plato and Saint Augustine have noted quite 1 Interesting works to consult: Gombrich (1960); Danto (1964); Zemach (1987); Ferry (1990); Kemp (1990); Solso (1996). 2 We must cite here the “end of art”, heralded by Danto in 1964 (Danto and Goehr 2014), and proclaimed much earlier by Hegel, Spencer, Heidegger and many others. 3 Other classifications may be possible and some of these would shed additional light on our subject: “monism/dualism”, “empiricism/rationalism”, “physicalism/spiritualism”, etc. These grand conceptualizations of the nature of the spirit and the body are often key to understanding the philosophy of aesthetics in contemporary discourses. 4 Diderot refers here to the mathematician and philosopher, Christian Wolff (1679–1784).
The Legacy of Philosophers
3
well”. Does Beauty give pleasure? Or does pleasure create Beauty? Diderot has made his choice, referring to works by Ancient philosophers. But there are indeed two ways of studying the problem, as we will see below. For example, Luc Ferry has penned the exact counterpoint to Diderot, “it is no longer because the object is intrinsically beautiful that it pleases but, rather, we can go so far as to say that it is because it provides a certain type of pleasure that we call it beautiful” (Ferry 1990, p. 25). Equipped with both these philosophical frameworks, we will turn to more recent results from neurobiology to see whether they make it possible to choose between these approaches or to validate them through brain imaging experiments. We will also examine how compatible they are with results from studies carried out for over a century in experimental psychology, psychosociology and sociology. We hope to then have a collection of work that will enable us to interpret the processes adopted by automated approaches to determine the beauty of photographs. 1.1. The objectivist approach 1.1.1. The source: ancient Greece The objectivist approach5 was formalized in Ancient Greece, that of Plato6 and Socrates7, and which Aristotle8 also adopted, by humanizing it a little. This was the approach that was bequeathed to the West through the Romans, re-adopted by Augustine of Hippo9 who adapted it to Christianity. It then blossomed in the 5 Objectivism is a philosophical school of thought that is founded on three premises: existence exists (i.e. there is a world outside us); consciousness exists (i.e. the results of our cerebral activity are real); consciousness can access existence (through the intervention of perception). Objectivism shares these premises with realism, but objectivism allows itself recourse to transcendence. The term was reclaimed, since 1950, by Ayn Rand, to cover a radical form of objectivism applied to politics and society, which aimed to establish libertarian governments and preached rational self-interest. This acceptance of objectivism became highly popular in the United States since the publication of the highly successful Atlas Shrugged in 1957. We do not speak here of objectivism as Ayn Rand used it, but of aesthetic objectivism. We reject two expectations that she added: “humans’ ultimate goal is their well-being” and, “the best political-economic system is liberal capitalism”. 6 Plato (428-348 BCE) wrote two Theories of Beauty, one in Hippias Majeur and the other in Phaedre (Plato 2011). 7 Socrates left behind no text and is only known through what other authors wrote about him: Plato, Aristophanes, Xenophone, Aristotle, Plotinus. 8 Aristotle expressed his views on aesthetics in his Poetics, written around 335 BCE, which focused chiefly on tragedy and the epic (Aristotle 1996). 9 Augustine of Hippo (Saint Augustine) chiefly speaks about aesthetics in his work The Origin of Good, written around 554 CE, but this theme was also explored in many of his other texts.
4
Aesthetics in Digital Photography
Renaissance, flourishing in the Classical period and remained dominant until the middle of the 19th century in the world of art, if not in philosophy. The objectivist approach declares that Beauty is a quality of the concerned person, object or music. This Beauty is the reflection, in the observed object, of the universal properties of harmony in proportion and shape, properties that we do not completely understand, but which exist prior to our observation and influence our judgment.10 For this reason, Beauty is perceived equally by any observer11 at any time and any place. Thus, according to the most reputed art historians and experts, the Venus de Milo, the Apollo Belvedere, and Michelangelo’s David, among others, are endowed with these eternal qualities that demand recognition from humanity. The rules established in Athens in the 5th century BCE, which were adopted into the field of perception, reside in the harmony of form12 and proportion, especially the consonance of the whole and its parts13. The simplest proportions are the most pleasing (1/2, 1/3, 3/4, etc.), but the eye appreciates subtler fractions in more complex constructions, like in architecture. The echo of proportions on various scales and in different parts of the object is important and contributes to the “symmetry”14. All Greek art is designed 10 In an extreme version, beauty is even sometimes explained as the object’s active desire to manifest itself to the observer. Schopenhauer proposes this idea: “It is interesting to see with what insistence of the plant world, in particular, invites and almost forces us to contemplate it” (Schopenhauer 1966, p. 259). At this point, Schopenhauer notes that this same idea is found in the writing of Augustine of Hippo (De Civitate Dei contra paganos, XI, p. 27): “Plants offer their diverse forms, which embellish our visible world, to the perception of our senses; [...] in some way they seem to wish to be recognized”. 11 According to authors, this universality of the perception of beauty by any observer must be moderated: in the Greek democracy, all men were not equal and philosophical discourse was addressed to only a few of them, καλ` oς καγαθ´ ´ oς, (kalos kagathos), “beautiful and good” men, who were worthy of being citizens of Athens; in the Renaissance, texts were addressed to the small number of literate people, and the Enlightenment focuses on the Honest Man. The language of art is accessible to any observer, if they were “sensitive enough” (Kant) to receive the message of the philosopher. Batteux expresses this by, “All men are almost united when it comes to the heart... If one man with exquisite taste is attentive to the impression that a Work of Art makes upon him, if he senses it distinctly and if he consequently utters a judgement, it is hardly possible that other men will not accept his judgement” (Batteux 1747). 12 “. . . the chief forms of beauty are order and symmetry and definiteness. . . ” (Aristotle, Metaphysics M3). 13 “Beauty is the regular harmony between components so that we cannot add or subtract anything to it or change anything in it without diminishing its pleasantness”, wrote L.B. Alberti in On the Art of Building in 1485 (Alberti 1992). 14 Symmetry, in the Greek sense (συµµεθρια, summetria), designates the opposite of asymmetry (disproportionate) and thus expresses the immanence of a common measurement and not simply an identical reproduction of a motif, as today. The architect Claude Perrault, who used proportion instead of symmetry in his 1637 treatise, said: “Although the word symmetry has become French, I will not use it here because symmetry, in French, does not mean, at all, what
The Legacy of Philosophers
5
using a highly visible mathematical framework, which represents the omnipresence of numbers in Athenian cosmogony. However, while absolute beauty is the business of the gods, the rules of harmony are essentially inherited from nature, especially the proportions of the human body. Nature, a reflection of divine constructions, is the primordial guide of aesthetics. All authors preach the necessary simplicity of works, which must be comprehensible at first glance. On the other hand, in the field of colors, many ancient rules have been lost and most texts that could have testified to the tastes of the Ancient Greeks have disappeared. We know very little about chromatic harmonies that seem to use a range from black to white, harmonically divided by red and yellow. Greek rules were stated through “canon”, which is applied quite rigidly to shapes, lines and harmonies. The canons limit artistic creation and innovation was not considered a virtue in artwork. Further, in Athens, the artist was an artisan before being an artist, and was not really a creator, except in the noblest art of the Tragedy or unless he attained perfection, as did Praxiteles, Polycletus or Zeuxis. The objectivist hypothesis fully justifies the attempt to determine the aesthetic qualities of an object without using an observer, since these qualities are immanent and are expressed with or without an observer. This is further legitimized by the fact that the rules of aesthetics are universal and largely independent of the form of the artifact: painting, music, architecture, sculpture. Indeed, this was how Greek thinkers of the Hellenistic period had conceived of these15. 1.1.2. After Greece Rome adopted the Hellenistic aesthetic, and while this then became more flexible, naturalized and immersed in society, the whole of the Mediterranean Basin was given over to the rules of Antiquity. Rome insisted on Greek statuary and architecture serving as ideals and the canon as rules. Vitruvius’ text On Architecture16, translated Symmetria means in Greek and in Latin, nor what Vitruvius meant by symmetry, which is the relationship between the dimensions of the whole with each of its parts whose dimensions are different” (Dézarnaud-Dandine and Sevin 2007). Further, beyond these geometric proportions, symmetry “is the trace, the imprint that the Demiurge left behind during the creation of the world” (Dézarnaud-Dandine and Sevin 2007, p. 8). 15 The reader is invited to refer to Appendix 1 for a detailed analysis of how reflections on aesthetics have evolved in the West over the past 25 centuries. 16 “For without symmetry and proportion no temple can have a regular plan; that is, it must have an exact proportion worked out after the fashion of the members of a finely shaped human body” (as translated in 1545 by Jean Martin).
6
Aesthetics in Digital Photography
from the late 1st century BCE, transmitted these teachings to us, the original Greek sources having disappeared. The Roman Empire, however, was large, and between the 6th and 7th century CE, it split into two. The Eastern influences from Byzantine led it away from the Athenian aesthetic toward arts that were more fixed both in their inspiration and their forms. The human gave way to the spiritual. The canons were fixed with abstract conventions, governing the style, materials, forms, expressions and colors. Nature retreated and ornaments appeared, symbols with codes reflecting mysticism and religion. This branch of aesthetics split away from the Athenian line and never returned toward this significant Western school of art. On the other hand, the Western Roman Empire seized upon the Athenian legacy. In North Africa, Augustine of Hippo, brought up on the Hellenistic aesthetic, declared that the Beauty of Antiquity was objective and universal and could express the divine ideal: simplicity, harmony, a respect for proportions. The beauty of Olympus was bestowed upon the Temple in Jerusalem and the Church in Rome. The universality of Beauty is because of divine transcendence, a reflection of the glory of one God, as it reflected the glory of the pantheon. Christianity, shaped by these teachings, would carry Plato and Aristotle’s precepts across Medieval times, through the copies made by monks and the workshops of artists. In parallel to these, the same teachings were taught along the Tigris and Euphrates and in the distant Samanid empire, where poetry, cosmology and algebra were discussed and debated. Beyond the Byzantine Empire, this ancient tradition was alive and flourishing and would be transmitted, in turn, to Bologna, Montpellier, Coimbra and Paris at the start of the new millennium. At the turn of the millennium, the year 1000, while aesthetic rules had not evolved very much, Athenian philosophy was reconsidered, especially with regard to the relationship between vision and perception. English and Scottish thinkers began the empirical revolution, which preached the end of the innate and the pre-eminence of cognition in consciousness and emotion. Beauty, like Goodness, was not bestowed upon one at birth, but could be acquired through experience and practice, by following an example. There were still a few centuries to go before the reflections of Ockham, Locke, Hutcheson and then Hume would introduce sceptical relativism (we can refer, among others, to Ferry (1990) and Tatarkiewicz (1970) for the impact these thinkers had on the concept of aesthetics). At the same time, across the Channel, Descartes17 and Spinoza would state the dominance of reason, the separation between Mind and Body which allows the development of the individual, which was the only reference for the mind: “cogito 17 Published in 1637, Dioptrics was a treatise on light and explained how the eye sees (Descartes 1991); however, it is the mind that judges Beauty, from indications that it sends to the eye. This Beauty is always universal, timeless and immutable.
The Legacy of Philosophers
7
ergo sum”. We now enter a complex period in aesthetics where the foundations of objectivism were openly undermined, but the walls held strong and a scaffolding of arguments were put up to support the conclusions that society wanted to hear: beauty is universal, timeless and can be immediately perceived by all. For some people, this was a part of God, placed in us so we could judge Beauty and Justice equally, alongside our free will. For some others, this was a pleasing goal of Nature, which ensured harmony in judgment. Others still saw this as the result of an as-yet-unknown biological organ that “perceived” beauty as the eye-perceived light. This was a period where scholarly texts were entirely given over to aesthetics18: Jean-Baptiste Du Bos, Charles Batteux, Dominique Bouhours, Jean-Pierre de Crousaz, William Hogarth, etc. This was also the time when debates between luminaries arose again in society: the Quarrel of the Ancients and the Moderns was a debate between ideas that defended Classicism and ideas that fought for “an aesthetic of sensibility” and all works of art turned into a battlefield between these two currents: poetry, music, painting, decoration and architecture. Faced with the rise of the individual and subjectivity, the objectivist position became weaker and weaker: Johann Joachim Winckelmann was one of the last ramparts to stand firm19, while in the field of philosophy the arguments constructed by Immanuel Kant would form the modern foundations of aesthetics, the way in which the field is still largely discussed today (Schaper 1964; Aquila 1970; Crowther 1976; Hopkins 2001). Kant is held to be the primary reference on subjectivism20; however, his vision of Beauty is profoundly influenced by objectivist precepts. In the following paragraphs, we will explain why we consider Kant to be the last of the objectivists and the first of the subjectivists. 1.1.3. Kant and modern aesthetics The complexity of the universal thinking on aesthetics is seen in Kant’s oeuvre, especially in his Critique of Judgement (Kant 2015), which came out shortly after his Critique of Pure Reason and uses many results from the latter. For Kant, Beauty is the product of the “judgment of taste” and thus arises from judgment and not from perception. Today we would say that it is a cognitive act, and not a perception (“red”, for example, is purely a product of perception). However, Kant also differentiates between several kinds of cognitive judgments. “Beauty”, according 18 Appendix 1 gives more details on this period. 19 Johann Joachim Winckelmann was a famous 18th century neoclassical theoretician, a passionate defender of Platonic aesthetics in his 1755 work Thoughts on the Imitation of Greek Works in Painting and Sculpture, and the 1764 text History of Ancient Art. He was one of the final defenders of objectivism, especially in his 1763 work: Essay on the Beautiful in Art. 20 For example: “And in the Critique of Judgement, which some however believe to be the apogee of modern subjectivism, [...]” (Ferry 1990, p. 21).
8
Aesthetics in Digital Photography
to Kant, is derived from perception, like “square” would be, for instance, and both are products of the mind. Nonetheless, “square” is not a judgment of taste, it is a reasoned result of the application of “concepts”, while a judgment of taste does not require any concept that would “explain” the judgment. It is simply a product of the pleasure experienced during the observation. “Beautiful”, “Good” and “Fair” all share this property of being products of judgment of taste. On the other hand, “red” cannot be disputed. If 20 people tell me that this object is red, but I only see brown, then I would have to question my visual system. If 20 people tell me that a silhouette seen at a distance is a dog, while I think it was a wolf, we would be able to talk about the cues and reasoning (“concepts”) that led me to the decision and I would probably agree with their opinion or they might change theirs. However, if I judge a painting to be ugly while 20 people find it beautiful, Kant believes that other people’s opinions would not change mine, since, “that a thing has pleased others could never serve as the basis of an aesthetical judgement” (Kant 2015). It can, at most, make me a little less confident of my opinion21. Is aesthetic judgment objective or subjective according to Kant? Even Kant (2015) finds this a tricky question. He writes: “The green color of the meadows belongs to objective sensation, as a perception of an object of sense; the pleasantness of this belongs to subjective sensation”. However, he specifies that, “[...] the satisfaction presupposes not the mere judgement about it, but the relation of its existence to my state, so far as this is affected by such an Object. Hence we do not merely say of the pleasant, it pleases; but, it gratifies. I give to it no mere approval, but inclination is aroused by it...”. He supplements this, a little further, by saying, “the judgement of taste is merely contemplative; i.e. it is a judgement which, indifferent as regards the being of an object, compares its character with the feeling of pleasure and pain”. The active role played by the object is clear here. Kant also borrows other arguments from the objectivists: a work of art is universally beautiful and, as we saw earlier, the sensation of beauty can neither be explained nor demonstrated. He also says, “[he] will therefore speak of the beautiful, as if beauty were a characteristic of the object”. In Kant’s view, the judgment of taste is not influenced by any personal interest22 or emotion, it precedes the pleasure from the observation of the object, which cannot then be the determining principle of this sensation. 21 For more details, refer to Hopkins (2000). 22 This idea of free art was championed by Theophile Gautier, who defended it with the phrase, “Art for art’s sake” in his novel Mademoiselle de Maupin in 1834. This idea was spread (and shared) by Oscar Wilde, William Blake and Edgar Allen Poe. Then the 20th century marshaled many authors against this notion, authors such as W. Benjamin (Lamarque 2010), or artists such as W. Kandinsky (1954). From this time onwards, “Art for art’s sake” would begin to sound like a denouncement.
The Legacy of Philosophers
9
Kant (2015) identifies two different types of beauty: “free” beauty, which is unconditional, and “dependent” beauty, which accompanies a particular function (a beautiful garment, a beautiful shield, a building). Art is concerned with free beauty, while in the case of dependent beauty, the judgment of taste is denounced as being impure23, that is, it is clouded by motives that do not arise from aesthetic criteria. It is quite clear that Kant’s thinking occupies a special place, straddling both objectivism and subjectivism. It holds this balanced position as there is a convenient deus ex machina brought in, which allows all men to share a common sensibility: a “common sense”, a judgment shared by all, not on rational foundations, like Descartes’ common sense, nor based on the criteria of the majority opinion, as the Stoics saw it, but through spontaneous, universal and transcendental evidence that makes it immediately perceptible to all. He says, “the necessity of the universal agreement that is thought in a judgement of taste is a subjective necessity, which is represented as objective under the presupposition of a common sense” (ibid., paragraph 22). 1.1.4. Objectivism after Kant: from pseudo-subjectivism to aesthetic realism Following Kant’s monumental work and with the gradual emergence of phenomenology24 as a framework integrating the concepts of perception and thinking, the majority of philosophical reflections accepted a subjectivist framework. Nonetheless, many thinkers still accept certain concepts inherited from objectivism. Like Kant, Georg Wilhelm Friedrich Hegel took up a position at the intersection of these two systems. However, he considered that aesthetics did not arise from the sensible world, but from the rational world, which was a kind of return to Cartesian thinking. He too believed that Beauty existed, but that it was the result of our reflection, produced by the direct and objective influence of the object.25 Hegel also shared the 23 This distinction has come under much heavy criticism (Lorand 1989). 24 Phenomenology is a critical reading of metaphysics that tries to associate perception and representation of the world by returning to the concrete and to intuition. In this school of thought, there is no longer a separation between the “true” nature of an object and our knowledge of the object. An object is made up of a set of representations that we possess, which are perceived or developed by our consciousness. These are phenomena. Pioneered by Lambert and Kant in the 18th century, modern phenomenology was formalized by Husserl and has many variants thanks to Fichte, Schopenhauer, Ricoeur, Levinas, Sartre and Merleau-Ponty. 25 “...the idea alone is true. [...] And truth not at all in the subjective sense that there is an accordance between some existent and my ideas, but in the objective meaning that the ego or an external object, an action, an event, a situation in its reality is itself a realization of the Concept”. (Hegel 1997).
10
Aesthetics in Digital Photography
Platonic idea that Beauty is universal and atemporal26. However, as we will see later in the text that, because he spoke of his own experience, Hegel largely used very personal and highly subjective elements that left little room for the more objective Platonic arguments. Arthur Schopenhauer also took on Kant’s legacy and blended it with sincere Platonic thought and empiricism. He stated that our knowledge is the conjunction of two processes with different bases: that which he called, “intuition”, an immediate and spontaneous access to the deep nature of the world around us, and that which he called “Will”, which arose from the rational principle. The rational principle is the chief tool of our cognitive construction. However, in the domain of aesthetics, it gives way to intuition, which is the only way to access the true nature and the intention of the artist, if the artist has proven their genius. While Will and the rational principle are subjective manifestations of our self, intuition is naturally objective, in accordance with Plato’s teachings27. With the growing role of our consciousness in decision-making by humans, objectivist interpretations of beauty became rarer. Indeed, at the end of the 19th century, it was the scientific and positivist community that defended the objectivist approach: beauty exists and it is the “human machine” that processes it through a specific adaptation of the mental circuits. This singular path was particularly defended by Charles Henry28 (Henry 1885) and by Gabriel Séailles29 (Séailles 1883). However, this original path remained a minority point of view in its century. It was not until the 20th century that a significant current in aesthetics once again shone the spotlight on these theories, in contrast to the majoritarian subjectivist approach, to corral off an objectivity that was unique to the aesthetic phenomenon, thus emphasizing on the realism of the visible. 26 We can read, for example, that “Beauty has been tasted by all nations, in all ages” in Aesthetics, written between 1818 and 1829. 27 “In all these reflections it has been my object to bring out clearly the nature and the scope of the subjective element in æsthetic pleasure; the deliverance of knowledge from the service of the will, the forgetting of self as an individual, and the raising of the consciousness to the pure will-less, timeless, subject of knowledge, independent of all relations. With this subjective side of æsthetic contemplation, there must always appear as its necessary correlative the objective side, the intuitive comprehension of the Platonic Idea” (Schopenhauer 1966, p. 258). 28 We will return to Charles Henry later as he is quite probably the first to have imagined that beauty could be detected by “exact” science (see section 4.1). 29 “The formal beauty of musical harmony and the harmony in geometric figures come from the fact that the logical and internal proportions of the ideal content find a tangible representation. The heart of it as such, does not appear to the conscious mind, nonetheless, its richness attracts the waiting ear, the fixed gaze, and fills the mind with an unconscious arithmetic” (Séailles 1877).
The Legacy of Philosophers
11
However, it was the subjectivist Clive Bell (Bell 1914) who was responsible for the concept of “significant form” to explain how certain conjunctions of lines and colors lead us objectively to a feeling of Beauty. These significant forms are made up of physical elements captured by our visual systems and also of subtler components that would later be described in varied terms: “aesthetic predicates” (Zemach) or “perceptive expressivity” (Arnheim). These are the reason we find a scene to be joyful, poetic or troubling. These components, which we cannot formally transform into optical signals, directly address our sensibility. Well-established in the field of musical aesthetics (a minor chord is “obviously” sad, isn’t it?30), this idea that “invisible” configurations lead objectively to feelings in the observer is also defended in the field of images, although it is not so easily accepted here. The emergence of psychological sensations following an experiment in perception brought together various philosophical theories of the mind, especially around the concept of “supervenience”, proposed by D. Davidson31. The supervenience of aesthetic properties from the physical properties of the object engages both the object and the observer in the aesthetic appreciation. As Pouivet said: “If we state that the aesthetic appearance of an object is a supervening function of its base properties (structural, i.e., perceptible and thus both physical, with respect to the object, and physiological, with respect to the person perceiving it), it is clear that we must reject the essentialist theories that state that the aesthetic character of an object is, sui generis, a property of this object” (Pouivet 1996). It was the “realist” school that gave these “significant forms” a key role to play in the construction of an aesthetic appraisal. Epistemological realists agreed on the existence of an external reality, independent of thought, and our knowledge is formed based on this. All through the 20th century, aesthetic realism strove to show that aesthetic expressions are communicated to our consciousness through our channels of perception and bring in aesthetic information. This school of thought held that these aesthetic expressions, resulting from the object, are objective and that the aesthetic judgment that is then formed is subjective. 30 We can find writing on this online: “Why is the minor chord sad? The answer to this question is quite simple: minor intervals are smaller, ‘tighter’ than major intervals’... this gives the impression of melancholy, introversion, mystery” (Available at: www.musiclodge.fr/article-la-tristesse-du-mineur-37419635.html). This explanation should reassure those who are convinced of this. 31 Donald Davidson proposed supervenience in 1963 to connect thought and action. In aesthetics, the supervenience of aesthetic predicates on non-aesthetic predicates is expressed by a univocal correlation as follows: if two objects have different aesthetic predicates, they necessarily have different non-aesthetic predicates, however the converse is not true (see Bréhin 2007).
12
Aesthetics in Digital Photography
Frank Sibley32 (Sibley 1959) and Eddy Zemach33 (Zemach 1991, 2005) declared that aesthetic predicates are real, observable and dependent on non-aesthetic correlates34. Zemach deduced from this that beauty is a real, objective property of certain objects. He argued that the dissenting aesthetic appraisals are due to the differences in the observation conditions. These observation conditions are, of course, those related to the material environment (light, space, ambiance), and also those that include the epistemic conditions attached to the observer (their mood, availability, knowledge about the object, etc.). He states that if we have what he calls standard observation conditions (SOC), we will always end up having similar judgments about the beauty of an object, Zemach thus reiterates the invariance in aesthetic judgment. He recognizes that it is difficult to guarantee these SOC, which explains differing judgments. In case of conflicting views, he suggests turning to aesthetic experts whose vocation ensures SOC. Jerrold Levinson35 agrees with Zemach’s aesthetic realism. However, he has a different interpretation of dissenting judgments, which he explains through the psychological diversity among observers, which induces diversity in their sensibilities. He is, thus, resolutely in the subjectivist camp (Hsu 2009). Roger Pouivet36 sides with the position taken by Zemach, anti-relativist and non-subjectivist, but still adopting a more moderate realism and questioning the possibility of identifying SOC in certain cases. He supports his arguments by examining the expression of contemporary language, especially the formulation, “A is beautiful”37, thus demonstrating that our linguistic heritage is not compatible with a subjectivist justification of aesthetic evaluation38. 32 The philosopher Frank Sibley is an important contributor to modern aesthetics. In 2001, his work was brought together in the book Approach To Aesthetics. 33 Eddy Zemach has published many texts on aesthetics, notably Analytic Aesthetics in 1970, Aesthetics in 1976 and Real Beauty in 1997. 34 Thus, F. Sibley wrote: “Aesthetic words apply ultimately because of, and aesthetic qualities ultimately depend upon, the presence of features which, like curving or angular lines, color contrasts, placing of masses, or speed of movement, are visible, audible, or otherwise discernible without any exercise of taste or sensibility” (Sibley 1959). 35 Jerrold Levinson has published his work in The pleasures of Aesthetics, in 1996. 36 Roger Pouivet is a philosopher in the realist stream, who has dedicated his work to aesthetics and metaphysics. He has shared his thoughts in Pouivet (2006). 37 Pouivet uses the following reasoning. The usual expression when one comes across a beautiful object “A”, is, “A is beautiful”, naturally attributing the feeling that we experience to A and not to ourselves, as would be the case if we had a truly subjectivist feeling of aesthetics. We would have said, “I feel beauty”. On the other hand, when we experience pain upon banging into furniture, we say, “I am in pain”, not, “the furniture is painful” (Pouivet 2006). 38 Gérard Genette’s response to this argument is: “We give ourselves over spontaneously to the “objectivization” of our feelings. This is a natural trend in empirical psychology that leads us to
The Legacy of Philosophers
13
It can thus be seen that objectivism brings together philosophical attitudes that are quite varied: the most extreme position makes beauty dependent on a transcendent outside the object itself (God, the Pantheon or Nature); in this case, the object is simply the receptacle of a shared beauty that is impressed upon the observer, if they are an initiate. Later, Beauty was attributed to the object, to its form and its proportions. Then the role of the observer was gradually recognized. The observer is also the witness to Beauty, which is only recognized in the observer’s pleasure. The observer is “touched” by the beautiful and testifies to its existence. But through which paths? Descartes and Hegel justified this through reasoning, Kant experienced it through a primary intuition, pre-dating any concept but distinct from primary sensory perceptions, an interpretation that would often be adopted by later, phenomenological approaches (Bergson, Merleau-Ponty). Finally, in other very modern processes, the reality of the beautiful is subtly woven into the perceived signal and objectively affects our sensibility when the conditions of its observation come together. For objectivists, the beautiful exists, it is a matter of near-consensus, it persists in time and we can attest to it. 1.2. The subjectivist approach The subjectivist approach, which relates any judgment of value to an individual act of consciousness by the observer, does not necessarily strip the object of its aesthetic responsibility, as we have shown with the example of Kant. However, by defining beauty by the pleasure experienced by the observer, the various “sciences of the mind” can be applied in very varied ways, with very different consequences on the interpretation. The most extreme of these could almost exclude the object under consideration from playing any role in the aesthetic decision. Let us look closely at this dilemma. In the Kantian interpretation, the observer provides the “beauty sensor”, returning a judgment that is subjective, since no one can replace the observer in their decision (Kant believes that we cannot convince someone of beauty). G. Bachelard expresses this clearly: “When we have intimate experiences, we fatally contradict objective experience” (Bachelard 1949). But what the Kantian observer carries out is not really personal and intimate, since neither memory, nor reflection, nor reasoning is involved in this decision. On the contrary, let us revisit the reasons that led G.W.F. Hegel to attribute the supreme aeshetic qualities to Correggio’s Mary Magdalene39. It is clear that such a judgment greatly involves the observer’s culture, attribute to an object (as an objective property) the value that follows from the feeling that we experience with respect to the object” (Pouivet 2006, p. 153). 39 In Hegel (1835–1838), he writes: “...in supreme pictures we cannot find room for any thought except the one which the situation is meant to arouse. This is the reason why Correggio’s Mary Magdalene in Dresden seems to me to be so worthy of admiration and it will ever be admired. She is the repentant sinner, yet we see in her that sin is not the serious thing for her, but that from the start she was noble and cannot have been capable of bad passions and actions.
14
Aesthetics in Digital Photography
sensibility, imagination and memory. Stop the next person you meet in the street and they will be unable to echo any of the reasons that led Hegel to his decision. Hegel’s subjectivism, therefore, is much more marked by the individual and it would be difficult to find the appropriate SOC. 1.2.1. From classicism to romanticism The paradigm of beauty as universal and intrinsic to the object was thus challenged in the 16th century. Indeed, there was never a complete absence of critics at any point in these philosophical debates, like the Greek Sophists, for example40. However, in the 16th century these criticisms took on a definite form. In 1588, William Shakespeare decried the universality of Beauty: “Beauty is bought by judgement of the eye”41 and in 1591, Giordano Bruno challenged its uniqueness: “Alia est pulchritudo unius speciei, alia alterius, alia unius generis, alia alius”42. In 1630, René Descartes clearly stated that he did not hold Beauty to be real and the object of shared rationality.43 It was the philosopher John Locke who developed the “subjectivist doctrine”. He expressed in the clearest terms that there is nothing acquired in human thought, that is, all thought is born of a “sensible reception”, therefore from our observations. Although he did not directly speak about Beauty, it can be understood that this quality, like the other Secondary Qualities of an object44, does not at all exist outside the person who perceives it45. So her profound but reserved withdrawal into herself is but a return to herself and this is no momentary situation but her whole nature. In the whole presentation, in the figure, facial traits, dress, pose, surroundings, etc., the artist has therefore left no trace of reflection on one of the circumstances which could hint back to sin and guilt”. 40 We can thus read in the Dissoi Logoi, which was a collection of controversies from Plato’s time: “I believe, then, that if someone bid all men make a heap of the things which they each deem to be the shameful, and conversely, to take those that each considers as the seemly ones from these collected, nothing would be left behind, but everyone would take everything. For not all have the same opinions” (Dissoi Logoi, 2, 8, reported in Tatarkiewicz (1970)). 41 Love’s labour lost, 1588. 42 “The various sorts and kinds of beauty differ one from another” (G. Bruno, De vinculis in genere, III, 638, quoted by Tatarkiewicz (1970)). 43 “...in general ‘beautiful’ and ‘pleasing’ each signify merely a relation between our judgement and an object; and because men’s judgements are so various, there can’t be any definite standard of beauty or pleasingness”. Letter to Mersenne on March 18, 1630, in Œuvres et Lettres, Gallimard, Paris, 1953, quoted in Malinowski-Charles (2004). 44 Secondary qualities are those that are not immediately deduced from perception, such as size and color, but result from the conjunction of various primary qualities that thus associate a “complex idea” with the observed object. 45 These ideas are the subject of Locke’s 1689 work, Essay on Human Understanding Locke (2009): “From whence I think it easy to draw this observation, that the ideas of primary qualities
The Legacy of Philosophers
15
Many of the arguments he used fed the Quarrel of the Ancients and the Moderns, which was raging across all Europe at this time. His ideas were largely reproduced in the 18th century in the texts of Denis Diderot46 and Johann Wolfgang von Goethe47. They declare that Beauty is an experience and not an attribute, that this experience is largely born of the individual’s emotion and it is therefore personal and not universal, continent and not immanent. It is eminently variable in time, dependent on the individual’s “moods”. That which is beautiful will not always be so, and that which is not beautiful today may, perhaps, be beautiful tomorrow. This idea is captured in the famous line “Beauty is in the eye of the beholder”48, which still offers various interpretations. 1.2.2. The moderns The philosophers who made concessions to subjectivism are those who moved away from objectivism with a heavy heart. Luc Ferry, for example, wrote, “Modern aesthetics is certainly subjectivist inasmuch as it bases beauty on human faculties, reason, feeling or imagination. It is still, however, driven by the idea that the work of art is inseparable from a certain form of objectivity” (Ferry 1990, p. 20), or “There are fewer disputes over the grandeur of Bach or Shakespeare than over the validity of Einstein’s physics” (Ferry 1990, p. 42). Those who worked in experimental psychology, following the path carved out by G.T. Fechner in the 19th century, also adopted a pragmatic subjectivism. They strove to systematically test the universality and permanence of Beauty on populations of observers. They showed the breadth of factors that existed, related to the context, to education and to culture, but tried hard to generalize their conclusions to very varied populations. Above all, they had great difficulty in identifying constant aesthetic criteria that would be convincing in their generality. Berlyne (1970, 1971) thus identified specific qualities of the aesthetic object: novelty, incongruity and complexity. The first two arose more from the observer. He also showed the emotive power of the form through controlled experiments (by which he responded to of bodies are resemblances of them, and their patterns do really exist in the bodies themselves; but the ideas, produced in us by these secondary qualities, have no resemblance of them at all. There is nothing like our ideas existing in the bodies themselves” (J. Locke, An Essay concerning Human Understanding, Book II, Chapter 8, Paragraph 15). 46 Diderot’s texts on aesthetic criticism are collected in the Les Salons columns in Diderot (1769). 47 Goethe’s ideas can be found in The experiment as mediator between subject and object from 1792 and again in Theory of Colors from 1810. 48 A line by Margaret Wolfe Hungerford in Molly Bawn (1878), often attributed to Oscar Wilde – as so many of these lines are! We find quite a similar sentence in David Hume’s 1757 writing, “Beauty is no quality in things themselves: It exists merely in the mind which contemplates them” (D. Hume, Of the Standard of Taste).
16
Aesthetics in Digital Photography
C. Bell’s line of thinking) by basing his work on the concept of physiological arousal. He tried to demonstrate that the arousal (of the observer) was a function of the complexity of the parameters of the form (the object)49. Other work (Reber et al. 2004; Moshagen and Thielsch 2010), borrowing from Gestalt formalism, put forward the concept of the law of Prägnanz50 to explain why certain significant forms preferentially activate our perceptual system. Consequently, the law of Prägnanz should make it possible to share universal criteria and support the objectivist approach. However, it appears that in practice it depends on a large number of parameters inherent to the observer: some arise from their mood, others from their temperament, their culture, their training and, finally, other parameters are a result of the context. All these factors act concurrently upon the observer’s emotions, producing a masking effect or, on the contrary, an amplification of the stimuli. Some thinkers emphasize on the particular role played by the perceptual system (Molnar 1974, 1997), while others draw a link between aesthetics and semiotics, associating the artist’s message with what the observer experienced, and stating that the aesthetic experience was governed by the knowledge of “codes” acquired through a cultural education that is, by nature, very subjective (Eco 1962; Solso 1996). In contrast to these moderate interpretations of Beauty, the 19th century and the Romantic period gave voluminous subjectivist interpretations that were far removed from Platonic thought. Beauty requires an object, but its relationship with the object is tenuous and it may potentially be transferred to another object if the first were to grow distant or disappear51. While poets and musicians were the first to walk down this path, painters soon adopted it as well. In society, there was a brutal break from the Classical school with the two currents, flowing largely in parallel and competing with each other, each holding on to their truth. The success of the Impressionists over l’Art Pompier (art that used an extravagant or academic style) cemented the overturning of trends in the world of pictorial arts. The blossoming of various artistic movements across the world (pointillism, fauvism, cubism, surrealism, etc.) resulted in the long-term acceptance of this approach, which centered the spectator at the heart of the artist’s process, while small pockets of Classicism survived here and there. 49 Berlyne believed that the dependence between form and arousal followed an inverted U-curve distribution passing through an optimum before declining for forms that were too complex. 50 Prägnanz was the term chosen by M. Wertheimer to express the ease of perception of a figure based on the specific relations between its components (symmetry, parallelism, proximity etc.) (Wertheimer 1922). 51 The romantic experiments of Romanticism abound with such examples (Novalis 2005), however, this same argument, elaborated through multiple modalities, was subtly built up into La recherche du temps perdu or In Search of Lost Time (Proust 1911).
The Legacy of Philosophers
17
Friedrich Nietzsche was emblematic of this Romantic approach52. Nietzsche denies the existence of Beauty as well as ugliness53 aesthetics is reduced to an organic act54. Nietzsche wished to wrench Beauty away from Platonic rules as well as the Cartesian world, and spoke of the significance of Dionysiac criteria (eminently subjective in their excess) when compared to the Apollonian criteria, consisting of measurement and reflection. In doing this, he rejected the universality and timelessness of Beauty. He only accepted the Beauty of the elites, that which the masses (the “sensual plebians”) would not perceive: “Pulchrum est paucorum hominum”55. Beauty is not made of canons and harmonies, but from violence and mud, the antithesis of Aristotelian precepts.56 However, Nietzsche was complex and changeable and at the end of his life he often denounced that which he had praised in his youth57. We find many elements of Nietzschean thought in Robert Pepperell’s very modern Posthuman Aesthetics, which claimed, on behalf of art, a break from compromises with society58. By the end of the 19th century, the psychanalytic school of thought threw their weight behind subjectivism, rejecting all objectivist arguments of Beauty. They 52 It is especially in his 1888 work Twilight of the Idols that F. Nietzsche shares his thoughts on aesthetics (Nietzsche 2005), however, several other references are also found in The Will to Power and in The Antichrist. 53 He writes, “Nothing is beautiful, except man alone: all aesthetics rests upon this naïveté, which is its first truth. Let us immediately add the second: nothing is ugly except the degenerating man—and with this the realm of aesthetic judgment is circumscribed.” (Nietzsche 2005). 54 Art as physiological function: “Making art physiological is the same as reducing art to the level of gastric secretions”, or the evaluation of Beauty as a sport: “We can use a dynamometer to measure the effect of ugliness” (Nietzsche 2005). 55 “Beauty is for the few”, F. Nietzsche, The Antichrist, 1895. 56 “Beauty is difficult. Let us beware of beauty! Let us dare, my friends, let us dare to be ugly. Wagner dared it. Let us heave the mud of the most repulsive harmonies undauntedly before us”. (Nietzsche 2005). 57 For example, Nietzsche defended Corneille against Hugo, contradicting the above lines. 58 In his Posthuman Manifesto, R. Pepperell stated, among other things: “Rich aesthetic experience is generated by the perception, simultaneously, of continuity and discontinuity in the same event”; “All stimulating design relies on balancing the relative quotients of order and disorder in the object. This also goes for the composition of music and literature. However, such judgements cannot be made in isolation from the fact that values of order and disorder are largely prescribed by social agreement”; “Posthuman art uses technology to promote discontinuity. Healthy societies tolerate the promotion of discontinuity since they understand that humans need exposure to it, in spite of themselves. Unhealthy societies discourage the promotion of discontinuity” (Pepperell 2005).
18
Aesthetics in Digital Photography
adopted a rigorous interpretation of Beauty which referred only to the individual and their past experience, burying the deepest of their responses in the fog of the unconscious. Sigmund Freud wrote: “...beauty originates in the domain of sexual feeling”59, echoing Nietzsche, who also stated that the feeling of beauty had a sexual origin, “without a certain overheating of the sexual system a man like Raphael is unthinkable”60. 1.2.3. The influence of neurobiology It was not till the end of the 20th century that a new pillar was added to the structure of subjectivism, with the advent of neurobiology. A lot of work based on new brain imaging techniques (functional MRI and MEG) made it possible to better understand how our thoughts functioned. The results obtained in physiology, in the middle of the 20th century, made it possible to explain the role that the visual pathways and the posterior cortex played in perception. Today, neurobiology sheds light on the role played by the neo-cortex and the limbic system in these tasks of perception, evaluation and decision-making. It highlights the place of emotions in the creation of our interpretation. In light of these discoveries, a number of philosophers, art historians and artists, along with neurobiologists, have come together under a new aesthetic label: neuroaesthetics, which has produced a large amount of literature since the late 1990s, including seminal work by Zeki (1999); Livingstone (2002); Kawabata and Zeki (2004), more recent work by Cupchik et al. (2009) and Di Dio and Vittorio (2009) and work by Di Dio et al. (2011); Ishizu and Zeki (2011); Kuehn and Gallinat (2012). These will be explored further in Chapter 2. For many thinkers, these results have firmly established the subjectivist interpretation of Beauty. They support conclusions, which in their extreme form would suggest that: – Beauty is a personal emotion, that is, a complex psycho-physiological experience in reaction to biochemical and environmental influences transmitted by our senses; – aesthetic appraisal is distinct from other forms of judgment from our cortex (e.g. judgment about food) as beauty is not vital for a human being (unlike food, for example) and there is therefore no physiological “interest” in expressing it (Beardsley 1966); 59 S. Freud, Civilization and Its Discontents, Presses universitaires de France, Paris, 1985. 60 F. Nietzsche, The Will to Power, incomplete texts from 1888, collected in the Kritische Gesamtausgabe.
The Legacy of Philosophers
19
– given this, artistic material may be processed by different and specialized circuits, sometimes known as the “hedonistic areas”61. Following Changeux (2016), we could say that subjectivism has slowly migrated from sensualism (as described by Condillac) and empiricism created by Locke and Hume toward a cerebralization, the field of the neurobiologists. They join with Martin Heidegger who stated: “As soon as art is simply a result of physiology, the essence and reality of art dissolve into nervous states and processes in nerve cells”. This was also the point of view adopted by many people in the field of cognitive computing, which is far removed from the Platonic pre-suppositions adopted by Big Data enthusiasts. This dilemma among computer scientists is discussed in the last few chapters. 1.3. Subjectivism and objectivism: an ongoing debate The importance of the subjective phenomena in appraising the aesthetic qualities of an image cannot be denied, today, and the role of the observer, endowed with a particular sensibility, education and culture, is indisputable62. However, many witnesses testify to an objective consensus on the beauty of certain objects. This consensus indicates that the “cues for beauty” emanate from this object and are largely recognized and act on a large section of the population. Faced with both sets of evidence, we stand at the heart of a dilemma. This is what this book tries to define and we will set aside all other approaches to Beauty that would take us away from this particular debate63. 61 Thus, around 1730, Francis Hutcheson considered that Beauty is that which is captured by our internal sense of Beauty as the visible that which is captured by our eye. He therefore attributes organs of hedonistic perception to man, which later works would seek in vain to locate (see section 2.2). 62 This necessary implication of the personal qualities that tie beauty to the observer as much as to the object seems to culminate in A. Carlson’s statement of faith, which states that the beauty of a landscape cannot be appreciated without a solid foundation in geology, botany, meteorology, history and sociology (Carlson 2005). We must compare this point of view with that expressed by R. Barthes: “...I dismiss all knowledge, all culture, I refuse to inherit anything from another eye than my own” (Barthes 1980, p. 82). 63 Among the most important currents of thought in contemporary aesthetics, we will not be looking at functional aesthetics, which was given considerable importance in the 20th century, especially for its relationship with industrial and architectural design. Following an idea that has earlier been proposed by the Stoics, this school of thought attributed aesthetic quality purely based on the object’s ability to fulfill a function. Many large artistic movements were based on this (Blau Reiter, Suprematism, etc.). It has been discussed at length in the work of Jacques Viénot, Gilbert Simondon and Mikel Dufrenne, in France. Functional aesthetics is not well suited to judge the beauty of photographs. We will also not be studying what is called
20
Aesthetics in Digital Photography
Figure 1.1. Are these photos beautiful. This is the question this text attempts to answer
C OMMENT ON F IGURE 1.1.– If a large jury is asked this question, there is a certain consensus that often emerges. This supports those who think beauty can be measured by a machine. If everyone perceives criteria for beauty in (a) and (b) that are more obvious than in (c), which is usually seen as ordinary, why would a computer (capable of remarkable shape recognition performances) be unable to detect these? Photo (a) (Poppy by Irving Penn) is a famous, signed photograph and we could object that it benefits from this reference (this is the “signature effect” analyzed in Changeux (2016) which will be discussed at length in section 3.5.3 and which often appears as one of the reasons for attributing beauty to a work, which will not be attributed to an unsigned work); however, photograph (b) is anonymous (posted on the Web) and is only evaluated based on its inherent qualities. Further, all three photos represent flowers and are therefore based in the same cultural and semantic context for most observers. Finally, they do not have any properties that would involve interest or surprise, which are the other potential ingredients for differentiation, as we have seen in the Introduction.
contextualized or historical aesthetics, illustrated by Ernst Gombrich and Arthur Danto, which suggests that a large number of the aesthetic qualities of an artwork are supplied by the artistic, political, social and familial context in which the artwork was created. Only a knowledge of this context would allow us to appreciate the artistic intention, from which follows our judgment of beauty. In Bullot and Reber (2013), the authors attempt to bring together the contextualized interpretation and the purely subjective approach to beauty using neuro-aesthetics. Nevertheless, we find this method is not adapted to justifying an aesthetic judgment in a computer processing approach.
The Legacy of Philosophers
21
It seems largely accepted today that objects recognized as “beautiful” possess intrinsic properties that are conducive to any observer recognizing this (see, for example, Figure 1.1), but it is the observer who finally decides whether or not they judge it to be beautiful. Thus, we once again center this interpretation of Beauty on the ideas we have already seen, proposed by de Crousaz or Kant, which strive to reconcile the subjectivism that has ruled for the past five centuries and the objectivism that dominated for 20 centuries. These approaches did not emerge during the enlightenment; while they have always been present in debates on aesthetics, their contradictory nature was not highlighted and rarely discussed. Let us cite the sole example of William of Auvergne, who wrote, in the early 13th century: “in that which is indisputably essentially beautiful and elegant, the deep essence and the quality is beauty”64 followed by: “A thing that is beautiful in itself may be ugly close to another object, or with another object, or in another object”65. Positions like this, which are difficult to reconcile and which call for solutions of compromise, may be found in many texts from Ancient Greece. Today, phenomenological approaches (which we have seen earlier) that intimately associate the observed object with the mind through “phenomena” sometimes offer such a combination. While the object is charged with providing the “aesthetic predicates”, these will then be transformed into qualia66. These qualia will then be the carriers of personal, ineffable (incommunicable), immediate aesthetic appreciation and are related only to perceptive experience and directly apprehended by consciousness Dennett (1993). However, these interpretations, which strongly oppose the divide between mind/consciousness, defended by rationalists, are still widely debated. Cues for beauty are thus more or less effective at creating emotion and the pleasure that accompanies the feeling of beauty. This is a central point in our study. What does this effectiveness depend on? We can first look at the object that was photographed. Is it beautiful or not? It is obvious that to answer this primary question we must resolve the entire problem, because when we speak of the beauty of the object chosen as the model, we attribute to this object all the criteria that we would attribute to its photograph (see Figure 1.2). Next, speaking of “aesthetic predicates” invites us to list these out. We would then turn to arguments from Platonic thought: 64 “Hoc indubitantur essentialiter pulchrum est et decorum, et ujus essentia et quidditas ejus est pulchritido” (Du bien et du Mal, 206 (Pouillon 317), cited in Tatarkiewicz (1970)). 65 “Ipsum quod est in se decorum, est turpe ad aliquid, aut cum alio, aut in alio” (Du bien et du Mal, 207 (Pouillon 317), cited in Tatarkiewicz (1970)). 66 For philosophers who have adopted these, the qualia (singular: quale) are the trace of the phenomenological experience. Qualia cannot exist outside the sensible experience and are incommunicable (I cannot explain “red” to someone who cannot see).
22
Aesthetics in Digital Photography
simplicity, symmetry, harmony, perhaps associated with Aristotelian nuances to humanize these. We can then use Gestalt criteria (symmetry, parallelism, proximity, etc.) (Reber et al. 2004; Moshagen and Thielsch 2010), the fundamental principles of Prägnanz (Wertheimer 1922), and we could use the more complex criteria identified by Berlyne: novelty, incongruity and complexity (Berlyne 1970). However, we expect to also use a large number of parameters internal to the observer, some arising from their mood, others from their temperament, culture, training and finally, other factors from the context. All of these act simultaneously on their emotions through a masking effect or, on the contrary, amplification (Dufrenne 1967; Solso 1996; Changeux 2016).
a)
The Legacy of Philosophers
23
b) Figure 1.2. Why is a photo beautiful?
C OMMENT ON F IGURE 1.2.– For photo (a) (Romania by Josef Koudelka), it can be said that its beauty is the result of its object: here, the horse itself is beautiful. But other than the fact that this observation only shifts our interrogation into the attributes of beauty, it does not remove the aesthetic effect of the photograph, which makes a clear impression. In photo (b) (Rive gauche by Gordon Parks), the earlier argument does not hold valid as the object itself (an alley against daylight) only derives its aesthetic qualities from the image, from its composition and the play of light. The painters who progressively replaced odalisques and carriages by jugs and jam-jars got this quite right.
2 Neurobiology or the Arbitrator of Consciousness
Perception is a subjective act. It cannot be wholly communicated in speech or writing: it requires the experience of the gaze and shared reward. Jean-Pierre C HANGEUX (2008)
There are multiple studies that have allowed us to understand the mechanisms of human vision. Clinical studies carried out on injured patients during surgeries, often complemented by postmortem anatomical studies, paved the way, making it possible to localize the regions involved in certain stages of perception, analysis or the interpretation of visual signals1. Then there are electrophysiological studies on humans (electroretinography, electro-oculometry, event-related potential), and especially those carried out using more invasive techniques on birds, cats and monkeys, which have made it possible to verify in vivo the validity of models on the functioning of the optic pathways and the primary regions related to vision. These pathways are represented in Figure 2.1. However, it was not until the emergence of techniques such as electro-encephalography (EEG), magneto-encephalography (MEG), magnetic resonance imaging (MRI) and positron-emission tomography (PET) that researchers could access the functioning of superior regions associated with vision in (almost) real time. 1 Of particular interest is R. Vigouroux’s work, which offers many different examples of clinical problems affecting famous artists following various accidents (Vigouroux 1992). This book is complemented by a highly detailed and didactic table listing the brain structures engaged by artistic activities, with a special emphasis on creative functions.
26
Aesthetics in Digital Photography
a)
b) Figure 2.1. (a) Diagram of the optic pathways that go from the eyes to the visual regions, passing through the optic chiasm, and then the lateral geniculate nucleus (credit: Dr. Leininger). (b) The visual areas. Certain regions have well-identified functions: V1 for processing contrasts, V2 for contrast and motion, V3A for orientation and motion, V3B for depth, V4 for recognizing faces, writing, colors, etc. (credit: Wikipedia). For a color version of this figure, see www.iste.co.uk/maitre/aesthetics.zip
Neurobiology or the Arbitrator of Consciousness
27
The ascending visual pathways are, of course, those most significantly involved in carrying the message from the observation of a photograph toward the superior regions, but they are not the only ones. M. Imbert initially demonstrated that direct pathways made it possible to short-circuit the visual regions in a way that transmitted the viewed observation to the motor cortex and constituted our proprioceptive response. The role that these pathways play in the perception of the world around us is still not well known, however, and they are poorly integrated into the understanding of vision pathways. 2.1. fMRI protocols and neuroaesthetics Among all of these techniques, it was the research carried out using functional magnetic resonance imaging (fMRI) that led to the most significant progress in this field. Let us recall that this technique, which emerged in the 1990s, was based on the detection of oxygen consumption (which was related to circulatory activity) in active brain regions. This consumption is very small and results in a very small variation in the signal detected by the MRI (the order of a percentage point). In order to study this, the protocols used require the test subject to carry out the same exercise several times in a row, such that multiple measures can be integrated. Since these protocols must also make it possible to isolate the desired signals, it is useful to alternate between the detection step and steps designed to identify oxygen consumption at rest, or consumption due to other brain activity in the subject, independent of the test stimulus. The desired response is then obtained through the difference between these two experimental series. Let us look at a fictional example to see how these protocols work. Assume that we wish to identify whether there is a specific brain circuit to process the color pink. We first produce a series of varied images, some of which contain a range of pinks, while others do not. These are then presented, one after the other and in a random order, to an observer placed in an MRI machine. If there is a “pink processor” in the brain, then we would expect to see one active region in all the fMRI images taken when an image containing pink was presented. We would say that this processor is specific to pink if the same experiment was repeated with the pink zones converted to purple, with no “lighting up” in the zones activated earlier. In order to generalize these results, we must repeat each experiment on several subjects to ensure that there is a consensus on the results, and then take the average of the results obtained. However, since these averages must take into account the variability in brain morphology, we register each brain map in a reference frame (for example, Talairach coordinates) before computing the average. These complex protocols are, of course, the topic of much discussion and criticism and we must be greatly circumspect when examining all of these results. Despite their limitations, however, they have led us to new understandings of aesthetic judgment and arguments that were unavailable to the remarkable work
28
Aesthetics in Digital Photography
carried out earlier on this subject (Gardner 1984; Vigouroux 1992). The complexity of the protocols led to many studies being carried out with very similar objectives, but different experimental conditions. Over the past decade or so, research that tries to track brain processes associated with signals based on their aesthetic qualities has all been grouped together into a specific branch of science called “neuroaesthetics” (Skov 2009; Vidal 2011). This is dedicated specifically to the study of the neural bases of beauty across various modes of perception: vision, of course (painting, photography, architecture, statuary), but also hearing (music and songs), and, in some rare cases, taste and smell. For many authors, one of the important objectives of neuroaesthetics is to “naturalize” the circuits for the evaluation of beauty. That is, to show that they use cortex functions that are already in place for other purposes, especially those that are crucial for the survival of the species (which aesthetic judgments are not) (Brown et al. 2011), as these crucial functions could justify the existence of these brain structures in our species in a Darwinian concept of evolution. 2.2. The fMRI quest for “beauty processes” in the brain We cannot list all of the research carried out on neuroaesthetics. A 2009 review of literature (Di Dio and Vittorio 2009) cited about 30 publications, another in 2016 (Chatterjee and Vartanian 2016) cited close to 200, while a 2017 search for the word neuroaesthetics threw up almost 3,000. We will not be looking at all of this literature individually. Instead, we will isolate the principal brain regions involved and refer the reader to a few articles that will explain how these discoveries were made. 2.2.1. The role of the prefrontal cortex The prefrontal cortex (shown in color in Figure 2.3) was the first candidate for the “hedonic center” in research, where its involvement during the observation of works of art was highlighted initially (Kawabata and Zeki 2004; Jacobsen et al. 2006; Cela-Conde et al. 2011). While other, very important regions of the brain were then included, it remains one of the dominant zones in aesthetic vision. The orbitofrontal cortex (OFC) is known to play a significant role in decision-making tasks. It is a component in most perceptual pathways and is an important part of the system that governs appraisal (Kringelbach 2005; Wallis 2007), as well as the regulation of rewards. For these reasons, it could play a central role in the pleasure function (Berridge and Kringelbach 2013). This role will be illustrated through the research carried out by Ishizu and Zeki (2011), which formed the base for neuroaesthetic studies and allowed the authors to propose a “theory of beauty”, which is often used as a reference. In their experiments, subjects in an fMRI machine were exposed to a series of stimuli (photos and music), which had been categorized beforehand into “beautiful”, “indifferent” and “ugly”. The
Neurobiology or the Arbitrator of Consciousness
29
analysis then consisted of comparing the activated zones based on the “quality” of the stimulus.
a)
b)
Figure 2.2. Active zones in Ishizu and Zeki’s experiment. For a color version of this figure, see www.iste.co.uk/maitre/aesthetics.zip
C OMMENT ON F IGURE 2.2.– It is shown that only the central part of the OFC (the spot on the right in image (a)) was systematically activated during experiments with visual or acoustic signals that carried aesthetic qualities. The caudate nucleus (in the center of figure (b)) may also be active in some experiments. The experiments were conducted on 21 subjects. The regions in red were excited by a visual signal, the regions in green by an acoustic signal, while the regions in yellow were excited by both types of signal (Ishizu and Zeki 2011). The results showed systematic and preferential activity in the central part of the OFC every time a “beautiful” document was presented, rather than an “indifferent” or “ugly” document (Figure 2.2). These results, which had already been stated by others, were regularly found on repeated observations (see the summary by Brown et al. (2011)). Furthermore, inasmuch as it can be deduced from a small number of experiments, the level of activation is directly related to the beauty of the document, as appraised during the calibration phase that preceded the exposure to the stimulus and brain mapping. These experiments thus seem to give a biological basis to the concepts of aesthetics and pleasure, a point that is regularly discussed in the philosophy of Beauty. Finally, it seems as though there is no significant activation that distinguishes
30
Aesthetics in Digital Photography
“Ugliness”. This point attempts to answer an important question in aesthetics in philosophy, about whether or not there is a negative valence to Beauty2.
Figure 2.3. Zones active when emotions are manifested in the experiment by Chatterjee and Vartanian (2016). For a color version of this figure, see www.iste.co.uk/maitre/aesthetics.zip
C OMMENT ON F IGURE 2.3.– The OFC is the region seen in violet in views A and C. The anterior insular cortex (AIC) is shown in yellow in view D. It is closely related to the lateral regions of the OFC. The visceromotor circuits consist of the ventral region of the ventromedial prefrontal cortex (vmPFC) (in blue in A, B and C). The vmPFC is closely related to the amygdala (in pink in view D). The anterior cingulate cortex (ACC) is in beige in B, before the corpus callosum. The dorsal zone of the prefrontal cortex is associated with control of mental states: dorsal part of the vmPFC, frontal zone (in brown), dorsomedial prefrontal cortex (dmPFC) (in green in A and B). The ventrolateral prefrontal cortex (VLPFC) is in red in view A. The thalamus is in pink in the center of view B and the middle frontal gyrus is in orange. 2 Many essays in aesthetics do not recognize a negative valence to beauty, while common sense accepts that “Ugliness” could fulfill this role. However, we would then need to accept that there are attributes of “Ugliness”, which would be distinct from simply the absence of the criteria for Beauty, which is the most commonly adopted position. The concept of “Ugliness” is not given much space in literature on aesthetics (especially when compared to the concept of “Evil” in Ethics). Two references can be recommended, however: Rozenkranz (1853) and Bénard (1877). It must be noted here that since neurobiology reveals appraisal zones that are as sensitive to positive valences as to negative valences (especially the OFC), it is trying to answer this question.
Neurobiology or the Arbitrator of Consciousness
31
The study cited here identifies other regions that are also activated, especially the caudate nucleus (whose role in regulating emotions is known) or the supramarginal gyrus (involved in the perception of space), but in a non-systematic manner. We will return to them later on. 2.2.2. The role of the insular cortex The work by Brown et al. (2011) summarized available studies on neuroaesthetics in 2011 to see if it was possible to find regions that were common to aesthetic appraisal across the various modes of perception (vision, hearing, taste or smell). This also involved a search for the “hedonic function”, and they strove to “factorize” this for all modes of perception. The results from this research were well corroborated by more recent studies that gave more specific information on them (Boccia et al. 2016). The review looked at 93 neuro-imaging studies (fMRI or PET only) that studied which brain regions were activated during exposure to positive aesthetic stimuli across the four domains of perception. There were significantly more articles on vision, and we will focus here on the results related to these. All of these studies were carried out on healthy individuals and provided results that could be plotted within the same reference atlas (Talairach or MNI)3. The study consists of examining the conjunction of activations between authors and between modes across all of the studies, using well-established protocols4. Although we find that the OFC plays an important role in most observations (Figure 2.4), it is not engaged in aesthetic judgments of taste and smell. Brown et al. (2011) believe that it is therefore not systematically associated with aesthetic judgment. Consequently, they do not accord it pre-eminence in aesthetic appraisal. On the contrary, this study concludes that the right lobe of the anterior insular cortex was of prime importance, being the site of high activation during an aesthetic stimulus, regardless of the modality (hearing, vision, taste, smell). This is likely to be the only area that is specifically active for all modalities. This is the only area that is likely to be systematically activated by positive aesthetic stimuli and it would seem that this is where the most specific mechanisms related to aesthetic appraisal are located. 3 Because of the geometric diversity in brain anatomy, functionally identical regions from different individuals cannot be superimposed on one another without carrying out geometric transformations. The Talairach atlas is the oldest reference (Talairach and Tournoux 1988); it is very well-suited to superficial structures. However, the MNI atlas (named after the Montreal Neurological Institute in McGill university) is sometimes preferred as it is more representative of the variety in different populations and is better-adapted to deeper structures (Evans et al. 1992). 4 The chosen protocol is GingerALE: ALE = Activation Likelihood Estimation.
32
Aesthetics in Digital Photography
Figure 2.4. The chief focal areas of visual activity during the experiments reported in Brown et al. (2011). The reference coordinates chosen are Talairach coordinates. For a color version of this figure, see www.iste.co.uk/maitre/aesthetics.zip
C OMMENT ON F IGURE 2.4.– Axial cuts whose vertical coordinate is indicated by z are presented. The active zones have a threshold probability of 5%. The abbreviations are as follows: IFG = inferior frontal gyrus, IPL = inferior parietal lobe, aMCC = anterior medial cingulate cortex, pgACC = pregenual anterior cingulate cortex, OFC = orbitofrontal cortex. Caudate denotes the Caudate nucleus. Conclusions that concur with these are also found in Cupchik et al. (2009) (Figure 2.5).
a)
b)
c)
Figure 2.5. Experiment carried out by Cupchik et al. (2009). For a color version of this figure, see www.iste.co.uk/maitre/aesthetics.zip
C OMMENT ON F IGURE 2.5.– This experiment yielded results through experimentation with three different types of stimuli: (i) a visual field that was almost empty (a uniform background marked by four crosses), (ii) some generic images and (iii) reproductions of paintings. In two of the images (a and b), we compare the regions that were specifically activated during the observation of images (whether ordinary or artistic) with the observation of the empty visual field. In the lower part of the brain, we can see the activation of tertiary visual areas (inferior fusiform
Neurobiology or the Arbitrator of Consciousness
33
gyrus), and in the upper part, the activation of the anterior insular cortex. In image (c), we can identify the brain region activated as a discriminant during the observation of “aesthetic” versus ordinary images. However, this region is also activated in a variety of situations where the aesthetic dimension is not engaged, and therefore there is still some confusion about its exact role. It is especially reputed to be active in situations where there is a negative judgment (disgust, pain, empathy, etc.), which is considerably far removed from our conclusions. It is also involved in introspective consciousness (maintaining the state of the human body, homeostasis, self-awareness, etc.), which includes activities that are not very close to aesthetic judgment. Finally, this region also seems to play a role in regulating the expression of emotions by the amygdala, which is probably closer to our subject. Research by Lacey et al. (2011) specifies the importance of the zones that are in charge of “rewards”, the ventral striatum, hypothalamus and OFC when examining works of art. It then offers some schemas for connectivity between these zones. According to Cupchik et al. (2009), the coordination between the OFC and AIC is likely to be complex: the OFC, in charge of the “rewards” related to perception (especially visual perception), is likely to store the “memory” of these rewards during the experience such that the aesthetic attractiveness is maintained through the AIC until the decision is made. 2.2.3. The role of the visual areas These studies have not ignored the activity in the upper visual areas and there is much research to attest to their engagement in vision when viewing works of art (e.g., in Figure 2.5(a), we have noted the engagement of the tertiary visual areas). In Di Dio et al. (2007), an observer is placed in the observation conditions for viewing statues in a museum. These statues either have the original proportions or deformed proportions (elongated or shortened). The study showed the involvement of visual areas: temporal areas, as well as the lateral occipital cortex, both known to process images in the human body. In Cupchik et al. (2009), a comparison was carried out between the observation of paintings with clearly defined contours and those with soft contours. The objective of this study was to highlight the possible role of active attention in the interpretation of works of art, since the cue from the contours, which is important in recognizing shapes, is not available at the output of the optic pathways in the absence of clearly defined boundaries. The study thus shows that a distinction is made in the superior parietal lobe, the region that, among other things, is in charge of spatial orientation. In line with the authors’ hypothesis, this indicates that the cortex is indeed engaged in the task of completing the pictorial message if the source image does not carry all of the information (Figure 2.6).
34
Aesthetics in Digital Photography
a)
b)
c)
Figure 2.6. In this experiment from Cupchik et al. (2009), we can see the responses of areas that were activated specifically when viewing artistic images “with soft edges” (b) versus images “with hard edges”. (c) The superior parietal lobe is preferentially active in the first case. This area is in charge of the spatial localization function in the interpretation of scenes. For a color version of this figure, see www.iste.co.uk/maitre/aesthetics.zip
Neurobiology or the Arbitrator of Consciousness
35
In Vartanian and Skov (2014), a summary of 15 studies confirmed the systematic engagement of the anterior temporal lobes, the fusiform gyrus and the parahippocampal gyrus, which play a role in recognizing the environment. 2.2.4. The role of memory and cognition The same studies often conclude that there is activation in the regions involved in memorization. Thus, in the study cited earlier (Di Dio et al. 2007), the authors identified the action of the medial parietal lobes and the pre-frontal lobes, regions that are involved in the memory function. This experiment also then argues that cognitive functions are engaged when processing reference statues. Further, the study of brain recordings in the presence of “beautiful” objects has shown an overactivation in the amygdala, a region with multiple roles, but which is often engaged for its role in positively or negatively appraising something during learning tests, and in expressing emotions. Similarly, the meta-study carried out by Vartanian and Skov (2014) concluded that there is very general activation in the anterior temporal lobe, which has been identified as playing an important role in semantic memory, as well as in conceptual integration during shape recognition. From their observations, the authors in Di Dio et al. (2007) deduced that the interpretation of artwork is “driven by data” (role of the optic pathways), as well as being “led by emotions” (role of the AFC and AIC). 2.2.5. The role of embodiment While the research cited previously argued that memory and experience, in fine cognition, must be taken into account for aesthetic appraisal, other research has also suggested that, on the contrary, a very different process is set in motion. For example, in the research presented in Freedberg and Gallese (2007), the authors start with the hypothesis that works of art with very different content can engage different brain regions. This approach seems especially well-suited to photography, which reproduces real scenes, thereby running the risk of mixing up the phenomena that we tried to distinguish in the Introduction. Indeed, the authors demonstrate the variety of regions involved, often related to cognition or memorization (as seen above), but they also find that the observation of the work may affect sites in the observer’s brain that are likely to have been engaged by the artist either when they found inspiration for their creation, or during the creation of the artwork5. The response returned by these sites will actively contribute 5 For example, a seascape can evoke the same memories of holiday-time in the observer, as well as in the painter.
36
Aesthetics in Digital Photography
to the creation of our emotions, following a process known variously as embodiment, incarnation or personification. This interpretation is based on the mirror-neuron theory (still debated) that makes it possible to hold, within ourselves, a representation of our own body and, by transition, of the world, resulting in the development of the empathy that is classically expressed when observing this artwork. Thus, the regions in our ventral premotor cortex, in charge of the limbs, are likely to be engaged when observing a photograph of an athlete, and will then be involved in the sensation we experience just as much as, or perhaps more than, the regions in charge of knowledge or memory, which are often evoked (temporal lobes, amygdala and hippocampus). This interpretation, tested using the works of Goya, Michelangelo and Caravaggio, was also verified using abstract art (Pollock, Fontana). It was also supported by experimental findings in Di Dio et al. (2007) and, more recently, in Ticini et al. (2014). This would corroborate other suggestions made about how our judgment functions (Damásio 1994).
2.3. Responses from functional electric encephalography Unlike approaches that use functional nuclear magnetic resonance, techniques that measure electromagnetic potentials in brain activity make it possible to study phenomena that occur very rapidly, but this is at the cost of mediocre localization of the sources, especially at depth (Renault 2004). These techniques may be enhanced by tomography6, which makes it possible to recalculate volume information. Consequently, EEG studies have made it possible to show response times in areas that are sensitive to orientations, colors or that are able to detect the presence of a written text or a face in a photograph, even if observed for a very brief duration (typically 20 ms). These times range from 150 to 300 ms, showing that our brain circuitry has a remarkable ability to identify important primitives in an image (Thorpe et al. 1996; Fize 2004). Other studies have made it possible to measure the time required to carry out more elaborate functions, such as the categorizing of objects, which involve memory abilities (Schendan and Kutas 2007). These times typically range from 300 to 850 ms following a very brief presentation of an object. These activations are called event-related potentials (ERP). The difference that marks the viewing of artworks does not seem to reside in reflex abilities, but rather in ERP activations, which may potentially be delayed, calling upon memory and also involving emotions. These activations appear after a period that may 6 An electromagnetic tomographic reconstruction protocol that is frequently adopted is called LORETA (low-resolution electromagnetic tomography analysis) (Pascual-Marqui 1999).
Neurobiology or the Arbitrator of Consciousness
37
be relatively long (in the order of a second or more) or, sometimes, very long (from 5 to 10 s)7 (Hillyard and Anllo-Vento 1998; Schupp et al. 2004). Research carried out by Lengger et al. (2007), therefore, examined differences in brain activity in the presence of an abstract or figurative artwork. This research was carried out using contemporary artwork (thus governed by aesthetic criteria that are quite homogeneous), and consisted of reports on brain activity and analysis that aimed to determine two properties of the observer’s experience: their level of pleasure and their understanding of the artwork. With respect to the pleasure that was experienced, the study did not make it possible to link significant results: an abstract and a figurative painting evoked similar pleasure ranges, which could not be clearly differentiated from each other in brain activity. Studies on the observer’s understanding of the artwork produced the results presented in Figure 2.7(a). These show that after a delay of 6 to 10 s, different zones are active. According to the authors, this seems to demonstrate that parallel processing is carried out in the brain for both these types of stimuli. They also observe that the figurative images provoke many more associations than abstract images, especially in the left frontal lobe and in the parietal lobes and limbic system. The frontal lobe is known for its role in the integration and combinations of diverse information, as well as its role in maintaining attention and memory, a role it shares with the parietal lobe (some of these results can be found in de Tommaso et al. (2008)). Another contribution this study made was to test the possible role of contextual information brought in prior to the observation. To test this, certain tests were conducted after the observer was given an explanation on the artist’s style and intention. These results are presented in Figure 2.7(b). They show that the absence of information leads to an increase in activity in the left frontal lobe, parietal lobe and insula. This was interpreted by the authors as an additional action undertaken by the observer to arrive at an interpretation and categorize the incoming signal. It therefore appears that any information on the artwork that is easily available, whether it is found within the work or in any additional source of information makes interpretation easier. It has not been shown, however, that this information helps in the aesthetic experience. 7 For this reason, these are sometimes called SLP (slow cortical potentials).
38
Aesthetics in Digital Photography
Figure 2.7. Two results from EEG analysis showing the role of different brain areas during the observation of artworks. For a color version of this figure, see www.iste.co.uk/maitre/aesthetics.zip
Neurobiology or the Arbitrator of Consciousness
39
C OMMENT ON F IGURE 2.7.– (a) In an experiment where an observer was presented with contemporary abstract art (at the bottom) or figurative art (at the top), we can see the zones with significantly more activity (in blue) in the presence of figurative art: in particular, the left pre-central gyrus, the right medial temporal lingual gyrus, the left central superior gyrus and the paracentral lobule. (b) The experiment consists of showing the observer a work of art accompanied by commentary given beforehand (at the bottom) or with no explanation (at the top). The figure shows the zones with significantly more activity (in red) in the absence of any commentary. These regions are, in particular, the insular cortex, the left superior frontal gyrus and the left precentral gyrus (from Lengger et al (2007)). In recent research (Van Dongen et al. 2016), the authors focused on the role of the artistic context in the expression of the emotion experienced upon seeing an image. They did this by presenting the observer with a series of emotionally charged images, taken from the IAPS collection (test images used in psychiatry, refer to footnote 9 in Introduction), either claiming that they were taken from an artistic production or with no such background. They subsequently noted a difference in activity during the observation of the alleged artwork (e.g. unpleasant images appeared to be less unpleasant and sexual content was less sexually arousing). These differences were interpreted as different brain regions being involved (the centro-parietal regions were less engaged); in an artistic context, the observer is likely to be led into paying more attention to the distribution of shapes and colors (characteristics of artwork) than the interpretation of the content. 2.4. A global cognitive scheme for aesthetic judgment? We cannot help having a rather confused impression of these complex works, which sometimes lead to contradictory results and which are still largely being debated (Cela-Conde et al. 2011; Boccia et al. 2016). It would be preferable to have a functional scheme that described the trajectory of signals and stimuli across the brain regions, as well as the progressive development of emotions, judgments and actions that result from this observation. There are no models that cover the full spectrum of functions involved in aesthetic perception. We come across “low level” perception models, which describe visual areas very precisely, based on the pioneering work by Huebel and Wiesel and resulting in relatively well-developed functional schema that have been verified through experiments. The most famous of these is the David Marr model (Marr 1982). Among the models for early vision, we introduce here, as reference, the Jean Petitot model, first for its extreme mathematical sophistication, and secondly, for the conclusions that it draws with respect to the innate/acquired basis of visual representation (Petitot 2008).
40
Aesthetics in Digital Photography
We will also present several models of “higher level” aesthetic vision, which are careful not to establish a link between the transmitted signal and the perception that is experienced, and begin their modeling with higher neurophysiological activations, that is, the results of the perception phase, in a way. These models are discussed in Liu et al. (2017) and summarized in the following paragraphs. However, it is still surprising to see the diversity of approaches adopted, sometimes placing the brain regions, cognitive functions or biological systems at the heart of the schema. 2.4.1. J. Petitot’s neurogeometric model J. Petitot proposed an original mathematical model of the primary visual pathways (especially the V1 regions), which he called the “neurogeometric model” (Petitot 2008). He was particularly interested in the architecture of neural connections within the columns in the V1 region, which carry out a filtering of the visual signal based on orientation, as well as spatial frequencies. He draws parallels between these operations and the fibration of the retinal plane by projected lines of planar orientations, such that they could be defined in terms of differential geometry. Elaborating on this analogy, he showed that translation invariance leads us to bring in sub-Riemanian geometry, allowing us to explain the specific characteristics of vision, such as the perceptual grouping described in Gestalt theory, or the closure of contours at a distance illustrated in Kanizsa illusions. This mathematical model has been verified both when compared with neuronal architectures and also in the interpretation of many experiments in early vision. The author also shows that this explains very deep epistemological problems in visual perception, such as the concept of transcendental intuition, which makes it possible to solve tricky problems between what is innate and what is acquired – problems that are at the heart of Kantian aesthetics and also important in all of the phenomenological philosophy from Husserl to Merleau-Ponty via Bergson. This work in neurogeometry therefore partly goes against the current in most neurobiological approaches, which do not acknowledge the importance of brain architecture in our “pre-wired” representation of space, and give learning prime position in the construction of our representations. 2.4.2. A. Chatterjee’s aesthetic emotion model A. Chatterjee’s model (Chatterjee 2003) is one of the oldest models that tried to represent higher order perception and the impact it has on the observer’s experience. The model further divides the perceptual and cognitive tasks involved in a visual observation into a few broad categories, which it associates with brain structures and attempts to track the temporal transmissions between these tasks (Figure 2.8).
Neurobiology or the Arbitrator of Consciousness
41
Figure 2.8. A. Chatterjee’s functional model for vision with aesthetic judgment
COMMENT ON F IGURE 2.8.– Two different functions, attention and representation, interact univocally. Attention has an effect at the level of early vision by guiding exploration. It is partly influenced by the emotional response, and also contributes to the development of this emotional response. Attention acts on our decision, especially through reflex reactions. Representation is with respect to our models of the world and is also involved in decision-making (the justification for the aesthetic judgment) and the emotional response (adapted from Chatterjee (2003)).
Aesthetic appraisal takes two forms: an emotional response and a decision that anticipates actions. According to this flowchart: – the visual regions, occipital zones, inferior lateral zones, insular cortex, superior parietal lobule, etc., are, of course, activated during early vision, acting directly on the optical flow and also extracting semantic frameworks based on specific regions: shapes, colors, movements, faces; – the OFC, which is in charge of appraisal, regulating “rewards” and pleasure, is also a key player in controlling our aesthetic decision;8 – the insular cortex, which controls our emotions, is also an important part and is almost always involved in the observation of a work of art. It brings in the contribution required to make an appraisal. Is this contribution prior to the “reward” or is it the consequence of the “reward”? This does not seem to be clear-cut as of now; 8 It must be recalled that the most frequently used definition for Beauty today is “the characteristic of an object, the sight of which gives us the perceptual experience of pleasure” (see Introduction).
42
Aesthetics in Digital Photography
– the zones involved in operations of cognition (amygdala) and memory (medial parietal lobes, pre-frontal lobe) are also engaged. They provide the high-level contributions that will be invoked to “justify” the appraisal, especially whenever we seek an explanation from our consciousness about the aesthetic judgment. These justifications may be with reference to our lived experience or deductions from some reasoning; – the areas in charge of premotor control are called upon in a specific way in situations of high empathy and embodiment (premotor ventral cortex, temporal lobes, hippocampus). They are also involved in the reflex in the visual pathway of the image or when engaged by conclusions from the cognition regions, looking for cues for confirmation or explanation. 2.4.3. The model by Brown et al. This model has been presented in Brown et al. (2011). It is presented in diagrammatic form in Figure 2.9 and reflects the observations made in section 2.2. In this model, aesthetic appraisal results from the conjunction of two types of processing: one along the interoceptive pathways (which give us information on our internal states through various indicators), the other is exteroceptive, reflecting perception, which is visual in the case of an aesthetic experience with a photograph. The interoceptive pathway arises from sub-cortical zones: essentially, the thalamus, hypothalamus and posterior dorsal insular cortex. The stimuli that are thus established act on the interoceptive neuronal network, in particular on the right and left anterior insular cortex. All of these zones are known to play a role in the case of emotion, empathy and, generally speaking, to express appreciation (whether positive or negative). The exteroceptive pathway comes from the OFC, which is likely to play the role of the gateway for evaluating rewards for various modes (vision, hearing, taste, etc.). It appears that in the OFC, the various modalities have adjacent regions rather than common areas. Thus, like the interoceptive pathway, the exteroceptive pathways are also likely to be involved in developing the reward attached to the message from the visual channels, but the pathways through the OFC are more likely to use memory and reasoning. Further, it would seem to maintain the appraisal carried out over time, such that it intervenes in future appraisals.
Neurobiology or the Arbitrator of Consciousness
43
Figure 2.9. Functional model of connectivity by Brown et al. for vision with an aesthetic judgment
C OMMENT ON F IGURE 2.9.– In this model, the appraisal is carried out by the recurrent comparison of external information (exteroceptive) passing through the OFC and interoceptive information passing through the sub-cortical region (anterior insula). We can also see the influence of the cingulate cortex (rostral cortex), which is in charge of emotions, and the influence of the basal ventral ganglia, in charge of the hedonic response (from Brown et al. (2011)). In Brown’s model, the OFC also interacts with the most anterior part of the cingulate cortex (ACC), which seems to be involved in monitoring visual salliance. The connectivity between the AIC and the OFC will then lead to a biologically stabilized, or homeostatic, emotion which reflects our aesthetic judgment. 2.4.4. Model proposed by H. Leder At around the same time as A. Chatterjee, H. Leder proposed another flowchart. He used a very different plane to propose a modeling of actions of perception and aesthetic appraisal. In order to do this, he proposed a very precise categorization of the operations carried out by the operator, choosing a psychological model rather than a physiological one (Leder et al. 2004).
44
Aesthetics in Digital Photography
Figure 2.10. A model of aesthetic perception by H. Leder (from Leder et al. (2004))
Neurobiology or the Arbitrator of Consciousness
45
C OMMENT ON F IGURE 2.10.– In this model, we identify two distinct outputs: one corresponds to the aesthetic appraisal, and the other to the aesthetic emotion. According to the authors, both operations occur in parallel, following distinct circuits, and therefore neither can be reduced to the other. However, they consider that aesthetic pleasure (at the heart of the emotion) is greater when there is greater understanding of the artwork (one of the major points in the appraisal). The study was carried out on contemporary artworks, and it is not clear whether the same arguments can be applied to photographs. Further, the model focuses on the aesthetic appraisal of artworks, a task that probably uses more cognitive and semantic functions (recognition of style, recognition of the artist) than evaluating photographs. Nonetheless, this model offers us an interesting framework for our work. According to Leder et al., the mechanism is divided into five steps (Figure 2.10): the first two steps are essentially concerned with the visual pathways and are therefore primarily reflexes (for our purposes, we can therefore call them “objectivist”), the next three steps are the recognition of the content, the global interpretation and the appraisal. These steps are cognitive and may be called “subjectivist”: they use our domain expertise, centers of interest, tastes and also elements of the context in which the experience occurs, and which are very important in assessing a work of art (the ambiance in a museum, or the “signature effect” mentioned earlier). The broad principles in Figure 2.10 could certainly inspire a slightly different scheme that would be appropriate for photography. As in Chatterjee’s work, H. Leder proposes that the outputs of this scheme are, first, an emotion and, second, an aesthetic appraisal. However, both of these results are developed simultaneously in separate circuits throughout an affective evaluation, which progressively brings in elements of expertise in art, and results from a prior experience, personal taste and interest. Finally, H. Leder says that there is also the involvement of a context specific to observation in a museum, which acts outside the perceptive circuits only on aesthetic judgment and not on emotion. Leder’s model was complemented and slightly modified in Leder and Nadal (2014), especially to better account for the role emotion plays in aesthetic appraisal. 2.4.5. The model by C. Redies C. Redies proposed a model of aesthetic judgment that combines the perceptive and cognitive aspects within the same signal flows, but separating them based on two worlds (see Figure 2.11): one external (the image signal as well as the context of the observation), and the other internal, bringing together all biological functions, whether perceptive or cognitive (Redies 2015). In this internal world, the flow processed by the primary visual pathways is directed toward the zones of perception and provides elements for making a decision on beauty, as well as contributing to the establishment
46
Aesthetics in Digital Photography
of the emotional response (the upper part of the flow chart). When sent toward the regions in charge of cognitive coding and memory, a contextual response is developed through personal filtering, which also contributes to the emotional response (the lower part of the flowchart). In the long term, this filtering is the result of exposure to the observer’s cultural environment (learning, experience), unlike the world of external information, which is subject to the influence of the observer’s immediate context (ambiance, mood, etc.).
Figure 2.11. Functional model by C. Redies. For a color version of this figure, see www.iste.co.uk/maitre/aesthetics.zip
C OMMENT ON F IGURE 2.11.– This diagram distinguishes between an external world and an internal world. The external world consists of the observed object, which is already endowed with its perceptive attributes (including beauty) and the context of its observation (for instance, in a museum). The internal world separates the processing of the signal (coding, processing) in the upper part from the cognitive processing in the lower part. The emotion results from both streams of processing. The result is an “aesthetic experience”, which includes both emotional and cognitive dimensions (justification, explanation) (adopted from Redies (2015)). Emotion shares both data streams, perceptual and cognitive, which differentiates this model from the previous one, which does not consider the cognitive steps to have any influence on the emotion. Finally, the aesthetic experience here is the output of
Neurobiology or the Arbitrator of Consciousness
47
this system, which does not explicitly view aesthetic judgment as being a product of observation. 2.4.6. The emotions model developed by S. Koelsch et al. This model is not limited to the field of art. It aims to cover the entire field of emotions (Koelsch et al. 2015), especially human emotion, with its highly strong and complex social component. It is based on a functional division of the brain into four affect systems (which is why the model is also called the quartet model), which regulate affective activity and emerged progressively over the course of biological evolution: the orbitofrontal areas, hippocampus, diencephalon and brain-stem (see Figure 2.12). The authors give a very detailed description of the brain regions associated with each of these four affect systems, their role and their action, as well as their contribution to the development of the emotion. They distribute many psychological concepts over this area, thereby delegating these concepts to different affect systems: learning and the memorization of emotions, emotional satiety, cognitive complexity, subjective emotions, degree of consciousness, etc. They finally demonstrate how this emotive perception enriches expression and language and how these, in turn, function as regulators of emotions. 2.4.7. L.H. Hsu’s model of emotions based on A. Damásio The model proposed in Hsu (2009) is inspired by A. Damásio’s (1994) theories. It takes a philosophical view of the mind and breaks down the processing of an artwork into two complementary flows (the double-intentional-object model, see Figure 2.13) rather than four, as in the previous model. It is based on a functional scheme of the brain which differentiates the limbic path (which transmits emotions) from the neocortex pathway (which leads to reflective thinking). This limbic-neocortex distinction is discussed (and criticized) in LeDoux (2000). In Hsu’s approach, a proprioceptive flow creates/modifies the observer’s emotional state. It acts reflectively with very short response times. The second flow constructs our cognitive response to the stimulus, feeding the peripheral zones of the neocortex. Based on Damásio’s interpretations, the two flows do not merge, but are juxtaposed: the development of a thought formulated along the cognitive pathways is overlaid on the evolution of our perception. The mental image that we form through the interpretation of the image, the association with the contents of our memory and the deduction of our reasoning, etc., is influenced by our somatic state through the intermediary of neurochemical transmitters from the limbic system. L.H. Hsu explains the construction of our experience when faced with such an artwork in the following way:
48
Aesthetics in Digital Photography
Figure 2.12. The functional model (called the quartet model) by S. Koelsch et al. The functional system is made up of four active centers (called affect systems): one centered on the orbitofrontal regions, another around the hippocampus, the third around the diencephalon, and the last around the brainstem (on the left). The role of the linguistic system is also highlighted in this model (from Koelsch et al. (2015)). For a color version of this figure, see www.iste.co.uk/maitre/aesthetics.zip
Neurobiology or the Arbitrator of Consciousness
49
Emotions based on the image of the body are not parasites on reason, but the conditions for consciousness. They play an active role in cognition, memorization and decision-making. By preparing the organism for action and by reducing the complexity of the problems encountered, they form the foundation for reason to function. (Hsu 2009, p. 112)
Figure 2.13. Double-intentional object model by L.H. Hsu
C OMMENT ON F IGURE 2.13.– The exteroceptive stimulus (through the intermediary of the visual pathways) makes up the input signal, which will distribute the information toward a representation of the external world (the physical perception of the object), and toward a proprioceptive representation (modification in the observer’s state), which will express the emotions experienced upon seeing the object. The sub-cortical pathway, toward the limbic system (hippocampus, amygdala, cingulate gyrus, hypothalamus) is short, while the other pathway to the neocortex (toward the occipital areas of the visual areas, as well as the prefrontal and orbitofrontal zones) is long. This is the cognitive pathway that is likely to be associated with an appraisal (adopted from Hsu (2009)).
50
Aesthetics in Digital Photography
Consequently, according to Damásio, the action of observing an artwork has two distinct objectives: first, being conscious of a visible, intentional exteroceptive object, that is observable through our senses; second, the consciousness of an implicit, intentional interoceptive object, within the observer, observable as a feeling of self. This second, internal object reflects the observer as the “arousal” reflects the observed object. It is an expression by the observer “enlightened” by the object. This second object is Damásio’s original contribution to the philosophy of Beauty9 and seems to have been ignored by earlier work, which focused chiefly on the relationship between the object and exteroceptive consciousness. We thus read in Damásio: The sensory images that you perceive externally, as well as the related internal image that you recall, may take up most of the extent of your mind – but certainly not all of it. In addition to these images, there is also the presence that represents you, as the observer of the things in the images, the owner of the things in the images, the potential actor on the things in the image. There is your presence, in a particular relationship with a certain object. (Damásio (1999), cited in Hsu (2009)) 2.4.8. Other models There are several other models that are more or less well-formulated, and with differing levels of elaborateness. For example, in section 3.3.3, we will look at the model proposed by Thumfart et al., a perceptual model of textures that relates psychophysiological measurements to the perception of textures at three levels: an affective level, a level of judgment and an emotional level. This kind of model will be useful for aesthetic appraisal. In Vartanian and Skov (2014); Kirsch et al. (2016), we find more complete schema that attempt to illustrate the network of activations when exposed to a work of art, associating the three main functions: cognition, emotion and sensorimotor control. However, these are still very speculative. J.P. Changeux, on the other hand, proposes a near-complete scheme for the progression of signals from their entry into the visual system until they reach consciousness, beyond the “inferotemporal cortex and parietal lobe” (Changeux 2016, pp. 166–173). This diagram gives a good representation of the points that guide the signal’s various components (color, contours, background/form, faces, etc.) 9 In some aspects, Damásio’s approach can be interpreted in terms of affordance, that is, an immediate and non-mediated functional perception, proposed in J.J. Gibson’s “ecological theory of perception” (Gibson 1986). Gibson also referred to a double perception pathway: one proprioceptive and the other exteroceptive.
Neurobiology or the Arbitrator of Consciousness
51
toward specialized regions and then the engagement of the regions in charge of memory, meaning, emotions, etc., and spans the functions of conscious observation. The model is less detailed about the access to “the global neuronal workspace of consciousness, which is based on a neuronal network, whose long-distance axons connect the prefrontal, parieto-temporal and cingulate regions...the ignition [of which] brings in a top-down feedback loop [...] This ignition is likely to be the objective sign of the aesthetic efficacy of the artwork” (Changeux 2016, p. 169). We effectively have here, well-marked out, the very objective that we defined at the beginning of this chapter, but in a form that is too conceptual and hypothetical to be definitively convincing. We must make further progress in our abilities to study brain activities to verify these reflections. This cannot happen unless there are significant advances in the sensitivity and speed of detection by measuring instruments (EEG, MEG), and it is possible to envisage that the advent of better-performing tools will then open up new avenues of thought that could lead to other interpretations. 2.5. A critique of neuroaesthetic methods Despite the considerable significance of work dedicated to the neural bases of aesthetics, especially the neuroaesthetic approach, and despite the considerable support this has received in the scientific world (Changeux 2008, 2016; Le Bihan 2012; Boulez et al. 2014), there are also important criticisms leveled at this approach. These criticisms, in general, are directed at all brain imaging approaches (Uttal 2002; Dupont 2015), addressing both the objective and the methods. For neuroaesthetics, we find the criticisms collected in Vidal (2011, 2012), and find some rebuttals to these in Pearce et al. (2016). 2.5.1. Criticism of neuroaesthetic methods Where method is concerned, neuroaesthetics are accused (not wrongly) of highlighting “correlations” (zone X is activated when stimulus Y is presented) and often hastily deriving, from this, functional models that support causalities. Further, brain functions are distributed in the cortex and, simultaneously, different regions are engaged for different kinds of processing. An analysis, a posteriori, of brain activity maps, ignores the diversity of the underlying mechanisms and attempts to look for the most concise explanation between the object and aesthetic appraisal, leading to succinct interpretations such as, “Beauty is a quality of objects that is correlated with activity in the medial prefrontal cortex through the intermediary of our senses” (Ishizu and Zeki 2011). It is clear that the evolution of observation techniques, especially the emergence of methods that make it possible to showcase exchanges between brain areas, will push back against this criticism in the near future.
52
Aesthetics in Digital Photography
Another criticism reproaches neuroaesthetics for working in a state of urgency. Many studies seem to poorly define the objective of their experiments and confound or combine different concepts: beauty, art, interest, innovation or singularity, etc. Experiments often ignore determining dimensions: culture, education, social acceptability, etc. They are unable to identify these in the protocols and during the interpretations and cannot distribute them well over the activated brain regions. These dimensions are components that have been identified, for many years now, by social sciences, ecology and psychology. If neurologists fail to take them into account properly, it is not because they are unaware of them, but because they consider (without demonstrating this) that they are included in their process, or because they believe they are too complex to be properly processed and, therefore, prefer to exclude them altogether from the scope of their work. 2.5.2. Criticisms of the objectives of neuroaesthetics Criticisms of the objectives of neuroaesthetic research are more fundamental. These very radical criticisms are given in Brown and Dissanayake (2009), taking into account the social evolution related to the contemporary multicultural context. They thus oppose the interpretation of the term aesthetics that arose during the Enlightenment, and that is attached to the emotional response that results from the contemplation of artworks (which is what neuroaesthetics study) and the historical meaning associated with the value system related to the appreciation of beauty, a meaning that is especially seen in Darwin’s work, where it applies to human animals. More fundamentally, we cannot ignore the criticisms of the philosophical premises of neuro-aesthetic processes that are based on the foundations laid by experimental subjectivists in the 18th century: “nothing is innate, thus any consciousness from perception is the result of biological activation produced in the observer”. To which H. Bergson responded, “intelligence is inept at giving us the reality of things, as it mortifies them and only delivers a hollow scheme of these things” (Bergson 2009). M. Merleau-Ponty adds to this, saying: It is inevitable that in its general effort to objectivize, Science sees a human being as a physical system in the presence of stimuli, which are themselves defined by their physico-chemical properties, and strives to use this as the basis to reconstruct the effective perception and close the cycle of scientific knowledge by revealing the laws according to which knowledge itself is produced, by creating an objective science of subjectivity. However, it is also inevitable that this attempt will fail. Merleau-Ponty (1945)
Neurobiology or the Arbitrator of Consciousness
53
Indeed, as P. Huneman and E. Kulich say: It (philosophy) is not a philosophy of universal reason, but shows a “transcendental field”, that is, a space where there is an inter-dependence between the objects and the meditating subject, thus rendering the subject/object separation null (this is only valid through the independence and abstract characteristic of the subject in philosophy) (Huneman and Kulich 1997). In the appendices, we will see how Asiatic cultures were constructed around these concepts such that the very concept of aesthetics no longer has any meaning.
3 What Are the Criteria For a Beautiful Photo?
The value of certain colours are emphasized by certain forms and dulled by others. In any event, sharp colours sound stronger in sharp forms (for example, yellow in a triangle). Those inclined to be deep are intensified by round forms (for example blue in a circle). On the other hand, if a form does not fit the colour, the conjunction should not be considered “inharmonious”, but rather as a new possibility and, therefore, as harmony. Wassily K ANDINSKY (1954)
Let us set aside both philosophy and neurobiology for the moment and look at texts that offer practical advice on how to obtain beautiful photos. Wherever possible, when the authors themselves do not give reasons for their recommendations, we will try to justify the strategies they suggest by using the theories we have just discussed in the previous chapters. As far as possible, we will verify this on the ground, either through tests in experimental psychology (where the effect of such-and-such a parameter is measured across a cohort of observers) or through measurements on several collections of photographs, from which we will derive statistical results that make it possible to distinguish photos that are said to be beautiful from those that are more ordinary.
56
Aesthetics in Digital Photography
3.1. Before we enter into the fray 3.1.1. What reference books do we have? We are spoilt for choice when it comes to choosing guides that advise us on art and how to take good photographs. There are dozens of books like these and they are meant for beginners as well as professionals. Their content swings between two extremes: – on the one hand, highly technical books that are based on the working of a camera to indicate the role that each setting plays in the final image: f-number, focal length, shutter speed, sensitivity, etc. They explain how to capture motion or, on the other hand, how to track it over time, how to increase the depth of field, how to arrange lighting and so on; – at the other end of the spectrum we have books on the aesthetics of photography which train the photographer’s eye to capture certain messages, which are like similar essays on aesthetics in painting. That is, how to inculcate certain sensitivity in a photographer that will allow them to capture an interesting shot in a given scene. These books explain why a landscape must be muted or sharpened, when to introduce sadness or melancholy, how to create an ambiance, suggest an interpretation, direct the observer’s gaze or misdirect it. It is in the latter family of books that we will find the most significant aesthetic advice. However, these are very rarely described in terms that can be translated into rules. For example, in M. Freeman’s book1 (2018), where, in the introduction, he lays out a short list of qualities that an image must possess to be beautiful: 1) a photo is clear; 2) it stimulates and provokes; 3) it operates on several levels; 4) it conforms to the cultural context; 5) it contains an idea; 6) it corresponds to the media. We can all accept the relevance of these recommendations for any photographer, or debate them, or compare them with other such recommendations from other authors. However, it is still difficult to transform them into objective, operational criteria for aesthetic measurements of a photo for a machine that has neither “ideas” or a “cultural context”; which does not know what “levels” denote, and which is not 1 Michael Freeman is a professional photographer who is famous for his photographs of open spaces. He has also published many books on photography, which are set apart by the fact that he strives to associate aesthetic intent with precise rules for composition or photographic setting. In doing so, he thus bridges the chasm between the two extremes cited here.
What Are the Criteria For a Beautiful Photo?
57
sensitive to “stimulation” nor “provocation”, and which knows of no other “reading” than compatibility with a programming language. In short, these rules are meant for humans endowed with senses and not a computer. We cannot use these as a basis to derive algorithms that measure the aesthetic qualities of a photo, although this in no way detracts from their value. 3.1.2. “Beauty of an image” or “quality of an image”? At this point, it is important to spend a while on the adjacent concepts of “quality of an image” and “beauty of an image”, which we wish to clearly distinguish. The quality of a photo is a property that has been widely studied for over 50 years. It has accompanied the socioeconomic rise of electronic images, from the point it is captured to its transmission, whether for photographs (tele-detection in medicine) or moving images (television, video-surveillance, etc.) (Maître 2017, Chapter 6). The quality of an image is a property that covers specific technical attributes of the signal: 1) its contrast or dynamic; 2) the range of its chromatic palette; 3) the focus and precision of the details; 4) the richness of textured zones; 5) the subtleties of the models over zones that have small variability; 6) the absence of noise on uniform fields; 7) the absence of any perceptible geometric distortions of the linear contours and deformations in the field. These properties allow us to incline toward one mode of transmission, coding or display, over another and to this end algorithms have been developed to measure the quality of images: Moorthy and Bovik (2011), Charrier et al. (2012), Chandler (2013), Mittal et al. (2013). However, these algorithms do not study the aesthetic properties of the image and can be applied to any image that is high quality without being beautiful (see Figure 3.1(a)). In this chapter, we will see that a photograph can, conversely, be beautiful without meeting all the requirements listed above. It may even be that none of the criteria for quality are met but an image is still judged to be very beautiful (Figure 3.1(c)). Beauty and quality, when applied to photography, are two very different concepts. While they often go together (beautiful photos are often high-quality images), the converse is not true at all. Nonetheless, we will also see that the proximity of these two concepts has led to some authors using very similar tools to address both problems (Talebi and Milanfar 2017).
58
Aesthetics in Digital Photography
a)
b)
c)
Figure 3.1. “Quality” and “beauty” of a photograph. For a color version of this figure, see www.iste.co.uk/maitre/aesthetics.zip
C OMMENT ON F IGURE 3.1.– Photo (a) has all the criteria for a high-quality image but is obviously not a beautiful photo. Photo (b) (Triple falls in Glacier National Park, Montana by Sean Bagshaw) is similarly of high quality: resolution, focus, chromatic palette, nuances, textures. Unlike the previous photo, it also has aesthetic properties. Moreover, it has been lauded for its beauty in the highly technophile Pinterest community. Nevertheless, it is not a given that all experts will vote for its aesthetic qualities. On the contrary, in photo (c) (Untitled by Seydou Keïta) the quality is quite mediocre, with respect to the framing as well as the way it was archived, which considerably damaged its technical quality. And yet, in professional photography circles it has won undeniable recognition for its aesthetic. 3.1.3. A glossary of aesthetic appraisal Let us review the vocabulary used to describe aesthetic qualities and let us note that the various terms used for this are distinguished to a great extent by their semantics. However, this observation, made in the middle of the last century, seems to have had little impact on modern approaches. With new techniques that use large volumes of information, it may be desirable to make use of these differences. Many long debates (Elton 1954; Sibley 1959; Kivy 1968; Iseminger 1981) have been conducted within what is called (by common consensus) “analytic aesthetics” to define the language used in aesthetic critiques. This subject has been briefly discussed in section 1.1.4.
What Are the Criteria For a Beautiful Photo?
59
This debate started from the observation that while certain descriptions resulted directly from perception through our senses: red, contrasted, sinuous, etc., others were indirectly obtained from the observation of an artwork: graceful, dynamic, sad, etc. These terms were called aesthetic terms by researchers who were trying to define how these terms were developed by the observer, since words of this kind often form the framework for the explanation given by an expert when justifying their judgment. Some researchers believes that these “aesthetic” qualifiers are directly derived from non-aesthetic characteristics, although this is a complex process. A term such as rectangular can be logically deduced from a few simple rules (“right angles”, “parallel sides”). Even a more complex adjective, such as intelligent, may be verified through a large number of sufficient conditions (“holds several degrees”, “plays chess well”, “speaks many languages”, “is able to respond well”, etc.). On the contrary, there does not appear to be any battery of tests that allows us to decide whether a photo is banal or romantic2. Moving from these non-aesthetic properties, which can be shared by everyone, to the aesthetic properties that are born out of them is a burning question in philosophy, and there are many controversies over how this transition takes place. For Cometti et al. (2000), (Fabrizio 2015) and according to ideas put forth by Levinson (1980), the aesthetic properties “supervene” (as used by Davidson3) on the non-aesthetic properties4. Other philosophers believe that the “aesthetic” elements are processed separately by vision or hearing through a sense that may be distinct, “taste”5, posing new problems that we will not discuss here (Kivy 1968). Other philosophical questions arise: are aesthetic qualities evaluative6? Yes, we do want this to be the case in order to recognize that a photo is more or less 2 “Though on seeing the picture we might say, and rightly, that it is delicate or serene or restful or sickly or insipid, no description in non-aesthetic terms permits us to claim that these or any other aesthetic terms must undeniably apply to it” (Sibley 1959). 3 Davidson’s “supervenience” has been briefly discussed in footnote 31 in Chapter 1. 4 “Aesthetic properties supervene on non-aesthetic properties: there is no aesthetic difference without a non-aesthetic difference. Basically, aesthetic properties are likely to be supervenient as well as emerging: it may not necessarily be possible to reduce them to non-aesthetic, physical and perceptible properties, however they are likely to depend on these and co-vary with them, i.e., they are attached and fixed to these [non-aesthetic] properties” (Cometti et al. 2000). 5 This taste must be distinguished from the sense that identifies flavors and is thought to be common to all senses. It must be noted that, to date, functional neurology has not found any evidence to support this hypothesis. 6 The concept of evaluative judgment is a complex one (Tappolet 2000). First of all, it questions the transition from the experience to the object that produced this experience: if “this spider is disgusting!”, what part of this arises from my own disgust, and how much of it can be attributed to the spider itself? We will not discuss this point further here, as it is discussed quite a bit in the rest of the book, but let us look at the quantitative aspect of judgment. Is there a gradation in the value of the judgment? Can we measure the spider’s “disgustingness”?
60
Aesthetics in Digital Photography
“beautiful” and we follow in Mikel Dufrenne’s7 steps here: “...beauty is opposed to ugliness and appears to be measurable in degrees: we can speak till we’re blue in the face about something being more or less beautiful” (Dufrenne 1980). But what does “evaluative” mean for a “melancholic” photo? Many qualities are metaphorical (“dramatic”, “romantic”, etc.). How do you imagine the semantic depths here? What must be added to the photograph to make it “romantic”? Is there any configuration of pixels that could mirror the arrangement of notes in Chopin’s Nocturne or the words in Novalis’ work? Is there any “aesthetic property” that exists in the absence of the observer’s culture, their education and their temperament? And finally: is there any “aesthetic property” without the observer? As E. Gilson8 so pertinently asks: what happens to the aesthetic properties of a painting when the museum doors close behind the last visitor? We now fully enter into our debate, which will use terms that are hardly new. Let us replace “non-aesthetic properties” by “computer-assisted measurement” and we have the same set of questions. We will, however, simplify them a little as all we wish to find is an “aesthetic value” rather than exploring the full richness of “aesthetic properties” that critics have proposed. If we use new vocabulary and note that the artwork is projected onto the canvas along perfectly objective lines, and thus objectivist lines as used in the philosophy of perception, is it then possible to use a logical chain (carried out by the computer, here) to derive an evaluation of the aesthetic quality of the work? The process, which we will discuss in the following sections, aims to create an inventory of all the “objective” cues that lead to aesthetic statements. In the following chapters, we will see that it is then possible to use these cues, refine them or distil them, to arrive at an “aesthetic appraisal” using a computer. 3.1.4. Measuring beauty Let us now return to a debate that arose at the beginning of the last century, whose objective was to describe this property that we call “Beauty” in an “evaluative” manner and in a post-Kantian framework, where Beauty is perceived subjectively and gives rise to an attested behavioral response at the end of the perceptual chain. The measurement can be carried out in various ways (Palmer et al. 2013), for instance, through physiological measurements (skin conduction, event-related potential, fMRI), although it is more often done through interviews conducted using one of the three following modalities: 7 Mikel Dufrenne is a 20th-century philosopher, whose books: Aesthetics and Philosophy and The Phenomenology of Aesthetic Experience anchored philosophy in the phenomenological current. 8 Quoted in Cometti et al. (2000).
What Are the Criteria For a Beautiful Photo?
61
– a two-alternative forced-choice comparison, a combination method that requires experts to put in a lot of time; – classification of samples, which is quick for small sets, but becomes quite complicated for many samples; – awarding a score to each sample, a technique that is often adopted by online questionnaires as it is very flexible, but which presupposes the existence and common knowledge of the same scale among different experts. 3.1.4.1. The dimension of Beauty The earliest research considered the space in which Beauty evolves. Is it a scalar quantity? Does it have several independent components, several dimensions? This problem was taken up using experimental psychology tools that were already outdated (Dewar 1938; Eysenck 1939) in order to determine a global measurement scale for aesthetic appraisal and the “dimension” of aesthetic judgment. This research was based on a collection of artwork that was arranged in order by experimenters who were rather new to this and through the cross-examination of their results. The chosen artworks were reproductions of paintings, vases, flowers, etc., which were to be ranked, within the same theme, based only on their aesthetic qualities. The statistical study was carried out on over 300 observers. This process made it possible to reduce each observer’s individual specificities by only examining the relationships between the works of art. The authors concluded, “that a single general factor of artistic taste is responsible for artistic classification” (Dewar 1938). This factor was thus compared to “Beauty”9 and the dimension 1 attributed to it. The second study added some nuance to this result by showing that a second factor played a role if abstract documents were introduced into the test (drawings, posters, embroidery). This second factor had scattered results along an axis going from the formal to the representative (Eysenck 1939) and the authors concluded that this was then dimension 2. Forty years later, this research was revisited with the development of tests that were said to measure only aesthetic sensitivity, excluding the effects of age, sex, education, ethnicity or training (Goetz et al. 1979). VAST (visual aesthetic sensitivity test) is made up of binary drawings (Figure 3.2) drawn by artists and then intentionally modified to degrade their appearance. After unanimous validation by eight other artists, the pairs were presented to the observer who had to choose the 9 It must be noted here that this concept is faithful to the objectivist idea of a unique and universal beauty. On the other hand, it squarely opposes the subjectivist interpretation put forth by Xenophon, for instance, who was a Sophist in the Socratic school and who records Socrates as saying, “A beautiful runner is unlike a beautiful wrestler and a beautiful shield is unlike a beautiful javelin, and it cannot be otherwise because a shield is beautiful when it provides good protection, and a javelin when it is capable of being thrown swiftly”, Xenophon, Commentarii, III, 8, 4, quoted by Tatarkiewicz (1970).
62
Aesthetics in Digital Photography
most aesthetic design. While the statistical results offered quite a reasonable robustness with respect to the variabilities in the sampling of the observers, they did not highlight any specific ability for aesthetic judgment, which was the objective of the test (Eysenck 1983).
Figure 3.2. The figures used in VAST for aesthetic measurements
C OMMENT ON F IGURE 3.2.– The observer was asked to choose the most beautiful design from top to bottom. These tests are supposed to vary in difficulty, the shapes on the left being the easiest while the shapes in the center are the most difficult (from Eysenck (1983)). The statistical results from large populations do not lead to very convincing conclusions about the possibility of using these tests to distinguish between specific sensibility to aesthetic qualities. 3.1.4.2. A scale for beauty Most studies in experimental psychology are based on the sincerely held opinions of the experimenter and on a shared “common sense” between them, assumed to be universal (in line with I. Kant’s thinking). However, recent research has highlighted the difficulty in establishing an aesthetic opinion without having an operational measure for what beauty is. It is generally suggested that we use hedonic pleasure as the intermediate variable, as it expresses the emotion experienced by the observer and it may therefore directly reflect the degree of beauty. In the field of visual arts, this research is still finding its feet and the results are not unanimous (Augustin et al. 2011; Blijlevens et al. 2017; Hayn-Leichsenring et al. 2017; Augustin et al. 2018); there has been greater success with industrial design and the designing of human–machine interfaces, fields where it is possible to measure other aspects related to the functional utility.
What Are the Criteria For a Beautiful Photo?
63
3.2. Composition Composition is the choice of how to arrange the various elements of interest within the photograph, their individual position and their relative position. It guides the viewer’s attention to the elements that make up the scene. It has always been one of the major criteria in appreciating artwork. The Renaissance in particular has given us remarkable examples of complex constructs based on Pythagorean rules of harmony (Bouleau 2014; Crettez 2017) and some of these principles are still included today in guidelines for photography, as we will see here. Photography manuals give them significant place in their recommendations (Freeman 2018), however they rarely adopt the Greek arguments for harmony, expressing these rules in terms that are more influenced by Romanticism. We speak of “atmosphere”, “ambiance”, “softness” and so on, and prioritize the observer’s sensibilities. 3.2.1. Complexity versus simplicity Among the many attributes of composition, the one cited most often in the context of measuring beauty is the complexity of the scene. D. Berlyne defines this using the number of distinct elements involved, their semantic heterogeneity, the irregularity of their contours, their colors and how they are arranged, and the asymmetry and incongruity of the shapes (Berlyne 1971). Using shapes that were quite simple, he confirmed through test observers that we generally prefer moderate complexity (an inverse U-shaped curve, which goes by his name in aesthetics and which expresses that we set aside images that are too simple as well as overly complex). Berlyne’s conclusions are often followed today, however a long tradition, dating back to Ancient Greece, considers complexity to always be detrimental and G. Birkhoff (in his algorithmic approach, which will be explored in detail in section 4.2) seeks to minimize complexity. Complexity was at the heart of “entropic” approaches (A. Moles, M. Bense), which related complexity to Shannon’s information theory. Later on, the algorithmic theory of complexity developed by Chaitin, Kolmogorov and Solomonov from the length of the description, or the length of the shortest programme that could describe the image, was also applied, for example, in J. Schmidhuber’s “lazy brain” theory, which will also be studied in Chapter 4. These theories incline toward very low complexity (Schmidhuber 1997). The influence of complexity on aesthetic appreciation thus remains a controversial topic, but nonetheless it appears the heart of aesthetic appraisal, often counterbalanced by the concept of order. This is still the case in the aesthetics of digital works (Hoenig 2005). The significant elements of this complexity that have an impact on the aesthetic have been studied through multiple works and summarized in Nadal et al. (2010). It is chiefly the number of elements that affects the aesthetic, while their diversity in color,
64
Aesthetics in Digital Photography
variety in shapes and how they are distributed seem to be less important. The number of elements is therefore the first factor that determines complexity. The second is what the authors call the unintelligibility of the scene, while the third is disorganization. Ichikawa (1985) states that complexity is constructed from two processes: one is the result of a quick viewing and quite universal (number, shape and spatial distribution of the components), while the second is a cognitive interpretation that depends on the observer (their age, education and the attention they pay to the image). Both these processes can be expressed through two terms which, when suitably weighed, linearly define the experienced complexity of the image. 3.2.2. Unity Unity is the relationship that emerges from the various parts of the scene when harmony is created between the different local visual characteristics. A photo has the quality of unity if the various components carry the same perceptual properties: contrast, finesse in details, focus, color palette, etc. Different studies (Bell et al. 1991; Veryzer and Hutchinson 1998) have shed light on the role that unity plays in how a scene is appraised. These studies, carried out on many observers through evaluation tests, suggest that aesthetically attractive objects are distinguished by strong unity, while objects that receive weaker aesthetic opinions often present lower unity. Furthermore, aesthetic properties seem to benefit from superadditivity, that is, the whole is greater than the sum of the parts. Thus, if we consistently improve two components of an object, the overall gain is greater than the sum of the two basic gains (Veryzer and Hutchinson 1998). Nonetheless, these properties could only be rigorously established for simple drawings and it is not certain that this will hold good for photos, and that the aesthetic appraisal of a complex scene will be greater than the sum of the aesthetic appraisals of its components. 3.2.3. A specific case in composition: landscapes The composition of a scene is an oft-discussed item in photography manuals. Curiously, we find few systematic analyses of the general guidelines given for how to successfully compose a scene. However, this side of things has been well covered in solid psychosociological studies for images of landscapes alone (Kaplan et al. 1972; Hunter and Askarinejad 2015). In Kaplan et al. (1972), it has been shown that people prefer photographs whose composition is simple and that the apparent complexity is a negative factor in the appraisal of these photographs. It has also been concluded that nature photographs are generally and clearly preferred to sites of urban photos. It is seen that urban photos are considered to have greater complexity. However, the difference in appraisal does not seem to be due to this difference in complexity, but the difference
What Are the Criteria For a Beautiful Photo?
65
in content (a highly semantic criterion). More recent studies contradict the conclusion that images of nature have lower complexity, but concur on the conclusion that there is a preference for nature versus artificial, which they explain through the predominance of the semantic criterion over the visual criteria (Kotabe et al. 2017). Hunter and Askarinejad (2015) propose a systematic inventory of the components of landscape photos that contribute to aesthetic appraisals in later studies of the landscapes. The study listed 62 indicators that contributed to the aesthetic and many studies were based on some of these primitives to analyze our preferences (Kardan et al. 2017; Ibarra et al. 2017; Meidenbauer et al. 2019). We will discuss three of these briefly, and they have also been illustrated in Figure 3.3, as we find them particularly representative of composition: – framing: it is often recommended that a scene should be delineated by elements that guide the gaze toward the center of the view. This frame, often made up by foliage or trees, often in the foreground, may materialize only on some sides. Frames with three or four borders marked out in this way (configurations 8 to 11 in Figure 3.3) are preferred, as they give the observer an impression of greater safety and better continuity in the scene in the background; – perspective: Hunter proposes seven different types of perspective based on the presence and position of vanishing points. Forced perspective (image 5 in Figure 3.3) is preferred in tests because, according to Kaplan et al. (1972), it lends an air of mystery to the images; – scenography: Hunter used this term to designate the observer’s relative position with respect to the scene and identified seven different configurations. The use of foregrounding (transition from image 3 to image 4 in Figure 3.3) is highly recommended in essays on photography. The influence this has on photographs of mountains has been statistically verified in Schirpke et al. (2013). As concerns photographs of undergrowth, in Herzog and Bryce (2007), participants in a survey on such photographs validated the use of long focal distances rather than short distances (thus, narrow angles rather than wide ones), as these preserve an air of mystery that is appreciated. We find very similar opinions in many photography manuals. In section 3.5.1, we will look more closely at other results from these studies on environmental aesthetics, which shed light on how we evaluate these images. C OMMENT ON F IGURE 3.3.– Landscapes form a specific category of photographs whose composition has been systematically studied from an aesthetic point of view. From the systematic analysis carried out here, we have chosen three important aspects of this composition: (a) framing, (b) geometry of perspective and (c) the scenography. In each case, it has been statistically verified that these configurations are given priority (images 8–11 for framing, 5 (forced perspective) for perspective, and images 3 and 4 for the scenography).
66
Aesthetics in Digital Photography
Figure 3.3. Elements of the aesthetic judgments of landscapes from (Hunter and Askarinejad 2015)
What Are the Criteria For a Beautiful Photo?
67
3.2.4. Using oculometry to analyze composition The impact that a photograph’s composition has on an observer may be analyzed using oculometry techniques, which tracks eye movements as the eye studies an image, the sequence of its visits and the duration of each visit (Yarbus 1967) (see Figure 3.5). F. Molnar was thus able to differentiate between “good” and “poor” composition by studying these movements (good composition leads to fewer transitions from one site to another and longer periods of fixation (Molnar 1977)). He also noted differences between experts and non-experts in how the eyes travel across an image (Molnar 1981; Nodine et al. 1993). From this, he concluded that the appraisal of a work of art and the aesthetic judgment on this artwork are modified by learning. This interpretation has been contested by Rosenberg and Klein (2015) who link the observed differences to differences in motivation between observers.
Figure 3.4. Several effects listed in this text come together in this landscape. The particularly elongated format guides the gaze in exploring the seascape. The cloudy bar at the top and the jetty at the bottom frame the visual space. The lighthouse is foregrounded. The gradation in hues from dark shades in the foreground to lighter ones further back reinforce depth and draw the eye to the fine details of the spires. For a color version of this figure, see www.iste.co.uk/maitre/aesthetics.zip
It seems difficult to go further with the aesthetic judgment of an artwork using this technique without clarifying the reasons (or mechanisms) for these movements. We would like to couple these oculometric examinations with neurological recordings (e.g. fMRI), so as to associate ocular movements with the activation of one or another brain region. Unfortunately, the time constants of these two types of devices are still incompatible today.
68
Aesthetics in Digital Photography
a)
b)
Figure 3.5. (a) Oculometry carried out on a painting by Vladimir Repin (Unexpected visitors). (b) Results of one of the eye-tracking recordings, obtained by A. Yarbus (1967). For a color version of this figure, see www.iste.co.uk/maitre/aesthetics.zip
C OMMENT ON F IGURE 3.5.– Yarbus gave the observer several instructions in order to answer different questions. For example, “How old are the people?”, or, “What are they wearing?”. Each instruction led to different trajectories. This trajectory was obtained without any instructions and is likely to correspond well to the trajectory our eyes would follow in a museum. Let us note that this trajectory has not been reframed on the image, which would be done with trajectories obtained these days (Yarbus 1967). Let us now look at some elements that are involved in the composition of a photograph, which are the subject of recommendations that can be verified a posteriori. 3.2.5. Format or aspect ratio Photographs are produced in very variable formats10 (see, for instance, Maître 2017). Many photographs refuse to alter the format through cropping. The framing is, therefore, imposed by the technology and this constraint is accepted, just as a poet accepts sonnets or haikus: rules that allow their talents to shine through. When asked, they rarely consider their chosen format to have a greater aesthetic quality. Let us see what formats are available to them. It must be noted that these formats often originated from Pythagorean rules of aesthetics, more or less forgotten today. 10 The word “format” is used here in the geometric sense of a width/height ratio, also called the aspect ratio. This term has a second meaning that refers to the coding adopted to display the image (e.g. JPEG or TIFF), which does not play a role in aesthetic judgment.
What Are the Criteria For a Beautiful Photo?
69
The square format is produced through technical optimization: this is the format that shows off the photographic objective to best results. The 3/2 format is a historical standard. It was chosen as a “harmonic” fifth and then emerged dominant for chiefly commercial reasons from the time that photography was democratized. The 4/3 format is a standard that was created due to television. This narrower format is more adapted to shooting in a studio. The 16/9 format falls outside the Pythagorean order and accompanied the rise of cinema halls in order to make it easy to comfortably view an image from a distance. It is probably the only format based on the user’s experience. In the digital world, different camera manufacturers, and now cellphone manufacturers, offer specific formats related to the geometry of the device’s sensors: 2/3, 4/3, 5/3, 5/4, 7/6, 16/9, etc., technical compromises of VLSI manufacturing, which amply illustrates the point that there is no ideal format. This is also highlighted by the formats offered by photographic paper manufacturers. They generally respond to different printing standards and offer yet another original range of formats. Unlike television or cinema, photographs can be in “portrait” or “landscape” mode, as required. This allows the photographers to choose the orientation that is best adapted to their subject. Apart from adherents of the “golden number” (we will look at this further on), and despite the Pythagorean recommendations that lead to the creation of this “golden number”, there are no longer aesthetic recommendations on any one universal form that surpasses all others. To quote R. Arnheim: “The more visual affinity there is between frame and picture, the more they can influence each other” (Arnheim 1983). Let us also note that it is common to change formats as required in photography. Printing on paper, and then on a computer, has largely encouraged the practice of cropping. Thus, seascapes are very elongated, just like battle scenes and cityscapes (Figure 3.4). On the other hand, still life photography is more compact, as are views of interiors or streets. There are also full-length portraits that are very tall, while photos from the shoulder up are stockier. 3.2.5.1. Statistical studies of formats Several studies have tried to evaluate the distribution of formats among pictorial artwork that are considered to be beautiful among classical European paintings11. In keeping with Arnheim’s propositions, the most probable format depends on the subject being captured. However, if we ignore the subject, statistically, the majority of vertical paintings are in the 5/4 ratio, while the majority of horizontal paintings are most often in the 4/3 ratio (Arnheim 1983), showing a slight anisotropy, which can be explained by the anisotropy of the visual field, stretched out more along the horizontal axis. Similar studies carried out on collections of photographs of art show that the formats dictated by technological standards, either due to film or photographic sensors, lead to a preponderance of certain formats: 1/1, 2/3 and 3/4 11 As we know, the Chinese tradition of painting on scrolls often adopted highly elongated formats that were not commonly used in Europe. This artwork is not taken into account in these statistical measures.
70
Aesthetics in Digital Photography
(Arbellini 2017). It would, however, be imprudent to look at this as a consequence of Pythagorean rules on aesthetics. 3.2.5.2. The golden number þ
The golden number has the value Φ = 1+2 5 þ 1 .6 8 1 . It emerges from many mathematical equations and is endowed with remarkable algebraic properties, which explain its frequency in natural constructions. It also holds a singular place in Art and has been the subject of many essays and much research seeking to study its use from Ancient times and also seeking to justify its aesthetic qualities (Ghyka 1931). It was regularly cited during the Renaissance, but saw a real resurgence during the 19th century with publications by A. Zeising, and then M. Ghyka. These books were resolutely engaged in promoting this number, with weak methodological bases. Somewhat hastily obtained results by G. Fechner seemed to have established a perceptual interest in this and Φ was rapidly promoted to the station of “divine proportion” and spoken of as such to explain any Ancient aesthetics. This reputation explains its abundant use by many artists over the last two centuries (Seurat, Le Corbusier, Duchamp, Pissaro, Picabia, Léger, Dali) who indisputably created masterpieces using this ratio (as they did with other ratios). Research conducted from the 1950s onwards largely demonstrated the arbitrary nature of many of the aesthetic virtues attributed to Φ as well as the weak proof for its use in Ancient times (Markowsky 1992; Neveux 1995; Livio 2008). We specify in particular, with respect to the general aspect of a photograph, that the golden number is not the preferred ratio of length to height of any rectangular surface (this was the proposition that Fechner tried to prove, however his result was marred by a methodological bias (Markowsky 1992)) and that the most probable ratio for these dimensions, in a large number of artworks considered to be beautiful, is not equal to Φ. We will encounter this here and there, as an explanation for a particular proportion or position, but without any truly convincing arguments. We find, in Obrador et al. (2010), the use of proportions based on the golden number and their extension to more complex shapes, such as golden triangles (see Figure 3.6). These proportions replace the rule of thirds (RoT, which will be seen below) in the placement of points of interests and zones of salience. This chapter seeks to show that using these rules of composition would be effective in choosing “beautiful” images, but the demonstration is not very convincing. 3.2.6. The rule of thirds (RoT) This is, in fact, the only truly universal rule in photography. It is found in all photography manuals, although its origins date back to 19th-century paintings (Amirshahi et al. 2014). This rule states that the objects of interest in an image must
What Are the Criteria For a Beautiful Photo?
71
be found on the two horizontal lines or two vertical lines that divide an image into three equal parts (see the images in Figure 3.7).
Figure 3.6. Grids that make it possible to detect the composition of the image according to Obrador et al. (2010). From left to right: the grid for the rule of thirds, then the grid for the golden mean, then a grid with one of the golden triangles, and, finally, a filter combining two golden triangles. Each mask is obtained through the convolution of a line, or several lines, with a Gaussian kernel. There are 55 in total, denoted here by αn
Figure 3.7. The rule of thirds shows how to arrange the zones of attention within the image. Dividing the length and breadth of the image into three, the objects of interest must then be placed along these four lines, ideally at the intersections, if possible. For a color version of this figure, see www.iste.co.uk/maitre/aesthetics.zip
Several studies have been dedicated to verifying the RoT. The most comprehensive of these, in Amirshahi et al. (2014), suggests testing the validity of this rule using a
72
Aesthetics in Digital Photography
carefully chosen database of images (“beautiful” images are selected from among the highest rated images on the website Photo.net). These are either chosen by a group of observers, or using algorithms, which identify a zone of interest as a zone of high salience. The conclusions of this study, presented in Figure 3.8, clearly reveal that the RoT is not respected in the majority of “beautiful” photographs, neither more nor less frequently than in ordinary photographs.
a)
b)
Figure 3.8. Aesthetic quality depending on adherence to the rule of thirds (from Amirshahi et al. 2014). For a color version of this figure, see www.iste.co.uk/maitre/aesthetics.zip
C OMMENT ON F IGURE 3.8.– In this study, we can see the distribution of the aesthetic appraisals of photographs that verify the ROT (in red, labeled ROT+) or do not verify this rule (in blue, labeled ROT-). These appraisals were measured either (a) subjectively or (b) digitally. The value for whether the RoT is respected varies from 0 (no agreement) to 1 (perfect agreement). The aesthetic score (vertical axis) is the average of 20 observer judgments. It can be seen that there is no aesthetic difference between the ROT+ and ROT- populations. In an associated result, it can be seen that computer-assisted techniques find it difficult to distinguish between ROT+ and ROT-, with the difference being experienced better by an observer, irrespective of the fact that the authors chose the best of three established techniques for determining salience. 3.2.7. The center of the image Positioning the object of interest at the center of the image goes against the RoT. Nonetheless, it is one of the rules proposed by psychologist D. Ross in 1907. R. Arnheim, the theoretician, adopted and defended this suggestion (Arnheim 1954, 1983). He made a distinction between the geometric center of the frame and the
What Are the Criteria For a Beautiful Photo?
73
center of the gravity of the image, which takes into account the visual importance of each zone. Arnheim proposed that when these two points coincided, the construction would appear balanced, if not, a recentering force was brought in, which could have a negative effect on the aesthetic appraisal. He also suggested that the horizontal and vertical axes, and also diagonal axes, if these existed, as well as the corners were other sites that competed with the center in drawing attention to the balance. The study by McManus et al. (2011) attempted to statistically verify this property. It concluded, with results that were not very convincing12, that Arnheim’s proposition was not very well verified in practice and that it contradicted another of his propositions where he invokes a repulsion by the frame as an additional acting force in the balance (McManus et al. 1985). Other studies proposed different rules for measuring the apparent equilibrium of a composition (Hubner and Fillinger 2016). They lean toward verification protocols using very simple images (geometric shapes in black on a white background) and these results are difficult to generalize to photographs. In order to add some flexibility to the center of the image and to generalize it a little, it was suggested that “focality” be defined as the single point in the image on which the gaze converges, if this point exists (Ulrich 1983). A strong focal point is appreciated in images. This point is often induced by the convergence of lines. It is therefore not necessarily materialized. This idea joins several recommendations in photography manuals, however there are many problems encountered when testing this digitally and these are not yet resolved. 3.2.8. Other rules for composition Photography manuals also present other geometric elements as being important in aesthetic appraisals of images, however these have not been methodically studied, to the best of our knowledge. 3.2.8.1. Symmetry We have seen that this term, originally from Greek aesthetics, encompasses properties of harmony that go much beyond its modern usage13. It is this modern usage and scope that we will use here, especially with respect to horizontal symmetry (thus, with respect to a vertical axis, the axis that is most familiar to an observer who is also endowed with this symmetry). Horizontal symmetry is essential in classical 12 The study adopted an over-simplified definition of the center of gravity, where the mass from the physical definition was likened to the grayscale; and it did not take into account the statistical bias introduced by limiting the size of the image. Further, it tested the observer’s judgment rather than the quality of the document. 13 See section 1.1.1 and footnote 14 in Chapter 1.
74
Aesthetics in Digital Photography
texts on architecture and any deviation from this symmetry was thought to be disgraceful. Horizontal symmetry contributes to Arnheim’s “center of gravity”, without being essential to the latter concept, and it reduced the RoT to one dimension. Although this is a quality that is often sought for in composition in order to convey a sense of balance (especially in early religious paintings), there is no systematic evidence for this quality in painting, nor any formal recommendation14. Symmetry was given considerable importance by the highly geometric schools of art in the early 20th century and with the advent of industrial design. Symmetry contributes to the sense of balance and also to the centering effect. Its role in conveying an impression of peace, or even respectability, has been systematically studied on populations of observers in the context of marketing studies and compared with the impression of dynamism resulting from asymmetry (Luffarelli et al. 2019). It is highly probable that conclusions from this research carried out on simple shapes, expressed in terms of arousal, can be extended to the domain of photography, especially its most abstract forms, but this comes at the cost of complexification. 3.2.8.2. Negative space We sometimes use the terms “positive space” and “negative space”, the former for the parts of the photographic field that are used to represent the central subjects (those that have been photographed) and the latter for the background or surrounding environment. Although the negative space is not primordial in many photos, it is recommended that this be given a significant place, highlighting the main object, and it becomes even more important when the positive space is in motion. In this case, the negative space is greater in the direction of motion (Figure 3.9(b)). This very empirical rule has significant consequences in defining the format of the image (the subject we just discussed above) and leads to the choice of a global format that echoes the aspect of the objects of interest (an elongated format for a lake or mountain range, a squarer format for a face). 3.2.8.3. Repetitive and periodic objects Unlike in the case of negative space, it is often recommended that no negative spaces should accompany these positive spaces and a very tight frame be used, limited only to the positive space, when the object is regular and when the aim of the image is to highlight this regularity (this is illustrated in Figure 3.9(a)). 3.2.8.4. Framing the human body Tight framing of a human body is recommended in only three positions: to mid-chest, at full length, or up to mid-thigh. Other images are judged to be 14 We will see, in the Appendix, that symmetry is very rare in Chinese paintings and proscribed in Japanese painting, as stated by Fillinger (2020), for example.
What Are the Criteria For a Beautiful Photo?
75
ungraceful. It must be noted that we have been unable to find any study to support or justify this recommendation. 3.2.8.5. How to capture vanishing lines When the image includes parallel lines leading to a vanishing point, their point of convergence in the image, whether virtual or real, attracts attention. It must be carefully placed if it is the main subject of the photo, in accordance with the RoT (see section 3.2.6), or if it belongs to the negative space, so as to help in the observation of the positive space (Figure 3.9(c)).
a)
b)
c) Figure 3.9. (a) Tight framing is required for an object with very regular motif, as well as for a scene in motion (b), negative space must be large, especially in the direction of motion. (c) The point of convergence of parallel lines is the significant point in the image, often called the “focal point”. For a color version of this figure, see www.iste.co.uk/maitre/aesthetics.zip
3.2.8.6. The use of diagonals Diagonals make up lines of force, recommended for introducing depth to a scene, or to support motion.
76
Aesthetics in Digital Photography
3.3. Histograms, spectral properties and textures 3.3.1. Histograms and gray levels Photographers are generally advised to make greater use of the available range of light by avoiding zones that are light and saturated (“burnt” zones) or dark (“blocked” zones), as well as unused ranges in the histogram (Kardan et al. 2017); furthermore, most cameras have a default linear contrast stretching function. At the same time, many beautiful photos offer very low dynamic, with all shades being achieved across a subtle modulation of grays. Very dark images are called low-key images, while very light images are called high-key (Figure 3.10). Thus, there is no imperative recommendation on the general gray level distribution (or light distribution, in the case of color photos). The experiment showed that beautiful photos may have the most varied histograms.
a)
b)
Figure 3.10. Using a small range of gray levels is not generally recommended, however it may lead to beautiful photos. (a) Low key image using only dark shades. (b) High key image using only strong light
What Are the Criteria For a Beautiful Photo?
77
a)
b) Figure 3.11. The role played by the histogram’s bias (third order moment) on the appearance of an image. For a color version of this figure, see www.iste.co.uk/maitre/aesthetics.zip
C OMMENT ON F IGURE 3.11.– (a) The photos on top have a negative bias, while those on the bottom, with identical mean and variance, have a positive bias (Attewell and Baddeley 2007). According to Attewell and Baddeley (2007), people generally prefer images with a negative bias; according to Giraldo and Velandia (2020), the beautiful images in the AVA base almost never have a bias far from zero. (b) Two gray level distributions are available: one with a positive bias and the other with a negative bias.
78
Aesthetics in Digital Photography
There is therefore no recommendation for an average value (the average level in the image) or for the standard deviation (the contrast) in this distribution of grays represented by the histogram. However, it is surprising to note that studies have shown that this distribution must verify third-order properties (the average µ and variance σ being the first two) for the image to be pleasing. If p is the distribution of gray levels i, the bias γ is defined by: 1 γ= N
N i=1
( p( i)
µ)
3
σ3
[3.1]
where N is the number of gray levels in the image. A study in Attewell and Baddeley (2007) shows that people prefer images with a negative bias to images with a zero bias (as in Gaussian distributions) or, a fortiori, a positive bias (Figure 3.11). Conclusions by Motoyoshi et al. (2007), on the same subject, are a little different but still confirm that a bias has an important influence on the perceived quality. Our sensitivity to third-order properties may be explained by specific adaptation mechanisms in our visual cells (Laughlin 1987; Brady and Field 2000; Motoyoshi et al. 2007). Research carried out by Giraldo and Velandia (2020) shows that in the AVA base beautiful images generally have a bias close to zero, while for non-beautiful images the bias may have high values. 3.3.2. Focus, spectral density, fractals Focus, which results in a finer image with precise details (“acutance”) is a very important element in evaluating the technical quality of a photograph (Maître 2017). But what is its role in the aesthetic appraisal of this image? It has a complex relationship with the beauty of the image and there are many different recommendations on how to approach it: 1) a lack of focus is immediately perceived, and often perceived negatively, especially if it affects the main subject: for many scenes (exteriors, nature), we expect the same focus across all shots of the scene, that is, a large depth of field (Figure 3.12(a)); 2) however, photographers may use a blurred focus to create specific ambiances: unreality, distance, oblivion, mystery15; they thus apply it even to the subject of the photograph; 15 We can note the systematic use of blurred focus in black and white cinema to soften women’s faces.
What Are the Criteria For a Beautiful Photo?
79
3) it is generally recommended that the background to an object be greatly blurred so as to emphasize the object of interest, as our attention is preferentially drawn by clear-cut objects. This is a particularly important rule for portraits, photos of flowers or animals. We often appreciate it when the depth of field is very small (Figure 3.12(b)). We also judge the quality of blurring (the bokeh), which reflects the geometric properties of the camera’s diaphragm16.
a)
b)
Figure 3.12. In many images, a precise focus on all objects in the image is a desirable quality. Some images, however, must deviate from this rule so that a single subject is highlighted and not the surrounding background. In image (b), only the flower is in focus and the rest of the image is blurred. For a color version of this figure, see www.iste.co.uk/maitre/aesthetics.zip
Many photographers strive to arrive at a high resolution across the field, and small apertures are recommended for this17. The advent of digital images, with lenses that did not satisfy the Shannon sampling theorem18 introduced the opposite of blurring: 16 Blurring, or being out-of-focus is expressed as a convolution of the image. Bokeh is the impulsional response of this convolution. Camera lens manufacturers try to create diaphragms that produce pleasant Bokeh effects. Bokeh is especially visible with point-like light sources. 17 Let us recall, in particular, the ephemeral photographic aesthetic movement “f/64”, which advocated exclusively using this very small and highly restrictive aperture that guaranteed exceptional focus across the field. 18 Shannon’s theorem establishes the distance that must exist between signal samples during digitization to avoid creating any flaw in the digital signal thus created. Shannon’s theorem
80
Aesthetics in Digital Photography
aliasing. Several tools are available to reduce aliasing and toolboxes offer software to work on the signal’s spectral density (modification of the acutance19 of photos). Fine details are included in the clarity of the shot and along with the textures these contribute to the richness of the image. Textures and details contribute to the photo’s spectral density. Studies have shown that for a range of scales, spectral density must decrease by 1 /f 2 (where f measures the spatial frequency) (Alvarez et al. 1999; Gousseau and Roueff 2007). The human visual system seems to be particularly well adapted to processing this kind of signal (Brady and Field 2000). This spectral decrease is related to self-similar properties of image models that can be modeled by fractal properties. It has been shown that a photo appears more attractive when it has fractal properties, that we generally prefer structures with an intermediate fractal structure (of the order of 1.3–1.5), but that the most appreciated value depends on the structure chosen to describe the fractality (Taylor et al. 2005). This research has been used particularly to explain the aesthetic qualities of Jackson Pollock paintings (Taylor et al. 2011). Therefore, the structural complexity, expressed either through spectral properties or through fractal properties, appears to be an important component of aesthetic appreciation (Lakhal et al. 2020). We will revisit this point when we study the work carried out by George Birkhoff (see section 4.2). Other studies have shown that this is indeed the type of property that we find in images that are aesthetically pleasing (Koch et al. 2010). Further, it has been shown (Schweinhart and Essock 2013) that we appreciate it when this 1 /f 2 decrease is regular in all directions in space and that deviations from this rule lead to “uncomfortable” images (Fernandez and Wilkins 2008). 3.3.3. Textures Textures are zones in the image that offer homogeneous and singular statistical properties. Their role in making images attractive has been studied quite comprehensively, in order to attach semantic and aesthetic properties to low-level statistical measurements. Indeed, textures are not spontaneously perceived in the image, but play an important role in “dressing up” the scene and contribute to the spotlighting of the aesthetic properties (as used by Sibley) that we discussed at the allows us to obtain a discrete signal that is equivalent to the original, continuous signal. The conditions for the validity of this theorem are related to the extension of the spectral density of the continuous signal. The CMOS and CCD sensors, made up of a mosaic of photoreceptors, cannot satisfy Shannon’s sampling conditions (see Maître 2017). 19 Acutance is a measure of the quality of an image that takes into account both the camera’s transfer function and the sensitivity of the human visual system in specific observation conditions (see Chapter 6 in Maître 2017).
What Are the Criteria For a Beautiful Photo?
81
start of this chapter. The tool used to establish this link is S. Thumfart’s perceptual model (Thumfart et al. 2011) (Figure 3.13).
Figure 3.13. The texture perception model by Thumfart et al. (2011). For a color version of this figure, see www.iste.co.uk/maitre/aesthetics.zip
C OMMENT ON F IGURE 3.13.– This model, as used by J. Liu et al. to subjectively evaluate the basis for an aesthetic appreciation of a texture, relies on three layers of interpretation: the affective layer, layer of judgment of appearance and emotional layer. Each level is associated with a few pairs of dominant sensations that make it possible to qualify an aesthetic opinion. It is thus possible to deduce three values from the low-level properties and make it possible to attribute a final emotional score (Thumfart et al. 2011; Liu et al. 2018). Research by Liu et al. (2018) also aimed to connect the received signal and the perceived sensation. The study was carried out using images with pure texture (from the SynTex database). A vectorial representation of each texture was created from statistical primitives, expressing the dependence between pixels (co-occurrence matrices, energy and entropy measurements, decomposition into wavelets, etc.). The hundreds of primitives selected in this manner were reduced to around 10 measurements, chosen for their independence.
82
Aesthetics in Digital Photography
The sensations that are experienced are described using 20 pairs of semantic antonyms. The subjects are asked to express their feelings when presented with a texture. The most significant pairs of antonyms are chosen at the end of the systematic examination of textures by the assessors: hot/cold, dark/light, rough/soft, etc. By applying the Thumfart model, three layers of interpretation are revealed, associated with specific attributes: – an affective or perceptual layer: hot/cold-rough/soft-dark/light; – a layer of judgement of appearance: elegant/inelegant-simple/complex-artificial/ natural-homogeneous/disordered; – an emotional layer: I like it/I don’t like it. All this work is born out of a good understanding of the various factors in aesthetic appraisal. Unfortunately, the results are based on images that are too simple (pure texture) to allow us to judge the role of this texture within a complete scene, like those found in “beautiful photos”. 3.4. Color Although “beautiful” photos are not restricted only to color photographs, color is a very important attribute of photography and the aesthetics of photography. So much so that some photos are judged to be beautiful only because of the harmony in colors. However, we must go further: a color photo cannot be beautiful unless we express a positive opinion of the color-palette used. Moreover, color is so important that descriptions of artwork often focus on the colors20. The aesthetics of color thus merits special treatment. However, before going on to these studies on color, it would be useful to present a brief summary of the history of the perception of color, which would enable us to better understand why this problem has been at the heart of such deep discussion. 3.4.1. About the concept of color To the scientist it might seen as though all knowledge on color is available due to three famous discoveries: 1) the splitting of white light into a spectrum of wavelengths by Newton in 1660; 20 Let us illustrate this point with an excerpt from one of the last letters Van Gogh wrote to his brother, in which he described the portrait he had made of Mademoiselle Gachet in these terms (and these terms only): “the dress is pink, the wall in the background green with orange spots, the carpet red with green spots, the piano dark violet; it is 1 metre high by 50 cm wide” (Van Gogh 1992).
What Are the Criteria For a Beautiful Photo?
83
2) the trivariance of chromatic perception in humans, described by Grassmann in 1853; 3) the linearity of properties between radiometric and photometric quantities, discovered by Abney in 1886, which led to the work of the CIE (Commission internationale d’éclairage21), from the 1920s onwards. These results were unknown however, or if known, were contested, especially by the artistic world, until the early 20th century. We must explain why this was debated. The chromatic theory, as we know it today, is based on a solid foundation of physics and in-depth knowledge of physiology, both of which give a good account of perception in a static, simple and controlled experimental context: large, well-defined color ranges, stable and controlled lighting, generally neutral, “normal” observers who are relaxed and attentive, etc. However, it leaves out several more complex aspects. Thus, it does not offer a good explanation for the observer’s perception in the presence of a very rich chromatic context (with the effects of influence and masking), in the presence of variable light sources (color constancy experiment that uses “white balance” techniques), with moving scenes, flash, etc. It does not account for the phenomena of adaptation and chromatic vision after a flash or in the shadows, or under the effect of drugs, and colors seen in memories or in dreams. Artists, however, have long been sensitive to these phenomena and tried to account for them. Some authors (e.g. Goethe, Schopenhauer) proposed theories with no real physical basis, but strongly grounded in their lived experiences. Others, often painters, developed pragmatic approaches based on their experience and the availability of pigments. In short, models built on subjective and sensory bases developed on the back of a theory of color built from physics and physiology. Thus, in Europe22, for over 20 centuries there have been approaches based on the very uncertain foundations of Platonic harmony and Pythagorean bases, where, for instance, the octave from black to white is decomposed into single intervals, thirds or fourths. These are the approaches that inspired Goethe and Schopenhauer, who rejected Newton. For those who believed in the Newtonian approach, many representations were maintained concurrently (Runge, Bezold, Chevreul, Itten), offering different “color wheels”, placing complementary colors opposite each other (see Figure 3.14). There was considerable confusion over the number of primary colors and their choice: this confusion persisted in the artistic world in the absence of a distinction between additive synthesis (mixing light sources) and subtractive synthesis (obtained through mixing pigments). 21 International Commission on Illumination, which carried out the studies that made it possible to standardize chromatic representation. 22 In the Appendix, we will see that in China the perception of color is based on a totally different cosmological concept.
84
Aesthetics in Digital Photography
b)
a)
d)
e)
c)
f)
Figure 3.14. Color wheels proposed by different authors. For a color version of this figure, see www.iste.co.uk/maitre/aesthetics.zip
C OMMENT ON F IGURE 3.14.– (a) The first color wheel, as envisioned by I. Newton, (b) the sphere of colors by P.O. Runge (source Wikipedia) and (c) a chromatic circle by E. Chevreul (source: Gallica/BNF). (d) The circle of colors by P. Sérusier (musée d’Orsay), (e) nine elements of chromatic circles by W. Kandinsky (musée d’Orsay) and (f) the circle of colors by J. Itten (source: Wikipedia). It is seen that while the order of colors is respected, complementary colors vary depending on the author. 3.4.2. Preferences related to isolated colors When we seek to characterize our interest in color, we must differentiate between two distinct aspects. The first concerns our appreciation of the colors taken individually (which we will look at first), the second concerns the harmony of the colors, that is, the manner in which we perceive a field made up of several hues that are juxtaposed (this will be studied in the following section).
What Are the Criteria For a Beautiful Photo?
85
There is not really much to say about the general rules that govern preferences for colors. An old adage in French observes, “there is no arguing with taste or color”, which holds good, indicating there is no rule nor rationale for choosing one color over another. Nevertheless, many studies have been carried out to try and establish certain general trends and to explain these23. A very comprehensive review is proposed in Hurlberg and Ling (2012) summarizing various pieces of research carried out over a century on preferences regarding colors, while Mohr et al. (2018) offer a vast international study associating colors and emotions. It generally seems like the colors blue and blue-green are preferred, while yellow and yellow-green are the least appreciated (McManus et al. 1981; Ou et al. 2004). This order of preference is generally independent of the observer’s characteristics (origin, sex, age, education), although there may be a few factors that can sometimes be detected (see, for instance, Hurlberg and Ling (2007); Taylor et al. (2013)). It must be noted that while these preferences are applicable to large populations, there are many individual deviations that go beyond the singular differences in vision that come under the generic name of “color-blindness”. There have been attempts to propose theoretical models that predict the appreciation for a color, however not all of them yield conclusive results: – models based on our knowledge of the visual system, especially the specialization in the parvocellular and koniocellular pathways. Ling and Hurlberg (2007) and Vienot and Le Rohellec (2012), making use of the LMS representation, use a model with four variables: intensity, saturation, L-M red-green and the S-(L+M) yellow-blue contrast, with the last two factors explaining 70% of preferences; – based on criteria of affective psychology (Machajdik and Hanbury 2010). The three quantities pleasure, excitation and dominance are expressed in terms of the light Y and saturation s by: pleasure excitation dominance
= 0 .6 9 Y + 0 .2 2 s = 0 .3 1 Y + 0 .6 0 s = 0 .7 6 Y + 0 .3 2 s
[3.2]
23 In this text, we will only discuss experimental studies which test propositions by verifying them in the field. Artistic literature abounds in statements that reflect the convictions of the author, but which are not supported by any reproducible validation. They often associate color and emotion in a very subjective approach. Let us illustrate these works, which we have excluded from the text, with this line by the painter E. Delacroix: “Everyone knows that yellow, orange, and red suggest ideas of joy and plenty”. However, unless it is supported by arguments, this kind of statement remains unconvincing, even if it comes from an authoritative source and we prefer quoting statements with these ideas that also use measurements on the ground, for example (Jonauskaite et al. 2019).
86
Aesthetics in Digital Photography
– by choosing an emotional weightage for the colors: color activity, weight and warmth. There are 10 emotions, each given a positive/negative polarity, including: heavy/light, modern/classical, masculine/feminine, hard/soft, etc. (Ou et al. 2004); – based on ecological arguments (colors are strongly associated with nature in our environment: blue with the sky, green with grass, etc.). Palmer and Schloss used four categories (saturated colors, light colors, dark colors and veiled colors) and approximated experimental statistics quite well (see Figure 3.15(a)) (Palmer 2010). 3.4.3. Preferences related to color palettes While literature is reticent about recommendations on colors when considered in isolation, it is richer in propositions concerning the grouping of several colors into a composition (generally called a “palette”). E. Chevreul is, rightly, considered the first to have significantly explored the richness of chromatic harmonies that were required to create carpets24. However, his interpretation, based on “visual complementary colors” is refuted today. Another global approach to a palette is the one developed by Itten, who spoke about finding a balance. This balance could be attained with a gray field, or by combining complementary colors. The ideal harmony is achieved when the right balance is found (Itten 1973). Systematic and reproducible studies on chromatic harmony were created thanks to work by A. Munsell in 1905 (using his rational color atlas – Figure 3.15(b)), and then the standardization by the CIE in 1931 (especially with the setting up of the chromatic space, CIELab, as well as the associated metrics) and finally, the work by D.B. Judd in 1943 (which established a link between Munsell’s experimental intervals and the CIELab metric).
24 Chevreul went beyond earlier work (that of Goethe, for instance), which only thought of color as a component in harmony, distinguishing between harmonies of hue and harmonies of contrast, which made explicit use of the concepts of saturation and intensity, in addition to hue (Chevreul 1864). In doing this, he laid the path for Moon and Spencer.
What Are the Criteria For a Beautiful Photo?
a)
b) Figure 3.15. Harmony of colors. For a color version of this figure, see www.iste.co.uk/maitre/aesthetics.zip
87
88
Aesthetics in Digital Photography
C OMMENT ON F IGURE 3.15.– (a) Highlighting chromatic preferences (Palmer 2010). A: the different colors used; B: their position in the a b plane in the space Lab; C: appraisal using the Wave model (weighted affective valence estimate) which has quite good agreement with the experiment. (b) The Munsell color system (simplified) which separates the various range of colors based on the degrees of experimentally controlled perceptual difference. 3.4.3.1. Harmony according to Moon and Spencer The work carried out by P. Moon and D.E. Spencer made up the scientific reference that is most frequently cited when we look for a mathematical basis for color harmony (Moon and Spencer 1944c). We must immediately note that this theory provides geometric tools to manipulate colors as well as rules to evaluate the aesthetic quality of a palette. However, these recommendations are only justified through a collection of earlier writings by other authors (Chevreul, Oswald, Bezold, Brücke, Field). The authors use no physiological arguments nor any results from experimental psychology. The authors believe that the rules they propose account for a shared, universal practice in the artistic world. They use a mathematical form to express a universal knowledge that they believe is obvious. First of all, they propose representing colors in a rational color wheel, like ω, Munsell’s chromatic space (hence three-dimensional), defined by cylindrical coordinates using light, color and saturation (Figure 3.15(b)). Munsell’s ten basic colors are distributed uniformly and define as many sectors of the cylinder. What are Moon and Spencer’s rules for chromatic harmony? Any association of colors in this representation, ω, is harmonious if: 1) the relations between any two colors are never ambiguous; 2) the chromatic points representing these colors are arranged along a simple geometric distribution in ω. There is ambiguity between two colors if their hues belong to two neighboring sectors of hues (Figure 3.16). According to the first law, hues can either be close or have chromatic contrast. According to the second law, two or three variables can be varied as long as the points of color lie on the planes or along lines in space ω. There is thus an infinite number of harmonies and the authors are careful not to create a hierarchy. Further, the authors propose a two-class classification (harmonious or non-harmonious), although they acknowledge that the line between both these is probably blurred and there are probably gradations in the harmony. Moon and Spencer offer specific numerical values to define the “permitted” and “forbidden” zones (specified in Figure 3.16). From these postulates, Moon and Spencer create a methodical classification of harmonies, which they explore systematically. Three main classes are distinguished depending on the number of variables that are allowed to be free. The first class is thus obtained by allowing the
What Are the Criteria For a Beautiful Photo?
89
points of color to vary along only one of the three variables of light, hue and saturation. When two variables are free, the points of color can describe more complex shapes, and even more complex shapes when three variables are free.
a)
b)
c) Figure 3.16. The work carried out by Moon and Spencer: (a) Munsell’s space of representation of a color palette. In images (b) and (c), the cross-hatched regions represent forbidden zones, while the white zones correspond to colors that are in harmonious relationships, in the center are colors with constant light, on the right those with constant hue (from Moon and Spencer (1944c))
It must be noted that Moon and Spencer were more interested in a systematic study of all configurations that were compatible with the two postulates they established at the beginning of their study, rather than verifying the aesthetic qualities of these configurations. They did, however, experiment with the majority of harmonies that
90
Aesthetics in Digital Photography
they proposed25 and generally recognized that they had interesting qualities, though they identified between configurations that were more pleasing, brighter or subtler.
a)
b)
Figure 3.17. The various types of chromatic harmony according to Moon and Spencer. (a) The set of harmonies; (b) harmonies with a single free variable, thus “Type 1” (from Moon and Spencer (1944c))
This work was complemented by studies that take into account the size of the zones, as well as their distance from a point of chromatic accommodation (Moon and Spencer 1944b). The work can be summarized as follows: 1) a pleasing balance of n colored zones is obtained when the moments of color26 of these regions are equal in the space ω; 25 These experiments were carried out using collections of colored paper, according to the Munsell color wheel, which were commercially available. 26 The moment of color is the product of the area of the zone and its distance at the point of chromatic adaptation (often along the axis of neutral hues, but potentially displaced when the scene has a dominant color), a distance measured using Munsell metrics (expressed as JND distances (just noticeable difference)).
What Are the Criteria For a Beautiful Photo?
91
2) another pleasing balance is obtained when these moments of color are in simple ratios. As we can see, Moon and Spencer’s approach follows an “objectivist” aesthetic approach and gives very little space for the observer to state their preferences. Furthermore, the authors doubled down on this approach by expressing their chromatic harmony using Birkhoff’s objectivist formalism (as presented in section 4.2) (Moon and Spencer 1944a). The resulting formulation, which takes into account the heuristically determined weighting coefficients, is very confused and difficult to use. 3.4.3.2. Harmony according to Matsuda In 1995, Y. Matsuda proposed an extension of Moon and Spencer’s approach. Situated in the CIE HSV space27, this approach also, like Moon and Spencer’s, recognizes harmonious configurations (see Figure 3.18). However, these configurations are distributed differently and can account for harmonies with more than two colors. The overall harmony is measured by summing all compatible configurations with the given masks, each configuration being loaded by a weight that is equal to the products of the saturations, thus giving a greater role to vivid hues. It must be noted that Matsuda’s work has no demonstration of the legitimacy of this approach, but it is based on a long experimental practice.
Figure 3.18. The configurations chosen by Matsuda. The chromatic circle can rotate freely around its center. The configurations associate hue and saturation (from a figure by Lu et al. (2015)). For a color version of this figure, see www.iste.co.uk/maitre/ aesthetics.zip 27 The HSV space (Hue, Saturation, Value) is a chromatic space that is widely used in the digital world, chosen by the International Commission on Illumination (Commission internationale de l’éclairage – CIE).
92
Aesthetics in Digital Photography
More recent work has sought to determine the rules for universal harmony using machine learning techniques and databases on the Internet (Lu et al. 2014a, 2016). We will study them further (section 6.3.1). 3.4.3.3. Universal harmonies? For the past 50 years, many studies have refuted the hypothesis of universally appreciated colors. Many authors highlighted the importance of personality, culture and education (see, for instance, Mehrabian (1977); Smith (2008)). Using this approach, Schloss and Palmer (2011), whom we have already met, revisited the concept of chromatic harmony by tempering the objectivist approach. Using a musical analogy28, they considered three criteria for quality when appraising an association of two colors: 1) the observer’s preference for the association of these two colors; 2) the perception of harmony within this association; 3) the appraisal of the agreement of these colors with the other hues considered to be a background. They thought it necessary to distinguish between preference for one pair and the harmony of a pair as we distinguish between harmonious sounds and pleasing sounds. Differences between individuals can be located along this distinction (linked to education or personality). All the same, their conclusions did not accord a significant place to this distinction, which is too subtle in practice to be clearly reflected in their experiments. Going even further with this rejection of objectivism, with the aim of completely personalizing the measurement of harmony, O’Connor (2010) chose several particular factors that come into play in the perception of color: individual differences (gender, age, personality, etc.), cultural experience, the context, perceptual effects and time (influence of fashion). She found these factors to be additive and have a multiplicative effect on a universal harmony (such as could have been calculated by Moon and Spencer or Matsuda) following a formula of the type: HO Connor = [ Φindividual + Φcultural + Φcontext + Φmode + . . .] HM atsuda [3.3] Of course, for a formulation of this kind we must know many parameters in order to use this operationally, which is very rarely the case. It is not surprising, therefore, to find few such formulae in work carried out on large populations. 28 This musical analogy compares the music of Mozart and Stravinsky. While people may prefer Stravinsky to Mozart (individual criteria), it is generally agreed that Mozart is found to be more harmonious.
What Are the Criteria For a Beautiful Photo?
93
3.5. What behavioral psychosociology has to say We have studied, in detail, the properties of photographs as manipulated by image processors. These quantities emerge directly from the image and lend themselves well to an objectivist interpretation of aesthetics as they yield quantities that are equally available to everyone for evaluation. Other studies took different paths, questioning observers about aspects of the image that are less easy to measure and that are better-suited to a subjectivist approach. The methodology used in this research was inquiry methods, usually carried out within psychology. They offer results that sometimes shed new light on our relationship with beauty, and these results are harder to include in automated approaches as they are based on concepts that are not well modeled, in operational terms, by software engineers. These make up what we refer to as the corpus of the psychology of aesthetics. Let us look more closely at this in this section. 3.5.1. Images of nature 3.5.1.1. Nature and beauty Nature occupies a special place in the literature on aesthetics. In Chapter 1, we reiterated that nature presided over the criteria for beauty in the Greek world, but through humans, in whom all these qualities came together. Nature as landscapes entered into paintings much later. It was only in the Renaissance, and then from the 17th century especially, that there emerged outdoor studies of landscapes, a wave that would culminate with the impressionists. Texts devoted little attention to Nature, with Hegel even excluding it entirely from his majestic Aesthetics, restricting his analysis to the study of man-made objects. On the other hand, M. Dufrenne attributed to nature an exceptional ability to evoke aesthetic pleasure: “This is why we admire the glory of the sunset, which we find off-putting on calendars [...] we accept a spontaneity and exuberance from natural sensibilities that we do not tolerate in artistic sensibilities” (Dufrenne 1980). Alexandre Lacroix (Lacroix 2018) traces back analytical interest in nature’s beauty to William Gilpin (1724–1804) (the author of the first tourist guidebooks in English) and to Uvedale Price (1747–1829). Both of these writers strove to identify what in nature merited a painter’s attention. Lacroix expressed the aesthetic qualities of a landscape through the French term “picturesque”, to highlight the relationship with the painter, rather than the more ordinary term “pittoresque”. A. Lacroix identified three groups among studies on the aesthetics of nature: 1) the ecological or evolutionary approach, which we have already spoken of. John Appleton laid down the broad guidelines for landscapes (Appleton 1975). He stated
94
Aesthetics in Digital Photography
that regardless of the habitat they came from, when an observer was asked about their preferences when shown photos of the savanna, forest, desert, etc., they preferred the savanna. Appleton himself, as well as other researchers subsequently (Falk and Balling 2010), verified this experimentally. They explained this by the advantage the savanna offers in terms of survival, as it offers better refuge as well as food sources when compared to the other landscapes; 2) the cultural approach: one persistent tradition holds that people are only likely to appreciate nature after having learnt of beauty through artistic works; Gilpin and Price defended this position. There is an even stronger belief that nature’s beauty is only revealed on richly cultivated land. This was established as a rule by Carlson (1979, 1995) through the argument that a landscape hides its mysteries and beauties unless we have the geological key that explains its folds and tombs, the botanical knowledge to name the different species we see, and the historical knowledge of the monuments, routes and settlements; 3) the mystical approach: this is the approach that connects the observer to forces in the universe and makes them aware of their dependence on nature’s mystery, their participation in the movement of stars, clouds, the wind, streams, etc. Adherents of this approach include Rousseau, Schopenhauer, Hugo, Wagner and Friedrich, shamans and hermits. When we travel beyond the Greco-European sphere, we will see that this is also the point of view of many Asian cultures (see Appendices 2 and 4). The very concept of “nature” is challenging. F. Ducarme (Ducarme and Couvet 2020) highlights the extent to which the term is not a shared one and evokes different concepts fed by different cultures in different countries. However, we must not digress too far from our objective of listing out the criteria for beauty and so we will examine only a few texts that analyze our taste for landscapes through sociological studies. 3.5.1.2. Images of nature or artificial scenes? Research that aimed to study images of nature and associated preferences in society gave rise to a wave in aesthetics called “environmental aesthetics” (Ulrich 1983; Frangne 2018). We have already encountered this as work on environmental aesthetics gives us the most detailed instructions on how to evaluate the composition of a scene (see section 3.2.3). The older research quoted above (Kaplan et al. 1972) showed that in landscape photos people prefer a natural scene (gardens, parks, mountains) to a partly man-made landscape (streets, buildings). The poet and philosopher George Santayana stresses the romantic aspect of this landscape: “This is a beauty dependant on reverie, fancy and objectified emotion”. In more concrete terms, Kaplan postulated that the natural aspect is better for attention and memory and it thus stimulates the mechanisms involved in aesthetic appreciation29. 29 Kaplan’s writings (Kaplan et al. 1972) were revisited in Kotabe et al. (2017), which also sought to explain our preference for natural scenes. This research led to the following conclusion: natural scenes are indeed preferred, despite their “disordered” properties, which
What Are the Criteria For a Beautiful Photo?
95
In Kardan et al. (2017), the authors compare the role of low-level primitives in recognizing anthropized natural scenes. Using primitives from the spread of the histogram, the color palette, the distribution of contours and entropy, the authors show that it is possible to reasonably predict whether the scene is natural or anthropized and to deduce from this the aesthetic judgment that an observer will make of this landscape using a small number of measurements. The authors also verify that, complexity being equal, scenes of nature are preferred to artificial landscapes. The three most significant primitives used to distinguish this type of landscape are: – the absence of straight lines in contours, shadows and silhouettes; – a dominant color, falling more along the green-yellow antagonism than the blue-red antagonism; – a large dynamic in the chromatic saturation. The authors leave one remark open. When we choose photos where the above primitives are not deterministic, we can still reach a decision between natural and anthropized images, however this takes a longer reaction time and leads us to think that mechanisms other than the visual system are active. But which ones? In Ibarra et al. (2017), the previous study is complemented by taking into account not only low-level primitives, but also high-level primitives from among those proposed (Hunter and Askarinejad 2015) (some have been presented in Figure 3.3). These express aspects of a large thematic variety (vegetation, buildings, transport routes) and semantic richness (obstacle to penetration, winning through, organization, etc.). Among these primitives (62 in number), the authors show that there are 6 which allow us to predict an observer’s aesthetic judgment even better than low-level primitives30. These six criteria account for almost 70% of the difference between natural and anthropized subjects, much more than the lower level indices. They account for over 50% of the variance in predicting aesthetic preference. These studies were carried out on a few hundred photos and compared with the judgments of about 50 experienced assessors. The low-level features were processed by an image-processing software, while high-level features were evaluated using the contradicts our stated preference for order (Berlyne 1971; Ulrich 1983). Three competing hypotheses were postulated to resolve this paradox: 1) the preference for natural scenes is greater than aversion to disorder; 2) disorder does not play a role in the evaluation of natural scenes (masking the semantics); 3) disorder is appreciated in natural scenes. Kotabe et al. (2017) thinks that the first hypothesis is the most probable. 30 These six high-level primitives are: Scenography Type, Building Distribution, Water Expanse, Built Surfaces for Movement, Skyline Geometry and Skyline Maximum Undulation (Ibarra et al. 2017).
96
Aesthetics in Digital Photography
results of decisions taken by the assessors and were digitally measured using a graphic tablet. Before we move away from the aesthetic of landscapes, we must emphasize the considerable role that culture and education play in forming our taste for landscapes, despite the apparent unanimity in the opinions presented above. In the 18th century, French parks and English gardens were elevated to the apogee of good taste. Order, alignment and symmetry were propagated across Europe at the same time as gardeners thought up ingenious ways to recreate copses, waterfalls, a maze of paths and woods, to simulate a return to nature. Classicism versus Romanticism? Things are not quite so simple and the “truly natural” nature of Chinese landscapes and the “rigorously random and artificial” nature of Japanese gardens reminds us that there are many other paths to beauty in this world, while only a handful of these have the privilege of featuring in publications and commentaries. Let us conclude by quoting Adolphe Garnier who, in the early 19th century, at the peak of the Romantic period, boldly swam against the current of advocating for nature while observing the traffic of steamboats in the port of Le Havre, defending human traces and the mark they leave on society and civilizations: “Like me, you will also be convinced that after admiring the marvels of physical nature, the marvels of intellectual nature are no less worthy of admiration” (Garnier 1826). 3.5.2. The aesthetics of faces While interest in the aesthetics of landscape photographs is rather reserved with respect to the importance of the subject in photography, the story is a bit different with another frequent type of photograph: portraits. Unlike with landscapes, it may even be that too many different objectives are the reason for too many varied approaches to the subject, which make it difficult to draw any clear conclusions. Let us first look at the poetry and literature which used beauty, feminine beauty in particular, as an essential, timeless and universal ground for expression. On a more technical front, let us look at the medical and paramedical fields (reparative surgery, dentistry, hairdressing, optometry, cosmetics), clothing and fashion, as well as the vast field of human relations (employment, social networks, leisure, etc.), and finally, society itself with its more worldly preoccupations. As we can see, it is quite difficult to choose from the innumerable texts published on this theme to identify the precise reasons for which we appreciate a face: a “beautiful face” could thus guarantee good sociability, or an exciting personal relationship. It may be regular enough to become a standard over evolution, or show exemplary functional harmony and be used as a basis for reconstruction, or provide a favorable context to accompany a message or an offer and so on. Many studies have highlighted to what extent a beautiful face can induce
What Are the Criteria For a Beautiful Photo?
97
people to have a positive opinion on a person’s other qualities. Alley and Hildebrand call this the beauty-is-good stereotype31. Unlike these approaches, which are based exclusively on morphological and geometric criteria, many analyses, which are often quite convincing, seized onto subjective manifestations of the portrait. We will illustrate this with the single example of Roland Barthes’ beautiful text, which says it all. Trying to identify what cues led him to distinguish a portrait of his mother in the Jardin des Plantes from among so many others, he spoke about the air, which he first defined at length by describing everything that it is not, before then using these enigmatic words: “is a kind of intractable supplement of identity, what is given as an act of grace, stripped of any ‘importance”’ (Barthes 1980, p.168). This text, more than many others on aesthetics, gives us a glimpse of our relationship with the photograph, but it takes us farther away from the aim of this book as this sheds no light on how computers will relate to images given that it uses arguments that computers will be unable to process for a long time to come, if ever. Let us thus return to studies that adopted a more analytical and less intimate process. The recognition of features is dependent, notably, on gender, age and origin, both for the observer and the subject, and judgments on the beauty of faces seem much more common between populations (Cross et al. 1971). This point has been the subject of many different studies, with conclusions that were more or less in agreement (Patzer 2012). We must distinguish between two properties that come into play, though separately, in our assessment of a face: one is the general shape (morphology), the other is the expression, which is overlaid on the shape and can modulate it. Ancient literature was quite heavily inclined towards the first point and many treatises were written on Platonic criteria that ruled on morphologies. Harmony, in this case, was likely to be the result of simple ratios, and the golden number was given a choice position here, more than in any other context. The components of this choice were debated, especially the precise site of measurements (see, for example, Alley and Hildebrandt (1988); Patzer (2012)): the roots of the hair, the eyebrows, the bridge of the nose, the tip of the nose, the outline of the upper and lower lips or the way the mouth folds, the tip of the chin, or the center of the cheeks. There were multiple studies that tried to verify the ratios between some of these sites, however these measurements were often imprecise (the soft tissue being very mobile and the bony structures being too deep) and there was uncertainty about the criteria themselves (who is beautiful and according to whom?). It is not surprising then, that these conclusions are not very reliable (Peck and Peck 1970). 31 “. . . people generally believe that more attractive individuals are typically more competent, likeable, and, in a very broad sense, “better” than less attractive individuals: a “beauty-is-good” stereotype” (Alley and Hildebrandt 1988).
98
Aesthetics in Digital Photography
a)
b) Figure 3.19. Beauty in magazines
C OMMENT ON F IGURE 3.19.– The canons of feminine beauty have been widely discussed in earlier centuries. In the early 20th century, beauty recommendations were sold at newsstands: here, the publication L’Art d’être jolie (The art of being beautiful), edited by the l’Institut de beauté scientifique (Institute of Scientific Beauty), as well as a page from 1904 taken from this publication (b). It makes an allusion to “the German scholar”, which refers to the anthropologist and anatomist Gustav Fristch, who was also an eminent photographer (source: Bnf Gallica). While the second point (the role that expressions play in a judgment of beauty) has also been examined in many studies ((Mueser et al. 1984), for example), it must be pointed out that these studies do not really examine beauty as we do here, but focus more on attractiveness, likability or pleasantness. It is not surprising to learn that sorrow decreases the charm of a face, while a neutral expression or joy are equally appreciated by observers. Some studies looked at how to measure a part of the beauty of a face and also its charm, in order to combine them into a global score (Lienhard et al. 2015).
What Are the Criteria For a Beautiful Photo?
99
3.5.3. The role of the signature, title and context Pierre Changeux, in particular, highlighted how important a signature is when it comes to how much attention we pay to an artwork (Changeux 2016). He rightly observed that when a signature is well known, it leads us to devote more time to observing the artwork and heightens our appreciation. Thus, artworks by big name creators seem entitled to beauty and advantage on principle. This is something all of us have experienced. Let us recall here the position taken by G.T. Fechner, who maintained that even the best copy of a masterpiece is not beautiful, claiming that beauty must be accompanied by authenticity (see footnote 16 in the Introduction). Is this not an illustration of the subjective weight a signature carries, as opposed to all the objective characteristics of the work of art? This position is rather a singular one today in the field of aesthetics and the exceptional value of the original, as evidenced on the art market for example, is attributed more to the rareness of the piece, or even to the historical legacy attached to the work, than to any objective nuances in appearance (which even experts often disagree on). A signature is rarely seen in art photography and P. Changeux’s observation is less likely to affect our study. However, in photography it is not the physical signature that creates the “signature effect”, but our knowledge of the photographer even when their name is not given. We may be satisfied that there is no signature effect in aesthetic judgment and that it comes into play only in parallel registers: markets, celebrities, etc. This was the stance taken by John Deway, who sought to train a judgment free of any influence (Dewey 2010). This is also the challenge faced by certain actors in the artistic revolution who wish to mask their authorial identity so that the artwork alone would act on the observer. In the visual field, R. Mutt and Duchamp were the first two people to try experimenting with this, however literary creations have many examples of this kind of anonymity. In Frow (2002), we find a close study of the position occupied by the signature in the artistic field. It is difficult to find precise references to testify to this objective effect in the field of judgment of photographs, however we should be able to predict this possibility. Let us mention an experiment that reveals this in the adjacent field of taste of food (McClure et al. 2004). The experiment was carried out with two very popular and similar drinks, which were first tested without their brand names, and then with the brand name being known. There was a total shift in the preferences expressed by one population of testers, proving that only a knowledge of the brand name could have significantly affected the consumer’s judgment. Although we do consider that aesthetic judgment could lend itself to the same bias, we will still be prudent when transposing gustatory conclusions into the field of visual perception. The signature, like the title of an artwork, the description that accompanies it and, more generally, the place where it is exhibited (museum, collection, exhibition, etc.) are all important elements that play a role in how the work is received (Konecni
100
Aesthetics in Digital Photography
1979; Jacobsen 2010). They act on our judgment through our cognitive load, as can be seen in neuro-imaging studies (section 2.3). These studies confirm the engagement of specific brain regions during the analysis of a photo accompanied by commentary, but it does not support any specific value of the aesthetic judgment. Several studies aim to measure the role of cognitive interpretation, which comes in alongside emotional sensibility in the aesthetic appraisal of a photo (Leder et al. 2004). These studies, a fortiori, found a place within a historical interpretation of aesthetics (see section 1.3), where an artwork can only be properly evaluated if the observer knows about the conditions in which it was created. For our study, we will pay particular attention to the information carried in the title and the note that accompanies a painting. Several studies have been carried out on the role of the title in how we appreciate an artwork, differentiating the type of work (abstract or figuration), the relevance of the title, its descriptive or explanatory qualities, the duration of observation (brief = 1 second, or long = 10 seconds or longer) (Franklin et al. 1993; Leder et al. 2006; Thoemmes and Huebner 2014). Overall, the studies concluded that titles do not play a very important role in our judgment on the aesthetics of the work and that this role is only relevant in the case of abstract works. Let us take the example of the work carried out by Swami (2013) which aimed to determine what role the various pieces of knowledge related to an artwork played in the judgment formed by the observer and on their understanding of the work. The experiments looked at the opinions of the subjects who were shown abstract paintings (M. Ernst, P. Picasso), with different kinds of information available to them beforehand. The conclusions were as follows: – for abstract artworks, as long as the additional information was general (the title of the piece, information on the artist’s biography, or generalities on their style) it did not greatly influence the judgment of the artwork. However, this information played a notable role if they shed any light on the meaning of the artwork and the artist’s intentions when they created it; – the role played by relevant information regarding the meaning of the work and the author’s intention is especially important in appreciating abstract art (Picasso in the cubist period) but is not very important for figurative artwork (Picasso in the blue period). These results reinforced those published in Leder et al. (2006); – if the additional information is not relevant (if it is not directly related to the artwork) it plays no role in the judgment that is formed. These results are for abstract or figurative paintings. They can probably be transposed into the field of photography (as in the study by Thoemmes and Huebner
What Are the Criteria For a Beautiful Photo?
101
(2014)), which is also sometimes quite abstract, but in photography, are we likely to have as much information about the photographer’s motivation and style? It is likely that information on the historical, geographical, political, and social context (among others) will play a greater role. However, while it is very probable that we will then measure an increased interest in the photo, chiefly due to the effect of relevance (see the Introduction), it is not clear whether aesthetic judgment will also be affected. Finally, in line with this work, we can acknowledge a recent study that examined the impact of a completely absurd title (bullshit title) attributed to an abstract work, in keeping with a trend that is quite common in modern art. It was experimentally shown that these titles could make a work seem profound and could thus alter aesthetic judgment (Turpin et al. 2019). Curiously enough, this was more of a positive influence and enhanced the opinion people had of the artwork. 3.5.4. Perception and memory: prototypicality It is thanks to schools of design that we have more formal studies on some slightly specific elements that emerged from the global analysis of artworks, and which influence the aesthetic judgment of a scene. This research has been directed towards understanding the opinions on various man-made artwork: furniture, architecture of interiors, haute-couture, but there is no reason why their results should not be relevant in the field of fine arts and photography. This research made it possible to identify a specific criterion that is not commonly used in the analysis of photographs but which we think is particularly relevant here: prototypicality. Prototypicality is that property of an object that links it to a category or to the average values of objects in this category. Some domains of photography are more open to this property: nudes, portraits, animal photos, sunsets. Studies carried out on prototypicality in the field of design led to the quite-unanimous conclusion that greater prototypicality leads to better aesthetics (Barsalou 1985), although this is a bit paradoxical. Undoubtedly, by approaching the typical form, we probably bring together the largest number of traits that are appreciated in each form of expression. Undoubtedly, we feel like in the prototype we find the “canons” of Aristotelian beauty. Undoubtedly, with forms close to the prototype, we expect to offer the observer familiar cues and thus facilitate interpretation and aesthetic acknowledgment. However, we also move away from the elements that are truly at the heart of the artistic work and which are usually inseparable from our judgment of this work: innovation and creation. These disturbing results have, however, been confirmed through several experiments (Whitfield and Slatter 1979; Veryzer and Hutchinson 1998), while Boselie (1991) defends the contrary opinion. In the field of images or photography, the concept of the prototype must be further developed. At the moment it is unused, except in the field of the aesthetics of portraits. While there is abundant literature in this field, it seeks to essentially define the aesthetics of the
102
Aesthetics in Digital Photography
model. However, there are some interesting texts devoted to photography itself, especially those that strive to define an average face. It was thus shown that the face obtained by simply taking the average of a large number of faces after registering some distinctive points is a particularly attractive face (Trujillo et al. 2014). We can also cite here work carried out on group photos (Wang et al. 2020) which establish the criteria for acceptability (visibility of all faces, orientation of gaze, the groupings of the people, etc.). In both these cases, however, we have moved far away from the criteria for beauty that we adopted in this book.
4 Algorithmic Approaches to “Calculate” Beauty
Science [...] represents the last remains of our relation to objectivity. Luc F ERRY (1990)
We will now look at research that proposes calculating the beauty of a photograph (or an image, or a painting) using a program rather than any human. This research can be divided into two large, highly unequal groups: the first, which we discuss in this chapter, is devoted to algorithmic methods that propose comparing a photo with a calculation whose parameters are known and universal. These methods are thus heavily based on the objectivist approach. The second group works with methods based on machine learning using examples scored by a human observer. This methodology will be studied across three chapters. The first chapter examines the databases used for machine learning, as well as the expertise used as reference. The following chapter looks at the methods developed from 2000 to 2014, which use primitives taken from images, with rules implemented by classifiers acting on the primitives. The third chapter is dedicated to deep neural networks, which have been used since 2015. 4.1. First steps: C. Henry The first contributions to the computer-assisted determination of beauty were due to Charles Henry (1885, 1891). He clearly noted that determining the exact physical-chemical changes correlated with aesthetic perceptions lay beyond the scope of the science of his time. He thus proposed a general framework to evaluate the links between arousal and sensation, regardless of the modality of perception. Founded on both the “ancient” harmony of proportions (especially the first 12
104
Aesthetics in Digital Photography
positive and negative powers of 3/2 and regular inscribed polygons), as well as Van’t Hoff thermodynamic equations and the Lagrange principle of least action, this theory introduces the unifying principal of dynamogeny. He seems to have arrived at this theory, which seems rather confusing today (Henry 1889), through the mathematics of aesthetic evaluations of chromatic harmonies, proposed in work by Chevreul and Helmholtz, of forms as well as of music as taught by Rameau (Henry 1895). While Henry’s work was very well received at the time, it was not pursued after his death. 4.2. G.D. Birkhoff’s mathematical approach The mathematician and probabilist George D. Birkhoff adopted a totally different approach, although one still based on the fundamental idea that beauty resides in the object and can be measured through a judicious combination of its attributes (Birkhoff 1933). He proposed a beauty formula to measure this, which he tested quite systematically on highly varied data. This formula established a relationship between three quantities: – a quantity denoted by M, which measured the object’s beauty, that is, its aesthetic value, which seems like the reward for the effort required in examining the object1; – a quantity, denoted by O, which he called the order and which measured the qualities of harmony, regularity and symmetry, which are essential to aesthetic expression; – a quantity, denoted by C, which he called complexity and which corresponds to the effort of attention mentioned above. He then proposed, as conjecture: M=
O C
[4.1]
He spent a major part of the final years of his life verifying the validity of this conjecture in various situations: poetry, music, drawing and sculpture. It expresses beauty as a sort of “reward” for the effort the observer agrees to carry out to take in the richness of the scene. It must be noted that at this time, he had no tools that could describe shapes (this would come later with tools for image processing and shape recognition), nor that could express complexity (this would come in with the work carried out by Shannon and Kolmogoroff). Finally, he was unable to use the research from the Gestalt school of psychology, as this was published in very restricted 1 This attention must extend to the observer empathizing with the object so that the aesthetic dimension of the object is fully expressed. This is directly in line with the work carried out by the psychologist T. Lipps, which inspired Birkhoff (Lipps 1903).
Algorithmic Approaches to “Calculate” Beauty
105
circles at the time (Wertheimer 1922, 1923) and would still have been unknown to him. Birkhoff thus had to use very specific expressions for the terms in his equation for every problem studied, which considerably reduced the scope of his work. He had to create a “dictionary” of primitives for each figure (vase, freize), and then list out the properties that created an “order” among these forms (see Figure 4.1). A discussion on Birkhoff’s results can also be found in Delahaye (2015).
Figure 4.1. Application of Birkhoff’s formula to a vase. For a color version of this figure, see www.iste.co.uk/maitre/aesthetics.zip
C OMMENT ON F IGURE 4.1.– Birkhoff experimentally validated his equation on porcelain Chinese vases. The complexity of the shape was expressed using the diversity of tangent planes, curves and symmetries in the various planes, as well as by the positions of points of inflection and the extrema of curvature. The results from his measurements of beauty were finally compared to the price attached to these vases in shops, as this was considered the objective expression of their beauty. An agreement between both these measures reassured him in his choices (source: Birkhoff (1933)).
106
Aesthetics in Digital Photography
4.3. Those who followed G.D. Birkhoff 4.3.1. Beauty according to H.J. Eysenck Birkhoff’s work led to a large amount of literature. The oldest results were reported in Eysenck (1941), and applied to shapes (generally polygons). This fed into work by Davis, Swifft, Harsh, Eysenck and Schnittkind. Their work only partially agreed with Birkhoff’s formula. H.J. Eysenck wished to verify this equation on the configurations proposed by Birkhoff, and he first showed that it was possible to identify a first term (which he denoted by T ), which revealed the elements of regularity that were usually used in Birkhoff’s order term, O. This term is independent of the observer and their cultural loading (Eysenck 1939). He then proposed a linear expression that lent itself better to a regression analysis (Eysenck 1941), which replaced the quotient in equation [4.1]. He thus arrived at a relation of the type: M2 = T
C
[4.2]
in which he observed that the second term C depends on the observer. We also find another beauty formula by Eysenck (1968), which is the exact opposite regarding complexity to that put forth by Birkhoff, defended in Berlyne (1971) and often used, for example, in Kato and Taishi Matsumoto (2020): M3 = T C
[4.3]
These formulas clearly show how much the role of complexity is debated. Many authors underline the positive aspects of the deviation from regularity while judging deviations due to noise and randomness as being negative. Thus, Boselie and Leeuwenberg (1985) arrived at a better prediction of aesthetic appraisal with a formulation like Birkhoff’s (equation [4.1]) than using the formulations in equations [4.2] and [4.3]. 4.3.2. The Post-War years: the designers, A. Moles and M. Bense In the Post-War period, with the significant industrial design wave, Birkhoff’s statements inspired several researchers who also took into account the progress made in experimental psychology and information theory. The most original contribution toward this came from A. Moles in France (Moles 1957) and M. Bense in Germany (Bense 1969, 2007). Both were strongly influenced by the new information theory, semiotics and cybernetics, which they applied to artistic creation. Consequently, Bense suggested that formula [4.1] could be transformed using Shannon’s information entropy, H, applied only to the color of the image. The order O was then expressed as the reduction of entropy between the complete, equi-distributed palette
Algorithmic Approaches to “Calculate” Beauty
107
Hmax and the palette actually used for a given piece, Ho , leads to a beauty formula in the form: MB =
Hmax Ho Hmax
[4.4]
4.3.3. A dynamic approach: P. Machado and A. Cardoso A few decades later, P. Machado and A. Cardoso (Machado and Cardoso 1998) proposed a formula à la Birkhoff 2 where the complexity, C, and order, O, were expressed, respectively, by the processing complexity P C and the image complexity IC. After a few manipulations, the formula [4.1] becomes: IC a
MM C =
( P C( t0 ) P C( t1 ) )
b
P C(t1 ) P C(t0 ) P C(t1 )
c
[4.5]
The exponents a, b, and c must be determined and the times t0 and t1 denote the observation time. Other authors finally evaluated the difference between the image complexity and the processing complexity for the visual system using the difference in size between the original image and the compressed image (using a JPEG format or fractal format), adopting the algorithmic information theory approach proposed by Kolmogorov, Solomonoff and Chaitin. If the JPEG compression is done well (i.e. if the compression rate is reasonable), the compressed image is “visually” equivalent to the non-compressed image. The number of bytes in this compressed image is thus a reasonable approximation of the length of the “minimum length program” that can represent this image, and thus the quantity of information contained in this image. This method will be revisited further on. This research also offers an interesting experimental verification using a database of images to test artistic abilities (Graves 1977). Furthermore, they explicitly introduce the observation time (t1 t0 ) as a parameter of beauty, a hypothesis that would also be central to the work carried out by J. Schmidhuber and which makes it possible to understand perception across two phases: a brief visual acquisition and a slower time of understanding. We have seen these schemes earlier (see Figure 2.10). Other authors made greater use of the complexity theory to simultaneously derive the measures for O and C, using the work that Kolmogorov and Levin have done on complexity (Koshelev et al. 1998; Kreinovich et al. 1998). These authors believed that beauty could be expressed using the program p, its running time t( p) and its length l( p) , all of which gave rise to the formula: Mkk = f[ t( p) , l( p) ]
[4.6]
2 It is not clear whether P. Machado and A. Cardoso’s formula was inspired by Birkhoff’s formula.
108
Aesthetics in Digital Photography
where f[ t, p] is a decreasing function of its two arguments. A functional study of f ( t, l) (based on the argument that deleting 1 bit in the image requires examining the chain twice, once extended by 1 and once by 0) to determine the best representation, thus leading to f ( t, l) = f ( 2 t, l 1 ) . They deduced from this that Birkhoff’s original formula [4.1] is the only one that is compatible with these measures of order and complexity. 4.3.4. Work carried out by J. Rigau, M. Feixas and M. Bert Despite the interest aroused by his work, it would be several years before Birkhoff’s ideas were adapted for paintings, rather than simple drawings. J. Rigau, M. Feixas and M. Sbert have produced the most complete work based on Birkhoff’s formulation, applied to the evaluation of aesthetic qualities in paintings from the 19th and 20th centuries (Rigau et al. 2008). They did this using image processing tools (segmentation by zone, color classification) and the theory of complexity. Several formulations were put forth. First of all, the Kolmogorov complexity, K, can be used advantageously in expression [4.4]: MRF B1 =
nHmax K nHmax
[4.7]
where n is the number of pixels in the image. Since the Kolmogorov complexity, K, is non-calculable, we approach it using commercial compression algorithms (JPEG, in this instance, like in Machado and Cardoso (1998)). The above formula was then refined using Zureck’s physical entropy, Ho , which led to: MRF B2 =
nHo K nHo
[4.8]
Finally, the authors constructed an algorithm to create a tree representation of the image, dividing it into ν zones, based on a color criterion and expressing the variation in beauty of the approximation B( ν) that is obtained, represented by the description ˆ ν) , in the formula: R( ˆ ν) H B( ν) , R( MRF B3 ( ν) =
ˆ ν) H B( ν) |R(
H [ B( ν) ]
[4.9]
Algorithmic Approaches to “Calculate” Beauty
109
where H( x, y) expresses the joint entropy of the processes x and y, and H( x|y) is the conditional entropy of x knowing y. The results from this approach are depicted in Figure 4.2.
a)
b) Figure 4.2. In their research, Rigau et al. chose an experimental base of highly varied paintings (a), which they broke down into color patches. Birkhoff’s measurement of beauty was derived from a complex computation that involved Kolmogorov’s complexity. (b) The evolution of Birkhoff’s beauty based on the number of regions chosen (source: Rigau et al. 2008; Mondrian/Holtzman Trust). For a color version of this figure, see www.iste.co.uk/maitre/aesthetics.zip
110
Aesthetics in Digital Photography
C OMMENT ON F IGURE 4.2.– The paintings used in the test: (A) Composition in red, Piet Mondrian, 1938–1939; (B) Composition in red, blue, black, yellow and gray, Piet Mondrian, 1921; (C) Composition with grid 1, Piet Mondrian, 1918; (D) La Seine à la Grande-Jatte, Georges-Pierre Seurat, 1888; (E) Sous-bois à Pontaubert, Georges-Pierre Seurat, 1881; (F) Un dimanche après-midi à l’île de la Grande-Jatte, Georges-Pierre Seurat, 1884–1886; (G) Starry Night, Vincent Van Gogh, 1889; (H) Olive trees with the Alpilles in the background, Vincent Van Gogh, 1889; (I) Wheatfield with crows, Vincent Van Gogh, 1890. This research merits further experimentation. First of all, it seems difficult to process different styles in the same way due to the strong cultural context, which do mask the only aesthetic evaluations, as we highlighted at the beginning. Furthermore, it seems essential to take into account specific properties of the human visual system when evaluating the “complexity” of the image, the chromatic space metric and the masking effects in particular. 4.4. Algorithmic approach with AI: J. Schmidhuber J. Schmidhuber’s research is not related to earlier research in that it used an original formulation of beauty (Schmidhuber 1997). The research was also not specifically applicable only in an objectivist context as the author defended the idea that the evaluation of beauty depended on the observer and the precise conditions of observation. Nonetheless, it seems to us that the proposed algorithmic approach, as well as the weight given to simplicity in measuring beauty, locates this work in the same family of the research we just saw, rather than being aligned spiritually with the machine learning methods that we will see further on. J. Schmidhuber chose an information theory and coding approach which he called the “lazy brain” approach. In this approach, the image, denoted by D, is transmitted to the observer, who then creates a representation of it, D , through a coding algorithm denoted by C. D, D and C can be described by symbol chains, in accordance with Turing’s representation of informatic objects. C represents all the knowledge that the observer could have acquired about D at a given instance. If D < D, then D is redundant with respect to the observer’s knowledge and it is therefore compressible in the sense of information theory. J. Schmidhuber held that beauty existed if there was a high probability that D was conditional on D knowing C (Schmidhuber 2007). Beauty is directly measured by the gain in compression obtained by the observer in order to move from D to D . Beauty is thus greater when we have rules to interpret the visual scheme offered to us and to create a simple description of this with regard to our knowledge. This scheme seems apt to decide which stimulus we prefer in the presence of a large number of stimuli D1 , D2 , D3 , etc. However, it does not seem to guide us in
Algorithmic Approaches to “Calculate” Beauty
111
choosing an action when faced with very different stimuli. In this situation, J. Schmidhuber proposes defining the aesthetic process as that which leads to greater compression. In order to model this learning method, he used Kolmogorov’s theory to describe the complexity of the compressor, or more precisely, the variation in the compressor’s complexity over time, for example in an experiment where learning was reinforced. In the absence of any external reward, the optimal behavior would consist of optimizing what he called the reward of “curiosity”, which is expressed as a function of the knowledge Ct and Ct 1 that we are likely to have if we had to observe Dt using the knowledge of Dt or without this knowledge. Any implementation of this – highly theoretical – strategy makes abundant use of Solomonoff’s prediction schemes and Gödel’s virtual machines.
Figure 4.3. Image (a) is beautiful because it is simple. J. Schmidhuber suggests that this be determined using a very simple iterative scheme, which results in the network of lines seen in image (b). The most remarkable points in the image are found to be represented in very compact form in this adapted meshing (adopted from Schmidhuber (2007))
In practice, it is still very complicated to implement this approach and, as in the case of experiments by D. Birkhoff, the implementation is based on an ad hoc detection of the best coding schemes on a case-by-case basis (Figure 4.3). Furthermore, J. Schmidhuber himself concedes that many representation schemes that should have led to simple coding did not produce any significantly good results. J. Schmidhuber’s model seems to be well confirmed by studies carried out on average faces. Many studies have emphasized that “average faces”, obtained for example for purposes of facial recognition research, are extremely well-rated aesthetically. It is also possible that these are the faces that we learn by default throughout life.
5 The Holy Grail of the Digital World: Artificial Intelligence
Is it permitted to suppose that a people whose eyes are accustomed to consider the results of a material science as the products of the beautiful won’t, at the end of a certain time, have singularly diminished its faculty of judging and feeling what is the most ethereal and most immaterial? Charles BAUDELAIRE (1976)
The next three chapters will be dedicated to machine learning methods. The principle behind all the approaches that we will see in the following two chapters is based on the hypothesis that the rules that govern the appraisal of a photo can be derived from the observation of a large number of images judged by human experts. Implicitly, if we do not introduce any other mechanism, we can hypothesize that these rules are universal and timeless and we thus wholly adopt the objectivist framework. Two groups of methods have been successively proposed. The first group was developed from classification algorithms, which were quite simple, with solid mathematical foundations. These used information selected from the image, most often chosen for the role attributed to them in the aesthetic judgment. These methods are said to be “primitive-based” or handcrafted. Then, from 2015 onward, these were slowly displaced by methods based on deep neural networks. These put the onus entirely on the power of the classification system (the deep network), omitting the step of selecting primitives. On the other hand, they required a large database to ensure high-quality learning; hundreds of thousands of examples, while handcrafted methods required only a few hundred images.
114
Aesthetics in Digital Photography
This present chapter is dedicated to studying image sources and the sources of expertise that are common to both of these families of research, as well as all considerations that are shared by “learning techniques”1. 5.1. Which artificial intelligence? Given the difficulty of psychophysiological or neurobiological approaches to the aesthetic of images, many research projects have been proposed to study the problem of evaluating using digital techniques. This research is based on the conjunction of three major advances: 1) the development of efficient and universal learning methods (commonly called machine learning); 2) easy access to large quantities of images available on the Internet, databases accompanied by judgments on quality; 3) the availability of powerful processors that can process very large quantities of information. 5.1.1. The principles It is common to divide the intensive learning methods used for aesthetic studies into two broad families based on their chronological evolution: – the first (Chapter 6) brings together classifiers based on primitives taken from images (handcrafted). Many classification techniques can be implemented, as we will see in the next chapter. However, the widely used methods in recent years, support vector machine (SVM) methods, which are quite easy to implement, and lead to reasonable learning times with knowledge bases that are quite small because of a small number of parameters and good generalization ability. Their performance depends chiefly on the quality of the primitives used. They often have good performances, without being excellent, but it is quite difficult to learn subtle relationships (hierarchic relations, exceptions, semantically complex classification, etc.); – for some years now the preferred classifiers have been deep neural network (DNN) type classifiers, which will be studied in Chapter 7. DNNs are “end to end” classifiers, that is, they take the image itself as input, and the output is the category or measurement that is expected. DNNs are much harder to implement as they chain together successive processing layers (between 5 and 20) based on carefully 1 The Schmidhuber method, seen in the previous chapter (section 4.4), comes up as both an AI method and a deep network learning method. However, it is radically different from the methods that we will study now, as its objective is not to learn the aesthetic criteria for images. Instead, over time, it looks at the behavior of what goes into forming the observer’s judgment, for each image (Schmidhuber 2007).
The Holy Grail of the Digital World
115
constructed architectures. They also require long learning steps so that each classifier layer can be configured (learning the dependencies between layers). Large amounts of learning data must be available for this. Despite this complexity, DNNs offer remarkable performances, even in very difficult cases, and require little expertise. DNNs can also accept pre-calculated data as the input (like the primitives used in the first family) or data that are semantically associated with the image (ancillary information).
5.1.2. Learning algorithms Let us briefly look at all of the techniques that machine learning uses. An input space is connected to an output space through an algorithm, called a classification algorithm, whose parameters are unknown a priori. The parameters of the classification algorithm are determined using another algorithm: the learning algorithm. A bank of images is identified, providing a measurement of the aesthetic quality of each image. This measurement cannot be challenged, as it is a “truth” and the bank of images is the learning database. 1) The learning step determines the unknown parameters of the classification algorithm, which is initialized using more or less arbitrary parameters: a) primitives (which are supposed to highlight the discriminant properties of the aesthetic qualities) are extracted from each of the images from the learning database; in the case of classification by deep neural networks, we can replace the primitives by the image itself, leaving the job of extracting the discriminant elements to the network; b) these primitives are placed at the entry of the classifier, the response is read at the output and compared with “the truth”; c) based on the difference between the output and the “truth”, the learning algorithm modifies the parameters of the classification algorithm; d) when these operations are applied to all images in the database (and perhaps repeated multiple times), the parameters of the classification algorithm are fixed. 2) A scoring step for a photo whose aesthetic qualities are unknown: a) the unknown image is broken down into the same primitives as those used during the learning; b) the primitives become the input for the classification algorithm; in deep neural networks, the image itself can be used; the output provides an assessment of the quality of the image. This “output” can take various forms. It may be a classification like beautiful/not beautiful, professional/amateur and experienced/beginner. It may also be a score or a set of values from which a single score is derived through regression.
116
Aesthetics in Digital Photography
The classification algorithm can take many forms: Bayesian classifier, K-nearest neighbors, SVM, random forest, neural networks of various types, etc. The learning algorithm sometimes uses the exact resolutions of the systems through matrix inversion. However, most often, it is based on optimization techniques that aim to iteratively reduce the gap between “input” and “truth”. In the case of neural networks, the most widely used is a gradient descent to successively process neural layers, beginning with the last layers and moving back up toward the input layer. This is called gradient backpropagation. In order to measure the efficiency of the classification algorithms, we carry out a test on a part of the “truth” database, being careful not to use these data during the learning (cross-validation). Various measurements reflect the quality of recognition: false alarm rates, recall ability2, precision rate3, the receiver’s operational characteristics4. Beyond these verifications, the algorithms should be capable of reacting to data that are quite different from that in the learning database (ability to generalize). On the other hand, we must also avoid the algorithm only being capable of identifying the data in the learning base (situation of overfitting). 5.2. Why artificial intelligence in aesthetics? Not all of the research that studies the aesthetics of photography has the same objective. Thus, some studies only wish to separate mediocre images from good images (i.e. of a quality usable as illustration). Others seek to differentiate between photos by amateurs and photos by professionals, for commercial purposes, for instance. Another use is an aesthetic criterion to classify a limited set of photographs to create a summary, an album or a repertory. This is especially the case with systems designed to assist photographs finding the best framing for a set of photos. This can be done a posteriori on a workstation, however this assistance system is also offered as a camera inbuild system that works when the shot is taken (Lo et al. 2013; Schwarz et al. 2016). Finally, other studies try to attribute a score based on a universal reference frame (from 0 to 100, or across five levels). 2 Recall, sometimes called “sensitivity” comes into play in binary classification. It is the proportion of correctly classified examples from the set of relevant examples. 3 Precision also comes into play in binary classification. It is the proportion of correctly classified examples from the set of examples chosen by that class. 4 The ROC curve (receiver operating characteristic) expresses the quality of a binary classification in the form of a curve, which gives the rate of true positives (fraction of the positives detected) as a function of the false positive rate (fraction of negatives that are correctly detected). The curve is generally obtained by varying the decision criteria. This is therefore a curve defined in the field [0, 1] × [0, 1]. In the absence of any indication on the cost of an error, we often choose to function at the point on the curve that is closest to the point (0, 1).
The Holy Grail of the Digital World
117
Level Description 1.0 Beginner: takes a photo without considerations of composition, lighting etc. (the photo may still be technically good) 2.0 Amateur: a good photo for someone without any photography experience. There is nothing embarrassing, but nothing artistic either. The lighting and composition are correct 2.5 Experienced amateur: artistic elements are incontestably present. However, it cannot be said that the result in the photograph, its content or the way it is shot are refined 3.0 Semi-professional: on the way to becoming a professional photographer 3.5 Semi-professional: the photo is better than many others, but has been better-taken elsewhere 4.0 Professional: this photo was constructed, not “taken”. Everything in the image works together to create a great photograph Table 5.1. Proposal for a hierarchy of photographs
C OMMENT ON TABLE 5.1.– The scale with four levels and two sub-levels created by Fang and Zhang (2017) is one of the most developed attempts to formalize an aesthetic hierarchy in photography. Each level receives a label (except level 3.0), a definition (except 2.5) and is further clarified by a set of remarks from professionals who observed this particular photography. We complement the title and definition here using terms taken from these remarks when needed. Note that 4 is only awarded to a very small number of photographs with an exceptional aesthetic quality. The remarks (not given here) often refer to the author’s “professionalism”, which is a criterion that is probably taken in a more symbolic than strictly social sense. The majority of photographs are classified between 1.0 and 2.0. It is quite rare for evaluation systems to clearly define their notation scale. The most widely-used scales (e.g. DPChallenge5 which is the source for the AVA database that is most-frequently used in the context of our discussion here) usually only define the scoring interval, giving the user the freedom to fix the gradations. However, in recent studies there have been efforts to fix the progression of the scores over a four-level scale, presented in Table 5.1 (Fang and Zhang 2017). These studies propose associating a “beginner level” with 15% of images with the lowest score on the DPChallenge, placing the next 55% of images in class 2, the next 15% of images in class 3 and reserving class 4 only for the 15% of images with the highest DPChallenge score (see Table 5.2). 5 The DPChallenge voting instructions for Internet users: “Rate entries on a scale of 1 to 10. A score of 1 is a “bad” photo, and a score of 10 is a “good” photo”.
118
Aesthetics in Digital Photography
Quality according to Fang et al. Class DPChallenge classification DPChallenge notation Beginner 1 ≤15 % ≤4.67 Amateur 2 ∈ ]15%; 70% ] ∈ ]4.67; 5.82 ] Semi-professional 3 ∈ ]70%; 85% ] ∈ ]5.82; 6.19 ] Professional 4 >85% >6.19 Table 5.2. Correspondence between Fang’s classification (Fang and Zhang 2017), the DPChallenge classification and its notation between 0 and 10 (Murray et al. 2012)
Each level and some intermediate levels are well isolated. These are presented in Table 5.1. Each is succinctly illustrated by remarks from the professionals who classified the photos that belong to this level. However, no other criterion is truly objectivized: sometimes they refer to a debatable perception (“the subject is boring”, level 3.0), sometimes to unspoken professional rules, “You can tell in this photo that the person is paying attention to light. They are starting to think like a photographer, level 2.5”. These texts inevitably remind us of efforts made by the school of analytical aesthetics (see section 3.1.3 and the references (Elton 1954; Sibley 1959)) to explain and operationalize the vocabulary describing the aesthetic predicates of E. Zemach or C. Bell’s significant forms. While most studies only attribute a score to the images, some try to “explain” the factors involved in their judgment, for example, by providing partial marks across criteria (composition, color, contrast, etc.), and yet others offer cards that indicate regions where the criteria were detected. Finally, some studies make use of complementary information that can be found in databases indicating the thematic categories to which the images belong (e.g. a label indicating portrait or landscape), such that the score is adapted by prioritizing certain criteria for each theme. When the theme is not given, the aesthetic measurement algorithm may be in charge of determining the theme before carrying out the evaluation. Finally, in a small number of studies we associate the images with descriptions of ambiance, related to emotions: calm, joy, fright, surprise, etc., to help the evaluation. These emotions are then identified by photo-interpreters, although the algorithms increasingly offer to appraise these when the image contains faces in the foreground, for example. 5.3. Expert opinions Machine learning methods use experts to construct the “truth databases” during the initial learning phase. We use images from this database as the input for the classification algorithm and force the output to be the “truth” given by the expert and in this way, using various techniques, we determine the unknown parameters for the classifier. The choice of expert decision is thus crucial as all algorithms seek to
The Holy Grail of the Digital World
119
imitate this style of deciding. Contrary to how other automatic decision-making tools were designed (like expert systems, now obsolete), the only involvement of experts in the construction of the machine learning algorithm is in building the learning database. These new methods only require a knowledge of the expert result in a variety of cases and ignore the processes that led to this expert opinion. There are many differences between studies carried out on how to gather these expert decisions. Let us take a quick look at them. The best-known expert opinions are from the field of professional photography: art critics, journalists, professional photographers and juries of photography contests. These expert opinions are available on specialized sites, in online journals, and also during competitions that are often highly mediatized. While the opinions shared in this way are not easy to dispute, they often require a step of interpretation as they are rarely expressed in ways that are immediately usable (ideally, we would want an opinion in the form of a score within an absolute reference frame). This interpretation step is highly reductive as it may misleadingly exaggerate the opinions. In addition, these opinions may reflect an adherence to a certain school of thought or a trend, either because the expert is immersed in them, or because the framework of the contest requires this. Also, as we saw in the introductions on the various reasons that people look at photographs, it is often difficult to isolate, within words that praise an image, the criteria that are solely aesthetics. Let us note, in particular, that photography, following the general trend of 20th-century art, today tends to prioritize “emotional power” over “aesthetic pleasure”, as was also seen in the Introduction, and expert opinions often reflect this trend (see footnote 7 in the Introduction). Less common than the expert opinions, but also the most discussed are opinions by experienced amateurs, which are generally collected on specialized sites such as DPChallenge. The large number of such opinions makes it possible to carry out serious statistical analyses on these. The wide diversity in the origins of these experts ensures that a much greater breadth of opinions is covered, across age groups, socioprofessional origins and different geographic and cultural contexts. These opinions are usually a little detailed, but can easily be translated into scores without any risk of changing the opinion that is expressed. However, when evaluations are made in the context of “challenges”, we must take into account all the biases of open votes (group behavior, self-appreciation, ballot stuffing, etc.) (Reagle 2013). Several studies are based on small cohorts of experts generally made up of fine-arts students or image-processing students, specifically trained to evaluate established image databases according to well-established protocols. These studies guarantee neither the quality of professional experts nor the wide variety of results from experienced amateurs. However, they do ensure homogeneity and consistency across the results, which the other two methods cannot.
120
Aesthetics in Digital Photography
Finally, some studies are based on appraisals from “non-experts” by directly asking for opinions, often in a binary manner (I like this or I don’t like this), on photo-sharing social media sites, either general or specialized (Instagram, Flickr). Another trend that is very much in vogue is to call upon Internet users for small competitions during crowdsourcing campaigns6). In both cases, we try to arrive at an “average” opinion among the population by notably reducing the expectations on the aesthetic criteria that emerge from this. 5.4. The database It is very easy to find many images on the Web, but it is much more difficult to find the information related to these images that is essential for learning and, in the event, reliable information on the aesthetic quality of the photos. The databases used thus often compromise between the unreliable but numerous opinions from the general public, and the invaluable but rare opinions from professionals in art photography. Some of these databases are compared in Table 5.3. The professional sites are excellent sources of photographs of very high quality, however they are not very useful in solving the problem that we highlighted earlier, of making use of expert opinions. They are thus not often directly used to constitute large databases. Their chief merit lies in providing excellent examples associated with very instructive remarks. Sites for experienced amateurs are the primary source of images. They offer large quantities of photos, often of good quality. These sites make it possible to avoid the pitfalls of “artistic temptation” such that aesthetic qualities are prioritized, but are not immune to critics. In fact, in a movement that can be observed across all social networks, “clan behavior” tends to prevail. For example, it is striking how on certain sites, special effects (material or software) are used to “improve” an image: improving the focus, contrast, filtering noise, color saturation, geometric distortions, etc. These reflect a truly “technophile” aesthetic that goes beyond the classic criteria laid out by the doyens of photography (refer, for example, to Figure 3.1b). Finally, the judgments about these images can be criticized for the informative nature of these consultations (Reagle 2013). 6 Crowdsourcing is gathering a large number of responses from Internet users to resolve a task. This contribution may or may not be remunerated (Cardon and Casilli 2015; Casilli 2019). In studies on the aesthetics of photographs, it consists of asking Internet users to award a score to a series or images, or to classify two or more images in order of preference. However, crowdsourcing is often criticized for producing unreliable results.
The Holy Grail of the Digital World
Criteria
BEAUTY
AADB
Redi Base
AVA
Size (× 1,000) Aesthetic quality Aesthetic score Semantic classes Label about style Annotation Origin
15 Low 3 4 Non No Flickr
10 Low 5 No 11 Yes Flickr
100 Low 4 Yes No Yes Web
250 High 10 44 14 Yes DP Chal.
AVA-2 CUHK-DB 50 High 10/2 44 14 Yes AVA
AVA-PD 120 High 10 44 14 Yes AVA
Uni Tübingen 380 Low 10 No No No Flickr
Table 5.3. Some databases that are often used to study the aesthetics of images, with certain properties of these databases
121
Psycho Flickr 60 High 5 Yes No No Flickr
Flickr AES 40 Low 5 Yes No Yes Flickr
122
Aesthetics in Digital Photography
R EMARKS ON TABLE 5.3.– The size is expressed in “thousands of images”. The aesthetic quality is considered to be “high” if the images come from professionals or experienced amateurs and “low” if they come from social networks. The range denotes the number of levels of evaluation. The annotations are made up of the written comments that sometimes accompany the judgments. The evaluations of BEAUTY and the Redi base were crowdsourced. AADB was evaluated by five experts who awarded the aesthetic score when their opinions converged reasonably well enough. AVA is the reference database, created from photos from DPChallenge that received at least 200 judgments from Internet users. The AVA-2 database and CUHK-DB were derived from the AVA database by choosing only the best and worst images; the AVA-PD (AVA-Photographer Demographic) database is created by only choosing images from certain chosen photographers. Flickr-AES selected photos from Flickr that had crowdsourced annotations on their aesthetic criteria. Psycho-Flickr selected images from Flickr produced by photographers for whom a psychological profile was created based on a survey they answered. Generalist image databases, like Flickr, Tumblr, Picasa or Instagram, are also sources that have a large variety of images. However, their quality can be criticized, and the associated elements that make it possible to have expert opinion are most often not based on aesthetic criteria. Let us note that these sites generally have commercial goals that are far removed from artistic expertise (especially advertisements, but also the sale of products on some). These sites are constructed such that aesthetic appreciation contributes to this goal, often at the cost of objectivity and quality of judgment. This has been well analyzed in Pasquier et al. (2014). We thus seek to solicit opinions that reinforce mass opinions rather than individually pertinent opinions (seeking “influencer” effects (Cochoy 2011)). For these reasons, these “primary” sources are not often used “as is” to supply studies on aesthetics. They are sometimes filtered to give operational databases, which are presented here. 5.4.1. Generalist databases, used for aesthetic judgments – Flickr7: Although a primary source of images, Flickr has also been used as a database for studies on aesthetics, starting from the concept of attractiveness, which combines the criteria of “beauty” and “popularity” a little confusingly (San Pedro and Siersdorfer 2009; San Pedro et al. 2012). Attractiveness is measured by counting the remarks made by Internet users: attractive or unattractive. From Flickr there were constructed databases that were better adapted to studying aesthetic properties, like Flickr-AES8 (Flickr Images with Aesthetics Annotation), which was obtained by 7 Flickr: http://www.flickr.com/. 8 Flickr-AES: https://www.flickr.com/photos/aesimbolon/.
The Holy Grail of the Digital World
123
getting 210 Internet users to annotate 40,000 photos using a scoring scale from 1 to 5, with each user annotating at least 120 photos each. Another database, also derived from Flickr, is PsychoFlickr9, a database of 60,000 images posted by 300 photographers, professional users of Flickr (paying members, and not the general users with free accounts). This database brings together images and the psychological profiles of the photographers (posters and viewers) based on an assessment using the Big Five scale (see section 7.4.2). Each photographer responded to a concise form of the Big Five test (the Big-Five-10 (Rammstedt and John 2007)) making it possible arrive at a psychological profile for an “author” made up of five major scores lying between -4 and +4. Further, eight experts independently filled in a “receiver” profile for each of the 300 photographers, after having examined 200 photographs of each. Three hundred “perceptual judgments” were thus attributed by taking an average of the eight opinions, accompanied by a measure of the level of agreement between the experts for each photographer (Cristani et al. 2013). – Photo.net10: This is a website for experienced amateurs and professional photographers, which has around a million good quality photographs, classified into 38 categories. The photos receive a score from 1 to 7, based on two criteria: aesthetics and originality (see Figure 5.1). The photos can be accessed in descending order of the average score awarded. – DPChallenge11 is also a site for both experienced amateurs and professionals. It contains over 650,000 photos that are generally of good quality. The photos can be retouched and are often heavily modified. DPChallenge gives significant space to photography contests judged by Internet users. The scores go from 1 to 10. The best photos often receive over 200 scores. The many contests are identified by a theme that may serve as an identifier (the archives today have close to 1,000 themes). DPChallenge is the essential primary base for studies on aesthetics. It gave rise, notably, to the AVA base, which is the most used. Apart from the large number of high-quality images, this database also offers a wide range of scores, as well as written comments on many images. Many studies strove to get a good understanding of the statistical properties of the appraisals of the DPChallenge images (Kim et al. 2015; Fang and Zhang 2017; Talebi and Milanfar 2017). The distribution of scores is practically Gaussian, around an average that is just slightly greater than 5 (see Figure 5.2(a)). The standard deviation of the scores is itself quite a Gaussian distribution. The comments are distributed along a rather Poissonian distribution, with a mode of around 15 comments per photo (see Figure 5.2). If we compare the DPChallenge statistics with Fang’s notation (Fang and Zhang 2017) (seen earlier), Fang’s level 1 is awarded to photos whose score on DPChallenge is lower than 4.67, level 2 is awarded to photos whose score lies between 4.67 and 5.82, level 3 is awarded to those lying between 9 PsychoFlickr: https://github.com/raoulg/PsychoFlickr/. 10 Photo.net: http://www.photo.net. 11 DPChallenge: http://www.dpchallenge.com.
124
Aesthetics in Digital Photography
5.82 and 6.18, with level 4 being reserved for scores above 6.18. If we compare the comments and scores, it can be seen that the scattering of scores for the same photo is high in the presence of terms that express either the originality of the photo, or a highly personalized interpretation (Kim et al. 2015). This is quite a good reflection of what certain authors call the “non-conventionality” of the photo, that is, the fact that it goes beyond classical canons (Talebi and Milanfar 2017).
Figure 5.1. Relationship between the scores for aesthetics and originality in the Photo.net database (from Datta et al. (2006)). There is a near-linear dependence between these two criteria on average, but we can see that there are several notable deviations from the trend. For a color version of this figure, see www.iste.co.uk/maitre/aesthetics.zip
– ImageCLEF 12 is taken from the MyFlickr database, itself taken from Flickr (these are thus ordinary images without any specific aesthetic intention). This was developed to test indexing algorithms and image searches. It contains a million photos, with annotations on the content (44 semantic tags), as well as on the emotions associated with the photos. 12 ImageCLEF: http://www.imageclef.org/.
The Holy Grail of the Digital World
a)
b)
c) Figure 5.2. Scattering of the average scores awarded to each photo in the DPChallenge database (according to Kim et al. (2015))
125
126
Aesthetics in Digital Photography
C OMMENT ON F IGURE 5.2.– (a) Distribution over all the photos of the average scores awarded to each photo (this is practically a Gaussian with an average of 5.43 and standard deviation 0.73); (b) distribution of the variances of the scores awarded to each photo (this is also practically a Gaussian with an average of 1.41 and variance of 0.21); (c) distribution of the number of written comments per photo; this is more of a Poissonian distribution with a parameter of 15. – SUN 13 is an image database compiled by MIT, containing 130,000 indexed images, divided into 900 categories based on scenes. Certain objects are segmented, belonging to over 40,00 categories of objects. These images have no specific aesthetic characteristics. 5.4.2. Databases that are specialized for aesthetic photography – CUHK-DB14 is a database with two categories (beautiful/not beautiful) developed by Hong Kong University. The images are taken from DPChallenge. The database was created by selecting images belonging to either the best 10% or the worst 10%. The two categories are thus quite unambiguous and the “average” images are not included in the learning set as there is the risk of them being wrongly classified. – AVA15 (Aesthetic Visual Analysis) is an image database that was specially developed for aesthetic studies. It gradually became the principal source for deep neural network research on aesthetics. It was created from the DPChallenge database and has 250,000 images, sorted into 60 categories, with abundant indexing based on aesthetic scores (over 200 scores per photo, on average), photographic style (14 of these, see Table 5.4), content (44 semantic tags), as well as the emotions associated with the photos (Murray et al. 2012). It thus inherited the statistical properties of the DPChallenge database (Talebi and Milanfar 2017), especially the correspondence between the notations and Fang’s research, seeking to establish an objective aesthetic scale (Table 5.2). Given the important position the AVA base will occupy for all of the studies carried out on the aesthetics of photography, the quality of the images in this database must be examined. AVA is created from DPChallenge, which offers very good quality images at high resolution (generally around 4,000 pixels per line). However, the AVA site only provides images with very low resolution (a maximum of 640 pixels in their largest dimension). It is possible to access the source image through a link given in the AVA, but it is unclear whether high-resolution images are frequently used in the studies that 13 SUN: https://groups.csail.mit.edu/vision/SUN/. 14 CUHK-DB: http://mmlab.ie.cuhk.edu.hk/CUHKPQ/Dataset.htm. 15 AVA: http://www.computervisiononline.com/dataset/1105138637.
The Holy Grail of the Digital World
127
use the AVA, especially since DPChallenge puts mechanisms in place that are quite effective in protecting source images. Style attribute Complementary colors High dynamic HDR Light on white Macro Negative image Shallow DOF Soft focus
Number 949 396 1,199 1 698 959 710 1,479
Style attribute Duotones Image grain Long exposure Motion blur Rule of thirds Silhouettes Vanishing point
Number 1,301 840 845 609 1,031 1,389 674
Table 5.4. The 14 style attributes used in the AVA database and the number of times they occur (from Wang et al. (2016))
There are two adaptations of the AVA database that are seen in studies that significantly modify learning data and thus lead to evaluations that are sometimes difficult to compare. On the one hand, a set of images that are judged to be beautiful are distinguished from a set of images that are judged to be less beautiful by a threshold ± δ around the average value 5.43 and various learning sets were thus constructed for δ values from 0, 1 to 2. This parameter, δ, will be seen throughout the following chapters. On the other hand, the AVA database was reduced to only its most beautiful images (the 10% with the best scores) and least beautiful images (the 10% with the worst scores). Therefore, this new database, the AVA2, only has around 50,000 images. It is very close to the CUHK-DB database. Another sub-database was created from the AVA: AVA-PD (Photographer Demographic), which selected photos taken by regular photographers on the AVA site, presents a large number of photos and associates these photos with information regarding the photographer’s age, gender and country or region of activity. Only images from North America and Europe with complete information were included in this database, around 120,000 images in total. – AADB16. The Aesthetics with Attributes Database was created from the Flickr website. It contains 10,000 images annotated by five expert operators who give them an aesthetic score (on 5 levels), as well as a score for each of the 11 descriptive attributes of quality taken from a list supplied by photography professionals (interest of the content, lighting, color harmony, etc.). The average of the five aesthetic scores is taken as the reference score for the aesthetics, but we can also track a single operator so as to perhaps preserve a consistent aesthetic (Kong et al. 2016). 16 AADB: http://www.ics.uci.edu/skong2/aesthetics.html.
128
Aesthetics in Digital Photography
– BEAUTY 17 is a database of 15,000 images taken from Flickr, mixing images of highly varied quality. The indexing of these images based on their aesthetic quality is carried out through crowdsourcing by the community of users. Certain precautions are taken to limit a very large scattering of opinions, as well as to avoid biases in the statistical results. For the first point, only the users who contribute are chosen from a small number of countries that have high cultural homogeneity, and their results are filtered a posteriori to remove any behaviors that are too singular. With respect to the second point, the stock of processed images is divided into three large families of different qualities. The statistical representations of the samples from these families were notably different such that they reflected the natural distributions of the images. Finally, different criteria were chosen to describe the four classes: “people”, “nature”, “animals” and “city” (Schifanella et al. 2015). – The database by M. Redi et al. also has images (100,000) of varying quality, but taken from the Web from 8,000 requests that randomly drew 20 images per request (therefore it does not use a single source, which could potentially be biased, like Flickr or DPChallenge). Thus, 85% of the database comprises photographs and 15% comprises graphics, drawings, synthetic images, etc. These images are annotated by professionals along two indexes: the aesthetic quality (on four levels) and the type (“photo” or “non-photo”) (Redi et al. 2017). It has also been evaluated publically using The Amazon Mechanical Turk18. This evaluation makes it possible to annotate the entire database, but appears much less reliable. – The Tübingen University database (Schwarz et al. 2016) is made up of a selection of images from Flickr and contains 380,000 images. The annotation varies from 0 to 1. It is automatically annotated from accounts on Flickr. The score takes into account the number of views from users (Vi ) (i.e. how many times the image has been accessed) and the number of favorable responses (Fi ) (measured by the number of clicks on the like button) using the formula: S( i) = α
log( Fi ) log( Vi )
if f
Fi > 1
[5.1]
where α makes it possible to fix the measurement for the best image at 1. The measurement of quality is thus that given by a large number of observers (an average of 6,800 votes per photo), even if they are neophytes. The distribution of votes is practically flat between 0 and 1. 17 BEAUTY: http://www.di.unito.it/schifane/beauty-icwsm15/. 18 The Amazon Mechanical Turk is a crowdsourcing marketplace; https://www.mturk.com/.
available at:
The Holy Grail of the Digital World
129
– IDEA (Images Distributed Evenly for Aesthetics)19. This database is derived from the AVA. It was created such that the learning of aesthetic qualities benefited from the best statistical conditions. In order to do this, we must note that the base contains an equal number of samples for the various values of beauty: 1,000 images are chosen among the most scored in each unit interval [ 0 1 [ , [ 1 2 [ , [ 2 3 ] and 191 beyond 9. Thus, the database consists of a total of 9,191 images. This database also presents other interesting statistics: the same average as the AVA, but a much greater standard deviation, of course, and near-zero bias (Jin et al. 2020). – FACD (Filter Aesthetic Comparison Dataset)20. This is a database derived from the AVA to compare the aesthetic qualities of many widely used methods to process images to “improve” photos uploaded onto sites like Instagram (Sun et al. 2017). The database comprises of 1,260 original images edited in 22 different ways, using either Instagram libraries or the GIMP toolbox. These processes essentially consist of changing contrast and colorizations. Each image is placed into one of eight equally populated categories (animal, flora, landscape, architecture, food, portrait, urban scene and still life). The database also has 42,240 pairs of images that are associated with judgments of preference attributed by the human observers using the Amazon Mechanical Turk. – PCCD (Photo Critique Captioning Dataset) is made up of photos and comments taken from the professional photography website Gurushots21. It contains 4,200 images and 30,000 comments uploaded by very good photographers. We were unable to access the database itself. 5.4.3. Databases dedicated to artistic judgment These databases chiefly use paintings as their source images, but sometimes also include photos or graphic design. – JenAesthetics: The database from the University of Iena22. The database made up in the Computer Vision group of the University of Iena (Amirshahi et al. 2013) was set up with the specific goal of understanding the bases for aesthetic judgment given by human experts, to try to approximate this judgment using algorithms. The database is composed of 1,628 high-quality photographs for works of art collected during the Google Art Project, covering all styles of paintings from primitive art to contemporary art. Each artwork is accompanied by about 20 evaluations conducted by 19 IDEA, a database developed by the Beijing Institute of Electronics, does not seem to be accessible to the public. 20 FACD: https://wtwilsonsun.github.io/FACD/. 21 Available at: https://gurushots.com/. 22 JenAesthetics: http://www.inf-cv.unijena.de/en/jenaesthetics.
130
Aesthetics in Digital Photography
a panel of 130 observers who are non-specialists in art. These observers had to assess five criteria: aesthetics, beauty23, color, composition and content. Each evaluation is carried out using a continuous cursor between 0 and 1. – BAM! Behance Artistic Media dataset24. This database was formed from the Behance site25: this was a portal to websites that specialized in presenting professionals in the field of multimedia production. We thus find photos, but also drawings, paintings and videos, which are not directly related to our field of study here. Behance contains over 65 million documents. BAM! selected 1.9 million of these across the following categories: synthetic images, cartoons and graphic designs and paintings. Each image is given labels describing the emotion (“calm”, “happy”, “sad”, “worried” etc.) and the category of object (only scenes that were covered by the following terms were retained: bicycle, bird, building, car, cat, dog, flower, person, and tree). The annotations were obtained in two steps: first by crowdsourcing a limited number of images, and then through iterative automatic classification (Wilber et al. 2017). – Finally, sites where art is sold online: many sites specialize in selling art online. They often offer photographs alongside paintings and sculptures (Artspace, Artsper, ArtPhoto, KazoArt, Pixopolitan, Singulart, Ugallery, for example). They offer catalogs with a few hundred to a few thousand photos which are distinguished by their affirmed quality. These sites group together photos per category (landscapes, portraits, still life); they mention the authors and often annotate the comment documents, which can be used as meta-data in the classifications. The resources offered by these collection are discussed in Messina et al. (2018), for example. Let us note that these sites often offer artwork that is more artistically engaged than amateur sites like DPChallenge and refer to aesthetic criteria that could be described as “scholarly”, in that they use advanced photographic techniques. 5.4.4. Other image databases that are sometimes used We will also come across studies that use databases that were designed for a completely different purpose than studies on aesthetics, such as the IAPS and OASIS libraries. – IAPS (International affective picture system)26: IAPS is a database created by psychologists at the National Institute of Health, comprising a thousand reference 23 It is not specified how the authors distinguish between aesthetics and beauty, and the statistical results show that these variables are strongly correlated finally. 24 BAM! Behance Artistic Media dataset: https://bam-dataset.org/. 25 Behance: https://behance.net/. 26 IAPS is not available online to prevent the sharing of test images, which weakens their effectiveness during evaluation.
The Holy Grail of the Digital World
131
photos in order to test emotions. These contain scenes that are likely to arouse empathy, surprise or fear: erotic scenes, insects, snakes, road accidents, wounds, etc. (Lang et al. 1999). – OASIS (Open Affective Standardized Image)27: The OASIS library was created for the same purpose by Harvard University in 2015. It contains 900 images accompanied by many affective valence scores, as well as the arousal attributed by the Internet users (Kurdi et al. 2017). 5.4.5. Increasing databases Having seen the vast number of image sources, we may think that these sources are enough for the algorithms’ requirements. While this is generally true, with an increase in the depth of the networks certain algorithms need millions of examples and these numbers are not available in the databases we have seen here. A current practice in the field of AI is to expand these databases using various replicas of the same image, created by modifying the image in ways that do not alter the judgment of the image itself. This practice, which is very widespread in the field of pattern recognition (the same car is reframed, symmetrized, its colors are modified, etc.), was also proposed for studies on aesthetics. It is possible to make an annotated database 10 or 20 times its original size. It is difficult to imagine carrying out operations on images that will not affect the aesthetic qualities of the image and unfortunately, too many studies apply these operations without considering this, so as to adapt the images to DNN constraints (some of these operations will be seen in Chapter 7 and are illustrated in Figure 7.1). This is why some studies have tried to measure the impact of the most common modifications on the judgments made about the image. We can cite the study in Wang et al. (2016), which was based on a judgment tool (the BDN system) that tried to imitate the process followed by a human expert. We will look at BDN later in the book (see section 7.2.3). In their work, Wang et al. systematically tested what impact an operation (cropping, symmetry, etc.) had on the score awarded to an image by BDN. These transformations are described in Table 5.5. Through a measurement (LP ), these express the average effect of the operation on the entire set of images. It can be deduced from this that only reflection, change in scale, and the addition of low noise ensure that the score is maintained. We can thus multiply learning data using these three transformations. However, this leads to a gain of only 2 or 3% in the quality of aesthetic evaluation across two classes. 27 OASIS: http://www.benedekkurdi.com/#o asis.
132
Aesthetics in Digital Photography
Transformation Description Note LP Reflection Vertical symmetry 0.99 Random change in scale Choice of a unique factor in [0, 9, 1, 1] 0.94 Compression/stretching Choice of 2 factors in [0, 9, 1, 1] 0.55 Low noise Gaussian noise ∈ N (0, 5) 0.87 Strong noise Gaussian noise ∈ N (0, 30) 0.63 Color modification Independent distortion of the RGB channels 0.10 Rotation Random affine transformation 0.26 Table 5.5. Effect that various transformations have on the appraisal score of a photograph. When LP has a value of 1, the evaluation is not affected by the operation; when LP has a value of 0, the evaluation of the modified image no longer depends on the initial image. The evaluation is only resistant to reflection and change in scale, and, to a smaller extent, an additional noise of low value (Wang et al. 2016)
6 Primitive-based Classification Methods
The message of beauty is free of concepts. Mikel D UFRENNE (1980) [. . . ] once it is accepted that there is no physical or perceptual property that distinguishes works of art from simple objects, then all that remains, apparently, is proving that the difference is conceptual in nature. Jean-Pierre C OMETTI, Jacques M ORIZOT, Roger P OUIVET (2000)
Artistic photos are documents of large dimensions that demand that we process them in their full resolution, across their finest gradations and colors. Reducing these photos to a family of primitives is an arduous task, but it was a necessity for researchers working on the earliest machine learning methods1. In the methods presented here, we will first look at the use of specific primitives to help in aesthetic judgment, primitives chosen to reflect the know-how deployed by experts in their evaluation of images and the criteria that we have examined at length earlier in the book (see Chapter 3). This choice justifies the generic name given to these primitives: handcrafted primitives. However, equipped with their experience in representing images for indexing and searching for data in the field of multimedia imaging, researchers also suggested that we adopt more generic characteristics, which were known to reflect properties across a large diversity of photographs: describing the For a color version of all figures in this chapter, see www.iste.co.uk/maitre/aesthetics.zip. 1 We will see that due to memory-size constraints, even today (in 2020), it is not possible to directly use the image with its full extension, even in the most advance neural networks, and that certain details are inevitably lost when processing a 4,000 x 4,000 pixel photo using such a network (see Chapter 7).
134
Aesthetics in Digital Photography
light histogram, or color planes, the spatial distribution of color patches, the geometric moments of regions during a hierarchic decomposition, indicators of textures, etc. Other approaches were based on representations used in computer vision to find shapes and objects, lists of interest points (SIFT, SURF, Harris, etc.). Finally, some authors proposed combining these various primitives, as can be seen in the table given as an example in Figure 6.1. This chapters aims to introduce these various approaches and compare their efficiency. Name Brightness Color variance Contrast
Description Average and standard deviation in the V-channel in the HSV space Color variance in the Lab space Width of the middle 96% mass of the histogram the V-channel in the HSV space # Edges The image is divided into 16 x 16 blocks. Number of blokcs containing over 10% of the edges # Edges L R T C B Numbers on the left, on the right, on top, in the center and at the bottom # Hue counts Number of unique hues Saturation Average and variance of saturation Sharpness Variance of the Laplacian Distance to the centre Distance between the center of the S region and the center of the image Rule of thirds The shortest distance from the center of the S region to one of the axis of thirds Hue in the S region Average hue, brightness and saturation in the S region Sharpness in the S region Variance of the Laplacian of the S region Size of the S region Number of pixels in the S region Location of the S region The image being divided into nine parts, proportions of the S region in each part Color difference Difference of colors in the Lab space between the S region and the background HSV difference Difference of hue, brightness and saturation between the S region and the background Sharpness difference Difference in sharpness between the S region and the background
Table 6.1. An example of the primitives chosen for a study of aesthetics in photos (adapted from Simond et al. (2015))
C OMMENT ON TABLE 6.1.– Some of these are generic primitives used in image processing (like brightness or contrast); others are recommendations made by photography textbooks (distance to the center, rule of thirds, object/background contrast, preferential focus on the subject of interest, represented by the salient region (S region), which is the zone of maximum salience).
Primitive-based Classification Methods
135
a)
b) Figure 6.1. (a) Distribution of scores awarded by the ACQUINE software to a collection of 140,000 photographs in the Photo.net database: very good scores are rare, and the most probable scores are quite low (from Datta and Wang (2010)). (b) Two photographs scored by ACQUINE: 8.1/10 for the photograph on top and 2.5/10 for the one below
136
Aesthetics in Digital Photography
However, regardless of the primitives chosen, the research is designed as a two-step process: first, determining the parameters of the classification algorithm; second, classifying unknown images. Both these use the same core, which we described in section 5.1.2. We will look at some of the methods proposed in literature that do not use deep neural networks (which will be studied in the next chapter). We have chosen these as they illustrate the diversity of approaches. In our discussion, we will distinguish between research that aims to provide an evaluation of the quality of the photograph and research that aims to improve how a photograph is taken. We will conclude this chapter by presenting some original research on the margin of aesthetic judgment. 6.1. Judging aesthetics 6.1.1. Multimedia primitives: the ACQUINE system (Datta et al.) This study is among the first to have approached the problem of computer-assisted evaluation of the aesthetics of a photograph (Datta et al. 2006, 2008; Datta and Wang 2010). This research led to many other researchers in the community following in this domain. The result expected from the algorithm was in two parts: first, a two-category classification, low-quality images (learned from the images on Photo.net (see section 5.4) with scores below 4.2) and high-quality images (those with scores above 5.6), and then they give an individual score between 0 and 100 (this was later brought down to 0–10), with the score increasing with the aesthetic quality of the image. The primitives taken from images are inspired by a few criteria recognized in aesthetics; they chiefly use primitives that have proven themselves in the multimedia domain. There are 56 of these: – the size and aspect ratio, and the average gray level; – a color measurement, derived from a rough quantification of the RGB space into 64 cubes and from the difference (in the Luv space) between the distribution of this image and one which is equi-distributed; – the average values of saturation and hue; – three measurements that indicate the degree of agreement with the rule of thirds in the three HSV components; – two indicators of familiarity, which compare the image to a reference set and compute the average distance to the 20 or 100 closest images; – 12 texture terms derived from the decomposition of the HSV components into wavelets; – a region-based decomposition (classification using k-means in the chromatic space), retaining only the five most important regions and describing them using their
Primitive-based Classification Methods .
137
size, their average color point and two indicators expressing the distribution of hues and how complementary they are on the color wheel; – three depth of field indicators that make it possible to distinguish on-focus images against a blurred background; – finally, a factor that measures the convexity and regularity of shapes. The classifier used is an SVM that uses the 30 most discriminating primitives out of the 58. A linear regression on the polynomial expression of the chosen factors allows us to derive the final value for the aesthetic of the image. The high-quality versus low-quality classification is carried out correctly 63% of the time, which is a reasonably modest rate of success. The variance in this classification is of the order of 0.5 on a 7-level score, which makes it possible to compare it with the Photo.net scoring. It is difficult to draw conclusions about the choice of primitives made by the selection algorithms: the description of the regions would be topmost, followed by the average gray-level, and the convexity of forms. However, these conclusions are not unanimous and are not easily convincing. The evaluation is quite strict, as can be seen in Figure 6.1(a) and there are not many beautiful images. The system was made open-access under the name ACQUINE2. 6.1.2. Edges and chromatic distance: Ke et al. At around the same time as this research, Ke et al. (2006) presented research with similar objectives: their work was to classify a photo as either amateur or professional. However, the authors also adopted other criteria, as some of the elements used are more specifically related to what we would call the image quality. The database used is the DPChallenge database (see section 5.4); only those photos that received at least 100 votes were considered and they were awarded a score that was the average of all the scores they had received. Only the highest scored photos (belonging to the top 10%) were retained to form the category P (professional) and the lowest 10% were bracketed into category A (amateur), thus creating a database B of 6,000 photos (see Figure 6.5). 2 ACQUINE = aesthetic quality inference engine, an online service to evaluate the quality of a photograph. Available at: http://acquine.alipr.com/. As of late 2020, ACQUINE was still running. It has found some success among amateur photographers, since it was the only free-to-access evaluation platform for a long time. There is also a forum set up by Internet users who wish to experimentally determine the scoring criteria by submitting, to the engine, a series of images modified through image processing in order to obtain either the highest or lowest score.
138
Aesthetics in Digital Photography
a)
b)
c)
Figure 6.2. Paintings that received a score of 10/10 on the ACQUINE evaluation: (a) Luca Giordano, Minerva and Arachné (exposition Petit-Palais, Paris ) (1695); (b) Jacques Antoine Watteau, L’Amante inquiète (The Worried Lover) (Chantilly museum) (1715–1717); (c) Jacques-Louis David, Mars disarmed by Venus (Royal Museum of Beaux-Arts, Belgium) (1824)
C OMMENT ON F IGURE 6.2.– These painted artworks, like those given below, were taken from a test conducted on 200 randomly chosen paintings in the classical repertoire from 1400 to 2020. This test does not allow us to establish any law about the style or the theme that would make it possible to obtain a high score. For instance, another artwork by Luca Giordano, Saint Nicholas in Glory, which is close to Minerve and Arachné, given above, obtained only a modest 6.6/10, and Leonardo da Vinci’s Mona Lisa obtained a score of 8.6/10.
a)
b)
Figure 6.3. Other paintings from the classical repertoire which received a score of 10/10 through the ACQUINE evaluation: (a) Nicolas Poussin, Bacchanal with a Lute Player (Louvre-Lens Museum) (1627–1628); (b) Jacques-Louis David, The Intervention of the Sabine Women (The Louvre Museum, Paris) (1799)
Primitive-based Classification Methods
a)
b)
139
c)
Figure 6.4. Other paintings from the classical repertoire that received a score of 10/10 on the ACQUINE evaluation: (a) Arnoldt Böcklin, Roger and Angélique (Alte National Galerie, Berlin) (1873); (b) Georges Lacombe, Red Pines (temporary exhibition at the Musée d’Orsay, Paris) (1894–1895); (c) Hans Bellmer, Étude pour Ubu (Scharf-Gerstenberg collection, Berlin) (1934)
a)
b) Figure 6.5. How to separate beautiful and less beautiful photos?
140
Aesthetics in Digital Photography
C OMMENT ON F IGURE 6.5.– (a) Scores awarded to P (professional) or A (amateur) images by the system developed by Ke et al. It must be noted that these two sets, clearly separated on DPContest.com (the DPChallenge competition forum), are no longer separate. (b) The classifier’s performance can be improved by making stricter choices (left to right) of photos selected for learning. Nonetheless, it can be seen that by reducing the choice to only the top 2% of the most beautiful image and the lowest 2% of the ugliest images, we still have an error rate of 19% in both categories, P and A (from Ke et al. (2006)). In order to normalize the images, they are reduced to a square format. The primitives used are from the general field of image processing: – an edge map detected by a Laplacian is compared to the average contour map of the two distributions, P and A; a box enclosing the majority of the contours is evaluated and brought in proportion to the image size (the “beautiful” images tend to have their contours concentrated far from the edges); – following a quantification of the color planes, we search for n images of B whose histograms are the most similar (in the sense of the k-nearest-neighbors) to the processed image. If nP and nA are the number of images taken from P and A (n = nP + nA ), we use nP nA as the indicator of quality; – a hue index is derived from a count of the number of colors whose hue is greater than 5% of the hue dynamic; – a contrast index and a brightness index are also used. Five measures of quality are taken from these primitives, which are combined within a naive Bayes classifier, perhaps followed by a Real-AdaBoost classifier to define the ultimate judgment. The performances obtained through this method are still modest. Since the classes are defined based on highly separated truths, the P and A images are found, respectively, in the P and A categories with a high probability (of the order of 90%), but the images that belong to neither P nor A are distributed quite randomly in P and A. Furthermore, the learning we have derived from the primitives that are most important for aesthetic judgment is still quite poor. The presence of blurring that we get from the distribution of contours seems to be one of the most discriminating criteria. 6.1.3. Photography rules: Luo and Tang and Mavridaki and Mezaris This research is reported in Luo and Tang (2008). They are located as the extension of research by Ke et al., but adopt a strategy founded on the distinction between the chief object and its background. The separation between object and background is based on the focus, as seen earlier, and on a hypothesis that we find in many essays on
Primitive-based Classification Methods
141
photography, but this is far from being a general rule: the subject is generally clear. The detection algorithm thus returns to this principle and locally measures various statistics of intensity and its horizontal and vertical derivatives. The authors can thus implement various criteria regarding the object and its background, as well as the contrast that differentiates between them. Thus, the background benefits from being quite uniform, with a reduced chromatic palette, while the object benefits from greater harmonic richness (measured in the HSV space by marginal histograms). The quality of the chromatic palette is obtained by comparing these marginal histograms with the histograms for the “beautiful” images in the learning base that are closest in the space of primitives. Finally, the agreement with the rules for the spatial distribution is also measured, like the rule of thirds, which is measured from the center of gravity of the object. The same database created from DPChallenge, modified by Ke et al. serves as the learning database (3,000 high-quality photos and 3,000 bad photos). The classification is carried out by a Bayesian classifier or by an SVM followed by AdaBoost. The results are significantly superior to the results obtained by Ke et al. (see Figure 6.6), especially when the two families of primitives are combined. Using a process reported in Mavridaki and Mezaris (2015), it is proposed that we return to primitives that are closest to photographic criteria. This method is based on determining five aesthetic components in parallel: simplicity, sharpness, chromatic richness, motifs and composition. Simplicity uses an approach that is quite similar to Luo’s approach, distinguishing the object from the background. The chief original feature of this research concerns the search for motifs. These are identified by carrying out paired image tests on the interest SURF points under various hypotheses of symmetry. The overall composition is compared to three typical configurations: the rule of thirds, the “full field” (where the object occupies the most significant portion of the view) and landscape (where the regions of salience are vertically distributed, in Figure 6.7). After this, the vector obtained by concatenating these descriptors is presented to an SVM classifier. This is of the dimension 1232. The performances obtained are much better than those obtained by Luo or Ke, especially when the classifier is applied to images that have been thematically separated (animals, architecture, landscape, etc.). There is over 92% correct classification in the CUHK database. On the AVA databases, performances reach 78% accuracy.
142
Aesthetics in Digital Photography
Figure 6.6. Comparison of the results (accuracy based on recall) from Ke and Luo
Primitive-based Classification Methods
143
C OMMENT ON F IGURE 6.6.– Bayesian classification (a) using Ke’s primitives, (b) using Luo’s primitives, (c) with the combined primitives; (d) and (e) the role played by Ke’s and Luo’s primitives in the classification (for Ke, the dominant quantity is the intensity, for Luo it is lighting); (f) performances of the various classifiers (Bayesian, SVM or ADABoost). Luo’s choice of primitives leads to smaller errors than Ke’s or Datta’s, regardless of the classifier (from Luo and Tang (2008)).
Figure 6.7. Detection of primitives of motives (on top) and composition (at the bottom) using Mavridaki’s and Mezaris’ approach
C OMMENT ON F IGURE 6.7.– The motifs are measured through the symmetries of the SURF contrast points, applied to multiple sub-images of the photo. Three compositions are tested: full field (a), rule of thirds (b) and landscape (c). Eight principal objects are detected in the scene, we then measure the distance and positions of their centers with respect to the center of each region marked by a +, as well as their overlap. We then place the scene into one of the three categories (from Mavridaki and Mezaris (2015)). 6.1.4. High-level primitives: Dhar et al. This research is also an extension of Ke’s research, but the authors tested the use of high-level primitives, explicitly describing the composition of the image, its content and illumination in a way that can be interpreted by a human observer (Dhar et al. 2011). Figure 6.8 indicates how these primitives are calculated in addition to Ke’s primitives. The detection of many primitives is complex and yields very varied information: the presence of faces, animals, recognition of a scene (mountain, interiors, sky, etc.). These detections require the use, in parallel, of classification
144
Aesthetics in Digital Photography
sub-systems (17 SVMs are used to resolve the various problems related to shape recognition), with their own learning databases (e.g. a classifier for each of the 10 types of animals identified). The authors finally chose 26 attributes. The authors recommend using an SVM classifier that gives better results than a Bayes classifier on a database of 16,000 images taken from the DPChallenge database. The study then shows the clear benefits of this approach with respect to Ke’s research. These are summarized in Figure 6.9. This approach also makes it possible to calculate an indicator for the image’s power to “retain” the observer’s interest through a different weighting of the attributes of the content. 6.1.5. Generic descriptors of vision: Marchesotti et al. This research has been presented in Marchesotti et al. (2011, 2013). While not seeking to codify “rule of beauty”, the authors use primitives for generic description, developed for indexing and searching through images. These are either Bags of Visual Words (BoVW3) or Fisher vectors (FV). They function on the second level of representation of the signal based on low level primitives (in this case, both SIFTs4 and descriptors of color). The SIFTs and color descriptors are processed separately. Through a principal component analysis, they are reduced to 64 dimensions and then grouped together in words that are obtained by concatenating, into a single chain, either the number of occurrences (BoVW) or their statistics up to the second order (using modeling by a mixture of Gaussians) (FV). Information on the composition of the image can be given through a pyramidal representation, but this does not seem to lead to better performances. The classification is carried out through SVM and learning occurs through a stochastic gradient descent algorithm. This is applied separately to the SIFTs and to the colors, and the scores are then averaged. As before, we look for two categories: low and high. The first database used was the Photo.net database, which was slightly modified, resulting in the selection of around 3,200 photos. The authors then used the CUHK database (12,000 images). The study finally led to the creation of the annotated AVA database, specially dedicated to the study of aesthetic properties of images (Murray et al. 2012). 3 Bags of Visual Words is a pattern recognition technique that uses methods that have been successfully used in processing written documents. The bags of visual words represent an image through a concatenation of primitives, ordered by frequency of appearance. 4 SIFTs (Scale-Invariant Feature Transform) are detectors of multi-scale interest points.
Primitive-based Classification Methods
Figure 6.8. Schema for computing the aesthetic, developed by Dhar et al.
145
146
Aesthetics in Digital Photography
C OMMENT ON F IGURE 6.8.– It uses detections carried out on a low level through the Ke pre-processing, combined with high level primitives taken from the image. The scheme describes how to derive the high-level primitives from low-level measurements, and then how these high level primitives are combined with the primitives determined by Ke to provide the final aesthetic judgment (from Dhar et al. (2011)).
Figure 6.9. Comparison of the results obtained used Ke’s approach and Dhar’s approach in terms of recall and accuracy
C OMMENT ON F IGURE 6.9.– The curves labeled low level (red and black) were obtained from Ke’s primitives. The curve labeled high level (blue) uses Dhar’s high-level primitives, and the curve labeled combined (green) uses all the primitives. The experiment with low-level primitives was carried out using the classifier recommended by Ke (naive Bayes, denoted by NB in black) or an SVM classifier (red), yielding identical results (from Dhar et al. (2011)) In the conclusions to their research, the authors highlighted the very good performance given by generic primitives associated with a second-level classification (especially FVs). This approach makes it possible to surpass the performances obtained using only the “aesthetic” primitives, which is a bit disappointing for the
Primitive-based Classification Methods
147
proponents of handcrafted primitives, and those who defend the specificity of aesthetic properties. The authors delegated the incorporation of high-level properties into the decision to the FV classifier. The rate of correct classification in the AVA database exceeded 80% and would only be surpassed by techniques using deep neural networks. We must note, finally, that this research technique, associating aesthetics and semantics, has been patented (Murray et al. 2013).
a)
b) Figure 6.10. Performances using the approach proposed by Marchesotti et al.
148
Aesthetics in Digital Photography
C OMMENT ON F IGURE 6.10.– (a) Comparison of the performances of recognition on the Photo.net database using various strategies to represent images. The curve labeled “Datta” is the curve described in section 6.1.1. The variable “delta”, on the X-axis, corresponds to the difference between the scores for the “Low” images and “High” images, enforced during the learning phase. (b) Role of the SIFT descriptors or color descriptors in the performance of the classifier: they are quite complementary. The curve labeled “SP” uses a pyramidal decomposition. Despite greater complexity, it does not offer higher performance (according to Marchesotti et al. 2011). 6.2. Help in composing beautiful photos A number of studies do not aim to evaluate the beauty of a photo, instead they offer assistance to the photographer to take a quality photograph. Most often, these methods use criteria that are very close to those used for aesthetic evaluation, however they use them for different ends. 6.2.1. The library of aesthetic primitives developed by Su et al. In research presented in Su et al. (2012), it was proposed that a catalog of specific configurations seen in aesthetic photography be learned, based on beautiful photographs from the DPChallenge database, and then that these properties that were learned in this way be used to assist photographers when framing a shot. In order to do this, photographs are decomposed into fixed cells, determined once and for all, which reflect the principal scales of composition of photographs (see Figure 6.11). Specific statistics of some primitives (average and contrast between the measurements carried out on the sites in black and the sites in white in Figure 6.11) are then measured on this mesh. These primitives are related to color (RGB or HSV), edge distribution (histogram of oriented gradient [HOG] detector (Dalal and Triggs 2005)), a measurement of texture, called the LBP, and a measurement of salience. The computation of the various possible configuration, for example, in the 6 6 subdivision (d), is obtained through a triage of 36 measurement windows, allowing us to quickly arrive at the configurations that optimize contrasts, and their distance from the windows in the library. This technique has been shown to be particularly effective in suggesting a reframing of the image around more interesting zones, which was the chief objective of this research. 6.2.2. The OSCAR system by Yao et al. The OSCAR (or on-site composition and aesthetic feedback through exemplars) (Yao et al. 2012) system follows in the same line of work as that carried out by Datta
Primitive-based Classification Methods
149
and Wang (2010), with the goal being that of assisting a photographer equipped with a mobile telephone; when the person is taking the photo, they are connected to a centralized system that offers recommendations on the parameters to employ for the photo that they wish to take. This system is set apart by audacious architecture and highly visionary project. Unfortunately, the system was not fully developed – it looks as though several areas of development were announced but not realized, leaving it unfinished.
Figure 6.11. The analysis windows proposed by Su et al., which supply the library with aesthetic primitives. The image is divided along the four grids described in the sub-figures (a)–(d). For each configuration, specific properties are computed in the black zones with respect to the white zones (from Su et al. (2012))
The user transmits the image to the centralized system, which carries out an analysis on several aspects of the image: – the colors are extracted and grouped into categories in the HSV space; – the descriptors of the composition are calculated, measuring the organization along the diagonal, vertical, horizontal, the center and the presence of textures. These descriptors are based on contour detectors, followed by a filtering for relevance; – Gestalt elements are used to extract significant motifs expressing parallelism, similarity and continuity; – the noise is estimated (identified with granular noise). From these analyses, the system deduces elements for describing the photo. Based on these descriptions, taken separately, the system searches within the Photo.net database for photos that most resemble this and that are endowed with good aesthetic qualities (this is done using Datta’s ACQUINE engine, as well as the SIMPLIcity classifier, specializing in photos taken in an urban milieu). The best are then selected and suggested to the user as examples to follow. It is then up to the user to seek inspiration from these and modify how they take their photo.
150
Aesthetics in Digital Photography
6.2.3. Embedded systems: Lo et al. and Wang et al. The systems proposed by Lo et al. and Wang et al. are less ambitious, but have a similar objective. They offer techniques to determine the aesthetic qualities that must be embedded in a mobile phone. These are thus restricted to a modest computation complexity and low electric consumption (Lo et al.), but they can take the support of the power of cloud computing (Wang et al.). The techniques presented in Lo et al. (2012, 2013) are quite close to those put forth by Dhar et al. (see section 6.1.4). Five scores are awarded to each image, judging the color, saturation, composition, richness and contrast. The primitives detected are: – descriptors of color based on highly quantified histograms (see Figures 6.12(b) and (c)); – indicators of hue, saturation and luminance, obtained in the HSV space; – contour indicators for the four channels: H, S, V and H+S+V; – texture indicators calculated after the image is segmented into thirds, horizontally and vertically, and after the integration of the intensities of the gradients in each of these bands; – a blurring indicator, a black channel indicator and a count of the zeros on the quantified HSV histogram. Consequently, a total of around 30 primitives are used and classified by an SVM and reduced to 16 to simplify calculations. Learning takes place using the Hong Kong university database of 10,000 images. The database is divided into seven thematic classes. For every class, the weightage of the five criteria varies. The user is given a graphical representation of these five scores (see Figure 6.12(a)). They are also given an overall appraisal of either beautiful or not beautiful. The approach chosen in Wang et al. (2014, 2015) is based on quite a traditional calculation of aesthetics: segmentation of the image, extraction of primitives and classification. However, it transfers the computation to the cloud and these computations are then carried out through distributed computing (in the Hadoop language), giving very good response times.
Primitive-based Classification Methods
151
Figure 6.12. The approach used by Lo et al. (a) The final judgment is derived from five partial scores, whose importance varies depending on the category of images being processed. The palette is reduced to five dominant hues and makes it possible to distinguish images with a pleasant harmony (b) and those that are less pleasing (c) (from Lo et al. (2012))
6.3. Some specific research related to the evaluation of aesthetics using primitives 6.3.1. Color harmony: Lu et al. Lu et al. chose to teach a machine the rules of harmony using an automatic technique. They compared an instance of machine learning with the rules of harmony put forth by Moon and Spencer and by Matsuda, both of which were presented in section 3.4.3. This was done using the AVA database (Murray et al. 2012), sorted into broad categories (flowers, landscapes, etc.). Eight themes were considered. For each theme, two sets were identified: a set of harmonious images and a set of banal images. Then, after each image was decomposed into a mosaic of vignettes, they used a Bayesian approach (the latent Dirichlet allocation method (LDA)) to determine the association of vignettes that were most frequently attributed to “beautiful images”. This is done using Laplacian regularization (normalized or non-normalized), and then a LASSO regression to estimate the aesthetic score of the image (Lu et al. 2016).
152
Aesthetics in Digital Photography
Figure 6.13. Rate of correct recognition during the learning of color harmony in the AVA database based on the method adopted: Bayes classifier, LDA or the Matsuda or Moon and Spencer metric (a) with Matsuda’s criteria in the HSV space; (b) Moon and Spencer’s criteria in the Munsell space
Primitive-based Classification Methods
153
C OMMENT ON F IGURE 6.13.– The images were separated by thematic categories: animals (ANI), architecture (AT), city scene (CS), food and drink (FD), flowers (FL), landscape (LS), portraits (PT) and still life (SL) (from Lu et al. (2014a)). The results show higher performance on the recognition of “beautiful” images than those obtained using Moon and Spencer’s criteria, and much higher than those using Matsuda’s criteria (see Figure 3.18), which appear to give the worst performance (see Figure 6.13). We bring together some research that aims to process certain specific problems related to aesthetic judgment. 6.3.2. Group photography: Wang et al. It was emphasized that specific criteria affect the aesthetic of group photos: these concern the visibility of faces and eyes, the presence of a smile, the orientation of the busts and the direction of gaze. These criteria are added to other, more general criteria related to colors, focus, textures, etc. Wang et al. (2020) carried out research dedicated to the aesthetic evaluation of group photos. This was based on the work done by Datta (see section 6.1.1) and Machajdik and Hanbury (2010) to extract generic primitives of beauty, and also based on two toolboxes used for detecting and characterizing faces (Face++ and BaiDu IA) to complement the specific primitives for faces. The learning is carried out on a particular database of 1,000 group photos (Group Photography Dataset [GPD]), which is partly taken from the AVA and AADB, and partly created by the authors. The classification is carried out by SVM and the regression is obtained by using a Random Forest. This research may be criticized for paying too little attention to truly aesthetic criteria, giving preference to functional conformity (visible faces, facing the camera, and smiling). 6.3.3. Social networks and crowdsourcing: Schifanella et al. The research presented in Schifanella et al. (2015) differs from the earlier work on several points. It uses ordinary images, the BEAUTY database taken from the social network Flickr and the images are annotated through cooperative, crowdsourcing efforts. In order to sidestep the known disadvantages in this mode of evaluation (see section 5.4), significant precautions are taken a posteriori to eliminate participants who are unreliable and eliminate biases in the statistical results. The question that is asked “How beautiful is this picture?” must make it possible to gather only the aesthetic qualities of the image, and not the semantic content. The score must be chosen from one of these five options: unacceptable, has some flaws, ordinary, professional, or exceptional. Finally, different criteria are used during the learning of parameters for the different categories: “people”, “nature”, “animals”, and “city”.
154
Aesthetics in Digital Photography
The primitives used are visual primitives of color (based on the criteria resulting from affective psychology (see section 3.4.2 and (Machajdik and Hanbury 2010)), texture and spatial arrangement (rule of thirds, gradient distribution). The classification is carried out using partial least square regression (PLSR). The measurement of quality obtained for an unknown image using this classifier is seen to be unconnected to the image’s popularity score on Flickr. It can also be compared with the score obtained by experts using the Marchesotti method (see section 6.1.5). It is thus seen that there is a tendency to overrate the images. 6.3.4. Looking at comments: San Pedro et al. Before concluding this review, an original approach deserves to be mentioned here, even if it does take us further away from machine learning techniques for evaluating quality based on iconographic properties. This is the approach reported in San Pedro et al. (2012). The idea developed here suggests that instead of processing the image, it is the comments accompanying the image that must be studied. The authors use the DPChallenge database, which provides both images and the associated comments, submitted by Internet users. The authors do this by identifying vocabulary that is endowed with varied meaning: some words refer to the properties of the image: “lighting”, “composition” and “colors”, while other are positive or negative adjectives used in the judgment given of Internet users. Collecting all the opinions shared about the photo, the authors determine, for each word, the overall feeling expressed by these opinions. This is done using a Markovian process which estimates a hidden state, the quality of the photo. The learning of the dependencies between the criteria and beauty occurs in a supervised manner through entropy maximization, following the scheme popularized by the natural language processing system, OpinionMiner (Jin et al. 2009). This strategy is used to re-order the search results in a database, so that the top hits are images that are classified as best using the aesthetic criteria that were identified from the analysis of comments. A short experiment showed that the selection based on this analysis of comments is more relevant than the selection made using only key-words from the search term, or a selection made using aesthetic criteria taken from the image (using algorithms proposed in Datta et al. (2006), section 6.1.1 and in San Pedro and Siersdorfer (2009)).
7 Deep Neural Network Systems
[...] as long as analytical aesthetics keeps the nature of aesthetic experience ‘on ice’, even those questions about art that it does concern itself with must remain unanswerable. James K IRWAN (2012)
Techniques that use deep neural networks (DNN) have received a lot of attention in the last few years, every time that researchers wished to reproduce the complex decision-making carried out by an expert. The aesthetic evaluation is related to this criterion specifically, and we have thus seen several research studies rush down this path. Deng et al. (2017c) offer a good review of the work that pioneered this process, up to the first few months of 2017, and the work by Apostolidis and Mezaris (2019) and Liu et al. (2020) complements this. As we have already said, DNNs propose an “end-to-end” approach in which the image to be evaluated is directly input into the system. The entry layer thus has one neuron per pixel, or rather, three neurons per pixel, since it is usually color images that are processed. The first few layers (a variable number between 3 and 7) are traditionally convolutional neural networks (CNNs), which make it possible to progressively reduce the size of the processed maps, by connecting a neuron from the layer n to a small number (3 × 3 , 7 × 7 ) of neurons in the layer n − 1 . The dimension of the layers can also be reduced through pooling operations, which replace the sub-sampling of convolutional layers by other decimation functions (“sup”, “median”, or “inf” functions). The following layers are often totally connected, such that each neuron in the layer n is connected to all the neurons in the layer n − 1 . The final layer contains as many neurons as categories processed (e.g. two categories for a judgment of “amateur/professional”, or five categories for a judgment ranging from For a color version of all figures in this chapter, see www.iste.co.uk/maitre/aesthetics.zip
156
Aesthetics in Digital Photography
“neophyte” to “expert”). When a continuous score is desired, we often choose an output layer with a few dozen neurons on which the score is calculated through regression. The architectures used are those available online, generally developed for applications for recognizing shapes in images: AlexNet, ResNet, VGG, Inception, GoogLeNet or MobileNet, for instance. The limitations in evaluating the aesthetic of images using DNNs are strongly related to the need to have high-quality images with fine details and a field width that is compatible with the quality of professional photographs. In practice, this results in: 1) an input layer with several million neurons, generally tripled, to take the color into account; 2) a considerable number of connections within the network, leading to extremely long computation times for learning the parameters and requiring gigantic annotated databases for this learning to take place. The earliest operational systems functioned largely on thumbnails (whose size was generally smaller than 2 5 6 2 5 6 ), which did not render the finer details of the high-quality photographs. The solutions adopted at the time were very varied: processing in parallel, breaking them down into sub-images, random draw of high-resolution windows, etc. (see sections 7.1.1 and 7.1.2, for example). With time, this consideration was taken into account more effectively (e.g. the research presented in section 7.1.3 tackles this problem head-on; it has also been dealt with well in Hosu et al. (2019)), but through intensive use of convolution and sub-sampling, which probably further alters the judgment that is given. However, a small size does lead to a loss of many important details required to judge the quality of the image and results obtained on overly mediocre representations cannot be trusted. Beyond this aspect, related to the size of the image, there is also the problem of the image format (or aspect ratio). These images can be of different formats, while, by nature, a neural network is of a fixed size (generally square), which is denoted by N N . Conventionally, this problem is addressed in one of four ways (illustrated in Figure 7.1): – by selecting a sub-image of dimension N N , through cropping; – by reshaping the image to reduce it to an N N format (resampling); – by reducing the image such that its largest dimension is N and by filling in the margins with a replica of the smallest dimension of the image (padding); – using an inverted version of the image in the padding operation, such that discontinuities are avoided during replication (padding/flipping). From a signal conservation point of view, padding after flipping is the best way to preserve both the geometric properties as well as the spectral properties, and does not introduce any discontinuities in the signal.
Deep Neural Network Systems
157
Figure 7.1. A rectangular image (left) can be adapted for the square input layer of a DNN by (a) cropping, (b) affine resampling, (c) through padding and (d) through padding after flipping. First-preference must be given to the last-mentioned solution
All articles that use DNNs to evaluate aesthetics discuss image resizing. The impact that this step has on the judgment of the image is discussed in Sheng et al. (2018), among others. Let us look at some of these studies. 7.1. DNNs dedicated to aesthetic evaluation 7.1.1. High and low resolutions: the RAPID system, Lu et al. The first system to use learning through DNNs is presented in Lu et al. (2014b, 2014c, 2015b): RAPID1. In RAPID, the image is processed through a series of four convolutional layers, which progressively reduce the signal size, before processing it through three completely connected layers. The output layer offers two categories: beautiful/non-beautiful. This research had an original way of taking into account the size and diversity of formats, and sought to respect the fine details that could not be preserved in the input plane due to the size of the network. 1 RAPID: RAting PIctorial aesthetics using Deep learning (Lu et al. 2014b).
158
Aesthetics in Digital Photography
Figure 7.2. The two-path neuromimetic network (RAPID) proposed by Lu et al.
Deep Neural Network Systems
159
C OMMENT ON F IGURE 7.2.– Both pathways are identical. One is fed with the reduced and normalized image, the other with a full-resolution vignette that is randomly extracted from the original image. The first 4 layers are convolutional (with kernels that are 11 11 3, 5 5 64 and twice 3 3 64). The dimensions of the totally connected layers are reduced through maximum pooling (max-pooling). The first layer has 1,000 neurons, and the second has 256 neurons. Finally, the output has two states (from Lu et al. (2014c)). As in all other systems that use VGG-Net architecture2, the input level forces the image to be reduced to a size of 2 5 6 2 5 6 , on three channels, through a zoom function. The image is normalized in aspect into a square field, with the smallest dimensions being filled in by zeroes. This is the main DNN used in processing. A second DNN is then used to process a vignette, also 2 5 6 2 5 6 3 in size, whose position is randomly drawn, at full resolution, in the original image. To take into account the richness of the image, during the learning and during the judgment, there are 50 such vignettes drawn. Each vignette, associated with a main channel, is processed by the double network and we only retain the image which gives the best score. Finally, both DNNs are the last, totally connected layer in the network, to provide a unique score: 0 or 1 (Figure 7.2). The cost of these choices is heavy as the processing will be much longer. These studies use AVA data for the learning (see section 5.4). Since the authors have information regarding the style3 of certain images in the databases, they carry out an initial learning about the style (through a DNN similar to the earlier one). They then use the information on style as a regularization element and then go ahead with determining the aesthetic appreciation. The results (Table 7.1) make it possible to validate the choice of architecture as well as the learning protocol, and the use of a style vector to constrain the solution. These results are superior to those presented in Marchesotti et al. (2013), although the cost of implementation is also considerably higher. On the AVA database, they give an agreement of up to 73% with the two class notation of “Low” and “High”.
2 VGG-Net is an architecture developed by Oxford University’s Visual Geometry Group. It is distinguished by the convolutions that are limited to 3 × 3 kernels. It was placed second in the ILSVRC-2014 competition (ImageNet Large Scale Visual Recognition Competition). VGG-Net is now a Torch product. Torch is an association of developers of artificial intelligence software created by various GAFA companies. VGG16, often used in studies on aesthetics, offers 16 processing layers. 3 “Style” refers to the specific properties of the manually annotated images: presence of complementary colors, two-tone images, black and white images, high dynamic range image (HDR), macro photos, bougé, movement, etc.
160
Aesthetics in Digital Photography
7.1.2. The multi-path DMA-Net architecture: Lu et al. The RAPID architecture is further extended in Lu et al. (2015b) by replacing two-path architecture by a multi-path architecture, conforming to the schema in Figure 7.3. This is called DMA-Net (deep multi-patch aggregative network). In practice, a five-path scheme is adopted, with each path being a full-resolution sub-image randomly taken from within the image to be classified. In order to simplify the learning modules, all the CNNs processing a sub-image have identical parameters. The five paths are then merged together using one of two strategies, both of which give similar results: either an independent statistical operator of the order of its inputs (mean, max or median), or a layer of neurons whose input is the vectors that have been sorted (Figure 7.4). δ (Marchesotti et al. 2013) SCNN AVG-SCNN DCNN RDCNN 0 66.7% 71.20% 69.91% 73.25% 74.46% 1 67% 68.63% 71.26% 73.05% 73.70% Table 7.1. Results of the learning through DNN
C OMMENT ON TABLE 7.1.– The δ parameter is the same as the delta in Figure 6.10; it makes it possible to adjust the division between the “beautiful/non-beautiful” classes. The SCNN network has a single pathway. AVG-SCNN is the result from taking the average of the two independent networks. The DCNN network is the network that uses two pathways coupled with the last layer, while the RDCNN network is the one that is regularized by the information on “style” (from Lu et al. (2014b)). For several years (until 2017), the DMA-Net system presented in Lu et al. (2015b) was the reference for most of the research carried out on the aesthetic qualities of photographs using DNNs. This was because of the results obtained on the AVA base (over 75% accuracy in results in binary classification). In Sheng et al. (2018), we see the benefit of including wrongly classified images in the cost function of the evaluation. This can thus increase the recognition performance in two classes, using a multi-path network, to 83%. 7.1.3. Adapting to the size of the image: Mai et al. These studies also sought to resolve the problem of the input format of the image. They chose a complex but efficient solution by first adopting a pyramidal decomposition approach, and then a parallelization of the processing pathways (Mai et al. 2016).
Deep Neural Network Systems
161
Figure 7.3. Scheme for the learning in the deep, multi-path neural network
C OMMENT ON F IGURE 7.3.– In order to simplify the learning, the various CNNs are identical and their parameters, the results of the learning, are the same. The aggregation stage (orderless multi-patch aggregation) is described in Figure 7.4. This architecture is tested, either using sub-images chosen at fixed positions in the original image, or using randomly taken sub-images. In practice, it has been shown that randomly selecting the sub-images is preferable (from Lu et al. (2015)). The basic idea consists of having a DNN with convolutional layer, and pooling precedes the conventional processing layer. In order to adapt to the various possible image sizes, the authors suggest implementing five DNNs in parallel in this way, with decimation factors of 1 2 1 2 , 9 9 , 6 6 , 4 4 and 2 2 , respectively (see Figure 7.5a). It seems like it is thus possible to process images whose maximum dimension would be 1 2 2 5 6 = 3 0 7 2 pixels in their largest dimensions, but this has not been specified in the article. Furthermore, since the aspect ratio of the image may not be the same as the (generally square) aspect ratio of the DNNs input layer, they proposed an adaptive pooling technique derived, like the earlier technique, from work by He et al. (2015b). The architecture thus created is called MNA-CNN (multi-net adaptive pooling CNN). It is built using a VGG-Net (see footnote 2), which is pre-trained using the BVLC/CAFFE4 Zoo model. 4 CAFFE is a DNN software platform developed by the Berkeley AI Research (BAIR) and created by Yangqin Jia at BVLC (Berkeley Vision and Learning Center).
162
Aesthetics in Digital Photography
Figure 7.4. Merging of the multi-path network: the orderless multi-patch aggregation box in Figure 7.3
Deep Neural Network Systems
163
C OMMENT ON F IGURE 7.4.– Since the neuromimetic networks are sensitive to the order in which the input data is presented, it is advisable to introduce a layer that can ensure the independence of the result to the order of the choices of sub-windows before the fusion step. This can be done either through merging all the neurons of the same rank using an independent operator of the order (max, min, median) (a), or by sorting the data from these neurons such that they are ordered (b) (from Lu et al. (2015)). They then took into account the semantic content of the image by placing in parallel an additional network, the Places network developed by Google (Places 205-GoogLeNet)5, specialized in recognizing objects in the scene. They reduced the number of categories to 7 (instead of 205: humans, plants, architecture, landscape, still life, animal and night (Figure 7.5(b)). The experiment was conducted on the AVA database. The results that were obtained were better than those given by the classical cropping methods, sub-sampling methods, scaling methods or tiling methods (see Table 7.2). They also showed the benefit of the categorization channel. Method Precision F-measure AUC score VGG with cropping 71.2% 0.83 0.66 VGG with change in scale 73.8% 0.83 0.74 VGG with padding 72.9% 0.83 0.73 SPP-CNN 76.06% 0.84 0.77 MNA-CNN 77.1% 085 0.79 Table 7.2. Comparison of the performances of evaluation of images in the AVA database based on the type of input
C OMMENT ON TABLE 7.2.– Images reduced through (cropping), change in scale, or (padding) (above), using a multi-path network (SPP is the original network created by He et al. (2015b), MNA is the network created by Mai et al.). The two measurements (F-measure and AUC) are derived from the ROC curve of recognition (from Mai et al. (2016)). 7.1.4. Finding beauty on the Web: Redi et al. Presented in Redi et al. (2017), the goal of this process is to find criteria for beauty not in photos from specialist sites, but in ordinary images taken from the Web, which includes many more mediocre images, or even drawings or graphics, than photo sites. 5 GoogLeNet is a 22-layer network developed by Google. Note that 205 indicates the number of categories in the output layer. GoogLeNet won the ILSVRC-2014 competition.
164
Aesthetics in Digital Photography
Figure 7.5. The two architectures proposed by Mai et al. (a) To address the problem of adapting to the image format, five networks are placed in parallel, each one preceded by a (convolution + pooling) layer with variable decimation. (b) An additional channel is added to take into account the category of the input image (from Mai et al. (2016))
Deep Neural Network Systems
165
For this reason, this method uses a specific database, named for the first author (see section 5.4). The authors also wished to compare the performances of three different processing chains: 1) a method based on primitives, and a random forest classifier. The chosen primitives quite broadly cover the descriptors of images, color, texture, contrast, but also primitives related to the composition, the presence of objects, symmetries and unicity. There are 27 of these; 2) a DNN with three outputs: “high quality”, “ordinary quality” and “low quality”, which is entirely trained using data from the Web; 3) a DNN that is pre-trained using the AVA database (essentially “aesthetic” images) based on the protocol developed by Lu et al. (2015b), and then fine-tuned using images from the Web database (thus including poor quality images). These studies confirm that some primitives make it possible to distinguish between “photos” and “non-photos” with reasonable reliability, such that the evaluation of the aesthetic quality can then be adapted based on this information. They show that the best performances are obtained using the neural network trained on the Web (and not on the AVA database). They also show that the performances of the classifier on the primitives are close to the best performances. Finally, they show that if a network trained only with good quality images from the AVA database gives mediocre results (especially due to errors committed on “non-photo” images), it is possible to significantly improve this by refining the learning (although we cannot expect the same performance as with a network trained on the Web). 7.1.5. Siamese and GAN networks: Kong et al. and Deng et al. The research presented in Kong et al. (2016) listed ambitious objectives to verify the aesthetic evaluation capacities of DNNs. In order to do this, they first developed an original database (AADB, see section 5.4) taught through crowdsourcing to evaluate quality, carry out classification and the attributes of aesthetic quality. Unlike other databases, AADB retains the identity of the evaluators, photo by photo. The scheme adopted is a DNN working on a reduced image (2 5 6 2 5 6 ), that is eventually replicated, without a channel for processing the full-resolution information. The network is enhanced by a “Siamese” montage6, which processes pairs of images in parallel. These images were judged so as to be near-equivalent on the aesthetic criterion (Figure 7.6(a)). 6 A Siamese montage is made up of two networks that have the same architecture and identical (or almost identical) coefficients, generally working on different data. The results from the two networks are then compared in the final layers (Chopra et al. 2005).
166
Aesthetics in Digital Photography
Figure 7.6. The various components in the architecture used in Kong et al. (2016)
Deep Neural Network Systems
167
C OMMENT ON F IGURE 7.6.– (a) The “Siamese” network, which processes two images in parallel on the same architecture. (b) A network trained to simultaneously give an aesthetic evaluation (bottom) and a detection of the primitives (above), with both these being combined to provide the final evaluation. (c) An architecture where both primitives and content are detected. The authors first test the roles played by various information in the system’s performance. This was done by developing an architecture that is capable of supplementing the absence of information, regarding the category or the attributes (Figure 7.6). They show that all this information is useful, but not essential, as it can be found later. The conclusions from the experiments also show the importance of the observer who annotated the image during the classification, which the authors attribute to the “same aesthetic”. This is an important result, which offers an original method to collect expert opinions during the establishment of the database. This makes it possible to somewhat avoid the flaw in the “objectivist” aesthetic approach, which considers that the effects of culture, taste and education are negligible. This aspect must certainly be looked at in greater detail in future work. From a technical point of view, these conclusions also show that networks trained in this way are very poorly generalized, as the learning and recognition performances across the AADB and AVA databases prove to be mediocre. Finally, the authors wished to propose a “continuous” score between ugly photos and beautiful photos, unlike binary approaches. However, no related results were presented. A configuration that is quite similar to Siamese networks, an adversarial network7 is used in Deng et al. (2017b) to improve an image that was initially mediocre by changing the contrast (separately on the Lab color channels) and through cropping. In order to do this, an image judged to be beautiful and an image judged to be mediocre are opposed in the two branches of the GAN and the parameters of the second are changed until identical aesthetic evaluations are given. 7.1.6. Paying attention to the image construction: A-Lamp This research follows in the line of the DMA-Net system (see section 7.1.2), which uses several paths in parallel. It was presented in Ma et al. (2017) and its name 7 Adversarial networks are denoted by “AN”. When they are used to create new images, they are said to be “generative” and are denoted by “GAN” (Isola et al. 2017). An adversarial network modifies the parameters of the input image on one of the two paths in a Siamese network, such that identical outputs are obtained on both paths at the end of an evaluation layer. This is why it is said to be “generative”, as it creates a new image from an older image, with the constraint that this new image is judged to be “equivalent” to the reference image, with respect to the criteria used by the network.
168
Aesthetics in Digital Photography
(A-Lamp CNN, adaptive layout-aware multi-patch) expresses the principal properties of the system: first of all, there is an estimation of the composition (layout), and in parallel, there is also a measurement of the aesthetic across several of the zones that are studied (multi-patch) (see Figure 7.7).
Figure 7.7. Architecture of the A-Lamp system. This highlights, in the upper region, the multi-patch sub-network which processes the full-resolution sub-images in parallel; in the lower region, the network processing the composition is highlighted (from Ma et al. (2017))
C OMMENT ON F IGURE 7.7.– An adaptive network to detect interest points (not represented in this figure) is placed immediately after the input image is read and it feeds both parts of the processing. The composition analysis is inspired by various studies dedicated to the distribution of zones of interest in an image (Liu et al. 2010 or Obrador et al. 2010). The analysis uses an adaptive detection network for the zones of interest whose results also supply the other processing path. Each zone of interest, defined by its encompassing box, is a node in the representation graph, which is endowed with local and global attributes. Local attributes describe the relative position of a zone of interest with respect to each of the other zones (distance, orientation of the link, overlap), while the global attributes represent the position of each node in the scene. The multi-patch path is identical to the DMA-Net used by Lu et al. (2014a) or Lu et al. (2015a), but instead of using the sub-images in a fixed or random position, the parallel, full-resolution processing path works on the sub-images selected by the adaptive detection network. The authors show that this multi-patch path using pre-selected zones is in preference to the versions with fixed and random sub-images. They also show that
Deep Neural Network Systems
169
the complete network, with both its paths, gives better binary classification results (“beautiful” or “not-beautiful”) than the other architectures proposed earlier. These studies use the Multi-Patch VGG16 architecture (see footnote 2) pre-trained on ImageNet and the CAFFE software (see footnote 4). The prior detection of interest points follows the work done by Zhang et al. (2016).
a)
b) Figure 7.8. The A-Lamp system. (a) The multi-column sub-network processes, in parallel and at full-resolution, the zones of interest detected by the adaptive region detection network. (b) The network that processes the composition using the attribute graph (from Ma et al. (2017))
170
Aesthetics in Digital Photography
7.2. Variants around the basic DNN architecture Following the development of research that used neural networks in all fields of application, many studies dedicated to aesthetic judgment used increasingly complex architectures (more neural layers and more complex loops between layers). There was also a general tendency to replace totally connected layers by purely convolutional layers, with shorter range but present in greater numbers, and leading to a multi-scale pyramidal analysis of images. These changes generally enhanced the performances of networks. In parallel, there were efforts made to interpret the outputs from layers before classification, so as to explain the results obtained (Jin et al. 2016a; Kairanbay et al. 2017; Murray and Gordo 2017) using activation cards superimposed on the images. These results are still far from being convincing. Certain approaches aimed to simplify the architectures (Kairanbay et al. 2017), while others attempt to simplify the learning (Srivastava and Kant 2018). We will look at some of these propositions in greater detail. 7.2.1. Comparing photos between themselves: Schwarz et al. We have spoken of the database from the University of Tübingen, obtained by collecting favorable opinions from visitors to Flickr. This was designed for an original method to determine the aesthetics of images from non-specialist opinions from a large number of Internet users (Schwarz et al. 2016). To recall, the images were attributed a continuous score of quality between 0 and 1 (see equation [5.1] and Figure 7.9). The authors then carried out machine learning based on a neural network (ResNet-50 architecture8 and the TensorFlow software9), which uses aesthetic qualities not only to classify images (“beautiful” or “not beautiful”) but also to order it through 2 by 2 comparison (Siamese network (Chopra et al. 2005), as in the previous study) or 3 by 3 comparison through triplet networks (Hoffer and Ailon 2015). In the author’s opinion, this objective (classifying images by aesthetic rank) is closer to how human judgment functions. The qualities of this ranking are verified using mass-voting on the Internet to ensure that the two images placed in a particular order by the DNN have been ranked in the same order by the Internet users. The authors show that it is nonetheless possible to derive, from this order, a classification whose performance is very similar to the other techniques. 8 ResNet is the convolutional neural network from Microsoft that won the ILSRVC 2015 competition (He et al. 2015a). ResNet-50 is one of the versions of this network. 9 TensorFlow is a library of free computation software that use data-flow graphs. It was developed by the GoogleBrain project teams within Google’s artificial intelligence research center.
Deep Neural Network Systems
The authors finally used three different applications for their software: – to organize the photos in an album; – to extract the best photo from a video sequence; – to select the best sub-image from a large scene.
a)
b) Figure 7.9. How did the Internet vote?
171
172
Aesthetics in Digital Photography
C OMMENT ON F IGURE 7.9.– (a) Distribution of the 350,000 scores (between 0 and 10 in the Tübingen database (from Schwarz et al. (2016)). The scores are calculated using formula [5.1], which uses the opinions given by Internet users. It must be noted that the distribution is almost flat, which is a very different distribution from that of the images in the AVA database (which is practically Gaussian), represented here by (b) also on a logarithmic scale, obtained from Figure 5.2. 7.2.2. Making use of knowledge of the subject: Kao et al. In a long series of studies, Y. Kao, K. Huang and S. Maybank sought to make use of knowledge about the scene, represented by the image, to improve the evaluation score (Kao et al. 2016a, 2016b, 2017a, 2017b). In order to do this, the scene was sub-divided into three classes: “scene”, “object” or “texture”. They found out what class each image belonged to and then, based on the class identified, they oriented it toward one of three networks with slightly differing architecture. Thus, for the “scene” category, a single conventional network is used, endowed with five convolutional layers, followed by two totally connected layers. For the “object” category, the processing is carried out in parallel on two networks, one processing the overall image, the other focused on the region of greatest salience. Both channels pass through a totally connected layer and are then brought together before going through another totally connected layer. Finally, in the case of a “texture” image, 16 sub-images are chosen by a sliding window, with each window passing through a slightly shortened network and then the 16 results being averaged. The network gives two results. First, through a 1-bit classification, whether or not it belongs to the category of “beautiful images”, then through a regression using the sum of the squares of the output of the last, totally connected layer, a continuous score evaluating the aesthetic quality of the photo. Several variants of this network were thus developed based on the scheme described in Figure 7.10. This is the conventional MT-CNN scheme (multi-task convolutional neural net) that is illustrated in Figure 7.3, followed by a learning stage where there is learning of how the aesthetic score depends on the semantic classification. The images here are still of a modest size (2 5 6 2 5 6 ). An image xn (n 1 , ...N ) is described by an attribute of the aesthetic evaluation yn , which can take C values and the semantic attributes zn whose values znm ( m = 1 , ...M ) are either 0 or 1, depending on whether the image belongs to the semantic images m or not. The (X, Y, Z) database is thus made up of N triplets {xn , yn , znm }. The first step entrusted to MT-CNN is carried out using a logistic multinomial regression. This is followed by a step that
Deep Neural Network Systems
173
Figure 7.10. The architecture used in the first step of the research by Kao et al. Four convolutional layers with decimation are identified, followed by three totally connected layers, the last of which is made up of two specialized networks: one to attribute an aesthetic score, the other to recognize the semantic content (from Kao et al. (2017b))
174
Aesthetics in Digital Photography
takes into account the explicit dependence between the semantic and the aesthetic through the covariance matrix Ω : Ω=
( W T W ) 1/2 T r[ ( W T W ) 1/2]
[7.1]
where W is the vector of the parameters of the final layer (Figure 7.10). This is then optimized through backpropagation of the gradient with a constant Ω to calculate the W values and recalculate Ω from the values of W . This method is called the MTRL-CNN (multi task relationship learning convolutional neural net). The network learns on the AVA database from which 185,000 images are taken, having at least one semantic identifier of belonging to one of the 29 available semantic classes (M = 2 9 ). The training is carried out on 165,000 images and the verification takes place on 20,000 images. Only two aesthetic values are used (C = 2 ). The results obtained in these conditions classify images into one of these two aesthetic categories with a success rate of over 75% (see Table 7.3). The benefit of using semantic information is still verified, but to a moderate degree (of the order of 1–3%). The same network is then successfully used to evaluate images in the Photo.net database. MT-CNN 1 MTRL-CNN without With semantics 0 76.15 % 79/08 % 1 75.90 % 77/71 % δ
Table 7.3. The role of semantic information in the performances of the network developed by Kao et al. MT-CNN corresponds to the first stage of the CNN, without the coupling (Figure 7.10). MTRL-CNN is the complete network with the coupling used by Kao et al. The δ parameter is the same as shown in Figure 6.10: δ separates the most beautiful and the ugliest images around the value 5 – (> 5 + δ) and (< 5 − δ), respectively (from Kao et al. (2017b))
Finally, these studies were also used to determine the best frame in a photograph within a larger scene (Kao et al. 2017a). 7.2.3. BDN: halfway between classification and DNN The studies in Wang et al. (2016) tried to couple the approach of selecting primitives and the DNN approach. Running the primitive selection paths in parallel, they claimed this approach to be analogous to the human brain to justify the method and the name (brain-inspired-CNN), especially following the brain model proposed by Chatterjee (2003), which we saw in section 2.4.2. They thus divided the processing into two levels:
Deep Neural Network Systems
175
– in the first level, the detection is carried out in parallel on elementary primitives: color (directly measured by its HSV components), and then each of the 14 pieces of information on photographic style given in the AVA database (see Figure 5.4), with each piece of information being handled by a DNN made up of four convolutional layers and two totally connected layers; – in the second level, the outputs of these detectors are used as the input for a network with 128 inputs, four convolutional layers and two totally connected layers (see Figure 7.11). The final output consists of two values: the mean and variance of the Gaussian, which reproduces the curve of the distribution of opinions of the AVA jury (see Figure 5.2)10. These values are obtained by replacing the softmax output function with a Kullback–Leibler divergence computation. An interesting result from this study is that while it was expected that learning the distribution of scores would allow classification along a continuous scale, the authors insist on the particular role played by this distribution to improve the classification of the binary classification: “beautiful image” versus “ugly image”. This proposed system was implemented using a distributed Cuda-Convnet computation package11. 7.2.4. Using the distribution of the evaluations We have seen that the preponderance of images that received an average score in the AVA database often posed the problem of over-representation of average images in systems that used DNNs. We have also seen that in order to avoid this overfitting, researchers have often chosen to remove average images by using an interval (denoted by ±δ). Another strategy consists of weighting the sample by a weight that is inversely proportional to the frequency of the attributed score. The efficacy of this solution is especially seen in Jin et al. (2016a). The distribution of the scores awarded to the same image is a source of information regarding how the image was received by evaluators, independent of the average value that is generally used. If the scores have a high variance, this expresses the fact that there were very diverse opinions and, probably, a certain difficulty in understanding the beauty of the image and, finally, difficulty in giving an aesthetic judgment. A high 10 The authors note, however, that the choice of a Gaussian distribution is not apt for photos of very good or very bad qualities, for which the scores have a very asymmetrical distribution. 11 Cuda-Convnet is a package developed by the University of Toronto from 2011 onwards, making it possible to simulate any kind of DNN. It is written in C++ and parallelized in Cuda (Krizhevsky et al. 2017).
176
Aesthetics in Digital Photography
variance probably reflects the need to have several personal elements of judgment that not everybody has: taste, sensibility, culture, education, etc. It is, therefore, an important piece of information and several systems have sought to understand the distribution of scores rather than the mean value. This is especially the case with the research by Jin et al. (2016a). While it does not improve the performances in evaluating the beauty of an image, this research does make it possible to assess the difficulties encountered by a panel of observers when judging this image. Talebi and Milanfar (2017) went further down this path in a system called neural image assessment (NIMA). In order to learn the distribution of scores, they optimized the weights in the network through an optimal transport method (Earth Mover’s Distance). They used the TensorFlow software library (see footnote 9) and the DNN was pre-trained on the Image-Net database in order to accelerate the learning phase. Three different architectures were compared: VGG16 (see footnote 2), Inception-v212 and Mobile-Net13. Since Inception-v2 has a slightly superior performance, it was selected from these three. The results obtained show there is fairly satisfactory agreement with the scores awarded by human judges in the AVA (a mean error of 5% and maximum of 12% on the 27 examples presented), although the mean correlation on the AVA database is mediocre (of the order of 0.6). Going back to a two-category classification with thresholding, these give over 80% of correct responses, close to the state of the art. A very similar approach was proposed by Murray and Gordo (2017). This approach also made it possible to predict: – the classification into two categories with a thresholding of 5; – the aesthetic score as the mean of the distribution of the scores; – the distribution of scores (distributed over 10 levels). The optimization method adopted is distinguished by a criterion that uses a Huber norm on the distributions (quadratic below the threshold, linear above the threshold). 12 Inception-v2 uses a parallelization of the convolution and pooling steps, and replaced the completely connected layer with a max pooling layer, thus notably reducing the number of parameters to be learnt (Szegedy et al. 2015). 13 Mobile-Net is a DNN dedicated to embedded architectures of vision. It is distinguished by the fact that the convolution steps are replaced by separable filters, which are particularly fast (Howard et al. 2017).
Deep Neural Network Systems
Figure 7.11. Architecture of the BDN system developed by Wang et al., inspired by the human brain (from Wang et al. (2016))
177
178
Aesthetics in Digital Photography
C OMMENT ON F IGURE 7.11.– There is a set of detectors arranged in parallel at the input, which are supposed to detect primitives independent of one another. Apart from the first three, which measure the chromatic HSV primitives, the other channels use DNNs. The outputs at this stage are then set to another DNN, called a “high level” DNN. The network also predicts the distribution of scores (similar to what would be given by the AVA). 7.2.5. Extracting a “dramatic” image from a panorama: the Creatism system The Creatism system was presented in Fang and Zhang (2017). It proposed a complete processing chain to extract the best section of landscape from a 360º panoramic photo. It then applied a series of treatments to enhance the appeal of this segment: contrast enhancement, applying HdR dynamic extension, applying vignetting filters to darken the edges, distorting the grayscales and colors and then applying a “dramatic” filter to darken the clouds and enhance the contrasts. This method is based on the hypothesis that the aesthetic quality function can be broken down into a product of the functions of specific aspects (contrast, color, etc.) that can be sequentially optimized. The first step consists of searching for the most attractive sub-image. To do this, the panorama is divided into six images of 90º fields, thus ensuring there is an overlap of zones to avoid cutting out any potential interest points. After this, a window is randomly drawn in each image and this window is modified by adding or removing bands on the borders, and then re-sampling the window such that it remains square. It is then re-evaluated through a DNN and the process is iterated by a gradient descent. When the best window is isolated, then the enhancement operations are carried out in succession, by changing the parameters of each improvement (increasing the contrast, distorting colors, etc.), using a Snapseed toolbox14 and the DNN in a generative, adversarial configuration (GAN) (see section 7.1.5). The Creatism system has been tested by professionals, who have confirmed its ability to identify beautiful sub-images and to modify them aesthetically. One example of this is given in Figure 7.12.
14 Snapseed is an image-processing toolbox created in 2011. Initially developed for Windows, it is now available on Android and iOS. Snapseed is the property of Google: https://snapseed.fr.softonic.com.
Deep Neural Network Systems
179
7.3. Written appraisals: analyzing them and formulating new ones Several studies have sought to make use of written comments to accompany the binary or quantified evaluation of an image. They were generally based on the AVA-Comments database that accompanied part of the 250,000 images in the AVA database. However, most of the AVA-comments are difficult to use in their original form as they are imprecise or uncertain. Composed in natural language, they abound in abbreviations, exclamations and interjections that are difficult to interpret. These opinions are either expressed through the choice of words used or through the syntax. These phrases flourish: “I love the color”, “Nice shot”, and must be accurately weighted.
Figure 7.12. The Creatism system, beginning with the 360º panorama (b), looks for the most aesthetic sub-scene and then applies a series of treatments to this to modify the contrast, the color, the dynamic, etc., so as to make it “dramatic”. All these operations are carried out by a DNN, which has a generative configuration (DNN-GAN) for the modifications of the appearance parameters. The result is the image (a) (from Fang and Zhang (2017))
Most of these studies supported a quantitative classification by associating this textual information with the information from the image: they are distinguished by the ambition of the semantic analysis carried out. Some studies select a few words, judged to be pertinent, which are injected as the input, or more often, into one of the final layers of a DNN that analyzes the image. Other studies carry out a semantic analysis in parallel to the image processing. This semantic analysis can be done using traditional natural language processing (NLP) tools, or using neural approaches. In the latter case, a widely used tool is the long short term memory (LSTM) (Hochreiter and Schmidhuber 1957). LSTM is a particularly effective implementation of backpropagation and gradient descent, which is well-suited for recurrent networks and allows rapid learning that is well-suited to processing strings of data: text, lyrics and music.
180
Aesthetics in Digital Photography
a)
b) Figure 7.13. Construction of comments from the PCCD database (Chang et al. 2017)
Deep Neural Network Systems
181
C OMMENT ON F IGURE 7.13.– (a) Two examples of photographs with comments are taken from the PCCD base that is used for learning the vocabulary and syntax of comments on aesthetics. The comments touch upon the seven items chosen by Chang et al. Each item is also given a score out of 10. (b) An example of a caption is generated by the automatic system developed by Chang et al. (2017). It can be seen that there is a certain incoherence in the remarks, since extending the upper part of the image (to show more of the sky and branches) could lead to a weakening of the rule of thirds invoked in the first aesthetic criterion. The studies presented in Hii et al. (2017) illustrate how researchers can make use of AVA-Comments to work toward this goal. The semantic processing of the comments is based on a representation by words and uses the GloVe (Global Vector) algorithm (Pennington et al. 2014), which makes it possible to select the most significant words through the factorization of the co-occurrence matrix of words used. Each image is described by 100 words belonging to the complete vocabulary of 20,000 words taken from the comments. A two-layer recurrent network is then used, acting on the textual terms. The results from this are concatenated with those from a multiGAP network that processes the visual part. Both combined lead to a two-state classifier. 7.3.1. Photo critique captioning dataset (PCCD) There are more ambitious projects that aim to generate/create written comments that resemble those given by human experts. One of these is the project by the San Pedro team (San Pedro et al. 2012), which used primitives and classifiers (see section 6.3.4). Studies (Chang et al. 2017) try, first of all, to describe the aesthetic qualities. The learning does not happen on the AVA database, but on a collection of original data, called the PCCD taken from a professional photography site15 containing many comments that are much more pertinent and precise than those posted by amateurs on the AVA database, or the MSCOCO database, which is a generalist database used to analyze scenes, bringing together images and their captions. Each comment contains seven specific items: the general impression, the subject, composition/perspective, exposure/shutter speed, depth of field, color/lighting, focus. Each item is accompanied by a score of between 1 and 10. Figure 7.3 presents two examples of the learning database, as well as the result of the synthesis obtained for one of the photos. 15 The PCCD (photo critique captioning dataset) is made up of photos and comments taken from the GURU site: https://gurushots.com/. It contains 4,200 images and 30,000 comments. We were unable to access the database itself.
182
Aesthetics in Digital Photography
In these studies, an automatic image indexing method used to construct the captions (the NeuralTalk2 captioning algorithm16 (Karpathy and Fei-Fei 2015)) is extended to the vocabulary of aesthetics. This algorithm looks for alignment between various forms identified within the image and the words in the given vocabulary. A recurrent network, CNN-LSTM, is used, exploiting the specific properties of photography aesthetics in relation to the themes in the comments. The network makes use of multiple comments on a single image to create a new comment that acts as a synthesis of all the earlier comments. 7.3.2. Neural aesthetic image retriever (NAIR) This is also the case with the NAIR system, which is quite close to the earlier one (Wang et al. 2018), but which also offers a quantitative analysis of the image. The architecture is again that of a recurrent network with two branches, one branch with LSTM learning and a conventional, convolutional branch (the Inception-v3 model and TensorFlow software). The LSTM branch follows the textual analysis schemes proposed in Vinyals et al. (2015). These have proven to be effective in automatic text translation tasks. The learning database is taken from AVA with the associated comments and captions. It can be seen, upon examining some results in Figure 7.14, that this highly laudable effort to provide a justification for the evaluation must be further improved as the terms employed at present are imprecise and not very pertinent. 7.3.3. Semantic processing by Ghosal et al. In Ghosal et al. (2019), the authors tried to extract coherent comments from the imperfect and disorganized remarks accompanying the photographs in the AVA-comments database. Specific attention was paid to correction of spelling mistakes, abbreviations and interjections, which bring down the quality of any texts taken from social networks. This is a classic step in processing natural language texts. A second step makes it possible to measure the scope of information provided by chains of n terms (n-grams) using the TF-IDF approach (term frequency-inverse document frequency), and then grouping together the equivalent n terms through a weakly supervised semantic classification. A photo is thus described by the set of n-grams from various comments and from the terms that are associated with it, considered as the non-noisy ones from AVA-caption. A latent Dirichlet analysis (LDA) makes it possible to extract the most significant 1 g and 2 g, which are used to construct the synthesized caption using the NeuralTalk2 network. 16 NeuroTalk and NeuroTalk2 are networks that make it possible to comment on images using the LSTM (long-short term memory) principle: https://deepai.org/machine-learning-model/ neuraltalk (August 2020).
Deep Neural Network Systems
183
Figure 7.14. Results of the aesthetic evaluation of four images in the AVA database by NAIR: an evaluation of the aesthetics as well as a written expression of the evaluation, both compared to their “truth” in the AVA database (from Wang et al. (2018))
7.3.4. Aesthetic multi attribute network (AMAN) In another publication (Jin et al. 2019), quite a complete system was proposed, which gave both written comments and evaluation scores over five specific properties of the photo: the “color lighting”, “the composition”, “the depth of field and focus”, “the impression and the subject”, “the use of the camera”. These properties are those from the PCDD (see footnote 15), which is used to train the AMAN. Figure 7.15 presents the keywords associated with each property.
Figure 7.15. The five criteria for judgment in the AMAN system (on the left) and the most-frequently associated keywords encountered in the PCCD database (on the right); the numbers in parentheses indicate the number of occurrences (from Jin et al. (2019))
184
Aesthetics in Digital Photography
Figure 7.16. Scheme of the AAMAN system by Jin et al
Deep Neural Network Systems
185
C OMMENT ON F IGURE 7.16.– It contains two cores. The one in the upper box, denoted by MAFN, uses a card dense with primitives as input, and then two channels: one is dedicated to the overall processing of primitives and the other is dedicated to the individual processing of primitives. The lower box, denoted by LGN, is dedicated to the synthesis of a written comment (the five colored boxes at the bottom left). It is fed by the output of the MAFN (from Jin et al. (2019)). A network was constructed as per the scheme in Figure 7.16. This would directly process the image after breaking it down into primitives, yielding an overall score and five individual scores (one per property); it would also process the text associated with each primitive using an LSTM network, which is dedicated specifically to comments. These two processing pathways are combined at the end of the network. Since the PCCD is too small to carry out an end-to-end experiment, the authors first created a database of comments across five levels, from the single-level comments database in AVA. They used the PCCD as the training database. This allowed them to create a database (which they named DPC-Captions) equipped with a sub-base of comments with five primitives. Tested on a small number of images from the AVA database, the AMAN method yielded comments that seemed to agree reasonably well with what a human expert might have said. Although the verification is very subjective, and thus not very reproducible, it appears that the agreement is satisfactory for two-thirds of the images examined. 7.4. Measuring subjective beauty In all the studies presented here, beauty has been a universally accepted and shared property. It is a totally objective property that would thus only depend on the photo itself. The truth given by the training database (very often the AVA) is not debatable and, if the network functions well, the score it awards a new photo is a measure of its beauty, which is valid for any observer. As we emphasized in the first few chapters, this point of view is not shared and it is often accepted that aesthetic judgment arises from subjective criteria, which depend on many factors; temperament, state of mind, education, culture, environment, etc. (all factors that are generally clubbed together under the term “idiosyncrasies”). However, studies on the beauty of photographs never use a subjectivist hypothesis. However, the objectivist approach, presented in the preceding sections, has been modified in a few studies that we present here in order to take the observer into account, sometimes as a marginal factor, sometimes more in-depth (Maître 2020). The chief difficulty in applying earlier research to any specific observer based on a DNN adapted to their preference is the size of the learning database required,
186
Aesthetics in Digital Photography
annotated with the judgments of each individual. This difficulty has been overcome in various ways: – it is possible to find a personalized profile of the observer and then use this to specify the evaluation system (the profile is then associated with the image to be judged at the input of the DNN); – it is possible to modify a generalist network of aesthetic evaluation with a learning that uses a few test images, in order to adapt it to the user’s specific aesthetic tastes (here it is the network parameters that will code the specific tastes of the observer); – several evaluation systems can be developed in parallel, responding to different aesthetics, and we can then see which corresponds the best with the specific observer. These different paths are often simultaneously followed in approaches that are quite complex and which we will see below. 7.4.1. Recommendation systems 7.4.1.1. Recommendation and online purchases The most widespread techniques today to adapt an offer to a demand that is not explicitly stated are used by recommendation systems. They are particularly efficient at offering a client products that match their tastes in the fields of cinema, music, and purchases in general, and hobbies in particular. This is done by gathering the client’s past orders, as well as any relevant information about their age, where they live, their social status and even their tastes with respect to the associated domains. Based on all this information, the client is categorized into a group whose practices seem similar. It is then possible to make them offers that this group appreciated unanimously. If we wish to predict their reactions about a particular product, we can attribute to them the average opinion of other group members, or the opinion of another client who resembles them the most within the group. In the field of cinema, the concerns seem quite close to those in the field of art photography (Deldjoo et al. 2016; Elahi et al. 2017; Deldjoo et al. 2018), and we could hope to benefit from the analysis carried out of the image signal (choice of contrasts, colors, framing) and transpose it to the fixed image. Unfortunately, the role of the image today is too small for recommendations with respect to movies, and it is also a secondary role with respect to the semantic information which photography does not have (the script, production, casting, as well as the order of images, scene changes, the soundtrack, and so on). In the field of online sales of paintings, recommendations have been the subject of recent studies (Benouaret 2017; Dominguez et al. 2017; Messina et al. 2017, 2018). These may be more easily transposed to photography than the research carried out
Deep Neural Network Systems
187
on cinema, as the field of art is particularly sensitive to complementary aspects of the style of the artwork as well as the artist’s manner both of which also play a role in photography. However, certain significant trends in the art market also set them apart, such as the fact that there often exists only a single copy of an artwork and, therefore, multiple buyers cannot give their opinion on the work, as it disappears from public after the purchase. Based on data from online art galleries, there are a mix of visual criteria related to the painting (color, dimensions, theme) and high level criteria (author, date, style, etc.). Researchers confirmed that the approaches using low level properties of the image processed by the DNN are more effective than those based on classifiers with dedicated primitives (Dominguez et al. 2017). Nonetheless, the role played by this information from the image remains marginal with respect to the role played by symbolic primitives. Thus, the metadata “Name of the artist” is given prime position in the final recommendation (Messina et al. 2018). Another recommendation system was also developed to help searches within the BAM! database by the artistic site Behance (see section 5.4) (He et al. 2016). It uses two models concurrently, one using a personalized Markov chain to maintain the user’s preferences and the evolution of their tastes, and the other modeling the visual appearances of various elements in the scene. The whole setup makes up the VISTA+ recommendation system. These are latent knowledge engines, which make it possible to define the relationship between an observer and an artwork, the relationship between an observer and other observers, the dependence between the works and finally the evolution over time of the user’s preferences over their various visits and especially with respect to their previous purchases. Let us finally note that in the field of photography, specialized research has examined a very particular aspect of recommendation, which governs the choice of special effects filters on social networks. These filters, which act on the contrast and colors in an image, are few in number (about 20, to be found in the editing toolboxes on social networks), while the task of recommendation is quite complex, as it consists of offering a given user, for a given image, one or two filters and associated parameters, adapted to their taste and their images (Sun et al. 2017). The proposed software, which works by learning through DNN, ideally on pairs of images processed by different software, functions relatively well in 80% of cases. 7.4.1.2. Image recommendations In the systems described earlier, the profile that is established is a social profile: age, gender, profession, family environment, hobbies, etc. Kosinski et al. (2013, 2014). It is probable that these determinants play a role in the user’s tastes, but few studies have established usable links, to date, to allow us to deduce the user’s aesthetic preferences from this information. Some researchers take into account their social network habits to modify the order of images proposed by the search engine when responding to a search (Cui et al. 2014). There is no criterion of taste used here, only habits (membership, connections, activities) are taken into account.
188
Aesthetics in Digital Photography
The research reported in Kairanbay et al. (2019) are a step closer to this, however. It shows what information we can extract from images posted by a photographer on their site. In order to do this, they use a sub-set of very specific photos taken from the AVA database, for which there is, exceptionally, social data (age, gender, country of origin) on the photographers who took these images, thus constituting a new database called the AVA-PD (see section 5.4). The authors then show that a profile reduced to these data alone can partly be derived from the photos posted by the photographer. They then show that, to a certain extent, it is possible to predict how this photographer will evaluate a given photo based on the photographs they have posted. To obtain this result, the authors adopt a conventional scheme for taste through a transfer of learning complemented by a fine tuning step (Nagabandi et al. 2018) with the help of images posted by the photographer. Other authors sought to use images posted on social networks (most often Flickr, Instagram or Pinterest) by an Internet user, to narrow in on their areas of interest, by examining the style and content of these images (Lovato et al. 2013; Yang et al. 2015; You et al. 2016). This works differs in the approaches that emphasize either the characteristics of the image or the themes. The research produced results that were quite convincing with regard to the ability of the learning techniques to predict the interest that a photographer would show in a given photo. Nonetheless, there is nothing in the results to show that an aesthetic judgment was engaged. 7.4.2. Defining the user’s psychological profile 7.4.2.1. Personality and the Big Five The observer’s psychological profile is often considered a fundamental determinant of aesthetic taste, outranking culture, mood or context (Konecni 1979; Jacobsen 2010). Many classic studies inclined toward categorizing personality profiles (Eysenck 1991), while others looked at quick ways of determining a subject’s profile through tests that were as short as possible (John et al. 1991; Costa and McCrae 1992; Rammstedt and John 2007). The use of five major dimensions of personality (the Big Five (Goldberg 1990)) gradually emerged as the dominant approach, especially for online studies. The five most significant dimensions were as follows (with respect to the jargon that was emerging in this domain): openness, conscientiousness, extraversion, agreeableness and neuroticism17. 17 The Big Five: O openness (curiosity, open-mindedness), C conscientiousness, (sense of responsibility), E extraversion (ease of making contact), A agreeableness (tendency to generosity), N neuroticism (tendency to see the worst in any situation). These are often represented by the initials and the method is often referred to as OCEAN. It must be recalled, however, as it is in Eysenck (1991), that these dimensions are still debated and other models sometimes oppose them.
Deep Neural Network Systems
189
An image, especially an image posted on a site by an Internet user, is an objective way of expressing their personality. This may be analyzed, like any exchange during a human interaction, using proven statistical models (Brunswick 1956). Thus, Cristani et al. (2013) associated the photos posted with the psychological profile of the photographer. They first created a database (using quite a complex supervised protocol), PsychoFlickr, associating 60,000 images posted by 300 “professional” users on Flickr, as well as “author” and “recipient” psychological profiles, using the Big Five. This was done for each of the 300 users. The poster profile represents what the author wished to present, while the recipient profile is what a visitor to the site perceives. Then, based on the primitives taken from the photos on the site, a classifier learns to relate the properties of the image and the photographer’s psychological profile. This classifier then makes it possible to draw up the profile of an unknown Internet user based on the photos they have posted. This research, carried out using handcrafted and regression primitives, were again used and extended in Segalin et al. (2016, 2017) with neural networks, leading to an improvement particularly with respect to the correct classification on the five categories of traits (except “neuroticism”, which is not measured as well). The research also distinguished between the profile that the author wished to present and the profile that was actually perceived by an interlocutor, as these two profiles are often considerably different. 7.4.2.2. The Big Five and aesthetics In Li et al. (2020), we see how an individual profile can be used to obtain a subjective aesthetic judgment. During a learning phase, two networks are placed in parallel, one tasked with giving a “generic” aesthetic score according to the authors (“objective” in our terms), the other tasked with measuring a user profile based on their representation on the Big Five. These networks are identical in the first few layers (Siamese network) but have differing output layers. The first is trained, using the universal AVA database, to award an aesthetic score to any image that is input. The second is trained using PsychoFlickr and Flickr-AES (see section 5.4). It learns to note the photographer’s profile based on the images they have taken and also, with the help of the first network, to determine how this profile will intervene in the aesthetic judgment (Figure 7.17). During the step of subjective evaluation of a new photo, a “generic” aesthetic score is measured in the first branch, while the Internet user’s profile is determined in the second branch based on their photos. The outputs of the two networks are then combined: the generic note is linearly modified by five corrective terms introducing the subjective contribution inherent to the Internet user’s profile. Each term expresses the contribution of one of the Big Five as the product of this factor as measured in the observer and the weight of this factor in the image. The weight of the Big Five in the image (or the “profile” of the image) is illustrated in Figure 7.18.
190
Aesthetics in Digital Photography
Figure 7.17. Architecture of the system developed by Li et al. It uses two branches of a Siamese network: The upper branch measures a generic beauty, identical for all observers; the lower branch determines the profile in terms of the Big Five (represented by the five scores on OCEAN). The two evaluations are then merged in a step that provides a single, personalized score (Li et al. 2020)
Deep Neural Network Systems
191
Figure 7.18. Two images projected in the space of the Big Five. The five scores lie between −4 and +4: O = openness, C = conscientiousness, E = extraversion, A = agreeableness, N = neuroticism, according to Li et al. (2020)
7.4.3. Learning the user’s tastes through tests The research presented up to now uses photos posted by the user that reflect their aesthetic tastes. However, this situation is quite rare, as few of the users whose judgment we wish to predict offer this source of information. It is therefore necessary to include a prior testing step (as was done, for example, to determine the Big Five). These tests generally consist of judging a small number of images. As they are time consuming, authors try to shorten them. Literature today retains two families of tests: one is rapid and is carried out using 10 images, and the other, which is longer, uses 100 images. The results do not seem to favor longer tests. While intuitively one would opt for the longer tests, some authors have noticed that performances on these tests dips over time, probably because the observer’s attention fades. The idea behind several studies conducted in this manner is to specialize a network of “generic beauty” by fine-tuning the final neural layers using the test results. This thus yields an aesthetic judgment network, at low cost, which is specialized in the tastes of the user who responded to the test.
192
Aesthetics in Digital Photography
In Zhu et al. (2020), the researchers made use of a meta-learning technique through optimization. In order to do this, an annotated database is required, which has saved a trace of the people who annotated it (this is the case with two databases: Flickr-AES and AADB). A network was trained (first on ImageNet, and then refined for each observer) and, in this way, the authors first determined the rules shared by all observers when judging the images, and then the parameters that made it possible to customize this model for each observer. When each new, unknown user appears, a series of tests is used to determine which known user they most closely resemble and the specialized network is then used by this observer to judge new photos. Tests of evaluation of this method show that there is up to 70% of correct classification on two classes, when the learning is carried out with a series of 100 test images. But the rate of correct responses dips to 56% with only 10 images. In comparison, comparable research that used matrix factorization and LDA instead of a neural network peaked at 52% in both these situations (annotation of 10 or 100 images) (O’Donovan et al. 2014). The research proposed in Park et al. (2017) proposes that the observer classifies test images by preference (and not through a binary choice of “beautiful/not beautiful”). To accelerate this choice, the observer is offered binary choices: choose the most beautiful out of a pair of images (ten pairs), and then the algorithm orders the set of test images in a way that respects these choices. This sorting, carried out on the small number of test images, is propagated across the entire learning base according to a nearest-neighbor principle. Finally, to evaluate a new image, there is a compromise made between an objective score obtained through regression at the output of a generic classifier, and an ordering based on the observer’s personal ranking. The original regression algorithm implemented was inspired by the support vector regression (SVR) as well as the ranking support vector machine (R-SVM). This compromise was managed through a unique, objective function that conjointly optimized the subjective and the objective part of the system. In Lv et al. (2018), the authors carry out learning through a small number of images proposed by the user. A network then makes it possible to extract, from the database, those images that are closest to the user’s choices. Through an interactive feedback loop (learning through reinforcement), the user can iteratively improve this choice until a specialized database is created that can account for their aesthetic tastes. The chaining of three to four correction steps seems to be sufficient for effective learning. The interaction phase is thus relatively short. This database is then available to create a specialized network, adapted to the user’s aesthetic (Figure 7.19). It seems that the results (measured by the correlation between the ranking of a series of photos by the network and by the user) are good (greater than 0.8), and that learning on a very small number of images (5, in this case) is preferable to longer learning. This is not the case with the research presented in Ren et al. (2017), which is quite similar in objective, but which uses a discriminating function (here, a support vector regressor that uses radial base functions) instead of a network. Unlike the previous case, it appears that
Deep Neural Network Systems
Figure 7.19. Scheme of the USAR system by Lv et al.
193
194
Aesthetics in Digital Photography
C OMMENT ON F IGURE 7.19.– This uses a learning loop that may be iteratively traveled. The system gives the user images from the annotated database that are close to the small set of images appraised by the user. The user confirms the pertinence of these images or rejects them, such that the entire set that is finally chosen matches their taste well. These are the images that will be used to train the system (Lv et al. 2018). the performances increase regularly and noticeably with the increase in the number of images used to qualify a user, at least up to 100 images. To take into account this property, the authors developed an incremental approach, which improves with the user’s responses. Some works combine tests on the user’s tastes (learning through a small set of tests where images are ranked) and an analysis of their social profile based on their connections on the network (Deng et al. 2017a). The image to be evaluated receives two scores: one reflecting their personal tastes (determined by their responses to the tests and their participation in the social network), the other reflecting how much they conform to generic tastes. These two evaluations are then combined in quite a complex manner. 7.4.4. Multiplying concurrent expertise One way of disrupting the aesthetic objectivity of the DNN is to explicitly introduce subjectivity using experts who are in charge of representing competing aesthetic pathways. This is an avenue that was also partly explored in the work of Zhu et al. (2020), seen earlier. This method was approached by Kong et al. (2016), who tasked a small number of true experts in aesthetics (retaining a trace of their expertise) with the evaluation of their AADB database (see section 5.4). Each expert expressed a different sensibility. The authors were thus able to use scores from only one expert to evaluate an image (if an Internet user felt they agreed the most with one of them) or, on the contrary, were able to use scores from several experts. More ambitious research, reported in Hong et al. (2016), tried to determine communities of tastes among the population of Flickr members. To do this, for every image they used two types of descriptors: semantic descriptors (taken from the captions given on each image) and low level descriptors (color, contrast, contours, textures). In the space of these descriptors, each image was processed as a word by an LDA. The words were grouped into sentences (one sentence represented one photograph), which then provided the latent topics defining the communities (one topic represented one communities). The communities shared neighboring subjects in the descriptor space. While there were many mathematical tools used, and while the
Deep Neural Network Systems
195
method was very interesting, the initial results were not totally convincing. It is tricky to identify the respective weight of the various descriptors, and it is difficult to distinguish between an aesthetic tendency and a photographic theme, and it is very difficult to characterize the real aesthetic differences in distinctions that are still quite coarse (see Figure 7.20).
Figure 7.20. Detection of communities in the cloud of descriptors using a latent Dirichlet allocation and then graph partition
C OMMENT ON F IGURE 7.20.– The communities are identified by ellipses of different colors. In red: “designers”; in blue: “colors”; in green: “architecture”; in yellow: “black and white”. The three representations are particular projections of the cloud (from Hong et al. 2016). It can be seen that there is a large overlap between these communities and themes in the photos. We would like to be able to bring together these last two approaches to construct incontestable communities around experts who are pertinently chosen to express well-established aesthetic tendencies. Furthermore, researchers must think about how to experimentally validate these various approaches, given how difficult it is to develop protocols that are reliable, reproducible and convincing. The methods that propose taking into account the observer’s subjectivity in delivering a judgment of taste sin not only through methodology that has very complex and confused premises, but also through highly empirical and ad hoc verification strategies.
8 A Critical Analysis of Machine Learning Techniques
It is important to note that the factual basis of a science of aesthetics is not to settle whether some image or object is “objectively beautiful” – we agree with Kant that this is impossible – but rather to determine whether (or to what degree) some representative set of individuals judge or experience it as beautiful (or ugly). Stephen E. PALMER, Karen B. S CHLOSS and Jonathan S AMMARTINO (2013)
8.1. The popularity of studies on aesthetics Let us begin by observing that over the past few years the scientific community has grown much more interested in aesthetics. For the past 20 years, there have been over 50,000 publications a year dedicated to this field. Many of these concern the fields of philosophy, where they study two aspects chiefly: the relationship between art and beauty, and the relationship between beauty and culture. However, as we have seen, new studies intertwine aesthetics and technology, either in order to find the biological sources of aesthetic judgments or to reproduce this aesthetic judgment through algorithms. Thus, a search for the rather unusual combination “aesthetics + computer” on Google Scholar returned close to 20,000 hits in 2019, whereas there were no more than 5,000 hits 20 years before1. For a color version of all figures in this chapter, see www.iste.co.uk/maitre/aesthetics.zip. 1 To be accurate, we must compare the frequency of the terms “aesthetics + computer” with the product of the frequency of the terms “aesthetics” and “computer”. This ratio is 16 times greater for 2019 than for 2000.
198
Aesthetics in Digital Photography
Reference Database Evaluation type Primitives Classifier (Ke et al. 2006) DPChallenge 2 classes generic attributes Bayes, AdaBoost (Luo and Tang 2008) DPChallenge 2 classes photo primitives Bayes, SVM, AdaBoost (Datta and Wang 2010) Photo.net 2 classes+score photo primitives SVM (Marchesotti et al. 2011) Photo.net+CUHK score generic attributes PCA+FV+SVM (Dhar et al. 2011) DPChallenge 2 classes high level primitives SVM (Yao et al. 2012) Photo.net recommendations composition,texture,color ACQUINE+ SIMPLIcity (Lo et al. 2012) CUHK 2 classes photo primitives category-based SVM (San Pedro et al. 2012) DPChallenge continuous score internet-user opinions SV(Su et al. 2012) DPChallenge 2 classes aesthetic library AdaBoost (Lu et al. 2014a) AVA 2 classes HSV or Munsell LDA + Lasso (Lu et al. 2014b) AVA 2 classes photo + sub-image RNP - SCNN + RD-CNN (Schifanella et al. 2015) BEAUTY 5 levels photo primitives PLSR (Jin et al. 2016b) AVA 2 classes reduced image RNP = ILGNet (Kong et al. 2016) AADB + AVA 2 classes reduced image + category Siamese DNN (Mai et al. 2016) AVA 2 classes pyramid + class. info. Multitask MNA DNN (Schwarz et al. 2016) Tübingen 2 by 2 classification image: 1 track Siamese + triplet DNN (Kao et al. 2016b) AVA 2 classes + distrib. image + 3 categories A&C DNN (Wang et al. 2016) AVA - AVA2 2 classes + distrib. histo. of notes. Brain-inspired DNN (Redi et al. 2017) AVA → Redi 3 classes web images learning DNN (Kao et al. 2017b) AVA 2 classes aesthetics + category multitask DNN (Kairanbay et al. 2017) AVA - Comments 2 classes images + style DNN + GAP layers (Ma et al. 2017) AVA 2 classes composition + ss-images A-Lamp, multitask (Murray and Gordo 2017) AVA continuous score histo. of scores. DNN = ResNet (Park et al. 2017) AVA + interact. personalized choice personal preference DNN +SVR (Talebi and Milanfar 2017) AVA continuous score histo. of notes.+category multitask DNN (Srivastava and Kant 2018) AVA 2 classes reduced image+ LAB color DNN = ILGNet (Wang et al. 2018) AVA - Comments explanations AVA-Reviews DNN = Inception
Table 8.1. The various machine learning systems of phototgraphic quality in the order in which they appeared
A Critical Analysis of Machine Learning Techniques
199
C OMMENT ON TABLE 8.1.– The term “photo primitives”, used here, covers all primitives that are specifically derived from photography guidelines: the rule of thirds, the focus on the object and blurring of the background, the importance of constructing the shot, etc. The high level primitives are calculated in the image, while the “categories” are taken from the data, often as metadata, sometimes calculated by the DNN. The studies that used DNNs are in the white rows. Does this mean that the algorithmic approach is well accepted, today, in the community that studies aesthetics? Certainly not. The emergence of automated techniques has always been received with many reservations and the support given to digital methods is chiefly from the new milieux that study aesthetic problems, with the same enthusiasm with which they approached medical diagnostics, financial predictions or competitive games. Aesthetics has become the new challenge for computational and digital neuroscience and, today, it is largely computer scientists who have taken the plunge into digital approaches. Their work was largely unknown in the humanities and barely followed in biology. And while they themselves often seek inspiration in the writings of philosophers or biologists, it is from a distance. 8.2. A summary of learning methods Table 8.1 summarizes a number of research, which has been characterized based on the attributed used to measure the beauty of photographs and by the algorithms used in their calculations. The dates of these studies show how systems based on primitives and classifiers, which opened up the path into computer-assisted aesthetics, progressively gave way to neural networks. It must be noted, however, that the specific classification techniques are still used for particular tasks downstream of a DNN process, as in Park et al. (2017). 8.2.1. Which architecture? Which software? It is no surprise to note that the research presented here is often satisfied with using conventional architecture, which had proven their strength in international shape recognition competitions. Therefore, these architectures as well as the software are the best as at the time t and will be replaced by the generation that wins the next competition. For studies on aesthetics, particular attention has been paid to the format of the images, concerning both the resolution as well as the aspect ratio. But even here, the solutions adopted were those chosen for other problems. The authors seem to have no problem in processing images of a modest size (the AVA database, universally used for learning, has only reduced versions of the original images in the DPChallenge database, with a maximum size smaller than 640 pixels). Certain developments in DNNs were followed with much interest by the community of researchers working on determining aesthetics using DNNs:
200
Aesthetics in Digital Photography
multi-task networks and Siamese networks most of all, but also recurrent networks. Thus, several very different tasks were run in parallel, for example tasks involving processing the image itself, and then the processing of the metadata, or tasks that tried to deduce these from the image (category, style, context, etc.). Nonetheless, the trend seen in 2018 in the most general applications of DNNs (prioritizing an increase in the convolutional layers) was not as widespread in studies on aesthetics, and the joint use of convolutional layers and totally connected layers was followed much more often. Similarly, the architectures adopted were almost all funnel-architecture rather than “yoyo” architecture, as was the trend in other domains. This was probably because researchers wished to endow the image with an overall aesthetic quality, rather than spread the evaluation over its different zones. The network’s learning remains a tricky point. Given how deep current networks are, the database dedicated to aesthetics appears too small to lead to end-to-end learning (this is especially the case with the AVA). Increasing the database (through replications of the samples after geometric or photometric transformations) does not give very good results these days for our problem. Increasing the size by using a database of synthetic images has not yet been implemented, but it could be envisaged in the long term. “Universal” learning, followed by a transfer of learning, carried out on banalized images (as in ImageNet) for very generic form recognition is used much more in current studies, without seeming to notably affect the performances of the studies on aesthetics. 8.2.2. What performances? Today, the most widely used categorization by the community is two-class categorization (“beautiful image” versus “not a beautiful image”) using the AVA database as the benchmark for evaluating the performance of different systems. We indicated (see section 5.4) that it is possible to have variants of this criterion based on a δ parameter, which measures the distance between the categories during the learning. The pertinence of this criterion will be discussed further on. The most significant results are presented in Table 8.2 for two values of δ and reviewed for δ = 0 in Figure 8.1. Other results were obtained using the AVA-2 database (i.e. only those images in the AVA that belong to the best 10% and the worst 10%), which is thus a simpler tasks, giving even stronger rates of correct recognition (over 90%) (Dong and Tian 2015; Jin et al. 2016b; Apostolidis and Mezaris 2019). It can be seen that, today, these classification performances are statistically good for large sets of photos: on the AVA there is over 75% accurate classification between “high quality” and “low quality” photos on sets of several hundreds of thousands of images. It can also be seen that with some additional knowledge about the category of the scene being observed or the photographer’s style, this recognition rate can be
A Critical Analysis of Machine Learning Techniques
201
increased to 85%. However, it can still be noted that research that aims to obtain excellent classification rates have plateaued a bit since 2018. The most research studies seem to be satisfied with these performances and have other goals: justifying the opinion delivered, matching the judgment to the observer or to the scene, for example. SVM+Gauss. mixt. SCNN RD-CNN DMA-Net MNA-CNN (Murray et al. 2012) (Lu et al. 2014b) (Lu et al. 2014b) (Lu et al. 2015b) (Mai et al. 2016) 0 66.7% 71.20% 74.46% 75.41% 77.4% 1 67.0% 68.63% 77.70% – 76.5% ILGNet BDN MTRL-CNN A-LAMP NIMA (Jin et al. 2016b) (Wang et al. 2016) (Kao et al. 2017b) (Ma et al. 2017) (Talebi and Milanfar 2017) 0 79.25% 76.80% 79.08% 81.7% 81.5% 1 – 76.04% 77.71% – –
Table 8.2. Comparative results of different methods of aesthetic evaluation in the AVA database, applied to a two-class categorization (“high quality”/”low quality”), for two values of separation between the classes (the δ is the same as the delta on the X-axis in Figure 6.10). The A-LAMP method (Ma et al. 2017) gives the best results (in 2017) (from Kao et al. 2017b)
Figure 8.1. Performances of two-class aesthetic evaluation (beautiful/not beautiful) of different systems, using the AVA database, with machine learning, for 10 years, with δ = 0. It must be noted that the system developed by Murray et al. is the only one chosen for the “podium finish” of best systems, as a system using a selection of primitives followed by a classification. All others use DNNs
202
Aesthetics in Digital Photography
It can be seen that the proposed methods are well suited to diverse photos that are clearly distinguished by subject, style as well as format. The algorithms do not seem to get misled by specific categories of photos. Unlike other domains of application of AI, it does not seem like modifications that are too subtle for a human observer can fool the algorithm when it comes to an aesthetic judgment. Studies that aim to transfer learning from one database to another seem to conclude, instead, that this learning is quite robust (Kao et al. 2017b), even though some studies give different results (Kong et al. 2016). Although today these studies produce results that are quite encouraging, they are not immune from criticism. Thus, it can be seen that the two-class categorization is a problem that is not very exciting, intellectually, and also has no social-economic interest. Its chief attraction is that it offers the algorithms easy arbitrage. Most studies that use this categorization had initially defined a more ambitious goal: either of carrying out a fine, non-binary aesthetic evaluation, or of comparing a small set of images to each other, or, finally, to reveal information that would explain or justify a score. Let us look at these other objectives and the ways in which they were approached. – Continuous evaluation: Apart from the ACQUINE system (Datta and Wang 2010), which may be consulted online2, few other systems offer a score reflecting the continuous evaluation of the aesthetic as the final result. ACQUINE offers a score on 10 points that is reasonably satisfactory, but which has proven to be susceptible to change. On systematically testing the same image by varying certain parameters (blurring, noise, color), it is possible to obtain very good scores for parameters that clearly move the image away from the photographer’s initial choices (e.g. by introducing an unacceptable blue filter over the image). During the photography contest conducted by Huawei3, a continuous score between 0 and 100 expressed the judgment of the photos, probably the result of the algorithm. As the photos were also sorted and ranked by professional, human experts, it would be very interesting to compare and contrast these two rankings. In section 7.2.4, it was seen that some systems propose predicting the break-up of scores that an AVA jury was likely to award, from which an average evaluation could be derived (Murray and Gordo 2017; Talebi and Milanfar 2017). However, their 2 ACQUINE, an online service to evaluate the quality of a photograph. It can be accessed at: http://acquine.alipr.com/. 3 “Spark” the (SPARK, a RenAIssance photographic contest) run by Huawei in July 2018 to promote their P20 Pro cellphone model (see the footnote 17 in the Introduction). During this contest, two juries selected the best photo: one made up of expert photographers, the other consisting of an algorithm.
A Critical Analysis of Machine Learning Techniques
203
results are not very convincing and, consequently, they are more often presented as intermediate steps, rather than as the system’s chief output. Further, this distribution of scores has very few uses. The most convincing uses are probably to describe the specific difficulty of delivering an aesthetic judgment on a specific photo, but there is still much to be done to validate and make better use of this path. – Ranking photos: Comparing images is an oft-stated objective, but it is not very widely studied. This was explicitly the objective of the study presented in Schwarz et al. (2016). The proposed method is original and the preliminary results presented are encouraging – but at the price of a certain complexity of the model that is implemented. – Justifying the score: The basic techniques used by classifiers and handcrafted primitives was quite well suited at identifying the essential causes for an aesthetic appraisal, as we could trace, among all the criteria adopted, those criteria that are actually active. Unfortunately, the performances were quite low and this path was not pursued. Methods using DNNs ignored the idea of justifying their results, as their decisions were masked in the depths of many, successive layers. Nonetheless, researchers tried to use these techniques to explain their decisions by looking at the layers closest to the output, where it was seen that zones were activated just before the decision was made. This is what several studies have done to a modest extent4, (Jin et al. 2016a; Kairanbay et al. 2017; Murray and Gordo 2017). Very different arguments are made by studies that seek to exploit written expressions, like those presented in San Pedro et al. (2012); Wang et al. (2018). Their results are, however, far from being convincing. While the methodology used seems reasonably well suited to make best use of the comments associated with the judgment of the photos, the quality of the working database is questionable: are the opinions of the “AVA experts” reliable and formalized enough to support the proposed reasoning? 8.3. Questioning the hypotheses Going back to the sources of the elements that drove these studies on computational aesthetics, we cannot help but return to the hypotheses that underpin this research. It is regrettable, first of all, that these hypotheses are rarely expressed and discussed. The argument that the superiority of the DNNs is established in recognition tasks and in evaluation in various fields is often used to justify the work carried out in the field of aesthetics using these tools. While this argument is quite sufficient to justify an exploratory study, it cannot be used when we look at performances and their limitations. 4 If the location of the active zones is closely examined, there are many situations where these zones are alongside elements that must hold attention, as can be judged in Kairanbay et al. (2017) or in Jin et al. (2016a).
204
Aesthetics in Digital Photography
As has already been pointed out, the implicit hypothesis behind all these learning methods using neural networks is that of an objective and universal aesthetic. It is thus clear that while there are a few exceptions (to which we will return), in most studies at the heart of the algorithm is a judgment that is valid for everyone at all times, independent of the observer, their culture, their temperament and their mood and emotion at the moment. This judgment is constructed on knowledge, which is itself a universal appraisal attached to the image and to that image alone, contained within the database that allows the learning to occur. This is precisely what computer scientists called a “data driven” context, with the programmer trying to be as transparent as possible so that the conclusions drawn can be universal. This bias, in aesthetics, is clearly born out of objectivism, as summarized in section 1.1. This interpretation therefore goes against many approaches that have been developed in other domains of knowledge related to aesthetics, whether in neurobiology (Chapter 2), psychology (social or experimental psychology) (Chapter 3) or philosophy (Chapter 1). Although subjectivism, a position diametrically opposed to objectivism, is no longer the rule as it once was in philosophy, and despite its many supporters (Zeki 1999; Ishizu and Zeki 2011; Changeux 2016), it is commonly called upon (especially in interactionist interpretations, which has gathered much support these days (Reber et al. 2004; Ingarden 2011)) and the observer, with their states of consciousness and attention, is seen as a major determinant in aesthetic appraisal that “uses” the signal emitted by the object, which is the other determinant of the beauty that is experienced. As we have seen, approaches that use artificial intelligence to look at evaluating the aesthetics of photographs leave limited space for subjectivity today. They often intervene in a correctional manner, on the margins of a process founded on the universality of the judgment of beauty (Kong et al. 2016; Park et al. 2017; Lv et al. 2018; Zhu et al. 2020); however, they take into account only a small number of determinants of subjectivity. These thus often remove known concepts that sometimes are a result of education, culture, temperament or the context, which are likely to be a significant foundation for the appraisal. 8.4. Specific features of beautiful images detected by a computer Let us now revisit the mechanisms that are implemented in the most advanced studies today that make massive use of DNNs and let us try to form some conclusions from the choices made. It is in the annotations in the learning databases that we can find the essential of the “aesthetic expertise” deployed in DNNs and, today, the AVA database is the universal choice for this. Let us recall that it was made of over 250,000 images taken from the DPChallenge site. Each photo is associated with some metadata, sometimes
A Critical Analysis of Machine Learning Techniques
205
summary metadata, and also with ratings given by Internet users in the form of scores and, sometimes, written comments. This database is thus distinguished from photo databases that are oriented toward computer vision, such as ImageNet, ImageClef, or sites to store images for social purposes (Flickr, Instagram), by the emphasis placed on the aesthetic quality of the artworks, and this is very noticeable in the DPChallenge site from which they are taken. 8.4.1. Some observations on the photos in the AVA database To the best of our knowledge, there are no studies that describe this database, except the article in which it was introduced (Murray et al. 2012). As was said before, it is important to note that on an initial analysis, the images it offers are generally of very good technical quality, that they cover a very wide range of themes, quite compatible with what we find in art galleries, museums or other similar collections: portraits, landscapes, still life, crowds, interior or outdoor scenes, etc. It must however be noted that there is a very large majority of color photos, to a greater extent than is found in the repertoire of most artists, even contemporary ones. It is quite obviously very difficult to qualify the statistical representativeness of the themes, styles and compositions offered here. It would also be difficult to fix an ideal objective for representativeness without clearly establishing the aesthetic framework that we wish to characterize. It can still be noted, nonetheless, that abstract or near-abstract scenes (such as in the works of Man Ray, for example), which constitute an important part of the oeuvre of established photographers, as illustrated in Figure 8.2, are often very under-represented in the AVA. Similarly, we find that scenes that include an animal are overrepresented. Among nature photographs, there is much space given to themes around water, and among landscapes there are a large number of sunsets and night photos. For scenes of daily life, we see a very small proportion of images that reflect situations of work (workshops, factories, fields, etc.), while images of family life abound. When looked at through a historical perspective of photography, the works represented here are resolutely contemporary, whether in terms of the themes adopted or the manner in which they are treated. Very few images use the aesthetic codes from the early days of photography (simplified backgrounds, high contrasts, linearization of geometry), such as can be found in the work of Adget, Cameron, Kertesz, Carjat and so on, whose aesthetic was often dictated by technical constraints, but which influenced many future artists with their sparse scenes and reduced details. Very few AVA images also adopt themes, frames or codes in their photos that are strongly rooted in the geography of their place of origin: Japanese or Korean photography, African or South American photography5. 5 When we speak of photographs rooted in their geography, we think, for example, of Africa as shown in the works of Seydou Keïta, Malick Sidibé, Jean Depara, Calvin Dondo, Fatoumata
206
Aesthetics in Digital Photography
Figure 8.2. A few examples of “abstract” photographs (or weakly figurative), similar to what can often be found in modern exhibitions: from left to right, and top to bottom, they are the works of Fernell Franco, Andrew Gustar, Jacqueline Hammer, Michael Lønfeldt and Les Cunliffe. Photos of this kind seem to be under-represented in the AVA database, and those which approach this style are frequently given poor scores by DPChallenge
The thematic inspiration for the photos in the AVA database is often drawn from intimate settings, familial or daily life, captured without any distance or arrangement of the scene. Landscapes are also the object of much interest, but conversely, more so when they are grandiose than the intimate landscapes that the impressionists, for example, were so fond of. Another source of inspiration moves away from nature and is clearly from the field of design: pure, simple shapes, uniform backgrounds, few shadows but lighting effects, hues that are often artificial. It is especially striking to see the amount of digital post-processing, which aims to enhance resolution, expand the color palette, augment contrasts and use chromatic selection, all of which modifications are often introduced with no moderation. Incrustation is also a widely used device, as well as blurring movement and the superimposition of images. This is what we called “technophile aesthetics”. This appears to be more pronounced than in the “realistic” images taken by professional photographers and far less refined and subtle than in the works of art photographers who also use them. Diabaté, Aida Muluneh, etc., all of whom share codes of composition and treatment, which make them immediately identifiable.
A Critical Analysis of Machine Learning Techniques
207
We know a little about the photographers who created these images thanks to the AVA-PD (Photographer Demographic) database, which gather whatever little information is available online on the most regular authors in DPChallenge, as we have discussed this earlier (see section 7.4). In Kairanbay et al. (2019), certain statistical elements were taken from these data. Note that 66% of the photographers are North American, and 15% are European (especially English) (see Figure 8.3). Around two-thirds of them are men, aged between 18 and 85, with a mean age of 50 and a mode close to 40. It seems to us that there are certain learnings we can take from the representativeness of the AVA database on this point, but these studies are yet to be conducted. The study carried out in Murray et al. (2012) to identify each image within a psychological frame of reference (in this case, Russell’s circumplex model of affect6) is an excellent avenue that deserves to be explored (using Russell’s model or another model7), but must be carried out by multidisciplinary teams. 8.4.2. The scores in the AVA database While the AVA photographs do not evoke any important remarks with respect to their content, apart from what we have already highlighted, we can be a little more critical about the evaluations associated with each photo. What do we know about the Internet users judging these images? Unfortunately, very little. We can only speculate on the practices common to social media and imagine that these are the same Internet users who post images and who judge other people’s images. This would indicate that we might be able to apply, to this population of judges, the characteristics identified for the photographers: age, gender, country of origin. This is a hypothesis that must be further strengthened or contradicted. The evaluations, on the other hand, are known and published. In the basic DNN techniques, they are often posited as primary sources of information that must 6 Russell’s circumplex model of affect proposes distributing psychological states in an ordered manner over a circumference organized around eight principal affects that are regularly distributed over the circle: pleasure, arousal, excitement, distress, displeasure, depression, torpor and relaxation. Russell then proposes placing all other emotions within this reference frame, with polar coordinates (Russell 1980). 7 Psychological states like emotions (emotions are intense and brief psychological states associated with a clear cause) have been the subject of many studies which do not agree on a reliable mapping. The 5 “primary” states: sadness, anger, joy, disgust, fear are generally identified as important, but many others are recognized and discussed. The challenge in this research is to determine the “geometry” of the space of emotions and, often, to demonstrate that a particular emotion could be derived from a combination of primaries, as is done in the space of colors (Plutchik 1991; Scherer 2005).
208
Aesthetics in Digital Photography
reproduce human judgment. Do they fulfill their task? The response is unfortunately debatable.
Figure 8.3. The origins of photos in the AVA database. Two-thirds of the photographers are from North America and Europe. Their photos were chosen for the AVA-PD database, a sub-base of the AVA supplemented with information on the photographers (from Kairanbay et al. 2019)
It must be noted, first of all, that most studies take the average of the scores awarded to be the “aesthetic truth”. From a simple, statistical point of view, the number of scores (over 200 per photo) should justify this choice, if the distribution of scores was strongly modal, expressing the magnitude of a quantity that is clearly defined. But it must be observed that this is not the case, probably because there is no consensus on what each score represents. A similar value given to one or another photo is probably not measuring only the aesthetic quality of the concerned photo. This can be noticed by comparing a few images, randomly drawn from a group which was given the same score of 5.43, which corresponds to the average of the scores in the AVA database (see Figure 8.4). Some images (the questionnaire, for example) have no other quality than being able to retain attention through a humoristic pirouette that has nothing to do with beauty. Others (the tank behind the wire fence, the statue of the nursing woman) present glaring flaws in the setting that disqualify them from a technical point of view, but the object in the images is rather original. Two others (the piece of jewelry and the poinsettia) have printed matter overlaid onto the image (deliberately?), which would seem incongruous in a photography contest. Thus, only about half the images are left, which are exempt from the early elimination you would encounter in any competition, photo that were probably judged on truly aesthetic criteria.
A Critical Analysis of Machine Learning Techniques
209
Figure 8.4. Twelve photos randomly drawn from the AVA library from among those that received the average score of 5.43, which corresponds to the average of the scores awarded by the Internet users
C OMMENT ON F IGURE 8.4.– It can be noted that there is significant heterogeneity among these images, which makes it difficult to qualify them from an aesthetic point of view. One carries a brand name, another has advertising text, a third can only be interpreted by reading the written text, one or two have clear flaws of lighting... these images, which are questionable, were as well-scored as others whose aesthetic qualities were less questionable. The numbers of these photos in the database are as follows: 115,821, 120,486, 131,330, 134,515, 135,292, 169,814, 173,889, 232,238, 1,133, 306,322, 276,462, 132,804. Let us now examine the 16 photos that got the “best” scores among the 225,000 images in the database (Figure 8.5). These are indisputably all superior than the earlier ones and none of them seems to have obtained their score undeservedly, even though each is distinguished by specific criteria. In this selection, we can note the importance
210
Aesthetics in Digital Photography
of what we call the “technophile” criteria, and which are quite representative of the Internet-user community that participated in the contest.
Figure 8.5. The 16 “most beautiful” images in the AVA library, arranged in decreasing order of the value of the average score awarded by Internet users
C OMMENT ON F IGURE 8.5.– The best photo, top left, has an average score of 8.60. The next-highest, right next to it, had a score of 8.52. All these 16 photos have an average score that is higher than 8.25. We can observe that there is a wide range of styles and inspirations. Extending this analysis to the next set of photos in the classification, it can be seen that as with these 16 there is a high representation of photos that have been enthusiastically improved using image-processing tools or
A Critical Analysis of Machine Learning Techniques
211
through special effects. The source of inspiration for these can be traced to the world of design or advertising (more than half of the above selection). We can also see a high representation of animal photos, as well as landscapes in the night, and, compared to what we find in galleries and museums, there is lower representation of portraits, nudes, architecture and street scenes. The images, presented from left to right, and top to bottom, are numbered: 106, 9,482, 150, 491,369, 55,938, 642,962, 543,104, 267,110, 2,892, 455,890, 54,599, 335,951, 106,707, 455,658, 111,547, 957,982. We will not present the “worst” AVA photos as most are of no interest (empty field or barely shaded), but a small selection of the photos with the lowest scores (Figure 8.6), which also reflects the diversity of the criteria which could have been used to set aside the photos and the ambiguity of a unique score to characterize the aesthetic. This impression of a high diversity in how the photo is interpreted is reinforced when we examine the scattering of the votes that led to the score. Thus, while all these mediocre photographs largely were awarded the lowest score (1) by some voters at least, only the photos of the bushes (number 45,563) and the green, abstract photo (number 92,643) did not receive a single 10 (the highest score). All the others were awarded this top score at least three times, which is surprising, to say the least. The comments posted are also open to criticism. They are often very vague, giving little information or only even anecdotal in nature. The majority of them are very positive and, thus, are largely addressed only to those images that the commenters have scored well, which does not mean that they are collectively well scored. We thus have the paradox of positive comments on an image that has received an overall bad score. As a result, most comments on images with overall poor scores are positive comments, and it becomes difficult to find out what these images lack. As has already been highlighted in section 7.3, the comments can often just be summarized by “I like this!”. They rarely refer to the aesthetic criteria or are couched in very general terms: “beautiful colors”, or, “good construction”. Very beautiful photos of nature (sunsets, misty forests, beaches, snowy peaks), flowers or animals are definitely popular, just like portraits of old people or of children, if they are not identity photographs. Many photos are appreciated for being quirky, humorous or serious, rather than for their aesthetic qualities, as testified by their comments. Abstract photos, which cannot be immediately interpreted, are generally poorly scored or are the object of very scattered opinions. When following the debates around the evaluations on the DPChallenge site, we cannot help being struck by the important role played by the subject of the photo with respect to the form of the photo. The narrative dimension of the photo is undoubtedly an important characteristic in the balance of the evaluation, reinforced both by the title of the photo (which is well-positioned and often very descriptive) and by the title of the contest to which the photo was submitted, with this title proving to serve as a reference framework within which to read and interpret the photograph. All this takes
212
Aesthetics in Digital Photography
us back to the observations in the introduction, where we examined the non-aesthetic reasons for appreciating an image. It must finally be recalled that there are many biases that skew the voting between Internet users, and which have been well analyzed for other applications (Cochoy 2011; Reagle 2013; Pasquier et al. 2014). These have been detailed in section 5.4: inexperience, negligence, bias, favoritism, manipulation, etc. It is thus seen that what is used to train a DNN is a very particular kind of “truth”, a sort of statistical beauty that reflects a reasonable consensus among a small number of self-proclaimed “judges”, reasonably well-qualified with a task that is poorly defined and not universally understood. While it is possible to support this “truth” during an initial process to sort the photos, it is harder to agree that it is an inherent part of attempts to establish algorithmic canons of photography aesthetics.
Figure 8.6. Five photos among the lowest scored images in the AVA database (their scores do not exceed 2.15). The numbers of these photos are: 11,220, 45,563, 92,663, 142,717, 221,721
Thus, if the DNN functions well (and there is every indication that in the near future, they will), it can judge photos, with great reliability (95 %?, 98%, ?) to be “beautiful”, that is, photos that conform to the average tastes of a small group of aficionados, bound together by a common passion for photography. And thus, we return to Bourdieu’s world, rather than Arnheim’s or Berlyne’s world.
Conclusion
It is inevitable that in its general effort to objectivize, Science sees a human being as a physical system in the presence of stimuli, which are themselves defined by their physico-chemical properties, and strives to use this as the basis to reconstruct the effective perception and to close the cycle of scientific knowledge by revealing the laws according to which knowledge itself is produced, by creating an objective science of subjectivity. However, it is also inevitable that this attempt will fail. Maurice M ERLEAU -P ONTY (1945, p. 33)
How can we approach the relationship between aesthetics and photography today, in the early 21st century, even as photography itself is undergoing a new cycle, one of the most active periods in its short history, due to changes in camera technology and how images are shared? Let us observe right away that the photographic image is an aesthetic object in its own right and that unlike Bourdieu, 50 years ago, nobody today can deny it a place among the other arts: painting, sculpture, music, etc. By various criteria, photography has not only become an art, but is also one of the most dynamic and widely respected forms of art (Fried 2008; Cotton 2009; Hariman and Lucaites 2016). Let us also note that much more than other modalities of art, photography lends itself to modern techniques of investigation. One reason for this is that it is easy to store and manipulate on computers, and another is that within a digital society it occupies a central position that makes it very easy to access large numbers of photographs and in forms that can be directly integrated into information processing procedures. We presented an analysis of aesthetics in photography across multiple registers in the hope of tying together all these fields into one conclusion. Unfortunately, this is not
214
Aesthetics in Digital Photography
the case, as there are very few bridges between these various streams of knowledge, with each deliberately hiding behind its expertise (this point has often been deplored; see, for example, Gombrich (2000); Ramachandran and Freeman (2001); Bullot and Reber (2013)). We must, therefore, be content with looking at each of these fields to take stock of the progress in the relationship between photography and aesthetics. In the field of philosophy, for the past 50 years at least, aesthetics has had to contend with an overhauling of the very concept of art (Arnheim 1986; Danto 1992). This upheaval led to the dominance of artwork where “aesthetic” was at best a secondary concept, if not completely alien. This was also accompanied by considerable confusion in Western thinking on the subject (Danto and Goehr 2014). In parallel to this was the irruption of “new” arts (primitive, naive, raw, aboriginal, etc.) and then the opening up of the international art market and worldwide exposure to highly diverse traditions and cultures, which sparked off reflections on how beauty was contingent on different factors and then how society heavily influenced how beauty was defined (Wascheck 2000; Brown and Dissanayake 2009). However, this inclusion of a cultural influence did not lead to the concept of beauty becoming more malleable. This state of things applied to images just as much as to painting or sculpture, and photography is also subject to it, although it would seem to be somewhat less exposed to this concept given its mode of operation (capturing the external world) and the fact that the creator of the image was accorded less importance. However, even as art disengaged to a great extent from the idea of beauty, beauty itself acquired a singular philosophical importance, as an autonomous experience. The relationship between the object and the observer remained terribly mysterious. Nobody denies that the object has their own, inherent properties and nobody denies the observer’s role in ratifying the beauty. What about intuition? What role do idiosyncrasies play? What about reasoning? And concepts? Is beauty itself an emotion, or geometry tied to light? Philosophy has been able to describe the Kantian project, but is still trying to articulate it to the Platonic ideal. In neurobiology, the field of neuroaesthetics is still trying to find convincing answers as to the chronology of the mechanisms of aesthetic appreciation and, consequently, trying to identify the causality in the chain between conscious activities (Zeki 1999). While a small number of hypotheses have been set aside (such as those proposing that this processing is carried out by dedicated areas in a possible hedonic cortex), while some certainties are now established (like the naturalization of sites involved in aesthetic appreciation), while there are certain new avenues that must be confirmed (such as the role of an external, exteroceptive pathway, and internal, interoceptive pathway), there are still many significant uncertainties that remain. For instance, the selection of the upward flows of the visual cortex, and the conditions for engaging the areas in charge of memory, reasoning or emotion, or knowledge related to the arbitrage between the various currents that these produce and their
Conclusion
215
composition. The overall interpretation remains largely speculative (Changeux 2016). There are cruel reminders of the need for better tools for functional introspection (like the diffusion MRI, which shows connections between active areas and makes it possible to quantify the flows). Similarly, it would be essential to have advances in rapid tools (denser and deeper magnetic or electronic imaging, for example) that make it possible to connect the signals resulting from the visual pathways with activation in the prefrontal region or the limbic cortex. This will allow researchers to support convincing hypotheses regarding the mechanisms of aesthetic evaluation. The experimental protocols used in experimental psychophysiology – which is very well suited for many studies on the visual apparatus, perception and the interpretation of simple forms – are far too simplified at present to clear the host of questions posed by practitioners and theorists in the field of photography when confronting the concept of beauty. The theoretical frameworks developed by Gestalt theory (Wertheimer 1938) are barely seen in recommendations that are couched in the most general semantic terms (“point of interest”, “principal subject”, “central figure”, etc.), and in the overly geometric concepts (“rule of thirds”, “center of fixation”, etc.). Experimental data in the form of a posteriori verification of these recommendations (wherever they have been formulated) through statistical analysis carried out after processing the images would be very helpful. However, research today lacks universally recognized protocols, a corpus of shared and accepted data, and also suffers from a limit ability to produce convincing results. Sociological approaches are essential for taking into account the role of education, context, the influence of society and culture. However, current protocols find it very difficult, under normal observation conditions, to isolate factors related to the individual, their temperament, and their mood. Finally, the newcomers to studies on the aesthetics of photographs: methods based on machine learning and Internet databases. These have established themselves in the field in an iconoclastic and exaggeratedly accultured manner. They offer rich potential for development in the market of new technologies and have a carefully cultivated reputation as being the ultimate progress in “artificial intelligence”, which has some difficulty shedding its image of being socioeconomic propaganda. These approaches are still far from being convincing the volume of resources they have been able to mobilize. Current, preliminary research is “promising” at best (which indicates, in the scientific world, that it must still prove itself), and disappointing at worst. It would be more accurate to say that they offer tempting but intriguing results that researchers do not know how to interpret or how exactly they can be used as they are. The lines of research carried out at present seem to converge on a common point. Although it is imprecise, it must be retained, imprecise as it is, if we wish to arrive at a consensus. A person may find a photograph beautiful and for very personal reasons
216
Aesthetics in Digital Photography
that do not really involve aesthetic criteria: a photo of a loved one1, a well-loved space, a memory of an intense moment and so on. This kind of a photo is beautiful only for that individual or a few select people, and it is beautiful only because it reminds them of a person, a place or an incident which is beautiful, touching or evokes strong memories. An unconnected would not find the image beautiful as it has no power to evoke, no way of recreating for them the affective presence of an object from the past, and it will only be viewed on its own merits. If that memory is unrelated to us, the image has no history, is not charged with memories and we can legitimately say that it is not a beautiful photo (which does not imply, of course, that we find it ugly). A photo is aesthetically beautiful when multiple viewers who are not involved in the story, the image recounts are persuaded of its beauty. This photo is beautiful if it has a set of aesthetic qualities, the very ones described in the manuals, a mix of qualities just as we may have a mix of virtues to be judged good, or a mix of traits to be judged intelligent. No specific quality is necessary, no specific quality is sufficient and no specific combination is essential. In the particular bond that connects the photo and the observer, these qualities are the photo’s contribution to the beauty that is experienced, what we call the photo’s “objective” contribution. Another contribution comes from the observer, who is sensitive to this quality or this association of qualities. In order to do this, the observer engages a panoply of individual qualities that are a result of their personality, their mood, their culture, their education and the context in which they observe the photo. They thus bring in what we have called the “subjectivity” of beauty. On the one hand, there is the harmony of forms in the photo, the balance of colors, the richness of textures, the focus, the subtlety of the gradients, the firmness of construction, etc. On the other hand, there is a mind that may be dreamy, feisty or nonchalant, states of mind that are epic, serene or tormented, there is impatience, the indolence of minds sharpened by being used to museums, frequenting vast plains, deserts or the coast, a rigorous or lax education, intense or passive pasts, evenings spent reading, or afternoons spent tinkering, playing team sports, attending political meetings, etc. Through what associations do the properties of a photograph combine with the individual’s unique qualities in order to produce the pleasure experienced, with no vested interest, a pleasure that cannot be share and is difficult to explain: what we call “beauty”? This remains an enigma at the end of this book. We have been able to narrow in on what qualities of the image could be involved, and we described algorithms that could learn this information and then find it again in new images, and 1 A beautiful photo of this kind has been described at length by Roland Barthes, which wrote about how he found his mother’s face in a photo of a walk in the Jardin des Plantes (Barthes 1980). The reader is soon convinced that it is his mother who inspires these lines, the photo chiefly serving the purpose of evoking her memory, which other photographs could not have done. In this moment, Barthes recreates the Proustian experience: the photo replaces the madeleine and vision replaces taste.
Conclusion
217
we vaguely know how to incline a machine learning judgment on beauty toward one or another social group. Nevertheless, we have only some rather mediocre prospects, today, of developing an algorithm that could substitute individual human judgment with respect to aesthetic tastes.
Appendix 1 A Brief Review of Aesthetics
“At any time, and for an infinite number of subjects,what some call Beautiful others refuse to acknowledge as such; thus, it seems we must conclude from this either that humans do not have the same idea of Beauty, or that their Senses resemble each other much less than was believed, making them perceive objects very differently. Jean-Pierre DE C ROUSAZ (1985 [1715–1754])
To complement Chapter 1, let us look at how Beauty was regarded from Antiquity onwards, to see how various conceptions of Beauty were first classified as “objectivist” and then progressively included subjective concepts. In Panofsky (1989) and Tatarkiewicz (1970), we can find detailed descriptions of the change summarized here. A1.1. Aesthetics in the ancient world: objectivism ruled supreme A1.1.1. Aesthetics in the classical Greek period: Hellenic Greek The classical Athenian period (fifth and fourth centuries BCE1) carried all the earliest references to aesthetics. In this period, Beauty was endowed with virtues that were aesthetic and moral. It was the product of three sources: “the material, provided by nature, knowledge, the fruit of tradition, and the work, given by the artisan”2. 1 BCE: Before Common Era. 2 The Greeks considered the plastic arts (painting, architecture, sculpture, etc.) to be the work of artisans and not of artists, while oral expression, endowed with the quasi-divinatory power of presaging the future, was the work of an artist. (Tatarkiewicz 1970, book 1).
220
Aesthetics in Digital Photography
The concept of creation was almost completely absent here, and faithfully following tradition was dominant. Tradition was expressed by the “canons”, which fixed precise shapes for the object (Figure A1.1). Canons were most often inspired from distant religious and liturgical sources, but some seemed to have purely aesthetic foundations.
Figure A1.1. There are no longer paintings available that date back to this period, called the Hellenic period, and we must deduce the canons that governed painted artwork based on statues that do exist. Greek artists opened up art to the natural representation of the human body, abandoning the rigid schemes that had prevailed in earlier epochs (Egyptian art, Cycladic, Archaic Greek, etc.)
C OMMENT ON F IGURE A1.1.– (a) Torso of a man, a Roman copy, in marble, of a bronze statue from the fifth century BCE. (b) The birth of Aphrodite. (c) Ephebe combing his hair (Diadoumenos). All three are from the Athens Archaeological Museum. Canons from architecture and statuary are quite well known to us, as many examples of these works are still available. They were largely taken up, with commentary, by Vitruvius3, in Rome, in the first century. There are far fewer 3 Marcus Vitruvius Pollio, called Vitruvius, was a Roman architect in the first century. He laid out the rules of aesthetics through six laws: (1) order (θαξις, taxis); (2) arrangement (διαθεσις, diathesis); (3) art of movement or eurythmics (ευρυθµoς, eurhythmos); (4) symmetry (συµµεθρια, symmetria); (5) propriety; (6) economy (Oικoνoµια, Oikonomia).
Appendix 1
221
remnants of this for painting. The rules are largely simplicity4, order5 equality6, and symmetry, especially through numbers, ratios between quantities that express proportions and reflect harmony7. Simplicity aims to guarantee the “appropriate measure”, that is, using only those elements that are necessary for expression. The simpler the ratios, the greater the aesthetic effect. Aristotle introduced an additional concept, which summarizes all the above properties: perceptibility. This states that the beautiful object’s qualities must appear naturally and spontaneously by adapting the size of the object to the observer’s field of vision and eliminating all superfluous artifice. Further, Aristotle maintained that beauty impressed itself on the mind without any justification, that is, was not a property born out of the senses or judgment8. Aesthetics makes use of mathematics, but it is not abstract in essence. It finds its sources in nature and in the universe, especially in the human body, which is a part of the universe and reflects its harmony. Today, we can say that this harmony is fractal as it concerns proportions on all scales: the bust to the whole body, the hand to the arm, the fingers to the hand, the nose to the face and so on9. The preferred ratios of this kind are 1/2 or 1/3, or those constructed using a “perfect” þ right triangle, with the sides (3, 4, 5) from which we derive the golden number: 1+2 5 = 1 .6 1 8 or the approximate 8 /5 = 1 .6 . However, these proportions end up being part of ratios that are quite numerous, since in architecture, for example, the ratios must also conform to mechanical requirements to ensure a proper load distribution. In the domain of colors, harmony can be found in the composition of the four primaries: “there are four colors, corresponding to the number of element: white, black, red and yellow”, said Empedocles10, agreeing with Democritus. 4 “If someone exceeds measure, even the pleasantest of things become unpleasant”, Democritus, quoted by J. Stobaeus, Flor, III, 17, 38, frg. B 23 Diels, cited in Tatarkiewicz (1970). 5 “Order and proportion are beautiful and useful, while disorder and the absence of proportion are ugly and useless”, The Pytagorean school quoted by J. Stobaeus IV I, 40 H. frg. D 4 Diels, cited in Tatarkiewicz 1970. 6 “In all things, equality is appropriate, excess and flaws are not, in my opinion”, Democritus, Democrates, Sent. 68, frg. B 102 Diels, cited in Tatarkiewicz 1970. 7 This Pythagorean vision of beauty comes back in Christian texts written at the beginning of our era: “Omnia in mensura et numero et pondere disposuisti” (You have ordered all things in measure, number and weight), Livre de la Sagesse (Book of Wisdom), XI, 21. 8 Diogenes supported this concept of Beauty being a universal a priori. He stated that only a blind person may question why time is wasted on the concept of beauty. 9 Since it concerns ratios in the human body, this aesthetic has been called “anthropometric” by Panofsky (1983). 10 Empedocles, quoted by Aëtius, Plac. I, 15, 3 ; frg A 92 Diels, cited in Tatarkiewicz (1970).
222
Aesthetics in Digital Photography
While the canons are additional constraints, small deviations from the canons are tolerated and even appreciated by some people11. Their objective is to get the observer to perceive the rules of aesthetics better. Consequently, it was noted that parallel columns in a raised temple give the impression that they diverge. Thus, it was recommended that these columns be slightly convergent, such that an observer at the center would perceive them to be parallel. Vitruvius reported that Democritus used such tricks to convey relief effects in his paintings. A1.1.2. Hellenistic Greece and Rome In the period following Alexander’s conquests, Classical Greece gave way to Hellenistic Greece and the Roman world. Arts and the concept of beauty both adhered to the classical tradition. Although the social context was completely overturned by the emergence of large empires, borders opening up to distant, highly developed cultures, and the appearance of new political regimes, the guidelines laid down by Plato and Aristotle remained central to the aesthetic doxa. Nonetheless, three philosophical currents emerged, each of which treated aesthetics differently: – the Hedonists (Epicurus) associated art and pleasure and, symmetrically, pleasure and art. There was no autonomy of art and no rules of aesthetics; – the Skeptics (Pyrrho) highlighted disagreements and contradictions between judgments on beauty and art. Consequently, theorizing on art was impossible, useless and dangerous. They refuted the Platonic arguments about the universality of art; – the Stoics (Zeno, Seneca, Cicero) stated that beauty was subordinate to moral values. They distinguished between moral beauty and a beauty of the senses. The latter generally held less value. They agreed with Platonic criteria and echoed them. In practice, the rules that were most often followed were Platonic and Peripatetic. The Stoic influence was generally seen in a greater flexibility when it came to implementing the canons that governed the outlines of artwork, while the details followed a decorum12, which may be interpreted as a local and partial application of the canon specific to each represented object. 11 Although not by Plato, who recommended the greatest adherence to tradition: “To put it briefly, then, said I, it is to this that the overseers of our state must cleave and be watchful against [...] innovations in music and gymnastics counter to the established order...” , Plato, The Republic, 424 B, cited in Tatarkiewicz (1970). 12 “Ut enim in vita, sic in oratione nihil est difficilius quam quid deceat videre. Πρ´ επoν appelant hoc Graeci; nos dicamus sane decorum. De quo praeclare et multa praecipiuntur et res est cognitione dignissima” (“In oratory, as in life, there is nothing more difficult than recognizing what is appropriate. The Greeks call this ’prepon’, while we say ’decorum’. There are many and excellent rules that deserve to be known”. Cicero Orator, 21, 70, cited in Tatarkiewicz (1970).
Appendix 1
223
A1.1.3. The Early Medieval Age: Western Europe and Byzantium Revisited by the emergent Christian world, Platonic aesthetics shifted beauty from nature to the transcendent world (Plotinus, Pseudo-Dionysius the Areopagite). It was the divine world that was endowed with all the attributes of beauty and it was in representing this world that the artist followed recommendations of harmony, balance, simplicity etc. This was the neo-Platonic period. Beauty gained a new virtue from this divine dimension. While it once impressed itself passively upon the observer, now it was radiant and participated in the diffusion of speech. The concept of light thus occupied a special place with that which accompanied brilliance. On the other hand, however, it lost some of its sensual nature. Further East, in the other half of the Roman Empire (which had begun to separate from the West), a new figure emerged in paintings: in Byzantium, the ’Icon’ occupied a central position. Only the image of God or his saints were worthy of effort. Painting reproduced a world that was formally beautiful and shorn of all earthly woes, expressed through a facial expression. Codes (transposition of canons) were established: the eye in the central position, elongated figures, artificial colors, a frame shorn of all material context or reduced to a few symbolic fragments of vegetation, if not idealized by a golden disk. Finally, in North Africa, Augustine, the bishop of Hippo, used the Platonic texts reviewed by Plotinus to lay the foundations of what would become the aesthetics of the West until the Renaissance: beauty is objective and independent of us; it is created by the harmony of the parts, their order and their unity, and this must be regulated by numbers: “In all things, the greater the measure, the form, the number, the better it is; and the lower the measure, the form or number, the less good it is”13. This culminated in divine beauty, which surpasses and inspires all other beauty. It was also probably Augustine who introduced the idea that beauty had no negative valence (the opposite of beauty) but there was simply an absence of beauty which we called ugliness: “No nature has evil in it, simply a deficit of goodness”14. It appears that he found it difficult to state that ugliness could be created in a world created by God. This was a point that was the subject of debate. A1.1.4. The later Medieval Age From 1000 CE onwards, it was religious architecture that manifested a resurgence in a vigorous expression of aesthetics: first through the Roman style and then the 13 “Omnia enim quanto magis moderata, speciosa, ordinata sunt, tanto magis utique bona sunt; quanto autem minus moderata, minus speciosa, minus ordinata sunt, minus bona sunt”, Augustine, On the Origin of Good, 3, PL, 42, c.554, cited in (Tatarkiewicz 1970). 14 “Cuique naturae non est malum nisi minui bono”, Augustine, De l’origine du Bien, 14-17, PL, 42, c, 555-6, cited in Tatarkiewicz (1970).
224
Aesthetics in Digital Photography
Gothic style. Art created by artisans, certainly, but also by engineers. And thus, the art of Numbers in the Pythagorean tradition, and also following the tradition of teaching and secrecy in workshops and guilds. The rules of proportion, harmony and rhythm were retained and transmitted, but their source was generally omitted: “The unity of dimension, established on the principles of equality, similarity, appropriate arrangement, adaptation and the commensuration of parts, is not the smallest factor of beauty”15 as Baldwin of Canterbury said. In both social and collective wisdom, numbers were now important in themselves as a prerequisite for beauty, rather than being important as ratios in the human body (as originally postulated in Platonic texts). With the aid of technological prowess, light was brought into edifices: the arts took their inspiration from the firmament. R. Grosseteste best brought together the aesthetics of numbers and geometry (“Composition and harmony in anything that is created follows only from the five proportions that we find between four numbers: one, two, three, four”) with that of light (“Light makes things beautiful and shows their beauty to the highest degree”)16, using an approach that aimed to present a unified, mathematical view of beauty. Alhazen17 introduced a new distinction into aesthetics. A faithful proponent of Classical and Hellenistic Greek writings he also worked in optics and was especially interested in how images were formed. He thus distinguished between the image received by the eye and the perception that we have of this image. His message spread through Europe quite slowly and it was several centuries before it was revisited by a new school of thought (E. Vitellion, R. Bacon). He defended the idea that the eye only directly perceived color and light. The other components of aesthetics (shape, composition, arrangement) are only appreciated through other actions of perception, namely, memory and reflection. It was this new and singularly modern conception that formed the basis of aesthetic perception, which could distinguish a simple beauty that was immediately perceived by the eye, and then a more complex beauty that addressed forms that were themselves more complex18. While distinguishing between two types of beauty is conventional in analytical thinking, it is founded on principles of transcendental 15 “Parilitas autem dimensionis secundum aequalitatem, similitudinem, compositionem et modificatam et commensuratam congruentiam artium non minima pars pulchritudinis est”, Baldwin, Bishop of Canterbury, Treatise on the Salutation of the Angel, PL 204, c, 469, cited in Tatarkiewicz (1970). 16 “Solae quinque proportiones repertae his quattuor numeris: unum, duo, tris, quattuor, aptantur compositioni et concordiae stabilienti omne compositum”, Grosseteste, on Light (Baur 58) et “lux est maxime pulchrificativa et pulcchritudinis manifestiva”, Grosseteste, Comm. in Div. Nom., IV (Pouillon 320), cited in Tatarkiewicz (1970). 17 Alhazen, an Arab mathematician and philosopher, who worked at the turn of the millennium (1000 CE). His full name was Abu Ali al-Hasan ibn al-Hasan ibn al-Haytham. 18 Omnis vera comprehensio formarum visibilium aut est per solam intuitionem, aut per intuitionem cum scientia praecedente. (“All apprehension of visible forms is based either on
Appendix 1
225
philosophy. Here, on the other hand, the distinction is based on psychophysiological mechanisms as yet unknown but hypothesized. There are only two properties that contribute to the first type of beauty: light and color. However, Alhazen and Vitellion listed almost 20 properties that contributed to the second (Tatarkiewicz 1970, II, p. 266). Unfortunately, these properties are far from providing operational rules for beauty. Finally, it must be noted that these philosophers remained functionally objective in their conception of beauty. While the observer was taken into account, it was only to understand the mechanisms that made them sensitive to the beauty that existed independent of the individual. W. Ockham did not study aesthetics very much, but focused on the nature of the image and identified an original dimension, which goes beyond the object that is represented. This is the artist’s intention: “(a) In the strictest sense, the image is a substance formed by the artist in the likeness of another substance. In this conception, it is an essential feature of the image that it arises as imitation of what it is an image of [...] (b) In another way, the image is conceived as anything formed by the maker, regardless of whether it is in imitation of another thing or not”19. This was also a novel concept that would become particularly pertinent in the aesthetics of photography, where the object, which is fixed and the same for all, may be “transcended” into an object of art by the photographer’s eye. Subjectivity was thus introduced, not through the observer, where we might naturally expect it, but through the creator. These ideas would be largely adopted by Hegel. A1.2. The Renaissance It was the Renaissance that led to classical Greek thought being revisited and then reformulated in modern times and spreading through Europe. This was partly the reason behind its name20. This movement grew through the 14th and 15th centuries in the northern part of the Italian peninsula, especially in Florence, Venice and then Rome. Platonic legacy was the starting point for aesthetic thought, supported by the treatises written by Plotinus and Vitruvius, in the field of architecture, and complemented by concepts taken from Aristotle. mere gazing, or on gazing combined with previous knowledge...”, Vitellion, Optics, III, 62 (Alhazen, II, 69), cited in Tatarkiewicz (1970). 19 “(a) Strictissime et sic imago est substantia formata ab artifice ad similitudinem alterius ... Sic accipiendo imaginem de ratione imaginis est, quod fiat ad imitationem illius, cuius est imago; [...] . (b) Alio modo accipitur imago pro tali formato, sive fiat ad imitationem alterius, sive non.”, W. Ockham Questions in IV, Sententiarum libros, I, d,. 3, q. 10B (Baudry), cited in Tatarkiewicz (1970). 20 “Renaissance” (or rebirth) in two senses: the first referred to the world of the new human, liberated from the mental structures of the Medieval Age, the second was restoring Classical Greek philosophy in society (see Tatarkiewicz (1970), III, p. 32).
226
Aesthetics in Digital Photography
Marsilio Ficinio, in Florence, made himself the herald of Platonic Beauty as revisited by the Renaissance mind. The harmony, symmetry and balance in the cosmos created by God was reflected in the work of the artist and gave rise to Beauty and to Love: “The Beauty of the body does not lie in the shadows of matter, but in the clarity and grace of the form [...] in the appropriate and number and measure.21”, and Leon Baptista Alberti reinforced this, saying: “Beauty is a certain agreement and, if one may say so, a conspiracy of parts in the whole, where they are established according to a definite number, qualitative order and place as required by harmony, the absolute and primordial principle in nature”22. They inspired Da Vinci, Raphael, and Michelangelo, among many others.
Figure A1.2. Evolution of representations from the Renaissance to Classicism. For a color version of this figure, see www.iste.co.uk/maitre/aesthetics.zip
C OMMENT ON F IGURE A1.2.– The Renaissance revitalized the canons of beauty from Antiquity to create majestic religious figures, adorning them with symbols inherited from the Medieval Ages and the Byzantine age (here, the aureoles). (a) The Madonna and Child, with St. Dominic and St. Peter, by Fra Angelico, in 1435, from the Bode museum, Berlin. Later, artists returned to depicting nature alone, shorn of any symbolism ((b) detail from Mary Magdalene in Ecstasy, by Caravaggio, in 1606, individual collection). This aesthetic was frozen, in the 18th and 19th centuries, in academic classicism. ((c) The Birth of Painting, Eduard Daege, 1832, Alte Nationalgalerie, Berlin), which led to the “Art Pompier”. In the Renaissance, aesthetics moved into many other domains in daily life (goldsmithing, marquetry work, weaving, etc.). The role of the sciences was reinforced to support the concept of form in the visual arts. Nature remained the primary model for beauty and the artist strove to approach this, the concept of form 21 Marsilio Ficino, Letter to A. Canisano, cited by Chastel (1996). 22 L.B. Alberti De re aedificatoria, cited in Chastel (1996).
Appendix 1
227
in the visual arts. Their contribution is often in the subtlety and pertinence of their deviations in this reproduction. The concept of an overall harmony governs the composition of pictoral artwork23. Beauty is universal and everyone must agree: “But there are those who do not acknowledge this and say that there are diverse views concerning beauty and architecture, and that the form of edifices evolves based on each person’s tastes and preferences, without being governed by the commandments of art. This is a common error by ignorant people, who say that that which they do not know does not exist”24, as Alberti said. The Renaissance was also the period during which the artist was truly recognized as being equal to a creator. As Leonardo da Vinci said: “The divinity which is the science of painting transmutes the painter’s mind into a resemblance of the divine mind”25. This was also the period in which it was recognized that the artist’s objective is indeed the pleasure that they derive from their creation, and from sharing this with the viewers. A1.3. The modern world: from objectivism to subjectivism The 17th century dawned on a world that was considerably changed. Scottish and English philosophers with a keen interest in the scientific and experimental method proposed models that reinforced by descriptions of perceptual and cognitive functions. The physicists Copernicus, Kepler and Galileo had just overturned the Ptolemaic model of the universe and, in doing so, had raised some doubts about the Platonic legacy. However, it was Descartes who took on the role of establishing a resolutely new model that opened to door onto the Enlightenment.
23 “La belleza non consiste solo nelli colori, ma è una qualità que resulta dalla proporzione e corrispondenzia delli membri e delle altre parti del corpo ; tu non dirai che una donna sia bella per avere uno bello naso o belle mani, ma quando vi sono tutte le proporzioni. (Beauty does not consist of colors, but a quality that results from proportion and correspondence between body parts. You would not say a woman is beautiful because she has a beautiful nose or beautiful hands, but only if everything is in proportion.)”, J. Savonarole, Sermon on Ezekiel, XXVIII, cited in Tatarkiewicz (1970). 24 “Ma e’ci sono alcuni che non appruovano simili cose et che dicono che ella è una certa varia oppenione, con la quale noi facciamo giudicio della bellezza, et di tutte le muraglie, e che la forma degli edificij se muta secundo il diletto, e il piacere di ciascuno, non si ristrignendo dentro ad alcuni comandamenti de la arte. Comune diffeto de gli Ignoranti, é il dire che quelle cose che non sanno loro, non sieno”, L.B. Alberti, Della pittura, Lib. II, cited in Tatarkiewicz (1970). 25 “La deità, ch’a la scientia del pittorefa che la mente del pittore si transmuta in una similitudine di mente divina”, Leonard da Vinci, Essay on Painting, frag. 280, cited in Tatarkiewicz (1970).
228
Aesthetics in Digital Photography
A1.3.1. René Descartes Descartes did not completely break away from Platonic philosophy. He preserved most of its outline: a universe built around mathematical concepts, a rational universe that is capable of being understood, reflecting truths that are universal and temporally stable: a universe where physics and metaphysics were closely bound. However, unlike the Ancient Greeks, he considered that this world was entirely accessible to rational understanding. While the “Creator” was given a place, it was that of the “initial impulse” that gave rise to the world. But the world then followed its course based on this initial motion and the Creator, in their omnipotence, became indifferent to this fate and did not bother to modify the course, leaving the enlightened human leisure to explore its mysteries26. Descartes did not write very much on art and aesthetics, which he considered occupations of the salons. But while, like Plato, he considered that Beauty and Good are universal and timeless, he did not believe them to be properties of the universe, which were innate in humans, but instead considered them to be individual experiences unique to the thinking subject. Thus, Beauty, experienced by all, is not a property of the universal soul that bathes creation, but a lived experience of the individual being and aesthetic judgment is the result of rational deduction. Descartes believed that the perceived world is governed by the laws of physics. We access it through vision and perception, which create a sensible world. This sensible world is distinct from the rational world that our mind processes. Descartes’ aesthetics and the extensions born from it have been the subject of lengthy commentary in Krantz (2016). For the next two centuries, this distinction of “sensible world versus rational world” would be the site of much contention between philosophers. A1.3.2. Jean-Pierre de Crousaz Among the most developed objectivist thought were the writings of Jean-Pierre de Crousaz, a Swiss philosopher who penned a Traité du Beau (Essay on Beauty, de Crousaz 1985) which goes back to the Cartesian dichotomy and transfers its properties to the field of aesthetics: Beauty has two natures. One created by ideas (and thus, the Platonic world), and the other by feelings27. Syliane Malinowski-Charles said, about this double heredity attributed to Beauty: “The step taken by de Crousaz is significant and may be seen as emblematic of the major 26 “God did indeed create everything, but absented himself from his creation. A complete indifference in God is an important proof of his omnipotence”, Descartes, Principles of Philosophy, II, cited in Dézarnaud-Dandine and Sevin (2007). 27 De Crousaz wrote: “When we ask what is Beauty, we do not claim to speak of an object that exists outside us and separate from any other, as when we ask what is a horse or what is a tree”, in de Crousaz (1985).
Appendix 1
229
transition between the rationalist paradigm (which can also be called objectivist) and the empirical or subjectivist paradigm” (Malinowski-Charles 2004). De Crousaz, who explored the concept of aesthetics in greater depth, agreed on criteria that were very close to the criteria that Birkhoff would frame in his equation (see Chapter 4), intimately combining complexity and simplicity28. De Crousaz’s “translational” ideas were revisited, in France, by Jean-Baptise Du Bos and Father André29 and by Francis Hutcheson in Ireland (Jullien 2017). A1.3.3. William Hogarth After de Crousaz, William Hogarth, a famous painter of his time, wrote an Analysis of Beauty, Written with the View of Fixing the Fluctuating Ideas of Taste in 1753. In this, he clearly subscribed to an objectivist view of aesthetics. Thus, he proposed a “beauty curve” (Figure A1.3) which, in his eyes reflects, aesthetic excellence. Furthermore, he listed criteria for beauty, going far beyond those listed in Platonic aesthetics (simplicity, regularity, symmetry, harmony). Hogarth identified 10 criteria for beauty in painting30; he added two more to these, specifically for the beauty of the human body, and two for movement, which were meant to qualify the beauty of dance. It must be noted that the first criterion, “fitness”, describes the functional adaptation of the object to its use, a criterion that has little in common with the Platonic criterion and which was taken up by other Empiricists like Hume and would be widely echoed in the 20th century with the school of functional aesthetics. Another criterion, “intricacy”, looked at the object’s ability to defy immediate understanding, a sort of trigger for reflection, which would be echoed by philosophers in the following century. A1.3.4. David Hume David Hume expressed his thoughts on aesthetics in several essays between 1740 and 1760. He adopted a clearly subjectivist and experimental position. Beauty was an experience of the mind, like Justice, resulting from our perception and generally the 28 “Beauty is created by unity within diversity”, in de Crousaz (1985), taking inspiration from the beauty of mathematical shapes: triangle, square, polygon etc. 29 “I saw that there is an essential beauty, & independent of any institution, even divine: that there is a natural beauty, & independent of the opinion of men: finally, that there is a type of beauty, of human institution, & which is arbitrary up to a certain point”, in André (1759), p. 4. 30 Hogarth’s criteria were as follows: intricacy, variety, uniformity, simplicity, quantity (this corresponded, more or less, to harmony of the whole and its parts), fitness (which corresponded to the agreement between form and function), as well as four qualities related to lines, which allowed him to define the serpentine line as being ideal.
230
Aesthetics in Digital Photography
result of a positive association between the object and its function, or even between the animal and its nature: a horse’s legs are beautiful if their form is suited to the functions of transport and speed. Transmitted to our mind through perception, the form spontaneously evokes our sympathy, that is, a positive and disinterested resonance, which takes place upstream of our reflection. Pleasing forms are created from rules that Hume does not make explicit, but he recognizes that there is a certain universality to them, and a certain destination which calls to mind certain Greek thinkers: “Certain forms or specific qualities, because of the original structure of Man’s internal constitution, are calculated to please, and others to displease”.31 The receptacle of these impressions (the hedonic brain?), which is human nature, is fundamentally the same in all humans and in all eras.
Figure A1.3. The serpentine line appears on the original cover of Hogarth’s Analysis of Beauty. It represents, to the author, the peak of graphic aesthetics and deserves this attribute both as a fixed image and as a line of motion, where it makes it possible to recognize the most beautiful phases of a dance, for example
Nonetheless, Hume believed that the sentiment of taste is not equal between all humans, but that a small elite group shares the sense of beauty, making up what was called the “aesthetic aristocracy”32.
31 David Hume, “On the Norms of Taste”, in Aesthetic Essays, II, Vrin, 1974. 32 Hume believed that a judgment of beauty could only serve as reference when it was delivered by experts endowed with a certain number of virtues: (1) a clean nature; (2) delicate and refined organs; (3) a culture developed through frequent exposure to art; (4) a natural opinion that is not affected or mannered, an indicator of common sense and common understanding; (5) intelligence that is sharp enough to create an aesthetic of perfection.
Appendix 1
231
A1.3.5. Alexander Baumgarten Alexander Gotlieb Baumgarten was the first in a long line of German philosophers who centered their philosophy around art. In 1750, he authored Aesthetics, a text that truly created the term that would cover the science of taste, approaching all arts in the same manner, and clearly separating Goodness from Beauty. He clearly showed that this term covered perception (through the different modalities of the different sensations), representations and various mental associations, from illusions to dreams, to hallucinations and to reasoning. Baumgarten attributed considerable importance to sensible intuition, which he defined differently from his predecessors Descartes and Leibnitz: it is a direct perception of the properties of an artwork, which is established parallel to reasoning but often makes it possible to access the essence of things more rapidly and effectively, if a little confused (Malinowski-Charles 2005). Baumgarten stated: “Intuition is necessary for beauty”33. A1.3.6. Immanuel Kant Kant is recognized as the grand modern theoretician of Aesthetics34. With this essay dedicated specifically to aesthetics, he left behind an invaluable legacy which even today is the subject of much commentary. As already indicated, although Kant believed, like Plato, that Beauty was universal, he ascribed this virtue not so much to the object itself, as to the judgment of the observer’s aesthetic sense. He believed that this judgment was a universal gift, common to all humans and equally shared, and called it “common sense”. In his view, Beauty was tightly bound to the concept of Goodness. This concept was simultaneously perceived by our senses and understood by our conscience, an independent interaction of our imagination and our understanding. It has no biological objective, but aims to establish harmony among our internal representations. This must not be confused with simple pleasure, as this experience is independent of any final purpose. In this way, it is indeed a property of the observer and not of the object. Nonetheless, the subjectivism of Kantian aesthetics agrees on many points with Platonic objectivism: universality (which Kant calls “inter-subjectivity”) and atemporality. Kant also adds some attributes which he staunchly defends, especially, “the autonomy of judgment”. Kant is willing to recognize the individuality of the sense of taste regarding beauty, which he calls “adherent”, that is, which is attached to an object endowed 33 Baumgarten, Aesthetics, section 37, quoted in Malinowski-Charles (2005). 34 A large part of Kant’s texts related to aesthetics are from the Critique of Judgment, from 1790 (Kant 2015).
232
Aesthetics in Digital Photography
with a utilitarian purpose (a church, clothing), which Hogarth and Hume tarried over. However, he does not recognize this free judgment with respect to the beauty that he called “free”, which acts within an artwork. Because of this absence of a final purpose, he considers “free beauty” to be superior to “adherent beauty”. Kant’s positions have been described at length in section 1.1.3. We can also find a detailed discussion of Kant’s contribution to modern thought on Aesthetics (Schaper 1964; Aquila 1970; Crowther 1976; Hopkins 2001; Ferry 1990), and discussions in contemporary terms on how we can interpret Kantian thinking. Let us also mention here Friedrich Wilhelm Joseph von Schelling, author of Philosophy of Art, written between 1802 and 1805, but published posthumously, which follows in the Kantian line of thought, but liberates art from the influence of aesthetics (David 2002) by considering it to be a purely mental creation: “all art is mythology”, as J.F. Marquet says (Marquet 1983). A1.3.7. Georg William Friedrich Hegel In his very long treatise, titled Aesthetics, Hegel re-examines all concepts of Beauty and places Aesthetics in a global and rational conception of the individual. Starting with dialectical reasoning, free of any earlier schemes, he still concludes, like the Platonic school, that the object exerts an objective influence on the observer to create Beauty, and it is not a subjective interpretation as Locke, for example, suggested35. The originality of Hegel’s thinking is that he considers Beauty to be an intellectual experience, the fruit of reasoning36, and not the result of our sensations. He takes the opposite point of view from the subjectivists and distrusts our sensations37. However, in parallel, he condemned the approach of “Bel Art” experts38. In art criticism, however, he had a very conventional view of the aesthetic 35 “All that exists, then, has only truth in so far as it is a definite existence of the Idea. For the Idea is alone the truly real. The truth of the phenomenal is not derived from the fact that its particular existence is of an inward or external character, and as such is in a general sense reality; it is so wholly in virtue of the fact that such reality is adequate to the notion. Then alone is determinate existence real and true. And the truth, to which we here refer, is not a subjective interpretation of it, namely, that a particular existence is accordant with my own conception of it. It is truth in the objective sense that the reality of the Ego, or of any external object, action, or circumstance actually contributes to the realization of the notion”. (Hegel 1997, p. 176). 36 “The beauty of art is beauty born of the spirit and born again,[2] and the higher the spirit and its productions stand above nature and its phenomena, the higher too is the beauty of art above that of nature”. (Hegel 1997, p. 52). 37 “The lowest in grade and that least compatible with relation to intelligence is purely sensuous sensation.” (Hegel 1997, p. 91). 38 “For when great passions and the movements of a profound soul assert themselves, we do not bother ourselves any more with the finer distinctions of taste and its retail traffic in trifles
Appendix 1
233
hierarchy and his stances have few arguments laid out to justify them, being largely born out of classical canons39. In painting, it was light that conveyed the essence of the painter’s message. It was through light that the painter evoked such-and-such an object, as he is otherwise constrained to limit the model’s forms and colors. In this context, the light is the “subject part” of the creative work40, however, this is the subjective of the creator and not the observer. Finally, like Kant, he also believes that “Beauty must also be universally recognized” (Hegel 1997, p. 117). A1.3.8. Arthur Schopenhauer Unlike the earlier authors, Schopenhauer did not pen any text exclusively on aesthetics. His contributions on this topic can chiefly be found in two texts: On Vision and Colors (1815) and The World as Will and representation, in 1818 and 1844. In the first text, which he wrote following his discussions with J. W. Goethe, from whom he borrowed the theory of chromatic perception, he returns to the Kantian distinction between perception and vision, as well as the phenomenological conception of the external world. Color is only a product of our subjectivity, and cannot be attached to the concerned object. Color is not present in light41. The reference hues (yellow, orange, blue and violet42) are innate in our consciousness. They result from the black/white interval being subdivided into simple Pythagorean fractions, like the divisions of the musical scale. This text wished to deconstruct the Newtonian approach, which in Schopenhauer’s opinion looked at only physics and thus neglected the deeply subjective concept of color perception. We can find elements in this text that apply to aesthetics: a subjectivism of principle that phenomenologically creates the object’s properties solely through the observer’s representation of this object, an innate, qualitative reference frame, which gives priority to spontaneity and intuition over conscious reasoning, a base of Platonic concepts (simplicity, harmony, proportions) that are largely universal and transcendental. [..] The connoisseur, or art-scholar, has taken the place of the man, or judge of artistic taste.” (Hegel 1997, p. 89). 39 Thus, he defined the beauty of the abstract form in Platonic terms: “regularity, symmetry, conformance to laws and harmonies in forms and colors” (Hegel 1997, p. 204). 40 “If we ask now which physical element the painting makes use of, it is light, which makes visible the objects in the external world in general. [. . . ] Light, through this ideal identity, offers the only side that responds to the principles of subjectivity. And through this relationship, it has the property of making objects visible”. (Hegel 1997, II, p. 225). 41 Like Goethe, Schopenhauer was greatly struck by the colors that persist after a colored object has been viewed, or those that appear spontaneously in total darkness. 42 In other places, he suggests there are six principal colors, organized in pairs: yellow/violet, orange/blue, red/green.
234
Aesthetics in Digital Photography
In The World as Will and Representation, Schopenhauer dedicates a long chapter to art and genius43. In an approach that owes a lot to Kant, he distinguishes between the instinctive perception (which he called intuition) of reasoned deduction (the principle of sufficient reason) which he places under the control of the will. Genius is only given to those who are capable of this impulsive perception, a profound expression of the state of nature (he evokes Nietzsche here, who pushed this idea to its limit). It is this naive and native intuition that leads the artist to a more exact representation of nature, and which gives their work its universally recognized qualities44. Schopenhauer holds that art and mathematics are two fields where the principle of sufficient reason, although the best tool to acquire essential concepts for developing thinking, proves to be inferior to intuition45. In this intuition/will duality, there is also a message about the attention of something who observes a work of art46. Like Kant and Hegel, Schopenhauer wrote at length about the specific properties of the sublime, which in his time held much greater importance than today. He held that the sublime was a state of rapture that excluded any ability to mobilize the principle of reason. The spectacle that leads to the sublime expresses a state of frank hostility to will.
43 The chapter is titled “Book Three: the world as representation” and begins with a quote from Plato. 44 “[. . . ] This is shown by those admirable Dutch artists who directed this purely objective perception to the most insignificant object, and establish a lasting monument to their objectivity and spiritual peace in their pictures of still life; a man of taste cannot contemplate their paintings without emotion... [. . . ]” (Schopenhauer 1966, p. 240). 45 We cannot help but compare these ideas with those that have been so vigorously debated since the 1950s in the field of artistic production, and which accompanied the subject of artists like J. Pollock, G. Matthieu and J.M. Basquiat, or the ideas that J. Dubuffet defended under the term “Art brut”: “this refers to that product created from persons immune from the artistic culture, in which mimetism plays a minimal part if any, in a different way from the activities of the intellectuals. These artists derive all, subjects, the choice of materials, symbologies, rhythms, style, etc., from personal interiority, and not from conventions of the traditional and fashionable art. We find ourselves head to head with a pure, completely crude artistic operation, reinvented in all its procedures subsequent exclusively to the impulses of the artist itself”. (Jean Dubuffet, L’art brut préféré aux arts culturels, Gallimard, 1967). 46 “Only through the pure contemplation described above, which ends entirely in the object, can Ideas be comprehended; and the nature of genius consists in pre-eminent capacity for such contemplation. Now, as this requires that a man should entirely forget himself and the relations in which he stands, genius is simply the completest objectivity, i.e. the objective tendency of the mind, as opposed to the subjective, which is directed to one’s own self-in other words, to the will. Thus genius is the faculty of continuing in the state of pure perception, of losing oneself in perception...” (Schopenhauer 1966, p. 240).
Appendix 1
235
A1.3.9. Friedrich Nietzsche and the Romantic turn Nietzsche was probably the philosopher who has had the most significant impact on generations of artists. And yet, his lessons were complex and ever-changing, and one can easily find recommendations in his work that contradict each other. Friedrich Nietzsche wrote about Aesthetics in Twilight of the Idols in 1888 (Nietzsche 2005), but his ideas on this theme are scattered through many other texts, especially in The Birth of Tragedy, The Will to Power and The Antichrist. His contribution to aesthetics has been discussed at length in Ferry (1990). To establish the context of Nietzsche’s reflections on aesthetics, let us note first of all that he considered art to be the peak of human activity and the only activity that truly merited total investment. Nietzsche denounced the cerebralization of classical aesthetics, with its concepts of order, rules, intelligence (what he called Apollonian Art), a cerebralization that produced the Platonic dialectic that was largely defended by Hegel47 which he thought was intellectual fraud. On the contrary, he declared himself for an art of liberty, excess, irrational joy, intoxication (which he called Dyonisian art)48. He claimed an individual, egotistic, elitist aesthetic. He recommended participating in orgiastic and pagan festivals, far removed from the stiff, religious ceremonies of Christianity. He searched for a primitive sentiment that had no reference to reason or reflection. He trusted in the primary nature of humans – as long as this was not vulgar – to identify true beauty in nature. And yet, Nietzsche also rejected the aesthetics of the Romantics (who did not greatly bemoan this loss!), especially in his criticism of Wagner, whom he accused of being overly attentive to vulgar physiological sensations. He also turned toward classical aesthetics and eventually said Corneille (the classical and cerebral) was preferable to Victor Hugo (the romantic poet).
47 “According to Deleuze, Nietzsche’s complicated aesthetic position can only be interpreted by relating it to his primary objective: to dismantle the philosophical message of Hegel.” (Deleuze 1962 cited in Ferry 1990). 48 “What is the meaning of the conceptual opposites which I have introduced into aesthetics, Apollonian and Dionysian, both conceived as kinds of frenzy? The Apollonian frenzy excites the eye above all, so that it gains the power of vision. The painter, the sculptor, the epic poet are visionaries par excellence. In the Dionysian state, on the other hand, the whole affective system is excited and enhanced: so that it discharges all its means of expression at once and drives forth simultaneously the power of representation, imitation, transfiguration, transformation, and every kind of mimicking and acting” (Nietzsche 2005).
Appendix 2 Aesthetics in China
The sea has a torrent, the mountains are lurking, the sea has a throughput, and the mountains have arches. The sea can swallow the clouds, the mountains can be transported, the sea has a torrent, and the mountains are also lurking. The sea has a state of vomiting, and the mountains are arched and polite... The mountains and rivers are full of intricate characters. [...] If a person can grasp the “one painting” in a specific way and understand the subtlety, then the intention is clear and the pen and ink are clear. Shitao (Cheng 1991a)
A2.1. The image in Chinese literature Let us begin with an overview of Chinese aesthetics, which is significantly different from Western aesthetics. Our observations here have been largely borrowed from the following texts: – first of all, François Jullien’s work. “The Great Image Has No Form”. This was said by Lao Tseu and F. Jullien, philosopher and sinologist, has borrowed this term to be the title of his work on Chinese aesthetics (Jullien 2003); – the work of François Cheng, writer, poet, calligrapher, member of the Académie Française, steeped in both cultures (Cheng 1991ba and b); – the memoirs of Françoise Verdier, painter trained in the demanding classical Chinese school of art (Verdier 2001, 2003); – the lectures and texts of Xun Jiang, writer and contemporary art historian in Taiwan (Xun 2015, 2016), kindly translated and commented on by Pr. Sun Hong; For a color version of all figures in this chapter, see www.iste.co.uk/maitre/aesthetics.zip
238
Aesthetics in Digital Photography
– the collection of translations of ancient texts, with commentary by Yolaine Escande (Escande 2003), sinologist and philosopher of aesthetics. As we have seen, Western aesthetics is largely based on the work of Greek philosophers from the fifth century BCE. It relies on a faithful imitation of nature, prioritizing the criteria of harmony, both in the construction of the artwork as well as in the reproduction of details or the choice of colors. Chinese aesthetics in the grand artistic tradition of the Middle Empire traces its origins back to around the same period as Greek aesthetics, and, like the latter, was very soon the subject of many essays, paving the way for 25 centuries of intense practice. Unlike Western art, it does not appear to have undergone any revolution, with the oldest works continuing to serve as respected references for a long time and changes due to evolving tastes took place very gradually, until the end of the Qing dynasty and the emergence of republics1. The rules that governed the construction of the “great image” can be found in the emergence of three large schools of thought: Confucianism (between the fifth and third centuries BCE), Taoism (from the third to second centuries BCE) and Buddhism, from 1 CE onwards. These philosophical influences, radically different from Platonic logic, led to the prescription of practices that were very different from those we have seen so far in this book. Chinese aesthetics concentrated to a great extent on visual arts: painting2 and calligraphy, which holds an eminent position among the fine arts in China3. It pays less attention to music and the plastic arts: statuary, porcelain, etc., considering them to be a different field of arts from the graphic arts. In this text, we will look only at the graphic arts and, more specifically, at painting and drawing, as they will help us develop rules for beauty in photography. However, the very term “beauty” seems ill-suited to characterize a successful Chinese painting. The Chinese painter does not aim for beauty, instead they aim to share an intimate experience with the observer and bring the observer into the world 1 The Qing dynasty collapsed in 1912, followed by periods of considerable political turmoil. On a cultural level, aesthetics that were strongly rooted in tradition slowly opened up to western influences. However, it was not until the Civil War of 1927, and then the Sino-Japanese war (in 1931 in Manchuria, and then again in 1937 across the whole territory) that Chinese civilization underwent a sea change and entered a long, transitory period that brutally cut ties with ancient tradition, especially with the imposition of Maoist realism. This two-millennium old tradition continued largely in Taiwan. 2 “In China, painting occupies the supreme position among all arts”, writes (Cheng 1991b, p. 11), supporting this statement with multiple citations. 3 Chinese calligraphy must be considered as being the peak of painting and any rule that is valid for calligraphy can be transposed to painting.
Appendix 2
239
they have created4. “More than an aesthetic experience, Chinese painting offers a mystical experience” (Cheng 1991b).
Figure A2.1. Making visible the invisible, expressing the fluidity of water, the stability of the mountain and the movements of bamboo. (a) A visual representation of the poem by Du Fu by Zhao Kui (1185–1266), southern Song dynasty. (b) Fishing in the river by Wen Boren (1502–1575), Ming dynasty, Shanghai museum
While Greek art was entirely taken up with the Aristotelian distinction between the objects around us and the thought that grasps them, while Greek art strove to construct images of this object so that the mind would recognize them on the canvas (a prime example of this is the Parrhasius curtain5), Chinese art was based on a contrasting conception of the world, in which the painter is part of the scene that they observe. The object is therefore to reflect this immersion in such a way that the observer also shares in it. This immersion, which binds together the inspired painter 4 This is clearly expressed with respect to calligraphy by Cyrille Javary: “One fact, however, is clearly seen: in Chinese it is never said that a calligraphy is ‘beautiful’. This would mean describing it as inert. In poetry, and in painting, Chinese aesthetic goes past the concept of pure beauty, which Greek reason dreamt of in the heaven of Ideas, preferring the more fleeting idea of a lively beauty” (Cyrille J.-D. Javary, L’écrire étincelant, in Verdier (2001)). We also read: “The criterion in art is not beauty, a subjective notion that varies depending on place and time, but sincerity, authenticity”, words from the painter Huang Yang, reported in Verdier (2003), and “aesthetics as the science of beauty does not exist in Chinese art”, in Escande (2003). 5 In a contest against Parrhasius, the great painter Zeuxis (considered the greatest in Ancient Greece) painted marvelous grapes that sparrows tried to peck at. Convinced by this that he had attained the greatest beauty, he expected to win the prize, but first wanted Parrhasius to draw back the curtain that masked his work. This curtain was painted, and Zeuxis had to admit that he was beaten, as he had been taken in himself, while his work had only fooled the sparrows.
240
Aesthetics in Digital Photography
and the landscape that they are preparing to reproduce is materialized by the “breath of energy”, the qi, which is the vital energy (with the alternating yin/yang forces) which goes through all things and gives the universe its shape6. Capturing the qi such that it flows through the observer is the aim of Chinese art, just as capturing beauty is the aim of Western art. This is not so much a question of skill as of communion with the landscape. As François Jullien reminds the reader, “objective evidence is destroyed, there is no longer any need for a pre-established nature assuming an object that is already given” (Jullien 2003, pp. 107–108). Finer details may be ignored and it is better if the resemblance is a bit distant because, as Jing Hao wrote7: “resemblance means achieving the form and allowing the life-energy to escape” or, again, “when the life-energy is lost in the image phenomenon, it is the death of the image”8, and F. Cheng adds: “lacking energy is the very mark of a mediocre painter” (Cheng 1991b, p. 72). Even beyond an immersion, the Chinese painter aims to create an alliance between the scene and the painter9: “There is no longer a perceived object and a perceiving subject but a correlation and exchange between poles: ‘welcoming’/ ‘welcomed”’, writes F. Jullien. As Shitao10 said, toward the end of his life, now that he was able to paint, “the landscape calls upon me to speak in its place...”. The Chinese painter is most stimulated by attempting to render that which is not perceived: the disappearance of the mountain into the mist, the trace of the fish in the water, the rain disappearing into the peach trees, the transition of night into day. While Chinese architecture made empty spaces their chief feature (Li 2002), the Chinese painter thrives on the polarity of “being/not being” and, to this end, makes abundant 6 Thus, we read in François Cheng: “It is known that Chinese aesthetic thought, founded on an organicist conception of the universe proposes an art that has always tended to recreate a total microcosm where the unifying action of the Life-Breath predominates, where even emptiness, far from being synonymous with flux or arbitrary, is the internal site where the network of vital forces is established” (Cheng 1991b, p. 10). 7 Jing Hao (855–915) was a reputed painter and fine essayist from Henan. 8 Both references cited in (Jullien 2003, p. 227). 9 This Eastern vision of the intimate union between a person and their landscape finds some echoes in Western philosophy as well. Merleau-Ponty (1964), one of the great phenomenologists, states: “Visible and mobile, my body is a thing among things; it is one of them. It is caught in the fabric of the world, and its cohesion is that of a thing. But because it sees and moves itself, it holds things in a circle around itself. Things are an annex or prolongation of my body; they are incrusted in its flesh, they are part of its full definition; the world is made of the very stuff of the body. [...] vision is caught [...] where the indivision of the sensing and the sensed persists”. 10 Shitao is the famous “Bitter Gourd Monk” (1642–1708), from South China. His work is known in Europe in the translation by Pierre Ryckmans (2007).
Appendix 2
241
a)
b) Figure A2.2. Drawing with emptiness, using simple and efficient lines to evoke an idea. The line leaps out without any work, made necessary by the object. (a) Landscape, detail from a panel, by Hong Ren (1610-1663), Qing dynasty. b) Flowers, detail on a roll, by Chen Chun (1483–1544), Ming dynasty. Shanghai museum
242
Aesthetics in Digital Photography
use of the tool the West usually ignores: emptiness11. Emptiness (hsü or t’ai-hsü) is the privileged site of interaction between bipolar natures: it is what allows these polarities to come together instead of entering into conflict (Cheng 1991b, p. 51). In Chinese painting, emptiness can be found at all levels: in the composition, by reserving large masses for absence, in the traces between the brush-strokes that are laden with color or empty, in the opacity of the ink or painting which blots itself out in smoke, in tree branches, hung far from the trunk, in the fabric of cloth, fluttering in the breeze. According to Chang Shih, it is good to cover only one-third of the paper. In this sense, the sketch is the most accomplished form of Chinese art, leaving forms incomplete and leaving ample space for the viewer to invade and be invaded by the space described.
Figure A2.3. Simplifying lines and letting the viewer guess at the absent; the painter’s gaze does not describe, it only expresses: Poetic scene on the bank of Lake Shihu by Wen Zhengming (1470–1559), Ming dynasty, Shanghai museum
Contrary to Western painting, whether it is Florentine portraits, intimate Flemish paintings or French landscapes, the Chinese painter does not try to fill up their canvas. They play with space, planes and hollows, and guide the viewer to think of what has not been painted. Do not look for a faithful map of any site, or a precise representation of people. All faces have eyes, why represent them if they add nothing? Even the very format of the canvas defies Western practices. The scroll is stretched out over a large distance around the lake, going over the mountains, accompanying a calm river, like a survey carried out by a walker on the mountain. On the other hand, the wisteria 11 However, we can find in V. Jankélévitch: “It is in the incompleteness that we allow life to settle in”, cited in (Verdier 2003, p. 131).
Appendix 2
243
tumbling down over the bamboo while the flight of swallows darts into the clouds and arrives at the top of the steep cliffs are all arranged in an endless vertical procession.
a)
b)
Figure A2.4. The Five Phases (or wuxing) theory associated with the fundamental elements of the universe of colors (and also of dynasties, tastes, directions etc.)
C OMMENT ON F IGURE A2.4.– In order to respect the number of directions and seasons, yellow is represented in the center, which reflects its dominant character (a). The associations between the colors are of two types: (b) acceptable (the broad, light arrows), these then create acceptable shades, although these are inferior to the initial colors, or forbidden shades (the thin, dark arrows) leading to non-aesthetic shades. These relations are justified by the relationships, in nature, between the underlying elements (credit Wikipedia). The choice of the colors itself is a result of this cosmological vision of the universe. The principles for this are already mentioned in the Essay on Artisans12, written centuries ago. However, we owe the truly cosmogenic principles that govern these choices to Zou Yan13. Although sometimes debated14, these principles are still intact today. There are five primary colors. Each color is associated with an original 12 The Essay on Artisans from the Rites of Zhou dates from the Han dynasty (453–221 BCE). There is a commentary on this in Escande (2003). 13 Zou Yan (305–240 BCE) was a major philosopher in the Warring States period who made significant contributions to the theory of yin and yang as well as the Five Element theory. 14 Especially by Zhuang Zi, the Taoist philosopher, who accused them of reigning in the imagination.
244
Aesthetics in Digital Photography
element in the universe15: wood (green/blue16), metal (white), earth (yellow), water (black) and fire (red). The possible associations of colors follow the affinities of nature (wood for water, etc.). However, these mixtures are always less successful than pure shades. On the other hand, the antagonisms between the elements (water does not mix with fire, etc.) explains the aesthetic incompatibility between certain combinations (see Figure A2.4(b)). This Five Phases theory advocates that yellow (the Emperor’s color) is superior to all other colors from an aesthetic point of view, red is the color associated with life and black is associated with mystery. While colors were very often used in dazzling forms in ancient times, it was gradually recommended that they be expressed using the toned-down colors that predominated in the grand period and were enforced in the artwork that can largely be seen in museums. A2.2. Objective or subjective? It would be considerably difficult to replace Chinese aesthetics in the debate that was discussed in this book now. It is clear that there is no way to reconcile terms from Platonic analysis and the aesthetic objectivity in the Chinese oeuvre. The objects are not beautiful in themselves, they are beautiful in their relationship to their context and the beauty emanates from the union and conjunction of presences through a delocalized grace, the qi. The most beautiful objects in ancient Greek, the human body, the nude, would hold no charm for the Chinese connoisseur. And, on the other hand, the peak of Chinese art, with its representations of landscapes where the majesty of mountains combine with the fluidity of the rivers and the movements of the forests, in a subtle balance of yin and yang, is likely to leave the Western aesthete totally unmoved, at least until the Florentine or Lorraine painters made these the subject of study. Making an exact copy of nature to get closer the model, apprehending rules of lighting, of shadows, of perspectives, all this was an indignity to the Chinese painter who, on the contrary, mastered the art of a few lines that conveyed meaning, ellipses and illusion. It was also meaningless to speak of a “canon”, for while the example of the ancients was venerated, copying methods was beneath the painter17. 15 These notes are presented in great detail in La pensée chinoise by M. Granet (1968). 16 Green/blue, called qing, which is not exactly our cyan, but close to it, must not be confused with green (lu), nor with blue (lan). This intimate alliance between blue and green may be approached through Pier Paolo Pasolini’s expression, “Green... green is the blue of the leaves in the pond. In the evening when the bells ring and the women sing at their doors. And then in the familiar and immemorial peace of the garden, night descends like a shadowy storm. The leaves stay immobile on the surface of the water and become bluer and bluer, until they turn green. But then green or blue?” (screenplay for La Ricotta, 1963, reported in (Lacroix 2018, p. 191)). 17 On this point (should we or should we not copy the masters?), Chinese literature is divided, as testified to by Xie He, painter and art critic from the Liu Song and Qi dynasties in the South,
Appendix 2
245
We must however emphasize that Chinese aesthetics does share with ancient Western aesthetics the most objectivist belief that beauty (or whatever takes its place in a world which does not accord beauty the same value) is universally shared, as the ultimate cosmic framework. In Western aesthetics, it is a reflection of harmony, in Chinese aesthetics, of a vital breath. This final framework is unique and identically open to all who view it, as long as they acquire the necessary virtues (this is as true for the Athenian kalos kagathos as for the learned Chinese). For this reason, the Chinese masterpiece does not suffer criticism. It imposes itself and will impose itself durably. While, on the majority of the criteria used, Chinese art does not agree with the objectivist precepts of aesthetics, does it agree with the subjectivist conceptions? Far from it. It is true that the painter’s objective may well be that of getting the observer to experience what the painter experienced in that instant as they created the painting. And it is true that there exists the idea that the artwork does not take shape as much as being experienced by the spectator, and the artist’s success depends on the depth of the viewer’s depth of engagement. However, Chinese aesthetics refuses to recognize that the artistic experience depends on the viewer’s abilities and faculties in responding to the perception by an active contribution driven by their temperament, cognition and reasoning. On the contrary, the aesthetic effect only exists through the complete passiveness of the individual, who allows the energy-breath to travel through them: “Let your heart be empty and free of the smallest speck of dust, and the landscape will flow more intimately into your soul”, said Wang Yu18, cited in Verdier (2001), and Xun Jiang19, today, excludes any conscious action in the perception of beauty, which is only the “awakening of the soul”, and in beauty, which is “the tightrope walk between emotion and feeling, on the one hand, and between arousal and pleasure on the other” (Xun 2015). Neither objectivism nor subjectivism seems to respond to the requirements of Chinese aesthetics. It might be possible to get closer to this by adopting a phenomenological process that would grant significant importance to intuition (Schopenhauer20, Bergson, Merleau-Ponty). But this is another tale altogether. writing around 550. He recommended, in the final point of his “Six principles of painting”, painting not only by drawing inspiration from nature, but also by copying the masters. 18 Wang Yu, who signed all his work with the name Dongzhuang, is a Chinese painter born in the Jiangsu province around 1650 and who died in 1729. 19 Xun Jiang is an eminent contemporary Taiwanese historian and art critic, who has published many texts (all in Chinese) tracing parallels between Eastern and Western art. 20 Let us recall these words of Schopenhauer’s, cited more completely in footnote 27 in Chapter 1 : “[...] the subjective element in æsthetic pleasure [consists of] the deliverance of knowledge from the service of the will, the forgetting of self as an individual, and the raising of the consciousness to the pure will-less, timeless, subject of knowledge, independent of all relations”.
246
Aesthetics in Digital Photography
A2.3. Artificial intelligence and the aesthetic appraisal of Chinese art It will not have escaped the reader that traditional Chinese painting defies all the analyses that we carried out in the final chapters of this book. Is it possible to imagine a way to reintegrate this aesthetic into the schemes used by scientists to automate the aesthetic appraisal of an artwork? At this stage, we must divide the discussion into two levels, depending on whether we proceed by first detecting the criteria for beauty (what we called “handcrafted” in Chapter 6) or whether the end-to-end processing is delegated to a neural network. A2.3.1. Handcrafted primitives In this approach, we must identify all quantifiable criteria of beauty in the sense it is used in Chinese aesthetics. Thus, an inventory would have to be drawn up, comparable to the list in Chapter 6, based on Chinese literature. It appears that all the teaching of painting in China is based on a set of rules for producing a beautiful artwork: the six laws of Xie He21. These laws are considered as an absolute guide for the painter, intangible and eternal, and every apprentice must conform rigorously to these22. Thus, all teaching begins with the rules of Xie He. However, these have come down to us in the form of six words of four characters each, and these drawn up in the ancient, very hermetic calligraphy, have given rise to multiple translations and interpretations, sometimes very different from each other23. Unfortunately, the highly abstract nature of these rules gives us very little information to deduce teaching from these to work toward our objective. We must find more objective traits that can be measured in the photo. 21 Xie He (or Sheikh) was one of the leading theoreticians of Chinese aesthetics. He lived in the sixth century. He was also a painter but his artwork was not as famous as his chief text: Guhia Pinlu, Appreciating Ancient Paintings. 22 Thus, you can read on the website of the Central Academy of Fine Arts in Beijing: “The postulating of the six laws had a decisive influence on the later theory of painting [for Xie He]. The six laws are still the norm, making it possible to judge the quality of a painting”. 23 In this text, we have used the translation by the sinologist Yolande Escande (Escande 2003, p. 296) whose work seems most easily comprehensible: (1) the resonance of the breaths give life and movement; (2) the method of the bone (i.e. of the skeleton and bony structure) guides the handling of the brush; (3) faithfully respect the object when drawing its form; (4) ensure there is adherence to genres; (5) pay attention to the suitable arrangement in the composition; (6) allow transmission through copies [of the masters]. These rules have a strict hierarchy. The first two prevail of the next two and, a fortiori, over the last two. Some people see an echo of Vitruvius’six recommendations in these six rules: order, arrangement, eurythmics, symmetry, propriety and economy.
Appendix 2
247
To the best of our knowledge, no list of these properties has been drawn up either for painting or for photography. However, based on texts related to painting, it would be possible to find the important traits. For example, rules on the space that is left to be white at various levels, the attachment of the leaf to the branch, relating to the emptiness of the prairies or the opening of the mountains and the clouds. It might also be possible to identify pairs falling into the yin/yang polarity: mountain/river24, tree/rock, boat/fisherman, path/torrent, bridge/boat, etc. It would also be necessary to identify the traces of disappearance, the line that fades away, the path melting into the forest, the mountain fading into the mist, the evanescence of distant things, the incomplete depiction of people and so on. On the other hand, in notable opposition, in China the line has exceptional vigor. It sweeps away detail and crudely expresses the principal line with the force of a long-reflected gesture “this driving impulse which proceeds from the vital breath (qi ji)” as F. Jullien says, quoting Jiang He (Jullien 2003, p. 279). This is probably expressed in its contrast, curves and orientations. It would certainly be desirable to pay greater attention to ink drawings, which holds prime position in Chinese aesthetics. The predominance of ink over painting and the power of the brush (the bi mo25), which leads us to recognize six “nuances”: black, white, dry, damp, thick and fluid, as listed by Tang Dai26. It would probably also be desirable to pay attention to the elementary detail, which is of primordial importance in Chinese painting: the line. The quality of the brushline is the fruit of an education acquired over many years27. The singularity of the “beautiful line”, which does not escape the Chinese scholar, could probably be revealed through a fine morphological analysis by a computer. We have been unable to find any research in this direction. Finally, the absence of shadows is an necessary feature of Chinese painting, which sees shadows as a manifestation of yin, which must be excluded from strong images. This constraint, which can easily be implemented in painting (but is not a discriminating quality when evaluating its beauty), must, in photography, be given 24 “Mountain river” is a literal translation of shan shui, the pair that is a generic designation for the landscape, with the polarity carried in the compound word. 25 The bi mo denotes both the small collection of objects used to prepare ink by crushing the powder, as well as a style of atelier painting, as we would say “aquarelle” or “estampe” painting. 26 Tang Dai was a painter in the 18th century, from Manchuria, and the author of an important pedagogic essay on painting (Jullien 2003). 27 The pedagogic syllabus proposed by the master Huang Yuan, in Verdier (2003) reads: “You will begin by tracing lines, only lines, for many months. In Chinese painting, everything is constructed of lines [...] You must first acquire a solid base. This base is the horizontal line, we will not move on to the other lines...”.
248
Aesthetics in Digital Photography
greater importance if it is also a tacit aesthetic criterion: do Chinese photographers seek to mask shadows? Do they allow them the space that they naturally occupy in a scene? It does not seem as though this has ever been systematically studied. A2.3.2. Deep neuromimetic networks As we have seen in earlier chapters, the question of which primitives to choose soon became a secondary problem because of the considerable advances made by the DNN methods, which bypassed the primitives. It is thus possible to envisage a way to avoid establishing the inventory described in the previous section by directly feeding a network with images that are judged to conform to the prescriptions of Chinese scholars. To the best of our knowledge, no approach has undertaken to teach a computer beauty in the sense of the Great Image. Further, it seems that it is precisely from the Middle Empire that we have the greatest and most advanced types of research28, based on a Platonic approach to aesthetics. Should we believe that the digital world is turning its back on the tangible world that nourishes traditional Chinese painting? Probably not, but almost everything leads us to believe that it was chiefly due to the possibility of easily implementing tools that conformed to an objectivist aesthetic that computer scientists dove into the stream of social networks and online photography databases. It is simply a question of opportunity. If we had to teach a computer to distinguish beauty in accordance with Chinese aesthetic criteria, it would not work to simply replace, piece by piece, the AVA database and the collection of evaluations from DPChallenge to train a DNN architecture on Chinese aesthetics. If this line of work was followed, we would have to first collect a few hundreds of thousands of “beautiful images” taken from the best collections from Chinese museums. This is probably not very difficult, given the number of museums. Next would then be collecting, for each artwork, a reasonable number of comments. This would definitely be more complex, especially if we wanted these opinions to conform to the tradition of judgment in Chinese aesthetics, since this aesthetic is not well known to Internet users today. We have also seen, in Chapter 8, that criticisms may be leveled at this process of learning via DNNs, even without these difficulties. It can therefore be suggested that we profit from the experiments carried out on “Western” photography to challenge the resolutely objectivist approach adopted up to the present. It may be too soon yet to conceive of new paths. It is probably better to wait for clear knowledge to emerge about the limitations of the objectivist based on ongoing 28 A close examination of the authors cited as references in Chapter 6 and 7 in China as well as in Western countries will convince us of this.
Appendix 2
249
experiments and to then develop other methods that may be more fruitful. It would therefore not be very surprising if we had to go back to square one to redesign learning principles and reasoning mechanisms to better respect the subtlety of aesthetics as per Chinese texts.
Appendix 3 The Aesthetic of Persian Miniatures
Today, the common translation of muDawwir is “creator” in the sense of “painter” and, by extension, “photographer”, “cameraperson”, “illustrator”, etc., while the verb Dawwara refers to any act of creation. Ioana F EODOROV (2005)
The aesthetic current represented by the miniatures1 of the Middle East is undeniably one of the strongest and best-documented aesthetic traditions in the history of art. While this art spans a wide period, extending approximately from the 12th century to the 19th century, its spatial boundaries are a little less precise. The heart of this form is in Persia (modern-day Iran and Iraq), but it extended to the North, to the cities of Uzbekistan, to the East it covered Afghanistan, Kashmir and the Indian Punjab region, while in the West it extended into modern-day Turkey, as far as Istanbul (Roux 2007). For a color version of all figures in this chapter, see www.iste.co.uk/maitre/aesthetics.zip 1 The term “miniature” is criticized today as it may seem degrading, being too evocative of a reduced form of art (Melikian-Chirvani 2007). However, following a long tradition (Ishaghpour 2009), we will still continue to use it, as in the collective unconscious it recalls just that form of art that we are analyzing here. For us, this word is, of course, free of any qualitative judgment.
252
Aesthetics in Digital Photography
A3.1. A brief history A3.1.1. Image and Islam One of the questions that quickly comes up when looking at the place of aesthetics in the Middle East is the role of Islam as a religion in the relationship between the artist and the image. This has been the subject of many texts. We recommend the analyses penned by Mohamed Aziza (1978) and by Silvia Naef (2011) for their wealth of detail and relevance. We will briefly summarize these analyses here: when Islam originated (Feodorov 2005), the Prophet Mohammed did not seem very inclined toward images. The Koran does not reject figurative images, but recommends avoiding representations of pre-Islamic pantheons2. Then, “when it became certain that people believed strongly, it accepted these representations and did not concern itself with them any longer”3. Severe judgments against images are found in the commentaries on the Koran, the hadiths, which form the references that are often invoked today, proscribing the representations of living things in images. The image is, first of all, rejected as a creative work that could claim to compete with God4, at the cost of committing the sin of pride. This specifically covers all forms that carry a breath or depict a soul: people or animals. But an image, especially in a large format (paintings, panels), could also become an object of idolatry, which is the second reason to proscribe it. However, images were tolerated in carpets, which are walked upon, and in the miniatures that decorated objects that people would only use occasionally. Finally, an image often depicts a deliberate ostentatious luxury and wealth that is unacceptable in a religion that preaches asceticism and detachment. Independent of these religious arguments, it has sometimes been suggested that these extreme positions may have been adopted because the Umayyad dynasty did not have a strong enough symbolic tradition to supplant the iconic Byzantine dynasty, which was at its peak at this time (Naef 2011). It will be noted that many cultures and the entire group of Abrahamic religions condemn the use of images (called aniconism) to varying degrees. 2 The texts in the Koran only speak of images through the term “idol”, which appears in verse 74 of Surah 6 (The Cattle), in verse 9 of Surah 5 (The Table Spread), in verses 57 to 59, and 66 to 69 of Surah 21 (The Prophets). 3 The words of Ahmad Muhammad Isa in Muslims and Taswir, from Journal of Al-Azhar, The Muslim World, vol. XLV, 1955, p. 252. 4 “Those whom God will punish most severely on Judgment Day are those who imitate God’s creations”, in the word of Ahmad ibn Hanbal (780–855).
Appendix 3
253
a)
b)
c)
Figure A3.1. Islam’s rules about depicting living objects led to fantastic abstract, geometric art where highly stylized forms were combined to fill up space in a harmonious manner
C OMMENT ON F IGURE A3.1.– (a) In the Abdullah-Kahn madrassa in Bukhara, a Surah becomes the motif in the decoration. (b) In the Friday Mosque of Na’in in Iran, and (c) in the Mosque of Sheikh Lotfollah in Esfahan, these are highly complex, purely geometric forms, combined with daring architecture that has been skillfully rendered, creating incontestable harmony.
254
Aesthetics in Digital Photography
Given these constraints, Islamic civilizations produced carpets, jewels, objects for daily use and monumental work, all of a remarkable elegance and rare beauty, using both calligraphy and geometry (Figure A3.1). In doing so, it created its own active criteria of aesthetics. This aesthetic, which is somewhat removed from photography, will not be looked at in detail. But we will look specifically at the miniatures. Miniature art was not subject to the ban on painting living things, but participated in the overall large movement of Islamic art. Thus, it inherited a marked taste for geometry, allied with great flexibility of strokes, a dazzling color palette, a structuring of space that left no place for emptiness, as well as close attention paid to the whole as well as to the smallest detail. A3.1.2. The miniature in history Tracing the source of the Persian miniature obliges us to go far back in time, as we must refer to Ancient Persian tradition, Mazdaism and Manichaeism, the Sassanid legacy, steeped in Chinese influence, the Sogdian experiences in the northern Bactrian territories and then the Arab conquests and the profound Islamization of these earlier traditions, and finally the Armenians, in the 10th and 12th centuries, who have given us the earliest known miniatures and testify to a transition with the late Byzantine art. The oldest Persian manuscripts endowed with miniatures have come to us from the Baghdad school and the early 13th century5 (Busson 1965). This artistic creation was enthusiastically received by the courts of the Sultans. It was encouraged by powerful princes, who found in them a way to ensure the memory of their reign lived on. Miniatures were illustrations designed to enliven and explain the texts that they accompanied: collections of poetry, sacred books, accounts of past exploits, biographies of real or fictional princes. Unlike Western illumination, miniatures were an art form in themselves, often carried out on a separate page in the text. It was common practice to then re-use them in new collections, as the prince desired, by undoing them from the earlier manuscripts and reintegrating them within different texts, or even scratching out or writing over the drawings to change the dedication. A manuscript was not produced by a single artist. It was the work of a team and carried out within a workshop where people had many different roles, all under the “artistic direction” of a master artisan, who was often very famous and would compose the main part of the manuscript and put the finishing touches on the whole. The artist very rarely signed their work, but their name would stay attached to the work in oral history, both in the workshops as well as in the royal courts. Calligraphers, artists to do the drawing, colorists and artists who added details all would work consecutively on the same drawing. These operations have been described very well by Orhan 5 The manuscripts sometimes predate this period, going back to the 10th century, but the illuminations are often later than the written texts, often added several centuries later.
Appendix 3
255
Pamuk in his book, My Name is Red, dedicated to life in a workshop in Istanbul6, providing a detailed analysis of the aesthetic of the miniature. The oldest miniatures that have come down to us date to the middle of the 13th century and are often from Baghdad, then the seat of the Abbassid caliphs, around the period when the armies of Genghis Khan began their invasion of Iran. The Mongols spread out across the East from 1220 onwards. These horsemen toppled princes, razed or subjugated cities and displaced hordes of courtesans, artists and poets, whom they attached to their own court. Consequently, they carried with them traditions that originated in modern-day Uzbekistan, Afghanistan, India and distant China. After the Mongol invasions, Persia was conquered by the troops of “Tamerlane” (Timur), continuing the considerable mixing of the arts in the second half of the 14th century. The cities that had the most reputed artistic schools, at this time, were Shiraz, Herat and Tabriz. These schools reached the peak of their glory in the 14th and 15th centuries (the time of the Timurid school) and were very active in diffusing their methods and their principles across all of Persia during the long period of the Safavid school of art (from 1502 to 1722) that would see many centers of art flourish (Samarkand, Bukhara, Esfahan, Istanbul, etc.). Must we distil this varied and eclectic artistic world into a homogeneous and unique school of Eastern miniature art? Of course not – despite its significant coherence (of which we shall see some examples below) the miniature offers diverse features that specialists have clearly identified. There are multiple ways of classifying the aesthetics of miniatures. Jean-Paul Roux (2007), for example, identifies three distinct families: – the Iranian miniature, illustrated by the schools of Herat and Shiraz, is somewhat the heart of this art form, devoting considerable interest to representing landscapes and a great deal of attention to composition and fine details. This was born around the late 13th century. This form of the miniature drew its inspiration from poetry above all, and 6 Orhan Pamuk is a Turkish author, born in the middle of the 20th century who was awarded the Nobel Prize for Literature for his novel, My Name is Red (Pamuk 1998). Written in the form of a murder mystery and love story within a large workshop under a Sultan of Istanbul in the late 16th century, it describes and comments on the revolution that was introduced in how texts were illuminated in the time of the Ottoman Empire. This revolution was brought about by the entry of techniques that came from the West, especially from Venice. Richly punctuated with references to texts and artwork that were the glory of the miniature form, this book can also be read as a debate between subjectivism and objectivism at the heart of Islam. The author is careful not to take sides – defending both the traditional point of view as well as the progressive point of view equally.
256
Aesthetics in Digital Photography
scenes from ancient epics (Roux 2002). It is particularly famous for having illustrated the works of the great poets: Ferdowsi7, Nizami8, Hafez9, Jami10; – the Ottoman miniature, created in the reign of Mehmed II, followed the Turkish capture of Constantinople in 1453 and the establishment of a brilliant court that was open to the arts. It was born from the Iranian miniature and reached its peak in the reigns of Suleiman the Magnificent (1520–1566) and Murad III (1574–1595). It is more realistic than the Iranian miniature, more concrete and less fantastical (Roux 2007); – the Mughal miniature, created in the reign of the great Mughal Emperor, Humayun, who took in exiled painters from the Persian court and installed them first in Kabul and then in Delhi in 1455. However, it was his son Akbar who gave these artists the means to create a school and then allowed these schools to grow, intimately combining the Iranian tradition with Hindu inspiration (Roux 2007). A3.2. What is the aesthetic of the miniature? As we have said, the miniaturists approached all themes11. The sole exception is the portrait. These are quite rare and only appeared later, probably under the influence of Venetian painting, from the 16th century onwards. The formats are very heterogeneous, making use of the free spaces between sections of text. But many occupy a whole, vertical page in the book, on separate sheets. The composition does not seem to obey any precise rules12. One rule, however, seems to be widely followed regarding exterior scenes: the line of the horizon is placed very high in the image, expressing a downward view of the scene (Figures A3.2(b) and (c)). The space is totally used13, even more than in Western painting and certainly more than in 7 Ferdowsi lived in the 10th century, and his main text, the Shâhnâmah (The Book of Kings) gave rise to multiple illustrated collections, many of which are today considered as references of miniature painting (Ishaghpour 2009). 8 Nizami was a poet and scholar from modern-day Azerbaijan, living in the 12th century. His principal work Khamseh (Five Treasures) was complemented by famous illuminations painted in Shiraz in 1410. 9 Hafez was a Persian poet and philosopher who lived in Shiraz in the 14th century. His poetry was collected in the book Divan, which inspired many illustrators. 10 Jami was a Sufi poet and philosopher who lived in Herat in the 15th century. His writings were mystical fables which especially inspired the miniatures of the great master Behzad. 11 “...the garden, the palace, romance, festivals, hunting, battles between kings, as well as the trivial such as a bath, a construction or a watermelon seller. All are given the same splendor, appearing as if in the distant suspended world of dreams and fables”, writes Youssef Ishaghpour (2009, p. 10). 12 “The Persian miniature is decentered and, thus, has no power of the frame that could define, with its coordinates, a space within the structures” (Ishaghpour 2009, p. 26). 13 “The horror of emptiness, the characteristic feature of the Muslim decorative genius, renders gardens exuberant, with the smallest plot of soil being covered in flowers. Curved lines, often
Appendix 3
257
Chinese painting. Each interval between the principal objects, people, monuments is a chance for additional ornamentation: a flower, a rock, a cloud, a bird, a vase, etc. (Figure A3.2(b)). The Persian miniature abhors emptiness! The use of perspective was forbidden for a long time (Figure A3.2(c)); vanishing lines slid timidly into the miniature after the spread of Western artwork, in the 18th century, but would never create an illusory, three-dimensional world. The world of the miniatures was a flat screen, where people, animals and trees were of the same size, whether near or distant. Shadows were also proscribed, both in surface models based on lighting as well as silhouettes. As in China, the artist does not consider the shadow to belong to the person or the object. Miniaturists paid special attention to colors14. They were divided by flat strokes, without fading. They were very bright, often violent, put together in clearly divided compositions. The Persians long held on to the secret to these hues that were both bright and nuanced, created from shells, plants or powdered precious gems bound together by coatings15. The Mughals then enriched this tradition with their own contributions. No hue was forbidden, no combination given greater importance, the artist’s inspiration was the only guide to choosing these palettes. The aesthetics of miniatures obliges the artist to represent only beautiful objects, each with a clear form16. The scene will be beautiful if each element that constitutes it is perfect: the beautiful Shireen, the central character, the focus of all attention, her embroidered veil and sandal, her lover, Khusrow, the young and pure hero, with large shoulders and curly hair, the garden filled with delicate fruit trees, the artfully hemmed clouds, each bird sitting on the edge of the fountain, each flower in the lawn, each detail treated with the same attention as though it were the central object of the scene. Youssef Ishaghpour says: “There exists a veritable cult of beauty in the Persian miniature”. And the proof that each detail receives the same attention as the whole, the smallest detail is treated with incomparable expertise: the miniscule carnation on the edge of the field is not a buttercup, and the bird the hunter is hunting is not a rock partridge, but a chukar partridge. spirals or semi-circles that may represent streams, contribute to the softness and harmony of the whole”, writes Jean-Paul Roux (2002). 14 “If the draftsman is a master, the colorist is a genius. Although in the 12th and 13th centuries the colors were aquarelle colors, lacking vividness, by the 15th century they had completed their transformation and had become lacquered, thick and brilliant. Each color was pure and of the same intensity, without color shifts, without nuances to create a model” (Roux 2002). 15 “Powdered gold, silver, lapis-lazuli, emeralds and other precious stones transform the materiality so that they are but the reflection of the light” (Ishaghpour 2009, p. 9). 16 “Small size, delicacy, finesse, are the essential and fundamental formal characteristics” (Ishaghpour 2009, p. 17).
258
Aesthetics in Digital Photography
a)
b)
c)
Figure A3.2. Some miniatures from the Persian School
C OMMENT ON F IGURE A3.2.– (a) From Riza y Abbasi of the Esfehan school, the miniature The Animal Trainer, from 1621, showcases an aesthetic inspired by Chinese art with regard to the features of the characters depicted, as well as the form of the clouds. (b) From Djoneïd Negargar, Khomay and Khomayoun, painted in Baghdad toward 1400 based on the poems of K. Khermani. This miniature clearly illustrates
Appendix 3
259
the filling in of space, the downward point of view and the attention paid to detail, the choice of a very high horizon line, and the characteristic absence of shadows in the Persian miniature. (c) From Nasir al Din Tusi, Painters and Calligraphers at Work, extract from a treatise on Mughal morals toward the end of the 16th century. It can be noted here that liberties have been taken with the rules of geometric perspective. The aesthetic of the miniatures does not aim to reproduce the world as it is, but as God made it, or rather, as is said in every workshop, “as God sees it”. Thus, the painter must not paint the model, but the idea that we have of an ideal model. Thus, if a horse was to be painted, there is no point seeking inspiration from a given horse, however beautiful it is, because it would certainly be imperfect. To reproduce the image of the horse as God conceived it, one must return to tradition and follow the lesson of the ancient masters who progressively approached this perfect image. When the artist is steeped in the idea of the ideal horse, through observation and the study of tradition, they can then reproduce this internal image of the perfect form, which is not guided by the perception of the eye, but by the divine concept of the horse. The hand is not led by the eye, but by the idea and the idea is led by tradition. All horses in beautiful miniatures share the same ideal form, which slowly emerged from the best workshops. Furthermore, the perfection of the line would be that produced by the master at the end of his career, when he had lost his vision, and thus there was nothing to distract him from the internal image. Going blind from working day and night on paintings in his workshop, the artist attained the ultimate level, where the work is traced automatically by the hand, guided solely by the internal image of the horse’s beauty (Pamuk 1998). Thus, the Persian miniature has identical reproductions of tens of horses in the same attitudes: lying down, marching, galloping, rearing. A single battle will see 20 of them, with the same caparisons, the same riders with the same gleaming arms. In crowds, in palaces, it is hard to distinguish between old and young, and while men differ from women in dress, their faces are often very similar; Shireen can only be distinguished from her beautiful followers in their finery by the fact that she is in the center of the image and the center of attention. The Beauty of the woman is a distant legacy of Mongol invasions and the Chinese aesthetic: the women have broad, very pale faces with high cheekbones and almond-shaped eyes. Also inherited from China are the swirls of clouds, the knotted tree trunks and the sometimes serpentine dragons, which replace the hairy, ape-like monsters inherited from Mazdean liturgies in the depiction of hell (Figure A3.3(b)).
260
Aesthetics in Digital Photography
a)
b) Figure A3.3. Some miniatures from the Persian school
C OMMENT ON F IGURE A3.3.– (a) Anonymous miniature from Bijapur, in India, around 1650. Yogini playing a musical instrument. The miniature easily integrates foreign cultures (here Hindu culture) into the Persian aesthetic code. (b) Anonymous miniature from Baghdad, extracted from a version of Ferdowsi’s Shâhnâmah painted in Baghdad around 1370: epic scene, Bahram Gour killing the dragon. The Chinese inspiration is particularly noticeable in this work, both in the form of the dragon as well as the features of the hero. This beauty, imposed on the form, is also imposed on the content. The image does not only reproduce peaceful and soft scenes (although these are many). It may sometimes also depict carnage, bloody battles and slaughter. But the impression produced by these violent scenes is of the same serenity as that produced by the image of a fountain in a garden: someone is slaughtered in the battle between armies, among the neighing of the horses, corpses pile up, walls are destroyed – but all this at a distance and calmly, without passion, without horror, without a grimace17. This transposition of the ambience into a world free of human contingencies also operates in landscapes. As Dominique Clévenot says: “Persian art does not propose a naturalist vision of nature, it carries out a derealization of the natural landscape by the sides taken in arranging the space, through the play of forms, the color choices and the very sheen of the pigments employed” (Clévenot 2001).
17 “There is no drama; while there is sometimes violence and death, there is no emotion or sign, but for a celestial feeling of detachment, of peace and eternity” (Ishaghpour 2009).
Appendix 3
261
Unlike the Chinese artwork, the Persian miniature is made up of precise objects, marked out by clear lines. The fluidity of the line, its purity in rendering the forms, are the qualities of great painters. This line is created by a sure hand, decided and spontaneous, which is only obtained through long practice, long training in a workshop under the direction of a master. It takes many years of practice and in this we can see a Chinese manner of conceiving of the line as the result of learning over an entire career (Verdier 2003). It is said that in a Persian miniature, if a horse is to be beautiful, it must be drawn with a single line of the plume, starting from the right hindquarters and returning to this without lifting the plume (Pamuk 1998). A3.3. Objective or subjective? The aesthetic that governs the creation of the most beautiful Persian miniatures brings us back to the tricky problem that underpins our work: which elements support the hypothesis of an objective Beauty and which arguments support a subjective Beauty? In the Medieval Age, the thinking in Islamic civilization was heavily influenced by the Greek legacy. The thinking and texts of Socrates, Plato and Aristotle were well known to thinkers in the East. They were not only studied in depth, but also commented upon, critiqued and complemented by autogenous thinking, especially taking into account the religious message specific to Islam, which had a deep influence on society. The world is beautiful because it was created by God, and the painted work must reflect this beauty, which is manifested in each object. This, then, has all the elements of an objectivist aesthetic: the universality of the experience on observing the art, the permanence of the judgment that travels generations and is anchored in memories, the shared criteria: purity of line, precision of feature, fluidity of forms, richness of details, sumptuous hues, all of which objectivize the opinion and justify it. These characteristics are translated by the artist into perceptible, shareable traces that can be opposed by a viewer who does not perceive this beauty. The Islamic world knew the laws of physics and optics even better than the Greeks. Under the influence of scholars from the East, physiology and medicine made it possible to establish the initial boundaries between that which was perceived and that which is conscious thought. The reality of the physical world is established and affirmed, and it is also different from the world of thought. However, the beauty that the artist creates may belong to the miniature, but does not belong to any natural “model” that the painter perceives. It belongs to the intimate experience, which is only seen through the perceptual pathways as the result of a distant legacy. It is an image conceived in the soul of the painter who is endowed with exceptional sensitivity through laborious practice, rigorous education and continued observation of the examples of the old masters. Beauty is noticeable in the modern miniature, because the old master had conceived of the exact forms within himself, not because he had seen them. His model was an entirely mental image, produced as the result of
262
Aesthetics in Digital Photography
a singular sensibility, as well as an advanced education, and not the result of observation. As proof of this: the blind man is able to arrive at this image and use it to guide his hand18. Is this not then a “subjective” beauty, produced by intense spiritual activity and the result of exchanges between the higher regions of the brain: memory, cogitation, intuition? Is this not a situation of “valorized observation or, rather, hypnotized observation”, as Bachelard (1949) said, where objectivity and subjectivity are combined, with the objective coming not from the external world, but from a sublime universe shared by the artists?
18 On the subject of this vision of the world that is not based on perception but on the idea that is transmitted by the image, we can also read: “To establish a relationship between the tangible world and transcendence, the Iranian mystics created the ‘imaginal’ world corresponding to a suprasensible sensibility: a universe of subtle bodies, called the ‘world of the eighth climate’, an intermediary between the world of the pure idea and the world of ordinary perception. Insofar as this refers to the ‘creative imagination’ and the relationship between the sensible and the intelligible, there is significant proximity between this visionary mystic and the questions of art, without referring to these”, in Ishaghpour (2009, p. 60).
Appendix 4 Aesthetics in Japan
Artists attempt to not only represent the passage of water and clouds in their artwork, but they also strive to reproduce the temporal experience. Yuko H ASEGAWA (Fondation du Japon 2018)
Japan is among those countries that have not only produced a highly rich range of artwork, over their history, but also possess a specific literature that makes it possible to define original criteria for beauty. The earliest artistic manifestations in Japan can be seen in the Jomon pottery1, from several thousand years ago. The foundations of Japanese aesthetics were considerably strengthened by contributions from China, above all, and also Korea, during the constant exchanges in the Yamato (250–710) and Nara (710–794) periods. It then gradually cultivated its unique features and often set itself up in opposition to these precursors. Unlike what happened in China, the writings that analyzed and established the framework for this aesthetic came later, chiefly during the Heian period, at the end of the first millennium. The Edo period (1603–1868) and then the Meiji period (1868–1912) saw considerable development in these studies and dedicated an original aesthetic, which was distinguished by its great simplicity, combined with high sophistication. This is therefore a rich and singular artistic tradition that we will try to broadly describe here. A tradition that led a distinction For a color version of all figures in this chapter, see www.iste.co.uk/maitre/aesthetics.zip 1 The Jomon period was the first of the traditionally recognized historical periods in Japan. It falls in Japanese proto-history. It is likely to have started around 15,000 years ago and to have ended around the fifth-century BCE.
264
Aesthetics in Digital Photography
between nihonga, Japanese painting, and y¯oga, Western painting2. In some previous studies (Lane 1962; Bayou 2004; Delay 2004; Richie 2016), the reader can find a wealth of information on this field. A4.1. A brief history of art in Japan During the Tang dynasty (618–917), the Chinese Empire enjoyed a period of exceptional prosperity and extended its cultural influence far beyond its borders. The Chinese pictorial arts traditions established over 10 centuries had been consolidated into treatises that were widely available, as we saw earlier. This “aesthetic of the Tangs” (the Japanese called it kara-e) was the norm both in the Korean courts as well as in Japan. This is particularly evident in the calligraphy and in natural landscapes. The Nara period was followed by the Heian period (from 795 to 1185); Japan distanced itself considerably from China and entered a period of isolation. This was a great time for Japanese culture which, alongside official art inspired by the Tang tradition, developed a yamato-damashii (spirit of Japan) which gave importance to local customs, family traditions and daily life. A school of painting was created around Heian-ky¯o (the future Kyoto), a school that would come to be called yamato-e (Ancient Japan). This abandoned large landscapes to depict intimate scenes and interior scenes. It paid great attention to detail but strove for unadorned construction. The yamato-e would serve as the artistic reference in Japan for 10 centuries and was divided into slightly different variants for various dynasties, until the end of the Meiji period. The lines are often very dark, the colors bright as in the style of the 15th century Tosa school. The art of the portrait inspired from yamato-e was expressed in an original style, called nise-e (portrait resembling the original), which appeared toward the late 12th century (Figure A4.1). The yamato-e also inspired the Rinpa d’Ogata K¯orin school of art, which, with its rules of economy of line and richness of decoration, would travel across centuries and arts, ranging from paintings to sculptures, ceramics to lacquer-work. This movement toward everyday, ordinary themes, reflections of daily life and bourgeois life in the Edo period was solidified by the emergence of the ukiyo-e (floating image) style, which popularized scenes in the garden, the theatre, scenes depicting fighters, geishas playing music, gracious and modest scenes, far removed from the splendors of court or the depths of mountain landscapes. In the 1880s, through the Meiji era, which followed the Edo period, renewed nationalist feeling led to conventions, techniques and materials of traditional 2 The term nihonga (“Japanese painting”) was proposed by Okakura Kakuz¯ o Tenshin in his 1906 work, Book of Tea, which established the unique features of Japanese art. In this analysis, Tenshin followed the path laid by the Orientalist Ernest Fenollosa with whom he did a lot of work (Mitteau 2015).
Appendix 4
265
a)
b) Figure A4.1. Two examples of nise-e. (a) Fujiwara Nobuzane (1176–1266), Portrait ¯shin, Portrait of the Emperor Hanazono of the Emperor Go Toba. (b) Fujiwara no Go (1338). In both these portraits, the essential is communicated with ornamentation: the faces are finely personalized, the clothing is elegant, but drawn with neither detail nor pomp, the attitude expresses authority and, without showing it, hints at the deference of the onlookers
266
Aesthetics in Digital Photography
Japanese painting being demonstrated as an example and reinforced in their role as a model for the Nippon aesthetic. It was at this time that the term nihonga (Japanese painting) grew popular, bringing together all expressions of art that were faithful to this tradition, as opposed to y¯oga (the Western style), which was knocking loudly on Japan’s doors. The debates from this period gave rise to what we now call the Japanese aesthetic (Charrier 1996).
Figure A4.2. Sengai Gibon (1750–1837) is famous for this piece, Circle, triangle, square. Multiple meanings are attached to these three shapes, encoded in Japanese culture: the circle = infinity and Buddhism, the triangle = form and Confucianism, the square = architecture and the universe of Shintoism. Each form is completed by a single stroke, without lifting the brush
A4.2. The art of impermanence As is the case across Asia, art was a highly intellectual activity; this is recalled particularly by the bunjin-ga (the painting of letters). All representation is intentional, codified and with respect to a given form, which refers to interior life and to religious references (Figure A4.2). Among these, the Western observer is likely to be most surprised by the role of time, which, paradoxically, tries to find expression in the most static of the arts. Time is present everywhere and Japanese art captures continuous movement, transition and change better than any other3. A term is dedicated to this: mono-no-aware (the interjection of things), which expresses the appeal of the ephemeral, but also the attraction in the instant, the perishable, the 3 In Parkes and Loughnane (2005), we can read: “The world of flux that presents itself to our senses is the only reality: there is no conception of some stable ’Platonic’ realm above or behind it”.
Appendix 4
267
melancholic charm of the transient. The expression glides over representations of a group of young women having a happy moment, the subtle decline of a bouquet of flowers, or the disappearance of the sun behind waves. There is a recognition of transition as the very essence of time, a constant in Japanese painting that we do not find in Western art. The impermanence (mujô) and uncertainty (futei) are at the heart of the work of art and its raison d’être, as expressed by Yuko Hasegawa, the Director of the Tokyo Museum of Contemporary Art, quoted at the opening of this chapter.
Figure A4.3. Hakuin Ekaku, Zen Buddhist philosopher and painter (1686–1769), Blind men crossing a bridge. The virtues of Buddhism are presented here: clear lines, modesty of the subject, a simple man in nature and an unadorned scene
Another characteristic, which also arises from Buddhism, can often be found in the Japanese artwork: the importance of old objects, polished, used by a lot of handling, eroded by times, sometimes cracked, sometimes battered, prime position accorded to old people with lined faces, with marked yet serene faces4. This attraction of the wearing away from the past is called the sabi, which makes people prefer an old object to a new one, the tarnished to the shining. Glitter is not favored in Japanese art and what is prized is modesty, discretion and probably also sorrow, melancholy and the austere rigor of solitude (the wabi) modest detachment, like the pebble on its bed of sand5. 4 The poet Jun’ichir¯ o Tanizaki (1886–1965) (Tanizaki 2017) says “We have always preferred deep reflections, somewhat veiled, to a superficial and frozen brilliance; whether in natural stones or artificial materials, this brilliance that is gradually altered irresistibly evokes the effects of time”. 5 Again, in the writings of Jun’ichir¯ o Tanizaki we find: “All is well considered because we, in the East, seek to fit within the limitations that are imposed on us so that we may be always content with our present condition; consequently, we experience no revulsion with respect to that which may be obscure, we resign ourselves to it as to the inevitable; if there is poor light, well, let it be so! Better, we go joyously deeper into the shadows and discover the beauty inherent to them. In the West, on the contrary, are constantly striving for progress and are in constant agitation in pursuit of a state that is better than the present” (Tanizaki 2017).
268
Aesthetics in Digital Photography
The philosopher Shin’ichi Hisamatsu (1889–1980), strongly steeped in Buddhist culture, believes that Zen art has seven characteristics: asymmetry, simplicity, noble austerity, lightness (or the absence of effort), non-conformity, repercussion with an end, and peace. Donald Richie, the eminent American Japanologist, sees in “the effects of asymmetry and incompleteness, the “poor” and “natural” materials”, as the most distinctive mark of Japanese culture as compared with Western art (Richie 2016). Symmetry, which is one of the bases for Platonic aesthetics (see section 1.1) is dismissed by Japanese artists as it betrays an illusory statism or permanence in a world that is constantly evolving, and opposes movement, which expresses life. Simplicity is required both in the overall construction of the work, as well as in its component elements. It removes superfluous objects and only reveals the essential. It accompanies forms with a precise and fluid line, a single line, without the hand being lifted from the paper. As in Chinese painting, lines are of major importance and are expressed in a precise and varied manner, as required (Melay 2019). Simplicity can be divided into two terms: mu, emptiness, nothingness and ma interval, space, both at the heart of Zen objectives (Lucken 2014). This attraction of space, of emptiness, is found in the art and can also be found in everyday photographic preferences (Nisbett and Masuda 2003; Masuda et al. 2008): a preference for horizons placed very high in the landscape, giving the impression of a bird’s eye view, and revealing a large panorama; photos of people from head to toe, rather than a portrait, freeing up the visual field, making the person distant, depersonalizing them, losing them in the decor6. The relationship between the individual and the world is completely different from that in the West, more inclusive, more integration, and more “a relation of the part with the whole” and not “the subject observing the object” (Markus and Kitayama 1991). We have said that dark or subdued tones are found frequently in the palette of painters. Taking into account regions of shadow, uncertain light, nocturnal lighting - this is a prized exercise, where we recognize the accomplished artist, as Tanizaki’s text recalls, in footnote 5. However, Japanese painting does not forbid itself from using saturated hues, bright colors, when they evoke a bird’s plumage, a flowery dress or a child’s clothing. The faces of women, if they are beautiful, are a flat white, without shadows. The eyebrows are shaved, to make the face larger, and reinforce the pallor of the make-up. The lips are heavily loaded with blue and green. On the contrary, men, like old people, have a parchment shade, tanned, the mark of a life spent in nature. Geometry does not follow the rules of perspective, or rather, only respects them at a remove, with no rigor. Cast shadows are absent, as in Chinese art. 6 “It is clear that in the Japanese arts, however they are expressed there is a valorization of distance, of emptiness, of interstitial space, especially as compared to the European arts” writes Michael Lucken (Lucken 2014).
Appendix 4
a)
b)
269
c)
Figure A4.4. (a) Ito Jakuchu (1716–1800), Mandarin ducks in the snow. (b) The Temple of Ryosoku In, Plum tree and rooster in the snow, anonymous painting on ¯rin (called Ichinojo ¯) (1658–1716), Flowers in silk from the 18th century. (c) Ogata Ko autumn (detail from a screen) (Korin school)
a)
b)
¯rin (called Ichinojo ¯) (1658–1716), Figure A4.5. (a) Ogata Ko Waves at Matsushima (Korin School). (b) Tosa Mitsunobu (1430–1522), Foundations of the temple of Seiko-Ji (Tosa school)
As with Chinese art, Japanese art does not fit within the broad categories of Western aesthetics. While it accords a remarkable place to the sensible world, and while it captures details, light, shadow and movement with great faithfulness, this is not done to testify to the external world, captured and returned as is. It is, instead, in order to place the observer within a particular moment and to reconstruct the whole, with the observer being a part of it: a cool breeze among children having fun with a
270
Aesthetics in Digital Photography
kite, a moment with women making music in an arbor, a moment where an old person falls asleep, and so on. Each painting opens a door and invites the observer to visit it, without claiming any other happiness than that of sharing it. If there is any pleasure, it is a pleasure of being, of participating, not the pleasure of seeing, understanding or analyzing.
a)
b)
c) Figure A4.6. (a) and (b) Tosa Mitsuoki (1617–1691), Scenes of Genji Monogatari (Tosa school). (c) Fukae Roshu (1699–1757), The narrow path (screen)
References
Alberti, L. (1992). De pictura. Macula, Paris. Alley, T. and Hildebrandt, K. (1988). Determinants and consequences of facial aesthetics. In Social and Applied Aspects of Perceiving Faces, Alley, T. (ed). Lawrence Erlbaum Associates Publishers, Hillsdale, NJ. Alvarez, L., Gousseau, Y., Morel, J.M. (1999). The size of objects in natural and artificial images. Avances in Imaging & Electron Physics, 111, 167–242. Amengual, X., Bosch, A., de la Rosa, J.L. (2015). Review of methods to predict social image interestingness and memorability. In International Conference on Computer Analysis of Images and Patterns, Springer, Cham. Amirshahi, S., Denzler, J., Redies, C. (2013). Jenaesthetics – A public dataset of paintings for aesthetic research. In European Conference on Computer Vision (Workshop), Jena, 3–19. Amirshahi, S., Hayn-Leichsenring, G., Denzler, J., Redies, C. (2014). Evaluating the rule of thirds in photographs and paintings. Art & Perception, 2(1–2), 163–182. André, J. (1759). Essai sur le Beau. J.H. Schenider Editions, Amsterdam. Apostolidis, K. and Mezaris, V. (2019). Image aesthetics assessment using fully convolutional neural networks. In International Conference on Multimedia Modeling, 361–373. Appleton, J. (1975). The Experience of Landscape. John Wiley & Sons, New York. Aquila, R. (1970). A new look at Kant’s aesthetic-judgement. Kant Studien, 1–4, 17–34. Arbellini, J. (2017). Les photographies d’art ont-elles un format compatible avec le nombre d’or ? Masters thesis, Institut Villebon-Georges Charpak, Telecom-Paris, Dept. IDS, Paris. Aristotle (1996). Poetics. Penguin, London. Arnheim, R. (1954). Art and Visual Perception: A Psychology of the Creative Eye. University of California Press, Berkeley, CA. Arnheim, R. (1983). The Power of the Center. A Study of Composition of the Visual Arts. University of California Press, Berkeley, CA. Arnheim, R. (1986). New Essays on the Psychology of Art. University of California Press, Berkeley, CA.
272
Aesthetics in Digital Photography
Attewell, D. and Baddeley, R. (2007). The distribution of reflectances within the visual environnement. Vision Research, 47(4), 548–554. Augustin, M.D., Carbon, C., Wagemans, J. (2011). Measuring aesthetics impression. In ECVP’11, Toulouse. Augustin, M.D., Wagemans, J., Carbon, C., Holmes, T., Kapoula, Z., Roberts, M.N. (2018). Measuring aesthetic impressions [Online]. Available at: http://www.gestaltrevision.be/en/?option=com_content&view=article&catid=27&id=283. Aziza, M. (1978). L’image et l’islam. Albin Michel, Paris. Bachelard, G. (1949). La psychanalyse du feu. Gallimard, Paris. Barsalou, L.W. (1985). Ideals, central tendency, and frequency of instantiation as determinants of graded structure in categories. Journal of Experimental Psychology: Learning, Memory, and Cognition, 11(4), 629. Barthes, R. (1980). La Chambre claire : note sur la photographie. Gallimard/Le Seuil/Cahiers du cinéma, Paris. Batteux, C. (1747). Les Beaux Arts réduits à un même principe. Librairie Durand, Paris. Baudelaire, C. (2019). Oeuvres Intégrales, Carrés Classiques, Nathan, Paris. Bayer, R. (1956). Traité d’esthétique. Armand Colin, Paris. Bayou, H. (2004). Images du monde flottant – Peintures et estampes japonaises siècles. Réunion des Musées Nationaux, Paris.
e
e
XVII -XVIII
Beardsley, M. (1966). Aesthetics from Classical Greece to the Present. MacMillan, NewYork. Bell, C. (1914). Art. Chatto & Windus, London. Bell, S.S., Holbrook, M.B., Solomon, M.R. (1991). Combining esthetic and social value to explain preferences for product styles with the incorporation of personality and ensemble effects. Journal of Social Behavior and Personality, 6(6), 243. Bénard, C. (1877). L’esthétique du laid. Revue philosophique de la France et de l’étranger, IV, 233–265, PUF, Paris. Benouaret, I. (2017). Un système de recommandation contextuel et composite pour la visite personnalisée de sites culturels. Doctorate thesis, Université de technologie de Compiègne, Compiègne. Bense, M. (1969). Einfürung in die informationstheoretische Ästhetik. Grundlegung und Andwendung in der Texttheorie. Rowoldt Taschenbuch Verlag, Reinbek. Bense, M. (2007). Aesthetica. Les Éditions du Cerf, Paris. Bergson, H. (2009). La pensée et le mouvant. PUF, Paris. Berlyne, D. (1970). Novelty, complexity and hedonic values. Perception & Psychophysics, 8, 279–290. Berlyne, D. (1971). Aesthetics and Psychobiology, vol. 336. Appleton-Century Croft, NewYork. Berridge, K. and Kringelbach, M. (2013). Neuroscience of affect: Brain mechanisms of pleasure and displeasure. Current Opinion in Neurobiology, 23(3), 294–303. Birkhoff, G. (1933). Aesthetic Measure. Harvard University Press, Cambridge, MA.
References
273
Blijlevens, J., Thurgood, C., Hekkert, P., Chen, L.L., Leder, H., Whitfield, T.W. (2017). The aesthetic pleasure in design scale: The development of a scale to measure aesthetic pleasure for designed artifacts. Psychology of Aesthetics, Creativity, and the Arts, 11(1), 86–97. Boccia, M., Barbetti, S., Piccardi, L., Guariglia, C., Ferlazzo, F., Giannini, A.M., Zaidel, D.W. (2016). Where does brain neural activation in aesthetic responses to visual art occur? Meta-analytic evidence from neuroimaging studies. Neuroscience & Biobehavioral Reviews, 60, 65–71. Boselie, F. (1991). Against prototypicality as a central concept in aesthetics. Empirical Studies of the Arts, 9(1), 65–73. Boselie, F. and Leeuwenberg, E. (1985). Birkhoff revisited: Beauty as a function of effect and means. The American Journal of Psychology, 98(1), 1–39. Bouleau, C. (2014). The Painter’s Secret Geometry: A Study of Composition in Art. Allegro Editions, The Courier Corporation, USA. Boulez, P., Changeux, J., Manoury, P. (2014). Les neurones enchantés. Odile Jacob, Paris. Bourdieu, P. (1965). Un art moyen : essai sur les usages sociaux de la photographie. Les Éditions de Minuit, Paris. Brady, N. and Field, D. (2000). Local contrast in natural images: Normalisation and coding efficiency. Perception, 29, 1041–1055. Bréhin, Y. (2007). Une lecture de la “beauté réelle” d’Eddy Zemach. Marges : revue d’art contemporain, 6, 112–119. Brown, S. and Dissanayake, E. (2009). The arts are more than aesthetics: Neuroaesthetics as narrow aesthetics. Neuroaesthetics, 43–57. Brown, S., Gao, X., Tisdelle, L., Eickoff, S., Lotti, M. (2011). Naturalizing aesthetics: Brain areas for aesthetic appraisal across sensory modalities. Neuroimages, 58, 250–258. Brunswick, E. (1956). Perception and the Representative Design of Psychological Experiments. University of California Press, Berkeley, CA. Bullot, N.-J. and Reber, R. (2013). The artful mind meets art history: Towards a psycho-historical framework for the science of art appreciation. Behavioural and Brain Sciences, 36, 123–180. Busson, A. (1965). La miniature : un art d’une richesse inouïe dominé par le réalisme. Le Monde diplomatique, supplément : l’Iran au passé glorieux se tourne vers l’avenir, December, 25. Cardon, D. and Casilli, A. (2015). Qu’est-ce que le digital labor ? INA, France. Carlson, A. (1979). Appreciation and the natural environment. The Journal of Aesthetics and Art Criticism, 37(3), 267–275. Carlson, A. (1995). Nature, aesthetic appreciation, and knowledge. The Journal of Aesthetics and Art Criticism, 53(4), 393–400. Carlson, A. (2005). What is the correct curriculum for landscape appreciation? In The Aesthetics of Everyday Life, Light, A., Smith, J.M. (eds). Columbia University Press, New York. Casilli, A.A. (2019). En attendant les robots – Enquête sur le travail du clic. Le Seuil, Paris. Cela-Conde, C., Agnati, L., Huston, J., Mora, F., Nadal, M. (2011). The neural foundations of aesthetic appreciation. Progress in Neurobiology, 94, 39–48.
274
Aesthetics in Digital Photography
Chandler, R. (2013). Seven challenges in image quality assessment: Past, present and future research. ISRN Signal Processing, 2013, 1–53. Chang, K.Y., Lu, K.H., Chen, C.S. (2017). Aesthetic critiques generation for photos. In Proceedings of the IEEE International Conference on Computer Vision, 3514–3523. Changeux, J. (2008). Du vrai, du beau, du bien : une nouvelle approche neuronale. Odile Jacob, Paris. Changeux, J. (2016). La beauté dans le cerveau. Odile Jacob, Paris. Charrier, I. (1996). Débat sur l’avenir de l’art dans le japon de l’époque meiji. D’une vision traditionnelle de l’art à une esthétique. Ebizu, 12(12), 154–180. Charrier, C., Lezoray, O., Lebrun, G. (2012). Machine learning to design full-reference image quality assessment algorithm. Signal Processing: Image Communications, 27(3), 209–219. Chastel, A. (1996). Marsile Ficin et l’Art. Droz, Geneva. Chatterjee, A. (2003). Prospects for a cognitive neuroscience of visual aesthetics. Bulletin of Psychology and the Arts, 4(2), 55–60. Chatterjee, A. and Vartanian, O. (2016). Neuroscience of aesthetics. Annals of the NewYork Academy of Sciences, 1369, 172–194. Cheng, F. (1991a). Shitao, la saveur du monde, citations sur la peinture, Phébus, Paris. Cheng, F. (1991b). Vide et plein, le langage pictural chinois. Le Seuil, Paris. Chevreul, E. (1864). Des couleurs et leurs applications aux arts industriels à l’aide des cercles chromatiques. J.-B. Baillière et fils, Paris. Chopra, S., Hadsell, R., LeCun, Y. (2005). Learning a similarity metric discriminatively with application to face verification. In CVPR’05, 1, 539–546. Clévenot, D. (2001). Paysages persans – Vers une esthétique de l’imaginal. Horizons maghrébins – Le droit à la mémoire, 45, 34–49. Cochoy, F. (2011). De la curiosité, l’art et la séduction marchande. Armand Colin, Paris. Cometti, J.-P. (2006). Art, représentation, expression. Armand Colin, Paris. Cometti, J.-P., Morizot, J., Pouivet, R. (2000). Questions d’esthétique. PUF, Paris. Costa, P. and McCrae, R. (1992). NEO-PI-R Professional manual. Psychological Assessment Resources, Odessa. Cotton, C. (2009). The Photograph as Contemporary Art, vol. 1. Thames & Hudson, London. Crettez, J. (2017). Les supports de la géométrie interne des peintres. ISTE Editions, London. Cristani, M., Vinciarelli, A., Segalin, C., Perina, A. (2013). Unveiling the multimedia unconscious: Implicit cognitive processes and multimedia content analysis. In Proceedings of the 21st ACM International Conference on Multimedia, 213–222. Cross, J.F., Cross, J., Daly, J. (1971). Sex, race, age, and beauty as factors in recognition of faces. Perception & Psychophysics, 10(6), 393–396. de Crousaz, J.-P. (1985). Traité du beau, où l’on montre en quoi consiste ce que l’on nomme ainsi, par des exemples tirés de la plupart des arts et des sciences. Fayard, Paris. Crowther, P. (1976). Fundamental ontology and transcendent beauty: An approach to Kant’s aesthetics. Kant Studien, 1–4, 55–71.
References
275
Cui, P., Liu, S., Zhu, W., Luan, H., Chua, T., Yang, S. (2014). Social-sensed image search. ACM Transactions on Information Systems (TOIS), 32(2), 1–23. Cupchik, G., Vartanian, O., Crawley, A., Mikulis, D. (2009). Viewing artworks: Contributions of cognitive control and perceptual facilitation for aesthetic experience. Brain and Cognition, 70, 84–91. Dalal, N. and Triggs, B. (2005). Histograms of oriented gradients for human detection. In CVPR Conference Computer Vision and Pattern Recognition, San Diego, 886–893. Damásio, A. (1994). Descartes’ Error: Emotion, Reason and the Human Brain. Grosset/ Putnam, New York. Damásio, A. (1999). The Feeling of What Happens: Body, Emotion and Consciousness. Harcourt Brace, NewYork. Danto, A. (1964). The artworld. The Journal of Philosophy, 61(19), 571–584. Danto, A. (1992). Beyond the Brillo Box: The Visual Arts in Post-Historical Perspective. University of California Press, Berkeley, CA. Danto, A. and Goehr, L. (2014). After the End of Art: Contemporary Art and the Pale of History. Princeton Classics, Princeton, NJ. Datta, R. and Wang, J. (2010). ACQUINE: Aesthetic quality inference engine–real-time automatic rating of photo-aesthetics. In MIR’10, 421–424. Datta, R., Joshi, D., Li, J., Wang, J. (2006). Studying aesthetics in photographic images using a computational approach. In Computer Vision, ECCV 2006, vol. 3953 of Lecture Notes in Computer Science, 288–301. Datta, R., Li, J., Wang, J.Z. (2008). Algorithmic inferencing of aesthetics and emotion in natural images: An exposition. In 15th IEEE International Conference on Image Processing, 105–108. David, P. (2002). Schelling : construction de l’art et récusation de l’esthétique. Revue de Métaphysique et de morale, 34(2), 29–41. Delahaye, J.-P. (2015). La beauté mise en formules. Pour la Science, 455, 78–83. Delay, N. (2004). L’estampe japonaise. Hazan, Paris. Deldjoo, Y., Elahi, M., Cremonesi, P., Garzotto, F., Piazzolla, P., Quadrana, M. (2016). Content-based video recommendation system based on stylistic visual features. Journal on Data Semantics, 5(2), 99–113. Deldjoo, Y., Constantin, M.G., Ionescu, B., Schedl, M., Cremonesi, P. (2018). Mmtf-14k: A multifaceted movie trailer feature dataset for recommendation and retrieval. In Proceedings of the 9th ACM Multimedia Systems Conference, 450–455. Deleuze, G. (1962). Nietzsche et la philosophie. PUF, Paris. Deng, X., Cui, C., Fang, H., Nie, X., Yin, Y. (2017a). Personalized image aesthetics assessment. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, 2043–2046. Deng, Y., Loy, C.C., Tang, X. (2017b). Aesthetic-driven image enhancement by adversarial learning. arXiv, 1707.05251. Deng, Y., Loy, C.C., Tang, X. (2017c). Image aesthetic assessment: An experimental survey. IEEE Signal Processing Magazine, 34(4), 80–106.
276
Aesthetics in Digital Photography
Dennett, D. (1993). La conscience expliquée. Odile Jacob, Paris. Descartes, R. (1991). Discours de la méthode suivi de La Dioptrique. Gallimard, Paris. Dessalles, J.L. (2008). La pertinence et ses origines cognitives. Hermes-Lavoisier, Paris. Dessalles, J.L. (2013). Algorithmic simplicity and relevance. In Algorithmic Probability and Friends, Dowe, D. (ed.). Springer-Verlag, Berlin. Dewar, H. (1938). A comparison of tests of artistic appreciation. British Journal of Educational Psychology, 8, 29–49. Dewey, J. (2010). L’Art comme expérience. Gallimard, Paris. Dézarnaud-Dandine, C. and Sevin, A. (2007). Symétrie m’était contée... histoires de symétries. Ellipses, Paris. Dhar, S., Ordonez, V., Berg, T. (2011). High level describable attributes for predicting aesthetics and interestingness. In Computer Vision and Pattern Recognition (CVPR), 1657–1664. Di Dio, C. and Vittorio, G. (2009). Neuroaesthetics: Neurobiology, 19, 682–687.
A review. Current Opinion in
Di Dio, C., Macaluso, E., Rizzolatti, G. (2007). The golden beauty: Brain response to classical and renaissance sculptures. PloS One, 2(11), e1201. Di Dio, C., Canessa, N., Cappa, S., Rizzolatti, G. (2011). Specificity of esthetic experience for artworks: An fMRI study. Frontiers in Human Neuroscience, 5(139), 1–14. Diderot, D. (1769). Correspondance littéraire. Furne, Paris. Diderot, D. and d’Alembert, J.L.R. (1777). Encyclopédie. Pellet, Paris. Dominguez, V., Messina, P., Parra, D., Mery, D., Trattner, C., Soto, A. (2017). Comparing neural and attractiveness-based visual features for artwork recommendation. In Proceedings of the 2nd ACM Workshop on Deep Learning for Recommender Systems, 55–59. Dong, Z. and Tian, X. (2015). Multi-level photo quality assessment with multi-view features. Neurocomputing, 168, 308–319. Ducarme, F. and Couvet, D. (2020). What does “nature” mean? Palgrave Communications, 6(1), 1–8. Dufrenne, M. (1967). Phénoménologie de l’expérience esthétique. PUF, Paris. Dufrenne, M. (1980). Esthétique et Philosophie. Klincksleck, Paris. Dupont, J. (2015). Que faire de l’imagerie cérébrale ? Territoires anciens et nouveaux d’une technologie. Comptes rendus Biologie, 8–9, 607–612. Eco, U. (1962). L’oeuvre ouverte. Le Seuil, Paris. Elahi, M., Deldjoo, Y., Bakhshandegan Moghaddam, F., Cella, L., Cereda, S., Cremonesi, P. (2017). Exploring the semantic gap for movie recommendations. In Proceedings of the Eleventh ACM Conference on Recommender Systems, 326–330. Elton, W.R. (1954). Aesthetics and Language. Philosophical Library, NewYork. Escande, Y. (2003). Traités chinois de peinture et de calligraphie. Tome 1 : Les textes fondateurs (des Han aux Sui). Klincksieck, Paris. Evans, A., Collins, D., Milner, B. (1992). An MRI-based stereotactic atlas from 250 young normal subjects. Journal of Soc. Neurosci., 18, 412.
References
277
Eysenck, H. (1939). The general factor in aesthetic judgements. British Psychological Society, 31(1), 94–102. Eysenck, H. (1941). The empirical determination of an aesthetic formula. Psychological Review, 48(1), 83–92. Eysenck, H. (1968). An experimental study of aesthetic preference for polygonal figures. The Journal of General Psychology, 79(1), 3–17. Eysenck, H. (1983). A new measure of good taste in visual art the visual aesthetic sensitivity test. Leonardo, 16, 229–231. Eysenck, H. (1991). Dimensions of personality: 16, 5 or 3? Criteria for a taxonomic paradigm. Personality and Individual Differences, 12(8), 773–790. Fabrizio, D. (2015). Sur l’épigenèse de l’esprit esthétique. Le sens de la beauté, de la survie à la survenance. Nouvelle revue d’esthétique, 15(15), 93–110. Falk, J. and Balling, J. (2010). Evolutionary influence on human landscape preference. Environment and Behavior, 42(4), 479–493. Fang, H. and Zhang, M. (2017). Creatism: A deep-learning photographer capable of creating professional work. arXiv, 1707.03491. Fechner, G. (1871). Zur Experimentalen Aesthetik. Hirtzel, S., Leipzig. Feodorov, I. (2005). La question de l’image chez les musulmans. Miscellanea Archaeus, IX(1–4), 1–18. Fernandez, D. and Wilkins, A. (2008). Uncomfortable images in art and nature. Perception, 37(7), 1098–1113. Ferry, L. (1990). Homo aestheticus, l’invention du goût à l’âge démocratique. Grasset, Paris. Fillinger, M.G. (2020). Effects of perceptual balance on aesthetic appreciation. PhD thesis, University of Konstanz, Baden-Württemberg. Fize, D. (2004). La catégorisation visuelle rapide. In Imagerie cérébrale fonctionnelle électrique et magnétique, Renault, B. (ed.). Hermes-Lavoisier, Paris. Fondation du Japon (2018). FUKAMI – une plongée dans l’esthétique japonaise. Exposition, July 14–August 21. Frangne, P.-H. (2018). Au principe de l’esthétique environnementale. Du paysage de montagne à l’esthétique de la montagne. Nouvelle revue d’esthétique, 22, 37–53. Franklin, M., Becklen, R., Doyle, C. (1993). The influence of titles on how paintings are seen. Leonardo, 26(2),103–108. Freedberg, D. and Gallese, V. (2007). Motion, emotion and empathy in esthetic experience. Trends in Cognitive Sciences, 11(5), 197–203. Freeman, M. (2018). Qu’est-ce qu’une photo réussie ? Dunod, Paris. Fried, M. (2008). Why Photography Matters as Art as Never Before. Yale University Press, New Haven, CT. Frow, J. (2002). Signature and brand [Online]. Available at: https://minerva-access.unimelb. edu.au/bitstream/handle/11343/25746/67014_00002624_01_Frow010.pdf?sequence=5. Fu, Y., Hospedales, T.M., Xiang, T., Gong, S., Yao, Y. (2014). Interestingness prediction by robust learning to rank. In European Conference on Computer Vision, 488–503.
278
Aesthetics in Digital Photography
Gärdenfors, P. (2000). Conceptual Spaces, the Geometry of Thought. MIT Press, Cambridge, MA. Gardner, H. (1984). Art, Mind and Brain: A Cognitive Approach to Creativity. Basic Books, New York. Garnier, A. (1826). Du beau dans la nature sauvage et du beau dans la société. Le Producteur, 3, 500–507. Ghosal, K., Rana, A., Smolic, A. (2019). Aesthetic image captioning from weakly-labeled photographs. arXiv, 1908.11310v1. Ghyka, M. (1931). Le nombre d’or. Gallimard, Paris. Gibson, J. (1986). The Ecological Approach to Visual Perception. Lawrence Erlbaum Associates, Hillsdale, NJ. Giraldo, K. and Velandia, B. (2020). Beauté des images selon les moments statistiques. Technical report, Telecom-Paris, Palaiseau. Goetz, K., Lynn, R., Borisy, A., Eysenck, H. (1979). A new visual aesthetic sensitivity test: I: Construction and psychometric properties. Perceptual and Motor Skills, 49(3), 795–802. Goldberg, L. (1990). An alternative “description of personality”: The big-five factor structure. Journal of Personality and Social Psychology, 59, 1216–1229. Gombrich, E.H. (1960). Art and Illusion: A Study in the Psychology of Pictorial Representations. Princeton University Press, Washington, DC. Gombrich, E.H. (2000). Concerning “the science of art”: Commentary on Ramachandran and Hirstein. Journal of Consciousness Studies, 7(8–9), 17. Gousseau, Y. and Roueff, F. (2007). Modeling occlusion and scaling in natural images. SIAM Journal of Multiscale Modeling and Simulation, 6(1), 105–134. Granet, M. (1968). La pensée chinoise. Albin Michel, Paris. Graves, M. (1977). Test of Drawing Appreciation. The Psychological Corporation. Gygli, M., Grabner, H., Riemenschneider, H., Nater, F., Van Gool, L. (2013). The interestingness of images. In IEEE International Conference on Computer Vision (ICCV), 1633–1640. Hariman, R. and Lucaites, J.L. (2016). Photography: The abundant art. Photography and Culture, 9(1), 39–58. Hayn-Leichsenring, G.U., Lehmann, T., Redies, C. (2017). Subjective ratings of beauty and aesthetics: Correlations with statistical image properties in western oil paintings. i-Perception, 8(3), 204–210. He, K., Zhang, X., Ren, S., Sun, J. (2015a). Deep residual learning for image recognition. arXiv, 1512.03385. He, K., Zhang, X., Ren, S., Sun, J. (2015b). Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans on PAMI, 37(9), 1904–1907. He, R., Fang, C., Wang, Z., McAuley, J. (2016). Vista: A visually, socially, and temporally-aware model for artistic recommendation. In Proceedings of the 10th ACM Conference on Recommender Systems, 309–316. Hegel, G. (1835–1838). Vorlesungen über die Äesthetik. Collected by H.G. Hohto. Duncker und Humblot, Berlin.
References
279
Hegel, G. (1997). Esthétique ou Philosophie de l’art, vols 1 and 2. Le Livre de poche, Paris. Henry, C. (1885). Introduction à une esthétique scientifique. Revue contemporaine, Paris. Henry, C. (1889). Éléments d’une théorie générale de la dynamogénie, autrement dit, du contraste, du rythme et de la mesure avec applications spéciales aux sensations visuelles et auditives. Charles Verdin, Paris. Henry, C. (1891). Physiologie générale des sensations et esthétique. Notices sur les travaux scientifiques de M. Charles Henry. Imprimerie des sciences mathématiques et physiques, Rome. Henry, C. (1895). Quelques aperçus sur l’Esthétique des formes. La Revue blanche, 6–61. Herzog, T.R. and Bryce, A.G. (2007). Mystery and preference in within-forest settings. Environment and Behavior, 39(6), 779–796. Hii, Y.L., See, J., Kairanbay, M., Wong, L.K. (2017). Multigap: Multi-pooled inception network with text augmentation for aesthetic prediction of photographs. In IEEE International Conference on Image Processing (ICIP), 1722–1726. Hillyard, S.A. and Anllo-Vento, L. (1998). Event-related brain potentials in the study of visual selective attention. Proceedings of the National Academy of Sciences, 95(3), 781–787. Hochreiter, S. and Schmidhuber, J. (1957). Long short-term memory. Neural Computation, 9(8), 1735–1780. Hoenig, F. (2005). Defining computational aesthetics. In 2005 Computational Aesthetics in Graphics, Visualization and Imaging Conf. The Eurographics Association, 13–18. Hoffer, E. and Ailon, N. (2015). Deep metric learning using triplet network. Lectures Notes in Computer Sciences, 9370. Hong, R., Zhang, L., Tao, D. (2016). Unified photo enhancement by discovering aesthetic communities from flickr. IEEE transactions on Image Processing, 25(3), 1124–1135. Hopkins, R. (2000). Beauty and testimony. In Philosophy, the Good, the True and the Beautiful, O’Hear, A. (ed.). Cambridge University Press, Cambridge. Hopkins, R. (2001). Kant, quasi-realism and the autonomy of aesthetic judgement. European Journal of Philosophy, 9(2), 166–189. Hosu, V., Goldlucke, B., Saupe, D. (2019). Effective aesthetics prediction with multi-level spatially pooled features. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 9375–9383. Howard, A., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., Hartwig, A. (2017). Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv, 1704.04861. Hsieh, L.C., Hsu, W.H., Wang, H.C. (2014). Investigating and predicting social and visual image interestingness on social media by crowdsourcing. In Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference, 4309–4313. Hsu, L. (2009). Le visible et l’expression. Étude sur la relation intersubjective entre perception visuelle, sentiment esthétique et forme picturale. Doctoral thesis, École des Hautes Études en sciences sociales, Paris. Hubner, R. and Fillinger, M. (2016). Comparaison of objective measures for predicting perceptual balance and visual aesthetic preferences. Frontiers in Psychology, 7(335), 1–15.
280
Aesthetics in Digital Photography
Huneman, P. and Kulich, E. (1997). Introduction à la phénoménologie. Armand Colin, Paris. Hunter, M.R. and Askarinejad, A. (2015). Designer’s approach for scene selection in tests of preference and restoration along a continuum of natural to manmade environments. Front Psychol., 6, 1228. Hurlberg, A. and Ling, Y. (2007). Biological components of sex differences in color preference. Current Biology, 17(16), R623–R625. Hurlberg, A. and Ling, Y. (2012). Understanding colour perception and preference. In Colour Design, Theories and Applications, Best, J. (ed.). Woodhead Publishing, Cambridge. Ibarra, F.F., Kardan, O., Hunter, M.R., Kotabe, H.P., Meyer, F.A., Berman, M.G. (2017). Image feature types and their predictions of aesthetic preference and naturalness. Frontiers in Psychology, 8, 632–648. Ichikawa, S. (1985). Quantitative and structural factors in the judgment of pattern complexity. Perception & Psychophysics, 38(2), 101–109. Ingarden, R. (2011). Esthétique et ontologie de l’œuvre d’art. Choix de textes 1937–1969. Vrin, Paris. Iseminger, G. (1981). Aesthetic appreciation. The Journal of Aesthetics and Art Criticism, 39(4), 389–397. Ishaghpour, Y. (2009). La miniature persane. Verdier, Lagrasse. Ishizu, T. and Zeki, S. (2011). Toward a brain-based theory of beauty. PLOS One, 6(7), e21852. Isola, P., Xiao, J., Torralba, A., Oliva, A. (2011). What makes an image memorable? In IEEE Conference on Computer Vision and Pattern Recognition, 145–152. Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A. (2017). Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1125–1134. Itten, J. (1973). The Art of Color. Wiley, NewYork. Jacobsen, T. (2010). Beauty and the brain: Culture, history and individual differences in aesthetic appreciation. Journal of Anatomy, 216(2), 184–191. Jacobsen, T., Schubotz, R., Höfel, L., Cramon, D. (2006). Brain correlates of aesthetic judgment of beauty. Neuroimage, 29, 276–285. Jin, W., Ho, H., Srihari, R. (2009). OpinionMiner: A novel machine learning system for web opinion mining and extraction. In Proc. ACM SIGKDD, KDD’09, 1195–1204. Jin, B., Ortiz-Segovia, V., Süsstrunk, S. (2016a). Image aesthetic predictors based on weighted CNNs. In ICIP, Int. Conf. on Image Processing 2016, 2294–2295. Jin, X., Chi, J., Peng, S., Tian, Y., Ye, C., Li, X. (2016b). Deep image aesthetics classification using inception modules and fine-tuning connected layer. In 8th IEEE International Conference on Wireless Communications & Signal Processing (WCSP), 1–6. Jin, X., Wu, L., Zhao, G., Li, X., Zhang, X., Ge, S., Zhou, X. (2019). Aesthetic attributes assessment of images. In Proceedings of the 27th ACM International Conference on Multimedia, 311–319. Jin, X., Wu, L., Zhao, G., Zhou, X., Zhang, X., Li, X. (2020). Idea: A new dataset for image aesthetic scoring. Multimed Tools Applications, 79, 14341–14355.
References
281
John, O., Donahue, E., Kentle, R. (1991). Big five inventory. Journal of Personality and Social Psychology, 66. Jonauskaite, D., Abdel-Khalek, A., Abu-Akel, A., Al-Rasheed, A., Antonietti, J., Asgeirsson, A., Meziane, M. (2019). The sun is no fun without rain: Physical environments affect how we feel about yellow across 55 countries. Journal of Environmental Psychology, 66, 101350. Jullien, F. (2003). La grande image n’a pas de forme. Le Seuil, Paris. Jullien, C. (2017). Les mathématiques : le langage de la beauté. Maths Langages Express, Comité International des Jeux Mathématiques, Paris, 93–98. Kairanbay, M., See, J., Wong, L.K., Hii, Y.L. (2017). Filling the gaps: Reducing the complexity of networks for multi-attribute image aesthetic prediction. In IEEE International Conference on Image Processing (ICIP), 3051–3055. Kairanbay, M., See, J., Wong, L.K. (2019). Beauty is in the eye of the beholder: Demographically oriented analysis of aesthetics in photographs. ACM Transactions on Multimedia Computing, Communications, and Applications, 15(2s), 1–21. Kandinsky, W. (1954). Du spirituel dans l’art, et dans la peinture en particulier. Denoël, Paris. Kant, I. (2015). Critique of Judgment. Project Gutenberg [Online]. Available at: https://www. gutenberg.org/cache/epub/48433/pg48433-images.html. Kao, Y., He, R., Huang, K. (2016a). Visual aesthetic quality assessment with multi-task deep learning. arXiv, 1604.04970. Kao, Y., Huang, K., Maybank, S. (2016b). Hierarchical aesthetic quality assessment using deep convolutional neural networks. Signal Processing: Image Communication, 47, 500–510. Kao, Y., He, R., Huang, K. (2017a). Automatic image cropping with aesthetic map and gradient energy map. In 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 47, 500–510. Kao, Y., He, R., Huang, K. (2017b). Deep aesthetic quality assessment with semantic information. IEEE Transactions on Image Processing, 26(3), 1482–1495. Kaplan, S., Kaplan, R., Wendt, J.S. (1972). Rated preference and complexity for natural and urban visual material. Perception & Psychophysics, 12(4), 354–356. Kardan, O., Demiralp, E., Hout, M.C., Hunter, M.R., Karimi, H., Hanayik, T., Yourganov, G., Jonides, J., Berman, M.G. (2017). Is the preference of natural versus man-made scenes driven by bottom-up processing of the visual features of nature? Frontiers in Psychology, 6, 471–484. Karpathy, A. and Fei-Fei, L. (2015). Deep visual-semantic alignments for generating image descriptions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 3128–3137. Kato, T. and Taishi Matsumoto, T. (2020). Morphological evaluation of closed planar curves and its application to aesthetic evaluation. Graphical Models, 109, 10106. Kawabata, H. and Zeki, S. (2004). Neural correlates of beauty. Journal of Neurophysiology, 91(4), 1699–1705. Ke, Y., Tang, X., Jing, F. (2006). The design of high-level features for photo quality assessment. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), 1, 419–426.
282
Aesthetics in Digital Photography
Kemp, M. (1990). The Science of Art, Optical Themes in Western Art from Brunelleschi to Seurat. Yale University Press, New Haven, CT. Khosla, A., Xiao, J., Isola, P., Torralba, A., Oliva, A. (2012). Image memorability and visual inception. In SIGGRAPH Asia 2012 Technical Briefs, 35. Kim, J., Yoon, S., Pavlovic, V. (2013). Relative spatial features for image memorability. In Proceedings of the 21st ACM International Conference on Multimedia, 761–764. Kim, W.H., Choi, J.H., Lee, J.S. (2015). Subjectivity in aesthetic quality assessment of digital photographs: Analysis of user comments. In Proceedings of the 23rd ACM International Conference on Multimedia, 983–986. Kirsch, L.P., Urgesi, C., Cross, E.S. (2016). Shaping and reshaping the aesthetic brain: Emerging perspectives on the neurobiology of embodied aesthetics. Neuroscience & Biobehavioral Reviews, 62, 56–68. Kirwan, J. (2011). Esthétiques sans esthétique. Diogène, 1, 253–263. Kivy, P. (1968). Aesthetic aspects and aesthetic qualities. The Journal of Philosophy, LXV(4), 85–93. Koch, M., Denzler, J., Redies, C. (2010). 1/f 2 characteristics and isotropy in the Fourier power spectra of visual art, cartoons, comics, mangas, and different categories of photographs. PLoS One, 5(8), e12268. Koelsch, S., Jacobs, A.M., Menninghaus, W., Liebal, K., Klann-Delius, G., von Scheve, C., Gebauer, G. (2015). The quartet theory of human emotions: An integrative and neurofunctional model. Physics of Life Reviews, 13, 1–27. Konecni, V. (1979). Determinants of aesthetic preference and effects of exposure to aesthetic stimuli: Social, emotional and cognitive factors. Prog. Exp. Pers. Res., 9, 149–197. Kong, S., Shen, X., Lin, Z., Mech, R., Fowlkes, C. (2016). Photo aesthetics ranking network with attributes and content adaptation. In ECCV European Conference on Computer Vision, 662–679. Koshelev, M., Kreinovich, V., Yam, Y. (1998). Towards the use of aesthetics in decision making: Kolmogorov complexity formalizes Birkhoff’s idea. University of Texas Library, Digital Commons, El Paso, TX. Kosinski, M., Stillwell, D., Graepel, T. (2013). Private traits and attributes are predictable from digital records of human behavior. Proceedings of the National Academy of Sciences, 110(15), 5802–5805. Kosinski, M., Bachrach, Y., Kohli, P., Stillwell, D., Graepel, T. (2014). Manifestations of user personality in website choice and behaviour on online social networks. Machine Learning, 95(3), 357–380. Kotabe, H.P., Kardan, O., Berman, M.G. (2017). The nature-disorder paradox: A perceptual study on how nature is disorderly yet aesthetically preferred. Journal of Experimental Psychology: General, 146(8), 1126–1142. Krantz, E. (2016). Essai sur l’esthétique de Descartes, étudiée dans les rapports de la doctrine cartésienne avec la littérature classique française au XVIIe siècle (1898). BNF, Paris. Kreinovich, V., Longpre, L., Koshelev, M. (1998). Kolmogorov complexity, statistical regularization of inverse problems, and Birkhoff’s formalization of beauty. Bayesian Inference for Inverse Problems, 3459, 159–170.
References
283
Kringelbach, M. (2005). The human orbito-frontal cortex: Linking reward to hedonic experience. Nat. Rev. Neurosciences, 6, 691–702. Krizhevsky, A., Sutskever, I., Hinton, G. (2017). Imagenet classification with deep convolutional neural networks. Communications of the ACM, 60(6), 84–90. Kuehn, S. and Gallinat, J. (2012). The neural correlates of subjective pleasantness. Neuroimage, 61, 289–294. Kurdi, B., Lozano, S., Banaji, M.R. (2017). Introducing the open affective standardized image set (OASIS). Behavior Research Methods, 49(2), 457–470. Lacey, S., Hagtvedt, H., Patrick, V.M., Anderson, A., Stilla, R., Deshpande, G., Hu, X., Sato, J., Reddy, S., Sathian, K. (2011). Art for reward’s sake: Visual art recruits the ventral striatum. Neuroimage, 55(1), 420–433. Lacroix, A. (2018). Devant la beauté de la nature. Allary Éditions, Paris. Lakhal, S., Darmon, A., Bouchaud, J., Benzaquen, M. (2020). Beauty and structural complexity. Physical Review Research, 2(2), 022058. Lamarque, P. (2010). The uselessness of art. The Journal of Aesthetics and Art Criticism, 68(3), 205–214. Lane, R. (1962). L’estampe japonaise. Aimery Somogy, Paris. Lang, P.J., Bradley, M.M., Cuthbert, B.N. (1999). International affective picture system (IAPS): Instruction manual and affective ratings (tech. rep. no. a-4). Technical report, University of Florida, Gainesville, FL. Lange, C. and James, W. (1922). The Emotions. Williams & Wilkins, Philadelphia, PA. Laughlin, S. (1987). Form and function in retinal processing. Trends in Neurosciences, 10(11), 478–483. Le Bihan, D. (2012). Le cerveau de cristal. Odile Jacob, Paris. Leder, H. and Nadal, M. (2014). Ten years of a model of aesthetic appreciation and aesthetic judgments: The aesthetic episode-developments and challenges in empirical aesthetics. British Journal of Psychology, 105(4), 443–464. Leder, H., Belke, B., Oeberst, A., Augustin, D. (2004). A model of aesthetic appreciation and aesthetic judgements. British Journal of Psychology, 95, 489–508. Leder, H., Carbon, C.C., Ripsas, A.L. (2006). Entitling art: Influence of title information on understanding and appreciation of paintings. Acta Psychologica, 121(2), 176–198. LeDoux, J. (2000). Emotion circuits in the brain. Ann. Rev. Neurosciences, 23, 155–184. Lengger, P.G., Fischmeister, F.P.S., Leder, H., Bauer, H. (2007). Functional neuroanatomy of the perception of modern art: A DC–EEG study on the influence of stylistic information on aesthetic experience. Brain Research, 1158, 93–102. Levinson, J. (1980). Aesthetic uniqueness. The Journal of Aesthetics and Art Criticism, 38(4), 435–449. Li, X. (2002). The aesthetic of the absent: The chinese conception of space. The Journal of Architecture, 7(1), 87–101. Li, L., Zhu, H., Zhao, S., Ding, G., Lin, W. (2020). Personality-assisted multi-task learning for generic and personalized image aesthetics assessment. IEEE Transactions on Image Processing, 29, 3898–3910.
284
Aesthetics in Digital Photography
Lienhard, A., Ladret, P., Caplier, A. (2015). How to predict the global instantaneous feeling induced by a facial picture? Signal Processing: Image Communication, 39, 473–486. Ling, Y. and Hurlberg, A. (2007). A new model for color preference: Universality and individuality. In Color and Imaging Conference, 8–11. Lipps, T. (1903). Ästhetik: Psychologie des Schönen und der Kunst. Voss, L., Hamburg. Liu, L., Chen, R., Wolf, L., Cohen-Or, D. (2010). Optimizing photo composition. Computer Graphics Forum, 29(2), 469–478. Liu, J., Lughofer, E., Zeng, X. (2017). Toward model building for visual aesthetic perception. Computational Intelligence and Neuroscience, 1–13. Liu, J., Lughofer, E., Zeng, X., Li, Z. (2018). The power of visual texture in aesthetic perception: An exploration of the predictability of perceived aesthetic emotions. Computational Intelligence and Neuroscience, 1–9. Liu, D., Puri, R., Kamath, N., Bhattacharya, S. (2020). Composition-aware image aesthetics assessment. In The IEEE Winter Conference on Applications of Computer Vision, 3569–3578. Livingstone, M. (2002). Vision and Art: The Biology of Seeing. Harry N. Abrams. Inc. Publishers, New York. Livio, M. (2008). The Golden Ratio: The story of Phi, the World’s Most Astonishing Number. Broadway Books, New York. Lo, K., Liu, K., Chen, C. (2012). Assessment of photo aesthetics with efficiency. In IEEE International Conference on Pattern Recognition (ICPR), 2186–2189. Lo, K., Liu, K., Chen, C. (2013). Intelligent photographing interface with on-device aesthetic quality assessment. In ACCV 2012 Workshop, vol. LNCS 7729, Parks, J. and Kim, J. (eds). Springer Verlag, Berlin. Locke, J. (2009). Essai sur l’entendement humain, translated by P. Coste. Le Livre de Poche, Paris. Lorand, R. (1989). Free and dependent beauty: A puzzling issue. British Journal of Aesthetics, 29(1), 32–40. Lovato, P., Perina, A., Cheng, D.S., Segalin, C., Sebe, N., Cristani, M. (2013). We like it! Mapping image preferences on the counting grid. In 2013 IEEE International Conference on Image Processing, 2892–2896. Lu, P., Kuang, Z., Peng, X., Li, R. (2014a). Discovering harmony: A hierarchical colour harmony model for aesthetics assesment. Asian Conference on Computer Vision, 452–467. Lu, X., Lin, Z., Jin, H., Yang, J., Wang, J. (2014b). RAPID: Rating pictorial aesthetics using Deep Learning. 22nd Int. Conf. on Multimedia MM’14, 457–466. Lu, X., Lin, Z., Jin, H., Yang, J., Wang, J.Z. (2014c). Rating image aesthetics using deep learning. IEEE Transactions on Multimedia, 17(11), 2921–2034. Lu, P., Peng, X., Li, R., Wang, X. (2015a). Towards aesthetics of image: A Bayesian framework for color harmony modeling. Signal Processing: Image Communication, 39, 487–498. Lu, X., Lin, Z., Shen, X., Mech, R., Wang, J.Z. (2015b). Deep multi-patch aggregation network for image style, aesthetics, and quality estimation. In Proceedings of the IEEE International Conference on Computer Vision, 990–998.
References
285
Lu, P., Peng, X., Zhu, X., Li, R. (2016). An EL-LDA based general color harmony model for photo aesthetics assessment. Signal Processing, 120, 731–745. Lucken, M. (2014). Les limites du ma. Retour à l’émergence d’un concept “japonais”. Nouvelle revue d’esthétique, 13(1), 45–67. Luffarelli, J., Stamatogiannakis, A., Yang, H. (2019). The visual asymmetry effect: An interplay of logo design and brand personality on brand equity. Journal of Marketing Research, 56(1), 89–103. Luo, Y. and Tang, X. (2008). Photo and video quality evaluation: Focusing on the subject. In ECCV 2008, vol. LNCS 5304, Forsyth, D. and Zisserman, A. (eds). Springer-Verlag, Berlin-Heidelberg. Lv, P., Wang, M., Xu, Y., Peng, Z., Sun, J., Su, S., Xu, M. (2018). Usar: An interactive user-specific aesthetic ranking framework for images. In Proceedings of the 26th ACM International Conference on Multimedia, 1328–1336. Ma, S., Liu, J., Chen, C. (2017). A-lamp: Adaptive layout-aware multi-patch deep convolutional neural network for photo aesthetic assessment. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 4535–4544. Machado, P. and Cardoso, A. (1998). Computing aesthetics. In Brazilian Symposium on Artificial Intelligence, 219–228. Machajdik, J. and Hanbury, A. (2010). Affective image classification using features inspired by psychology and art theory. In MM’10, ACM Conference, 83–92. Mai, L., Niu, Y., Liu, F. (2016). Composition preserving deep photo aesthetic assessment. In Proc. IEEE Conf. Comp. Vision and Pattern Recognition, 497–506. Maître, H. (2017). From Photon to Pixel: The Digital Camera Handbook. ISTE Ltd, London and John Wiley & Sons, New York. Maître, H. (2020). Juger du beau avec subjectivité : le défi de l’esthétique computationnelle. Arts et Sciences, Open Science, 4(4), 80–96. Malinowski-Charles, S. (2004). Entre rationalisme et subjectivisme : l’esthétique de Jean-Pierre de Crousaz. Revue de théologie et de philosophie, 136, 7–21. Malinowski-Charles, S. (2005). Baumgarten et le rôle de l’intuition dans les débuts de l’esthétique. Études philosophiques, 75(4), 537–558. Marchesotti, L., Perronnin, F., Larlus, D., Czurka, G. (2011). Assessing the aesthetic quality of photographs using generic image descriptors. In Int. Conf. on Computer Vision, 1784–1791. Marchesotti, L., Perronnin, F., Meylan, F. (2013). Learning beautiful (and ugly) attributes. BMVC, 7, 1–11. Markowsky, G. (1992). Misconceptions about the golden ratio. The College of Mathematics, 23(1), 2–19. Markus, H. and Kitayama, S. (1991). Culture and the self: Implications for cognition, emotion, and motivation. Psychological Review, 98(2), 224. Marquet, J. (1983). Contribution à l’histoire de la philosophie moderne. PUF, Paris. Marr, D. (1982). Vision: A Computational Investigation into the Human Representation and Processing of Visual Information. W.H. Freeman and Company, New York.
286
Aesthetics in Digital Photography
Masuda, T., Gonzalez, R., Kwan, L., Nisbett, R. (2008). Culture and aesthetic preference: Comparing the attention to context of East Asians and Americans. Personality and Social Psychology Bulletin, 34(9), 1260–1275. Mavridaki, E. and Mezaris, V. (2015). A comprehensive aesthetic quality assessment method for natural images using basic rules of photography. In 2015 IEEE International Conference on Image Processing (ICIP), 887–891. McClure, S., Li, J., Tomlin, D., Montague, L.M., Montague, P. (2004). Neural correlates of behavioral preferences for culturally familiar drinks. Neuron, 44(2), 379–387. McManus, I., Jones, A., Cottrell, J. (1981). The aesthetic of colour. Perception, 10, 651–666. McManus, I., Edmonson, D., Rodger, J. (1985). Balance in picture. British Journal of Psychology, 76, 311–324. McManus, I., Stöver, K., Kim, D. (2011). Arnheim’s Gestalt theory of visual balance: Examining the compositional structure of art photographs and abstract images. i-Perception, 2(6), 615–647. Mehrabian, A. (1977). Individual differences in stimulus screening and arousability. Journal of Personality and Social Psychology, 45, 237–250. Meidenbauer, K.L., Stenfors, C.U., Young, J., Layden, E.A., Schertz, K.E., Kardan, O., Decety, J., Berman, M.G. (2019). The gradual development of the preference for natural environments. Journal of Environmental Psychology, 65, 101328, 1–11. Melay, A. (2019). Entre abstraction, multiplicité et matérialité dans l’esthétique japonaise. Nouvelle revue d’esthétique, 23, 95–105. Melikian-Chirvani, A. (2007). Le chant du monde : l’art de l’Iran safavide. Somogy Éditions d’art, Paris. Merleau-Ponty, M. (1945). Phénoménologie de la perception. Gallimard, Paris. Merleau-Ponty, M. (1964). L’œil et l’esprit. Gallimard, Paris. Messina, P., Dominguez, V., Parra, D., Trattner, C., Soto, A. (2017). Exploring content-based artwork recommendation with metadata and visual features. arXiv, 1706.05786. Messina, P., Dominguez, V., Parra, D., Trattner, C., Soto, A. (2018). Content-based artwork recommendation: Integrating painting metadata with neural and manually-engineered visual features. User Modeling and User-Adapted Interaction, 29, 251–290. Mittal, A., Soundararajan, R., Bovik, A.C. (2013). Making a “complete blind” image quality analyzer. IEEE Signal Processing Letters, 20(3), 209–212. Mitteau, A. (2015). Beauté et pluralité chez Ernest Fenollosa (1853–1908) et Okakura Tenshin (1862–1913). Une application du paradigme de l’esthétique universaliste à l’art japonais ancien, et sa mise à l’épreuve. Doctoral thesis, Université Paris-Sorbonne, Paris. Mohr, C., Jonauskaite, D., Dan-Glauser, E., Uusküla, M., Dael, N. (2018). Unifying research on colour and emotion: Time for a cross-cultural survey on emotion associations to colour terms. Progress in Colour Studies: Cognition, Language and Beyond, 209–222. Moles, A.A. (1957). Théorie de l’information et perception esthétique. Revue philosophique de la France et de l’étranger, 147, 233–242. Molnar, F. (1974). Experimental aesthetics or the science of art. Leonardo, 7(1), 23–26.
References
287
Molnar, F. (1977). Mouvement des yeux et l’hypothèse des explorations épistémiques et diversives. Journal national de la Société française de psychologie. Molnar, F. (1981). Towards science in art. In Advances in Intrinsic Motivation and Aesthetics, Hill, A. (ed.). Plenum Press, NewYork. Molnar, F. (1997). A science of vision for visual art. Leonardo, 30(3), 225–232. Moon, P. and Spencer, D. (1944a). Aesthetic measure applied to color harmony. Journal of the Optical Society of America, 34, 234–242. Moon, P. and Spencer, D. (1944b). Area in color harmony. Journal of the Optical Society of America, 34, 93–103. Moon, P. and Spencer, D. (1944c). Geometric formulation of classical color harmony. Journal of the Optical Society of America, 34(1), 46–63. Moorthy, A.K. and Bovik, A.C. (2011). Blind image quality assessment: From natural scene statistics to perceptual quality. IEEE Transactions on Image Processing, 20(12), 3350–3364. Morin, E. (2016) Sur l’esthétique. Robert Laffont, Paris. Morizot, J. (1996). La philosophie de l’Art de Nelson Goodman. Jacqueline Chambon, Paris. Moshagen, M. and Thielsch, M.T. (2010). Facets of visual aesthetics. International Journal of Human-Computer Studies, 66(10), 689–709. Motoyoshi, I., Ishida, S., Sharan, L., Adelson, E. (2007). Image statistics and the perception of surface qualities. Nature, 447, 206–209. Mueser, K., Grau, B., Sussman, S., Rosen, A. (1984). You’re only as pretty as you feel: Facial expression as a determinant of physical attractiveness. Journal of Personality and Social Psychology, 46(2), 469. Murray, N. and Gordo, A. (2017). A deep architecture for unified aesthetic prediction. arXiv, 1708.04890. Murray, N., Marchesotti, L., Perronnin, F. (2012). AVA: A large-scale database for for aesthetic visual analysis. In CVPR: Intern. Conf. on Comp. Vision and Pattern Recognition, 2408–2415. Murray, N., Marchesotti, L., Perronnin, F. (2013). Methods and systems for ranking images using semantic and aesthetic models. U.S. Patent No 9,286,325, 15 March 2016. Nadal, M., Munar, E., Marty, G., Cela-Conde, C.J. (2010). Visual complexity and beauty appreciation: Explaining the divergence of results. Empirical Studies of the Arts, 28(2), 173–191. Naef, S. (2011). Y a-t-il une “question de l’image en Islam”? Téraèdre, Paris. Nagabandi, A., Kahn, G., Fearing, R., Levine, S. (2018). Neural network dynamics for model-based deep reinforcement learning with model-free fine-tuning. In 2018 IEEE International Conference on Robotics and Automation (ICRA), 7559–7606. Neveux, M. (1995). Le nombre d’or, radiographie d’un mythe. Le Seuil, Paris. Nietzsche, F. (2005). Crépuscule des idoles, translated by Patrick Wotlin. Garnier-Flammarion, Paris. Nisbett, R.E. and Masuda, T. (2003). Culture and point of view. Proceedings of the National Academy of Sciences, 100(19), 11163–11170.
288
Aesthetics in Digital Photography
Nodine, C., Locher, P., Krupinski, E. (1993). The role of formal art training on perception and aesthetic judgement of art compositions. Leonardo, 26(3), 219–227. Novalis (2005). Art et Utopie, les derniers fragments (1799–1800). Presse de l’École normale, Paris. O’Connor, Z. (2010). Colour harmony revisited. Color Research & Application, 35,(4), 267–273. O’Donovan, P., Agarwala, A., Hertzmann, A. (2014). Collaborative filtering of color aesthetics. In Proceedings of the Workshop on Computational Aesthetics, 33–40. Obrador, P., Schmitt-Hackenberg, L., Oliver, N. (2010). The role of image composition in image aesthetics. In Proc. of the 17th IEEE Int. Conf. on Image Processing, 3185–3188. Ou, L., Luo, M., Woodcock, A., Wright, A. (2004). A study of colour emotion and colour preference: Part I. Colour emotions of single colours. Color Research and Applications, 29(3), 232–240. Palmer, S.E. and Schloss, K.B. (2010). An ecological valence theory of human color preference. Proceedings of the National Academy of Sciences, USA, 107(19), 8877–8882. Palmer, S.E., Schloss, K.B., Sammartino, J. (2013). Visual aesthetics and human preference. Annual Review of Psychology, 64, 77–107. Pamuk, O. (1998). My Name is Red, translated by Erdag Goknar. Kindle edition. Panofsky, E. (1983). Meaning in the Visual Arts. University of Chicago Press, Chicago, IL. Panofsky, E. (1989). Idea. Les Éditions de Minuit, Paris. Park, K., Hong, S., Baek, M., Han, B. (2017). Personalized image aesthetic quality assessment by joint regression and ranking. In IEEE Winter Conference on Applications of Computer Vision (WACV), 1206–1214. Parkes, G. and Loughnane, A. (2005). Japanese aesthetics. In The Stanford Encyclopedia of Philosophy (Winter 2018 Edition) [Online]. Available at: https://plato.stanford.edu/ archives/win2018/entries/japanese-aesthetics/. Pascual-Marqui, R.D. (1999). Review of methods for solving the EEG inverse problem. International Journal of Bioelectromagnetism, 1(1), 75–86. Pasquier, D., Beaudoin, V., Legon, T. (2014). “Moi je lui donne 5/5”. Paradoxes de la critique amateur en ligne. Presses des Mines, Paris. Patzer, G. (2012). The Physical Attractiveness Phenomena. Springer Science & Business Media, Berlin/Heidelberg. Pearce, M.T., Zaidel, D.W., Vartanian, O., Skov, M., Leder, H., Chatterjee, A., Nadal, M. (2016). Neuroaesthetics: The cognitive neuroscience of aesthetic experience. Perspectives on Psychological Science, 11(2), 265–279. Peck, H. and Peck, S. (1970). A concept of facial esthetics. The Angle Orthodontist, 40(4), 284–317. Pennington, J., Socher, R., Manning, C. (2014). Glove: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 1532–1543. Pepperell, R. (2005). The Posthuman Manifesto. Kritikos, 2, 1552–5112.
References
289
Petitot, J. (2008). Neurogéométrie de la vision. Les Éditions de l’École polytechnique, Palaiseau. Plato (2011). Plato, œuvres complètes. Flammarion, Paris. Plutchik, R. (1991). The Emotions. University Press of America, Lanham, MA. Pouivet, R. (1996). Compétence, survenance et émotion esthétique. Revue internationale de philosophie, 198(4), 635–649. Pouivet, R. (2006). Le réalisme esthétique, PUF, Paris. Ramachandran, V.S. and Freeman, A. (2001). Sharpening up “the science of art”. Journal of Consciousness Studies, 8(1), 9–30. Rammstedt, B. and John, O. (2007). Measuring personality in one minute or less: A 10-item short version of the Big Five inventory in English and German. Journal of Research in Personality, 41(1), 203–212. Reagle, J. (2013). Revenge rating and tweak critique at photo.net [Online]. Available at: http://reagle.org/joseph/2013/photo.net.html. Reber, R., Schwarz, N., Winkielman, P. (2004). Processing fluency and aesthetic pleasure: Is beauty in the perceiver’s processing experience? Personality and Social Psychology Review, 8(4), 364–382. Redi, M., Liu, F., O’Hare, N. (2017). Bridging the aesthetics gap: The wild beauty of web imaginary. In ICMR’17 Conference, 242–250. Redies, C. (2015). Combining universal beauty and cultural context in a unifying model of visual aesthetic experience. Frontiers in Human Neuroscience, 9, 218. Ren, J., Shen, X., Lin, Z., Mech, R., Foran, D.J. (2017). Personalized image aesthetics. In Proceedings of the IEEE International Conference on Computer Vision, 638–647. Renault, B. (2004). Imagerie Hermes-Lavoisier, Paris.
cérébrale
fonctionnelle
électrique
et
magnétique.
Richie, D. (2016). Traité d’esthétique japonaise. Sully-Le Prunier, Vannes. Rigau, J., Feixas, M., Sbert, M. (2008). Information aesthetic measures. IEEE Computer Graphics and Applications, 2, 24–34. Rosenberg, R. and Klein, C. (2015). Art, Aesthetics and the Brain. Oxford University Press, Oxford. Roux, J. (2002). La miniature iranienne, un art figuratif en terre d’islam [Online]. Available at: https://www.clio.fr/BIBLIOTHEQUE/la_miniature_iranienne_un_ art_figuratif_en_terre_d_ islam.asp. Roux, J. (2007). Dictionnaire des arts de l’Islam. Fayard, Paris. Rozenkranz, K. (1853). Ästhetik des Häßlichen. Heidelberg. Russell, J.A. (1980). A circumplex model of affect. Journal of Personality and Social Psychology, 39(6), 1161–1178. Ryckmans, P. (2007). Les propos sur la peinture du moine Cítrouille-Amère. Plon, Paris. Saint Girons, B. (2005). Le sublime de l’Antiquité à nos jours. Desjonquères, Paris. San Pedro, J. and Siersdorfer, S. (2009). Ranking and classifying attractiveness of photos in folksonomies. In Proc. ACM Conf. on World Wide Web WWW’09, 771–780.
290
Aesthetics in Digital Photography
San Pedro, J., Yeh, T., Oliver, N. (2012). Leveraging user comments for aesthetic aware image search reranking. In Proceedings of the 21st International Conference on World Wide Web, 439–448. Santayana, G. (2002). Le sentiment de la Beauté : esquisse d’une théorie esthétique. Presses universitaires de Pau, Pau. Schaper, E. (1964). Kant on aesthetics appraisals. Kant Studien, 1–4, 431–449. Schendan, H.E. and Kutas, M. (2007). Neurophysiological evidence for transfer appropriate processing of memory: Processing versus feature similarity. Psychonomic Bulletin & Review, 14(4), 612–619. Scherer, K. (2005). What are emotions? And how can they be measured? Social Science Information, 44(4), 695–729. Schifanella, R., Redi, M., Aiello, L. (2015). An image is worth more than thousand favourites: Surfacing the hidden beauty of Flickr pictures. arXiv, 1505.03358v2. Schirpke, U., Tasser, E., Tappeiner, U. (2013). Predicting scenic beauty of mountain regions. Landscape and Urban Planning, 111, 1–12. Schloss, K.B. and Palmer, S.E. (2011). Aesthetic response to colour combinations: Preference, harmony and similarity. Attention Perceptual Psychophysics, 73, 551–571. Schmidhuber, J. (1997). Low-complexity art. Leonardo, Journal of the International Society for the Arts, Sciences, and Technology, 30(2), 97–103. Schmidhuber, J. (2007). Simple algorithmic principles of discovery, subjective beauty, selective attention, curiosity and creativity. arXiv, 0709.0674. Schopenhauer, A. (1966). Le monde comme volonté et comme représentation. PUF, Paris. Schupp, H., Cuthbert, B., Bradley, M., Hillman, C., Hamm, A., Lang, P. (2004). Brain processes in emotional perception: Motivated attention. Cognition and Emotion, 18(5), 593–611. Schuster, R., Mörzinger, R., Haas, W., Grabner, H., Van Gool, L. (2010). Real-time detection of unusual regions in image streams. In Proceedings of the 18th ACM International Conference on Multimedia, 1307–1310. Schwarz, K., Wieschollek, F., Lensch, H. (2016). Will people like your image? 1611-05203.
arXiv,
Schweinhart, A. and Essock, E. (2013). Structural content in paintings: Artists overregularize oriented content of paintings relative to the typical natural scene bias. Perception, 42, 1311–1332. Séailles, G. (1877). L’esthétique de Hartmann. Revue philosophique de la France et de l’étranger, IV, 483–495. Séailles, G. (1883). Essai sur le génie dans l’art. Germer Baillière, Paris. Segalin, C., Cristani, M., Perina, A., Vinciarelli, A. (2016). The pictures we like are our image: Continuous mapping of favorite pictures into self-assessed and attributed personality traits. IEEE Trans. Affect. Comput., 99, 1–1. Segalin, C., Cheng, D., Cristani, M. (2017). Social profiling through image understanding: Personality inference using convolutional neural networks. Computer Vision and Image Understanding, 156, 34–50.
References
291
Sheng, K., Dong, W., Ma, C., Mei, X., Huang, F., Hu, B.G. (2018). Attention-based multi-patch aggregation for image aesthetic assessment. In Proceedings of the 26th ACM International Conference on Multimedia, 879–886. Sibley, F. (1959). Aesthetic concepts. Philosophical Review, LXVIII, 421–450. Simond, F., Arvanitopoulos, N., Süsstrunk, S. (2015). Image aesthetics depends on context. In IEEE International Conference on Image Processing, 3788–3792. Skov, M. (2009). Neuroaesthetic problems: A framework for neuroaesthetic research. In Neuroaesthetic, Skov, M. and Vartanian, O. (eds). Baywood, New York. Smith, D. (2008). Color-person-environnement relationship. Color Research & Applications, 33, 312–319. Solso, R. (1996). Cognition and the Visual Arts, 5th edition. The MIT Press, Cambridge, MA. Sperber, D. and Wilson, D. (2004). Relevance theory. In The Handbook of Pragmatics, Horn, G. and Ward, L.R. (eds). Blackwell, Oxford. Srivastava, M.M. and Kant, S. (2018). Visual aesthetic analysis using deep neural network: Model and techniques to increase accuracy without transfer learning. arXiv, 1712.03382. Su, H.H., Chen, T.W., Kao, C.C., Hsu, W.H., Chien, S.Y. (2012). Preference-aware view recommendation system for scenic photos based on bag-of-aesthetics-preserving features. IEEE Transactions on Multimedia, 14(3), 833–843. Sun, W.T., Chao, T.H., Kuo, Y.H., Hsu, W.H. (2017). Photo filter recommendation by category-aware aesthetic learning. IEEE Transactions on Multimedia, 19(8), 1870–1880. Swami, V. (2013). Context matters: Investigating the impact of contextual information on aesthetic appreciation of paintings by Max Ernst and Pablo Picasso. Psychology of Aesthetics, Creativity, and the Arts, 7(3), 285–296. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z. (2015). Rethinking the inception architecture for computer vision. arXiv, 1512.00567. Talairach, J. and Tournoux, P. (1988). Co-planar Stereotaxic Atlas of the Human Brain: Three-Dimensional Proportional System – An Approach to Cerebral Imaging. Thienne Medical Publishers, NewYork. Talebi, H. and Milanfar, P. (2017). NIMA: Neural image assessment. arXiv, 17090541v1. Tanizaki, J. (2017). Louange de l’ombre. Philippe Picquier, Arles. Tappolet, C. (2000). Émotions et valeurs. PUF, Paris. Tatarkiewicz, W. (1970). History of Aesthetics. Mouton Editions, The Hague, Paris, Warsaw. Taylor, R., Spehar, B., Wise, J., Clifford, C.W., Newell, B.R., Hagerhall, C.M., Martin, T.P. (2005). Perceptual and physiological responses to the visual complexity of fractal patterns. Nonlinear Dynamics Psychol. Life. Sci., 9, 89–114. Taylor, R., Spehar, B., Hagerhall, C., Van Donkelaar, P. (2011). Perceptual and physiological responses to Jackson Pollock’s fractals. Frontiers in Human Neuroscience, 5, 60. Taylor, C., Clifford, A., Franklin, A. (2013). Color preferences are not universal. Journal of Experimental Psychology, 142(4), 1015–1027. Thoemmes, K. and Huebner, R. (2014). A picture is worth a word: The effect of titles on aesthetic judgments. In Proceedings of the Twenty-third Biennial Congress of the International Association of Empirical Aesthetics, 499–603.
292
Aesthetics in Digital Photography
Thorpe, S., Fize, D., Marlot, C. (1996). Speed of processing in the human visual system. Nature, 381(6582), 520–521. Thumfart, S., Jacobs, R.H., Lughofer, E., Eitzinger, C., Cornelissen, F.W., Groissboeck, W., Richter, R. (2011). Modeling human aesthetic perception of visual textures. ACM Transactions on Applied Perception (TAP), 8(4), 27. Ticini, L.F., Rachman, L., Pelletier, J., Dubal, S. (2014). Enhancing aesthetic appreciation by priming canvases with actions that match the artist’s painting style. Frontiers in Human Neuroscience, 8, 391. de
Tommaso, M., Pecoraro, C., Sardaro, M., Serpino, C., Lancioni, G., Livrea, P. (2008). Influence of aesthetic perception on visual event-related potentials. Consciousness and Cognition, 17(3), 933–945.
Trujillo, L.T., Jankowitsch, J.M., Langlois, J.H. (2014). Beauty is in the ease of the beholding: A neurophysiological test of the averageness theory of facial attractiveness. Cognitive, Affective, & Behavioral Neuroscience, 14(3), 1061–1076. Turpin, M., Walker, A., Kara-Yakoubian, M., Gabert, N., Fugelsang, J., Stolz, J. (2019). Bullshit makes the art grow profounder. Judgement and Decision Making, 14(6), 658–670. Tversky, A. and Kahneman, D. (1981). The framing of decisions and the psychology of choice. Science, 211, 451–458. Ulrich, R.S. (1983). Aesthetic and affective response to natural environment. In Behavior and the Natural Environment, Altman, I. and Wohlwill, J.F. (eds). Springer, Boston, MA. Uttal, W. (2002). Precis of the new phrenology: The limits of localizing processes in the brain. Brain and Mind, 3, 221–228. Van Dongen, N.N., Van Strien, J.W., Dijkstra, K. (2016). Implicit emotion regulation in the context of viewing artworks: ERP evidence in response to pleasant and unpleasant pictures. Brain and Cognition, 107, 48–54. Van Gogh, V. (1992). Dernières lettres. Mille et une nuits, Paris. Vartanian, O. and Skov, M. (2014). Neural correlates of viewing paintings: Evidence from a quantitative meta analysis of functional magnetic resonance imaging data. Brain Cogn., 87, 52–56. Verdier, F. (2001). L’unique trait de pinceau – Calligraphie, peinture et pensée chinoise. Albin Michel, Paris. Verdier, F. (2003). Passagère du silence. Albin Michel, Paris. Veryzer, J. and Hutchinson, J. (1998). The influence of unity and prototypicality on aesthetic responses to new product designs. Journal of Consumer Research, 24(4), 374–394. Vidal, F. (2011). La neuroesthétique, esthétique scientiste. Revue d’histoire des sciences humaines, 25(2), 239–264. Vidal, F. (2012). Neuroaesthetics: Getting rid of art and beauty. Biosocieties, 7(2), 209. Vienot, F. and Le Rohellec, J. (2012). Colorimetry and physiology – The LMS specification. In Digital Color: Acquisition, Perception, Coding and Rendering, Fernandez-Maloigne, C., Robert-Inacio, F., Macaire, L. (eds), ISTE Ltd, London and John Wiley & Sons, New York. Vigouroux, R. (1992). La fabrique du Beau. Odile Jacob, Paris.
References
293
Vinyals, O., Toshev, A., Bengio, S., Erhan, D. (2015). Show and tell: A neural image caption generator. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 3156–3164. Wallis, J. (2007). Orbitofrontal cortex and its contribution to decision making. Annu. Rev. Neurosciences, 30, 31–56. Wang, W., Liu, J., Zhao, W., Li, J. (2014). A system of image aesthetic classification and evaluation using cloud computing. In Chinese Conference on Pattern Recognition. Communications in Computer and Information Science, Li, S., Liu, C., Wang, Y. (eds). Springer, Berlin/Heidelberg. Wang, W., Zhao, W.C., Huang, J., Xu, X., Li, L. (2015). An efficient image aesthetic analysis system using hadoop. Signal Processing: Image Communication, 39, 499–508. Wang, Z., Chang, S., Dolcos, F., Beck, D., Huang, T. (2016). Brain-inspired deep network for image aesthetic assesment. arXiv, 1601.04155v2. Wang, W., Yang, S., Zhang, W., Zhang, J. (2018). Neural aesthetic image reviewer. arXiv, 1802.10240. Wang, Y., Ke, Y., Wang, K., Zhang, C., Qin, F. (2020). Aesthetic quality assessment for group photograph. arXiv, 2002.01096. Wascheck, M. (2000). Le chef d’œuvre : un fait culturel. In Qu’est-ce qu’un Chef d’œuvre ? Gallimard, Paris. Wertheimer, M. (1922). Unterzuchungen zur Lehre von der Gestalt - i - prinzipelle Bemerkungen. Psychologische Forschung, 1, 47–58. Wertheimer, M. (1923). Unterzuchungen zur Lehre von der Gestalt - ii - laws of organization in perceptual forms. Psychologische Forschung, 4, 301–350. Wertheimer, M. (1938). Laws of of organization in perceptual forms. In A Source Book of Gestalt Psychology, Ellis, W.D. (ed.). Kegan Paul, Trench, Trubner & Company, London. Whitfield, T.W. and Slatter, P.E. (1979). The effects of categorization and prototypicality on aesthetic choice in a furniture selection task. British Journal of Psychology, 70(1), 65–75. Wilber, M.J., Fang, C., Jin, H., Hertzmann, A., Collomosse, J., Belongie, S. (2017). Bam! The Behance artistic media dataset for recognition beyond photography. In Proc. Int. Conf. Comp. Vision, 1–4. Xun, J. (2015). Contemplation sur l’art. SDX Joint Publishing Company, Taiwan. Xun, J. (2016). Aube réelle. People’s Literature Publishing Society, Beijing. Yang, L., Hsieh, C.K., Estrin, D. (2015). Beyond classification: Latent user interests profiling from visual contents analysis. In 2015 IEEE International Conference on Data Mining Workshop (ICDMW), 1410–1416. Yao, L., Suryanarayan, P., Qiao, M., Wang, J.Z., Li, J. (2012). Oscar: On-site composition and aesthetics feedback through exemplars for photographers. International Journal of Computer Vision, 96(3), 353–383. Yarbus, A. (1967). Eye Movements and Vision. Plenum Press, New York. You, Q., Bhatia, S., Luo, J. (2016). A picture tells a thousand words about you! User interest profiling from user generated visual content. Signal Processing, 124, 45–53.
294
Aesthetics in Digital Photography
Zeki, S. (1999). Inner Vision: An Exploration of Art and the Brain. Oxford University Press, Oxford. Zemach, E. (1987). Aesthetic properties, aesthetic laws and aesthetic principles. The Journal of Aesthetics and Art Criticism, 46(1), 239–251. Zemach, E. (1991). Real beauty. Midwest studies in Philosophy, XVI, 249–265. Zemach, E. (2005). La beauté réelle : une défense du réalisme esthétique. Presses universitaires de Rennes, Rennes. Zhang, J., Sclaroff, S., Lin, Z., Shen, X., Price, B., Mech, R. (2016). Unconstrained salient object detection via proposal subset optimization. In IEEE Conf. on Comp. Vision and Pattern Recognition (CVPR), 5733–5742. Zhu, H., Li, L., Wu, J., Zhao, S., Ding, G., Shi, G. (2020). Personalized image aesthetics assessment via meta-learning with bilevel gradient optimization. IEEE Transactions on Cybernetics, 52(3), 1798–1811.
Index
A A-Lamp (assessment method), 167 AADB (database of images), 127 acutance, 80 ACQUINE (assessment method), 136 aesthetics analytical, 58 contextualized or historical, 19 environmental, 65, 94 functional, 19 predicates, 11 properties versus non-aesthetic, 59 realism, 12 affordance, 50 Alberti, L.B., 4, 226, 227 AlexNet (DNN architecture), 156 algorithms, 115 Alhazen, 224 Amirshahi, S.A., 71 aniconism, 252 Appleton, J., 93 appraisal, 19 approach objectivist, 3 subjectivist, 13 Aristotle, 3, 220 Arnheim, R., 69, 73 arousal, xv, 16 brut, 234 art for art, 8 aspect ratio, 68
assessment method A-Lamp, 167 ACQUINE, 136 brain-inspired DNN, 174 DMA-Net, 161 MNA, 163 MTRL-CNN, 174 NAIR, 182 NIMA, 176 OSCAR, 148 RAPID, 157 attraction, 122 Augustin of Hippo, 3, 4, 223 AVA -Captions (database), 183 -Comments (database), 182 database of images, 127 -PD (database of images), 127 AVA2 (database of images), 127 B bag of visual words (BoVW), 144 Baldwin of Canterbury, 224 BAM! (artistic media dataset), 130 Barthes, R., 19, 97 Baudelaire, C., 113 Baumgarten, A., 231 Beauty, xviii BEAUTY (database of images), 128 beauty-is-good (stereotype), 97 Bell, C., 11
296
Aesthetics in Digital Photography
Benjamin, W., 8 Bense, M., 107 Berlyne, D.E., 15, 63 bi mo, 247 Big data, 114 Five (personality profiles), 188 Birkhoff, G.D., 63, 104 bokeh, 79 Bourdieu, P., xx brain-inspired DNN (assessment method), 174 Brown, S., 31, 50 Brown, S., et al. (emotion model), 42 Burke, E., xvii C CAFFE (DNN environment), 161 canon (aesthetics), 220 Carlson, A., 94 Changeux, J.-P., 19, 25, 51, 99 Chatterjee, A., 30 aesthetic emotion model, 40 Chevreul, E., 87, 104 China (aesthetics in), 237 Chinese painting, 237 chromatic palette, 87 Cicero, 222 CIE (International Commission on Illumination), 83 CIELab (metrics), 87 circumplex affect, 207 classification, 133 cognition (areas of), 35 color, 82, 234 Cometti, J.P., 133 common sense, 231 complexity, 63, 80, 104 theory, 108 composition, 63, 64, 73 cortex insular, 31 prefrontal, 28 crowdsourcing, 120 Cuda-Convnet, 175 CUHK-DB (database of images), 126 Cupchik, G., 18, 33
D d’Ockam, G., 225 da Vinci, L., 227 Damasio, A., 36, 47 Danto, A.C., xx, 19 databases, 120 AADB, 127 AVA, 127 AVA-PD, 127, 187 AVA2, 127 BAM!, 130 BEAUTY, 128 CUHK-DB, 126 DPChallenge, 124 FACD, 129 Flickr, 122 Flickr-AES, 123 IDEA, 129 ImageCLEF, 124 ImageNet, 122 JenAesthetics, 129 PCCD, 129, 181 Photo.Net, 123 PsychoFlickr, 123, 189 SUN, 126 Uni Tübingen, 128 Datta, R., 137 de Crousaz, J-P., 228 Democritus, 221 depth of field, 77 Descartes, R., 228 design, 19 Devey, J., 99 Di Dio, C., 18, 33, 35 diagonals, 75 Diderot, D., 1–3, 15 DMA-Net (assessment method), 161 DNN (deep neural network), 115, 155 DPChallenge (database of images), 124 Du Bos, J.-B., 229 Dubuffet, J., 234 Dufrenne, M., 60, 93 E EEG (electro-encephalography), 36, 51 embodiment, 35
Index
emotions, 207 ERP (Event-Related Potential), 37 experimental psychology, 55 experts, 119 Eysenck, H.J., 62 F FACD (database), 129 Fechner, G.T., 70, 99 Feodorov, I., 251 Ferry, L., 3, 15 Ficin, M., 226 five major dimensions (personality profiles), 188 Flickr-AES (database of images), 123 fMRI (functional magnetic resonance imaging), 27 focality, 73 focus, 80 framing, 75 G GAN (Generative Adversarial Network), 167 Gestalt (theory), 16, 105 Ghyka, M., 70 Goethe, J.W., 15, 87, 234 golden number, 70 Gombrich, E.H., 2, 19 GoogLeNet (neural network), 163 Grosseteste, R., 224 H handcrafted (features, primitives), 134 Hasegawa, Y., 263 hedonic (center), 28 Hegel, G.W.F., 10, 14, 232 Heidegger, M., 19 Henry, C., 10, 104 high-keys, 76 histogram, 76 Hogarth, W., 229 hsü, 242 Hume, D., 229 Hutcheson, F., 19, 229
I IAPS (International Affective Picture System), xvi, 39, 130 icon, 235 IDEA (database), 129 idiosyncrasies, 185 image and Islam, 252 image format, 68, 156 ImageCLEF (database of images), 124 incarnation (embodiment), 35 Inception (neural network), 177 Ishizu, T., 18, 28 Itten, J., 86 J Japan (aesthetics in), 263 JenAesthetics (painting database), 129 Jullien, F., 237 K Kandinsky, V., 55 Kant, I., 7, 14, 231 Kaplan, S., 65, 94 kara-e (aesthetic of the Tangs), 264 Ke, Y., 140 Kirwan, J., 155 Koelsch, S. et al (emotion model), 47 L Lacroix, A., 93 landscape, 64, 93 learning, 115 Leder, H., 100 aesthetic emotion model, 43 Lengger, P.G., 37 Levinson, J., 12 Livingstone, M., 18 Locke, J., 15 low-key, 76 LSTM (Long Short Term Memory) (algorithm), 179 M Marchesotti, L., 144 masses of data, 114
297
298
Aesthetics in Digital Photography
Matsuda, Y., 91 MEG (magneto-encephalography), 36, 51 memory (areas of), 35 Merleau-Ponty, M., 52, 213 mirror-neurons, 36 MNA (assessment method), 163 Mobile-Net (neural network), 177 model aesthetic emotion A. Chatterjee’s, 40 C. Redies’, 45 H. Leder’s, 43 L.H. Hsu’s, 47 S. Brown et al’s, 42 S. Koelsch et al’s, 47 Moles, A., 107 Molnar, F., 16, 67 mono-no-aware (the interjection of things), 267 Moon, P., 88 Morizot, J., 133 mountain river (shan shui), 247 MTRL-CNN (assessment method), 174 Munsell, A., 87 Murray, N., 144 N NAIR (assessment method), 182 nature, 64, 93 neural network adversarial, 167 deep (DNN), 115, 155 GoogLeNet, 163 Inception, 177 Mobile-Net, 177 ResNet, 170 Siamese, 165 VGG, 157, 163, 177 neuroaesthetics, 18, 28 biology, 25 geometric, 40 Nietzsche, F., 17, 235 nihonga (Japanese painting), 264 NIMA (assessment method), 177 Nizami, 256
O OASIS (Open Affective Standardized Image), 131 objectivist (approach), 3, 220 OSCAR (assessment method), 148 P Palmer, S.E., 197 Panofsky, E., 220 parallel lines, 75 PCCD (database), 129, 181 Pepperell, R., 17 perceptibility, 221 Persia (aesthetics in), 251 Persian miniatures, 251 personality, 188 personification (embodiment), 35 Petitot, J., 40 phenomenology, 9 Photo.net (database of images), 123 picturesque, 93 Plato, 2, 3, 220 pooling, 155 Pouivet, R., 12, 133 Prägnanz, 16 precision, 116 primitives, 133 handcrafted, 134 high level, 143 prototypicality, 101 PsychoFlickr (database of images), 123, 189 psychological states, 207 Q qi, 240 qualia, 21 quality versus beauty, 57 R Rand, A., 3 RAPID (assessment method), 157 recall, 116 receiver’s operational characteristic, 128 recommendation systems, 186 Redies, C. (aesthetic emotion model), 45
Index
ResNet (neural network), 170 reward, 28 ROC curve, 116 Ross, D., 73 rule of thirds, 71 S Saint Augustin (Augustin of Hippo), 3, 4, 223 San Pedro, J., 154 Schelling, F., 232 Schmidhuber, J., 63, 110 Schopenhauer, A., 4, 10, 83, 233 Shâhnâmah, 255 shan shui (Mountain River), 246 Shannon (sampling, theorem), 80 Shannon sampling, 91 Sheikh (or Xie He), 247 Shitao, 237, 240 shot, 77, 80 Sibley, F., 12 signature (effect), 21, 32, 45, 57, 99, 111 significant form, 11 six laws of Vitruvius, 220, 247 Xie He, 246, 247 SOC (Standard Observation Conditions), 12 Socrates, 2, 3, 220 Solso, R.L., 16 spectral density, 77 subjectivist (approach), 13, 226 sublime, 234 SUN (database of images), 126 superadditivity, 64
299
supervenience, 11, 59 symmetry, 5, 221 T, U Tatarkiewicz, W., 220 TensorFlow (software library), 177 Texture, 80 Thumfart, S., 81 ugliness, 30, 223 ukiyo-e (floating image), 264 unity, 64 V Vartanian, O., 35 VAST (Visual Aesthetic Sensitivity Test), 62 VGG (neural network), 157, 163, 177 VISTA+ (recommendation system), 187 visual areas, 33 Vitellion, 225 Vitruvius, 220, 247 W, X, Y written appraisal, 179 Xenophon, 61 Xie He (or Sheikh), 247 Xie He (six laws of), 246, 247 yamato-e (Ancient Japan), 264 Yarbus, A., 67 yin and yang, 240 y¯ oga (Western painting), 264 Z Zeki, S., 18, 28
WILEY END USER LICENSE AGREEMENT Go to www.wiley.com/go/eula to access Wiley’s ebook EULA.