Who Wrote Citizen Kane?: Statistical Analysis of Disputed Co-Authorship (Quantitative Methods in the Humanities and Social Sciences) 3031402235, 9783031402234

This book offers a solution to one of film history’s major controversies: the long-running dispute over Orson Welles’ an

123 78 6MB

English Pages 180 [174] Year 2023

Table of contents :
Acknowledgments
Contents
List of Figures
Chapter 1: Introduction
References
Chapter 2: The Trials of Coauthorship
2.1 The Dispute over the Authorship of the Citizen Kane Screenplay
2.2 The Study of Coauthorship
2.3 The Citizen Kane Screenplays
References
Chapter 3: Screenplays: Words on the Page
3.1 The Components of a Screenplay
3.2 Welles’s Screenplays
3.3 Mankiewicz’s Screenplays
References
Chapter 4: The Statistical Analysis of Style: Aims and Methods
4.1 The Fingerprint Analogy
4.2 Premises of Stylometry
4.2.1 Quantifiable (Measurable) and Computable
4.2.2 High Rate
4.2.3 Context-Free
4.2.4 Multiple
4.2.5 Subconscious (Automatic)
4.2.6 Distinctive
4.2.7 Stable
4.3 The Federalist Papers, the Aristotelian Ethics, and Pericles
4.4 Stylometry and Coauthorship
4.5 Stylometric Data and Tests
4.6 Punctuation and N-Gram Analysis
4.7 Contractions
4.8 Word Analysis
4.8.1 Percentage of Old English Vocabulary
4.8.2 Distinctive Words
4.9 Collocations
4.10 Sentence Length
4.11 Cluster Analysis
4.12 Statistical Significance
4.13 The Distinctiveness Ratio, Confidence Intervals, and Effect Size
References
Chapter 5: Distinguishing Mankiewicz from Welles: Training Phase Results
5.1 Distinctive Words
5.2 Punctuation
5.3 N-Grams
5.4 Contractions
5.5 Word Length
5.6 Word Frequency Profile
5.7 Vocabulary and Percentage of Old English Vocabulary
5.8 Personal Directions and Scene Heading Elements
5.9 Collocations
5.10 Sentence Length
5.11 Cluster Analysis
5.12 Distinctive Features, Fluctuation, and Confidence Intervals
5.13 Distinctive Groups
Appendix
References
Chapter 6: Comparing Mankiewicz and Welles to the Citizen Kane Screenplay (1): Relative Frequencies, Distinctiveness Ratios, and Confidence Intervals
6.1 Whole Screenplay
6.1.1 Relative Frequencies
6.1.2 Ratios
6.1.3 Summary of the Whole Screenplay Analysis
6.2 Seven Segments of Citizen Kane
6.2.1 Relative Frequencies
6.2.2 Ratios
6.2.3 Summary of the Seven-Segment Analysis
6.3 Thirteen Scenes of Citizen Kane
6.3.1 Relative Frequencies
6.3.2 Ratios
6.3.3 Summary of the Thirteen-Scene Analysis
6.4 Preliminary Conclusions
Appendix
References
Chapter 7: Comparing Mankiewicz and Welles to the Citizen Kane Screenplay (2): Cluster Analysis, Type/Token Ratios, Sentence Length, and Linguistic Inquiry and Word Count (LIWC)
7.1 Sentence Length
7.2 Cluster Analysis
7.3 Type/Token Ratios
7.4 His Girl Friday and All the President’s Men
7.5 Mankiewicz’s Man of the World Dialogue and Welles’s Touch of Evil Memo
7.6 Linguistic Inquiry and Word Count (LIWC)
7.6.1 LIWC Results
7.7 Summary
Appendix
References
In Conclusion
References
Index

Recommend Papers

Higher Education Policy Analysis Using Quantitative Techniques: Data, Methods and Presentation (Quantitative Methods in the Humanities and Social Sciences) 3030608301, 9783030608309

This textbook introduces graduate students in education and policy research to data and statistical methods in state-lev

109 40 4MB Read more

Citizen Kane 9781838711993, 9781844574971

Citizen Kane’s reputation as one of the greatest films of all time is matched only by the accumulation of critical comme

137 3 11MB Read more

Mixed-Effects Regression Models in Linguistics (Quantitative Methods in the Humanities and Social Sciences) 9783319698304, 9783319698281, 3319698303

100 100 2MB Read more

Humanities Data in R: Exploring Networks, Geospatial Data, Images, and Text (Quantitative Methods in the Humanities and Social Sciences) [2 ed.] 303162565X, 9783031625657

This book teaches readers to integrate data analysis techniques into humanities research practices using the R programmi

108 12 9MB Read more

Virtual Reality Methods: A Guide for Researchers in the Social Sciences and Humanities 9781447360773

EPUB and EPDF available Open Access under CC-BY-NC-ND licence. Since the mid-2010s, virtual reality (VR) technology has

154 96 8MB Read more

Quantitative Data Analysis: Doing Social Research to Test Ideas (Research Methods for the Social Sciences) [1 ed.] 0470380039, 9780470380031

Table of contents : Contents......Page 3 Tables, figures, exhibits, and boxes......Page 8 Preface......Page 19 The autho

839 87 130MB Read more

Statistical Analysis of Microbiome Data (Frontiers in Probability and the Statistical Sciences) 3030733505, 9783030733506

Microbiome research has focused on microorganisms that live within the human body and their effects on health. During th

103 81 9MB Read more

Pairs trading : quantitative methods and analysis 0471460672

313 122 5MB Read more

Database Computing for Scholarly Research: Case Studies Using the Online Cultural and Historical Research Environment (Quantitative Methods in the Humanities and Social Sciences) 3031466942, 9783031466946

This book discusses in detail a series of examples drawn from scholarly projects that use the OCHRE database platform (O

110 15 41MB Read more

Research Methods in the Social Sciences 9781735934020

105 85 35MB Read more

Author / Uploaded
Warren Buckland

0 0 0
Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up

File loading please wait...

Citation preview

Quantitative Methods in the Humanities and Social Sciences

Warren Buckland

Who Wrote Citizen Kane? Statistical Analysis of Disputed Co-Authorship

Quantitative Methods in the Humanities and Social Sciences Series Editors Thomas DeFanti, Calit2 University of California San Diego La Jolla, CA, USA Anthony Grafton, Princeton University Princeton, NJ, USA Thomas E. Levy, Calit2 University of California San Diego La Jolla, CA, USA Lev Manovich, Graduate Center The Graduate Center, CUNY New York, NY, USA Alyn Rockwood, KAUST Boulder, CO, USA

Quantitative Methods in the Humanities and Social Sciences is a book series designed to foster research-based conversation with all parts of the university campus – from buildings of ivy-covered stone to technologically savvy walls of glass. Scholarship from international researchers and the esteemed editorial board represents the far-reaching applications of computational analysis, statistical models, computer-based programs, and other quantitative methods. Methods are integrated in a dialogue that is sensitive to the broader context of humanistic study and social science research. Scholars, including among others historians, archaeologists, new media specialists, classicists and linguists, promote this interdisciplinary approach. These texts teach new methodological approaches for contemporary research. Each volume exposes readers to a particular research method. Researchers and students then benefit from exposure to subtleties of the larger project or corpus of work in which the quantitative methods come to fruition. Editorial Board: Thomas DeFanti, University of California, San Diego & University of Illinois at Chicago Anthony Grafton, Princeton University Thomas E. Levy, University of California, San Diego Lev Manovich, The Graduate Center, CUNY Alyn Rockwood, King Abdullah University of Science and Technology Publishing Editor for the series at Springer: Faith Su, [email protected]

Warren Buckland

Who Wrote Citizen Kane? Statistical Analysis of Disputed Co-Authorship

Warren Buckland School of Arts Oxford Brookes University Oxford, Oxfordshire, UK

ISSN 2199-0956 ISSN 2199-0964 (electronic) Quantitative Methods in the Humanities and Social Sciences ISBN 978-3-031-40223-4 ISBN 978-3-031-40224-1 (eBook) https://doi.org/10.1007/978-3-031-40224-1 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland Paper in this product is recyclable.

Acknowledgments

This project took shape with the help of a Research Excellence Award from Oxford Brookes University in 2020. Colleagues offered comments and support at various stages of development and revision, including John Fullerton, Paul Gulino, Peter Krämer, Kevin Maynard, Paolo Russo, Daniela Treveri-Gennari, Barry Salt, Yannis Tzioumakis, Robert Williamson, and the editorial board of the Springer series Quantitative Methods in the Humanities and Social Sciences. I presented a preliminary outline of the project (“Welles and Mankiewicz: The Complexities of Co-Authorship”) at the 12th Screenwriting Research Network (SRN) International Conference, Catholic University of Portugal, Porto, in September 2019. Throughout this project, Richard Forsyth, a pioneer in computational authorship attribution, acted as a consultant, offering advice, comments, and at times some complex calculations. And Biman Chakraborty (Lecturer in Statistics at University of Birmingham) commented on the statistical results presented in Chaps. 5, 6, and 7. For the most part, I have kept the statistics straightforward in order to aim the research squarely at an arts and humanities readership.

v

Contents

1

Introduction�� 1 References�� 9

2

The Trials of Coauthorship �� 11 2.1 The Dispute over the Authorship of the Citizen Kane Screenplay�� 11 2.2 The Study of Coauthorship �� 20 2.3 The Citizen Kane Screenplays�� 22 References�� 26

3

Screenplays: Words on the Page �� 29 3.1 The Components of a Screenplay �� 30 3.2 Welles’s Screenplays�� 32 3.3 Mankiewicz’s Screenplays�� 38 References�� 40

4

The Statistical Analysis of Style: Aims and Methods�� 43 4.1 The Fingerprint Analogy�� 43 4.2 Premises of Stylometry �� 45 4.2.1 Quantifiable (Measurable) and Computable �� 46 4.2.2 High Rate�� 47 4.2.3 Context-Free �� 47 4.2.4 Multiple�� 48 4.2.5 Subconscious (Automatic)�� 48 4.2.6 Distinctive �� 49 4.2.7 Stable�� 49 4.3 The Federalist Papers, the Aristotelian Ethics, and Pericles�� 50 4.4 Stylometry and Coauthorship �� 52 4.5 Stylometric Data and Tests �� 53 4.6 Punctuation and N-Gram Analysis�� 54 4.7 Contractions�� 55 4.8 Word Analysis �� 56 vii

viii

Contents

4.8.1 Percentage of Old English Vocabulary�� 58 4.8.2 Distinctive Words�� 59 4.9 Collocations�� 61 4.10 Sentence Length�� 61 4.11 Cluster Analysis�� 62 4.12 Statistical Significance�� 64 4.13 The Distinctiveness Ratio, Confidence Intervals, and Effect Size�� 65 References�� 73 5

Distinguishing Mankiewicz from Welles: Training Phase Results�� 75 5.1 Distinctive Words�� 77 5.2 Punctuation �� 80 5.3 N-Grams�� 82 5.4 Contractions�� 84 5.5 Word Length �� 84 5.6 Word Frequency Profile�� 85 5.7 Vocabulary and Percentage of Old English Vocabulary�� 86 5.8 Personal Directions and Scene Heading Elements �� 87 5.9 Collocations�� 87 5.10 Sentence Length�� 88 5.11 Cluster Analysis�� 89 5.12 Distinctive Features, Fluctuation, and Confidence Intervals�� 90 5.13 Distinctive Groups�� 94 Appendix�� 97 References�� 116

6

Comparing Mankiewicz and Welles to the Citizen Kane Screenplay (1): Relative Frequencies, Distinctiveness Ratios, and Confidence Intervals �� 117 6.1 Whole Screenplay �� 119 6.1.1 Relative Frequencies �� 119 6.1.2 Ratios�� 119 6.1.3 Summary of the Whole Screenplay Analysis�� 121 6.2 Seven Segments of Citizen Kane�� 123 6.2.1 Relative Frequencies �� 123 6.2.2 Ratios�� 123 6.2.3 Summary of the Seven-Segment Analysis�� 126 6.3 Thirteen Scenes of Citizen Kane�� 128 6.3.1 Relative Frequencies �� 128 6.3.2 Ratios�� 128 6.3.3 Summary of the Thirteen-Scene Analysis�� 130 6.4 Preliminary Conclusions�� 135 Appendix�� 137 References�� 143

Contents

7

ix

Comparing Mankiewicz and Welles to the Citizen Kane Screenplay (2): Cluster Analysis, Type/Token Ratios, Sentence Length, and Linguistic Inquiry and Word Count (LIWC)�� 145 7.1 Sentence Length�� 145 7.2 Cluster Analysis�� 146 7.3 Type/Token Ratios�� 146 7.4 His Girl Friday and All the President’s Men�� 147 7.5 Mankiewicz’s Man of the World Dialogue and Welles’s Touch of Evil Memo�� 148 7.6 Linguistic Inquiry and Word Count (LIWC)�� 149 7.6.1 LIWC Results�� 151 7.7 Summary �� 153 Appendix�� 155 References�� 156

In Conclusion�� 157 References�� 162 Index�� 165

List of Figures

Fig. 4.1 Frequency distribution of sentence length in Thomas à Kempis and Jean Charlier de Gerson �� 62 Fig. 4.2 The Junian plus group confidence interval (243–377) together with the 88 texts of the million-word sample �� 70 Fig. 4.3 The Junian minus group confidence interval (96–142) together with the 88 texts of the million-word sample �� 70 Fig. 4.4 Sir Philip Francis’s eight samples from his plus group superimposed over the Junian plus confidence interval�� 71 Fig. 4.5 Sir Philip Francis’s eight samples from his minus group superimposed over the Junian minus confidence interval�� 72 Fig. 5.1 Frequencies of the Trigram e_b in Mankiewicz (divided into 10 segments)�� 92 Fig. 5.2 Frequencies of the Trigram e_b in Welles (divided into 10 segments)�� 93 Fig. 5.3 Distinctive words in Mankiewicz and Welles �� 97 Fig. 5.4 Punctuation marks in Mankiewicz and Welles �� 98 Fig. 5.5 Letter Unigrams in Mankiewicz and Welles �� 98 Fig. 5.6 Distinctive Bigrams in Mankiewicz and Welles �� 99 Fig. 5.7 Distinctive Trigrams in Mankiewicz and Welles�� 100 Fig. 5.8 Trigram contractions in Mankiewicz and Welles�� 100 Fig. 5.9 Word length in Mankiewicz and Welles�� 101 Fig. 5.10 Mankiewicz: frequency distribution of word length �� 101 Fig. 5.11 Welles: frequency distribution of word length�� 102 Fig. 5.12 Difference between Mankiewicz and Welles (frequency distribution of word length)�� 102 Fig. 5.13 Word length frequencies and their distinctive ratio values in Mankiewicz and Welles�� 103 Fig. 5.14 Frequency distribution (occurrences of word types) in Mankiewicz and Welles�� 104 Fig. 5.15 Word frequency profile of Mankiewicz and Welles�� 105 xi

xii

List of Figures

Fig. 5.16 Word frequency profile of Mankiewicz and Welles (logarithmic scale)�� 105 Fig. 5.17 Personal directions in Mankiewicz and Welles�� 106 Fig. 5.18 Basic scene heading elements in Mankiewicz and Welles�� 106 Fig. 5.19 Two-word collocations in Mankiewicz and Welles�� 107 Fig. 5.20 Sentence length in Mankiewicz and Welles�� 107 Fig. 5.21 Difference between Mankiewicz and Welles (frequency distribution of sentence length)�� 108 Fig. 5.22 Cluster graph of four Mankiewicz and Welles screenplay samples�� 108 Fig. 5.23 Dendrogram of four screenplay samples�� 109 Fig. 5.24 Initial list of 37 plus group distinctive features (organized by Distinctiveness Ratio [DR])�� 110 Fig. 5.25 Initial list of 40 minus group distinctive features (organized by Distinctiveness Ratio [DR]) �� 111 Fig. 5.26 The e_b Trigram in Segments 1 and 2 of Welles�� 112 Fig. 5.27 The e_b Trigram in segments 1 and 2 of Mankiewicz�� 113 Fig. 5.28 Final list of 22 plus group distinctive features (organized by Distinctiveness Ratio [DR])�� 114 Fig. 5.29 Final List of 22 Minus Group Distinctive Features (organized by Distinctiveness Ratio [DR]) �� 115 Fig. 5.30 Number of linguistic features in the final 44 plus and minus groups�� 116 Fig. 6.1 Visualization of the plus group CK distinctiveness ratios�� 120 Fig. 6.2 Visualization of the minus group CK distinctiveness ratios �� 121 Fig. 6.3 Percentage matches between Mankiewicz, Welles, and Citizen Kane (Whole screenplay)�� 122 Fig. 6.4 The plus group ratios (Mankiewicz and Welles compared to seven segments of Citizen Kane)�� 125 Fig. 6.5 The minus group ratios (Mankiewicz and Welles compared to seven segments of Citizen Kane)�� 126 Fig. 6.6 Percentage matches between Mankiewicz, Welles, and Citizen Kane (Seven Segments) �� 127 Fig. 6.7 The plus group (Mankiewicz and Welles compared to the 13 scenes of Citizen Kane)�� 129 Fig. 6.8 The minus group (Mankiewicz and Welles compared to the 13 scenes of Citizen Kane)�� 130 Fig. 6.9 Percentage matches between Mankiewicz, Welles, and Citizen Kane (13 scenes)�� 131 Fig. 6.10 Mankiewicz’s results from his plus and minus groups �� 132 Fig. 6.11 Welles’s results from his plus and minus groups�� 132 Fig. 6.12 Summary of the percentage matches between Citizen Kane (three tests) and Mankiewicz and Welles�� 135

List of Figures

xiii

Fig. 6.13 The plus group relative frequencies of Mankiewicz and Welles compared to Citizen Kane (whole screenplay)�� 137 Fig. 6.14 The minus group relative frequencies of Mankiewicz and Welles compared to Citizen Kane (whole screenplay) �� 138 Fig. 6.15 CK distinctiveness ratios: Mankiewicz compared to Citizen Kane�� 139 Fig. 6.16 CK distinctiveness ratios: Welles compared to Citizen Kane �� 140 Fig. 6.17 The relative frequencies of the plus group of Citizen Kane (seven segments) and their differences from Mankiewicz and Welles �� 141 Fig. 6.18 The relative frequencies of the minus group of Citizen Kane (seven segments) and their differences from Mankiewicz and Welles �� 141 Fig. 6.19 Citizen Kane divided into its 13 major scenes�� 142 Fig. 6.20 The relative frequencies of the plus group of Citizen Kane (13 Scenes) and their differences from Mankiewicz and Welles�� 142 Fig. 6.21 The relative frequencies of the minus group of Citizen Kane (13 Scenes) and their differences from Mankiewicz and Welles �� 143 Fig. 7.1 The sum of relative frequencies and distinctiveness ratios of His Girl Friday (Whole Screenplay) in relation to Mankiewicz and Welles�� 147 Fig. 7.2 The sum of relative frequencies and distinctiveness ratios of All the President’s Men (Whole Screenplay) in relation to Mankiewicz and Welles�� 148 Fig. 7.3 Mankiewicz and Welles compared to Welles’s Touch of Evil Memo�� 149 Fig. 7.4 Mankiewicz and Welles compared to Mankiewicz’s Man of the World Dialogue �� 149 Fig. 7.5 LIWC categories in Mankiewicz and Welles�� 151 Fig. 7.6 LIWC categories in Mankiewicz, Welles, and Citizen Kane�� 153 Fig. 7.7 Visualization of LIWC categories in Mankiewicz, Welles, and Citizen Kane �� 153 Fig. 7.8 Sentence lengths in Mankiewicz, Welles, and Citizen Kane�� 155 Fig. 7.9 Visualization of sentence lengths in Mankiewicz, Welles, and Citizen Kane �� 155 Fig. 7.10 Cluster graph of Mankiewicz, Welles, and Individual Scenes from Citizen Kane �� 155 Fig. 7.11 Type/token ratios in Mankiewicz, Welles, and Citizen Kane�� 156

Chapter 1

Introduction

What value is “criticism” which consists merely of opinions without reference to facts? (Joseph McBride 1971, 32). When experiment is pushed into new domains, we must be prepared for new facts, of an entirely different character from those of our former experience (P.W. Bridgman 1927, 2).

Every 10 years, Sight & Sound magazine carries out a survey to compile a list of critically acclaimed films. For the 2012 poll, the magazine’s editor explained that “we approached more than 1,000 critics, programmers, academics, distributors, writers and other cinephiles, and received (in time for the deadline) precisely 846 top-ten lists that between them mention a total of 2,045 different films” (James 2021). From 1962 to 2002, Citizen Kane (1941) topped Sight & Sound’s list, although in 2012 it came second to Hitchcock’s Vertigo (1958) and fell to third place in 2022. Despite the apparent frivolity in compiling a top ten, a list presents a straightforward way to arrange data (such as film titles) in a descending order of rank (determined by the number of mentions), and 846 responses constitute a sufficiently large data set to represent meaningful results. These polls confirm Citizen Kane’s status as one of the most critically acclaimed films in cinema history. The making of Citizen Kane has even been the subject of two feature films: RKO 281 (Benjamin Ross, 1999) and Mank (David Fincher, 2020), while Orson Welles—the producer, director, and cowriter of Citizen Kane (as well as its star)—was and still is a celebrated public figure. However, Welles’s cowriter credit on Citizen Kane has generated a long-running public dispute, between eminent critics and industry insiders such as Pauline Kael and actor John Houseman on one side, who argue that it is the sole work of Herman J. Mankiewicz, and equally eminent critics and industry insiders such as Andrew Sarris and Peter Bogdanovich on the other, who argue that Welles collaborated fully in the writing process. Mankiewicz’s biographer, Richard Meryman, sums up the dispute: “the authorship of Citizen Kane has become one of film history’s major controversies. And the question of who did what and how much opens up an extraordinary subdrama of jostling egos” (Meryman 1978, 237). The arguments for and against Welles’s coauthorship credit are merely asserted, driven by opinions, impressions, loyalties among friends, and feuds between enemies. Houseman maintained a 40-year feud with Welles, taking every opportunity to (quite literally)

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 W. Buckland, Who Wrote Citizen Kane?, Quantitative Methods in the Humanities and Social Sciences, https://doi.org/10.1007/978-3-031-40224-1_1

1

2

1 Introduction

discredit Welles’s coauthorship of the Citizen Kane screenplay. Bogdanovich noted that, in his memoir Run-Through, Houseman (1972, 468) took his revenge on Welles via a strategic rhetorical move—Houseman goes out of his way to praise Welles as a director before proclaiming that Welles did not write any of Citizen Kane, a move that makes Houseman’s account sound sincere, fair, and balanced (Bogdanovich 1972, 101). Houseman’s revenge is matched by Kael’s hostility—not only toward Welles but also toward Andrew Sarris, whose auteur theory canonized directors such as Welles: “The auteur critics are so enthralled with their narcissistic male fantasies,” Kael wrote, “that they seem unable to relinquish their schoolboy notions of human experience” (Kael 1963, 26). To counter Sarris’s veneration, Kael glibly remarked that Welles “was to become perhaps the greatest loser in Hollywood history” (Kael et al. 1971, 46). Here, Kael combines uncertainty (perhaps) with hyperbole (the greatest loser). But by what criteria could Welles be defined as a loser? (Incompetence? Lack of success?) Kael uses the term (and boosts it with a superlative adjective) merely as an emotionally laden derogatory slur. But Sarris’s and Bogdanovich’s attempts to canonize Welles are equally unreserved. Bogdanovich begins his defense of Welles by quoting in passing the quintessential European film auteur Jean-Luc Godard, who said that “All of us will always owe [Welles] everything” (Godard, quoted in Bogdanovich 1972, 99). With all of us, always, and everything packed into one concise sentence, Bogdanovich could not have found a more hyperbolic statement to counter Kael’s rhetoric. Bogdanovich then set out to discredit both Houseman and Kael, claiming, for example, that Houseman’s career was undistinguished and that his only claim to fame was his brief association with Welles. Houseman, Bogdanovich argues, only gets himself noticed by continually repeating that Welles is a fraud: “For many years now, Houseman has been actively promoting the picture of Welles as a credit thief” (Bogdanovich 1972, 190). I untangle these feuds in Chap. 2, and in the following chapters, I bypass such hearsay and rumor and instead tackle the public dispute from a new perspective, one that subjects the language of the Citizen Kane screenplay to close scrutiny via statistical analysis. In this study, I set out to answer two questions: (1) What distinguishes the writing of Mankiewicz from Welles? (2) What did each author contribute to the writing of the Citizen Kane screenplay? This study has no vested interest in privileging one author over the other; instead, it is driven by curiosity and the unknown and is guided by a discovery procedure that aims to find the answers to these two research questions by employing data collection and statistical analysis. I aim to persuade film study scholars that it is only via statistical methods that we can resolve the long-running dispute over the authorship of the Citizen Kane screenplay. Like all statistical analyses, the analysis guiding this study is an inquiry into unknown parameters: the distinctive features of Mankiewicz’s writing, the distinctive features of Welles’s writing, and each author’s contribution to the Citizen Kane screenplay. These unknowns are reduced through the collection, measurement, classification, and statistical analysis of relevant data. This study puts to one side ad hoc

1 Introduction

3

judgments and instead provides internal (textual) evidence of authorship, which supplements traditional methods of authorship attribution such as external historical evidence (discussed in Chap. 2). Moreover, this study examines the internal authorship of the Citizen Kane screenplay with unusual and special types of linguistic evidence not previously considered—new facts of an entirely different character (to paraphrase Bridgman). This evidence becomes accessible via statistical theories and methods that quantify those specific linguistic features that separate Mankiewicz from Welles, features that are then identified in the Citizen Kane screenplay. The simplest statistical techniques from the recent development of new statistics are employed to discover and quantify each author’s contribution to Citizen Kane.1 Quantifying entails translating a text’s linguistic properties into numerical values, which can then be measured via frequency counts and transformed using other statistical operations (calculating the mean, standard deviation, percentages, ratios, confidence intervals, and effect sizes). My motivation behind this study involves replacing impressionistic accounts of Mankiewicz’s and Welles’s contributions to the Citizen Kane screenplay with a rigorous statistical analysis. Attributing authorship to an anonymous or disputed text by quantifying stylistic features requires statistical inferences, for authorship is not an empirical property of a text. In other words, authorship cannot be measured directly and explicitly, for it is an indirect and implicit property that emerges from direct properties such as word frequencies. In logical terms, attributing authorship is not deductively entailed from a text’s empirical properties but is nondemonstrative, an attribution supported but not demonstrated or proven by textual properties. A radical skeptic such as David Hume would deny the veracity of making a nondemonstrative or ampliative inference, for an inference of authorship extends beyond and is (always) underdetermined by the available data and cannot, therefore, be logically validated or justified. This type of skepticism raises metatheoretical issues, such as: Is there any type of data that can count as evidence in a statistical analysis of authorship? How does a statistical analysis validate the data it generates? And how do statistics manage indirect evidence and uncertainty? Statistics relies on a set of inductive reasoning processes that are explicitly defined and shared, which makes them secure, reliable, and robust. Statistics contains and controls uncertainty and rejects Hume’s skepticism by turning a problem into an empirically testable research question by collecting, measuring, and quantifying an enormous amount of data and by identifying consistency (that is, patterns and trends) in that data. Statistical testing then draws inferences from that data, which provides evidence for the empirical research question. Within a statistical framework, linguistic features become independent variables that predict authorship. Controversially, authorship is reduced to a constant quantity, a discriminant numerical value comprising a linear combination of variables that maximizes the difference between authors. The likely success of such a study increases when the linguistic data are sufficiently comprehensive, when the distinctive linguistic variables are constant (or, at least, when their variation can be

See, for example, Geoff Cumming (2014) and Cumming and Calin-Jageman (2017).

1

4

1 Introduction

measured), when the data are drawn from only one genre of writing (the screenplay) and two authors (Mankiewicz and Welles), and when the study employs multiple statistical methods to corroborate and cross-validate the results (which reduces variability in the data and the bias of a single method). In this study, I define authorship in a literal way—in terms of a writer’s organization and the wording of a written text, which in turn produces that author’s style. Such a study assumes that writing is an orderly process that can be quantified and analyzed in precise terms. More specifically, an author’s style emerges from their recurring habits, a systematic series of linguistic choices that make their writing distinctive. (As we shall see, these choices are not necessarily made consciously, which is why we prefer the term habit.) Studying these linguistic habits from a statistical perspective takes place on several levels. For Alvar Ellegård (whose statistical authorship method is central to this study), the term style is synonymous with “features or combinations of features in an author’s way of writing” (Ellegård 1962, 9). Ellegård defines style as a distinctive combination of features that emerge from an author’s systematic set of linguistic habits. Similarly, Fiona Tweedie and her colleagues define style as “a set of measurable patterns which may be unique to an author,” adding that: Almost every conceivable measure has been considered, ranging from sentence lengths to the number of nouns, articles or pronouns occurring in the text. The vocabulary of the author has also undergone scrutiny, with counts being taken of words that occur only once in the text, to the most common words that act as fillers. Between these two extremes are function words, certain non-contextual words occurring in the text. They can be used as “markers” for different authors (Tweedie et al. 1996, 1).

And in his definition of style, N. E. Enkvist compares frequencies: Style is concerned with frequencies of linguistic items in a given context, and thus with contextual probabilities. To measure the style of a passage, the frequencies of its linguistic items of different levels must be compared with the corresponding features in another text or corpus which is regarded as a norm and which has a definite relationship with this passage (Enkvist 1964, 29).

In simple terms, statistical analysis identifies an author’s distinctive style by measuring and then quantifying a vast array of linguistic features via frequency counts. An author’s quantified stylistic features become meaningful only when compared to the quantified style of other authors. Style is therefore defined comparatively, as the quantifiable deviation of one author’s style from the style of other authors. Deviation is measured and quantified in terms of the frequency counts of letters, words, sentence lengths, the number of nouns, articles, pronouns, etc. Such a precise and measured analysis of style can assist us in identifying the author of a disputed text. In this study, a statistically based analysis of the authorship of the Citizen Kane screenplay identifies the discriminant variables, the most relevant or significant linguistic features that distinguish the style of Mankiewicz from Welles, for it is from these discriminant variables that inferences are generated to identify each author’s contribution to the screenplay. However, statistics cannot banish uncertainty but manages and reduces it to a measurable and knowable quantity.

1 Introduction

5

Because this study uses statistics to measure and quantify writing habits, it forms part of the discipline of stylometry (the quantification of style)—more specifically, to a branch of stylometry devoted to authorship attribution, which in the past has examined in unprecedented detail the disputed authorship of classical texts (Plato, Aristotle), the New Testament, as well as Shakespeare’s plays, plus De Imitatione Christi, The Letters of Junius, The Federalist papers,2 and numerous other disputed works, some of which have made international headlines—such as Don Foster’s successful unmasking of Joe Klein as the author of the anonymous novel Primary Colors and his mistaken attribution of the poem “A Funeral Elegy” to Shakespeare (Foster 2001). Like these previous studies in stylometry (outlined in Chap. 4), in this study, I employ descriptive statistical methods to quantify linguistic features to infer authorship. In the last 40 years, stylometric authorship attribution has been bolstered with the advent of computing and, more recently, with software tools, an integral part of which involves representing the results in tables and graphs. It is via these statistical procedures, software tools, and visual representation of information that we can discover new data relevant to determining the authorship of the Citizen Kane screenplay. This type of research is evidence based. It avoids speculation, overgeneralization, and impressionistic judgments and is interdisciplinary, forming part of the Digital Humanities, which challenges the traditional ways of thinking embedded in the humanities. Digital Humanities research employs statistical solutions to seemingly intractable humanities problems. The present study introduces simple statistical methods to an arts and humanities readership and then demonstrates the value of those methods by carrying out a systematic study of the long-running coauthorship problem that has puzzled the film industry and film critics for decades. Chapter 2, The Trials of Coauthorship, investigates the dispute between Herman J. Mankiewicz and Orson Welles, focusing on Welles’s claim to the status of coauthor of the Citizen Kane screenplay, and attempts by his adversaries to deny him this status, which (they argue) he appropriated unfairly and deceptively. I frame my discussion of this apparent case of modern-day pseudepigrapha (false ascription of authorship to a written text) via two institutions: the current guidelines from the Writers Guild of America (WGA) and the current copyright law formulated by the United States legislature. For both institutions, authorship is premised on self- contained individualism—on writing as a solitary and individual act of creation that produces an original text that (in Aristotle’s formulation) reflects the writer’s character, thereby conferring on the writer the right to claim the ownership of that text. The chapter also examines the assumptions behind the concept of coauthorship, which complicates authorship attribution by challenging the commonplace idea that a written text has a single origin (the mind of one author). Coauthorship involves sharing different writing tasks and roles, such as outlining, planning, drafting, editing, and revising. Chapter 2 ends by discussing Robert Carringer’s authoritative Standard overviews of stylometric authorship attribution include Susan Hockey (1980, Chapter 6); David I. Holmes (1994); Patrick Juola (2008); Efstathios Stamatatos (2009); and Michael P. Oakes (2014). 2

6

1 Introduction

study of the (co)authorship of seven different versions of the Citizen Kane screenplay (Carringer 1978). Chapter 3, Screenplays: Words on a Page, presents the control set of screenplays—screenplays of known authorship, which will be analyzed (in Chap. 5) to determine an effective set of linguistic variables that distinguish the writing of Mankiewicz from Welles. The control set consists of two screenplays known to be written by Mankiewicz (Made in Heaven (1943–45) and A Woman’s Secret (1949)) and two known to be written by Welles (The Other Side of the Wind (1970) and The Big Brass Ring (1982)). Each screenplay sample is 20,000 words long—the first 20,000 words of each screenplay, minus the character cues and other standard formatting marks, as explained in the chapter. Each sample is therefore 40,000 words, which constitutes a control corpus that serves to identify stylistic features that distinguish Mankiewicz from Welles. To assist in this effort, data from other screenplays are presented in passing, including The Magnificent Ambersons (Welles, 1941), Touch of Evil (Welles, 1957), and Man of the World (Mankiewicz, 1931). For comparative purposes, data from two other screenplays not written by Mankiewicz or Welles are included: His Girl Friday (1940, written by Charles Lederer) and All the President’s Men (1974, written by William Goldman). Mankiewicz’s sample consists of an adapted screenplay—A Woman’s Secret. It will be compared to its source material (the novel Mortgage on Life by Vicki Baum) using a software program called WCopyfind,3 which evaluates two documents by matching the overlapping words and phrases. This will establish how similar the screenplay is to the adapted source material. The first half of Chap. 4, The Statistical Analysis of Style: Aims and Methods, begins by outlining three fundamental distinctions central to statistics: descriptive and inferential statistics, sample and population, and statistical tests and effect sizes. I then present stylometry’s basic premises to arts and humanities scholars: namely, that relevant linguistic features should ideally be quantifiable, high rate, context- free (not dependent on the subject matter), multiple, subconscious, distinctive, and stable (consistent and regular). In the second half of the chapter, I present an overview of stylometric methods that have previously been successful in discriminating between authors and the range of data they used—including punctuation, unigrams, contractions, vocabulary analysis (word frequency profile, the type/token ratio, distinctiveness ratio), collocational analysis, and frequency distribution of sentence length. The authorship attribution process comprises a training phase, followed by a testing phase. Chapter 5, Distinguishing Mankiewicz from Welles: Training Phase Results, presents the training phase, where the stylometric methods introduced in Chap. 4 are applied to the control group of screenplays to establish which methods are successful in identifying and quantifying the stylistic features that distinguish Mankiewicz from Welles. Several statistical concepts and methods proved useful: relative frequencies, the distinctiveness ratio, sample means, sample standard

https://plagiarism.bloomfieldmedia.com/software/wcopyfind/

3

1 Introduction

7

deviation, confidence intervals, and effect size. In traditional statistical terms, relative frequency quantifies the frequency of variables within the same sample.4 In contrast, the distinctiveness ratio compares the relative frequency of the same variable in two different samples. Distinctive variables are identified by dividing a variable’s relative frequency in one sample into the same variable’s relative frequency in the other sample.5 The higher the ratio, the more distinctive the variable. The sample mean estimates the population mean, the confidence interval represents variation around that estimated mean, and the effect size measures the scale or magnitude of the difference between two samples. In the following study, and with frequent reference to Ellegård’s theory of authorship attribution, these statistical methods are employed to construct models or statistical profiles of Mankiewicz’s and Welles’s writing styles: relative frequency identifies variables (linguistic features) that have a high or low frequency in Mankiewicz’s and Welles’s writings; the distinctiveness ratio compares Mankiewicz’s and Welles’s relative frequencies in order to identify the most distinctive linguistic features that distinguish the two authors; confidence interval represents the range or variation that linguistic features can take around the mean of an author’s statistical profile, with the upper and lower limits corresponding to the boundaries of that profile; and effect size measures the size of the separation between the two authors. From this training phase, I draw up a list of distinctive linguistic features that distinguish Mankiewicz from Welles. An initial list of 77 distinctive linguistic features and the final list of 44 features are presented in the Appendix to Chap. 5. These distinctive features are divided into two groups—the plus group represents 22 variables distinctive of Mankiewicz (in relation to Welles), and the minus group represents 22 variables distinctive of Welles (in relation to Mankiewicz). In Chap. 6, Comparing Mankiewicz and Welles to the Citizen Kane Screenplay (1), I employ Mankiewicz’s and Welles’s statistical profiles to assign authorship in a precise way to the Citizen Kane screenplay. Whereas in Chap. 5 I distinguish Mankiewicz from Welles using 44 distinctive linguistic features, in Chap. 6 I employ the same features to establish the similarities between each author and the Citizen Kane screenplay. It is in this chapter that I discover the quantity of writing that Mankiewicz and Welles contributed to Citizen Kane. I analyze the screenplay three times: firstly, as a single document; secondly, by segmenting it into 4000-word samples; and thirdly, by dividing it into its 13 major scenes. On each occasion, I compare the Citizen Kane screenplay to Mankiewicz’s and Welles’s statistical profiles. In other words, the 22 distinctive linguistic features of Mankiewicz’s statistical profile and the 22 distinctive features of Welles’s profile are counted in the Citizen Kane screenplay and converted into relative frequencies; these Citizen Kane relative frequencies are then divided into Mankiewicz’s and Welles’s relative frequencies to calculate their distinctiveness ratio. If the ratio is small, this signifies similar

Relative frequency of x = the observed frequency of x in the sample divided by the total number of words in that sample. 5 The distinctiveness ratio = relative frequency of x in sample1 / relative frequency of x in sample2. 4

8

1 Introduction

authorship; if the ratio is large, it suggests different authorship, where similar/different authorship is defined in terms of ratios and the boundaries of the confidence interval. I identify the exact sections in the screenplay where Welles’s stylistic signature dominates and where Mankiewicz’s signature dominates. I use these results to revise and update some of the conclusions in Carringer’s authoritative study by promoting a hypothesis that he did not consider: whether Welles wrote part of the Citizen Kane screenplay before Mankiewicz began writing the first draft. In Chap. 7, Comparing Mankiewicz and Welles to the Citizen Kane Screenplay (2), I examine sentence length, clusters, and type/token ratios and use the software program Linguistic Inquiry and Word Count (LIWC) to compare Mankiewicz and Welles to Citizen Kane. LIWC measures and quantifies 92 linguistic features of texts, including grammatical categories such as pronouns, verbs, and function words, together with punctuation, informal expressions, and words expressing positive or negative sentiment. The chapter also compares two screenplays not written by either author (the two mentioned above—His Girl Friday and All the President’s Men) to their statistical profiles to ensure the statistical tests do not falsely attribute these screenplays to Mankiewicz or Welles, and I analyze other texts known to be written by Mankiewicz and Welles to see how well they match each author’s profiles. The conclusion considers how nondemonstrative inferences of authorship can be evaluated and, more generally, reflects on the role and limits of statistics in solving problems in Arts and Humanities research. Rather than rely on point estimates (such as the p values of null hypothesis significance testing) to determine the relevance and importance of the results, this study follows the new statistics by employing confidence intervals and magnitude or effect sizes, which are particularly appropriate for the nondemonstrative inferences generated in the Arts and Humanities. The difference between two data samples (e.g., Mankiewicz and Welles) is quantified in terms of degrees (a range of values located within an interval) and effect size (a ratio value, measured using Cohen’s d),6 which replaces the single all-or-nothing p value. An estimate determines to what extent the data support an inference rather than stating that the data either support or does not support the inference. By employing methods that have become central to the new statistics in his study of The Letters of Junius in 1962, Ellegård was ahead of his time. His methods are also straightforward to understand and sufficiently powerful to offer a precise solution to the authorship of the Citizen Kane screenplay.

Confidence intervals and ratios are the same in the way they quantify differences. Whereas a ratio measures the difference between two samples numerically, a confidence interval represents that ratio visually. A ratio of one, for example (no difference between two samples), is located in the center of an interval. Cohen’s d measures the difference between two samples in terms of standard deviation. A Cohen’s d value of 0.5 quantifies the difference between two samples as 0.5 standard deviations, while a Cohen’s d value of 0 signifies identity between the two samples (as does a ratio of 1). Cohen’s d is defined more formally in Chap. 4. 6

References

9

References Bogdanovich, Peter. 1972. The Kane Mutiny. Esquire (October): 99–105; 180–190. Bridgman, P.W. 1927. The Logic of Modern Physics. New York: The Macmillan Co. Carringer, Robert L. 1978. The Scripts of Citizen Kane. Critical Inquiry 5 (2): 369–400. Cumming, Geoff. 2014. The New Statistics: Why and How. Psychological Science, 25 (1): 7–29. Cumming, Geoff, and Robert Calin-Jageman. 2017. Introduction to the New Statistics: Estimation, Open Science, and Beyond. New York: Routledge. Ellegård, Alvar. 1962. A Statistical Method for Determining Authorship: The Junius Letters 1769-1772. Gothenburg: Gothenburg Studies in English. Enkvist, N. E. 1964. On Defining Style. In N. E. Enkvist, J. Spencer & M. J. Gregory, Linguistics and Style, 1–56. Oxford: Oxford University Press. Foster, Don. 2001. Authorship Unknown: On the Trail of Anonymous. London: Macmillan. Hockey, Susan. 1980. A Guide to Computer Applications in the Humanities. London: Duckworth. Holmes, David I. 1994. Authorship Attribution. Computers and the Humanities 28 (2): 87–106. Houseman, John. 1972. Run-Through 1902/1941. New York: Touchstone. James, Nick. 2021. How We Made the Greatest Films of All Time Poll: https://www.bfi.org.uk/ sight-and-sound/polls/greatest-films-all-time/introduction Juola, Patrick. 2008. Authorship Attribution. Foundations and Trends in Information Retrieval 1 (3): 233–334. Kael, Pauline. 1963. Circles and Squares. Film Quarterly 16 (3): 12–26. Kael, Pauline, Herman Mankiewicz, and Orson Welles. 1971. The Citizen Kane Book. Boston: Little, Brown. McBride, Joseph. 1971. Rough Sledding with Pauline Kael. Film Heritage 7 (1): 13–16; 32. Meryman, Richard. 1978. Mank: The Wit, World, and Life of Herman Mankiewicz. New York: William Morrow. Oakes, Michael P. 2014. Literary Detective Work on the Computer. Amsterdam: John Benjamins Publishing Company. Stamatatos, Efstathios. 2009. A Survey of Modern Authorship Attribution Methods. Journal of the American Society for Information Science and Technology 60 (3): 538–56. Tweedie, F. J., S. Singh, and D. I. Holmes. 1996. Neural Network Applications in Stylometry: The Federalist Papers. Computers and the Humanities 30 (1): 1–10.

Chapter 2

The Trials of Coauthorship

2.1 The Dispute over the Authorship of the Citizen Kane Screenplay An episode of the popular TV program Columbo called “Murder by the Book” (directed by Steven Spielberg, 1971) features a collaborative writing team, with one writer carrying out the writing of novels and the other involved in their publicity and promotion. The writer plans to strike out on his own as an independent author, prompting the nonwriting collaborator to murder him, for he cannot write and will therefore be exposed as a fake author. “Murder by the Book” sets up the status of the writer as a matter of life and death, mixing conceptual issues around defining an author—someone who generates the wording of a text and claims ownership of it—with emotional concerns around pride and self-esteem. The same issues are central to the conflict over the coauthorship of the Citizen Kane screenplay, with the legacy of two authors at stake. The opening page of the Screen Credits Manual published by the Writers Guild of America (WGA) makes clear the significance of screen credit: “A writer’s credits play an enormous role in determining our position in the motion picture and television industry” (Writers Guild of America 2018, 1). According to Herman Mankiewicz’s biographer, Richard Meryman, it was Orson Welles who planned to write and direct a film about a newspaper mogul told from several different perspectives: “Welles wanted the individual recollections about the newspaper publisher to be radically different, exactly like the later Japanese film Rashomon. Several long sequences would be repeated exactly but acted differently so the same events and dialogue would give conflicting images of Kane” (Meryman 1978, 247).1 The dispute over the authorship of the Citizen Kane screenplay is itself

The Rashomon idea is diluted in the final film, which presents a composite portrait of Kane with only minor overlaps. 1

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 W. Buckland, Who Wrote Citizen Kane?, Quantitative Methods in the Humanities and Social Sciences, https://doi.org/10.1007/978-3-031-40224-1_2

11

12

2 The Trials of Coauthorship

a Rashomon-like story, with several conflicting accounts already in existence. In this chapter, I do not summarize or synthesize these various versions of events in any detail. Instead, in the first half, I extract from these conflicting accounts Welles’s working methods (especially the way coauthorship of the Citizen Kane screenplay is presented), and in the second half, I offer a theoretical overview of the concept of coauthorship, ending on an account of the different drafts of Citizen Kane. My theoretical overview does not focus on the final film, which, as Harlan Lebo has demonstrated in great detail, is significantly different from the final version of the screenplay (a common practice).2 Instead, this chapter frames the study of authorship through the current guidelines of the Writers Guild of America (WGA), the current Copyright Law of the United States (which incorporates previous laws) (Library of Congress and Copyright Office 2021), and the development of “screenplay studies” as embodied on the Screenwriting Research Network (SRN), set up in 2006.3 The SRN regards the screenplay as an object of study in itself and, in doing so, challenges the main premise of la politique des auteurs—which promotes the director and downplays the screenplay and the screenwriter’s work. However, the study of the screenwriter is complicated by coauthorship, the common method of writing practiced in the classical Hollywood mode of production. Classical Hollywood’s assembly-line system adopted a form of industrial mechanization that standardized the production of creative artifacts, which deskilled artistic practices such as writing, directing, and cinematography, reducing them to impersonal and routine activities. Authors, directors, and cinematographers had to work against this system in order to express any individuality, which is what the auteur critics celebrated—although they tended to confine their focus to the work of directors.4 Meryman portrays the historical background to the coauthorship dispute from Mankiewicz’s perspective based primarily on the recollections of Mankiewicz’s friends and acquaintances, although he also reproduces several written statements (primarily letters) from Mankiewicz concerning his contribution to the Citizen Kane screenplay. In one letter (to his father), he claimed: “There is hardly a comma that I did not write” (Mankiewicz, quoted in Meryman 1978, 238). In another letter (to Alexander Woollcott), Mankiewicz wrote: “I feel it my modest duty to tell you that the conception of the story, the plot, the characters, the manner of telling the story and about 99 percent of the words are the exclusive creations of Yours, Mank” (Mankiewicz, quoted in Meryman 1978, 255). He later revised his estimate down to In the archives of the Museum of Modern Art in New York, Lebo identified the “Correction Script”: “The eighty-five-page script is undated, but based on the contents, it was written after the Third Revised Final [i.e., the seventh and final official version of the Citizen Kane screenplay] was completed” (Harlan Lebo 2016, 61). Lebo demonstrates how Welles further edited and reduced the screenplay just before as well as during filming in June 1940. He has posted the “Correction Script” online: https://www.scribd.com/document/482031782/ Citizen-Kane-The-Correction-Script-is-this-the-last-script-for-Orson-Welles-masterpiece 3 Screenwriting Research Network: https://screenwritingresearch.com/ 4 Auteur critics such as Andrew Sarris promoted Welles the director as the ultimate auteur struggling against the Hollywood Studio system’s assembly-line mode of production (Sarris 1968, 77–81). Richard Corliss subsequently extended auteur criticism to screenwriting (Corliss 1974). 2

2.1 The Dispute over the Authorship of the Citizen Kane Screenplay

13

98%. Variety also reported that Mankiewicz lodged a complaint with the Screen Writers Guild (the WGA’s name before 1954) but withdrew it in January 1941: Herman Mankiewicz, scripter, registered a protest with the Screen Writers Guild, demanding screen credit for his work on “Citizen Kane.” When the war between William Randolph Hearst, Orson Welles and RKO broke out over the picture, Mankiewicz lost all interest and withdrew his protest. Now the studio has decided to give him full credit (Anon. 1941, 1).5

For Welles, this was a matter of history repeating itself, for he had recently fought for the right to be considered the author of the War of the Worlds radio play, the play that panicked radio listeners in 1938 and established his reputation. Hadley Cantril’s 1940 book-length study of the psychology of panic in relation to the broadcast published the script but attributed it to Howard Koch (Cantril 1940). Welles wrote to Cantril6 accusing him of misrepresenting the provenance of the War of the Worlds broadcast. And when CBS made a docudrama, “The Night America Trembled,” about the broadcast in 1957, Welles sued CBS, believing he owned the copyright. But he lost the case in 1962 (McFarlin 2016, 704). Both of Welles’s actions again demonstrate how a writer’s credits play an enormous role in determining their reputation. Welles consolidated his career in radio on “The March of Time” series in 1934, a show in which the cast and crew worked quickly to report and dramatically reenact current news events. John Dunning reports that “The March of Time” “was accused of being pompous, pretentious, melodramatic, and bombastic. But it was never dull” (Dunning 1998, 436).7 And Welles’s career as a writer-director in radio began in 1937 with his adaptation of Victor Hugo’s thousand-page novel Les Misérables into a three-and-a-half-hour radio series (divided into 30-minute episodes) for the Mutual Broadcasting System. This series revealed Welles’s talent for condensing, reshaping, and editing long, sprawling narrative texts. Later that year, he and John Houseman cofounded the Mercury Theatre Company and, a year after that, The Mercury Theatre on the Air, which broadcast on CBS radio dramatizations of popular literature, including Dracula, Treasure Island, and A Tale of Two Cities. However, in an interview with Peter Bogdanovich, Welles confirms that Houseman took charge of the radio scripts: “For the radio shows, [Houseman] acted as super editor over all the writers; he produced all the first drafts” (a position Houseman also took with Mankiewicz in writing the first draft of Citizen Kane) (Welles, in Orson Welles, Peter Bogdanovich, and Jonathan Rosenbaum 1992, 55). Patrick McGilligan adds: After talking the stories over with Houseman, Welles left his partner to his own devices. Orson would drop in regularly to make criticisms and changes in the scripts, but the first drafts were Houseman’s responsibility. Houseman worked much the way Orson himself often worked on scripts, lying in bed in his apartment that summer, surrounded by copies of

Furthermore, Mankiewicz was given first-position credit, suggesting that he contributed the most substantial amount to the screenplay. 6 Reproduced in Simon Callow’s biography of Welles (Callow 1996, 490–92). 7 “The March of Time” newsreel first appeared in cinemas in 1935. 5

14

2 The Trials of Coauthorship the book they were adapting, samples of usable scripts for reference, and an array of tools: scissors, a paste pot, a supply of pencils (McGilligan 2015, 468–69).

It was during the Mercury Theatre’s expansion into radio that Houseman hired Howard Koch to write several adaptations. “Koch signed a six-month contract with Mercury with a clause giving him the future rights to any radio script he wrote,” writes McGilligan (2015, 498). War of the Worlds was his fourth assignment. McGilligan charts the writing and revising processes: Welles suggested both adapting H.G. Wells’ War of the Worlds and presenting the first half as a series of news bulletins as if the invasion was taking place in real time (while the second half employs first-person singular narration, Welles favored radio technique, whereby a character embedded within the story acts as the story’s narrator). Koch then wrote the first draft in a matter of days, and Welles listened to a read-through of this first draft 4 days before the broadcast. “On Sunday, October 30,” McGilligan continues, “Orson arrived shortly after noon for the customary day-long preparations for the evening broadcast, and in the studio he was the leader beyond dispute” (McGilligan 2015, 501). (Welles had been used to such a tight schedule ever since he worked on the “March of Time” series.) McGilligan adds: However divergent the eyewitness accounts of this radio production, they all agree on their portrait of the single-minded and clearheaded Welles, shaping the evolution and quality of “War of the Worlds” in spite of the staff’s continued opposition and skepticism, thoroughly in command of the show’s concept and details. Every important decision was his to make; he was the producer and star, and the highest artistic executive (McGilligan 2015, 504).

Yet such an account of Welles’s creativity relies only on hearsay, memory, and opinions. Furthermore, Welles’s conflicts with Cantril and CBS show that copyright law and the Writers Guild of America’s definition of authorship do not favor his creative working methods. The WGA guidelines and copyright law are based on a literal definition of authorship: in other words, an author’s contribution needs to be quantifiable and fixed in a tangible work. In terms of tangibility, copyright protection is clear—it does not cover intangible ideas: “Copyright protection subsists, in accordance with this title, in original works of authorship fixed in any tangible medium of expression” (Library of Congress and Copyright Office 2021, section 102, 8). The copyright law adds: “In no case does copyright protection for an original work of authorship extend to any idea, procedure, process, system, method of operation, concept, principle, or discovery, regardless of the form in which it is described, explained, illustrated, or embodied in such work” (ibid). In other words, copyright law (in 1938 as well as in the present8) advocates an expression-only rule: only tangible work can be copyrighted, and an author is someone who creates that work. In terms of quantity, attributing authorship is measured in the WGA guidelines but is more diffuse in copyright law (although ideas briefly expressed in

In his commentary on The War of the Worlds legal case, Timothy J. McFarlin points out that “Under prevailing law, both in 1938 and today, the contribution of ideas and ideas alone—no matter how vital—did not and does not constitute authorship of a copyrightable work” (McFarlin 2016, 706). 8

2.1 The Dispute over the Authorship of the Citizen Kane Screenplay

15

aterial form cannot be copyrighted). For an author to be legally defined as an m author, they need to create an independently copyrightable work or, if contributing to a collective work, an independently copyrightable contribution to that work. Copyright lawyer William Patry argues: In order to be a “joint” author, one must be an “author.” To be an author, one must independently create and contribute at least some minimal amount of expression. The requirement that each joint author contribute expression has important policy and constitutional implications. From a policy perspective, the requirement ensures that the scope of the joint authorship doctrine is not expanded to include editors, research assistants, actors in plays, and movie consultants. Instead, reward for such contribution is left to contract (William Patry, quoted in Michael B. Landau 2014, 170).

A joint author is not, therefore, an editor, research assistant, or consultant; a joint author must firstly be identifiable as an author, someone who independently creates and contributes “at least some minimal amount of expression.” Copyright law does not define creativity in terms of aesthetic or cultural values; instead, creativity means that the work is original in the sense that it is not produced in a mechanical or routine way and is therefore distinguishable from other works. Furthermore, a joint author is an author whose independently copyrightable work is intended to be combined with the work of another author (or authors): “A ‘joint work’ is a work prepared by two or more authors with the intention that their contributions be merged into inseparable or interdependent parts of a unitary whole” (Library of Congress and Copyright Office 2021, section 101, 4). Copyright law does not deny the value or significance of ideas; it instead applies copyright to a tangible object rather than an ineffable idea or process. The stylometric analysis of screenplays carried out later in this study adheres to the expression-only rule—to the analysis of a written work as a form of physical evidence that can be distinguished from other works but without taking into account aesthetic or cultural values embedded in that physical evidence. In September 1939, Welles also hired Mankiewicz (who was recovering in the Cedars of Lebanon hospital in Los Angeles from a broken leg sustained in a car crash) to adapt literary classics for The Mercury Theatre on the Air. Meryman reports that Mankiewicz received $200 a script but (as was the norm for a Mercury Theatre contract) no writing credit: “as a publicity device all the radio shows were billed as written, produced, directed by Orson Welles.” Assisted by Houseman, “Herman wrote weekly Mercury scripts for Huckleberry Finn, Rip Van Winkle, Vanity Fair, The Murder of Roger Ackroyd, Dodsworth” (Meryman 1978, 241–42). Welles also hired Mankiewicz (again under a Mercury Theatre contract) to help him with the screenplay that would eventually become Citizen Kane but claims: “I had no intention on Mank being the coauthor. None. Rightly or wrongly, I was still without self-doubt in my ability to write a film script. I thought Mank would do that anecdotal kind of thing about Hearst, give me a few ideas, fight me a little …” (Welles, quoted in Meryman 1978, 248). A preplanning phase between Mankiewicz and Welles took place throughout January 1940. In February, Mankiewicz then began writing, insisting that Houseman join him to edit the first draft. Both headed to Campbell Ranch in Victorville,

16

2 The Trials of Coauthorship

California, and completed an extraordinarily long first draft of the screenplay (called American) on April 16. Welles later stated in an interview with Bogdanovich that “after mutual agreements on story line and character, Mank went off with Houseman and did his version while I stayed in Hollywood and wrote mine” (Orson Welles, in Welles, Peter Bogdanovich, and Jonathan Rosenbaum 1992, 54). However, in an interview with Meryman, Welles claims that he wrote a first draft before Victorville: According to Welles, he himself wrote the first script, a mammoth, 300-page version, mainly dialogue, which Herman actually took with him to Victorville. […] Welles had never before mentioned that massive piece of preliminary work. Hitherto he always based his claim of primary authorship on a very different story. That one has him writing his own, original screenplay, not before Victorville, but in parallel with Herman during Victorville. When the two scripts were finished, Welles combined them both (Meryman 1978, 251–52).

Meryman is skeptical that Welles wrote a complete draft: “There is no corroborating evidence of early drafts by Orson Welles” (Meryman 1978, 256). Welles’s different accounts of what happened 30 or more years in the past either are based on faulty memory or emerge from his copyright issues with Cantril and CBS (i.e., he claimed to have fixed his ideas in tangible form). In terms of coauthorship, Meryman vacillates from one position to the next: [I]n the overall, historical puzzle of who did exactly what, Herman’s percentages [that he wrote 98% or 99%] seem implausible. Houseman, a fluent man of intellect, must have made valuable input. It is impossible to imagine that the protean Welles at the height of his powers was merely an admiring spectator. Moreover, Herman’s version ignores some five weeks of preliminary script discussions (Meryman 1978, 256).

Nonetheless, Meryman concludes that “Herman’s pride of authorship was justified. […] What Herman created in Victorville contained all the characters, nearly all the scenes, and more than 60 percent of the dialogue in the finished film. Herman Mankiewicz wrote Citizen Kane” (Meryman 1978, 256). He revises Mankiewicz’s exaggerated percentage down to 60% of the dialogue but gives him credit for writing most of the scenes. However, Meryman does not indicate how he calculated these percentages. Pauline Kael reignited the dispute in 1971 with her 50,000-word article “Raising Kane,” which reproduced much of the hearsay and rumor surrounding the screenplay’s authorship. Like Meryman, she also sided with Mankiewicz, claiming that he “wrote the first draft in about three months and tightened and polished it into the final shooting script of Citizen Kane in a few more weeks” (Kael, in Pauline Kael, Herman Mankiewicz, and Orson Welles 1971, 50). She denied Welles had any creative input in the screenplay. Her old-time nemesis, Andrew Sarris, responded in a series of articles published in the Village Voice, followed by Welles’s friends and supporters, including Peter Bogdanovich, Joseph McBride, and Jonathan Rosenbaum. “How much of the final script of ‘Citizen Kane’ was written by Herman J. Mankiewicz and how much by Orson Welles?” Sarris asked. He responded that nobody knows because “literary collaboration, like marriage, is a largely unwitnessed interpenetration of psyches” (Sarris 1971a, 63). Nonetheless, the stylometry presented in this study attempts to untangle the “interpenetration of psyches” via a

2.1 The Dispute over the Authorship of the Citizen Kane Screenplay

17

close analysis of the material evidence that survives—the various drafts of the screenplay. Sarris also offers a number of observations on the quality of Mankiewicz’s dialogue: “The one expression critics invariably used to describe Herman Mankiewicz’s dialogue was ‘grown up,’ not witty or New Yorkerish or hilarious, but simply grown up. And the dialogue in ‘Citizen Kane’ is nothing if not grown up” (Sarris 1971b, 59). Although he seems to be attributing much of the screenplay to Mankiewicz in this comment, Sarris added: In all the space she has devoted in “Raising Kane” to an analysis of the script, Pauline Kael never mentions the brothel scene with Kane and Leland, a scene that the censors knocked out of Mankiewicz’s original script, one of many elisions that tended to tip the viewpoint of the film from the more sensual Mankiewicz to the more theatrical Welles. Curiously, the rigid censorship that was in force in 1941 worked to the advantage of Welles vis-a-vis other directors with fewer hang ups about women (Sarris 1971b, 59).

Some of these qualitative assertions are difficult to substantiate. The sensuality of Mankiewicz’s language and the theatricality of Welles’s could be confirmed via a vocabulary analysis—once the vocabulary of each category (sensuality and theatricality) had been established. (I address this issue in Chap. 7 in relation to the LIWC software program.) However, what is grown-up dialogue? The term itself is too vague (as indeed is the comment about Welles’s hang-ups about women), although one could probably begin with a content analysis of a screenplay’s themes. Peter Bogdanovich published a response to Kael in Esquire in 1972, consisting of long excerpts from his interviews with Welles and Charles Lederer. Bogdanovich (with considerable input from Welles) used his interviews to refute several assertions in Kael’s article, including the claim that Mankiewicz was offered $10,000 to leave his name off the screenplay, and he quotes Lederer, saying that Mankiewicz objected to Welles’s many revisions to the text, which challenges Kael’s account that Welles did not contribute to its writing: Manky was always complaining and sighing about Orson’s changes. And I heard from Benny [Hecht] too, that Manky was terribly upset. But, you see, Manky was a great paragrapher—he wasn’t really a picture writer. I read his script of the film—the long one called American—before Orson really got to changing it and making his version of it—and I thought it was pretty dull (Charles Lederer, in Peter Bogdanovich 1972, 104).

Lederer added that “Orson vivified the material, changed it a lot, and I believe transcended it with his direction” (Lederer, in Bogdanovich 1972, 105). Bogdanovich finds it curious that Kael did not interview Welles or Lederer when researching her article. Kael’s biographer, Brian Kellow, argues that in writing “Raising Kane,” Kael has her own coauthorship issue to address (Kellow 2011). She discovered in mid-1969 that Howard Suber, an assistant professor at UCLA, had carried out extensive work on Citizen Kane (analyzing numerous drafts of the screenplay and interviewing several people, including Dorothy Comingore, who plays Susan Alexander in the film; Mankiewicz’s widow, Sara Mankiewicz; and Robert Wise, the film’s editor). According to Kellow, Kael asked Suber if he would like to cowrite an essay on Citizen Kane to coincide with the screenplay’s publication. Suber agreed, and

18

2 The Trials of Coauthorship

“Pauline sent Suber a check for a little over $375, telling him it was half of the advance she had been paid, and he turned over his research materials to her” (Kellow 2011 158). He also wrote up some of his research for the Introduction and sent it to Kael. But their coauthorship was never formally agreed in writing, and Kael never discussed his part of the Introduction. When Suber received the “Raising Kane” copy of The New Yorker, he saw evidence of his research shaping Kael’s essay, but his name was not mentioned anywhere. In his memoir Run-Through, John Houseman claims that he played a major role in creating the first draft of the Citizen Kane screenplay. He also maintains that it was Mankiewicz (not Welles) who had the idea of “telling a man’s private life (preferably one that suggested a recognizable American figure), immediately following his death, through the intimate and often incompatible testimony of those who had known him at different times and in different circumstances” (Houseman 1972, 449). Mankiewicz pitched the idea to Welles, who was looking for a suitable subject to write an original screenplay for his first film (after his initial projects, the adapted screenplays of Joseph Conrad’s Heart of Darkness and Nicholas Blake’s Smiler with the Knife, were shelved). In Houseman’s version of events, he and Mankiewicz worked on the screenplay for ten weeks, producing a 400-page document, which they then spent a further two weeks attempting to shorten, with Houseman taking credit for the “News on the March” sequence (a pastiche of “The March of Time” newsreels): in the final two weeks, “we worked on the connective tissue, substituting sharp cinematic cuts and visual transitions for what, in the first version, had too often been leisurely verbal and literary expositions. And, for the twentieth time, I reorganized the March of Time, which had become my special domain” (Houseman 1972, 456). After hearing that Welles claimed coauthorship of the screenplay, Houseman reveals that he wrote a letter to Welles (but never sent it), saying that if anyone should claim coauthorship, it should be him (Houseman) (Houseman 1972, 464). Yet, as Meryman reminds us, the screenplay was still incredibly long and would require a reduction of 50% to make it filmable. Furthermore, the sharp cinematic cuts and visual transitions were implemented in later drafts. Houseman only mentions in passing Welles’s additional rewrites: “Orson, in the half-dozen ‘revised’ and ‘finally revised’ scripts that ground out of the RKO mimeographing machines, did no more than a creative director is accustomed to do before and during the shooting of a film” (Houseman 1972, 459). Houseman even claims that “under the current rules [of the Writers Guild of America] Welles could not possibly have claimed a writing credit for his contribution to Citizen Kane” (Houseman 1972, 460). However, this opinion is challenged by the following statement, which Meryman quotes from Frank Pierson, who argues that the changes Welles made at that time were more than enough to earn him a second line credit: Screenwriter and director Frank Pierson (Cool Hand Luke, Dog Day Afternoon, A Star Is Born) sits on the guild arbitration committee. It is Pierson’s personal opinion that if Welles wrote original material during the cutting of American, the changes made at that time were more than enough to earn him a second line credit—signifying that Welles was not coauthor but did make significant contributions (Meryman 1978, 265–66).

2.1 The Dispute over the Authorship of the Citizen Kane Screenplay

19

In the current WGA guidelines, for the first author to receive credit on an original screenplay (one not based on prior written material), his or her contribution must exceed 33% of the final script, and a subsequent writer must contribute at least 50% (Writers Guild of America 2018, 15). The Guild adds the following requirements: The percentage contribution made by writers to screenplay obviously cannot be determined by counting lines or even the number of pages to which a writer has contributed. Arbiters must take into consideration the following elements in determining whether a writer is entitled to screenplay credit: • • • •

dramatic construction; original and different scenes; characterization or character relationships; and. dialogue (Writers Guild of America 2018, 16).

Within its quantitative framework, the WGA emphasizes that a second writer’s contribution and changes need to be qualitative. However, it is this emphasis on the quality of a screenplay that has generated an unresolvable dispute over Citizen Kane’s authorship. Robert Carringer’s close study of the screenplay’s text in its various drafts (discussed below) has in part transcended this dispute (Carringer 1978).9 In this study, I shift focus to the screenplay’s text even more closely, in minute detail, via a quantitative analysis, for authors can be distinguished according to variations in the frequency of linguistics features in their writing. We can use the current WGA guidelines as a general framework to steer our discussion of the conflict over Welles’s Citizen Kane writing credit, for the guidelines emphasize what has always been important in determining a writer’s screen credit. What is immediately evident is that screen credit cannot be awarded simply by a subsequent writer editing the first writer’s screenplay, for the subsequent writer must also write original dialogue and new scenes and develop characters. Rather than focusing exclusively on Welles’s contribution to the Citizen Kane screenplay, it is worthwhile considering the status of Mankiewicz’s contribution: should he have received a story by credit rather than shared the screenplay (or the original screenplay) by credit? In the current WGA guidelines, a story is distinct from a screenplay and consists of a “basic narrative, idea, theme or outline indicating character development and action” (Writers Guild of America 2018, 14). American, the first draft of Citizen Kane, could by itself only earn Mankiewicz a story by credit since it was unfilmable. The actual expression in writing of these story elements in screenplay format (scenes and dialogue) requires a screenplay by credit. If the author wrote both the story and screenplay, their credit would say written by. If Mankiewicz also reshaped American into a filmable screenplay, then he would receive a written by credit. The following section examines coauthorship in more detail.

Carringer’s essay is republished in his book The Making of Citizen Kane (1984).

9

20

2 The Trials of Coauthorship

2.2 The Study of Coauthorship From the perspectives of copyright law and the WGA guidelines, writing is perceived as a solitary and individual act of creation that results in producing an original tangible text that embodies the writer’s character. In this conception of authorship, it is through their peculiar selection and combination of words that writers impress their personality into a text, making it both original and personal, enabling the writer to claim ownership of that text. Although the classical Hollywood studios’ depersonalized factory system complicates the idea of single authorship, Welles’s contract with RKO offered him unprecedented creative control—at least on Citizen Kane. Nonetheless, Welles still needed assistance with writing the screenplay. Coauthorship also complicates the idea of the solitary writer and their singular act of creation, for individuality is subsumed under a common goal (an agreed plan) and shared credit. The work of Renaissance painters raises similar issues, for Old Masters employed apprentices in their workshop to carry out much of the routine painting activities (especially filling in the background, sky, and décor), creating images that had been planned by the master, who primarily worked on fleshing out figures. In their study of collaborative writing, Lisa Ede and Andrea Lunsford reverse-engineer the shared writing process. They define collaboration in terms of the following roles and activities: “written and spoken brainstorming, outlining, note-taking, organizational planning, drafting, revising and editing” (Ede and Lunsford 1990, 14). Coauthors take up different roles, share activities within the writing process, and collaborate on different levels. There are three main scenarios to consider. Firstly, more than one author may work on the project at the same time (they work together collaboratively and simultaneously, back and forth in constant dialogue and brainstorming). Secondly, coauthorship can take place collaboratively and consecutively: an author works individually on a text and then passes it on to their coauthor, who may add to it and/or edit what has already been written. This process is usually reciprocal, with each author working in turn as an editor on their coauthor’s writing (unless there is a hierarchy between junior and senior authors). In this second scenario, authors remain separate but combine their writing. In a third scenario, coauthors work separately and consecutively: they remain disconnected and write separate parts of a document. In other words, regarding the specific parts of a text they write, there is no collaboration—that is, no coediting (although there may be coplanning); instead, authors contribute independently to the same written document. In sum, coauthors may collaborate simultaneously or consecutively or contribute independently to a shared document. The first scenario offers the opportunity for a united, harmonious collaboration, with each author’s style meshed or comingled, while independent collaboration is more prone to the creation of a disjointed and uneven text. In other words, the second and third scenarios maintain a rivalry of individual competition over shared cooperation. Nonetheless, in the second and third scenarios, one author may revise the whole document to make it consistent and coherent. Meryman contends that Welles was

2.2 The Study of Coauthorship

21

good at “augmenting, revising, rearranging, transposing—imprinting his personality on whatever he controlled” (Meryman 1978, 257). He reinforces this point by quoting Welles’s assistant from the 1940s, Richard Barr (spelled Baer under his Associate Producer credit on Citizen Kane): Mankiewicz provided Welles with the blueprint of a masterpiece. Then Welles, with solid help from Herman, added important enrichments and refinements and lifted the script the final distance. “I know Orson touched every scene,” says Richard Barr, Welles’s assistant, who later became a Broadway producer and president of the League of New York Theaters. “And I don’t mean cutting a word or two. I mean some serious rewriting, and in a few cases he wrote whole scenes. I think it’s time history balanced this situation” (Richard Barr, in Meryman 1978, 257–58).

Simon Callow also enforces the opinion that Welles was a great editor, although not necessarily an accomplished writer: “Writing was something at which [Welles] felt he should be good; he never was. […] he was, however, an inspired editor” (Callow 1996, 518–19). Kenneth Tynan similarly argues that writing was not Welles’s primary strength. He described Welles as “a superb bravura director, a fair bravura actor, and a limited bravura writer; but an incomparable bravura personality” (Kenneth Tynan, quoted in Joseph McBride 2006, 216).10 On Welles’s theatrical production of Julius Caesar, McGilligan describes Welles’s process of editing and rewriting: “When Orson was alone, he worked swiftly and furiously. Slashing away at Shakespeare’s script, he boldly combined and transposed, excising a number of scenes …” (McGilligan 2015, 413). Callow and Tynan suggest that Welles perceived authorship only in terms of total control of all the creative aspects of filmmaking—writing, acting, and directing. Yet editing an author’s work does not in itself define one as a coauthor. We saw above that, in the legal definition of coauthorship, each collaborator must produce a separate independent contribution. If an author is legally defined as someone who expresses an idea in tangible form in an artifact or a material object, then the procedure or process of one author editing another author’s work is secondary, as is an author’s procedure of combining their own work with another author’s work. For example, Welles’s activity of editing Julius Caesar does not make him Shakespeare’s coauthor, and neither does Welles’s activity of merging his fragments of Citizen Kane with Mankiewicz’s fragments of Citizen Kane to create a unitary whole define him as the screenplay’s coauthor. Instead, only if Welles contributed the elements of writing listed by the WGA—dramatic construction, original and different scenes, characterization or character relationships, and dialogue—does he become a coauthor. In his testimony to the “Ferdinand Lundberg vs Orson Welles, Herman J. Mankiewicz and R.K.O. Pictures” copyright infringement case,11 Welles offered a series of brief (and sometimes vague) comments on his collaboration with McBride (who knew Welles personally) adds that Tynan describes Welles “with considerable accuracy” (McBride 2006, 216). 11 Deposition of Orson Welles (1949), held at U.S. District Court for the Southern District of New York (Welles’s deposition was taken in Casablanca, where he was making Othello). The 10

22

2 The Trials of Coauthorship

Mankiewicz: “There was nothing unusual in our method of collaboration. Following preliminary conversations, rough drafts of a shooting script were prepared. There were subsequent conversations and discussions, further drafts and finally the finished product was agreed upon” (Welles 1949, 3). Welles’s vagueness is enhanced by his use of the passive voice, which directs attention away from the subjects carrying out the activity and toward the end product of that activity (the shooting script). He added that the process of rewriting the screenplay cannot be documented clearly because “The sequences of scenes in this screen play were changed so often and in so many drafts that it would be impossible for me to state with any reasonable accuracy what methods were employed and what contributions were made in degree, time, sequence or dramatic logic” (Welles 1949, 4). But he added: “I have a clear recollection of the conversations in which scenes and sequences were eliminated since this part of the job were the last one undertaken” (Welles 1949, 4), suggesting that he was heavily involved in editing and cutting the screenplay. Houseman and Kael (the latter in fact simply repeating the former) claim that Welles did very little writing for or editing of the screenplay. The Rashomon effect of incompatible viewpoints is most evident when considering the transition from the first draft (American) to the seventh and final script of Citizen Kane. Robert Carringer has charted this transformation in great detail in his essay “The Scripts of Citizen Kane” (1978).

2.3 The Citizen Kane Screenplays Carringer carried out extensive research at the RKO archives in Hollywood and concluded that Welles’s contribution to the Citizen Kane screenplay “was not only substantial but definitive […]. When Welles himself becomes heavily involved in the writing, it will become apparent almost at once how greatly his ideas on how to deal with the material differ from [Mankiewicz and Houseman’s ideas]” (Carringer 1978, 370; 372–73). Carringer identified seven complete versions of the screenplay in the RKO files, the earliest dated April 16, 1940, and the latest July 16, although he notes that revisions were made on a daily basis and inserted into a screenplay carrying an earlier date. The fourth draft is called the “Final,”12 followed by the “Revised Final,” “Second Revised Final,” and “Third Revised Final.” From February to May 1940, Mankiewicz and Houseman, who were in Victorville, regularly updated Welles. Carringer reports that the first draft, called American and dated April 16, is over 250 pages with huge gaps in continuity. The length of the first draft is itself a contentious issue. In the version I have seen, the pagination ends on page 325. However, the screenplay jumps from page 212 to page

complaint involved Lundberg’s accusation that the defendants drew extensively from his book Imperial Hearst: A Social Biography (1937) in creating Citizen Kane. 12 The fourth draft is available from: http://www.dailyscript.com/scripts/citizenkane.html

2.3 The Citizen Kane Screenplays

23

271, with a note inserted saying “pagination is in error due to combination of two preliminary scripts.” Carringer also mentions a 60-page gap in the pagination, although does not specifically discuss any preliminary scripts written by Welles prior to or concurrently with Mankiewicz’s first draft, and therefore does not consider whether American could contain sections of Welles’s writing. (I return to this important point when I interpret the results of my stylometric analysis in Chap. 6.) In the second draft, dated 9 May 1940 (while Mankiewicz and Houseman were still working in Victorville), Carringer identifies a number of fundamental changes, including the deletion of several scenes and the addition of scenes of Kane and Susan. He notes that “roughly a third to a half of the lines are written substantially as they will be played in the film” (Carringer 1978, 390). After completing revisions on the second draft, Mankiewicz went to work on a screenplay for MGM while Houseman traveled to New York. Carringer therefore identifies revision to the third draft (which, he suggests, was completed on 27 May) as the moment Welles took over the project. According to him, in the third draft: about seventy-five pages of the Mankiewicz-Houseman material—most of it in the form of expository and character-dialogue sequences—had been eliminated. Typically, many of the deleted sequences have been replaced with snappy or arresting montages. It is the first unmistakable appearance of the witty bravado style that is the film’s most characteristic trait. Creative ellipsis of this type will continue to be one of the most apparent signs of Welles’ hand in the scripting (Carringer 1978, 394).

Furthermore, he notes that Welles condensed several scenes into short montage sequences and added around 140 new pages of material, although he notes that “not much of the material introduced by Welles at this point [except the montage sequences] will survive in the form in which it first appears” (Carringer 1978, 395). The fourth draft (dated 16 June), called “Final” and the first to be named Citizen Kane, involves further deletions, especially the Rome sequence of events. Soon afterward (from June 18 to July 27), Mankiewicz was put on the RKO payroll again to further assist in the editing process. The fifth draft (24 June), the Revised Final, kills off Kane’s wife and his son in an automobile accident, together with a series of minor changes to other scenes. The sixth draft (9 July), the Second Revised Final, consists of 155 mimeographed pages (printed using a stencil and ink) and was the copy sent to the Hays Office (which awarded a seal of approval from the Production Code Administration, enabling films to be screened in major theater chains). This sixth draft contains numerous small revisions, plus a major addition (the famous breakfast table montage sequence), and a major deletion—of the assassination attempt on the President. Finally, the seventh draft (officially called the Third Revised Final, dated 16 July 194013) contains additional revisions, including a clearer development of the Leland-Kane relationship, revising some of Susan’s lines, and adding the tent scene during the picnic. Carringer summarizes his findings:

13

The seventh version is published in Pauline Kael, Herman Mankiewicz, and Orson Welles (1971).

24

2 The Trials of Coauthorship [C]ertain sections of the script were close to their final form at Victorville [i.e., in the first draft]. Principally these are the beginning and end, the newsreel, the projection room sequence, the first visit to Susan, and Colorado; that is, the Rosebud gimmickry and the elaborate plot machinery used to get Charles Foster Kane on and off stage—but none of the parts involving the adult Kane people actually knew. […] The Victorville scripts contain dozens of pages of dull, plodding material that will eventually be discarded or replaced altogether (Carringer 1978, 399).

I have already presented a few anecdotal comments on Welles’s revision process. Carringer offers additional insights: “Unlike most writers, Welles’ customary approach to revision is not to ponder and polish but to discard and replace. He works rapidly and in broad sweeps, eliminating whole chunks and segments at a stroke and, if necessary, replacing them with material of his own devising” (Carringer 1978, 399). And “[Welles’s] somewhat frenetic scripting habits were unusual by Hollywood standards. They are, in fact, habits one associates with live mediums like radio and theater, where one learns quickly to perform with grace and aplomb under the pressures of deadlines, fate, and all the vagaries of the moment” (Carringer 1978, 400). Carringer ends by arguing that Mankiewicz provided Welles with story material, which Welles fundamentally transformed into a filmable screenplay. Carringer’s judgment that Welles’s contribution to the Citizen Kane screenplay was not only substantial but definitive at first appears to follow the WGA guidelines quoted above, which state that an author receives credit if he or she contributes “dramatic construction; original and different scenes; characterization or character relationships; and dialogue.” According to Carringer, the development of the character of Charles Foster Kane (as well as other characters) appears to be the work of Welles: “Most of American is quite simply à clef plotting with only the barest effort at characterization. Kane himself at this stage is more an unfocused composite than a character portrait, a stand-in mouthing dialogue manufactured for some imaginary Hearst” (Carringer 1978, 385). In Carringer’s opinion, Welles transformed Mankiewicz’s unfocused composite of Kane into an authentic dramatic portrait of a rounded character. Carringer is guided by a comparative study of the seven successive drafts of Citizen Kane. He analyzes a specific set of editing changes that improve the screenplay’s storytelling from one version to the next, including fundamental changes to the ordering (ordinatio) of story content, adding new scenes, shortening or deleting other scenes, and rewriting—that is, rewording the screenplay. An editor’s addition of new scenes will of course impose their stylometric fingerprint on the screenplay, whereas rewriting has less of an impact, for (unless the original text is replaced entirely) rewriting only modifies the stylometric fingerprint. The other editing procedures rarely involve rewriting, including the deletion of text, Welles’s favored method according to Carringer: “Creative ellipsis […] will continue to be one of the most apparent signs of Welles’ hand in the scripting” (Carringer 1978, 394).14 But deleting text is secondary to the primary activity of producing new and original Commenting on the fourth draft, Carringer adds that “the principal changes are deletions” (1978, 395). 14

2.3 The Citizen Kane Screenplays

25

material, according to copyright law and the WGA. To determine the authorship of the coauthored Citizen Kane screenplay, it is necessary to identify each author’s stylometric fingerprint and ask: did Welles add a sufficient quantity of his own original text to the screenplay, and did he carry out enough revisions to completely transform Mankiewicz’s stylometric fingerprint into his own stylometric fingerprint? This is an open question I answer in Chap. 6. Whereas Kael, Houseman, and Callow claim that Welles simply tidied up this draft before filming it, Welles, Lederer, Sarris, Bogdanovich, McBride, Rosenbaum, and Carringer claim that Welles was not only compelled to drastically cut the screenplay by deleting entire scenes and storylines but also had to write new scenes, characters, and dialogue. To return to Lunsford and Ede’s terms, Mankiewicz and Welles began work on the project at the same time (brainstorming and arguing) before shifting to a different scenario where they worked consecutively: they primarily remained disconnected and wrote separately, with Welles editing and meshing his and Mankiewicz’s writing to create a consistent and coherent screenplay. In Chap. 6, I carry out a stylometric analysis of the seventh version (dated “July 16th, 1940”) in order to detect coauthorship. Three outcomes are possible: (1) Only single authorship is detected (Mankiewicz or Welles). (2) There is evidence of coauthorship (Mankiewicz and Welles). (3) Neither author is detected (neither Mankiewicz nor Welles). Outcome 2 will confirm the coauthorship hypothesis, while outcome 1 will reject it, and option 3 is unlikely (but should nonetheless remain a possibility). With outcome 1, we would need to determine if either Mankiewicz or Welles is the single author. (Historical evidence of course does not rule out Mankiewicz, which means that the issue with outcome 1 would be whether Welles wrote any of the screenplay.) With outcome 2, detecting coauthorship, we need to determine the following: (2a) The boundaries where authorship changes. (2b) The identity of the author of each section, and therefore. (2c) The type and quantity of writing each author contributed to the screenplay. Identifying the boundaries where authorship changes in the screenplay is not straightforward, but locating them would assist in determining each author’s contribution. We cannot rule out the possibility that other authors (such as John Houseman) worked on the screenplay, and we need to consider whether the text is fully collaborative at the sentence level (in which the dual authorship is inextricably mixed), which rules out drawing any conclusions beyond outcome 2, evidence of coauthorship. Such a view is bolstered by the belief that coauthors “may sometimes, whether deliberately or unconsciously, each accommodate his own style to that of the other” (Jackson 2003, 7). But, as Brian Vickers points out in Shakespeare, Co-Author, in English Renaissance theater, “collaboration was a normal way of sharing the burden of composition, producing a script more quickly, and taking part in a collective enterprise” (Vickers 2002, 27). Coauthorship was practiced by all the major playwrights (including Shakespeare) in order to save time, which implies a division of

26

2 The Trials of Coauthorship

labor in terms of writing separate scenes and acts. Hollywood screenwriting involves a division of labor with a subsequent screenwriter adding to and also rewriting the work of a previous screenwriter. These issues are central to understanding the provenance of the Citizen Kane screenplay, which was written at record speed due to the collapse of Welles’s previous two film projects and his need to honor his RKO contract by delivering a completed film. This time pressure and Welles’s practice of editing and changing radio scripts primarily by himself at the last moment suggest that Mankiewicz had little involvement in contributing to the screenplay after the second draft. From the qualitative critical judgments reviewed in this chapter, it seems that Welles not only edited Mankiewicz’s screenplay; he also rewrote sections of it and contributed new scenes and dialogue and developed the character of Kane (a role that he of course played in the film). Ambiguity remains whether Welles wrote a separate draft of the screenplay, and if so, when did he write it—before, during, or after Mankiewicz wrote his first draft?

References Anon. 1941. Not Me. Variety 141 (7) (January 22): 1. Bogdanovich, Peter. 1972. The Kane Mutiny. Esquire (October): 99–105; 180–190. Callow, Simon. 1996. Orson Welles: The Road to Xanadu. London: Vintage. Cantril, Hadley. 1940. The Invasion from Mars: A Study in the Psychology of Panic. Princeton: Princeton University Press. Carringer, Robert L. 1978. The Scripts of Citizen Kane. Critical Inquiry 5 (2), 369–400. Carringer, Robert L. 1984. The Making of Citizen Kane. Berkeley: University of California Press. Corliss, Richard. 1974. Talking Pictures: Screenwriters in the American cinema, 1927-1973. Woodstock, N.Y.: Overlook Press. Dunning, John. 1998. On the Air: The Encyclopedia of Old-Time Radio. New York: Oxford University Press. Ede, Lisa, and Andrea Lunsford. 1990. Singular Texts /Plural Authors: Perspectives on Collaborative Writing. Carbondale: Southern Illinois Press. Houseman, John. 1972. Run-Through 1902/1941. New York: Touchstone. Jackson, MacDonald. 2003. Defining Shakespeare: Pericles as Test Case. Oxford: Oxford University Press. Kael, Pauline, Herman Mankiewicz, and Orson Welles. 1971. The Citizen Kane Book. Boston: Little, Brown. Kellow, Brian. 2011. Pauline Kael: A Life in the Dark. New York: Viking Press. Landau, Michael B. 2014. Joint Works Under United States Copyright Law: Judicial Legislation Through Statutory Misinterpretation. IDEA: The Intellectual Property Law Review 54 (2): 157–224. Lebo, Harlan. 2016. Citizen Kane: A Filmmaker’s Journey. New York: St Martin’s Press. Library of Congress and Copyright Office. 2021. Copyright Law of the United States and Related Laws Contained in Title 17 of the United States Code. Washington: Library of Congress. Lundberg, Ferdinand. 1937. Imperial Hearst: A Social Biography. New York: The Modern Library. McBride, Joseph. 2006. Whatever Happened to Orson Welles? A Portrait of an Independent Career. Lexington: University Press of Kentucky.

References

27

McFarlin, Timothy J. 2016. An Idea of Authorship: Orson Welles, the War of the Worlds Copyright, and Why We Should Recognize Idea-Contributors as Joint Authors. Case Western Reserve Law Review 66 (3): 701–67. McGilligan, Patrick. 2015. Young Orson: The Years of Luck and Genius on the Path to Citizen Kane. New York: HarperCollins. Meryman, Richard. 1978. Mank: The Wit, World, and Life of Herman Mankiewicz. New York: William Morrow and Company. Sarris, Andrew. 1968. The American Cinema: Directors and Directions, 1929-1968. New York: E.P. Dutton. Sarris, Andrew. 1971a. Citizen Kael vs Citizen Kane (I), Village Voice (April 15). Sarris, Andrew. 1971b. Citizen Kael vs Citizen Kane (IV), Village Voice (June 9). Vickers, Brian. 2002. Shakespeare, Co-Author. Oxford: Oxford University Press. Welles, Orson. 1949. Deposition of Orson Welles (dated May 4, 1949) in the case of Ferdinand Lundberg v. Orson Welles, Herman J. Mankiewicz, and R.K.O. Radio Pictures, Inc. Available online from the National Archives: https://catalog.archives.gov/id/195993046 Welles, Orson, and Peter Bogdanovich, edited by Jonathan Rosenbaum. 1992. This Is Orson Welles. New York: HarperCollins. Writers Guild of America. 2018. Screen Credits Manual: https://www.wga.org/uploadedfiles/credits/manuals/screenscredits_manual18.pdf

Chapter 3

Screenplays: Words on the Page

A peculiar document takes center stage in this study: the screenplay, which has only recently emerged as an object of academic scrutiny.1 Steven Price (2013) places this shift in focus to the publication of the Citizen Kane Book in 1971 (Kael et al. [1971]). He argues that “the competing claims to authorship of Welles and Mankiewicz brought the screenplay, rather than the film, into focus as a contested sight,” which in its turn necessitated the “documenting [of] the material texts, rather than confining oneself to anecdotal accounts of screenwriters in Hollywood” (Price 2013, 162). In the following study, several screenplays are subjected to close scrutiny. In addition to Citizen Kane, the data consist of four training texts from Mankiewicz and Welles, texts that will define the statistical profile of both authors. The screenplays are Made in Heaven (Mankiewicz; unproduced, 1943–45), A Woman’s Secret (Mankiewicz, 1949) (an adaptation of the novel Mortgage on Life (1946) by Vicki Baum), plus Welles’s The Big Brass Ring (1987)2 and The Other Side of the Wind (1970). This selection was not random, but neither was it predetermined; instead, it was decided by the current availability of screenplays. For comparison, I examine two test screenplays (not authored by Mankiewicz or Welles): His Girl Friday (written by Charles Lederer, 1939) and All the President’s Men (written by William Goldman, 1976). His Girl Friday was written around the same time as Citizen Kane, while All the President’s Men was written between the time Welles wrote The Other Side of the Wind and The Big Brass Ring. Unproduced screenplays are less likely to have been rewritten by others (director, producer, etc.), which makes them suitable for determining a writer’s style. Stylometry can still work on adaptations, although Key texts include Claudia Sternberg (1997); Steven Maras (2009); Steven Price (2010); Jill Nelmes (2010); and Ian W. Macdonald (2013). 2 After the publication of the screenplay, The Big Brass Ring was rewritten by George Hickenlooper and F. X. Feeney and directed by Hickenlooper in 1999. My analysis focuses on Welles’s unproduced screenplay published in 1987.

1

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 W. Buckland, Who Wrote Citizen Kane?, Quantitative Methods in the Humanities and Social Sciences, https://doi.org/10.1007/978-3-031-40224-1_3

29

30

3 Screenplays: Words on the Page

this is dependent on whether writers transform the source text into their own style. In this chapter, I compare Mankiewicz’s adapted screenplay A Woman’s Secret to its source material.

3.1 The Components of a Screenplay A screenplay typically consists of the following eight components: (1) The Scene Heading (in CAPS), sometimes called a Slug Line, which introduces a new scene. It contains three types of information: whether the scene is an interior (INT.) or exterior (EXT.), the locale, and whether it takes place during the DAY or NIGHT. Location and time are separated by a single dash surrounded by a space before and after, for example: EXT. XANADU - FAINT DAWN Extensions or modifications may be added in parentheses to indicate the city (LONDON), season (WINTER), specific year (1941), whether the shot is STOCK footage, a special effect (PROCESS SHOT or MINIATURE), or whether the camera is inside a MOVING or TRAVELING vehicle. (2) A secondary Slug Line (in CAPS) names a change within a scene (change from one room to another), as well as details of point-of-view (POV) shots (specifying who sees and what they see) or whether the shot is an INSERT. (3) Scene Text (also called Stage Directions) describe the physical aspects of location, situation, characters, entrances and exits, and actions (without repeating what is in the Scene Heading). They present a visual and aural description of setting and character from the perspective of the film’s audience. In a more precise analysis, Claudia Sternberg divides this part of the screenplay into three components: description, report, and commentary. She defines description as “passages which describe the setting and objects to be visualized on the screen” (Sternberg 1997, 71). Description conveys the setting almost frozen in time. For example, the Prologue of Citizen Kane describes Xanadu from various distances, both from the exterior and the interior: Camera travels up what is now shown to be a gateway of gigantic proportions and holds on the top of it—a huge initial “K” showing darker and darker against the dawn sky. Through this and beyond we see the fairy-tale mountaintop of Xanadu, the great castle a silhouette at its summit, the little window a distant accent in the darkness (In Kael et al. 1971, 91).

The report mode “is typified by events and their temporal sequence and generally centers on the actions of human beings” (Sternberg 1997, 72). In other words, the report mode uses verbs to convey actions and events unfolding in the story:

3.1 The Components of a Screenplay

31

A hand—Kane’s hand, which has been holding the ball, relaxes. The ball falls out of his hand and bounds down two carpeted steps leading to the bed, the camera following. The ball falls off the last step onto the marble floor where it breaks, the fragments glittering in the first ray of the morning sun (In Kael et al. 1971, 97).

Sternberg defines commentary in screenplays as “passages or parts of sentences which explain, interpret or add to the clearly visible and audible elements of the screenplay” (Sternberg 1997, 73). For example: The dominating note is one of almost exaggerated tropical lushness, hanging limp and despairing—Moss, moss, moss (In Kael et al. 1971, 95.)

Such sentences do not describe physical settings or report on actions but instead convey an abstract mood. Commentaries are more literary and authorial than the standardized description and report sections, and screenwriters are not encouraged to write commentary. (4) The Character Cue (in CAPS) names the character (or another agent) who is about to speak. (5) An optional Personal Direction or “Parenthetical” (because it is in parenthesis, immediately underneath the Character Cue) gives the actor instructions for delivering the line—(quietly) or (angrily), for example—or it describes an action to be performed with the dialogue, such as (looks up). A pause is indicated by (beat). (6) Dialogue refers to the words spoken by the character named in the Character Cue. Pauses in and interruptions to dialogue are formatted using the following conventions: “Pauses in a sentence may be indicated with the ellipsis (three periods).” “The ellipsis is also used when a sentence is interrupted by personal direction. In this case, the ellipsis must appear both at the end break and at the beginning of the remainder of the sentence.” “Pauses may also be indicated by using two dashes. Leave a space on each side of them” (Cole et al. 1990, 89). While both ellipses and dashes indicate a pause, dashes more specifically signify an interruption in dialogue or a sudden change in thought. If dialogue continues on the next page, then (MORE) is added to the bottom of the page and (CONT’D) is added to the top of the next page. (7) Scene transitions: transitional instructions include: FADE IN: DISSOLVE TO: CUT TO: FADE OUT Each instruction ends with a colon (except FADE OUT). A screenplay always begins with FADE IN: and always ends with FADE OUT. CUT TO: is only used on

32

3 Screenplays: Words on the Page

specific occasions: “when there is not a logical progression from one scene to the next” (Cole et al. 1990, 51), that is, to mark a sudden and dramatic change in scene. (8) Technical instructions: additional elements include technical instructions involving sound cues, camera cues, voice-over (V.O.), and a sound or voice off- screen (O.S.). In a screenplay, the descriptive passages are written in short sentences in the present tense and active voice and are limited to what can be seen and heard from the perspective of the camera. The hidden aspects of a character are conveyed through their dialogue and the physical description of props, clothing, surroundings, and behavior. Except when characters are introduced for the first time, character cues (name of speaker) and dialogue continuity cues at the top and bottom of pages (“more” and “cont’d”) are excluded from the screenplay analyses carried out in Chaps. 5, 6, and 7 because they are standard formatting issues that do not constitute a stylistic choice; furthermore, they are sufficiently numerous (around 4% to 5% of a screenplay) to affect the analysis of style. Deleting redundant elements is a widespread practice in stylometry, especially in the closely associated analysis of drama texts (plays).3

3.2 Welles’s Screenplays Welles’s unproduced screenplay The Big Brass Ring (June 22, 1982, published in 1987) and his screenplay The Other Side of the Wind (1970) are used as test samples of his writing. Additional screenplays will be consulted, including the adapted screenplays The Magnificent Ambersons (adapted from Tarkington 1918) and Touch of Evil (adapted from Masterson 1956). Several sources have discussed the authenticity of the 1987 published versions of The Big Brass Ring as Welles’s own writing (with Oja Kodar contributing to some of the dialogue of one character). Jonathan Rosenbaum writes: “The Big Brass Ring allows us the rare privilege of hearing [Welles’s] voice, resonant and unmistakable, still in the present, addressing us directly” (Rosenbaum 1987, 148). Rosenbaum also mentions how, during a lunch meeting, Henry Jaglom encouraged Welles to write an original screenplay (when Welles could not get financing for his adapted screenplay The Dreamers): [Jaglom:] One night, about six days later, I got a phone call. [Welles] said, “I’ve got four pages.” He was sweating; I could hear it: “Could I read them to you?” I said, “Sure,” and he read to me—and I said, “My God, they’re brilliant! Please keep on writing. And he said, “What are you, crazy? It’s four in the morning; I’ve got to go to sleep.” The next day he came to lunch, and he had 12 pages. And the next day he had 23. And in three months he had a script, one which I just could not believe. It was called The Big Brass Ring. It was absolutely the bookend to Citizen Kane. It was about America at the end

See, for example, Hugh Craig and Brett Greatley-Hirsch (2017, 30) and C.B. Williams (1970, 19).

3

3.2 Welles’s Screenplays

33

of the century—socially and politically and morally—as Kane and Ambersons are about America at the beginning of the century (Jaglom, quoted in Rosenbaum “Afterword,” 139).4

The authenticity of The Big Brass Ring as Welles’s own writing is indisputable. The Other Side of the Wind is a semi-autobiographical film that focuses on a chaotic birthday party for the legendary Hollywood director Jake Hannaford, who attempts to make a comeback with his new film (also called The Other Side of the Wind) made in European art cinema style. Welles’s film combines the chaotic evening with a screening of Hannaford’s new film. Joseph McBride (who appears in the film as a young film critic) recounts the time he asked Welles about the screenplay: I asked Welles if he had been working on the script when I walked in. He laughed and said there wasn’t any script; the film would be improvised. Seeing my surprise, he said he had written a script that would have run for nine hours on-screen but put it aside because he realized he was writing a novel. “I’m going to improvise out of everything I know about the characters and the situation,” he said. He had a large cardboard box crammed with notes sitting next to the typewriter (McBride 2006, 155).

Josh Karp reports that Welles would regularly rewrite scenes the night before they were filmed: “it wasn’t uncommon for cast and crew to leave [Welles] at the end of the night sitting in his pajamas pounding out revisions and then find him in the same position—still typing—when they returned in the morning” (Karp 2015, 67). Karp also quotes script supervisor Mary Ann Newfield, who said the screenplay was disjointed and over 200 pages long (Karp 2015, 147) (The Netflix version is 212 pages). Oja Kodar again collaborated with Welles by starring in and directing several scenes of the film-within-the-film. Turning to Welles’s adaptations, we discover that his methods for adapting The Magnificent Ambersons and Touch of Evil are remarkably different. Welles wrote The Magnificent Ambersons in 1941 (the final screenplay is dated October 7, 1941),5 immediately after Citizen Kane. However, unlike Kane, it is an adaptation—of Booth Tarkington’s famous novel of the same name. Geoffrey O’Brien has said of Welles’s version of The Magnificent Ambersons: “the film is a stunning demonstration of Welles’s genius for pinpointing the most expressive moments in the original text, while letting others go by. Tarkington was a masterful storyteller, but his presentation of character has a certain theatrical flatness; Welles’s paring away has the effect of making the characters both more mysterious and more profoundly real” (O’Brien 2018). He adds: “What is most striking is Welles’s faithfulness to the novel’s language. The particularities of the way Tarkington’s characters talk, as well as the cadences of the omniscient third-person narrator, were evidently essential to Welles’s conception of the film” (O’Brien 2018). O’Brien’s qualitative assessment of Welles’s screenplay—Welles’s faithfulness to the novel’s language, especially the way characters talk and the narrator’s cadences—could make the screenplay unsuitable as a sample of Welles’s writing. I tested O’Brien’s observations by

Rosenbaum transcribes the conversion from Wilmington (1987). “The Magnificent Ambersons” (final shooting script) available from: http://www.themagnificentambersons.com/final-shooting-script-cutting-continuity/ 4 5

34

3 Screenplays: Words on the Page

comparing the language of the novel and the screenplay using a simple software program called WCopyfind,6 which compares two documents side by side for overlapping words and phrases. The degree of overlap is determined in advance by setting the length of overlap and by ignoring “outer punctuation,” which disregards the quotation marks around the dialogue in the novel, for dialogue in the screenplay does not use quotation marks, and their absence in the screenplay causes the software to ignore the matching of a number of short words in a novel that have quotation marks attached to them. More generally, this software can be used to make adaptation studies more precise by systematically comparing the source material and the resulting screenplay. To compare the novel The Magnificent Ambersons to the screenplay, the overlap was set to four words, ignoring outer punctuation. With these settings, every fragment of text with the same four consecutive words (minus dialogue quotation marks) is highlighted in both documents. The result is an overlap between novel and screenplay at the four-word level of 51%—that is, 10,569 words (or just over half of the 20,000-word sample of Welles’s Magnificent Ambersons screenplay) overlap with Tarkington’s novel.7 This overlap is evident in the voice-over (V.O.) spoken by the narrator as well as in character dialogue. The novel begins: Major Amberson had “made a fortune” in 1873, when other people were losing fortunes, and the magnificence of the Ambersons began then. Magnificence, like the size of a fortune, is always comparative, as even Magnificent Lorenzo may now perceive, if he has happened to haunt New York in 1916; and the Ambersons were magnificent in their day and place. Their splendour lasted throughout all the years that saw their Midland town spread and darken into a city, but reached its topmost during the period when every prosperous family with children kept a Newfoundland dog.

The film retains the omniscient third-person narrator in the form of a voice-over, which is used extensively in the film’s opening. The screenplay begins: FADE In on a dark screen NARRATOR The magnificence of the Ambersons began in 1873. Their splendor lasted throughout all the years that saw their Midland town spread and darken into a city.

From these opening 26 words spoken by the film’s narrator, 21 derive from Tarkington’s novel. (The software does not match the words splendor to splendour due to the different spelling. But clearly, the match needs to be retained.)8 The narrator in the novel and the film share the following observation concerning the hailing of a horse-drawn streetcar. Firstly, the novel: https://plagiarism.bloomfieldmedia.com/software/wcopyfind/ I then set the level to five words. This had a minor effect, only reducing the overlap to 45%. 8 The software can be set to accept such discrepancies using the ‘imperfections’ option. Setting it to ‘1’ would then allow such imperfections to be matched. 6 7

3.2 Welles’s Screenplays

35

… a lady could whistle to it from an upstairs window, and the car would halt at once and wait for her while she shut the window, put on her hat and cloak, went downstairs, found an umbrella, told the “girl” what to have for dinner, and came forth from the house.

In Welles’s screenplay, the narrator says: A lady could whistle to it from an upstairs window and the car would halt at once and wait for her while she shut the window, put on her hat and coat, went downstairs, found an umbrella, told the “girl” what to have for dinner and came forth from the house.

There are four minor nonmatches: Welles capitalizes the letter A; one word is different in the screenplay—Welles uses coat rather than cloak, and he deletes two commas (one after upstairs window and the other after dinner). The software has revealed subtle revisions that Welles has introduced—he capitalizes the A even though it does not appear at the beginning of a sentence (it is preceded by a semicolon); he changes one near synonym for another (cloak/coat), and he deletes two commas that come before the word and – the first one separates two simple sentences conjoined by and, and the second is a serial comma. Clearly, unlike Tarkington, Welles does not see the need to add a comma before and when linking sentences or when creating a list.9 We can explore the punctuation further by comparing the overlapping text. The samples are the same size: 10,569 words. In the fragments of Tarkington’s novel that overlap with Welles’s screenplay, there are 715 commas. And in the fragments of Welles’s screenplay that overlap with Tarkington’s novel, there are 521 commas, a reduction of 194 (a 27% decrease). Fifteen of these commas come before and, including the following: Tarkington … flute, harp, fiddle, cello, cornet, and bass viol would presently release to the dulcet stars such melodies as …

Welles flute, harp, fiddle, cello, cornet and bass viol would presently release their melodies to the dulcet stars.

Tarkington Georgie stepped upon the platform, and took up the emblem of office.

Welles

I attribute the rewriting to Welles, for he remains the uncontested author of the screenplay.

9

36

3 Screenplays: Words on the Page

George steps upon the platform and takes up the horse-pistol.

In the first example, Welles has again deleted the final serial comma (after cornet); he has also shortened the sentence and changed word order so that the sentence ends on dulcet stars. In the second example, he deletes the comma joining two simple sentences (and, in line with the conventions of screenplay writing, he uses the present tense and makes the scene more descriptive by specifying what the emblem of office actually is). These two examples illustrate Welles’s tendency to delete both serial commas and commas between simple sentences, relying instead on and by itself. The film’s dialogue is written in a similar way. Early in the novel, at the Ambersons’ ball, George meets Lucy for the first time. George begins to criticize the other men at the ball when several ask Lucy to dance: “How’d all those ducks get to know you so quick?” George inquired, with little enthusiasm. [Lucy Morgan] “Oh, I’ve been here a week.” “Looks as if you’d been pretty busy!” he said. “Most of those ducks, I don’t know what my mother wanted to invite ‘em here for.” “Don’t you like them?” “Oh, I used to see something of a few of ‘em. I was president of a club we had here, and some of ‘em belonged to it, but I don’t care much for that sort of thing any more. I really don’t see why my mother invited ‘em.” “Perhaps it was on account of their parents,” Miss Morgan suggested mildly. “Maybe she didn’t want to offend their fathers and mothers.” “Oh, hardly! I don’t think my mother need worry much about offending anybody in this old town.” “It must be wonderful,” said Miss Morgan. “It must be wonderful, Mr. Amberson—Mr. Minafer, I mean.” “What must be wonderful?” “To be so important as that!” “That isn’t ‘important,’” George assured her. “Anybody that really is anybody ought to be able to go about as they like in their own town, I should think!”

Welles transcribes all the dialogue (the text in quotation marks) almost word for word from the novel. There are a few minor differences: Welles deletes all in George’s opening line as well as his expression Oh, hardly! and deletes all four of Tarkington’s exclamation marks. The overlaps between the novel and screenplay continue, constituting in total 51% of the language in The Magnificent Ambersons screenplay at the four-word level, confirming O’Brien’s observation about Welles’s faithfulness to the novel’s language and his ability to pinpoint the novel’s most expressive moments. As it stands, Welles’s Magnificent Ambersons screenplay is unsuitable as a sample of his writing. Nonetheless, because the software highlights every overlap of four words or more, the highlighted text can in principle be deleted. The resulting text would not be suitable for collocational and sentence analysis but could still be useful for word and n-gram analysis, although this will half the size of the sample. Sixteen years later, Welles directed Touch of Evil (released in 1958), an adaptation of the pulp thriller Badge of Evil by Whit Masterson (a pseudonym of Robert

3.2 Welles’s Screenplays

37

Allison Wade and H. Bill Miller) (the 1957 screenplay still bears this title).10 Welles adapted his screenplay not directly from the book but from a screenplay commissioned by the film’s producer, Albert Zugsmith, from television writer Paul Monash. Zugsmith had shelved the project after reading Monash’s adaptation because, John C. Stubbs suggests, “the screenplay is too prolix, and it contains problems with character motivation and plot” (Stubbs 1985, 23). In rewriting the screenplay, Welles had no coauthorship issues to contend with (no collaboration or coediting). Instead, he wrote his own screenplay by transforming both Monash’s draft and Whit Masterson’s novel.11 It is therefore no surprise that a direct comparison of the novel and Welles’s screenplay yields an overlap of just 2.7% at the four-word level ignoring outer punctuation (a mere 582 words in a screenplay of 21,645 words). The most notable overlaps include two lines of dialogue from Sanchez (eventually discovered to be the criminal who planted the bomb): “I’ve been told I have a very winning personality. The best shoe clerk the store ever had --” “I’ve been at her feet ever since!”

He utters both phrases as Quinlan questions him about the bomb explosion. A second cluster of phrases relates to division: on both sides of the side of the border--? on the other side of the side of the fence on your side of the on their side of the on my side of

These phrases highlight one of the main themes of the novel and film—the border between Mexico and the United States, in which the police investigate a bomb explosion that takes place in a border town (an innovation Welles added to the screenplay). The repetition of these phrases in the novel and screenplay is therefore unremarkable. They are scattered throughout the whole screenplay. Nonetheless, a pattern in the repetition of words can be detected in the final scene, when Captain Quinlan’s criminal activities (of planting evidence) are exposed

The “Touch of Evil” screenplay is available online: https://www.wellesnet.com/ orson-welles-scripts-online/ 11 ‘Since the Welles screenplay and film draw on story material and lines of dialogue from the novel which are not in the Monash version, we can be sure Welles used the original novel as a source, despite McBride’s assertion to the contrary. By the same token, since the Welles screenplay and film also contain material and dialogue from the Monash screenplay not in the novel, we can be assured that Welles drew on the Monash screenplay, too, as he worked toward his final version’ (Stubbs 1985, 20). 10

38

3 Screenplays: Words on the Page

by Vargas and Quinlan’s partner, Pete Menzies. In this final scene, out of the total of 582 words that overlap between the novel and film, we find a repetition of 93 words: I thought you were to tell the truth. I supposed to take another oath in front of you? too easy to duck any difference does it make? just made sure he paid for it. I believed in you. all the time, all these years ... get away with it. you happen to know about I’m working for the department, you can give me If that’s the way you want it, I’ve got it right here I didn’t want to, you make me do it?... dead -- you killed him! as well as the

In other words, 16% of the overlap between the novel and the screenplay takes place in the final scene of the screenplay. Both The Magnificent Ambersons and Touch of Evil demonstrate Welles’s different approaches to adaptation, with his traditional practice of condensing, reshaping, and editing the source text in evidence much more in Touch of Evil than in The Magnificent Ambersons. But because Welles’s screenplay cannot be compared to Paul Monash’s version of the screenplay, it will not be used to develop Welles’s statistical profile.

3.3 Mankiewicz’s Screenplays Herman Mankiewicz’s unproduced screenplay Made in Heaven (1943–45)12 and his adapted screenplay A Woman’s Secret (Mankiewicz 2007) from Vicki Baum’s novel Mortgage on Life (Baum 1946) are used as test samples of his writing. To compare A Woman’s Secret to Mortgage on Life, I set the overlap to four words, ignoring outer punctuation. The result was an overlap between the novel and screenplay at In the ‘RKO Radio Pictures Studio Records (Collection PASC 3)’, Made in Heaven (1943–1945) is listed as production no. 1410. UCLA Library Special Collections, Charles E. Young Research Library, University of California, Los Angeles: https://oac.cdlib.org/findaid/ark:/13030/ kt267nd72c/ 12

3.3 Mankiewicz’s Screenplays

39

the four-word level of just 3% (that is, 813 words of the 26,668-word screenplay overlap with the novel). The overlaps consist of small fragments of dialogue and scene description scattered throughout the screenplay, with one exception. But to explore that exception, I first need to note that in the screenplay, the names of the characters change. The main protagonist, arrested for shooting Betty, is called Marian, although she is also called Margaret: INT. POLICE STATION - NIGHT Margaret, a policewoman, and Fowler are standing in front of the desk. The Lieutenant on duty finishes writing something, then looks up. MARIAN You'll make that call right away, Inspector? INSPECTOR JIM FOWLER You're sure you don't want to call you lawyer? MARIAN Just the one call. Margaret and the policewoman leave the scene, past the desk, towards the rear of the police station.

In the final film A Woman’s Secret (Nicholas Ray, 1949), the woman is called Marian, which suggests that the name change from Margaret to Marian was not fully implemented in the screenplay. Betty is also called Susan, sometimes in the same scene. (She is called Susan in the film.) The following scene is the only moment in the screenplay that shows any significant overlap with the novel. In a flashback, Susan (Betty) explains to Marian and Luke why she left California: SUSAN […] Well, to tell you the truth, I got into a kind of a little scandal back in Azusa and so when Mrs. Burgell offered me this money to come to New York, I grabbed it. LUKE (delighted) A little scandal, eh? (he rubs his hands) Let's hear all about it, Betty. MARIAN Luke! BETTY There wasn't much to it. Naturally, all Mrs. Burgell wanted was to get me out of Azusa and away from that crazy husband of hers. LUKE That was all, eh? BETTY

40

3 Screenplays: Words on the Page

I figured she was out of her mind, but it was her money -- as if I'd have anything to do with a man twice my age. He must be thirty-six if he's a day. I thought she was just joking when she said he'd asked her for a divorce. Wouldn't you? MARIAN (taken aback) Well, I -- I – BETTY You see, when you're working in a cafe, you always have some regular customers who drop in for laughs – (interrupts herself) How was I to know Mr. Burgell would turn around and tell Mrs. Burgell he wanted a divorce? (reminiscent smile) You should have heard the noise it made in Azusa. Mr. Burgell's about the richest man there, you know.

This is the most substantial overlap in the screenplay. Overall, its 3% overlap with its source novel is comparable to the 2.7% overlap between Touch of Evil and its source novel. The two Mankiewicz screenplays are combined to create a 40,000-word sample, and the two Welles screenplays are combined to create a second 40,000-word sample. Both samples are analyzed in Chap. 5 using the stylometric tests outlined in Chap. 4.

References Baum, Vicki. 1946. Mortgage on Life. New York: Doubleday & Company. Cole, Jr., Hillis R., and Judith H. Haag. 1990. The Complete Guide to Standard Script Formats. North Hollywood: CMC Publishing. Craig, Hugh, and Brett Greatley-Hirsch. 2017. Style, Computers, and Early Modern Drama. Cambridge: Cambridge University Press. Kael, Pauline, Herman Mankiewicz, and Orson Welles. 1971. The Citizen Kane Book. Boston: Little, Brown. Karp, Josh. 2015. Orson Welles’s Last Movie: The Making of The Other Side of the Wind. New York: St Martins Press. McBride, Joseph. 2006. Whatever Happened to Orson Welles? A Portrait of an Independent Career. Lexington: University Press of Kentucky. Macdonald, Ian W. 2013. Screenwriting Poetics and the Screen Idea. Basingstoke: Palgrave. Mankiewicz, Herman. 2007. A Woman’s Secret. Alexandria, VA: Alexander Street Press. Maras, Steven. 2009. Screenwriting: History, Theory and Practice. London: Wallflower. Masterson, Whit. 1956. Badge of Evil. New York: Dodd, Mead. Nelmes, Jill, ed. 2010. Analysing the Screenplay. London: Routledge. O’Brien, Geoffrey. 2018. Orson Welles’s “The Magnificent Ambersons.” NYR Daily: https://www. nybooks.com/daily/2018/12/26/orson-welless-magnificent-ambersons/ Price, Steven. 2010. The Screenplay: Authorship, Theory and Criticism. Basingstoke: Palgrave.

References

41

Price, Steven. 2013. A History of the Screenplay. Basingstoke: Palgrave. Rosenbaum, Jonathan. 1987. Afterword, 137–48. In Orson Welles, with Oja Kodar, The Big Brass Ring. Santa Barbara: Santa Teresa Press. Sternberg, Claudia. 1997. Written for the Screen: The American Motion-Picture Screenplay as Text. Tübingen: Stauffenburg. Stubbs, John C. 1985. The Evolution of Orson Welles’s Touch of Evil from Novel to Film. Cinema Journal 24 (2): 19–39. Tarkington, Booth. 1918. The Magnificent Ambersons. New York: Doubleday, Page and Co. Welles, Orson, with Oja Kodar. 1987. The Big Brass Ring. Santa Barbara: Santa Teresa Press. Williams, C.B. 1970. Style and Vocabulary: Numerical Studies. London: Griffin. Wilmington, Michael. 1987. The Scorpion and the Frog. L.A. Style 3 (1).

Chapter 4

The Statistical Analysis of Style: Aims and Methods

The types of linguistic data collected in this study and the methods adopted to analyze those data need to be outlined and justified to arts and humanities scholars. In this study, I do not identify the literary quality or general style of Mankiewicz’s and Welles’s screenplays, nor do I focus on grammar or linguistic meaning or rhetorical effects or an author’s worldview. My methods of analysis and results are far removed from a human reader’s experience or understanding of written texts. Instead, my study is based on evidence-based research that goes beyond the obvious and self- evident experience of reading, for it isolates, measures, and quantifies a combination of relevant—but seemingly marginal—linguistic features that can distinguish authors and serve as evidence of authorship. In what may sound like a restatement of Saussure’s radical thesis that there are only differences with no positive terms, the following study presents a comparative analysis that only examines the differences between Mankiewicz and Welles. In this chapter, I present the ideas underlying the statistical approach to authorship attribution, briefly summarize three exemplary case studies (of The Federalist papers, the Aristotelian Ethics, and Pericles), and identify the linguistic features that provide the primary data of authorship.

4.1 The Fingerprint Analogy To overcome the qualitative partisan views of the authorship of the Citizen Kane screenplay outlined in Chap. 2, this study carries out a quantitative comparative analysis of linguistic variables to discover those variables that distinguish Mankiewicz from Welles. The comparative analysis is quantified in order to introduce new data, which, in P.W. Bridgman’s description of relativity theory, is “of an entirely different character from those of our former experience” (Bridgman 1927, 2). Statistics

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 W. Buckland, Who Wrote Citizen Kane?, Quantitative Methods in the Humanities and Social Sciences, https://doi.org/10.1007/978-3-031-40224-1_4

43

44

4 The Statistical Analysis of Style: Aims and Methods

measures and analyzes these new data in precise terms, including visually representing them in tables and graphs. Quantification simply involves the frequency count of linguistic features, which adds up the number of occurrences of a repeating event— such as an author reusing the same word type. In authorship attribution studies, quantification identifies linguistic features that different authors repeat at a significantly distinctive rate, for it is the differences in the relative frequency of multiple features that become a potential discriminating marker between the compared authors. The quantitative approach to distinguishing authors searches, in the samples, for the linguistic equivalent of the fingerprint. A fingerprint offers a useful analogy to the extent that it is a shared and inconspicuous part of the human body that nonetheless represents a distinctive trait of each individual when compared to other individuals. Of course, the fingerprints of an individual resemble the fingerprints of everyone else; it is the small variations within those similarities that distinguish one fingerprint from another. (We can also make a similar claim about DNA.) In crime detection, the fingerprint offers a reliable method of identification based on minute variations within an inconspicuous part of the human body. However, manually matching fingerprints in a crime scene to fingerprints held on file is a labor-intensive process—a process that the computer expedites. The fingerprint analogy is therefore useful in the analysis of writing in at least three ways: the focus on inconspicuous shared elements, the use of comparative study to search for minute variations within those inconspicuous elements, and the use of computer software to help in identifying and quantifying those minute variations. Within a shared linguistic community, the writing of an author resembles that of other authors to the extent that they all use the same language; no author possesses an exclusive set of words or unique syntax. Instead, it is the variations between authors in the frequency of their usage of these shared, inconspicuous, low-level linguistic features that contribute to their distinctive style. We find a similar emphasis in the connoisseurship of Giovanni Morelli and Bernard Berenson, who attributed authorship to Renaissance painters based not on subject matter (which was usually limited to the same select array of mythological or religious themes) but on small, inconspicuous, and nonconscious details such as the way earlobes or fingernails were painted (see Carlo Ginzburg and Anna Davin [1980] and Richard Wollheim [1973]). More recent approaches to connoisseurship bypass all aesthetic judgment by carrying out a material analysis of paintings (of their physical and chemical properties) or a visual analysis aided by computer imaging tools—from X-ray to infrared to the microscopic analysis of brushwork, which quantifies features such as the direction, thickness, length, and overlap of a painter’s brush strokes (see, for example, Shahram et al. [2008]). Stylometric authorship attribution therefore studies patterns of behavior in all forms of human activity in an attempt to identify an individual’s decision-making style. In the digital age, stylometry identifies in large sets of data significant uniform patterns and trends that cannot be detected by other means. It therefore harnesses the power of statistics to quantify and measure the underlying organization of data

4.2 Premises of Stylometry

45

in the seemingly infinitesimal variations in language use that distinguish author styles. Anthony Kenny reminds us that the data set used in stylometric studies does not invalidate the results: “Being a trivial and humble feature of style would be no objection to its use for identification purposes: the whorls and loops at the ends of our fingers are not valuable or striking part of our bodily appearance” (Kenny 1982, 12), but, he continues, they can nonetheless be used to distinguish one individual from another. Frederick Mosteller and David L. Wallace similarly point out: “we know of no theory relating fingerprints to personality or behavior. Yet fingerprints excel in identification” (Mosteller and Wallace 1964, 265). Stylometric authorship attribution provides new evidence at the micro level of language to identify the relevant stylistic features of authors. At this micro level, new linguistic units of analysis are established and their boundaries defined, and such units only begin to make sense when quantified. Whether the data consist of individual letters, prefixes, or punctuations, all are potential sources of information for predicting authorship— once they have been subjected to statistical testing. The Shakespeare scholar Brian Vickers agrees; referring to the study of function words, he points out that “This unglamorous line of research has turned out to be a most valuable resource in authorship studies” (Vickers 2002, 80). To ensure the validity of this research, several basic statistical tests need to be performed on many forms of linguistic data—a time-consuming but essential stage of the research process, to avoid the bias of a single test. Nonetheless, like all research, this study is guided by methodological finitism—it can only analyze a finite amount of data (the variables on the micro level of language) using a finite number of stylometric theories and methods.

4.2 Premises of Stylometry Statistically based authorship attribution is carried out within the framework of three fundamental interrelated distinctions central to statistics: descriptive and inferential statistics, sample and population, and statistical tests and effect size. A sample is a selection from the whole population (for example, a screenplay by Orson Welles is a sample of his entire writing output). Descriptive statistics quatifies the sample data and their variations in precise terms, while inferential statistics goes beyond describing samples to make inferences about the population from where the sample was taken. Two descriptive methods are particularly important in this study: proportions and ratios. Proportion relates a part to a whole; it is used to discover the quantity of one category within the total group, such as the relative frequency of a word in an author’s corpus, which can be expressed as a fraction or percentage. A ratio, on the other hand, divides a quantity derived from one group into a comparable quantity from another group, such as the frequency of the same word in two different authors. A ratio measures effect size: it compares the magnitude or strength of one quantity in relation to another, which can be expressed as a

46

4 The Statistical Analysis of Style: Aims and Methods

fraction (e.g., 2/5) or colon (2:5) or written out using the word to (2 to 5). In inferential statistics, descriptive data inform the generation of inferences that predict (in different ways, outlined at the end of this chapter) whether the samples are representative of the population, and significance tests and/or confidence intervals express the level of confidence/significance of an inference. Attributing authorship to a disputed sample (such as a screenplay) involves (1) inferring whether the disputed sample represents one population (one author) rather than another, and (2) determining the reliability of that inference. To carry out a statistical analysis of a sample’s discriminant variables—the linguistic features that maximally differentiate authors—those variables need to be: • • • • • • •

Quantifiable (measurable) and computable. High rate. Context-free (not dependent on the subject matter). Multiple. Subconscious (automatic writing habits). Distinctive. And Stable (consistent and regular).

4.2.1 Quantifiable (Measurable) and Computable The variables in the written text must be measurable or countable, preferably automatically by a computer. “In order to build up a definition or description of style to use in a statistical enquiry,” writes Gerard Ledger, “it is necessary to select something, or a group of things, which may be measured. These are called ‘variables’ and they could be anything which is accessible as a physical quantity within the text and are in some way related to what the author wrote or, in other words, to his or her style” (Ledger 1989, 4). The activities of counting and measuring generate the numbers used in statistical analysis. Word and sentence length are easily countable, although they are not always distinctive markers of authorship in English (they work better for Greek texts). Identifying grammatical categories remains difficult to achieve automatically, although they can be more distinctive in predicting the authorship of a disputed or anonymous text. (Some grammatical categories such as English prepositions are easy to identify because they are usually two letters long.) The variables need not only be countable but also unambiguous and computable. Variables become unambiguous when the methods and finite series of procedures used to define them are spelled out and used consistently. A variable is computable when it can be subjected to a series of calculations that manipulate and transform it.

4.2 Premises of Stylometry

47

4.2.2 High Rate If the variables are countable, they also need to “score at a sufficiently high rate to be free of the uncertainties associated with low-level distributions and also to enable us to use small samples” (Ledger 1989, 4). Additionally, if these high-rate variables are common elements of language, this means they are also shared by many authors. Authors can be compared according to the frequency at which they use these shared variables. Function words and n-grams are the most successful variables to satisfy the high rate and shared requirements. At the same time, the highest frequency variables may be shared by authors to such an extent that the overlap is too substantial to differentiate them. We will see later in this chapter that while Mosteller and Wallace identified function words as high rate and shared, they also identified a series of marker words that are distinctive precisely because they are high rate but not shared. In other words, the high frequency of a linguistic feature in one author compared to its low frequency in another author provides strong evidence to distinguish the two authors. Alvar Ellegård also concluded that low-frequency variables tend to be more distinctive of an author than high-frequency variables, although their low frequencies create problems in the fluctuation or variation of the frequency counts (Ellegård 1962, 15–16). He overcame this problem by combining variables into plus and minus groups (discussed below).

4.2.3 Context-Free Practitioners of stylometry assert that the linguistic variables chosen to distinguish authors must be context-free because the variability of usage should be dependent on the author and not on the genre or the topic under discussion. Mosteller and Wallace argued that “We need variables that depend on authors and nothing else. Some function words come close to this ideal, but most other words do not” (Mosteller and Wallace 1964, 265). A number of function words in English, such as prepositions, mark grammatical structure; the advantage of focusing on them is that an author uses their preferred function words regardless of the context, genre, or topic. Function words also constitute a closed group and are therefore more stable over time. Other word types, such as verbs, are more problematic in distinguishing authors because they constitute an open group, which means variation is due in part to context and not to the author; such variability therefore hinders the identification of stable characteristics of authorship. Nonetheless, if authors cowrite the same document in the same place and time following the same strict genre conventions of screenplay writing, it is worthwhile pursuing some context words, for both writers work under the same stable conditions. Also, authors still have preferences in choosing context words. Because both Mankiewicz and Welles wrote within the same fixed context, it may be possible to distinguish them via their preferred set of vocabulary words which they employ

48

4 The Statistical Analysis of Style: Aims and Methods

throughout their writing. Unlike function word analysis, n-gram analysis takes into consideration content words,1 as does the LIWC (Linguistic Inquiry and Word Count) software discussed in Chap. 7.

4.2.4 Multiple Although writing is amenable to such testing in isolation, David I. Holmes points out that no single variable distinguishes one writer from another—no author uses a word or phrase exclusively: All authorship studies begin with a choice of criteria believed to characterize authors. One should probably not believe that any single set of variables is guaranteed to work for every problem, so researchers must be familiar with variables that have worked in previous studies as well as the statistical methods to determine their effectiveness for the current problem (Holmes 1994, 104).

In order to identify the distinctiveness of an author’s writing, multiple tests must therefore be carried out and several variables need to be analyzed to reduce the uncertainty. “In authorship studies,” Vickers writes: “when differing approaches point to the same direction, it shows that the hypothesis was sound, the methodology viable” (Vickers 2002, 89–90). Earlier stylometric studies (such as those of A.Q. Morton) relied too much on a single variable or asked too much of a single test to quantify an author’s style.2

4.2.5 Subconscious (Automatic) Context-free function words are so frequent in English and “so unlikely to be regulated by authors, their frequencies may reflect authorial habits that remain constant in spite of differences in subject matter, point of view, or theme” (Hoover 2002, 157). These habits remain stable in an author’s work because they are to a great extent automatic.

‘Letter n-grams unavoidably capture thematic information in addition to the stylistic information. Under the assumption that all the available texts are on the same thematic area, this property of letter n-grams can be viewed as an advantage since they provide a richer representation including preference of the authors on specific thematic-related choices of words or expressions’ (Efstathios Stamatatos [2013], 428). 2 For an outline and critique of Morton’s work, see Susan Hockey (1980, 136–39). 1

4.2 Premises of Stylometry

49

4.2.6 Distinctive Stylometric authorship attribution aims to identify and quantify the most distinctive variables that differentiate authors and to discard insignificant variables. Furthermore, these distinctive variables must be linked to style rather than to coincidence or to general features of language. Distinctive variables are not known in advance; they are discovered through the trial and error of statistical testing. Frequency of occurrence is a fundamental property of a stylistic variable because it allows norms to be established by statistical methods—both for its rate and the variability of that rate in a given author’s writings. This is why the following attempt to differentiate Mankiewicz from Welles focuses primarily on the frequency counts of multiple variables, where distinctiveness is defined in terms of a variable’s significantly different rate of occurrence in the two authors and is expressed as a ratio. Collectively, these significant variations in frequency constitute a discriminant pattern that can separate the style of one author from the other.

4.2.7 Stable Unlike mathematics, which is founded on deductive logic, statistics is empirical and inductive; it involves the collection, description, and analysis of data and attempts to discover uniform patterns in those data. For example, relationships between variables are expressed in terms of correlation rather than causation or logical relations, in which variables are at best assumed to be linked via a mutual relation rather than the direct action of one variable on another. The results of statistical analysis are not, therefore, based on complete certainty but can only be expressed in terms of probability. Yet when comparing and distinguishing authors, a stylometric study needs to focus on those features that are stable and robust—that is, consistent and regular. In their study of The Federalist papers, Mosteller and Wallace argue that their main study “shows stable discrimination for essays on various subjects even with the writing spread over a quarter of a century” (Mosteller and Wallace 1964, 264). Similarly, Ellegård argues: The main assumption, or working hypothesis, underlying any attempt to determine the authorship of a text from its style or other linguistic characteristics, is that some features, or combinations of features, in a particular writer’s style or language, remain reasonably constant, or change in a predictable manner (Ellegård 1962, 8).

We also need to acknowledge that distinctive variables are not necessarily the same as an author’s characteristic variables. Whereas characteristic variables are representative of a single author and remain fixed throughout their work, distinctive variables are relative to a comparative study. An author’s distinctive variables change according to whom they are being compared. A discriminant analysis is relational in that it compares authors’ styles rather than analyzes the intrinsic stylistic properties of each author. This study isolates the distinctive variables that

50

4 The Statistical Analysis of Style: Aims and Methods

distinguish the writing of Mankiewicz from Welles—although these variables are not necessarily characteristic or representative of each author.3 Furthermore, no series of statistical tests can guarantee a successful result, for they are always open to revision.

4.3 The Federalist Papers, the Aristotelian Ethics, and Pericles In one of the most famous and successful studies of authorship attribution in the history of stylometry, Inference and Disputed Authorship: The Federalist, Mosteller and Wallace resolved the authorship of the 12 disputed Federalist papers. They asked: Were they written by Alexander Hamilton or James Madison? They experimented with several different tests (sentence length, word length) without success. Other tests on different variables (percentage of nouns, adjectives, one-letter and two-letter words, frequency of the definite article the) were partly successful. They eventually discovered a series of tests that successfully distinguished Hamilton from Madison, particularly what they called marker words (a word common in one author and rare in another) and frequently used function words, including “prepositions, conjunctions, pronouns, and certain adverbs, adjectives, and auxiliary verbs” (Mosteller and Wallace 1964, 17). Marker words are quantifiable, multiple, subconscious, distinctive, and stable; function words share all these attributes and are, in addition, context-free, high rate, and shared. Marker words include an author’s preference for one of two synonymous words (such as on/upon, while/whilst) and the frequency with which an author uses function words (conjunctions and prepositions) that in a weakly inflected language such as English signify grammatical relations. All authors use function words; what is distinctive is an author’s rate of usage of these words. Mosteller and Wallace identified an initial list of 165 words that can potentially distinguish Hamilton from Madison, which they subjected to a series of tests that reduced them to a final list of 30 words—a mix of marker words and high-frequency function words. Here are four examples: they discovered that Hamilton used upon and to frequently and by infrequently and preferred while over whilst. In contrast, Madison used upon and to infrequently and by frequently and preferred whilst to while. The word upon averaged 3.24 appearances per 1000 words in the known writings of Hamilton but only 0.23 in the writings of Madison. Function words tended to be more significant than marker words: “in the end, the high-frequency words outshone all the marker words’ (Mosteller and Wallace 1964, 77). The 30 distinctions constitute the multivariate linguistic fingerprint of each author. Applied

Of course, there is some overlap between characteristic and distinctive variables. For more details, see Klaussner, Nerbonne, and Çöltekin (2015). 3

4.3 The Federalist Papers, the Aristotelian Ethics, and Pericles

51

to the 12 disputed papers, Mosteller and Wallace identified Madison as the author of all 12. Mosteller and Wallace analyzed these words from two perspectives: via traditional frequentist methods of discriminant analysis and via Bayesian probability theory. In the latter, the probability of an event (such as the frequency of the word while in an essay of known authorship) predicts the probability of finding that word in an essay of unknown or disputed authorship. They obtained similar results from both methods, which may explain why the more complex Bayesian probability theory remains sidelined in stylometry. Other authors have tried out new methods on The Federalist papers: Holmes and Forsyth (1995) used genetic algorithms; Tweedie, Singh, and Holmes (1996) developed a neural network analysis; and Colin Martindale and Dean McKenzie (1995) developed a quantitative analysis of content. Ben Blatt tested Mosteller and Wallace’s initial word list by applying it to 600 novels by 50 authors. He pitted each novel “head-to-head against its actual author and each of the other 49 authors.” He reports that, out of almost 29,000 tests, the system “worked all but 176 times. This is over a 99.4% success rate” (Blatt 2017, 64). Anthony Kenny employed stylometric methods in his study of Aristotle’s Eudemian Ethics and the Nicomachean Ethics. The issue he addressed was “whether the three books which make a double appearance in the manuscript tradition, as books 5, 6, and 7 of the Nicomachean Ethics and as books IV, V, and VI of the Eudemian Ethics, were originally written for one context rather than another” (Kenny 2016, 70). Kenny therefore investigated the boundaries between texts by the same author rather than boundaries between authors in the same texts. He presented a straightforward method: he compared the style of the three books to the two treatises in which they appear. His study matched books to treatises in terms of vocabulary—specifically, particles, connectives, prepositions, adverbs, pronouns, and technical terms. Based on the relative frequencies of these features in the undisputed books in each treatise, Kenny generated the mean or expected number of occurrences the disputed books should have in order to be included in the Nicomachean Ethics or the Eudemian Ethics. He concluded that “the overwhelming weight of the evidence suggests that the common books are more at home in the context of the Eudemian Ethics than in the context of the Nicomachean Ethics” (Kenny 2016, 158). In his study of Pericles, MacDonald P. Jackson (2003) tackled the long-running issue of the disparity between Acts 3–5, which conform to Shakespeare’s mature style, and Acts 1–2, which do not. The boundary is clearly marked in the change from Act 2 to Act 3. The issue concerns whether the first two Acts were written by a lesser talented playwright or whether Shakespeare wrote them early in his career and only finished the play much later, toward the end of his career. Jackson provides evidence and arguments for the dual authorship hypothesis, with George Wilkins the most likely coauthor. One of several tests that worked for Jackson was counting the infinitive form of the verb (to + verb): “In fact, when all twenty-seven scenes of Pericles are ranked in order of ‘infinitive richness,’ or the number of occurrences of ‘to + verb’ per one-thousand words, we find a remarkable degree of separation between the eleven scenes in Acts 1–2 and the sixteen in Acts 3–5” (Jackson 2003,

52

4 The Statistical Analysis of Style: Aims and Methods

118–19). Overall, infinitives occur at a rate of 20.1 per 1000 words in Acts 1–2 and at a rate of 12.1 in Acts 3–5. Jackson compares these rates to other Shakespeare plays and concludes that “Acts 3–5 of Pericles use infinitives at a rate that is close to the average for Shakespeare’s plays, whereas for Acts 1–2 the rate is anomalously high” (Jackson 2003, 120). The infinitive verb form constitutes one of Wilkins’s high-frequency writing habits in comparison to Shakespeare’s much lower frequency. However, the infinitive verb constitutes just one variable that needs to be studied in relation to other distinctive variables.

4.4 Stylometry and Coauthorship Like the Federalist dispute, my study of Citizen Kane’s authorship examines a small and closed set of possible authors. Unlike the Federalist dispute, I assume the document analyzed in this study is coauthored. The authorship of the Federalist papers is usually studied in terms of the single authorship of individual papers. But Douglass Adair raised the issue of coauthorship—that Madison may have turned to Hamilton to assist him in completing a few of the papers, with the result that mixed authorship or coauthorship can be detected in those papers. But what type of coauthorship? Just as Elizabethan playwrights cowrote in order to save time (which tended to rule out coauthorship scenarios 1 and 2 mentioned in Chap. 2—that is, excludes one author rewriting the work of the other author), a small number of the disputed Federalist papers could have been written collaboratively, in the sense of the third scenario (each author contributing separate paragraphs). But even if coauthors rewrite each other’s work, is that sufficient to transform that work’s stylistic fingerprint? In considering if Madison rewrote Hamilton, making Hamilton’s revised papers sound like Madison’s own, Mosteller and Wallace are doubtful: We give little credence to the possibility that Hamilton wrote but Madison thoroughly edited the disputed papers, so that they finally looked Madisonian, rather than like a mixture of styles or a parody. The reader must appreciate that such a performance is not so trivial as changing two or three words. Even among the 30 words of the main study, Madison would have to make between 50 and 100 changes in each paper, to say nothing of the further changes these would induce. Since Madison could not know that we planned to use these 30 words, the total revision required, so that an analysis shows clear Madison rates, would have to be on a scale even more vast (Mosteller and Wallace 1964, 264).

Mosteller and Wallace raise an important point fundamental to this study of Mankiewicz and Welles: editing and revising a coauthor’s work to the extent that the frequencies of marker and function words are transformed to match those of the editor requires a tremendous act of rewriting. For a short 2000-word paper written by Hamilton, they estimate Madison would need to make 100 changes. Even assuming Mankiewicz only wrote 50% of the final Citizen Kane screenplay (13,000 words), that would roughly equate to Welles making—in addition to significant

4.5 Stylometric Data and Tests

53

changes to story structure and character development—around 650 revisions to Mankiewicz’s punctuation and words (changing his inflections, contractions, vocabulary, and word order). Did Welles rewrite Mankiewicz across these multiple linguistic parameters, as well as restructure the story? Or did he write separate scenes while simply deleting irrelevant scenes from Mankiewicz’s screenplay? Welles’s adaptation of The Magnificent Ambersons discussed in Chap. 3 has already revealed that his editing and rewriting of Tarkington was minimal, limited to a few marks of punctuation. Nonetheless, American (the first draft of Citizen Kane) was not a finished and published piece of work; it was a draft that required substantial editing and rewriting.

4.5 Stylometric Data and Tests Ledger spells out the methodology of statistical authorship attribution: “We could build up a discriminant function for each and every author, based on known and certain attributions, and then apply it to all works of uncertain origin so as to determine their authorship” (Ledger 1989, 51). Stylometric studies of authorship carry out a variety of statistical tests on different types of data to find the variables that distinguish the style of one author from another. We do not know in advance which specific tests and which variables work effectively to achieve this goal. It was only through extensive trial and error that earlier stylometric studies eventually identified these variables via the most relevant test (or small series of tests). For example, as we just saw, Mosteller and Wallace finally discovered that marker words and function word frequencies distinguish Hamilton from Madison only after their study of sentence length failed to distinguish the two authors. The tests and data outlined in this chapter have been selected because of their potential to distinguish authors by isolating reliable discriminating stylistic variables. These tests range from frequency counts of the smallest units, primarily punctuation and unigrams, to vocabulary analysis of word types and word tokens, collocational methods that quantify groups of words, and various analyses of sentences, including frequency distribution of sentence length. Frequency counts are important because they define linguistic units in relation to one of their most basic statistical properties—their number of occurrences in the texts under analysis. For example, the scene heading element INT. occurs 110 times in the 40,000-word Mankiewicz sample and 23 times in the Welles sample. (In this and the following chapters, Mankiewicz’s statistics are cited first, unless indicated otherwise.) When we group together the frequencies of many linguistic units (e.g., all the bigrams or trigrams), we end up with the frequency distribution of these linguistic units in a sample text. Like the earlier studies, it is not known in advance which will work effectively to distinguish Mankiewicz from Welles. The data and the tests applied to them are outlined below, and in Chap. 5, the tests are applied to the control group, the screenplays of known authorship.

54

4 The Statistical Analysis of Style: Aims and Methods

4.6 Punctuation and N-Gram Analysis “Although the word is the obvious unit of measurement and perhaps the natural one,” writes Ledger, “it is not the only candidate. Letters, syllables, morphemes, phrases, sentences, lines of text, or pages could also be used” (Ledger 1989, 16). N-gram analysis captures small fragments of a text’s style and measures the frequency of those fragments. It breaks apart words and captures information on their internal components, potentially revealing an author’s preference for certain morphemes (prefixes, affixes) or combinations of consonants or vowels. In terms of the criteria established in the first half of this chapter, n-grams meet most of them: they are very easy to quantify; they are both high rate and shared (which can, however, also work against them as discriminators); they are not, however, context-free, for they are dependent on the subject matter (certain letters will appear more frequently depending on the text’s topic or theme, for letters indicate vocabulary preference); they are very stable, for letters are evidently consistent and regular; the analysis of several n-grams meets the multiplicity criterion; and they appear to be subconsciously chosen. Whether they are distinctive for a particular author in relation to another author is an empirical matter that needs to be tested via a frequency count. In his study of the chronology of Plato’s dialogues, Ledger counted single-letter n-grams, although their frequency was not for him an end in itself; instead, he counted the words in a text based on the letters they contain. To count Plato’s words, Ledger established three n-gram positional categories, in which he defined each n-gram in terms of its position in a word: (1) words containing one of the letters of the (Greek) alphabet (a category he labeled alets), (2) words ending in a specified letter (blets; he limited the count to the nine most frequent letters), and (3) words with a specified letter in the penultimate position (clets; again limited to the nine most frequent letters).4 Ledger divided each of the three categories into letter groups: alets contains 19 groups (alet1, alet2, etc.) representing each letter of the Greek alphabet, with the least frequent letters grouped together, while blets and clets each contain nine letters. Ledger therefore carried out counts for 37 variables in total—19 alets, nine blets, and nine clets. He counted a word and assigned it to a group within a category if it contains the specified letter. For example, the blet4 group contains all words that end in ί, including the Greek word καί (and). In total, καί is grouped under alet8 (κ), alet1 (α), alet7 (ί), blet4 (ί), and clet1 (α). If Ledger’s counts were based only on letters, then καί would have been reduced to three categories, each representing its letters; the word itself would not be recoverable from the categories, and neither would the location of letters in the penultimate and ultimate positions. Ledger justifies his preference for word count over letter count by arguing that “I am uncertain if letter distribution, when correctly measured, could act as such a powerful discriminator of style as the alets” (Ledger Ledger focused on word endings due to the highly inflected nature of the Greek language, where grammatical choices are signified via suffixes; authorial word preference is, in part at least, marked in the letters that make up the suffixes. 4

4.7 Contractions

55

1989, 8 n9). Nonetheless, counting word frequencies via their letters adds a significant level of complexity to the analysis, and it leads to the creation of overlapping groups, which diminishes their independence. Ledger appears to have overcome such skepticism about letter distributions in his jointly written essay “Shakespeare, Fletcher, and the Two Noble Kinsmen” (Ledger and Merriam 1994). Ledger and Merriam approach the coauthorship of the play The Two Noble Kinsmen using straightforward letter frequency measurements to discriminate between Shakespeare and Fletcher in order to determine which parts of the play match Fletcher and which match Shakespeare. They lose some data (the position of a letter in a word), but they gain simplicity and separation between groups by counting letters regardless of their position in a word. They analyzed this data using cluster analysis in order to determine authorship, results that I discuss at the end of this chapter when I introduce cluster analysis. Ledger analyzes letter n-grams, the frequency of letters of the alphabet. This is contrasted to byte-level n-gram analysis, pioneered by Vlado Kešelj and his colleagues (Kešelj et al. 2003), which focuses on all features of a text, including punctuations and even white spaces: “character-level n-grams use letters only and typically ignore digits, punctuation, and whitespace while byte-level n-grams use all printing and non-printing characters” (Graovac, Kovacevic, and Pavlovic-Lažetic [2015]: 6). A byte-level n-gram analysis treats a text the same way a computer sees it, as a purely linear string of bytes, with each byte a unit of data (eight bits of information in binary code) representing a single feature of that text (a punctuation mark, a letter, a number, a symbol, or a space). Unlike the human reader, the computer cannot detect different levels of bytes grouped into words or sentences; it can only detect different bytes. Chapter 5 focuses on the byte-level n-gram analysis of screenplay texts, with bytes of different lengths (one to three), followed by a select number of word- and sentence-level analyses—those that count words and sentences in terms of two distinct features: length and frequency.

4.7 Contractions David Hoover finds contractions to be significant indicators of authorship, in part because they are carried out on high-frequency words and because an author can choose to use or not use the contraction (a personal choice that, like synonymous words, conforms to the statistical criterion of independence).5 Similarly, Vickers presents a summary of research on contractions in drama, which he calls a helpful indicator in authorship studies. Middleton’s single-authored plays, for example, average 90 contractions per play, while in a sample of 100 other plays, the average is 16 (Vickers 2002, 85).6

See, for example, Patrick Juola’s summary (2008, 280). Vickers quotes figures compiled by MacDonald P. Jackson (1979).

5 6

56

4 The Statistical Analysis of Style: Aims and Methods

This study quantifies the following contractions in Mankiewicz and Welles: n’t (aren’t, can’t) on’ (don’t, won’t) t’s (that’s, it’s) it’ (it’s, it’ll) sn’ (wasn’t, isn’t) ‘ve (I’ve, we’ve) ‘re (you’re, we’re) These seven contractions are easy to identify and quantify; their regular use (especially in dialogue) should ensure that they are stable and high frequency.

4.8 Word Analysis Words can be analyzed from a multitude of statistical perspectives, for many are stable and high rate, while some subgroups (e.g., function words) are context-free, and several of their attributes can be quantified using common statistical measures—sample mean, sample standard deviation, coefficient of variation, frequency distribution, and type/token ratio. These statistical attributes are important because they identify specific patterns in word use in the samples. The mean is a common statistic; it condenses all the values of a variable into a single number, their average, which becomes the expected value. Standard deviation measures the dispersion of all the values around the mean; it indicates how far, on average, each value deviates from the mean. A low standard deviation indicates a small average deviation, in which the variables are closely grouped around the mean, whereas a high standard deviation indicates a wide variation or fluctuation in values. Mean and standard deviation are useful measures in the analysis of style simply because style is the deviation from a predefined average. The coefficient of variation divides the standard deviation by the average value (the mean), which is then multiplied by 100. The coefficient therefore measures the standard deviation relative to the mean. A frequency distribution is potentially more nuanced than these summary statistics, for it measures the frequency of a variable according to how it is distributed across a range of values rather than reducing it to one value. For example, a frequency distribution of word length counts words of different lengths, reveals patterns in the variations of length, and quantifies those variations rather than reducing them to a single number (a mean or standard deviation).7

Standard deviation is expressed as more than one value. One standard deviation covers 68% of the dispersion around the mean (in a normal distribution of data); two standard deviations cover 95% of the dispersion; and three standard deviations cover 99.6% of the data. Standard deviation is also relevant to defining confidence intervals, an important statistical concept employed in this book to attribute authorship to the Citizen Kane screenplay. 7

4.8 Word Analysis

57

Another important measure for understanding words from a statistical perspective is the distinction between word types and word tokens. Whereas each word form is a separate type of word, the frequency of the occurrences of word types are word tokens. The statistical analysis of word types in a text counts every separate word form just once, while the analysis of word tokens counts every occurrence of those word types.8 Counting word types measures the variety of different vocabulary items in a text; counting word tokens measures how many times vocabulary items are used in a text. The number of types and tokens measures the richness of an author’s vocabulary, for it quantifies the variety of word forms used on the one hand and the repetition of the same words on the other.9 Authors are potentially distinguishable according to their type/token ratio: the variety of word types divided by the number of times they repeat those types (tokens). In counting the frequency of all the types of nouns in the disputed work, De Imitatione Christi, G.U. Yule discovered that there are 1168 separate and distinct noun types and 8225 tokens or occurrences of those nouns (a type/token ratio of 1:7). In contrast, in a sample from Jean Charlier de Gerson of the same size, he discovered that there are 1754 noun types and 8196 tokens (a type/token ratio of 1:4.7). The count of noun types and the type/ token noun ratio are two of the many statistical tests that work against Gerson as the author of De Imitatione Christi. For Yule, the frequency of different actual word types is not the only relevant statistic. When counting noun types and tokens, he discovered “quite unexpectedly the simple frequency distribution showing merely the numbers of nouns used once, twice, thrice etc. proved to have considerable interest of its own” (Yule 1944, 3). That is, the distribution of the number of words used at a certain frequency can in itself be distinctive. He constructed a table to represent the complete frequency distribution of nouns in De Imitatione Christi. At the top of the table, “there are 520 nouns that occur once only in the whole work, 174 that occur twice, 111 that occur thrice, and so on” (Yule 1944, 9). At the bottom is one noun that occurs 418 times. Authors may be distinguishable by the number of words they use once or twice. These specific measures of vocabulary frequency even have their own terms: hapax legomena (once-occurring word types) and dis legomena (twice-occurring word types). In addition to measuring the frequency distribution of word length, Chap. 5 also measures the frequency distribution of the number of word types, organized according to their occurrences in the screenplays of Mankiewicz and Welles.10

In a frequency count, the word and may appear 20 times in a 1000-word text; in this example, and is a single word type with 20 occurrences or tokens. 9 Although not used in this study, another type of vocabulary analysis, called lemmatization, reduces words further to their root form, in which inflected forms of a word are reduced to a single item. 10 However, the quantitative study of an author cannot rely on vocabulary remaining constant, for vocabulary increases with text length, but the rate of increase gradually diminishes because the author begins repeating words used earlier. 8

58

4 The Statistical Analysis of Style: Aims and Methods

4.8.1 Percentage of Old English Vocabulary The Oxford English Dictionary defines Old English (in use from around 600 to 1150) as an inflected language with few lexical borrowings from French and Latin.11 Several thousand Old English words remain in use in contemporary English (with modern spelling). The percentage of Old English words in comparison to Latinate words from Romance languages is a potential source of data for distinguishing one author’s vocabulary from another. For example, in his study of coauthorship in Shakespeare, Vickers (2002) reviews stylometric studies of the ratio between Old English and Latinate vocabulary in an author’s work, focusing on F.E. Pierce’s study (1909) of John Webster and Thomas Dekker. Pierce identified the Latinate vocabulary in these authors by counting words with three or more syllables (for Latin words are longer than Old English words) and discovered that Dekker is consistent in using three-syllable words sparingly while Webster used them more frequently. Pierce analyzed their coauthored plays scene by scene, which enabled him, in Vickers’s summary, “to identify their respective contributions to Westward Hoe! and Northward Hoe!, showing that in each play the vocabulary of major characters—Birdlime and Justiniano in the former, Bellamont and Mayberry in the latter—differs in style according to which dramatist is at work” (Vickers 2002, 76). Using a different technique for the same ends, Yule analyzed the distribution of nouns, organized according to their initial letter, in samples from Bunyan and Macaulay. His distributions revealed that “Macaulay as compared with Bunyan shows a heavy relative excess of A’s, E’s and I’s, and a heavy or moderately heavy deficiency of B’s, F’s, H’s and W’s” (Yule 1944, 183–84). Yule at first speculated that authors may choose certain words because of their sound: “Bunyan, for example, rather liked the explosive B as a start for a word and was repelled by the soft vowel sounds of A, E, and I, and that the difference between his alphabetical distribution and that of Macaulay was due entirely to such idiosyncrasies” (Yule 1944, 198). But Yule ruled out this type of analysis because it is difficult to verify. Instead, he calculated the percentage of Old English nouns and Latinate nouns in the two authors, for vocabulary from both sources is in part distinguishable according to its initial letters: “If one author uses more words of Latin derivation than another, this will tend therefore to increase the vocabulary under certain letters of the alphabet more than that under others” (Yule 1944, 198). According to Yule’s calculations, whereas 43.9% of Bunyan’s noun types derive from Old English, Macaulay has only 26.7% Old English noun types, a difference of 17.2%. These figures demonstrate Macaulay’s preference for Latin-based nouns in comparison to Bunyan’s. Bunyan’s preference for Old English nouns becomes even more apparent when taking into account the tokens of these nouns, their frequency in relation to all nouns in

11 The OED also quotes Henry Sweet’s preference for Old English over Anglo-Saxon: ‘I use “Old English” throughout this work to denote the unmixed, inflectional stage of the English language, commonly known by the barbarous and unmeaning title of “Anglo-Saxon.”’ Henry Sweet, King Alfred’s Version of Gregory’s Pastoral Care, Preface, v (quoted in the OED).

4.8 Word Analysis

59

the two authors’ texts: Old English nouns occur 60.3% in Bunyan and only 31.9% in Macaulay (Yule 1944, 213). In Chap. 5, I invert Pierce’s test by identifying the number of Old English words in Mankiewicz and Welles. I searched OED online, using the Advanced Search options to limit the search to words from 600 to 1150 which are labeled as “still in use today,” and collected 5478 entries. I downloaded these words into a file and compared it to the screenplays using WCopyfind, the same software used to compare screenplays to their adaptations (Chap. 3).

4.8.2 Distinctive Words In A Statistical Method for Determining Authorship, Alvar Ellegård employed the distinctiveness ratio to solve an authorship attribution problem—who wrote The Letters of Junius? (a series of political pamphlets composed between 1769 and 1772 written under the pseudonym Junius) (Cannon 1978; Ellegård 1962). Ellegård employed the ratio to define a word’s distinctiveness. Calculating the distinctiveness ratio of a word or other linguistic feature involves a three-stage process: 1. Measure the observed frequency of a linguistic feature in two samples. 2. Calculate the proportion or relative frequency of this feature by dividing each frequency into the sample it came from. (The relative frequency can also be expressed as a percentage.) 3. Divide the two relative frequencies into each other to derive the sum of that feature’s distinctiveness ratio in the two samples. This can be illustrated with an example from Ellegård’s analysis of the authorship of The Letters of Junius: 1. The word uniform occurs 23 times in 82,200 words of The Letters of Junius and 65 times in the one-million-word sample of potential authors. 2. The relative frequency of uniform in The Letters of Junius is 23 / 82,200 = 0.000280 (or 0.0280%); the relative frequency of uniform in the work of potential authors is 65 / 1,000,000 = 0.0000650 (or 0.00650%). 3. Divide 0.0280% by 0.00650% = the sum 4.3. The first step finds the observed frequency of the word uniform in the two samples. The second step finds the relative frequency of the word uniform in relation to all the word tokens in each sample. Expressed as a percentage, the relative frequency of a linguistic feature states the probability of finding that feature in a 100- word segment of text. The probability of finding the low-frequency word uniform in a 100-word segment of The Letters of Junius is 0.0280%. (To avoid small numbers, the probability can be scaled up—that is, expressed in terms of larger segments of text. For example, the probability of finding uniform in a 10,000-word segment of The Letters of Junius is 2.8.) This second step normalizes or relativizes the observed frequencies, enabling them to be divided into each other and compared, even though the samples are different sizes. (If the size of the samples is identical, the raw

60

4 The Statistical Analysis of Style: Aims and Methods

frequencies can be divided into each other to calculate the distinctiveness ratio.) In the third step, the relative frequencies for the word uniform are divided into each other from the perspective of The Letters of Junius, which means the frequency of The Letters of Junius becomes the numerator (is placed on top) of the division. In the uniform example, one relative frequency (0.000279) is divided into another (0.0000650), and the sum from this division (= 4.3) expresses the difference between the two quantities as a multiple. This multiple expresses the number of times one value contains or is contained within another. In the ratio 0.000279/0.0000650 the numerator is 4.3 times the size of the denominator, which is the same as saying that the rate of occurrence of the word uniform is 4.3 times more in The Letters of Junius than the one-million-word sample of potential authors. A ratio therefore measures magnitude, the size of one quantity (the numerator) in relation to another quantity (the denominator). Magnitude is a measure of effect size, for it indicates how larger or smaller the difference is between two samples. Based on his extensive empirical testing, Ellegård suggested that, when comparing two samples, any linguistic feature with a ratio above 1.5 or below 0.7 is distinctive, whereas a ratio value that falls within this interval (that is, a value above 0.7 but below 1.5) is less distinctive, and a ratio of 1 signifies no distinction between the feature in both samples. Ellegård calls distinctive words above 1.5 plus words and below 0.7 minus words.12 In the above example, the word uniform is a distinctive plus word in The Letters of Junius in relation to the one-million-word sample, for its relative frequency is significantly higher, generating a high ratio (well above 1.5), which indicates that the two samples do not derive from the same population (i.e., the same author). In his study of the three overlapping chapters in the Nicomachean Ethics and the Eudemian Ethics, Kenny discussed the distinctiveness ratio in terms of means/ expected frequencies. That is, the distinctive linguistic features generate expected frequencies (based on the assumption that the frequency of linguistic features is constant), which are then compared to the frequencies of the disputed chapters: “By comparing the expected number of occurrences of each group in each case with the actual occurrences, we can determine for each sample in turn whether it resembles the Nicomachean Ethics more than the Eudemian” (Kenny 2016, 131). The distinctiveness ratio (D.R.) is Kenny’s preferred metric of comparison: … the D.R. of an expression is the ratio of its frequency in one treatise to its frequency in the other. Let us express the D.R. of an expression occurring in the ethical treatises as its Nicomachean frequency divided by its Eudemian frequency. Thus, a word which occurs more frequently in the Nicomachean Ethics will have a D.R. greater than unity, a Eudemian favourite will have [a ratio] between one and zero, and a word which is used with the same relative frequency in each will have a D.R. of one (Kenny 2016, 130).

When comparing the two treatises with the disputed chapters, a D.R. between the 0.7 and 1.5 interval indicates that the two samples are similar, and a D.R. of 1 means A ratio of 0.7 means that the denominator is 1.43 times the size of the numerator. A shortcoming of the minus group is that its value is finite, for the values are compressed between 0.7 and 0, whereas the plus group’s value ranges from 1.5 to infinity. 12

4.10 Sentence Length

61

that the expected frequency and the observed frequency match. Outside the interval, the expected and observed frequencies diverge. Kenny uses Ellegard’s 0.7 and 1.5 interval (in fact, 1.4) as a guide to different authorship (or, more accurately, as a similarity/dissimilarity boundary, for the author of the two ethical treatises is known). Kenny therefore divides two relative frequencies to get a D.R. and uses Ellegård’s interval as a similarity/difference boundary, a practice I follow in this study. By calculating the distinctiveness ratio of dozens of words, it is possible to create a statistical profile of an author (a linguistic fingerprint, or what Ledger calls a discriminant function for each author) in terms of high-frequency plus words and low- frequency minus words. “In this way,” Ellegård argued, “the list ought to provide an effective instrument for identifying the author, since it is naturally rather unlikely for different authors to have identical lists of characteristic words” (Ellegård 1962, 15). In Chap. 5, I create statistical profiles of Mankiewicz and Welles by identifying their distinctive linguistic features.

4.9 Collocations Collocation analysis (or word n-grams) studies elements of language structure that are downplayed in frequency counts of individual letter n-grams and individual words—the syntagmatic dimension of language, such as the order of words in a text. Like other methods, collocation analysis still relies on the frequency count of words, but it counts different word properties—their co-occurrence in a text. The most basic collocations are of words that appear together consecutively, in close proximity; other studies of English limit a significant co-occurrence (or span) to four words on either side (Daley et al. [2004]: 13; 42–48). In this study, I limit collocational analysis to the frequency of two-word collocations and determine their frequency and distribution throughout the samples. (Furthermore, the collocational analysis maintains word boundaries. This means, for example, that the collocation to become will not be counted in the collocation to be.) Collocations can be very distinctive, for they are based on an author’s preferences for specific word combinations, but they are also low frequency, which means they are not stable or regular, making them potentially unreliable.

4.10 Sentence Length Stylistic patterns have been found in sentence length, and statistical analysis makes possible the comparison of authors according to their sentence length distribution. In one of the first studies to use this measure in a case of disputed authorship, Yule analyzed the frequency distribution of sentence length in De Imitatione Christi (Yule 1939, 363–90). He collected 1200 sentences from two likely authors, Thomas

62 Fig. 4.1 Frequency distribution of sentence length in Thomas à Kempis and Jean Charlier de Gerson

4 The Statistical Analysis of Style: Aims and Methods

Words per Sentence

De Imitatione

à Kempis

de Gerson

1-5

39

47

59

6-10

302

251

166

11-15

376

333

223

16-20

237

217

191

21-25

119

137

146

…

à Kempis and Jean Charlier de Gerson, and organized them according to their word length frequencies, arranged into groups of five in a series of tables. Table 3.1 brings together a few of the results from three tables in Yule’s article (Yule 1939, 387–88) (Fig. 4.1). Yule noted the close similarity between De Imitatione Christi and Thomas à Kempis in comparison to de Gerson. In The Statistical Study of Literary Vocabulary, he also analyzed 8200 words (noun tokens) in relation to vocabulary (word types) and discovered that De Imitatione Christi has the lowest number of types (1168), followed by à Kempis (1406), while de Gerson has 1754 (Yule 1944, 237). Both tests (as well as many others he carried out in Chapters 9 and 10 of his book) reinforce the hypothesis that à Kempis is the author of De Imitatione Christi. Sentence-length analysis of screenplays can be problematic due to their layout, for different sections of text are marked by formatting rather than punctuation (which tends to make the sentences long), while dialogue is short. However, in a study devoted only to the comparison of authors (rather than the screenplay’s intrinsic nature), where the formatting is standard and consistent between screenplays, comparison remains a viable option.

4.11 Cluster Analysis Clusters, multidimensional scaling, and dendrograms (or tree diagrams) reduce several textual features to a two-dimensional plane, symbolize those features as points on that plane, and represent their proximity and distance visually. Features with similar values are grouped close together into clusters (preserving the relative distances between the data points), while variables with different values are grouped separately. A multidimensional scaling graph is effective when it represents the data as internally cohesive clusters that are separate from other clusters. In stylometry, samples from the same author (and, on a secondary level, the same screenplay) should cluster together, while different authors and their screenplays should form separate clusters. A cluster dendrogram offers a similar visual representation of the data, which are organized into groups based on their similarities. Dendrograms are

4.11 Cluster Analysis

63

hierarchically organized and are created from agglomerative clustering, which begins with the single samples and progressively groups them into larger, less distinctive clusters. Each dendrogram displays the successive stages of clustering, with the new levels created from the previous levels. In contrast, multidimensional scaling presents a snapshot of the single samples seen from a different perspective. Above we saw that Ledger and Merriam approached the coauthorship of The Two Noble Kinsmen using straightforward letter frequency measurements and cluster analysis. They divided the play into its preexisting 26 segments—24 scenes, prologue, and epilogue, whose word length varies from 579 words (Act 1 scene 5) to 10,746 words (Act 3 scene 6). They carried out their analysis on the assumption that Shakespeare and Fletcher worked separately (the third scenario of coauthorship presented in Chap. 2, where authors remain disconnected and write separate parts of a document), and they were guided by the core idea behind all letter n-gram analyses that “authors have a distinctive and characteristic letter frequency pattern” (Ledger and Merriam 1994, 235). Ledger and Merriam organize the data into 21 letter groups: The variables used are the frequencies of all the letters of the alphabet excluding Q, X and Z. These latter variables were found to be low scoring, sometimes in shorter samples registering zero, so that it was potentially troublesome to include them. Tests showed that their exclusion did not effect [sic] the outcome. I and J were added to form one variable, (IJ), as also were U and V, to form the variable UV (Ledger and Merriam 1994, 237).

The plays of six authors were initially included: Shakespeare and Fletcher, plus a control group consisting of Henry Chettle, Thomas Dekker, Thomas Heywood, and Anthony Munday. Ledger and Merriam used cluster analysis to represent several 25,000 letter samples from each of these authors. Based on 21 variables (letters of the alphabet), the samples from each author tend to cluster together, which means that samples from different authors remain relatively separate. This initial test demonstrated that cluster analysis worked reasonably well, with a number of false results (or misclassifications) due to the attempt to distinguish a large number of authors based on a straightforward method (cluster analysis). After establishing the validity of cluster analysis in addressing this coauthorship problem, Ledger and Merriam deleted the control group of four authors, leaving the Shakespeare and Fletcher samples and adding samples from The Two Noble Kinsmen. The authorship of a scene from The Two Noble Kinsmen is attributed to Shakespeare or Fletcher if it clusters with their samples; it is left unattributed if it forms part of a separate cluster. This technique produced an effective attribution, with two thirds of the scenes clustering around Shakespeare and the other third clustering around Fletcher, although it also produced several indeterminate attributions caused primarily by small sample sizes (i.e., short scenes), which suggests that samples below 500 words (or approximately 2500 letters) are unreliable (Ledger and Merriam 1994, 244).13 Ledger and Merriam conclude that:

The authors divide up The Two Noble Kinsmen, attributing scenes to Shakespeare and Fletcher, in two slightly different ways on pages 244 and 245. 13

64

4 The Statistical Analysis of Style: Aims and Methods The advantage of our approach is that it is easy to apply, does not require measurements of large numbers of variables (which subsequently have to be reduced to just a few), and it gives immediate access to the essential data for serious analysis, adapted to the specific needs of the enquiry. Here, with just 21 easily measured variables, we have provided an adequate basis for a more extended study of authorship discrimination (Ledger and Merriam 1994, 247).

Furthermore, their study of The Two Noble Kinsmen not only demonstrates what can be achieved with unigram data and cluster analysis; their division and analysis of a coauthored text to determine who wrote what sections offers a model of how to determine the contribution of co-authors to the same text. Chapter 5 carries out cluster analysis and produces a multidimensional scaling graph and a dendrogram by uploading the samples to the Texts Similarity Analysis tool on the online site WebStyml.14 The clusters are not based on the exact same variables used in the rest of this study; instead, the software uses its own combination of values that derive from frequency counts of punctuation, words, and collocations, with an emphasis on lexical similarity between clusters. The software creates clusters by quantifying the similarity within and the distance between linguistic variables. But similarity and distance are not fixed values; they are defined and calculated differently by a variety of measures such as Euclidean, Manhattan, and Cosine.15 After a series of experiments, I set the similarity/distance measure in the cluster analysis to Cosine.

4.12 Statistical Significance When a sample of data is drawn from a larger (possibly infinite) population, a significance test indicates how representative the sample is of that population. It establishes a boundary or threshold to distinguish a representative sample from an unrepresentative sample. The boundary is a fixed numerical point, a p value traditionally set to 0.05 (5%) or occasionally 0.01 (1%), which expresses the probability that the data sample is representative of the population. The initial assumption (the null hypothesis) is that samples are representative of or consistent with the population; in numerical terms, the p value is above the 5% threshold. But if the p value is below the threshold, the sample is said to be significantly different—in the sense of being unexpected, improbable, or rare, for it comprises data whose extreme values place them on the border of or outside the population. WebStyml – an Open Stylometric System based on Multilevel Text Analysis, developed by Maciej Eder (University of Kraków), Maciej Piasecki, and Tomasz Walkowiak (both at the Wrocław University of Technology): https://ws.clarin-pl.eu/webstyml.shtml?en 15 The Euclidean distance technique directly measures the physical distance between the points in the graph; the Manhattan technique measures the distance between points by combining vertical and horizontal movement (as if moving through city blocks in Manhattan); the cosine technique turns data points into vectors—that is, traces a line back to the origin or zero point on the graph— and then measures the angles between the vectors of each point in order to calculate their distance. 14

4.13 The Distinctiveness Ratio, Confidence Intervals, and Effect Size

65

However, dissatisfaction with the limitations of statistical significance tests has been growing for several decades because they only offer a partial or incomplete explanation of the importance of statistical data. Significance tests are structured around a binary choice (a sudden cut-off point where the data are significant or not significant), a choice defined in terms of a single fixed point, the p value. In other words, p values only provide limited information—the threshold for rejecting the null hypothesis and for defining a different or extreme set of data as significant; they do not indicate the strength (size, magnitude) of significance. In contrast, effect size quantifies how large and important the value of one variable is in relation to the value of another variable independent of sample size. Due to the limitations of the significance test, the American Statistical Association recommends that “Scientific conclusions and business or policy decisions should not be based only on whether a p-value passes a specific threshold” (Wasserstein & Lazar 2016, 132). Effect size (in conjunction with confidence intervals) is proposed as one alternative to significance testing. In Chap. 5 and 6, I use Cohen’s d to measure effect size, the scale or magnitude of the difference between two samples.16 It is calculated by taking the difference between the two sample means and dividing the result by the pooled standard deviation of both samples.17 The results are measured in units of standard deviations. For Jacob Cohen, a result of 0.2 is a small effect size (the difference between the two samples is small), 0.5 is a medium effect size, and 0.8 (and over) is a large effect size, signifying a standard deviation difference of 0.8 between the two samples. If d = 0, this means the two samples are identical (Cohen 1988).

4.13 The Distinctiveness Ratio, Confidence Intervals, and Effect Size Ellegård’s distinctiveness ratio (outlined above) is an effect size that is more commonly called relative risk in medical statistics. As we saw above, a ratio measures the relative size of one group in relation to the other group. In medical terms, relative risk is a ratio that compares a group exposed to a new treatment to a control group that was not exposed. Relative risk measures how much more (or less) likely the outcome of a treatment is in the exposed group compared to the control group. In other words, relative risk measures the effect size between a treatment and its outcome; it determines how well a treatment influences the outcome (rather than whether the treatment works or not—a binary choice offered by significance tests). A relative risk of 1 means an equal risk or no difference between a treatment group subjected to medical intervention and a control group not subjected to intervention, I will in fact use a measure that corrects Cohen’s d for bias in the pooled standard deviation in small samples, and when comparing two different-sized samples. I use the following software (option 2: Comparison of groups with different sample size): https://www.psychometrica.de/ effect_size.html 17 Cohen’s d = (mean of sample A – mean of sample B) / pooled standard deviation. 16

66

4 The Statistical Analysis of Style: Aims and Methods

while a relative risk below 1 signifies a reduction in risk and above 1 signifies an increase in risk. Relative risk and the distinctiveness ratio set out to identify not only whether a difference exists between two groups (or authors, etc.) but also the magnitude of that difference. When calculating the distinctiveness ratio, Ellegård warned of the dangers of low-frequency words, whose frequency may fluctuate widely across a sample. To identify which features remain fairly constant across an author’s writing, he recommended dividing samples into segments. “In order to test the hypothesis that the usage of an author remains fairly constant,” he wrote: […] it is necessary to compare the different parts of an author’s work with one another. For this reason, the material of each author was divided into portions, each of 2000 words. For instance, the primary Junius material (items 1–41) of 82,200 words was divided into 41 sections. The figure of 2000 was chosen as the probable minimum size of text suitable for statistical identification (Ellegård 1962, 26).18

By segmenting his Junius material into 2000-word samples, Ellegård was able to identify fluctuations in the frequency of the distinctive linguistic features across that material by calculating their mean and standard deviation. He discovered that frequencies are not constant but vary slightly and calculated that “the standard deviations of the values […] are between 10 and 20% greater than would have been the case if the frequencies had been constant throughout the text” (Ellegård 1962, 29). He decided against deleting low-frequency words that fluctuated erratically because highly distinctive words usually have a low frequency. Instead, he recommended arranging the distinctive words into groups to ensure that the observed values in the samples sufficiently approximate the population parameters. A group comprises a collection of words (and other linguistic features) with similar distinctiveness ratios. Grouping words in this way avoids the problem of low frequency and variation, for the frequencies of the individual words with similar distinctiveness ratios are simply aggregated, and the group is represented by its average distinctiveness ratio. Sensibly, Ellegård only used the distinctiveness ratio for creating the plus and minus groups (he placed words with a distinctiveness ratio above 1.5 in the plus group and words below 0.7 in the minus group). He carried out all calculations on the relative frequencies of these words rather than on their ratio value, for it is best to compare the data as directly as possible rather than mediate and abstract it through several tests. Ellegård’s distinctiveness ratio cannot in itself take the place of significance tests, for it is a statistical technique that only describes sample data; it cannot be used to make inferences about the population from which those samples were taken, nor can it express the degree of confidence in how representative samples are of that population. To address these limitations, Ellegård supplemented his distinctiveness ratio with a confidence interval and confidence level, both of which help to generate Similarly, in choosing words that distinguish Hamilton from Madison, Mosteller and Wallace tried to avoid words whose frequency varied excessively: ‘for the main study special pains should be taken to examine all candidates for the final list of discriminators for evidence of large variability between [an author’s different] writings’ (Mosteller and Wallace [1964], 22). 18

4.13 The Distinctiveness Ratio, Confidence Intervals, and Effect Size

67

inferences from samples to population. At the sample level, a confidence interval estimates the value of a parameter in the population (such as its mean) via a range of values enclosed within a lower and upper confidence limit or boundary, and a 95% level of confidence means that an interval estimates the parameter with 95% certainty (which means there is a 5% chance the parameter falls outside the interval). When the population parameters are unknown, an interval is calculated from several variables in the sample: sample size (n), its mean (x-bar), standard deviation (s), and the level of confidence required. The traditional or conventional level of confidence, 95%, is equivalent to 1.96 standard deviations from the mean (in a large normal distribution), whereas 99% confidence is equivalent to three standard deviations.19 At the 95% level of confidence, the lower limit of the interval is calculated by subtracting the sample’s mean from 1.96 times its standard deviation, which is then divided by the square root of the sample size, while the upper limit of the interval is calculated by adding the sample’s mean to 1.96 times its standard deviation, divided by the square root of the sample size. Like the p value, a confidence interval plus its confidence level can function as a test of significance.20 But unlike the p value, the confidence interval presents a range of values rather than a single fixed number. The width of that interval is determined by the confidence level and sample size, with a narrow interval offering a more precise estimate of the population parameters than a wide interval.21 Confidence intervals and confidence levels resemble quality control charts used in statistical process control (SPC), which monitors the variation in manufacturing processes by regularly sampling products to determine if they remain within tolerable control limits. The samples are measured in order to detect variations, and the measurements are plotted on a control chart. The mean and standard deviation of these measurements are calculated to determine the stability of and variation within the samples. The upper confidence interval is called the upper control limit, and the lower confidence interval is the lower control limit. The variations within the samples need to remain within these control limits.

In smaller samples, the t-distribution with n-1 degrees of freedom is used instead of standard deviation. 20 The two types of test at first appear to be similar: for the p-value, the significance threshold is traditionally defined at 5%, while the 1.96 standard deviation threshold in an interval estimate defines a non-significant difference (the work of the same author) within the 95% interval, with means that a significant difference (a potential change in authorship) lies outside the 95% realm— that is, in the remaining 5%. They are similar because a p-value set at 0.05 is equivalent to 1.96 standard deviations. Nonetheless, an interval estimation (in combination with the distinctiveness ratio) takes into account effect size and defines a flexible range of values rather than a single fixed point. 21 The calculation of a confidence interval for relative frequencies is variable, for the interval needs to be calculated from the sample data, and because each sample’s mean and standard deviation are different, each sample will generate a different interval width. In contrast, a confidence interval for a ratio is fixed, because the ratio expresses the difference between the two quantities as a multiple, and multiples remain the same. Ellegård’s ratio expresses the fixed multiples (0.7 and 1.5) two writing samples must exceed to be considered samples from different authors. 19

68

4 The Statistical Analysis of Style: Aims and Methods

The relevance of confidence intervals (or upper and lower control limits) for authorship attribution will become clearer by discussing Ellegård’s first test (Program 1), presented in Chap. 4 of A Statistical Method for Determining Authorship (1962, 28–38). When the frequencies of the same words differ in two authors being compared, how do we know if the different frequencies are due to mere random fluctuations in the samples rather than to real differences between the authors? After organizing the words into plus and minus groups, Ellegård used the 95% confidence interval to determine whether a difference in the relative frequency of a group of variables between authors is due to fluctuations or reflects real differences. He used the upper and lower confidence limits to define the boundaries of authorship, whereby frequencies must differ by at least 1.96 standard deviations from the mean to be regarded as a real difference in authorship. In this first test, Ellegård created a control group from 41 texts by Junius (82,200 words) and 88 texts by 79 potential authors (the one-million-word sample). He calculated the distinctiveness ratio of 458 words in this control material and calculated the distinctiveness ratios and relative frequencies of all 458 words (Ellegård 1962, 22).22 In the example presented above, uniform has a distinctive ratio of 4.3, making it a plus word distinctive of the Junius sample in relation to the one-million-word sample. Ellegård then grouped together those words whose distinctiveness ratio is 1.5 and above into the plus group, comprising words distinctive in Junius (because their frequency is high in Junius in relation to the one-million-word group). He also grouped together words whose distinctiveness ratio is 0.7 and below into the minus group, comprising words distinctive in the one-million-word sample (because their frequency in the one-million-word sample is high in relation to Junius—which of course means that their frequency is low in Junius). Ellegård then calculated the mean and standard deviation of each group.23 In the Junius/plus group, the mean = 317 (per 10,000 words) and the standard deviation = 33.4. In the one- million- word sample/minus group, the mean = 116 and the standard deviation = 11.4. From these figures, he calculated the confidence intervals, which he defined at the 95% confidence level—1.96 (or 2) standard deviations (2 s) from the mean of the plus and minus control groups: Hence, the 2 s range—which in a normal distribution contains c.95% of all the values—will be 243–377 for the plus group, and 96–142 for the minus group […] It is between these values that we should expect to find most of the texts written by Junius (Ellegård 1962, 35).

What this means is that all the fluctuations within the 2 s are attributable to sampling fluctuations (with 95% confidence), and fluctuations that are more than 2 s are due to a real effect, such as a different author. Rather than a single fixed p value defining the threshold of significance (in this case, the threshold of different authorship), Ellegård generated two confidence intervals—an interval for the Junius group Ellegård constructed his list intuitively, from reading all the material. He presents all 458 words (which include a few phrases) and their frequencies in Appendix II of his study. 23 Ellegård defined the relative frequencies per 10,000 words. The percent difference can be found by dividing the frequencies by 100. 22

4.13 The Distinctiveness Ratio, Confidence Intervals, and Effect Size

69

(the plus group) and an interval for the one-million-word group (the minus group).24 Both of these confidence intervals define the boundaries of Junius’s style because the first interval defines Junius’s words with a high frequency in relation to the one- million-word sample and the second interval defines Junius’s words with a low frequency in relation to the one-million-word sample. To avoid confusion, these groups can be defined from one perspective, that of Junius: the first group (with high- frequency Junius words) can be called the Junius plus group, which generates an interval of 243–377, and the second group (with low-frequency Junius words) is called the Junius minus group, which generates an interval of 96–142. These two intervals define Junius’s statistical profile. Ellegård then compared Junius’s statistical profile to the million-word sample in order to determine if the test distinguishes them. He added up the relative frequencies of the plus group and the minus group and presented all the results in a table (Table III in Ellegård 1962, 34). If both the plus and minus values for one of the 88 sample texts fall into the Junius intervals, this means the sample text is similar to the Junius text; if both values fall outside the intervals, this means the sample text is distinct from the Junius text; and if one (plus or minus) value falls inside an interval and the other falls outside it, this means the sample text manifests similarities and differences to the Junius text. Ellegård discovered that, of the 88 texts in the million- word sample, “two fall within the Junian range in the positive group, and another six not far short of it. No less than ten texts attain Junian values in the negative group. No text, however, obtains Junian values in both groups. We may therefore maintain that the test excludes all the 88 texts in the comparative material fairly efficiently” (Ellegård 1962, 35). Although Ellegård presented the results in a table, he did not visualize them in graphs. I visualize the results in Figs. 4.2 and 4.3. Figure 4.2 shows the Junian range in the plus group (a confidence interval defined earlier—243–377), together with the results for the 88 texts of the million-word samples. This graph visualizes the comparison between the Junius sample and the 88 texts of the one-million-word sample. We can see two samples (texts 158 and 159) that Ellegård says fall inside the Junian range and the six other close to the interval. Similarly, Fig. 4.3 shows the results for the Junian range in the minus group—the confidence interval defined earlier (96–142), together with the results for the 88 texts of the million-word samples. This graph visualizes the comparison between the Junius sample and the 88 texts of the one-million-word sample. We can see the ten samples that Ellegård says fall inside the Junian range (texts 127, 132, 157, 176, 208, 238, 239, 243, 246, 250). In sum, the texts that fall outside both of the Junian intervals are distinct from Junius; the texts that fall inside one of the Junius intervals are partly similar to Junius; and no text falls into both the plus and minus Junius intervals, demonstrating

The width of the interval provides more information about the sample data than a p-value, for a wide interval indicates sizeable variability of that sample data. 24

70

4 The Statistical Analysis of Style: Aims and Methods

400 350 300 250 200 150 100 50 0

1 3 5 7 9 111315171921232527293133353739414345474951535557596163656769717375777981838587

million-word sample (88 texts)

Junius lower boundary (243)

Junius upper boundary (377)

Fig. 4.2 The Junian plus group confidence interval (243–377) together with the 88 texts of the million-word sample

400 350 300 250 200 150 100 50 0 million-word sample (88 texts)

Junius lower boundary (96)

Junius upper boundary (142)

Fig. 4.3 The Junian minus group confidence interval (96–142) together with the 88 texts of the million-word sample

that the test is partly successful in distinguishing Junius from the million-word sample authors. Ellegård also ran a test to determine if Sir Philip Francis is Junius. To support this hypothesis, founded on historical evidence, he compared eight sample texts from Francis to the overall Junius sample. His aim is the opposite of the million-word test: that test aims to distinguish Junius from the 88 texts of the million-word

4.13 The Distinctiveness Ratio, Confidence Intervals, and Effect Size

71

sample, and the test is successful if those 88 samples fall outside the Junius intervals. However, the Francis test aims to identify similarities between Junius and Sir Philip Francis. Identifying similarities is straightforward: the high (plus) and low (minus) frequency words in the Francis samples must correspond to the high- and low-frequency words in the overall Junius sample. In the test, this similarity is signified by the Francis sample falling inside the Junius interval. Ellegård discovered that: Only one value in the negative group is outside the Junian range, and two in the positive group. Five of the eight texts have Junian values in both groups, all of them in at least one. It is obvious that our test very strongly supports Francis’ Junian claims. The hypothesis that Francis was Junius has been considerably strengthened (Ellegård 1962, 35–36).

Ellegård presented the results in a table but did not visualize them in graphs (Table IV in Ellegård 1962, 35). As with the previous test, I visualize the results in Figs. 4.4 and 4.5. Figure 4.4 shows the results for Francis’s eight samples from the plus group superimposed over the Junian plus confidence interval defined earlier—243–377. This graph visualizes the similarity of each of the eight Francis samples with the overall Junian sample. We can see the six Francis samples that fall inside the Junian range and the two Francis samples (the third and the eighth) that Ellegård says fall outside the Junian range. Similarly, Fig. 4.5 shows the results for Francis’s eight samples from the minus group, superimposed over the minus Junian confidence interval defined earlier—96–142. We can again see the seven Francis samples that fall inside the Junian range and the sample that Ellegård says falls outside the Junian range (the second, with the first sample on the interval).

400 350

353

300 250 200

271

312

288 252 210

262 216

150 100 50 0

Sir Philip Francis plus group Junius lower boundary (243) Junius upper boundary (377) Fig. 4.4 Sir Philip Francis’s eight samples from his plus group superimposed over the Junian plus confidence interval

72

4 The Statistical Analysis of Style: Aims and Methods

400 350 300 250 200 175

150 100

142

122

50 0

1

2

128

139 103

Sir 3 Philip 4Francis 5minus group 6

99

7

120

8

Junius lower boundary (96) Junius upper boundary (142) Fig. 4.5 Sir Philip Francis’s eight samples from his minus group superimposed over the Junian minus confidence interval

For Ellegård, the key factor to matching one author with another author is the similarity in both their plus and minus word groups, for similarity in both groups presents a strong case for shared authorship. In the Francis/Junius comparison, five out of eight Francis samples fall into both the plus and minus Junius intervals. At the end of Program 1, he reorganized his plus and minus words into groups comprising only the highest value words—above 2.5 and below 0.4, which created a slighter better separation between Junius and the comparative material (Ellegård 1962, 38). To improve his test further, he decided to revise it in several ways: by studying what he called alternative words (i.e., synonyms) and by creating another sample limited to one genre of writing (political writings that match the Junius letters). In Chap. 5, I apply the statistical methods outlined in this chapter to the control group of screenplays of known authorship to identify stylistic features that distinguish Mankiewicz from Welles. In addition, the majority of the results will be subjected to Ellegård’s method (the distinctiveness ratio and confidence intervals) to determine what variables are the most relevant in distinguishing the two authors. Ellegård’s first test (Program 1) does not need to be modified for this study, for it is limited to two authors and to the same genre of writing (it is an empirical question to be answered in Chap. 5 whether the plus or minus groups need to be limited to the highest value results). Ellegård’s method was chosen for this study because it proposes a simple approach that fulfills this book’s agenda—a comparative discriminant analysis that identifies distinctive linguistic variables that recognize effects or real differences between Mankiewicz and Welles. But, just as importantly, the distinctiveness ratio is used because it quantifies the effect size between those distinctive variables, which cannot be measured by significance tests. (Both the

References

73

statistical significance test and effect size observe a difference in the data, but each offers a different explanation of that difference.) At the end of Chap. 5, I create a composite statistical profile of Mankiewicz and Welles from a list of their multiple distinctive plus and minus linguistic features. Only highly distinctive variables are relevant; if the results for each author are similar (because they use the variables at a similar rate), then those variables are irrelevant, for they are unable to distinguish the two authors. In addition, keeping too many similar variables in the samples will give the impression that the samples are related, even though the similarities may be a general feature of language rather than a similarity of authorship. “The problem,” Ledger states, “is one of reducing the variable set in a way which will not prejudge the results or cause such a high level of overlapping and misclassification of samples as to make interpretation very difficult” (Ledger 1989, 58).

References Blatt, Ben. 2017. Nabokov’s Favourite Word is Mauve: The Literary Quirks and Oddities of Our Most-Loved Authors. London: Simon & Schuster. Bridgman, P.W. 1927. The Logic of Modern Physics. New York: The Macmillan Co. Cannon, John, ed. 1978. The Letters of Junius. Oxford: Oxford University Press. Cohen, Jacob. 1988. Statistical Power Analysis for the Behavioral Sciences, second edition. New York, NY: Academic Press. Daley, Robert, Susan Jones, and John Sinclair. 2004. English Collocation Studies: The OSTI Report, edited by Ramesh Krishnamurthy. London: Continuum. Ellegård, Alvar. 1962. A Statistical Method for Determining Authorship: The Junius Letters 1769–1772. Gothenburg: Gothenburg Studies in English. Ginzburg, Carlo, and Anna Davin. 1980. Morelli, Freud and Sherlock Holmes: Clues and Scientific Method. History Workshop 9: 5–36. Graovac, Jelena, Jovana Kovacevic, and Gordana Pavlovic-Lažetic. 2015. Language Independent n-Gram-Based Text Categorization with Weighting Factors: A Case Study. Journal of Information and Data Management 6 (1): 4–17. Hockey, Susan. 1980. A Guide to Computer Applications in the Humanities. London: Duckworth. Holmes, David I. 1994. Authorship Attribution. Computers and the Humanities 28 (2): 87–106. Holmes, D. I., and R. S. Forsyth. 1995. The Federalist Revisited: New Directions in Authorship Attribution. Literary and Linguistic Computing, 10 (2): 111–27. Hoover, D. L. 2002. Frequent Word Sequences and Statistical Stylistics. Literary and Linguistic Computing 17 (2): 157–80. Jackson, MacDonald P. 1979. Studies in Attribution: Middleton and Shakespeare. Salzburg: Salzburg University Press. Jackson, MacDonald P. 2003. Defining Shakespeare: Pericles as Test Case. Oxford: Oxford University Press. Juola, Patrick. 2008. Authorship Attribution. Foundations and Trends in Information Retrieval, 1 (3): 233–334. Kenny, Anthony. 1982. The Computation of Style. Oxford: Pergamon Press. Kenny, Anthony. 2016. The Aristotelian Ethics: A Study of the Relationship between the Eudemian and Nicomachean Ethics of Aristotle, second edition. Oxford: Oxford University Press. Kešelj, Vlado, Fuchun Peng, Nick Cercone, and Calvin Thomas. 2003. N-gram-based Author Profiles for Authorship Attribution. Proceedings of the Conference Pacific Association for

74

4 The Statistical Analysis of Style: Aims and Methods

Computational Linguistics. Halifax, Nova Scotia: Computer Science Department, Dalhousie University: 255–64. Klaussner, Carmen, John Nerbonne, Çağrı Çöltekin. 2015. Finding Characteristic Features in Stylometric Analysis. Digital Scholarship in the Humanities, 30 (1): i114–i129. https://doi. org/https://doi.org/10.1093/llc/fqv048 Ledger, Gerard. 1989. Re-Counting Plato: A Computer Analysis of Plato’s Style. Oxford: Clarendon Press. Ledger, Gerard, and Thomas Merriam. 1994. Shakespeare, Fletcher, and the Two Noble Kinsmen. Literary and Linguistic Computing 9 (3): 235–48. Martindale, Colin, and Dean McKenzie. 1995. On the Utility of Content Analysis in Author Attribution: The Federalist’. Computers and the Humanities 29 (4): 259–70. Mosteller, Frederick, and David L Wallace. 1964. Inference and Disputed Authorship: The Federalist. Reading, Mass.: Addison-Wesley. Pierce, F. E. 1909. The Collaboration of Webster and Dekker. New York: Henry Holt and Company. Shahram, Morteza, David G. Stork, and David Donoho. 2008. Recovering Layers of Brush Strokes Through Statistical Analysis of Color and Shape: An Application to van Gogh’s Self Portrait with Grey Felt Hat. Proc. SPIE 6810, Computer Image Analysis in the Study of Art, 68100D: https://doi.org/10.1117/12.765773 Stamatatos, Efstathios. 2013. On the Robustness of Authorship Attribution Based on Letter n-gram Features. Journal of Law and Policy 21 (2): 421–39. Tweedie, F. J., S. Singh, and D. I. Holmes. 1996. Neural Network Applications in Stylometry: The Federalist Papers. Computers and the Humanities 30 (1): 1–10. Vickers, Brian. 2002. Shakespeare, Co-Author. Oxford: Oxford University Press. Wasserstein, Ronald L., & Nicole A. Lazar. 2016. The ASA Statement on p-Values: Context, Process, and Purpose. The American Statistician 70 (2): 129–33. Wollheim, Richard. 1973. Giovanni Morelli and the Origins of Scientific Connoisseurship. In On Art and the Mind: Essays and Lectures, 177–201. London: Allen Lane. Yule, G. U. 1939. On Sentence-Length as a Statistical Characteristic of Style in Prose: With Application to Two Cases of Disputed Authorship. Biometrica 30: 363–90. Yule, G. U. 1944. The Statistical Study of Literary Vocabulary. Cambridge: Cambridge University Press.

Chapter 5

Distinguishing Mankiewicz from Welles: Training Phase Results

The comparative statistical analysis of writing style carried out in the remainder of this book consists of a training phase and a testing phase. The training phase, which I undertake in this chapter, analyzes the Mankiewicz and Welles 40,000-word samples three times. In the first analysis, I follow closely Alvar Ellegård’s (1962) methodology by identifying distinctive words. Secondly, I analyze the samples to identify in them additional distinctive linguistic features below and above the level of the word that are capable of distinguishing Mankiewicz from Welles. Thirdly, I divide the 40,000-word samples into smaller 4000-word samples in order to compare different parts of the larger sample with each other to measure the numerical fluctuation of distinctive linguistic features across an author’s work. All three analyses calculate relative values: rather than compare each author to some hypothetical absolute point of reference,1 I compare the two authors’ samples in relation to each other—however, with the proviso that the samples are representative of the population (an author’s entire output), which is measurable via confidence intervals. Like other stylometric studies carried out within the arts and humanities, the following study of the long-running dispute over the coauthorship of the Citizen Kane screenplay uses the most parsimonious statistical methods to distinguish Mankiewicz from Welles. To that extent, in this study, I align myself more with Anthony Kenny rather than with Frederick Mosteller and David L. Wallace. Kenny solves authorship problems by importing a limited number of standard statistical tests into his qualitative research. For example, he writes at the end of his study of the New Testament that he has “refrained from using any but the more elementary statistical

Differentiating two authors by comparing them to one another is distinct from the study of keywords, or keyness, in which an author (or a text) is compared to a reference corpus (such as the one-billon word Corpus of Contemporary American English [COCA]): ‘In a quantitative perspective, keywords are those whose frequency (or infrequency) in a text or corpus is statistically significant, when compared to the standards set by a reference corpus’ (Marina Bondi 2010, 3). 1

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 W. Buckland, Who Wrote Citizen Kane?, Quantitative Methods in the Humanities and Social Sciences, https://doi.org/10.1007/978-3-031-40224-1_5

75

76

5 Distinguishing Mankiewicz from Welles: Training Phase Results

techniques,” maintaining that more sophisticated methods would not likely “alter any of the conclusions presented here,” although they would “undoubtedly permit a more sophisticated and graphic method of presenting the evidence” (Kenny 1986, 121–22). On the other hand, Mosteller and Wallace’s main aim was to contribute to statistics—to compare two different approaches to statistical discrimination (the classical frequentist method and the methods of Bayesian inference) using authorship attribution as an example. In the Preface to their study of the disputed authorship of the Federalist papers, they wrote: “While we help to solve this historical problem, our practical application of Bayes’ theorem to a large analysis of data is a step to testing the feasibility of a method of inference that has been heavily criticized in the past” (Mosteller and Wallace 1964, ix). After presenting the analysis of distinctive plus and minus words, I examine in the second analysis the following linguistic features, beginning with the smallest and progressing systematically to the largest features: • • • • • • • • •

Punctuation. N-gram (1, 2, and 3) analysis. Contractions. Word-length frequency distributions. Word frequency profile. Vocabulary analysis (including a percentage of Old English vocabulary). Two-word collocational analysis. Sentence-length frequency distributions. Cluster analysis.

These analyses constitute a multivariate examination of each author’s samples. The software program “Quantitative Index Text Analyzer” (QUITA) counted the observed frequencies of linguistic features.2 The two authors’ screenplay samples are the same in terms of word count but differ slightly in byte and letter count: the Mankiewicz 40,000-word sample comprises 218,190 byte-level n-grams (including 165,314 letter tokens), and the Welles sample has 219,881 byte-level n-grams (including 163,924 letter tokens), a rather insignificant difference of 1691 bytes and 1390 letters. In order to count all characters, the Quantitative Index Text Analyzer was set to “treat numbers as words” and “treat nonalphanumeric characters as words.” I carried out the comparisons of variables in the Mankiewicz and Welles samples using the byte-level n-gram relative frequencies—except, of course, at the word level, where comparisons between the two authors were carried out using relative frequencies of words. One may expect half as many bigram tokens as unigram tokens. However, the software counts successive n-grams, in which each byte combines with each consecutive byte to form a bigram. For example, the word prologue_ (followed by a space) yields nine unigrams but also nine bigrams: pr, ro, ol, lo, og, gu, ue, e_ (with the space at the end used again in the next bigram). This

The Quantitative Index Text Analyzer is available from: https://www.ram-verlag.eu/ software-neu/ 2

5.1 Distinctive Words

77

“sliding window” method ensures that, in comparing two texts, every matching bigrams is identified. In contrast, the “fixed-position” block approach to bigram or trigram analysis would only collect successive bigrams (pr, ol, og, ue) or trigrams (pro, log, etc.)—that is, it would only identify a successively smaller selection of n-grams. The analysis of the 40,000-word samples sets out the expected frequencies of these linguistic features in each author. It is only by carrying out a statistical analysis of all the linguistic features listed above that one can identify which are relevant for distinguishing Mankiewicz from Welles. All results from the analyses will be presented because completeness in reporting—describing all decisions made and procedures carried out and presenting both successful and unsuccessful results—is a fundamental requirement of research that aims to avoid selection bias and that strives to be open (by making evident how the data were collected and analyzed). The American Statistical Association’s statement on significance testing emphasizes that conducting multiple analyses of data and selectively reporting only successful results undermines the research process: Cherry-picking promising findings, also known by such terms as data dredging, significance chasing, significance questing, selective inference, and “p-hacking,” leads to a spurious excess of statistically significant results in the published literature and should be vigorously avoided (Wasserstein & Lazar 2016, 132).

The results generated in the following pages were regularly checked against the screenplay samples to see the data in situ. Furthermore, to ensure that the flow of the written text is not interrupted, the majority of the figures that present the data are to be found in the Appendix to this chapter.

5.1 Distinctive Words In Ellegård’s formulation (outlined in Chap. 4), a word is distinctive if its relative frequency is substantially different in two comparative samples. Ellegård defined substantially different in terms of ratios: a distinctive word manifests a ratio above 1.5 (a plus word) or below 0.7 (a minus word). I entered the Mankiewicz and Welles samples into the QUITA software to generate the observed and relative frequencies of their words. I compared the two samples to identify words with major differences in frequency; I then quantified those differences by calculating the distinctiveness ratios of the words together with 95% confidence intervals for each individual ratio (to measure its precision and variability). Figure 5.3 in the Appendix displays the plus words that exceed Ellegård’s 1.5 boundary and the minus words that exceed his 0.7 boundary (a double line on the graph marks the border between plus and minus words).3 The counts maintain word boundaries. For example, the count for her as a whole word in Mankiewicz is 523; ignoring word boundaries, its frequency increases to 1048, because it is treated 3

78

5 Distinguishing Mankiewicz from Welles: Training Phase Results

For the sake of consistency and to avoid too much repetition, the distinctiveness ratios have only been calculated from Mankiewicz’s perspective (that is, with Mankiewicz in the numerator position). For example, if has a relative frequency of 0.475% in Mankiewicz’s 40,000-word sample and a relative frequency of 0.208% in Welles, yielding a distinctiveness ratio for Mankiewicz of 1:2.28—or, more simply, 2.28. This means that if occurs 2.28 times more in Mankiewicz than in Welles. This ratio can also be expressed from Welles’s perspective by reversing the frequencies, which yields a ratio of 0.44:1. This means that, for every occurrence of if in Mankiewicz, there are 0.44 occurrences in Welles. These ratios (1:2.28 and 0.44:1) express the same quantity but from different perspectives. Each author has 24 word types in the plus group and 16 in the minus group. Mankiewicz’s plus and minus words add up to 4364 word tokens, with two thirds of them in the plus group. It is no surprise that most of Mankiewicz’s words are in the plus group simply because this group is defined in terms of words with a high frequency in Mankiewicz compared to Welles. And the reverse is the case with the minus group: Welles’s plus and minus words add up to 3450 word tokens, with two thirds of them in the minus group.4 Similarly, the 95% confidence intervals of the plus and minus groups are the reverse of each other: the plus group begins with a high distinctiveness ratio value (11.38) and a wide asymmetrical interval [6.82, 18.96], but as the ratio values fall, the interval gradually narrows and becomes more symmetrical. Only entry 23 (smiling) breaks with this pattern due to its very low frequencies (12 in Mankiewicz and 7 in Welles—too low to be of value as a discriminator; furthermore, it overlaps with another plus word, smiles). The 95% confidence intervals of the minus group display the opposite pattern: they begin with low distinctiveness ratio values with narrow and symmetrical intervals (at least from the second entry—the high-frequency what), but as the ratio values increase, the intervals gradually widen and become less symmetrical, especially noticeable in the final six entries (from last to old). The Cohen’s d effect size for the 24 word types in the plus group is 0.694, with a 95% confidence interval (C.I.) of [0.112, 1.277]. This is between a medium and large effect size. And the Cohen’s d effect size for the 16 word types in the minus group is 0.491, with a 95% C.I. of [−0.213, 1.194], which is a medium effect size. The most common function words (a, to, of, and, and in) are absent from the plus and minus groups because their frequency of occurrence is similar. For example, the ratio of the indefinite article a is 0.99, which indicates that its frequency is almost identical in both authors. Only one function word (at) manifests a distinctive ratio (1.96). With the exception of at, high-frequency function words are not suitable as potential discriminators to distinguish Mankiewicz from Welles.

as a trigram that forms other words such as there and another. However, word count does include contractions. For example, the count for she in Mankiewicz is 485, which includes she’s. 4 Mankiewicz has 27% more word tokens in the plus and minus groups because of his restricted vocabulary, in which the words he does use have a high frequency, whereas Welles’s extended vocabulary means that his frequencies are spread over more word types.

5.1 Distinctive Words

79

Mankiewicz’s most distinctive plus word is Miss, with a frequency of 182 compared to 16 for Welles, yielding a distinctiveness ratio of 11.38. These frequencies exclude the verb miss (which only appears four times in Mankiewicz and two times in Welles) but include Missy, which appears three times in Welles’s sample—specifically, in The Big Brass Ring. (In statistical terms, these frequencies are too low to make any difference to the results.) The frequency of Miss in Mankiewicz is an outlier, indicating either that Mankiewicz refers to his female characters (in the scene text) using this form of address or his characters use it in their dialogue. The same applies to Mr, used 96 times by Mankiewicz and 44 times by Welles. (Nonetheless, Welles does spell out the word Mister 11 times, although this figure has not been added to the total.) Although Welles does not use these forms of address at a high frequency, it is notable that he occasionally draws attention to them when he does use them. In Citizen Kane, the following exchange takes place5: Kane meets up with Bernstein and his former guardian Thatcher to sign a document to relinquish all control over his newspapers. Kane says, “All right, Mr. Bernstein. I read it, Mr. Thatcher. Let me sign it and I’ll go home.” Thatcher objects to being called Mr.: “Too old to call me Mr. Thatcher, Charles.” Kane responds: “You’re too old to be called anything else. You were always too old.” Kane’s strained relationship with his former guardian is expressed not only in terms of their difference in age but also by Kane’s formal mode of address (Mr. Thatcher). The Kane song (not in the screenplay’s seventh draft but added during production) also emphasizes the formal type of address: “It’s Mister Kane!/He doesn’t like the Mister!/He likes good old Char-lie Kane!” And in the opening scene of Touch of Evil, Welles draws attention to the use of Miss and Mrs:6 IMMIGRATION OFFICIAL Where’re you born, Miss? SUSAN Mrs. IMMIGRATION OFFICIAL (slightly deaf) What? SUSAN Philadelphia.

In the ensuing conversation, Mike Vargas mentions that Susan is his wife, and her marital status again becomes the topic of discussion with the immigration official. Additionally, Marcia Linnekar, whose father was killed by the bomb explosion I am attributing the dialogue of this scene to Welles because it is not in the seventh and final draft of the screenplay but was added on set, during filming. 6 In the novel (Badge of Evil), Miss is only used five times, to refer to Miss Linneker, and Mrs is used eleven times, to refer to Mrs. Holt (Susan Vargas). The exchange between the immigration official and Susan quoted here does not appear in the novel. 5

80

5 Distinguishing Mankiewicz from Welles: Training Phase Results

at the beginning of the film, is referred to as Miss Linnekar in the dialogue, although it is eventually revealed that she has secretly married Manolo Sanchez. The two couples are therefore linked—Mike (Miguel) and Susan on the one hand and Marcia and Manolo on the other: two interethnic couples secretly (or hurriedly) married and on opposite sides of the law (Mike is a special investigator attached to the Mexican Ministry of Justice, while Sanchez planted the bomb that killed Marcia’s father, presumably with her blessing). Finally, in The Big Brass Ring, as we have seen already, Pellarin uses the derogatory term Missy (but only three occasions in the 20,000-word sample, four times in the complete screenplay) when talking to Cela Brandini. The results from Mankiewicz’s distinctive plus (or high-frequency) words include content words describing place and time: day, room, and door, permanent features of the scene heading and scene text that are open to frequency variation. One could develop this further by listing all the variations (bathroom, bedroom, lunchroom, doorway), but the frequencies are too low in the samples used in this study. For example, Mankiewicz uses the word bathroom once in 40,000 words, while Welles uses it six times. However, the vocabulary can be expanded to include all words that (for example) describe domestic space, an idea I revisit in Chap. 7 via the LIWC (Linguistic Inquiry and Word Count) software.

5.2 Punctuation In terms of punctuation marks in the Mankiewicz and Welles samples, only the ratio values for the exclamation mark, colon, and ellipsis exceed Ellegård’s [0.7, 1.5] interval (Fig. 5.4 in the Appendix). The magnitude of the exclamation mark’s ratio is slight (for it in fact falls on the 1.5 threshold, with half of the fairly wide 95% confidence interval below the threshold and the other half above it). Welles’s frequency is lower than Mankiewicz’s, confirming the observation made in Chap. 3 that Welles uses fewer exclamation marks, which he edited out of The Magnificent Ambersons. The ratio value for the colon is large (0.07), for its low frequency in Mankiewicz (24) is significant in relation to its high frequency in Welles (330). The ellipsis will need to be analyzed in relation to dashes, for we saw in Chap. 3 that they serve a similar purpose in dialogue. A single dash in the scene heading separates the location and time, while ellipses (…) and dashes (−- or – or – with a space before and after) appear in the scene text and in the dialogue.7 In dialogue, they signify a pause; more specifically, dashes indicate an interruption, while ellipses indicate that a character’s dialogue simply trails off or fades. In the 40,000-word samples, Mankiewicz uses 45 ellipses and 851 dashes, whereas Welles uses 1107 ellipses and 1137 dashes. The extreme difference in the frequency of ellipses Published versions of screenplays, in this instance The Big Brass Ring and Citizen Kane (seventh draft), replace two dashes “--” with one long en dash “–”. The table representing punctuation (Fig. 5.4 in the Appendix) combines the count of the three different types of dashes. 7

5.2 Punctuation

81

between the two authors generates an enormous distinctiveness ratio of 0.04 from Mankiewicz’s perspective, which means that Mankiewicz uses 0.04 of an ellipsis for every ellipsis Welles uses. (Viewing the same data from Welles’s perspective, the ratio becomes 24—in other words, Welles uses 24 ellipses for every ellipsis Mankiewicz uses.) In contrast, the frequency of dashes only generates a ratio of 0.75, which means it is not sufficiently distinctive to separate the two authors. The only other punctuation mark with a ratio value above 1 is the comma, although its ratio of 1.38 does not reach Ellegård’s threshold. Welles’s relative frequency (0.844%) is again lower than Mankiewicz’s (1.167%), confirming the observation made in Chap. 3 that Welles uses less commas, for he edited them out when adapting Tarkington’s novel. The other 20,000-word samples align with Mankiewicz’s results: His Girl Friday’s comma count is 0.99%, whereas in All the President’s Men, it is 1.02%. These results show that Welles uses less commas than the other screenwriters in the control group, but especially Mankiewicz. The question mark (like the exclamation mark) appears primarily in the dialogue; its lower frequency count in Mankiewicz compared to Welles indicates that Mankiewicz’s characters ask fewer questions than Welles’s characters, although the dialogue of Mankiewicz’s characters is punctuated with more exclamation marks. However, the distinctiveness ratio for question marks is slight, for it falls into the indistinct neutral zone around the ratio of 1. The number of parentheses in a screenplay is important because they enclose personal directions, in which the screenwriter gives the actor instructions for delivering the lines—such as (quietly)—or they describe an action to be performed with the dialogue—such as (looks up). They are potential stylistic markers because screenwriters can opt to add them to a screenplay. However, the frequencies of parentheses in Welles and Mankiewicz are similar. For this reason, we need to look beyond the parentheses and instead manually extract and analyze from each of the 40,000-word samples the vocabulary within the personal directions. The results are reproduced below. According to the distinctiveness ratio, three of the seven punctuation marks are distinct, whereas commas, parentheses, dashes, and question marks fail to reach the distinctive thresholds. However, the Cohen’s d effect size for the plus group (above the double line) is negligible (close to zero) because the mean and standard deviation of the two groups are very similar—and because the sample size is small. The Cohen’s d effect size for the minus group is extremely high (1.76) due to the enormous differences in the mean and standard deviation of the two groups. However, the very small sample size renders the results unreliable. (From now on, I will only report Cohen’s d effect sizes for the groups of distinctive features.)

82

5 Distinguishing Mankiewicz from Welles: Training Phase Results

5.3 N-Grams Using QUITA, I carried out an analysis of all 26 letters of the alphabet (unigrams) and extracted their frequency. Only three entries (z, q, and j) manifest a ratio sufficiently distinctive to act as potential discriminators (Fig. 5.5 in the Appendix). All three are notable for having low frequencies, which indicates that the majority of unigrams that have high frequencies also have similar frequencies in both authors, confirming Ellegård’s observation, discussed in Chap. 4, that low-frequency linguistic units are more distinctive. We also saw in the same chapter that Vlado Kešelj and his colleagues argued that unigrams contain too little data to distinguish authors, although Gerard Ledger discovered they were satisfactory in analyzing the chronology of Plato’s work. The Mankiewicz-Welles unigram results side with Kešelj in rejecting all but the lowest frequency unigrams as potential discriminators. The results also show that letters with high frequencies present more stable and precise estimations of population parameters, for their 95% confidence intervals are narrow and the ratio value falls squarely in the center. For example, the interval for the high- frequency y [1.16, 1.26] is narrow and completely symmetrical around its ratio value of 1.21—which is not, however, very distinctive. Compare this to the low- frequency z: it has a wide interval [0.54, 0.88] whose ratio value (0.69) is not in the center but is nonetheless distinctive. In addition, unigrams offer authors less choice in terms of selection, for they are, in George Kingsley Zipf’s words, more “crystallized” than higher level linguistic units (Zipf 1932, Part II). That is, an author cannot choose individual letters (they are not independent variables), for they are bound up with other letters embedded in words. More authorial choice exists at the levels of vocabulary, word groupings, and sentence length, where the degree of crystallization is lower. I then used QUITA to generate bigrams (Fig. 5.6 in the Appendix). There are 19 distinctive bigrams that exceed the [0.7, 1.5] ratio confidence interval. Following common practice in stylometry, I deleted character cues (which name the characters who are about to speak) from the samples before entering them into QUITA. After the analysis, I also deleted bigrams linked to names mentioned frequently in the dialogue and scene descriptions. In Mankiewicz, I deleted the following bigrams: lu and uk, due to 158 mentions of Luke in A Woman’s Secret; ny, due to 95 mentions of Tony in the same screenplay; ma, which appears in four names in Mankiewicz— Margaret, Matthews, Marie, and Marian; rg and ga, due to 280 mentions of Margaret; ew and ws, due to 76 mentions of Matthews; and nn, due to 180 mentions of Kenneth. In Welles, I deleted the following bigrams: na, which appears in three names in The Big Brass Ring—Menakin, Diana, and Tina—and ak, due to its high frequency in the names Menakin and Blake in The Big Brass Ring, as well as Jake and Otterlake in The Other Side of the Wind. The byte-level bigrams consist of a mix of two letters, a letter and punctuation, and on one occasion a space and a letter (_v). The punctuation found in these bigrams comprises the full stop (period), comma, parentheses, question mark, and apostrophe. Although the ratios of the comma, parentheses, and question mark were

5.3 N-Grams

83

not sufficiently distinctive in themselves, their combination with a letter makes them more distinctive. In particular, the comma becomes a more notable part of Mankiewicz’s style, and the question mark becomes a more notable part of Welles’s style, although the frequencies of these punctuation marks when combined with a letter fall severalfold. Other aspects of style can also be retrieved from some of the bigram frequencies, especially word choice. The significantly higher frequency of the bigram mr in Mankiewicz derives from his stylistic choice to use Mr. (96) and Mrs. (34) in front of characters’ names (primarily in the dialogue). The bigram lk derives from the words walk and talk, both of which are more frequent in Mankiewicz than in Welles: walk appears 60 times in Mankiewicz but only 17 times in Welles, and talk appears 76 times in Mankiewicz and only 19 times in Welles. This does not mean that Welles’s characters walk and talk less; instead, Welles’s characters tend to move (36 times) in opposition to Mankiewicz’s characters (five times), and they also speak 38 times, whereas Mankiewicz’s characters only speak three times. Such observations identify the preferred vocabulary choices of both authors. QUITA generated an enormous number of trigrams that exceed the threshold of the distinctiveness ratio interval. On this occasion, a ratio of 2 was used as the cutoff point for the distinctive trigrams due to their large number (Fig. 5.7 in the Appendix).8 The remaining trigrams fall into three categories: three consecutive letters (enc, act, qui, ang, and ame), two letters combined with a punctuation (such as nt. and ll,), and two letters combined with a space (which can occur before or after the two letters or can appear in between them, as in e_b—which represents the final letter of one word and the first letter of the following word). As before, I deleted distinctive n-grams linked to names mentioned frequently in the dialogue and scene descriptions, such as the trigram lar, which has a frequency of 166 in Welles and only 67 in Mankiewicz (a distinctiveness ratio of 0.40) due to the frequent use of the name Pellarin in The Big Brass Ring. Trigrams tend to exhibit a higher frequency in Mankiewicz compared to Welles due to the differences between each author’s vocabulary. Inevitably, a number of bigrams reappear as trigrams, including l, (in the trigram ll,) and mr (in the trigram _mr). It is important to identify these repetitions in order to avoid counting the same features twice, which would compromise the independence between variables. Before carrying out the tests in Chap. 6, I deleted the repeated linguistic features that have a lower distinctiveness ratio.

There are an additional 22 trigrams that have a distinctiveness ratio between 1.99 and 1.5. The decision to report trigrams with a distinctiveness ratio above 2 was made after completing the last section of this chapter, which outlines in more detail the final selection of variables to carry over to Chap. 6. 8

84

5 Distinguishing Mankiewicz from Welles: Training Phase Results

5.4 Contractions Next, I extracted the frequency of trigram contractions (Fig. 5.8 in the Appendix), which other researchers (specifically David Hoover and Brian Vickers) have found valuable in distinguishing authors (see Chap. 4). Overall, Mankiewicz’s two screenplays contain 0.693% trigram contractions, whereas Welles’s contain 0.533%, a percentage point difference of 0.160% and a distinctiveness ratio of 1.30. All the trigram contractions have a higher frequency in Mankiewicz compared to Welles. However, only on’ reaches the distinctiveness ratio threshold. This means that on’ is the only trigram contraction that can discriminate between the two authors. Unlike the research carried out by Hoover and Vickers, contractions play a small role in distinguishing Mankiewicz from Welles.

5.5 Word Length I compared the two authors using the sample mean (average), median (middle), standard deviation, and coefficient of variation (standard deviation divided by the average value) of word length (Fig. 5.9 in the Appendix). Mankiewicz’s and Welles’s word lengths are similar, with both averaging just over four letters per word with a standard deviation marginally above two letters, which therefore generates a coefficient of variation of around 50% for each author (who differ by less than 0.5%). To identify variations within these numbers, I decided to tabulate the frequency distribution of word lengths of all the words (i.e., tokens, not types) in each sample. I created histograms and a line graph in Excel (Figs. 5.10, 5.11, and 5.12 in the Appendix).9 Hidden within the nearly identical mean, median, standard deviation, and coefficient of variation of word length figures, we discover a more nuanced and moderately skewed frequency distribution. Mankiewicz shows a marked increase in word length for words one to four letters long (a total of 1317 additional words) and a decrease in words five and seven letters long (327 less words), before the figures tail off into low frequencies and differences. However, in terms of stylometric difference—the quantified deviation of Mankiewicz’s style from Welles’s style— Mankiewicz’s additional one- to four-letter words only represent 3.3% of his overall sample, and Welles’s additional five- and seven-letter words only represent 0.8% of his overall sample. We therefore need to turn to the distinctiveness ratio values of these word length frequencies to determine their difference. We discover that none of the word lengths (with a frequency above 100) reaches the distinctiveness ratio

The data was generated from the “Analyze My Writing” website. The frequencies do not add up to exactly 40,000 because the software underestimates the word count by around 2%. See Analyze My Writing: https://www.analyzemywriting.com/index.html 9

5.6 Word Frequency Profile

85

thresholds (Fig. 5.13 in the Appendix), which means that none of the word lengths in Mankiewicz and Welles displays distinct frequency distributions.

5.6 Word Frequency Profile The word frequency profile presents a list of vocabulary items that are used once, twice, three times, etc. The vocabulary from each screenplay sample was extracted using the Vocabulary List function on Wcopyfind and pasted into Excel, and the columns were reorganized according to word frequency. I then added up how many word types occur once, twice, three times, etc. As with G.U. Yule’s results (reported in Chap. 4), the data display an extremely skewed, asymmetrical distribution with a huge discrepancy in the frequency of values. Mankiewicz uses 3908-word types (unique word forms), and Welles uses 5332. Of the 3908-word types, 1903 (48.69%) occur once in the Mankiewicz sample, falling to 555 (14.20%) occurring twice, followed by another huge drop to 323 (7.27%) occurring three times, and 203 (5.19%) occurring four times. The results are similar for Welles: from his 5332-word types, 2884 (54.09%) occur once, 840 (15.75%) occur twice, 423 (7.93%) occur three times, and 221 (4.14%) occur four times (Fig. 5.14 in the Appendix). After the 20 occurrences mark, the number of word types tails off considerably. The analysis is therefore limited to occurrences 1 to 20. At the other end of the figure, many single- word types occur hundreds of times. For example, in Mankiewicz, one word (the) occurs 1563 times. The results reveal that only word types that occur 10, 15, and 18 times reach the distinctive ratio thresholds, all of which are low frequency, which explains why their confidence intervals are very wide. In a line graph of the 20 entries (Fig. 5.15 in the Appendix), it is only the first seven word types that display any separation. The compression of the remaining values on the graph is due to the exponential decline in frequencies of these first seven types, which tend to flatten the nonexponential declines between the remaining 13 entries. One way to overcome this problem (suggested by C.B. Williams in [1940] and [1970]) is to convert the data to a different scale—specifically, to transform the arithmetic scale into a logarithmic scale (for logarithms are the inverse of the exponent). The logarithmic scale (Fig. 5.16 in the Appendix) linearizes the exponential decline in frequencies, making the remaining differences between Mankiewicz and Welles visible— although on this occasion, these differences remain insignificant. These studies demonstrate that vocabulary is not evenly distributed across a text, for many words occur just once or twice while a handful of other words occur hundreds of times. The word frequency profiles in both Mankiewicz and Welles follow a similar pattern (both broadly conform to Zipf’s law, which posits an inverse relation between the rank and frequency distribution of a word).

86

5 Distinguishing Mankiewicz from Welles: Training Phase Results

5.7 Vocabulary and Percentage of Old English Vocabulary The previous section reported that Mankiewicz uses 3908-word types in his 40,000- word sample and Welles uses 5332-word types in his sample. The ratio between word types and tokens in Mankiewicz is 3908/40,000, or approximately one type to every ten tokens (1:10), and in Welles the ratio is 5332/40,000, or approximately one type to every seven and a half tokens (1:7.5). Mankiewicz uses 1424 less word types than Welles, a percentage point difference of 30.8% (which means that Welles’s vocabulary is richer or more varied than Mankiewicz’s by a third).10 The type/token ratio is a significant discriminator that can clearly distinguish the style of Mankiewicz from Welles. We can explore vocabulary further by identifying each author’s percentage of Old English words. The screenplays were compared to a file consisting of 5478 Old English words (listed in the OED as still in use today) using Wcopyfind, setting the overlap to one word. Mankiewicz’s screenplays yielded a result of 21,127 Old English word tokens from the 40,000-word screenplay sample, a ratio of 1 to 2 (or 52%).11 In Welles, the results are similar: 20,299 of the words in the Welles sample derive from Old English, a ratio of 1 to 2 (or 50%).12 The percentage of Old English word tokens does not distinguish Mankiewicz from Welles. These results contrast sharply with Yule’s analysis of Old English nouns (summarized in Chap. 4), where he discovered twice as many occurrences in Bunyan as in Macaulay (60.3%/31.9%). These contrasting results illustrate that statistical tests that distinguish one pair of authors do not necessarily distinguish another pair. In terms of word types, Mankiewicz’s sample consists of 900 Old English vocabulary items while Welles’s sample has 1071. Welles’s vocabulary therefore comprises 171 (or 17.35%) more Old English word types than Mankiewicz’s (which is almost identical to Yule’s figure of a 17.2% difference between Bunyan’s and Macaulay’s noun types). Whereas the percentage of total Old English word tokens is similar in both authors, Welles uses a wider variety of word types, which means that his vocabulary is richer than Mankiewicz’s (who repeats more often the same smaller group of word types). Unlike Old English word tokens, the percentage of Old English word types is, like vocabulary in general, a significant discriminator between Mankiewicz and Welles.

Percentage point refers to the arithmetic difference between two percentages (that is, the subtraction of one figure from the other). It is not the same as percentage difference, the difference between two percentages expressed as a percentage. The percentage difference between Mankiewicz’s and Welles’s vocabulary is 36.43%. 11 Both of Mankiewicz’s screenplays display remarkable consistency, for each yielded similar percentages: 10,489 (52%) Old English word tokens in A Woman’s Secret and 10,633 (53%) in Made in Heaven. 12 Like Mankiewicz, both of Welles’s screenplays display remarkable consistency, with The Big Brass Ring yielding 10,441 (50%) Old English word tokens while The Other Side of the Wind has 9850 (49%). 10

5.9 Collocations

87

5.8 Personal Directions and Scene Heading Elements I extracted the personal directions (which are enclosed within parentheses in a screenplay) from each of the 40,000-word samples. Mankiewicz uses 2091 words in 655 personal directions, an average of 3.19 words per personal direction, and Welles uses 2791 words in 661 personal directions, an average of 4.22 words per personal direction. Welles uses more personal directions than Mankiewicz, with each containing just over one additional word, although the distinctiveness ratio of 0.76 suggests that the difference between the two authors is marginally significant at best. To distinguish the two authors’ frequencies, we need to look at the vocabulary items. One of the most common word forms used in personal directions is the adverb ending in -ly. Mankiewicz uses 204 -ly adverbs in his 2091 personal directions (9.7%), and Welles uses 138 out of his 2791 (4.9%), half as much as Mankiewicz, indicating once again that Welles varies his vocabulary more than Mankiewicz (Fig. 5.17 in the Appendix). Welles does not use slowly and rarely uses suddenly, whereas Mankiewicz does not use quiet(ly) and rarely uses quickly or silence. Furthermore, Mankiewicz does not use (beat), while Welles uses the phrase (after a beat) 16 times in The Other Side of the Wind. But there are issues with taking these results in isolation: all the frequencies are low, and the confidence intervals (where they can be calculated) are extremely wide. These marker words (to use Mosteller and Wallace’s term) may still be relevant to distinguish Mankiewicz from Welles, but only when combined with other linguistic features. A count of the basic scene heading elements yields Mankiewicz using 250 (an average of 125 per screenplay), whereas Welles only uses 64 (32 per screenplay) (Fig. 5.18 in the Appendix). Mankiewicz uses NIGHT, INT., and DAY at a distinctively higher rate than Welles, while EXT. does not sufficiently distinguish the two authors. However, in The Magnificent Ambersons, Welles used 94 of these elements (INT.: 28, EXT.: 19, DAY: 37, NIGHT: 10). Furthermore, Welles occasionally writes out the basic scene heading elements INT. and EXT. in full (INTERIOR and EXTERIOR), making them potential distinctive markers.

5.9 Collocations Even at the two-word level, a collocational analysis can reveal an author’s word combination habits, especially the way they use distinctive words, marker words, and high-frequency words. Out of Mankiewicz’s high-frequency distinctive words (Fig. 5.3 in the Appendix), six reappear in the two-word collocations: looks, at, her, going, room, and door. In Mankiewicz’s two-word distinctive collocations (Fig. 5.19 in the Appendix, the first ten entries whose ratio is above 1.5), we discover three distinct groupings: (1) Three of Mankiewicz’s distinctive words (looks, at, and her) combine to create two distinctive collocations (looks at and at her).

88

5 Distinguishing Mankiewicz from Welles: Training Phase Results

(2) Three additional entries (going, room, and door) combine with high-frequency but nondistinctive words (to and the). (3) The remaining Mankiewicz distinctive collocations combine nondistinctive words to create distinctive two-word collocations (there is, to be, is a, and for a). The first group of collocations is more distinctive than the individual words. For example, looks has a distinctiveness ratio of 2.79 in Mankiewicz, and at has a ratio of 1.96. But when combined into looks at, its distinctiveness ratio increases to 6.52. (Of course, the distinctiveness ratio is determined by the relation between each author’s rate of frequency of these collocations, not by the individual value of each word. The purpose of comparing single-word ratios to two-word ratios is simply to determine which functions best as discriminators.) In the second group of collocations, the distinctive word ratio of room (2.80) and door (4.71) increases significantly when combined with the nondistinctive word the: the room (8.53) and the door (6.52). The ratio for going increases slightly when combined with to (from 2.24 to 2.66). The third group is the most interesting in terms of discriminant analysis, for we see nondistinctive words combining to create no less than four distinctive two-word collocations, with two (there is, to be) manifesting a distinctiveness ratio above 2 in Mankiewicz’s favor. To be is the only collocation comprising the infinitive form of a verb. MacDonald P. Jackson’s ability to distinguish Wilkins from Shakespeare in Pericles using infinitive verb forms (discussed in Chap. 4) has limited value in distinguishing Mankiewicz from Welles. Only one of Welles’s high-frequency distinctive words (from, with a distinctivenes ratio of 0.66) appears in a collocation, combined with the nondistinctive word the. While its distinctiveness increases from 0.66 to 0.55, its frequency decreases severalfold.

5.10 Sentence Length Like the mean, median, standard deviation, and coefficient of variation of word length, the sentence length frequencies are similar for both authors (Fig. 5.20 in the Appendix). This is at first surprising because there is more authorial choice at the sentence level than at the word level. However, it looks like screenplay conventions on writing clear, simple dialogue and uncomplicated scene text may preclude an author from varying sentence length (ruling out long sentences). Welles writes shorter sentences with less variation in length, although his difference from Mankiewicz is slight. The frequency distribution of sentence length shows Mankiewicz displaying 23 more one-word sentences and 40 more two-word sentences than Welles (Fig. 5.21 in the Appendix).13 This is followed by the most distinctive feature of sentence length: there are 414 less sentences for Mankiewicz in The graph covers sentence lengths from 1 to 50; the frequency of sentences longer than 50 words represents less than 1% of the total sentences. The first half of the graph shows fluctuations, after which it flattens out. The fluctuations can more accurately be described as oscillations, with the 13

5.11 Cluster Analysis

89

the three- to 12-word range. Mankiewicz displays slightly more sentences in the 13-, 14-, and 15-word range (a total of 29 sentences), and in the 20- and 22- to 26-word range (a total of 92 sentences) before the graph begins to flatten out. Welles’s frequencies completely dominate the three- to 12-word range of sentence lengths, while Mankiewicz dominates the short- and long-length sentences. (This result loosely correlates to word length distribution: Mankiewicz shows an increase for words one to four letters long and a decrease in words five to seven letters long.) The writing style of both authors therefore displays a distinct frequency distribution structure at the sentence level.

5.11 Cluster Analysis I uploaded the four 20,000-word screenplay samples to the “Texts Similarity Analysis” tool on WebStyml. The tool was set to divide each screenplay into ten samples (around 2000 words each), and the cosine measure of distance was chosen, which works best on this dataset—that is, it confirms that these samples of known authorship are internally cohesive. (Additionally, cosine can cluster fragments of different sizes, which makes it relevant for analyzing different-sized samples from Citizen Kane in Chap. 7.) In the multidimensional scaling graph (Fig. 5.22 in the Appendix), all four screenplays separate into their own clusters, with Mankiewicz’s Made in Heaven occupying the bottom left section, A Woman’s Secret in the bottom right, Welles’s The Other Side of the Wind in the top left, and The Big Brass Ring in the top right. Crucially, the authors remain separated: Mankiewicz occupies the graph’s bottom half and Welles the top half. However, two segments within each director’s half of the graph are displaced: a segment of Made in Heaven (at 0.4, −0.1) is on the graph’s bottom right-hand side, and a segment of The Big Brass Ring (at −0.6, 0.1) is on the graph’s top left-hand side. Multidimensional scaling and the dendrogram look at the same data from different angles. It is no surprise, therefore, that the separation between Mankiewicz and Welles is replicated in the dendrogram (Fig. 5.23 in the Appendix), with no displaced segments. The samples are clustered together in successive nodes, with dis/ similarity represented by the number on each node and the length of the lines between the nodes (the higher the numbers and the longer the lines, the more distant the clusters). Furthermore, each screenplay is dominated by its own unique node: the whole of The Big Brass Ring is united under node 75, The Other Side of the Wind under 73, Made in Heaven under 71, and A Woman’s Secret under 74. At the next level, both of Mankiewicz’s nodes (71 and 74) are united under node 76, while Welles’s two nodes fall under node 77. All the clusters come together under node 78.

amplitude of the oscillations gradually decreasing until they flatten out. (The word length distribution—Fig. 5.12 in the Appendix—also oscillates in a similar manner.)

90

5 Distinguishing Mankiewicz from Welles: Training Phase Results

Several measures of similarity were used in order to identify the one relevant for this dataset. Some measures were unable to keep all the samples separate. For example, the Manhattan similarity measure (not illustrated here) is unable to separate these samples of known authorship into internally cohesive clusters. Firstly, it does not sufficiently separate each author’s screenplays: the Made in Heaven cluster contains three segments of A Woman’s Secret, while The Other Side of the Wind contains four segments of The Big Brass Ring. Secondly, the authors are also mixed: Welles’s The Other Side of the Wind cluster contains a segment of Mankiewicz’s A Woman’s Secret, while the remaining two clusters mix both screenplays and authors. However, the Jaccard similarity measure (not illustrated here) successfully replicates the cosine similarity measure in distinguishing authors and their screenplays into their own distinct clusters. However, the cosine similarity measure remains the most viable metric because it can cluster different-sized samples. In the final variation, His Girl Friday and All the President’s Men were added to the four samples (not reproduced here). Each screenplay occupies a separate zone on the multidimensional scaling graph (with one sample from Made in Heaven clustered with A Woman’s Secret), with these two additions occupying the graph’s bottom-right corner. The separation of all six screenplays is in fact more distinct than the four sample screenplays. On the dendrogram (not reproduced here), each of the six screenplays again formed distinct clusters, with His Girl Friday and All the President’s Men occupying the bottom of the graph. In sum, with cosine similarity, both the multidimensional scaling graphs and the dendrograms accurately cluster the Mankiewicz and Welles samples of known authorship; furthermore, clustering can distinguish the two authors from His Girl Friday and All the President’s Men. These results on the screenplays of known authorship suggest that clustering with cosine similarity can successfully represent Mankiewicz and Welles, making it a potentially useful tool in determining the coauthorship of the Citizen Kane screenplay.

5.12 Distinctive Features, Fluctuation, and Confidence Intervals I applied Ellegård’s distinctiveness ratio to the majority of the results presented in this chapter to separate the distinctive from the nondistinctive linguistic features. This has yielded an initial list of 77 types of potential features that can distinguish Mankiewicz’s statistical profile from Welles’s profile. I present this list in Fig. 5.24 (plus group) and Fig. 5.25 (minus group) in the Appendix, where the 77 features are reorganized and renumbered according to their distinctiveness ratio. The plus or Mankiewicz group comprises variables whose frequencies in Mankiewciz are at least 1.5 higher than in Welles, whereas the minus or Welles group comprises variables whose frequencies in Mankiewicz are at least 0.7 lower than in Welles. The

5.12 Distinctive Features, Fluctuation, and Confidence Intervals

91

Mankiewicz group comprises variables 1 to 37 and the Welles group variables 38 to 77. The correlation and Cohen’s d effect size were calculated for both groups. We can think of the frequencies of these distinctive features as 77 subsamples extracted from the larger 40,000-word samples. Because the data collection process was carried out on the same sample at several levels, there is a danger of counting the same features more than once. In order to maintain independence between successive observations, I deleted from the plus and minus groups repetitions between different types of linguistic features (such as bigrams and trigrams) to avoid counting the same features twice. (The repeated feature with the lower distinctiveness ratio was deleted.14) However, no data can be completely independent or pure; the concept of the “complete independence” of observations is a theoretical ideal. Nonetheless, these 77 variables need to be scrutinized further, specifically in terms of Ellegård’s additional requirements (outlined in Chap. 4): that is, to identify the degree of fluctuation of the distinctive linguistic features, to group features together according to their distinctiveness ratios, and to eliminate features close to the [0.7, 1.5] ratio threshold. To illustrate Ellegård’s requirement to examine fluctuation within authors, in the following pages, I present a typical feature, the trigram e_b in Figs. 5.1 and 5.2. Due to low frequencies, the count of the trigram e_b is presented per 10,000 n-grams.15 Across the ten samples we see large fluctuations in frequency, together with samples that fall outside the confidence interval. Furthermore, the ten subsamples are not randomly selected but are consecutive segments (which means their probability of selection is fixed). The trigram e_b is a minus distinctive linguistic feature for Mankiewicz (and therefore a plus feature for Welles). Across the ten Mankiewicz segments (each with an average of 21,819 n-grams), the trigram’s frequency fluctuates from 6.87 per 10,000 n-grams (in segments 2 and 5) to 13.29 per 10,000 (in segment 7). It has a mean of 9.03 and a standard deviation of 2.22, which generate the interval [7.44, 10.62], represented as two lines of dashes on Fig. 5.1.16 Like a quality control chart, these lines represent the upper and lower control limits, and each black box represents a data point, in this instance the relative frequency of the e_b variable in each of Mankiewicz’s 4000-word (21,819 n-gram) samples. The interval (calculated from 1.96 standard deviations of the mean) represents the similarity/difference boundary, which means that all the subsamples from the same sample should fall within the interval. Yet only three of the trigram’s relative frequency values fall within this interval (segments 1, 3, and 10), with a further three very close to it (4, 8, and 9). The coefficient of variation is 24.61%. The exceptions are door/the door and looks/looks at. Although the ratio values of the door and looks at are higher than door and looks, respectively, the door and looks at were deleted because of their very low frequencies—the door appears just nine times and looks at just ten times in Welles’s 40,000-word sample. 15 To return these figures to percentage relative frequencies, simply divide by 100. 16 Due to the small sample size and low frequencies in those samples, I used the t-distribution with nine degrees of freedom. 14

92

5 Distinguishing Mankiewicz from Welles: Training Phase Results

Relative Frequency (per 10,000 n-grams)

25

20

15

10

5

0 1

2

3

Mankiewicz

4

5 6 Segment

7

Interval (7.44)

8

9

10

Interval (10.62)

Segment

1

2

3

4

5

6

7

8

9

10

Frequency: per 10,000 (observed)

7.79 (17)

6.87 (15)

9.17 (20)

10.99 (24)

6.87 (15)

11.46 (25)

13.29 (29)

7.33 (16)

7.33 (16)

9.17 (20)

Fig. 5.1 Frequencies of the Trigram e_b in Mankiewicz (divided into 10 segments)

The trigram e_b is a plus linguistic feature for Welles because its frequency is higher than in Mankiewicz, generating a distinctiveness ratio of 0.58. Across the ten Welles segments (each an average of 21,988 n-grams), frequency fluctuates from 8.64 per 10,000 n-grams (in segment 2) to 20.92 (in segment 1). It has a mean of 15.46 and a standard deviation of 3.56, which generate the interval [12.92, 18.01], represented as dotted lines in Fig. 5.2.17 Six of the trigram’s relative frequency values fit within this interval, and the value of segment 5 falls just outside the lower interval. The coefficient of variation is 23.03%, very similar to Mankiewicz. The intervals, however, are different: Mankiewicz interval: [7.44, 10.62] Welles interval: [12.92, 18.01] These figures show that the e_b trigram is sufficient to distinguish the two authors because their intervals do not overlap (a result reinforced by the trigram’s distinctiveness ratio of 0.58). But these figures also reveal that one of Welles’s subsamples Generated from 1.96 standard deviations of the mean using the t-distribution and nine degrees of freedom. 17

5.12 Distinctive Features, Fluctuation, and Confidence Intervals

93

Relative Frequency (per 10,000 n-grams)

25

20

15

10

5

0 1

2

Welles

3

4

5 6 Segment

7

Interval (12.92)

8

9

10

Interval (18.01)

Segment

1

2

3

4

5

6

7

8

9

10

Frequency: per 10,000 (observed)

20.92 (46)

8.64 (19)

17.28 (38)

16.37 (36)

12.28 (27)

20.01 (44)

14.55 (32)

15 (33)

15.46 (34)

14.10 (31)

Fig. 5.2 Frequencies of the Trigram e_b in Welles (divided into 10 segments)

(segment 2: 8.64) is an outlier, for it falls into or overlaps with Mankiewicz’s interval. Overlap suggests that the subsample is not sufficiently distinctive and that it cannot differentiate between the two authors. Of course, linguistic features do not belong exclusively to one author; some overlapping in their frequency is inevitable between different authors. When comparing two authors, the ideal is to ensure that their samples and/or intervals do not overlap, although in practice a small overlap does not undermine the results. Several researchers have suggested that an overlap of 25%–30% is acceptable.18 We saw at the end of Chap. 4 that Ellegård discovered an overlap between the sample of texts by potential authors and The Letters of Junius: two out of 88 samples fall within the plus Junius interval and a further ten in the minus Junius interval—but, crucially, no sample falls within both Junius intervals. For Ellegård, this minor overlap (of 12 out of 88, or 13.64%) rules out any similarity between these sample texts and the Junius texts. In contrast, six of the Peter E. Austin and Janet E. Hux (2002) suggest that a 29% overlap between intervals is still equivalent to a p-value of 0.05%, whereas Cumming (2014) gives a figure of 25% overlap. Cumming emphasizes that p-values are not necessary to justify interval overlap; he simply quotes the equivalent p-value for comparative purposes.

18

94

5 Distinguishing Mankiewicz from Welles: Training Phase Results

eight samples from Sir Philip Francis fall within the plus Junius interval and seven in the minus Junius interval, with five of the same samples from Francis falling within both Junius intervals. This level of overlap strengthens the hypothesis that Sir Philip was Junius, for it signifies a high probability that the samples derive from the same population (the same author). The plus and minus intervals for the e_b distinctive feature do not overlap, in part because they are calculated from the whole of the 40,000-word samples. It is the fluctuations within these samples that are more problematic. We can investigate this further by taking a closer look at the frequency fluctuations of e_b in segments 1 and 2 of Welles because this trigram’s value in segment 1 is 2.5 times higher than in segment 2 (Fig. 5.26 in the Appendix). In both segments, the e_b trigram is widely distributed across many words. Nonetheless, the following frequencies distinguish Welles’s two segments: have been has a high frequency in segment 1 (seven times) in comparison to segment 2 (one time), followed by the best in segment 1 (three times), with zero frequency in segment 2. Furthermore, only ten entries in segment 1 have zero frequency in comparison to 28 entries in segment 2, indicating Welles’s varied choice of vocabulary in the first segment. The figure also shows that the e_b trigram is frequent in Welles because of the many times he uses the before a noun or an adjective: 30 times across segments 1 and 2. In contrast, Mankiewicz (Fig. 5.27 in the Appendix) only uses the 17 times across his first two segments (and, incidentally, almost all the nouns and adjectives he uses are different from Welles: the bag, the battle-axe, the beach, the broadcast, etc., with only two overlaps—the bedroom(s) and the biggest). From this study of one trigram, we can begin to understand the fluctuations within the Mankiewicz and Welles samples as well as the lexical and grammatical differences between the two authors.

5.13 Distinctive Groups In comparing Mankiewicz and Welles to Citizen Kane in Chap. 6, I only have the luxury of using the entire Citizen Kane screenplay once; in other tests, I need to use smaller samples. And the analysis of the e_b trigram has revealed the problem of fluctuation within smaller samples. Ellegård proposed two solutions to low frequency and fluctuation in small samples: organize several variables into groups according to their distinctiveness ratio and only include the more extreme ratios in the groups. As Ellegård predicted, low-frequency variables are also the most distinctive, which explains why they need to be retained. Yet the individual frequencies of most linguistic features by themselves are insufficient to carry out many rigorous statistical analyses on small samples, for it would be difficult to tell if a difference in frequency is due to sampling fluctuation or if it signifies an actual difference in writing style. To reduce the fluctuations evident in individual variables and to sufficiently separate the two authors, it is necessary to organize the variables into distinctive groups. Figures 5.24 and 5.25 in the Appendix have already grouped the 77

5.13 Distinctive Groups

95

variables into plus (Mankiewicz) and minus (Welles) groups. But are these groups sufficiently distinctive? At the end of his initial study (Program 1), Ellegård realized that he needed to delete some variables in his plus and minus groups that were too close to the boundaries of the nondistinctive neutral zone (the [0.7, 1.5] interval). He proposed making his plus and minus groups more distinctive by including in them only the more extreme variables—words with a distinctiveness ratio below 0.4 or above 2.5. Ellegård therefore made his theory more robust by focusing on the most distinctive variables, which involves factoring out the less extreme values (an inversion of standard data trimming methods that remove outliers). This produced “a slightly more thorough separation between Junius and the comparative material” (Ellegård 1962, 38), although it also reduced the overall number of variables employed in his study. Using Ellegård’s revised thresholds would limit the overall number of linguistic features available to distinguish Mankiewicz from Welles from 77 to 25, for only 25 features have a distinctiveness ratio above 2.5 and below 0.4. To maintain a high distinctiveness ratio and at the same time to ensure an adequate number of linguistic features remain, I decided to keep 22 plus and 22 minus features (Figs. 5.28 and 5.29 in the Appendix). The limit could have been set at 20 or 24, but 22 maintains a balance between a high distinctiveness ratio on the one hand and an adequate number of features on the other.19 The improved performance of the smaller plus and minus groups is evident in their correlation (r) and Cohen’s d figures. Correlation measures the strength of the relationship between variables in two different groups, with +1 signifying a strong positive correlation, −1 a negative correlation, and 0 no correlation. In a strong positive correlation, for example, the values of variables in one group correspond to the values in the other group. A weak correlation between the Mankiewicz and Welles variables would signify their distinctiveness. Firstly, the 77 variables—the correlation between the 37 variables in the plus group (Fig. 5.24 in the Appendix) is 0.924, and between the 40 variables in the minus group (Fig. 5.25 in the Appendix), it is 0.941, both signifying very strong positive correlation between Mankiewicz and Welles. Secondly, the 44 variables—the 22 plus group (Fig. 5.28 in the Appendix) shows a correlation of 0.606 and the 22 minus group (Fig. 5.29 in the Appendix) a correlation of 0.385. Both figures are weak to medium positive correlations, far better than the strong positive correlations of the 77 variables. The Cohen’s d effect size calculations are encouraging in both groups. In the 77 variable group, the Cohen’s d effect size for the plus variables is 0.756, with a 95% confidence interval (C.I.) of [0.284, 1.228]. This figure signifies a medium to large effect size between the two authors. The Cohen’s d effect size for the minus variables is 0.490, with a 95% C.I. of [0.045, 0.935], which is a medium effect size between the two authors. In the 44 variable group, the Cohen’s d effect size between the 22 plus variables is 1.749, with a 95% C.I. of [1.054, 2.444]. The Cohen’s d

In reducing the variables from 77 to 44, I also deleted the scene heading NIGHT, for it has a low frequency (3) in Citizen Kane. 19

96

5 Distinguishing Mankiewicz from Welles: Training Phase Results

effect size between the 22 minus variables is 1.403, with a 95% C.I. of [0.744, 2.063]. Both figures signify large effect sizes between the two authors. With their correlation and Cohen’s d results, these 44 variables adequately distinguish Mankiewicz from Welles. Out of the 22 variables in the plus group, there are two bigrams, three trigrams, one personal direction (pause/s), one scene heading element (INT.), ten words, and five two-word collocations. Out of the 22 variables in the minus group, there are two punctuation marks, one unigram, four bigrams, five trigrams, nine distinctive words, and one two-word collocation (Fig. 5.30 in the Appendix). Mankiewicz’s 44 plus and minus variables constitute 5.391% of all the linguistic tokens in his sample, and Welles’s 44 variables are similar, at 5.149%. These percentages are more than double the number of variables Ellegård used; in his analysis of The Letters of Junius, Ellegård’s plus and minus words constitute 2.5% (3183-word tokens) of the 127,000-word text. Ellegård noted: “we have in fact no reason to think that more than a small part of the vocabulary of an author is ‘distinctive’ in the sense we require” (Ellegård 1962, 48). At the end of their study of The Federalist papers, Mosteller and Wallace concluded: “Hamilton’s and Madison’s styles are unusually similar; new problems, with two authors as candidates, should be easier than distinguishing between Hamilton and Madison” (Mosteller and Wallace 1964, 265). The statistical stylistic profiles of Mankiewicz and Welles have now been established: they comprise 44 distinctive linguistic features organized into plus and minus groups. In Chap. 6, I employ these statistical profiles of Mankiewicz and Welles to identify their contributions to the Citizen Kane screenplay.

Appendix Mankiewicz Frequency and % Miss

Welles Frequency and %

Distinctiveness Ratio

95% lower

C.I. upper

182

0.455

16

0.040

11.38

6.82

18.96

because

48

0.120

9

0.023

5.22

2.56

10.64

mind

57

0.143

12

0.030

4.77

2.56

8.89

door

128

0.320

27

0.068

4.71

3.11

7.13

dissolve

73

0.183

16

0.040

4.58

2.67

7.87

Mrs.

34

0.085

9

0.023

3.70

1.77

7.71

smiles

47

0.118

13

0.033

3.58

1.94

6.62

day

79

0.198

23

0.058

3.41

2.14

5.42

anything

39

0.098

13

0.033

2.97

1.59

5.56

right

118

0.295

40

0.100

2.95

2.06

4.22

room

132

0.330

47

0.118

2.80

2.01

3.90

looks

109

0.273

39

0.098

2.79

1.94

4.03

if

190

0.475

83

0.208

2.28

1.76

2.95

going

101

0.253

45

0.113

2.24

1.58

3.19

Mr.

96

0.240

44

0.110

2.18

1.53

3.12

her

525

1.313

246

0.615

2.13

1.84

2.48

eyes

62

0.155

30

0.075

2.07

1.34

3.20

behind

42

0.105

21

0.053

1.98

1.17

3.34

at

428

1.070

218

0.545

1.96

1.67

2.31

she

485

1.213

257

0.643

1.89

1.62

2.19

which

52

0.130

28

0.070

1.86

1.17

2.94

say

66

0.165

37

0.093

1.77

1.18

2.65

smiling

12

0.030

7

0.018

1.67

0.66

4.24

no

133

0.333

81

0.203

1.64

1.25

2.16

away

41

0.103

62

0.155

0.66

0.45

0.98

what

183

0.458

278

0.695

0.66

0.55

0.79

from

103

0.258

157

0.393

0.66

0.51

0.84

he

405

1.013

624

1.560

0.65

0.57

0.73

here

55

0.138

86

0.215

0.64

0.46

0.90

still

44

0.110

70

0.175

0.63

0.43

0.92

off

57

0.143

94

0.235

0.61

0.44

0.84

little

50

0.125

103

0.258

0.48

0.34

0.67

where

28

0.070

60

0.150

0.47

0.30

0.73

some

43

0.108

93

0.233

0.46

0.32

0.66

last

22

0.055

51

0.128

0.43

0.26

0.71

man

30

0.075

81

0.203

0.37

0.24

0.56

another

28

0.070

90

0.225

0.31

0.20

0.48

yes

12

0.030

50

0.125

0.24

0.13

0.45

silence

17

0.043

79

0.198

0.22

0.13

0.36

8

0.020

111

0.278

0.07

0.04

0.15

old

Fig. 5.3 Distinctive words in Mankiewicz and Welles

98

5 Distinguishing Mankiewicz from Welles: Training Phase Results

Punctuation Mark

Mankiewicz Frequency and %

Welles Frequency and %

Distinctiveness Ratio

95% C.I. lower

upper

204

0.093

136

0.062

1.50

1.21

1.86

Comma

2547

1.167

1855

0.844

1.38

1.30

1.46

Parenthesis

1502

0.688

1770

0.805

0.85

0.79

0.91

Exclamation Mark

Dash -- (or – or -)

851

0.390

1137

0.517

0.75

0.69

0.82

Question Mark

547

0.251

745

0.339

0.74

0.66

0.83

Colon :

24

0.011

330

0.150

0.07

0.05

0.11

Ellipsis …

45

0.021

1107

0.504

0.04

0.03

0.05

Fig. 5.4 Punctuation marks in Mankiewicz and Welles

Letter

Mankiewicz Frequency and %

Welles Frequency and %

Distinctiveness Ratio

95% lower

C.I. upper

y

4434

2.032

3693

1.680

1.21

1.16

1.26

x

273

0.125

237

0.108

1.16

0.97

1.38

m

4511

2.068

4093

1.862

1.11

1.06

1.16

z

108

0.050

158

0.072

0.69

0.54

0.88

q

107

0.049

162

0.074

0.66

0.52

0.84

j

272

0.125

478

0.217

0.58

0.50

0.67

Fig. 5.5 Letter Unigrams in Mankiewicz and Welles

Appendix

Bigram

99

Mankiewicz Frequency and %

Welles Frequency and %

Distinctiveness Ratio

95% lower

C.I. upper

lk

143

0.066

43

0.020

3.30

2.35

4.64

(s

146

0.067

62

0.028

2.39

1.78

3.22

mr

134

0.061

57

0.026

2.35

1.72

3.20

‘d

154

0.071

73

0.033

2.15

1.63

2.84

y.

282

0.129

146

0.066

1.95

1.60

2.38

rr

318

0.146

170

0.077

1.90

1.58

2.29

l,

136

0.062

72

0.033

1.88

1.41

2.50

o.

116

0.053

65

0.030

1.77

1.31

2.40

s)

116

0.053

77

0.035

1.51

1.13

2.01

bl

175

0.080

254

0.116

0.69

0.57

0.84

ck

291

0.133

440

0.200

0.67

0.58

0.78

_v

164

0.075

250

0.114

0.66

0.54

0.80

sp

110

0.050

171

0.078

0.64

0.50

0.81

ic

430

0.197

673

0.306

0.64

0.57

0.72

e?

78

0.036

129

0.059

0.61

0.46

0.81

fr

187

0.086

312

0.142

0.61

0.51

0.73

ie

289

0.133

499

0.227

0.59

0.51

0.68

ra

369

0.169

635

0.289

0.58

0.51

0.66

ov

134

0.061

234

0.106

0.58

0.47

0.72

Fig. 5.6 Distinctive Bigrams in Mankiewicz and Welles

100

5 Distinguishing Mankiewicz from Welles: Training Phase Results

Trigram

Mankiewicz Frequency and %

nt.

163

0.075

Welles Frequency and % 32

0.014

Distinctiveness Ratio

95% lower

C.I. upper

5.36

3.67

7.83

_mr

120

0.055

38

0.017

3.24

2.25

4.67

_ni

134

0.061

54

0.025

2.44

1.78

3.35

_if

160

0.073

66

0.030

2.43

1.82

3.24

er,

143

0.066

62

0.028

2.36

1.75

3.18

if_

190

0.087

80

0.037

2.35

1.81

3.05

ll,

112

0.051

49

0.022

2.32

1.66

3.25

_at

449

0.206

225

0.102

2.02

1.72

2.37

ss_

271

0.124

136

0.062

2.00

1.63

2.46

ame

100

0.046

171

0.078

0.59

0.46

0.76

e_b

197

0.090

340

0.155

0.58

0.49

0.69

ang

75

0.034

137

0.062

0.55

0.42

0.73

qui

52

0.024

109

0.050

0.48

0.34

0.67

act

57

0.026

124

0.056

0.46

0.34

0.63

enc

64

0.029

160

0.073

0.40

0.30

0.53

95% lower

C.I. upper

Fig. 5.7 Distinctive Trigrams in Mankiewicz and Welles

Contraction

Mankiewicz Frequency and %

Welles Frequency and %

Distinctiveness Ratio

on’

200

0.092

115

0.052

1.77

1.41

2.23

‘ve

111

0.051

80

0.036

1.42

1.07

1.89

n’t

506

0.232

366

0.166

1.40

1.22

1.60

it’

136

0.062

106

0.048

1.29

1.00

1.66

‘re

150

0.069

132

0.059

1.17

0.93

1.48

sn’

109

0.050

102

0.046

1.09

0.83

1.43

t’s

300

0.137

276

0.126

1.09

0.93

1.28

Fig. 5.8 Trigram contractions in Mankiewicz and Welles

Appendix

101

Mankiewicz

Welles

Mean Word Length

4.23

4.31

Median Word Length

4.0

4.0

Standard Deviation of Word Length

2.19

2.21

51.77%

51.28%

Coefficient of Variation

Thousands

Fig. 5.9 Word length in Mankiewicz and Welles

10 8675

9

7631

8 6798

Count

7 6

4805

5 4 3 2

3185 1858

2539 1750 957

1 0

1

2

3

4

5

6

7

8

9

Word Length Fig. 5.10 Mankiewicz: frequency distribution of word length

446 10

198 146 66

26

8

11

14

15

12

13

5 Distinguishing Mankiewicz from Welles: Training Phase Results

Thousands

102

10 8540

9 8

7189

7

6235

Count

6

5094

5 4 3 2

3058 1681

2577 1604

1 0

1

2

3

4

5

6

1063

7 8 9 Word Length

572 10

261 124 58

14

5

11

14

15

12

13

Fig. 5.11 Welles: frequency distribution of word length

563

600

442

Count (word difference)

500 400 300 200

127 177

100 0 -100

0

22

135 3

6

-38

9 -126

-63

12

8

12

3 15

-106

-200 -300

146

-289

Word Length

Fig. 5.12 Difference between Mankiewicz and Welles (frequency distribution of word length)

Appendix

103

Word length (in letters)

Mankiewicz Frequency

Welles Frequency

Distinctiveness Ratio

95% lower

C.I. upper

1

1858

1681

1.11

1.04

1.18

2

6798

6235

1.09

1.06

1.12

3

8675

8540

1.02

0.99

1.05

4

7631

7189

1.06

1.03

1.09

5

4805

5094

0.94

0.91

0.98

6

3185

3058

1.04

0.99

1.09

7

2539

2577

0.99

0.94

1.04

8

1750

1604

1.09

1.02

1.16

9

957

1063

0.90

0.83

0.98

10

446

572

0.78

0.69

0.88

11

198

261

0.76

0.63

0.91

12

146

124

1.18

0.93

1.50

…

…

Fig. 5.13 Word length frequencies and their distinctive ratio values in Mankiewicz and Welles

104

5 Distinguishing Mankiewicz from Welles: Training Phase Results

Occurrences

Mankiewicz word types and %

Welles word types and %

Distinctiveness Ratio

95% lower

C.I. upper

1

1903

48.69

2884

54.09

0.90

0.85

0.95

2

555

14.20

840

15.75

0.90

0.81

1.00

3

323

8.27

423

7.93

1.04

0.90

1.20

4

203

5.19

221

4.14

1.25

1.03

1.51

5

142

3.63

138

2.59

1.40

1.11

1.77

6

78

2.00

96

1.80

1.11

0.82

1.50

7

59

1.51

85

1.59

0.95

0.68

1.32

8

54

1.38

71

1.33

1.04

0.73

1.48

9

48

1.23

57

1.07

1.15

0.78

1.69

10

44

1.13

40

0.75

1.51

0.98

2.32

11

38

0.97

41

0.77

1.26

0.81

1.96

12

30

0.77

29

0.54

1.43

0.86

2.38

13

27

0.69

27

0.51

1.35

0.79

2.30

14

28

0.72

26

0.49

1.47

0.86

2.51

15

26

0.67

21

0.39

1.72

0.97

3.06

16

21

0.54

23

0.43

1.26

0.70

2.28

17

13

0.33

21

0.39

0.85

0.43

1.70

18

19

0.49

15

0.28

1.75

0.89

3.44

19

10

0.26

10

0.19

1.37

0.57

3.29

20

11

0.28

13

0.24

1.17

0.52

2.61

1

0.02

0

0

… 1563

Fig. 5.14 Frequency distribution (occurrences of word types) in Mankiewicz and Welles

Appendix

105

3500

Word types

3000 2500 2000 1500 1000 500 0 1

2

3

4

5

6

7

8

9 10 11 12 13 14 15 16 17 18 19 20

Occurrences Mankiewicz

Welles

Fig. 5.15 Word frequency profile of Mankiewicz and Welles

Word types (log scale)

4 3.5 3 2.5 2 1.5 1 0.5 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Occurrences Mankiewicz

Welles

Fig. 5.16 Word frequency profile of Mankiewicz and Welles (logarithmic scale)

106

5 Distinguishing Mankiewicz from Welles: Training Phase Results

Mankiewicz Frequency and %

Welles Frequency and %

slowly

18

0.045

0

0

Distinctiveness Ratio

95% lower

C.I. upper

-

-

0

pause/s

68

0.170

13

0.033

5.15

2.85

9.32

suddenly

25

0.063

5

0.013

5.00

1.91

13.06

quiet(ly)

0

0

11

0.028

0

-

-

quickly

2

0.005

11

0.028

0.18

0.04

0.82

silence

1

0.003

14

0.035

0.09

0.01

0.68

Fig. 5.17 Personal directions in Mankiewicz and Welles

Scene Heading NIGHT

Mankiewicz Count and %

Welles Count and %

Distinctiveness Ratio

95% lower

C.I. upper

74

0.185

15

0.038

4.87

2.80

8.48

INT.

110

0.275

23

0.058

4.74

3.02

7.43

DAY

48

0.120

12

0.030

4.00

2.13

7.53

EXT.

18

0.045

14

0.035

1.29

0.64

2.58

Fig. 5.18 Basic scene heading elements in Mankiewicz and Welles

Appendix

Two-Word Collocation

107

Mankiewicz Frequency and %

Welles Frequency and %

Distinctiveness Ratio

95% lower

C.I. upper

the room

51

0.128

6

0.015

8.53

3.66

19.87

at her

55

0.138

8

0.020

6.90

3.29

14.48

the door

60

0.150

9

0.023

6.52

3.24

13.14

looks at

65

0.163

10

0.025

6.52

3.35

12.69

there is

60

0.150

10

0.025

6.00

3.07

11.72

going to

85

0.213

32

0.080

2.66

1.77

3.99

a moment

64

0.160

28

0.070

2.29

1.47

3.56

to be

92

0.230

43

0.108

2.13

1.48

3.06

is a

55

0.138

28

0.070

1.97

1.25

3.10

for a

66

0.165

38

0.095

1.74

1.17

2.59

with a

41

0.103

63

0.158

0.65

0.44

0.96

one of

29

0.073

46

0.115

0.63

0.40

1.00

in the

80

0.200

138

0.345

0.58

0.44

0.76

from the

24

0.060

44

0.110

0.55

0.33

0.90

after a

29

0.073

54

0.135

0.54

0.34

0.84

Fig. 5.19 Two-word collocations in Mankiewicz and Welles

Mankiewicz

Welles

Number of Sentences

3881

4175

Mean Sentence Length

10.28

9.40

Median Sentence Length

8.00

7.00

Standard Deviation of Sentence Length

8.77

7.71

85.31%

82.02%

Coefficient of Variation Fig. 5.20 Sentence length in Mankiewicz and Welles

5 Distinguishing Mankiewicz from Welles: Training Phase Results

108

Count (Sentence Difference)

60 40 20 0

0

5

10

15

20

25

30

35

40

45

50

-20 -40 -60 -80

Sentence Length (in words)

Fig. 5.21 Difference between Mankiewicz and Welles (frequency distribution of sentence length)

Fig. 5.22 Cluster graph of four Mankiewicz and Welles screenplay samples

Appendix

Fig. 5.23 Dendrogram of four screenplay samples

109

110 Fig. 5.24 Initial list of 37 plus group distinctive features (organized by Distinctiveness Ratio [DR])

5 Distinguishing Mankiewicz from Welles: Training Phase Results Number

Variable

Mankiewicz

Welles

DR 11.38

1

Miss

0.455

0.040

2

the room

0.128

0.015

8.53

3

at her

0.138

0.020

6.90

4

there is

0.150

0.025

6.00

5

because

0.120

0.023

5.22

6

pause/s

0.170

0.033

5.15

7

NIGHT

0.185

0.038

4.87

8

mind

0.143

0.030

4.77

9

door

0.320

0.068

4.74

10

INT.

0.275

0.058

4.74

11

dissolve

0.183

0.040

4.58

12

smiles

0.118

0.033

3.58

13

day

0.198

0.058

3.41

14

lk

0.066

0.020

3.30

15

anything

0.098

0.033

2.97

16

right

0.295

0.100

2.95

17

looks

0.273

0.098

2.79

18

going to

0.213

0.080

2.66

19

(s

0.067

0.028

2.39

20

er,

0.066

0.028

2.36

21

if_

0.087

0.037

2.35

22

ll,

0.051

0.022

2.32

23

a moment

0.160

0.070

2.29

24

suddenly

0.063

0.028

2.25

25

‘d

0.071

0.033

2.15

26

to be

0.230

0.108

2.13

27

Mr.

0.240

0.113

2.12

28

eyes

0.155

0.075

2.07

29

ss_

0.124

0.062

2.00

30

is a

0.138

0.070

1.97

31

y.

0.129

0.066

1.95

32

rr

0.146

0.077

1.90

33

which

0.130

0.070

1.86

34

she

1.213

0.673

1.80

35

o.

0.053

0.030

1.77

36

on’

0.092

0.052

1.77

0.165

0.095

1.74

6.908

2.549

37 for a Sum Correlation (r) Cohen’s d

0.924 0.756

Appendix Fig. 5.25 Initial list of 40 minus group distinctive features (organized by Distinctiveness Ratio [DR])

111 Number 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 Sum Correlation (r) Cohen’s d

Variable

Mankiewicz

bl z ck away what _v q sho with a he sp ic one of still here e?

0.080 0.050 0.133 0.103 0.458 0.075 0.049 0.062 0.103 1.013 0.050 0.197 0.073 0.110 0.138 0.036

te_ off fr ame ie ra ov in the j e_b little qui where some act last enc man another yes silence : (colon) old … (ellipsis)

0.051 0.143 0.086 0.046 0.133 0.169 0.061 0.200 0.125 0.090 0.125 0.024 0.070 0.108 0.026 0.055 0.029 0.075 0.070 0.030 0.043 0.011 0.020 0.021 4.541 0.941 0.490

Welles 0.116

DR 0.69

0.072 0.200 0.155 0.695 0.114

0.69 0.67 0.66 0.66 0.66

0.074 0.096 0.158 1.560 0.078 0.306

0.66 0.65 0.65

0.115 0.175 0.215 0.059

0.63 0.63 0.63 0.61

0.084 0.235 0.142 0.078 0.227 0.289 0.106

0.61 0.61 0.61

0.345 0.217 0.155 0.258 0.050 0.150 0.233 0.056 0.128 0.073 0.203 0.225 0.125 0.198 0.150 0.278 0.504 8.697

0.58 0.58 0.58 0.48 0.48 0.47 0.46 0.46 0.43 0.40 0.37 0.31 0.24 0.22 0.07 0.07 0.04

0.65 0.64 0.64

0.59 0.59 0.58 0.58

112 The e_b trigram in context

5 Distinguishing Mankiewicz from Welles: Training Phase Results Segment 1 (Welles)

Segment 2 (Welles)

airplane bag

1

0

be bad

1

0

coffee bar

1

0

The e_b trigram in context

Segment 1 (Welles)

Segment 2 (Welles)

the beard

1

0

the beautiful

0

1

the bedroom

1

0

come back

1

0

the beginning(s)

2

0

have been

7

1

the best

3

0

he bobs

1

0

the big

0

1

he breaks

2

2

the biggest

1

0

he bursts

0

1

the boat

0

1

I’ve been

2

1

the border(’s)

1

2

little bitty

1

0

the Brandini

0

1

she been

1

0

the brigade

1

0

penthouse bedroom

0

1

the brightest

1

0

plane bumps

0

1

the burden

1

0

someone begins

1

0

the busy

1

0

someone bound

1

0

the buzzing

1

1

the backgammon

1

1

they’re bound

1

0

the background

1

0

they’ve been

1

0

the bar

1

0

were bold

1

0

the Barcelona

0

1

Whipple Butte

1

0

the bargains

1

0

you’re back

2

0

the bartender

1

0

you’re brave

1

0

the Batunga

0

2

wee bit

0

1

Fig. 5.26 The e_b Trigram in Segments 1 and 2 of Welles

5 Distinguishing Mankiewicz from Welles: Training Phase Results

The e_b trigram in context

The e_b trigram in context

113

Segment 1 (Mank)

Segment 2 (Mank)

Segment 1 (Mank)

Segment 2 (Mank)

be both

1

0

the beach

2

0

Chesapeake bay

1

0

the bed

0

1

cigarette box

0

2

the bedrooms

0

1

have been

1

0

the behind

1

0

He beams

1

0

the bell

0

2

I’ve been

1

0

the biggest

0

1

office building

0

4

the booth

1

0

office but

0

1

the bounce

0

1

the b.g.

1

0

the broadcast

1

0

the back

1

1

there by

1

0

the bag

1

0

voice belongs

1

0

the baggage

1

0

We’re both

0

1

the battle-axe

1

0

--

--

--

Fig. 5.27 The e_b Trigram in segments 1 and 2 of Mankiewicz

114

5 Distinguishing Mankiewicz from Welles: Training Phase Results

Number 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 Sum Correlation (r) Cohen’s d

Variable Miss the room at her there is because pause/s mind

Mankiewicz 0.455 0.128 0.138 0.150 0.120 0.170 0.143

Welles 0.040 0.015 0.020 0.025 0.023 0.033 0.030

door INT. dissolve smiles day lk anything right looks going to (s er, if_ ll, a moment

0.320 0.275 0.183 0.118 0.198 0.066 0.098 0.295 0.273 0.213 0.067 0.066 0.087 0.051 0.160

0.068 0.058 0.040 0.033 0.058 0.020 0.033 0.100 0.098 0.080 0.028 0.028 0.037 0.022 0.070

3.774

0.959

DR 11.38 8.53 6.90 6.00 5.22 5.15 4.77 4.74 4.74 4.58 3.58 3.41 3.30 2.97 2.95 2.79 2.66 2.39 2.36 2.35 2.32 2.29

0.606 1.749

Fig. 5.28 Final list of 22 plus group distinctive features (organized by Distinctiveness Ratio [DR])

5 Distinguishing Mankiewicz from Welles: Training Phase Results

Number 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 Sum Correlation (r) Cohen’s d

Variable fr ame ie ra ov in the j e_b little qui where some act last enc man another yes silence : (colon) old … (ellipsis)

Mankiewicz 0.086 0.046 0.133 0.169 0.061 0.200 0.125 0.090 0.125 0.024 0.070 0.108 0.026 0.055 0.029 0.075 0.070 0.030 0.043 0.011 0.020 0.021

Welles 0.142 0.078 0.227 0.289 0.106 0.345 0.217 0.155 0.258 0.050 0.150 0.233 0.056 0.128 0.073 0.203 0.225 0.125 0.198 0.150 0.278 0.504

1.617

4.190

115

DR 0.61 0.59 0.59 0.58 0.58 0.58 0.58 0.58 0.48 0.48 0.47 0.46 0.46 0.43 0.40 0.37 0.31 0.24 0.22 0.07 0.07 0.04

0.385 1.403

Fig. 5.29 Final List of 22 Minus Group Distinctive Features (organized by Distinctiveness Ratio [DR])

116 Fig. 5.30 Number of linguistic features in the final 44 plus and minus groups

5 Distinguishing Mankiewicz from Welles: Training Phase Results

Linguistic feature punctuation

Plus group 0

Minus group 2

unigram

0

1

bigram

2

4

trigram

3

5

personal directions

1

0

scene heading

1

0

10

9

5

1

words two-word collocations

References Austin, Peter E., and Janet E. Hux. 2002. A Brief Note on Overlapping Confidence Intervals. Journal of Vascular Surgery 36 (1): 194–95. Bondi, Marina. 2010. Perspectives on Keywords and Keyness: An introduction. Keyness in Texts, ed. by Marina Bondi and Mike Scott, 1–18. Amsterdam: John Benjamins. Cumming, Geoff. 2014. The New Statistics: Why and How. Psychological Science 25 (1): 7–29. Ellegård, Alvar. 1962. A Statistical Method for Determining Authorship: The Junius Letters 1769-1772. Gothenburg: Gothenburg Studies in English. Kenny, Anthony. 1986. A Stylometric Study of the New Testament. Oxford: Clarendon Press. Mosteller, Frederick, and David L Wallace. 1964. Inference and Disputed Authorship: The Federalist. Reading, Mass.: Addison-Wesley. Wasserstein, Ronald L., & Nicole A. Lazar. 2016. The ASA Statement on p-Values: Context, Process, and Purpose. The American Statistician 70 (2): 129–33. Williams, C. B. 1940. A Note on the Statistical Analysis of Sentence-Length as a Criterion of Literary Style. Biometrika 31: 356–61. Williams, C. B. 1970. Style and Vocabulary. London: Griffin. Zipf, George Kingsley. 1932. Selected Studies of the Principle of Relative Frequency in Language. Cambridge, Mass.: Harvard University Press.

Chapter 6

Comparing Mankiewicz and Welles to the Citizen Kane Screenplay (1): Relative Frequencies, Distinctiveness Ratios, and Confidence Intervals

At the end of Chap. 5, I constructed statistical profiles of Mankiewicz and Welles using relative frequencies, distinctiveness ratios, and confidence intervals, identifying 44 types of distinctive linguistic features that distinguish the two authors (Figs. 5.28 and 5.29). This chapter employs the same 44 linguistic features, but this time to find similarities between each author and the Citizen Kane screenplay. Chapter 2 listed three possible outcomes, which can now be reformulated in terms of relative frequencies and distinctiveness ratios. The relative frequencies of linguistic features in the Citizen Kane screenplay may (1) match or fall close to the relative frequencies of one author (as expressed in a small distinctiveness ratio) or (2) fall somewhere in between the relative frequencies of both authors, with some results aligning with Mankiewicz and others aligning with Welles, or (3) the relative frequency values in the Citizen Kane screenplay may fall completely outside both authors’ relative frequency values. The first scenario confirms single authorship, the second is evidence of coauthorship, and the third scenario is unlikely, for it suggests that neither Mankiewicz nor Welles wrote the screenplay (nonetheless, this outcome remains an option for some sections of the screenplay due in part to John Houseman’s claim to have contributed to the “News on the March” sequence). A clear-cut case of single authorship is confirmed if one author’s plus and minus distinctiveness ratios are located inside the confidence interval and if the other author’s ratios are located outside the interval, and coauthorship is the most likely inference if both Mankiewicz’s and Welles’s results are located in a similar position in relation to the intervals, especially if they overlap within an interval for the same group of linguistic features in the Citizen Kane screenplay. But how do we identify the correspondences between Mankiewicz and Welles on the one hand and Citizen Kane on the other? In this chapter, I undertake three analyses. (1) The first compares each author’s statistical profile to the Citizen Kane screenplay as a whole (as a single entity), which (after editing) is 25,841 words long (111,439 letters, a total of 146,914 byte-level n-grams). (2) The second divides the

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 W. Buckland, Who Wrote Citizen Kane?, Quantitative Methods in the Humanities and Social Sciences, https://doi.org/10.1007/978-3-031-40224-1_6

117

118

6 Comparing Mankiewicz and Welles to the Citizen Kane Screenplay (1): Relative…

screenplay into 4000-word samples (the same length used at the end of Chap. 5 to study the fluctuation of the e_b trigram). This practice, common in statistics, divides a large sample into equal-sized smaller subsamples regardless of high-level structures such as chapters, scenes, paragraphs, or sentences. (3) The third analysis adheres to the natural subdivisions in the screenplay—its 13 major scenes (marked by changes in action, space, and/or time). This third approach worked well for MacDonald P. Jackson (2003) in his study of Pericles and for Ledger and Merriam (1994) in their study of The Two Noble Kinsmen (discussed in Chap. 4). All three analyses reveal the similarities and differences between Mankiewicz and Welles on the one hand and the Citizen Kane screenplay on the other. These analyses follow the same straightforward procedure: (a) the 44 types of linguistic features in the plus and minus groups are counted in (a sample of) Citizen Kane to determine their observed frequencies (token count) and their relative frequencies, and (b) these relative frequencies are compared to the relative frequencies of the same linguistic features in Mankiewicz’s and Welles’s statistical profiles to generate distinctiveness ratios, which determine which author’s statistical profile is closest to Citizen Kane. An author’s similarity to Citizen Kane is presented numerically as a ratio and visually in a confidence interval graph. The area within the confidence interval represents a neutral zone where similarity (a small ratio) is predominant, with the interval’s center signifying equivalence (a ratio of 1:1, an exact match) between an author and the Citizen Kane screenplay. These three analyses involve a certain amount of duplication. The same 44 linguistic features are counted three times, which affords the opportunity to compare the three sets of results in an attempt to ensure that each test is internally valid— which is achieved if the results are consistent—that is, compatible—across all three analyses. Although carrying out an analysis three times is repetitive, it is more valuable than a single analysis, for the repetition aims to reduce the errors, uncertainty, and variability associated with sampling. Furthermore, all three analyses are directly comparable because they use the same author profiles (derived from the same 40,000-word samples) and the same comparative text (the Citizen Kane screenplay). This means that the proportion of byte-level n-grams and word-grams remains constant across the three analyses. Nonetheless, the repetition is not exact, for the same linguistic features are counted in different-sized samples taken from the comparative text. At each stage of the analysis, I spell out the procedures I followed to infer authorship, beginning with the observed data, to data collection, to its quantification as relative frequencies and distinctiveness ratios, and then to the comparison of each author’s ratios using confidence intervals.

6.1 Whole Screenplay

119

6.1 Whole Screenplay 6.1.1 Relative Frequencies Firstly, a summary of the observed and relative frequency of the 22 plus distinctive linguistic features found in Citizen Kane (whole screenplay)—the number of tokens adds up to 940, with individual frequencies varying from smiles (nine) to dissolve (127) (Fig. 6.13 in the Appendix at the end of this chapter). Mankiewicz’s 22 plus features add up to 3.774%, Welles’s add up to 0.959%, and Citizen Kane’s sum is 2.706%. Not surprisingly, the Citizen Kane screenplay is located between its two authors, with Mankiewicz’s profile revealing a difference of 1.068 from Citizen Kane and Welles’s profile a difference of 1.747 (in absolute terms, ignoring the negative). Translating these differences into percentages, we see that in the plus group, Mankiewicz diverges 38.87% from Citizen Kane and Welles diverges 61.13%. Of course, our main interest in this chapter is similarity; this means that Mankiewicz’s divergence of 38.87% is the same as a 61.13% match, and Welles’s 61.13% divergence translates into a 38.87% match. Secondly, a summary of the observed and relative frequency of the 22 minus distinctive linguistic features found in Citizen Kane (whole screenplay)—the overall sum of these features in Citizen Kane is 2171, varying from silence (ten) to the bigram ra (440) (Fig. 6.14 in the Appendix). Note that despite its larger observed frequency (2171 in comparison to 940), the minus group relative frequency count (2.565) is less than the plus group (2.706) because n-grams and words are weighted differently.1 The percentage sum of these 22 minus features in Mankiewicz is 1.617%, in Welles their sum is 4.190%, and in Citizen Kane it is 2.565%. Citizen Kane is again located between the two authors, with Mankiewicz’s profile displaying a difference of 0.948 in absolute terms and Welles’s profile a difference of 1.625. Mankiewicz diverges 36.96% from Citizen Kane, and Welles diverges 63.04%. Converting these to similarities, Mankiewicz is 63.04% similar and Welles is 36.96% similar to Citizen Kane.

6.1.2 Ratios I shall call the ratio that compares an author to Citizen Kane the “CK distinctiveness ratio.” The 44 CK distinctiveness ratios (each accompanied by their 95% confidence interval), representing the similarities between Mankiewicz and Citizen Kane (Fig. 6.15 in the Appendix), are generated by dividing the relative frequencies of the linguistic features in Citizen Kane into Mankiewicz’s relative frequencies, with In other words, the relative frequency of an n-gram is much smaller than the relative frequency of a word. The high count of n-grams in the minus group therefore contributes less than the high count of words in the plus group. 1

120

6 Comparing Mankiewicz and Welles to the Citizen Kane Screenplay (1): Relative…

7 6

Ratio

5 4 3 2 1 0

Linguistic Feature Citizen Kane/Mankiewicz

Citizen Kane/Welles

Interval (0.7)

Interval (1.5)

Fig. 6.1 Visualization of the plus group CK distinctiveness ratios

Mankiewicz placed in the denominator position. Similarly, the 44 CK distinctiveness ratios representing the similarities between Welles and Citizen Kane (Fig. 6.16 in the Appendix) are generated by dividing the relative frequencies of the linguistic features in Citizen Kane into Welles’s relative frequencies, with Welles placed in the denominator position. I now switch to examining the plus group and minus group data separately using graphs. All the plus group CK distinctiveness ratios are visualized in Fig. 6.1, and the minus group ratios are visualized in Fig. 6.2. Because the results are presented as ratios, I follow Anthony Kenny (2016, 130) in using Ellegård’s [0.7, 1.5] ratio interval as the boundary to mark the similarity/difference between each author and Citizen Kane.2 Any ratio higher than 0.7 but lower than 1.5 falls inside the interval, signifying a similarity between an author and Citizen Kane due to their comparable relative frequencies. For example, there are 110 instances of the scene heading element INT. in Mankiewicz’s 40,000-word sample, yielding a relative frequency of 0.275% (which means that, on average, in every 100 words, INT. occurs 0.275 times). In Citizen Kane, its observed frequency is 70, which yields an almost identical relative frequency of 0.271%, resulting in a CK distinctiveness ratio value of 0.99. In contrast, Welles’s CK distinctiveness ratio value for INT. is 4.67, a significant difference between his relative frequency (0.058) and Citizen Kane’s. In the graphic representation of ratios, the similarity between an author and Citizen Kane is evident if a marker falls inside the [0.7, 1.5] interval. Kenny uses a ratio of 1.4 rather than 1.5 because 1.4 is the exact inverse of 0.7. I follow Kenny in using the ratio interval as the boundary to mark the similarity/difference between authors, but I stick to Ellegård’s [0.7, 1.5] ratio interval. 2

6.1 Whole Screenplay

121

7 6

Ratio

5 4 3 2 1 0

Linguistic Feature Citizen Kane/Mankiewicz

Citizen Kane/Welles

Interval (0.7)

Interval (1.5)

Fig. 6.2 Visualization of the minus group CK distinctiveness ratios

In total, 21 of Mankiewicz’s ratio values are located within the [0.7, 1.5] interval—eight in the plus group (features 1–22) and 13 in the minus group (features 23–44). In contrast, 14 of Welles’s CK distinctiveness ratios are located within the interval—five in the plus group and nine in the minus group, indicating their similarity to the Citizen Kane screenplay. However, the bigram fr (entry 23) is distinctive for both Mankiewicz and Welles, which rules it out as a distinctive feature for either author (or, more positively, it signifies 50/50 coauthorship). These graphs reveal the erratic fluctuation of values above and below the intervals due to the use of individual linguistic features with very low frequencies. This is especially evident in Fig. 6.1, where Welles’s ratio for dissolve, 12.287, is an outlier that falls far outside the interval, for it is significantly larger than the other values, and its 95% confidence interval [7.31, 20.67] is wide and asymmetrical. (Compare this result to the bigram ie, which has a CK distinctiveness ratio of 0.67 and a narrow symmetrical 95% confidence interval of [0.57, 0.78].)

6.1.3 Summary of the Whole Screenplay Analysis Figure 6.3 presents a summary of the percentage matches between Mankiewicz and Welles on the one hand and Citizen Kane (the whole screenplay) on the other. Mankiewicz’s matches in his plus and minus groups are similar, averaging out at 62.09%, and Welles’s plus and minus matches are also similar, averaging out at 37.92%. Both authors’ contributions add up to 100.01% (their respective

122

6 Comparing Mankiewicz and Welles to the Citizen Kane Screenplay (1): Relative…

Fig. 6.3 Percentage matches between Mankiewicz, Welles, and Citizen Kane (Whole screenplay)

Citizen Kane / Citizen Kane / Mankiewicz Welles match match + Group

61.13%

38.87%

– Group

63.04%

36.96%

Average

62.09%

37.92%

percentages have been rounded up, which accounts for the additional 0.01%). In other words, this initial comparison of relative frequencies demonstrates that authorship is shared, although not equally, for Mankiewicz is the dominant author: 62.09% of Mankiewicz’s statistical profile was detected in Citizen Kane, whereas only 37.92% of Welles’s profile was detected. This is a difference of 24.17 percentage points between the two authors, which translates into a 38.93% percentage difference. This means that Mankiewicz’s profile has a 38.93% increased presence in Citizen Kane compared to Welles.3 In terms of the total number of matches—where the ratio values fall within the interval—20 out of 44 (44.5%) of Mankiewicz’s distinctive linguistic features and 13 out of 44 (29.5%) of Welles’s features match Citizen Kane (excluding the shared bigram fr). Although these figures are broadly in line with the relative frequency results (in terms of the percentage difference between Mankiewicz and Welles in relation to Citizen Kane), the simple count of features inside and outside the interval resembles the all-or-nothing p test, for it does not consider the degree of similarity/ difference between the ratios inside and outside the intervals. This is why the graphs are invaluable for showing the magnitude of inclusion/exclusion from the interval. On this occasion, they reveal the erratic fluctuation of individual linguistic features with low frequencies. In this initial analysis, I counted Mankiewicz’s and Welles’s distinctive linguistic features in the Citizen Kane screenplay (as a single entity) in order to quantify each author’s contribution. In the following two analyses, I continue to compare Mankiewicz and Welles to Citizen Kane via relative frequencies, ratios, and confidence intervals, but I divide the screenplay into smaller subsamples and only report the sum of the plus and minus groups of linguistic features in each subsample. Will dividing the screenplay into 4000-word segments offer a more nuanced analysis of its (co)authorship by detecting the boundaries where authorship changes?

As mentioned in Chap. 5, “percentage point” refers to the arithmetic difference between two percentages (subtraction of one figure from the other), whereas “percentage difference” is the difference between two percentages expressed as a percentage. 3

6.2 Seven Segments of Citizen Kane

123

6.2 Seven Segments of Citizen Kane The previous section supports the inference that Citizen Kane was coauthored, although the statistical results strongly favor Mankiewicz over Welles. In this section, I divide the Citizen Kane screenplay into six consecutive segments of 4000 words each and a seventh segment of 1841 words (the remaining words at the end of the screenplay).4 Calculating the relative frequencies of the 44 plus and minus features for each segment of the screenplay yields a total of 14 tables containing 308 frequencies. Rather than reproducing these 14 tables, I present the sums of their relative frequencies and their distinctiveness ratios.

6.2.1 Relative Frequencies Mankiewicz’s 22 plus features add up to 3.774% and Welles’s add up to 0.959%. Figure 6.17 in the Appendix presents the relative frequencies of these 22 plus features in each of the seven segments of Citizen Kane, together with their differences from Mankiewicz and Welles. For example, in segment 1 of Citizen Kane, the 22 plus features add up to 0.949%. This means that Mankiewicz differs by 2.825% from segment 1 of Citizen Kane (3.774–0.949), whereas Welles only differs by 0.010% (0.959–0.949: almost a perfect match). Mankiewicz’s 22 minus features add up to 1.617%, and Welles’s add up to 4.190%. Figure 6.18 in the Appendix presents the minus group relative frequencies across the seven segments of Citizen Kane, together with Mankiewicz’s and Welles’s differences from the screenplay across those seven segments. For example, in segment 1 of Citizen Kane, the 22 minus features add up to 3.149%. This means that Mankiewicz differs by 1.532% from segment 1 of Citizen Kane, whereas Welles only differs by 1.041%. These percentage differences are important to the extent that they determine the CK distinctiveness ratios in each of the seven segments of Citizen Kane.

6.2.2 Ratios Figure 6.4 (see page 125) visualizes the CK distinctiveness ratio values for the seven segments of the plus group. The black squares represent Mankiewicz’s CK distinctiveness ratio values across the seven segments. Four of his seven segments are similar to Citizen Kane, for they are located inside the interval (visible in the graph and Although segment 7 is less than half the size of the other segments, it is significant for identifying a potential change of authorship and is therefore too valuable to discard. And, of course, the results have been normalized. 4

124

6 Comparing Mankiewicz and Welles to the Citizen Kane Screenplay (1): Relative…

highlighted in the accompanying table). Just as importantly, the degree or magnitude of similarity is indicated by the position of the squares within the interval—the closer they are to the center the stronger the match. The graph confirms that three segments fall outside and below the interval boundary. Mankiewicz’s ratio value for segment 1 (0.25) contrasts sharply with his remaining values, which are located within or close to the interval. The value of segment 6 (0.94), for example, is a close match between this segment of Citizen Kane and Mankiewicz’s statistical profile. One possible explanation for the dissimilarity between Mankiewicz’s segment 1 and Citizen Kane is that, unusually, nine of the 22 features in the plus group of segment 1 of Citizen Kane register zero frequency.5 This is unusual because only 16 additional zero frequencies appear across the remaining 13 (plus and minus) groups. This high concentration of zero frequencies in the plus group of segment 1 of Citizen Kane lowers its overall relative frequency, which affects Mankiewicz more than Welles because Mankiewicz’s overall frequency is high in the plus group (3.774) whereas Welles’s overall frequency is low (0.959). It is the difference between Mankiewicz’s 3.774 and his 0.849 for segment 1 of Citizen Kane that contributes to the low CK distinctiveness ratio of 0.25, which signifies that very few of his distinctive linguistic features appear in segment 1. The black circles in Fig. 6.4 represent Welles’s CK distinctiveness ratio values across the seven segments. Only segment 1 is located within the interval, registering a ratio value of 0.99, an almost perfect match between this segment of Citizen Kane and Welles. His values for the remaining six segments lie outside and significantly above the interval, registering ratio values from 2.06 (segment 7, the second lowest value) to 3.70 (segment 6). Across the seven segments, there are no overlapping values: where one author is in the interval, the other is outside it. In segment 1, the magnitude of Mankiewicz’s difference from Citizen Kane is matched by Welles’s similarity to it. This correlation between the two authors continues across segments 2 to 7, with both rising and falling at the same time, indicating that when one author’s relative frequencies get closer to Citizen Kane’s relative frequencies, the other author veers away from it— although at different magnitudes, with Mankiewicz close to or inside the interval and Welles far from and outside it.6 The CK distinctiveness ratio values for the seven segments of the minus group are visualized in Fig. 6.5 (see page 126). The graph identifies similarities between the two authors and Citizen Kane: four of Mankiewicz’s values (segments 3, 4, 5, and 6) and two of Welles’s values (1 and 7) are located inside the interval (highlighted in the accompanying table), with only one value (Welles’s 0.87 in segment 7) representing a close match with Citizen Kane. Furthermore, just like the plus group, across the seven segments there is no overlap in this minus group: where one author is inside the interval, the other is outside. The following words and two-word collocations register zero frequencies in segment 1 (plus group) of Citizen Kane: Miss, the room, at her, pause/s, mind, smiles, looks, going to, and a moment. 6 Because Citizen Kane is a text written primarily by these two authors, one would expect Mankiewicz’s profile to recede when Welles’s profile dominates, and vice versa. 5

6.2 Seven Segments of Citizen Kane

125

4.0 3.5 3.0

Ratio

2.5 2.0 1.5 1.0 0.5 0.0 1

2

3

4 5 Segment Citizen Kane / Welles

Citizen Kane / Mankiewicz

6

7

Int.-

Int.+

Segment Number:

1

2

3

4

5

6

7

Citizen Kane / Mankiewicz

0.25

0.71

0.57

0.76

0.88

0.94

0.52

Citizen Kane / Welles

0.99

2.78

2.26

2.99

3.47

3.70

2.06

Fig. 6.4 The plus group ratios (Mankiewicz and Welles compared to seven segments of Citizen Kane)

It is important to study each author’s plus and minus results together in order to derive an accurate account of their contribution to Citizen Kane. This is because a match of the relative frequencies of an author to the screenplay in both the plus and minus groups creates a strong case for single authorship. The results of the seven- segment analysis confirm that Citizen Kane matches Mankiewicz’s profile in both the plus and minus groups in segments 4, 5, and 6 and that Citizen Kane matches Welles’s profile in both the plus and minus groups in segment 1. Furthermore, Mankiewicz matches Citizen Kane in segment 3 of the minus group but not the plus group, and similarly, Welles matches Citizen Kane in segment 7 of the minus group but not the plus group. The plus and minus values for segment 2 are anomalous, for they are almost identical (2.669 and 2.705 respectively), and they are very close to the overall percentage sum of Citizen Kane’s plus and minus groups (2.706 and 2.565 respectively). These results indicate that neither Mankiewicz’s nor Welles’s statistical profile is dominant in segment 2, suggesting that authorship is split 50/50 (an assumption I explore further in this chapter). In his minus group, Welles to some extent replicates the CK distinctiveness ratio values in Mankiewicz’s plus group: it is now Welles’s values that are low, placed outside and underneath the interval—except for the first and seventh values, which are located inside it. Nonetheless, the replication is incomplete because four of Mankiewicz’s values fall within the interval in the plus group and only two of

126

6 Comparing Mankiewicz and Welles to the Citizen Kane Screenplay (1): Relative…

4 3.5 3

Ratio

2.5 2 1.5 1 0.5 0 1

2

3

4 Segment

Citizen Kane/Mankiewicz Int.- (0.7)

5

6

7

Citizen Kane/Welles Int.+ (1.5)

Segment Number:

1

2

3

4

5

6

7

Citizen Kane / Mankiewicz

1.95

1.67

1.39

1.42

1.42

1.36

2.26

Citizen Kane / Welles

0.75

0.65

0.54

0.55

0.55

0.53

0.87

Fig. 6.5 The minus group ratios (Mankiewicz and Welles compared to seven segments of Citizen Kane)

Welles’s values fall within the interval in the minus group. Furthermore, Mankiewicz does not replicate Welles in the plus group because, again, four of Mankiewicz’s values in the minus group fall inside the interval compared to one of Welles’s values in the plus group. In terms of the total number of matches, in the plus and minus groups, eight out of 14 (57%) of Mankiewicz’s values and only three out of 14 (21.5%) of Welles’s values are located within the [0.7, 1.5] ratio interval. But, as I pointed out at the end of the whole screenplay analysis, these figures are less relevant than the actual relative frequency and ratio values.

6.2.3 Summary of the Seven-Segment Analysis Figure 6.6 presents a summary of the percentage matches between Mankiewicz and Welles on the one hand and the seven segments of Citizen Kane on the other. Unlike the whole screenplay analysis, Mankiewicz’s matches in his plus and minus groups are different, with the plus group (48.97%) much lower than the minus group (61%). The plus group result brings down Mankiewicz’s average to 54.99%. Welles’s plus and minus matches remain similar, averaging out at 40.04%. Both authors’ averages

6.2 Seven Segments of Citizen Kane Fig. 6.6 Percentage matches between Mankiewicz, Welles, and Citizen Kane (Seven Segments)

127

Citizen Kane / Citizen Kane / Mankiewicz Welles match match + Group

48.97%

38.34%

– Group

61.00%

41.74%

Average

54.99%

40.04%

add up to 95.03%, which means that 5% remains unaccounted for. There is an overall difference of 14.95 percentage points between Mankiewicz and Welles in relation to Citizen Kane, which translates into a 27.19% percentage difference. In place of a single analysis of the whole screenplay presented in the earlier section, dividing it into seven segments has generated variations that potentially mark the boundaries of authorship. In particular, the position of each author’s values in the intervals (represented as black squares and circles in Figs. 6.4 and 6.5) show where an author’s profile is dominant and where it is subordinate. In regard to the authorship of each segment, Mankiewicz’s CK distinctiveness ratio value for segment 1 in the plus group (far outside the interval) indicates that this segment is strongly associated with Welles’s statistical profile. In contrast, segments 4, 5, and 6 of the plus group are strongly associated with Mankiewicz’s statistical profile because all three of his CK distinctiveness ratio values are inside the interval while all three of Welles’s corresponding values are far outside it. In the minus group, Welles’s CK distinctiveness ratio value for segment 1 is inside the interval, reinforcing the results from the plus group that Welles is strongly associated with that segment (for his values in both the plus and minus groups are located inside the interval and both of Mankiewicz’s values for the same segment are located outside it). For the same reason, Welles’s results for segments 4, 5, and 6 of the minus group also reinforce the majority of the results from the plus group that Mankiewicz is strongly associated with these three segments, for Mankiewicz’s values in both the plus and minus groups are located inside the interval and both of Welles’s values for the same segments are located outside it. In segment 7, only Welles’s minus group enters the interval. Segmentation has revealed fluctuations in Mankiewicz’s and Welles’s statistical profiles in Citizen Kane that were not evident in the analysis of the screenplay as a single entity. The segmentation provides evidence that Welles’s profile is only prominent (but overwhelmingly so) at the beginning (segment 1) and end (segment 7) of the screenplay. The statistical tests show that, in the screenplay’s central sections (especially segments 3 to 6), Mankiewicz’s profile is again prominent. Can the analysis of Citizen Kane’s major predefined scenes reveal and identify the author(s) of each scene and therefore quantify further each author’s contribution to the screenplay?

128

6 Comparing Mankiewicz and Welles to the Citizen Kane Screenplay (1): Relative…

6.3 Thirteen Scenes of Citizen Kane 6.3.1 Relative Frequencies For this third analysis, I divided the Citizen Kane screenplay into its 13 major scenes (CK1 to CK13—see Fig. 6.19 in the Appendix). I then calculated the relative frequencies of the 22 plus and 22 minus features for each scene (a total of 572 frequencies), together with their differences from Mankiewicz and Welles (Figs. 6.20 and 6.21 in the Appendix). These frequencies make sense when converted into ratios.

6.3.2 Ratios I divided the relative frequencies of the 13 scenes of Citizen Kane into the statistical profiles of Mankiewicz and Welles (with the authors in the denominator position) to generate each scene’s CK distinctiveness ratio. As with the seven segments, I present the CK distinctiveness ratios of all 13 scenes in a plus graph and table (Fig. 6.7) (see page 129) and in a minus graph and table (Fig. 6.8) (see page 130). Figure 6.7 presents the CK distinctiveness ratio values for the 13 scenes of the plus group. The results from this 13-scene analysis reinforce the results from the seven-segment analysis. The shape of Mankiewicz’s plus group graph (the black squares in Fig. 6.7) repeats the equivalent graph for the seven segments (Fig. 6.4): both are below the bottom interval but with several ratio values that enter the interval. Four out of seven segments inside the interval in Fig. 6.4 become six out of 13 scenes inside the interval in Fig. 6.7. Both graphs can be divided into three groups: segment 1 is equivalent to scenes 1, 2, and 3 (all outside the interval); segment 7 is equivalent to scene 13, the final part of the screenplay (again outside the interval); and segments 4, 5, and 6 (inside the interval) are represented as scenes 7 to 12 (four inside the interval and another very close to it). But what stands out in the 13-scene plus group table and graph are the small percentage differences in scenes 4, 5, and 12, representing a near exact match between Mankiewicz and Citizen Kane. Scenes 7, 10, and 11 are also near matches for Mankiewicz. In Welles’s plus group (13-scene analysis), the shape of his results (the black circles in Fig. 6.7) also repeats the equivalent graph for the seven segments (Fig. 6.4): a value inside the interval at the beginning, followed by a series of values outside and above the interval, ending with a final value (segment 7, scene 13) that markedly converges toward the interval. However, the near-perfect match between Welles’s profile and segment 1 is not replicated in scene 1. Again, it is scene 2 (“News on the March”) that represents the anomaly—Welles’s value strays into Mankiewicz’s part of the graph close to the interval. All of Welles’s values for the remaining 12 scenes are positioned outside and above the interval. Because the screenplay is divided into smaller samples, we see more variation in both Mankiewicz and Welles, and in scene 2 (“News on the March”), the magnitude

6.3 Thirteen Scenes of Citizen Kane Scene

129

Citizen Kane

Citizen Kane /

Citizen Kane /

plus group

Mankiewicz (3.774)

Welles (0.959)

1.740 0.655 2.143 3.540 3.945 2.561 2.961 2.469 2.290 2.993 2.923 3.549 2.176

0.46 0.17 0.57 0.94 1.05 0.68 0.78 0.65 0.61 0.79 0.77 0.94 0.58

1.81 0.68 2.23 3.69 4.11 2.67 3.09 2.57 2.39 3.12 3.05 3.70 2.27

CK1 CK2 CK3 CK4 CK5 CK6 CK7 CK8 CK9 CK10 CK11 CK12 CK13

4.5 4.0 3.5

Ratio

3.0 2.5 2.0 1.5 1.0 0.5 0.0 1

2

3

4

Citizen Kane/Mankiewicz

5

6

7 8 Scene Citizen Kane/Welles

9

10 Int.- (0.7)

11

12

13

Int.+ (1.5)

Fig. 6.7 The plus group (Mankiewicz and Welles compared to the 13 scenes of Citizen Kane)

of Mankiewicz’s position outside the interval becomes more pronounced, for the ratio falls to 0.17 (which is equivalent to a ratio of 5.76 from Welles’s perspective). Figure 6.8 presents the CK distinctiveness ratio values for the 13 scenes of the minus group. These results confirm the results of the plus group as well as the earlier seven-segment results (Fig. 6.5) while revealing additional fluctuations. For Mankiewicz, his four out of seven segments inside the interval become five out of 13 scenes: that is, segments 3, 4, 5, and 6 become scenes 8, 10, 11, and 12. We shall see ahead that the result for scene 9 in particular reveals important information. For Welles, his two out of seven segments within the interval (segments 1 and 7) become four out of 13 scenes (the three opening scenes, 1, 2, and 3, and the final scenes, 13), with two additional scenes (4 and 9) close to the interval.

130

6 Comparing Mankiewicz and Welles to the Citizen Kane Screenplay (1): Relative… Scene

Citizen Kane

Citizen Kane /

Citizen Kane /

minus group

Mankiewicz (1.617)

Welles (4.190)

3.646 2.991 3.833 2.632 2.337 2.481 2.539 2.183 2.633 2.285 2.365 1.991 3.777

2.25 1.85 2.37 1.63 1.44 1.53 1.57 1.35 1.63 1.41 1.46 1.23 2.34

0.87 0.71 0.91 0.63 0.56 0.59 0.61 0.52 0.63 0.55 0.56 0.48 0.90

CK1 CK2 CK3 CK4 CK5 CK6 CK7 CK8 CK9 CK10 CK11 CK12 CK13

4.5 4.0 3.5

Ratio

3.0 2.5 2.0 1.5 1.0 0.5 0.0 1

2

3

4

Citizen Kane/Mankiewicz

5

6

7 8 Scene Citizen Kane/Welles

9

10 Int.- (0.7)

11

12

13

Int.+ (1.5)

Fig. 6.8 The minus group (Mankiewicz and Welles compared to the 13 scenes of Citizen Kane)

6.3.3 Summary of the Thirteen-Scene Analysis Figure 6.9 presents a summary of the percentage matches between Mankiewicz and Welles on the one hand and the 13 scenes of Citizen Kane on the other. Mankiewicz’s matches in his plus and minus groups are similar, averaging out to a 56.65% similarity. Welles’s plus and minus matches are significantly different, with a low plus group match (34.92%) compared to his minus group match (47.38%), which averages out at 41.15%. We can begin to draw together some general conclusions from this analysis of the 13 scenes. In scene 1 (Prologue), Mankiewicz’s statistical profile is outside the

6.3 Thirteen Scenes of Citizen Kane

131

Fig. 6.9 Percentage matches between Mankiewicz, Welles, and Citizen Kane (13 scenes)

Citizen Kane / Citizen Kane / Mankiewicz Welles match match + Group

54.46%

34.92%

– Group

58.84%

47.38%.

Average

56.65%

41.15%

distinctiveness ratio interval in both the plus and minus groups, while Welles is inside the interval in the minus group. In scene 2 (“News on the March” newsreel), Mankiewicz’s statistical profile is largely absent, which is dominated by Welles’s profile combined either with John Houseman or with the newsreel’s genre conventions (whose voice-over is distinct from the conventions of dialogue and stage direction). In scene 3 (Projection Room), both the plus and minus values of Mankiewicz’s statistical profile are again outside the interval, while Welles’s minus value is inside—and is very close to the interval’s center. With scene 4 (Thompson’s first visit to Susan Alexander Kane), we see a marked shift in the screenplay: Mankiewicz’s plus and minus values converge toward and largely fall inside the interval, while Welles’s values diverge and largely fall outside it. It is therefore significant to note that scene 4 begins with the following comment, in parentheses: (Note: Now begins the story proper—the search by Thompson for the facts about Kane—his researches [sic]—his interviews with the people who knew Kane.) (In Kael, Mankiewicz, and Welles [1971], 126.)

With these words, Mankiewicz’s statistical profile becomes dominant. In scene 4, Mankiewicz’s plus value is inside the interval, while both of Welles’s values are outside it. In scene 5 (Thompson’s visit to the Thatcher library), both of Mankiewicz’s values are inside the interval, and their ratios almost overlap because they are similar; in contrast, both of Welles’s values are far apart and outside the interval. This trend is repeated in scenes 6 (Thatcher library flashback), 10 (Leland’s flashback), 11 (Susan’s second framing story), and 12 (Susan’s flashback), where both of Mankiewicz’s values are again inside the interval and both of Welles’s values outside it. Welles’s statistical profile is outside the interval from scenes 4 to 12. Only in the final scene (scene 13) does his profile reenter the screenplay, dominating it in the way evident in scenes 1 (Prologue) and 3 (screening room), with his minus value close to the interval’s center and his plus value converging toward it, with both of Mankiewicz’s values outside the interval. It is important to investigate further the relation between each author’s plus and minus values. In scenes 1, 2, 3, 9, and 13 (Prologue, News on the March, Projection Room, Leland’s framing story, and the final scene), Mankiewicz is outside the intervals in both the plus and minus groups. In contrast, in scenes 5, 10, 11, and 12 (the Thatcher Library, the flashback embedded in Leland’s scene, Susan’s second framing scene, and the flashback embedded in her scene), Mankiewicz is inside the intervals in both the plus and minus groups.

132

6 Comparing Mankiewicz and Welles to the Citizen Kane Screenplay (1): Relative…

4.5 4.0 3.5

Ratio

3.0 2.5 2.0 1.5 1.0 0.5 0.0 1

2

3

4

5

Mankiewicz +

6

7 Scene Mankiewicz -

8

9

10

Int.- (0.7)

11

12

13

Int.+ (1.5)

Fig. 6.10 Mankiewicz’s results from his plus and minus groups

4.5 4.0 3.5

Ratio

3.0 2.5 2.0 1.5 1.0 0.5 0.0 1

2

3 Welles +

4

5

6 Welles -

7 Scene

8 Int.- (0.7)

9

10

11

12

13

Int.+ (1.5)

Fig. 6.11 Welles’s results from his plus and minus groups

The relation between each author’s plus and minus values becomes clearer when combined into a single graph. Figure 6.10 combines Mankiewicz’s results from both his plus and minus groups, and Fig. 6.11 combines Welles’s results from both of his groups. The differences between Figs. 6.10 and 6.11 are striking. In Fig. 6.10, we can clearly see that none of Mankiewicz’s plus and minus values in scenes 1, 2, 3, 9, and 13 are located inside the interval. In contrast, his plus and minus values from scenes 4 to 8 and 10 to 12 are closely interlinked, demonstrating that all his distinctive linguistic features in these scenes manifest a similar relative frequency to

6.3 Thirteen Scenes of Citizen Kane

133

Citizen Kane. Furthermore, most of his values are inside the interval (whether they derive from the plus group or the minus group), for they all have a low CK distinctiveness ratio. Scenes 5 (Thatcher Library) and 12 (Susan’s flashback) are notable because the plus and minus values are close to one another inside the interval. But in scene 13, Mankiewicz’s plus and minus ratio values diverge rather suddenly and are positioned outside (above and below) the interval. The interlinking of Mankiewicz’s plus and minus values in scenes 4 to 8 and 10 to 12 predominately within the interval indicates that his statistical profile closely matches Citizen Kane in those scenes—especially when compared to Welles’s statistical profile in relation to the same scenes (Fig. 6.11). We have already seen that Welles’s plus group values are significantly above the interval and his minus group values are below and outside the interval, especially in the middle sections. Furthermore, unlike Mankiewicz’s plus and minus graphs, there is a wide gap between Welles’s two graphs. Yet Welles’s statistical profile is evident in scenes 1, 2, 3, and 13 (Prologue, News on the March, Projection Room, and the final scene), where he is positioned within the interval, with the minus values of scenes 3 and 13 (the black circles in Fig. 6.11) near the interval’s center (0.91 and 0.90 respectively, close to a ratio of 1, signifying a near match to Citizen Kane). Even Welles’s plus group values for scenes 1 and 13 (the black squares in Fig. 6.11) are marginally closer to the interval than his other values: scene 13 edges closer without reaching its threshold, unlike his minus value for scene 13, which is located within the interval. This means that Welles’s graph for the minus group begins and ends in the interval, unlike Mankiewicz, whose minus values for these scenes fall outside the interval. Furthermore, Welles’s plus value for scene 2 (0.68) is almost identical to its value in the minus group (0.71), which means that they nearly overlap. All of Welles’s distinctive linguistic features in “News on the March” therefore manifest a similar relative frequency, making a case that his statistical profile dominates while Mankiewicz’s is largely absent from this scene. Nonetheless, attributing scene 2 to Welles is not so straightforward. For both Mankiewicz and Welles, their CK distinctiveness ratio values for scene 2 in the plus group are outliers: for Mankiewicz, the ratio extends all the way down to 0.17, far outside the range of his other values. And similarly for Welles, his plus values decrease from 1.81 (scene 1, outside the upper interval) to 0.68 (scene 2 almost on the lower interval) before increasing to 2.23 (scene 3, outside the upper interval), and it remains outside the interval in subsequent scenes. In other words, for both Mankiewicz and Welles, scene 2 in the plus group does not match the CK distinctiveness ratio values of their other scenes. Why is Mankiewicz’s plus value almost off the chart while Welles’s plus and minus values overlap (for the first and only time)? An additional factor appears to be influencing the results. We need to consider John Houseman’s claim to have rewritten the scene many times (see Chap. 2). We can only speculate that perhaps Houseman’s claim to authorship is justifiable in that he did write and rewrite “News on the March” (with Welles) and at the same time his authorial profile is somewhat similar to Welles’s (or, at least, completely different to Mankiewicz’s). However, such a claim can only be substantiated by

134

6 Comparing Mankiewicz and Welles to the Citizen Kane Screenplay (1): Relative…

constructing Houseman’s statistical profile and then comparing it to Welles and to scene 2—which is not attempted here because this study is focused on Mankiewicz and Welles, and the analysis has just established that scene 2 does not match Mankiewicz’s statistical profile.7 Figure 6.11 reveals another pattern: Welles’s CK distinctiveness ratio values in scenes 5 (Thatcher library) and 12 (Susan’s flashback) for the plus group are particularly high (4.11 and 3.70 respectively), and the corresponding values for the minus group are low (0.56 and 0.48 respectively). In other words, Welles’s statistical profile is almost completely absent from these two scenes, which explains why Mankiewicz’s plus and minus values for the same scenes almost overlap inside the interval. Scene 9, part of the framing story depicting Thompson’s visit to interview Leland, shows a minor reversal of this pattern. Patrick McGilligan points out: More than once Orson returned to the Kane-Leland scenes, trying to capture the friendship that perplexes Leland. During filming, he worked with Cotten to touch up Leland’s narration, his wheelchair scenes, and the final clash between Kane and Leland after Kane finishes Leland’s damning review of Susan’s opera debut (McGilligan [2015], 677–78).

Welles’s editing and rewriting of scene 9, the Leland framing story (the “wheelchair scenes”), shows up in Figs. 6.10 and 6.11, with Mankiewicz’s values separating slightly and Welles’s moving slightly closer together. The stylometric analysis supports McGilligan’s claim that Welles robustly edited scene 9 of the screenplay, and his intervention during the filming of the scene may have changed it further (however, the finished film is not the present object of study). Nonetheless, there is no stylometric evidence that Welles robustly edited the Kane-Leland flashback (scene 10). The statistical tools (relative frequencies, distinctiveness ratios, and confidence intervals) that I deployed three times (whole screenplay, seven segments, 13 scenes) in this study discovered unequal coauthorship in Citizen Kane, with Mankiewicz’s statistical profile dominant. And as the analyses progressed to smaller segments, additional information emerged about single authorship, of who dominates what sections. Figure 6.12 presents a summary of the relative frequency similarities between Mankiewicz and Welles in relation to Citizen Kane from all three tests. Because the data are the same (author profiles based on 40,000-word samples, the Citizen Kane screenplay), one would expect the results to be comparable. There is a constant in all three results, a constant that is central to this comparative study of Mankiewicz and Welles: the percentage of Mankiewicz’s statistical profile located in Citizen Kane is always greater than Welles’s profile—and by a similar percentage point margin. In fact, the seven-segment and 13-scene analyses are almost identical, We can also speculate that scene 2 is an anomaly due to genre conventions, for the voiceover imitates a newsreel, meaning that it is distinct from both the dialogue and the scene text in the rest of the screenplay. Nonetheless, this is a hypothesis that will need to be tested, for the distinctiveness of the newsreel voiceover lies more in its syntax, which reverses words (‘Legendary was Xanadu’) as well as its loud, bombastic tone of voice, linguistic characteristics that do not influence the tests performed in this chapter. 7

6.4 Preliminary Conclusions Fig. 6.12 Summary of the percentage matches between Citizen Kane (three tests) and Mankiewicz and Welles

135

Citizen Kane / Citizen Kane / Mankiewicz Welles match match Whole screenplay

62.09%

37.92%

Seven segments

54.99%

40.04%

13 Scenes

56.65%

41.15%

Grand Total

57.91%

39.70%

whereas the whole screenplay analysis attributes around 7% more to Mankiewicz than those two tests and 3% less to Welles.8 A “Grand Total” of all three tests (rounded up) identifies 58% of Mankiewicz’s statistical profile in Citizen Kane and 40% of Welles’s profile. Nonetheless, these two figures do not add up to 100%; 2.2% are missing—either lost in the rounding of figures or (perhaps) attributable to Houseman.9

6.4 Preliminary Conclusions In his study of the seven drafts of the Citizen Kane screenplay (discussed in Chap. 2), Robert Carringer does not consider the preliminary discussions that took place before Mankiewicz began writing the first draft at Victorville. Carringer notes that “certain sections of the script were close to their final form at Victorville. Principally these are the beginning [CK1] and end [CK13], the newsreel [CK2], the projection room sequence [CK3], the first visit to Susan [CK4], and Colorado [CK6]” (Carringer [1978], 399). Here, Carringer implies that Mankiewicz wrote these scenes, for they are in the first draft, which he assumes was written by Mankiewicz. However, the statistical analysis of the plus and minus groups carried out in this chapter emphatically attributes the beginning, the projection room, and the end (CK1, CK3, and CK13) to Welles, with the newsreel (CK2) as an anomaly. The analysis also emphatically attributes scenes CK5 (Thatcher library), CK10 (Leland’s flashback), CK11 (second visit to Susan Alexander Kane), and CK12 (Susan’s flashback) to Mankiewicz. Furthermore, Mankiewicz’s profile is dominant in CK4 (first visit to Susan Alexander Kane), CK6 (Flashback to Colorado), CK7 (Thompson’s visit to Bernstein), and CK8 (Bernstein’s flashback).

The difference between Mankiewicz’s whole screenplay analysis (62.09%) and the other two analyses (54.99% and 56.65%) is the most problematic result of this study, especially in light of the similarities between all three analyses of Welles. 9 More likely, because the 13-scene analysis uses small sample sizes (scenes 3, 4, 5, 7, 9, and 11 range from 517 to 946 words), it tends to under-estimate the percentage of both authors’ profiles in Citizen Kane. 8

136

6 Comparing Mankiewicz and Welles to the Citizen Kane Screenplay (1): Relative…

Patrick McGilligan does consider the preliminary meetings before Victorville: “the script’s opening sequence reads as though it was devised primarily by Welles, who from his earliest talks with Mankiewicz conceived of scenes in terms of the camera as well as of the story” (McGilligan [2015], 633). McGilligan adds: The death of Kane, “Rosebud,” and News on the March—the opening sequences of the film—all this had been agreed on by Welles and Mankiewicz before Victorville. But what happened next would depend a lot on the five or six key characters, who took over the film for long sections of subjective memory. Welles and Mankiewicz had discussed the characters, but the different personalities had to be fleshed out, and their accounts had to fit together and overlap just a little, without too much repetition or contradiction—a point Orson conceded to Mank (McGilligan [2015], 634).

McGilligan’s assertions—that the first scene sounds like Welles and that the opening sequences were agreed upon before Mankiewicz started writing the first draft—are confirmed by my statistical analysis, which provides evidence that all the agreed-upon scenes sound like Welles, together with the closing sequence, which includes Thompson’s summation of Kane’s life. (Welles readily attributed the rosebud gimmick to Mankiewicz.) To return to the current WGA guidelines discussed in Chap. 2, for the first author to receive credit on an original screenplay, his or her contribution must exceed 33% of the final script, and a subsequent writer must contribute at least 50%. If Welles identifies as the first author (of the seventh and final draft of the screenplay), I conclude that the relative frequency tests and confidence intervals representing ratios would attribute authorship to him, but if he is identified as the second author (which is how he is listed in the film’s credits, apparently at his own insistence), I conclude he would not receive screen credit. But this second conclusion does not diminish Welles’s herculean task of writing several scenes (some of which—like the Prologue—Mankiewicz integrated into his first draft), restructuring Mankiewicz’s initial draft, rewriting many scenes (especially the early flashbacks), deleting many others, drafting a small number of additional actions and events, and adding dramatic transitions between scenes, all of which reinforce the views of Richard Meryman, Charles Lederer, Richard Barr, Simon Callow, and Kenneth Tynan (discussed in Chap. 2) that Welles was an accomplished editor but not necessarily a talented writer. But this study has demonstrated that editing a coauthor’s prose does not transform its authorial identity—its stylometric fingerprint—unless the editing and rewriting are extensive, as is the case with scene 9 (although, even then, Welles’s extensive editing, which McGilligan discusses only marginally, modified the graphs). Mankiewicz’s claim that the final draft of Citizen Kane contains a substantial amount of his writing is justified by the three statistical analyses of the screenplay carried out in this chapter, for his authorial profile is prominent in the screenplay’s central scenes. However, his contribution does not (as he claimed) add up to 99% or 98% but corresponds to a more modest 58%.

Appendix

137

Appendix

Feature

Mank %

Welles %

Citizen Kane % (frequency)

1. Miss

0.455

0.040

0.081

(21)

2. the room

0.128

0.015

0.093

(24)

3. at her

0.138

0.020

0.039

(10)

4. there is

0.150

0.025

0.093

(24)

5. because

0.120

0.023

0.070

(18)

6. pause/s

0.170

0.033

0.070

(18)

7. mind

0.143

0.030

0.039

(10)

8. door

0.320

0.068

0.186

(48)

9. INT.

0.275

0.058

0.271

(70)

10. dissolve

0.183

0.040

0.492

(127)

11. smiles

0.118

0.033

0.035

(9)

12. day

0.198

0.058

0.232

(60)

13. lk

0.066

0.020

0.040

(59)

14. anything

0.098

0.033

0.143

(37)

15. right

0.295

0.100

0.221

(57)

16. looks

0.273

0.098

0.221

(57)

17. going to

0.213

0.080

0.186

(48)

18. (s

0.067

0.028

0.048

(71)

19. er,

0.066

0.028

0.041

(60)

20. if_

0.087

0.037

0.046

(67)

21. ll,

0.051

0.022

0.024

(35)

22. a moment

0.160

0.070

0.039

(10)

Sum:

3.774

0.959

2.706

(940)

Correlation(r)

0.428

0.481

0.451 (Mank/CK)

0.968 (Welles/CK)

Cohen’s d

Fig. 6.13 The plus group relative frequencies of Mankiewicz and Welles compared to Citizen Kane (whole screenplay)

138

6 Comparing Mankiewicz and Welles to the Citizen Kane Screenplay (1): Relative…

Feature

Mank %

Welles %

Citizen Kane % (frequency)

23. fr

0.086

0.142

0.123

(181)

24. ame

0.046

0.078

0.111

(163)

25. ie

0.133

0.227

0.152

(223)

26. ra

0.169

0.289

0.300

(440)

27. ov

0.061

0.106

0.110

(161)

28. in the

0.200

0.345

0.329

(85)

29. j

0.125

0.217

0.133

(196)

30. e_b

0.090

0.155

0.114

(167)

31. little

0.125

0.258

0.155

(40)

32. qui

0.024

0.050

0.068

(100)

33. where

0.070

0.150

0.085

(22)

34. some

0.108

0.233

0.112

(29)

35. act

0.026

0.056

0.029

(43)

36. last

0.055

0.128

0.081

(21)

37. enc

0.029

0.073

0.033

(49)

38. man

0.075

0.203

0.174

(45)

39. another

0.070

0.225

0.077

(20)

40. yes

0.030

0.125

0.155

(40)

41. silence

0.043

0.198

0.039

(10)

42. colon

0.011

0.15

0.045

(66)

43. old

0.020

0.278

0.112

(29)

44. ellipsis

0.021

0.504

0.028

(41)

Sum:

1.617

4.190

2.565

(2171)

Correlation(r)

0.842

Cohen’s d

0.654 (Mank/CK)

0.341 0.795 (Welles/CK)

Fig. 6.14 The minus group relative frequencies of Mankiewicz and Welles compared to Citizen Kane (whole screenplay)

Appendix

Feature

139

CK / Mank

95% C.I. [lower, upper]

1. Miss

0.18

[0.11, 0.28]

23. fr

1.43

[1.17, 1.75]

2. the room

0.73

[0.45, 1.19]

24. ame

2.41

[1.88, 3.09]

3. at her

0.28

[0.14, 0.55]

25. ie

1.14

[0.96, 1.36]

4. there is

0.62

[0.39, 1.00]

26. ra

1.78

[1.55, 2.04]

5. because

0.58

[0.34, 1.00]

27. ov

1.80

[1.43, 2.27]

6. pause/s

0.41

[0.24, 0.69]

28. in the

1.65

[1.21, 2.23]

7. mind

0.27

[0.14, 0.53]

29. j

1.06

[0.89, 1.28]

8. door

0.58

[0.42, 0.81]

30. e_b

1.27

[1.03, 1.56]

9. INT.

0.99

[0.73, 1.34]

31. little

1.24

[0.82, 1.88]

10. dissolve

2.69

[2.02, 3.59]

32. qui

2.83

[2.03, 3.96]

11. smiles

0.30

[0.15, 0.61]

33. where

1.21

[0.69, 2.12]

12. day

1.17

[0.84, 1.64]

34. some

1.04

[0.65, 1.66]

13. lk

0.61

[0.45, 0.82]

35. act

1.12

[0.75, 1.66]

14. anything

1.46

[0.93, 2.29]

36. last

1.47

[0.81, 2.68]

15. right

0.75

[0.55, 1.03]

37. enc

1.14

[0.78, 1.65]

16. looks

0.81

[0.59, 1.12]

38. man

2.32

[1.46, 3.68]

17. going to

0.87

[0.61, 1.24]

39. another

1.10

[0.62, 1.95]

18. (s

0.72

[0.54, 0.96]

40. yes

5.17

[2.71, 9.85]

19. er,

0.62

[0.46, 0.84]

41. silence

0.91

[0.42, 1.98]

20. if_

0.52

[0.39, 0.69]

42. colon

4.09

[2.56, 6.53]

21. ll,

0.47

[0.32, 0.69]

43. old

5.60

[2.56, 12.25]

22. a moment

0.24

[0.12, 0.47]

44. ellipsis

1.33

[0.88, 2.02]

Feature

CK / Mank

Fig. 6.15 CK distinctiveness ratios: Mankiewicz compared to Citizen Kane

95% C.I. [lower, upper]

140

6 Comparing Mankiewicz and Welles to the Citizen Kane Screenplay (1): Relative…

Feature

CK / Welles

95% C.I. [lower, upper]

1. Miss

2.03

[1.06, 3.89]

2. the room

6.19

3. at her

Feature

CK / Welles

95% C.I. [lower, upper]

23. fr

0.87

[0.72, 1.04]

[2.53, 15.14]

24. ame

1.42

[1.15, 1.76]

1.94

[0.77, 4.91]

25. ie

0.67

[0.57, 0.78]

4. there is

3.72

[1.78, 7.78]

26. ra

1.04

[0.92, 1.17]

5. because

3.03

[1.36, 6.74]

27. ov

1.04

[0.85, 1.27]

6. pause/s

2.11

[1.03, 4.31]

28. in the

0.95

[0.73, 1.24]

7. mind

1.29

[0.56, 2.99]

29. j

0.61

[0.52, 0.72]

8. door

2.73

[1.70, 4.37]

30. e_b

0.74

[0.61, 0.89]

9. INT.

4.67

[2.92, 7.48]

31. little

0.60

[0.42, 0.86]

10. dissolve

12.29

[7.31, 20.67]

32. qui

1.36

[1.04, 1.78]

11. smiles

1.06

[0.45, 2.48]

33. where

0.57

[0.35, 0.93]

12. day

4.00

[2.47, 6.47]

34. some

0.48

[0.32, 0.73]

13. lk

2.01

[1.36, 2.98]

35. act

0.52

[0.37, 0.74]

14. anything

4.34

[2.31, 8.16]

36. last

0.63

[0.38, 1.05]

15. right

2.21

[1.48, 3.31]

37. enc

0.45

[0.33, 0.62]

16. looks

2.25

[1.50, 3.38]

38. man

0.86

[0.60, 1.24]

17. going to

2.32

[1.48, 3.63]

39. another

0.34

[0.21, 0.55]

18. (s

1.73

[1.23, 2.43]

40. yes

1.24

[0.82, 1.88]

19. er,

1.46

[1.02, 2.08]

41. silence

0.20

[0.10, 0.39]

20. if_

1.23

[0.89, 1.70]

42. colon

0.30

[0.23, 0.39]

21. ll,

1.08

43. old

0.40

[0.27, 0.60]

22. a moment

0.55

[0.70, 1.67] [0.27, 1.13]

44. ellipsis

0.06

[0.04, 0.08]

Fig. 6.16 CK distinctiveness ratios: Welles compared to Citizen Kane

Appendix

141

Segment Citizen Kane Mankiewicz (3.774) / plus group Citizen Kane difference

Welles (0.959) / Citizen Kane difference

1

0.949

2.825

0.010

2

2.669

1.105

1.710

3

2.165

1.609

1.206

4

2.871

0.903

1.912

5

3.327

0.447

2.368

6

3.549

0.225

2.590

7

1.971

1.803

1.012

SUM:

17.501

8.917

10.808

Fig. 6.17 The relative frequencies of the plus group of Citizen Kane (seven segments) and their differences from Mankiewicz and Welles

Segment

Citizen Kane Mankiewicz (1.617) / minus group Citizen Kane difference

Welles (4.190) / Citizen Kane difference

1

3.149

1.53

1.04

2

2.705

1.09

1.49

3

2.246

0.63

1.94

4

2.289

0.67

1.90

5

2.294

0.68

1.90

6

2.203

0.59

1.99

7

3.652

2.04

0.54

18.538

7.23

10.80

SUM:

Fig. 6.18 The relative frequencies of the minus group of Citizen Kane (seven segments) and their differences from Mankiewicz and Welles

142

6 Comparing Mankiewicz and Welles to the Citizen Kane Screenplay (1): Relative…

Citizen Kane Scenes

Counts words / letters / n-grams

Scene Description

CK1

939

4340

5485

CK2

2736

13,085

16,865

Prologue

CK3

643

2673

3615

Projection Room

CK4

652

2821

3756

Susan Alexander Kane (framing story)

News on the March

CK5

520

2374

3044

Thatcher Library (framing story)

CK6

2167

9195

12219

Flashback: Stories embedded in CK5

CK7

817

3307

4483

Bernstein’s Office (framing story)

CK8

4883

21172

27935

Flashback: Stories embedded in CK7

CK9

946

3858

5116

Leland (framing story)

CK10

6215

26041

34620

Flashback: Stories embedded in CK9

CK11

517

2115

2789

Susan Alexander Kane (framing story)

CK12

2792

11754

15602

Flashback: Stories embedded in CK11

CK13

2014

8706

11388

Final scene

Fig. 6.19 Citizen Kane divided into its 13 major scenes

Scene Citizen Kane Mankiewicz (3.774) / plus group Citizen Kane difference CK1 1.740 2.03 CK2 0.655 3.12 CK3 2.143 1.63 CK4 3.540 0.23 CK5 3.945 0.17 CK6 2.561 1.21 CK7 2.961 0.81 CK8 2.469 1.31 CK9 2.290 1.48 CK10 2.993 0.78 CK11 2.923 0.85 CK12 3.549 0.23 CK13 2.176 1.60 SUM:

33.945

15.460

Welles (0.959) / Citizen Kane difference 0.78 0.30 1.18 2.58 2.99 1.60 2.00 1.51 1.33 2.03 1.96 2.59 1.22 22.09

Fig. 6.20 The relative frequencies of the plus group of Citizen Kane (13 Scenes) and their differences from Mankiewicz and Welles

References

143

Scene

CK1 CK2 CK3 CK4 CK5 CK6 CK7 CK8 CK9 CK10 CK11 CK12 CK13 SUM:

Citizen Kane Mankiewicz (1.617) / minus group Citizen Kane difference 3.646 2.03 2.991 1.37 3.833 2.22 2.632 1.01 2.337 0.72 2.481 0.86 2.539 0.92 2.183 0.57 2.633 1.02 2.285 0.67 2.365 0.75 1.991 0.37 3.777 2.16 35.693

14.67

Welles (4.190) / Citizen Kane difference 0.54 1.20 0.36 1.56 1.85 1.71 1.65 2.01 1.56 1.90 1.82 2.20 0.41 18.78

Fig. 6.21 The relative frequencies of the minus group of Citizen Kane (13 Scenes) and their differences from Mankiewicz and Welles

References Carringer, Robert L. 1978. The Scripts of Citizen Kane. Critical Inquiry 5 (2), 369–400. Jackson, MacDonald P. 2003. Defining Shakespeare: Pericles as Test Case. Oxford: Oxford University Press. Kael, Pauline, Herman Mankiewicz, and Orson Welles. 1971. The Citizen Kane Book. Boston: Little, Brown. Kenny, Anthony. 2016. The Aristotelian Ethics: A Study of the Relationship between the Eudemian and Nicomachean Ethics of Aristotle, second edition. Oxford: Oxford University Press. Ledger, Gerard, and Thomas Merriam. 1994. Shakespeare, Fletcher, and the Two Noble Kinsmen. Literary and Linguistic Computing 9 (3): 235–48. McGilligan, Patrick. 2015. Young Orson: The Years of Luck and Genius on the Path to Citizen Kane. New York: HarperCollins.

Chapter 7

Comparing Mankiewicz and Welles to the Citizen Kane Screenplay (2): Cluster Analysis, Type/Token Ratios, Sentence Length, and Linguistic Inquiry and Word Count (LIWC)

In this chapter, I present further statistical comparisons between Mankiewicz and Welles on the one hand and Citizen Kane on the other: sentence length, cluster analysis, type/token ratios, analysis of screenplays not authored by Mankiewicz or Welles (His Girl Friday and All the President’s Men), analysis of other texts known to be written by Mankiewicz and Welles (Mankiewicz’s dialogue from the film Man of the World (1931) and extracts from Welles’s memo on Touch of Evil), and, finally, results from the Linguistic Inquiry and Word (LIWC) Count software.

7.1 Sentence Length What we discover when comparing sentence lengths (Fig. 7.8 in the Appendix of this chapter) is that the average length of a sentence in Citizen Kane is higher than in both Mankiewicz and Welles but is closer to Mankiewicz. The median sentence length, standard deviation of sentence length, and coefficient of variation follow a similar pattern: they are higher in Citizen Kane than in both authors but closer to Mankiewicz. Overall, the differences between Mankiewicz and Welles in relation to Citizen Kane are marginal (visualized in Fig. 7.9 in the Appendix), demonstrating once again that sentence length is unreliable as a mark of authorship. Nonetheless, the graph does reveal a pattern: in terms of short (one and two-word) sentences, Citizen Kane is similar to Welles, but then it fluctuates above and below both authors. After the 14-word sentence mark, it displays the most dominant frequencies, always above Welles and (with a few exceptions) always above Mankiewicz as well. This graph reinforces the results presented in Fig. 7.8, that sentences in Citizen Kane are marginally longer than in Mankiewicz and in Welles.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 W. Buckland, Who Wrote Citizen Kane?, Quantitative Methods in the Humanities and Social Sciences, https://doi.org/10.1007/978-3-031-40224-1_7

145

146

7 Comparing Mankiewicz and Welles to the Citizen Kane Screenplay (2)…

7.2 Cluster Analysis The cluster graph is generated from the 40,000-word Mankiewicz and Welles samples (each divided into 20 smaller samples) and the 13 scenes from Citizen Kane. All 53 files were uploaded to WebStyml and a cluster graph generated (Fig. 7.10 in the Appendix).1 This graph is the same as the cluster graph generated in Chap. 5 (Fig. 5.22 in the Appendix) but with Citizen Kane added. In both graphs, Mankiewicz occupies the left section and Welles the right section. The 13 scenes of Citizen Kane occupy the top, with some scenes close to Mankiewicz and others close to Welles. The distribution of Citizen Kane’s 13 scenes in part reinforces the 13-scene analysis carried out in Chap. 6, although several divergences are also evident. In the top right section, near Welles, the Citizen Kane scenes in which Welles’s profile is dominant—1, 2, 3, and 13—form a loose group (although 13 is more central on the graph than in the other three scenes). All four scenes are closer to Welles than to Mankiewicz, reinforcing the 13-scene analysis carried out in Chap. 6. However, that 13-scene analysis singled out scene 9 (Leland’s framing story) because the data show a marked change in the balance between Mankiewicz and Welles, with Welles’s statistical profile more prominent than in the scenes before and after it. However, in the cluster graph, scene 9 is the one closest to Mankiewicz and therefore the farthest from Welles, a result contrary to the results presented in Chap. 6. This may point toward the unreliability of cluster graphs when analyzing small sample sizes (for scene 9 comprises a sample of 946 words). The graph groups together the scenes set in the film’s present and scenes set in the past, which suggests that the present and past scenes are written differently. The scenes with Thompson interviewing Kane’s acquaintances—4, 5, 7, 9, and 11— occupy the top of the graph, spread out in a line. The flashbacks—scenes 6, 8, 10, and 12—are loosely organized in a diagonal line, slightly closer to Mankiewicz than to Welles. Except for correctly placing scenes 1, 2, 3, and 13 close to Welles, the WebStyml software appears to be more adept in grouping the 13 scenes of Citizen Kane into meaningful clusters (scenes set in the present/set in the past) than it is in relating those scenes to Mankiewicz and Welles.

7.3 Type/Token Ratios Chapter 5 reported type/token ratios (including Old English types) for Mankiewicz’s and Welles’s 40,000-word samples. When comparing these figures to Citizen Kane (whole screenplay) (Fig. 7.11 in the Appendix), we discover that its type/token ratio of 1:7.2 is much closer to Welles’s 1:7.5 than to Mankiewicz’s ratio of 1:10. However, in terms of Old English types, Mankiewicz’s 23% of Old English types is To create the clusters the WebStyml software divides the screenplay into 2000-word segments using different variables than those used in the rest of this study: https://ws.clarin-pl.eu/webstyml. shtml?en 1

7.4 His Girl Friday and All the President’s Men

147

identical to Citizen Kane’s percentage, whereas Welles’s figure is 3% lower, at 20%. Because the type/token ratio is not a constant but is dependent on the size of the sample, I reduced the 40,000-word samples to the same size as Citizen Kane (I used the first 25,841 words of each sample). Welles’s new ratio comes out at 1:6.3, which no longer matches Citizen Kane’s ratio of 1:7.2, but it is still closer than Mankiewicz’s new ratio of 1:8.9. In terms of Old English word types, both authors’ percentages rise by 2%, which places Welles 1% closer to Citizen Kane than Mankiewicz. Overall, the type/token ratio marginally favors Welles, while the results for the Old English word types are too similar to yield any significant results.

7.4 His Girl Friday and All the President’s Men This section compares His Girl Friday and All the President’s Men (20,000-word samples) to Mankiewicz’s and Welles’s statistical profiles in terms of relative frequencies and distinctiveness ratios. There is no need to divide the screenplays into segments because we are not attempting to distinguish between authors in the screenplays but instead setting out to determine if the statistical tests sufficiently distinguish between Mankiewicz, Welles, and screenplays they did not write. This section draws attention to the distinction between characteristic and distinctive variables, which I discussed at the beginning of Chap. 4. The statistical profiles of Mankiewicz and Welles are based on the variables that distinguish the two authors, although they are not necessarily characteristic of that author. In other words, they may not necessarily distinguish Mankiewicz and Welles from other authors. The following comparisons are therefore tentative. Firstly, His Girl Friday. Figure 7.1 presents the sum of the relative frequencies and distinctiveness ratios of His Girl Friday (whole screenplay) in relation to Mankiewicz and Welles. In the plus group, Mankiewicz diverges 1.646 from His Girl Friday (75% difference) and Welles diverges 1.169 (54.93%) in absolute terms. In the minus group, Mankiewicz diverges 0.691 (29.94%) and Welles diverges 1.882 (81.54%). In terms of the ratios (in the final two columns), Mankiewicz’s plus value (0.56) and Welles’s plus and minus values (2.22 and 0.55 respectively) fall outside Ellegård’s [0.7, 1.5] ratio interval, which indicates different authorship; only Mankiewicz’s minus value (1.43) falls just inside it. The statistical profiles of Mankiewicz and Welles are therefore sufficiently distinctive to distinguish their authorship from His Girl Friday. In both plus and minus groups, Welles is slightly Mankiewicz Welles His Girl Friday His Girl Friday / His Girl Friday / Mankiewicz Welles + Group

3.774

0.959

2.128

0.56

2.22

– Group

1.617

4.190

2.308

1.43

0.55

Fig. 7.1 The sum of relative frequencies and distinctiveness ratios of His Girl Friday (Whole Screenplay) in relation to Mankiewicz and Welles

148

7 Comparing Mankiewicz and Welles to the Citizen Kane Screenplay (2)…

Mankiewicz Welles All the President’s APM / APM / Mankiewicz Welles Men (APM) + Group

3.774

0.959

1.488

0.39

1.55

– Group

1.617

4.190

2.386

1.48

0.57

Fig. 7.2 The sum of relative frequencies and distinctiveness ratios of All the President’s Men (Whole Screenplay) in relation to Mankiewicz and Welles

more distinct from His Girl Friday than Mankiewicz. However, it is evident that only an author’s high-frequency words (represented in this study as Mankiewicz’s plus words and Welles’s minus words) sufficiently distinguish them from His Girl Friday (Mankiewicz: 75% difference, Welles: 81.54% difference). This suggests that high-frequency words are more characteristic of an author. Secondly, All the President’s Men (Fig. 7.2). In the plus group, compared to All the President’s Men, Mankiewicz diverges 2.286 (153.63% difference) and Welles diverges 0.529 (35.55%) in absolute terms. In the minus group, Mankiewicz diverges 0.691 (28.96%) and Welles diverges 1.804 (75.61%). Mankiewicz’s 153.63% divergence is caused by the difference between him and All the President’s Men (2.286) being larger than the sum of relative frequencies in All the President’s Men (1.488). In terms of the ratios, Mankiewicz’s plus value (0.39) and Welles’s plus and minus values (1.55 and 0.57 respectively) fall outside Ellegård’s [0.7, 1.5] ratio interval; only Mankiewicz’s minus value (1.48) falls just inside it. The distinctiveness ratio manages to distinguish All the President’s Men from Mankiewicz’s and Welles’s statistical profiles by positioning them outside the [0.7, 1.5] interval. As with the results for His Girl Friday, it is evident that only Mankiewicz’s and Welles’s high- frequency words sufficiently distinguish them from All the President’s Men. It is interesting to see that the plus and minus values for His Girl Friday are similar to each other, as are the values for All the President’s Men. Compare these values to Mankiewicz and Welles in the same tables, where the differences between the plus and minus values are large. These results are encouraging because the tests carried out on the control group of Mankiewicz’s and Welles’s screenplays in Chap. 5 were designed to find the discriminant variables that maximally differentiate Mankiewicz from Welles. This is why the differences between their plus and minus values are large. The tests were not designed to identify the distinctive variables in His Girl Friday or All the President’s Men, which is why the differences between their plus and minus values are smaller than the Mankiewicz and Welles differences.

7.5 Mankiewicz’s Man of the World Dialogue and Welles’s Touch of Evil Memo In this section, I analyze additional writing by Mankiewicz and Welles not used in any previous tests in this study to ensure that the statistical tests match the profiles of these two authors to texts known to be written by them. Based on availability, I

7.6 Linguistic Inquiry and Word Count (LIWC)

149

Mankiewicz Welles Welles’s Memo / Memo / Memo Mankiewicz Welles + Group

3.774

0.959

0.881

0.23

0.92

– Group

1.617

4.190

3.101

1.92

0.74

Fig. 7.3 Mankiewicz and Welles compared to Welles’s Touch of Evil Memo Mankiewicz

Welles

Man of the World

Man of the World / Mankiewicz

Man of the World / Welles

+ Group

3.774

0.959

1.634

0.43

1.70

– Group

1.617

4.190

2.771

1.71

0.66

Fig. 7.4 Mankiewicz and Welles compared to Mankiewicz’s Man of the World Dialogue

have chosen Mankiewicz’s dialogue from the film Man of the World (8415 words, 43,039 n-grams)—because it is one of the few additional texts available—plus extracts from Welles’s long memo from 1957 on Touch of Evil (10,726 words; 60,478 byte-level n-grams)—which I used because it is readily available in electronic form.2 Welles’s long memo to Edward I. Muhl, vice president in charge of production at Universal-International Pictures (dated December 5, 1957), has become a legendary document in which Welles presented reediting instructions for Touch of Evil (the memo is detailed because Welles was barred from the editing suite). The expectation, of course, is that the memo is a close match with Welles’s statistical profile and does not match Mankiewicz’s profile. Figure 7.3 presents the results: Welles’s plus group is a close match (0.92), and the minus group (0.74) falls just inside the [0.7, 1.5] interval. In contrast, neither of Mankiewicz’s groups comes close to the interval: his plus group is 0.23, and his minus group is 1.92. With Mankiewicz’s dialogue from Man of the World, we would expect a close match with Mankiewicz’s profile and no match with Welles’s profile. Figure 7.4 presents the results, which are inconclusive because no author in either the plus or minus group falls into the [0.7, 1.5] interval. This may suggest that dialogue by itself is not suitable as a sample.

7.6 Linguistic Inquiry and Word Count (LIWC) In this final analysis, I separately uploaded the Mankiewicz and Welles 40,000- word samples and the Citizen Kane screenplay to the software program LIWC (Linguistic Inquiry and Word Count) developed by James Pennebaker, Ryan 2 I extracted Welles’s memo from “Orson Welles’s memo on Touch of Evil” by Lawrence French, available on Wellesnet: http://wellesnet.com/touch_memo1.htm

150

7 Comparing Mankiewicz and Welles to the Citizen Kane Screenplay (2)…

L. Boyd, Kayla Jordan, and Kate Blackburn.3 LIWC measures and quantifies 92 linguistic features of texts, including grammatical categories such as pronouns, verbs, and function words, together with punctuation, informal expressions, and words expressing positive or negative sentiment. The software also identifies different types of vocabulary related to social, cognitive, psychological, and biological processes, and personal concerns (such as work, home, leisure, and money). Pennebaker and his colleagues call each of the 92 linguistic categories specialized predefined dictionaries, comprising: 4 summary language variables (analytical thinking, clout, authenticity, and emotional tone), 3 general descriptor categories (words per sentence, percent of target words captured by the dictionary, and percent of words in the text that are longer than six letters), 21 standard linguistic dimensions (e.g., percentage of words in the text that are pronouns, articles, auxiliary verbs, etc.), 41 word categories tapping psychological constructs (e.g., affect, cognition, biological processes, drives), 6 personal concern categories (e.g., work, home, leisure activities), 5 informal language markers (assents, fillers, swear words, netspeak [and nonfluencies]), and 12 punctuation categories (periods, commas, etc.) (Pennebaker et al. [2015], 2).

Each dictionary contains a closed predefined vocabulary. LIWC scores a text from 0 to 100, which represents the relative frequency at which terms from a dictionary appear in a text. Pennebaker adds that the four summary variables “are the only nontransparent dimensions in the LIWC2015 output” (Pennebaker et al. (2015), 6), which is why they need to be used with caution. In its analysis of text, the LIWC software calculates the relative frequency of the linguistic features in each category. Running these tests risks duplicating the results in the previous chapter, although LIWC may identify some linguistic features overlooked in the previous tests (sentiment, vocabulary related to social, cognitive, psychological, and biological processes, etc.). Additionally, LIWC is a useful piece of software because Pennebaker and his colleagues interpret these categories in terms of the psychological meaning of words. Whereas the relative frequencies are descriptive, the psychological dimension of their theory is more speculative (that is, inferential). For example, Pennebaker argues that the high frequency of articles (a, an, the), prepositions (by, to, with, at, above, etc.), and nouns is associated with concrete and analytical writing, for this group of words—which he calls the noun cluster—creates formal and precise descriptions of specific objects, events, and plans (Pennebaker 2011, 70–72). Furthermore, the related function words comprising prepositions and conjunctions (and, but, whereas, etc.) are associated with cognitive complexity, for several propositions are conjoined in the same sentence. The opposite of this concrete and analytical writing style is an informal and personal type of writing, dominated by a high frequency of pronouns, verbs, auxiliary verbs (is, have, do, etc.), and hedges. For Pennebaker, this informal style—which he calls the pronoun-verb cluster—reflects a dynamic storytelling mode of language.

Linguistic Inquiry and Word Count (LIWC): https://www.liwc.app/

3

7.6 Linguistic Inquiry and Word Count (LIWC)

151

7.6.1 LIWC Results The relative frequency of the 92 linguistic features of each file creates three LIWC profiles—of Mankiewicz, of Welles, and of Citizen Kane. To qualify as a mark of authorship, the LIWC analysis must follow the tests carried out in Chaps. 5 and 6: a linguistic feature must have different relative frequencies in Mankiewicz and Welles, and it must have a similar relative frequency in one of the authors and Citizen Kane. Mankiewicz’s and Welles’s similarity to Citizen Kane is again measured using the distinctiveness ratio. Only categories where at least one value is over 1% and the distinctiveness ratio is below 0.7 or above 1.5 are reported. This leaves just a few categories. The differences between Mankiewicz and Welles are represented in Fig. 7.5. Pennebaker and his colleagues tested the LIWC 2015 version of the software on six genres of writing (blogs, expressive writing, novels, natural speech, The New York Times, and Twitter) comprising a total of 231 million words (Pennebaker et al. [2015], 10–12). The LIWC mean values represent the norm from a large reference corpus, a baseline presenting a broader context in which to understand the relative frequencies in Mankiewicz, Welles, and Citizen Kane. The second column in Fig. 7.5 presents the baseline score for novels (the closest category to screenwriting). The authentic category refers to people who tend to speak more spontaneously and do not censor what they say. Rather than sounding detached and guarded, an authentic text (as defined by LIWC) is more personal and self-revealing and usually has a higher frequency of personal pronouns: “Examples of texts that score low in Authenticity include prepared texts (i.e., speeches that were written ahead of time) and texts where a person is being socially cautious. Examples of texts that score high in Authenticity tend to be spontaneous conversations between close friends or political leaders with little-to-no social inhibitions.”4 The baseline score for novels LIWC Category

LIWC Norm (novels) (%)

Mankiewicz (%)

Welles (%)

Distinctiveness Ratio

home

0.56

1.19

0.54

2.20

authentic

21.56

38.41

20.91

1.84

female

1.88

2.95

1.68

1.76

I

2.63

3.61

2.28

1.58

leisure

0.56

1.01

1.48

0.68

male

4.09

2.53

3.78

0.67

Fig. 7.5 LIWC categories in Mankiewicz and Welles LIWC Analysis: https://www.liwc.app/help/liwc

4

152

7 Comparing Mankiewicz and Welles to the Citizen Kane Screenplay (2)…

in the authentic category is 21.56% (and natural spontaneous speech is 61.32%). Mankiewicz and Welles differ substantially in this category, with Mankiewicz scoring 38.41% and Welles 20.91%. Welles’s writing is more formal than Mankiewicz’s writing, which is more spontaneous. (In Chap. 2, I reported that Andrew Sarris had made a similar observation when he characterized Mankiewicz’s writing as the more sensual in comparison to the more theatrical Welles.) Nonetheless, Welles’s score is in fact similar to the norm for novels, which indicates that Mankiewicz writes more informally than the norm, although still far from natural spontaneous speech. The LIWC results also show a higher frequency of personal pronouns in Mankiewicz’s writing, which means that the character dialogue he writes in his screenplays conforms marginally more to the dynamic pronoun-verb cluster, that is, it is less detached and guarded than the dialogue of Welles’s characters. The Home dictionary comprises 100 words, including carpet, curtain, kitchen, domestic, and landlord. Its norm in novels is 0.56%. Welles’s result is almost identical to the norm (0.54%), whereas Mankiewicz’s results are much higher (1.19%) and therefore more “homely.” This result indicates vocabulary choice and, more importantly, the topics that characters talk about and the spaces where an author prefers to set the action and events. The leisure category comprises 296 words (such as arts, beach, cocktail, museum, poker, sport) that refer to leisure activities, with a norm in the novel of 0.56%. The results are the inverse of the home category: the leisure category words are dominant in Welles, with a score of 1.48%, against Mankiewicz’s score of 1.01%. Both exceed the norm. The male and female categories (the former with vocabulary items such as boy, brother, his, dad and the latter with girl, sister, her, mum, etc.) are split between Mankiewicz and Welles, with female dominant in Mankiewicz (2.95%, against Welles’s 1.68%) and male dominant in Welles (3.78%, against Mankiewicz’s 2.53%). Welles is again close to the norms, which are 1.88% for female and 4.09% for male. Mankiewicz’s LIWC profile is therefore dominated by the categories authentic, home, and female and exceeds the LIWC norms, whereas Welles’s LIWC profile foregrounds leisure and male and conforms to the LIWC norms. The more important results, however, concern the similarity of each author to Citizen Kane’s LIWC profile. The results, presented in Fig. 7.6 and visualized in Fig. 7.7, reveal that Citizen Kane is closer to Mankiewicz than to Welles. This is because the ratio differences between Mankiewicz and Citizen Kane are small, with five categories located inside the [0.7, 1.5] confidence interval, whereas Welles only has two categories inside the interval (I and male). Furthermore, both of Welles’s categories overlap with Mankiewicz inside the interval, ruling them out as distinctive markers. Moreover, with the home, authentic, and leisure categories, Mankiewicz’s ratios are inside the interval and Welles’s ratios are outside. (The opposite does not occur— there are no ratios where Welles is inside the interval and Mankiewicz outside.) In terms of male and female, it is interesting to see both Mankiewicz and Welles outside the interval in the female category and both inside in the male category. What these results mean is that Mankiewicz is close to (and Welles is distant from) three out of six of Citizen Kane’s values (home, authentic, leisure) and the vocabulary of Citizen Kane contains few female words (as defined by the LIWC norm).

7.7 Summary

153

LIWC Category

Mankiewicz (%)

Welles (%)

Citizen Kane (%)

CK / Mankiewicz Ratio

CK / Welles Ratio

home

1.19

0.54

1.14

0.96

2.11

Authentic

38.41

20.91

36.39

0.95

1.74

female

2.95

1.68

1.03

0.35

0.61

I

3.61

2.28

2.92

0.81

1.28

leisure

1.01

1.48

0.78

0.77

0.53

male

2.53

3.78

3.03

1.20

0.80

Fig. 7.6 LIWC categories in Mankiewicz, Welles, and Citizen Kane

3 2.5

Ratio

2 1.5 1 0.5 0 1

2

3

Mankiewicz

4 Welles

5 int -

6 int +

1

2

3

4

5

6

home

authentic

female

I

leisure

male

Fig. 7.7 Visualization of LIWC categories in Mankiewicz, Welles, and Citizen Kane

7.7 Summary Several statistical tests carried out in this chapter are inconclusive: sentence length, type/token ratios, and Mankiewicz’s dialogue from Man of the World. The cluster analysis in part confirms the results in Chap. 6, but it also contains anomalies and creates a different pattern within the 13 Citizen Kane samples (separating the flashback scenes from the rest of the screenplay). The statistical profiles of Mankiewicz and Welles constructed in Chap. 5 sufficiently distinguish the two authors from screenplays they did not write, although the results are uneven (favoring an author’s high-frequency linguistic features) and the differences are modest. These results can

154

7 Comparing Mankiewicz and Welles to the Citizen Kane Screenplay (2)…

be explored further by constructing statistical profiles of the authors of His Girl Friday and All the President’s Men and comparing them to Mankiewicz and Welles in order to find their distinctive variables. In terms of matching writing known to be written by Mankiewicz and Welles to their statistical profiles, the Touch of Evil memo was successfully attributed to Welles and sufficiently distinguished from Mankiewicz. However, the same cannot be said of the Man of the World dialogue, which was not attributed to either author. One hypothesis is that screenplay dialogue in itself is not sufficiently distinctive of an author in comparison to scene text. Finally, the LIWC software favors Mankiewicz over Welles as the dominant author of Citizen Kane, in part confirming the results of Chap. 6; just as importantly, it offers further insights into the stylistic and thematic patterns in their work, as well as in Citizen Kane. All these results present further avenues to explore, from differences between flashback scenes and scenes set in the present to overlaps between an author’s distinctive and characteristic variables to the differences between a screenplay’s dialogue and scene text.5

I explore the differences between dialogue and scene text in the Citizen Kane screenplay in Buckland (2023). 5

155

Appendix

Appendix Mankiewicz

Welles

Number of Sentences

3881

4175

2334

Mean Sentence Length

10.28

9.40

11.46

8.00

7.00

9.00

Median Sentence Length Standard Deviation of Sentence Length Coefficient of Variation

Citizen Kane

8.77

7.71

9.67

85.31%

82.02%

85.17%

Fig. 7.8 Sentence lengths in Mankiewicz, Welles, and Citizen Kane 10 9

Number of sentences (%)

8 7 6 5 4 3 2 1 0

1

2

3

4

5

6

7

8

9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26

Sentence Length (words) Mankiewicz

Welles

Citizen Kane

Fig. 7.9 Visualization of sentence lengths in Mankiewicz, Welles, and Citizen Kane

Fig. 7.10 Cluster graph of Mankiewicz, Welles, and Individual Scenes from Citizen Kane

156

7 Comparing Mankiewicz and Welles to the Citizen Kane Screenplay (2)… Mank (40,000)

Types

Welles (40,000)

Citizen Kane (25,841)

Mank (25,841)

Welles (25,841)

3908

5332

3587

3020

4138

40,000

40,000

25,841

25,841

25,841

Type / Token Ratio

1:10

1:7.5

1:7.2

1:8.9

1:6.3

Old English Types

900

1071

842

754

913

Old English Types (%)

23%

20%

23%

25%

22%

Tokens

Fig. 7.11 Type/token ratios in Mankiewicz, Welles, and Citizen Kane

References Buckland, Warren. 2023. The Motion Picture Screenplay as Data: Quantifying the Stylistic Differences Between Dialogue and Scene Text. The Palgrave Handbook of Screenwriting Studies: History, Theory and Practice of Screenwriting Research, edited by Rosamund Davies, Paolo Russo, and Claus Tieber. Palgrave Macmillan. Pennebaker, James W. 2011. The Secret Life of Pronouns: What Our Words Say About Us. New York: Bloomsbury. Pennebaker, James W., Ryan L. Boyd, Kayla Jordan, and Kate Blackburn. 2015. The Development and Psychometric Properties of LIWC2015. Austin: University of Texas at Austin. https:// repositories.lib.utexas.edu/bitstream/handle/2152/31333/LIWC2015_LanguageManual.pdf.

In Conclusion

The statistical reasoning and analysis carried out in this study have followed straightforward quantitative research practices. I created statistical profiles of Mankiewicz and Welles based on linguistic variables that maximally differentiate the two authors. I then compared each author’s profile to the same variables in the disputed text, the Citizen Kane screenplay. To compare the distinctive variables in each author’s profile to the same variables in Citizen Kane, it became evident that the procedure demarcating the interval boundary between similarity and difference – for example, to determine if the quantity of linguistic variables that constitute an author’s profile are similar to or different from their quantity in the disputed text – needed to be robust and applied in a nonarbitrary way, for it is this boundary that determines what counts as evidence in the inference of authorship. Like all statistical studies, in my analysis of Mankiewicz, Welles, and Citizen Kane, I set out to identify the underlying unity (consistent patterns and trends) in a voluminous amount of data. I sought to define those patterns in terms of theories of style and to construct arguments that attribute stylistic unity to an author. I acknowledged that an inference of authorship cannot be deductively validated, for such an inference extends beyond the available data. To be successful—to establish a link between inputs (quantified linguistic data) and outputs (inference of authorship) in which the output follows from the input—all attributions of authorship must be supported by robust internal evidence. Furthermore, stylometry fundamentally alters what counts as evidence. On the one hand, the evidence becomes simpler: trace elements that point to the source of that trace. On the other hand, the operations carried out to identify, measure, and justify those data and transform them into evidence become more complicated. The resulting evidence used to infer authorship is not obvious or incontrovertible; it requires statistical reasoning and a precise set of explicitly formulated operations to justify it. Although stylometry assumes that creating a style is an orderly and regular process that manifests a level of uniqueness that can be analyzed and quantified with © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 W. Buckland, Who Wrote Citizen Kane?, Quantitative Methods in the Humanities and Social Sciences, https://doi.org/10.1007/978-3-031-40224-1

157

158

In Conclusion

precision—which in turn can form the basis of a series of inferences about the authorship of disputed texts—the results of stylometric inquiry can never be established with complete logical certainty; they are always provisional and contingent because 100% statistical certainty is unobtainable and unrealizable, which means that statistics must instead replace certainty with probability estimates from samples. In other words, no number of positive instances will match or equal a deductive entailment. As philosopher Carl Hempel argued, “even extensive testing with favorable results does not establish a hypothesis conclusively, but provides only more or less strong support for it” (Hempel, quoted in Rudolf P. Botha [1973], 36, n69). Like all inductive inferences, attributing authorship is a nondemonstrative inference that cannot be deductively proven. However, nondemonstrative inferences can still be evaluated and assessed. Following on from the pioneering work of L. Jonathan Cohen and Wesley Salmon, Rudolf P. Botha identified three levels at which nondemonstrative inferences can be evaluated: the level of support, the level of acceptability, and the level of persuasive power (Salmon [1967]; Cohen [1970]; Botha [1973], 39–54). In terms of support, relevant data are measurable via the quantity of positive and negative instances (for example, positive and negative matches between distinctive variables in Welles’s authorial profile and the same variables in Citizen Kane) and via the variety of the data (the analysis carried out in this study collected a diverse array of data—including punctuation, byte-level n-grams, vocabulary, plus and minus words, etc.).1 In terms of acceptability, the inference of authorship satisfies one or more of the following: it (i) fits into an already well-established paradigm, (ii) is similar to previous successful studies, (iii) discovers previously unknown facts, (iv) is formulated from a simple or overtly complex theory, and (v) has significant consequences. The study carried out in this book fits into a well- established paradigm (stylometry) and contributes to a new and emerging paradigm of the Digital Humanities, although it has a limited presence in film studies, which (like many arts and humanities disciplines) expresses a degree of skepticism toward quantification and statistical reasoning. This study discovers new facts in the screenplay of one of the most revered films in the history of cinema and presents a new perspective on the creative contributions of its two authors. This study is also limited to well-established familiar statistical methods, for an overtly complex theory with many qualifications tends to evade or obscure any underlying contradictions in the theory and/or in the phenomena under discussion. And a stylometric analysis of the Citizen Kane screenplay may have significant consequences in reevaluating the careers and reputations of Orson Welles and Herman Mankiewicz. Finally, persuasive power means the relation of an inference to current opinion, plus the psychological traits (personality) of the researcher and their ability to influence other researchers. However, this third level brings us close to the position this study is attempting to avoid (opinions, impressions, hearsay, rumor, etc.). ‘Generally speaking, the degree of confirmation of a law on the evidence of a number of confirming experiments should depend not only on the total number of (positive) instances found but also on their variety, i.e. on the way they are distributed among various kinds’ (Carnap [1945], 93). 1

In Conclusion

159

Yet stylometry does not strictly adhere to some of the fundamental assumptions of statistics. Applying statistics to linguistic data presents insurmountable hurdles, especially when that data cannot conform to the fundamental principles of statistics. These include ensuring that: • • • • •

The samples are large. The samples are randomly selected. The variables display similar variability (homoscedasticity) across samples. The variables conform to a normal distribution. The variables of interest are independent.

All these principles can only be followed under ideal conditions. Choosing data randomly ensures that it is independent, and large sample sizes reduce variability and lead to the formation of narrower and therefore more accurate confidence intervals. In this study, Mankiewicz and Welles are represented by a large sample of 40,000 words each, and the Citizen Kane screenplay is 25,841 words long. However, these samples are also divided into smaller samples, which necessitated the monitoring of variability. Furthermore, the Mankiewicz and Welles 40,000-word samples are of course independent from each other; they derive from two separate populations. But they are not random samples (their selection, which I outlined in Chap. 3, was determined by what screenplays were available at the time the study was carried out). In statistics, independence means that the occurrence of one event is not affected by the occurrence of another event; there is no connection between the two. For example, rolling a dice several times produces independent results because each result will not affect or influence the next one (that the first result was a six does not influence subsequent results). However, we need to distinguish two types of independence: independence between observations and independence between variables. Independence between observations is essential to stylometric analysis (as with any other type of statistical analysis); it refers to the procedure of data collection, especially when grouping variables into different categories: when repeated observations carried out at different levels (1-gram, 2-gram, etc.) are subsequently grouped together and added up, one needs to ensure that these observations are independent—that is, do not overlap. In other words, data collection should be mutually exclusive, for one needs to avoid double counting the same variables. Independence between variables is difficult to overcome in linguistics due in part to the inherent characteristics of language, such as sequencing. For example, in a given text, the first sentence may use the past tense. This choice will then influence the second sentence, which (for the purpose of creating a consistent and coherent text) will also most probably be written in the past tense. In other words, the tense of the second sentence is dependent on the first. One solution to overcoming this dependency is to select a smaller sample of random sentences rather than all the sentences consecutively; however, this strategy runs the risk of significantly reducing the sample size and therefore increasing variability. Another solution is to take the dependency of variables in a text as the object of study, treating it (for example) as a Markov chain, in which the probability of an event—such as the choice of a

160

In Conclusion

word—depends on the selection of a previous word (the text’s “current state”). In other words, probabilities are not limited to independent events, for they also apply to chains of connected events.2 Independence is also complicated by multivariate analysis—the simultaneous study of several variables taken from the same sample. For example, in his famous statistical study of natural selection, the comparative zoologist Hermon Bumpus took eight morphological measurements (wingspan, skull measurements, etc.) and the weight of 49 house sparrows affected by adverse weather conditions. The important point to note is that the study carried out eight independent observations in the sense that they do not overlap (they do not measure the same feature more than once), but the variables themselves are not independent of one another but are correlated (a heavy bird will tend to have a larger skull and a longer wing length, etc.). One objective of multivariate analysis is to investigate dependence among variables. This study of the Citizen Kane screenplay is also a multivariate analysis to the extent that the authorial profiles comprise several (predominately nonoverlapping) variables drawn simultaneously from the Mankiewicz and Welles samples, variables that are nonetheless related to each other. Dependence between linguistic variables is influenced by multiple external factors, each of which introduces variability into the data, including dialect, time, place, genre conventions, institutional constraints—and authorial style. A study seeking complete independence between all linguistic variables is detrimental to authorship attribution, for “non-independence” is a property of an author: the pattern they impose on language via word repetitions, word preferences, punctuation choices, etc. constitutes the data required in the study of authorial style. In other words, an author is a nonindependent data structure, an external influence that introduces bias and variability into language, and the aim of statistical authorship attribution is to isolate an author’s specific pattern of bias and variability. This study of the Citizen Kane screenplay has therefore aimed to identify nonindependent linguistic bias introduced by one type of external influence— authorship (more specifically, the authorship of Mankiewicz and Welles). Many other external influences mentioned above are shared by both authors: they cowrote the same document in the same context (Hollywood) following the same strict genre conventions of screenplay writing. In other words, they shared the same stable context, whose influence is a fixed constant and can, therefore, be factored out. Finally, authorship identification is not limited to the arts and humanities; it is also a vital task carried out within a legal setting. From this perspective, the language of the Citizen Kane screenplay and the sample screenplays constitute documentary evidence of authorship. Can this evidence be subjected to the principles of forensic science? Forensics characterizes evidence as (1) trace elements that (2) remain uniform (or invariant/constant) and that (3) point to the source of that trace.3 In 1913 A. A. Markov tested out his theory on the distribution of vowels and consonants in Pushkin’s Eugene Onegin. He discovered that vowels and consonants are not independent (are not randomly distributed across the text) but have a tendency to alternate (A. A. Markov [2006]). 3 The first principle is called the Exchange Principle; the second is the Uniformitarian Principle; and the third is the Principle of Individuality (or Principle of Difference, for it can only be deter2

In Conclusion

161

Forensic data are rarely self-evident; instead, they consist of minute and physical remnants of an activity detectable via specialized tools and methods. Forensic methods must be reliable, repeatable, reproducible, accurate, and relevant. These concepts are of course linked, for data and methods are considered accurate and reliable only if they can be repeated and reproduced. In regard to repeatability, the same group of researchers carries out the same tests and measurements several times using the same data and methods in order to ensure the results are consistent and reliable. Repeating a test or set of observations allows for anomalies to be identified, although systematic errors are more difficult to recognize. Reproducibility is distinct from repeatability in that it refers to the practice of different researchers independent from the original group (usually in a different place and time) carrying out the same tests and measurements using the same data and methods in an attempt to replicate and corroborate the results. Reproducibility aids in determining if the methods are accurate and reliable and if the data are invariant. Finally, the data need to be relevant to the task at hand—to identify the provenance of the trace elements. Repeatability, reproducibility, and relevance are dependent on the existence of mutually accepted standards and procedures to establish valid criteria for “best practice,” together with criteria for the acceptability of the data generated by best practice.4 These standards and procedures constitute a tall order, an ideal that very few hard sciences, let alone a discipline such as stylometry, is able to fulfill—but they remain distant ideals by which to guide research. Stylometry follows these forensic ideals to the extent that it quantifies the minutiae of language and attempts to recognize patterns and trends within those minutiae in order to identify stylistic trace elements that reflect the character of a writer (as manifest in that writer’s set of systematic linguistic habits). The main difference, of course, is that forensics focuses on criminal activity while stylometry analyzes creative activity. Furthermore, the low frequency of variables and variance afflict stylometry more than hard sciences. However, although stylometry has yet to reach the stage where a data set or a set of methods is validated as standard, frequency counts of n-gram letters and function words (such as prepositions, conjunctions, and auxiliary verbs) come close to being recognized as universal, for they have consistently generated reliable results in authorship attribution studies.5 In an attempt to fulfill the ideal of repeatability and especially reproducibility, the methods in this study

mined via a comparative analysis). See Houck (ed. [2016], 3–4). 4 Carole Chaski (2013) outlines best practice for what she calls forensic computational linguistics. 5 ‘In several independent studies, it has been demonstrated that function words (defined as the set of the most frequent words of the training set) and letter n-grams are among the most effective stylometric features, though the combination of several feature types usually improves the performance of an attribution model’ (Efstathios Stamatatos [2013], 423). Patrick Juola claims to have devised a valid general-purpose algorithm that fulfills the criteria of repeatability, reproducibility, and accuracy. He carried out five separate tests—on vocabulary, word length, character n-grams, most common words, and punctuation (Juola [2020]).

162

In Conclusion

were—as far as possible—operationalized, by spelling out the tools of measurement and procedures that were followed to generate the data. Mosteller and Wallace (1964, 264) argued that a tremendous amount of editing of minor linguistic features such as marker and function words would be required to alter an author’s stylometric fingerprint before their frequencies match those of the editor. Just as stylometry cannot quantify complex linguistic features such as homographs, polysemy, metaphor, and sarcasm, neither can it easily quantify an editor’s considerable input in reshaping a screenplay. Yet the creative editing and adaptation of another’s text was one of Welles’s many considerable talents. With Citizen Kane, he followed his earlier working practices: he “adapted” Mankiewicz’s initial screenplay (American) in the same way he had creatively adapted classical literature to the radio in the 1930s and Conrad’s Heart of Darkness and Nicholas Blake’s Smiler with the Knife to the screen when he first arrived in Hollywood in 1939. I pointed out in Chap. 2 that it is unlikely Welles wrote a complete draft of Citizen Kane before meeting Mankiewicz, for Mankiewicz was hired because Welles did not have time to write a complete screenplay. But this does not rule out Welles writing key scenes in preliminary discussions that took place before Mankiewicz began writing the first draft. The stylometric results presented in Chap. 6 demonstrate that Welles wrote the first draft of the opening three scenes and the final scene, which Mankiewicz inserted into the first draft of the screenplay. In the coauthorship scenarios that Lisa Ede and Andrea Lunsford study (1990), Mankiewicz and Welles conform to scenario 3: after an initial brainstorming meeting, they worked separately and consecutively on the screenplay, but Welles also revised the whole screenplay to make it consistent and coherent. Nonetheless, in terms of the WGA guidelines and copyright law, editing does not in itself constitute a separate independent contribution to a work. Like all forms of inquiry, stylometry cannot be prolonged indefinitely; its collection and analysis of data is never complete or final. In regard to justifying the results of a stylometric analysis of disputed authorship, Susan Hockey expressed a widespread opinion when she stated that “In only a very few cases have problems of disputed authorship been solved totally and these have been ideally suited to computer analysis. Using such quantitative methods will however accumulate as much evidence as possible and will allow many different tests to be applied systematically” (1980, 130).

References Botha, Rudolf P. 1973. The Justification of Linguistic Hypotheses: A Study of Nondemonstrative Inference in Transformational Grammar. The Hague: Mouton. Carnap, Rudolf. 1945. On Inductive Logic. Philosophy of Science 12 (2): 72–97. Chaski, Carole. 2013. Best Practices and Admissibility of Forensic Author Identification. Journal of Law and Policy 21 (2): 333–76. Cohen, L. Jonathan. 1970. The Implications of Induction. London: Methuen. Ede, Lisa, and Andrea Lunsford. 1990. Singular Texts /Plural Authors: Perspectives on Collaborative Writing. Carbondale: Southern Illinois Press. Hockey, Susan. 1980. A Guide to Computer Applications in the Humanities. London: Duckworth.

In Conclusion

163

Houck, Max M., ed. 2016. Forensic Fingerprints. Amsterdam: Elsevier. Juola, Patrick. 2020. Verifying Authorship for Forensic Purposes: A Computational Protocol and Its Validation. Forensic Science International: doi:https://doi.org/10.1016/j.forsciint.2021.110824 Markov, A. A. 2006. An Example of Statistical Investigation of the Text Eugene Onegin Concerning the Connection of Samples in Chains. Science in Context 19: 591–600. Mosteller, Frederick, and David L Wallace. 1964. Inference and Disputed Authorship: The Federalist. Reading, Mass.: Addison-Wesley. Salmon, Wesley C. 1967. The Foundations of Scientific Inference. Pittsburgh: University of Pittsburgh Press. Stamatatos, Efstathios. 2013. On the Robustness of Authorship Attribution Based on Letter n-gram Features. Journal of Law and Policy 21 (2): 421–39.

Index

A Adair, Douglass, 52 All the President’s Men, 6, 8, 29, 81, 90, 145, 148, 154 Authorship defined, 3 joint author (legal definition), 15 quantified, 3, 5 See also Coauthorship; Copyright Law; Style; Stylometry; Writers Guild of America B Barr, Richard, 21, 136 Berenson, Bernard, 44 Blatt, Ben, 51 Bogdanovich, Peter, 1–2, 16, 25 Bridgman, P.W., 1, 3, 43 Bunyan, John, 58–59, 86 C Calan-Jageman, Robert, 3 Callow, Simon, 13, 21, 25, 136 Cantril, Hadley, 13–14, 16 Carnap, Rudolf, 158 Carringer, Robert, 5–6, 19, 22–25, 135 Chaski, Carole, 161 Citizen Kane Bernstein (flashback scenes), 135, 142 Bernstein’s office (framing story), 135, 142

external evidence of authorship, 3 Final scene, 131, 133, 142 internal evidence of authorship, 3 Leland (flashback scenes), 131, 134, 135, 142 Leland (framing story), 131, 134, 142 Mr Kane song, 79 “News on the March” sequence, 18, 117, 128–129, 131, 133–134, 136, 142 Place in the Sight and Sound poll, 1 Projection room scene, 131, 133, 135, 142 Prologue, 130–131, 133, 136, 142 Susan (first framing story), 131, 135, 142 Susan (flashback scenes), 131, 133–135, 142 Susan (second framing story), 131, 135, 142 Thatcher (flashback scenes), 131, 142 Thatcher library (framing story), 131, 133–135, 142 Coauthorship, 1–4, 11–27, 47, 51–53, 122, 127, 130–136 Copyright Law, 5, 14–15, 20, 24–55, 162 Cumming, Geoff, 3 D Davin, Anna, 44 De Imitatione Christi, 5, 57, 61–62 Dekker, Thomas, 58, 63 Digital Humanities, 5, 158 Dunning, John, 13

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 W. Buckland, Who Wrote Citizen Kane?, Quantitative Methods in the Humanities and Social Sciences, https://doi.org/10.1007/978-3-031-40224-1

165

166

Index

E Ede, Lisa, 20, 25, 162 Ellegård, Alvar, 4, 8, 47, 49, 59–61, 66–72, 76, 77, 80–82, 90–91, 93–96, 147, 148 Enkvist, N.E., 4 Ethics (Aristotle), 43, 51, 60

Letters of Junius, The, 5, 8, 59–60, 68–72, 93–94, 96 Linguistic Inquiry and Word Count (LIWC), 8, 48, 80, 145, 149–154 Lundberg, Ferdinand, 21 Lunsford, Andrea, 20, 25, 162

F Federalist papers, The, 5, 43, 49–52, 76, 96 Feeney, F.X., 29 Fletcher, John, 55, 63 Forsyth, Richard, 51 Foster, Don, 5 “Funeral Elegy, A”, 5

M Macaulay, Thomas, 58–59, 86 Macdonald, Ian W., 29 Mank (David Fincher 2020), 1 Mankiewicz, Herman J. 22 distinctive linguistic features, 7, 95–96, 115 Made in Heaven, 6, 29, 38, 86, 89–90 Man of the World, 6, 145, 148–149, 153, 154 vocabulary richness, 86 A Woman’s Secret, 6, 29, 30, 38–39, 86, 89–90 Maras, Steven, 29 “March of Time, The”, 13, 18 Markov chain, 159–160 Martindale, Colin, 51 McBride, Joseph, 1, 16, 25, 33 McGilligan, Patrick, 13–14, 21, 134, 136 McKenzie, Dean, 51 Mercury Theatre Company, 13–15 Merriam, Thomas, 55, 63–64, 118 Meryman, Richard, 1, 11–12, 15–16, 18, 20–21, 136 Morelli, Giovanni, 44 Mortgage on Life (Vicki Baum), 6, 29, 38 Morton, A.Q., 48 Mosteller, Frederick, 45, 47, 49–51, 75–76, 96, 162 Muhl, Edward I., 149

G Ginzburg, Carlo, 44 Godard, Jean-Luc, 2 Goldman, William, 29 H Hempel, Carl, 158 Hickenlooper, George, 29 His Girl Friday, 6, 8, 29, 81, 90, 145, 147–148, 154 Hockey, Susan, 5, 48, 162 Holmes, David I., 5, 48, 51 Hoover, D.L., 48, 55, 84 Houseman, John, 1–2, 13–18, 22–23, 25, 117, 131, 133 Hume, David, 3 J Jackson, MacDonald P., 51–52, 88, 118 Juola, Patrick, 5, 55, 161 K Kael, Pauline, 1–2, 16–18, 22, 25 Kellow, Brian, 17–18 Kenny, Anthony, 45, 51, 60–61, 75–76 Kešelj, Vlado, 55, 82 Klein, Joe, 5 Koch, Howard, 13–14 L Lebo, Harlan, 12 Lederer, Charles, 17, 25, 29, 136 Ledger, Gerard, 46, 47, 53–55, 63–64, 73, 82, 118

N Nelmes, Jill, 29 O Oakes, Michael P., 5 P Pennebaker, James, 149–151 p-hacking, 77 Pierson, Frank, 18 Politique des auteurs, la, 12 Price, Steven, 29 Pseudepigrapha, 5

Index Q Quantitative Index Text Analyzer (software), 76, 77, 82, 83 R Rashomon (Kurosawa 1950), 11 RKO 281 (Benjamin Ross 1999), 1 Rosenbaum, Jonathan, 16, 25 S Salmon, Wesley, 158 Sarris, Andrew, 1, 12, 16–17, 25, 152 Selection bias, 77 Shakespeare, 5, 25 Pericles, 43, 51–52, 88, 118 Two Noble Kinsmen, 55, 63–64, 118 Singh, S., 51 Spielberg, Steven, 11 Stamatatos, Efstathios, 5, 48, 161 Statistical independence, 159–160 Statistics Bayesian probability, 51, 76 coefficient of variation, 56, 84, 88, 92 Cohen’s d, 8, 65, 78, 81, 91, 96, 114, 115, 137, 138 confidence intervals, 3, 7, 8, 66–67, 75, 78, 91, 117–122, 134, 136, 152 correlation, 49, 91, 95–96, 114, 115, 124 descriptive vs. inferential, 6, 45–46 effect size, 3, 7, 8, 45–46, 60, 65–66, 72, 78, 96 frequency, 7, 19, 44, 45, 47, 49, 50, 52–64, 66, 68–69, 71, 76–88, 91–94, 117, 119–120, 123–124, 128, 136, 148, 150–151, 161 mean, 3, 56, 84, 88, 91 New Statistics, The, 3 percentage point vs. percentage difference, 86, 122 proportion, 45, 59 p-value, 8, 64–65, 67, 122 ratio, 7–8, 45–46, 59–60, 65–66, 77–78, 81, 84, 90, 92, 95, 117–136, 147, 148 sample vs. population, 6, 45–46 standard deviation, 3, 56, 67, 81, 84, 88, 91 statistical process control, 67 t-distribution, 67, 91 Sternberg, Claudia, 29–31 Stubbs, John C., 37, 38 Style, 4, 43–74, 94 Stylometry, 5, 6, 29–30, 43–74

167 fingerprint analogy, 43–45, 50 premises, 45–50, 157–159 relation to forensic science, 160–161 Suber, Howard, 17–18 T Tweedie, Fiona, 4, 51 Tynan, Kenneth, 21, 136 V Vertigo (Hitchcock 1958), 1 Vickers, Brian, 25, 45, 48, 55, 58, 84 W Wallace, David L., 45, 47, 49–51, 75–76, 96, 162 WCopyfind (software), 6, 59, 85, 86 Webster, John, 58 WebStyml (software), 64, 146 Welles, Orson American (first draft of Citizen Kane), 16–19, 22–24, 53, 162 Big Brass Ring, The, 6, 29, 32–33, 80, 82, 83, 86, 89–90 conflict with Houseman, 1–2, 18, 22 22 distinctive linguistic features, 7, 95–96, 115 Heart of Darkness, 18, 162 Magnificent Ambersons, The, 6, 53, 80, 87 Other Side of the Wind, The, 6, 29, 32–33, 82, 86, 87, 89–90 Smiler with a Knife, 18, 162 Touch of Evil, 6, 29, 32, 33, 36–38, 79–80 Touch of Evil memo, 145, 148–149, 154 vocabulary richness, 86 War of the Worlds radio play, 13–14 Wells, H.G., 14 Wilkins, George, 51–52, 88 Williams, C.B., 85 Wollheim, Richard, 44 Writers Guild of America, 5, 11–13, 20, 24–25, 136, 162 Y Yule, G.U., 57–58, 61–62, 85, 86 Z Zipf, George Kingsley, 82, 85