Referential Null Subjects in Early English (Oxford Studies in Diachronic and Historical Linguistics) 9780198808237, 0198808232

This book offers a large-scale quantitative investigation of referential null subjects as they occur in Old, Middle, and

136 98 4MB

English Pages 272 [264] Year 2019

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Cover
Referential Null Subjects in Early English
Copyright
Contents
Series Preface
Acknowledgements
List of Figures and Tables
Figures
Tables
List of Abbreviations
1: Introduction
1.1 Background
1.1.1 Referential null subjects in Old Germanic
1.1.2 Referential null subjects in Old English
1.2 Contribution of the book
1.3 Scope and definitions
1.4 Data material and method
1.4.1 Data collection
1.4.2 Representativeness
1.4.3 Old English
1.4.4 Later stages of early English
1.5 Outline of the book
2: Referential null subjects in Old English
2.1 Introduction
2.2 Three types of subjectless structure
2.2.1 Imperative-like’ subjunctives
2.2.2 Non-overt subject relatives in naming structures with hatan
2.3 Referential null subjects in Old English
2.3.1 Null subjects in Old English prose: an overview
2.3.2 Null subjects in Old English poetry: an overview
2.4 Is Old English a canonical pro-drop language?
3: Do Anglian dialects of Old English have a partial pro-drop property?
3.1 Introduction
3.2 The dialect-split hypothesis
3.3 An initial assessment of the dialect-split hypothesis
3.4 Null subjects according to dialect
3.4.1 Prose
3.4.2 Poetry
3.5 Null subjects according to genre
3.6 Null subjects according to period
3.6.1 Prose
3.6.2 Poetry
3.7 Null subjects according to translation status
3.7.1 Prose
3.7.2 Poetry
3.8 Assessing the dialect-split hypothesis
3.9 Concluding summary
4: The morphosyntactic characteristics of null subjects in Old English
4.1 Introduction
4.2 Null subjects according to clause type
4.3 Null subjects according to person and number
4.4 Null subjects according to the position of the finite verb
4.5 Modelling null subjects in Old English according to linguistic and non-linguistic variables
4.6 Concluding summary
5: What could have sanctioned null subjects in Old English?
5.1 Introduction
5.2 Referent identification: antecedent relations and accessibility
5.2.1 Antecedentless null subjects
5.2.2 Antecedent accessibility
5.3 Two generative proposals evaluated: Aboutness topics and verbal inflections
5.4 Closing discussion
5.4.1 A suggestion: null subjects in Old English as argument ellipsis
5.5 Summary
6: The long-term diachrony of referential null subjects in early English
6.1 Introduction
6.2 Referential null subjects in Middle English
6.2.1 Null subjects in Middle English according to text, genre, and period
6.2.2 Morphosyntactic characteristics of null subjects in Middle English
6.2.3 Modelling null subjects in Middle English
6.3 Referential null subjects in EarlyModern English
6.3.1 Referential null subjects in Early Modern English according to text and period
6.3.2 Morphosyntactic characteristics of null subjects in Early Modern English
6.3.3 Modelling null subjects in Early Modern English
6.4 The long-term diachrony of null subjects in early English: correlation and regression
6.5 Concluding discussion
6.6 Summary
7: Conclusions
References
Index
Recommend Papers

Referential Null Subjects in Early English (Oxford Studies in Diachronic and Historical Linguistics)
 9780198808237, 0198808232

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

OUP CORRECTED PROOF – FINAL, //, SPi

Referential Null Subjects in Early English

OUP CORRECTED PROOF – FINAL, //, SPi

OX F OR D ST U DI E S I N DIACH RONIC A ND HISTORICA L L INGUISTICS general editors: Adam Ledgeway and Ian Roberts, University of Cambridge advisory editors: Cynthia Allen, Australian National University; Ricardo Bermúdez-Otero, University of Manchester; Theresa Biberauer, University of Cambridge; Charlotte Galves, University of Campinas; Geoff Horrocks, University of Cambridge; Paul Kiparsky, Stanford University; Anthony Kroch, University of Pennsylvania; David Lightfoot, Georgetown University; Giuseppe Longobardi, University of York; George Walkden, University of Konstanz; David Willis, University of Cambridge recently published in the series  Gender from Latin to Romance History, Geography, Typology Michele Loporcaro  Clause Structure and Word Order in the History of German Edited by Agnes Jäger, Gisella Ferraresi, and Helmut Weiß  Word Order Change Edited by Ana Maria Martins and Adriana Cardoso  Arabic Historical Dialectology Linguistic and Sociolinguistic Approaches Edited by Clive Holes  Grammaticalization from a Typological Perspective Edited by Heiko Narrog and Bernd Heine  Negation and Nonveridicality in the History of Greek Katerina Chatzopoulou  Indefinites between Latin and Romance Chiara Gianollo  Verb Second in Medieval Romance Sam Wolfe  Referential Null Subjects in Early English Kristian A. Rusten For a complete list of titles published and in preparation for the series, see pp. –.

OUP CORRECTED PROOF – FINAL, //, SPi

Referential Null Subjects in Early English KRISTIAN A. RUST EN

1

OUP CORRECTED PROOF – FINAL, //, SPi

3

Great Clarendon Street, Oxford, ox dp, United Kingdom Oxford University Press is a department of the University of Oxford. It furthers the University’s objective of excellence in research, scholarship, and education by publishing worldwide. Oxford is a registered trade mark of Oxford University Press in the UK and in certain other countries © Kristian A. Rusten  The moral rights of the author have been asserted First Edition published in  Impression:  All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without the prior permission in writing of Oxford University Press, or as expressly permitted by law, by licence or under terms agreed with the appropriate reprographics rights organization. Enquiries concerning reproduction outside the scope of the above should be sent to the Rights Department, Oxford University Press, at the address above You must not circulate this work in any other form and you must impose this same condition on any acquirer Published in the United States of America by Oxford University Press  Madison Avenue, New York, NY , United States of America British Library Cataloguing in Publication Data Data available Library of Congress Control Number:  ISBN –––– Printed and bound by CPI Group (UK) Ltd, Croydon, cr yy Links to third party websites are provided by Oxford in good faith and for information only. Oxford disclaims any responsibility for the materials contained in any third party website referenced in this work.

OUP CORRECTED PROOF – FINAL, //, SPi

Contents Series Preface Acknowledgements List of Figures and Tables List of Abbreviations  Introduction . Background .. Referential null subjects in Old Germanic .. Referential null subjects in Old English . Contribution of the book . Scope and definitions . Data material and method .. Data collection .. Representativeness .. Old English .. Later stages of early English . Outline of the book  Referential null subjects in Old English . Introduction . Three types of subjectless structure .. ‘Imperative-like’ subjunctives .. Non-overt subject relatives in naming structures with hatan . Referential null subjects in Old English .. Null subjects in Old English prose: an overview .. Null subjects in Old English poetry: an overview . Is Old English a canonical pro-drop language?  Do Anglian dialects of Old English have a partial pro-drop property? . . . .

Introduction The dialect-split hypothesis An initial assessment of the dialect-split hypothesis Null subjects according to dialect .. Prose .. Poetry . Null subjects according to genre

ix xi xiii xvii                             

OUP CORRECTED PROOF – FINAL, //, SPi

vi

Contents . Null subjects according to period .. Prose .. Poetry . Null subjects according to translation status .. Prose .. Poetry . Assessing the dialect-split hypothesis . Concluding summary

 The morphosyntactic characteristics of null subjects in Old English . . . . .

Introduction Null subjects according to clause type Null subjects according to person and number Null subjects according to the position of the finite verb Modelling null subjects in Old English according to linguistic and non-linguistic variables . Concluding summary  What could have sanctioned null subjects in Old English? . Introduction . Referent identification: antecedent relations and accessibility .. Antecedentless null subjects .. Antecedent accessibility . Two generative proposals evaluated: Aboutness topics and verbal inflections . Closing discussion .. A suggestion: null subjects in Old English as argument ellipsis . Summary  The long-term diachrony of referential null subjects in early English . Introduction . Referential null subjects in Middle English .. Null subjects in Middle English according to text, genre, and period .. Morphosyntactic characteristics of null subjects in Middle English .. Modelling null subjects in Middle English . Referential null subjects in Early Modern English .. Referential null subjects in Early Modern English according to text and period

                               

OUP CORRECTED PROOF – FINAL, //, SPi

Contents .. Morphosyntactic characteristics of null subjects in Early Modern English .. Modelling null subjects in Early Modern English . The long-term diachrony of null subjects in early English: correlation and regression . Concluding discussion . Summary

vii

    

 Conclusions



References Index

 

OUP CORRECTED PROOF – FINAL, //, SPi

OUP CORRECTED PROOF – FINAL, //, SPi

Series Preface Modern diachronic linguistics has important contacts with other subdisciplines, notably first-language acquisition, learnability theory, computational linguistics, sociolinguistics, and the traditional philological study of texts. It is now recognized in the wider field that diachronic linguistics can make a novel contribution to linguistic theory, to historical linguistics, and arguably to cognitive science more widely. This series provides a forum for work in both diachronic and historical linguistics, including work on change in grammar, sound, and meaning within and across languages; synchronic studies of languages in the past; and descriptive histories of one or more languages. It is intended to reflect and encourage the links between these subjects and fields such as those mentioned above. The goal of the series is to publish high-quality monographs and collections of papers in diachronic linguistics generally, i.e. studies focusing on change in linguistic structure, and/or change in grammars, which are also intended to make a contribution to linguistic theory by developing and adopting a current theoretical model, by raising wider questions concerning the nature of language change, or by developing theoretical connections with other areas of linguistics and cognitive science as listed above. There is no bias towards a particular language or language family, or towards a particular theoretical framework; work in all theoretical frameworks, and work based on the descriptive tradition of language typology, as well as quantitatively based work using theoretical ideas, also feature in the series. Adam Ledgeway and Ian Roberts University of Cambridge

OUP CORRECTED PROOF – FINAL, //, SPi

OUP CORRECTED PROOF – FINAL, //, SPi

Acknowledgements This study is based on, and a further development of, my University of Bergen PhD thesis (Rusten b), which was supervised by Kari E. Haugland and Gard B. Jenset. Thanks are first and foremost due to them. Kari’s critical and helpful reading of my research and her acute insight into all matters linguistic—empirical, methodological, theoretical—have been invaluable. I very much appreciate her unstinting support, generosity, encouragement, criticism, attention to detail, and constant availability for discussion. My debt to Gard is also considerable. His aid in methods of data handling and statistical analysis has greatly benefited the work presented in this book. He has also generously written two (R and Perl) scripts (duly acknowledged in the text) which have been of great help. I am also grateful for his willingness to make himself available for consultation and discussion, as well as his eagerness in engaging with my research. The errors and inaccuracies that no doubt remain in the book do so as a result of my own shortcomings. I would like to sincerely thank the members of the committee which assessed my PhD thesis, Elly van Gelderen, Nils-Lennart Johannesson, and Dagmar Haumann, for discussing my work with me, and for offering extremely helpful advice and feedback. I want to thank Elly van Gelderen especially for encouraging me to pursue publication of this work. I am also happy to acknowledge debts owed to a number of scholars who have contributed advice, assistance, and discussion on various linguistic and academic topics. In this regard, I want to sincerely thank Kristin Bech, Aidan Conti, Jena Habegger-Conti, Karol Janicki, Kari Kinn, Kevin McCafferty, Persijn de Rijke, Victoria Rosén, Koenraad de Smedt, and George Walkden. Furthermore, I want to thank the researchers who have offered comments during and after my conference talks at ICAME , ICAME , ICHL , ICEHL , FGLS , and NAES . Many thanks are owed also to the anonymous reviewers of this book—their suggestions have made for a better book. I also extend my gratitude to the anonymous reviewers of my previously published work. Their insightful comments have benefited this book as well as the articles for which they were intended. I would especially like to thank Kari Kinn and George Walkden for our collaboration on an article investigating null subjects in Icelandic, and George Walkden again for our collaboration on an article on null subjects in Middle English. These joint works have also influenced the present one (although this is, of course, not to say that Kari or George would necessarily agree with or endorse any of the claims made in this book). Again, all remaining errors are mine.

OUP CORRECTED PROOF – FINAL, //, SPi

xii

Acknowledgements

Additionally, it is my pleasure to thank the Department of Linguistics at the University of California at Berkeley, where I was fortunate enough to spend six months in – as a Visiting Student Researcher under UC Berkeley’s Visiting Scholar and Postdoctoral Program. I am particularly grateful to Gary Holland for his willingness to act as my Faculty sponsor and to Belén Flores for handling many of the formalities of my visit (including graciously answering a barrage of questions from the Visiting Student Researcher). I sincerely thank the Department of Foreign Languages at the University of Bergen and the Meltzer Research Fund for awarding me stipends of , NOK and , NOK, respectively, for the purposes of this research stay. I am also extremely grateful to the Norway—America Association and the American-Scandinavian Foundation’s Andrew E. & G. Norman Wigeland Fund for awarding me , for the purpose of the same research stay. These research funds were instrumental in facilitating my stay at Berkeley. Many thanks are due to the Department of Foreign Languages at the University of Bergen for funding substantial portions of this research, and to my current employer, Western Norway University of Applied Sciences, for funding the remainder. But most of all, I owe an immense debt of gratitude to my wife, Heidi, who has allowed me to selfishly pursue my interest in language and linguistics for many hours more than I probably should. Kristian A. Rusten Bergen  April 

OUP CORRECTED PROOF – FINAL, //, SPi

List of Figures and Tables Figures . Relationship between text size (measured in terms of the total number of subject pronouns) and proportions of null subjects in Table .



. Mosaic plot showing the distribution of overt and null subjects in West Saxon and non-West Saxon/non-West Saxon-influenced texts of Old English prose



. Cohen–Friendly plot showing the contributions to the chi-squared value in a  ×  contingency table of overt and null subjects according to dialect



. Mosaic plot showing the distribution of overt and null subjects in West Saxon and non-West Saxon/non-West Saxon-influenced texts of Old English poetry



. Mosaic plot showing the distribution of overt and null pronominal subjects in Old English prose and poetry



. Cohen–Friendly plot showing the contributions to the chi-squared value in a  ×  contingency table of overt and null subjects according to genre



. Dotplot showing the variable importance of ‘genre’ and ‘translation’ in a random forest



. Binned residual plot for the logistic regression model in Table .



. Grid showing effect plots for all fixed-effect predictors in the regression model in Table .



. Fixed effects in the generalized mixed-effects logistic regression model in Table . according to the odds ratio of switching from an overt to a null pronoun



. Grid showing effect plots for all fixed-effect predictors in the regression model in Table .



. Binned residual plots for the generalized mixed-effects logistic regression models in Tables . and .



. Grid showing effect plots for all fixed-effect predictors in the regression model in Table .



. Binned residual plot for the generalized mixed-effects logistic regression model in Table .



. Licensing of null subjects in Old English (Walkden : , fig. )



. Grid showing effect plots for all fixed-effect predictors in the generalized mixed-effects logistic regression model in Table .



. Binned residual plot for the generalized mixed-effects logistic regression model in Table .



. Grid showing effect plots for all fixed-effect predictors in the generalized mixed-effects logistic regression model in Table .



OUP CORRECTED PROOF – FINAL, //, SPi

xiv

List of Figures and Tables

. Binned residual plot for the generalized mixed-effects logistic regression model in Table .



. Enhanced scatterplot showing the proportion of null subjects in  OE, ME, and EModE texts listed in chronological order



. Enhanced scatterplot showing the proportion of null subjects in  OE, ME, and EModE non-verse texts listed in chronological order



. Scatterplot showing the results of a robust correlation performed using the mvoutlier package



Tables . Overt and null referential subjects in Old English prose



. Overt and null referential subjects in Old English poetry



. Overt and null subjects in six Anglian/Anglian-influenced texts (extracted from Walkden : –)



. Classification of Old English prose texts according to dialect



. Subject pronouns in Old English prose according to dialect



. Subject pronouns in Old English poetry according to dialect



. Subject pronouns in Old English prose according to sub genre



. Overt and null referential subjects in Old English medical handbooks



. Overt and null referential subjects in Old English historical texts



. Classification of Old English prose texts according to period



. Subject pronouns in Old English prose according to period



. Overt and null referential subjects in early and late Old English prose



. Classification of Old English poetic texts according to period



. Subject pronouns in Old English poetry according to period



. Classification of unlabelled Old English prose texts according to translation status



. Subject pronouns in Old English prose according to translation status



. Classification of Old English verse texts according to translation status



. Subject pronouns in Old English poetry according to translation status



. Results of the generalized fixed-effects logistic regression model in ()



. Null and overt subject pronouns in Old English prose according to clause type



. Null and overt subject pronouns in Old English poetry according to clause type



. Null and overt subject pronouns in Old English prose according to person and number



. Null and overt subject pronouns in Old English poetry according to person and number



. Subject pronouns in Old English prose according to clause type and verb position



OUP CORRECTED PROOF – FINAL, //, SPi

List of Figures and Tables

xv

. Subject pronouns in Old English poetry according to clause type and verb position



. Results of the generalized mixed-effects logistic regression model in ()



. Results of the generalized mixed-effects logistic regression model in ()



. Results of the generalized mixed-effects logistic regression model in ()



. Antecedent function in Old English prose



. Antecedent function in Old English poetry



. Antecedent form in Old English prose



. Antecedent form in Old English poetry



. Distance measured in words between the null subject and its antecedent in Old English prose



. Distance measured in words between the null subject and its antecedent in Old English poetry



. Old English verbal inflections: helpan ‘to help’ (strong class b verb)



. Old English verbal inflections: fremman ‘to advance, do, effect’ (weak class  verb)



. Overt and null referential subjects in Middle English prose and poetry



. Overt and null referential subjects in Middle English according to genre



. Overt and null subjects in Middle English according to period



. Null and overt subject pronouns in Middle English prose and poetry according to clause type



. Subject pronouns in Middle English prose and poetry according to clause type and verb position



. Overt and null subject pronouns in Middle English prose and poetry according to person and number features



. Results of the generalized mixed-effects logistic regression model in ()



. Overt and null referential subjects in Early Modern English



. Overt and null subjects in Early Modern English according to period



. Overt and null referential subjects in Early Modern English according to clause type



. Subject pronouns in Early Modern English according to clause type and verb position



. Overt and null subject pronouns in Early Modern English according to person features



. Results of the generalized mixed-effects logistic regression model in ()



. Example of the format of a dataframe ranking Old, Middle, and Early Modern English texts according to date of composition



OUP CORRECTED PROOF – FINAL, //, SPi

OUP CORRECTED PROOF – FINAL, //, SPi

List of Abbreviations    acc comp CP dat def df DP EModE eOE GB gen imp indef indic inf inst IP lOE ME neg neut nom NP Oø OE OGmc OHG OIce ON OSax OSwe PdE pl pres R refl sbj sbv sg

first person second person third person accusative case complementizer complementizer phrase dative case definite degrees of freedom determiner phrase Early Modern English early Old English Government-Binding theory genitive case imperative mood indefinite indicative mood infinitive instrumental case inflection phrase late Old English Middle English negative particle neuter gender nominative case noun phrase null object Old English Old Germanic Old High German Old Icelandic Old Norse Old Saxon Old Swedish Present-day English plural number present tense Coefficient of determination (R-squared) reflexive subject subjunctive mood singular number

OUP CORRECTED PROOF – FINAL, //, SPi

xviii ShiftP Spron Sø Sø.imp Sø.jus Sø.rel V V ρ φ φP χ Ø ∗

List of Abbreviations shift phrase overt referential pronominal subject referential null subject empty imperative subject empty subject in jussive subjunctive structure empty subject relative verb-initial verb-second Spearman’s rank-order coefficient (Spearman’s rho) Pearson’s phi coefficient of association (statistical modelling)/phi-features phi-phrase chi-squared empty position (syntax) / null inflection (morphology) grammatically unacceptable

OUP CORRECTED PROOF – FINAL, //, SPi

1 Introduction This book is concerned with the non-expression of referential pronominal subjects in early English finite clauses and the superficially ‘subjectless’ constructions which occur as a consequence. The book builds directly on the foundations laid by my previous work on structures containing such referential null subjects in Old English (OE) (Rusten , , a, a), but also indirectly on recent work by Walkden (, , c).1 The investigation carried out here is corpus-based, and it analyses , overt and null referential pronominal subjects extracted from  OE, Middle English (ME), and Early Modern English (EModE) texts (and text samples). The book thus gives a detailed account of referential null subjects in three periods of early English, covering c. years of the history of the language, but primary importance is assigned to OE. The principal aim of the book is to provide an in-depth quantitative analysis of the non-expression of referential subjects in early English, and to attempt to determine what systematicity there is in terms of the morphosyntactic and pragmatic characteristics of this phenomenon. Relying on large corpora and state-of-the-art statistical methods, I aim to analyse the role played by a number of linguistic variables in sanctioning referential null subjects in the early history of English. I also aim to establish and analyse possible correlations between the non-expression of referential subjects and factors such as the translation status of the investigated texts, as well as their period and place of composition. Re-evaluations of previously proposed hypotheses concerning referential null subjects in early English—with special attention paid to OE—will be provided on the basis of these analyses. In pursuing these aims, this book will provide a more exhaustive empirical analysis of subjectless finite clauses in the history of English than I believe has been given to date. With the exception of my own previous work, as well as that of Walkden, very little large-scale systematic empirical research has been offered by the research tradition. Previous studies on the null subject phenomenon in early English have 1 In addition to the works cited above, the book draws partially on joint work I did together with Kari Kinn and George Walkden on null subjects in early Icelandic (Kinn et al. ) and Middle English (Walkden & Rusten ). I gratefully acknowledge my debt to Kari and George. Referential Null Subjects in Early English. First edition. Kristian A. Rusten. © Kristian A. Rusten . First published in  by Oxford University Press.

OUP CORRECTED PROOF – FINAL, //, SPi



Introduction

predominantly been based on unsystematized (cf. e.g. Pogatscher ; Mustanoja ; Visser ; Mitchell a,b) or limited empirical data (e.g. van Gelderen , ). No consensus has been reached with regard to the distribution or the morphosyntactic and pragmatic characteristics of this phenomenon. Nor has an entirely satisfactory model been proposed for explaining the occurrence of subjectless finite clauses, which intermingle with parallel structures containing overt referential pronominal subjects in seemingly erratic fashion (cf. e.g. Mitchell a: –). It is hoped, then, that this study, by carrying out a large-scale statistical analysis of empirical data, can make a contribution to research on early English syntax by providing further concretization of an area which until a few years ago seriously lacked systematic quantitative-linguistic attention, but which nonetheless is of considerable empirical and theoretical interest.

. Background It has been noted by both philologists and historical linguists that numerous finite clauses in OE lack an expressed referential subject.2 Present-day English (PdE) requires overtly expressed referential subjects in most finite environments,3 and the presence of subjectless clauses in OE is therefore remarkable. Examples () and () illustrate the phenomenon.4 Note that the PdE equivalents to the OE clauses require overt referential subjects. () On þysum life we ateoriað gif [Sø ] us mid bigleofan ne in this life we waste-away if [we] us with food not ferciað: sustain.pl ‘In this life, we waste away if we do not sustain ourselves with food.’ (ÆCHom I .) fæst, feores () Hraðe siððan wearð [Sø ] fetorwrasnum quickly after became [he] fetter-chains.dat bound life.gen orwena. despairing ‘Quickly after he was bound by fetters, despairing of life.’ (And )

2 Cf. the references later in this section, as well as those in sections .. and ... 3 See section . for details. 4 Even though the majority of the OE examples cited in this book were identified by means of the corpora employed in the study (cf. section ., as well as the companion website at www.oup.co.uk/companion/rusten), the text of all examples is taken from the Dictionary of Old English Web Corpus (Healey et al. ). Hence, reference is made to the Dictionary of Old English short titles as opposed to the ID tags used by the corpora. A list of the abbreviations used for OE sources cited in examples is provided at the companion website.

OUP CORRECTED PROOF – FINAL, //, SPi

Background



Similarly subjectless clauses appear also in ME and EModE, as illustrated in () and (), respectively.5 () [ . . . ] yf [Sø ] were trewe or no I remytte me to Gode. [ . . . ] if [they] were true or no I remit me to God ‘[ . . . ] if they were true or not, I remit myself to God.’ (Gregory’s Chronicle (CMGREGOR,.)) () but I believe [I] shall be on Munday at a ball at St. Jeames [ . . . ] (ALHATTON-E-H,,.) Current linguistic theory analyses subjectless clauses such as those exemplified above as containing an understood—or ‘underlying’—‘null’ or ‘empty’ pronoun which functions syntactically as the subject, and which is associated with the thematic roles normally assigned to the subject. An analysis along these lines is expressed here through use of the abbreviation Sø to symbolize the ‘presence’ of a referential null subject.6 Extensive theoretical and empirical work on empty syntactic categories has been carried out by linguists since the s. The vast majority of this research has been conducted within the various frameworks of generative linguistics— most notably within Principles and Parameters theory, including the developments of Government–Binding (GB) theory and the Minimalist Programme. In these frameworks, the phenomenon observable by the absence of verbal arguments such as subject and object is commonly referred to as pro-drop. When the ‘missing’ argument is a subject, the clause is said to contain a null subject or an instance of pro,7 and the clause is said to exemplify subject pro-drop. Correspondingly, any language which permits referential null subjects to be present in the finite clause structure is within these frameworks referred to as a pro-drop- or null subject language.8

5 The text of the ME and EModE examples is that given in the corpora used to collect them, with the exception that the characters æ, þ, and ð have been inserted to replace +a, +t, +d, respectively. Consequently, reference is made to the corpus token IDs in these examples. Also, note that no gloss is given in EModE examples; instead, the missing pronoun has been inserted in the corpus text enclosed by square brackets. 6 In other words, the abbreviation Sø signals that an overt referential subject is absent from the finite clause in question. It should not necessarily be taken to signify the presence of the specific theoretical entity pro of Government—Binding theory (cf. also n. ). 7 In the GB era, subjectless finite clauses were typically assumed to contain instances of a specific theoretical entity pro. Under the current Minimalist Programme, the GB view has been questioned. Whenever the term pro-drop is used here, then, it should be understood as denoting a clause containing a null subject—nothing is implied as to the ontological status or featural make-up of that subject. 8 It is, perhaps, necessary to make it clear that pro-drop and null subject are not entirely synonymous terms. The pro-drop phenomenon refers to the ‘dropping’ of a pronoun having any syntactic function, meaning that e.g. object pro-drop is also a possibility for many pro-drop languages. Null subject languages are languages which allow referential null subjects in the finite clause structure, but not necessarily null objects or objects of prepositions.

OUP CORRECTED PROOF – FINAL, //, SPi



Introduction

Terms such as pro-drop and null subject carry specific theoretical implications; but regardless of theoretical affiliation, a basic distinction can be made between languages which allow referential null subjects in finite clause structures and languages which generally do not. In languages of the former type—typically exemplified by ‘canonical’ pro-drop languages such as Italian and Spanish—clauses normally do not feature overt subject pronouns, except for emphasis. Non-pro-drop languages, on the other hand, exemplified by e.g. English, French, and the mainland Scandinavian languages, require subject pronouns to be overtly present in the clause structure. The requirement for overt referential subjects in finite clauses exhibited by e.g. English is notable among the world’s languages, of which the majority appear to allow non-expression of referential subjects to some extent or other (see Gilligan ). Among European languages, obligatory overt expression of referential subjects is mainly restricted to the Germanic languages and Modern French. In light of this, it is interesting to note that many of the omitted subjects in OE correspond to the generative pro-drop phenomenon. Since PdE is a non-pro-drop language, the presence of null subjects in OE could suggest that English in the course of its documented history has transitioned from a stage where null subjects were a sanctioned grammatical possibility toward one where they are not. Similar trajectories of development have been argued for the other Old Germanic (OGmc) languages, and there is some consensus that Proto-Germanic, the postulated and partially reconstructed ancestor of the Germanic languages, was a pro-drop language (cf. e.g. Grimm : , referenced in Fertig : , Wright & Wright : , and Walkden c: –). If this is the case, and if it is also the case that Proto-Indo-European was a pro-drop language (see e.g. Fertig : , who refers to Schulze : ), it follows that all the Germanic languages (and, indeed, all the Indo-European languages) must have developed from a stage where null subject pronouns were the rule, or at the very least permissible. The presence of null subjects in the OGmc languages has been investigated by both philologists and historical linguists. Until recently, however, the field has suffered from an unfortunate lack of rigorously analysed quantitative data. In the interests of sketching a background for the work presented in this book, then, a short overview of studies on null subjects in OGmc and OE will be given in sections .. and .., respectively. .. Referential null subjects in Old Germanic The present section offers a brief survey of research on null subjects in Gothic, Old High German (OHG), Old Saxon (OSax), Old Swedish (OSwe), Old Norwegian (ONorw), and Old Icelandic (OIce). As the oldest of the attested Germanic languages, Gothic has been of considerable interest to scholars advocating the view that Proto-Germanic was a pro-drop language. Even though the surviving Gothic material is severely limited, evidence from

OUP CORRECTED PROOF – FINAL, //, SPi

Background



this language—which, with the exception of certain early runic inscriptions, pre-dates evidence from the other OGmc languages by several centuries—has been considered to represent an early stage of development that cannot be accessed by investigating other OGmc languages. As (Fertig : ) states, ‘by the time texts started appearing in the North and West Germanic dialects, null referential subjects had become the exception rather than the rule’. In Gothic, however, they do seem to be the rule. Null subjects in Gothic have been commented on by e.g. Streitberg () and Abraham (, ), and quantitative studies have been carried out by Fertig (), (Ferraresi ), and recently Walkden (, c). According to Fertig (: , , and passim), Gothic natively allowed referential subjects to be realized as null, independently of the reconstructed Greek Vorlage (understood by Fertig as the Greek text in Streitberg , although for his purposes ‘any familiar modern Greek edition [ . . . ] yields virtually the same results’ (Fertig : )).9 Supplementing the studies by Fertig and Ferraresi with an exhaustive investigation of the Gothic Gospel of Matthew, Walkden (: ) finds that referential null subjects are very common in main, conjunct, and subordinate clauses, with relative frequencies for null subjects ranging from . to . of the total number of pronominal subjects. With respect to person and number, he shows that first, second, and third person pronouns, singular and plural, are dropped at frequencies between . and . of the total number of pronouns within each person/number combination (p. ). Thus, Gothic allows omission of pronominal subjects at a stable and high-frequency level, regardless of clause type and person and number specifications. Research on null subjects in OHG has been carried out by a number of philologists and modern linguists.10 Notably, building on the quantitative material collected by Eggenberger (), Axel (, ), and Axel & Weiß () show that referential null subjects are quite frequent in early OHG, at frequencies ranging from  of all pronominal subjects in Isidor to  in the Monsee Fragments (Axel : ). While these frequencies are considerably lower than those observed for Gothic, Axel (: ) argues that it is ‘very doubtful whether the prevailing position can be upheld that the archaic null-subject property that was still present in Proto-Germanic had already disappeared in pre-OHG’. In later OHG, referential null subjects are not attested, but there are ‘some very rare exceptions’ (Axel : ). For instance, only . of the pronominal subjects in Notker’s translation of Boethius’ Consolatio

9 Walkden (c: ), however, points out that reliance on Streitberg’s edition is problematic since ‘Streitberg was not a Bible scholar, and his version of the Greek New Testament is a hybrid which does not derive from any single manuscript’. Consequently, Walkden uses the Majority Text (Robinson & Pierpont ) for his own comparisons between Greek and Gothic, as the Gothic follows this version closely, according to Ratkus (: ). 10 Cf. e.g. Kraus (), Held (), Paul (), Eggenberger (), Sonderegger (), Axel (, ), Axel & Weiß () and Walkden (, c).

OUP CORRECTED PROOF – FINAL, //, SPi



Introduction

Philosophiae are realized as null (Haugland : ).11 Eggenberger’s () data thus indicate that frequencies for referential null subjects decrease substantially over time. It is also observed that, unlike the situation in Gothic, referential null subjects in the OHG texts are considerably more common in main clauses than in subordinate clauses, and that third person pronouns are more likely to be omitted than first and second person ones (Axel : , –). Claims have also been made about the status of referential null subjects in OSax. For instance, Heyne (: ) notes, according to Pogatscher (: ), that the nominative singular, and more rarely the plural, ‘bisweilen vor dem verbum ausgelassen sei’—i.e. the subject may sometimes be omitted when preceding the verb. Pogatscher (: –) also cites some examples of null subjects in OSax in his discussion of the same phenomenon in OE. To the best of my knowledge, however, Walkden’s (; c) treatment of the OSax Heliand is the only quantitative study of referential null subjects in this language. He demonstrates that null subjects occur at . of the total number of pronominal subjects, while stressing that the result is ‘a lower bound’, since it is possible that some of the instances he has ‘analysed as conjunction reduction are in fact cases of referential null subjects in non-conjunct clauses’ (Walkden : ). He further shows that null subjects are very infrequent in subordinate clauses as compared to main and conjunct clauses (p. ), and also that third person pronouns are significantly more likely to be realized as null than non-third person ones (pp. –). Among the third person null subjects, the frequency for the plural (.) is higher than that for the singular (.). The status of null subjects in Old North Germanic has been investigated, either qualitatively or quantitatively, and to varying degrees of detail, in a relatively substantial number of studies. Nygaard (, ), Thráinsson & Hjartardóttir (), Hjartardóttir (), Sigurðsson (), Faarlund (, ), Walkden (, c), Lander & Haegeman (), and Kinn et al. () deal with Old Norse/OIce, while ONorw, OSwe, and Old Danish are investigated in Kinn (, a,b, ), Håkansson (, ), and Heltoft (), respectively. Similar tendencies to those observed for OHG and OSax emerge for ONorw and OSwe: Håkansson investigates pronominal subjects in clauses occurring in extracts of up to c. pages from twelve OSwe texts, and concludes that referential null subjects are very rare. Investigating a sample of , clauses, he also shows that referential null subjects are more common in main () as opposed to subordinate clauses (), and that third person pronouns are more frequently null than first and second person ones ( null subjects with third person reference, compared to  and  with first and second person reference, respectively) (Håkansson : ). Kinn (: ) adduces data showing that null subjects occur at proportions of . and . in 11 Like Axel (, ) and Axel & Weiß (), Haugland’s () statistics for OHG are extrapolated from Eggenberger ().

OUP CORRECTED PROOF – FINAL, //, SPi

Background



two ONorw texts. In one of the texts, null subjects are considerably more frequent in main than in subordinate clauses (p. ), and third person pronominal subjects are null much more frequently than first and second person ones in both texts (p. ). Heltoft’s () functionalist analysis of null subjects in Old Danish provides no quantification, but he nevertheless says that this language ‘allowed zero arguments [ . . . ] in any nominal (argument) position’ (p. ). Walkden’s (: –) quantitative investigation of referential null subjects in nine OIce texts demonstrates that frequencies for null subjects are ‘uniformly low, and never above ’ (p. ). This finding is corroborated and extended by Kinn et al. (). For four of these nine texts, Walkden shows that referential null subjects are significantly more frequent in subordinate than in (non-conjunct) main clauses (p. ). Moreover, Walkden demonstrates for two of the texts that null subjects are significantly more frequent with third than with non-third person reference (p. ). This finding, which echoes findings in OHG, OSax, OSwe, and ONorw, is corroborated and extended by Kinn et al. (: ), who show statistically that the third person ‘can be observed to favor nullness across the entire dataset’.12 While most of the works mentioned above focus on individual OGmc languages, Walkden (, c) gives a cross-Germanic perspective. His chapter on null arguments in OGmc13 builds on a previous summary by Rosenkvist (), who, on the basis of the data available at the time, noted that the OGmc languages ‘display some striking similarities’ with regard to the properties displayed by null subjects in these languages (p. ). He states that the distribution of referential null subjects ‘does not in any [OGmc—kar] language depend on the “richness” of verbal inflection’ (p. ), and he also points out that third person null subjects ‘were by far the most frequent’ (p. ).14 He also comments, on the basis of Sigurðsson (), that null subjects in OGmc ‘all seem to depend on lexically realized antecedents in the preceding discourse’ (p. ). .. Referential null subjects in Old English As can be seen, then, referential null subjects have a long-recognized presence in all the OGmc languages. This presence is long recognized also in the case of OE, but prior to Rusten () and Walkden (), very little systematic, quantitative research had been carried out. This fact is reflected in the research tradition by numerous contradictory statements concerning the phenomenon. Traugott (: ), for instance, states that ‘[a] grammatical subject is not obligatory in OE’. Similar claims are found in introductory textbooks, such as Baker (: ), where it is noted that the OE finite verb ‘can sometimes express the subject all by itself ’. Mitchell 12 I.e. a dataset covering the entire Icelandic Parsed Historical Corpus (IcePaHC) (Wallenberg et al. ). 13 Walkden’s work on OE will be presented in the immediately following section ... 14 Note that Rosenkvist does not take Gothic into consideration.

OUP CORRECTED PROOF – FINAL, //, SPi



Introduction

& Robinson’s (: ) guide to OE states that a ‘pronoun subject is frequently not expressed’, and while the unexpressed subject is often ‘the same as that of the preceding clause’, this is not always the case.15 Pogatscher (: ), providing a long list of subjectless clauses taken from OE prose and poetry, ventures to bring attention to the ‘von anderen erkannte thatsache, dass im Altenglischen nicht bloss im haupt- sondern auch im nebensatze das subjekt unausgedrückt bleiben kann’—i.e. the fact that the subject can be unexpressed in both main and subordinate clauses in OE. In a discussion of OE poetic syntax, Blockley (: ) claims that ‘[i]n Old English, unexpressed subjects were grammatical’, while Moessner (–: ), discussing the syntax of relative clauses, states that ‘the subject is an optional constituent in OE’. Even more strongly, van Gelderen’s (: ) generative study asserts that ‘pro-drop is quite common’ in OE, and in her more recent work it is argued that OE was ‘a genuine pro drop language’ with ‘Romance-style pro drop’ (van Gelderen : , ). Hulk & van Kemenade (: ), on the other hand, stress that the ‘phenomenon of referential pro-drop does not occur in OE’, and van Kemenade (: ) says that ‘OE allows no referential pro-drop’. Visser (: ) states that ‘use of the subjectpronoun was the rule’, while Mitchell (a: ) says that the subject was ‘only spasmodically’ omitted,16 and that the ‘personal pronoun is normally expressed when it is the subject of a verb’ (p. ). However, he also states that Sø ‘must be accepted as idiomatic OE’ (p. ). As is evident from these widely diverging accounts, the research tradition has failed to reach consensus with regard to the distribution and status of null subjects in OE. The need for more research was further accentuated by Rusten (, ) and Walkden (, ). These corpus-based works independently found, in samples of  and  texts,17 respectively, that null subjects are very rare in the majority of the investigated texts (Rusten : ; Walkden : –). In the data of Rusten (: ), Sø is on aggregate most frequent in conjunct clauses, and Walkden shows that, in a subset of six of his analysed texts—the ones which ‘exhibit null subjects to a greater extent’ (Walkden : )—Sø is more frequent in main than in subordinate clauses. This effect is found to be statistically significant in the case of Beowulf (p. ). Moreover, in an early work, Berndt () demonstrated that in two interlinear glosses, those in the Lindisfarne and Rushworth Gospels, third person 15 The implication of this is that the unexpressed subject then cannot be a case of conjunction reduction, which is still permissible in PdE. See section . for further explanation and examples. 16 Mitchell (a: ) prefers the term ‘non-expression’ to ‘omission’, because ‘to speak of “omission” [ . . . ] is to disregard the possibility that we have [what Ardern (: xxv) called—kar] “a genuine archaism” ’. This is an important distinction, since if it is the case that OE has developed from a stage where expression of verbal arguments was non-obligatory, diachronic development has moved in the direction of adding pronouns to finite clause structures, instead of omitting them. However that may be, terms such as omission and pro-drop will nevertheless be used here. 17 Walkden () investigates two additional prose texts compared to Walkden (), and Rusten (a) adds four prose, and  verse, texts to the material in Rusten (, ).

OUP CORRECTED PROOF – FINAL, //, SPi

Background



pronouns are left unexpressed more frequently than first and second person ones (cf. table . in Walkden : , which summarizes Berndt’s data). Building on this work, Walkden (: –) investigates null subjects according to person and number features in four texts, and finds a similar tendency: third person subjects are more frequently null than non-third person ones. The same tendency is found in Rusten’s (a: section .) examination of  prose and  verse texts. On the basis of the low frequencies for Sø , my earlier work concluded that accounts such as that by van Gelderen () ‘have overestimated the “idiomaticity” ’ of null subjects in OE (Rusten : ), and that the null subject phenomenon is ‘nearly dead by the extant Old English period’ (Rusten : ). Walkden (: ), however, finding that proportions for null subjects are higher in Anglian and Anglian-influenced texts than in West Saxon ones, suggests that referential null subjects ‘were available, subject to certain restrictions, in Anglian dialects’. Thus, he proposes a West Saxon/Anglian dialect-split with regard to the permissibility of null subjects in OE. Rusten (a) called this suggestion into question, pointing out that the higher frequencies in Walkden’s Anglian and Anglian-influenced texts could be due to other factors than dialect, such as genre, time of composition, and translation status. However, a firm conclusion could not be reached in that work. Thus, previous research has not attained consensus as to the actual pro-drop status of OE, and only a restricted number of texts have been investigated in detail. The research tradition is also divided as concerns the factors which sanction the occurrence of referential null subjects in OE. Null subjects—both generally and in OE specifically—have long been linked to the presence of a ‘rich’ system of verbal inflections. With appealing logic, it has been claimed in both traditional and generative works that in pro-drop languages, the subject referent is unambiguously identified by the person and number features displayed by the finite verb.18 Thus, the referent of a null subject is recoverable by the inflections on the finite verb, and an overt subject is therefore redundant. The notion that inflectional ‘richness’ can explain the occurrence of null subjects was formalized by generative linguists in connection with the formulation of the identification hypothesis (cf. e.g. Jaeggli ; Jaeggli & Safir b), which postulated that a null subject may occur when ‘certain important aspects of its reference can be recovered from other parts of the sentence’ (Huang : ). Ohlander (: ) relies on verbal inflections for an early explanation of subject omission in OE, saying that ‘the subject-pronoun was seldom necessary, since the subject was generally sufficiently indicated by the personal ending of the predicate verb’. Mitchell (a: , n. ) rejects this explanation, however, noting that ‘the verb endings were too ambiguous for this to have played much part, even 18 In generative theory, these features are typically referred to as phi-features, occasionally simply abbreviated as φ.

OUP CORRECTED PROOF – FINAL, //, SPi



Introduction

in OE’. Formal syncretism in the OE verbal morphology is well documented and illustrated in grammar-book paradigms, but it was also ‘often more extensive’ in ‘the practice of individual writers’ (Haugland : ) than suggested by such paradigms. Peeters (: ) (quoted in Mitchell a: ) emphasizes this, remarking that ‘the relatively numerous verbal endings that were used could in most cases functionally be dispensed with’, also noting that this is ‘indeed what happened in the course of time’. Visser (: ) agrees, citing ‘extensive formal syncretism’ in the verbal morphology as the reason why ‘use of the subject-pronoun was the rule’. A radically different stance is taken by van Gelderen (: ), who argues that ‘[i]n Old English [ . . . ] pro-drop is quite common’ as a consequence of ‘the strength of the verbal person features’. Referring to Adams (), Abraham (), and Sigurðsson (), she notes that ‘ “[o]lder” languages often license pro-drop’, ‘possibly because the verbal inflection is unambiguous, which weakens the need for overt elements’ (p. ). She states that OE is ‘no exception’ (p. ), and argues that the third person is more specified in terms of phi-features than the first and second persons—which results in third person pronouns being omitted more frequently than first and second person ones (p. ), as demonstrated in Berndt (). Similar statements are made in van Gelderen (: , ), where—as mentioned above— it is claimed that OE was ‘a genuine pro drop language’ with ‘Romance-style pro drop’. Specifically, according to van Gelderen (), ‘the loss of overt agreement on the first and second person pronouns’ verbs’ causes the split between third and non-third person, since ‘agreement features are uninterpretable’ on first and second person verbs (p. ). However, ‘[t]hird person verbal agreement, especially singular, remains more stable’, a fact which on her analysis licenses third person referential null subjects ‘up to late Old English because the features on the verb are interpretable’ (p. ). In Rusten (), I argued, in agreement with Mitchell (a), that verbal inflections are insufficient in terms of explaining what sanctions the occurrence of Sø in OE. Walkden (, , c) independently reaches the same conclusion, noting that an agreement-based explanation ‘cannot account for the Old English facts’ (Walkden : ). He extends this argument to the OGmc languages in general, stating that ‘rich agreement is unlikely to have played a role in allowing null arguments in any of the early Germanic languages, with the possible exception of Gothic’ (Walkden : ). Corresponding conclusions have been drawn for e.g. OHG and OIce: Axel (: ) notes that referential null subjects were ‘largely lost in the OHG period even though there was no substantial weakening of inflectional endings’, and Sigurðsson (: ) argues that ‘in spite of its richness Agr[eement] never took any part in identifying subject pro in Icelandic’. Building on Holmberg’s () study of null subjects in Finnish, Walkden promotes a Minimalist analysis where Anglian OE is considered a partial null subject language. According to this analysis, which—like Holmberg’s—is inspired by the innovative approach to referential null subjects presented in Frascarelli (), Sø is licensed

OUP CORRECTED PROOF – FINAL, //, SPi

Background



by an Aboutness topic in the specifier of the Shift Phrase (ShiftP): ‘the [uD] feature [i.e. ‘uninterpretable Definite feature’—kar] of a null argument could [ . . . ] only be valued by agreement with a null aboutness topic’ (Walkden : ). SpecShiftP probes to the specifier of TP, leaving the noun phrase subject null by means of chain reduction. On Walkden’s analysis, Sø predominantly has third person reference because null first and second person subjects are dependent on probing from ‘logophoric agent’ and ‘logophoric patient’ operators, respectively, and these operators ‘lacked the ability to probe in Old English’ (p. ). A somewhat similar generative analysis is given in van Gelderen (). While stating that pro-drop is ‘licensed by agreement on the verb’ (p. ), she also incorporates the approach pioneered by Frascarelli (), which (as just noted) stipulates that null arguments must agree with a topic in the left periphery. On van Gelderen’s analysis, as the case is with Walkden’s, it is the Aboutness topic which is the licensing topic for null subjects in OE. As is evident, then, these important recent generative works highlight the discourse status of the omitted argument as crucially important in terms of sanctioning referential null subjects. A precursor to this ‘licensing-by-topic’ hypothesis can be found also in early philological works on null subjects in OE. That is, Pogatscher (: ) argues that ‘das subjekt nicht ausgedrückt zu werden braucht’ as long as ‘die subjektvorstellung’—i.e. the ‘concept’ of the subject—‘dem hörer aus dem zusammenhange genügend deutlich vorschwebt’. That is, the subject pronoun may be omitted as long as the concept expressed by the subject (i.e. the referent) is clear from the context and in the hearer’s mind throughout the discourse. According to Pogatscher, this is accomplished by having the ‘the subject itself ’ (i.e. the overt subject) or a reference to the concept of the subject occur in close vicinity to the empty position (die ersparungsstelle, ‘place of omission’) (p. ). This notion was criticized in Andrew’s () study of the syntax of Beowulf, where it was pointed out that ‘Pogatscher’s “hovering” rule’19 is not systematically applied in parallel cases (p. ). In line with this, Mitchell (a) criticizes Pogatscher’s vorschweben as a notion ‘impossible to apply in practice’, also saying that this explanation does not account for why pronominal subjects are interchangeably realized as overt and null in ‘what appear to us parallel situations’ (p. ). The focus on Aboutness topicality in current generative research on null arguments can arguably be seen as a vindication of Pogatscher’s statement. However, the content of this ‘licensing-by-topic’ hypothesis has not yet been tested sufficiently, and the lack of systematicity in parallel situations pointed out by Andrew and Mitchell is still a problem. I began working towards testing these hypotheses in my earlier work. For example, Rusten () conducted a brief investigation of the notion that subject pronouns may be omitted if they refer to a discourse topic. I found that discourse topicality could not 19 This is presumably Andrew’s somewhat clumsy translation of vorschweben, ‘having in mind, having a vague notion’.

OUP CORRECTED PROOF – FINAL, //, SPi



Introduction

account systematically for the occurrence of referential null subjects, and provided examples of null subject clauses where the omitted argument does not refer to ‘who the narrative is about’ (Rusten : ). Rusten () also began providing quantitative evidence concerning the characteristics of the antecedent, under the assumption that the prominence of the antecedent in the discourse could influence the possibility of a null subject. However, the results of that work built on very restricted data material, and thus additional work on both topicality and the status of the antecedent is highly desirable. This will be provided by the present book.

. Contribution of the book In summary, then, there can be little doubt that there is more ground to cover in this area of OE syntax. First, there is still controversy as to the actual pro-drop status of OE. A large-scale quantitative investigation into the frequency of Sø , as well as the linguistic and non-linguistic variables associated with its occurrence, can be used to empirically assess the various conflicting claims concerning the licitness of pro-drop in OE. Such a quantitative investigation is also eminently suitable for assessing Walkden’s suggestion of an Anglian/West Saxon dialect split. As pointed out in Rusten (a), the differences attributed to diatopic variation by Walkden may potentially be due to other factors, such as the genre, translation status, or period of composition of the texts used as evidence for the dialect split hypothesis. A unified, multifactorial analysis of Sø according to these variables is still lacking. In this book, these variables will be weighted against each other using sophisticated statistical measures. Moreover, while Rusten (, , a) and Walkden (, , c) began providing quantitative analyses of null subjects in OE, only a limited number of texts have been examined in detail. Also, with the exception of Rusten (a), which compared the occurrence of Sø in  OE verse texts to that in  prose texts, previous quantitative works have focused mainly on the prose genre, as well as on interlinear glosses.20 This book will also offer further investigation into the degree to which various linguistic factors play a role in permitting Sø in OE. Substantial parts of this study will take the form of a statistical analysis of the role of a number of structural variables proposed by previous research as relevant for subject omission in OE. Unlike previous quantitative works on null subjects in OE prose and poetry, this book will apply statistical measures that permit modelling of the importance of individual variables relative to that of others, and all overt and null subject tokens in all the texts in the corpora will be included in the analysis. To the best of my knowledge, moreover, the number of potentially explanatory variables investigated here is larger than in any previous study on Sø in OE. The role played by verbal agreement in sanctioning Sø will also be (re-)examined. As recalled, verbal agreement was deemed insufficient in licensing Sø in OE by 20 Cf. Berndt (), van Gelderen (, ), and Walkden ().

OUP CORRECTED PROOF – FINAL, //, SPi

Scope and definitions



Mitchell (a), Rusten (), and Walkden (), while van Gelderen’s (, ) work, on the contrary, attributes licensing of null subjects in OE precisely to verbal agreement. Consequently, this question is still unresolved. Also, the role played by antecedent relations will be examined somewhat closely in the present study— operationalized via testing of the predictions made by Ariel’s () Accessibility Scale and the Givenness Hierarchy of Gundel et al. (). The validity of the licensing-bytopicality analyses of Walkden (, , c) and van Gelderen () will also be tested. Finally, this book contributes a large-scale quantitative investigation of the longterm diachronic development of referential null subjects from Old to Early Modern English. There are as yet few substantial quantitative accounts of Sø at later stages than OE,21 and—to the best of my knowledge—none which offer a unified picture of diachronic changes in the frequency of Sø and its linguistic characteristics across a period of c. years. In the words of Bruce Mitchell, then, there seems still to be ‘room for more work here’ (a:  and passim), and these are the contributions that this book aims to make.

. Scope and definitions A null subject is understood here as a nominative argument which is not phonologically or graphically present in the finite clause structure, but which nevertheless can be analysed as an underlying ‘null analogue of an overt pronoun’ (Huang : ). Languages which permit null subjects thus allow finite verbs to occur without an overt subject yet the ‘missing’ argument is still considered to be present in the structure of the clause. This omitted element is typically analysed as having nominative case and as being assigned a thematic role by the verb, despite its unexpressed status. The concept of the null subject is not necessarily theory-specific, even though it has received by far the greatest degree of attention from researchers working within transformationalgenerative frameworks. That is, null subjects are treated within various models of syntax in traditional grammar22 and in non-transformational generative grammar,23 21 Exceptions are constituted by Rusten (a) and Walkden & Rusten (). The former, a short paper written as a pilot study to the longitudinal work presented here, gave basic statistics on the occurrence of Sø in  OE, ME, and EModE texts. The latter is a substantial study of null subjects in  ME prose and verse texts. I will expand on this work in the present book. 22 Cf. e.g. Bopp (: ), where it is noted that ‘[t]he Latin verb, dat, expresses the proposition, he gives, or he is giving’, and ‘the letter t, indicating the third person, is the subject’. The commonness of such an analysis is also reflected e.g. in Jespersen (: –) where, in a discussion of what constitutes a sentence, it is noted that ‘[m]ost grammarians would probably analyze such Latin one-word sentences as “Canto” [ . . . ] as containing implicitly a subject [ . . . ]’. As can be seen, then, such early analyses represent unexpressed subjects in a fashion which bears at least some resemblance to generative pro-analyses. 23 Cf. e.g. Bresnan’s (: ) Lexical Functional Grammar, which considers a null pronoun to be ‘a functional anaphor (“pro”) which is not expressed in c-structure’. This framework does not permit underlying levels of syntactic representation, and thus the clause does not contain a node with a null NP. Null pronouns are, however, represented in f-structure. In this way, such pronouns can be assigned syntactic

OUP CORRECTED PROOF – FINAL, //, SPi



Introduction

as well as in transformational-generative grammar. It must of course be noted that these theoretical frameworks differ substantially both in their notation and in their ontological understanding of null subjects, and the work couched in GB and Minimalist theory is almost certainly richer and more diverse than that carried out within other frameworks. There is considerable agreement that there are grounds to distinguish between various distinct types of subject omission. Three main categories will be distinguished here. These encompass expletive null subjects (subsuming for practical purposes all types of non-overt non-referential subject pronouns), subjects deleted under coordination, and referential null subjects ostensibly not deleted under coordination.24, 25 The scope of the present study will be limited to dealing exclusively with the third category. The present section will provide justification for this decision, as well as briefly presenting the three categories. As mentioned above, some degree of use of referential null subjects is the norm among the world’s languages. Closely related to this is the fact that most languages do not feature overt non-referential pronouns of the kind exemplified in (): () It is raining Here, the non-referential pronoun it is used to—as it were—fill the subject slot to the left of the finite verb, despite the fact that the governing verb rain is a zero-place verb which does not take any arguments. The obligatory use of a non-referential pronoun empty of semantic meaning e.g. in structures referring to the weather or to time (referred to by e.g. Chafe  as ambient sentences) is characteristic of the Germanic languages.26 Unlike Gothic and ON, ‘nonreferential hit “it” was virtually obligatory’ with such verbs also at the OE stage of the language (Haugland : ). There were exceptions, though, and structures containing non-referential null subjects are attested also in OE ambient sentences. Consider examples () and (). () & hit rinde ða ofer eorðan feowertig daga & feowertig and it rained then over earth forty days and forty nihta on an. nights on one ‘And it rained then over earth forty days and forty nights without cease.’ (Gen .)

and semantic functions without being represented in the actual clause. While subjectless finite clauses do not have a subject in c-structure, then, it ‘is necessary to have an empty element in f-structure in order to satisfy the subcategorization requirements of verbs’ (Rosén : ). 24 But see further below, this section. 25 Another empty category, referred to as PRO—or big PRO—in generative syntax, is not relevant, since it only occurs in infinitival clauses and is permissible in pro-drop and non-pro-drop languages alike. This empty category will not be considered here. 26 Cf. Haiman (), to whom this ‘slot-filler’ theory may in large measure be ascribed.

OUP CORRECTED PROOF – FINAL, //, SPi

Scope and definitions



() Ða cwom þær micel snaw & [Ø] swa miclum sniwde swelce then came there much snow and [it] so heavily snowed as-if micel flys feolle. much fleece fell.sbv ‘Then there came much snow, and it snowed so heavily that it seemed as if a lot of fleece was falling.’ (Alex .) (adapted from Haugland’s (: ) example (b)) Non-referential null subjects also occurred in other types of structure, but were far from categorical at the OE stage of the language (cf. Haugland ). Haugland (: ) notes that in weather constructions, ‘the variant with hit is almost obligatory already by early OE’. This forms a clear contrast to archetypal pro-drop languages such as Italian () or Spanish (), where weather verbs cannot take an overt subject (see, among many others, e.g. Haegeman : ; van Gelderen : ). () [Ø] Piove [it] rains ‘It is raining.’ () [Ø] Llueve [it] rains ‘It is raining.’ This book addresses non-expression of referential subjects. Non-referential null subjects of the type exemplified above therefore fall outside the scope of this work. That is not to say, of course, that referential and non-referential null subjects have no mutual relevance. It has been observed by previous research that ‘the occurrence of referential null subjects entails that of expletive null subjects’, so that ‘if a language allows referential null subjects, it will allow expletive null subjects, but not vice versa’ (Huang : ).27 For a comprehensive empirical analysis of null and overt nonreferential subjects in OE, the reader is referred to Haugland (). The second category of subject omission encompasses referential subjects deleted under coordination. To be precise, this category includes cases where the subject of a conjunct clause (i.e. a main clause introduced by a coordinating conjunction) is deleted on the basis of being identical to the subject of the immediately preceding clause. Examples of OE and PdE clauses containing elided coordinated subjects are given in () and (), respectively. () & he aras & [Ø] ferde to hys huse. and he rose and [Ø] went to his house ‘And he stood up and went home.’

(Mt (WSCp) .)

27 Note, however, that there are counterexamples to this generalization. As Huang (: ) points out, Platzack () shows that Älvdalsmålet allows referential null subjects but not null expletive ones.

OUP CORRECTED PROOF – FINAL, //, SPi



Introduction

() Linda left home early, and [Ø] arrived early for work. As should be quite obvious, this type of subject non-expression—often referred to as conjunction reduction—is still grammatical in PdE. Hence, such instances will be excluded from consideration in this study. The third category, then, encompasses cases where referential subjects are left unexpressed not due to conjunction reduction, but through some other mechanism. It is with this type of null subject the present work will be concerned. Five examples have been given below. () Þa he þa Wigheard to Rome becwom, ær þon he to when he then Wigheard to Rome came, before that he to biscophade becuman meahte, [Sø ] wæs mid deaðe forgripen, bishophood become might, [he] was with death afflicted ‘When he, Wigheard, came to Rome, he died before he could become bishop.’ (Bede .) () Þæs on þæm æfterran geare Hannibal sende sciphere on this.gen on the following year Hannibal sent.sg ship-army on þær ungemetlice gehergeadon. Rome, & [Sø ] Rome, and [they] there excessively ravaged.pl ‘In the year after this, Hannibal sent a fleet to Rome and they there excessively ravaged.’ (Or .) () Oft eac gebyreð ðonne se scrift ongit ðæs costunga often also happens when the confessor hears-of the temptations ðe he him ondetteð ðæt [Sø ] eac self bið mid ðæm which he to-him confesses that [he] also self is by the ilcum gecostod. same tempted ‘It happens often when the confessor hears of the temptations which he confesses to him, that he himself is tempted by the same thing.’ (CP .) () Ne nimð hig nan man æt me ac [Sø ] læte hig fram not takes it not-one man from me but [I] let-down it from me sylfum. me self ‘No man will take it from me, but I will lay it down myself.’ (Jn (WSCp) .) () He befran he asked mærlice beautifully

ða hwam ða then for-whom the getimbrode; Him timbered. him.dat

gebytlu gemynte buildings meant wæs gesæd. þæt was said that

wæron. swa were, so hi wæron they were

OUP CORRECTED PROOF – FINAL, //, SPi

Data material and method



gemynte anum sutere on romana byrig. And meant for-one.dat shoemaker.dat in Roman.gen.pl city. and hine [Sø ] eac namode him [one] also named.sg ‘He then asked for whom the buildings were meant, that were so gloriously constructed. He was told that they were meant for a certain shoemaker in Rome, and one also named him.’ (ÆCHom II .) As can be seen in ()–(), Sø can occur in a varied range of constructions. Instances of Sø can be found in non-conjunct main clauses (), in second conjunct clauses even when not co-referent with the subject of the immediately preceding clause (), (), and (), and in subordinate clauses (). In addition to the fact that Sø occurs in different clause types, the above examples show that both singular and plural pronouns may be realized as null—cp. () and ()—that the null subject may refer to different persons—cp. () and ()—and that Sø can have generic, in addition to specific, reference (). The omitted subjects in the examples given above thus correspond with the characteristics of referential null subjects of the type occurring in pro-drop languages and (as stated above) this is the type of null subject that will be investigated in this study. A final comment should be made on imperative clauses. As is well known, imperatives constitute a morphosyntactic environment where non-expression of referential subjects is the rule in PdE. While imperatives are possible with an overt second person subject, they are by far most frequently subjectless. Consequently, imperative structures will be omitted from the scope of the study (see further section ..). In summary, then, this book will concern itself with those occurrences of referential null subjects in early English which are incompatible with the rules of PdE, excluding from its scope expletive null subjects, instances of conjunction reduction, and subjectless imperatives.

. Data material and method This section will give a brief introduction to the data material analysed in this book as well as the method by which it has been collected.28 The immediately following subsections will present the means of data collection (..), address the problem of representativeness (..), and provide general introductions to OE, ME and EModE as well as to the material under analysis (..).

28 The methodologically interested reader is invited to consult the online supplement to this book, which deals with data collection and analysis in a much more detailed manner than the outline offered here. The online supplement can be found at www.oup.co.uk/companion/rusten.

OUP CORRECTED PROOF – FINAL, //, SPi



Introduction

.. Data collection Being a study of historical English syntax, the present work is necessarily corpusbased. The investigation relies on five corpora: three large corpora containing prose and two smaller corpora containing poetry. Sorted by period, these corpora are the York-Toronto-Helsinki Parsed Corpus of Old English Prose (YCOE) (Taylor et al. ), the York-Helsinki Parsed Corpus of Old English Poetry (YCOEP) (Pintzuk & Plug ), the Penn-Helsinki Parsed Corpus of Middle English (PPCME) (Kroch & Taylor ), the Parsed Corpus of Middle English Poetry (PCMEP) (Zimmermann ), and the Penn-Helsinki Parsed Corpus of Early Modern English (PPCEME) (Kroch et al. ). These corpora are phrase-structure treebanks featuring syntactic and morphological annotation. The annotation scheme is comparatively neutral in terms of theoretical assumptions, albeit somewhat partial to early variants of generative syntax, and to a large extent uniform across the corpora. The CorpusSearch  programme (Randall –) was used to pull out all instances of null and overt referential pronominal subjects from all the texts contained in the corpora.29 The YCOE comprises c.. million words, and contains a substantial number of the extant OE prose texts. Use of this corpus thus provides an unprecedented opportunity for conducting a quantitative analysis of referential null subjects in OE. The texts in the corpus are based on the electronic versions created by the Dictionary of Old English project, and the corpus thus builds on the critical editions as opposed to the manuscripts. The YCOEP comprises , words, and is thus substantially smaller than the YCOE. That said, it contains  distinct texts or text excerpts, and it is thus substantial for a poetry corpus, especially when it is taken into consideration that poetry, almost per se, tends to be shorter in length than prose. The sample should thus be large enough to be representative. The PPCME comprises c.. million words, and contains  text samples covering the ME period (c.–c.). These texts represent a variety of (predominantly prose) genres and authors. The version of the PCMEP used (the most recent as of October ) comprised , words drawn from  ME poems.30 The PPCEME contains  text files totalling c.. million words and covers a period ranging from c. to . The number of genres and authors represented across the  files is substantial. .. Representativeness The problem of representativeness is a perennial one in historical linguistics. Consequently, some attention will be given here to the problem of sampling. The primary 29 Measures have been taken, as far as is possible, to ensure that the statistics for overt pronominal subjects include only instances of referential uses of hit ‘it’ in cases where the subject position is filled by this pronoun. Note, however, that it is impossible to exclude all instances of overt non-referential hit from the automated searches. This is true for all the corpora, so this caveat should be considered to be applicable to all of them. Cf. the online supplement to the book for details. 30 It has since been updated, and is now somewhat larger.

OUP CORRECTED PROOF – FINAL, //, SPi

Data material and method



focus in this section will be on OE, since the material from this period constitutes the main focus of the present book, even though the text samples for all periods are similar in size, as detailed above. The central issue, then, is whether the OE prose and poetic material utilized for this study is sufficiently large to allow for claims with regard to the situation for null subjects in OE as a whole. Small (: ) contends that ‘[w]hether the study is to be descriptive, historical, or comparative, the statistics upon which it is based should embrace all the available examples in Old English literature’. The reason for this requirement, which he admits may ‘at first seem too exacting’, is that ‘scattered and somewhat artificial groups of writings compose the body of Old English literature’, and that this ‘makes it dangerous to generalize upon selected examples’ (pp. –). I agree in principle with Small’s sentiment, and the advent of electronic corpora has provided an opportunity to at least approach fulfilling his requirement. At c.. million words, there can be little doubt that the YCOE is large enough to be representative for the preserved corpus of OE prose. According to Crystal (: ), the total corpus of OE texts compiled at the University of Toronto—containing all the preserved OE texts, but not every variant manuscript—comprises c.. million words. While the YCOE and YCOEP corpora cannot claim to be exhaustive, at roughly  the size of the Toronto corpus, they do contain the majority of the most expansive texts, as well as numerous shorter works. This indicates that the corpora are large enough to ‘scale up’. Of course, the ME and EModE corpora cannot lay claim to this kind of coverage: corpora of .–. million words are not particularly large compared to the total preserved output from the ME and EModE periods. Even so, the size of the historical corpora is still considerable, and this, along with their diversity in terms of time of composition, dialect, translation status, and authorship, should allow for representative claims to be made. In this connection, note that Jenset (: –), in a diachronic investigation of existential there, shows statistically that the YCOE is sufficiently large for drawing conclusions about comparatively frequent phenomena.31 Since null subjects are here investigated in the context of the entire population of subject pronouns in the five corpora, there are more than enough observations for conclusions to be drawn. A special note should be made of the fact that in basing my analysis on these corpora, I am separating myself from the primary linguistic data by several layers of indirectness. As mentioned above, the text of the OE corpora is taken from the Dictionary of Old English project. That project, in turn, builds primarily on philological editions of relatively recent date, and not the extant manuscripts. Therefore, I tacitly accept the decisions made by the editors in the transition from manuscript to edition. This may be problematic, since, as Fischer (: –) points out, editors are known to have made mistakes, for example by ‘misanalysing’ earlier constructions 31 He demonstrates that ‘with a data set of   adverbs, we have probably captured a fairly good range of the variation in the (hypothetical) population of utterances [with locative adverbs—kar]’ (Jenset : ).

OUP CORRECTED PROOF – FINAL, //, SPi



Introduction

by thinking of them as equivalent to modern constructions. There is potential for the editors to have made mistakes in interpreting the script, and it is also well known that editors have made alterations and emendations on various levels according to their knowledge and tastes. In principle, this problem is compounded by the fact that errors may also have occurred in the transition from editions to machine-readable corpora. Additionally, in relying on the syntactic and morphological annotation of the corpora, I accept the linguistic judgements of the corpus analysts and their computer tools. As should be clear, then, it is quite conceivable that the results for whatever phenomenon is under analysis may be affected first by various editorial decisions and then by the decisions of the corpus analysts. For our purposes, then, it should be acknowledged that (for example) there exists a risk of editors having inserted referential subject pronouns where none were present in the manuscript. There is also a risk of the corpus analysts overlooking instances of referential null subjects, consequently not tagging them as such. Furthermore, if one accepts the analysis that oblique (i.e. nonnominative) arguments could function as subject, it is also possible that the corpus analysts have indicated the presence of a referential null subject where an accusative or dative argument can be analysed as the subject. These factors could conceivably affect the findings of the study and result in either over- or under-reporting of the occurrence of Sø . It should also be kept in mind that the manuscripts themselves may have been copied and re-copied by a number of scribes, ‘each with their own grammar, whose copies may differ both geographically and diachronically from the text of the original author’ (Fischer : ). While these are potential rather than definite sources of error, they need to be acknowledged here. While it may seem like an insurmountable problem that I am removed from the original language users by layers of intermediary scholars, and therefore layers of indirectness, there are obviously numerous advantages to the corpus-based approach. First, the use of annotated corpora ensures replicability, particularly when the scholar allows transparency in terms of the search parameters used.32 Secondly, the corpora employed were not annotated specifically for my study, instead being annotated in such a way as to facilitate investigation of any number of linguistic phenomena, in light of any theoretical framework, and this is a factor which contributes to the objectivity of the study. Finally, on a much more pragmatic note, it is a considerable advantage that use of corpora is hugely time-efficient as compared to combing through the manuscripts by hand. Since, as Fischer (: ) points out, frequency counts are a way of ascertaining ‘whether a particular construction has become or is becoming grammatical [or, for that matter, ungrammatical—kar] in the course of time’, it is of vital interest to use as large a quantitative material as possible. Use of corpora represents the only way to achieve this volume in a reasonable time frame. 32 For this, see the online supplement to the book, where search parameters are presented in detail.

OUP CORRECTED PROOF – FINAL, //, SPi

Data material and method



Fischer (: ) raises another concern linked to the use of corpora in historical linguistics, namely that ‘the examples as they come up on the screen are divorced from context’. ‘Ideally,’ she says, ‘one should check every example in context, which is hardly feasible’ (Fischer : ). In many cases this is no doubt true, but given that the occurrence of referential null subjects is a restricted phenomenon in OE (as shown in Rusten , , a and Walkden , , c), it is more than feasible, and also imperative for the purposes of this study, to investigate every example in context. Jenset (: ) suggests another approach: ‘to use corpus linguistics and statistics to fill out the gaps and make estimations given what we can infer from the data’. As he points out, ‘a combination of large amounts of data and statistical models can yield estimates that are both informative and robust with respect to the main tendencies in language change’ (Jenset : ). This is a view which will be taken up in the present book, which, as a consequence, will implement both solutions to offset the weaknesses inherent in the material. .. Old English The designation Old English is used as a collective term for the dialects spoken and written (chiefly) by the Germanic population of Britain in the period c.–c. ad. While textual evidence of this language first appears c., scholars of OE also consider an undocumented Prehistoric Old English period stretching from the time of the initial Germanic settlement of Britain in the fifth century up to the production of the first preserved texts. Distinction is commonly made between an early (c.–) and a late period of OE (c.–). The Germanic settlement of Britain comprised several West Germanic tribes, including Angles, Saxons, Jutes, and Frisians. These bands of soldiers and settlers spoke closely related, but distinct, varieties of West Germanic—a fact reflected in several similarly distinct OE dialect areas. Most of the extant OE texts are written in the West Saxon dialect, but there are also in evidence texts displaying Mercian and Northumbrian (collectively referred to as Anglian) and Kentish varieties of OE. Knowledge of Northumbrian, Mercian, and Kentish is based on very few sources compared to West Saxon (Gneuss : ). Even for West Saxon, there remains only limited data, but the corpus of preserved OE texts is still comparatively rich. According to Robinson (: ) ‘Old English is second only to Old Norse in the volume and variety of texts’. In addition to prose texts, the preserved OE corpus includes a relatively extensive verse tradition, as well as interlinear glosses and runic inscriptions, and Greenfield & Calder (: ) say that in ‘no other medieval vernacular language does such a hoard of verbal treasures exist for such an extended period’. The texts in the YCOE belong to a wide variety of genres, among them histories, homilies, saints’ lives, legal texts, handbooks, letters, and prayers; and although the literate clergy no doubt are the ones responsible for the transmission of OE literature,

OUP CORRECTED PROOF – FINAL, //, SPi



Introduction

the texts exemplify both religious and secular prose.33 Most of the texts are translations of Latin originals, but there are also numerous texts originally composed in the vernacular. One notable example is the Anglo-Saxon Chronicle, which is referred to by Magoun () (cited in Garmonsway : xv) as ‘the most important work written in English before the Norman Conquest’. While the majority of the YCOE texts are West Saxon ones, a number of texts display other dialect features. Therefore, while keeping in mind the limited evidence from non-West Saxon dialects, it should in principle be possible to ascertain quantitatively whether Walkden’s (; ; c) observations about dialectal variation with regard to the licitness of null subjects hold. The text of the YCOEP is taken from the OE section of the Helsinki Corpus of English Texts (Rissanen et al. ), which, like the Dictionary of Old English project, builds on the critical editions. Most of the poems are included in full, but some, such as Andreas, Christ, Genesis, The Meters of Boethius, and Riddles, are included in sample format only. The texts and text excerpts in the YCOEP represent a range of dialects, including West Saxon, Anglian, and Kentish.34 Like the prose texts, many verse texts are translations or adaptations of Latin originals, yet a number of poems composed in the vernacular—most notably Beowulf —are also included. The corpus compilers also point out that the ‘texts included in the corpus represent a range of dates of composition’.35 In this work, I will for all practical purposes endorse this statement; but it should be kept in mind that dating of OE poetry is highly contested, and this claim is thus somewhat controversial (see section .., where this issue is addressed further). For a complete list of the texts contained within the YCOE/P, the reader is referred to the web pages of the corpora.36 The abbreviations used to refer to the OE texts in this book are the standard abbreviations employed by the Dictionary of Old English project (cf. also Healey & Venezky ).37

33 It can, however, be noted that the YCOE is slightly skewed in favour of religious prose. 34 Although this should not be taken to mean that the poems are dialectally ‘pure’: they certainly show a mixture of dialect features. 35 The quote is taken from http://www-users.york.ac.uk/~lang/pcorpus.html. 36 http://www-users.york.ac.uk/~lang/YCOE/info/YcoeText.htm and http://www-users.york.ac.uk/ ~lang/ptext-list.html. 37 There are certain exceptions to this. First, Charters and Wills collected from various texts have been labelled Docu , , etc., in accordance with the YCOE files. The same is true for the texts labelled Heptateuch and Vercelli Homilies. Heptateuch, as the name suggests, contains prose versions of Genesis, Exodus, Leviticus, Numbers, Deuteronomy, Joshua and Judges. Vercelli Homilies contains a number of homilies and saints’ lives, specifically HomS (ScraggVerc , , , , , , , , , , , ), HomU (ScraggVerc , , , , , ), HomM (ScraggVerc , ), and LS (ScraggVerc , ). When reference is made to specific citations, the standard DOE abbreviation for the text in question has been used.

OUP CORRECTED PROOF – FINAL, //, SPi

Data material and method



.. Later stages of early English The status of English changed drastically after the Norman Conquest of . Late OE lost its status as a literary and administrative language, and was largely replaced by Latin and Norman French as concerns official use. As a consequence, written English was ‘for much of the period’ used as a ‘local [language], catering for local literary tastes’ as well as ‘for the contemporary equivalent of primary education’ (Smith : ). No account will be given here of the numerous and substantial linguistic changes initiated and/or completed during this period, or of their causes. Suffice it to say that postConquest English eventually came to exhibit considerable differences from OE in the domains of syntax, morphology, phonology, vocabulary, and orthography. The designation Middle English, then, is usually taken to refer to the varieties of English spoken and written between c. and . Distinction is commonly made between an early (c.–) and a late stage of ME (c.–), and late ME significantly outweighs early ME in terms of output. The Chancery standard was established during late ME, and Caxton’s printing press was established towards the very end of the period. Thus, late ME ‘pave[d] the way for the Modern period, in view of both literature and nonliterary writings, and of language development’ (Nevanlinna et al. : ). The Early Modern English period, characterized e.g. by the stabilizing influence of the printing press and increasing literacy, is typically delineated as covering the span c.–. At the end of the period, English had more or less achieved its modern form, and texts from this stage are largely intelligible to native speakers of PdE. The PPCME texts are quite diverse in terms of dates of composition, dialect, and authorship, but the corpus is skewed in the direction of late ME as a consequence of the available data. The PPCME does contain important early texts such as the Ormulum, the Peterborough Chronicle, and the Katherine group texts, but there is a particular lack of texts composed in the period c.–. The  texts in the PCMEP can to some extent be used to counter this diachronic skewedness, since this corpus contains a number of verse texts from this period (as noted by Zimmermann , and as also pointed out in Walkden & Rusten : ), such as The Owl and the Nightingale, Havelok the Dane, and The Simonie. The PPCEME texts represent a wide range of authorships, dialects and prose genres. For example, the corpus contains several lengthy excerpts from the correspondence of the Barrington, Hatton, and Oxinden families, as well as texts produced by important Renaissance authors, including Shakespeare, Bacon, and Middleton. Compared to the YCOE, however, which in many cases samples extensive texts in their entirety, samples from individual texts are rather short in the PPCEME. The sample from Shakespeare’s The Merry Wives of Windsor, for instance, includes , words, and while this is a substantial sample, it is much smaller than e.g. Ælfric’s Lives of Saints (,  words).

OUP CORRECTED PROOF – FINAL, //, SPi



Introduction

For complete lists of the texts in the PPCME, PCMEP, and PPCEME, as well as bibliographical details, the reader is directed to the web pages of the corpora.38

. Outline of the book The book is structured as follows. Chapter  presents quantitative surveys of the occurrence of referential null subjects in the OE prose and verse data. On the basis of these surveys, the chapter assesses the question of whether OE was a canonical pro-drop language. Chapter  investigates the dialect-split hypothesis of Walkden (, , c), i.e. the hypothesis that Anglian OE had a partial pro-drop property, by testing statistical associations between the occurrence of null subjects and the non-linguistic variables of dialect, genre, period, and translation status. Chapter  provides a quantitative investigation of the morphosyntactic characteristics of null subjects in OE. Chapter  addresses the question of what may have licensed the occurrence of null subjects in OE, with special focus on antecedent accessibility and the generative analyses proposed by van Gelderen (, ) and Walkden (). Chapter  investigates the occurrence and linguistic characteristics of null subjects in Middle and Early Modern English. Finally, Chapter  summarizes and concludes.

38 Cf. http://www.ling.upenn.edu/histcorpora/PPCME-RELEASE-/, http://pcmep.net/texts.php and http://www.ling.upenn.edu/histcorpora/PPCEME-RELEASE-/

OUP CORRECTED PROOF – FINAL, //, SPi

2 Referential null subjects in Old English . Introduction The present chapter will address the question of whether OE was a canonical pro-drop language. Placing under analysis the YCOE and YCOEP corpora in their entirety, I will provide quantitative overviews of the occurrence of overt and null referential subjects in OE prose and poetry (section .). On the basis of these data, I will, in section ., offer a re-evaluation of van Gelderen’s claims that ‘[i]n Old English [. . .] pro-drop is quite common’ (van Gelderen : ), and that OE was ‘a genuine pro drop language’ (van Gelderen : ). First, however, section . will briefly present two types of subjectless structure which ultimately were omitted from consideration, despite being annotated as ∗ pro∗ in the corpora.

. Three types of subjectless structure When collecting data, I ran corpus searches designed to find all occurrences of the empty category tagged ∗ pro∗ in the YCOE and YCOEP. On the basis of a detailed examination of each example in context, a tripartite division was observed with regard to the collected tokens: (i) subjectless ‘imperative-like’ jussive subjunctive structures, (ii) naming structures with the verb hatan ‘be called’ which contain an empty subject relative, and (iii) subjectless clauses featuring a non-overt referential subject corresponding to the pro-drop phenomenon. Only category (iii) is of substantial interest in the context of this work. Indeed, the corpus analysts are careful to point out that the ∗ pro∗ label ‘is not meant to indicate “small pro” in any theoretical sense, although it may include such cases’. What is meant by the ∗ pro∗ label is simply that the null non-expletive token so annotated ‘is not exactly co-referent with the labelled subject in the previous clause or token’. The annotators also point out that it ‘is left to the interested investigator to determine the appropriate analysis (or analyses)

Referential Null Subjects in Early English. First edition. Kristian A. Rusten. © Kristian A. Rusten . First published in  by Oxford University Press.

OUP CORRECTED PROOF – FINAL, //, SPi



Referential null subjects in Old English

of such subjects’.1 Consequently, Chapters – will mainly be dedicated to an in-depth treatment of category (iii). However, in the interest of exhaustiveness, categories (i) and (ii) will receive detailed attention in the immediately ensuing sections .. and ...2 They will thereafter be excluded from consideration. .. ‘Imperative-like’ subjunctives A substantial proportion of the collected tokens of ∗ pro∗ occur in subjunctive structures which express exhortations, orders, suggestions, or wishes, and which are notably similar to imperatives, both structurally and functionally. This similarity is a problem in the context of the present work, which concerns itself solely with such instances of subject omission as would be ungrammatical in PdE. As noted previously, imperatives constitute one of few syntactic environments where PdE permits structures with an omitted subject. As a consequence, there is good reason to exclude these subjectless subjunctive structures from consideration on the basis of their similarity to subjectless imperatives. The first order of business, however, is to actually establish the similarity of such subjunctives to genuine imperatives. For this purpose, two PdE imperatives are given in () and () below. The abbreviation Sø.imp denotes the non-overt subject position in an imperative clause. () [Sø.imp ] Leave at once. () You leave at once. As the examples demonstrate, PdE imperatives may have overt or non-overt subjects. The non-overt variant is by far the most frequent: in a study on imperatives in A Corpus of English Conversation (Svartvik & Quirk ), Aarts (: ) finds that . of what he calls [–LET]-imperatives do not have an overt subject. Variation between these two pronominal variants is observed also in OE. Negative imperatives normally feature an overt pronominal subject, while positive imperatives most commonly do not (Mitchell a: –, –). Even so, positive imperatives featuring overt pronominal subjects are also observed. All three variants are illustrated below: () Ne wep þu: not weep.imp.sg you ‘Do not weep.’

(ÆCHom I .)

() Gif ðu hælend crist sy. Gehæl [Sø.imp ] ðe and us; if you saviour christ be. save.imp.sg [you] you.refl and us ‘If you are the Saviour Christ, save yourself and us.’ (ÆCHom II .) 1 The quotes are taken from the entry on empty subjects at http://www-users.york.ac.uk/~lang/ YCOE/doc/annotation/YcoeLite.htmsyntactic_labels. 2 These sections draw heavily from section . in Rusten ().

OUP CORRECTED PROOF – FINAL, //, SPi

Three types of subjectless structure



() Ondswarede he him: Gif he Godes man sy, fylgað answered he them: if he God’s man be, follow.imp.pl ge him. you him ‘He answered them: “if he is God’s man, follow him”. ’ (Bede .) As concerns the syntax of OE imperatives, it may be observed that positive imperatives often have subject–verb inversion when the pronoun is overt. This is illustrated in example (). By analogy, it can be assumed that the post-verbal position is also the most likely position of the non-overt subject.3 Morphologically, the imperative mood in OE has distinct inflectional endings only in the second person singular (Ø) and plural (-aþ). Quirk & Wrenn (: ) thus maintain that the ‘imperative proper exists only in the second person singular and plural’. They allow that there may exist a very rare first person plural form. For exhortations or orders in the third person, the subjunctive mood takes the function of the imperative, according to them (). The result is constructions which resemble imperatives to a significant degree. These structures should not, however, be conflated with genuine imperatives, a fact which Mitchell (a: ) stresses by ‘unrepentantly’ referring to this verbal category as the ‘jussive subjunctive’. The similarities between the imperatives in () and () above and the subjunctives given in () and () below should be obvious. The abbreviation Sø.jus denotes the non-overt subject position in a jussive subjunctive structure. () Drihten. gehæle [Sø.jus ] me lord, save.sbv [you] me ‘Lord, save me.’

(ÆCHom I .)

() Gif preost biscopes agen geban forbuge, gilde [Sø.jus ] XX or. if priest bishop’s own decree decline.sbv, pay.sbv [he]  or ‘If a priest does not comply with a bishop’s own decree, let him pay  or.’ (LawNorthu ) Example () exemplifies a clause highly similar to that in (), in terms of both semantics and morphosyntax. As concerns semantics, both constitute pleas for salvation. The main morphosyntactic difference between the two—abstracting away from the subordinate clause preceding the imperative main clause verb in () and the vocative preceding the subjunctive clause in ()—is the imperative Ø-ending of the verb in () and the subjunctive -e ending in (). In both cases, the main clause is subjectless, with the verb preceding the position in which it may be assumed that the subject would occur. The likely post-verbal position of the subject may be established by analogy

3 The subject pronoun could, however, also precede the verb; see e.g. Mitchell (a: ).

OUP CORRECTED PROOF – FINAL, //, SPi



Referential null subjects in Old English

with overt subjects in the relevant structures, as illustrated in examples () and () below (compare also with the overt imperative subject in example ()):4 () gielde he fulwite pay.sbv he full-fine ‘Let him pay the full fine.’

(LawIne )

() Monnes cinban, gif hit bið toclofen, geselle mon XII man’s cheekbone, if it be cloven, give.sbv one twelve scillinga to bote. shillings to boot ‘If the cheekbone of a man be split, give twelve shillings in compensation.’ (LawAf .) Example () likewise bears great similarity to (), since both clauses are introduced by an adverbial clause of condition, followed by a subjectless imperative or subjunctive main clause. Again, the main morphosyntactic difference between the two is the contrast between the imperative Ø-ending of the verb in () and the subjunctive singular -e ending in (). The structure exemplified in () and (), where an introductory adverbial clause of condition is followed by an imperative or subjunctive main clause, is commonly observed for both variants. Thus, while there certainly are inflectional differences, there is little to indicate that subjectless jussive subjunctive structures should be treated as fundamentally different to subjectless imperative clauses syntactically. Example () offers further illustration of the similarity between imperatives and jussive subjunctives, displaying as it does two parallel examples where the same verb, hælan ‘heal, save’, occurs twice in consecutive clauses, once in the imperative and once in the subjunctive. Note that the subject referent is the same in both clauses, i.e. God. () Hie cwædon, hæl [Sø.imp ] us on þon hehstan, efne swa swa they said, save.imp [you] us on the highest, even just as hie openlice cwædon, Hæle [Sø.jus ] us on eorþan, þu þe they openly said, save.sbv [you] us on earth, you who godcund mægen hafast on heofenum. divine might have in heaven ‘They said: “save us in the highest”, even as they openly said, “save us on earth, you who have divine might in Heaven.” ’ (HomS  (BlHom ) .) 4 A possible exception to this regularity is found in constructions with the indefinite mon ‘one’, which occasionally occurs in the final position, as exemplified in XXX scillinga geselle him mon ‘one should give him  shillings’ (LawAf ). However, it should be mentioned here that mon exhibits characteristics of both pronouns and full NPs. For instance, mon can appear in clause positions where pronouns normally do not occur. This has caused some to consider mon a fully nominal element. Van Bergen (: ), however, concludes that mon must be considered pronominal.

OUP CORRECTED PROOF – FINAL, //, SPi

Three types of subjectless structure



Imperative-like jussive subjunctives most commonly have third person reference, but second person reference also occurs, even though structures with a second person subject which express exhortations or orders are usually the domain of the imperative. Illustrations are given in () and () above and in () below: () Gemyne [Sø.jus ] þæt ðu gehalgige þone ræstedæg; remember.sbv [you] that you hallow the rest-day ‘Remember that you keep the sabbath-day holy.’ (LawAfEl ) In summary, then, the jussive subjunctive structures which are the focus of the present section resemble imperatives to a notable extent: the overt subject—and presumably also any possible non-overt subject pronoun—is often post-verbal, and frequently located to the immediate right of the verb in linear order. Jussive subjunctive and imperative main clauses often contain embedded conditional clauses preceding the position of the predicate verb of the main clause. Semantically, both types of structure express commands, suggestions, encouragements, and exhortations. The non-overt subjects occurring in such subjunctive structures thus arguably represent a syntactic ‘behaviour’ which is comparable to that of subjectless imperatives. On this basis, then, imperative-like subjunctives such as those illustrated above will be excluded from consideration. At this stage, I should point out that a small number of the subjectless clauses analysed as jussive subjunctives here are somewhat less imperative-like than the unambiguous examples presented above. Consider example (): () [. . .] þæt ilece gebed þære ylecan endebyrdnesse sy geweorðod, [. . .] the same prayer in-the same manner be recited, þæt is mid ferse and mid imene þæra sylfra tida mid that is with versicle and with hymns of-the same time with þrim sealmum, mid rædinge and ferse and kyrrieleyson, and three psalms, with reading and versicle and Kyrie-Eleison and swa mid gebede beon [Sø.jus ] geendode. thus with prayers be.sbv.pl [they] ended ‘[. . .] the same prayer should be recited in the same order, that is with versicle and with hymn and with three psalms at the same time, with readings and versicle and Kyrie Eleison, and let them be ended with prayer.’ (BenR ..) The meaning of the subjectless subjunctive in () is that of an imperative—i.e. an instruction to ‘end them with prayer’. It might be viewed as a somewhat complicating factor that the subjunctive plural beon constitutes a complex verb phrase together with the participle geendode, while verb phrases in imperatives and jussive subjunctives are typically simple. However, as pointed out by Mitchell (a: ), there are examples of OE imperatives being part of ‘periphrases with the present participle’. Since there is parallellism between imperatives and jussive subjunctives also with

OUP CORRECTED PROOF – FINAL, //, SPi



Referential null subjects in Old English

respect to complex verb phrases, it was deemed acceptable to exclude examples such as that in () from consideration. I have also found scattered examples of non-overt subjects in subjunctive structures where the verb is not agentive, as in (), unlike what is typically the case in imperatives and jussives. Again, it seems acceptable to view these clauses as corollaries to imperatives, as it may be argued that PdE allows unagentive have in such structures, as exemplified in ().5 () & gyf seo lad þonne berste, hæbbe [Sø.jus ] þone ylcan and if the defence then bursts, have [he] the same dom, þe se þe þæt fals worhte. doom, as he who that forgery worked ‘And if the defence then fails, let him have the same punishment as he who worked the forgery.’ (LawIICn:.) () Have a cookie! It should be evident that the examples presented in () and () differ slightly from the ‘straightforward’ jussive subjunctives in () and () in terms of morphosyntactic behaviour. There are, however, also notable similarities: the non-overt subject, arguably, occurs in post-finite position in both types of structure, and the sense is clearly imperative in both cases. Thus, examples such as () and () have been counted along with the more unambiguous jussives, and henceforth disregarded. The exclusion of jussive subjunctives has real consequences for the remainder of this study, since an entire , of the , (.) instances of ∗ pro∗ initially collected from the YCOE are of the jussive subjunctive variety. Non-overt subjects in jussive subjunctives are found in  text files in the corpus, but the overwhelming majority of them are concentrated in instructional prose: , of the , subjectless jussives occur in ecclesiastical and secular laws, monastic rules, and medical handbooks. It may be commented at this point that a notable difference between Walkden (, , c) and the present study is that Walkden investigates null subjects occurring in indicative clauses exclusively, disregarding all null subjects occurring in subjunctive clauses. Blanket exclusion of occurrences of Sø in subjunctive clauses is problematic, however, since even though the vast majority of the occurrences of Sø in subjunctives are of the ‘imperative-like’ type, a number of them may only with quite some difficulty be viewed as imperative-equivalents. An example is given in ():

5 Although I realize that it certainly may be discussed whether () should be understood as ‘receive a cookie’ (unagentive) or ‘take a cookie’ (agentive).

OUP CORRECTED PROOF – FINAL, //, SPi

Three types of subjectless structure



() & gyf he þæt gelæste, þonne bið he wyrðe, þæt hine man and if he that keeps, then is he worthy, that him one he wunige. þe bet healde, wunige [Sø ] þær the better hold, dwell.sbv [he] where he dwell ‘And if he keeps that, then he is worthy to be considered better, no matter where he dwell.’ (LawVAtr .) In this example, the non-overt subject occurs in a subordinate clause, unlike the examples occurring in main clauses given above. Additionally, the clause is concessive, and cannot be construed as imperative-like. Hence, I have elected to retain this and similar examples in my data material, as such examples have more in common with genuine pro-drop than do subjectless jussives. Likewise, examples such as the first of the two non-overt subjects in () were also judged valid for inclusion in my data. () Gif [Sø ] ðonne on gafolgeldan huse oððe on gebures if [anyone] then on rentpayer’s house or on peasant’s gefeohte, CXX scillinga to wite geselle [Sø.jus ] & þam gebure fight, CXX shillings in fine give [he] and the peasant VI scillinga. VI shillings ‘If anyone then fights in the house of a rent-payer, or a peasant, let him pay  shillings as a fine and six shillings to the peasant.’ (LawIne .) .. Non-overt subject relatives in naming structures with hatan Another distinct group of non-overt subjects annotated as ∗ pro∗ by the YCOE analysts may reasonably be analysed as omitted relative pronouns or relative particles in contact relative clauses,6 as opposed to null subjects analogous to personal pronouns. These non-overt relativizers can be analysed as replacing the subject or as themselves functioning as subject in a contact relative. Two examples are given below. The abbreviation Sø.rel signifies a non-overt subject relative. () þa bæd his fæder, [Sø.rel ] wæs eac Fauius haten, þæt then begged his father, [who] was also Favius called, that þa senatum forgeafen þæm suna þone gylt the.pl senator.nom.pl forgive the.dat son.dat the.acc guilt.acc ‘Then his father, who was also called Favius, begged that the senators forgive the son his guilt.’ (Or  ..)

6 Use of the term contact relative to describe a relative clause which immediately follows its antecedent without the presence of a relativizer appears to be due to Jespersen (). Others have used terms such as asyndetic relative construction (Curme ) and zero construction (Visser ).

OUP CORRECTED PROOF – FINAL, //, SPi



Referential null subjects in Old English

() On þes ilca Offa dæi. wæs an ealdorman [Sø.rel ] Brordan on this same Offa’s day, was an alderman [who] Brordan wæs gehaten. was called ‘In the time of this very Offa, there was an alderman who was called Brordan.’ (ChronE .) The corpus annotators take the view that the clauses in question are parenthetical main clauses. On this analysis, the empty subject position is filled by ∗ pro∗ , as opposed to a subject wh-trace, which means that () should be understood as ‘in the time of this very Offa, there was an alderman; [he] was called Brordan’. In illustration, the corpus tree structure for () is given in (): ()

IP-MAT

NP-ACC

P on

NP-NOM

BEDI

PP

NP-GEN-c

wæs NUMˆN NˆA

an

ealdorman NP-NOM NP-NOM-PRD BEPI *pro*

DˆN ADJˆN NRˆN dæi þes

ilca

IP-MAT-PRN

NˆN

Brordan

VBN

wæs gehaten

Offa

Despite the corpus annotation, however, I have elected to analyse clauses such as those in () and () as subjectless relative clauses, as indicated above. This view is also taken by e.g. Dekeyser (b: ), who says that while OE features only ‘sparse examples of SCC [subject contact clauses, i.e. contact relatives with omitted subjects—kar]’, these are found to occur ‘mainly with the verb hatan [. . .] or a copula’. Mitchell (b: ) also adopts this analysis, noting that the ‘[a]pparent absence of a relative pronoun in a definite adjective clause’ occurs ‘most frequently in OE with forms of the verb hatan [. . .] viz. hatte, hatton’ when the missing element has nominative case. He also notes that ‘[i]ndisputable examples involving other verbs are few’ (Mitchell b: )’.7 Mitchell (a: –) also gives examples of this type of relative clause in a section on OE ‘naming constructions’—hence the title of the present section. The same analysis of this type of clause—i.e. as relative clauses— is also adopted in works such as Erdmann (), Dekeyser (a), Poppe (), Walkden (), and Rusten ().

7 See () and ().

OUP CORRECTED PROOF – FINAL, //, SPi

Three types of subjectless structure



On this analysis, then, there is no referential null subject in these relative clauses, but rather an empty relativizer. Empirical evidence favouring this analysis can be adduced from parallel clauses which contain an overt relativizer. () provides an example: () Ðær wæs þa sum eald man in Hierusalem in þære byrig in there was then some old man in Jerusalem in the town in þa ilcan tid se wæs haten Simeon. the same time who was called Simeon ‘There was then in the town of Jerusalem at the same time an old man who was called Simeon.’ (LS  (PurifMaryVerc ) ) Here, the pronoun se arguably serves as relativizer. As Mitchell (b: ) points out, however, one cannot be certain that se in examples such as () is a relative, as opposed to a demonstrative, pronoun. If se is considered a demonstrative pronoun in example (), a case could be made that the subject position in () is indeed filled by a null pronominal subject.8 Even so, it is well known that the demonstrative pronoun can be used as a relativizer, both alone and in combination with a form of the relative particle þe. Consequently, it could be considered evidence in favour of an analysis of se as a relativizer in () if this pronoun could be shown to combine with the relative particle þe in the same type of structure. Such cases are in evidence, as shown in (). () þa werun gesomnade alduras sacerdas & þa aeldra þæs then were gathered elders of-the-priests and the elders of-the folkes in cæfertun þæs aldorsacerdæs seþe wæs haten Caifas. people in palace of-the high-priest who was called Caifas ‘Then were gathered the elders of the priests and the elders of the people in the palace of the High Priest, who was called Caiaphas.’ (MtGl (Ru) .) Moreover, a simple search for the string þe hatte in the Dictionary of Old English Web Corpus yields  hits in structures such as that exemplified in (): () On þære halgan bec þe hatte uita patrum us segð on the holy book which is-called vita patrum us says swutellice þæt [. . .] plainly that [. . .] ‘In the holy book which is called Vita Patrum, it tells us clearly that [. . .]’ (ÆHom  )

8 It seems straightforward to assume that a demonstrative pronoun may constitute an overt corollary to a referential null subject, as in Þæt wæs god cyning ‘That was a good king’ (Beo ). Such subjectless clauses could then reasonably be analysed as containing pro.

OUP CORRECTED PROOF – FINAL, //, SPi



Referential null subjects in Old English

Here, þe is clearly a relativizer. Thus, all three strategies for overt relativization in OE can be observed in naming structures with hatan. This bolsters the analysis that the empty position in such clauses is filled by a subject relative, and not a referential pronominal subject. Furthermore, since the example in () is taken from the glosses in the Rushworth Gospels, it provides an interesting opportunity for cross-linguistic comparison. The Latin version reads: () Tunc congregati sunt principes sacerdotum et seniores populi then congregated are leaders of-priests and elders of-people in atrium principis sacerdotum qui dicebatur Caifas in palace of-head of-priests who was-called Caifas ‘Then the chiefs of the priests and the elders of the people congregated in the palace of the high priest, who was called Caiaphas.’ (MtGl (Ru) .) The Latin qui is unambiguously a relative pronoun, and it seems a straightforward assumption that the OE combination of se and þe in () has been chosen by the glossator in order to introduce a relative clause. The evidence presented above therefore prompts the conclusion that the empty position in these naming constructions is indeed filled by a non-overt relativizer instead of a null subject, and hence these tokens will be excluded from further consideration. Thus, the same approach is taken here as in my previous work (e.g Rusten , ), as well as in Walkden (, , c).9 As mentioned above, non-overt subject relatives are most frequently found in structures containing the verb hatan. Also attested in my data, however, are scattered examples of non-overt subject relatives co-occurring with other verbs. See () and () for exemplification. Such tokens have been counted as equivalents to the examples discussed above, since overt subject relatives obviously also occur with verbs other than hatan.

9 The above notwithstanding, I would be remiss if I did not mention that examples exist which conform with the analysis of the corpus annotators. One such example is given in (i) (I am grateful to one of the anonymous reviewers for supplying this example). (i)

Þa æfter þam he uteode & geseah Publicanum, he wæs oþrum naman then after that he out-went and saw publican.acc, he was other.dat name.dat Leui gehaten æt ceapsceamule sittende, & he cwæþ to him, filig me. Levi called at toll-booth sitting, and he said to him, follow me ‘Then after that he left and saw a publican—he was called Levi by another name—sitting at a tollbooth, and he said to him “follow me”. ’ (Lk (WSCp) .)

In this clearly parenthetical clause, the subject of the VP wæs gehaten is a personal pronoun. If this clause had been subjectless, the analysis exemplified in example () would be appropriate.The existence of such examples does not seriously damage my conclusions about clauses with hatan, even though the relative clause analysis may not be the only one possible. The key point is that apparent instances of an omitted referential subject can be analysed in ways not involving pro-drop.

OUP CORRECTED PROOF – FINAL, //, SPi

Referential null subjects in Old English



() Nu we spræcon be cynegum we willað þysne cwyde now we speak of kings we will this discourse gelencgan, and be sumum cynincge eow cyðan git, [Sø.rel ] lengthen, and about some king you tell yet, [who] Abgarus wæs geciged, sum gesælig cynincg on Syrian lande. Abgarus was called, some blessed king on Syria land ‘Now that we speak of kings, we will lengthen this discourse, and tell you of yet another King, who was called Abgarus, a certain blessed king in the land of Syria.’ (ÆLS (Abdon and Sennes) ) () Ða het he hie seman, ða wæs ic ðara then commanded he them.acc.sbj settle.inf, then was I of-the monna sum ðe ðærto genemned wæran, & Wihtbord & men some who there-to appointed were, and Wihtbord and ælfric, [Sø.rel ] wæs ða hrælðen [. . .] Ælfric, [who] was then robe-thane [. . .] ‘Then he ordered them to come to an agreement; then I was one of the men who were appointed for that, and Wihtbord and Ælfric, who was then the Keeper of the Wardrobe [. . .]’ (Ch  (HarmD ))  of the , (.) initially collected tokens of ∗ pro∗ are non-overt subject relatives. These are largely concentrated in historical narratives which introduce characters and places:  tokens occur in historical works. Omitted subject relatives are nevertheless also found in various other genres, including Biblical translations, Saints’ Lives, homilies, and medical texts.

. Referential null subjects in Old English Having excluded the tokens of ∗ pro∗ discussed in the immediately preceding sections, the focus of attention now shifts to omitted subjects which are correlates to the empty category discussed in the extensive literature on the pro-drop property—i.e. non-overt obligatory referential subject pronouns occurring in finite clauses. Sections .. and .. give overviews of the occurrence of Sø in OE prose and poetry, respectively. Based on the data presented in these sections, section . re-evaluates the claim that OE was a canonical pro-drop language. .. Null subjects in Old English prose: an overview Table . presents the results of the investigation into the occurrence of Sø in OE prose following removal of subjectless jussive subjunctives and subjectless clauses with hatan. The table gives observed frequencies for overt (Spron ) and null referential pronominal subjects in  individual OE prose texts, as well as the proportion of

OUP CORRECTED PROOF – FINAL, //, SPi



Referential null subjects in Old English Table .. Overt and null referential subjects in Old English prose Text

Spron



Total

 Sø

ÆCHom I ÆCHom I (Pref) ÆCHom II ÆCHom II (Pref) ÆGenEp ÆGenPref ÆHom ÆLet (Wulfsige Xa) ÆLet (WulfsigeT) ÆLet (Wulfstan ) ÆLet (Wulfstan ) ÆLet (SigewardB) ÆLet (SigewardZ) ÆLet (Sigefyrth) ÆLet (Wulfgeat) ÆLS ÆLS (Pref) ÆLS (Vincent) ÆTemp Ad Alc Alex ApT Aug Bede BenR BlHom Bo ByrM Ch  (HarmD ) Ch  (HarmD ) Ch  (HarmD ) Ch  (HarmD ) Ch  (Rob ) Ch  (Rob ) Ch  (HarmD ) Ch  (HarmD ) Ch  (HarmD ) Ch  (Rob ) Ch  (Rob ) Ch  (HarmD ) Ch  (Rob )

,  ,    ,         ,         ,  , ,              

                                         

,  ,    ,         ,         ,  , ,              

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

OUP CORRECTED PROOF – FINAL, //, SPi

Referential null subjects in Old English Ch  (Rob ) Ch  (Rob ) Ch  (Rob ) Ch  (Rob ) Ch  (Rob ) Ch  (Rob ) Ch  (Rob ) Ch  (Rob ) Ch  (Rob ) Ch  (Rob ) Ch  (Whitelock ) Ch  (Whitelock ) Ch  (Whitelock ) Ch  (Whitelock .) Ch  (Whitelock ) Ch  (Whitelock ) Ch  (Whitelock ) ChrodR  ChronA ChronC ChronD ChronE CP CP (Cotton) CPLetWærf Eluc  Eluc  Exod (Ker) GD (C) GD (H) Gen (Ker) Heptateuch HomS (ScraggVerc ) HomU (ScraggVerc ) HomS (ScraggVerc ) HomU (ScraggVerc ) HomS (ScraggVerc ) HomU (ScraggVerc ) HomU (ScraggVerc ) HomS (ScraggVerc ) HomS (ScraggVerc ) HomS (ScraggVerc ) HomS (ScraggVerc ) HomS (ScraggVerc )

                     , ,      , ,  ,            

                                           

                     , ,      , ,  ,            

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . (continued)



OUP CORRECTED PROOF – FINAL, //, SPi



Referential null subjects in Old English

Table .. Continued Text

Spron



Total

 Sø

HomS (ScraggVerc ) HomM (ScraggVerc ) HomU (ScraggVerc ) HomU . (Scragg) HomS (ScraggVerc ) HomS (ScraggVerc ) HomS (ScraggVerc ) HomM (ScraggVerc ) HomU (ScraggVerc ) HomS . (Scragg) LawAf LawAfEl LawGer LawICn LawIICn LawIne LawNorthu LawVAtr LawVIAtr LawWLad Lch I (Herb) Lch II () Lch II () Lch II () Leof LS  LS  LS  LS  LS  LS  LS  LS  LS (ScraggVerc ) LS (ScraggVerc ) LS  LS  LS  Mart  Mart . Mart  Marv Med .

                                        ,  

                                          

                                        ,  

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

OUP CORRECTED PROOF – FINAL, //, SPi

Referential null subjects in Old English Med  Nic (A) Nic (C) Nic (D) Nic (E) Or Prov  Sol I Sol II Solil SolilPref VSal () WCan .. WCan .. WHom WPol .. WPol .. WSCp Total

     ,    ,     ,   ,

                 

     ,    ,     ,   ,

. . . . . . . . . . . . . . . . . .

,



,

.



Sø expressed as a percentage of the total number of pronominal subjects. The YCOE consists of  text files. Several of these have been split into their constituent texts. Thus, the table does not refer to the files entitled codocu..o, codocu..o, etc., but to the separate charters and wills collected in those files. Individual reference is made also to the three parts of Bald’s Leechbook and the texts comprising the Vercelli Homilies. As is evident from Table ., null subjects are very rare in most OE prose texts. The overall relative frequency for Sø in the  texts is only . This low frequency prompts the conclusion that observations by Rusten (, ) and Walkden (, ) as to the general scarcity of Sø have been corroborated. The table shows that the large majority of the texts under investigation select the overt pronoun in –  of the cases:  of  texts feature frequencies for Sø in the range between  and . An additional  texts have Sø at between . and  of the total pronominal subjects. From a quantitative perspective, it is thus very difficult to argue that null subjects are a productive grammatical feature in most OE prose texts. In light of these data, then, claims highlighting the regularity of null subjects in OE must be characterized as exaggerated. As was concluded also in my previous work, van Gelderen’s (: ) statement that ‘pro-drop is quite common’ in OE must be rejected on the basis of the relative frequencies for null subjects in the vast majority of the prose texts. Similarly, van Gelderen’s (: ) more recent assertion that OE ‘is a genuine pro drop language’ is not supported by the data if the statement is taken to apply to OE as a whole, even though she adds the proviso that ‘the system is in decline’.

OUP CORRECTED PROOF – FINAL, //, SPi



Referential null subjects in Old English

That being said, there is some variation between texts. Twelve texts have Sø at proportions of .–, and six texts have frequencies in the range of .–.. Four texts display frequencies exceeding . These are Scragg’s edition of Homily  of the Vercelli Homilies (HomU . (Scragg)), Bald’s Leechbook (Lch II), and two Anglo-Saxon charters, namely Dispute between Bishop Athelstan and Wulfstan (Ch  (Rob)) and Agreement between Archbishop Eadsige and Æthelric (Ch  (Rob )). These texts have Sø at relative frequencies of ., ., ., and , respectively. After these texts, the highest frequencies for Sø are found in the D manuscript of the Anglo-Saxon Chronicle (ChronD, .), the Laws of Ine (LawIne, .), a short excerpt from the OE Martyrology (Mart , ), the Life of James the Greater (LS , .), Ch  (Rob ) (.), and the C manuscript of the Anglo-Saxon Chronicle (ChronC, .). Interestingly, three of these texts—ChronD, Lch II, and Mart —are recognized as Anglian-influenced ones. This could be viewed as support for Walkden’s dialect-split hypothesis. Additionally, Anglian influence is a possibility for ChronC, HomU ., and LS , although these texts have been classified as West Saxon for the purposes of this study (see section ..). See Chapter  for an evaluation of the dialect-split hypothesis. It must be noted that some of these comparatively high relative frequencies are somewhat suspect, since they result from a severely limited number of null pronouns which form a subset of a similarly restricted total population of pronominal subjects. For example, the notable frequencies of . and  in Ch  (Rob) and Ch  (Rob) are derived from only  and  pronominal tokens respectively, of which only one is null in each text. The . Sø in HomU . (Scragg) is derived from only  observed null pronouns which are part of a total population of no more than  pronominal subjects. Similarly, the frequency of . Sø in LawIne is based on only  null subjects, and LS  (.) and Mart  () only have  and  null subjects, respectively. It is thus difficult to exclude the possibility that the high frequencies in these texts are attributable to coincidence, since these texts represent much smaller samples than many of the other texts in the corpus. The extract from Mart  is illustrative in this context. It comprises only , words, but the YCOE contains two additional excerpts from the Martyrology. The other excerpts, Mart . (. Sø ) and Mart  (.), have Sø at much lower frequencies than Mart , giving an overall relative frequency of . Sø in the combined extracts from the Martyrology. The relative frequency in Mart . is based on a slightly larger sample (, words) than that of Mart , while the sample from Mart  is much larger (, words). This adds weight to the notion that the high frequency in Mart  may be attributable to the small sample it constitutes. The quantitative overviews given in this section make it clear that null subjects constitute a highly infrequent phenomenon in OE prose. However, some of the statements

OUP CORRECTED PROOF – FINAL, //, SPi

Referential null subjects in Old English



concerning the commonness and grammaticality of referential null subjects in OE are to a considerable degree predicated on observations of verse data (see primarily van Gelderen ,  and Pogatscher ). Rusten (a) showed that Sø is much more frequent in OE verse than prose, and argued that this could, in turn, explain the differing assessments of the grammaticality of null subjects in OE. Building on my previous work, section .. will present a quantitative survey of the occurrence of Sø in  texts of OE poetry.10 In subsequent chapters, the data collected from OE prose and poetry will form the basis of a rigorous quantitative analysis of the distribution and characteristics of null subjects in OE. .. Null subjects in Old English poetry: an overview Table . presents the results of the investigation into the occurrence of Sø in OE poetry. The YCOEP contains  text files, some of which comprise several OE verse texts. The table presents the results for each individual text. Thus, reference is made not to the combined statistics e.g. in the file conorthu.psd, but to the figures in the individual poems contained in that file, i.e. the Northumbrian versions of Bede’s Death Song (BDSN) and Cædmon’s Hymn (CædN), the Ruthwell Cross (RuneRuthwellA), and the Leiden Riddle (LRid). The Exeter Book Riddles constitute an exception. For these texts, the table gives individual statistics for Riddles  and , while the other riddles are treated as a unit.11 Two citations containing subjectless jussive subjunctives and one citation containing a subjectless naming structure with hatan have been subtracted from the material prior to compilation of the table.12 The table gives the total number of occurrences for Spron and Sø , as well as the relative frequency for Sø as a percentage of the total.13 At . of all pronominal subjects, the overall relative frequency for Sø in OE poetry is substantially higher than the  in OE prose. This high relative frequency is not attributable to any single text, and frequencies are generally higher than in the prose:  of  (.) poetic texts have frequencies for Sø exceeding , and  of  (.) verse texts feature Sø at relative frequencies of  or more. The highest relative frequency for Sø in the poetry is found in CædN: the only simple pronoun 10 An earlier version of this survey was presented in Rusten (a) (see table  in that article), which was published during work on the thesis upon which this book is based. Sections where OE verse data are explored in this book build and expand substantially on Rusten (a). 11 Riddles  and  differ from the other riddles in that (i) they are translations of Latin originals, while the others most likely are not, and (ii) Riddle  was composed at a putatively later stage than the others. These texts are therefore presented separately. See also sections .. and ... 12 It can thus be noted that these two types of subjectless structure are decidedly rare in the poetry. 13 Due to the presentational facts just mentioned, and due to the fact that the data collection procedure was refined for the present book, Table . is quite different from table  in Rusten (a). The overall results, however, are largely the same.

OUP CORRECTED PROOF – FINAL, //, SPi



Referential null subjects in Old English Table .. Overt and null referential subjects in Old English poetry Text And Beo Brun CædN Christ I Christ II Christ III Deor Dream Elene Ex Fates Fort GenA Jul KtHy KtPs LRid Max I Met Pan Part Phoen Rid Rid  Rid  Rim RuneRuthwellA Sea Wan Whale Wid Wife Wulf Total

Spron



Total

 Sø

                                 

                                 

                                 

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

,



,

.

token in this poem is a null one.14 Six texts have no occurrences of Sø at all, while  texts display frequencies between . and .. 14 The critical reader may object here, as there are actually two occurrences of nominative he in CædN. However, these are reasonably interpreted by the YCOEP annotators as the heads of discontinuous NPs containing appositions, i.e. he [. . .] eci dryctin ‘he, the eternal Lord’ and he [. . .] halig scepen ‘he, the holy creator’. Only nominative pronouns which constitute an NP on their own are counted as overt pronominal

OUP CORRECTED PROOF – FINAL, //, SPi

Is Old English a canonical pro-drop language?



These findings constitute quantitative justification of Mitchell’s (b: ) statement that subject omission appears to be ‘[m]ore common in the poetry than in the prose’, and may serve as an explanation for the conflicting claims concerning the licitness of pro-drop in OE (as also suggested in Rusten a). It is certainly understandable that studies relying on poetic evidence may over-report the frequency of Sø compared to studies relying on prose. In light of the poetic data considered in isolation, then, van Gelderen’s comments as to the ‘commonness’ of pro-drop in OE are somewhat better justified (cf. also Rusten a: ).

. Is Old English a canonical pro-drop language? We are now in a position to re-evaluate central claims put forth in previous research concerning the status of null subjects in OE. The present section will evaluate van Gelderen’s claim that OE is a ‘genuine’ pro-drop language, while Walkden’s dialectsplit hypothesis will be assessed in Chapter . First, however, it must be noted that the criteria for the presentation of evidence are unclear. That is, it is not evident what it ‘means’ to be a pro-drop language in quantitative terms generally, and, more specifically, it is not clear in quantitative terms exactly what van Gelderen takes to be a genuine pro-drop language. The investigation carried out here is a frequency-based one, yet depending on one’s views on grammar and the role of frequency in linguistic investigations, it would conceivably be possible to argue that a given language is a pro-drop language on the simple basis that it displays n occurrences of null pronouns. If such a methodology is adopted, the claims that the ‘phenomenon of referential pro-drop does not occur in OE’ (Hulk & van Kemenade : ), and that ‘OE allows no referential prodrop’ (van Kemenade : ), could be viewed as categorical statistical statements such that a single observation of Sø would be enough to falsify them. While not necessarily taking such an extreme stance, van Gelderen () does to a notable degree build her arguments pertaining to pro-drop in OE on scattered examples of subjectless clauses, as opposed to contrastive quantitative data which take into account both the overt and the null variants.15 This is an unsatisfactory approach. As Fischer (: ) points out, what is significant in an investigation on language change is the ‘relative frequency of constructions [. . .] not their absolute frequency’, since absolute frequencies do not ‘mean much if the structure under investigation is rare’. The phenomenon under investigation here is by all means rare, and this means

subjects in this study (see also section .. in the online supplement to this book). The reason for this, obviously, is that null pronouns in pro-drop languages most typically are viewed as analogues to simple pronouns, as opposed to more complex NPs (but cf. section ..). Thus, the pronoun-headed discontinuous NPs in CædN are not directly comparable to null subjects. 15 She does, of course, report Berndt’s () statistics on the Lindisfarne and Rushworth glosses as well as statistics drawn from the texts in the ME Katherine group.

OUP CORRECTED PROOF – FINAL, //, SPi



Referential null subjects in Old English

that lists of examples—including substantial lists, such as that given in Pogatscher (), which is also utilized by van Gelderen—are of restricted value. Conversely, it is not clear exactly how low frequencies for Sø should be in order for it to be argued that a language is a non-pro-drop language, which is the essential thrust of Hulk & van Kemenade’s () statement. That is, it is unclear how low relative frequencies for Sø must be in order for van Gelderen’s claim to be falsified. Walkden (: ) states that if OE is taken to be a non-pro-drop language along the lines of PdE, ‘it is not necessarily to be expected that the frequency of ∗ pro∗ in the YCOE would be ’. On the contrary, it can be expected that pronominal subjects are occasionally omitted in naturally occurring utterances even in languages which are typically considered to be non-pro-drop. Consider e.g. () and (), which illustrate several types of omitted pronouns in Modern Norwegian.16 It may be observed that the null subject in () occurs in a subordinate clause introduced by the complementizer så ‘so’, meaning that the occurrence cannot be interpreted as topic-drop, a type of argument omission which is still permissible in Modern Germanic.17 Example () similarly illustrates a null subject in a subordinate clause, but also features an instance of topic-drop and an instance of object-pro-drop (abbreviated Oø ). This is an interesting observation, since topic-drop is the only type of argument drop which is supposed to be licit in Modern Norwegian.18 Note that both examples represent a very informal register. () Brann trenger virkelig støtte i disse tunge tider, så [Sø ] Brann needs really support in these heavy times, so [I] anbefaler alle som er glad i Brann å stille opp og recommend everyone who is fond in Brann to stand up and støtte laget support team.def ‘Brann really needs support in these difficult times, so I urge everyone who cares about Brann to be there and support the team.’ () [Sø ] Deler lett [Oø ] siden [Sø ] er så heldig å få være [I] share easily [this] since [I] am so lucky to get be mor til to fantastisk flotte gutter:-) mother to two fantastic great boys ‘I have no problem sharing this, since I am lucky enough to be the mother of two fantastic boys.’ 16 The examples were both posted as ‘status updates’ by native speakers of Norwegian on the social network Facebook. It may consequently be assumed that the utterances represent informal written parallels to everyday colloquial speech patterns. 17 See section . for a definition, examples, and a discussion of topic-drop. 18 Different analyses of subject omission in Modern Norwegian exist (cf. e.g. Stjernholm  and Nygård ).

OUP CORRECTED PROOF – FINAL, //, SPi

Is Old English a canonical pro-drop language?



This means that Walkden (: ) is no doubt right that frequencies of  are not necessarily expected even on a null hypothesis predicting ‘that Old English behaved like Modern English in disallowing null subjects’: instances of Sø do appear in naturally occurring language production even in languages assumed not to have a pro-drop property. Consequently, low-frequency occurrence of Sø is not necessarily illustrative of a productive grammatical system sanctioning null subjects. Hence, n occurrences of Sø are not enough to falsify the claims of Hulk & van Kemenade () and van Kemenade (). Thus, given that in the combined datasets of OE prose and poetry there are , instances of Spron and no more than , occurrences of Sø (. Sø ), there is a possibility that the few examples of Sø in OE are actually indistinguishable from noise in the data. On the other hand, it is stressed in Fischer et al. (: ) that ‘frequency in no way corresponds with grammaticality in the generative approach’, and that ‘there is no reason whatever to assume that an infrequent construction is less grammatical or marginally grammatical’. Whatever one’s preferred framework, there can be little doubt that low-frequency phenomena may be perfectly grammatical. That being said, the phenomenon under investigation here is low-frequent enough, particularly in the prose, that the question of performance error is applicable. Interestingly in this context, Santorini (: , , , fns  and ), in an investigation of V in Yiddish, treats as performance errors phenomena which contrast with her generalizations at relative frequencies of ., ., and . Furthermore, Santorini (: , n. ) states that ‘according to detailed quantitative work’ carried out by herself and others, ‘well-established generalizations in a language are violated in naturally-occurring usage at a low, relatively constant rate of about ’. This cut-off point is adopted for OE by Pintzuk (), and the same delineation point is adopted in Bies’ () investigation of verb-final word order in early New High German, where it is similarly noted that ‘[e]ven the strongest linguistic generalizations are found to be violated at a low rate of approximately ’ (p. ). She illustrates this by giving examples from corpora of Present-day American English. The examples in (a) and (b), taken from Bies (: ), are clearly ill-formed, ‘even though they are produced by native speakers of American English’ (p. ): () a. —and I can take plastic like milk cartons or if they have water in them. [talking about recycling] (Switchboard) b. The company said there was an additional increase in loss and loss-expense reserves of  million reflecting “higher than expected” development in claims legal expenses from to prior periods. (Wall Street Journal) (Bies’  example a,b) Referring to Bies () and Santorini (), Walkden (: ) states that ‘[t]exts that include only very small numbers of instances of ∗ pro∗ are not necessarily

OUP CORRECTED PROOF – FINAL, //, SPi



Referential null subjects in Old English

evidence for the grammaticality of referential null subjects in these varieties’. I agree, and find the estimate of a c. cut-off point for grammaticality to be reasonable. Impressionistically, it may be speculated that the margin for deviations from prescriptions might be even higher than this in modern spoken language corpora.19 Walkden (: ) dismisses from further consideration texts which have pro-drop at frequencies between – in all clause types, saying that ‘one approach to such low figures is to consider these examples ungrammatical’. In this connection, then, it is worthwhile highlighting again the fact that the  OE prose texts analysed in this investigation feature Sø at an overall frequency of no more than . Moreover, as mentioned in section ..,  of  investigated prose texts display frequencies for null subjects in the range of –. If the same threshold for ungrammaticality is adopted here as in Santorini (, ), Pintzuk (), and Bies (), the occurrence of Sø in OE prose is on aggregate indicative of performance error rather than a productive grammatical system. If the poetry is taken into account, the frequency of . still rounds to . If one takes the stance that these aggregate figures are not representative—since there is, after all, some degree of individual variation among texts—one would still safely be able to disregard as ungrammatical the occurrences of Sø in c. of the texts in the prose material and c. of the combined material. These facts are very difficult to reconcile with van Gelderen’s assertion that pro-drop is ‘common’ in OE, and that OE is a ‘genuine’ pro-drop language of the Romance variety. To make matters worse, it is unclear to what extent frequencies deviating only slightly upwards from  should be considered grammatical or ungrammatical. In an investigation of the position of particles relative to the verb in OE subordinate clauses, Pintzuk (: ) finds that particles are post-verbal in . of verb-final clauses. However, this frequency ‘is within the normal range for violations of wellestablished generalizations’ (p. , n. ), and the post-verbal tokens are therefore not counterexamples to her prediction that ‘[i]n clauses with auxiliaries and in VF [verb-final—kar] clauses with inflected main verbs, the particle should always appear before the main verb’ (p. ). Presumably, results that could be rounded down to  are ‘within the normal range for violations’, while those that can be rounded up to  may not be, although Pintzuk does not, as far as I have been able to determine, establish an upper boundary for this grammaticality threshold. If, for the sake of the argument, the threshold for ungrammaticality is raised to ., the result is that the

19 The question of how to treat utterances deviating from language prescriptions is, of course, a theoretical one. Functional theories tend to view ‘grammaticality’ as a continuum, whereas formal theorists sometimes view this concept as categorical. This question will not be treated here except to note that linguistic theories in varying ways ‘clean’ the language data with which they are concerned (Fischer : –). For example, according to Fischer, ‘[g]enerative linguists by and large ignore spoken and variant forms’ as well as ‘the circumstances under which language forms are used’, and often ‘purify the facts’ by ‘reducing them to written forms of language’ (p. ; emphasis original). She acknowledges that there are exceptions (p. , n. ).

OUP CORRECTED PROOF – FINAL, //, SPi

Is Old English a canonical pro-drop language?



tokens in  of the prose texts (/) would be inadmissible as evidence for a pro-drop grammar. If relative frequencies of  are also considered ungrammatical, that figure would increase to  of the texts. The point is obviously that it is impossible to draw a meaningful quantitative demarcation point between grammatical and ungrammatical utterances in these lowfrequency ranges: the difference between  and  Sø can likely be considered more or less arbitrary, and it would likewise be difficult to argue that occurrences of  represent a stable syntactic system whereas frequencies of  do not. My objective in the present section is not to argue that the occurrences of Sø in the OE prose data are scribal errors (see also below, this section). However, it may be remarked at this point that, according to a well-established tradition of quantitative research, the occurrences in at least . of the prose texts (i.e. prose texts displaying frequencies of . Sø or less, and which therefore can be rounded down to ) are prime candidates for analysis as performance errors. Thus, while it may be fruitless to set an arbitrary frequency limit for what constitutes a pro-drop language in quantitative terms, it seems clear that the OE prose data considered as a whole, and the large majority of the individual texts constituting that whole, would not meet such a frequency limit practically no matter how low it was set. If OE is to be considered a canonical pro-drop language of the Romance variety, it should be comparable to such languages in terms of relative frequencies for null subjects. This is not the case. In an investigation of the realization of subject pronouns in Modern Spanish, building on a material comprising , verbs, Erker & Guy (: –) find that the null variant is used in  of the cases with highfrequency verb forms, and in  of the cases with low-frequency ones. Similarly, in a variationist study based on sociolinguistic interviews, Flores-Ferrán (: ) finds that null subjects are chosen in  of the cases in the Spanish of  Puerto Ricans living in New York City, and she notes that this is ‘a tendency found in most Spanish varieties’. These figures are in no way comparable to the low frequencies in OE prose, or even to the higher ones in the poetry. As a consequence, it must be concluded that van Gelderen’s statements are not tenable on the basis of the facts evident in  OE prose texts comprising . million words and roughly half of the entire preserved OE output: OE is not a canonical pro-drop language of the Romance variety. In my earlier work, I noted that the phenomenon of referential null subjects ‘must be characterized as nearly dead’ by the OE period (Rusten : ), and that Sø ‘is more or less extinct by the time of the extant OE texts’ (Rusten : ). The OE prose data does not, as a whole, suggest that this position should be revised. The conclusion that OE is not a canonical pro-drop language does not automatically entail that the tokens of Sø in the OE prose material must all be dismissed as scribal errors. Another possibility is that the occurrences of Sø in the corpus represent what may be described as ‘remnants’ of a grammatical system which was productive in prehistoric OE, and certainly in Proto-Germanic (cf. e.g Walkden c: ,

OUP CORRECTED PROOF – FINAL, //, SPi



Referential null subjects in Old English

who suggests that the Gothic null argument system should be reconstucted for Proto-Germanic), yet which is largely unproductive by the time of the extant texts. In accordance with this, my previous work (: ) suggested that the tokens of Sø ‘may represent genuine remnants of an antiquated Germanic grammatical system’, but due to the low frequencies, the possibility of scribal error could not be confidently ruled out. As the discussion above shows, this is still a problem. However, the more satisfactory perspective would be not to dismiss the tokens as mere errors. After all, certain texts do display Sø at comparatively high frequencies, and other Old Germanic languages display Sø at more robust frequencies than those found in OE. The notion of low-frequency ‘remnants’ of previous systematicity could be reconciled with certain statements concerning linguistic facts observed synchronically made in Lass (). Lass advocates an approach where the role of the historical linguist is to take an ‘external or “God’s-eye” ’ perspective on language change rather than an ‘internal or “speaker’s head” ’ perspective (p. ). On such an approach, one may look at a language as one might at any other historically evolved object, and ask not only how it came to be what it is, but what scars its history has left, how that history helps us to understand the scars, and how they in turn help us to understand history. (Lass : )

The perspective afforded by such a ‘medium-neutral evolutionary model’ (p. ) is partially motivated by the problem of determining ‘how much of what looks like (synchronic) structure really is [synchronic structure—kar]’, and ‘how much is rather detritus left behind by historical processes’ (p. ). According to Lass, ‘[p]ortions of apparent “synchronic” states are relics of the historical processes that brought them into being’, and these ‘relics’ may be compared to ‘evolutionary scars on the presentday body’ (p. ). He also points out that there is a danger that ‘what seems [. . .] to be synchronic may really be residue, often of such antiquity that there is no way of telling whether or not it means anything’ (p. ; my emphasis). Regarding the scattered examples of Sø in OE prose as ‘residue’ could turn out to be an explanation as valid as any for the infrequent—or ‘spasmodic’ (Mitchell a: )—occurrence of null subjects in the early history of English, and it would certainly also be more interesting than simply considering them errors. Previous research indicates that the occurrence of null subjects in OE displays a degree of regularity. Quantitative evidence presented by works such as Berndt (), van Gelderen (, ), Rusten (, , a), and Walkden (, , c) shows that Sø is somewhat more frequent in root as opposed to subordinate clauses, and that Sø is also more frequent with third person than with first and second person reference. Both of these characteristics are observed in Old High German (Eggenberger , reported in Axel , ), Old Saxon (Walkden c), and Old Swedish (Håkansson , ); and the ‘split’ between third and non-third person can also be observed in Old Icelandic (Walkden , c, Kinn et al.

OUP CORRECTED PROOF – FINAL, //, SPi

Is Old English a canonical pro-drop language?



) and Old Norwegian (Kinn ). The fact that Sø in OE, although certainly an infrequent phenomenon, exhibits ‘behaviour’ similar to its cognates would suggest that its occurrence is not entirely haphazard—as would be expected if it was really attributable to performance error—but is in some way structurally governed. This means that further investigation of Sø in OE is worthwhile even though extant OE is not a canonical pro-drop language. So far, comments regarding the pro-drop status of OE have been restricted largely to the prose genre. One potential problem here, of course, is that the proportion of null subjects is much higher in the verse genre. It was noted above that van Gelderen’s claims concerning the ‘commonness’ of pro-drop in OE are better justified for the poetry, and her claims are also to a large extent built on examples taken from this genre. Van Gelderen (: –) provides a famous example from Cædmon’s Hymn, along with numerous examples from Beowulf. Moreover, she refers to Pogatscher (), noting that this work ‘provides many additional Old English examples’ (van Gelderen : ). Very many of Pogatscher’s examples are taken from verse texts, such as Beowulf, Andreas, Elene, and the verse version of Genesis. Van Gelderen also cites several examples taken from Visser (). The majority of these occur in verse texts: four examples are taken from Exodus, two examples are taken from Daniel, two are taken from Judith, and one is taken from Juliana (van Gelderen : –). From the prose genre, van Gelderen (: ) offers four of Visser’s () examples. Of these, two are taken from Orosius, one from Boethius, and one from the first series of Ælfric’s Catholic Homilies. She also offers a number of examples taken from the OE glosses in various Latin texts, including the Vespasian Psalter and the Lindisfarne and Rushworth Gospels.20 Consequently, then, much of the weight of van Gelderen’s argument rests on data collected from poetry. However, it is seriously questionable whether observations drawn from the poetry could reasonably be used to make blanket statements about the syntax of OE. There are at least two reasons for this. One is quantitative: the OE poetic corpus comprises only about , lines of text, equalling c. of the total extant OE corpus (Fulk & Cain : ). Thus, even if one considers the entire corpus of OE poetry (which is not the case either here or in van Gelderen’s work), there is a concern of representativeness: in claiming that OE is a pro-drop language, predominantly on the basis of verse evidence, van Gelderen assigns primary importance to a very restricted subset of the OE material, giving short shrift to the far more voluminous prose output.21 The second problem is qualitative: it is well known that OE poetry 20 And, as mentioned above, van Gelderen’s argument concerning the licensing role of verbal agreement is based on Berndt’s () quantitative investigation of pronominal subjects in the Lindisfarne and Rushworth Gospels. 21 The prose tradition is acknowledged, but seems to be considered simply as a later stage of linguistic development as compared to the poetic material. As far as prose works are concerned, van Gelderen (: ) notes only that she ‘assum[es] that pro-drop occurs in Alfred and Ælfric’ but that ‘[i]t is [. . .] much

OUP CORRECTED PROOF – FINAL, //, SPi



Referential null subjects in Old English

differs from the prose in several respects. Thus, the preponderance of Sø in OE poetry could in principle be the result of presentational decisions made by composers of poetry operating under stylistic restrictions different from those influencing writers of prose. If this is the case, the results based on the poetry corpus do not facilitate representative generalizations concerning OE syntax either quantitatively or qualitatively. The uniformity of the poetry is one of the ways in which it differs from the prose. According to Fulk & Cain (: ), the poetry is ‘exceptionally uniform in style’, and Wrenn (: –) states that ‘the basic patterns’ of the verse ‘remained constant’ throughout the entire OE period. It can rather straightforwardly be assumed that the alliterative form imposed restrictions on metre and rhythm. While this may not have directly affected the presence or absence of an unstressed pronominal subject (cf. e.g. Blockley : –, Walkden : ), the composition of the verse is clearly bound by structural requirements external to those of the grammatical system. There is ample reason to assume that the language of the poetry, as a consequence of this, differed at least to some degree from naturally occurring everyday speech. This is a problem, since the spoken language is often taken to be the primary object of linguistic study. Given the lack of living informants, then, one would ideally be interested in approaching the OE text material with a view towards assigning greater evidential value to texts which can be taken as relatively close to the spoken language. In his syntax of Old Norse, Faarlund (: ) exclusively uses data drawn from prose texts, since this genre, according to him, can ‘be assumed to be closest to the spoken language’. Faarlund thus views the verse texts as more distant than the prose from the spoken language. I take the same stance for OE as Faarlund does for Old Norse: prose is closer to the spoken language than poetry, and therefore evidence from the prose should be considered the primary source of information on OE syntax. In the context of syntactic differences between OE prose and poetry, van Kemenade (: ) states that ‘[w]ord order in poetry is very different from that in prose’ and that poetry, for this reason, ‘cannot be considered a reliable source of information on the standard of Old English’. Also interestingly, Crisma (), in a corpus-based investigation of the emergence of the definite article in English, documents striking differences in the use of determiners between the YCOE texts on the one hand and Beowulf and the verse Genesis and Exodus on the other. For example, where the YCOE data show that . of argumental proper nouns preceded by adjectives also have the determiner se or þes (Crisma : ), the same is true for only  of  argumental proper nouns in Genesis and Exodus, and for  of  such proper nouns in Beowulf (). This leads Crisma to state that ‘the grammars represented [by the OE prose and OE verse data—kar] are two different grammars’ (my emphasis): in her data, the verse grammar shows ‘no hint of obligatory use of expletive articles with PropersNs [proper nouns—kar]’, contrary to the prose grammar (p. ). less than in Early OE’. The prose of Alfred and Ælfric thus seems to be considered to be ‘late’, as opposed to the ‘early’ language of the poetic texts. In my view, this abstracts away from the many genre differences between the texts involved in a somewhat unfortunate way.

OUP CORRECTED PROOF – FINAL, //, SPi

Is Old English a canonical pro-drop language?



When formulating statements concerning the characteristics of a language at a given point in time, historical linguists must obviously make the best of a less-thanideal situation in terms of the available data. Any investigation on OE, specifically, must by necessity be based on the stylistically formal written language output of trained professionals—an output which survived over time in what may be described as a process of chance. It can reasonably be assumed that the data at hand in the best of cases are removed at some distance from the characteristics of everyday spoken OE. This unavoidable issue is in itself problematic, but it is exacerbated if statements concerning linguistic phenomena are predicated on a genre which is (i) limited in size and (ii) further removed from the spoken language than other genres for which data are readily available, i.e. prose. If the expressions of certain syntactic phenomena in prose and poetry are different to the degree that these genres present themselves as different grammars, use of verse data for generalizations on the status of OE is problematic indeed. The view taken here, then, is that the most fruitful approach is to assign primary importance to the genre which is closer to the spoken language. In this context, it may be noted that, in a survey of various differences between OE prose and poetry, Mitchell (b: –) mentions that subject omission is one of several features used ‘to achieve compression and to give the poetry its characteristic texture’. Similarly, Blockley (: ) states that ‘the motivation for omitting’ pronouns is ‘tied up with the peculiar circumstances of Old English poetic style’. Thus, it appears that use of subjectless clauses is among the stylistic choices which characterize OE poetry, a view which is certainly supported by the frequency differences between the YCOE and the YCOEP demonstrated above. This is cause for caution as far as syntactic argumentation is concerned, and therefore, the conclusion that OE is not a canonical pro-drop language is considered justified despite the higher frequencies for Sø in the poetic texts: subjectless clauses are a (stylistically motivated) possibility in the poetry, but the extremely low relative frequencies in the prose show that Sø was not a productive feature in genres closer to the spoken language.22 Therefore, the results from the poetry do not generalize to OE as a whole. Lass (: ) says that ‘literary (especially poetic) materials have to be used with extreme care’, and he adds that they should be viewed as ‘secondary and a bit suspect’. I see no reason to disagree with this. This is, however, not necessarily to say that poetic evidence is irrelevant or inadmissible. Lass (: ) also points out that it ‘is unlikely in principle [. . .] that any device used in verse will be an absolute violation of the norms of non-verse language’. Similarly, van der Horst (: ) (paraphrased in Fischer : , n. ) makes a point of the fact that ‘poets work with the same language system as prose writers but they often use it in a more original fashion’. The question then becomes whether the poetic material should be disregarded entirely, or whether it should be included in the in-depth quantitative analyses 22 But this obviously calls for statistical analysis. See Chapter , and particularly section ., where these notions are developed further.

OUP CORRECTED PROOF – FINAL, //, SPi



Referential null subjects in Old English

that follow. I take the stance that there are reasons why the verse material should be retained. First and foremost, the poetic data may provide a yardstick against which generalizations derived from the (more representative) prose data could be measured. Such an investigation could reveal that the distribution of Sø across the identified variables is similar in both OE prose and poetry (as is the case in the more limited study in Rusten a), and this could in turn point towards the system which presumably sanctioned subject omission at the unattested stage where such structures were, putatively, productive. The poetic data could prove useful as a means of indirectly accessing this unattested period, since it has been argued e.g. by (Fischer : , n. ) that ‘poems often preserve in their diction archaic expressions’ which can provide information about ‘earlier stages of the language’. This ‘preservative’ effect seems to apply to several areas of linguistic inquiry. For example, Fulk & Cain (: ) note that words such as þengel ‘prince’ and deor ‘bold’ occur in the poetry but not in the prose ‘presumably because they are archaic words that passed out of everyday use but are preserved in verse because of the traditions they evoke’. In a discussion on alternation in feminine o-stem nouns, Ringe & Taylor (: ) note that ‘the formulaic structure of Germanic oral poetry has preserved the original distribution of allophones’. Moreover, Ringe & Taylor (: ) state that the spelling ‘of formulaic poetry can be expected to have lagged behind the actual change in speech’. As an example of syntactic differences which can be related to diachronic development, Haugland’s (: ) quantitative investigation of expletive pronouns in OE finds that ‘in weather statements with zero argument verbs, like “rain” [. . .] the variant with hit is almost obligatory already by early OE’ but that ‘the type without a pronoun still appears, for instance in poetic texts’. Finally, Walkden (c: ) takes the occurrence of null subjects in Beowulf to be representative of ‘the early stage of OE’. Thus, going forward, both prose and poetic data will be considered, both separately and as a joint dataset.23 Again, however, despite the higher densities of Sø in the poetry, it is argued here that the extant stage of OE is by no means a canonical pro-drop language, contra van Gelderen. In Chapters  and , this argument will be developed further on the basis of detailed quantitative analyses of both linguistic and extralinguistic variables potentially relevant for subject omission in OE. First, we turn to some of the arguments presented in the work of Walkden, where it is suggested that Anglian dialects of OE exhibit a partial pro-drop property.

23 One of the anonymous reviewers rightly points out that ‘although constructions characteristic of poetry may stem from earlier Germanic/pre-OE grammars, they do not help to date the composition of a poem, since some at least are conventionalized for poetic style and are found in poems known to be composed late’.

OUP CORRECTED PROOF – FINAL, //, SPi

3 Do Anglian dialects of Old English have a partial pro-drop property? . Introduction The present chapter will evaluate Walkden’s (, , c) ‘dialect-split hypothesis’, i.e. the suggestion that ‘standard’ OE—the West Saxon dialect—was a non-pro-drop language while Anglian dialects of OE exhibited a partial prodrop property. The chapter builds on, and significantly expands the scope of, the exploratory work first presented in Rusten (a). The chapter is structured as follows: section . gives a presentation of the dialectsplit hypothesis as formulated in Walkden’s work. Section . provides an initial assessment of this hypothesis, raising the possibility that what Walkden takes to be diatopic variation might also be diachronic variation, or variation contingent on genre or translation status, as also suggested in Rusten (a). Sections .–. attempt to clarify this picture by investigating referential null subjects according to dialect, genre, period, and translation status. In section ., the dialect-split hypothesis is assessed statistically by means of generalized logistic regression modelling and a random forest of conditional inference trees. Section . summarizes and concludes.

. The dialect-split hypothesis In a chapter of a larger work which investigates the possibility of reconstructing the syntax of unattested stages of Germanic, Walkden ()—later published as Walkden (c)—supplies a quantitative study of referential null subjects in OE. He also provides a formal analysis seeking to account for the distinct distributional characteristics exhibited by null subjects in some of his investigated texts (see sections .. and . for details on his analysis). His main findings are also presented in Walkden (). Walkden (: ) reconciles the conflicting claims of Mitchell (a), Hulk & van Kemenade (), and van Gelderen () concerning the occurrence of null subjects in OE by stating that ‘all three suggestions [i.e. that ‘referential prodrop does not occur in OE’ (Hulk & van Kemenade : ), that ‘Old English has Referential Null Subjects in Early English. First edition. Kristian A. Rusten. © Kristian A. Rusten . First published in  by Oxford University Press.

OUP CORRECTED PROOF – FINAL, //, SPi



Anglian dialects of OE and partial pro-drop

pro-drop’ (van Gelderen : ) and that the phenomenon occurs ‘only spasmodically’ (Mitchell a: )—kar] appear to be right’. Arguing for this position, he notes that in ‘the majority of classical OE texts, examples of null referential arguments are so rare as to be potentially considered entirely ungrammatical’ (Walkden : ). He finds relative frequencies for null subjects in the range of – in most of his texts, and notes that this ‘arguably lends weight to’ Hulk & van Kemenade’s assertion, ‘since one approach to such low figures is to consider these examples ungrammatical’ (p. ). These texts are subsequently excluded from consideration. However, he also finds that ‘in certain other texts the phenomenon occurs with a frequency and distribution that cannot be attributed entirely to performance errors’ (p. ). The texts in question are Bede, Beowulf, and Bald’s Leechbook, in addition to the C, D, and E manuscripts of the Anglo-Saxon Chronicle, which are all demonstrated to ‘exhibit null subjects to a greater extent’ than the other investigated texts (p. ). Crucially, according to Walkden, these texts are either Anglian (i.e. Mercian or Northumbrian) or display Anglian features, and on this basis, he concludes ‘tentatively’ that ‘referential null subjects were not grammatical in classical OE (West Saxon) [. . .] but were available, subject to certain restrictions, in Anglian dialects’ (p. ). Thus, the ‘key to resolving the apparent contradiction’ arguably ‘lies in dispelling the illusion of OE as a monolithic entity’ (p. ).

. An initial assessment of the dialect-split hypothesis As stated above, Walkden’s dialect-split hypothesis is based chiefly on the observation that the six texts in his sample which have higher frequencies for Sø than the majority of the texts are either Anglian or Anglian-influenced. Statistics for the texts in question, extracted from Walkden (), are given in Table .. Note that Walkden’s figures differ slightly from mine. The differences arise from the fact that I have included all finite clauses, whereas he restricts his investigation to indicative clauses. Moreover, I take measures to exclude instances of overt non-referential hit Table .. Overt and null subjects in six Anglian/Anglian-influenced texts (extracted from Walkden : –)

Bede Beo ChronC ChronD ChronE Lch II

Spron



Total

,      

     

,      

 Sø . (.) . (.) . (.) . (.) . (.) . (.)

OUP CORRECTED PROOF – FINAL, //, SPi

An initial assessment



‘it’ functioning as subject, and I also exclude overt predicative nominative pronouns. Walkden and I make the same exclusions with regard to jussive subjunctives and subjectless structures with the verb hatan.1 In the interest of contrastive clarity, the relative frequencies obtained in the present study are given in parentheses in the rightmost column of the table. The dialect-split hypothesis is prima facie appealing, but not unproblematic. As can be observed, the relative frequencies for Sø displayed in Table . are not uniformly high—for example, only two of the texts in the table display Sø at frequencies exceeding . The frequencies of . and  in Bede and ChronE are higher than those typically demonstrated in the prose dataset, yet they are at the same time low enough that it may be questioned whether the occurrence of Sø is actually an expression of a stable partial pro-drop grammar in these texts. Sø occuring at proportions of  and below can only be considered ‘high’ in a context of generally very low frequencies. Additionally, the number of Anglian/Anglian-influenced texts used as evidence in favour of the dialect-split hypothesis is somewhat restricted, and it is likely that a larger dataset could present a different picture. In this connection, it may also be noted that the two texts with the highest number of pronominal tokens have the lowest relative frequencies for Sø , while the highest relative frequencies are found in the texts with the lowest number of tokens. The inverse relationship between text size and the proportion of null subjects in Table . is illustrated in Figure .. Contrary to expectations if the dialect-split hypothesis is correct, my data show that several Anglian-influenced texts display frequencies for Sø which are as low as those typically displayed by West Saxon texts. For instance, the Blickling Homilies has . Sø , the C-manuscript of Gregory’s Dialogues (GD (C)) has ., while the Life of Saint Chad (LS ) and Marvels of the East (Marv) feature Sø at relative frequencies of . and , respectively. Furthermore, as also suggested in section .., a case could be made that ChronC should not be counted among the Anglian-influenced texts, and this text will in fact be considered West Saxon for the purposes of this study (see section .. for a short explanation of the rationale). Even with these provisions, Walkden is no doubt right that it is a conspicuous fact that the texts in his material which ‘display null subjects robustly’ are ‘Anglian [. . .] or exhibit Anglian features’ (Walkden : ). However, in the larger data material employed here, this observation does not hold. Of the ten texts featuring Sø at proportions exceeding  mentioned in section .., only three can unambiguously be considered Anglian-influenced: ChronD, Lch II, and Mart I.2 This, combined with the fact that several Anglian-influenced texts have Sø at very low relative frequencies, might suggest that the observed variation in subject expression among the texts could be attributable to factors other than dialect, or that the variation in 1 See the online supplement to the book for methodological details. 2 Of course, if ChronC is included, that proportion would increase to four of ten.

OUP CORRECTED PROOF – FINAL, //, SPi



Anglian dialects of OE and partial pro-drop Lchll

Beo

Proportion of pronouns null

0.15

0.10

ChronD

ChronC 0.05 ChronE Bede 500

1000

1500

2000

Total pronouns

Figure .. Relationship between text size (measured in terms of the total number of subject pronouns) and proportions of null subjects in Table .

fact could be entirely epiphenomenal. It is also a concern here that while many West Saxon texts display non-West Saxon features, the extent to which they do so varies considerably. Additionally, considerable care must be exercised when extrapolating from the existence of documented lexical and morphological differences between West Saxon and Anglian dialects to diatopically conditioned syntactic variation. That is, the presence of lexical, phonological, and morphological influence from Anglian argued by Fulk () to be present in West Saxon texts does not necessarily entail substantial syntactic influence in the same texts. Walkden does, of course, state that his suggestion is only a tentative one (: ), but it will nevertheless be examined closely here. In this chapter, then, quantitative evidence will be adduced and analysed in order to assess the dialect-split hypothesis. First, section . gives a survey and discussion of the distribution of null subjects in OE prose and poetry according to dialect features. However, dialect is not the only parameter according to which it is necessary to evaluate Walkden’s hypothesis, since it is conceivable that the concentration of null

OUP CORRECTED PROOF – FINAL, //, SPi

Null subjects according to dialect



subjects in the Anglian-influenced texts in his sample may be amenable to alternative explanations. Notably, Beowulf is a verse text, and it has been shown above (and in Rusten a) that null subjects are more common in the poetry than in the prose. Thus, genre variation in terms of prose and poetry is a potential alternative explanation for the high frequency in Beowulf. Furthermore, Bede and the sections of Bald’s Leechbook which display the highest frequencies for null subjects are early OE texts, meaning that the higher frequencies in these texts could be indicative of a general diachronic decline in occurrence between early and late periods of OE, as opposed to (or in addition to) diatopic variation. Moreover, Bede and Bald’s Leechbook are both translated texts, and it is possible that the density of null subjects in these texts could be attributed to influence from Latin.3 The significance of these considerations is that the dialect-split hypothesis could be weakened if it is possible to show that there are correlations between null subjects and genre, period, and translation status similar to that between null subjects and dialect (if there is one). The immediately following sections .–. will investigate correlations between null subjects and these variables in both OE prose and poetry. The investigation into the dialect-split hypothesis will be concluded in section ., where the relative importance of the variables of dialect, period, genre, and translation status will be assessed statistically by means of a generalized fixed-effects logistic regression model and a random forest.

. Null subjects according to dialect The majority of the extant OE texts survive mainly in late West Saxon dialect form. This dialect, according to Fulk (: ), ‘was a literary language written everywhere in Anglo-Saxon England by the later tenth century’, and it is represented at its ‘most unmixed’ in the prose of e.g. Ælfric, abbot of Eynsham. The scarcity of non-West Saxon textual material means that it is a more than challenging endeavour to conduct in-depth investigations on syntactic differences between OE dialects. The fact that ‘almost all the texts are in the West Saxon dialect’, while substantial non-West Saxon works ‘consist mostly of interlinear glosses on parts of the Vulgate bible’—and which therefore are problematic as syntactic evidence—leads Fischer et al. (: ) to state that there ‘is little scope for work on dialect syntax in Old English’. This most central problem in dialect-oriented investigations on historical syntax was highlighted in the workshop on Early English dialect morphosyntax at the th International Conference on Historical Linguistics. The conveners of the workshop point out that the view espoused by Fischer et al. may be ‘in need of qualification’ and that ‘syntactic dialect differences within OE [. . .] can be identified, provided that the (admittedly

3 These possibilities were also raised, and investigated inconclusively, in Rusten (a).

OUP CORRECTED PROOF – FINAL, //, SPi



Anglian dialects of OE and partial pro-drop

limited) non-West-Saxon OE material is used with care’ (de Haas & Walkden ).4 However, they also state that one should be careful that diatopic variation is not ‘conflated with diachronic change or other types of variation’. As pointed out in de Haas & Walkden (), several studies have identified cases of morphosyntactic variation in early English which may be attributed to dialect differences. Kroch & Taylor (), for instance, demonstrate quantitatively that V is more systematic in northern OE. They associate this fact with language contact with Old Scandinavian. Also on the basis of quantitative reasoning, Ingham () challenges ‘the traditional view that OE showed optional use of negative concord’, instead showing that the variation is conditioned by dialect (p. ). The studies on the Northern Subject Rule carried out by de Haas (, ), Cole () and de Haas & van Kemenade () identify regional differences in agreement patterns in OE and ME. Thus, the workshop conveners are no doubt right that there is a place for studies on dialectal variation in historical English syntax. The question for our purposes, then, is whether the (for the most part severely restricted) variation in the occurrence of Sø can clearly be shown to be a case of diatopic–syntactic variation, as Walkden tentatively argues, or whether the data also show correlations which indicate that Sø can be attributed to ‘diachronic change or other types of variation’ (de Haas & Walkden ). .. Prose Before the quantitative results for the prose can be presented, some preliminary statements on classification are necessary. When assigning dialect labels to texts, I have relied mainly on the dialect classifications provided by the YCOE and Helsinki corpora. However,  text files were left unclassified according to dialect features by the corpus compilers. Some of these texts are known to be authored by Ælfric and Wulfstan (e.g. ÆLet (Sigeward Z) and WCan ..), and it has been straightforwardly assumed that these texts are West Saxon. Similarly, the preface to Augustine’s Soliloquies (SolilPref) had not been assigned a dialect, while the main text of the Soliloquies (Solil) had been classified as West Saxon. SolilPref was thus assigned the same label as Solil. However, dialectal classification for a number of texts proved quite challenging. When classifying problematic texts according to dialect, I have relied primarily on statements in the relevant editions and in the philological literature. The problem was, however, especially acute for several texts taken from the Cotton Vespasian D. xiv manuscript, i.e. Alc, Aug, Eluc , LS , LS . These texts have been digitized on the basis of the edition by Warner (), which is rather bare-bones and restricted to transcriptions of the edited texts. Warner (: v) promised that a ‘second part 4 Citations from de Haas & Walkden () are taken from the unpaginated online workshop abstract at http://www.arts.kuleuven.be/ling/ICEHL/workshops/early-english-dialect-morphosyntax.

OUP CORRECTED PROOF – FINAL, //, SPi

Null subjects according to dialect



containing an Introduction, Notes, and Glossary is in preparation’, but as far as I have been able to determine, this work never saw publication. Since I have been unable to find authoritative statements on the existence of any non-West Saxon dialect features in these texts, the majority of them have been classified as West Saxon on the basis of the index to the Toronto corpus. This is true also for other texts for which classification proved excessively difficult. In uncertain cases, then, texts have been given the same label as that listed in the Toronto index, with the exception that some of the Saints’ Lives and homilies of unknown authorship have been classified as Anglianinfluenced on the basis of the survey provided in Fulk (: –). Table . gives an overview of my classification of the YCOE texts for which dialect information was not given by the corpus. This classification is tentative, and represents a pragmatic approach to the thorny problem of identifying OE dialect features.5 As can be seen, the majority of the unclassified texts are taken to be West Saxon. This should be a largely uncontroversial decision, even though many of these texts are not as ‘purely’ West Saxon as the prose of Ælfric. For instance, the Vercelli Homilies do exhibit ‘a small number of distinctively early or nonWS forms’, even though the

Table .. Classification of Old English prose texts according to dialect WS

WSA

WSK

K

Alc Aug ChrodR  ChronC CP (C) HomS . (Scragg) HomU . (Scragg) LS  LS  LS  Nic (A) Nic (D) Nic (E) Sol II SolilPref Vercelli VSal ()

LS  ChronD LS  LS  Mart  Mart .

Nic (C)

Eluc 

5 I use the following abbreviations for dialect labels: ‘A’=Anglian, ‘AX’=Anglian+unknown dialect features, ‘K’=Kentish, ‘KX’=Kentish+unknown dialect features, ‘WS’=West Saxon, ‘WSA’=West Saxon+ Anglian dialect features, ‘WSK’=West Saxon+Kentish dialect features, ‘WSX’=West Saxon+unknown dialect features. See also section . in the online supplement to this book.

OUP CORRECTED PROOF – FINAL, //, SPi



Anglian dialects of OE and partial pro-drop

language generally is ‘conservative late WS’ (Scragg : xliii). The second-largest group comprises Anglian-influenced West Saxon texts. LS , LS , and LS  have been assigned this label since Fulk () finds relatively many Anglian features in these texts.6 ChronD is believed to have been compiled in Worcester, and is known to display a number of Anglian features, even though its language is primarily late West Saxon. Mart  and Mart . are also taken to be Anglian-influenced texts, since this label is given to Mart  by the YCOE analysts. Nic (C) has been classified as West Saxon influenced by Kentish on the basis of Allen (: ). Eluc  has been classified as Kentish on analogy with Eluc , which has been labelled as such by the YCOE analysts. ChronC, also referred to as the ‘Abingdon chronicle’, was composed on the border between West Saxon and Anglian dialect areas as typically represented in OE dialect maps,7 and Walkden (, , c) considers it an Anglian-influenced text. However, I have chosen to classify this text as West Saxon on the basis of O’Keeffe (), who notes that both the phonology (p. xcvi) and the morphology ‘of the C-text of the Chronicle is, in general, that of late West Saxon’ (p. cx). Table . gives the distribution of pronominal subjects in Old English prose according to dialect. It can be observed that the highest relative frequencies for Sø occur in texts which are labelled ‘WSX and ‘WSA’. Such texts have Sø at . and . of all subject pronouns, respectively. The corresponding frequency for texts simply labelled ‘WS’ is .. Thus, Sø is slightly more frequent in West Saxon texts which display dialect features classified either as ‘unknown’ or ‘Anglian’ than in texts classified as solely displaying West Saxon features. At this point, however, recall the

Table .. Subject pronouns in Old English prose according to dialect Dialect

Spron



Total

 Sø

A AX K KX WS WSA WSK WSX

    ,  ,   , 

       

    ,  ,   , 

.  .  .  .  .  .  .  . 

Total

, 



, 

. 

6 LS , LS , and LS , on the other hand, are classified as West Saxon, since Fulk’s survey shows that they have comparatively few Anglian features. It should be made clear Fulk does not draw a distinction between West Saxon and non-West Saxon texts; the subclassification is entirely mine. Clearly, there is some potential for inaccuracy here. 7 Such maps are, obviously, abstractions.

OUP CORRECTED PROOF – FINAL, //, SPi

Null subjects according to dialect



discussion of the grammaticality threshold of c. employed by e.g. Santorini (, ), Pintzuk (), and Bies (): it is not straightforwardly the case that the difference between