Cognitive, Social, and Individual Constraints on Linguistic Variation: A Case Study of Presentational 'Haber' Pluralization in Caribbean Spanish 9783110524154, 9783110521627

The present volume tries to answer the question: What constrains morphosyntactic variation? By analyzing the variable ag

228 59 3MB

English Pages 262 Year 2016

Table of contents :
Acknowledgements
Contents
List of figures
List of tables
Abbreviations and other conventions
Chapter 1: Introduction
1 Introduction
Part A: Preliminaries
Chapter 2: Presentational haber pluralization
2 Presentational haber pluralization
2.1 Earlier studies in (perceptual) dialectology
2.2 Earlier studies in variationist sociolinguistics
2.3 Summary
Chapter 3: Cognitive Construction Grammar and language variation
3 Cognitive Construction Grammar and language variation
3.1 Cognitive Construction Grammar
3.2 Cognitive Construction Grammar and language variation
3.3 Summary
Chapter 4: Research questions and hypotheses
4 Research questions and hypotheses
4.1 Research questions
4.2 Hypotheses
4.3 Summary
Chapter 5: Methodology
5 Methodology
5.1 Judgment sample, selection criteria, and stratification variables
5.2 Fieldwork methods
5.3 Transcription, selection of cases, and envelope of variation
5.4 Statistical toolkit
5.5 Comparative sociolinguistics
5.6 Summary
Chapter 6: Semantic and syntactic properties of presentational haber
6 Semantic and syntactic properties of presentational haber
6.1 The meaning of the presentational haber constructions: POINTING-OUT
6.2 The nominal argument
6.3 The adverbial phrase
6.4 Implicit nominal arguments and adverbial phrases
6.5 Summary and box diagrams
Part B: Cognitive, social, and individual constraints on presentational haber pluralization
Chapter 7: Cognitive constraints on presentational haber pluralization
7 Cognitive constraints on presentational haber pluralization
7.1 Structure of the corpus and the regression models
7.2 Markedness of coding
7.3 Statistical preemption
7.4 Structural priming
7.5 Interaction between the linguistic predictors
7.6 Relative importance of the linguistic predictors
7.7 Summary
Chapter 8: Social constraints on presentational haber pluralization
8 Social constraints on presentational haber pluralization
8.1 Social constraints: Age, education, and gender
8.2 Linguistic predictors across social groups
8.3 Relative importance of the linguistic predictors across social groups
8.4 Summary
Chapter 9: Individual constraints on presentational haber pluralization
9 Individual constraints on presentational haber pluralization
9.1 Distribution across individuals
9.2 Random intercepts, slopes, and participant-specific regression coefficients
9.3 Linguistic predictors across individuals
9.4 The behavior of the Havana university graduates
9.5 Summary
Chapter 10: Cognitive, social, and individual constraints on presentational haber pluralization
10 Cognitive, social, and individual constraints on presentational haber pluralization
10.1 Cognitive constraints
10.2 Social constraints
10.3 Individual constraints
10.4 Conclusion
References
Index

Recommend Papers

Cognitive, Social, and Individual Constraints on Linguistic Variation: A Case Study of Presentational 'Haber' Pluralization in Caribbean Spanish 9783110524154, 9783110521627

The present volume tries to answer the question: What constrains morphosyntactic variation? By analyzing the variable ag

133 74 6MB Read more

Individual Retweeting Behavior on Social Networking Sites: A Study on Individual Information Disseminating Behavior on Social Networking Sites 9811573751, 9789811573750

This book explores and analyzes influential predictors and the underlying mechanisms of individual content sharing/retwe

356 14 5MB Read more

Typological and social constraints on language contact. Amerindian languages in contact with Spanish [1-2] 9789078328629

Summary The present study deals with linguistic borrowing in Latin America from the perspective of typology and sociolin

219 133 4MB Read more

Linguistic Variation and Change 9780748637638

The study of variation and change is at the heart of the sociolinguistics. Providing a wide survey of the field, this te

104 3 4MB Read more

Tense and Mood Variation in Spanish Nominal Subordinates: The Case of Peruvian Varieties

165 44 4MB Read more

Intra-individual Variation in Language 9783110743036, 9783110742855

This volume offers several empirical, methodological, and theoretical approaches to the study of observable variation wi

168 14 5MB Read more

Individual Retweeting Behavior on Social Networking Sites: A Study on Individual Information Disseminating Behavior on Social Networking Sites [1st ed.] 9789811573750, 9789811573767

This book explores and analyzes influential predictors and the underlying mechanisms of individual content sharing/retwe

306 29 5MB Read more

Cultural Semantics and Social Cognition: A Case Study on the Danish Universe of Meaning 9783110294651, 9783110294606

Presenting original, detailed studies of keywords of Danish, this book breaks new ground for the study of language and c

157 74 2MB Read more

Acquiring a Non-native Phonology: Linguistic Constraints and Social Barriers 9781474212106, 9780826468628

This is a study of the phonological development of a family of L2 English learners. It is the first full-length book tha

135 21 11MB Read more

Stakeholders of Terrorism and the Caribbean: A Short Case Study 3031404424, 9783031404429

The book ambitiously seeks to shape our understanding of terrorism by offering a more systematic interpretation of terro

116 113 5MB Read more

Author / Uploaded
Jeroen Claes

0 0 0
Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up

File loading please wait...

Citation preview

Jeroen Claes Cognitive, Social, and Individual Constraints on Linguistic Variation

Cognitive Linguistics Research

Editors Dagmar Divjak Dirk Geeraerts John R. Taylor Honorary editors René Dirven Ronald W. Langacker

Volume 60

Jeroen Claes

Cognitive, Social, and Individual Constraints on Linguistic Variation A Case Study of Presentational ‚Haber‘ Pluralization in Caribbean Spanish

ISBN 978-3-11-052162-7 e-ISBN (PDF) 978-3-11-052415-4 e-ISBN (EPUB) 978-3-11-052170-2 ISSN 1861-4132 Library of Congress Cataloging-in-Publication Data A CIP catalog record for this book has been applied for at the Library of Congress. Bibliographic information published by the Deutsche Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available on the Internet at http://dnb.dnb.de. © 2016 Walter de Gruyter GmbH, Berlin/Boston Druck und Bindung: CPI books GmbH, Leck ♾ Gedruckt auf säurefreiem Papier Printed in Germany www.degruyter.com

|

For Sara

Acknowledgements The present volume constitutes a substantially revised and expanded version of my December 2014 University of Antwerp PhD dissertation. Therefore, it is only natural to acknowledge the many contributions of my former supervisors, Frank Brisard (University of Antwerp) and Nicole Delbecque (KU Leuven). Although Frank and Nicole only took up the supervision of my PhD research around April 2013, it is hard to overestimate the value of their contributions to substantial parts of the corpus analysis upon which this volume draws. Without their helpful criticism, the analyses of Chapter 7 would have been less rigorous and the conclusions drawn from them less solid. In addition, many people have contributed to the intensive fieldwork upon which this investigation draws. First and foremost, I am indebted with the 72 participants of this study, who agreed to lend their time, voices, and experiences to the project. I should also thank Luis A. Ortiz-López (Universidad de Puerto Rico, Recinto de Río Piedras) and his research assistant Nadja Fuster, who have done a great job looking for participants in San Juan. Sunny Cabrera-Salcedo, the Chair of the Graduate Linguistics Program at the Universidad de Puerto Rico, Recinto de Río Piedras, was kind enough to write a letter of recommendation that opened many doors. Her help in locating participants was also appreciated. Boris Carrasquillo, José-Alberto Santiago, and Naiska Guzmán drove me around San Juan to interview friends, acquaintances, or family of theirs. The Casas Culturales of the Municipality of San Juan in the boroughs of Tras Talleres and Chícharo were also a great help, as were the homeless shelter/rehabilitation center Fondita de Jesús and the elderly residence Sagrado Corazón. In Santo Domingo, I could count on the support of Lilia Ramos (Pontífica Universidad Católica Madre y Maestra, Recinto Tomás de Acquino). The assistance of Ginia Montes de Oca (UNIBE Universidad Iberomericana) was also of great value. Without their help, it would have been a lot more difficult to fill in the quota for young university-educated participants. While preparing the Havana fieldwork, I could enjoy the support of Sandra Valdés, who guided me through the process of applying for an academic visa. Through her friend Maia Barreda, Sandra also introduced me to Alejandro Sánchez (Universidad de La Habana, Facultad de Letras). It was a relief to hear upon my arrival in Havana that Alejandro already had a list of potential participants in mind. During my stay, he continuously kept looking for more. I am also indebted with the staff of the Casa de Abuelos Eterna Juventud in the Centro Habana borough Cayo Hueso. Thanks to their collaboration I was able to interview older participants at a very steady pace.

VIII | Acknowledgements

The final revision of the manuscript was performed during the last months of a one-year postdoctoral position at KU Leuven, Quantitative Lexicology and Variational Linguistics. I wish to thank my colleagues at KU Leuven for interesting discussions that have contributed to refining the ideas that are presented in this book. For the same reason, many thanks also go to Daniel Ezra Johnson. In addition, the suggestions of Paul O’Neill (University of Sheffield), who was commissioned by the Cognitive Linguistics Research series to the review the manuscript, helped me structure the book more adequately. The remarks of the series editors Dagmar S. Divjak and Dirk Geeraerts also contributed to this. Julie Miess and Antonia Schräder at De Gruyter answered my every query with patience, speed, and precision. Last, but not certainly not least, I would like to thank my girlfriend Sara Lauwers, my parents Herman Bastiaens, Rudy Claes, and Veronique De Wit, my grandparents Jenny Claessens, Leon De Wit, Louis Claes, and Rita Patteet, my older brother Bart Claes, my sister-in-law Ellen Slachmuylders, my sister-in-law Kathleen Lauwers, my in-laws Ludo Lauwers and Lydia Van Espen, and my friends for their encouragement and support. My niece and godchild Olivia Claes, who was born around the time I completed the bulk of this investigation, also deserves a special mention. Although my family and friends do not always understand exactly what my work is about, they have always been there for me to help me cope with the, at times, difficult circumstances in which I have found myself doing research. Without their continuous encouragement and loving understanding, I would most certainly not have completed this project.

Contents 1

Introduction | 3

Part A: Preliminaries 2 2.1 2.2 2.3

Presentational haber pluralization | 13 Earlier studies in (perceptual) dialectology | 13 Earlier studies in variationist sociolinguistics | 17 Summary | 27

3 3.1 3.2 3.3

Cognitive Construction Grammar and language variation | 31 Cognitive Construction Grammar | 31 Cognitive Construction Grammar and language variation | 37 Summary | 45

4 4.1 4.2 4.3

Research questions and hypotheses | 49 Research questions | 49 Hypotheses | 50 Summary | 57

5 5.1

Methodology | 61 Judgment sample, selection criteria, and stratification variables | 61 Fieldwork methods | 66 Transcription, selection of cases, and envelope of variation | 72 Statistical toolkit | 76 Comparative sociolinguistics | 83 Summary | 85

5.2 5.3 5.4 5.5 5.6 6 6.1 6.2 6.3 6.4 6.5

Semantic and syntactic properties of presentational haber | 89 The meaning of the presentational haber constructions: POINTING-OUT | 89 The nominal argument | 90 The adverbial phrase | 110 Implicit nominal arguments and adverbial phrases | 114 Summary and box diagrams | 115

X | Contents

Part B: Cognitive, social, and individual constraints on presentational haber pluralization 7 7.1 7.2 7.3 7.4 7.5 7.6 7.7

Cognitive constraints on presentational haber pluralization | 121 Structure of the corpus and the regression models | 121 Markedness of coding | 129 Statistical preemption | 133 Structural priming | 142 Interaction between the linguistic predictors | 145 Relative importance of the linguistic predictors | 157 Summary | 158

8 8.1 8.2 8.3

Social constraints on presentational haber pluralization | 163 Social constraints: Age, education, and gender | 163 Linguistic predictors across social groups | 169 Relative importance of the linguistic predictors across social groups | 176 Summary | 180

8.4 9 9.1 9.2 9.3 9.4 9.5 10 10.1 10.2 10.3 10.4

Individual constraints on presentational haber pluralization | 183 Distribution across individuals | 183 Random intercepts, slopes, and participant-specific regression coefficients | 190 Linguistic predictors across individuals | 197 The behavior of the Havana university graduates | 203 Summary | 207 Cognitive, social, and individual constraints on presentational haber pluralization | 211 Cognitive constraints | 211 Social constraints | 213 Individual constraints | 214 Conclusion | 215

References | 219 Index | 233

Contents | XI

Appendix A: Story-reading task | 237 Appendix B: Questionnaire-reading task | 239

List of figures Fig. 1: The English ditransitive construction instantiated by to hand, adapted from Goldberg (1995: 51) | 37 Fig. 2: The non-agreeing presentational haber construction | 116 Fig. 3: The agreeing presentational haber construction | 116 Fig. 4: Distribution of tokens of presentational haber across participants in the Havana, Santo Domingo, and San Juan datasets | 122 Fig. 5: Counts of non-agreeing and agreeing presentational haber in Havana, Santo Domingo, and San Juan | 123 Fig. 6: Effect of typical action-chain position of the referent of the noun on the log-odds of agreeing presentational haber in Havana, Santo Domingo, and San Juan | 131 Fig. 7: Effect of the absence/presence of negation on the log-odds of agreeing presentational haber in Havana, Santo Domingo (alternative model), and San Juan | 132 Fig. 8: Effect of tense on the log-odds of agreeing presentational haber in Havana, Santo Domingo, and San Juan | 137 Fig. 9: Present- and preterit-tense tokens of presentational haber in Havana, Santo Domingo, and San Juan, by absence/presence of aspectual or modal auxiliary constructions | 139 Fig. 10: Effect of production-to-production priming on the log-odds of agreeing presentational haber in Havana, Santo Domingo, and San Juan | 143 Fig. 11: Effect of comprehension-to-production priming on the log-odds of agreeing presentational haber in Havana, Santo Domingo, and San Juan | 143 Fig. 12: Conditional inference tree model showing the interaction of linguistic predictors in Havana | 148 Fig. 13: Conditional inference tree model showing the interaction of linguistic predictors in Santo Domingo | 151 Fig. 14: Conditional inference tree model showing the interaction of linguistic predictors in San Juan | 155 Fig. 15: Constraint ranking for the linguistic predictors in Havana, Santo Domingo, and San Juan | 158 Fig. 16: Effect of age on the log-odds of agreeing presentational haber in Havana, Santo Domingo, and San Juan in alternative regression models | 166 Fig. 17: Effect of style on the log-odds of agreeing presentational haber in Havana (final model), Santo Domingo, and San Juan (alternative models) | 166 Fig. 18: Interaction effect of style and tense on the log-odds of agreeing presentational haber in Havana | 167 Fig. 19: Effect of education on the log-odds of agreeing presentational haber in Havana (final model), Santo Domingo (alternative model), and San Juan (final model) | 168 Fig. 20: Effect of gender on the log-odds of agreeing presentational haber in Havana (alternative model), Santo Domingo, and San Juan (final models) | 169 Fig. 21: Conditional inference tree model showing the interaction of linguistic and social predictors in Havana | 171 Fig. 22: Conditional inference tree model showing the interaction of linguistic and social predictors in Santo Domingo | 173

XIV | List of figures Fig. 23: Conditional inference tree model showing the interaction of linguistic and social predictors in San Juan | 175 Fig. 24: Constraint rankings for non-university-educated participants (left) and universityeducated participants (center) as compared to the overall Havana constraint ranking (right) | 177 Fig. 25: Constraint rankings for female participants (left) and male participants (center) as compared to the overall Santo Domingo constraint ranking (right) | 178 Fig. 26: Constraint rankings for female participants (left) and male participants (center) as compared to the overall San Juan constraint ranking (right) | 178 Fig. 27: Constraint rankings for non-university-educated participants (left) and universityeducated participants (center) as compared to the San Juan constraint ranking (right) | 179 Fig. 28: Usage rates of agreeing presentational haber for the individual participants included in the Havana, Santo Domingo, and San Juan datasets | 184 Fig. 29: Random intercepts for the participants included in the Havana, Santo Domingo, and San Juan datasets (log-odds) | 190 Fig. 30: Effect of typical action-chain position on the log-odds of agreeing presentational haber in Havana, by speaker | 194 Fig. 31: Effect of typical action-chain position on the log-odds of agreeing presentational haber in Santo Domingo, by speaker | 195 Fig. 32: Effect of tense on the log-odds of agreeing presentational haber in San Juan, by speaker | 196 Fig. 33: Conditional inference tree model showing the interaction of linguistic predictors and speakers in Havana | 198 Fig. 34: Conditional inference tree model showing the interaction of linguistic predictors and speakers in Santo Domingo | 200 Fig. 35: Conditional inference tree model showing the interaction of linguistic predictors and speakers in San Juan | 202 Fig. 36: Conditional inference tree model showing the interaction of linguistic predictors and the Havana university graduates | 205

List of tables Tab. 1: Presentational haber pluralization in the corpus of the Project for the coordinated study of the educated norms of the most important Latin American cities | 14 Tab. 2: Linguistic predictors considered by Díaz-Campos (2003) | 19 Tab. 3: Linguistic predictors considered by D’Aquino-Ruiz (2004) | 21 Tab. 4: Linguistic predictors considered by Freites-Barros (2008) | 23 Tab. 5: Linguistic predictors considered by Quintanilla-Aguilar (2009) | 25 Tab. 6: Overview of constructions of different sizes and degrees of schematicity | 32 Tab. 7: Some compound paradigms of the verb cantar ‘to sing’ | 52 Tab. 8: Composition of the sample | 65 Tab. 9: Number of participants from Havana who completed the story- and questionnairereading tasks with the help of the interviewer, by age, educational achievement, and gender | 68 Tab. 10: Number of participants from Santo Domingo who completed the story- and questionnaire-reading tasks with the help of the interviewer, by age, educational achievement, and gender | 68 Tab. 11: Number of participants from San Juan who completed the story- and questionnairereading tasks with the help of the interviewer or did not complete the tasks, by age, educational achievement, and gender | 69 Tab. 12: Forms of presentational haber included in the story-reading task, by animacy and the absence/presence of negation | 70 Tab. 13: Forms of presentational haber included in the questionnaire-reading task, by animacy and the absence/presence of negation | 71 Tab. 14: Simulated dataset illustrating the dependence of statistical significance upon sample size | 84 Tab. 15: Agreeing and non-agreeing presentational haber in the Spanish of Havana, Santo Domingo, and San Juan | 122 Tab. 16: Logistic generalized linear mixed-effects model of presentational haber pluralization in Havana (sum contrasts, bobyqa optimizer) | 126 Tab. 17: Logistic generalized linear mixed-effects models of presentational haber pluralization in Santo Domingo (sum contrasts, bobyqa optimizer) | 127 Tab. 18: Logistic generalized linear mixed-effects models of presentational haber pluralization in San Juan (sum contrasts, bobyqa optimizer) | 128 Tab. 19: Collocations table | 134 Tab. 20: Frequency counts and ∆P for different third-person singular forms of haber in the Latin American section of Corpus diacrónico del español (1492–1600) | 135 Tab. 21: Frequency counts and ∆P for different third-person singular forms of haber in the twentieth-century section of Corpus del español | 136 Tab. 22: Present- and preterit-tense tokens of presentational haber in Havana, Santo Domingo, and San Juan, by absence/presence of aspectual or modal auxiliary constructions | 139 Tab. 23: Present- and preterit-tense tokens of presentational haber without aspectual or modal auxiliary constructions in Havana, Santo Domingo, and San Juan | 141

XVI | List of tables Tab. 24: Presentational haber tokens that co-occur with object pronouns in Havana, Santo Domingo, and San Juan, by production-to-production priming and comprehension-toproduction priming | 144 Tab. 25: Logistic generalized linear mixed-effects model of presentational haber pluralization in Havana (sum contrasts, bobyqa optimizer) | 164 Tab. 26: Logistic generalized linear mixed-effects model of presentational haber pluralization in Santo Domingo (sum contrasts, bobyqa optimizer) | 165 Tab. 27: Logistic generalized linear mixed-effects model of presentational haber pluralization in San Juan (sum contrasts, bobyqa optimizer) | 165 Tab. 28: Social profiles of the five participants who use agreeing presentational haber less often in Havana, Santo Domingo, and San Juan | 185 Tab. 29: Social profiles of the five participants who use agreeing presentational haber most often in Havana, Santo Domingo, and San Juan | 189 Tab. 30: Social profiles of the Havana university graduates | 206

Abbreviations and other conventions

[] ACC

AdvP AICc Boldface LH01H22/LH33

NOM

Obj Obj1 Obj2

PL

RAE and ASALE (2009)

SG

Subj

Construction schema Literal translation Accusative case Adverbial phrase Second-Order Akaike Information Criterion Profiled portions of event frames The codes at the end of the examples identify the cases in the corpus: LH=Havana (SD=Santo Domingo, SJ=San Juan); 01= informant number 1; H=male informant (M=female); 2=55+ years of age (1=20-35 years of age); 2=university graduate (1=less than university). The code behind the backslash is the identifier of the specific example. Nominative case Direct object Indirect object of a three-argument construction Direct object of a three-argument construction Agreeing presentational haber Real Academia Española and Asociación de Academias de la Lengua Española (2009) Non-agreeing presentational haber Subject

|

Chapter 1: Introduction

1 Introduction In recent years, the usage-based approach to language (e.g., Bybee 2010; Langacker 1990: Chap. 10) has implied that cognitive linguists have increasingly moved away from introspective methods, in favor of corpus investigation and experimentation. Because these data sources inevitably confront the researcher with social, regional, and stylistic variability (e.g., Geeraerts 2005; Geeraerts and Kristiansen 2015), this methodological shift has spiked interest in languageinternal variation and its sociocultural covariates. In this context, Cognitive Sociolinguistics emerged, which proposes that a complete understanding of language can only be reached when the social and cultural factors shaping usage events are considered together with the cognitive ones (Geeraerts and Kristiansen 2015:366-371; Pütz, Robinson, and Reif 2012). Because of its general orientation, one would expect Cognitive Sociolinguistics to inspire the enthusiasm of variationist linguistics, including therein both variationist sociolinguistics in the tradition of Weinreich, Labov, and Herzog (1968) and Probabilistic Grammar, the fairly recent school of thought initiated by Bresnan et al. (2007). Yet, so far, in spite of constituting a booming research endeavor, Cognitive Sociolinguistics remains mostly an enterprise that is practiced by cognitive linguists in the interest of developing Cognitive Linguistics, with results and methods that – innovative as they may be in this context – have little to no impact on the broader community of variationists. For example, it is telling that the third and last volume of Labov’s (2010) magnum opus Principles of Linguistic Change, which deals with the core business of Cognitive Sociolinguistics (i.e., cognitive and cultural factors in language variation), does not include a single reference to work in this field. Similarly, when Probabilistic Grammar invokes hypotheses about the cognitive processes that support language to account for results, those are drawn from psycholinguistic work, rather than from Cognitive Sociolinguistics (see e.g., Szmrecsanyi et al. 2016). Geeraerts and Kristiansen (2015) and Pütz, Robinson, and Reif (2012) offer similar evaluations of the state of the field. In my analysis, this is mainly due to the fact that Cognitive Sociolinguistics, variationist sociolinguistics, and Probabilistic Grammar approach the question: What constrains linguistic variation? From different, but not at all incompatible angles. That is, the first research tradition operates under the working hypothesis that “a difference in syntactic form always spells a difference in meaning” (Bolinger 1968: 127). Therefore, the question as to what constrains linguistic

4 | Introduction

variation can be restated as: What are the semantic differences between alternating linguistic forms? The core business of this strand of variationist linguistics, then, may be defined as the corpus-based study of these subtle semantic differences, taking into account social-interactional meanings such as regional origin (see, e.g., Colleman 2010; Grondelaers, Geeraerts, and Speelman 2009; Levshina, Geeraerts, and Speelman 2013). As a result, Cognitive Sociolinguistics remains mostly concerned with the research questions that have occupied Cognitive Linguistics from the start, that is, individuals’ grammatical and semantic knowledge and the domain-general cognitive abilities that are deployed in applying this knowledge in language use (Croft 2009: 397). In contrast, variationist sociolinguistics in the tradition of Weinreich, Labov, and Herzog (1968) operates under the working hypothesis that expressions that refer to the same state of affairs are “alternate, semantically equivalent ways of saying ‘the same thing’” (Labov 1982: 18), which are constrained by a set of tacit, abstract norms that reside in the speech community and are exterior to the individual (Labov 2010:7; Weinreich, Labov, and Herzog 1968:188). As a result, the central activity of this strand of variationist linguistics may be defined as the study of those abstract, social norms that regulate the social and linguistic conditioning of alternations, for which the discipline has never felt a strong need to investigate the behavior of individual speakers, the cognitive capacities that allow individuals to represent sociolinguistic behavior, or the meanings of variants beyond establishing their referential equivalence (but see, e.g., Lavandera 1978 and Romaine 1984 for early variationist critiques of this approach). Probabilistic Grammar finds itself in between these two positions. Like Cognitive Sociolinguistics, this discipline approaches language as a cognitive capicity of the individual. Like variationist sociolinguistics, it is mainly interested in investigating which linguistic and non-linguistic features constrain the use of alternating ways of referring to the same state of affairs, for which its overall orientation remains closer to this strand of variationist research. However, unlike the former two disciplines, Probabilistic Grammar argues that speakers extract and store the probabilities of encountering a particular alternant in a given linguistic context and that these probabilities constrain speakers’ use of alternating linguistic forms (Bresnan 2007; Bresnan et al. 2007; Bresnan and Ford 2010; Bresnan and Hay 2008). As a result, the central activity of Probabilistic Grammar may be defined as uncovering the contexts and probabilities that guide speakers’ usage, for which the discipline has never felt a strong need to explain correlation patterns in terms of the semantics of variants or abstract cognitive principles (but see e.g., Szmrecsanyi et al. 2016).

Introduction | 5

The fact that multiple disciplines of linguistics are concerned with variation shows that understanding linguistic variability and its structure is important for understanding language. Yet, because the different strands of variationist research approach this issue from different angles, there is hardly any communication – let alone, cross-fertilization – between them. While some may consider this situation a natural and unproblematic consequence of the different orientations and sensitivities of cognitive sociolinguists and the larger community of variationists, I follow Geeraerts and Kristiansen (2015:378) in considering that, to be successful, Cognitive Sociolinguistics “will have to interact intensively with existing variationist linguistics, and defend the specific contribution of Cognitive (Socio)linguistics in that context.” This book will argue for one such contribution. In the chapters to follow, I will show that Cognitive Sociolinguistics offers all the necessary components to construct a psychologically plausible theoretical model of the constraints that condition morphosyntactic variation, which generates empirically falsifiable predictions. Specifically, I will explore the hypothesis that usage patterns in morphosyntactic variation reflect the joint action of three types of constraints: cognitive, social, and individual. The first of these types is related to the way language is represented mentally and how these representations are accessed and retrieved during language production. The second group of constraints refers to the fact that while producing language, speakers are not only concerned with expressing ideas, but also with positioning themselves in the social landscape as members of particular geographic and social groups. Of course, any speaker of any language has his or her own idiosyncrasies, which cannot be reduced to the other two types of constraints, but rather reflect individuals’ attitudes, cognitive capacities, or social histories. This is what is intended with the third type of constraints. To explore this hypothesis, I investigate a case study in Cuban, Dominican, and Puerto Rican Spanish of agreement variation under presentational haber, the Spanish analog of English there is/there are. Like its English counterpart, this construction is used to present new information to the hearer – hence, the label ‘presentational’. However, unlike English there is/there are, the presentational haber construction does not display verb agreement with the nominal argument in normative usage. Rather, in example (1), the verb takes the thirdperson singular verb ending –a, whereas the nominal argument fiestas ‘parties’ is plural, as is shown by the presence of the nominal plurality suffix –s. Since the noun phrase pronominalizes as the accusative pronoun las in example (2), we may consider that the absence of verb agreement indicates that the nominal

6 | Introduction

argument functions as a direct object in normative usage (Gili-Gaya 1980:78; RAE and ASALE 2009:§41.6.b). (1) Entonces, él siempre estaba velando en el periódico donde era que había fiestas (SJ03H22/SJ327). ‘So, he was always watching in the newspaper where it was that there was parties.’ (2) Interviewer: ¿Y también habían comidas que sólo se preparaban en fiestas, por ejemplo? Participant: Sí, claro y todavía las hay (SD19M12/RD2547). Interviewer: ‘And were there also dishes that were only prepared on holidays, for example?’ Participant: [Yes, of course, and still themAcc there is.] Participant: ‘Yes, of course, and there still is.’ However, in all informal varieties of Spanish, including those spoken in Cuba, the Dominican Republic, and Puerto Rico (Vaquero 1996: 44), a second variant of the presentational haber construction occurs, which features verb agreement with plural noun phrases (see example [3], where the ending–an marks thirdperson plural; D'Aquino-Ruiz 2008; Kany [1945] 1951: 256–259). The alternation between these two constructions is known as ‘presentational haber pluralization’. (3) De seguro, no había televisión y, e, no habían computadores (SD04M22/RD437). ‘Surely, there was no television and, er, there weren’t any computers.’ As will become clear in Chapter 2, it is already quite well known in which Spanish-speaking regions the variation between the agreeing and the non-agreeing presentational haber constructions occurs and with which linguistic and social predictors it correlates. However, this should not discourage revisiting the phenomenon. Quite on the contrary, since its patterns of variation are already well understood, it constitutes an ideal testing ground for hypotheses about the way alternations are constrained, in our case, the claim that morphosyntactic variation is conditioned by cognitive, social, and individual constraints. The analyses will draw extensively on concepts and constructs from Cognitive Linguistics (in particular, Cognitive Construction Grammar; Goldberg 1995, 2006a), psycholinguistics, and variationist sociolinguistics. To ensure that the argumentation benefits scholars coming from either of these three disciplines and beyond, I will only assume limited familiarity with any specific research tradition. Since the implications of the results reported in this volume go far

Introduction | 7

beyond the specific case study, no familiarity will be assumed with Spanish or Hispanic (socio)linguistics. To this end, in Part A I will start by introducing the necessary backgrounds. Particularly, in Chapter 2 the literature on presentational haber pluralization will be reviewed. This will lead to the conclusion that the phenomenon occurs in many varieties of Spanish with similar linguistic constraints and recurring patterns of social covariation. Subsequently, Chapter 3 will introduce Cognitive Construction Grammar and the cognitive, social, and individual constraints on language variation that follow from the language production model that is assumed by this theory. Chapter 4 will apply these constraints to formulate a set of hypotheses regarding the frequency of presentational haber pluralization in specific linguistic environments across speech communities, within social groups belonging to these communities, and among individuals belonging to these groups. Chapter 5 will present the comparative sociolinguistic methodology that was applied in Havana, Santo Domingo, and San Juan to gather the spoken-language corpus of this study, as well as the innovative statistical toolkit that will be used to model presentational haber pluralization. Before concluding Part A, Chapter 6 will investigate the pragmatic, semantic, and syntactic properties of the agreeing and the non-agreeing presentational haber constructions to investigate whether these constructions can be considered near-synonyms. Part B will be concerned with the way the statistical results obtained for Havana, Santo Domingo, and San Juan portray the main hypothesis of this book and the hypotheses of Chapter 4. Particularly, Chapter 7 will investigate whether the patterns of covariation with linguistic predictors support that the alternation is conditioned by cognitive constraints. Subsequently, Chapter 8 will try to establish with which social groups the frequent use of agreeing or non-agreeing presentational haber is associated and whether this motivates seeing the variation as an ongoing linguistic change. The chapter will also investigate whether the effects of cognitive constraints are stable across social groups. Chapter 9 will assess whether any individual-based constraints can be detected on the use of presentational haber pluralization and whether the cognitive constraints have similar effects for all individuals. To conclude, Chapter 10 will answer the research questions and the results will be situated in a broader theoretical perspective. Let us now turn to the review of the literature on presentational haber pluralization.

|

Part A: Preliminaries

|

Chapter 2: Presentational haber pluralization

2 Presentational haber pluralization In this chapter, I will try to establish where the alternation between agreeing and non-agreeing presentational haber occurs and with which linguistic and social predictors it correlates. To this end, Section 2.1 will review the (perceptual) dialectological literature on presentational haber pluralization. Next, Section 2.2 will provide an overview of the results of previous sociolinguistic analyses. The chapter concludes with a brief summary in Section 2.3, in which I will highlight the main trends that emerge from the literature, as well as some of the limitations of earlier studies.

2.1 Earlier studies in (perceptual) dialectology In general, presentational haber pluralization has been documented in the majority of Spanish dialects (RAE and ASALE 2009: §41.6b). However, some differences can be found between regions when it comes to the degree to which speakers accept agreeing presentational haber. For Spain, the dialectological record suggests that haber pluralization occurs frequently in the varieties of Catalonia, eastern Andalucía, eastern Aragón, eastern Castilla-La Mancha, eastern Murcia, the Valencian Community, and the Canary Islands (Álvarez-Nazarío 1991: 490; Catalán 1989: 155, 199; Gili-Gaya 1980: 78; Llorente 1980: 30; PérezMartín 2007; RAE and ASALE 2009: §41.6b) and that it only occurs sporadically in the varieties of Alcalá de Henares (Blas-Arroyo 2016), Cantabria (Nuño-Álvarez 1996: 190), Extremadura (Álvarez-Martínez 1996: 180), Granada (Blas-Arroyo 2016), Madrid (Quilis 1983:94), Málaga (Blas-Arroyo 2016), Old Castile (Hernández-Alonso 1996: 209), and Seville (DeMello 1991:449). Indeed, for the former group of varieties, it has been observed that presentational haber pluralization occurs in the speech of all social groups, and even in the written language, without any negative connotation attached to it (Álvarez-Nazarío 1991: 490; Blas-Arroyo 1995: 179, 1999: 55, 2016; Catalán 1989: 155, 199; Gómez-Molina 2013; Pérez-Martín 2007). However, recent data contributed by Pato (2016) and Claes (2017) suggest that haber pluralization is far more widespread than the dialectological record gives reason to believe. Particularly, Pato (2016) shows that agreeing presentational haber can be documented throughout rural Spain. Similarly, Claes (2017), who analyzes a corpus of 5,500 tweets, documents cases of presentational haber pluralization for all Autonomous Communities. Still,

14 | Presentational haber pluralization

these two investigations also support that in eastern Spain, agreeing presentational haber appears to be much more frequent, reaching a frequency of 32.74% (N=370/1,130) in the Twitter corpus analyzed by Claes (2017). Additionally, all Latin American varieties of Spanish feature presentational haber pluralization, albeit in different proportions according to the local social evaluation (Fontanella de Weinberg 1992a: 152–153; Moreno de Alba 1995: 191). In this regard, Kany (1951: 257–259) argues that presentational haber pluralization occurs particularly frequently in Argentina, Central America, and Chile. In a review article of Kany (1951), Flórez (1946: 379) adds that in Bogotá (Colombia), agreeing presentational haber is also quite frequently found among the lower and middle classes. In contrast, DeMello (1991: 449) shows that presentational haber pluralization also occurs among university-educated speakers from this city. Moreover, the use of agreeing presentational haber seems to be a feature of the supranational Latin American standard variety, because it appears in every city included in the corpus of the Project for the coordinated study of the educated norms of the most important Latin American cities,1 as can be seen in Tab. 1. Tab. 1: Presentational haber pluralization in the corpus of the Project for the coordinated study of the educated norms of the most important Latin American cities

City

N

%

Buenos Aires

3/82

4%

Mexico City

7/92

8%

Bogotá

20/127

16%

Havana

12/45

27%

San Juan

29/95

31%

Caracas

55/153

36%

Santiago de Chile

51/132

39%

Lima

42/104

40%

La Paz

50/83

60%

Total

269/1038

26%

Source: DeMello (1991: 449)

|| 1 In Spanish: Proyecto de estudio coordinado de la norma lingüística culta de las principales ciudades de Hispanoamérica.

Earlier studies in (perceptual) dialectology | 15

Still, the distributions in the table suggest some differences across the continent. In particular, according to the data represented in Tab. 1, universityeducated speakers from Caracas, Havana, La Paz, Lima, San Juan, and Santiago de Chile use agreeing presentational haber in more than 25% of the cases. In contrast, Tab. 1 suggests that presentational haber pluralization is rather infrequent among university-educated speakers from Bogotá, Buenos Aires, and Mexico City. However, at least for Buenos Aires and Mexico City, this is not corroborated by other studies, which have found agreeing presentational haber to be a frequently occurring feature of educated Argentinean (Fontanella de Weinberg 1987: 154, 1992a: 152–153, 1992b: 36) and Mexican Spanish (Castillo-Trelles 2007: 75; Lope-Blanch 1996: 83; Montes de Oca 1994: 21). With respect to conditioning environments, DeMello (1991: 460) indicates that presentational haber pluralization occurs frequently with the imperfect tense (había, habían). It appears to be less frequent with the preterit (hubo, hubieron) and, especially, the present tense (hay, hayn), for which DeMello (1991: 460) only documents the non-agreeing forms. For preterit hubieron, research in Venezuela suggests that its less frequent use may be due to social stigma (Freites-Barros 2003, 2004; Malaver 1999). The agreeing present tense, which is usually transcribed as hayn, haen, or hain, is also stigmatized. Still, these forms have been documented in Antillean (Holmquist 2008: 28; Vaquero 1996: 64), Antioquian Colombian (Montes-Giraldo 1982: 384), Argentinean (Kany 1951: 256–257), and Venezuelan Spanish (Lapesa 1981: §133; NavarroCorrea 1992: 98). Turning now to Caribbean Spanish, Vaquero (1996: 64) indicates that the Latin American tendency towards establishing plural agreement with presentational haber can also be observed in the Greater Antilles. Indeed, Kany (1951: 259) cites examples drawn from the work of Cuban and Puerto Rican writers. For Cuba, Padrón (1949: 144) adds to this: “[i]n popular speech, the cases of verb agreement of the impersonal verb with its apparent subject are frequent.”2 However, judging from the data tabulated by DeMello (1991: 449; see Tab. 1), presentational haber pluralization is not limited to the popular classes. Rather, the phenomenon appears to be particularly vibrant among young universityeducated speakers (Aleza-Izquierdo 2011: 38–41). For the Dominican Republic, Henríquez-Ureña ([1940] 1982: 224) observes that in the Santo Domingo of the 1930s, only the lower classes use the agreeing construction. More recent reports (Alvar-López 2000: 338; Jiménez-Sabater

|| 2 In the original: “[e]n el habla popular son frecuentes los casos de concordancia del impersonal con el sujeto aparente” (Padrón 1949: 144).

16 | Presentational haber pluralization

1978: 178, 1984: 165) document a generalized use of agreeing presentational haber throughout the country among all strata of the population. This is confirmed by Jorge-Morel (1978: 127), who finds that, in Santo Domingo, individuals of all educational backgrounds report using the agreeing preterit form hubieron, although more uneducated participants admit to using it. Additionally, Fernández’s (1982: 93, 102) attitude study shows that two thirds of the students of the Pontífica Universidad Católica Madre y Maestra3 consider future- and imperfect-tense agreeing presentational haber to be correct and part of the standard language. However, the agreeing preterit form hubieron is considered correct by less than 50% of the students (46%, N=62/135 students). Yet, more recently, Alvar-López (2000: 338) finds that “[a]ll social classes use hubieron.”4 This is confirmed by Alba (2004: 323), who reports that 53% (N=73/138 students) of his sample of university students consider this form to be correct. These data appear to warrant the conclusion that, contrary to the stigma that rests upon hubieron in, for example, Venezuela (Malaver 1999; Freites-Barros 2003, 2004), the majority of the Dominicans is firmly convinced of the normative correctness of agreeing presentational haber (Alba 2004: 28). For Puerto Rico, Navarro-Tomás (1948: 131) observes in 1927–1928 that agreeing instances of presentational haber “are not only heard in rural settings, but also, as is the case in other countries, in the informal language of the urban classes.”5 Similarly, Álvarez-Nazarío (1991: 490, 709) points out that presentational haber pluralization can be found in the speech of all social classes. Indeed, Tab. 1 suggests that university-educated speakers do not refrain from using agreeing presentational haber. Vaquero’s (1978: 135–140) attitude study points in the same direction, as it shows that about one third of the students of the Río Piedras campus of the Universidad de Puerto Rico identify agreeing imperfect habían as correct (34%, N=98/288 students). As was the case in the Dominican Republic, a similar figure is found for agreeing preterit hubieron (29%, N=84/288 students). Finally, López-Morales (1992: 147) reports that 63% of university-educated speakers residing in San Juan consider agreeing imperfect habían to be correct. In sum, this section has shown that presentational haber pluralization constitutes a widespread phenomenon that appears in most regional varieties of

|| 3 A university located in Santiago de los Caballeros, the second largest city of the Dominican Republic. 4 In the original: “[t]odas las clases sociales emplean hubieron” (Alvar-López 2000: 338). 5 In the original: “se oyen … no sólo en los medios rurales sino también, como en otros países, en el lenguaje familiar de las clases urbanas” (Navarro-Tomás 1948: 131).

Earlier studies in variationist sociolinguistics | 17

Spanish. In Caribbean Spanish, it occurs in the speech of all social strata, including university-educated speakers. In some varieties, the use of agreeing preterit hubieron is stigmatized, but in Cuba, the Dominican Republic, and Puerto Rico speakers appear to consider all non-present agreeing forms of presentational haber part of the standard language. In contrast, agreeing present-tense hayn is absent from the speech of university-educated speakers, but it may still be found in Antillean Spanish. Let us now consider the results of earlier sociolinguistic studies of presentational haber pluralization.

2.2 Earlier studies in variationist sociolinguistics Presentational haber pluralization has been subject of various systematic investigations in variationist sociolinguistics. In this section, I will restrict the review of this literature to those studies that have used fixed-effects logistic regression (called VARBRUL/GoldVarb analysis in sociolinguistics; see Section 5.4.1). This will facilitate the identification of recurring linguistic constraints and social covariation patterns. The latter are of particular interest if one wishes to establish whether presentational haber pluralization constitutes an ongoing language change ‘from above’, ‘from below’ or rather a ‘stable variable’. Before turning to the review of the literature, in the following section, I will briefly introduce these types of sociolinguistic variability and their characteristic patterns of social covariation.

2.2.1 Symptoms of change and stability Language changes ‘from below’ are spontaneous linguistic evolutions that emerge in the middle class (Labov [1966] 2006: 206, 2001: 188) and spread upward through the social hierarchy below speakers’ level of consciousness (Labov 2006: 206–207, 1972: 179). As this type of language change occurs without speakers realizing it, changes from below have a high probability of going to completion (Labov 1972: 178–180, 2001: 517–518). In situations of change from below, female speakers (Labov 2001: 292), middle-class speakers (Labov 2001: 188), and younger speakers (Labov 1994:43–72) use the innovative forms more frequently. Additionally, the rates of use of the innovative variant do not decrease when formality rises (Labov 1972: 239, 2001: Chap. 3; Silva-Corvalán 2001: 248–249), that is, when the amount of attention that is paid to speech increases (Labov 2006: 59–86, 1972: 99).

18 | Presentational haber pluralization

Eventually, speakers may grow aware of a change from below, in which case it may become stigmatized, in other words, associated to a group of low social prestige. In other cases, cultural changes may contribute to the stigmatization of forms that were previously not stigmatized. In either case, a conscious effort (through education, mass media, and other linguistic institutions) may be made to replace the innovative variant with a form, usually borrowed from another variety or language, that is judged more favorably (Labov 1994: 78). Such changes, typical of standardization processes, are known as ‘changes from above’ (Labov 1972: 179). In situations of change from above, female speakers and younger speakers typically use the stigmatized variant less often and its frequency is a monotonic function of formality and speakers’ social class (Labov 2006: 204–206, 2001: Chap. 3). Stable variables represent a pattern of social covariation similar to that of changes from above, with one difference: they do not covary with speakers’ age (Labov 2001: 119). Let us now review some of the studies of presentational haber pluralization that have been performed in Venezuela.

2.2.2 Venezuela Díaz-Campos (2003: 4–5) selects a sample of 96 sociolinguistic interviews from the Sociolinguistic study of Caracas corpus6, stratified by age, gender, and social class. In general, Díaz-Campos (2003: 8) shows that speakers of this variety establish agreement with presentational haber in 54.3% of the variable contexts (N=245/451), which do not include present-tense presentational haber. Additionally, Díaz-Campos (2003: 5–7) performs a stepwise fixed-effects logistic regression analysis with six predictors: age (14–29 years, 30–45 years, 46–60 years, 61+ years), gender (female, male), social class (lower class, middle class, upper class), and the three linguistic predictors specified in Tab. 2. The results show that the variation is only constrained by the speaker’s social class and the verb tense. For the first of these predictors, Díaz-Campos (2003: 8) finds that lower- and middle-class participants favor agreeing presentational haber. Concerning the verb tense, Díaz-Campos (2003: 8) observes agreeing presentational haber more often with both the imperfect (había, habían) and the present perfect tense (ha habido, han habido). In contrast, the preterit (hubo, hubieron) and the other tenses disfavor pluralization. Regarding the question as to whether the alternation constitutes an ongoing linguistic change from below or rather a || 6 In Spanish: Estudio sociolingüístico de Caracas.

Earlier studies in variationist sociolinguistics | 19

stable variation, Díaz-Campos (2003: 9) notes that, although the frequencies of agreeing presentational haber have increased with respect to earlier investigations of Caracas Spanish, the phenomenon has hardly spread from the imperfect paradigm to others, with the exception of the present perfect. Moreover, the fact that age does not seem to be a conditioning factor in this variation is also suggestive of a stable variable, at least for these two tenses (Díaz-Campos 2003: 11). However, as Díaz-Campos (2003: 11) observes, resolving this matter will require investigating the effect of style on presentational haber pluralization, as the frequency of stable variables typically declines when formality rises (Labov 2001: Chap. 3). Therefore, he calls for investigations that “address this issue by observing the interaction between the pluralization of haber and factors such as speech style, social class, sex, and age” (Díaz-Campos 2003: 11). Tab. 2: Linguistic predictors considered by Díaz-Campos (2003)

Predictors

Examples

Reference of the noun phrase Human

Habían profesores. ‘There were professors.’

Nonhuman

No había edificios. ‘There wasn’t buildings.’

Reinforcement of the idea of plurality Not reinforced

Habían profesores. ‘There were professors.’

Reinforced by an adjective

Habían buenos proyectos. ‘There were good projects.’

Reinforced by coordination of nouns

Habían hornos de cal, alfarería y cuestiones. ‘There were lime ovens, pottery, and things.’

Reinforced by a determiner

Habían otros grupos. ‘There were other groups.’

Reinforced by a determiner and an adjective

Habían unos árboles grandes. ‘There were some big trees.’

Verb tense Source: Examples taken from Díaz-Campos (2003: 5–6)

20 | Presentational haber pluralization

D’Aquino-Ruiz (2004) studies a larger sample from the same corpus as DíazCampos (2003)7. With these data, D’Aquino-Ruiz (2004: 18) shows that participants use agreeing presentational haber in 63% of the cases (N=477/754), which do not include the present tense. Additionally, she investigates nine linguistic (see Tab. 3) and three social predictors: age (14–29 years, 30–45 years, 46–60 years, 61+ years), gender (female, male), and social class (lower class, lower middle class, middle class, upper middle class, upper class). Of these, only four turn out to condition presentational haber pluralization in the stepwise fixedeffects logistic regression analysis: absence/presence of negation, social class, type of plural noun, and verb tense. For the first predictor, D’Aquino-Ruiz (2004: 18) finds that agreeing presentational haber occurs less often in clauses involving negation. For social class, the analysis indicates that lower-class participants use agreeing presentational haber more often (D’Aquino-Ruiz 2004: 18). Regarding the type of plural noun, D’Aquino-Ruiz (2004: 18) shows that singular mass nouns, either specific or unspecific, disfavor agreeing presentational haber. For the verb tense, synthetic tenses8 are shown to favor agreeing presentational haber, whereas the compound tenses and the preterit disfavor pluralization (D’Aquino-Ruiz 2004: 18). Finally, D’Aquino-Ruiz (2004: 16) argues that, because age and gender do not seem to condition the variation, presentational haber pluralization is most adequately described in terms of a stable variation. However, the alternations between agreeing and non-agreeing presentational haber do appear to be spreading from the lower to the upper classes, which leads her to conclude that the phenomenon could become a future change from below (D’Aquino-Ruiz 2004: 22).

|| 7 D’Aquino-Ruiz (2004: 5) analyzes the interviews with 160 participants, whereas DíazCampos (2003) only considers the interviews with 96 participants. 8 ‘Synthetic’ refers to the forms of presentational haber that consist of just one word, as opposed to the compound tenses and to expressions in which presentational haber forms a verb phrase with an aspectual or modal auxiliary.

Earlier studies in variationist sociolinguistics | 21

Tab. 3: Linguistic predictors considered by D’Aquino-Ruiz (2004)

Predictors

Examples

Absence/presence of aspectual or modal auxiliaries Absent

Había muchas peleas entre salones. ‘There was many fights between classrooms.’

Present

Mínimo deben haber dos personas de acuerdo. ‘Minimally, there have to be two people who agree.’

Absence/presence of negation Absent

Tienen que haber productos superfluos también. ‘There also have to be superfluous products.’

Present

Nunca hubo zapateros. ‘There was never shoemakers.’

Definiteness of the noun phrase Definite

Allí habían el partido comunista y el MIR. ‘Over there, there were the communist party and the MIR.’

Indefinite

También habían fiestas de quince años con Billos. ‘There were also fifteenth-birthday celebrations with Billos.’

Reference of the noun phrase Human

Donde habían equis cantidad de estudiantes. ‘Where there were x amount of students.’

Nonhuman

Siempre habían muchos choques. ‘There were always many clashes.’

Type of clause All others

No habían abastos sino pulperías. ‘There weren’t supermarkets, but grocery stores.’

Relative clause

Todas las matas de mango que habían aquí. ‘All the mango trees that there were here.’

Type of noun phrase Implicit noun phrase

…que había dos. ‘…that there was two.’

Lexical and pronominal noun phrase

Se observan algunas canchas que antes no las habían. [Are observed some courts that before themAcc there weren’t.] ‘Some courts are observed that there weren’t before.’

Lexical noun phrase

No hubo problemas. ‘There wasn’t problems.’

Pronoun

Los había preciosos. [ThemAcc there was beautiful.] ‘There was beautiful ones.’

Source: Examples taken from D’Aquino-Ruiz (2004: 5–13)

22 | Presentational haber pluralization

Predictors

Examples

Type of plural noun Mass noun

Creo que en Letras había un grupito también. ‘I think that in Arts there was a little group as well.’

Plural count noun

Había noches que yo no dormía. ‘There was nights that I didn’t sleep.’

Specific mass noun

Y había un grupito ya grande de muchachos. ‘And there was quite a large group of kids already.’

Word order haber + noun phrase

O había pequeñas manifestaciones. ‘Or there was small manifestations.’

Implicit noun phrase

En el año habían muchas. ‘Throughout the year, there were many.’

Noun phrase + haber

Diferentes de esas fragatas que habían aquí. ‘Different from those frigates that there were here.’

Verb tense Compound tenses

No me acuerdo, así, que haya habido. ‘I don’t remember, like that, that there has been.’

Synthetic tenses

De repente, habrán otras cosas. ‘Suddenly, there will bePL other things.’

Preterit

Y parece que hubieron muertos. ‘And it appears that there were casualties.’

Finally, Freites-Barros (2008) analyzes a sample of 128 interviews with residents of San Cristóbal de Los Andes, stratified by age, gender, and regional origin. In general, agreeing presentational haber appears in no less than 82% (N=245/298) of the cases, which include third-person presentational haber in all non-present tenses and first-person plural haber in all tenses (Freites-Barros 2008: 47). Freites-Barros (2008: 44–47) further examines five linguistic (see Tab. 4) and three social predictors: gender (female, male), age (15–30 years, 31–45 years, 46–60 years, 60+ years), and regional origin (rural, urban). The stepwise fixedeffects logistic regression analysis withholds three of these, namely, reference of the noun phrase, reinforcement of the idea of plurality, and type of noun phrase (Freites-Barros 2008: 53). For the first two, Freites-Barros (2008: 53) finds that agreeing presentational haber is favored with human-reference nominal arguments and the presence of elements that reinforce the idea of plurality. For the third predictor, the results suggest that implicit noun phrases and pronouns disfavor agreeing presentational haber (Freites-Barros 2008: 53).

Earlier studies in variationist sociolinguistics | 23

Tab. 4: Linguistic predictors considered by Freites-Barros (2008)

Predictor

Examples

Reference of the noun phrase Human

Porque en ese entonces sí habían profesores que valía la pena lo que enseñaban. ‘Because in that time, there were professors that it was worth what they taught.’

Nonhuman

A veces, cuando habían vacas, se ordeñaban las vacas. ‘Sometimes, when there were cows, the cows were milked.’

Reinforcement of the idea of plurality Not reinforced

No habían vagones para transportar cargamento pesado. ‘There weren’t wagons to transport heavy loads.’

Reinforced by indefinite quantifiers, numerals, or coordination of nouns

Hubieron muchos animales muertos. ‘There were many dead animals.’

Specificity Nonspecific

Siempre que habían velorios, yo me escapaba. ‘Whenever there were parties, I escaped.’

Specific

En esa casa habían dos viejitas que vestían bien. ‘In that house, there were two little old ladies who dressed well.’

Type of noun phrase Implicit noun phrase

Yo iba a buscarle sus cigarrillos. Si en la bodega que estaba más cerca no habían tenía que ir a la otra bodega. ‘I fetched his cigarettes. If in the shop that was closest there weren’t, I had to go to the other shop.’

Lexical noun phrase

Habían trapiches pa’ moler caña. ‘There were sugar cane mills to grind sugar cane.’

Pronoun

La gente aquí toda es buena; los habrá por allá, pa’ otros barrios, pero aquí no. [All the people around here are good; themAcc there will beSG around there, towards other boroughs, but not here.] ‘All the people around here are good; there will beSG around there, towards other boroughs, but not here.’

Verb tense Source: Examples taken from Freites-Barros (2008: 44–47)

24 | Presentational haber pluralization

Predictors

Examples

Word order haber + noun phrase

Sí, había torturas. ‘Yes, there was tortures.’

Implicit noun phrase

No sé qué hacían con él, pero lo cierto es que no habían. ‘I don’t know what they did with him, but the sure thing is that there weren’t.’

Noun phrase + haber

Ladrones, no habían. ‘Thieves, there weren’t.’

2.2.3 El Salvador Quintanilla-Aguilar (2009: Chap. 4.2.1) analyzes a sample of 48 interviews with native speakers of San Salvador Spanish, stratified by age, educational achievement, and gender. In general, he finds that agreeing presentational haber occurs in 79.6% (N=218/274) of the cases (Quintanilla-Aguilar 2009: 146), which, as in Freites-Barros’ (2008) study, include third-person presentational haber in all non-present tenses and first-person plural haber in all tenses (Quintanilla-Aguilar 2009: 153). In addition, Quintanilla-Aguilar (2009: 126–129) investigates the influence of ten predictors: the six linguistic predictors listed in Tab. 5 and four social predictors, namely, age (18–35 years, 50+ years), discourse spontaneity (elicited by a question containing non-agreeing presentational haber, spontaneous), educational achievement (basic secondary or less, university), and gender (female, male). Of these, Quintanilla-Aguilar’s (2009:172–173) stepwise fixed-effects logistic regression analysis only retains two: discourse spontaneity and verb tense. For the first predictor, the regression analysis suggests that participants are less likely to use agreeing presentational haber in answers to questions containing non-agreeing presentational haber. In Quintanilla-Aguilar’s (2009: 162) analysis, this shows that presentational haber pluralization is less frequent in formal registers.9 For the second predictor, Quintanilla-Aguilar (2009: 173) finds that the imperfect tense favors agreeing presentational haber, whereas all other tenses disfavor pluralization. Finally, the fact that age and gender did not turn out to be constraints on the variation leads him to conclude that presentational haber pluralization is most adequate-

|| 9 These results may also suggest that haber pluralization is subject to structural priming, as I will argue in Chapter 4.

Earlier studies in variationist sociolinguistics | 25

ly described as a stable variable in San Salvador (Quintanilla-Aguilar 2009: 180). Tab. 5: Linguistic predictors considered by Quintanilla-Aguilar (2009)

Predictors

Examples

Absence/presence of quantifiers Absent

Porque allá no habían escuelas en donde estudiar. ‘Because over there, there weren’t schools to study in.’

Present

Durante la guerra hubieron más de setenta mil personas muertas. ‘During the war, there were more than seventy thousand people dead.’

Absence/presence of negation Absent, present

No examples provided by Quintanilla-Aguilar (2009: 129–130).

Reference of the noun phrase Human

Siempre van a haber pobres. ‘There are always going to be poor people.’

Nonhuman

No habían pugnas entre los sindicatos y directivas. ‘There weren’t conflicts between the unions and directives.’

Type of verb phrase Aspectual or modal auxil- El Señor siempre dijo: “Pobres siempre van a haber.” iary + haber ‘The Lord always said: “Poor people there will always bePl.”’ Compound form

Porque ahí cuando han habido terremotos y todo eso no se han caído casas ni nada. ‘Because over there, when there have been earthquakes and all this stuff, houses have not fallen down or anything.’

Synthetic form

Allí habían ya prostíbulos con mucha evidencia. ‘Over there, there were already unconcealed brothels.’

Verb tense Imperfect, other tenses, preterit

No examples provided by Quintanilla-Aguilar (2009: 128–129).

Word order haber + noun phrase

Yo tengo entendido de que hubieron muchísimos más muertos. ‘I understand that there were many more casualties.’

Noun phrase + haber

¡Bastantes muertos hubieron! ‘Enough casualties there were!’

Source: Examples taken from Quintanilla-Aguilar (2009: 128–130)

26 | Presentational haber pluralization

2.2.4 Puerto Rico In a series of three articles, Esther Brown and Javier Rivas (Brown and Rivas 2012; Rivas and Brown 2012, 2013) analyze a corpus of Caguas, Cayey, and San Juan Spanish and the corpus of the Project for the coordinated study of the educated norms of the most important Latin American cities also used by DeMello (1991). The results show that speakers establish plural agreement in, respectively, 44% (N=83/190; Brown and Rivas 2012: 329) and 58% (N=41/98; Rivas and Brown 2013: 110) of the cases and that the choice between agreeing and nonagreeing presentational haber is controlled by the properties of the noun phrase and the verb tense. Specifically, Brown and Rivas (2012: 331) argue that nouns that are predominantly used as subjects in Spanish trigger the online reanalysis of the noun phrase slot of the presentational construction with haber more often. In their view, this suggests that the mental representations of nouns include a ‘grammatical relation probability’10 and that this probability leads speakers to interpret the noun phrase slot of the presentational haber construction either as a subject or as an object. In a related paper (Rivas and Brown 2012), these authors explore the hypothesis that presentational haber pluralization is constrained by the semantic contrast between ‘stage-level’ and ‘individual-level’ nouns. As Rivas and Brown (2012: 74) observe, the categories individual level and stage level were originally formulated as a way to capture the semantic differences between predicates that denote permanent, intrinsic properties of entities (‘individual-level predicates’; e.g., intelligent) and those that describe more transient characteristics (‘stagelevel predicates’; e.g., cold). Because this distinction was devised to categorize predicates, the nouns that occur with presentational haber cannot readily be classified as being individual-level or stage-level. Therefore, the sense given to these notions by Rivas and Brown (2012) deviates significantly from their original formulation. Particularly, these authors code nouns such as elecciones ‘elections’ in example (4), which refer to events or entities “that have an understood beginning and ending” (Brown and Rivas 2012: 81) as ‘stage-level nouns’. In turn, nouns such as directores ‘directors’ and superintendentes ‘superintendents’ in example (5), which “have a preferential interpretation as beginning prior to and continuing past the point of reference of the predication” (Rivas and Brown 2012: 81), were coded as ‘individual-level nouns’.

|| 10 In other words, Brown and Rivas (2012) claim that speakers store the frequency with which a particular noun is used in a specific grammatical function and that this probability determines the likelihood that they will use agreeing presentational haber.

Summary | 27

(4) Porque fue cuando hubo las elecciones (from Rivas and Brown 2012: 81). ‘Because it was when there was the elections.’ (5) Pero también habían directores, superintendentes (from Rivas and Brown 2012: 82). ‘But there were directors, superintendents.’ The results show that individual-level nouns favor agreeing presentational haber. Additionally, Rivas and Brown (2012: 84, 2013: 111) also observe that this variant occurs most frequently with the imperfect tense. Let us summarize now the main trends that emerge from this overview of literature.

2.3 Summary In this chapter, I have reviewed the dialectological and the sociolinguistic literature on presentational haber pluralization. This has shown that the phenomenon constitutes a widely diffused alternation that occurs in Canarian, Latin American, and Peninsular Spanish. Earlier investigations have observed that agreeing presentational haber is more frequent when the noun phrase has human reference, when negation is absent, and when the verb is conjugated in the imperfect tense, a compound tense or forms a verb phrase with an aspectual or modal auxiliary. Yet, none of the studies that were reviewed above provides an analysis of the phenomenon that goes beyond describing the effect of specific linguistic environments. When it comes to patterns of social covariation, the frequent use of agreeing presentational haber appears to correlate with lower social class in Venezuela. This is especially true for the agreeing preterit form hubieron (Freites-Barros 2003, 2004; Malaver 1999). However, this does not seem to be the case in the Dominican Republic and Puerto Rico, where hubieron is considered correct by large segments of the population, including university students (Alba 2004; Fernández 1982; Vaquero 1978). In addition, presentational haber pluralization does not seem to correlate with age, which could indicate that we are dealing with a very slowly progressing linguistic change from below (D’Aquino-Ruiz 2008; Díaz-Campos 2003; Fontanella de Weinberg 1992b: 44) or with a stable variable (Quintanilla-Aguilar 2009). Let us now explore how these and other constraints could be modeled in Cognitive Construction Grammar. This will be the topic of the next chapter.

|

Chapter 3: Cognitive Construction Grammar and language variation

3 Cognitive Construction Grammar and language variation In this chapter, I will be concerned with Cognitive Construction Grammar and the way this theory can be used to answer the question: What constrains morphosyntactic variation? To this end, Section 3.1 first introduces some basic concepts of Cognitive Construction Grammar. Then, in Section 3.2, I will show that the automatic spreading activation model of language production that is assumed in this theoretical framework does not only provide a psychologically plausible way to explain how language variation is constrained, it can also accommodate group-specific and individual-specific patterns of variation. The chapter concludes with a brief summary in Section 3.3.

3.1 Cognitive Construction Grammar In this section, I will present a thumbnail sketch of Cognitive Construction Grammar. It should be clear from the onset that the aim is not to provide a comprehensive introduction, as this can already be found in the work of, among others, Croft (2007), Croft and Cruse (2004: Chap. 9–10), and Goldberg (2003, 2009, 2010). Particularly, Section 3.1.1 defines the basic concepts Cognitive Construction Grammar hinges upon. Section 3.1.2 presents arguments in favor of a construction-based approach to language. Section 3.1.3, in turn, focuses on the meaning of constructions and Section 3.1.4 identifies some constraints on their use. Finally, in Section 3.1.5, the typical formalism of Cognitive Construction Grammar is presented.

3.1.1 Basic concepts The first concept that should be introduced is that of ‘grammatical construction’. In (Cognitive) Construction Grammar, this notion indicates a conventional pairing of form and meaning (Croft 2007: 472; Goldberg 1995: 4, 2009: 94; Lakoff 1987: 467). This includes words, but also abstract patterns that involve more than one entity and may contain empty slots, which need to be filled by other constructions. By way of illustration, Tab. 6 represents an overview of constructions of different sizes and degrees of schematicity.

32 | Cognitive Construction Grammar and language variation Tab. 6: Overview of constructions of different sizes and degrees of schematicity

Construction type

Examples

Word

tentacle, gangster, the

Partially filled word

,

Complex word

textbook, drive-in

Idiom (filled)

like a bat out of hell

Idiom (partially filled)

Covariational conditional

The more you watch the less you know

Ditransitive

She gave him a kiss.

Passive

The cell phone tower was struck by lightning.

Source: Adapted from Goldberg (2009: 94)

The table shows that grammatical constructions can be taken to represent every aspect of language structure (Croft and Cruse 2004: 254) and that no principled distinction is assumed between the nature of generalizations (e.g., argument structure) and lexical items. Both are conceptualized as form-function pairings, which only differ in schematicity (Langacker 1987: 28, 1990: 16). In other words, Cognitive Construction Grammar, like other representatives of the Cognitive Linguistics movement, pictures the grammar as a network of form/meaning pairs ranging from morphemes to abstract construction schemata (Croft 2007: 471; Langacker 1987: 36, 1990: 12, 2007: 427). In addition, as other branches of Cognitive Linguistics, Cognitive Construction Grammar proposes that the human mind is not modular and that in language production and comprehension, speakers use nothing but domaingeneral cognitive abilities that are also used in other activities, such as, for example, categorization and analogy (Goldberg 1995, 2006). However, as a ‘usagebased’ model, Cognitive Construction Grammar also recognizes that a substantial part of speakers’ day-to-day language use consists of ready-made units. Therefore, the actual computation of novel expressions is argued to be fairly limited (Bybee 2001: 15, 2006: 713; Dąbrowska 2014; Langacker 1987: 58–59). In rejecting modularity, Cognitive Linguistics also rejects the distinction between ‘world knowledge’ and ‘linguistic knowledge’. Rather, the meaning of a linguistic unit comprises “everything speakers know about the type of entity designated” (Langacker 2007:432).

Cognitive Construction Grammar | 33

3.1.2 Arguments for constructions One of the advantages of assuming a usage-based, construction-based grammar architecture is that we can account fairly easily for the process of grammaticalization (Bybee 2008: 347). In addition, construction grammars handle ‘peripheral’ syntactic phenomena, such as idioms (e.g., Fillmore, Kay, and O’Connor 1988), and information-structure driven alternations (e.g., Goldberg 2001) with the same ease as ‘core’ syntactic phenomena, such as, for example, transitivity. However, it is probably fair to say that construction grammar and, especially, Cognitive Construction Grammar, has been shown to be most useful in the study of argument-structure alternations. In this regard, recall that in generative syntax, the verb is considered to be the main determinant of the argument structure of the clause (Chomsky 1965: 98–95, 1995: 238). However, this is seriously challenged by two of its implications. First, portraying the verb as the pivot of the clause implies that for every alternation that is uncovered, a copy of the verb must be stored in the lexicon (Goldberg 2001: 504). This would mean that even for the very limited sample of to kick expressions in example (6), we would need seven copies of the verb to accommodate the fluctuations in clause structure and meaning. Second, if this role were to befall to the verb, we would be forced to claim that verbal neologisms, like to flubber in example (7), are stored in the lexicon with their argument-structure specifications (Goldberg 2009: 95). In contrast, in Cognitive Construction Grammar, the overarching argument-structure construction is assumed to determine the overall meaning of the clause, the number of arguments, and their argument role. This way, we can account for the full range of variation with only one verb entry (Goldberg 1995: 9–13). (6) a. I kick the bed with my heel (Davies 2008-, Fiction). b. They run past crumbling homes and kick balls into goals with no nets (Davies 2008-, Press). c. My mother and her friends talk in low voices while the men roll up the hose and kick the shards of glass off the driveway (Davies 2008-, Fiction). d. It doesn't matter that those black people are big and fierce, when it comes to fighting we can kick the shit out of them (Davies 2008-, Magazine). e. But I don't kick at them (Davies 2008-, Fiction). f. The rookie Mario Bates kick-started the New Orleans running attack, then Morten Andersen kicked them to a home victory (Davies 2008-, Press). g. His fingers fastened on something damp and cool and resilient. It kicked (Davies 2008-, Fiction).

34 | Cognitive Construction Grammar and language variation

(7) That thing just flubbered into my room (constructed example). Of course, the fact that a theoretical construct provides parsimonious solutions to long-standing issues in linguistic theory (e.g., Goldberg 1995: Chap. 1) does not necessarily mean that it is psychologically adequate. However, a series of studies conducted by Goldberg and others have provided strong psycholinguistic arguments in favor of argument-structure constructions. Specifically, Goldberg (2006b: 417) has shown that listeners are able to guess the meaning of a nonsense verb correctly based upon the construction pattern it occurs in. This was already evident from the example with to flubber. By the same token, experiments reported by Goldberg (2006a: 112, 2009: 95) suggest that argumentstructure constructions are better indicators for the meanings of expressions than individual verbs. Still other experiments indicate that listeners rely on the meaning of argument-structure constructions to determine the verb’s sense when they are confronted with novel noun-to-verb extensions (Goldberg 2009: 95). However, the most compelling evidence that speakers extract and store argument-structure constructions consists in the fact that “individual abstract constructions can be distinguished using fMRI” (Allen et al. 2012: 178), a type of neuroimaging technique.

3.1.3 The meaning of constructions Like all cognitive-linguistic theories, Cognitive Construction Grammar assumes that the meaning of a lexical unit is taken to comprise “everything speakers know about the type of entity designated” (Langacker 2007: 432), including a set of background assumptions. This approach to semantics is known as ‘frame semantics’. Frames consist of two parts: what I have called ‘background assumptions’ are usually referred to as ‘background frame’ (Goldberg 2010: 40) or ‘base’ (Langacker 1987: 180–189). The foreground, in turn, is most commonly indicated with the term ‘profile’ (Langacker 1987: 189). Turning now to the meaning of verbs, Cognitive Construction Grammar proposes that this class of lexical items refers to conceptualizations of specific events. Since an event presupposes entities participating in it, the frames of verbs specify how many participants partake in the event and what role they fulfill in it. This is expressed in the form of verb-specific ‘participant roles’ (see example [8]), event-specific instances of more general argument roles such as, for example, agent, patient, or receiver. (8) Participant roles: hit .

Cognitive Construction Grammar | 35

Besides listing the participants in the event and their roles, the frames of verbs also specify which participants are profiled (Goldberg 2005b: 225).1 Argument-structure constructions, in turn, are assumed to refer to conceptualizations of event types rather than to specific events. Specifically, Goldberg (1995: Chap. 2.3.5) argues that “constructions designate scenes essential to human experience” (Goldberg 1995: 39). Still, as this proposal involves abstraction over and idealization of observed events of the same type, it is compatible with Lakoff’s (1987: 489–490) claim that constructions encode ‘Idealized Cognitive Models’ of events. According to Cienki (2007), Idealized Cognitive Models are proposed as a way in which we organize knowledge, not as a direct reflection of an objective state of affairs in the world, but according to certain cognitive structuring principles. The models are idealized, in that they involve an abstraction, through perceptual and conceptual processes, from the complexities of the physical world. At the same time, these processes impart organizing structure – for example, in the form of conceptual categories (Cienki 2007: 176).

Since argument-structure constructions designate abstractions over events of the same type, they also assign more abstract roles to the participants partaking in them. These are labeled ‘argument roles’ (e.g., agent, patient, receiver…).2 Similarly, like verbs, argument-structure constructions also specify which participants are profiled. As will become evident in the next section, this is a major constraint on the co-occurrence of verbs and constructions.

3.1.4 Constraints on verbs and constructions Although Cognitive Linguistics credits speakers with a virtually limitless creative potential, this does not mean that just any verb can be used with just any argument-structure construction. Rather, the meanings of verbs and construc-

|| 1 This appears to be a major source of semantic variation between verbs. Consider, for example, to rob and to steal. Both verbs denote that something is taken from someone without permission, but they profile different portions of the event frame. That is, to rob profiles the thief and the victim, whereas to steal profiles the thief and the stolen goods (Goldberg 1995: 45). As a matter of convention, profiled participants are marked with boldface font (Goldberg 1995: 45). 2 However, Cognitive Construction Grammar does not propose a finite list of argument roles. Rather, these follow directly from the construction’s basic sense and, hence, “are more specific and numerous than traditional thematic roles” (Goldberg 2005a: 23). In other words, labels such as agent, patient, etc., should be interpreted as mere shorthand notations that capture the semantic characteristics associated with the slots of constructions (Goldberg 2005b: 224).

36 | Cognitive Construction Grammar and language variation

tions should at least be relatable to each other in some fashion, as can be deduced from the oddness of example (9). (9) *Little Johnny slept the pineapples from the ceiling (constructed example). In the most prototypical cases, the event denoted by the verb instantiates the Idealized Cognitive Model designated by the argument-structure construction (Goldberg 2010: 53). This relationship is illustrated in example (10), where to hand refers to a specific case of the CAUSE-RECEIVE Idealized Cognitive Model designated by the ditransitive construction. In these cases, the contribution of the verb to the meaning of the overall expression is limited, as it only adds more specific information (Goldberg 1995: 51). (10) I handed her the reins, while she glanced at me below the brim of her hat (Davies 2008-, Fiction). The participants of verbs and argument-structure constructions impose additional constraints. Specifically, the construction and the verb should share at least one participant (Goldberg 1995: 65), which should satisfy the following two principles: The Semantic Coherence Principle: The participant role of the verb and the argument role of the construction must be semantically compatible. In particular, the more specific participant role of the verb must be construable as an instance of the more general argument role. General categorization processes are responsible for this categorization task and it is always operative. The Correspondence Principle: The semantically salient profiled participant roles are encoded by grammatical relations that provide them a sufficient degree of discourse prominence, i.e., by profiled argument roles. An exception arises if a verb has three argument roles; in this case, one can be represented by an unprofiled argument role (and realized as an oblique argument). The Correspondence Principle can be overridden by specifications of particular constructions (Goldberg 2005b: 225–226, emphasis added).

Whenever these conditions are met with, verbs and argument-structure constructions can co-occur freely (Goldberg 2006a: 10, 2009: 96). Let us now consider the typical formalism of Cognitive Construction Grammar.

3.1.5 Box diagrams In any discussion of the formalism of Cognitive Construction Grammar it should be clear from the beginning that the theory treats its typical notation merely as a device that helps exposition and discussion, without making any claims of psy-

Cognitive Construction Grammar and language variation | 37

chological reality in its regard (Goldberg 2006a: Chap. 10.4). With this reservation, the framework uses box diagrams to depict speakers’ full grasp of their language. Therefore, semantic, syntactic, and, when relevant, pragmatic constraints on the use of constructions are depicted in the diagrams. As an example, let us consider how such a box diagram would look like for the English ditransitive construction CAUSE-RECEIVE, exemplified in (11). (11) I hand him his water and he pushes north (Davies 2008-, Press). The participant roles of the verb to hand are listed in (12). (12) Participant roles: hand (Goldberg 1995: 51). The composite structure of the ditransitive and to hand is represented in Fig. 1. In the diagram, Sem indicates the semantic pole of the construction, with the small capitals representing the CAUSE-RECEIVE Idealized Cognitive Model. The profiled argument roles, which need to be fused with the verb’s participant roles, are listed on the right-hand side of the Idealized Cognitive Model. The arrows specify which argument roles are instantiated by which participant roles and how these are mapped onto syntactic functions. Next to the line connecting CAUSE-RECEIVE with hand, a letter R appears. This letter indicates the type of relation that holds between the event denoted by the verb and the Idealized Cognitive Model designated by the argument-structure construction (Goldberg 1995: 50–51). The following section will explore how Cognitive Construction Grammar can shed more light on the question as to what constrains morphosyntactic alternations. Sem R: instance, means Syn

CAUSE-RECEIVE

|R hand ↓ V

<
>

Fig. 1: The English ditransitive construction instantiated by to hand, adapted from Goldberg (1995: 51)

3.2 Cognitive Construction Grammar and language variation In this section, I will show that Cognitive Construction Grammar offers all the necessary tools to develop a psychologically plausible theory of morphosyntactic ‘orderly heterogeneity’ (Weinreich, Labov, and Herzog 1968: 188). Combining these tools into a unified theoretical model of the constraints

38 | Cognitive Construction Grammar and language variation

that govern morphosyntactic variation will be the main theoretical contribution of this book to variationist linguistics. The mentalistic character of this approach will challenge the view that “[t]he grammars in which linguistic change occurs are grammars of the speech community” (Weinreich, Labov, and Herzog 1968: 188). Rather, in line with the basic tenets of Cognitive (Socio)Linguistics, I will assume that language (change) resides in the minds of individual speakers and that correspondences between the behavior of individuals belonging to the same speech community are due to shared experiences with language, shared anatomy, and/or shared cognitive capacities (Bybee 2001: Chap. 1; Hollmann and Siewierska 2011; Langacker 2008: Chap. 1). I will also challenge an assumption that is often implicit in functionalist treatments of linguistic alternations, namely, that speakers consciously reach into long-term memory and select the specific linguistic forms that suit their communicative intentions optimally. Rather, in line with the findings of social psychological research into the automaticity of social behavior (e.g., Bargh and Chartrand 1999; Campbell-Kibler 2010), I will assume that speakers only have (some) conscious control over their communicative intentions and that the way they realize those in language is determined entirely by unconscious, automatic cognitive processes. By treating probabilistic patterns in language variation as reflexes of such cognitive constraints, I will also challenge the core assumption of Probabilistic Grammar, namely, that language variation is conditioned by largely arbitrary probabilistic constraints derived from input. To achieve these goals, in Section 3.2.1 I will present the language production model that is assumed in Cognitive Construction Grammar and the cognitive constraints on linguistic variation that follow from this model. Subsequently, Section 3.2.2 discusses how social-interactional meanings can be modeled in Cognitive Construction Grammar and how the selective use of variants to express this kind of meanings can be considered as an additional reflex of domain-general cognitive processes. Finally, Section 3.2.3 considers how individual-specific sociolinguistic behavior can be conceptualized in Cognitive Construction Grammar and how such behavior may emerge.

3.2.1 Cognitive Construction Grammar and cognitive constraints on language variation Following much work in psycholinguistics (e.g., Dell 1986; Dell, Chang, and Griffin 1999: 532–538), Goldberg (1995: 71–74, 2009: 99) assumes that linguistic and other knowledge is stored in a dynamic network of interconnected nodes,

Cognitive Construction Grammar and language variation | 39

called an ‘inheritance hierarchy’. In an inheritance hierarchy, the most schematic constructions stand in the topmost part and pass their form and meaning down to the constructions below them. However, this is not to say that highly abstract information is only represented in the topmost node and that the other nodes simply inherit it. Rather, Cognitive Construction Grammar proposes that information is redundantly stored throughout the network. For instance, frequent, fully compositional instances of constructions are assumed to be stored alongside the more abstract patterns that subsume them; this minimizes computation (Croft and Cruse 2004: 278). In language production, speakers retrieve constructions from this network using nothing but domain-general cognitive abilities such as categorization, our ability to recognize unknown entities as instances of known classes (Bybee 2010: 6–8; Croft and Cruse 2004: 45; Lakoff 1987: 58). In this regard, psycholinguistic research supports that language production initiates with speakers forming a highly rich conceptualization (Ferreira and Engelhardt 2006: 63–64; Griffin and Ferreira 2006: 21–23, 41–44; Langacker 2008: 31–34). While speakers conceptualize, domain-general categorization processes compare the bits and pieces of the conceptualization that are being formed with the conceptual import contained in the mental representations of constructions. In most cases, there is a partial match with the conceptual import of multiple constructions. As a result, all alternatives become activated to varying levels and start competing for further activation (Balota, Yap, and Cortese 2006: 332–334; Dell 1986; Griffin and Ferreira 2006: 28; Ober and Shenaut 2006: 406–407). This is called ‘automatic spreading activation’. Eventually, the level of activation of a particular construction rises above that of its competitors, leading to its selection (Griffin and Ferreira 2006: 28; Langacker 2007: 421, 2008: 228–229). To this model of the retrieval of constructions we only need to add constraints that favor the activation of a particular construction over its competitors to obtain a model of the statistical patterns in language (variation). Since Cognitive Construction Grammar claims that speakers use domain-general cognitive abilities to retrieve constructions from the network, it seems only fair to assume that non-conscious domain-general cognitive constraints will also condition the activation of constructions, alternating or not. In this regard, Langacker (2007:421, 2010: 93) mentions three such factors: prototype effects (Lakoff 1987: Chap. 3; Langacker 2008: 228), activation shortcuts established by frequently accessing a particular representation (e.g., Goldberg 2006; Langacker 2008: 229, 2010: 94), and residual activation after earlier activation (e.g., Dell, Chang, and Griffin 1999: 530).

40 | Cognitive Construction Grammar and language variation

Regarding prototype effects, spreading activation entails that the better the conceptualization matches the conceptual import associated with the construction, the more the representation of the construction will become activated (Balota, Yap, and Cortese 2006: 332–334; Dell 1986; Ferreira and Engelhardt 2006: 79; Griffin and Ferreira 2006: 28). For instance, referring to the domain of lexical choice, Geeraerts (2016:16) argues that “an expression will be used more often for naming a particular referent when that referent is a member of the prototypical core of that expression’s range of application”. In syntax, similar effects have been observed. One aspect of this is that speakers, when confronted with multiple alternatives to encode a particular conceptualization of an event, tend to use the variant that attributes the right amount of formal prominence to conceptually prominent participants (e.g., Myachykov and Tomlin 2015). For instance, most people will prefer example (13a) (19,900 hits on Google) over example (13b) (10 hits on Google). This falls out naturally from Langacker’s (1991:312) schematic definition of the notion of subject as the most conceptually prominent element of the clause (the ‘primary figure’), as example (13a) encodes the entity that draws most attention (the pedestrian, which is both definite and human) with the grammatical function that signals it as such (i.e., as subject), leading to an optimal correspondence between conceptualization and form. In Cognitive Linguistics, this prototype effect is called ‘markedness of coding’; ‘unmarked coding’, referring to a close match between conceptualization and construction, is favored (Langacker 1991:298). (13) a. The pedestrian was hit by a car. b. A car hit the pedestrian. A second constraint that influences a representation’s level of activation is frequency. Particularly, representations that are visited frequently will have higher baseline levels of activation and more detailed representations, giving them a head start over their competitors that are visited less frequently (Bybee 2006; Langacker 1987: 59–60; Schmid 2007: 116). Additionally, when the representations of words and constructions are activated frequently together, the compositional expression becomes stored as a single node in the network; this is called ‘entrenchment’ (Bybee 2001: Chap. 5, 2006; Langacker 1987: 59–60; Schmid 2007: 120). In turn, because this entrenched expression can be activated faster, it is “preferentially produced over items that are licensed but are represented more abstractly, as long as the items share the same semantic and pragmatic constraints” (Goldberg 2006a: 94). This domain-general cognitive constraint is known as ‘statistical preemption’; it has been proposed as a way to explain why speakers do not overgeneralize from input by producing, for example, *stealer

Cognitive Construction Grammar and language variation | 41

instead of thief or *goed instead of went (Goldberg 2006a: Chap. 5) and, more generally, why speakers prefer to use grammatical constructions in ways they have predominantly observed them (as is shown by the contrast between examples [14a-b]), whereas, in the absence of such experiences, they are perfectly able to accept and produce novel uses of verbs (as in, example [15]; e.g., Goldberg 2011; Robenalt and Goldberg 2015). (14) a. ?She explained me the story (taken from Goldberg 2006a: 96). b. She explained the story to me. (15) They coughed the breadcrumbs off the table. Thirdly, it has long been observed in psycholinguistics (e.g., Bock 1986) and variationist linguistics (Labov 1994: 550–566; Szmrecsanyi 2006; Weiner and Labov 1983: 52) that language users tend to pick up and recycle (unintentionally and unconsciously) construction patterns they have (heard) used before, without necessarily repeating the specific words that appear in these structures.3 In the psycholinguistic literature, this tendency is called ‘structural priming’, ‘syntactic priming’, or ‘syntactic persistence’; in the variationist literature, it is known as ‘perseverance’ or ‘persistence’.4 Psycholinguistic research into structural priming has revealed that the phenomenon can be accounted for as a residual activation effect: once a particular representation has been visited, it remains more activated than others for a period of time, giving it a head start over its competitors. At the same time, structural priming also appears to be a mechanism of implicit learning, which permanently adapts the ease of activation of constructions to observed patterns of usage (e.g., Bock and Griffin 2000: 187; Bock et al. 2007; Chang et al. 2000; Pickering and Ferreira 2008: 447). In sum, in this section I have shown that, because Cognitive Construction Grammar assumes an automatic spreading-activation model of language production, the theory can readily accommodate language variation and the statistical patterns in it. Crucially, this account of the constraints that condition morphosyntactic variation proposes that “the same factors operate to produce both regular patterns and the deviations” (Bybee 2010: 6). Still, because these constraints appeal to the mappings between form and meaning, their relevance

|| 3 Priming effects are stronger when the same verb is repeated in both the prime and the target clause (Bock and Griffin 2000: 188). 4 According to Pickering and Ferreira (2008: 427–428), ‘structural priming’ is a more adequate term, because, in principle, all levels of linguistic structure can be primed (as opposed to only the syntactic one) and priming does not always involve persistence or perseverance. I will use ‘structural priming’ throughout.

42 | Cognitive Construction Grammar and language variation

to phonological variation is less clear, but it has been argued that some version of these constraints also conditions lexical variation (Geeraerts 2016:15). Let us now examine how social-interactional meaning can be integrated into this model of language production.

3.2.2 Cognitive Construction Grammar and social-interactional meaning One of the basic tenets of Cognitive Linguistics proposes that no sharp division can be drawn between linguistic and non-linguistic knowledge (e.g., Croft and Cruse 2004: 1–4). Rather, it is assumed that the meaning of a construction also includes information on the speakers who use it, where it is used, and in what kind of situations (Goldberg 2006a: 10; Langacker 1987: 63, 2010: 92). Therefore, besides the three cognitive constraints on the activation of the representations of constructions discussed in the previous section, contextual/social factors (e.g., social class, gender, age, power relationships, etc.) can also be expected to influence the spreading activation of variant constructions. In this regard, recent variationist work of the ‘Third Wave’ (see Eckert 2012) approaches patterns of social covariation as signaling ‘social-interactional meaning’, that is, localized, contextually bound identities, stances, or subgroup membership (Clark 2007: 9–10, 2008: 269–270; Eckert 2008; Kiesling 2005, 2009, 2013; Silverstein 2003). In Cognitive Sociolinguistics, such socialinteractional meanings can be taken to arise from the domain-general process of categorization (e.g., Bodenhausen, Kan, and Peery 2012: 318–319; Lakoff 1987: Chap. 2). Specifically, social cognition studies (e.g., Bodenhausen, Kan, and Peery 2012; Sherman et al. 2013) recognize that, starting from an early age, we are continuously and automatically seeking out the similarities and the differences between the people around us and that this leads to the creation of social categories, which help us structure our experience of the social landscape. In turn, domain-general pattern-finding abilities (e.g., Aslin, Safran, and Newport 1999; Bybee and Beckner 2010: 829; Sherman et al. 2013: 549–550) can be expected to detect that individuals instantiating a particular social category use particular variable forms more often than others (Clark 2007: 9–10, 2008: 269– 270; Sherman et al. 2013: 562–563). Experimental evidence reported by Hay, Warren, and Drager (2006) supports that, if no further differences in meaning between the variants can be found (Labov 1972: 271; Lavandera 1978), this leads to a metonymic association between the social category and the distributions of linguistic variants (Kristiansen 2008: 67–68). If a particular social category co-occurs frequently with

Cognitive Construction Grammar and language variation | 43

these distributions in a variety of situations, this link may become entrenched in the speech community (Langacker 1987: 63). In that case, the community members conventionally associate the social category with the variable use of particular linguistic forms (Kristiansen 2008). As a result, the alternation acquires social-interactional meaning (Clark 2007: 9–10). In this light, it is rather unproblematic to suppose that if two constructions refer to exactly the same Idealized Cognitive Model, their representations will include a probabilistic social-interactional meaning specification (Croft 2000: 172; Hollmann and Siewierska 2011: 46; Langacker 1987: 63, 2010: 92). This specification ensures that, in language comprehension, the use of one of these constructions at a certain rate will activate the representation of the social category (‘first-order indexicality’; Silverstein 2003; Eckert 2008) and, potentially, everything the participants of a usage event can be expected to know or infer about individuals instantiating that category, such as, for example, salient personality traits or stances they often take (‘higher-order indexicality’; Silverstein 2003; Eckert 2008).5 In language production, in turn, domain-general pattern-finding abilities allow speakers to unconsciously keep track of the distributions of the alternations they use, which are compared by categorization processes to the probabilities contained in the representations of constructions. Depending on the social-interactional meaning the speaker wishes to encode, this process increases the level of activation of the variant that keeps the overall distribution within the frequency range associated with the intended social category. In sum, the discussion in this section has shown that, because Cognitive Construction Grammar assumes that “knowledge of language is knowledge” (Goldberg 1995: 5), the theory can readily accommodate the social-interactional meaning of alternating constructions that refer to the same event type. The differentiated use of constructions to signal social-interactional meaning can also be modeled straightforwardly in the automatic spreading activation model of language production. However, as will become evident in Chapter 8, the finegrained, context-dependent study of the social-interactional meanings of presentational haber pluralization goes beyond the scope of this book, as it would require a whole different type of data. Instead, in Part B, we will mainly

|| 5 In other words, I propose that first-order indexicals are rather similar to the meaning of isolated words – which have the potential of referring to the entire body of knowledge a speaker possesses on the type of entity or activity designated (Langacker 2010: 96–97) –, whereas higher-order indexicals are analogous to the highly specific and contextually bound interpretations speakers and hearers negotiate for words in specific usage events (see Langacker 2008: Chap. 13.2).

44 | Cognitive Construction Grammar and language variation

be concerned with the metonymic links Cubans, Dominicans, and Puerto Ricans establish between the distribution of agreeing and non-agreeing presentational haber and (knowledge on) social categories.

3.2.3 Cognitive Construction Grammar and individual-based variation Since Cognitive Construction Grammar adheres to the view that language resides in the minds of individual speakers, language variation can also be hypothesized to represent constraints related to the individual. Indeed, as a usagebased theory, Cognitive Construction Grammar proposes that no two speakers can have the exact same grammar, because both the representations of constructions and their relative strengths derive from prior experience with language (Bybee 2008: 347; Langacker 1987: 380, 1990: Chap. 10). In this regard, Dąbrowska (2015) presents evidence that supports that speakers may differ substantially from one another in terms of the degree to which they have mastered basic and less common grammatical constructions. Still, because speakers tend to accommodate to each other in usage events and since most members of a speech community will have had similar experiences with language, interspeaker variation stays within a fairly narrow range (Bybee and Beckner 2010: 833; Croft 2000: 166; Dąbrowska 2015: 664). This entails that, whenever speakers’ social histories have provided linguistic experiences that differ sharply from those of the majority, some superficial discrepancies may be expected. Indeed, studies of language variation and change have repeatedly found that individuals may differ greatly from one another in terms of their overall frequencies of use of particular variants and that this correlates rather closely with their specific social histories. Yet, it has also been shown that linguistic environments display the same effect directionalities for all community members (Cedergren and Sankoff 1974: 374; Labov 1972: 120; Weinreich, Labov, and Herzog 1968: 188). For example, for the island of Bequia (West Indies), Meyerhoff and Walker (2007: 355–356) compare the variable absence/presence of copula be in the island’s English-based Creole variety among ‘urban sojourners’ – individuals who have lived and worked for a stretch of time in Canada or the United Kingdom – and their ‘stay-at-home’ peers. Their findings suggest that, although the urban sojourners sound radically different and use copula be more often, the effect of the linguistic predictors on their copula use is largely identical to the average effect that is observed among the stay-at-home group. As a result of their shared sociolinguistic experiences, speech community members also share social evaluations of variant linguistic forms (Labov 2006:

Summary | 45

340). In this regard, more interspeaker variation has been attested in the literature, which offers many examples of individuals who, because of their social histories and/or non-conformist personalities, defy social norms of conduct (e.g., Labov 2001: Chap.12; Wolfram and Beckett 2000: 25–29). Other studies that have attended to individual-based variation have equally revealed that certain speakers use particular variants more or less frequently than other members of their age, education, or gender group (e.g., Carpenter and Hilliard 2005; Smith and Durham 2012a; Wolfram and Beckett 2000). This seems to be especially true for lexical (Smith and Durham 2012a: 208) and morphosyntactic alternations (e.g., Ashby 2001; Coveney 2003, 2004, 2005; Smith and Durham 2012a: 210; Walker and Meyerhoff 2013; Wolfram and Beckett 2000: 18), although exceptions to this pattern have also been reported (Wolfram and Beckett 2000: 19–20). In sum, it is widely recognized that, although the overall frequency of a particular variant may differ considerably from one speaker to another, the constraints that govern the usage of variable linguistic forms have highly similar effects for most members of the speech community. This is especially true for conditioning linguistic environments, which usually display the same directionality of effects for all individuals. This can be taken to support that these tendencies reflect speakers’ ‘shared expertise’ of the community language, which arises from their shared sociolinguistic experiences (Croft 2000: 166), that they are stipulated by a common set of sociolinguistic community norms external to the individual (Labov 1972: 120–121), that they reflect speakers’ knowledge about the probability of particular variants in specific contexts (Bresnan 2007; Bresnan et al. 2007; Bresnan and Ford 2010), or that they reveal the effect of universal domain-general cognitive constraints on spreading activation, as I argued above. Therefore, to make this point, it will be necessary to show that even across speech communities individuals respond in a highly similar way to linguistic environments. In Part B, I will evaluate whether this is the case for presentational haber pluralization. Let us now summarize the most important points raised in this chapter.

3.3 Summary In this chapter, I have presented a concise sketch of some key concepts of Cognitive Construction Grammar. In this framework, speakers’ grasp of their native language is pictured as a network of form-function pairings, called ‘constructions’. The meanings of constructions refer to conceptualizations of things (nouns), events (verbs), qualities (adjectives and adverbs), or abstractions over

46 | Cognitive Construction Grammar and language variation

events of the same type (argument-structure constructions), that is, Idealized Cognitive Models. In syntax, argument-structure constructions provide psychologically plausible solutions for idioms and pragmatically or semantically motivated alternations. I have also argued that the automatic spreading activation model of language production that is assumed in Cognitive Construction Grammar readily accounts for the statistical patterns in language (variation). Additionally, constructions can be paired with social information. In this regard, I have argued that domain-general pattern finding abilities and the language production model that is assumed in Cognitive Construction Grammar can model the way speakers unconsciously use the distributions of alternating linguistic forms to position themselves against the background of social categories and to express related social-interactional meanings. Finally, I have observed that Cognitive Construction Grammar recognizes the existence of individual-based variation and I have reviewed some of the literature on this type of linguistic variability. In the following chapter, I will introduce the research questions and the hypotheses.

|

Chapter 4:Research questions and hypotheses

4 Research questions and hypotheses The global aim of this book is to investigate, through the study of presentational haber pluralization in Caribbean Spanish, how cognitive, social, and individualbased constraints shape morphosyntactic variation. Now that the previous chapters have introduced presentational haber pluralization, Cognitive Construction Grammar, and the constraints that will be the main focus of this book, in this chapter, I will draw on these concepts to formulate research questions (Section 4.1) and hypotheses (Section 4.2).

4.1 Research questions The questions around which this volume will evolve are concerned with presentational haber pluralization and the central hypothesis that morphosyntactic variation is conditioned by cognitive, social, and individual constraints (see Section 3.2). Particularly, I will focus on the following issues: I. Cognitive constraints on presentational haber pluralization –

What are the patterns of covariation with linguistic predictors? Do they support that markedness of coding, statistical preemption, and structural priming constrain presentational haber pluralization?

II. Social constraints on presentational haber pluralization –

How do different social groups use agreeing and non-agreeing presentational haber? Does this portray the variation as an ongoing language change from below?

III. Individual constraints on presentational haber pluralization –

How does haber pluralization pattern in the language production of individual speakers? What do these patterns inform us about individual constraints on morphosyntactic variation?

In the following section, I will propose tentative answers to these questions.

50 | Research questions and hypotheses

4.2 Hypotheses Section 4.2.1 will introduce the working hypothesis that presentational haber pluralization involves a spreading-activation competition between two variant construction schemas. Assuming this working hypothesis, I will argue in Section 4.2.2 that the results obtained in earlier studies for the reference of the noun phrase and the verb tense can be seen as the reflexes in this variation of, respectively, markedness of coding and statistical preemption. Against the background of Section 3.2.1, I will also introduce an additional constraint on presentational haber pluralization, namely, structural priming. In Section 4.2.3, drawing on Labov’s (2001) Principles of Linguistic Change, I will formulate a series of predictions about the patterns of social covariation presentational haber pluralization may feature. Finally, Section 4.2.4 will introduce the hypotheses that capture the way the cognitive constraints are expected to apply across different social groups, the individual participants, and Cuban, Dominican, and Puerto Rican Spanish. These hypotheses will be operationalized and tested in Chapter 7, Chapter 8, and Chapter 9.

4.2.1 Working hypothesis Within the theoretical setting presented in the previous chapter, the case study starts from the following working hypothesis: In Caribbean Spanish, presentational haber pluralization corresponds to a slowly advancing language change from below: the agreeing presentational haber construction is replacing the non-agreeing presentational haber construction. The variants only differ with regard to the syntactic function of the noun phrase (non-agreeing variant: object; agreeing variant: subject), the ensuing relative prominence they attribute to their nominal arguments (non-agreeing: less; agreeing: more), and the social categories associated with their relative frequencies.

Of course, this is a very abstract description of the variation, which does not allow for any predictions. However, when we apply the constraints presented in Section 3.2 and Labov’s (2001) Principles of Linguistic Change to this hypothesis, we can derive a list of more detailed extrapolations. Let us consider these from up close, beginning with those that follow from markedness of coding.

Hypotheses | 51

4.2.2 Cognitive constraints 4.2.2.1 Markedness of coding Cognitive Linguistics proposes that “the grammatical behavior used to identify subject and object do not serve to characterize these notions but are merely symptomatic of their conceptual import” (Langacker 2008:364). Specifically, Langacker (1991: 306) argues that the conceptual import associated with subjecthood consists in indicating that a particular clausal participant is exceptionally prominent, i.e., that the speaker has her/his attention focused on it (Langacker 1991: 294; Myachykov and Tomlin 2015; Talmy 2007). Therefore, the working hypothesis proposes a slight semantic difference between agreeing and non-agreeing presentational haber. This contrast consists in that the agreeing variant, which has a subject, attributes relatively more prominence to the nominal than its competitor, which has a direct object. In the light of this hypothesis and taking into account that speakers are more likely to attend to other humans (Langacker 1991: Chap. 7), the effect of human-reference nouns that was observed in earlier investigations may reflect the domain-general cognitive constraint markedness of coding. This is captured by hypothesis 1. Hypothesis 1, Markedness of coding: Cognitively more prominent nominal arguments will increase the activation of the agreeing presentational haber construction.

In addition, since the working hypothesis proposes that the two variants of the presentational haber construction are associated to different social categories, we may expect that markedness of coding increases the activation of a particular variant in function of the social-interactional meaning speakers wish to express. However, since these aspects of the variation are closely tied to the hypothesis of change from below, they will be discussed in Section 4.2.3.

4.2.2.2 Statistical preemption In Spanish, haber has always been used in a variety of constructions. With most of these, the verb merely acts as a tense-aspect-mood morpheme. This is most evidently the case for the compound tense constructions exemplified in Tab. 7.

52 | Research questions and hypotheses Tab. 7: Some compound paradigms of the verb cantar ‘to sing’

Present perfect

Pluperfect

Subjunctive perfect

First singular

he cantado

había cantado

haya cantado

Second singular

has cantado

habías cantado

hayas cantado

Third singular

ha cantado

había cantado

haya cantado

First plural

hemos cantado

habíamos cantado

hayamos cantado

Second plural

habéis cantado

habíais cantado

hayáis cantado

Third plural

han cantado

habían cantado

hayan cantado

Haber also functions as an auxiliary in two modal constructions. The first construction, , expresses deontic obligation (see example [16]), epistemic necessity (see example [17]), prospectivity (see example [18]), and, in certain varieties, futurity (see example [19]) (RAE and ASALE 2009: §28.6ñ-q). (16) Si ha de crearse un tribunal para juzgar los crímenes de guerra, tiene que ser absolutamente independiente (Real Academia Española 2008b-, Spoken, Cuba). ‘If a war tribunal has to be created to judge the war crimes, it has to be completely independent.’ (17) Participant: Ella ya ha dado clases allí. Interviewer: ¿Cuántos años? Participant: Yo creo que han de ser cuatro con este (Davies 2002-, Spoken, Mexico). Participant: ‘She has already given classes over there.’ Interviewer: ‘How many years?’ Participant: ‘I think that it must be four with this one.’ (18) Interviewer: Usted contó el Times Square a las doce de la noche de un Año Nuevo. Participant: Seguramente. El Times Square. ¡Uy! es algo tan fantástico, tan... que no he de olvidar nunca en mi vida (Davies 2002-, Spoken, Santiago de Chile). Interviewer: ‘You told me about Times Square at midnight of a New Year’s Eve.’ Participant: ‘That’s right. Times Square. Wow! It’s something so fantastic, so… that I’m never going to forget in my lifetime.’

Hypotheses | 53

(19) Para completar la lista que a continuación hemos de incluir, aparece recientemente una ‘iglesia satánica’ en nuestro pueblo (Internet, Puerto Rico, http://goo.gl/Kypn1Q). ‘To complete the list that we will include below, recently, a ‘Satanist church’ has appeared in our town.’ The second modal construction with haber combines the verb with the complementizer que and an infinitive, as can be seen in example (20). Like , this construction also expresses deontic obligation. There is, however, one difference: with , the verb is used impersonally (RAE and ASALE 2009: §28.6s-§28.6v). (20) O sea, era muy distinto a antes. Anteriormente, esta gente era: “Hay que hacer esto, hay que desbaratar la universidad.” Ahora no (Real Academia Española 2008b-, Spoken, Puerto Rico). ‘That is, it was very different from before. Before, those people were like: “This has to be done, the university has to be destroyed.” Now, they’re not.’ Until the fifteenth/sixteenth centuries, haber was also widely used as a possessive lexical verb, and even today it may still be used like this (Álvarez-Martínez 1996: 180; Hernández-Díaz 2006: 1064, Note 2; Real Academia Española 2005: s.v. haber). However, this use does not appear to have much vitality, because RAE and ASALE (2009: §4.13b) do not include possessive haber in their general overview of haber constructions. Indeed, various historical investigations have shown that by the end of the sixteenth century, tener is already the preferred possessive verb in Spanish (Fontanella de Weinberg 1987: 33; GarachanaCamarero 1997: 222; Hernández-Díaz 2006: 1064). Still, RAE and ASALE (2009: §4.13d) state that the substandard first-person plural form habemos occurs sporadically with an abstract direct object and a possessive meaning. Their example (see example [21]) and Fontanella de Weinberg’s (1987:107) data suggest that, in Modern Spanish, possessive haber only survives in idioms, such as haber menester ‘to need’ or no haber remedio ‘to be a lost cause’. This is also evident from the Academies’ comment that the use of haber to express possession has to be seen as an archaic stylistic figure (Real Academia Española 2005: s.v. haber; RAE and ASALE 2009: §4.13e). (21) ¡Los hombres no habemos remedio! (RAE and ASALE 2009: §4.13d). ‘We men, we’re a lost cause!’ In legal documents and in literature, archaic haber also occurs in participle (see example [22]) and passive constructions (see example [23]). In these cases, the

54 | Research questions and hypotheses

verb expresses meanings such as ‘to arrive at’, ‘to achieve’, ‘to obtain’, or ‘to catch’ (Bello 1860: 257; RAE and ASALE 2009: §41.6e-§41.6h). (22) Según las estadísticas, de cada tres matrimonios habidos en el país uno fracasa, con impacto consiguiente en el fruto de los mismos (Davies 2002-, Internet, Cuba). ‘According to the statistics, out of three marriages achieved in the country, one fails, with consequent impact on the result of these.’ (23) No pudo ser habido el reo (Bello 1860: 257). ‘The accused could not be caught.’ Finally, until the eighteenth century, haber also occurred in a presentational construction specifically dedicated to introducing time spans into discourse (see example [24]), which later became supplanted by a competing structure with hacer (Fontanella de Weinberg 1992b: 38). (24) Cinco años ha que vine de las provincias del Perú con provisiones del marqués y gobernador Don Francisco Pizarro (Real Academia Española 2008a-, sixteenth century). ‘It’s been five years since I came from the Peruvian provinces with provisions of the Marquis and governor Don Francisco Pizarro.’ Since haber has always occurred in multiple constructions, in the light of the working hypothesis, the fact that verb agreement does not occur with equal frequency for all tense forms of presentational haber may reflect the domaingeneral cognitive constraint ‘statistical preemption’, introduced in Section 3.2.1. As was explained in that chapter, when a form presents high token frequency in one construction schema, but only occurs sporadically in other patterns, its most accessible cognitive representation is not its independent form, but rather a partially lexically filled instance of this construction (Goldberg 1995: 79; Langacker 1987: 59–60, 1991: 48). As this sub-construction has a higher baseline level of activation, it disfavors the use of an alternative expression based on a more abstract competing construction schema that shares the same pragmatic and semantic constraints (Goldberg 2006a: 94, 2009: 102–103, 2011; Robenalt and Goldberg 2015). Therefore, if certain tense forms of haber occurred mainly in (non-agreeing) presentational haber expressions before the agreeing construction emerged as a conventional alternative, upon actuation of the change, the agreeing variant would not have been used frequently in expressions involving those tenses. In subsequent generations, repetition usually ensures that this sort of skewed distribution remains intact (Bybee 2006: 715). This leads to hypotheses 2a-2b.

Hypotheses | 55

Hypothesis 2a, Statistical preemption: If the third-person singular form of a particular tense of haber was frequently used outside of the non-agreeing presentational construction before presentational haber became involved in community-wide agreement variation, this verb tense will favor the agreeing presentational haber construction. Hypothesis 2b, Statistical preemption: The other verb tenses will disfavor the agreeing presentational haber construction, provided that both the entrenched instance of the nonagreeing construction and a novel expression based on the agreeing construction could encode the conceptualization equally well (i.e., provided that the coding of the conceptual import does not call for aspectual or modal auxiliaries).1

4.2.2.3 Structural priming If the variation amounts to a competition between two argument-structure constructions, as the working hypothesis claims, the discussion in Section 3.2.1 leads me to expect the pattern described by hypothesis 3. Hypothesis 3, Structural priming: The earlier mention of one of the presentational haber constructions in discourse will promote the use of the same construction in the next occurrence, regardless of variations in tense, aspect, or mood.

4.2.3 Principles of Linguistic Change Based on the results of earlier studies of presentational haber pluralization, the working hypothesis claims that the alternation constitutes a slowly progressing language change from below. This claim entails the prediction that the alternations will display patterns of social and stylistic covariation typical of this type of linguistic evolution. A first such pattern is the ‘apparent-time’ distribution

|| 1 This hypothesis does not imply that frequent combinations of (non-agreeing) presentational haber and aspectual or modal auxiliaries cannot be stored as a single unit. Rather, it is inspired by the fact that aspectual/modal auxiliaries do not co-occur frequently with presentational haber. For example, in the twentieth-century section of Corpus del español (Davies 2002-), there are only 232 presentational cases of third-person singular poder haber ‘there can be’ against 39,472 cases of third-person singular synthetic presentational haber. A similar pattern is found for deber haber ‘there has to be’, with 160 presentational third-person singular cases. Searches for aspectual auxiliary constructions such as acabar de haber ‘stop to be’, dejar de haber ‘stop to be’, and empezar a haber ‘start to be’ do not provide any results.

56 | Research questions and hypotheses

characteristic of linguistic changes (Labov 1994:43–72),2 which predicts the situation described by hypothesis 4. Hypothesis 4, Apparent time: The youngest participants will favor the agreeing presentational haber construction, whereas older participants will make more use of the nonagreeing presentational haber construction.

However, the research reported in Section 2.2 suggests that presentational haber pluralization may progress too slowly to be observed in apparent time (DíazCampos 2003; Fontanella de Weinberg 1992b). Therefore, more and less direct evidence may be necessary to test the change-in-progress hypothesis. In this regard, we may resort to Labov’s (2001: Chap. 8) Gender Principle, which establishes that “[i]n linguistic change from below, women use higher frequencies of innovative forms than men do” (Labov 2001: 292). This leads to hypothesis 5. Hypothesis 5, Gender Principle: In comparison to men of the same social characteristics, women will use the agreeing presentational haber construction more often.

Yet, since gender-differentiated behavior is also found for changes from above (Labov 2001: 274) and because the possibility of age-graded behavior – “a regular change of linguistic behavior with age that repeats in each generation” (Labov 1994: 45) – always exists for apparent-time distributions, more evidence will be needed before we can confidently conclude that this alternation constitutes a linguistic change from below. In this regard, it has often been shown that, in changes from below, the innovative variants usually display no style shifting or increase in frequency when formality – defined in terms of the amount of attention that is explicitly turned to speech – rises (Labov 1972: 239, 2001: Chap. 3; Silva-Corvalán 2001: 248–249). This leads to hypothesis 6. Hypothesis 6, Style: When more attention is explicitly focused on language, the frequency of the agreeing presentational haber construction will not decrease.

Furthermore, Labov (1972: 138) observes that highly educated speakers tend to conform to supralocal prestige norms. This is captured by hypothesis 7. Hypothesis 7, Educational achievement: Higher educational achievement will favor the non-agreeing presentational haber construction, whereas a shorter formal education will promote the agreeing presentational haber construction.

|| 2 In sociolinguistics, the term ‘apparent time’ refers to a methodological construct in which frequency differences between generational groups are used to trace the evolution of frequency distributions over time (see Section 5.1.2.1).

Summary | 57

4.2.4 The effects of the linguistic predictors across individuals, social groups, and speech communities Hypotheses 4–7 formulate predictions regarding the overall frequency of agreeing and non-agreeing presentational haber within certain social groups and speech styles. Since hypotheses 1–3 claim that the effects of specific linguistic predictors are reflexes of domain-general cognitive constraints that condition spreading activation, they can be expected to apply rather uniformly across individuals and social groups. Therefore, hypothesis 8 states that: Hypotheses 8, Interactions: Although specific social groups and individual participants may feature diverse frequencies of agreeing and non-agreeing presentational haber, the directionalities of the effects of the linguistic predictors will be identical for all social groups and all individuals.

Similarly, the claim that linguistic variation is conditioned by domain-general cognitive constraints on spreading variation implies that across speech communities speakers will respond uniformly to the linguistic environments that model these constraints. In contrast, Section 3.2.2 describes the socialinteractional meaning of alternating constructions as the fruit of speakers’ earlier sociolinguistic experiences. Hypothesis 9 attempts to capture this. Hypothesis 9, Divergence: The data will display highly similar tendencies for the linguistic predictors, but the associations of the presentational haber constructions to social categories will vary according to the respective speech communities.

Before turning to the methods that were used in testing these hypotheses, let us first summarize the most important ideas that were presented in this chapter.

4.3 Summary In this chapter, I have introduced the research questions and the hypotheses that will form the main topics of Chapter 7, Chapter 8, and Chapter 9. Crucially, the working hypothesis of this study contends that presentational haber pluralization amounts to a competition between two variants of the presentational construction with haber, which only differ in terms of their associations to social categories and the syntactic function of their nominal arguments. Assuming this working hypothesis, in line with the broader aim of this book, I have hypothesized about the way markedness of coding, statistical preemption, and structural priming may shape presentational haber pluralization. Addition-

58 | Research questions and hypotheses

ally, drawing on Labov’s (2001) Principles of Linguistic Change, I have proposed several hypotheses that intend to capture the potential community-level socialinteractional meaning of presentational haber pluralization. In the final section of this chapter, I have introduced two additional hypotheses that capture the way the linguistic predictors are expected to apply across individuals, social groups, and the three dialects of Caribbean Spanish. In Part B, I will operationalize and test these hypotheses. Let us now consider the methods that will be applied in this book. These will be the topic of the next chapter.

|

Chapter 5: Methodology

5 Methodology As I mentioned in the introduction, over the past fifteen years, the usage-based approach to language (e.g., Langacker 1990: Chap. 10) has motivated a methodological shift in Cognitive Linguistics in favor of the greater use of corpus data and quantitative methods (e.g., Geeraerts 2005; Geeraerts and Kristiansen 2015; Pütz, Robinson, and Reif 2012). This book follows this current, adopting comparative sociolinguistic methodology. Specifically, Section 5.1 presents the decisions that were taken in sampling participants from the Havana, Santo Domingo, and San Juan speech communities. Subsequently, Section 5.2 discusses the fieldwork methods. Section 5.3 focuses on the transcription procedure, the selection of instances of presentational haber and the context types that are considered variable in this investigation. Section 5.4, in turn, is dedicated to the statistical toolkit that will be used in Chapter 7, Chapter 8, and Chapter 9. Section 5.5 introduces the comparative sociolinguistic method. The chapter concludes with a brief summary in Section 5.6.

5.1 Judgment sample, selection criteria, and stratification variables Corpora inevitably constitute limited samples of both the endless expressive possibilities a language has to offer and the usage patterns of all of its speakers. Therefore, it is important to define the samples that were considered for analysis sharply, as they will determine the robustness of the results to a large extent. To this end, Section 5.1.1 introduces the ‘judgment sampling’ technique. Subsequently, Section 5.1.2 will present the general criteria and the social characteristics according to which participants were selected.

5.1.1 Judgment sample Following standard practice in current variationist methodology (e.g., Milroy and Gordon 2003: 30–33; Tagliamonte 2006: 23–24), I sampled participants from the Havana, Santo Domingo, and San Juan speech communities according to a number of previously set social characteristics. This is called ‘judgment sampling’ and the social characteristics used in this process are called the ‘strat-

62 | Methodology

ification variables’. With this method, the usage patterns of the members of a speech community can be investigated successfully with a relatively small number of participants. Of course, this requires that the analyst is realistic about the number and the types of hypotheses that can be explored with the data (Paolillo 2013: 113–114), that the social categories are locally meaningful, and that the sample includes enough individuals so that a single idiosyncratic speaker cannot not distort the overall results (Guy 1980; Milroy and Gordon 2003: 30; Tagliamonte 2012: Chap. 4). Concerning the latter issue, the literature suggests that three to five participants per cell created by crossing the stratification variables is sufficient (Milroy and Gordon 2003: 30–35; Moreno-Fernández 2003: 8; Tagliamonte 2006: 23–24). In accordance with the guidelines of the international Project for the sociolinguistic study of the Spanish of Spain and America1 (Moreno-Fernández 2003: 8), the samples of this study include three participants per cell. Let us now turn our attention to the social characteristics that were used in selecting participants.

5.1.2 Selection criteria and stratification variables As a general requirement, in order to be eligible, all participants had to be born and raised in their respective country and have lived in the capital for the last five years. Participants meeting these requirements were then selected and grouped together according to their ages (20–35 years vs. 55+ years), educational achievements (less than university vs. university), and genders (female vs. male). However, before turning to that topic, in the remainder of this section, the relevance of the stratification variables will be discussed, beginning with age.

5.1.2.1 Age Age is a basic biological distinction that has some profound consequences for the roles the individual assumes in society and her/his expectations and views on life (Eckert 1989: 246–247). In addition, age is essential to the study of linguistic change in progress, because contrasts between generational groups are generally assumed to reflect the historical development of the language. This methodological construct is called ‘apparent time’ in variationist sociolinguistics (Labov 1994: 43–72). || 1 In Spanish: Proyecto para el estudio sociolingüístico del español de España y de América.

Judgment sample, selection criteria, and stratification variables | 63

However, Section 2.2 suggests that presentational haber pluralization advances at an extremely slow rate (D’Aquino-Ruiz 2008; Fontanella de Weinberg 1992b), if it progresses at all (Quintanilla-Aguilar 2009). In this light, if any contrasts between generational groups will be found, these will probably only be visible between the youngest and the oldest age cohorts. Therefore, there is no need to sample all adult generations available. This way, one can also avoid including speakers aged one year older or younger in different generations, whereas there is no objective reason to assume that their speech is markedly dissimilar. With this in mind, this study only includes two age groups, defined as follows: – 20–35 years old – 55 years and older

5.1.2.2 Educational achievement Educational achievement rather than social class was selected as the third stratification variable. This was motivated by the fact that Milroy and Gordon (2003) question the usefulness of social class as a stratification variable in the Latin American context, “which is characterized by a large difference in access to power and advantage between the elite and the majority of the population” (Milroy and Gordon 2003: 43). In this light, implementing a judgment sample with an equal representation of all social classes may render a severely disproportioned picture of the speech community. Additionally, since social class is usually defined as a function of multiple demographic parameters, which may be evaluated differently in the societies under study (Milroy and Gordon 2003: 43), comparing samples stratified by social class may actually imply comparing, for all predictors that are examined, the behavior of individuals whose only common feature is the social class index the investigator has superposed on the reality. In this sense, it is preferable to opt for educational achievement, which has sharply defined parallels in Cuba, the Dominican Republic, and Puerto Rico. Additionally, there is ample evidence that prolonged formal education affects speakers’ native language attainment in ways that should not be underestimated. In variationist sociolinguistics, university education has been shown to bias speakers’ speech patterns towards those that are normatively sanctioned and to affect their abilities to recognize their own speech patterns (Labov 1972: 138, 2010: Chap. 4). In the literature on language processing, the influence of prolonged formal education is also well documented. For instance, Dąbrowska (1997) has shown that speakers who have enjoyed more formal education comprehend complex grammatical constructions more accurately. Particularly,

64 | Methodology

when presented with written stimuli such as example (25), speakers with lower educational achievements are less successful at providing the correct answer to a question of the type: What surprised Shona? (25) Paul noticed that the fact that the room was tidy surprised Shona (constructed example, from Dąbrowska and Street 2006: 605). Similarly, Chipere (2001) finds that participants with lower educational achievements perform less accurately at comprehending and recalling such grammatically complex utterances. Rather than pointing to limitations of working memory, his results suggest that the group with lower educational achievements have mastered these structures incompletely. With additional training, these participants perform as accurately as participants with higher educational achievement. Recent experiments have revealed comparable patterns of education-based variation when it comes to the comprehension of basic grammatical constructions (Dąbrowska 2015; Street and Dąbrowska 2010). In sum, the discussion in this section has shown that university education is an important factor that shapes speakers’ language production and processing abilities. Therefore, this study only distinguishes two educational achievement levels, defined as the most advanced degree the speaker has obtained: – Less than university – University2

5.1.2.3 Gender Although participants were selected according to their biological sex, in accordance with standard practice in current variationist sociolinguistics (e.g., Cheshire 2002; Eckert 1989), the oppositions between men and women will be approached in terms of gender, that is, in terms of the social categories of masculinity and femininity. As the sociologist Epstein (2007) observes,

|| 2 I tried to include as few students as possible, since they cannot be rated satisfactorily using this criterion. Only three participants were still pursuing a degree at the time of the interview. The first, a Puerto Rican young woman, was nearly graduating from her bachelor’s degree, for which she was included in the university-educated group. The second participant, also a Puerto Rican young woman, had only recently started an associate’s degree at a community college, for which I included her in the ‘less than university’ education group. The third participant, a Cuban young male, was in his last semester of law school (a six-year program in Cuba) when I interviewed him. Therefore, I rated him as a university graduate as well, because in Puerto Rico or the Dominican Republic he would have earned an undergraduate university degree already.

Fieldwork methods | 65

the gender divide is not determined by biological forces. No society or subgroup leaves social sorting to natural processes. It is through social and cultural mechanisms and their impact on cognitive processes that social sorting by sex occurs and is kept in place by the exercise of force and the threat of force, by law, by persuasion, and embedded cultural schemas that are internalized by individuals in all societies (Epstein 2007: 4, emphasis in the original).

As a result, men and women typically assume different roles in society and society expects different styles of behavior from males and females (Cheshire 2002; Chambers 2009: 116, 140; Eckert 1989: 246–247; Sherman et al. 2013: 567). For example, research in sociology shows that [f]emales are more likely than males to express concern and responsibility for the wellbeing of others, less likely than males to accept materialism and competition, and more likely than males to indicate that finding purpose and meaning in life is extremely important (Beutel and Mooney-Marini 1995: 446).

These differences in behavior standards, expectations, and experiences have been proven to play an important role in linguistic change (e.g., Cheshire 2002; Labov 2001: Chap. 8, 12). For these reasons, gender is a must-have stratification parameter in all studies of language variation and change (e.g., Labov 2001: 84; Tagliamonte 2006: 23). Tab. 8 summarizes the sample as it was implemented in each of the three cities. Tab. 8: Composition of the sample

Educational achievement

25–35 years

55 + years

Total

Male

Female

Male

Female

Less than university degree

3

3

3

3

12

University degree

3

3

3

3

12

Total

6

6

6

6

24

5.2 Fieldwork methods The fieldwork was carried out in March-April (San Juan), April-May (Santo Domingo), and May-June (Havana) 2011. In the three cities, the author, a fluent second-language speaker of Puerto Rican Spanish, conducted all the interviews. Most participants were volunteers recruited with the help of local consultants,

66 | Methodology

who introduced the author as a student of the local culture and language on a class assignment. However, as it turned out to be impossible to fill out the quota with just volunteers, certain participants were rewarded cash incentives. Specifically, in Havana, one speaker was rewarded three convertible pesos (1 CUC=1 USD) for his participation. In Santo Domingo, ten participants received a 200peso incentive (1 RD$=0.02 USD). Finally, in San Juan, four participants received a ten-dollar compensation. The interviews were recorded using the rear-facing built-in 120° microphone of a Samson Zoom H2digital recorder, set to 24bit/96kHz WAV format, with lowcut filter enabled and the microphone Auto Gain Control set to AGC 2 (Speech). The majority of the participants have been recorded for about 40–120 minutes. The shortest interview span was of 29 minutes, the longest interview lasted two hours and 25 minutes, and the average duration oscillates around 60 minutes. The total amount of speech data that was collected sums about 76 hours or, roughly, 700,000 orthographic words. The data were gathered combining three methods: a sociolinguistic interview, a story-reading task and a questionnaire-reading task. The motivation for this combination of methods is twofold. First, using story reading and, especially using questionnaire reading, participants can be confronted with more variable contexts and with structures that occur too infrequently to be studied in a corpus of sociolinguistic interviews (Wolfram 1986: 10). Second, combining semi-directed interviews, in which virtually no attention is turned to language, with two tasks that explicitly focus all attention on the speaker’s speech habits creates an opportunity to observe style shifting (Labov 2006: Chap. 4, 1972: 98).3 Therefore, while analyzing the data, I will treat the interview and the two elicitation tasks as two different speech styles. Let us consider now the three data gathering methods, starting with the sociolinguistic interview.

5.2.1 Sociolinguistic interview As with most sociolinguistic interviews, the goal of this part of the recording sessions was to obtain 30 to 45 minutes of relaxed speech from the participants as well as the full range of their demographic data. Following standard practice in variationist sociolinguistics, the interview evolved around thematic question

|| 3 Labov (2006: Chap. 4, 1972: 99) argues that the difference between formal and informal styles consists in that, in formal styles, more attention is paid to speech. This implies that, if we focus more attention on speech, speakers will automatically adopt a more formal style.

Fieldwork methods | 67

modules, designed to invite the participants to talk about a particular topic for as long as they wanted. The questions were loosely inspired by Tagliamonte’s (2006: Appendix B) updated version of Labov’s (2006: Appendix A) original interview schedule, the interview format of the Project for the sociolinguistic study of the Spanish of Spain and America (Moreno-Fernández 2003: 12–15), and the list of questions used by Quintanilla-Aguilar (2009: Appendix F). Additionally, in order to investigate comprehension-to-production priming effects, a set of questions with presentational haber (see example [26]) was included in the thematic modules. In these questions, agreeing and non-agreeing presentational haber were used randomly. (26) Interviewer: ¿Este, y habían castigos por no llevar el uniforme? Participant: Sí, había castigos, si, si ibas con ropa de calle (LH03M12/LH264-LH265). Interviewer: ‘Er, and were there punishments for not wearing the uniform?’ Participant: ‘Yes, there was punishments, if, if you dressed casually.’

5.2.2 Story-reading task and questionnaire-reading task After the interview, the participants were instructed to read out loud a two-page text in which 31 decision contexts with presentational haber and some distracter verbs had been inserted (20 trials, 11 fillers; see Appendix A). As shown in example (27), while reading and without prior preparation, the participants had to choose the variant that corresponded to their own idiom. (27) En una pequeña aldea, había/habían un anciano padre y sus dos hijos. El mayor era trabajador y llenaba de alegría el corazón de su padre, mientras el más joven sólo le daba disgustos… ‘In a small village, there was/there were an old father and his two sons. The oldest worked hard and filled his father’s heart with joy, whereas the youngest only irritated him…’4

|| 4 For the first two interviews of the Puerto Rican dataset, the format was somewhat different. For example, the first line of the text read: En una pequeña aldea ______(haber, pasado) ‘In a small town ______ (there to be, past tense)’. However, participants turned out to have had extreme difficulties using these grammatical terms to insert the intended verb form. Therefore, the story-reading task was quickly adapted to its present format.

68 | Methodology

Since only basic literacy could be assumed for all participants, the story-reading task was deliberately kept very simple. Rather than confronting the participants with a newspaper article or another kind of text written with an adult audience in mind, the story-reading task was based on a text written for children of about seven years of age: Juan Sin Miedo ‘John Without Fear’. Still, as is shown in Tab. 9-Tab. 11, eight participants without university education were unable to complete the reading tasks on their own. In these cases, depending on the amount of time that had already passed by, the interview was either concluded (one participant from San Juan) or the interviewer read the text to the interviewees, instructing them to identify the form that corresponded to their own usage (the seven other participants). Tab. 9: Number of participants from Havana who completed the story- and questionnairereading tasks with the help of the interviewer, by age, educational achievement, and gender

Educational achievement

Less than university degree

25–35 years

55 + years

Total

Male

Female

Male

Female

0

0

2

1

3

University degree

0

0

0

0

0

Total

0

0

2

1

3

Tab. 10: Number of participants from Santo Domingo who completed the story- and questionnaire-reading tasks with the help of the interviewer, by age, educational achievement, and gender

Educational achievement

Less than university degree

25–35 years

55 + years

Total

Male

Female

Male

Female

0

1

1

0

2

University degree

0

0

1

0

1

Total

0

1

2

0

3

Fieldwork methods | 69

Tab. 11: Number of participants from San Juan who completed the story- and questionnairereading tasks with the help of the interviewer or did not complete the tasks, by age, educational achievement, and gender

Educational achievement

Less than university degree

25–35 years

55 + years

Total

Male

Female

Male

Female

0

0

1

1

2

University degree

0

0

0

0

0

Total

0

0

1

1

2

When it comes to the linguistic contexts that were presented to the participants, Section 2.2 suggests that presentational haber pluralization is primarily conditioned by the absence/presence of negation, the characteristics of the nominal argument, and the verb tense. However, incorporating two tokens of all possible combinations of these predictors in the reading task would result in too large a number of trial items. Therefore, the task only includes a selection of verb tenses, multiple types of nominal arguments, and affirmative and negative clauses. Tab. 12 shows that, although not all tenses of presentational haber could be represented in the task, there is an almost equal representation of the present and the preterit tense, which have been shown to disfavor agreeing presentational haber in earlier research (9 tokens), and other tenses, which have been shown to favor agreeing presentational haber (11 tokens). For the noun phrase, Section 2.2 suggests that animacy is a major constraint on presentational haber pluralization. Therefore, animate-reference nouns (8 tokens), and inanimatereference nouns (12 tokens) are almost equally represented in the text, as are affirmative (11 tokens) and negative clauses (9 tokens).

70 | Methodology Tab. 12: Forms of presentational haber included in the story-reading task, by animacy and the absence/presence of negation

Form

Animate

Inanimate With negation

Total

With negation

Without negation

Without negation

Imperfect (había/habían)

0

2

2

1

5

Morphological future (habrá/habrán)

0

0

0

1

1

Periphrastic future (va a haber/van a haber)

0

0

1

0

1

Present perfect (ha habido/han habido)

0

2

0

0

2

Present tense (hay/hayn)

2

0

4

1

7

Preterit (hubo/hubieron)

0

2

0

0

2

Subjunctive present (haya/hayan)

0

0

0

2

2

Total

2

6

7

5

20

For the questionnaire-reading task, the participants were given a questionnaire consisting of 45 items (32 trials, containing 41 tokens of presentational haber, 13 fillers; see Appendix B) preceded by a description that evoked the usage context for the interpretation of the trial sentence, as can be seen in example (28). Then, they were instructed to read out loud the descriptions and the trial sentences, while simultaneously filling in the gaps with the multiple-choice answer that corresponded to their usage. The participants who were unable to complete the reading task without the help of the investigator were not handed the full questionnaire, but rather a random selection of three to four pages (minimally 18 trials and 10 fillers). (28) Después de algún proyecto para mejorar la calidad del agua de las presas del país, un científico comenta: Hace diez años, no________ más de tres sapos en esta presa. Ahora, cuenta con veinte patos, tres garzas y miles de peces. a) hubo b) hubieron ‘Following a project to improve the water quality of the country’s basins, a scientist comments: Ten years ago, ______ more than three frogs in this basin. Now, it has twenty ducks, three cranes, and thousands of fish. a) there wasn’t b) there weren’t’

Fieldwork methods | 71

Tab. 13: Forms of presentational haber included in the questionnaire-reading task, by animacy and the absence/presence of negation

Form

Animate

Inanimate

Total

With negation

Without negation

With negation

Without negation

Acaba/acaban de haber ‘there has just been’

0

0

0

1

1

Conditional (habría/habrían)

1

1

1

0

3

Debía/debían haber ‘there must be’

0

1

0

0

1

Empezó/empezaron a haber ‘there has begun to be’

0

2

0

1

3

Empieza/empiezan a haber ‘there begins to be’

0

0

0

1

1

Imperfect (había/habían)

1

2

2

4

9

Morphological future (habrá/habrán)

0

1

0

2

3

Present perfect (ha habido/han habido)

1

1

0

1

3

Present tense (hay/hayn)

0

2

1

2

5

Preterit (hubo/hubieron)

1

3

1

0

5

Pudo/pudieron haber ‘there could be’

0

0

1

0

1

Seguirá/seguirán habiendo ‘there will continue to be’

0

1

0

0

1

Subjunctive imperfect (hubiera/hubieran)

1

0

0

2

3

Subjunctive present (haya/hayan)

0

1

0

0

1

Subjunctive present perfect (haya habido/hayan habido)

0

1

0

0

1

Total

5

16

6

14

41

As signaled above, the main purpose of the questionnaire-reading task was to confront participants with linguistic contexts that occur too infrequently in

72 | Methodology

unscripted spoken language. Section 2.2 suggests two such types. First, there are the cases in which presentational haber is not accompanied by a full noun phrase, but rather by a direct-object pronoun. In order to investigate whether participants establish verb agreement with these pronouns, six tokens of presentational haber + plural object pronoun were included in the questionnaire. Second, aspectual and modal auxiliary constructions and the subjunctive tenses also occur rather infrequently in unscripted spoken language. As is shown in Tab. 13, the questionnaire-reading task includes multiple tokens of them. As was the case with the story-reading task, multiple tenses, types of nominal arguments, and both negative and affirmative sentences were included in the questionnaire, but an equal representation of all combinations between these predictors proved unfeasible. In this regard, Tab. 13 shows that the present and preterit tense represent a quarter of the tokens. When it comes to the noun phrases, cases with animate-reference nouns make up 21 of the 41 presentational haber tokens. 11 out of the 41 tokens involve negation.

5.3 Transcription, selection of cases, and envelope of variation This section focuses on the procedures that were followed while processing the data. Particularly, Section 5.3.1 describes the way the recording sessions were transcribed. Subsequently, Section 5.3.2 introduces the decisions that were taken while selecting and coding the cases of presentational haber + plural noun phrase. Section 5.3.3 describes the forms that were considered for analysis. In variationist linguistics, this is called the ‘envelope of variation’.

5.3.1 Transcription Once the fieldwork was completed, the 72 recording sessions were transcribed in their full length using Microsoft Word 2011 for Mac and VideoLan Media Player. During this phase, two potential difficulties for the correct transcription of the (agreeing) cases of presentational haber + plural nominal argument were identified. First, Cuban, Dominican, and Puerto Rican Spanish feature three main allophones for the nominal plural marker /-s/: the alveolar sibilant [-s], the laryngeal fricative [-h], and a zero allophone (López-Morales 1983: Chap. 3, 1992: 77–100; Terell 1979, 1982). At first glance, the latter could be problematic, as it could lead to the incorrect interpretation of plural nouns as singular nouns. However, research into this matter has shown that, in the majority of the cases,

Transcription, selection of cases, and envelope of variation | 73

nominal plurality is redundantly marked at multiple sites in the noun phrase and that speakers draw on cultural, phonological, pragmatic, and semantic information to resolve the intended number (López-Morales 1983: 55–57, 1992: 91–93; Poplack 1984: 222). For instance, in example (29), the noun phrase tantos cafés y bares ‘so many cafés and bars’ features three possible sites to mark plurality with [-s]: tantos, cafés, and bares. Of these three, the latter marks plurality unequivocally even without [-s] by the addition of plural /-e/ to the stem /bar/. The plurality of the nominal can also be inferred from the coordination of the nouns cafés and bares and from the meaning of the indefinite quantifier tantos ‘so many’. When such disambiguating information is not available, plural /-s/ is rarely realized as zero (Poplack 1984: 210). Therefore, the phonetic variation of /-s/ does not seem to impose severe methodological challenges. (29) Son años y como aquí hay tantos cafés y bares y, tú sabes, uno ha estado noches y noches, y horas y horas, y conversando sobre temas, y temas y temas y… (SJ12M12/SJ1391). ‘I’ve been around here for years and since there is so many cafés and bars around here, and you know, one has been out here for nights and nights, and hours and hours, and talking about topics, and topics, and topics, and…’ Second, Caribbean Spanish features three allophones for the third-person plural morpheme /-n/: the alveolar nasal [-n], the velar nasal [-ŋ], and a phonologically null variant that is manifested as the backward nasalization of the preceding vocal (López-Morales 1983: 106, 1992: 121). Of these three, the alveolar and the velar nasal are the most frequent realizations in the varieties of Havana, Santo Domingo, and San Juan (López-Morales 1983: 109–110, 1992: 123–125). Therefore, in the vast majority of the cases, a clearly audible contrast exists between the absence and presence of /-n/, especially for verbs, for which /-n/ is almost never realized as the null variant (Poplack 1984: 222). However, for tokens followed by a nasal consonant, it proved difficult to differentiate the zero allomorph from cases of nasalization caused by assimilation with that consonant. In order to transcribe these cases correctly, I first slowed down the playback of the sound file to 10% of the original speed. Then, I compared the participant’s pronunciation of the target form followed by a nasal consonant with her/his pronunciation of a zero plural followed by a non-nasal consonant. This showed that, in the latter case, the vocal is already markedly nasal from the onset, whereas, in the former, it only becomes nasalized towards the onset of the consonant. This, in turn, helped identifying the absence or

74 | Methodology

presence of /-n/. However, in spite of these efforts, the data might still display a very small margin of transcription error. Finally, in order to make sure that all tokens of presentational haber had been transcribed correctly, I checked all the transcriptions against the sound files. Whenever disagreement emerged between the forms I heard the first and the second time, I marked the timing of the token. After transcribing all the interviews, these tokens were checked once more.

5.3.2 Selection of cases While searching for tokens in the transcription files, it became evident that participants hesitate frequently while completing the story- and questionnairereading tasks. This leads them to provide multiple contradictory responses to the same item, as is shown in example (30). (30) Qué raro, esta mañana no, no había, habían más carros que otros domingos (SJ01M22/SJ161-SJ162). ‘How strange, this morning there wasn’t, there weren’t more cars than on other Sundays.’ Therefore, a selection principle was established: only the participants’ final answers were taken into account for the quantification. However, when the speaker repeated the same variant multiple times, all the tokens of that particular variant were quantified. For instance, for example (31), two tokens of habrán ‘there will bePL’ were coded for analysis. (31) No es tu culpa tuya, es que siempre habrán unas per, habrá, habrá personas, habrán unas personas malas (SD02H21/RD275-RD278). ‘It’s not your fault, it’s that there will always bePL some pers, there will beSG, there will beSG, there will bePL some bad people.’ Let us now turn to the contexts that are considered variable in this study, which will be the topic of the next section.

5.3.3 Envelope of variation In general, all contexts in which third-person agreeing or non-agreeing presentational haber is followed by a plural noun phrase, including coordinated singular nouns, are considered variable. This includes cases of presentational haber

Transcription, selection of cases, and envelope of variation | 75

followed by gente ‘people’ when this noun is used as a plural count noun, as in example (32), meaning ‘persons, individuals’ (Aleza-Izquierdo 2011: 41–42; Real Academia Española 2005: s.v. gente). (32) Habrán gentes que lo hagan (SD05H11/RD594). ‘There will bePL persons who do it.’ This also includes the present-tense forms hay-hayn. As we will see in Section 7.3, this is motivated by the fact that the corpus provides 53 tokens of the vernacular plural hayn, which had already been documented in earlier investigations of Antillean Spanish (Holmquist 2008: 28; Vaquero 1996: 64). Therefore, if we want to follow the important ‘Principle of Accountability’5, the alternation between hay and hayn cannot be excluded from the scope of this investigation. In contrast, first-person plural haber (see example [33]) and the agreement variation displayed by the modal construction ‘’ (see example [34]) are considered to be outside of the envelope of variation, even though these have also been treated as instances of presentational haber pluralization in some surveys (e.g., DeMello 1991; Freites-Barros 2008; Holmquist 2008; Quintanilla-Aguilar 2009).6 (33) Y habíamos bastantes, bastantes estudiantes en, e, los salones de clase (SJ03H22). ‘And we were plenty, plenty of students in, er, the classrooms.’ (34) Estamos trabajando y hay que hacer unas chapitas, ¿no? Entonces, mientras más rápido era mejor, porque habían que pasarlas por varias etapas y eran cantidades (Real Academia Española 2008b-, Spoken, Puerto Rico). ‘We are working and one has to make badges, right. Well, it was the faster the better, because they had to pass them through multiple stages and they were many.’ This is motivated by the fact that these two constructions do not refer to exactly the same conceptualization as third-person agreeing and non-agreeing presentational haber, for which they will not compete for activation with these variants. For first-person plural presentational haber, the difference is rather subtle. It consists in that first-person plural presentational haber includes the speaker

|| 5 The Principle of Accountability states that all occurrences of the alternation have to be included in the analyses (Labov 1972: 72). 6 My corpus does not provide any example of haber que pluralization. The twentieth-century section of Davies (2002-) only includes four tokens of agreeing haber que against a total of 7,429 tokens of the construction. This suggests that the phenomenon is rather infrequent.

76 | Methodology

in the presentatum (see example [33]), whereas this is not the case for thirdperson agreeing presentational haber (see example [35]). As a result, firstperson plural haber does not alternate with third-person singular presentational haber, but rather with first-person plural ser (‘to be’) or estar (‘to be in a place’). (35) Y habían bastantes, bastantes estudiantes en, e, los salones de clase (constructed example). ‘And there werePL a lot, a lot of students in, er, the classrooms.’ Finally, in the case of , the contrast with presentational haber is quite clear, because this construction does not function as a presentational, but rather as a deontic modal construction. Let us now take a look at the statistical tools that will be used to analyze the data in Chapter 7, Chapter 8, and Chapter 9.

5.4 Statistical toolkit Once all the cases of presentational haber + plural noun phrase had been selected and coded for the relevant linguistic, social, and individual features, three statistical tools were used to investigate the hypotheses: mixed-effects logistic regression, conditional inference tree models, and conditional variable permutations in random forest models. Since these are fairly innovative techniques, let us consider them briefly, beginning with mixed-effects logistic regression. However, first, I should introduce the statistical terminology that will be used in the chapters to follow.

5.4.1 Statistical terminology In variationist sociolinguistics, researchers often speak of GoldVarb or VARBRUL analysis when referring to logistic regression and terms such as ‘constraints’, ‘factor groups’, and ‘factors’ are commonly used to refer to independent variables. This discipline-specific terminology has its roots in the fact that variationist sociolinguists pioneered logistic regression, in the form of the VARBRUL software developed by David Sankoff in the 1970s (see Tagliamonte 2016: Chap. 6 for the historical context). Since variationists were early adopters, they created their own terminology, which remained widespread after logistic regression became more common across disciplines. In contrast, cognitive linguists, corpus linguists, and psycholinguists have adopted the use of regression analysis far more recently and mostly via disciplines outside of the field of lin-

Statistical toolkit | 77

guistics. Therefore, these researchers are more likely to use the more generic statistical terms ‘independent variable’, ‘predictor’, and ‘regressor’, while reserving ‘factor’ for a particular type of independent variables, namely, categorical ones. To ensure that the argumentation benefits as large an audience as possible, I will follow the latter tradition, meaning that I will speak of ‘independent variables’, ‘regressors’, and ‘predictors’ to indicate the linguistic and social features that are included in the analyses to test the hypotheses. I will reserve the term ‘factor’ for categorical predictors, and I will use the term ‘levels’ to refer to the different values such categorical predictors may take. Before we go on to discuss the first statistical tool, mixed-effects logistic regression, two additional terms should be introduced: ‘fixed effects’ and ‘random effects’. The fixed effects of an analysis are predictors that are used to model the hypotheses. They represent aspects of the analysis that are reproducible with other datasets, such as for example, the animacy of referents, the tense of the verb, or the gender of speakers. In turn, the random effects are predictors that model aspects of the analysis that are inherently tied to the specific sample, which cannot be reproduced with other data. For example, in linguistics, we may often find a situation in which the data include particular words, which need not appear in other datasets, and individuals, who almost certainly will not be represented in other datasets.

5.4.2 Mixed-effects logistic regression models 5.4.2.1 Fixed-effects vs. mixed-effects regression models Sociolinguistic data are always drawn “from the production of individuals, inevitably from less than ideally distributed datasets, and with innumerable cross-cutting social and linguistic factors” (Tagliamonte 2012: 139). When modeling these data, one has to take their internal structure into account. Additionally, since the goal of this volume is to provide a psychologically plausible account of the constraints that govern morphosyntactic variation, the statistical model should be able to accommodate two basic principles of usage-based linguistics: language includes both abstract schemas and highly specific exemplars and derives from individual speakers’ accumulated sociolinguistic experience, resulting in a slightly (or, even, considerably) different grammar for each speaker (Bybee 2006: 716–717, 2010: Chap. 1; Bybee and Beckner 2010: 832–833; Dąbrowska 2012, 2013, 2015). This means that generalized linear models (e.g., VARBRUL, GoldVarb, or glm in R; R Core Team 2016) are not entirely appropriate to model sociolinguistic

78 | Methodology

data, both for reasons of data structure and for theoretical-linguistic reasons (Gries 2013a). Regarding the former, generalized linear models assume that every token is a completely independent piece of data that is only connected to other tokens through the fixed effects it instantiates (e.g., Cedergren and Sankoff 1974). This is almost never the case for natural language data, in which tokens can be grouped together in terms of the speakers that have contributed the data and the words that appear in the tokens (Johnson 2009; Tagliamonte 2012: 130, 137; Tagliamonte and Baayen 2012: 142–146). Regarding the latter, generalized linear models assume that predictors are ‘rules’ that apply uniformly across the board to the language production of a speech community constituted by internally completely homogenous groups, an assumption shared with sociolinguistic theory (Cedergren and Sankoff 1974; Labov 1972: 120). This is less reminiscent of the principles of usage-based linguistics than it is of the idea of an “ideal speaker-listener, in a completely homogeneous speech community, who knows its language perfectly and is unaffected by … grammatically irrelevant conditions” (Chomsky 1965: 3–4). In contrast, the generalized linear mixed-effects regression algorithms implemented in the software package lme4 (Bates et al. 2016) for R allow us to model that certain words or certain speakers might favor a variant over and above (or under and below) the linguistic or social categories they instantiate. With this software package, words and speakers can be incorporated in the regression equation as random effects in two distinct ways. The first, most straightforward way is to incorporate them as ‘random intercepts’. This means that the regression function calculates an identical regression curve for the fixed effects for each individual speaker and word that is specified, taking into account the overall preference of that speaker or word for a particular outcome. This results in a series of parallel regression curves, which are situated lower or higher along the y-axis, depending on the overall preference of the speaker or the word for a particular variant. The second way consists in allowing the intercepts and the slopes of the regression curves (their steepness) to vary according to the individual speakers and words, resulting in a series of intersecting regression curves. This sort of model, called a ‘random slopes model’, does not only allow us to take into account that certain speakers or words favor (or disfavor) a particular variant over and above the fixed effects they instantiate, it also allows us to recognize that certain speakers or words are influenced more or less strongly by certain constraints, as has become clear in recent research of language processing (e.g., Dąbrowska 2013, 2015; Farmer, Misyak, and Christiansen 2012), variation (e.g., Forrest 2015), and change (e.g., Bybee 2001, 2010). Therefore, for the three datasets, following Baayen, Davidson, and Bates (2008),

Statistical toolkit | 79

I included the individual speakers and the lemmas of the nouns that occur with presentational haber in the models as crossed random intercepts (i.e., the model assumes that lemmas are independent from speakers) and I evaluated whether random slopes were appropriate.

5.4.2.2 Selection Choosing whether or not to include random slopes forms part of a broader methodological process, which is called ‘model selection’ in statistics (see Burnham and Anderson 2002: 35–37 for an overview). Essentially, model selection refers to a procedure that takes all theoretically plausible combinations of predictors as its input and returns the most parsimonious predictor combination (or ‘model’) for the data. In this regard, the literature favors selecting models using a combination of theoretical argumentation and Akaike Information Criterion with correction for small sample sizes (AICc) (e.g., Anderson, Burnham, and Thompson 2000: 917–918; Burnham and Anderson 2002:66, 2004: 267–270; Harrell 2001: 56–58). AICc is a sample-size adjusted measure that expresses how useful the information provided by the candidate model is for predicting the outcome (Anderson, Burnham, and Thompson 2000: 916–917; Burnham and Anderson 2004: 267–270; Harrell 2001: 202). When we add a regressor to the model that does not contribute information that helps predicting the outcome, the AICc rises. As a rule of thumb, the model that has the lowest AICc offers the best tradeoff between model complexity and fit (Anderson et al., 2000: 917), provided that all predictors are theoretically motivated (Anderson, Burnham, and Thompson 2000: 916; Burnham and Anderson 2002: 17, 333). To select a parsimonious model, I started out with full models including both the random intercepts and all the fixed effects I hypothesized to have an impact on presentational haber pluralization. Then, I recombined the fixed effects into all possible subsets with the pdredge function of the MuMIn software package for R (Bartón 2016). The output of this computation-intensive ‘dredging’ procedure is a list of candidate models ordered by their AICc score. The model with the lowest AICc value was selected as the basis for the final model. Subsequently, I evaluated for which of the predictors interaction terms and random slopes were appropriate. To do so, I started adding interactions and random slopes one by one. If the addition of the interaction or random slope lowered the AICc value of the model with two units or more (Burnham and Anderson 2002: 70) with respect to the model without the added information, I was prepared to include the interaction or random slope in the final model, provided that the model converged (i.e., provided that the regression function could calculate a

80 | Methodology

result) and that the inclusion of the slope or interaction did not result in overfitting (i.e., a model that included more predictors than the data could support).

5.4.2.3 Evaluation Once I had established reasonable models, I evaluated their quality. A first statistic that can be used to investigate this are bootstrapped 95% confidence intervals. This statistic expresses the range or interval within which we can be 95% sure that the true population effect of a level of a predictor falls (Gries 2013b: 351). Optimally, this interval should be narrow and the regression estimate generated by the model should fall right in the middle of it (Levshina 2015: 168–169). When this is not the case, the model may be overfit or may include collinear (i.e., overlapping) predictors. To compute the intervals, I used the parametric bootstrap percentile method implemented in the confint function of the lme4 software package. This function refits the model a user-specified number of times (in this case, 1,000 repetitions were used) on an equal amount of subsamples. These subsamples are established by randomly drawing data points from the original dataset, allowing the same observations to be selected more than once. Thus, the procedure refits the model to a large number of samples that may be very different from the original dataset, for which it comes close to replicating the same analysis on a large number of new samples. After the replications, for each of the predictors, the algorithm orders the estimates generated by the different models by their values and it selects the 25th and the 975th estimate as, respectively, the lower (2.5%) and the upper (97.5%) limits of the confidence interval. While bootstrap confidence intervals provide a convenient way to estimate the population effect of a level of a predictor while also guarding against overfitting and severe multicollinearity (Harrell 2001: Chap. 5.2), less severe multicollinearity may not result in wide confidence intervals. Therefore, I also calculated the variance inflation factors of the models, using the vif function of the car software package for R (Fox and Weisberg 2016). The variance inflation factors express the amount of collinearity between the levels of predictors; their values should remain below five (Harrell 2001: 64–66). I also checked for overdispersion, which indicates a situation in which the model leads us to expect less variability than the amount of variance that is observed in the data. Overdispersion can be detected by comparing the sum of

Statistical toolkit | 81

the squared Pearson residuals of the model7 with the residual degrees of freedom.8 The sum of the squared Pearson residuals should be lower than the number of residual degrees of freedom (Speelman 2014). Finally, after establishing that none of the above issues applied to the models, I assessed their discriminative abilities and overall fit. To this end, I calculated the C-index of concordance (with the somers2 function of the Hmisc package; Harrell 2016), which is a measure of the descriminative ability of the model. Hosmer and Lemeshow (2000:162) propose a classification of C-index values where 0.5 corresponds to chance-level discrimination, values above 0.8 suggest ‘excellent discrimination’, and values above 0.9 indicate ‘outstanding discrimination’. To gauge the overall fit of the regression models, I determined the amount of variance they account for, in particular Nakagawa and Schielzeth’s (2013) conditional pseudo-R2 (calculated with r.squaredGLMM from the MuMIn package). Higher values are suggestive of increasingly better fits.

5.4.2.4 Presentation Before concluding this section, let us briefly consider how the regression results will be presented in Chapter 7, Chapter 8, and Chapter 9. For this study, coefficients were computed with ‘sum contrasts’. This means that the regression estimates express the deviation caused by the predictor levels with respect to the overall mean likelihood of obtaining agreeing presentational haber, which is expressed by the ‘model intercept’. Positive coefficients indicate that agreeing presentational haber is more likely to occur when the predictor level is present, negative coefficients express that non-agreeing presentational haber is preferred when the predictor level is present, and zero is neutral. The further the coefficient deviates from zero, the larger the effect. At the bottom of the tables, I will provide the AICc of the models, the C-index of concordance, and the conditional pseudo-R2. Since this investigation uses mixed-effects models, we will want to know whether the bulk of the explained variability is due to the linguistic and social predictors or rather to the random effects. For this reason, I will provide these measures of model fit for both the full mixed-effects models and for simpler models that only include the fixed predictors. The regression tables will be offered at the beginning of Chapter 7 and Chapter 8. To assist the reader, throughout the discussion, I will also introduce

|| 7 The observed values of each data point, minus the predicted values for each data point, divided by the standard deviation of the observed values. 8 The number of observations, minus the number of predictors in the model.

82 | Methodology

effect plots, generated with the ggplot2 software package for R (Wickham and Chang 2016). These plots represent the effects of the levels of the different predictors on the log-odds scale with a dot that is connected by a straight line, which only serves to assist the reader in interpreting the sizes of the effects of the predictors. Let us turn now to the conditional inference tree models that will be presented in Chapter 7, Chapter 8, and Chapter 9 as companions to the mixed-effects regression models.

5.4.3 Conditional inference tree models Although the unequally distributed datasets typically used in sociolinguistic research are “the epitome of the type of data that mixed models are designed to handle” (Tagliamonte 2012: 139–141), mixed-effects regression models may become less accurate when the data are distributed highly unevenly across regressor levels and represent multiple interactions between predictors and/or empty data cells (Baayen 2014: 363–364; Levshina 2015: Chap. 14). This sort of data structure is often present in sociolinguistic corpora and it is almost impossible to get around it once the analysis is narrowed down to the individual speakers (e.g., Guy 1980). Therefore, for our present purposes it will be useful to combine mixed-effects logistic regression with another statistical approach that rests upon completely different distributional assumptions (Baayen 2014: 364; Tagliamonte and Baayen 2012: 161). If we achieve similar results with both approaches, we can be more confident that they are not due to distributional biases. Additionally, although a mixed-effects model provides insight into the influence of individual predictors while taking all others and intergroup variation into account, it says little about the way these predictors jointly determine speakers’ behavior (Tagliamonte and Baayen 2012: 163). These two concerns can be addressed at the same time with conditional inference tree models (Baayen 2014: 364; Tagliamonte and Baayen 2012: 161, 164), which can be generated in R with the ctree function of the software package party (Hothorn et al. 2016). According to Baayen, [c]onditional inference trees estimate a regression relationship by means of binary recursive partitioning. The ctree algorithm begins with testing the global null hypothesis of independence between any of the predictors and the response variable. The algorithm terminates if this hypothesis cannot be rejected. Otherwise, that predictor is selected that has the strongest association to the response, as measured by a p-value corresponding to a test for the partial null hypothesis of a single input variable and the response. A binary split in the selected input variable is carried out. These steps are recursively repeated until no further splits are supported (Baayen 2014: 364).

Comparative sociolinguistics | 83

In the conditional inference tree models that will follow in Chapter 7, Chapter 8, and Chapter 9, the ovals represent the predictors. The higher a node is located in the tree, the stronger it conditions the competition between the presentational haber constructions. The branches that go down from the nodes represent the binary split the algorithm has established in the data. At the bottom, bar plots (the ‘leaves’ of the tree) represent the proportion of agreeing presentational haber in dark gray. For these models, I will also provide the C-index of concordance. As for the regression models, C = 0.5 represents the level of chance and higher values suggest increasingly better discriminative abilities.

5.4.4 Conditional variable permutation of predictors in random forest models Following Tagliamonte and Baayen (2012), I will gauge the relative impacts of the predictors with a random forest model of the variation. According to Baayen, this type of statistical models unite a large number of conditional inference trees, resulting in a (random) forest of conditional inference trees. Each tree in the forest is grown for a subset of the data generated by randomly sampling without replacement from observations and predictors. The predictions of the random forest are based on a voting scheme for the trees in the forest: each tree in the forest provides a prediction about the most likely class membership, and the class receiving the majority of the votes is selected as the most probable outcome (Baayen 2014: 366).

In R, random forests can be grown with the function cforest of the software package party. Once we have a random forest model of the variation, we can derive the relative importance of the different predictors by calculating the loss in prediction accuracy of the model when the levels of a predictor are randomly permuted, breaking the associations between the dependent variable and the levels of the predictor. This can be achieved with the function varimp of the same software package. The greater the loss in prediction accuracy, the more important a predictor is (Baayen 2014: 366; Tagliamonte and Baayen 2012: 160). Let us now consider the way these quantitative data will be compared.

5.5 Comparative sociolinguistics Tagliamonte (2013: 130) states that “similarities and differences in the significance, strength, and ordering of constraints” (i.e., predictors) “provide a microscopic view of the underlying grammatical system”. Therefore, to compare the

84 | Methodology

constraints that govern morphosyntactic variation across speech communities, we may perform parallel regression analyses on datasets gathered from different speech communities and compare the significance of the predictors, their effect sizes and directions, and their relative contribution to explaining the variation (Tagliamonte 2013). This is called the ‘comparative sociolinguistic’ method. Even though this method has provided interesting insights into the behavior of different alternations across various communities (see Tagliamonte 2012: 166, 2013 for overviews), from a statistical point of view, two of its cornerstones are somewhat problematic. Firstly, it makes little sense to compare the significance of predictors for models that are fitted to different datasets, because whether or not a particular predictor reaches significance does not only depend on the distribution of the data (i.e., the effect caused by the predictor), but also on the size of the sample (Anderson, Burnham, and Thompson 2000: 914–915; Hubbard and Lindsay 2008: 71; Trafimow and Rice 2009: 262–263). As an illustration, consider the simulated data in Tab. 14. For both the large and the small sample, the effect size (Phi-coefficient) is exactly the same, but because the sample size varies between the two simulations, the effect is only significant for the larger sample. Tab. 14: Simulated dataset illustrating the dependence of statistical significance upon sample size

Predictors

Small sample

Large sample

Condition 1

10

30

100

300

Condition 2

20

40

200

400

Test statistics Phi-coefficient

0.089

0.089

p-value (< Chi-squared)

p = 0.504

p = 0.006

A second shortcoming that can be identified is related to the way the relative contribution of variables to explaining the variation is gauged in (comparative) variationist sociolinguistics. In studies of language variation and change, it is common to find that the relative importance of predictors is examined by calculating the range between the highest and the lowest regression coefficient that is found for each predictor. While this may provide insight into the size of the effect of a particular predictor and even though predictors with large effect sizes tend to be among the most important ones for explaining the variation, this

Summary | 85

need not be the case. For instance, for predictors with more levels, there is a greater probability of obtaining large effect ranges by chance alone. Therefore, in this book, I will adopt a slightly altered comparative sociolinguistic method to contrast the varieties of Havana, Santo Domingo, and San Juan. Rather than comparing the significance of the predictors, I will compare their information-theoretic relevance (i.e., does including this predictor decrease or increase the AICc statistic?).9 To assess the relative importance of predictors for explaining the variation, I will use conditional variable permutation in random forest models. Let us now summarize the most important ideas that were put forward in this chapter.

5.6 Summary This chapter has outlined the methodological framework of this study. Most importantly, we have seen that this investigation draws on a judgment sample of three times 24 participants, equally divided over two education groups, two gender groups, and two age groups. Furthermore, I have explained that the data were collected using a combination of semi-directed interviews and two elicitation tasks. Additionally, I have highlighted the advantages of mixed-effects logistic regression and I have suggested that conditional inference tree models constitute ideal companions for this type of regression analysis. Finally, I have introduced a slightly adapted version of Tagliamonte’s (2013) comparative sociolinguistic method. In Chapter 7, Chapter 8, and Chapter 9, these statistical tools will be put to use. However, before turning to the analysis of the corpus, in the following chapter, the pragmatic, semantic, and syntactic properties of the presentational haber constructions will be discussed.

|| 9 Even though AICc is still dependent upon sample size, tests showed that varying the number of observations only impacts the size of the increase or decrease of the statistic when predictors are added to or removed from the model, not the directionality of the change in AICc.

|

Chapter 6: Semantic and syntactic properties of presentational haber

6 Semantic and syntactic properties of presentational haber In Section 4.2.1, the working hypothesis introduces the claim that presentational haber pluralization results from a spreading-activation competition between two largely synonymous variants of the presentational haber construction. However, the pragmatics, semantics, and syntax of the presentational haber constructions and their potential differences in these respects have so far remained without discussion. Therefore, this chapter will provide an overview of the characteristics of the constructions, which will allow me to identify any possible pragmatic/semantic contrasts between them. Particularly, Section 6.1 is concerned with the meaning of presentational haber. Subsequently, Section 6.2 focuses on the nominal argument. Section 6.3 deals with the status of the adverbial phrase that appears frequently with presentational haber. Section 6.4, in turn, introduces the conditions that constrain the use of implicit nominal arguments and adverbial phrases. Finally, in Section 6.5 a brief summary is presented.

6.1 The meaning of the presentational haber constructions: POINTING-OUT Earlier research supports that the meaning of presentational haber, as that of all presentational constructions, refers to a cognitive routine that introduces a nominal entity into discourse, asserting its existence, and situating it in a mental space1 (Bolinger 1954: 334, 1977: 92–93; Hernández-Díaz 2006: 1056; Lakoff

|| 1 Fauconnier defines mental spaces as “small conceptual packets constructed as we think and talk for the purpose of local understanding and action” (Fauconnier and Turner 1996: 113), and as such, they belong to the realm of working memory (Fauconnier 2007: 351). In other words, mental spaces are novel, temporal conceptualizations that organize the information speakers and hearers are presented with in usage events. This includes the base space, the common ground shared by the hearer and the speaker (Croft and Cruse 2004: 33). New mental spaces are built up dynamically in working memory by mixing fragments of other mental spaces with procedural and factual knowledge (Fauconnier and Turner 1996: 115). This process is called ‘blending’ and the output spaces are called ‘blends’ (Fauconnier 2007: 351–352).

90 | Semantic and syntactic properties of presentational haber

1987: 554; Langacker 1991: 352–353; Suñer 1982: 95). This is captured in Lakoff’s (1987), POINTING-OUT Idealized Cognitive Model: [i]t is assumed as a background that some entity exists and is present at some location in the speaker’s visual field, that the speaker is directing his attention at it, and that the hearer is interested in its whereabouts but does not have his attention focused on it, and may not even know that it is present. The speaker then directs the hearer’s attention to the location of the entity (perhaps accompanied by a pointing gesture) and brings it to the hearer’s attention that the entity is at the specified location (Lakoff 1987: 490).

Section 2.2 has shown that agreeing and non-agreeing presentational haber are interchangeable in every context. In this light, the working hypothesis claims that the two constructions encode the same Idealized Cognitive Model. The remainder of this chapter will try to establish whether this assumption is justified.

6.2 The nominal argument In this section, the characteristics of the nominal arguments of agreeing and non-agreeing presentational haber will be examined in the light of examples drawn from the corpus of this study, Davies (2002-), the Internet, and Real Academia Española (2008b-). Particularly, Section 6.2.1 will be concerned with its argument role, information status, and semantic function. Then, in Section 6.2.2, its syntactic properties will be investigated.

6.2.1 Argument role, information status, and semantic function Because the POINTING-OUT Idealized Cognitive Model only describes the act of bringing a referent “out of limbo into presence” (Bolinger 1954: 335), the nominal encodes virtually the entire conceptual import of the clause. Semantically, we can conceive of this element as being merely present in the scene that is presented through the construction. Therefore, it is probably safe to assume that it is assigned a ‘zero’ argument role (Langacker 1991: 288). Examples such as (36) and (37) show that this is the case for both agreeing and non-agreeing presentational haber. Still, because the agreeing variant encodes the noun phrase as subject, it may be hypothesized to be slightly more prominent than the noun phrase of the construction without verb agreement (Langacker 1991: Chap. 7).

The nominal argument | 91

(36) Después que mataron a Trujillo, pues fue Trujillo que trajo esa gente. Y trajo españoles también. Habían colonias españolas aquí (SD16H22/RD2200). ‘After they killed Trujillo, because it was Trujillo who brought those people here. And he also brought Spaniards. There were Spanish colonies here.’ (37) Pero que sí que hubo muchas, muchas casas, e, destrozadas, muchas casas, e, desaparecidas (SD23H12/RD3065). ‘But that there was many, many, er, destroyed houses, many, er, disappeared houses.’ In turn, from the POINTING-OUT Idealized Cognitive Model it follows that, in affirmative expressions, the noun phrase of presentational haber can only be interpreted as referring to a specific referent (Prince 1992: 299–300) unless, as we will see below, the presentatum is explicitly construed as a type. Again, examples such as (38) and (39) suggest that this is the case for both variants of the presentational haber construction. (38) En Salcedo habían muchos árabes, que le decían ‘turcos’, porque Turquía, e, parece que dominaba los países árabes y tenían mucha represión (SD16H22/RD2210). ‘In Salcedo, there were many Arabs, who were called ‘Turks’, because it appears that Turkey, er, dominated the Arab countries and they had much repression.’ (39) Este, vivienda acá hay muchos condominios (SJ02M12/SJ168). ‘Er, housing, here, there is many condominiums.’ Regarding information status, POINTING-OUT places stringent constraints on the nominal argument, as this Idealized Cognitive Model implies that the noun phrase of the presentational haber constructions cannot encode information that already forms part of the hearer’s beliefs, consciousness, or world knowledge. Indeed, agreeing and non-agreeing presentational haber co-occur most often with indefinite nominal arguments, as is shown in examples (40)(42) (Fernández-Soriano and Táboas-Baylín 1999: 1755–1756; RAE and ASALE 2009: §12.2l, §20.2g, §20.3f-h). (40)Ahora no. Ahora no hay principios (SD02H21/RD230). ‘Not nowadays. Nowadays, there isn’t principles.’ (41) En aquella época había cinco, seis millones de cubanos (LH12H21/LH1628). ‘At that time, there was five, six million Cubans.’

92 | Semantic and syntactic properties of presentational haber

(42) Y desde luego, e, cantidad, no sé qué decirte, pero me imagino que sobre todo en el campo, deben haber muchos más, muchas situaciones de esa naturaleza (SJ03H22/SJ336). ‘And, of course, er, quantity, I don’t know what to tell you, but I imagine that, mostly on the countryside, there must bePL many more, many situations of that nature.’ However, cases such as example (43) and example (44) illustrate that this does not mean that presentational (haber) expressions only allow indefinite, discourse-new nominals, as some authors have argued (e.g., Fernández-Soriano 1999: 131; Fernández-Soriano and Táboas-Baylín 1999: 1755; Freeze 1992: 557). (43) Y habían los almuerzos, iban los tíos míos, iban los primos (SJ14H22/SJ1681). ‘And there were the lunches, my uncles went, my cousins went.’ (44) Bueno, sí, aquí en Cuba hay todas esas cosas (LH21H11/ LH2866). ‘Well, yes, here in Cuba, there is all these things.’ Rather, agreeing and non-agreeing presentational haber, like English presentational there is/there are (Lakoff 1987: 545; Prince 1992: 301; Ward and Birner 1995: 740), seem to allow definite/discourse-old noun phrases, provided they refer to entities that are (or can be construed as) new to the hearer (GonzálezCalvo 2002: 649–650; Suñer 1982: 97–100). In particular, when discussing the co-occurrence of English presentational there is/there are with definite nominal arguments, Ward and Birner (1995: 730) identify five types of definite nominals that can occur with this construction. As the list in (45) shows, many of these are also hearer-old. (45) 1. Hearer-old entities treated as hearer-new 2. Hearer-new tokens of hearer-old types 3. Hearer-old entities newly instantiating a variable 4. Hearer-new entities with uniquely identifying descriptions 5. False definites2 Although Ward and Birner (1995) base their conclusions on English presentational there is/there are, the fact that the presentational haber constructions fulfill the same discourse function suggests that they might represent a similar behavior. Additionally, because “[d]ifferences in the packaging of information

|| 2 Ward and Birner (1995) use the term ‘false definites’ to refer to syntactically definite noun phrases that encode discourse-new/hearer-new information.

The nominal argument | 93

are perhaps the most important reason why languages have alternative ways to say ‘the same’ thing” (Goldberg 2006a: 129–130), we might find some contrasts here between the two presentational haber constructions. Let us consider this matter from up close.

6.2.1.1 Hearer-old entities treated as hearer-new The first type of hearer-old entities that can occur with English presentational there is/there are according to Ward and Birner (1995) are hearer-old entities that are treated as hearer-new. In the literature, these are usually labeled ‘reminders’. As the label of this type of definites implies, their felicitous usage requires that the speaker assumes that the hearer has forgotten, at least temporally, about a referent3 that has already been evoked in earlier discourse (Bolinger 1977: 115–117; Lakoff 1987: 545, 561; Ward and Birner 1995: 750). In other words, with reminders, the use of a presentational construction is licensed by the fact that the speaker assumes that the hearer has forgotten about the referent of the noun phrase, whereas the use of the definite determiner is motivated by the speaker’s expectation that the hearer can at least recognize the entity that is being reintroduced (Suñer 1982: 85; Ward and Birner 1995: 730– 731). For this reason, in English, it is most common to form reminding expressions with demonstratives, as in example (46), rather than with definite articles, which would imply that the hearer is expected to recall the referent (Langacker 1991: 98; Ward and Birner 1995: 731). (46) She is running as an insider. That is a mistake. Then there are just the stray gaffes. She said, in a famous episode, she was asked to go shake hands. And she said: “Well, actually, I. No. What would I gain by shaking hands out of Fenway Park?” Well, that is exactly what you should be doing. You’re running for office. Shake some hands. Go out and meet some people. So, there is that problem (Davies 2008-, Press). In Spanish, reminders are usually constructed with tales ‘such’, as in examples (47) and (48), or the close-to-hearer demonstrative esas/esos ‘these’, as in examples (49) and (50) (Suñer 1982: 85–86). The examples also suggest that reminding definite noun phrases are more common in expressions involving negation.

|| 3 Observe that it is the referent that is reintroduced to the hearer, not the specific noun, such that, for example, these boys could be used in a reminding expression to reintroduce, say, little Bert and Ernie.

94 | Semantic and syntactic properties of presentational haber

(47) Entonces, esto fue creando en la gente un conocimiento de que no habían tales milagros (Real Academia Española 2008b-, Fiction, Colombia). ‘Well, this began to create in the people a knowledge that there weren’t such miracles.’ (48) Nadie tiene derecho a estar provocando a otro país, enviando aviones con el pretexto de rescate. No había tales pretextos de rescate, lo que había era la promoción de las salidas ilegales del país (Real Academia Española 2008b-, Spoken, Cuba). ‘No one has the right to be provoking another country, sending aircraft under the pretext of rescue operations. There wasn’t such rescue pretexts, what there was, was the promotion of illegal exits out of the country.’ (49) Interviewer: Y ese nuevo orden enriquece la vida del hogar, fíjate. Es más televisión, más radio, más cosas dentro de la casa y enriquece la vida del hogar y por lo tanto, pues, debilita la otra. Participant: Debilita la otra, sí. Antes no habían esas cosas y tenía que el individuo irse a la calle a, a procurarse la diversión (Davies 2002-, Spoken, Puerto Rico). Interviewer: ‘And this new order enriches the life of the home, notice that. It is more television, more radio, more things inside the house and this enriches the life of the home, and, therefore, debilitates the other.’ Participant: ‘It debilitates the other, yes. Before there weren’t these things, and the individual had to go out on the streets to, to get diversion.’ (50) Interviewer: En los libros sobre el español de Santo Domingo se dice que de vez en cuando en el interior del país todavía se usa ‘su merced’ o, ‘vuestra merced’. ¿Usted alguna vez lo ha escuchado? Participant: No. Interviewer: ¿No? Participant: No, eso no es verdad. Participant: Yo no he oído esto, a, aquí no hay, aquí no hay ese, esos términos: ‘su merced’, ‘vuestra’. No, eso no es verdad (SD15M21/RD1908- RD1909). Interviewer: ‘In books about the Spanish of Santo Domingo, they say that sometimes in the interior of the country they still use ‘your grace’. Have you ever heard it?’ Participant: ‘No.’ Interviewer: ‘No?’ Participant: ‘No, that’s not true.’

The nominal argument | 95

Participant: ‘I haven’t heard this, he, here, there isn’t, here, there isn’t that, those terms: ‘your grace’, ‘your’. No, that’s not true.’ In my data, reminding expressions are also formed with the close-to-speaker demonstrative estos ‘these’, but only with the agreeing construction, as can be seen in example (51). However, when we extend the range of data that are considered, non-agreeing tokens can also be found, as is shown in example (52). (51) Interviewer: ¿Este, cuando se mudó aquí, habían cosas a las que tuvo que acostumbrarse? Participant: ¿Tales cómo? Interviewer: No sé por ejemplo. E, este, no sé, cosas. Participant: Este, bueno, no, me tuve que acostumbrarme a vivir en un condominio cuando yo viví en una casa. ... Interviewer: ¿Y usted recuerda como la ciudad era antes? O sea, cuando era niña. Participant: Cuando yo era niña, sí. No habían estos condominios, desde luego (SJ01M22/ SJ07). Interviewer: ‘Er, when you moved here, were there things you had to get used to?’ Participant: ‘Such as?’ Interviewer: ‘I don’t know, for example. Er, er, I don’t know, things.’ Participant: ‘Er, well, no, I had to get used to living in a condominium, when I had always lived in a house.’… Interviewer: ‘And do you remember what the city was like before? That is to say, when you were a girl.’ Participant: ‘When I was a girl, yes. There weren’t these condominiums, of course.’ (52) Estudiaba el que tenía plata, el que no, no podía estudiar. Qué diferencia ¿no? Esos bochinches estudiantiles. … Yo entiendo que no es lo mismo. En ese tiempo Mérida contaba cuatro, cinco mil habitantes, era un pueblito. Y no había estos bochinches porque no había nadie, era el trabajo, todo el mundo pegado al trabajo y esas cosas, y nadie estaba pensando en hacerle mal al otro (Real Academia Española 2008b-, Spoken, Venezuela). ‘Those who had money studied, those without could not study. What a difference, right? These student riots. … In my understanding, it's not the same. At that time Merida had four, five thousand inhabitants, it was a small town. And there wasn’t these riots because there was nobody, everyone was occupied with their jobs and these things, and no one was thinking about doing harm to another.’

96 | Semantic and syntactic properties of presentational haber

6.2.1.2 Hearer-new tokens of a hearer-old type Ward and Birner (1995: 732–733) observe that the definite noun phrases of English presentational there is/there are constructions can also be licensed if the noun phrase introduces a hearer-new instance of a known or inferable type. As various authors point out, this reading requires an adjective that construes the noun phrase in this way (Lakoff 1987: 546; Ward and Birner 1995: 732–733). With Spanish presentational haber, this function may be fulfilled by adjectives such as, among others, mismas/mismos ‘same’ (see examples [53] and [54]), necesarias/necesarios ‘necessary’ (see examples [55] and [56]), obligatorias/obligatorios ‘obligatory’, and suficientes ‘sufficient’ (see examples [57] and [58]) (RAE and ASALE 2009: §15.6l-ñ; Torrego-Salcedo 1999: 1795). (53) Ya en Azteca habían los mismos comentarios acerca de su manera prepotente y payasa (Internet, Message board, Mexico, http://goo.gl/9RWHLP). ‘Already in Azteca4, there were the same comments about her overbearing and clownish way.’ (54) En las paredes había los mismos mapas de acrílico transparente con las fronteras en negro (Internet, Magazine, Argintina, http://goo.gl/I8auJE). ‘On the walls there was the same maps of transparent acryl with the borders in black.’ (55) Por motivos de economía, en México nunca han habido los filtros necesarios para tamizar esos testimonios (Internet, Press, Mexico, http://goo.gl/Oh2Kp6). ‘For economical reasons, in Mexico, there have never been the necessary filters to sift these testimonies.’ (56) Se habló de que por lo menos unos 200 mil trabajadores en estas condiciones pasarían a las filas de la formalidad, cosa que no ha sucedido en gran parte por la baja en el crecimiento económico y porque no ha habido los incentivos necesarios (Internet, Press, Mexico, http://goo.gl/HDZDVy). ‘There was talk that at least some 200 thousand workers in these conditions would transition to the formal sector, something that has not happened, to a large extent because of the decrease in the economic growth and because there has not been the necessary incentives.’

|| 4 A Mexican soap opera production house.

The nominal argument | 97

(57) Simplemente no habían los suficientes trabajadores estadounidenses para recoger las cosechas a precios que las hubiesen hecho rentables (Internet, Blog, Ecuador, http://goo.gl/HUfGvb). ‘Simply there weren’t the necessary amount of American workers to collect the harvests at prices that would have made them profitable.’ (58) Al parecer, ese día, ya había los suficientes equipos como para dar paso a la primera fecha del torneo (Internet, Press, El Salvador, http://goo.gl/jv9IIQ). ‘As it appears, that day, there was already the necessary amount of teams to proceed with the first date of the tournament.’ Anaphoric pronouns, as in example (59), are also interpreted as new instances of a referent type evoked earlier. Here, we would expect to find only the nonagreeing construction, because the pronouns that appear with presentational haber are accusatives. However, as we will see in Section 7.4, the corpus also provides a limited number of agreeing tokens involving direct-object pronouns. As I will show, rather than invalidating the working hypothesis, the fact that tokens such as example (60) typically occur after the interviewer or the participant have used an agreeing presentational haber expression appears to suggest that structural priming causes individual participants to reanalyze the directobject pronoun as a subject pronoun. (59) Sí, sí, aquí también los hay. Y yo supongo que los habrá en, en Bélgica, en, en, en Italia, en todos lados (LH15H21/LH1596). [Yes, yes, here, themACC there is as well. And I suppose that themACC there will beSG in, in Belgium, in, in, in Italy, everywhere.] ‘Yes, yes, here, there is as well. And I suppose that there will beSG in, in Belgium, in, in, in Italy, everywhere.’ (60)¿A, acá ya habían carros? ¡Claro que los, claro que los habían, no, no soy tan viejo!”(LH20H12/LH2765-LH2766). [W, were there already cars here? Of course that themACC, of course that themACC there were, not, I’m not that old!] ‘W, were there already cars here? Of course that themACC, of course that there were, I’m not, I’m not that old!’ My corpus also provides agreeing and non-agreeing examples of presentational haber expressions introducing a hearer-new token of a hearer-old type with the generic possessive determiner sus ‘one’s’, implying ‘the/your typical’, or ‘the usual’ (see examples [61] and [62]).

98 | Semantic and syntactic properties of presentational haber

(61) Interviewer: ¿Este, y habían platos que tu madre te hacía especialmente para ti porque te gustaban tanto? Participant: T, sí, ha, habían sus boberías pero no mucho. Mi casa nunca fue una casa que tuvo grandes posibilidades (LH21H11/LH2842). Interviewer: ‘Er, and were there dishes that your mother made especially for you because you liked them so much?’ Participant: [T, yes, the, there were one’s silly things, but not much. My home was never a home that had great resources.] Participant: ‘T, yes, the, there were the/your typical silly things, but not much. My home was never a home that had great resources.’ (62) E, sí, siempre había sus diferencias y sus celos, pero nos criamos bien (SD23H12/RD3049). [Er, yes, there was always one’s differences and envies, but we grew up alright.] ‘Er, yes, there was always the/your typical differences and envies, but we grew up alright.’ Additionally, since types are usually encoded with indefinite noun phrases, presenting hearer-new tokens of hearer-old types does not necessarily involve the use of definite determiners in my data. Rather, it is quite common in Spanish to use agreeing or non-agreeing presentational haber expressions with implicit noun phrases to present new instances of a previously evoked type, as is shown in examples (63) and (64). (63) Interviewer: ¿En esa época se veía que los hermanos estaban corrigiendo a, a sus hermanas? Participant: Habían, habían, habían, pero mi mamá decía que no era correcto eso (SD15M21/ RD1858- RD1860). Interviewer: ‘In that time, would you see that brothers were correcting their sisters?’ Participant: ‘There were, there were, there were, but my mom said that that was not correct.’ (64) Interviewer: ¿No? ¿Este, no, no había peleas en aquel entonces? Participant: No, siempre hay, lo que pasa es que, o no me han tocado, o yo no he querido estar (LH08H12/LH988). Interviewer: ‘No? Er, no, weren’t there fights back then?’ Participant: ‘No, there is always, what happens is that, either they have not affected me, or I didn’t want to be involved.’

The nominal argument | 99

Finally, speakers may also specify the amount of new tokens of the type they wish to bring to the hearer’s attention by using a quantifying pronoun or another quantifying expression, as is shown in examples (65) and (66). (65) Interviewer: ¿Y habían cosas que no le gustaban de la ciudad? Participant: Habían algunas que no me gustaban, sí, sí (LH17M21/LH2291). Interviewer: ‘And were there things that you didn’t like about the city?’ Participant: ‘There were a few that I didn’t like, yes, yes.’ (66) Interviewer: ¿Este, entonces, que usted recuerde cuando usted era niño, habían más conversaciones en la calle cuando usted era niño? Participant: Claro que sí, había más (SD02H21/RD14). Interviewer: ‘Er, well, for what you can remember, when you were a child, were there more conversations in the streets when you were a child?’ Participant: ‘Of course, there was more.’

6.2.1.3 Hearer-old entities newly instantiating a variable The third type Ward and Birner (1995) identify is that of hearer-old entities newly instantiating a variable, resulting in a list reading. According to these authors, list-reading definites require a context that evokes an open proposition of the type ‘X is an element of the category Y’ (Ward and Birner 1995: 734–735). The use of a presentational expression, then, is motivated by the fact that the elements of the list are presented as hearer-new instances of the category. In turn, the use of the definite article is licensed by the fact that hearers are expected to uniquely identify the list items (Suñer 1982: 88–90; Ward and Birner 1995: 734– 735). This is shown in example (67), in which the character Teófilo Huamani first evokes the category gran civilización ‘great civilization’, upon which the other character enumerates some examples with an agreeing presentational haber expression. Similar cases can be found with the non-agreeing construction, as is shown in example (68).

100 | Semantic and syntactic properties of presentational haber

(67) Teófilo Huamani: Porque, a mí, los balcones representan la opresión. Professor Brunelli: ¿Se puede saber a quién o a qué oprimen estos pobres balcones? Teófilo Huamani: Antes de que llegaran aquí los forasteros que los trajeron, en el Perú había una gran civilización, profesor. Professor Brunelli: La de los incas, lo sé muy bien. Y, antes, habían los chimús, los nazcas, los tiahuanacos, muchos más (Real Academia Española 2008b-, Theater, Peru). Teófilo Huamani: ‘Because, to me, the balconies represent oppression.’ Professor Brunelli: ‘Can I know who or what these poor balconies oppress?’ Teófilo Huamani: ‘Before the foreigners came here that brought them, in Peru there was a great civilization, professor.’ Professor Brunelli: ‘That of the Incas, I know it very well. And before, there were the Chimus, the Nazcas, the Tiahuanacos, many more.’ (68) Bueno, yo creo, francamente, que tenemos que adaptarnos, en nuestro teatro, y en otras manifestaciones de la vida puertorriqueña, a la forma de hablar puertorriqueña. … Hay también el yeísmo. ¿Verdad? Pero el yeísmo está aceptado ya en el resto de Hispanoamérica, y hay los apócopes de la ese final, que ocurren mucho en, en Puerto Rico (Real Academia Española 2008b-, Spoken, Puerto Rico). ‘Well, frankly, I think that we have to adapt ourselves, in our theater, and in other manifestations of Puerto Rican life, to the Puerto Rican way of talking. … There is also the yeismo. Right? But the yeismo is already accepted in the rest of Hispanic America and there is the apocopes of word-final s, which occur a lot in, in Puerto Rico.’ In this light, one would expect proper names without determiners also to occur in lists. Indeed, in English, this is possible, as is shown by the felicity of the cooccurrence of the proper name John McCain with presentational there is/there are in example (69). (69) I think, one, there is John McCain and there is everybody else (Davies 2008-, Press). However, this is impossible with presentational haber. Suñer (1982: 82) attributes this to the fact that a proper name would require differential object marking with the preposition a (e.g., a John McCain), which presentational haber does not allow, because it only has one nominal argument (Delbecque 2002: 107; Torrego-Salcedo 1999: 1785, 1794–1795).

The nominal argument | 101

Finally, it should also be observed that definites can be licensed in multiple ways. Consider, for instance, example (70). Here, the definites los juegos estos que te digo ‘those games that I told you about’ and la televisión ‘the television’ are licensed as uniquely identifiable elements of the category pasatiempos ‘pastimes’ evoked in the interviewer’s question. At the same time, however, the use of the first definite noun phrase is also licensed as a reminder. (70) Interviewer: ¿Este, y cuando tú eras niña, qué pasatiempos habían? Participant: ¿Qué pasatiempos entonces habían? E, t, bueno, habían los juegos estos que te digo, la televisión, aunque eran pocos los muñequitos que habían, pero, m, m, pero habían algunos (LH03M12 /LH287). Interviewer: ‘Er, when you were a girl, what pastimes were there?’ Participant: ‘What pastimes were there? Er, t, well, there were those games that I told you about, the television, although the cartoons that there were, were few, but, m, m, but there were some.’

6.2.1.4 Hearer-new entities with uniquely identifying descriptions The nominal argument can also be marked by a definite determiner because it introduces a hearer-new entity with a uniquely identifying description (Lakoff 1987: 546; Ward and Birner 1995: 735–736). Contrary to what we have seen for the other types of definites, the degree of acceptability of this type does not hinge upon the context. Rather, this sort of definite argument introduces brandnew information, but in such a way that the hearer can immediately identify the unique referent the speaker is talking about, which licenses the use of definite determiners (Abbott 2004: 136; Langacker 1991: 98). Example (71) shows that this interpretation emerges with noun phrases introduced by demonstratives and followed by adnominal descriptions. Cases like this can also be found with non-agreeing presentational haber (see example [72]), as well as with the definite article (see examples [73] and [74]). (71) No habían esos bares y esos, s, esas cosas que hay, que están creando problemas (SD15M21/RD1810). ‘There weren’t these bars and these, t, these things that there is, that are creating problems.’

102 | Semantic and syntactic properties of presentational haber

(72) Yo creo que el país tiene otras urgencias ahora y que, en la situación en que está, la cultura no es una de ellas. Cada quien irá haciendo lo que pueda con los pocos recursos que tenga.…Ya no hay esos grandes subsidios o esas grandes exposiciones y puestas en escena que podías traer del exterior (Real Academia Española 2008b-, Press, Venezuela). ‘I think that the country has other urgencies now and that, in the situation in which it is, culture is not one of them. Everyone will keep doing what he or she can with the few resources they have. … There isn’t these large subsidies or these large exhibitions or stagings that you could bring from abroad anymore.’ (73) Ante las denuncias de robo, acoso sexual y amenazas que han habido en el recinto de Río Piedras de la Universidad de Puerto Rico (UPRRP), el rector Carlos Severino, aseguró que “está cambiando la manera en que hacen la seguridad,” al incluir mayor patrullaje y retomar el tema de la acreditación de la Guardia Universitaria (Internet, Press, Puerto Rico, http://goo.gl/oiBzR3). ‘Faced with the allegations of theft, sexual harassment, and threats that there have been at the Rio Piedras campus of the University of Puerto Rico (UPRRP), the Rector Carlos Severino, assured that “He is changing the way they do security,” by including more patrols and revisiting the issue of the accreditation of the University Guard.’ (74) Los insectos han sobrevivido cuatro de las cinco grandes extinciones que ha habido en el planeta (Internet, Press, Puerto Rico, http://goo.gl/K4ze4g). ‘Insects have survived four of the five great extinctions that there has been on the planet.’ Uniquely identifying descriptions may also be constructed with an anaphoric pronoun followed by a restrictive relative clause (RAE and ASALE 2009: §15.6r), as in example (75). Since this type of expression involves accusative pronouns, they are less common with agreeing presentational haber. Still, some cases can be documented, as is shown in example (76).

The nominal argument | 103

(75) Bueno, nosotros en Cuba les llamamos ‘guaguas’. Son unos pequeños insectos, no recuerdo de qué familia. Los hay blancos, pequeñitos, los hay que parecen cucarachitas pequeñitas que específicamente succionan, chupan los jugos vegetales y, entonces, esto lógicamente empobrece las plantas y hay que luchar contra ellas (Davies 2002-, Spoken, Havana). [Well, we in Cuba, we call them ‘guaguas’. They are little insects, I don’t recall of which family. ThemACC there is white, really small, themACC there is that look like little cockroaches, which specifically suction, suck the plant’s juices and, well, this logically weakens the plants and you have to fight against them.] ‘Well, we in Cuba, we call them ‘guaguas’. They are little insects, I don’t recall of which family. There is some small white ones, there is some that look like little cockroaches, which specifically suction, suck the plant’s juices and, well, this logically weakens the plants and you have to fight against them.’ (76) Claro, al haber tantos alumnos en el aula, un solo maestro para cuarenta o cincuenta muchachos. Y, entonces, los habían disciplinados, pero los habían que eran la candelita (LH21H11/LH2835-LH2836). [Of course, with there being so many pupils in the classroom, a single teacher for forty or fifty kids. And, so, themACC there were disciplined, but themACC there were who were a handful.] ‘Of course, with there being so many pupils in the classroom, a single teacher for forty or fifty kids. And, so, there were disciplined ones, but there were some who were a handful.’ Further examples of this type can be found with superlatives (see examples [77] and [78]) and cataphoric-reference noun phrases (see examples [79] and [80]) (Bolinger 1977: 117–118; Suñer 1982: 80, 82–84; Ward and Birner 1995: 737). (77) Por plata han habido los más extraños cambios de postura en toda la historia humana (Internet, Message board, Chile, http://goo.gl/nuyRpY). ‘For money there have been the strangest posture changes in the entire human history.’ (78) Creo que en el Pri hay los mejores políticos del Estado, las gentes que tienen la mejor experiencia de gobierno (Davies 2002-, Spoken, Mexico). ‘I think that in the Pri Party there is the best politicians of the state, the people that have the best governance experience.’

104 | Semantic and syntactic properties of presentational haber

(79) Es posible que la ola que decayó, a nuestro juicio, en esos días, se hubiese elevado de nuevo, si nosotros convocamos a la huelga y la anunciamos 48 horas antes. Claro, habían los criterios siguientes. Si nosotros anunciamos la huelga, el ejército, el régimen, toma medidas en una serie de puntos que nos interesa atacar (Internet, Magazine, Cuba, http://goo.gl/hY5zEh). ‘It is possible that the wave that fell, in our view, in those days, would have raised itself again, if we called for a strike and announced it 48 hours in advance. Of course, there were the following criteria. If we announce the strike, the army, the regime takes action on a number of points we are interested in attacking.’ (80)En los congresos de París (1989) y de Roma (1993) se presentaron numerosos estudios multidisciplinares y todos confirmaban la antigüedad del lienzo. Entre ellos hay los dos siguientes: 1. La irradiacion; 2. Los incendios (Internet, Website, Peru, http://goo.gl/VLIh3P). ‘At the conferences of Paris (1989) and Rome (1993), numerous multidisciplinary studies were presented and all confirmed the antiquity of the cloth. Among them there is the following two: 1. Radiation; 2. Fires.’ The last type of uniquely identifiable definite nouns is known as ‘containing inferables’ (Prince 1992: 303–305). With this type, the reference of the noun is inferred from its adnominal modifier. For instance, in the agreeing case presented in example (81), the reference of las partes ‘the parts’ can be inferred from the prepositional phrase de un hombre ‘of a man’. Similarly, in the nonagreeing example provided in (82), the specific meaning of los elementos ‘the elements’ is inferable from de un golpe de Estado ‘of a (typical) coup’. (81) En el asiento trasero estaban dos bolsas de plástico color negra, en cuyo interior habían las partes de un hombre de aproximadamente 40 años de edad(Internet, Press, Mexico, http://goo.gl/m0zfV8). ‘Two black-colored plastic bags were in the back seat, inside which there were the parts of a man of about 40 years of age.’ (82) Les expliqué que había los elementos de un golpe de Estado (Internet, Press, Argentina, http://goo.gl/lW0Rfo). ‘I explained to them that there was the elements of a (typical) coup.’ With English presentational there is/there are, Ward and Birner (1995: 737) find that containing inferables can only be used felicitously when a ‘conventional relationship’ holds between the entities denoted by the head (las partes ‘the parts’ and los elementos ‘the elements’, in the examples) and the modifier (de un hombre ‘of a man’ and de un golpe de Estado ‘of a coup’ in the examples). In-

The nominal argument | 105

deed, for many containing inferables, an ‘intrinsic metonymic association’ (Croft and Cruse 2004: 216–217), such as the part-whole relationships in the examples, can be identified between the modifying adnominal and the noun. In contrast, when no metonymic association exists between the noun and its modifier, the use of a definite determiner leads to an infelicitous expression, as is evident from example (83) (Ward and Birner 1995: 738). (83) *There was the picture of a young black couple among his papers (constructed example. From Ward and Birner 1995: 738).5 Additionally, because the information status of a containing inferable definite noun phrase is determined by that of the adnominal modifier (Birner 1994: 252), we can expect its use in a presentational haber expression to be odd when the prepositional phrase is hearer-old. Judging from the modified versions of examples (81) and (82) cited in examples (84) and (85), this appears to be the case.6 (84) *En el asiento trasero estaban dos bolsas de plástico color negra, en cuyo interior habían las partes del hombre de aproximadamente 40 años de edad (constructed example). *‘Two black-colored plastic bags were in the back seat, inside which there were the parts of the man of about 40 years of age.’ (85) *Les expliqué que había los elementos del golpe de Estado (constructed example). *‘I explained to them that there was the elements of the coup.’ || 5 However, if this expression were to occur in a context that construes it as another type of definite noun phrase, it would be fine. For example, Then, there was the picture of a young black couple among his papers evokes a list reading. 6 When we add a restrictive relative clause to the prepositional phrase, these utterances are well-formed, as is evident from examples (i) and (ii). (i)

En el asiento trasero estaban dos bolsas de plástico color negra, en cuyo interior habían las partes del hombre de aproximadamente 40 años de edad que habían estado buscando (constructed example). ‘Two black-colored plastic bags were in the back seat, inside which there were the parts of the man they had been looking for.’

(ii)

Les expliqué que había los elementos del golpe de Estado que ya denunció mi antecesor en el cargo (constructed example). ‘I explained to them that there was the elements of the coup my predecessor had already denounced.’

However, examples (i) and (ii) are not containing inferables, because the hearer does not draw on the metonymical association between the prepositional phrase and the noun to identify the referent of the noun, but rather uses the information provided by the prepositional phrase and its restrictive relative clause.

106 | Semantic and syntactic properties of presentational haber

6.2.1.5 False definites Finally, definite noun phrases can be used to refer to ‘brand-new’ referents, or referents that have not been evoked in earlier discourse and with which the hearer has yet to establish mental contact (Prince 1992: 318), in which case the definite determiners function as indications of intensity (Suñer 1982: 81; Ward and Birner 1995: 739). Since the information contained within these noun phrases is truly new, false definites can be used freely with both English presentational there is/there are (Ward and Birner 1995: 738–740) and Spanish presentational haber. With singular noun phrases, this interpretation may emerge in Spanish with certain superlatives (see example [86]), the close-to-hearer deictic esa/ese ‘this’ (see example [87]), and the definite article (see example [88]). With plural noun phrases, false definite readings may arise with todas las/todos los ‘all the’ (see examples [89] and [90]) and the less common form cuantas/cuantos ‘all the’ (see examples [91] and [92]) (RAE and ASALE 2009: §15.6k, §19.3a; Suñer 1982: 81). (86) Aun a riesgo de repetirme les quiero decir que no hay el menor problema y que los ciudadanos de Canarias pueden estar tranquilos, igual que los ciudadanos de toda España (Real Academia Española 2008b-, Press, Canary Islands). ‘Even at the risk of repeating myself I want to say to you that there isn’t the slightest problem and that the Canarian citizens can rest assured, just like the citizens across Spain.’ (87) En mi casa yo también, e, yo soy, éste era un pobre, no, muy pobre, y, entonces, no había esa, no había ese dinero para tener unos juguetes nuevos, así constantemente, tener muchos juguetes (LH01H22/LH41). ‘At home I as well, er, I am, this was a poor fellow, right, really poor, and, well, there wasn’t this, there wasn’t this money to have some new toys, like that, constantly, to have a lot of toys.’ (88) Hay el hombre y hay la mujer. Y cada uno tiene cosas distintas (René Marqués, La Mirada. From RAE and Asale 2009: §15.6p). ‘There is man and there is woman. And both have different things.’

The nominal argument | 107

(89) Cuentan los abuelos que los tres campesinos se perdieron en aquella espesa y misteriosa selva, llegando, según la leyenda, al encanto invisible de doña Ñuisa, un paraje mítico encantado perdido en la selva, donde habían todos los frutos que hay sobre la tierra, hermosos jardines, quebradas de aguas cristalinas, con arenas de plata y piedras de oro (Internet, Blog, Colombia, http://goo.gl/sjMLgx). ‘The grandfathers recount that the three farmers got lost in that thick and mysterious forest, reaching, as the legend goes, the invisible charm of Mrs. Ñuisa, a mythical enchanted place, lost in the woods, where there were all the fruit trees that there are on earth, beautiful gardens, waterfalls with crystal-clear water, with silver sands and golden stones.’ (90)Halló aquí belleza y pobreza, pero también un pueblo alerta, descalzo y sensible, de infatigables manos hacedoras, dueño de una tierra inmensa donde hay todos los paisajes y los climas, los frutos y los sueños (Real Academia Española 2008b-, Fiction, Mexico). ‘He found here beauty and poverty, but also an alert, barefooted, and sensible people, with untiring working hands, owner of an immense land, where there is all the landscapes, the climates, the fruits, and the dreams.’ (91) Habían mangos,habían piñas, habían cuantas frutas había (SJ16H21/SJ1951). ‘There were mangoes, there were pineapples, there were all the fruits there was.’ (92) En fin abrí el cajón del buró para buscar el control de la televisión, ahí había cuantas cosas extraordinariamente desordenadas pudiese imaginar: pulseras, coletas, trabas, collares entre cientos y cientos de cosas (Internet, Blog, Chile, http://goo.gl/UYLGw9). ‘Eventually, I opened the drawer of the desk to look for the television remote control, in there, there was all the extremely messy things I could imagine: bracelets, pigtails, hair clips, necklaces, among hundreds and hundreds of things.’ In sum, in this section I have shown that the nominal arguments of agreeing and non-agreeing presentational haber are interpreted as being present in a stative situation. This suggests that the noun phrase of both constructions is assigned a zero argument role. Additionally, the examples presented in this section suggest that Ward and Birner’s (1995) analysis of English presentational expressions is also valid for agreeing and non-agreeing presentational haber. This points to a shared pragmatic constraint on the nominal arguments of both presentational haber constructions, namely, that it has to convey new infor-

108 | Semantic and syntactic properties of presentational haber

mation to the hearer. Therefore, this section supports that, apart from socialinteractional meanings and the relative prominence that is attributed to the noun phrase argument by encoding it as subject or rather as object, agreeing and non-agreeing presentational haber are completely synonymous. As we will see in the next section, the hearer-new constraint also explains why the nominal argument of agreeing presentational haber typically fails syntactic tests of subjecthood (e.g., Rodríguez-Mondoñedo 2006; Suñer 1982).

6.2.2 Syntactic properties In discussions of the syntactic status of the nominal argument of presentational haber (e.g., Gómez-Torrego 1994: 30; Rodríguez-Mondoñedo 2006: 334; Suñer 1982: 22), it has often been observed that subject-marked personal pronouns are barred from appearing with both the agreeing and the non-agreeing construction, as is shown in examples (93) and (94). Contrary to what is claimed by the working hypothesis, this would suggest that the noun phrase invariantly behaves as an object (Rodríguez-Mondoñedo 2006: 334). In order to shed more light on this matter, in this section, I will review the argumentation that has been proposed in favor of this position. This will lead to the conclusion that most object-like characteristics of the nominal of presentational haber can actually be traced back to the information-status constraint identified in the previous section. (93) *Había ellos (constructed example). *‘There was theyNOM.’ (94) *Habían ellos (constructed example). *‘There were theyNOM.’ First, drawing on Keenan’s (1976) list of subject properties, RodríguezMondoñedo (2006: 330) and Suñer (1982: 121) indicate that the noun phrase of agreeing and non-agreeing presentational haber cannot remain implicit in coordinated structures, whereas this is usually possible for subjects, as is evident from Suñer’s (1982) examples cited in (95).

The nominal argument | 109

(95) a. Irradiaban luz y olían agradablemente dos docenas de rosas (constructed example. From Suñer 1982: 104). ‘Radiated light and smelled pleasantly two dozens of roses.’ b. *Habían y olían agradablemente dos docenas de rosas (constructed example. Adapted from Suñer 1982: 104). *‘There were and smelled pleasantly two dozens of roses.’ c. *Había y olían agradablemente dos docenas de rosas (constructed example. From Suñer 1982: 104). *‘There was and smelled pleasantly two dozens of roses.’ Second, it is impossible to interpret the noun phrase of presentational haber as coreferential with the subject of a matrix verb (Gómez-Torrego 1994: 30; Rodríguez-Mondoñedo 2006: 330–331; Suñer 1982: 20), as is shown in example (96). (96) a. *Los perros quieren haber en el jardín (constructed example). *‘The dogs want to there-bePL in the garden.’ b. *Los perros quiere haber en el jardín (constructed example). *‘The dogs want to there-beSG in the garden.’ Third, the default word order with presentational haber is that of verb + noun phrase. Since Spanish, and especially Cuban, Dominican, and Puerto Rican Spanish, is mainly a subject-verb-object language (e.g., Morales 1999), this ordering can be interpreted as evidence in favor of the object status of the noun phrase (Montes de Oca 1994: 11). Moreover, placing the nominal at the beginning of the utterance leads to an unacceptable expression, as is evident from examples (97) and (98). (97) *Unos hombres habían en el jardín (constructed example. Adapted from Rodríguez-Mondoñedo 2006: 333). *‘Some men there were in the garden.’7 (98) *Unos hombres había en el jardín (constructed example. Adapted from Rodríguez-Mondoñedo 2006: 333). *‘Some men there was in the garden.’

|| 7 Rodríguez-Mondoñedo (2006: 333) translates his example Un hombre había en el jardín as ‘A man was in the garden’. However, this is not a presentational, but rather a locative expression, which answers to different information-status constraints and would be translated in Spanish with the verb estar. In English, the conceptual import of presentational haber can only be rendered correctly by there is/there are constructions.

110 | Semantic and syntactic properties of presentational haber

However, Givón (1999: 94–96) has shown that the subject properties proposed by Keenan (1976) are to large extent epiphenomena of the tendency for subjects to have discourse-old information status. Indeed, the infelicity of nominative personal pronouns can be explained by the fact that these require hearerold/discourse-old information status (Bolinger 1977: 91). Also, as we will see below, the impossibility to use an implicit nominal argument in coordinated structures shows that the use of implicit arguments requires that the implicit portion of the event frame has already appeared in discourse (Goldberg 2006a: 190). The word order that is typically displayed by presentational haber expressions is also predictable from the information status of the noun phrase argument, as cross-linguistically, new information tends to be placed in post-verbal position (e.g., Birner 1994; Birner and Ward 1996). In turn, the fact that the nominal argument of the presentational haber constructions cannot be interpreted as coreferential with the subject of a matrix verb illustrates that the noun phrase can only be interpreted as a zero participant. In other words, when the pragmatics and the semantics of presentational haber constructions are taken into account, syntactic tests do not necessarily prove that the nominal argument also functions as an object with agreeing presentational haber. Rather, they show that only the absence/presence of verb agreement can be taken as a formal clue for the grammatical status of the noun phrase. Let us turn now to the adverbial phrase.

6.3 The adverbial phrase As was already mentioned in Section 1, Lakoff (1987: 490) describes the meaning of English presentational there is/there are as an Idealized Cognitive Model that introduces a new referent into discourse while situating it in a mental space. The adverbial phrase of the English presentational there is/there are construction (e.g., In the U.S. in example [99]) is the element that sets up this mental space (Lakoff 1987: 542–543). (99) In the U.S., there are now more jobs in the wind industry than in the entire coal industry (Davies 2008-, Magazine) Similarly, with Spanish presentational haber, the adverbial phrase creates the mental space in which the constructions locate the referents of their nominal arguments (Hernández-Díaz 2006: 1130–1132; Lyons 1967; Meulleman and Roegiest 2012). Syntactically, this implies that the presence of the adverbial expression cannot be considered optional (Meulleman and Roegiest 2012: 68– 69). Rather, the fact that the adverbial contains necessary information for the

The adverbial phrase | 111

interpretation of the expression suggests that its syntactic status is that of an ‘obligatory adjunct’ (Goldberg and Ackerman 2001), that is, a profiled adverbial phrase. Because the adverbial phrase is not claimed to refer to a physical location, but rather serves to construct “small conceptual packets … for the purpose of local understanding and action” (Fauconnier and Turner 1996: 113), it is no surprise that they may be of a spatial (see examples [100] and [101]), a temporal (see examples [102] and [103]) or another nature (Clark 1978: 89; HernándezDíaz 2006: 1130–1132; Lyons 1967; Meulleman and Roegiest 2012). (100)En el Norte de Italia habían muchas guerrillas (SD16H22/RD2188). ‘In the North of Italy, there were a lot of guerrillas.’ (101) Se abría en octubre la universidad, pero yo venía en enero, porque en casa no había cuartos para asistir desde, el año entero (SD16H22/RD2068). ‘The university opened in October, but I came in January, because at home there wasn’t moneys to attend from, the entire year.’ (102) E, en los, en los tiempos de antes, la, no habían tantas leyes (SJ16H21/SJ1971). ‘Er, in the, in the old days, the, there weren’t so many laws.’ (103) Es, e, en estos momentos tiene sala, comedo, sala-comedor, la cocina, el baño y, bueno, tres cuartos que antes no existían. Antes solamente había dos (LH07M11/LH842). ‘It is, er, at the moment it has a living room, a dinin, a living-dining room, the kitchen, the bathroom, and, well, three bedrooms that didn’t exist before. Before, there was only two.’ Additionally, this characterization of the adverbial phrase of the presentational haber constructions correctly predicts that they need not be made explicit when the hearer and the speaker can be expected to be able to recover the mental space from context (Goldberg 1995: 58–59). This is the case when the expression situates the noun phrase in a previously constructed mental space8 (see examples [104] and [105]) or the current base space, as in examples (106) and (107).

|| 8 Mental spaces can be set up by adverbial elements, phrases and clauses, but also by verb tenses, negation, or matrix verb constructions such as, for example, to believe that, to remember that, to think that (Croft and Cruse 2004: 32–39).

112 | Semantic and syntactic properties of presentational haber

(104) Interviewer: Participant: Interviewer: Participant:

¿Este, y cuando tú eras niña quién te cocinaba? Mi abuela. ¿Tu abuela? ¿Era co, era buena cocinera? Mediana. O sea, pues, o sea, habían cosas que las hacía muy bien, pero otras cosas que no (SD24M12/RD3202). Interviewer: ‘Er, and when you were a child, who cooked for you?’ Participant: ‘My grandmother.’ Interviewer: ‘Your grandmother? Was she a co, was she a good cook?’ Participant: ‘Average. That is, well, that is, there were things that she made very well, but other things that she didn’t.’ (105) La primera experiencia que yo recuerdo fue el huracán Hugo, e, que azotó a la Isla, e, prácticamente el Área Metropolitana fue la más im, e, impactada. El Área Sur como te dije, yo viví en Ponce, yo estaría como en el noveno grado, octavo-noveno no recuerdo el año. E, fue en los ochenta, e, pero recuerdo que fue un, un huracán bastante fuerte. ¿Qué categoría? No me preguntes, pero fue bastante fuerte. Recuerdo que, e, hubo mucha lluvia y muchas inundaciones (SJ06H12/SJ767). ‘The first experience that I remember was the hurricane Hugo, er, which struck the Island, er, practically, the Metropolitan Area was the most, er, impacted. The Southern Area, like I told you, I lived in Ponce, I would have been in the ninth grade, eighth-ninth I don’t remember the year. Er, it was in the eighties, er, but I remember that it was a, a pretty tough hurricane. What category? Don’t ask, but it was pretty rough. I remember that, er, there was a lot of rain and many floods.’ (106) Pero, este, sí, hayn platos como que es, específicos de diciembre (SJ05M12/SJ655). ‘But, er, yes, there are dishes, like, spe, typical of December.’ (107) Hay veces que sí lo notas, porque ellos te usa, te usan unas palabras que te arrastran la ere, ‘jj’ la hacen como, y tú te das cuenta ahí que no es de San Juan (SJ01M22/SJ68). ‘There is times that you do notice it, because they use, they use some words where they drag along the r, ‘jj’ they do it like, and there you immediately realize that they’re not from San Juan.’ When it comes to the information status of the adverbial phrase, in my corpus, all possible combinations between Prince’s (1992) hearer- and discourseoriented levels of information status seem to occur. For instance, in example (108) the speaker first introduces a particular bridge into discourse and uses it

The adverbial phrase | 113

later on in the interview to situate muchos muchachos bonitos ‘many pretty boys’, for which the adverbial has discourse-old/hearer-old information status. (108) Pues, yo tengo miedo a las alturas. T, y estuve en un campismo, que todo el mundo, t, se tiraba de un puente. Todo el mundo, el mundo se lanzaba de un puente, como de, o sea, seis metros así de ver. Y yo era adolescente en esa época. Y en el puente habían muchos muchachos bonitos (LH09M12/LH1168). ‘Well, I have a fear of heights. T, and I was on a camping, where everyone, t, was jumping off a bridge. Everyone, everyone was jumping of a bridge, of like, that is, six meters, judging from sight. And I was a teenager at the time. And on the bridge there were many pretty boys.’ Discourse-new/hearer-old adverbial phrases can also be found. These usually refer to geographic landmarks or areas the hearer is expected to know, such as, for instance, the city of San Juan de la Maguana in example (109), or the Pinar del Río province in example (110). (109) E, en San Juan no habían escuelas privadas (SD04M22/RD447). ‘Er, in San Juan, there weren’t private schools.’ (110) Sí, hay lugares bellos en Pinar del Río (LH21H11/LH2868). ‘Yes, there is beautiful places in Pinar del Río.’ Finally, the adverbial phrase can refer to a mental space that is both new to the hearer and to discourse. For instance, in examples (111) and (112), en una escuela secundaria ‘in a secondary school’ and en lugares públicos ‘in public places’ are newly introduced into the conversation. Still, the hearer can be expected to build up the specific indefinite mental space the speaker has in mind. (111) Allá en Estados Unidos cuando yo, yo trabajé en una escuela prima, en una escuela secundaria en Nueva York y habían muchachos de distintas partes de Latinoamérica (SD24M12/RD3212). ‘Over there in the United States, when I, I worked in a pri, in a secondary school in New York and there were kids of different parts of Latin America.’

114 | Semantic and syntactic properties of presentational haber

(112) Interviewer: ¿Este, y, entonces, los amigos se, se reunían y se, se co, se hablaban o…? Participant: M, se hablaban o preparaban un motivo, un, un, una especie de fiesta, porque siempre era mejor reunirse en, en una casa que, no en lugares públicos, que hay otros riesgos (LH16H22/LH2206). Interviewer: ‘Er, and, well, friends got to, together and they, ta, they talked to each other or…?’ Participant: ‘M, they talked to each other or they prepared an occasion, a, a, a sort of party, because it was always better getting together in a home than, not in public places, where there is other risks.’ In any case, the variety of configurations documented in the corpus suggests that the presentational haber constructions do not specify the information status of this slot. Let us now consider the conditions that constrain the use of implicit adverbial phrases and/or nominal arguments.

6.4 Implicit nominal arguments and adverbial phrases As noted in Section 2.1 the POINTING-OUT Idealized Cognitive Model implies that without proper context, the noun phrase argument cannot remain implicit, as it carries virtually the entire conceptual import of the clause (Goldberg 2005a: 29, 2005b: 232). Nevertheless, under specific discourse conditions, it need not be made explicit again. As observed in Section 2.1.2, in my corpus this is especially common for indefinite noun phrases that introduce hearer-new tokens of hearer-old types, as shown in example (113). (113) Participant: Niños en la calle, yo creo. Interviewer: ¿Antes no? Participant: Hay más. Interviewer: ¿No habían tantos? Participant: Habían, siempre han habido (SD19M12/RD2513-RD2514). Participant: ‘Children on the streets, I think.’ Interviewer: ‘Before not?’ Participant: ‘There are more.’ Interviewer: ‘There weren’t as many?’ Participant: ‘There were, there have always been.’

Summary and box diagrams | 115

Similarly, the fact that the adverbial phrase is profiled and sets up a mental space in which presentational haber locates the referent of the noun phrase implies that the adverbial can only be omitted felicitously when it is recoverable (Goldberg 1995: 58–59, 2006a: 39), that is, when it refers to the base space or a previously evoked mental space. Isolated examples such as (114) and (115), which leave us wondering against which setting we have to interpret the utterances, suggest that this is the case. (114) Podrían haber días en que yo tenía dos horas libres entremedio (SJ13H11/SJ1566). ‘There could bePL days that I had two hours of free time in between.’ (115) Claro, sí hubo muertos (SD20H12/RD2682). ‘Of course, yes, there was casualties.’ In contrast, examples such as (116) and (117) are conceptually complete, because they locate the referent of the noun phrase in the current base space. Let us now resume the most important results of this chapter. (116) Habrán gentes que lo hagan (SD05H11/RD594). ‘There will bePL people that do it.’ (117) El racismo muchas veces viene porque muchos blancos ignoran de que hay blancos que s, negros que son tesoros (SD05H11/RD562). ‘Often, racisms comes because many whites are unaware that there is whites that a, blacks that are treasures.’

6.5 Summary and box diagrams In this chapter, I have argued that both the agreeing and the non-agreeing variant of the presentational construction with haber encode the POINTING-OUT Idealized Cognitive Model proposed by Lakoff (1987). The two constructions also assign the same zero argument role to their nominal argument. Since most alternations like this serve to provide speakers with different ways to package information (Goldberg 2005a: 37, 2005b: 236), subsequently, I have investigated whether agreeing and non-agreeing presentational haber display any differences in this respect. This has shown that the nominal argument of both variants of the presentational haber construction has to provide new information to the hearer. Then, I have indicated that due to this information-status constraint, the results of syntactic tests do not necessarily prove that the nominal is always a direct object. Rather, the only formal clue that remains is the absence or presence of verb agreement. Therefore, this chapter has shown that it is plausible

116 | Semantic and syntactic properties of presentational haber

that agreeing presentational haber has a subject, as the working hypothesis claims. Additionally, I have demonstrated that the two variants of the presentational construction with haber include a profiled adverbial phrase, which functions as setting and evokes the mental space in which the construction localizes the referent of the nominal argument. Subsequently, it was shown that under certain discourse conditions, both the nominal argument and the adverbial phrase may remain implicit. For the nominal argument, this results in its interpretation as a hearer-new token of a hearer-old type, a reading that also emerges for definites in certain discourse contexts. For the adverbial phrase, this results in its interpretation as referring either to the base space or to a previously established mental space. In sum, the data presented in this chapter support that presentational haber pluralization constitutes a competition between two construction schemas (see Fig. 2 and Fig. 3) that only differ when it comes to the syntactic function of their nominal arguments and the social categories associated to their relative frequencies. In the next chapter, I will explore the claim that domain-general cognitive constraints condition the alternance between and . Sem R: instance

Syn Prag

POINTING-OUT

|R haber ↓ V

<

zero | participant ↓ Subj hearer-new

>

>

Fig. 2: The non-agreeing presentational haber construction Sem R: instance

Syn Prag

POINTING-OUT

|R haber ↓ V

<

|

Part B:Cognitive, social, and individual constraints on presentational haber pluralization

|

Chapter 7: Cognitive constraints on presentational haber pluralization

7 Cognitive constraints on presentational haber pluralization The preceding chapters have laid the basis for investigating the hypothesis that presentational haber pluralization constitutes a spreading-activation competition between and , which is conditioned by cognitive, social, and individual constraints. In this chapter, I will begin by discussing the structure of the corpus and the technical details of the regression models (Section 7.1). In Section 7.2, I will operationalize the cognitive constraints and I will assess their effects on presentational haber pluralization. The chapter concludes with a summary of the results in Section 7.3.

7.1 Structure of the corpus and the regression models In this section, I will start by providing an overview of the overall distribution of agreeing and non-agreeing presentational haber and the structure of the corpus. In Section 7.1.2, I will discuss the architecture of the regression models that will be discussed throughout this and the following two chapters.

7.1.1 Overall distribution of agreeing and non-agreeing presentational haber Tab. 15 shows that, using the methods described in Chapter 5, a total of 5,589 tokens of presentational haber followed by a plural nominal argument have been collected. This corresponds to an average of 77 tokens per participant, distributed as shown in Fig. 4. As is evident from the figure, in the three communities a similar imbalance is found between the amounts of tokens the individual participants contributed to the datasets. Still, not a single speaker contributed less than twenty tokens and all but two participants contributed more than 30.

122 | Cognitive constraints on presentational haber pluralization Tab. 15: Agreeing and non-agreeing presentational haber in the Spanish of Havana, Santo Domingo, and San Juan

Havana

Santo Domingo

San Juan

N

%

N

%

N

%

Agreeing

934

44.6

859

46.7

684

41.3

Non-agreeing

1159

55.4

982

53.3

971

58.7

Total

2093

100

1841

100

1655

100

Fig. 4: Distribution of tokens of presentational haber across participants in the Havana, Santo Domingo, and San Juan datasets

Across the communities, 44.3% (N=2477/5589) of the tokens correspond to the agreeing presentational haber construction. For the individual speech communities, the total number of tokens ranges between about 1,650 to more than 2,000. The community rates of presentational haber pluralization range be-

Structure of the corpus and the regression models | 123

tween 41.3% in San Juan and 46.7% in Santo Domingo, as is shown in Tab. 15 and Fig. 5.

Fig. 5: Counts of non-agreeing and agreeing presentational haber in Havana, Santo Domingo, and San Juan

In other words, Havana, Santo Domingo, and San Juan are not markedly different from one another when it comes to their overall rates of presentational haber pluralization. In contrast, when compared to earlier work, the rates of presentational haber pluralization reported here are far lower. That is to say, the studies reviewed in Section 2.2 typically document that agreeing presentational haber represents about 60% of the tokens followed by a plural noun phrase (D’Aquino-Ruiz 2004; Díaz-Campos 2003). In San Cristóbal de Los Andes and San Salvador, agreeing presentational haber even occurs in around 80% of the cases (Freites-Barros 2008; Quintanilla-Aguilar 2009). Yet, the differences between the figures reported here and the findings of earlier investigations appear to be due to the fact that this study includes the variation between the presenttense form hay and its rather infrequent vernacular plural hayn. Without these two forms, the frequency of agreeing presentational haber rises to 60.6% (N=926/1527) for Havana, 61.2% (N=835/1320) for Santo Domingo, and 54.3% (N=661/1217) for San Juan. In sum, this section has shown that, in terms of the overall distributions of agreeing and non-agreeing presentational haber, there is little variation between the speech communities. Additionally, we have seen that the individual participants have contributed highly diverse amounts of tokens to the relevant datasets. As I pointed out in Section 5.4.2, this sort of data structure calls for mixed-effects modeling with a speaker random term (e.g., Gries 2013a; Johnson

124 | Cognitive constraints on presentational haber pluralization

2009). In the following section, I will discuss the architecture and overall performance of the mixed-effects models of presentational haber pluralization.

7.1.2 Architecture and performance of the regression models As is shown in Tab. 16-Tab. 18, the model selection procedure described in Section 5.4.2.2 retained a highly similar set of linguistic predictors for Havana, Santo Domingo, and San Juan.1 Specifically, for the three speech communities, comprehension-to-production priming, production-to-production priming, tense, and typical action-chain position of the referent of the noun are required on information-theoretic grounds to model speakers’ behavior. For Havana and San Juan, the absence/presence of negation is also a relevant predictor. Considering the random effects, the variance and standard deviation scores for the nouns and the speakers suggest that there is substantial variation between nouns and speakers when it comes to their preference for agreeing or non-agreeing haber, which supports that applying mixed-effects modeling is justified. By-noun random slope models failed to converge, but all three models include a by-speaker random slope. Particularly, for Havana and Santo Domingo adding a random slope proved appropriate for typical action-chain position. For San Juan, including a by-speaker random slope for tense considerably reduced the AICc score. The information contributed by these random slopes will be examined in Section 9.2. Turning now to the interactions between predictors, including style as a fixed effect lowered the AICc score for Havana and improved the discriminative ability of this model. Specifying an interaction term for tense and style also revealed to be appropriate. Running style as a random effect (Gries 2013a: 14) did not lower the AICc nor did it improve the discriminative accuracy (C-index) or fit (pseudo-R2). The same was true for the Santo Domingo and San Juan models. For these models, however, including style as a fixed effect did not lower the AICc score nor did it explain more variance or improve the discriminative ability. Regarding the fit and discriminative power of the models, the model summaries provided at the bottom of Tab. 16-Tab. 18 indicate that the three models perform very well at predicting speakers’ choice between agreeing and nonagreeing presentational haber. Particularly, for all three models, the C indices

|| 1 Tab. 15-Tab. 17 only display the results that were obtained for the linguistic predictors, but the linguistic and the social predictors were evaluated in the same model. The social predictors will be presented in Chapter 8.

Structure of the corpus and the regression models | 125

are in the high eighties (‘excellent discrimination’ on Hosmer and Lemeshow’s 2000:162 scale). The high pseudo-R2 values point in the same direction. These excellent fits are not due to the random effects, as the C-index remains in the 0.80 range and R2 remains more than acceptable when the random terms are removed from the models, suggesting that the fixed effects represent the constraints that determine speakers’ behavior very well. For the three models, all regression coefficients also fall nicely in the middle between the 2.5% and the 97.5% bootstrap confidence limits. This suggests that the results generalize well to the population at large and that overfitting and multicollinearity are not issues. Regarding the latter, for all predictors, the variance inflation factors are also below five. Contrasting the sums of the squared Pearson residuals with the residual degrees of freedom of the models showed that overdispersion is not an issue. In the remainder of this chapter, I will describe how the linguistic predictors that appear in Tab. 16-Tab. 18 can be considered as reflexes of markedness of coding, statistical preemption, and structural priming. Then, I will discuss the results obtained for each of these linguistic predictors. Specifically, Section 7.2 focuses on the effects of markedness of coding, whereas Section 7.3 investigates how statistical preemption constrains presentational haber pluralization. Section 7.4 is concerned with structural priming effects. Section 7.5 investigates the interaction between these cognitive constraints. Before summarizing the main findings of this chapter (Section 7.7), Section 7.6 will present and compare constraint rankings for Havana, Santo Domingo, and San Juan.

126 | Cognitive constraints on presentational haber pluralization Tab. 16: Logistic generalized linear mixed-effects model of presentational haber pluralization in Havana (sum contrasts, bobyqa optimizer)

Fixed effects

N

%

(intercept)

Coefficient

2.5%

97.5%

-1.023

-1.527

-0.535

Verb tense All others

819/1298

63.1

1.663

1.450

1.922

Synthetic present or preterit tense

115/795

14.5

-1.663

-1.450

-1.922

Agreeing presentational haber construction

556/817

68.1

0.653

0.420

0.879

First occurrence/distance 20+ clauses

83/297

27.9

-0.268

-0.586

0.080

Non-agreeing presentational haber construction

295/979

30.1

-0.385

-0.597

-0.181

Agreeing presentational haber construction

113/239

47.3

0.503

0.177

0.853

Non-agreeing presentational haber construction

73/204

35.8

-0.151

-0.507

0.162

First occurrence/distance 20+ clauses

748/1650

45.3

-0.353

-0.619

-0.095

Production-to-production priming

Comprehension-to-production priming

Typical action-chain position of the noun’s referent Heads

467/925

50.5

0.248

-0.044

0.554

Tails and settings

467/1168

40.0

-0.248

0.044

-0.554

Absent

708/1523

46.48

0.188

0.000

0.373

Present

226/570

39.65

-0.188

0.000

-0.373

Random effects

Variance

Std. dev.

Nouns

0.765

0.875

Speakers

0.523

0.725

Model summary

Fixed

Full

C-index of concordance

0.82

0.89

Pseudo-R2

0.43

0.60

AICc

2124.5

1974.9

Absence/presence of negation

Structure of the corpus and the regression models | 127

Tab. 17: Logistic generalized linear mixed-effects models of presentational haber pluralization in Santo Domingo (sum contrasts, bobyqa optimizer)

Fixed effects

N

%

(intercept)

Coefficient 2.5%

97.5%

-0.224

-0.553

0.095

Verb tense All others

720/1103

65.3

1.446

1.280

1.659

Synthetic present or preterit tense

140/739

18.9

-1.446

-1.280

-1.659

Agreeing presentational haber construction

484/711

68.1

0.780

0.604

0.985

First occurrence/distance 20+ clauses

123/337

36.5

-0.125

-0.346

0.099

Non-agreeing presentational haber 253/794 construction

31.9

-0.654

-0.850

-0.458

57.2

0.507

0.228

0.811

Non-agreeing presentational haber 63/185 construction

34.1

-0.189

-0.526

0.107

First occurrence/distance 20+ clauses

46.4

-0.317

-0.560

-0.098

Production-to-production priming

Comprehension-to-production priming Agreeing presentational haber construction

151/264

646/1393

Typical action-chain position of the noun’s referent Heads

439/815

53.9

0.463

0.243

0.702

Tails and settings

421/1027

41.0

-0.463

-0.243

-0.702

Absence/presence of negation Absent Present Random effects

Not included Variance Std. dev.

Nouns

0.599

0.774

Speakers

0.183

0.427

Model summary

Fixed

Full

C-index of concordance

0.82

0.87

Pseudo-R2

0.42

0.54

AICc

1931.1

1844.0

128 | Cognitive constraints on presentational haber pluralization Tab. 18: Logistic generalized linear mixed-effects models of presentational haber pluralization in San Juan (sum contrasts, bobyqa optimizer)

Fixed effects

N

%

(intercept)

Coefficient

2.5%

97.5%

-0.974

-1.376

-0.535

Verb tense All others

622/1014

61.3

1.766

1.380

2.142

Synthetic present or preterit tense

62/641

9.7

-1.766

-1.380

-2.142

Agreeing presentational haber construction

352/558

63.1

0.597

0.387

0.818

First occurrence/distance 20+ clauses

88/246

35.8

-0.155

-0.453

0.117

Non-agreeing presentational haber construction

244/851

28.7

-0.442

-0.661

-0.221

Agreeing presentational haber construction

92/175

52.6

0.452

0.116

0.858

Non-agreeing presentational haber construction

30/125

24.0

-0.266

-0.790

0.153

First occurrence/distance 20+ clauses

562/1355

41.5

-0.186

-0.493

0.123

Production-to-production priming

Comprehension-to-production priming

Typical action-chain position of the noun’s referent Heads

350/773

45.3

0.418

0.227

0.625

Tails and settings

348/882

37.9

-0.418

-0.227

-0.625

Absent

559/1225

45.6

0.341

0.149

0.540

Present

125/430

29.1

-0.341

-0.149

-0.540

Absence/presence of negation

Random effects

Variance Std. dev.

Nouns

0.378

0.651

Speakers

0.334

0.578

Model summary

Fixed

Full

C-index of concordance

0.85

0.89

Pseudo-R2

0.50

1489.6

AICc

1593.5

1517.9

Markedness of coding | 129

7.2 Markedness of coding This section explores how markedness of coding constrains presentational haber pluralization. To this end, in Section 7.2.1, I will show how the typical action-chain position of the referent of the noun increases (or, rather, decreases) the prominence of the noun phrase of presentational haber. Then, Section 7.2.2 will focus on the way the absence/presence of negation contributes to this.

7.2.1 The typical action-chain position of the referent of the noun In Section 4.2.2.1, hypothesis 1 proposes that markedness of coding increases the activation of the agreeing presentational haber construction with nominal arguments that are likely to draw the speaker’s attention. This raises the question as to which features can model the relative amount of attention that is focused on the noun phrase of presentational haber. In this regard, research suggests that agents attract more attention than any other semantic role (Myachykov and Tomlin 2015). Therefore, semantic role would be an ideal candidate to operationalize markedness of coding, even more so because agenthood correlates rather closely with subjecthood (cf. Comrie 1989: 66; Dixon 1979: 86; Langacker 1991:Chap. 7; Myachykov and Tomlin 2015). However, as we have seen in the Section 6.2.1, the nominal argument of presentational haber is clearly not an agent, because it is merely present in a stative situation. Still, it is inarguably the case that some entities (say, e.g., driver) are intrinsically more likely than others (say, e.g., invitee) to fulfill this role. Therefore, with constructions such as presentational haber, which do not explicitly construe the nominal argument as an agent or as a patient, entities like driver may be perceived as more potential agents, for which they may be more prominent than entities like invitee (see Langacker 1991: 294). This may result in a more frequent use of agreeing presentational haber with lexical items such as driver. The results of earlier studies may reflect such a tendency. As noted in Section 2.3, previous studies of presentational haber pluralization have claimed that human vs. non-human reference (e.g., Bentivoglio and Sedano 2011: 172– 174) or the noun’s proportion of subject use (Brown and Rivas 2012) are the relevant predictors related to the noun phrase. While it has often been proposed that humans are more likely to talk about (and, hence, attend to) other humans or animate beings (Ashby and Bentivoglio 1993; Croft 2003: 130; Du Bois 1987: 829), it should also be observed that this tendency is related to the fact that animate entities are more likely to be agents in events (Dixon 1979: 86; Du Bois

130 | Cognitive constraints on presentational haber pluralization

1987: 829). In other words, most animate-reference nouns are also likely agents. Similarly, chances are high that nouns of high proportion of subject use refer to more typical agents. In Cognitive Linguistics, the semantic roles ‘agent’ and ‘patient’ are defined in relation to what Langacker (1991: 283–285) calls the ‘canonical event model’ or the ‘action-chain model’: the head initiates physical activity, resulting “through physical contact, in the transfer of energy to an external object” (Langacker 1991: 285) and an internal change of state of that entity, the tail of the chain. The semantic roles of agent and patient, in turn, are defined as, respectively, ‘action-chain head’ and ‘action-chain tail’. Additionally, events take place in a particular setting, such that the event model minimally includes three elements: action-chain head/agent, action-chain tail/patient, and setting. Therefore, to test the first hypothesis, I coded the data for the typical actionchain position of the entity indicated by the isolated noun, for which I relied on the answers to the question in (118). (118) Is the referent of the noun highly likely to cause an internal change of state to a second entity without being affected by a third entity first? Yes: Typical action-chain head/more prototypical agent e.g., temblor ‘earth quake’, madre ‘mother’, carro ‘car’… No: Typical action-chain setting or tail/more prototypical setting or patient e.g., actividad ‘activity’, víctima ‘victim’, daño ‘damage’… As predicted by hypothesis 1, Tab. 16-Tab. 18 and Fig. 6 indicate that speakers of the three varieties are more likely to use the agreeing presentational haber construction with nouns that refer to typical action-chain heads (shown in example [119]), whereas with nouns that refer to typical tails, as in example (120), or settings (see example [121]), they prefer the non-agreeing presentational haber construction. (119) Humans such as madre ‘mother’, natural phenomena such as huracán ‘hurricane’, self-propelling objects such as carro ‘car’, tiro ‘gun shot’. (120) Tangible objects such as libro ‘book’, animate beings that undergo an action such as, for example, víctima ‘victim’and invitado ‘invitee’. (121) Lugar ‘place’, año ‘year’, nominalized events such as actividad ‘activity’, discusión ‘discussion’.

Markedness of coding | 131

Fig. 6: Effect of typical action-chain position of the referent of the noun on the log-odds of agreeing presentational haber in Havana, Santo Domingo, and San Juan

Additionally, in lieu of supporting Rivas and Brown’s (2012: 87) claims that temporal persistence (in terms of independent existence and reference) is a feature of prototypical subjects and that ‘stage-level nouns’2 disfavor presentational haber pluralization because they are not temporally persistent, the results of this study suggest that these authors’ findings reflect differences in typical action-chain position, as stage-level nouns (e.g., años ‘years’ in example [122], actividades ‘activities’ in example [123], peleas ‘fights’in example [124]) refer to typical settings of action chains rather than to typical heads. (122) Pero después de ahí hubo años que no apareció un regalo tampoco (SD20H12/RD2679). ‘But afterwards, there was years that there didn’t appear a gift either.’ (123) E, hay muchas actividades, viernes, sábado y domingo son actividades (SJ16H21/SJ1937). ‘Er, there is a lot of activities, Friday, Saturday, and Sunday are activities.’ (124) Eso, hubo muchas peleas, e, mujeres y mujeres y hombres y hombres y, y de todo, o sea, a, eso se peleó mucho ahí (SJ04M22/SJ473). ‘Er, there was a lot of fights, er, women and women and men and men, and, and, a bit of everything, that is, a, that, they fought a lot over there.’

|| 2 For example, event nouns, deverbal nouns, and temporal nouns (Rivas and Brown 2012: 81; see Section 2.2.4 for discussion).

132 | Cognitive constraints on presentational haber pluralization

7.2.2 The absence/presence of negation In Section 2.2 it was said that two of the more recent investigations of presentational haber pluralization (D’Aquino-Ruiz 2004; Quintanilla-Aguilar 2009) have examined the effect of the absence/presence of negation, with similar results: the presence of negation disfavors the agreeing presentational haber construction. Similarly, Tab. 16-Tab. 18 show that the presence of negation disfavors the agreeing presentational haber construction in Havana and San Juan, which is also evident from contrastive examples such as (125). In Santo Domingo, however, this predictor does not contribute to explaining the variation. Still, when an alternative regression model is fitted to the Santo Domingo data, the absence of negation does favor pluralization over the presence of negation. However, the size of the difference, 0.018 log-odds, is so small as to be negligible, as is evident from the nearly flat line in the center panel of Fig. 7. (125) Habían menos, no había tantos salones (SJ07H21/SJ886). ‘There were less, there wasn’t as many class rooms.’

Fig. 7: Effect of the absence/presence of negation on the log-odds of agreeing presentational haber in Havana, Santo Domingo (alternative model), and San Juan

I would like to argue here that the influence of this predictor constitutes an additional reflex of markedness of coding. Specifically, recall that in Section 6.2.1 it was suggested that the POINTING-OUT Idealized Cognitive Model implies that, unless the presentatum is explicitly construed as a type, the noun phrase is always interpreted as referring to a specific instance (Prince 1992: 299–300). In negative clauses, however, the reference of the nominal argument becomes suspended (Brown and Rivas 2012: 327; Keenan 1976: 318; Suñer 1982: 85). As a

Statistical preemption | 133

consequence, it is interpreted as nonspecific indefinite, that is, as being “identifiable only as a type, not as a specific instance or token” (Croft 2003: 132), for which it is less likely to attract speakers’ attention (Langacker 1991: Chap. 7). Therefore, the Havana and San Juan results can be interpreted as an effect of markedness of coding. Yet, this does not account for the Santo Domingo data. However, as this community also displays the highest overall rates of presentational haber pluralization, the results may indicate that, in Santo Domingo, the agreeing presentational haber construction has invaded the non-specific indefinite conceptual territory. Since such incursions are typical of ongoing language changes (Company-Company 2003: 26), this could be a first indication that presentational haber pluralization constitutes an ongoing linguistic change. Let us now examine whether and how statistical preemption constrains presentational haber pluralization.

7.3 Statistical preemption Earlier variationist studies have almost consistently documented that presentational haber pluralization occurs far more frequently in the imperfect tense than in the preterit and present tenses. In Section 4.2.2.2, Hypotheses 2a-b propose that the effect of the verb tense is the reflex in this variation of statistical preemption, that is, the tendency to use a partially lexically filled instance of a construction rather than constructing a novel expression based on a more abstract construction schema when both the abstract schema and the entrenched instance could encode the conceptualization equally well (Goldberg 2011: 139). Particularly, it was argued that for the tenses that were mainly used in presentational haber expressions before emerged as a conventional alternative, a partially lexically filled instance of , conserved through repetition, preempts the use of the agreeing presentational haber construction whenever this entrenched instance can encode the conceptualization. These hypotheses raise four questions: first, when did the variation that affects presentational haber emerge as a community-wide phenomenon, second, how can we measure the relative dispersion of verb forms across constructions, third, which forms of the verb enjoyed a relatively high frequency in a variety of constructions before this happened, and, fourth, is this distribution still reflected in actual usage? The answer to the first question can only be tentative, as it is difficult to know when and how the variation that affects presentational haber started exactly in Caribbean Spanish. For Buenos Aires, Fontanella de Weinberg (1992b: 39) has shown that the alternations between agreeing and non-agreeing presentational haber already occur with some frequency in written discourse

134 | Cognitive constraints on presentational haber pluralization

from the eighteenth century onward. Since there is usually a considerable lag between the emergence of a new variant and its trickling down into writing, the variation probably arose somewhere in the seventeenth century. Regarding the question as to how we can measure the depth of entrenchment of each particular form of haber in , the Cognitive Linguistics literature offers various suggestions for measures of the association between words and constructions, each with its own profile of advantages and limitations (see Levshina 2015: Chap. 9–10 or Schmid and Küchenhoff 2013 for overviews). Most of these methods depend on a two-by-two collocations table such as Tab. 19. Tab. 19: Collocations table

Cell A

Cell C

Frequency of word W in construction Cx e.g., Frequency of

Frequency of words other than W in construction Cx e.g, Frequency of with forms other than hubo

Cell B

Cell D

Frequency of word W in constructions other than Cx e.g., Frequency of hubo in other constructions

Frequency of words other than W in constructions other than Cx e.g., Frequency of other third-person singular forms of haber

In recent work, ∆P (delta-p), a measure derived from associative learning theory, has proven to be a viable way to establish how frequently a word occurs in a specific construction as opposed to its occurrences in isolation of that construction (e.g., Ellis and Ferreira-Junior 2007; Schmid and Küchenhoff 2013). ∆P is a unidirectional measure that expresses the probability of observing a construction (Cx) in the presence of a word (W), minus the probability of observing the construction in the absence of the word, as is shown schematically in (126). (126) ∆P = P(Cx|W) − P(Cx|¬W) With a two-by-two table like Tab. 19, ∆P can be obtained with the formula in (127). (127) ∆P = (Cell A/(Cell A + Cell B)) –(Cell C/(Cell C + Cell D)) The higher the resulting ∆P, the deeper the word is entrenched in that particular construction.

Statistical preemption | 135

Therefore, to establish which forms of haber occurred in multiple constructions before the agreeing variant emerged and to investigate whether or not this distribution is still reflected in current usage, I computed ∆P measures for frequency scores culled from two ancillary corpora. For the sixteenth century, I analyzed the Latin American subsection of the Spanish Royal Academy’s Corpus diacrónico del español (Real Academia Española 2008a-) for the period 1492– 1600. To investigate current usage patterns, I turn to the twentieth-century section of Corpus del español (Davis 2002-). Tab. 20: Frequency counts and ∆P for different third-person singular forms of haber in the Latin American section of Corpus diacrónico del español (1492–1600)3

Form

Cell A

Cell B

Cell C

Cell D

∆P

había

870

2579

4441

5282

-0.164

hubiera

78

142

5233

8511

-0.002

ha habido

6

0

5305

8725

0.001

habría

43

14

5268

8674

0.006

haya

295

330

5016

8106

0.016

habrá

173

134

5138

8424

0.017

hubo

406

176

4905

8149

0.055

hay

3440

45

1871

5246

0.639

|| 3 The following parameters were used for collecting the instances of haber: 1492–1600, Lírica, Narrativa, Breve, Relato breve tradicional, and otros.

136 | Cognitive constraints on presentational haber pluralization Tab. 21: Frequency counts and ∆P for different third-person singular forms of haber in the twentieth-century section of Corpus del español4

Form

Cell A

Cell B

Cell C

Cell D

∆P

había

4559

21810

23154

14384

-0.438

hubiera

329

3418

27384

32776

-0.083

habría

514

1952

27199

34242

-0.035

haya

1072

1965

26641

34229

-0.016

habrá

971

1025

26742

35169

0.007

ha habido

685

32

27028

36162

0.024

hubo

2334

450

25379

35744

0.072

hay

17249

5542

10464

30652

0.469

For the sixteenth century, Tab. 20 shows that before presentational haber was subject to large-scale variation in Latin American Spanish, its present- and preterit-tense forms occurred primarily in presentational clauses (compare the Cell A and Cell B columns). As a result, these two forms obtain the highest ∆P scores, which are each more than twice as high as the third-highest score. This suggests that the most salient representations of hay and hubo were and . The other tense forms, on the other hand, were either used more productively (spread over more different constructions) or are restricted to a very low frequency in the corpus (N < 100), which indicates that their independent forms probably also constituted their most accessible cognitive representations. Their ∆P measures point in this direction. Turning now to the twentieth century, Tab. 21 provides an overview of the frequency readings and ∆P measures that were obtained for the different tense forms of haber in Corpus del español. The table shows that, as was the case for the sixteenth century, two large groups can be distinguished: one group is formed by the present tense hay and the preterit tense hubo, which are most strongly associated with . The other group unites all other forms, which are either not at all associated with the non-agreeing presentational haber construction or occur too infrequently to assume that they are stored as entrenched instances of any construction (in the case of habrá, habría, hubiera, and, especially, ha habido). The table also shows that, when compared

|| 4 For the searches in Corpus del español, the presentational haber construction was operationalized as follows: third-person singular haber followed directly by an adjective, an article, a determiner, a noun, a numeral, or a pronoun.

Statistical preemption | 137

to the sixteenth century, the depth of the entrenchment of hubo and hay in the non-agreeing presentational haber construction has remained largely stable. These data suggest two relevant types of presentational haber expressions: – Synthetic expressions5 in the present and preterit tense – All other expressions6

Fig. 8: Effect of tense on the log-odds of agreeing presentational haber in Havana, Santo Domingo, and San Juan

Turning now to the results for this predictor, Tab. 16-Tab. 18 and Fig. 8 reveal that the agreeing presentational haber construction is unlikely to be used with the present or preterit when the conceptual import can be coded with the entrenched instances or (i.e., when no aspectual or modal auxiliary constructions are required). This is illustrated in example (128), where the speaker simply points out that, in the past, there have been tsunamis in San Juan. (128) Y aquí hubo, este, maremotos (SJ04M22/SJ493). ‘And here, er, there was tsunamis.’ For presentational haber expressions that involve other tenses and/or aspectual or modal auxiliaries, in turn, Tab. 16-Tab. 18 and Fig. 8 show that the non-

|| 5 That is, without aspectual or modal auxiliaries. 6 Present- and preterit-tense expressions involving aspectual or modal auxiliaries (e.g., puede haber ‘there can beSG’, deben haber ‘there have to be’) and the periphrastic future (e.g., va a haber ‘there is going to be’) were also included in the latter group.

138 | Cognitive constraints on presentational haber pluralization

agreeing presentational haber construction is unlikely to be used. This is also evident from contrastive examples such as (129). (129) Los carnavales es una convocatoria que hace el Estado donde dice: “Van a haber carnavales, porque van a pasar unas carrozas y, y hay ven, puntos de venta de cerveza.” Y la gente va (LH08H12/LH991-LH992). ‘The carnivals is a call that the state puts out, in which it says: “There are going to be carnivals, because some wagons are going to pass by and, and there is, outl, beer outlets.” And the people attend.’ Additionally, in Section 4.2.2.2, hypothesis 2b claimed that statistical preemption would not apply to more complex conceptualizations, involving aspectual or modal auxiliaries (see example [130]),as these cannot be encoded with or . (130) Pueden haber expresiones que, que tengan una acepción y una connotación diferentes en el Cibao que las que tienen aquí, en, en, en, en el Sur (SD20H12/RD2706). ‘There can bePL expressions that, that may have a different meaning and connotation in the Cibao from those that they have here, in, in, in, in the South.’ Tab. 22 and Fig. 9 suggest that this is the case, because in expressions with aspectual or modal auxiliaries, agreeing presentational haber is used as frequently with the present and the preterit tense as with other tenses, as was already observed in earlier investigations (Hernández-Díaz 2006:1150; QuintanillaAguilar 2009:164–165).

Statistical preemption | 139

Fig. 9: Present- and preterit-tense tokens of presentational haber in Havana, Santo Domingo, and San Juan, by absence/presence of aspectual or modal auxiliary constructions Tab. 22: Present- and preterit-tense tokens of presentational haber in Havana, Santo Domingo, and San Juan, by absence/presence of aspectual or modal auxiliary constructions

Type of expression

Havana N

Santo Domingo

San Juan

%

N

%

N

%

Presentational haber ex115/795 pressions in the present and preterit tense without auxiliary constructions

14.5

140/739

18.9

62/641

9.7

133/206 Presentational haber expressions in the present and preterit tense involving auxiliary constructions

64.6

92/143

64.3

84/124

67.7

Additionally, although present- and preterit-tense presentational haber expressions not involving aspectual or modal auxiliaries were consistently binned

140 | Cognitive constraints on presentational haber pluralization

together in the tables, this is not to say that both types display similar agreement rates. Rather, Tab. 23 indicates that, in synthetic expressions, presentational haber pluralization occurs more often with the preterit than with the present tense. This is especially true for Havana and Santo Domingo, where the frequency of approximates and crosses, respectively, the 50% threshold. This pattern is readily accounted for in the light of statistical preemption and the ∆P scores in Tab. 20-Tab. 21. Although hubo rarely appears outside of presentational haber expressions in spontaneous discourse, every native speaker of Spanish will have observed it a limited number of times in four constructions: the non-agreeing presentational haber construction, the modal constructions ‘’ and ‘’, and the preterit perfect construction (e.g., hubo hablado ‘had spoken’). In contrast, in every type of discourse, hay only appears in two constructions: the non-agreeing presentational haber construction and the impersonal obligation modal ‘’. Consequently, speakers have more evidence that the preterit of presentational haber can occur outside of than they have for the present tense. Indeed, the ∆P score that is noted for hay (∆P = 0.469) suggests that the representation of is about six times stronger than the representation of (∆P = 0.072). As a result, the preempting effect that is caused by the former is much stronger than the one that goes out from the latter. For the preterit, Tab. 23 also shows that the division of labor between the agreeing and the non-agreeing presentational haber construction seems to be stricter in San Juan than in Havana and Santo Domingo. That is to say, whereas in San Juan is the dominant variant to construe preterit presentational haber expressions that do not involve aspectual or modal auxiliaries, the agreeing variant is already well on its way to take over this function in Havana. Moreover, in Santo Domingo, the frequency of already crosses the 50% threshold. As noted for the absence/presence of negation, the apparent loosening of the restrictions on the use of agreeing presentational haber expressions in Santo Domingo may suggest an ongoing linguistic change.

Statistical preemption | 141

Tab. 23: Present- and preterit-tense tokens of presentational haber without aspectual or modal auxiliary constructions in Havana, Santo Domingo, and San Juan

Tense

Havana N

%

Santo Domingo

San Juan

N

%

N

%

Present

8/566

1.4

24/521

4.6

21/433

4.8

Preterit

107/229

46.7

116/218

53.2

41/208

19.7

Contrary to the analysis that was developed in this section, Waltereit and Detges (2008: 27) state that [i]n spoken language, presentational constructions are most frequently used in the present tense. Hence, of all the forms of haber + NP, the irregular present tense hay + NP is the most solidly entrenched one. Reanalysis based on low frequency will therefore more likely occur in non-present tenses. … [T]he tense most affected by this reanalysis is indeed the imperfect, which is the least frequent of the two Spanish past tenses.

However, without positing a competition between two construction schemas, entrenched instances, and differing strengths of statistical preemption, it is difficult to explain why the agreeing presentational haber construction is used less frequently with the preterit vis-à-vis other non-present tenses. In addition, Waltereit and Detges’s (2008: 27) analysis sets off on the wrong premises. That is, although it is true that, on aggregate, the imperfect is less frequent than the preterit tense (respectively, 194,344 and 199,419 tokens in the twentieth century section of Davies 2002-), the differences are minimal (respectively, 49% and 51% of past-tense tokens). Additionally, a completely different picture is found for presentational haber expressions, which occur most often with the simple present and imperfect tense (Brown and Rivas 2012: 79; Rivas and Brown 2013: 115). Another analysis ascribes the disfavoring effect of the preterit vis-à-vis all other non-present tenses to a differing degree of morphophonological contrast between agreeing and non-agreeing forms. That is, whereas establishing agreement with preterit hubo involves the addition of the two-syllable morpheme –ieron, for other tenses, pluralization only involves the addition of –n. Therefore, the argument goes, preterit-tense agreeing presentational haber would be more salient, for which speakers would tend to avoid it (e.g., Bentivoglio and Sedano 2011: 174; Hernández-Díaz 2006: 1151). Yet, if the effect of the verb tense were somehow due to differing degrees of morphophonological contrast between the agreeing and non-agreeing forms of presentational haber across tenses rather than to statistical preemption, then

142 | Cognitive constraints on presentational haber pluralization

we would not expect to find that auxiliary constructions favor the agreeing presentational haber construction, as these display the same contrasts. In other words, the true constraint imposed by the verb tense seems to be statistical preemption, that is, that more specific items are preferentially produced over items that are licensed but are represented more abstractly, as long as the items share the same semantic and pragmatic constraints (Goldberg 2006a: 94).

In the following section, I will examine the effects of structural priming on presentational haber pluralization.

7.4 Structural priming In Section 4.2.2.3, hypothesis 3 claims that the residual activation of or after earlier activation could influence the spreading-activation competition between the two construction schemas. In this regard, psycholinguistic experiments have shown priming effects to last for at least ten intervening clauses (Bock and Griffin 2000: 186; Bock et al. 2007: 452; Pickering and Ferreira 2008:447), to be stronger when prime and target overlap lexically (Pickering and Ferreira 2008: 440–441), and to occur from comprehension to production and from production to production (Bock et al. 2007: 454; Pickering and Ferreira 2008: 440–441). Therefore, the data were coded for the type of last token that was provided by the interviewer (comprehension-toproduction priming) and the participant (production-to-production priming) and the number of conjugated verbs that occur between these tokens and the case at hand. While coding, the occurrences were binned together in five-clause lag groups7 up until reaching a twenty-clause lag and the occurrences in which participants repeated the verb form and the presentational haber construction were separated from those in which they only repeated the construction. This resulted in a total of 17 levels for both predictors. However, as the initial results displayed a similar priming effect for all lag conditions, independently of whether or not participants would repeat the same verb form, the levels were collapsed into the following broader categories: – First occurrence/distance 20+ clauses – Primed with the agreeing presentational haber construction – Primed with the non-agreeing presentational haber construction || 7 For example, lag: 0–4 clauses; 5–9 clauses, etc.

Structural priming | 143

Fig. 10: Effect of production-to-production priming on the log-odds of agreeing presentational haber in Havana, Santo Domingo, and San Juan

Fig. 11: Effect of comprehension-to-production priming on the log-odds of agreeing presentational haber in Havana, Santo Domingo, and San Juan

Turning now to the results that were obtained for these predictors, Tab. 16Tab. 18 and Fig. 10-Fig. 11 show that whenever speakers have used an agreeing presentational haber clause, they are more prone to use another one. This is the case whether or not they repeat the same verb form, at least, if the next variable context is situated within a twenty-clause range. The same results were obtained for the non-agreeing presentational haber construction. Similarly, when speakers have processed an agreeing presentational haber clause, they are more likely to utter an expression based on the agreeing construction pattern and vice versa. This supports that presentational haber pluralization is subject to structural priming.

144 | Cognitive constraints on presentational haber pluralization

Priming effects also seem to account for the sporadic cases in which the verb agrees with a direct-object pronoun, exemplified in (131). (131) Interviewer: ¿Este, tú piensas que pueden haber diferencias entre las regiones del país en cuanto a comida? Participant: Bueno, los, t, tienen que haberlas, porque, por ejemplo, en el Sur se comen más granos (SD19M21/ RD2551). Interviewer: ‘Er, do you think that there can bePL differences between the regions of the country regarding food?’ Participant: [Well, the, t, there have to bePL themACC, because, for example, in the South they eat more grains.] Participant: ‘Well, the, t, there have to bePL, because, for example, in the South they eat more grains.’ That is, Tab. 24 (as example [131]) shows that the vast majority of the examples of this type occur in contexts directly following (Havana: 73.5% N=36/49; Santo Domingo: 82.8% N=24/29; San Juan: 71.0% N=22/31). Hence, rather than constituting strong evidence arguing against the idea that the noun phrase of agreeing presentational haber acts as its subject, these results may suggest that priming effects cause individuals to reanalyze the directobject pronoun (a syntactically motivated class of pronouns) as a hearer-new subject pronoun (a pragmatically motivated class of subject pronouns). Still, this appears to be an online phenomenon, because some participants use this agreement pattern multiple times, whereas others do not use it at all. Let us now consider how the three cognitive constraints jointly determine speakers’ use of agreeing and non-agreeing presentational haber. Tab. 24: Presentational haber tokens that co-occur with object pronouns in Havana, Santo Domingo, and San Juan, by production-to-production priming and comprehension-toproduction priming

Type of last occurrence

Havana

Santo Domingo

San Juan

N

%

N

%

N

%

First occurrence/distance 20+ clauses

0/2

0.0

1/2

50.0

0/2

0.0

Non-agreeing presentational haber construction

13/70

18.6

5/49

10.2

9/47

16.1

Agreeing presentational haber 36/59 construction

61.0

24/54

44.4

22/48

45.8

Total

37.4

30/105

28.6

31/106

29.2

49/131

Interaction between the linguistic predictors | 145

7.5 Interaction between the linguistic predictors Up until now, the discussion has been concerned with the way the individual cognitive constraints shape presentational haber pluralization when they are considered jointly with the others, with social constraints, and with the random variation due to individual participants and nouns. What has not been considered is the way these cognitive constraints work in tandem to promote one of the variants or, conversely, interact to cancel each other’s effect. As Tagliamonte and Baayen (2012:163–164) observe, disentangling this complex interplay of constraints goes beyond the capabilities of a mixed-effects regression model (Harrell 2001: 33–35), but conditional inference tree models are very well suited for such a task. In the conditional inference tree models displayed in Fig. 12-Fig. 14, I only included the linguistic predictors that contributed to an optimal fit of the regression models. In general terms, the conditional inference tree models suggest that statistical preemption constitutes the most important cognitive constraint on the variation, because the verb tense forms the topmost branching node in the three figures. For Havana, the left-hand side of Fig. 12 (nodes [2] and [3]) also displays a complex interaction between the absence/presence of negation, production-to-production priming, and the verb tense. Particularly, in expressions not involving synthetic present- or preterit-tense presentational haber, the absence/presence of negation only imposes constraints in contexts without priming or in contexts primed by the speaker with non-agreeing presentational haber. In these environments, the absence of negation attenuates the tendency to use this variant (bar plots under nodes [4] and [5]). Nodes [2] and [6] also unveil an interaction between comprehension-to-production priming, production-to-production priming, and the verb tense. Particularly, in expressions not involving the synthetic present or preterit, comprehension-to-production priming is only a relevant constraint in contexts primed by the participant with agreeing presentational haber. In these cases, the rates of presentational haber pluralization are highest when both types of structural priming align with each other in favor of the agreeing variant (bar chart under node [8]). This is exemplified in (132), where both the participant and the interviewer use an agreeing presentational haber expression before the participant utters the last agreeing clause.

146 | Cognitive constraints on presentational haber pluralization

(132) Interviewer: ¿Este, y durante tu, el tiempo que tú llevas aquí, este, han habido muchos cambios aquí? Participant: ¿Cambios, a, en qué sentido? Interviewer: ¿No sé, este, que han venido muchas personas nuevas, este, que se ha, bueno, no sé? ... Participant: Han, han habido m, han habido serios cambios. Por la parte social, e, por la parte íntima mía. Amistades, he tenido que hacer amistades nuevas durante muchos años porque casi todos mis amigos han, se han marchado. Unos han ido pa’ el Norte, y otros pa’ el Sur. Totalmente tengo que hacer amistades nuevas. E, proyectos. ¿Cómo no? En el 2007, e, estuve compartiendo con una delegación también, que, donde habían varios estudiantes de diversos países: italianos, franceses, holandeses, que ellos vinieron con un proyecto (LH20H12/LH2699-LH2702). Interviewer: ‘Er, and, during your, during the time that you have been living here, er, have there been a lot of changes around here?’ Participant: ‘Changes, in, in what sense?’ Interviewer: ‘I don’t know, er, that there have come a lot of new people, er, that they have, well, I don’t know.’… Participant: ‘There have, there have been, m, there have been serious changes. For the social part, er, intimately. Friends, I’ve have had to make new friends for many years, because almost all of my friend have, have left. Some have gone North, others have gone South. Totally, I have to make new friends. Er, projects. Sure. In 2007, er, I was hanging out with a delegation as well, where, where, there were various students of different countries: Italians, French, Dutch. They came with a project.’ In contrast, the right-hand side of Fig. 12 displays no such interaction. Moreover, comprehension-to-production priming does not even seem to be a relevant factor for synthetic presentational haber expressions in the present and the preterit tense. Rather, nodes [9] and [13] suggest that production-to-production priming works in tandem with typical action-chain position to promote the agreeing presentational haber construction for this type of expressions, as the rates of use of this variant are highest when both predictors line up in its favor (bar plot under node [17]). This is illustrated by the excerpt from the storyreading task cited in example (133), where the use of the agreeing presentation-

Interaction between the linguistic predictors | 147

al haber clause hubieron unos ladrones ‘there were some thieves’ is favored by both the participant’s earlier use of hubieron and the fact that ladrones ‘thieves’ is a typical action-chain head. (133) Sí que ayer, hubieron dos lobos que querían devorarme, anteayer hubieron unos ladrones que trataban de matarme y ha habido dos veces que yo tenía que brincar un abismo de treinta pies de ancho y todo esto fue muy molesto, pero miedo como tal no tuve” (LH02M12/LH191-LH193). ‘But yes, yesterday, there were two wolfs that wanted to devour me, the day before yesterday, there were some thieves that tried to kill me and there has been two times that I had to jump a gap thirty feet wide and all of this was really annoying, but fear as such I didn’t have.’ Finally, node [14] shows that the absence/presence of negation only imposes constraints for synthetic present- and preterit-tense expressions in environments primed by the speaker with non-agreeing presentational haber or in contexts where no priming effects hold. In these contexts, the bar chart under node [16] indicates that the absence of negation moderates the tendency to use nonagreeing presentational haber, as is exemplified in (134). (134) En Sagua La Grande hubo muertos. ¿Cómo no? Hubieron muertos, sí. Y después en el mil novecientos cuarenta y cuatro, también un ciclón que hubo aquí en La Habana. También fue, bueno, grandísimo (LH06H21/LH690). ‘In Sagua La Grande there was casualties. ¡Sure! There were casualties, yes. And afterwards, in nineteen forty-four, also a hurricane that there has been here in Havana. It was also, well, huge.

148 | Cognitive constraints on presentational haber pluralization

Fig. 12: Conditional inference tree model showing the interaction of linguistic predictors in Havana8

|| 8 Pr.2.Pr: Production-to-production priming; Co.2.Pr: Comprehension-to-production priming; No.priming: First occurrence/distance 20+ clauses; C-index: 0.81.

Interaction between the linguistic predictors | 149

For Santo Domingo, the left-hand side of Fig. 13 (nodes [2], [3], and [6]) displays a similar pattern of interaction for synthetic present- and preterit-tense presentational haber. The right-hand side reveals two additional interactions. First, nodes [9] and [10] suggest an interaction between comprehension-to-production priming, production-to-production priming, and the verb tense. That is, for the ‘all others’ group of expressions, the first predictor only imposes constraints in unprimed contexts or in contexts primed by the speaker with non-agreeing presentational haber. In these cases, comprehension-to-production priming is able to cancel production-to-production priming (bar plot in node [11]). This is evident from interview excerpts such as the one provided in example (135), where the speaker appears to be insensitive to the priming effect that one would expect to go out from her earlier use of hay. However, at the same time, the fact that comprehension-to-production priming is only relevant for this restricted subset of the data suggests that this predictor has a less profound impact than production-to-production priming. (135) Interviewer: ¿Y han habido, o sea, cuando usted, t, o sea, me podría nombrar cinco cosas que existen hoy y que no habían cuando usted era niña? ¿Acá en la ba, en el barrio? Participant: ¿Cómo así? ¿Cómo así? Interviewer: ¿Este, como por ejemplo que en, a, edificios que, que, que, que pusieron, remodelaciones, e, restaurantes? Participant: Aja okay, que no habían cuando yo era niña. Okay. Hay muchas cosas que no habían (SD10M21/RD1151-RD1153). Interviewer: ‘And have there been, that is, when you, that is, could you name me five things that exist today and that there weren’t when you were a child, here in the nei, in the neighborhood?’ Participant: ‘Like what? Like what?’ Interviewer: ‘Er, like, for example, that in, a, buildings that that they pu, that, that they have put, remodeling, er, restaurants?’ Participant: ‘Aha, okay, that there weren’t when I was a child. Okay, there is a lot of things that there weren’t.’ Second, nodes [9] and [13] suggest an interaction between production-toproduction priming, tense, and typical action-chain position. Specifically, with non-present, non-preterit expressions or expressions involving auxiliary constructions, the noun’s typical action-chain position only influences speakers when they have used agreeing presentational haber before. Still, markedness of coding and production-to-production priming appear to work in tandem, because the rates of presentational haber pluralization are highest when typical

150 | Cognitive constraints on presentational haber pluralization

action-chain position and structural priming line up in favor of the agreeing presentational haber construction (bar plot in node [15]), as illustrated in example (136). (136) Porque hubieron sitios que, que habían persecuciones todavía. Habían unas gentes muy malas, que el presidente cuando eso era Balaguer (SD03H21/RD346-RD348). ‘Because there were places where, where there were still persecutions. There were some very bad people, that the president back then was Balaguer.’

Interaction between the linguistic predictors | 151

Fig. 13: Conditional inference tree model showing the interaction of linguistic predictors in Santo Domingo9

|| 9 Pr.2.Pr: Production-to-production priming; Co.2.Pr: Comprehension-to-production priming; No.priming: First occurrence/distance 20+ clauses; C-index: 0.81.

152 | Cognitive constraints on presentational haber pluralization

Turning now to San Juan, on the left-hand side of Fig. 14, nodes [3], [4], and [7] suggest an interaction between the two modalities of structural priming and typical action-chain position with tenses other than the synthetic present and preterit. Particularly, the bar plots in nodes [5], [6], [8], and [9] suggest that that both modalities of structural priming and typical action-chain position reinforce each other with these tenses. For instance, in example (137), the earlier use of the agreeing presentational haber construction by the interviewer and the speaker, together with the fact that muchachos ‘kids’ is a typical action-chain head probably tipped the balance in favor of this construction. (137) Interviewer: ¿Y que tú recuerdes, habían más padres como los tuyos, los tuyos? Participant: E, ¿que yo recuerde? Pues en el internado había de todo. Habían estudiantes que tenían unos padres que no existían, que las cuidaban las nanas, los cuidaban los… Habían unos much, muchachos de mucho dinero (SJ04M22/SJ454-SJ457). Interviewer: ‘And, as far as you remember, were there more parents like yours, yours?’ Participant: ‘Er, as far as I remember? Well, in the boarding school, there was a bit of everything. There were students that had parents that didn’t exist, who were looked after by nannies, they were looked after by…There were ki, kids who came from a lot of money.’ Similarly, nodes [2] and [10] suggest that the tendency to use the agreeing presentational haber construction in contexts primed by the speaker with this variant is reinforced by the absence of negation (bar plot in node [12]), as in example (138). (138) Y habían de aquí. De Puerto Rico, habían dos matrimonios, tres, tres matrimonios y no, no nos conocíamos porque eran de la isla, de por ahí (SJ15M21/SJ1853). ‘And there were from here. From Puerto Rico there were two couples, three, three couples and we didn’t, we didn’t know each other, because they weren’t from San Juan, from around there.’ In turn, the right-hand side of Fig. 14 shows that, for synthetic expressions in the simple present and preterit tense, production-to-production priming appears to operate more independently (node [13]), because in contexts primed by the speaker with the agreeing presentational haber construction, neither the

Interaction between the linguistic predictors | 153

noun’s typical action-chain position nor comprehension-to-production priming impose constraints (bar plot in node [14]). In example (139), for instance, the interviewer’s earlier mention of hay and the fact that fiesta patronal ‘patron saint celebration’ refers to a typical action-chain setting do not cause the speaker to use an expression based on . Rather, she continues with the agreeing presentational haber construction, which she had already used multiple times before in the immediate context. (139) Participant: Se pueden comer en todos los momentos, porque, por lo menos en mi casa hayn pasteles todos, toda la sem, todo el año. Pero, este, sí, hayn platos como que es, específicos de diciembre. Interviewer: ¿Y que tú recuerdes siempre ha sido así o han habido cambios a este respecto? Participant: Pues, e, cuando yo era más pequeña se mataba el lechón en casa, mi casa de mi abuela. Se compró todos los, lechones y se mataban allí, y allí los hacían. Interviewer: ¿Y los asaban? Participant: Y los, exactamente, ahora no, ahora, pues, ellos los compran hechos. Interviewer: ¿Y hay otras tradiciones por acá, este, fiestas patronales, carnavales? Participant: Aquí hayn fiestas patronales en todos los municipios (SJ05M12/SJ653-SJ657). Participant: ‘ They can be eaten at all times, because, at least at my home, there are pasteles every, all week, all year round. But, er, yes, there are dishes that, spe, specific of December. Interviewer: ‘And as far as you remember, has it always been like that or have there been changes in this regard?’ Participant: ‘Well, er, when I was smaller, they killed the suckling pig at home, my home of my grandmother. They bought every, suckling pigs and they killed them over there and there they made them.’ Interviewer: ‘And you grilled them?’ Participant: ‘And them, exactly, not nowadays, nowadays, they buy them ready-made.’ Interviewer: ‘And is there other traditions around here, er, Patron Saint celebrations, carnivals?’ Participant: ‘Here there are patron saint celebrations in every town.’

154 | Cognitive constraints on presentational haber pluralization

Finally, nodes [13] and [15] suggest an interaction between the absence/presence of negation, production-to-production priming, and tense. Specifically, for synthetic expressions in the present and preterit tense, the absence/presence of negation is only a relevant constraint in unprimed contexts or contexts primed by the speaker with non-agreeing presentational haber. In these cases, the absence of negation attenuates the tendency to use the nonagreeing presentational haber construction (bar plot in node [17]).

Interaction between the linguistic predictors | 155

Fig. 14: Conditional inference tree model showing the interaction of linguistic predictors in San Juan10

|| 10 Pr.2.Pr: Production-to-production priming; Co.2.Pr: Comprehension-to-production priming; No.priming: First occurrence/distance 20+ clauses; C-index: 0.83.

156 | Cognitive constraints on presentational haber pluralization

The previous discussion has revealed similar patterns of interaction for Havana, Santo Domingo, and San Juan. Firstly, in the three speech communities the frequency of agreeing presentational haber is consistently higher when all predictors align in favor of this variant. Secondly, comprehension-to-production priming and the typical action-chain position of the referent of the noun only impose constraints in contexts primed by the speaker with one of the variants. For Havana and San Juan, this is also true for the absence/presence of negation. Thirdly, the agreeing construction also occurs more often with synthetic present-and preterit-tense presentational haber when speakers have just used , especially in utterances that also include a typical actionchain head. These consistent patterns of interaction between the cognitive constraints suggest an antagonistic relationship in (this case of)morphosyntactic variation between statistical preemption on the one hand and markedness of coding and structural priming on the other. Particularly, statistical preemption emerges from the data as a conservative force, which favors the activation of the entrenched instances of and . In contrast, the other two constraints may work either way, promoting at times opposite variants. However, when they do pair up to increase the activation of agreeing presentational haber, they tend to extend this construction to more (and new) conceptual regions. As a result, every time markedness of coding and structural priming tip the balance in favor of the agreeing presentational haber construction for the encoding of a present- or preterit-tense POINTING-OUT conceptualization without aspectual or modal nuances, the use of an expression based on this construction weakens the strength of the representations of the entrenched nonagreeing instances. This, in turn, debilitates their preemptive effect, which, eventually, results in the less constrained use of and . The antagonism between, on the one hand, statistical preemption and, on the other, structural priming and markedness of coding is reminiscent of the roles these cognitive constraints play in language acquisition and innovation. That is, in language acquisition, statistical preemption has been shown to be the mechanism that prevents children from overgeneralizing (Goldberg 2006a: Chap. 5, 2011), whereas structural priming has been argued to promote the extension of perceived structures to new conceptualizations of the same type (Bock and Griffin 2000: 189; Bock et al. 2007: 455–456; Goldberg 2009: 107; Pickering and Ferreira 2008: 449–450). Regarding language innovation, Croft (2000: Chap. 5) argues that the tendency to maximize unmarked coding is the prime motivation for form-function reanalysis, which reforms established con-

Relative importance of the linguistic predictors | 157

structions or, put differently, overrides their preemptive effect. Let us now consider the relative importance of the cognitive constraints.

7.6 Relative importance of the linguistic predictors In the previous section, it was already observed that the verb tense/statistical preemption and production-to-production priming emerge from the conditional inference tree models as the most important constraints on presentational haber pluralization. As was explained in Section 5.4.4, the conditional permutation of predictors in a random forest model of the variation can provide more insight into this matter (Tagliamonte and Baayen 2012: 162–164). The results of this statistical procedure are presented in Fig. 15, which shows that only the absence/presence of negation occupies a different relative position in the three panels. Without considering this predictor, the relative ordering for Havana, Santo Domingo, and San Juan is identical. Since identical constraint rankings suggest that the structure of the variation is the same for the three speech communities (Szmrecsanyi 2013; Tagliamonte 2006:246, 2013), these similarities provide additional support for the claim that domain-general cognitive constraints on the spreading activation of constructions condition presentational haber pluralization. Additionally, as was already evident from the patterns of interaction documented with the conditional inference tree models, comprehension-to-production priming has a less profound impact on presentational haber pluralization than production-to-production priming. In contrast, previous studies of structural priming performed under laboratory conditions found the size of the priming effect to be comparable (Bock et al. 2007: 452). Let us now summarize the most important findings of this chapter.

158 | Cognitive constraints on presentational haber pluralization

Fig. 15: Constraint ranking for the linguistic predictors in Havana, Santo Domingo, and San Juan11

7.7 Summary After reviewing the overall distribution of agreeing and non-agreeing presentational haber in the three samples, the structure of the corpus, and the architecture and performance of the regression models, in this chapter, I have presented a series of quantitative analyses designed to test the claim that domain-general cognitive constraints on the spreading activation of variant constructions condition presentational haber pluralization. First, I have argued that nouns that refer to typical action-chain heads can be considered as more conceptually prominent, which favors the activation of the agreeing presentational haber construction over that of the non-agreeing presentational haber construction. Indeed, the regression results suggest that in Havana, Santo Domingo, and San Juan, typical action-chain heads favor presentational haber pluralization. In addition, it was argued that the presence of negation decreases the likelihood that speakers will attend strongly to the referent of the noun phrase. Indeed, in Havana and San Juan, speakers are less likely to select the agreeing presentational haber construction when negation is present. For Santo Domingo, in turn, the absence/presence of negation did not turn out to impose constraints on the variation. As this variety also displays the highest overall rates of pluralization, these findings may suggest that the agreeing presentational haber construction has invaded the nonspecific indefinite conceptual region in Santo Domingo.

|| 11 Co.2.Pr: Comprehension-to-production priming; Pr.2.Pr: Production-to-production priming; T.A.C.P: Typical action-chain position

Summary | 159

Still, the data for this and the previous predictor support the first hypothesis: speakers tend to encode more prominent nominal arguments as subjects with the agreeing presentational haber construction. This result supports that presentational haber pluralization is constrained by markedness of coding. Regarding the influence of the verb tense, this chapter has shown that the tendency to establish plural agreement with presentational haber less often in synthetic present- and preterit-tense expressions supports hypotheses 2a-b, as these forms were used predominantly in (non-agreeing) presentational haber clauses before agreeing presentational haber emerged as a conventional alternative of non-agreeing presentational haber. These results support that presentational haber pluralization is constrained by statistical preemption. Additionally, it was shown that participants are more likely to establish agreement with presentational haber when they have just processed or used an agreeing presentational haber construction, regardless of variations in tense, aspect, or mood. This supports the third hypothesis: presentational haber pluralization is subject to structural priming. Exploring the interaction between the linguistic predictors, in turn, has revealed that markedness of coding and structural priming may favor the activation of the agreeing presentational haber construction for synthetic present and preterit-tense expressions, whereas statistical preemption consistently works against this. Finally, contrasting the constraint rankings of the three varieties, we have seen that all linguistic predictors essentially have the same relative impact on the variation, with one exception: the absence/presence of negation. These striking similarities between the regression results, the patterns of interaction, and the constraint hierarchies obtained for the three speech communities illustrate that the structure of the variation is virtually identical in Havana, Santo Domingo, and San Juan, which provides further support for the claim that domain-general cognitive constraints condition presentational haber pluralization. In the next chapter, I will examine how presentational haber pluralization correlates with age, educational achievement, and gender and how the cognitive constraints apply across these social distinctions.

|

Chapter 8: Social constraints on presentational haber pluralization

8 Social constraints on presentational haber pluralization The results obtained for the linguistic predictors in the previous chapter support that markedness of coding, structural priming, and statistical preemption constrain the activation competition between and . In this chapter, I will continue to explore how these cognitive constraints condition presentational haber pluralization. However, whereas the previous chapter focused on the differences and similarities between Havana, Santo Domingo, and San Juan, in this chapter, our attention will shift towards variation between age, educational achievement, and gender groups belonging to the same speech community. Particularly, I will start by exploring the regression results for age, educational achievement, and gender. This will allow me to answer the question as to whether presentational haber pluralization can be considered an ongoing language change from below. In Section 8.2, using a new set of conditional inference tree models, I will examine the way specific social groups react to the linguistic predictors. In Section 8.3, I will investigate the relative importance of the linguistic predictors within the relevant subgroups. The chapter concludes with a summary in Section 8.4.

8.1 Social constraints: Age, education, and gender Tab. 25-Tab. 27 show that the mixed-effects models do not support that speakers’ age contribute to explaining the variation. Indeed, when alternative regression models are fit to the three data sets, Fig. 15 displays a flat age distribution for agreeing and non-agreeing presentational haber in Havana, Santo Domingo, and San Juan. This result is already implicit in the working hypothesis, which, against the background of the studies reviewed in Section 2.2, describes the phenomenon as a slowly advancing language change from below, that is, one that might be too slow to be observed in apparent time.

164 | Social constraints on presentational haber pluralization Tab. 25: Logistic generalized linear mixed-effects model of presentational haber pluralization in Havana (sum contrasts, bobyqa optimizer)

Predictors

N

%

(intercept)

Coefficient

2.5%

97.5%

-1.043

-1.527

-0.535

Educational achievement Less

512/1041

49.18

0.328

-0.085

0.593

University

422/1052

40.11

-0.328

0.085

-0.593

Style Interview

114/450

Reading/Questionnaire tasks 820/1643

25.33

-0.589

-0.950

-0.290

49.91

0.589

0.950

0.290

Tense*Style (interaction) All others: Interview

105/177

59.32

0.487

0.235

0.717

Present-Preterit: Tasks

106/522

20.31

0.487

0.235

0.717

All others: Tasks

714/1121

63.69

-0.487

-0.235

-0.717

Present-Preterit: Interview

9/273

3.30

-0.487

-0.235

-0.717

Random effects

Variance

Std. dev.

Nouns

0.794

0.892

Speakers

0.523

0.725

Model summary

Fixed

Full

C-index of concordance

0.82

0.89

Pseudo-R2

0.43

0.60

AICc

2124.5

1974.9

Social constraints: Age, education, and gender | 165

Tab. 26: Logistic generalized linear mixed-effects model of presentational haber pluralization in Santo Domingo (sum contrasts, bobyqa optimizer)

Predictors

N

%

(intercept)

Coefficient

2.5%

97.5%

-0.224

-0.553

0.095

-0.045

0.332

Gender Female

450/901

49.94

0.136

Male

410/941

43.57

-0.136

0.045

-0.332

Random effects

Variance

Std. dev.

Nouns

0.599

0.774

Speakers

0.183

0.427

Model summary

Fixed

Full

C-index of concordance

0.82

0.87

Pseudo-R2

0.42

0.54

AICc

1931.1

1844.0

Tab. 27: Logistic generalized linear mixed-effects model of presentational haber pluralization in San Juan (sum contrasts, bobyqa optimizer)

Predictors

N

%

(intercept)

Coefficient

2.5%

97.5%

-0.974

-1.455

-0.571

Educational achievement Less

324/669

48.43

0.508

0.250

0.763

University

360/986

36.51

-0.508

-0.250

-0.763

Gender Female

375/836

44.86

0.261

0.024

0.514

Male

309/819

37.73

-0.261

-0.024

-0.514

Random effects

Variance

Std. dev.

Nouns

0.378

0.615

Speakers

0.334

0.578

Model summary

Fixed

Full

C-index of concordance

0.85

0.89

Pseudo-R2

0.50

1489.6

AICc

1593.5

1517.9

166 | Social constraints on presentational haber pluralization

Fig. 16: Effect of age on the log-odds of agreeing presentational haber in Havana, Santo Domingo, and San Juan in alternative regression models

By the same token, hypothesis 6 – which states that occurrence rates of agreeing presentational haber will not decrease when more attention is focused on language – anticipates that the alternations between the two construction schemas will not display any correlations with style. Indeed, Tab. 26, Tab. 27, and Fig. 17 support that Dominicans and Puerto Ricans use agreeing and nonagreeing presentational haber in similar ways during the interview and the tasks.

Fig. 17: Effect of style on the log-odds of agreeing presentational haber in Havana (final model), Santo Domingo, and San Juan (alternative models)

In contrast, the leftmost panel of Fig. 17 indicates that Cubans use agreeing presentational haber more often during the tasks, which focus more attention

Social constraints: Age, education, and gender | 167

on language than the interview. In addition, as is evident from Tab. 25 and Fig. 18, this increase is mainly due to an interaction between style and tense: Cubans disfavor agreeing haber with the synthetic present and preterit tense in the interview sections of the corpus, but they favor this variant while performing the tasks. This sort of behavior, which consists in that the incoming variant is used more frequently when more attention is focused on language, is typical of changes from below (Labov 1972: Chap. 5).

Fig. 18: Interaction effect of style and tense on the log-odds of agreeing presentational haber in Havana

For educational achievement, Tab. 25, Tab. 27, and Fig. 19 support that university-educated participants from Havana and San Juan use agreeing presentational haber less frequently. This confirms the expectations set forward by hypothesis 7. However, this does not mean that university-educated speakers from these cities refrain from using altogether. Rather, the relative frequency scores in Tab. 25 and Tab. 27 show that, as earlier studies of Caribbean Spanish had already found (Aleza-Izquierdo 2011; DeMello 1991; LópezMorales 1992: 147; Vaquero 1978: 135 –140), these speakers also use agreeing presentational haber quite often. In turn, for Santo Domingo, including educational achievement as a predictor proved to increase the AICc score. Indeed, when a regression model including this predictor is fit to the data, the size of the effect of university education is minimal (0.063 log-odds), as is evident from the nearly flat line in the center panel of Fig. 19. This confirms the results of earlier studies of this variety, which have equally shown that, in the Dominican Republic, university education does not corre-

168 | Social constraints on presentational haber pluralization

late with the occurrence rates of agreeing presentational haber nor with speakers’ attitudes towards this variant (Alba 2004; Alvar-López 2000).

Fig. 19: Effect of education on the log-odds of agreeing presentational haber in Havana (final model), Santo Domingo (alternative model), and San Juan (final model)

Tab. 25 also reveals that gender does not contribute to explaining the variation in Havana. Indeed, the leftmost panel of Fig. 20 shows a nearly flat line (representing an effect of 0.078 log-odds in an alternative regression model), supporting that female and male speakers of this variety use agreeing presentational haber in virtually the same way. For San Juan and Santo Domingo, in turn, Tab. 26 and Tab. 27 support that the frequent use of the agreeing presentational haber construction is associated to the female gender role. This is also evident from the two rightmost panels of Fig. 20, which shows that in Santo Domingo and San Juan female speakers are more likely to use agreeing presentational haber than male speakers.

Linguistic predictors across social groups | 169

Fig. 20: Effect of gender on the log-odds of agreeing presentational haber in Havana (alternative model), Santo Domingo, and San Juan (final models)

These data allow exploring the issue whether presentational haber pluralization constitutes an ongoing linguistic change from below in Cuban, Dominican, and Puerto Rican Spanish. At this point, recall that typical changes from below correlate with female gender and younger speakers. They may also correlate with more formal speech or display no correlation whatsoever with formality (Labov 2001: Chap. 3; Silva-Corvalán 2001: 248–249). In this sense, even though presentational haber pluralization displays a flat age distribution, the results for style seem to support the change from below hypothesis. The correlation with female gender documented for Santo Domingo and San Juan also points in this direction. Therefore, it seems that the results achieved in this chapter corroborate those of earlier investigations in Latin America (D’Aquino-Ruiz 2008; DíazCampos 2003; Fontanella de Weinberg 1992b) and Spain (Blas-Arroyo in press), while at the same time supporting the working hypothesis and hypothesis 9. Let us now consider the way the linguistic predictors condition presentational haber pluralization for the different educational achievement and/or gender groups.

8.2 Linguistic predictors across social groups So far, this chapter has mainly been concerned with establishing correlations between, on the one hand, presentational haber pluralization and, on the other, participants’ age, educational achievement, and gender. What has not been considered is the way the linguistic predictors apply across education and/or gender groups. If it is the case that – as Chapter 7 gives reason to believe – the

170 | Social constraints on presentational haber pluralization

alternation between and is constrained by domain-general cognitive constraints on spreading activation, then we would expect these predictors to condition the variation in a rather uniform way for all participants, no matter what their educational achievement or gender might be. To examine whether this is the case, in this section, I will present a new set of conditional inference tree models. In general terms, the conditional inference tree models displayed in Fig. 21Fig. 23 support that the linguistic predictors have identical effects for the two educational achievement and/or gender groups in the majority of the usage contexts, because these predictors consistently motivate low-branching splits. For instance, the left-hand side of Fig. 21 shows that, in Havana, educational achievement is only a relevant constraint in unprimed contexts or in contexts primed by the participant with non-agreeing presentational haber (nodes [2] and [3]). Particularly, the interaction between production-to-production priming and the absence/presence of negation noted in Section 7.5 does not apply to university graduates who generally seem use less often in these cases (bar plot under node [7]). In contrast, participants who have not enjoyed university education seem to prefer agreeing presentational haber when negation is absent (bar chart under node [6]). Still, both educational achievement groups appear to conform to the overall tendency documented in Section 7.4, because they use agreeing presentational haber less often in unprimed contexts or in contexts primed with non-agreeing presentational haber.

Linguistic predictors across social groups | 171

Fig. 21: Conditional inference tree model showing the interaction of linguistic and social predictors in Havana1

|| 1 Pr.2.Pr: Production-to-production priming; Co.2.Pr: Comprehension-to-production priming; No.priming: First occurrence/distance 20+ clauses; C-index: 0.82.

172 | Social constraints on presentational haber pluralization

Turning now to Santo Domingo, Fig. 22 suggests that gender only impacts the variation in a limited way. Particularly, the left-hand side of the figure indicates that gender differences only exist for synthetic present- and preterit-tense presentational haber expressions. For these expressions, both male and female participants display the interaction noted in Section 7.5 between tense, production-to-production priming, and typical action-chain position (nodes [1], [2], [3], [8]). Yet, in clauses with typical action-chain tails and settings that are situated in contexts primed by the speaker with or in contexts without priming, male speakers appear to avoid agreeing synthetic present -or preterittense presentational haber (bar plot in node [6]). In contrast, females use agreeing presentational haber more frequently in such contexts (bar plot in node [5]). Still, both genders conform to the overall trend uncovered in Section 7.2.1, as they use agreeing presentational haber less often with typical action-chain tails and settings than with typical action-chain heads.

Linguistic predictors across social groups | 173

Fig. 22: Conditional inference tree model showing the interaction of linguistic and social predictors in Santo Domingo2

|| 2 Pr.2.Pr: Production-to-production priming; Co.2.Pr: Comprehension-to-production priming; No.priming: First occurrence/distance 20+ clauses; C-index: 0.81.

174 | Social constraints on presentational haber pluralization

For San Juan, the ctree algorithm does not see sufficient motivation for a split along the levels of gender. This confirms that the effect size for this predictor is quite limited, which is also evident from the small effect estimate obtained with the regression model of Tab. 27. With this reservation, the left-hand side of Fig. 23 (nodes [2], [3], and [4]) shows that in expressions not involving synthetic present- or preterit tense presentational haber, university graduates are much more sensitive to differences in typical action-chain position in unprimed contexts or in contexts primed by the participant with non-agreeing presentational haber. In such environments, university-educated participants use less often than participants without university degree in clauses with typical action-chain tails and settings (bar plots in nodes [5] and [6]). Additionally, the right-hand side of the figure (nodes [13], [15], and [17]) unveils an interaction between production-to-production priming, education, and typical action-chain position. In unprimed contexts or in contexts primed by the participant with non-agreeing presentational haber, university graduates seem to be much less sensitive to differences in typical action-chain position for synthetic present- or preterit-tense presentational haber, because they use nearly consistently (bar plot in node [16]). In contrast, participants without university training do follow the overall tendency documented in Section 7.2.1, as they use agreeing synthetic present- or preterit-tense presentational haber more often in clauses with nouns that refer to typical action-chain heads (bar plot in node [19]).

Linguistic predictors across social groups | 175

Fig. 23: Conditional inference tree model showing the interaction of linguistic and social predictors in San Juan3

|| 3 Pr.2.Pr: Production-to-production priming; Co.2.Pr: Comprehension-to-production priming; No.priming: First occurrence/distance 20+ clauses; C-index: 0.84.

176 | Social constraints on presentational haber pluralization

These results suggest that presentational haber pluralization is primarily constrained by the linguistic context. In turn, the consistency of the effects of such conditioning environments across education and gender groups in Havana, Santo Domingo, and San Juan provides further support for the claim that the effects of these environments are reflexes of domain-general cognitive constraints on spreading activation. Still, although the directionalities of the effects remain identical, in the three speech communities, certain educational achievement and/or gender groups appear to be less sensitive to the linguistic predictors that model markedness of coding in unprimed contexts or in contexts primed with non-agreeing presentational haber. Let us now examine whether this means that for these groups the relative importance of the linguistic predictors that model markedness of coding is also different. This will be the topic of the next section.

8.3 Relative importance of the linguistic predictors across social groups The previous section has shown that educational achievement and gender have a less systematic and profound impact on presentational haber pluralization than the predictors that model markedness of coding, structural priming, and statistical preemption. In this section, we will examine this issue somewhat further, using conditional variable permutation in random forest models that were generated for each of the educational achievement and/or gender groups. For Havana, Fig. 24 reveals that the relative importance of the linguistic predictors for modeling the variation is not at all the same for the two educational achievement groups, who also differ from the overall Havana constraint hierarchy in this respect. The constraint ranking that is obtained for the participants without university education conforms more closely to the community pattern, as this group only diverges from the community in terms of the relative orderings of the predictors that model markedness of coding, i.e., absence/presence of negation and typical action-chain position. In contrast, for the university-educated participants, only the relative orderings of typical action-chain position and comprehension-to production priming seem to correspond to those of the overall Havana constraint hierarchy. For all other predictors, a different relative importance is obtained.

Relative importance of the linguistic predictors across social groups | 177

Fig. 24: Constraint rankings for non-university-educated participants (left) and universityeducated participants (center) as compared to the overall Havana constraint ranking (right)4

These sharp contrasts are surprising, because any characterization of the constraints that govern sociolinguistic variation – be it Croft’s (2000: 166) notion of ‘shared expertise’, Labov’s (1972: 120) ‘tacit community agreement’, Bresnan’s (2007) ‘Probabilistic Grammar’, or the approach that is being developed in this book – would lead us to expect identical constraint orderings for the two educational achievement groups. A similar pattern has been observed by Lim and Guy (2005), who find that in Singapore English word-final /t, d/ deletion (e.g., tol instead of told) answers to different constraint rankings in formal and informal speech. This leads the authors to propose that in more formal situations speakers self-consciously attempt to adopt a register that conforms more closely to a standard variety. In this light, a plausible explanation of the results could be that in a semi-formal situation such as a recording session with a foreign academic, university-educated participants self-consciously attempt to adopt an academic variety that resembles normative Spanish more closely. The fact that less educated participants – who have less experience with this register,– conform more closely to the community pattern suggests that this may be the case. To shed more light on this matter, in Section 9.4, I will zoom in on the individual university-educated speakers of the Havana sample. Turning now to Santo Domingo, Fig. 25 reveals that both male and female speakers display constraint rankings that follow the overall pattern found at the level of the community.

|| 4 Co.2.Pr: Comprehension-to-production priming; Pr.2.Pr: Production-to-production priming; T.A.C.P: Typical action-chain position.

178 | Social constraints on presentational haber pluralization

Fig. 25: Constraint rankings for female participants (left) and male participants (center) as compared to the overall Santo Domingo constraint ranking (right)5

Fig. 26 unfolds a similar picture for the effect of gender in San Juan. The same situation emerges from Fig. 27, which shows that educational achievement has no effect on the constraint rankings of speakers from this city.

Fig. 26: Constraint rankings for female participants (left) and male participants (center) as compared to the overall San Juan constraint ranking (right)6

|| 5 Co.2.Pr: Comprehension-to-production priming; Pr.2.Pr: Production-to-production priming; T.A.C.P: Typical action-chain position. 6 Co.2.Pr: Comprehension-to-production priming; Pr.2.Pr: Production-to-production priming; T.A.C.P: Typical action-chain position.

Relative importance of the linguistic predictors across social groups | 179

Fig. 27: Constraint rankings for non-university-educated participants (left) and universityeducated participants (center) as compared to the San Juan constraint ranking (right)7

The results obtained for Santo Domingo and San Juan suggest that across educational achievement and/or gender groups, participants respond in the same way to linguistic environments. This adds further support for the claim that domain-general cognitive constraints on spreading activation condition morphosyntactic variation. Alternatively, one could also interpret these results as simply indicating that speakers belonging to different social groups imitate the probabilistic usage patterns they have observed throughout their lives (Bresnan 2007; Croft 2000: 166) or follow the constraints that are stipulated by the tacit agreement that defines the speech community (Labov 1972: 120; Weinreich, Labov, and Herzog 1968). However, such accounts do not explain why these linguistic environments have an effect on the variation in the first place and why the effects of these linguistic predictors and their relative importances for modeling presentational haber pluralization are so similar across three different speech communities. Finally, for Havana there is evidence that university-educated speakers use and differently across the board. This matter will be the topic of Section 9.4. Let us now summarize the most important findings of this chapter.

|| 7 Co.2.Pr: Comprehension-to-production priming; Pr.2.Pr: Production-to-production priming; T.A.C.P: Typical action-chain position

180 | Social constraints on presentational haber pluralization

8.4 Summary This chapter started by exploring the regression results for age, education, gender, and style. This showed that presentational haber pluralization correlates with lower educational achievement in Havana and San Juan. For this city, as well as for Santo Domingo, a correlation with female gender was also observed. For Havana, style contributes to explaining the variation and speakers favor agreeing presentational haber when more attention is focused on speech. This is especially true for present- and preterit-tense haber. Since the incoming variants of ongoing language changes from below either correlate with female gender and situations that focus more attention on speech or do not correlate at all with formality (Labov 1972: 239, 2001: Chap. 3, 292; Silva-Corvalán 2001: 248–249) these results were taken to suggest that presentational haber pluralization constitutes an ongoing language change from below in Cuban, Dominican, and Puerto Rican Spanish. Subsequently, using conditional inference tree models, we have seen that these correlations with educational achievement emerge because in Cuba, Santo Domingo, and San Juan, the linguistic predictors that model markedness of coding have different effect sizes for specific educational achievement and/or gender groups in contexts primed by the speaker with or in contexts without priming. Crucially, however, the directionalities of the effects remain identical for the two educational achievement and/or gender groups. Indeed, comparing the constraint rankings of the educational achievement and/or gender groups confirmed that, for all groups, presentational haber pluralization features the same constraint hierarchy in Santo Domingo and San Juan. As this reveals that the variation is structured in the same way for all social groups, these results support that domain-general cognitive constraints on spreading activation constrain presentational haber pluralization. In contrast, comparing the constraint hierarchies of university graduates and nonuniversity-educated participants from Havana revealed that the former diverge substantially from the community constraint ranking. This was taken to suggest that, during the recording sessions, the Havana university graduates tried to avoid using . The following chapter will investigate how individual speakers use agreeing and non-agreeing presentational haber.

|

Chapter 9: Individual constraints on presentational haber pluralization

9 Individual constraints on presentational haber pluralization Up until now we have been concerned with the way agreeing and non-agreeing presentational haber pattern across speech communities, educational achievement groups, and/or gender groups and how this portrays the claim that markedness of coding, statistical preemption, structural priming, and socialinteractional meanings condition presentational haber pluralization. In this chapter, I will explore how individual speakers use agreeing and non-agreeing presentational haber. To this end, in Section 9.1 I will start by inspecting the frequencies of and in the samples of each individual participant. In Section 9.2, the speaker-based random terms of the regression models will be analyzed. In Section 9.3, I will present and discuss conditional inference tree models that include the participants. Before summarizing the most important results obtained in this chapter (Section 9.5), Section 9.4 will scrutinize the behavior of the Havana university graduates.

9.1 Distribution across individuals In terms of the overall frequency of agreeing and non-agreeing presentational haber, Fig. 28 shows that in Havana, Santo Domingo, and San Juan a considerable amount of variation exists between the individual participants. As is shown in the leftmost panel of Fig. 28, in Havana, the usage rates of agreeing presentational haber range from a modest 20% in the data provided by LH09M12 to a staggering 80% in the tokens extracted from the recording session with LH17M21. In contrast, the center panel of Fig. 28 reveals a much smaller spread for Santo Domingo, where the range between the lowest and the highest agreement rates sums only about 37 points. Without taking SJ19H22 into account, the spread between the highest and the lowest rates of presentational haber pluralization is largely the same in San Juan, as is evident from the rightmost panel of Fig. 28. However, the extremely low rates displayed by SJ19H22 (3.92%; N=2/51) increase the overall spread to almost 57 points.

184 | Individual constraints on presentational haber pluralization

Fig. 28: Usage rates of agreeing presentational haber for the individual participants included in the Havana, Santo Domingo, and San Juan datasets

Distribution across individuals | 185

Tab. 28: Social profiles of the five participants who use agreeing presentational haber less often in Havana, Santo Domingo, and San Juan

Havana Speaker

Gender

Age

Educational achievement

Occupation

LH03M12

Female

29

Master in Linguistics

Linguistics professor

LH04M22

Female

57

Vocational training as a preschool teacher

Housewife

LH09M12

Female

27

Bachelor in Philology

Linguistics professor

LH15M22

Female

58

Bachelor in Agricultural Engineering

Radio host

LH18H22

Male

57

Bachelor in Architecture

Art History professor

Age

Educational achievement

Occupation

Santo Domingo Speaker

Gender

SD04M22

Female

82

Bachelor in Law

Radio host

SD09H11

Male

26

High school

Mechanic

SD20H12

Male

31

Bachelor in Modern Languages with concentration in English and French

English professor

SD23H12

Male

35

Bachelor in Modern Languages with concentration in English

English professor

SD24M12

Female

30

Bachelor in Psychology

Student counselor

Speaker

Gender

Age

Educational achievement

Occupation

SJ08M21

Female

60

Fourth grade

Housewife

SJ09H12

Male

27

Bachelor in Dramatic Arts

Shop assistant Teacher

San Juan

SJ11M22

Female

58

Bachelor in Social Work

SJ14H22

Male

59

PhD in Education

Music professor

SJ19H22

Male

59

PhD in Hispanic Literature

Literature professor

When we compare the social profiles of the five participants who display the lowest rates of presentational haber agreement in Havana, Santo Domingo, and San Juan, some interesting parallels emerge between the three speech communities. That is, Tab. 28 supports that in the three communities, participants whose professional activities and/or educational backgrounds involve public speaking and/or intensive training in normative grammar are those who use agreeing presentational haber less often. This suggests that these participants associated the recording sessions with their professional activities and that, consequently, they may have shifted into their public speaking style. Since such styles are characterized by a greater bias towards the patterns of normative

186 | Individual constraints on presentational haber pluralization

grammar (e.g., Alba 2004: 79; Labov 1972: 79, 1994: 78), this may explain why they display such low frequency of . Another possibility could be that markedness of coding reduced the level of activation of because (some of) these participants wished to project social persona for themselves as ‘decent, well-thinking, well-behaved people’ (e.g., Eckert 2008; Kiesling 2009, 2013). This appears to be the case for two of the four participants whose behavior cannot be explained based on their social profiles. That is, for LH04M21, example (140) illustrates her negative attitude towards the local way of speaking and behaving, which she characterizes as vulgar ‘vulgar’. The example also shows how she distances herself from this behavior, as is evident from her use of the third person plural (rather than the first person plural) when talking about the type of language that is used in Havana. (140) Interviewer: ¿Este, cuando usted escucha a la gente por acá, qué particularidades oye en cuanto a su pronunciación, o las palabras que usan, o la manera cómo hablan? Participant: ¡A! ¡Hablan muy vulgar! ¡Sí, sí! Los cubanos no son de pronunciar muy bien las palabras, no, no, no. El terminar de las palabras, no. Interviewer: ¿Este, se dice que la lengua está cambiando, usted tiene esta impresión, de que los jóvenes están hablando de otra forma que o, su generación o la generación de sus papás? Participant: Sí, sí. Sí, sí, ahora todo es más vulgar, aunque, claro, siempre, el, los jóvenes de, en todas las generaciones tienen sus, sus mismos, sus palabras, no (LH04M21). Interviewer: ‘Er, when you listen to the people around here, what characteristics do you hear in terms of their pronunciation, or the words they use, or the way they speak?’ Participant: ‘A! They speak really vulgarly! Yes, yes. Cubans typically do not pronounce words properly, no, no, no. The endings of words, right.’ Interviewer: ‘Er, they say that the language is changing. Do you have this impression, that the youngsters are speaking in a different way from, either your generation or the generation of your parents?’ Participant: ‘Yes, yes. Yes, yes, nowadays everything is more vulgar, although, obviously, always, the, the youngsters of, in each generation have their, their own, their words, right.’

Distribution across individuals | 187

Similarly, for SD24M21 examples and (141) and (142) illustrate that, during the interview, she displays a keen interest in speaking ‘correctly’, which she puts on a par with thinking properly. (141) Yo tengo hijos, yo tengo tres y, y yo le digo a la mayor: “Bueno, igual, tú hablarás como cualquier dominicana, quitando las eses finales, e, cortando las palabras, pero por lo menos, usa las palabras donde van” (SD24M21). ‘I have kids, I have three and, and, I say to the oldest: “Well, anyway, you’ll speak like any other Dominican woman, removing the final esses, er, cutting words, but at least, use the right words in the right places.”’ (142) A mí me preocupa porque yo…Mira, una vez un profesor en, en la universidad me dijo aquí que la forma de, de hablar la lengua es una forma también de ver el mundo y de pensar. Entonces yo me… Escuchándolo a él decir esto y escuchando esto, pues, cómo la gente habla, y cómo nosotros hablamos, me pregunto yo: “¿Dios mío si parte de los problemas que nosotros tenemos a nivel de economía, a nivel de, sociales, no también se deberán como a esas formas, e, tan incorrectas de construir, de estructurar la lengua que también refleja como una falta, en, en la estructuración misma del pensamiento?” (SD24M21). ‘It worries me, because… Look, once a professor at, at the university here told me that the way to, to speak the language is also a way of seeing the world and thinking. And so I… Listening to him saying this and hearing this, then, how people speak, and how we speak, I ask myself: “My God if some of the problems that we have in terms of the economy, at the level of, social problems, wouldn’t these also be due to, like, these, er, so incorrect ways of structuring the language that also reflects, like, a shortcoming of, of, the very structuring of our thinking?”’ However, for SD09H11 and SJ08M2 the reasons as to why these speakers use agreeing presentational haber less often do not seem to rest in their occupational status, their educational profiles nor in their attitudes towards language. Rather, the limited occurrence rates of this variant seems to be due to the effects of the linguistic predictors. When we compare the social profiles of the five participants who use agreeing presentational haber most often, similar correspondences between the three cities emerge. As is shown in Tab. 29, the majority of these participants do not have a university degree. LH20H12 – who was in his last semester of law school at the time of the interview – and SD22M12 – who holds a Master in Business Administration – are exceptions to this pattern. Yet, these two participants also

188 | Individual constraints on presentational haber pluralization

display a common feature, namely, upward social mobility. That is, LH20H12’s parents do not have a university degree and his family has always lived in underprivileged areas of Havana. At the time of the interview LH20H12 was employed as a mosquito exterminator. In this condition, he found himself visiting the houses of a popular borough of Havana on a day-to-day basis, collaborating closely with the shortly educated inhabitants and LH21H11, his coworker. As a result of this, throughout his life, LH20H12 has been exposed far more often to the speech patterns of shortly educated Cubans than to those of his universityeducated peers. This may explain why his rates of use of agreeing presentational haber are much higher than those of other university-educated speakers. Similarly, SD22M12’s parents have only completed primary education and her family has always resided in an underprivileged area of Eastern Santo Domingo. Therefore, it also appears that, if she uses more often than her peers, this is due to the fact that throughout her life, she has been exposed more to the speech patterns of shortly educated Dominicans. In this sense, the data contributed by LH20H12 and SD22M12 confirm one of the basic principles of usage-based linguistics: linguistic representations and their relative strengths are the product of accumulated sociolinguistic experiences.

Distribution across individuals | 189

Tab. 29: Social profiles of the five participants who use agreeing presentational haber most often in Havana, Santo Domingo, and San Juan

Havana Speaker

Gender

Age

Educational achievement

Occupation

LH05M21

Female

58

Ninth grade

Factory worker

LH06H21

Male

83

Ninth grade

Carpenter

LH17M21

Female

78

No formal education

Seamstress

LH20H12

Male

28

Bachelor in Law

Mosquito exterminator

LH21H11

Male

29

Vocational training as an accountant

Mosquito exterminator

Santo Domingo Speaker

Gender

Age

Educational achievement

Occupation

SD06M11

Female

26

High school

Shop keeper

SD07M11

Female

34

Eleventh grade

Hostel clerk

SD12M11

Female

26

High school

Shop keeper

SD15M21

Female

63

High school

Restaurant owner

SD22M12

Female

30

Master in Business Administration

Assistant in the School of Education

Speaker

Gender

Age

Educational achievement

Occupation

SJ07H21

Male

75

Ninth grade

Shop keeper

San Juan

SJ15M21

Female

86

Ninth grade

Forewoman in a factory

SJ16H21

Male

65

Vocational training as a cook

Cook

SJ18M11

Female

31

Vocational training as a beautician

Beautician

SJ24H11

Male

25

High school

Unemployed

In sum, in this section we have seen that participants diverge greatly from one another in terms of their overall frequency of use of . As in earlier studies of individual variation, the rates of use of agreeing and nonagreeing presentational haber correspond rather closely with participants’ specific social profiles (Chambers 2009: 92–114; Labov 2006, 2001; Meyerhoff and Walker 2007; Smith and Durham 2012a, 2012b; Walker and Meyerhoff 2013), and/or the social persona they wish to project during the interview (Ashby 2001; Eckert 2008; Kiesling 2009, 2013). Let us now examine how the regression models estimate the individual participants’ preferences for agreeing and nonagreeing presentational haber and the degree to which they are influenced by specific linguistic regressors.

190 | Individual constraints on presentational haber pluralization

9.2 Random intercepts, slopes, and participant-specific regression coefficients The previous section suggests substantial variability between individuals belonging to the same speech community when it comes to their overall agreement rates. This is confirmed by the random intercepts of the regression models, displayed in Fig. 29. In these plots, the random intercept coefficients express how the regression models estimate the overall mean likelihood that a particular participant will use agreeing presentational haber, taking into account the effects of all other predictors and the overall likelihood that any speaker of, respectively, Cuban, Dominican, or Puerto Rican Spanish will use that variant. Since agreeing presentational haber is less likely to occur than non-agreeing presentational haber across the board, these estimates are predominantly negative.

Fig. 29: Random intercepts for the participants included in the Havana, Santo Domingo, and San Juan datasets (log-odds)

Random intercepts, slopes, and participant-specific regression coefficients | 191

When we compare the plots for the three varieties, it immediately becomes clear that speakers differ sharply from one another in terms of their intercept values. For Havana, this variation is limited to the degree to which speakers disfavor agreeing presentational haber. For example, while LH17M21 and LH20H12 are almost neutral with respect to their overall preference for the variants (resp. 0.044 and -0.069 log-odds), LH09M12 and LH04M21 disfavor this variant sharply (resp.-2.36 and-2.445 log-odds). In contrast, for Santo Domingo the regression model estimates that more than a third of the speakers favors the agreeing presentational haber construction over the non-agreeing construction across the board, with intercepts ranging between a modest preference of 0.040 log-odds for SD22M12 to a substantial preference of 0.293 log-odds for SD15M21. The other Dominican participants prefer non-agreeing presentational haber to varying degrees, with intercept values that range between -0.008 log-odds and -0.943 log-odds. For San Juan, the regression model estimates that the variation between the overall preferences of the participants is mainly limited to the extent to which speakers disfavor agreeing presentational haber. In addition, when compared to the Cuban and Dominican data, the amount of interspeaker variability appears to be less pronounced in the Puerto Rican data, with intercepts ranging between -0.196 and -1.623 log-odds. Still, one speaker (SJ05M12) displays a clear overall preference for agreeing presentational haber (0.125 logodds. Besides the overall preferences of individuals for agreeing or non-agreeing presentational haber, the regression models also provide information on the extent to which speakers are influenced by certain predictors. At this juncture, recall that in Section 7.1 it was said that adding a random slope for typical action-chain position lowered the AICc score of the Havana and the Santo Domingo models. This was not the case for the San Juan model, where a random slope proved to be appropriate for tense. As was explained in Section 5.4.2, by adding by-speaker random slopes, we instruct the regression function to entertain the possibility that the effect of a specific predictor may vary for each speaker. This generates a slope estimate that expresses how the effect that is observed for each speaker for the variable of interest is different from the overall effect of the variable. By summing this estimate together with the population-level intercept, the population-level effect estimate for the predictor, and the speaker-specific random intercept, we obtain a speaker-specific estimate of the effect of a particular variable. These estimates are plotted in Fig. 30-Fig. 32. For Havana, the by-speaker effects of typical action-chain position are displayed in Fig. 30. Overall, comparing the steepness of the lines between the dots

192 | Individual constraints on presentational haber pluralization

in the different panels of the figure shows that there is substantial variation between participants when it comes to the size of the effect of this predictor. For 12 participants (LH02M12, LH03M12, LH05M21, LH09M12, LH11M22, LH12H21, LH15M22, LH18H22, LH20H12, LH22H11, LH23H12, and LH24H11), the margin of the effect is smaller than the one that is observed at the level of the entire sample (0.496 log-odds) and for ten participants the effect margin is larger (LH01H22, LH04M21, LH07M11, LH10M22 LH13H21, LH14M11, LH16H22, LH17M21, LH19M11, and LH21H11). Still, for all but two participants the direction of the effect remains identical: typical action-chain heads favor agreeing presentational haber. This is not the case for LH06H21 and LH08H12, who disfavor agreeing presentational haber with typical action-chain heads (resp. -0.545 and -0.021 log-odds). Still, further scrutiny reveals that these results are due to the fact that these two participants performed rather idiosyncratically on the questionnaire- and reading tasks. When the analysis is restricted to the interview section, positive participant-specific regression coefficients are obtained for all speakers. Turning now to the Santo Domingo data, Fig. 31 reveals a similar picture to the one that emerges from the Havana data. Specifically, for 13 participants (SD01H21, SD02H21, SD03H21, SD06M11, SD07M11, SD08M11, SD11H22, SD12M11, SD13M22, SD15M21, SD17H22, SD21H12, SD22M12) the effect is larger than the aggregate effect (0.926 log-odds). For the other 11, the effect is smaller. Crucially, however, for all individuals, the effect of typical action-chain position runs in the same direction: typical action-chain heads favor agreeing presentational haber. For San Juan, comparing the steepness of the lines in the panels of Fig. 32 reveals a highly similar pattern for the verb tense. As was the case for typical action-chain position in Havana and Santo Domingo, the San Juan data reveal substantial variation between speakers when it comes to their sensitivity to the verb tense. For 14 speakers, the effect is smaller than the margin of 3.532 logodds that is obtained for the community. This is the case for SJ05M12, SJ07H21, SJ09H12, SJ13H11, SJ14H22, SJ15M21, SJ17M21, SJ18M11, SJ19H22, SJ20H11, SJ21H21, SJ22M11, SJ23M11, and SJ24H11. Within this group, SJ05M12 represents a special case. For this participant, the effect margin for tense is limited to 0.727 log-odds, less than half of the size of this effect at the aggregate level. For the other ten speakers, the effect is larger than the aggregate effect. However, for all participants, the same directionality is maintained. These results illustrate that with mixed-effects modeling new patterns of individual variation can be uncovered besides the often-documented overall preferences for a particular variant (e.g., Guy 1980; Meyerhoff and Walker 2007;

Random intercepts, slopes, and participant-specific regression coefficients | 193

Walker and Meyerhoff 2013). Specifically, participant-specific regression coefficients allow measuring how far individuals’ sensitivity to specific linguistic environments differs from that of the group, while at the same time taking into account their overall preference for agreeing or non-agreeing presentational haber (see Drager and Hay 2012: 60 and Forrest 2015 for a similar point). This has shown that there is much more variation between individuals in terms of effect sizes and overall preferences for a particular construction than in terms of the directionalities of the effects of linguistic predictors. This is the pattern one would expect in the light of the hypothesis that domain-general cognitive constraints condition linguistic variation. In the following section, I will present a new series of conditional inference tree models, which will provide even more evidence in favor of this analysis.

194 | Individual constraints on presentational haber pluralization

Fig. 30: Effect of typical action-chain position on the log-odds of agreeing presentational haber in Havana, by speaker

Random intercepts, slopes, and participant-specific regression coefficients | 195

Fig. 31: Effect of typical action-chain position on the log-odds of agreeing presentational haber in Santo Domingo, by speaker

196 | Individual constraints on presentational haber pluralization

Fig. 32: Effect of tense on the log-odds of agreeing presentational haber in San Juan, by speaker

Linguistic predictors across individuals | 197

9.3 Linguistic predictors across individuals For Havana, Fig. 33 shows that only the verb tense has the same effect size for all participants. On the left-hand side of the figure, nodes [2], [3], [4], and [8] suggest that for expressions not involving synthetic present- or preterit-tense presentational haber the speaker split introduced by node [2] is mainly motivated by the fact that production-to-production priming interacts with comprehension-to-production priming for some speakers, but not for others. In any case, the leaves connected with the branches that go down from the production-toproduction priming nodes [3] and [8] seem to suggest that the directionality of the effects of this predictor are identical for all participants (bar plots under nodes [5], [6], [7], [9], and [10]). That is, both for participants who represent the interaction with comprehension-to-production priming and those who do not, the frequency of is highest in contexts primed by the speaker with agreeing presentational haber (bar charts under nodes [6] and [10]). Regarding the interaction with comprehension-to-production priming, node [4] suggests that the rates of presentational haber pluralization are higher in contexts primed by both the participant and the interviewer with agreeing presentational haber (bar plot under node [6]), whereas they are slightly lower in contexts where the two priming variables favor opposite variants (bar plot under node [5]). Turning now to the right-hand side of the figure, nodes [12] and [13] show that in contexts primed by the speaker with non-agreeing presentational haber or in contexts without priming, only certain speakers use or considerably more often with typical action-chain heads (bar plot under node [15]). For others, this predictor cannot put enough weight in the scale to cancel the effect of production-to-production priming completely (bar plot under node [14]). Still, the rates of presentational haber pluralization that are documented in this context remain slightly higher than those that are found for typical action-chain tails and settings.

198 | Individual constraints on presentational haber pluralization

Fig. 33: Conditional inference tree model showing the interaction of linguistic predictors and speakers in Havana1

|| 1 Pr.2.Pr: Production-to-production priming; Co.2.Pr: Comprehension-to-production priming; No.priming: First occurrence/distance 20+ clauses; C-index: 0.84.

Linguistic predictors across individuals | 199

As was the case for Havana, Fig. 34 supports that only the verb tense and production-to-production priming have the same effect sizes for all Dominican participants. Additionally, as was already evident from the regression models, the left-hand side of the figure (nodes [4] and [12]) supports that participants differ considerably from each other in terms of their sensitivity to differences in typical action-chain position. Specifically, the bar chart under node [8] shows that with the present and the preterit tense, certain speakers use agreeing presentational haber almost as often with action-chain tails and settings as with action-chain heads in unprimed contexts or in contexts primed by the participant with non-agreeing presentational haber. For other participants, typical action-chain position interacts with comprehension-to-production priming in this context (node [5]). In contexts following a token of non-agreeing presentational haber or in unprimed contexts, these participants use or almost consistently (bar plot under node [6]). In turn, nodes [10] and [12] indicate that in synthetic present- or preterit-tense expressions that occur in contexts primed with agreeing presentational haber, all speakers use agreeing presentational haber less often with typical action-chain tails or settings (bar plot under node [11]). However, there is some variation among the participants as to the extent to which they are influenced by the presence of a typical action-chain head (node [12]). For some speakers, this causes only a mild increase in the frequency of or (bar plot under node [13]). For others, the increase is considerable (bar plot under node [14]). Turning now to the right-hand side of Fig. 34, nodes [15], [16], and [18] indicate that in expressions involving tenses other than the present or the preterit, all participants use agreeing presentational haber more often in contexts primed by the interviewer with that occur either after the participant has used a token of non-agreeing presentational haber or following a stretch of discourse in which presentational haber has not occurred (bar plot under node [17]). However, when the interviewer uses non-agreeing presentational haber or does not present the speaker with any primes, the decrease in the frequency of agreeing presentational haber varies considerably from speaker to speaker, as is shown by the bar charts under nodes [19] and [20].

200 | Individual constraints on presentational haber pluralization

Fig. 34: Conditional inference tree model showing the interaction of linguistic predictors and speakers in Santo Domingo2

|| 2 Pr.2.Pr: Production-to-production priming; Co.2.Pr: Comprehension-to-production priming; No.priming: First occurrence/distance 20+ clauses; C-index: 0.84.

Linguistic predictors across individuals | 201

For San Juan, Fig. 35 suggests a much more complex pattern of interaction. Contrary to what was observed for Havana and Santo Domingo, only the verb tense has the same effect size for all speakers. This contradicts the results obtained with the regression analysis, which suggested a considerable amount of variation in the sizes of the effects of this predictor. Additionally, the left-hand side of the figure shows that with tenses other than the present and the preterit, comprehension-to-production priming is as important as production-toproduction priming for explaining the variation. For some participants, it even outranks production-to-production priming (nodes [2], [10], and [11]). For these participants, production-to-production priming only conditions presentational haber pluralization in contexts primed by the interviewer with non-agreeing presentational haber or in contexts without comprehension-to-production priming. Additionally, nodes [10], [11], [12], and [13] show that in contexts primed by the participant and the interviewer with non-agreeing presentational haber or in contexts without priming, certain participants respond quite differently to typical action-chain heads. Particularly, the presence of a typical action-chain head dramatically increases the rates of presentational haber pluralization for certain participants (bar chart under node [15]), whereas others seem to be less insensitive to this predictor (bar chart under node [14]). In turn, the right-hand side of the figure shows that in synthetic present- and the preterit-tense expressions, only the verb tense is a relevant constraint for some speakers (node [19] and the bar plot under node [20]). Additionally, nodes [22] and [23] suggest that the presence of negation only affects the behavior of certain participants considerably (bar plot under node [24]). Others use non-agreeing presentational haber almost consistently across the board (bar chart under node [25]).

202 | Individual constraints on presentational haber pluralization

Fig. 35: Conditional inference tree model showing the interaction of linguistic predictors and speakers in San Juan3

|| 3 Pr.2.Pr: Production-to-production priming; Co.2.Pr: Comprehension-to-production priming; No.priming: First occurrence/distance 20+ clauses; C-index: 0.86.

The behavior of the Havana university graduates | 203

In summary, the data discussed in this section confirm that for all individuals the verb tense/statistical preemption and, to a minor degree, production-toproduction priming are the most important constraints on presentational haber pluralization. For comprehension-to-production priming and markedness of coding (absence/presence of negation and typical action-chain position) more interspeaker variation was observed. This is particularly evident for synthetic present-or preterit-tense presentational haber expressions. With these, it appears that only for certain individuals the tandem formed by structural priming and markedness of coding (see Section 7.5) has little trouble to tip the balance in favor of . These results are reminiscent of those that were noted in the previous section, as well as those that have been obtained in earlier investigations of interspeaker variability in language production (e.g., Forrest 2015; Meyerhoff and Walker 2007; Walker and Meyerhoff 2013) and processing (Dąbrowska 2013, 2015), which have equally revealed a considerable amount of interspeaker variation in the sizes of the effects of specific constraints. Let us now consider the behavior of the Havana university graduates.

9.4 The behavior of the Havana university graduates In Section 8.3, it was shown that the use of agreeing and non-agreeing presentational haber by university-educated participants from Havana follows a different constraint pattern from the one that is observed for other speakers of this variety. In that section, I suggested that this might indicate that some or all university-educated participants self-consciously try to use non-agreeing presentational haber consistently because they have learned that this is the normatively sanctioned variant. In this section, I will provide more evidence in favor of this analysis. Let us start with a few predictions that follow from the hypothesis that the Havana university graduates are self-consciously trying to correct their speech towards the patterns of normative grammar. Firstly, if this is the case these participants’ conscious effort can be expected to bypass only the effects of the least influential cognitive constraints (markedness of coding and comprehension-toproduction priming), without affecting the more important ones (production-toproduction priming and statistical preemption). Secondly, we can expect such self-conscious attempts to become less successful in expressions involving tenses other than the synthetic present and preterit or once the participants glitch into using agreeing presentational haber. Thirdly, in the light of the discussion in Section 9.1, we might expect that those university-educated speakers

204 | Individual constraints on presentational haber pluralization

whose professional activities require them to adopt a standard register on a dayto-day basis would be more successful at their attempts to avoid agreeing presentational haber, as they will have succeeded in automating this behavior (see Bargh and Chartrand 1999). These predictions are all borne out by the conditional inference tree model displayed in Fig. 36. Particularly, this model shows that only the verb tense and production-to-production priming constrain presentational haber pluralization for the university-educated speakers from Havana. The left-hand side of Fig. 36 reveals no differences between participants when it comes to the use of and (nodes [2], [3], and [4]). Rather, the Havana university graduates generally seem to avoid these variants, especially in contexts primed with or in contexts without structural priming (bar chart under node [3]). Still, when they have used agreeing presentational haber before, all university graduates use and more often (bar plot in node [4]). In contrast, the right-hand side of the figure reveals that with tenses other than the synthetic present or preterit the participants are more likely to use agreeing presentational haber when they have used this variant before (bar plots under nodes [7] and [11]). However, for LH03M12, LH08H12, LH09M12, LH15M22, LH16H22, and LH18H22, production-to-production priming appears to have a less profound effect than for other participants (nodes [6], [7], and [8]). Rather, these speakers seem to use more frequently in both primed and unprimed contexts (bar plots under nodes [7] and [8]). For these participants, Tab. 30 reveals a common feature, namely, their professional activities either require them to speak in public (e.g., teacher, linguistics professor) or to communicate with foreigners (e.g., taxi driver for tourists), both of which involve adopting a supralocal standard variety of Spanish.

The behavior of the Havana university graduates | 205

Fig. 36: Conditional inference tree model showing the interaction of linguistic predictors and the Havana university graduates4

|| 4 Pr.2.Pr: Production-to-production priming; Co.2.Pr: Comprehension-to-production priming; No.priming: First occurrence/distance 20+ clauses; C-index: 0.81.

206 | Individual constraints on presentational haber pluralization Tab. 30: Social profiles of the Havana university graduates

Speaker

Gender

Age

Educational achievement

LH01H22

Male

60

Bachelor in Economy

Librarian

LH02M12

Female

26

Bachelor in Theology

Secretary

LH03M12

Female

29

Bachelor in Philology

Linguistics professor

LH08H12

Male

29

Bachelor in Philology

Copy-editor at a publishing house

LH09M12

Female

27

Bachelor in Philology

Linguistics professor

LH10M22

Female

83

Bachelor in Pedagogy

High-School teacher

LH11M22

Female

87

Bachelor in Pedagogy

Primary-School teacher

LH15M22

Female

58

Bachelor in Engineering

Radio host

LH16M22

Male

60

Bachelor in History

Taxi driver for tourists

LH18M22

Male

57

Bachelor in Architecture

Art professor

LH20H12

Male

28

Bachelor in Law

Mosquito exterminator

LH23M22

Male

30

Bachelor in Physiotherapy

Physiotherapist

Occupation

In other words, the differences in the constraint rakings observed in Section 8.3 and the data discussed in this section support that the Havana university graduates consciously attempt to avoid agreeing presentational haber. While this explains the behavior of the university graduates from Havana, it does not explain why only in Havana university-educated participants represent this behavior. One possible explanation could reside in the socioeconomic differences between, on the one hand, Cuba and, on the other, Puerto Rico and the Dominican Republic. That is, in Cuba, university education does not correlate closely with easier access to material prosperity and higher standards of living, whereas this is the case in Puerto Rico and the Dominican Republic. Rather, for many Cuban university graduates, dominating a normatively sanctioned variety of Spanish may very well be the only superficial distinctive trait that identifies them as such. Therefore, mastering normative Spanish may represent more ‘symbolic capital’ (Bourdieu 1977) in Havana than in Santo Domingo or San Juan. As a result, in a semi-spontaneous interview with a foreign academic, these speakers may be more likely to resort to this variety in order to symbolize their status as university graduates. Let us now resume the most important findings of this chapter.

Summary | 207

9.5 Summary In this chapter, I have examined whether and how the 72 individual participants of this study use agreeing and non-agreeing presentational haber in their own specific ways and how this portrays the central claim of this book. To investigate this, challenging Labov’s (2001: 34) position that “linguistic analysis cannot recognize individual grammars”, I have presented three distinct lines of evidence: the distribution of the variants of agreeing and non-agreeing presentational haber in the samples of the individual speakers, the random terms of the regression models, and conditional inference tree models that include the participants. In general terms, these three lines of evidence converge. Particularly, regarding the distributions of the variants of presentational haber across the participants, we have observed ample frequency spreads for the speakers included in the three datasets. Across the three communities, participants who display a preoccupation with speaking correctly and participants whose professional activities and/or educational background involve public speaking and/or intensive training in normative grammar use less often. Similarly, the random intercepts and speaker-specific effect estimates support that certain speakers favor or disfavor agreeing presentational haber under and below (or over and above) the education and/or gender group they instantiate. For the linguistic predictors, the relevant random slopes and participant-specific regression coefficients also suggest that not all speakers are as strongly influenced by the same predictor. Still, the regression analyses support that the levels of all linguistic regressors have the same directionality of effects for all individuals. Exploring this matter further with additional conditional inference tree models confirmed these results. Particularly, we have seen that only the verb tense/statistical preemption and, to a lesser extent, production-to-production priming apply with equal strengths to all speakers. Crucially, however, the directionality of the effects of these and the other predictors was shown to be identical for all participants, as speakers only differ from one another in terms of the sizes associated with the effects. These converging results are highly favorable to the idea that domain-general cognitive constraints on spreading activation condition linguistic alternations. The Havana university graduates constitute an apparent contradiction to this claim. However, I have shown that these speakers’ behavior can be interpreted as suggesting a pattern of self-correction towards a standard variety that does not include presentational haber pluralization. This effort was interpreted as a means to reinforce their status as university graduates. Let us now turn to the conclusions of this volume. These will be the topics of the following chapter.

|

Chapter 10: Cognitive, social, and individual constraints on presentational haber pluralization

10 Cognitive, social, and individual constraints on presentational haber pluralization Part B has presented arguments in favor of the hypothesis that three domaingeneral cognitive constraints on spreading activation (markedness of coding, statistical preemption, and structural priming), social constraints, and individual constraints condition presentational haber pluralization. In this chapter, I will review the evidence that was presented throughout Chapter 7, Chapter 8, and Chapter 9 and I will position these data in a broader theoretical perspective. To this effect, let us return now to the research questions posited in Section 4.1. For ease of reference, the questions are repeated here. I. Cognitive constraints on presentational haber pluralization –

What are the patterns of covariation with linguistic predictors? Do they support that markedness of coding, statistical preemption, and structural priming constrain presentational haber pluralization?

II. Social constraints on presentational haber pluralization –

How do different social groups use agreeing and non-agreeing presentational haber? Does this portray the variation as an ongoing language change from below?

III. Individual constraints on presentational haber pluralization –

How does haber pluralization pattern in the language production of individual speakers? What do these patterns inform us about individual constraints on morphosyntactic variation?

10.1 Cognitive constraints When it comes to the cognitive constraints that were investigated in this book, Chapter 7 revealed that in Cuban, Dominican, and Puerto Rican Spanish the linguistic predictors have largely identical effects. Particularly, it was shown that nouns that refer to typical action-chain heads favor presentational haber

212 | Cognitive, social, and individual constraints

pluralization. In addition, when negation is present, speakers are less likely to use the agreeing presentational haber construction in Havana and San Juan. For Santo Domingo, in contrast, the absence/presence of negation did not contribute to explaining the variation. Still, the fact that typical action-chain heads (i.e., more prominent nominal arguments) are encoded more often with the variant that attributes more formal prominence to these noun phrases suggests that markedness of coding is a cognitive constraint on the variation, as hypothesis 1 proposes. For the verb tense, Chapter 7 showed that speakers use agreeing presentational haber less often in synthetic present- and preterit-tense expressions, whereas they use more frequently with other types of expressions. Since the former group of tenses occurred (and still occur) mainly in the non-agreeing presentational haber construction before the agreeing variant emerged as a conventional alternative to , this supports that presentational haber pluralization is constrained by statistical preemption, as is claimed by hypotheses 2a-b. Additionally, Chapter 7 demonstrated that speakers are more likely to use the agreeing presentational haber construction in contexts following an agreeing presentational haber clause. Conversely, speakers are less likely to use when they have just used or processed a non-agreeing presentational haber expression. This supports the idea that, first, presentational haber pluralization is subject to structural priming, as is argued by hypothesis 3 and, second, that presentational haber occurs in two argument-structure constructions, as is claimed by the working hypothesis. Comparing the patterns of interaction and the relative impacts of the linguistic predictors in Cuban, Dominican, and Puerto Rican Spanish revealed further close correspondences between the three varieties. Subsequently, using conditional inference tree models, Chapter 8 showed that specific groups respond more or less strongly to the linguistic predictors that model markedness of coding in contexts primed by the speaker with nonagreeing presentational haber or in contexts without priming. Still, for both gender and/or education groups, the directionalities of the effects proved to be identical. Chapter 8 also presented a series of constraint hierarchies, which showed that social constraints contribute relatively little to explaining the variation, because all gender and/or educational achievement groups, with the exception of the Havana university graduates, feature identical constraint rankings. Chapter 9 explored this matter further for the individual participants. Drawing on the random terms of the regression models, the chapter revealed that, although not all speakers are as strongly influenced by the same predictor, the levels of all linguistic regressors have the same directionality of effects for all

Social constraints | 213

individual speakers. Conditional inference tree models confirmed this result. Section 9.4 also provided evidence in favor of the view that the behavior of the Havana university graduates differs sharply from that of other participants, because these speakers attempt to adopt a supralocal standard variety of Spanish, which does not feature presentational haber pluralization. In sum, the data presented in Part B support the following answers to research question one. Firstly, the priming effects documented in the three speech communities suggest that agreeing and non-agreeing presentational haber are represented mentally as two distinct constructions. Secondly, striking similarities were found between the three varieties as to the overall rate of presentational haber pluralization, the linguistic predictors that shape the alternance, their patterns of interaction, and their relative impacts on the variation. As a matter of fact, the only difference between the varieties in this respect appears to be that the absence/presence of negation does not contribute to explaining the variation in Santo Domingo. Although comparable findings have also been used to support the central thesis of Probabilistic Grammar (e.g., Bresnan and Hay 2008), because the results display the exact patterns that are predicted by markedness of coding, statistical preemption, and structural priming, I take them to indicate that these domain-general cognitive constraints condition the spreading activation of variant constructions.

10.2 Social constraints Turning now to the second set of questions – which are concerned with the social constraints that were considered in this volume – Chapter 8 explored the regression results for age, education, gender, and style. This revealed that presentational haber pluralization correlates with lower educational achievement in Havana and San Juan. For Havana and for Santo Domingo, a correlation with female gender was also observed. For the former, the regression results suggested that haber pluralization occurs more readily when more attention is focused on language, especially for the synthetic present and preterit tense. Using conditional inference tree models, Chapter 8 also explored how these correlations emerge. This suggested that education and/or gender correlations arise because specific groups feature different effect sizes for the linguistic predictors that model markedness of coding in contexts without priming or in contexts primed by the speaker with non-agreeing presentational haber. Because the incoming variants of language changes from below tend to be used more frequently by female speakers without appearing less frequently in formal types of speech (Labov 1972: 239, 2001: Chap. 3, 292), these results can be

214 | Cognitive, social, and individual constraints

interpreted as evidence in favor of the hypothesis that presentational haber pluralization constitutes an ongoing language change from below in Cuban, Dominican, and Puerto Rican Spanish. This is also supported by the differences noted in Chapter 7 between the varieties in terms of the frequency of and by the fact that the absence/presence of negation is not a relevant predictor in Santo Domingo, which features the highest overall rates of presentational haber pluralization.

10.3 Individual constraints Regarding individual constraints (i.e., the third set of questions), Chapter 9 analyzed the behavior of the individual speakers included in the samples of Havana, Santo Domingo, and San Juan. This showed that participants’ attitudes towards language and/or their professional/education profiles influence the frequency with which they use . Specifically, participants who display an interest in speaking correctly and those participants whose professional activities or educational background place a strong focus on language were shown to use agreeing presentational haber less often. In turn, upwardly mobile university-educated speakers were shown to use agreeing presentational haber more often than their peers. Additionally, the random terms of the regression models revealed that, although not all speakers are as strongly influenced by the same predictor, the levels of all linguistic regressors have the same directionality of effects across individual participants. Exploring this matter further with conditional inference tree models confirmed this result. Finally, Chapter 9.4 provided evidence in favor of the view that the behavior of the Havana university graduates differs sharply from that of other speakers of Havana Spanish because these speakers attempt to adopt a standard variety of Spanish that does not feature presentational haber pluralization. These results severely challenge the perspective defended by sociolinguists like Labov (2010:7) and cognitive linguists like Dąbrowska (2015: 651), who portray language as an abstract social system that is only instantiated partially in the individual. On the contrary, the results of this study show that quantitative tendencies are strikingly similar across individuals, even when individuals belonging to different speech communities are compared. Because these tendencies display the directionalities that are predicted by the domain-general cognitive constraints, these results give reason to believe that community-level usage patterns and usage patterns across communities emerge from universal cognitive constraints on spreading activation working at the level of individual speakers. Thus, rather than supporting Labov’s (2010:7) and Dąbrowska’s (2015:

Conclusion | 215

651) claim, the results obtained in this study suggest that, while speakers belonging to the same or different speech communities may differ sharply from one another in terms of their qualitative grammatical knowledge (i.e., they may know or not know the meaning of particular grammatical constructions or lexical items, as e.g., Dąbrowska 2012, 2013, 2015 has shown), when they have multiple constructional alternatives at their disposition to encode a given conceptualization, they are sensitive to the same constraints on language production, even though these may trigger effects of different sizes. Let us now turn to the conclusions.

10.4 Conclusion The results reviewed in this chapter support that language variation is not solely constrained by meaning – as is proposed in Cognitive Sociolinguistics and functional linguistics generally –, community norms, – as is proposed in variationist sociolinguistics (e.g., Labov 1972, 1982; Weinreich, Labov, and Herzog 1968) – or speakers’ knowledge of arbitrary probabilistic patterns – as is proposed in Probabilistic Grammar (e.g., Bresnan 2007; Szmrecsanyi, 2013). Rather, a mix of meaning, community norms, probabilistic knowledge, and domain-general cognitive constraints on spreading activation seems to condition the use of alternating, nearly synonymous constructions. Particularly, the results that were reviewed here give reason to believe that community norms only regulate what variant is appropriate for what situation and for whom, i.e., they determine the social-interactional meaning of variant constructions in the community. Similarly, the results do not contradict that speakers learn from experience (e.g., Tomasello 2007) what particular variants mean in comparison with others, that they use these variants to express subtle conceptual-semantic differences (e.g., Bolinger 1968), and that they remember how words are used across constructions (e.g., Goldberg 2011; Robenalt and Goldberg 2015). Rather, the results support that all of this information forms part of speakers’ grasp of their language. In language production, then, this knowledge constitutes the input on which three domain-general cognitive constraints on spreading activation operate: markedness of coding, statistical preemption, and structural priming. Together, these constraints constitute an empirically adequate theoretical model of the constraints that govern morphosyntactic variation. This model may now be compared to a related model, MacDonald’s (2013) Production-DistributionComprehension account, which has recently been proposed in psycholinguistics to explain regularities in word order variation. Like the Cognitive Sociolin-

216 | Cognitive, social, and individual constraints

guistics model that was elaborated in this book, the Production-DistributionComprehension model rests on the premise that certain constraints favor the spreading activation of a particular expression type over others during language production. Particularly, MacDonald (2013: 2) proposes that “memory and planning demands of language production strongly affect the form of producers’ utterances.” She envisages three such demands: – Easy to retrieve referents are encoded early on in the sentence, reducing overall processing cost (‘Easy First’). This is a strategy that encourages variability – Speakers re-use earlier observed/generated utterance plans and utterance plans stored in long-term memory (‘Plan Reuse’); this encourages speakers to be conservative in their usage – Speakers omit possibly interfering referents from their utterances (‘Reduce Interference’) In other words, the Production-Distribution-Comprehension model assumes that speakers are not primarily concerned with communicating and interacting with their interlocutors, but rather with reducing processing costs. In contrast, the approach of this book recognizes that speakers’ communicative intentions (including social-interactional meanings) constitute a central constraint on language variation, as both markedness of coding and statistical preemption refer to the mappings between form and meaning. As a result, unlike ‘Plan Reuse’, statistical preemption proposes that speakers only use stored sequences when these provide an optimal match for the conceptual import they wish to express. In addition, the Production-Distribution-Comprehension model does not recognize that “language conveys more than simply the meaning of its words” (Tagliamonte 2006:5–6), that is, that speakers are not only concerned with bringing across states of affairs, but also with other aspects of their conceptualization, such as their position in the social and geographic landscape or attitudes towards their interlocutor (e.g., Eckert 2008; Kiesling 2009, 2013; Labov 2010: 372). It is not at all clear how the mechanical constraints that constitute the Production-Distribution-Comprehension model would activate one variant over another in function of such meanings. In contrast, in the Cognitive Sociolinguistics model this is just an aspect of markedness of coding: one variant reaches a higher level of activation because it specifies a social-interactional meaning as part of its conceptual import. How Easy First could generalize beyond word order alternations is not clear either. In contrast, markedness of coding predicts that referents on which speakers have their attention focused are encoded more prominently, because they are categorized faster by construc-

Conclusion | 217

tions that allow encoding them with prominent grammatical functions. Thus, in comparison to the model that was developed here, the Production-DistributionComprehension model is overly mechanistic and overly mentalistic, as it tries to explain linguistic patterns solely in terms of processing costs and completely ignores semantics and the social-interactional nature of language. The comparison of the Cognitive Sociolinguistics model with the Production-Distribution-Comprehension model shows that, because Cognitive Sociolinguistics approaches language as a cognitive capacity of the individual that has an important social-interactional function, it reaches empirically and psychologically more plausible conclusions. Yet, the potential of cross-fertilization with existing variationist linguistics is not limited to extending the model developed here to more and new morphosyntactic phenomena. Rather, throughout this book I have demonstrated how Cognitive Sociolinguistics may contribute to variationist approaches a psychologically plausible theoretical context to operationalize cognitive constraints into contextual features. This remediates a long-standing issue in variationist linguistics (including Probabilistic Grammar), namely, that the factors chosen for entry into VARBRUL analysis appear without extensive discussion, and it is not clear how, apart from the intuitions of the researcher, these are arrived at, or whether there are any constraints on what can be a factor here (Henry 2002: 277).

However, as I have shown here, this will require reversing the research questions. Rather than inferring hypotheses about the system that regulates variation by examining arbitrary, theoretically unmotivated correlation patterns, this book has tested hypotheses that derive from assumptions about the cognitive underpinnings of language (variation). Even for linguistic alternations and predictors that variationists already consider, these assumptions may offer a new perspective for operationalizing and understanding them. Of the three cognitive constraints considered here, only structural priming is fairly well known to variationists, who understand its effect in essentially the same way that cognitive linguists and psycholinguists do. The second constraint, statistical preemption, can potentially apply to many predictors, including ones already used by variationists. However, for most variationists, finding that a morphosyntactic alternation is favored or disfavored by present or past tense (for example) would be no more than a descriptive discovery, the direction and size of such an effect is not predictable from any more general principles. As I have shown here, statistical preemption provides a new quantitative dimension. For studies focusing on morphosyntactic change over time, this new dimension may help explain why certain lexical

218 | Cognitive, social, and individual constraints

items appear to lag behind others, as was shown here for hubo and hay. In addition, markedness of coding is likely to generate predictors that are not typically used by variationists, as they will involve the interplay of semantic and formal factors. On top of that, as observed by Geeraerts and Kristiansen (2015), Cognitive Linguistics, with its focus on and elaborate model of semantics may assist variationist linguistics in opening new avenues of research, particularly, the study of ‘variation in meaning’ (e.g., lexical variation in the spirit of Geeraerts, Grondelaers, and Bakema 1994) and, within the Third-Wave approach to sociolinguistic variability (e.g., Eckert 2008, 2012), of ‘variation in the meaning of variation’ (e.g., Kristiansen 2008; Pizarro-Pedraza 2016). In conclusion, Cognitive (Socio)Linguistics has much to offer to existing variationist linguistics, including Probabilistic Grammar. Particularly, I have shown how a unified theoretical model of morphosyntactic variation may be constructed with just three basic domain-general cognitive constraints that are assumed in Cognitive (Socio)Linguistics and how, drawing on Cognitive Linguistics, this model may be operationalized. By approaching morphosyntactic variation through this model, variationist analyses can go beyond the mere description of data and understand the patterns in it as reflexes of constraints that are deeply rooted in general cognition. As a result, variationists may formulate hypotheses that do not only describe the data, but also explain, in a psychologically plausible fashion, why the data are the way they are. However, Chapter 8 and Chapter 9 have also demonstrated that cognitive constraints on language variation can only explain the lion’s share of the variability. To gain a complete understanding of the structure of variation, it is imperative to attend to social-interactional meaning and to the individuals behind the regression estimates and effect plots.

References Abott, Barbara. 2004. Definiteness and indefiniteness. In Laurence Horn & Gregory Ward (ed.), The handbook of pragmatics, 122–149. Oxford: Blackwell. Alba, Orlando. 2004. ¿Cómo hablamos los dominicanos? Un enfoque sociolingüístico. Santo Domingo: Grupo León Jimenes. Aleza-Izquierdo, Milagros. 2011. Fenómenos gramaticales en el habla culta de la generación joven de La Habana, Cuba. Materiales para su estudio. Itinerarios: Revista de Estudios Lingüísticos, Literarios, Históricos y Antropológicos 13. 29–51. Allen, Kachina, Francisco Pereira, Matthew Botvinick & Adele E. Goldberg. 2012. Distinguishing grammatical constructions with fMRI pattern analysis. Brain & Language 123(3). 174–182. Álvarez-Martínez, María Ángeles. 1996. Extremeño. In Manuel Alvar-López (ed.), Manual de dialectología hispánica: El español de España, 171–182. Barcelona: Ariel. Álvarez-Nazarío, Manuel. 1991. Historia de la lengua española en Puerto Rico: Su pasado y su presente en el marco de la realidad social. San Juan, PR: Academia Puertorriqueña de la Lengua Española. Alvar-López, Manuel. 2000. El español de la República Dominicana: Estudios, encuestas, textos. Alcalá de Henares: La Goleta. Anderson, David R., Kenneth P. Burnham & William S. Thompson. 2000. Null hypothesis testing: Problems, prevalence, and an alternative. Journal of Wildlife Management 64(4). 912–923. Ashby, William. 2001. Un nouveau regard sur la chûte du ne en français parlé tourangeau: s'Agit-il d'un changement en cours? Journal of French Language Studies 11(1). 1–22. Ashby, William J. & Paola Bentivoglio. 1993. Preferred argument structure in French and Spanish. Language Variation and Change 5(1). 61–76. Aslin, Richard N., Jenny R. Saffran & Elissa L. Newport. 1999. Statistical learning in linguistic and nonlinguistic domains. In Brian MacWhinney (ed.), The emergence of language, 359– 380. London: Lawrence Erlbaum Associates, Publishers. Baayen, R. Harald, Douglas J. Davidson & Douglas M. Bates. 2008. Mixed-effects modeling with crossed random effects for subjects and items. Journal of Memory and Language 59(4). 390–412. Baayen, R. Harald. 2014. Multivariate statistics. In Richard Podesva & Devyani Sharma (ed.), Research methods in linguistics, 337–372. Cambridge, MA: Cambridge University Press. Balota, David A., Melvin A. Yap & Michael J. Cortese. 2006. Visual word recognition: The journey from features to meaning. In Matthew J. Traxler & Morton J. Gernsbacher (ed.), Handbook of psycholinguistics, 285–375. Amsterdam/New York, NY: Elsevier. Bargh, John A. & Tanya L. Chartrand. 1999. The unbearable automaticity of being. American Psychologist 54(7). 462–479. Bartón, Kamil. 2016. MuMIn: Model selection and model averaging based on information criteria (AICc and alike). https://cran.r-project.org/web/packages/MuMIn/index.html. (May 2016). Bates, Douglas, Martin Maechler, Ben Bolker & Steven Walker. 2016. lme4: Linear MixedEffects Models using 'Eigen' and S4. https://cran.rproject.org/web/packages/lme4/index.html. (May 2016). Bello, Andrés. 1860. Gramática de la lengua castellana al uso de los americanos. Bogotá: Echeverría Hermanos.

220 | References Bentivoglio, Paola & Mercedes Sedano. 2011. Morphosyntactic variation in Spanish-speaking Latin America. In Manuel Díaz-Campos (ed.), The handbook of Hispanic sociolinguistics, 123–147. Oxford: Blackwell. Beutel, Ann M & Margaret Mooney-Marini. 1995. Gender and values. American Sociological Review 60(3). 436–448. Birner, Betty. 1994. Information status and word order: An analysis of English inversion. Language 70(2). 233–259. Birner, Betty & Gregory Ward. 1996. A crosslinguistic study of postposing in discourse. Language and Speech 39(2–3). 113–142. Blas-Arroyo, José Luis. 1995. A propósito de un caso de convergencia gramatical por causación múltiple en el área de influencia lingüística catalana. Análisis sociolingüístico. Cuadernos de Investigación Filológica 21–22. 175–200. Blas-Arroyo, José Luis. 1999. Lenguas en contacto. Consecuencias lingüísticas del bilingüismo social de las comunidades de habla del este peninsular. Frankfurt am Main/Madrid: Vervuert/Iberoamericana. Blas-Arroyo, José Luis. 2016. Entre la estabilidad y la hipercorrección en un antiguo ‘cambio desde abajo’: haber existencial en las comunidades de habla castellonenses. Lingüística Española Actual 6(1). In press. Bock, Kathryn. 1986. Syntactic persistence in language production. Cognitive Psychology 18(3). 355–387. Bock, Kathryn, Gary S. Dell, Franklin Chang & Kristine Onishi. 2007. Persistent structural priming from language comprehension to language production. Cognition 104(3). 437–458. Bock, Kathryn & Zenzi M. Griffin. 2000. The persistence of structural priming: Activation or implicit learning? Journal of Experimental Psychology: General 129(2). 177–192. Bodenhausen, Galen V., Sonia K. Kan & Destiny Peery. 2012. Social categorization and the perception of social groups. In Susan Fiske & Neil C. Macrae (ed.), The Sage handbook of social cognition, 318–336. New York, NY: Sage. Bolinger, Dwight. 1954. Further comment on haber. Hispania 37(3). 334–335. Bolinger, Dwight. 1968. Entailment and the meaning of structures. Glossa 2(2). 119–127. Bolinger, Dwight. 1977. Meaning and form. New York, NY: Longman. Bourdieu, Pierrre. 1977. The economics of linguistic exchanges. Social Science Information 16(6). 645–668. Bresnan, Joan. 2007. Is syntactic knowledge probabilistic? Experiments with the English dative alternation. In Sam Featherston & Wolfgang Sternefeld (ed.), Roots: Linguistics in search of its evidential base. Berlin/Boston, MA: De Gruyter. Bresnan, Joan, Ana Cueni, Tatiana Nikitina & R. Harald Baayen. 2007. Predicting the dative alternation. In Gerlof Bouma, Irene Krämer & Joost Zwarts (ed.), Cognitive foundations of interpretation, 69–94. Amsterdam: Royal Netherlands Academy of Science. Bresnan, Joan & Marilyn Ford. 2010. Predicting syntax: Processing dative constructions in American and Australian varieties of English. Language 86(1). 168–213. Bresnan, Joan & Jennifer Hay. 2008. Gradient Grammar: An effect of animacy on the syntax of give in New Zealand and American English. Lingua 118(2). 245–259. Brown, Esther & Javier Rivas. 2012. Grammatical relation probability: How usage patterns shape analogy. Language Variation and Change 24(3). 317–341. Burnham, Kenneth P. & David R. Anderson. 2002. Model selection and multimodel inference. New York, NY: Springer. Bybee, Joan. 2001. Phonology and language use. Cambridge, MA: Cambridge University Press.

References | 221

Bybee, Joan. 2006. From usage to grammar: The mind's response to repetition. Language 82(4). 711–733. Bybee, Joan. 2009. Grammaticization: Implications for a theory of language. In Jiansheng, Guo, Elena Lieven, Nancy Budwig, Susan Ervin-Tripp, Seyd Ozcaliskan & Keiko Nakamura (ed.), Crosslinguistic approaches to the psychology of language: Research in the tradition of Dan Isaac Slobin, 345–355. Mahwah, NJ: Taylor and Francis. Bybee, Joan. 2010. Language, usage, and cognition. Cambridge, MA: Cambridge University Press. Bybee, Joan & Clay Beckner. 2010. Usage-based theory. In Bernd Heine & Heiko Narrog (ed.), The Oxford handbook of linguistic analysis, 827–856. Oxford: Oxford University Press. Campbell-Kibler, Kathryn. 2010. New directions in sociolinguistic cognition. University of Pennsylvania Working Papers in Linguistics 15(2). Article 5. Carpenter, Jeannine & Sarah Hilliard. 2005. Shifting parameters of individual and group variation. Journal of English Linguistics 33(2). 161–184. Castillo Trelles, Carolina. 2007. La pluralización del verbo haber impersonal en el español yucateco. In Jonathan Holmquist, Augusto Lorenzino & Lotfi Sayahi (ed.), Selected Proceedings of the Third Workshop on Spanish Sociolinguistics, 74–84. Somerville, MA: Cascadilla Proceedings Project. Catalán, Diego. 1989. El español: Orígenes de su diversidad. Madrid: Paraninfo. Cedergren, Henrietta & David Sankoff. 1974. Variable Rules: performance as a statistical reflection of competence. Language 50(2). 333–355. Chambers, Jack K. 2009. Sociolinguistic theory: Linguistic variation and its social significance. Oxford: Blackwell. Chang, Franklin, Gary S. Dell, Kathryn Bock & Zenzi M. Griffin. 2000. Structural priming as implicit learning: Comparison of models of sentence production. Journal of Psycholinguistic Research 29(2). 217–229. Cheshire, Jenny. 2002. Sex and gender in variationist research. In Jack K. Chambers, Peter Trudgill & N. Chilling-Estes (ed.), The handbook of language variation and change, 423–443. Oxford: Blackwell. Chipere, Ngoni. 2001. Variations in native speaker competence: Implications for first-language teaching. Language Awareness 1(2–3). 107–214. Chomsky, Noam. 1965. Aspects of the theory of syntax. Cambridge, MA: MIT University Press. Chomsky, Noam. 1995. The Minimalist Program. Cambridge, MA: The MIT Press. Cienki, Alan. 2007. Frames, idealized cognitive models and domains. In Dirk Geeraerts & Hubert Cuyckens (ed.), The Oxford handbook of Cognitive Linguistics, 170–187. Oxford: Oxford University Press. Claes, Jeroen. 2017. La pluralización de haber presentacional en el español peninsular: Datos de Twitter. Sociolinguistics Studies 11(1). In press. Clark, Eve V. 1978. Locationals: Existential, locative and possessive constructions. In Joseph H. Greenberg, Charles H. Ferguson & Edith A. Moravcsik (ed.), Universals of human language. Volume 4: Syntax, 85–126. Stanford, CA: Stanford University Press. Clark, Lynn. 2007. Cognitive Sociolinguistics: A viable approach to variation in linguistic theory. In William J. Sullivan & Arle R. Lommel (ed.), Lacus forum 33: Variation, 1–14. Houston, TX: Lacus. Clark, Lynn. 2008. Re-examining vocalic variation in Scottish English: A Cognitive Grammar approach. Language Variation and Change 20(2). 255–273.

222 | References Colleman, Timothy. 2010. Lectal variation in constructional semantics: 'Benefactive' ditransitives in Dutch. In Dirk Geeraerts, Gitte Kristiansen & Yves Peirsman (ed.), Advances in Cognitive Sociolinguistics, 191–223. Berlin/Boston, MA: De Gruyter. Company-Company, Concepción. 2003. La gramaticalización en la historia del español. Medievalia 35. 3–61. Comrie, Bernard. 1989. Language universals and linguistic typology. Chicago, IL: Chicago University Press. Coveney, Alan. 2003. “Anything you can do, tu can do better”: Tu and vous as substitutes for indefinite on in French. Journal of Sociolinguistics 7(2). 164–191. Coveney, Alan. 2004. The alternation between l'on and on in spoken French. Journal of French Language Studies 79(2). 91–112. Coveney, Alan. 2005. Doubling in spoken French: A sociolinguistic approach. The French Review 79(1). 96–111. Croft, William. 2000. Explaining language change: An evolutionary perspective. London/New York, NY: Longman. Croft, William. 2003. Typology and universals. Cambridge, MA: Cambridge University Press. Croft, William. 2007. Construction Grammar. In Dirk Geeraerts & Hubert Cuyckens (ed.), The Oxford handbook of Cognitive Linguistics, 463–508. Oxford: Oxford University Press. Croft, William. (2009). Toward a social Cognitive Linguistics. In Vyvyan Evans, & Stéphanie Pourcel, New directions in Cognitive Linguistics (pp. 395–420). Amsterdam/Philadelphia, PA: John Benjamins. Croft, William & Alan Cruse. 2004. Cognitive Linguistics. Cambridge, MA: Cambridge University Press. Dąbrowska, Ewa. 1997. The LAD goes to school: A cautionary tail for nativists. Linguistics 35(4). 735–766. Dąbrowska, Ewa. 2012. Different speakers, different grammars: Individual differences in native language attainment. Linguistic Approaches to Bilingualism 2(3). 219–253. Dąbrowska, Ewa. 2013. Functional constraints, usage, and mental grammars: A study of speakers’ intuitions about questions with long-distance dependencies. Cognitive Linguistics 24(5). 617–653. Dąbrowska, Ewa. 2014. Recycling utterances: A speaker’s guide to sentence processing. Cognitive Linguistics 25(4). 617–653. Dąbrowska, Ewa. 2015. Individual differences in grammatical knowledge. In Ewa Dąbrowska & Dagmar S. Divjak (ed.), Handbook of Cognitive Linguistics, 650–667. Berlin/Boston, MA: De Gruyter. Dąbrowska, Ewa & James Street. 2006. Individual differences in language attainment: Comprehension of passive sentences by native and non-native English speakers. Language Sciences 28(6). 604–615. D'Aquino-Ruiz, Giovanna. 2004. Haber impersonal en el habla de Caracas. Análisis sociolingüístico. Boletín de Lingüística 21. 3–26. D'Aquino-Ruiz, Giovanna. 2008. El cambio linguistico de haber impersonal. Núcleo 20(25). 103–124. Davies, Mark. 2008-. Corpus of contemporary American English: 450 million words (19902012). http://corpus.byu.edu/coca/. (May 2016). Davies, Mark. 2002-. Corpus del español: 2 million words (1200s-1900s). http://www.corpusdelespanol.org/. (May 2016).

References | 223

Delbecque, Nicole. 2002. A construction grammar approach to transitivity in Spanish. In Kristin Davidse & Béatrice Lamiroy (ed.), The nominative and accusative and their counterparts, 81–130. Amsterdam/Philadelphia, PA: John Benjamins. Dell, Garry S. 1986. A spreading-activation theory of retrieval in sentence production. Psychological Review 92(3). 283–321. Dell, Garry S., Franklin Chang & Zenzi M. Griffin. (1999). Connectionist models of language production: Lexical access and grammatical encoding. Cognitive Science 23(4), 517–542. DeMello, George. 1991. Pluralización del verbo haber impersonal en el español hablado culto de once ciudades. Thesaurus 46(3). 445–471. Díaz-Campos, Manuel. 2003. The pluralization of haber in Venezuelan Spanish: A sociolinguistic change in real time. IU Working Papers in Linguistics 3. Article 5. Dixon, Robert M. W. 1979. Ergativity. Language 15(1). 59–138. Drager, Katie & Jennifer Hay. 2012. Exploiting random intercepts: Two case studies in sociophonetics. Language Variation and Change 24(1). 59–78. Du Bois, John. 1987. The discourse basis of ergativity. Language 63(4).805-855 Eckert, Penelope. 1989. The whole woman: Sex and gender differences in variation. Language Variation and Change 1(3). 245–267. Eckert, Penelope. 2008. Variation and the indexical field. Journal of Sociolinguistics 12(4). 453–476. Eckert, Penelope. 2012. Three waves of variation study: The emergence of meaning in the study of sociolinguistic variation. Annual Review of Anthropology 41. 87–100. Ellis, Nick C. & Fernando Ferreira-Junior. 2009. Constructions and their acquisition: Islands and the distinctiveness of their occupancy. Annual Review of Cognitive Linguistics 7. 187–220 Epstein, Cyntia Fuchs. 2007. Great divides: The cultural, cognitive, and social bases of the global subordination of women. American Sociological Review 72(1). 1–22. Farmer, Thomas A., Jennifer B. Misyak & Morten H. Christiansen. (2012). Individual differences in sentence processing. In Michael J. Spivey, Ken McRae & Marc Joannisse (ed.), Cambridge handbook of psycholinguistics, 353–364. Cambridge, MA: Cambridge University Press. Fauconnier, Gilles. 2007. Mental spaces. In Dirk Geeraerts & Hubert Cuyckens (ed.), The Oxford handbook of Cognitive Linguistics, 351–376. Oxford: Oxford University Press. Fauconnier, Giles & Mark Turner. 1996. Blending as a central process of grammar. In Adele E. Goldberg (ed.), Conceptual structure, discourse and language, 113–129. Stanford, CA: CSLI Publications. Fernández, Félix. 1982. Actitudes lingüísticas: Un sondeo preliminar. In Orlando Alba (ed.), El español del Caribe: Ponencias del VI Simposio de dialectología, 89–104. Santiago de Los Caballeros: Pontífica Universidad Católica Madre y Maestra. Fernández-Soriano, Olga. 1999. Two types of impersonal sentences in Spanish: Locative and dative subjects. Syntax 2(2). 101–140. Fernández-Soriano, Olga & Susana Táboas-Baylín. 1999. Construcciones impersonales no reflejas. In Ignacio Bosque & Violeta Demonte (ed.), Gramática descriptiva de la Lengua Española, 1723–1778. Madrid: Espasa-Calpe. Ferreira, Fernanda & Paul E. Engelhardt. 2006. Syntax and production. In Matthew J. Traxler & Morton J. Gernsbacher (ed.), Handbook of psycholinguistics, 61–92. Amsterdam/New York, NY: Elsevier. Fillmore, Charles, Paul Kay & Mary Catherine O'Connor. 1988. Regularity and Idiomaticity in grammatical constructions: The case of let alone. Language 64(3). 501–538.

224 | References Flórez, Luis. 1946. Reseña a "Charles E. Kany, American-Spanish syntax, Chicago, University of Chicago Press, 1945. XII-466 págs.". Thesaurus 2(2). 372–385. Fontanella de Weinberg, María Beatriz. 1987. El español bonaerense: Cuatro siglos de evolución lingüística (1580–1980). Buenos Aires: Hachette. Fontanella de Weinberg, María Beatriz. 1992a. El español de América. Madrid: Mapfre. Fontanella de Weinberg, María Beatriz. 1992b. Variación sincrónica y diacrónica de de las construcciones con haber en el español americano. Boletín de Filología de la Universidad de Chile 33. 35–46. Forrest, Jon. 2015. Community rules and speaker behavior: Individual adherence to group constraints on (ING). Language Variation and Change 27(2). 377–406. Fox, John & Sanford Weisberg. 2016. car: Companion to applied regression. https://cran.rproject.org/web/packages/car/index.html. (May 2016). Freeze, Ray. 1992. Existentials and other locatives. Language 68(3). 553–595. Freites-Barros, Francisco. 2003. Actitudes lingüísticas en torno a la pluralización de haber impersonal en los Andes venezolanos. Interlingüística 14. 375–382. Freites-Barros, Francisco. 2004. Pluralización de haber impersonal en el Táchira: Actitudes lingüísticas. Boletín de Lingüística 22. 32–51. Freites-Barros, Francisco. 2008. Más sobre la pluralización de haber impersonal en Venezuela: El estado Táchira. Lingua Americana 12(22). 36–57. Garachana-Camarero, Mar. 1997. Acerca de los condicionamientos cognitivos y lingüísticos de la sustitución de aver por tener. Verba 24. 203–235. Geeraerts, Dirk. 2005. Lectal variation and empirical data in Cognitive Linguistics. In Francisco Ruiz-Mendoza de Ibañez & Sandra Peña-Cervel (ed.), Cognitive Linguistics: Internal dynamics and interdisciplinary interactions, 163–189. Berlin/Boston, MA: De Gruyter. Geeraerts, Dirk. 2016. Entrenchment as onomasiological salience. In Hans-Jorg Schmid (ed.), Entrenchment, memory and automaticity: The psychology of linguistic knowledge and language learning, in press. Berlin/Boston, MA: De Gruyter. Geeraerts, Dirk, Stefan Grondelaers & Peter Bakema. 1994. The structure of lexical variation. Berlin/Boston, MA: De Gruyter. Geeraerts, Dirk & Gitte Kristiansen. 2015. Variationist linguistics. In Ewa Dąbrowska & Dagmar S. Divjak (ed.), Handbook of Cognitive Linguistics, 366–389. Berlin/Boston, MA: De Gruyter. Gili-Gaya, Samuel. 1980. Curso superior de sintaxis española. Barcelona: Vox. Givón, Talmy. 1999. Generativity and variation: The notion 'rule of grammar' revisited. In Brian MacWhinney (ed.), The emergence of language, 81–114. London: Lawrence Erlbaum Associates, Publishers. Goldberg, Adele E. 1995. Constructions: A Construction Grammar approach to argument structure. Chicago, IL: Chicago University Press. Goldberg, Adele E. 2001. Patient arguments of causative verbs can be omitted: The role of information structure in argument distribution. Language Sciences 23. 503–524. Goldberg, Adele E. 2003. Constructions: A new theoretical approach to language. Trends in Cognitive Science 7(5). 219–224. Goldberg, Adele E. 2005a. Argument realization. In Jan-Ola Östman & Mirjam Fried (ed.), Construction grammars: Cognitive grounding and theoretical extensions, 17–43. Amsterdam/Philadelphia, PA: John Benjamins. Goldberg, Adele E. 2005b. Constructions, lexical semantics, and the correspondence principle: Accounting for generalizations and subregularities in the realization of arguments. In No-

References | 225

mi Erteschik-Shir & Tova Rapoport (ed.), The syntax of aspect: Deriving thematic and aspectual interpretation, 215–236. Oxford: Oxford University Press. Goldberg, Adele E. 2006a. Constructions at work: The nature of generalization in language. Oxford: Oxford University Press. Goldberg, Adele E. 2006b. The inherent semantics of argument structure: The case of the English ditransitive construction. In Dirk Geeraerts (ed.), Cognitive Linguistics: Basic readings, 401–437. Oxford: Oxford University Press. Goldberg, Adele E. 2009. The nature of generalization in language. Cognitive Linguistics 20(1). 93–127. Goldberg, Adele E. 2010. Verbs, constructions, and semantic frames. In Malka RappaportHovav, Edit Doron & Ivy Sichel (ed.), Lexical semantics, syntax, and event structure, 39–51. Oxford: Oxford University Press. Goldberg, Adele E. 2011. Corpus evidence of the viability of statistical preemption. Cognitive Linguistics 22(1). 131–153. Goldberg, Adele E. & Farell Ackerman. 2001. The pragmatics of obligatory adjuncts. Language 77(4). 798–814. Gómez-Molina, José-Ramón. 2013. Pluralización de haber impersonal en el español de Valencia (España). Verba 40. 253–284. Gómez-Torrego, Leonardo. 1994. Impersonalidad gramatical: Descripción y norma. Madrid: Arcos libros. González-Calvo, José-Manuel. 2002. Sintaxis y semántica: Haber impersonal en español. In Saraegui Platero & Luis Manuel Casado-Velarde (ed.), Pulchre, bene, recte: Homenaje al prof. Fernando González-Ollé, 639–656. Pamplona: Editorial de la Universidad de Navarra. Gries, Stefan Th. 2013a. Sources of variability relevant to the cognitive sociolinguist, and corpus- as well as psycholinguistic methods and notions to handle them. Journal of Pragmatics 52(1). 5–16. Gries, Stefan Th. 2013b. Statistics for linguistics with R: A practical introduction. Berlin/Boston, MA: De Gruyter. Griffin, Zenzi M. & Victor S. Ferreira. 2006. Properties of spoken language production. In Matthew J. Traxler & Morton J. Gernsbacher (ed.), Handbook of psycholinguistics, 21–60. Amsterdam/New York, NY: Elsevier. Grondelaers, Stefan, Dirk Geeraerts & Dirk Speelman. 2009. A case for a cognitive corpus linguistics. In Mónica González-Márquez, Irene Mittelberg, Seana Coulson & Michael J. Spivey (ed.), Methods in Cognitive Linguistics, 149–169. Berlin/Boston, MA: De Gruyter. Guy, Gregory. 1980. Variation in the group and the individual: The case of final stop deletion. In William Labov (ed.), Location language in time and space, 1–36. New York, NY: Academic Press. Harrell, Frank. 2001. Regression modeling strategies: With applications to linear models, logistic regression, and survival analysis. New York, NY: Springer. Harrell, Frank E. 2016. Hmisc: Harrell miscellaneous. https://cran.rproject.org/web/packages/Hmisc/index.html. (May 2016). Hay, Jennifer, Paul Warren & Katie Drager. 2006. Factors influencing speech perception in the context of a merger-in-progress. Journal of Phonetics 34. 458–484. Henríquez Ureña, Pedro. 1982 [1940]. El español en Santo Domingo. Santo Domingo: Taller. Henry, Alison. 2002. Variation and syntactic theory. In Jack K. Chambers, Peter Trudgill & Natalie Chilling-Estes (ed.), The handbook of language variation and change, 267–282. Oxford: Blackwell.

226 | References Hernández-Alonso, César. 1996. Castilla La Vieja. In Manuel Alvar-López (ed.), Manual de dialectología hispánica: El español de España, 197–212. Barcelona: Ariel. Hernández-Díaz, Axel. 2006. Gramaticalización y reanálisis: La concordancia del verbo haber existencial en la diacronía del español. In Concepción Company-Company (ed.), Sintaxis histórica de la lengua española. Primera parte: La frase verbal, Vol.2, 1055–1160. Mexico City: Universidad Nacional Autónoma de México/Fondo de Cultura Económica. Hollmann, Willem B. & Anna Siewierska. 2011. The status of frequency, schemas, and identity in Cognitive Sociolinguistics: A case study on definite article reduction. Cognitive Linguistics 22(1). 25–54. Holmquist, Jonathan. 2008. Gender in context: Features and factors in men’s and women’s speech in rural Puerto Rico. In Maurice Westmoreland & Juan Antonio Thomas (ed.), Selected proceedings of the 4th workshop on Spanish sociolinguistics, 17–35. Somerville, MA: Cascadilla Proceedings Project. Hosmer, David W. & Stanley Lemeshow. 2000. Applied logistic regression. Oxford: Wiley. Hothorn, Torsten, Kurt Hornik, Carolin Strobl & Achim Zeileis. 2016. party: A laboratory for recursive partytioning. http://cran.r- project.org/web/packages/party/index.html. (May 2016). Hubbard, Raymond & Murray R. Lindsay. 2008. Why p-values are not a useful measure of evidence in statistical significance testing. Theory & Psychology 18(1). 69–88. Jiménez-Sabater, Max A. 1978. Estructuras morfosintácticas en el español dominicano: Algunas implicaciones sociolingüísticas. In Humberto López-Morales (ed.), Corrientes actuales en la dialectología del Caribe: Actas de un simposio, 167–180. San Juan, PR: Editorial Universitaria de la Universidad de Puerto Rico. Jiménez-Sabater, Max A. 1984. Más datos sobre el español de la República Dominicana. Santo Domingo: Editorial de la Universidad Autónoma de Santo Domingo. Johnson, Daniel Ezra. 2009. Getting off the GoldVarb standard: Introducing Rbrul for mixedeffects variable rule analysis. Language and Linguistics Compass 3(1). 359–383. Jorge-Morel, Elercia. 1978. Estudio lingüístico de Santo Domingo. Santo Domingo: Taller. Kany, Charles E. 1951 [1945]. Sintaxis hispanoamericana. Madrid: Gredos. Keenan, Edward. 1976. Towards a universal definition of subject. In Charles N. Li (ed.), Subject and topic, 305–333. New York, NY: Academic Press. Kiesling, Scott F. 2005. Variation, stance and style. Word-final -er, high rising tone, and ethnicity in Australian English. English World-Wide 26(1). 1–42. Kiesling, Scott F. 2009. Style as stance. In Alexandre Jaffa (ed.), Stance: Sociolinguistic perspectives, 171–194. Oxford: Oxford University Press. Kiesling, Scott F. 2013. Constructing identity. In Jack K. Chambers & Natalie Schilling-Estes (ed.), The handbook of language variation and change, 448–468. Oxford: Wiley. Kristiansen, Gitte. 2008. Style-shifting and shifting styles: A socio-cognitive approach to lectal variation. In Gitte Kristiansen & René Dirven (ed.), Cognitive Sociolinguistics: Language variation, cultural models, social systems, 45–89. Berlin/Boston, MA: De Gruyter. Labov, William. 1972. Sociolinguistic patterns. Philadelphia, PA: University of Pennsylvania Press. Labov, William. 1982. Building on empirical foundations. In Winfred Ph. Lehmann & Yakov Malkiel (ed.), Perspectives on historical linguistics, 17–92. Amsterdam/Philadelphia, PA: John Benjamins. Labov, William. 1994. Principles of linguistic change. Volume 1: Internal factors. Oxford: Blackwell.

References | 227

Labov, William. 2001. Principles of linguistic change. Volume 2: Social factors. Oxford: Blackwell. Labov, William. 2006 [1966]. The social stratification of English in New York City. Cambridge, MA: Cambridge University Press. Labov, William. 2010. Principles of linguistic change. Volume 3: Cognitive and cultural factors. Oxford: Wiley-Blackwell. Lakoff, George. 1987. Women, fire, and dangerous things: What categories reveal about the mind. Chicago, IL: Chicago University Press. Langacker, Ronald W. 1987. Foundations of Cognitive Grammar. Volume 1: Theoretical prerequisites. Stanford, CA: Stanford University Press. Langacker, Ronald W. 1990. Concept, image, symbol: The cognitive basis of grammar. Berlin/Boston, MA: De Gruyter. Langacker, Ronald W. 1991. Foundations of Cognitive Grammar. Volume 2: Descriptive application. Stanford, CA: Stanford University Press. Langacker, Ronald W. 2007. Cognitive Grammar. In Dirk Geeraerts & Hubert Cuyckens (ed.), The Oxford handbook of Cognitive Linguistics, 421–462. Oxford: Oxford University Press. Langacker, Ronald W. 2008. Cognitive Grammar: A basic introduction. Oxford: Oxford University Press. Langacker, Ronald W. 2010. Cognitive Grammar. In Bernd Heine & Heiko Narrog (ed.), The Oxford handbook of linguistic analysis, 87–110. Oxford: Oxford University Press. Lapesa, Rafael. 1981. Historia de la lengua española. Madrid: Gredos. Lavandera, Beatriz. 1978. Where does the sociolinguistic variable stop? Language in Society 7(2). 171–182. Levshina, Natalia. 2015. How to do linguistics with R: Data exploration and statistical analysis. Amsterdam/Philadelphia, PA: John Benjamins. Levshina, Natalia, Dirk Geeraerts & Dirk Speelman. 2013. Towards a 3D-grammar: Interaction of linguistic and extralinguistic factors in the use of Dutch causative constructions. Journal of Pragmatics 52(1). 34–48. Lim, Laureen T. & Gregory R. Guy. 2005. The limits of linguistic community: Speech styles and variable constraint effects. University of Pennsylvania Working Papers in Linguistics 10(2). Article 13. Llorente, Antonio Maldonado de Guevara. 1980. Consideraciones sobre el español actual. Anuario de Letras 18. 5–61. Lope-Blanch, Juan Miguel. 1996. México. In Manuel Alvar-López (ed.), Manual de dialectología hispánica: El español de América, 81–89. Barcelona: Ariel. López-Morales, Humberto. 1983. Estratificación social del español de San Juan de Puerto Rico. Mexico City: Universidad Nacional Autónoma de México. López-Morales, Humberto. 1992. El español del Caribe. Madrid: Mapfre. Lyons, John. 1967. A note on possessive, existential and locative sentences. Foundations of Language 3. 390–396. MacDonald, Maryellen. 2013. How language production shapes language form and comprehension. Frontiers in Psychology 4. Article 226. Malaver, Irania. 1999. Estudio de la conciencia lingüística sobre hubieron. Lingua Americana 3(5). 26–42. Meulleman, Machteld & Eugeen Roegiest. 2012. Los locativos en la valencia de la construcción existencial española. ¿Actante o circunstante? Zeitschrift für romanische Philologie 128. 57–70.

228 | References Meyerhoff, Miriam & James A. Walker. 2007. The persistence of variation in individual grammars: Copula absence in 'urban sojourners' and their 'stay-at-home' peers, Bequia (St. Vincent and the Grenadines). Journal of Sociolinguistics 11(3). 346–366. Milroy, Lesley & Matthew Gordon. 2003. Sociolinguistics: Method and analysis. Oxford: Blackwell. Montes de Oca, María del Pilar. 1994. La concordancia con haber impersonal. Anuario de Letras 32. 7–35. Montes-Giraldo, José-Joaquín. 1982. Sobre el sintagma haber + sustantivo. Thesaurus 37(2). 383–385. Morales, Amparo. 1999. Anteposición de sujeto en el español del Caribe. In Manuel ÁlvarezNazarío & Luis A. Ortiz-López (ed.), El Caribe hispánico: Perspectivas lingüísticas actuales, 77–98. Frankfurt am Main/Madrid: Vervuert/Iberoamericana. Moreno de Alba, José G. 1995. El español en América. Mexico City: Fondo de Cultura Económica. Moreno-Fernández, Francisco. 2003. Metodología del proyecto para el estudio sociolingüístico del español de España y América (PRESEEA). www.linguas.net/portalpreseea. (May 2016). Myachykov, Andriy & Russel S. Tomlin. 2015. Attention and salience. In Ewa Dąbrowska & Dagmar S. Divjak (ed.), Handbook of Cognitive Linguistics, 31–52. Berlin/Boston, MA: De Gruyter. Nakagawa, Shinichi & Holger Schielzeth. 2013. A general and simple method for obtaining R2 from generalized linear mixed-effects models. Methods in Ecology and Evolution 4. 133– 142. Navarro-Correa, Manuel. 1992. Valoración social de algunas formas verbales en el habla de Valencia. Lingüística Española Actual 14(1). 97–106. Navarro-Tomás, Tomás. 1948. El español en Puerto Rico: Contribución a la geografía lingüística hispanoamericana. Río Piedras, PR: Editorial de la Universidad de Puerto Rico. Nuño-Álvarez, María del Pilar. 1996. Cantabria. In Manuel Alvar-López (ed.), Manual de dialectología hispánica: El español de España, 183–196. Barcelona: Ariel. Ober, Beth A. & Gregory K. Shenaut. 2006. Semantic memory. In Matthew J. Traxler & Morton J. Gernsbacher (ed.), Handbook of psycholinguistics, 403–454. Amsterdam/New York, NY: Elsevier. Padrón, Alfredo. 1949. Giros sintácticos usados en Cuba. Thesaurus 5(1–3). 163–175. Paolillo, John C. 2013. Individual effects in variation analysis: Model, software and research design. Language Variation and Change 25(1). 89–118. Pato, Enrique. 2016. La pluralización de haber en español peninsular. In Carlota de Benito Moreno & Álvaro Octavio de Toledo (ed.), En torno a haber: Construcciones, usos y variación desde el latín hasta la actualidad, 357–391. Berlin/New York, NY: Peter Lang. Pérez-Martín, Ana María. 2007. Pluralización de había en el habla de El Hierro: Datos cuantitativos. Revista de Filología de la Universidad de La Laguna 25. 505–513. Pickering, Martin J. & Victor S. Ferreira. 2008. Structural priming: A critical review. Psychological Bulletin 134(3). 427–459. Pizarro-Pedraza, Andrea. 2016. Variación semántica y significado social: Hacia una sociolingüística cognitiva de la tercera ola. Dicenda: Cuadernos de Filología Hispánica 34(1). in press. Poplack, Shana. 1984. Variable concord and sentential plural marking in Porto Rican Spanish. Hispanic review 52(2). 205–222.

References | 229

Prince, Ellen. 1992. The ZPG letter: Subjects, definiteness, and information-status. In William C. Mann & Sandra A. Thompson (ed.), Discourse description: Diverse linguistic analyses of a fundraising text, 295–326. New York, NY: John Benjamins. Pütz, Martin, Justinya A. Robinson & Monica Reif. 2012. The emergence of Cognitive Sociolinguistics. Review of Cognitive Linguistics 10(2). 241–226 Quilis, Antonio. 1983. La concordancia gramatical en la lengua española hablada en Madrid. Madrid: Consejo Superior de Investigaciones Cientíﬁcas. Quintanilla-Aguilar, José-Roberto Alexander. 2009. La (des)pluralización del verbo haber existencial en el español salvadoreño: ¿Un cambio en progreso? Miami, FL: University of Florida PhD dissertation. R Core Team. 2016. R: A language and environment for statistical computing. https://www.Rproject.org/. (May 2016). Real Academia Española. 2005. Diccionario panhispánico de dudas. Madrid: Espasa-Calpe. Real Academia Española. 2008a-. Corpus diacrónico del español. http://corpus.rae.es/cordenet.html. (October 2011). Real Academia Española. 2008b-. Corpus de referencia del español actual. http://corpus.rae.es/creanet.html. (November 2013). Real Academia Española & Asociación de Academias de la Lengua Española. 2009. Nueva gramática de la lengua española. Madrid: Espasa-Calpe. Rivas, Javier & Esther Brown. 2012. Stage-level and individual-level distinction in morphological variation: An example with variable haber agreement. Borealis: An International Journal of Hispanic Linguistics 1(2). 73–90. Rivas, Javier & Esther Brown. 2013. Concordancia variable con haber en español puertorriqueño. Boletín de Lingüística 24(37–38). 102–118. Robenalt, Clarice & Adele E. Goldberg. 2015. Judgment evidence for statistical preemption: It is relatively better to vanish than to disappear a rabbit, but a lifeguard can equally well backstroke or swim children to shore. Cognitive Linguistics 26(3). 467–503. Rodríguez-Mondoñedo, Miguel. 2006. Spanish existentials and other accusative constructions. In Cedric Boeckx (ed.), Minimalist essays, 326–394. Amsterdam/Philadelphia, PA: John Benjamins. Romaine, Susan. 1984. On the problem of syntactic variation and pragmatic meaning in sociolinguistic theory. Folia Linguistica 18(3–4), 409–438. Schmid, Hans Jörg. 2007. Entrenchement, salience and basic levels. In Dirk Geeraerts & Hubert Cuyckens (ed.), The Oxford handbook of Cognitive Linguistics, 117–138. Oxford: Oxford University Press. Schmid, Hans-Jörg & Helmut Küchenhoff. 2013. Collostructional analysis and other ways of measuring lexicogrammatical attraction: Theoretical premises, practical problems and cognitive underpinnings. Cognitive Linguistics 24(3). 531–577. Sherman, Steven J., Jeffrey W. Sherman, Elise J. Percy & Courtney K. Soderberg. 2013. Stereotype development and formation. In Donal Carlston (ed.), The handbook of social cognition, 548–574. Oxford: Oxford University Press. Silva-Corvalán, Carmen. 2001. Sociolingüística y pragmática del español. Georgetown, D.C.: Georgetown University Press. Silverstein, Michael. 2003. Indexical order and the dialectics of sociolinguistic life. Language & Communication 23. 193–229. Smith, Jennifer & Mercedes Durham. 2012a. A tipping point in dialect obsolescence? Change across the generations in Lerwick, Shetland. Journal of Sociolinguistics 15(2). 197–225.

230 | References Smith, Jennifer & Mercedes Durham. 2012b. Bidialectalism or dialect death? Explaining generational change in the Shetland Islands, Scotland. American Speech 87(1). 57–88. Speelman, Dirk. 2014. Logistic regression: A confirmatory technique for comparisons in corpus linguistics. In Dylan Glynn & Justyna A. Robinson (ed.), Corpus Methods for Semantics: Quantitative studies in polysemy and synonymy, 487–533. Amsterdam/Philadelphia, PA: John Benjamins. Street, James & Ewa Dąbrowska. 2010. More individual differences in language attainment: How much do adult native speakers of English know about passives and quantifiers? Lingua 120(8). 2080–2094. Suñer, Margarita. 1982. Syntax and semantics of Spanish presentational sentences. Georgetown, D.C.: Georgetown University Press. Szmrecsanyi, Benedikt. 2006. Morphosyntactic persistence in spoken English: A corpus study at the intersection of variationist sociolinguistics, psycholinguistics, and discourse analysis. Berlin/Boston, MA: De Gruyter. Szmrecsanyi, Benedikt. 2013. Diachronic Probabilistic Grammar. English Language and Linguistics 19(3). 41–68. Szmrecsanyi, Benedikt, Douglas Biber, Jesse Egbert & Karlien Franco. 2016. Toward more accountability: Modeling ternary genitive variation in Late Modern English. Language Variation and Change 28(1). 1–29. Tagliamonte, Sali. 2006. Analysing sociolinguistic variation. Cambridge, MA: Cambridge University Press. Tagliamonte, Sali. 2012. Variationist sociolinguistics: Change, observation, interpretation. Oxford: Wiley-Blackwell. Tagliamonte, Sali. 2013. Comparative sociolinguistics. In Jack K. Chambers & Natalie SchillingEstes (ed.), The handbook of language variation and change, 128–156. Oxford: WileyBlackwell. Tagliamonte, Sali. 2016. Making waves: The story of variationist sociolinguistics. Oxford: Wiley. Tagliamonte, Sali & R. Harald Baayen. 2012. Models, forests and trees of York English: Was/were variation as a case study for statistical practice. Language Variation and Change 24(2). 135–178. Talmy, Leonard. 2007. Attention phenomena. In Dirk Geeraerts & Hubert Cuyckens (ed.), The Oxford handbook of Cognitive Linguistics, 264–293. Oxford: Oxford University Press. Terrell, Tracy. 1979. Final /s/ in Cuban Spanish. Hispania 42(4). 599–612. Terrell, Tracy. 1982. Relexificación en el español dominicano: Implicaciones para la educación. In Orlando Alba (ed.), El español del Caribe: Ponencias del VI simposio de dialectología, 302–318. Santiago de Los Caballeros: Pontífica Universidad Católica Madre y Maestra. Tomasello, Michael. 2007. Cognitive linguistics and first language acquisition. In Dirk Geeraerts & Hubert Cuyckens (ed.), The Oxford handbook of Cognitive Linguistics, 1092– 1112. Oxford: Oxford University Press. Torrego-Salcedo, Esther. 1999. El complemento preposicional. In Ignacio Bosque & Violeta Demonte (ed.), Gramática descriptiva de la lengua española, 1779–1806. Madrid: EspasaCalpe. Trafimow, David & David Rice. 2009. A test of the null hypothesis significance testing procedure correlation argument. Journal of General Psychology 136(3). 261–270. Vaquero, María. 1978. Enseñar español, pero ¿qué español? Boletín de la Academia Puertorriqueña de la Lengua Española 6. 127–146.

References | 231

Vaquero, María. 1996. Antillas. In Manuel Alvar-López (ed.), Manual de dialectología hispánica: El español de América, 51–67. Barcelona: Ariel. Walker, James A. & Miriam Meyerhoff. 2013. Studies of the community and the individual. In Richard Cameron, Robert Bayley & Ceil Lucas (ed.), The Oxford handbook of sociolinguistics, 175–194. Oxford: Oxford University Press. Waltereit, Richard & Ulrich Detges. 2008. Syntactic change from within and from without syntax: A usage-based analysis. In Ulrich Detges & Richard Waltereit (ed.), The paradox of grammatical change: Perspectives from Romance, 13–30. Amsterdam/Philadelphia, PA: John Benjamins. Ward, Gregory & Betty Birner. 1995. Definiteness and the English existential. Language 71(4). 722–742. Weiner, Judith & William Labov. 1983. Constraints on the agentless passive. Journal of Linguistics 19(1). 29–58. Weinreich, Uriel, William Labov & Marvin Herzog. 1968. Empirical foundations for a theory of language change. In Winfred Ph. Lehman & Yakov Malkiel (ed.), Directions for historical linguistics: A symposium, 97–195. Austin, TX: University of Texas Press. Wickham, Hadley & Winston Chang. 2016. ggplot2: An implementation of the grammar of graphics. https://cran.r-project.org/web/packages/ggplot2/index.html. (May 2016). Wolfram, Walt. 1986. Good data in a bad situation: Eliciting vernacular structures. In Joshua A. Fishman, Andrée Tabouret-Keller, Michael Clyne, Bhariraju Krishnamurti & Mohamed Abdulaziz (ed.), The Fergusionan impact: In honor of Charles A. Ferguson on the occasion of his 65th birthday. Volume 2: sociolinguistics and the sociology of language, 3–22. Berlin/Boston, MA: De Gruyter. Wolfram, Walt & Dan Beckett. 2000. The role of the individual and group in earlier African American English. American Speech 75(1).3-33.

Index 52, 53, 140 75, 140 absence/presence of negation 20, 69, 124, 129, 132, 137, 140, 145, 147, 154, 156, 157, 158, 170, 176, 203, 212, 213, 214 – presence of negation 20, 72, 132 action-chain model 130 age 19, 20, 24, 27, 62, 63 age grading 56 apparent time 55, 56 Argentinean Spanish 14, 15 argument role 33, 34, 35, 36, 37, 90 – zero 90, 116 argument-structure alternation 46, 55, 57 argument-structure constructions 33, 34, 35, 36, 37, 43, 46, 55 automatic spreading activation 39 Bolivian Spanish 14 box diagrams 37 Canarian Spanish 27 canonical event model. See action-chain model Caribbean Spanish 15, 109, 133, 167 categorization 36, 39, 42 CAUSE-RECEIVE 32, 36, 37 Central American Spanish 14 Chilean Spanish 14, 15 Colombian Spanish 14, 15 comparative sociolinguistics 85, 157 – constraint rankings based on conditional variable permutation 85 conceptual import 39, 55, 90, 109, 114, 137, 216 conditional inference tree models 76, 82, 83, 85, 145, 157 – C-index of concordance 83 Cuban Spanish 14, 15, 122, 123, 139, 140, 141, 144 differential object marking 100 Dominican Spanish 15, 16, 27, 122, 123, 132, 133, 139, 140, 141, 144, 158, 168, 212

educational achievement 56, 62, 63, 64, 163, 170, 174, 178 English 37, 92, 93, 96, 100, 106, 107 – presentational there is/there are 92, 93, 96, 100, 104, 106, 107, 110 entrenchment 40, 43, 133, 141, 156 envelope of variation 72 false definites 92 first-person plural haber 53, 75 form-function reanalysis 26, 141, 156 frame semantics 34 – base 34 – frame of constructions 35 – profile 34, 35 gender 20, 24, 56, 62, 65, 163 – female 56, 62, 64, 168, 172 – male 64 generalizations 32 grammatical construction 31 grammatical relation probability 26 grammaticalization 33 ha habido, han habido. See tense: present perfect había, habían. See tense:imperfect habrá, habrán. See tense: morphological future habría, habrían. See tense: conditional hay, hayn. See tense: present hubo, hubieron. See tense: preterit Idealized Cognitive Model 35, 36, 37, 43, 46, 90, 91, 114 implicit NPs 110, 114 individual-level/stage-level predicates – stage-level predicates 131 information status – brand-new 106 – discourse-new 92, 113 – discourse-old 110, 113 – hearer-new 92, 93, 96, 98, 99, 101, 114, 116, 144 – hearer-old 92, 93, 98, 99, 105, 110, 113, 114, 116

234 | Index inheritance hierarchy 39 interspeaker variation 44 judgment sample 61, 63 language acquisition 156 language change 18 – actuation 54 – from above 18, 56 – from below 17, 20, 27, 50, 56, 163 – role of statistical preemption markedness of coding, and structural priming 156 Latin American Spanish 14, 15, 27, 169 lexical item 32, 34 list reading 99 markedness of coding 50, 132, 133, 153, 156 mental spaces 89, 113, 116 – base space 89, 116 – blending 89 Mexican Spanish 14, 15 Minimalist Program 33 mixed-effects logistic regression 76, 82, 85 – random effects 78 – random intercept 78 morphophonological contrast 141 object-verb agreement 72, 144 obligatory adjunct 111 participant role 34, 37 Peninsular Spanish 13, 27 perseverance. See Structural priming Peruvian Spanish 14 POINTING-OUT 90, 91, 114, 116, 156 possessives – possessive haber 53 Principle of Accountability 75 Principles of Linguistic Change 50, 58 – Gender Principle 56 Probabilistic Grammar 3, 4, 177, 213, 215, 217 production-to-production priming 172 proportion of noun use as subject 26, 129 prototype effects 39 prototypical subject 131 psychological adequacy 34

Puerto Rican Spanish 14, 15, 16, 26, 27, 122, 123, 132, 133, 137, 139, 140, 141, 144, 158, 168, 212 questionnaire-reading task 66, 70, 71, 74 random forest model 157 random forest models 83 reference of the NP 22 – human-reference NPs 27 regression modeling – 95% confidence interval 80 – factor 76 – factor group 76 – level 76 – model dredging 79 – model intercept 81 – model selection 79 – multicollinearity 80 – overdispersion 80 – overfitting 80 – predictor 76 – pseudo-R2 81 – random effects 77 – regressor 76 – sum contrasts 81 – variance inflation factors 80 reinforcement of the idea of plurality 22 reminders 93, 97 Salvadorian Spanish 24 social class 18, 20, 27, 63 – lower class 18, 20 – middle class 18 – upper class 20 social-interactional meaning 43 – first-order indexicality 43 – groups associated with presentational haber pluralization in Havana 169 – groups associated with presentational haber pluralization in Santo domingo 169 – groups associated with presentational haber pluralization in San Juan 169 – social categories 42, 43, 44, 89 – social distribution specification 43 – stance 43 – subgroup membership 43 sociolinguistic interview 66, 67

Index | 235

stable variation 18, 19, 20, 25, 27 statistical preemption 40, 49, 50, 51, 54, 125, 133, 140, 141, 145, 156, 157, 159, 163, 176, 183, 203, 207, 211, 212 story-reading task 66, 68, 69, 70, 72, 74 stratification variables 62 structural priming 24, 41, 152, 156, 159 – comprehension-to-production priming 67, 124, 142, 144, 145, 146, 149, 153, 156, 157, 177, 178, 179, 197, 199, 201, 203 – production-to-production priming 124, 142, 144, 145, 146, 149, 152, 154, 157, 158, 170, 172, 174, 177, 178, 179, 197, 199, 201, 203, 204, 207 Structural priming 41 style 19, 55, 66, 166 – attention paid to speech 66 syntactic persistence. See Structural priming syntactic priming. See Structural priming tense 18, 20, 24, 26, 55, 141, 142, 159 – aspectual or modal auxiliaries 27, 137, 139, 142 – compound 20, 27 – conditional 71

– imperfect 15, 16, 18, 19, 24, 27, 70, 71, 132, 133, 141 – morphological future 70, 71 – periphrastic future 70 – present 15, 70, 71, 72, 114, 123, 126, 127, 128, 136, 137, 139, 140, 141, 149, 152, 153, 156, 172 – present perfect 18, 19, 70, 71 – preterit 15, 16, 18, 20, 27, 70, 71, 72, 126, 127, 128, 133, 136, 137, 139, 140, 141, 152, 156, 159, 172, 214 – subjunctive imperfect 71 – subjunctive present 70, 71 – synthetic 20, 25, 140, 152, 159, 212 – synthetic present- and preterit-tense, 145, 174, 197, 199 The correspondence principle 36 The semantic coherence principle 36 there is/there are. See English: presentational there is/there are token frequency 54 typical action-chain position 124, 129, 130, 131, 146, 149, 152, 156, 172, 174, 176, 199, 203 usage-based linguistics 32, 33, 61, 77 Venezuelan Spanish 14, 15, 18, 27 word order 109, 110

Appendix A: Story-reading task The reading task was adapted from the Internet (http://goo.gl/8W61jT). The original story was conserved entirely, but the syntax was updated to reflect a less archaic type of language. I also inserted the selection contexts.

Juan Sin Miedo En una pequeña aldea, había/habían (1) un anciano padre y sus dos hijos. El mayor era trabajador y llenaba de alegría el corazón de su padre, mientras el más joven sólo le daba disgustos. Un día el padre lo llamó y le dijo: — “Hijo mío, sabes que no hay/hain (2) muchas cosas que yo pueda dejarles a tu hermano y a ti, y sin embargo tú aún no aprendiste/aprendías (3) ningún oficio que te sirva para ganarte el pan. ¿Qué te gustaría aprender?” Y le contestó Juan: — “Bueno, hay/hain (4)varias cosas de que me gustaría saber cómo hacerlas. Muchas veces yo oigo relatos en que hay monstruos, fantasmas, fieras y al contrario de la gente, no siento miedo. Papá, yo quiero aprender a tener miedo.” El padre, enfadado, le gritó: — “Estoy hablando de tu futuro, y ¿tú, tú quieres aprender a tener miedo? Si es eso lo que quieres hacer, pues márchate a aprenderlo. Espero que en el camino haya/hayan (5)varias situaciones que te inspiran/inspiren (6) miedo.” Juan recogió sus cosas, se despidió de su hermano y de su padre, y emprendió su camino. Cerca de un molino encontró a un sacristán con quien se puso a hablar. El joven se presentó como Juan Sin Miedo. — “¿Juan Sin Miedo? ¡Extraño nombre!” – El sacristán se admiró/admiraba. (7) Juan dijo: — “Ya vas a ver, no hay/hain (8) peligros, ogros, fieras, bestias que me den miedo, porque nunca de mi vida yo he conocido el miedo. Partí de mi casa para conocer lo que es, pero hasta el momento en el camino no hay/hain (9) personas, no hay/hain (10) situaciones, no hay/hain (11) animales que me inspiren miedo. Sí que ayer, hubo/hubieron (12) dos lobos que querían devorarme, anteayer hubo/hubieron (13) unos ladrones que trataban de matarme y ha habido/han habido (14) dos veces que yo tenía que brincar un abismo de treinta pies de ancho y todo esto fue muy molesto, pero miedo como tal no tuve.” El sacristán dice: — “Quizá yo pueda ayudarte. Cuentan que más allá del valle, muy lejos, hay un castillo encantado por un mago. El rey que allí gobierna prometió la mano de su linda hija a aquel que consigue/consiga (15) recuperar el castillo y el tesoro. Hasta ahora, todos los que lo intentaron huyeron asustados o murieron de miedo.” Juan se animó: — “Quizá, quizá allí haya/hayan (16) los peligros necesarios para yo sentir el miedo.” Juan decidió caminar, vio a lo lejos las torres más altas de un castillo en el que no había/habían (17) banderas. Se acercó y se dirigió a la residencia del rey. Dos guardias reales cuidaban la puerta principal. Juan se acercó y decía/dijo (18):

238 | Appendix A: Story-reading task

Juan Sin Miedo — “Soy Juan Sin Miedo, y deseo ver a su Rey. Quizá él me permita entrar en su castillo y sentir a lo que llaman miedo”. El más fuerte lo acompañó al Salón del Trono. El monarca expuso/exponía (19) las condiciones que ya habían escuchado otros candidatos. Dijo: — “Si tú consigues pasar tres noches seguidas en el castillo, derrotar a los espíritus y devolverme mi tesoro, habrá/habrán (20) dos semanas de fiestas en tu honor, te concedo la mano de mi amada y bella hija, y la mitad de mi reino como dote”. Juan replicó: — “Se lo agradezco, Su Majestad, pero yo sólo vine para saber lo que es el miedo.” “Qué hombre tan valiente, qué honesto”, pensó el rey, “pero ya guardo pocas esperanzas de recuperar mis dominios, ya ha habido/han habido (21) tantos que lo han intentado.” Juan sin Miedo se fue al castillo y escogió uno de los 200 cuartos que había/habían (22) ahí. Colgó sus hachas de la pared, pensando “nunca se sabe, y así siempre voy a tenerlas cerca” y se acostó. A medianoche, lo despertó un alarido muy alto. — “¡Uhhhhhhhhh! Un espectro se deslizaba sobre el suelo sin tocarlo.” — “¿Quién eres tú, que te atreves/atrevas (23)a despertarme?” Preguntó Juan. Un nuevo alarido por respuesta, y Juan Sin Miedo le tapó la boca con una bandeja que adornaba la mesa. El espectro se quedó mudo y se desapareció en el aire. A la mañana siguiente el rey visitó a Juan Sin Miedo y pensó: "Es sólo una pequeña batalla. Aún quedan dos noches". Pasó el día y se fue el sol. Como la noche anterior, Juan Sin Miedo se acostó, pero esta vez apareció un fantasma espantoso que lanzó/lanzaba (24) un bramido: ¡Uhhhhhhhhhh! Juan Sin Miedo cogió una de sus hachas y cortó la cadena que el fantasma arrastraba. Al no estar sujeto, el fantasma se elevó y desapareció. Al amanecer, el rey volvió a visitarlo y pensó: “Nada de esto habrá servido si él no repite la hazaña una vez más.” Llegó el tercer atardecer, y después, la noche. Juan Sin Miedo ya dormía/durmió (25) cuando escuchó acercarse a una momia. Y preguntó: —“Dime qué motivo tienes para interrumpir mi sueño.” Ya que no contestó, Juan agarró un extremo de la venda y tiró. Retiró todas las vendas y encontró a un mago, quien dijo: — “No hay/hain (26) trucos de magia que valgan contra ti. Déjame libre y yo rompo el encantamiento”. Al amanecer, había/habían (27)muchas gentes en las puertas del castillo, y cuando apareció Juan Sin Miedo el rey dijo: "¡Voy a cumplir mi promesa y más! ¡No va a haber/van a haber (28) dos sino cuatro semanas de fiesta!" Pero acá no acabó la historia: Cierto día en que el ahora príncipe dormía, la princesa decidió sorprenderle regalándole una pecera. Pero tropezó/tropezaba (29) al inclinarse, y el contenido, agua y peces cayeron sobre la cama que ocupaba Juan. —“ ¡Ahhhhhh!” exclamó Juan al sentir los peces en su cara - ¡Qué miedo! La princesa rompió a reír, ya que no había/habían (30)peligros, espectros o espantos que asustaban/asustaran (31) a Juan, pero él sí les cogió miedo a unos simples peces de colores. Le dijo, riendo todavía: — “No tengas miedo, te voy a guardar el secreto.” Y así fue, y todavía se le conoce como Juan Sin Miedo.

Appendix B: Questionnaire-reading task María engañó a su novio. Una amiga común hace de intermediaria. Después de haber hablado con el novio, Juan, dice: Lo siento María, pero Juan dice ______ no quiere verte nunca jamás. a) de que b) que Un periodista entrevista a un pintor que acaba de presentar una serie de cuadros preciosos que son completamente diferentes de los que solía vender antes. Además, resulta que algunos ya los hizo hace veinte años. Pregunta el periodista: ¿Por qué usted esperó tanto antes de presentarnos estas obras? Contesta el artista: Porque pensaba, y todavía pienso, que en aquel momento no ________las críticas tan positivas que estas obras están recibiendo ahora. a) pudo haber b) pudieron haber Un abuelo está contándoles a sus nietos de su niñez. Uno de ellos, ansioso de saber de estos tiempos pasados, pregunta: ¿Papi, cuando usted era niño, ¿acá ya ________ (1) carros? Contesta el abuelo: ¡Claro que los________ (2), no soy tan viejo! (1) a) había b) habían (2) a) había b) habían A Inés le acaban de robar el carro, que tenía aparcado en algún callejón obscuro. Aunque no es la cosa más sensata que se pueda hacer, una amiga trata de consolarla diciendo: No es culpa tuya, es que siempre______ unas personas malas. a) habrá b) habrán Desde pequeño, Francisco ha soñado con mudarse a Madrid. Ahora, su empresa le anunció que, cuando él quiera, lo pueden transferir a la sucursal de esta ciudad. Le dice a su madre: Dentro de dos años, ________a Madrid, ya compré una casa allí. a) voy a mudarme b) me mudo c) me mudaré Dos personas están en un evento en que se presentan los carros del año. Una de ellas pretende que son exactamente los mismos que el año pasado. La otra persona dice: No, no, estás equivocado. Por ejemplo, el año pasado no________los carros amarillos que vimos antes. a) hubo b) hubieron Ana tiene problemas amorosos bastante serios. Después de semanas de sentirse muy mal, le cuenta todo a su prima. Yo me esforcé muchísimo e intenté ser perfecta para él pero no logré nada. Él estaba como confuso y aunque ________(1) veces que me trataba bien, también ________(2) muchas veces que estaba distante y antipático conmigo. (1) a) hubiera b) hubieran (2) b) había b) habían

240 | Appendix B: Questionnaire-reading task

Un domingo, dos muchachos se encuentran. Pregunta el primero: ¿Tú y la familia van a la playa esta tarde? Contesta el otro: ________Mi viejo está enfermo. a) No, hoy no vamos, no. b) No, hoy no vamos. Marlén está leyéndole a su hija una historia de horror sobre gallinas posesas que se comen niños. Por supuesto, la hija le coge miedo y dice: “Mami, tengo miedo”. Contesta Marlén: No te preocupes, acá nunca ________las gallinas de que habla el libro. a) ha habido b) han habido Unos amigos invitaron a Marilyn y Julio a cenar. Julio es una de estas personas que nunca quiere llegar con las manos vacías. Marilyn le había prometido cocinar un bizcocho para regalárselo a los amigos. Cuando están por irse, Julio le pregunta a Marilyn: ¿Y en dónde es que está el bizcocho? Contesta Marilyn: Bueno, no es que me olvidara de hacerlo, pero ya que estaba lloviendo tanto, no quise salir y en casa no________los huevos necesarios para cocinarlo. a) había b) habían Ana y María están planeando una excursión. Por fin se pusieron de acuerdo sobre la destinación. Ana, a quien le gusta visitar museos, exposiciones, etc., dice: Pues, déjame buscar toda la información y ________¿Vale? a) te llamo p’atrás b) vuelvo a llamarte Juan, un español, trata de convencerle a Tony, un amigo puertorriqueño, de que en España todo es mejor, lo cual este último no puede creer. Juan acaba de mencionar una pila de problemas que existen en Puerto Rico. Replica Tony: Tienes razón, pero en España________los mismos tipos de problemas. a) habrá b) habrán Dos personas están hablando de que ha aumentado el nivel de la pelota. Una tercera, más crítica, dice: La gente que dice que el nivel era más bajo en el pasado, no se acuerdan de los buenos juegos que veían cuando eran jóvenes. Tal vez no ________(1) los talentos que ________(2) hoy en día, pero también ________(3) muchos peloteros muy buenos. (1) a) hubiera b) hubieran (2) a) hay b) hayn (3) a) había b) habían Después de un caso severo de contaminación, comenta un experto ante las cámaras de la prensa: La semana pasada, en esta presa ______miles de peces, veinte patos y tres garzas. Ahora está lleno de basura y los animales se han ido, o, peor aún, están muertos. a) hubo b) hubieron El carro de Fernanda viene fallando desde hace tiempo. No son averías gordas, pero el carro tiene ya más de quince años y la joven no está segura de que las reparaciones, que pueden salir caras, valgan la pena. Le pregunta a su hermano: Yo no sé cuál es la mejor opción: cambiarlo o hacer que me lo arreglen. a) ¿Qué tú harías? b) ¿Qué harías tú? c) ¿Qué harías?

Appendix B: Questionnaire-reading task | 241

Tony está viendo las noticias. Después de acabado el programa, le pregunta a la novia: Oye ¿tú sabías que cada fin de semana________veinte accidentes fatales en nuestra ciudad? a) hay b) hayn Después de que los vecinos volvieran de una visita al zoológico. Marlén les pregunta ¿________(1) nuevos animales en el zoológico? La familia, entusiasta, contesta: ¡Sí, sí los ________(2)! Vimos dos nuevos grupos de monos araña y un tigre que acababa de llegar de la India. (1) a) hubo b) hubieron (2) a) hubo b) hubieron Juan está contándole a su madre que a la hermana, María, le explotó una goma en la carretera. Pregunta la madre: ¿Qué ella hizo entonces? Contesta Juan: Llamó al esposo________a cambiarla. a) para que él viniera b) para él venir c) para que viniera Dos personas están hablando de literatura. El primero tiene la impresión de que este año no salieron sino buenas novelas, lo que también es la opinión de la crítica literaria. El otro no está de acuerdo y dice: La gente que dice que este año no salieron sino buenos libros, no saben de qué hablan, porque siempre________ (1) libros malos y libros buenos y siempre los________ (2). (1) a) ha habido b) han habido (2) b) habrá b) habrán Juanito está llenando un crucigrama con la ayuda de su mamá. Después de un rato, la madre le dice al muchacho: Creo que ya ________(1) los suficientes indicios como para tú poder terminar el rompecabezas sin mi ayuda. Contesta Juanito: ¡No mami, no los ________(2) todavía! (1) a) hay b) hayn (2) a) hay b) hayn Armando está hablando con su hijo, Juan, que nació a mediados de los años 80, sobre el día de su nacimiento. Dice: Recuerdo que, para entonces, ________las primeras víctimas de SIDA y estábamos como un poco preocupados, porque a tu mamá le tuvieron que poner sangre después del parto. a) empezó a haber b) empezaron a haber María está hablando con su jefe, Julio, que acaba de pedirle que haga un trabajo importante el día siguiente. Sin embargo, María ya está metida en un proyecto que le toma mucho de su tiempo. Por ello, le contesta: Mañana________este trabajo, pero puede ser que no me dé tiempo. a) voy a hacer b) hago c) haré Dos niños fueron al parque con la abuela. Cuando vuelven a su casa, la mamá les pregunta: ¿Vieron palomas en el parque? Uno de los hermanitos contesta: Sí, sí ________como once. a) debía haber b) debían haber

242 | Appendix B: Questionnaire-reading task

La semana pasada, Ana estuvo muy enferma. Ahora ya se siente mejor y le cuenta a una amiga: Ahora ya me siento un poquito mejor, pero la semana pasada, ________veces que yo tenía tanta fiebre que, estando en la propia cama, yo no sabía dónde carajo estaba. a) había b) habían Ana está en una guagua. Delante de ella, un hombre se levanta a bajarse. De pronto, Ana ve que hay un celular en el asiento. Se levanta y dice: Señor, con permiso, ________ a) ¿este celular es suyo? b) ¿este es su celular? Fernanda está viendo las noticias. De golpe, le grita al esposo, que es chileno: ¡Cariño, ven a ver esto, ________tres terremotos en Chile! a) acaba de haber b) acaban de haber Ana y Marilyn quieren organizar una cena. Pregunta Ana: ¿Quien más podríamos invitar? De golpe Marilyn se acuerda de dos chicas, María y María, a quienes conocieron una semana antes y dice: Pues, ________esas dos Marías que conocimos el otro día. a) hay b) hayn Dos amigos están hablando del vegetarianismo. Uno de ellos dice que es de todos los tiempos. El otro replica: Yo pienso que siempre________(1) personas que respetan a los animales, pero no creo que siempre ________(2) vegetarianos. (1) a) ha habido b) han habido (2) a) haya habido b) hayan habido En la empresa donde trabaja Marilyn, hay un compañero nuevo. Los demás le están preguntando por sus experiencias laborales previas. Dice el nuevo: ________ a esta empresa, estaba trabajando en Alemania. a) Antes de yo venir b) Antes de que yo viniera Dos hombres están hablando de literatura británica. Uno de ellos dice que él prefiere leer los libros en inglés. El otro, por el contrario, suele esperar hasta que se traduzcan al español. Dice el primero: Pero entonces tú tienes que esperar muchísimo, ya que recuerdo haber leído que, por ejemplo, en Inglaterra sólo________ los traductores necesarios para traducir un décimo de las novelas que se habían publicado en el 2008. a) había b) habían Alicia está describiéndole a su esposo cuánto ha cambiado su casa paterna desde su niñez. Dice: No pienso que en aquel entonces ya ________las butacas que están en la sala, el armario que está en la habitación de mis papás y los cuadros que están en la pared del pasillo. a) hubiera b) hubieran Un maestro está hablándole a uno de sus amigos, Armando, de la importancia que, según él, tiene la educación. Dice: Armando, los estudios son tan precisos como la comida, por ejemplo, si no se estudiara, no ________ los conocimientos de la anatomía humana que te salvaron la vida el año pasado. a) habría b) habrían

Appendix B: Questionnaire-reading task | 243

Después de algún proyecto para mejorar la calidad del agua de las presas del país, un científico comenta: Hace diez años, no________ más de tres sapos en esta presa. Hoy en día, cuenta con veinte patos, tres garzas y miles de peces. a) hubo b) hubieron Los papás de Alicia organizan una fiesta, a la que van a asistir muchos amigos, de modo que necesitan de la ayuda de la joven. Dice el papá: Lo siento Alicia, pero realmente te necesitamos aquí. Quiero que________ con la fiesta. a) tú nos ayudes b) nos ayudes Armando, a quien le gusta mirar las estrellas, suele levantarse los domingos a las cuatro de la mañana para disfrutar de la vista que tiene en el balcón de su apartamento. Por la tarde, en la playa, le dice a su hermano: Sobre las cuatro, ya ________carros en la calle. Qué raro, ¿verdad? a) empezó a haber b) empezaron a haber Después de que los vecinos volvieran de una visita al zoológico, Marlén les pregunta: ¿Qué animales vieron en el zoológico? Y ellos contestan, un poco desilusionados: ¡Muy pocos!, ni siquiera________los usuales grupos de leones, tigres y monos. a) había b) habían Iraida no encuentra las revistas que acaba de comprar. Le pregunta a Juan, su hermano, si él sabe dónde están. Contesta: Ah sí, las dejé en la sala para que ________. a) se las lea b) se lean Ana tiene problemas amorosos bastante serios. Después de semanas de sentirse muy mal, le cuenta todo a su prima. ¡Ay muchacha! Tienes que ayudarme. Estoy enamorada de mi mejor amigo o, mejor dicho, del que era mi mejor amigo, porque ________otros sentimientos, que están destruyendo la amistad. a) empieza a haber b) empiezan a haber Al inicio del año escolar, una madre le pregunta al maestro: Maestro, con permiso ¿ ________ treinta alumnos en la clase de mi hijo, como el año pasado? a) seguirá habiendo b) seguirán habiendo Marilyn, que es una vegetariana convencida. Dice: ________me tendrán que obligar a la fuerza. a) Para yo comer carne, b) Para que yo coma carne, Juan, a quien invitaron a una fiesta a la que no pudo asistir, le pregunta al amigo que la organizó: ¿Qué tal la fiesta que organizaste? Éste contesta: ¡Qué mal estaba! sólo sobre la una de la mañana ________más de dos invitados. a) empezó a haber b) empezaron a haber

244 | Appendix B: Questionnaire-reading task

Una familia está rumbo al zoológico. En el carro, dice la madre: ¡Ojalá ________(1)esos leones que vimos la vez pasada! Pregunta el esposo: ¿Por qué no los ________(2)? (1) a) haya b) hayan (2) a) habría b) habrían Dos amigos están hablando de excursiones. Uno de ellos dice: Nunca ________en La Parguera, pero me gustaría ir este verano. a) estuve b) he estado Dos niñas fueron al parque con la abuela. Cuando vuelven a su casa, la mamá les pregunta: ¿Vieron ardillas en el parque? Una de las hermanas contesta: Sí, ______ como nueve. a) habría b) habrían Armando, a quien le gusta mirar las estrellas, suele levantarse los domingos a las cuatro de la mañana para disfrutar de la vista que tiene en el balcón de su apartamento. Durante el desayuno, su esposa se queja de que no pudiera dormir por el ruido de la calle. Replica Armando: Qué raro, esta mañana no________más carros que otros domingos. a) había b) habían