Spatializing Social Media: Social Networks Online and Offline 036737420X, 9780367374204

Spatializing Social Media charts the theoretical and methodological challenges in analyzing and visualizing social media

373 51 6MB

English Pages 200 [201] Year 2021

Table of contents :
Cover
Half Title
Title
Copyright
Contents
Funding
Acknowledgments
Foreword
Introduction
Section I Local and digital: the dyadic interaction of social and virtual
1 Place and space
2 Face-to-face and online communities
3 From global village to identity tribes
Section II Social and spatial networks: the dyadic interaction of virtual and spatial
4 Network spillover
5 Social networks
6 Spatial analysis
Section III Social networks online and offline: the dyadic interaction of social and spatial
7 Spatial and social media data
8 Online-offline coordination
9 The directionality of homophily
Section IV Mapping online to offline social networks: bridging geography and geodesy
10 Network layouts by geodesy and geography
11 Methods in spatial statistics for social networks
12 An R package for spatializing social media
Conclusion
Index of names
Index of subjects

Recommend Papers

Marketing Communications: Integrating Offline and Online with Social Media [Fifth Edition] 0749461934, 9780749461935, 9780749461942

Marketing Communications has been listed as a "classic" by the Marketing Society. Paul Smith's and Ze Zoo

1,318 40 7MB Read more

Palestinian Youth Activism in the Internet Age: Online and Offline Social Networks after the Arab Spring 9781838600631, 9781838600655, 9781838600648

Since the Arab uprisings of 2011, Palestinian youth movements have formed unofficial and leaderless networks of politica

143 21 9MB Read more

Palestinian Youth Activism in the Internet Age: Online and Offline Social Networks after the Arab Spring 9781838600631, 9781838600662, 9781838600648

Since the Arab uprisings of 2011, Palestinian youth movements have formed unofficial and leaderless networks of politica

349 48 4MB Read more

Multimodality and Social Interaction in Online and Offline Shopping [1 ed.] 1032255919, 9781032255910

This collection brings together social semiotic, ethnographic, and conversation analytic approaches to multimodality in

119 3 20MB Read more

Social Media as a Space for Peace Education: The Pedagogic Potential of Online Networks 3030509486, 9783030509484

This book explores the potential of social media as a space for teaching and bringing about sustainable peace. Using cut

367 6 3MB Read more

New Media and Visual Communication in Social Networks 1799810410, 9781799810414

Social media and new social facilities have made it necessary to develop new media design processes with different commu

457 104 15MB Read more

Social Networks and Social Exclusion 9780754634294

197 58 884KB Read more

Cyber Security and Social Media Applications (Lecture Notes in Social Networks) 3031330641, 9783031330643

This book offers an excellent source of knowledge for readers who are interested in keeping up with the developments in

110 103 Read more

Social Media Analysis for Event Detection (Lecture Notes in Social Networks) 3031082419, 9783031082412

This book includes chapters which discuss effective and efficient approaches in dealing with various aspects of social m

121 48 9MB Read more

Social Media in Social Work Education 9781909682573

238 23 379KB Read more

Author / Uploaded
Marco Bastos

0 0 0
Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up

File loading please wait...

Citation preview

SPATIALIZING SOCIAL MEDIA

Spatializing Social Media charts the theoretical and methodological challenges in analyzing and visualizing social media data mapped to geographic areas. It introduces the reader to concepts, theories, and methods that sit at the crossroads between spatial and social network analysis to unpack the conceptual differences between online and face-to-face social networks and the nonlinear effects triggered by social activity that overlaps online and offline. The book is divided into four sections, with the first accounting for the differences between space (the geometrical arrangements that structure and enable forms of interaction) and place (the mechanisms through which social meanings are attached to physical locations). The second section covers the rationale of social network analysis and the ontological differences, stating that relationships, more than individual and independent attributes, are key to understanding of social behavior. The third section covers a range of case studies that successfully mapped social media activity to geographically situated areas and considers the inflection of homophilous dependencies across online and offline social networks. The fourth and last section of the book explores a range of networks and discusses methods for and approaches to plotting a social network graph onto a map, including the purpose-built R package Spatial Social Media. The book takes a non-mathematical approach to social networks and spatial statistics suitable for postgraduate students in sociology, psychology and the social sciences. Marco Bastos is the University College Dublin Ad Astra Fellow at the School of Information and Communication Studies and Senior Lecturer at City, University of London. His research leverages computational methods and network science to explore the intersection of communication and critical data studies.

“Comprehensive and insightful, Spatializing Social Media provides a fascinating, in-depth analysis of the relationship between the social, spatial, and virtual, and the confluence and divergence of social networks on- and offline. Essential reading for anyone interested in understanding the societal and spatial operations and impacts of social media.” Rob Kitchin, Professor of Geography at Maynooth University, Ireland “Spatializing Social Media provides an original and highly promising framework for the analysis of social media by focusing on the spatial patterns of digital interactions. This placebased perspective is bound to deepen our understanding of the role and effects of digital media in social life. By providing his readers with instructive case studies and a ready-made analytical tool kit, Bastos has made his book into the natural origin point of a new wave of studies with and on social media.” Andreas Jungherr, Professor for Digital Transformation and Publics at University of Jena, Germany “By focusing on theoretically grounded suggestions about what we as researchers can actually do once we get our hands on data, the perspectives contained in this book should remain valid for a long stretch of time. Spatializing Social Media appears as more than a methods book, much like it reads as more than theoretical account.” Anders Olaf Larsson, Professor of Media and Communication Studies at Kristiania University College, Norway “The idea that network analysis and geographic relationships are close cousins has long been understood, but the details of laying bare these relationships are too often addressed with hand-waving. Hidden beneath the title of Bastos’s new book is a thorough exploration of the ways in which networked approaches can be employed to provide a much better understanding of flows in online and offline spaces. The broad theoretical and practical approaches will provide the casual reader a sophisticated introduction to networked spaces, while leaving much to engage more experienced scholars as well.” Alexander Halavais, Associate Professor of Critical Data Studies at Arizona State University, United States of America

SPATIALIZING SOCIAL MEDIA Social Networks Online and Offline

Marco Bastos

First published 2022 by Routledge 2 Park Square, Milton Park, Abingdon, Oxon OX14 4RN and by Routledge 605 Third Avenue, New York, NY 10158 Routledge is an imprint of the Taylor & Francis Group, an informa business © 2022 Marco Bastos The right of Marco Bastos to be identified as author of this work has been asserted by him in accordance with sections 77 and 78 of the Copyright, Designs and Patents Act 1988. All rights reserved. No part of this book may be reprinted or reproduced or utilised in any form or by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying and recording, or in any information storage or retrieval system, without permission in writing from the publishers. Trademark notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe. British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library Library of Congress Cataloging-in-Publication Data Names: Bastos, Marco, author. Title: Spatializing social media / Marco Bastos. Description: Abingdon, Oxon ; New York, NY : Routledge, 2021. | Includes bibliographical references and index. Identifiers: LCCN 2021007473 (print) | LCCN 2021007474 (ebook) | ISBN 9780367374211 (hbk) | ISBN 9780367374204 (pbk) | ISBN 9780429354328 (ebk) Subjects: LCSH: Online social networks. | Social media. Classification: LCC HM742 .B369 2021 (print) | LCC HM742 (ebook) | DDC 302.30285—dc23 LC record available at https://lccn.loc.gov/2021007473 LC ebook record available at https://lccn.loc.gov/2021007474 ISBN: 978-0-367-37421-1 (hbk) ISBN: 978-0-367-37420-4 (pbk) ISBN: 978-0-429-35432-8 (ebk) Typeset in Bembo by Apex CoVantage, LLC Access the Support Material: www.routledge.com/9780367374204

CONTENTS

Fundingvii Acknowledgmentsviii Forewordix Introduction

1

SECTION I

Local and digital: the dyadic interaction of social and virtual

21

1 Place and space

23

2 Face-to-face and online communities

30

3 From global village to identity tribes

37

SECTION II

Social and spatial networks: the dyadic interaction of virtual and spatial

51

4 Network spillover

53

5 Social networks

63

6 Spatial analysis

74

vi Contents

SECTION III

Social networks online and offline: the dyadic interaction of social and spatial

89

7 Spatial and social media data

91

8 Online-offline coordination

101

9 The directionality of homophily

112

SECTION IV

Mapping online to offline social networks: bridging geography and geodesy

135

10 Network layouts by geodesy and geography

137

11 Methods in spatial statistics for social networks

145

12 An R package for spatializing social media

152

Conclusion

166

Index of names Index of subjects

172 181

FUNDING

The research underpinning this book was supported by Twitter, Inc. research grant 50069SS ‘The Brexit Value Space and the Geography of Online Echo Chambers’ and the University College Dublin Research Project 64927 ‘Ad Astra Fellowship.’

ACKNOWLEDGMENTS

To my daughter Sofia, who remains unimpressed with all things digital and who in her own way assisted and retarded this project in an idiosyncratic and invigorating way.

FOREWORD

The seemingly ever-changing nature of data-driven internet research can be seen as primarily dependent on two aspects: possibilities for data access and possibilities for data analysis. With regard to the first of the two, the ways in which researchers have been able to procure data from social media platforms such as Facebook, Twitter, and Instagram have for a long time been in a more or less constant flux, with the events surrounding the Cambridge Analytica revelations sometimes pointed to as a sort of watershed moment. Indeed, while some interesting initiatives have been taken to further cooperation between researchers and platforms (Puschmann, 2019), or directly between researchers and their potential informants (Halavais, 2019), critical voices have also been raised, drawing attention to the repercussions that cooperation between scholars and commercial interests—such as social media platforms—might lead to (Bruns, 2019). Conversely, as opportunities for data gathering have diminished or have become more exclusive, the ways in which data analysis can be carried out have multiplied at a rapid pace. While it appears safe to say that the bulk of comparably early empirical research into the uses of digital media primarily made use of different varieties of standard—one might even call them legacy—software applications common in social scientific analysis (e.g., Lewis et al., 2013), a series of interesting, highly useful, and (last but not least) freely available software applications were soon created and subsequently utilized by interested researchers (e.g., Bastian et al., 2009; Rieder et al., 2015; Rogers, 2019). Even more recent tendencies suggest a gelling of many of the functionalities made available by the aforementioned and other similar applications into what is often referred to as packages or libraries for the R and Python programming languages. To be frank, the skills needed to operate such languages are by no means commonplace in the current media and communication research landscape (see

x Foreword

Theocharis & Jungherr, 2020). Nonetheless, several signs point to the increasing importance of these skills and perspectives—such the founding of a Computational Methods division within the International Communication Association, developments of different varieties of ‘data science’ study programs at a number of research-intensive universities, and a plethora of conference panels and special issues in academic journal drawing on these themes—as well as the launch of a journal dedicated to the development of “relevant paradigms that guide future research” (van Atteveldt et al., 2019: 2). Spatializing Social Media makes a similar contribution to the one suggested above. Drawing on what is perceived as a dearth of scholarly efforts that take the spatiality of digital media data into account, Marco Bastos draws on and consequently furthers a set of techniques and principles previously used in other scholarly disciplines, providing internet researchers in general and perhaps those of the computational persuasion in particular with food for thought on what is defined as a set of techniques applied to the spatial expression of human behavior to accurately describe geographical patterns. Arguing that surprisingly little attention has been given to the overlap between online and offline social networks, Bastos adopts an interdisciplinary approach to tease out a series of insights otherwise siloed within disciplinary boundaries. The first two sections of the book feature a rich theoretical backdrop, including, but not limited to, a thorough discussion of the basic tenets of social network analysis—an approach that did indeed grow in popularity among interested scholars from about 2010 and onwards. The subsequent third section of the book moves on to discuss the theoretical aspects in practice. This is done by providing insights into a series of case studies where the theories and methods previously dealt with are put into use—most interestingly, perhaps, in relation to the author’s own work on the United Kingdom ‘Brexit’ campaign and how events related to it played out on and offline. The fourth and final section reviews a series of packages—applications of sorts—for the R programming language that allow the interested reader to try his or her hand at spatial analysis. Bastos also proves his worth as an R programmer by providing the potential analyst with a purpose-built package for this specified programming language. Such a pairing—book and package—appears as novel, at least to this reader. Somewhat reminiscent of research methods books where readymade datasets would be provided to help in flattening the learning curve, Spatializing Social Media nevertheless appears as more than a methods book, much like it reads as more than theoretical account. Given the previously mentioned constant fluctuation and coming and going of various approaches to data collection, the author has been wise to largely avoid such issues. While certainly important, a book like the one at hand would risk becoming obsolete sooner rather than later given these rapid changes. By focusing on theoretically grounded suggestions about what we as researchers actually can do once we (hopefully) can get our hands on data, the perspectives contained

Foreword xi

in Spatializing Social Media should remain valid for a longer stretch of time than those championed in your typical digital methods book (if such a thing does indeed exist). Anders Olof Larsson Oslo, 22 January 2021

References Bastian, M., Heymann, S., & Jacomy, M. (2009). Gephi: An Open Source Software for Exploring and Manipulating Networks. Paper presented at the 3rd International Conference on Weblogs and Social Media (ICWSM), San Jose, CA. Bruns, A. (2019). After the ‘APIcalypse’: Social media platforms and their fight against critical scholarly research. Information, Communication & Society, 22(11), 1544–1566. Halavais, A. (2019). Overcoming terms of service: A proposal for ethical distributed research. Information, Communication & Society, 22(11), 1567–1581. Lewis, S. C., Zamith, R., & Hermida, A. (2013). Content analysis in an era of big data: A hybrid approach to computational and manual methods. Journal of Broadcasting & Electronic Media, 57(1), 34–52. Puschmann, C. (2019). An end to the wild west of social media research: A response to Axel Bruns. Information, Communication & Society, 22(11), 1582–1589. Rieder, B., Abdulla, R., Poell, T., Woltering, R., & Zack, L. (2015). Data critique and analytical opportunities for very large Facebook Pages: Lessons learned from exploring “We are all Khaled Said.” Big Data & Society, 2(2), 2053951715614980. doi: 10.1177/2053951715614980 Rogers, R. (2019). Doing Digital Methods. London: Sage. Theocharis, Y., & Jungherr, A. (2020). Computational social science and the study of political communication. Political Communication, 1–22. van Atteveldt, W., Margolin, D., Shen, C., Trilling, D., & Weber, R. (2019). A roadmap for computational communication research. Computational Communication Research, 1(1), 1–11.

INTRODUCTION

Spatializing Social Media unpacks puzzling social problems that emerge as online social networks scale up and beyond the geographic constraints of real-world communities. It also offers an introduction to the geographic dependencies of social media and the techniques available to study them. Situated at the intersection of computational communication science with social network and spatial analysis, the book tackles two related complex problems. First, it maps the asymmetric social divides that arise as face-to-face communication and social networks disengage from online communities. Second, it introduces the challenges in analyzing and visualizing dyadic relationships anchored in geographic space, chief of which is the intermingling of spatial and nonspatial data. These two developments are central to understanding the spatial representation (or lack thereof) of digital trace data generated in social media platforms, and form the three tenets of this book: spatial relations, social networks, and digital trace data. The complex web of social interaction that may emerge online intersects with the geographic constraints pertaining to one’s offline, face-to-face social networks. We describe these dependencies through two central threads: (1) space and geography, and plotting social networks on a map; and (2) online/virtual versus offline/ face-to-face social networks. The methodological challenges, on the one hand, are largely associated with the first thread, particularly the process of manipulating relational digital trace data to plot them on a geographic grid. The theoretical challenges, on the other hand, are largely associated with the second thread: at the junction of relationships that are spatially embedded through offline contacts and online trace data. This relationship is more convoluted than suggested by the dichotomy online-versus-offline network formations. We address this theoretical conundrum through three distinct dimensions: (1) social (online and offline), (2) spatial (offline), (3) virtual (online).

2 Introduction

These dimensions foreground three distinct dyadic interactions: social ↔ spatial; social ↔ virtual; virtual ↔ spatial. The three emerging axes or dyadic relationships should account for the three-way relationship between social, spatial, and virtual. The dyad social/spatial is the object of spatial social network analysis (Johnston & Pattie, 2011); the dyad social/virtual is the substantive question driving internet research (Hunsinger et al., 2010); and finally, the dyad virtual/spatial is the lesser-explored dimension of the online-offline continuum, likely due to the relative absence of data about users’ location (Scellato et al., 2011). These dimensions constitute the temporality and spatiality of everyday network life by reinforcing, producing, and oftentimes contradicting each other. In other words, the social network milieu develops in subjective zones and wavelengths that rely asymmetrically on virtual, spatial, and social dependencies. This is to say that there are spatial arrangements driving neighborhood relationships (Tonkiss, 2005) as much as there are network effects driving social platforms’ growth (Zhang & Zhu, 2011). Similarly, relationships activated virtually reverberate spatially and we refer to this cross-pollination as a dyadic relationship (Kitchin & Dodge, 2011). Conversely, not all networks are social and not all social networks exist in space. Academic citation networks, for example, are the prototypical sociocultural network resulting from the professional interaction among individuals who may or may not know each other. Researchers may routinely cite scholars who have long died, and they themselves may be cited along with living acquaintances, friends, and contemporaries who they may have never met, and that additionally may be dead. This is in contrast with networks that are social in nature, such as those of sexual partners or needle-sharing drug users (White, 2011), whose social interaction necessarily takes place in space and time. Social media platforms, finally, are social in nature, but not necessarily circumscribed by spatial or temporal constraints.

Framework The dichotomy between face-to-face and online social networks reads much like a terminological paradox, as social networks are constructs rather than realities, constructs that nonetheless drive much social anxiety about their influence on the fabric of society. Similar to historical concerns over industrialization and urbanization, the collapsing of online and offline social networks can emancipate individuals from the immediate communities where they live and allow them to associate with different others (Wang et al., 2018). Conversely, the affordances of social platforms, particularly reposting, retweeting, liking, and favoriting, can further segregate groups and cliques by coordinating offline meetups, which selectively separate individuals and reinforce homophilous referrals, outgroup derogation, and neighborhood segregation observed offline (Boy & Uitermark, 2016). Additionally, the intertwining of online and offline social networks may conceivably reinforce outgroup stereotyping and mechanisms of fragmentation that expand or intensify extant structures of inequality observed on offline social

Introduction 3

networks (Bastos, Mercea, et al., 2018). This process is rarely more salient than in the distribution of propaganda and disinformation to drive wedges into already existing social divisions (Bastos & Farkas, 2019; Benkler et al., 2018). Successful mis-, dis-, or mal-information campaigns aim at sowing confusion and social division by relaying content via anonymous users that may, in fact, be automatic posting protocols (bots) with no affiliation to offline social networks beyond the political and commercial interests behind such campaigns (Bastos & Mercea, 2019; Starbird, 2019). We can also probe this problem by unpacking what ‘offline’ means in this context. This includes not only offline social networks that structure one’s sense of family, community, and society but also the physical networks that impinge on one’s network by facilitating or hindering the places and spaces for work, leisure, and personal relationships. As such, it includes both locations identified in geospatial data as socially meaningful places as well as face-to-face contacts with family and friends. In other words, social networks online and offline refer to the social processes situated at the interplay of digital communications and people’s cognitive maps of the social world. These cognitive maps are often anchored to geospatial sites that interact with the social processes triggered and sustained by social networks, but they may also take place in virtual sites capable of constructing social realities much like face-to-face social networks. It is the cognitive bonding provided by social media affordances, virtually detached from spatial constraints, that may trigger nonconformities to the malleable fabric of reality. Online bonding compounds the effects of mediatization, whereby the media takes over the role of social institutions like family, school, and church in providing information, tradition, and moral orientation for members of society (Hjarvard, 2008). These developments may well support social cohesion, with ample evidence that social media support community formation across cultural and geographic locations (Bastos & Mercea, 2016; Myrick et al., 2016). But they also allow social media users to remotely weave the arbitrary ties that link them to society. A pertinent if mundane manifestation of this development is subreddits like ‘Am I the Asshole?’ or ‘Ask Me Anything,’ where strangers in geographically disparate locations come together to agree on rules and conventions of etiquette governing social nuance and etiquette (Donald, 1981). The framework outlined in this book deviates from traditional sociological social network analysis, whose central tenets are based on relational data to explain behavior through the analysis of relationships (structure) rather than individuals and attributes (agency). The approach developed in this book is therefore etiologically in line with the work of Tom Snijders and Christian Steglich, who developed and used SIENA software (now RSIENA) to identify the interaction and co-evolution of social influence (network/peer group) effects and individual (agency) effects. Their modeling requires longitudinal data, which is relatively abundant in digital trace data, to perform analyses that are similar to what we propose with geospatial metadata (Snijders et al., 2010, 2017; Steglich et al., 2006, 2010). This is, however, only one of many disciplinary departures explored in this book.

4 Introduction

Indeed, the availability of digital trace data has opened new lines of inquiry on social media and spatial statistics. These developments nonetheless remained largely siloed within disciplinary boundaries: social media research explored the network topology of social media interactions, whereas geographic information system (GIS) research focused on gathering, manipulating, and analyzing spatially rich data. As a result, limited attention has been paid to the overlap between online and offline social networks. The elapsed effects of online activity on social groups offline speak to the long-lasting sociological question about the directionality of homophily; yet, we have only started to explore the multitude of personal information, server logs, and users’ online information to understand the complex and ever-evolving negotiation between online and real-world personas in social networks. While approaches to modeling social networks have grown in number and sophistication in recent years (Albert & Barabási, 2002; Newman, 2003), these models are not suitably informed by sociological and psychological insights on how social networks emerge (Robins & Pattison, 2005; Scott, 2011). As such, it is perhaps unsurprising that the wealth of social networks rendered from digital trace data was modeled to test the ever-growing models of diffusion and network formation, largely independent from geographic dependencies. While this line of investigation has undoubtedly advanced the field, the directionality of homophilic relationships from online to offline interactions has rarely featured in network models. More specifically, and reinforcing such limitations, constructs within the exponential random graph model family lack estimates that would allow for measuring inbreeding homophily and endogenous clustering effects (Wong et al., 2006).

Background Seminal investigations into online-offline interaction stem from internet studies of the 1990s, largely characterized by dichotomous terms of online, virtual communication versus embodied, offline worlds (Baym, 1999); a fascination with the geography of internet infrastructure (Dodge & Kitchin, 2001); and concerns with information technology’s impact on local and nationwide lifestyles (Curry, 2002). This body of scholarship was distinctively interdisciplinary and incorporated ethnographic methods as well as practices that would later be identified with data science. This emerging area of inquiry also explored theoretical perspectives that could provide context to the increasingly blurred line between the virtual (i.e., online) and real (i.e., offline) worlds, thereby accounting for different understanding of how online and offline realms would intersect and embody each other. This body of scholarship offered nuanced but largely aspatial accounts of the online-offline relationship, as if users could operate independent of space (Kitchin & Dodge, 2011). During the early noughties, the separation between online and offline dimensions transitioned to the assumption that online and offline personas were merging in social networks, an assumption that pervaded much of internet studies and social media research analyzing digital trace data as an implicit extension, often as a significant representation, of real-world politics and social issues. The collapsing of

Introduction 5

online and offline identities emerged with an exalting rhetoric praising the democratization of public discourse brought by the advent of social media platforms, particularly Twitter and Facebook, which fueled civil unrest in Western democracies and the MENA region (Castells, 2012). In this body of scholarship, space is no longer a mere container where political activity unfolds, highlighting instead the subtly evolving layers of context shaping social interaction. Yet, this line of inquiry offered limited insight on the vital role played by multiple identities, including automation and anonymity, which remained central to social networks like Reddit and 4chan. Tensions emerging from online-offline social interaction were largely heralded as an upward trajectory toward greater democratization by liberals and civil rights activists (Benkler, 2006). This narrative was challenged in the aftermath of highly contentious issues such as the US presidential election of 2016 or the UK EU membership referendum, also held in 2016. Informed observers sounded a note of alarm about the scope of social media activity for distorting electoral processes in democratic countries. The Brexit campaign and elections in the United States and France that followed the referendum have likewise been linked to disinformation, misinformation, and propaganda campaigns seeking to strategically exploit the mechanisms underpinning online interaction, such as network effects and information diffusion, with the aim of heightening partisanship and eroding the general trust in democratic institutions (Bastos & Mercea, 2019; Ferrara, 2017; Shao et al., 2017). The ensuing information warfare leveraging social media communication called into question fundamental expectations about social media platforms (Bastos & Mercea, 2018b), with the weaponization of news items epitomized by the contested and ideologically inflected idea of ‘fake news’ exemplifying the limits of a framework in which content consumed online was undistinguishable from content that appeared in print, with exceptions reserved to the speed, reach, and outlays involved in the production of content. Conversely, social media research started to focus on the affordances of internet-enabled technologies years before the crisis became apparent, with a growing body of scholarship exploring sociological questions underlying the tension between online and offline dynamics (Walker et al., 2019). The third phase of internet research offers a sharp contrast to the heyday of internet euphoria. This research program has swiftly evolved from ontologically distinct constructs of online and offline spaces (Graham et al., 2013) to methodological and ethical challenges raised by Big Data (Tufekci, 2014); the dynamics of online social networks (González-Bailón et al., 2011); and more granular investigations on the mutual elapsed effects of digital, internet-enabled activity on larger communities and social networks (Bastos et al., 2015). In the aftermath of this third phase, internet research studies sought to question the governance mechanisms regulating the generalized use of social platforms in daily life. At this point, social platforms had altered the fabric of social life by offering tangible relationships entirely independent from one’s immediate geographic

6 Introduction

surroundings. This relatively uncontroversial observation offers a direct challenge to the geographic premise that social behavior is context specific, and that space and society are mutually constituted (Leitner et al., 2008). Yet, we are only beginning to understand the magnitude of challenges involved in mapping online to offline social networks. The collapsing of distances brought by internet technologies often foregrounds the role of geography within one’s social network. On the other hand, it is often difficult if at all possible to determine the directionality of this homophilic relationship, with primary and secondary effects arising from the interaction between existing physical ties and online interactions. Perhaps more critically, patterns observed in online settings may not replicate in offline settings, whether because of social platform affordances (e.g., the anonymity in online settings increases the likelihood of vitriol and uncivil speech occurring in interpersonal communications) or the type of relationship that individuals develop on social media (Barberá & Steinert-Threlkeld, 2019). There are also specific anxieties associated with social platforms’ tight grip on social interaction that either heighten intergroup tensions or bring forth unanticipated conflicts. Early iterations of the business model backing social platforms foregrounded clickable content with little if any value to civic participation. Viral posts combined user-generated and content created by paid staff members with a topical focus on personal experience, parents’ advice, celebrity-endorsed lifestyles, feel-good-human-interest cases, one-liner memes, and videos exploring culinary curiosity that departed from similar content consumed in print (Bastos, 2016). The incentives rewarding clickbait content could be converted—and ultimately would be promptly converted—into influence operations on a scale that cannot be replicated in face-to-face interaction. This trait bourgeoned as social platform matured, with targeted political advertising allowing actors to subvert public discourse with limited scrutiny from civil society (Bastos & Mercea, 2018b). Early narratives heralding social platforms as a global utopia metastasized into narratives of digital dystopia, with real-world communities being ripped apart as prejudice, hatred, and outgroup hostility were peddled online. These issues are likely to become more prominent as internet access grows in rural communities (USDA, 2015). Taking the US population as an example, where individuals spend an average of 22.5 hours a week online (Cole et al., 2018), it is possible to estimate the ratio of time interacting with others online versus offline, as there are stringent limits on the amount of time individuals have for social interaction (Dunbar, 1998; Sutcliffe et al., 2012). Twenty-two hours may appear to be a small share of one’s weekly time, indeed little more than one-seventh of one’s weekly time, but this does not take into account the hours dedicated to working, sleeping, commuting, eating, exercising, and handling household chores and personal hygiene, which may conceivably be estimated at around 130 hours a week. This leaves around 38 hours of free time available per week, of which 22.5 would represent more than half of the free time available per individual. In other words, it is plausible that individuals with unrestricted access to the internet were

Introduction 7

evenly dividing their free time between online and offline activities even before the COVID-19 pandemic.

Networks From the perspective of a social network analyst, online and offline realms can only mirror each other in the unlikely scenario where one’s online and offline ties are identical. Granovetter’s (1973) seminal study of job search has implications at the group level as well: the strong pockets of cohesion in communities with many strong ties, such as extended family networks, also imply much weaker global cohesion. In contrast, online communities where the ties connecting users are largely weak may present low local cohesion but strong global cohesion that can be leveraged to distribute information faster, more efficiently, and to larger communities of users than would be conceivable in offline networks alone. In other words, and still following Granovetter’s original insight, communities with a diffuse weaktie structure, much like communities of interest and of practice forged online, have greater group-level social capital that can be leveraged in collective, goal-oriented tasks such as distributing information and organizing community action to respond to an outside threat. This otherwise apparent problem in the overlap between online and offline networks is often overlooked because network analysis is often approached from a nominalist perspective to networks (Laumann et al., 1989). For nominalists, network ties arise out of a definition, which may be a construct of the method or an arbitrarily defined boundary. Network nodes—including social actors—are by definition persistent, and ties are drawn as a function of the questions being asked or the transactions being recorded. From the perspective of realist network theory, however, networks refer to an underlying reality that can be traced or detected using methods of data collection. In other words, a realist invariably sees a node belonging to multiple networks, while a nominalist only conceives such connections to other nodes if one chooses, or indeed if the data allows for, selecting such connections for analysis. However apparent the interplay between networks online and offline may appear to be, the emergence of social platforms actually accelerated this inclination in the field of social network analysis. Relationally defined communities, particularly communities of interest, rapidly superseded spatially defined communities. Geographic dependencies were seldom explored in personal communities defined as an individual set of ties, as spatial distance was perceived as a lesser hindrance in social media communication increasingly defined by social accessibility rather than spatial accessibility (Chua et al., 2011). This change of analytical focus from spatial to relationally defined communities downplayed not only an important driver of tie formation but also the relational multiplexity that characterizes strongly tied pairs meeting offline, and then online again, and offline yet again (Haythornthwaite & Wellman, 1998). It also overlooked the asymmetric formation of weak ties, as social

8 Introduction

media interaction provides support for latent ties that may never be activated by social interaction. There remain fundamental challenges in connecting online behavior on social media to offline relationships, often driven by the distinctive dynamics of relationships online where structural requirements for tie formation are noticeably lower compared with tie formation in offline relationships. Twitter, in particular, presents characteristics typical both of social networks and of highly centralized diffusion systems (Bastos, Piccardi, et al., 2018; Wu et al., 2011). Users can reach out to other users on Twitter and other social media platforms more easily than offline, and there are various metrics of social media activity that can be used as a proxy for real-world friendship with varied levels of accuracy: @-mentions, retweets, likes, pins, and particularly Facebook shares or photo tagging, which are commonly used to infer edge strength and real-world friendships. In addition to that, social platforms are designed to operate deeply embedded into one’s social network in ways that are not entirely observable. In other words, unraveling affordances of social networking platforms from the routines through which one engages with others face-to-face is not a pedestrian task. Online communities are constrained by boundaries reshaped by social platforms in ways that they may reflect, at least in part, and often to abhorrent precision and accentuation, the constraints of physical social networks experienced offline. But the constant reshaping of these platforms also creates affordances that diverge radically from observable offline interactions, particularly when their staggering user base surpasses the population of sovereign countries and the stratified marketplaces where they operate.

Polarization A global social network is a milestone in the evolution of a species marked by a lasting amity-enmity complex that underpins communities and tribal coalitions (Cosmides et al., 2003). Whereas offline groups present clear boundaries defined by space and size typical of classrooms, colleges, workplaces, sports teams, churches, trade unions, and political parties, online social networks are subject to no such spatial and social constraint typical of tribal loyalty within groups. Mass media, notwithstanding its national reach and scale, posed no fundamental challenge to the tribal social fabric. Social platforms, however, are quickly outpacing the scale of mass media with communities that grow beyond this inflexible limitation. The absence of clear group boundaries necessary for in-group favoritism and outgroup derogation may trigger new forms of coalition and affiliation not prescribed in classic social identity theory (Tajfel, 1974); the response amplification may likewise lead to disruption and disintegration of traditional political and partisan allegiances (Bastos & Mercea, 2018a) along with divisive politics known as affective polarization (Iyengar et al., 2019). Social media facilitate, perhaps incentivize, interaction across distance. This requires an analytical framework that can take into account spatial non-adjacency

Introduction 9

in the interdependence of social relationships. Specifically, the spatial embedding of social media interactions may reframe the importance of places and immediate social relationships that may be rendered peripheral notwithstanding the centrality of their spatial location to the social fabric of groups, communities, and societies. Perhaps unsurprisingly, studies have found that tie formation continues to be dependent on geography, with the probability of ties reducing with distance (Liben-Nowell et al., 2005; Mok et al., 2010; Preciado et al., 2012; Wong et al., 2006) and proximate actors having similar sociocultural and demographic properties (Hipp et al., 2014). There is also evidence that homophily offline may trigger similarly clustered formations online (Onnela et al., 2011). A large body of scholarship has sought to model the probability of tie formation online as a function of geographic propinquity—that is, the hypothesis that people located closer together in physical space are more likely to form a relationship (Festinger et al., 1950). There is also extensive work probing space-independent communities in spatial networks (Expert et al., 2011). In the seminal study of Onnela et al. (2011) a social network of individuals with precise geographical information for each actor was collected. The variation of geographical span for social groups of varying sizes was surprising, as no correlation between the topological positions and geographic positions of individuals within network communities was found. The results are at odds with scholarship positing a linear association between ties forged online and geographic location, including the established dependency of geography on the structure of dyadic social interactions, as friendship probability has been shown to decay with distance (Liben-Nowell et al., 2005). In contrast to the just discussed, the results reported by Onnela et al. (2011) suggested that spreading processes may face distinct structural and spatial constraints. The relative autonomy of online social networks can insulate individuals from nearby communities. Well-educated and otherwise open communities may conceivably grow apart due to asymmetric information diets feeding different in-group identities. Conflicting information diets across communities can feed motivated reasoning and confirmation biases that jeopardize consensus-driven communication, with climate change being a critical development where the abundance of information, or the scientific consensus about the subject, is not pertinent to several communities (Bardon, 2019; Latour, 2004). Evidence points to ideological polarization actually increasing with respondents’ knowledge of the subject, so that the likelihood of conservatives being climate change deniers is higher if they are college-educated (Bolsen et al., 2015). But disagreeing with facts advanced by the scientific community is not a prerogative of conservatives. While conservatives are more likely to reject established facts about evolution, the age of the Earth, and climate change, liberals are particularly prone to reject scientific facts about fracking, vaccination, and GMO. Liberals are also less likely to accept expert consensus on the possibility of safe storage of nuclear waste or on the effects of concealed-carry gun laws (Kahan et al., 2011). Naturally, social groups formed online or offline are not static; there may exist considerable overlap over time, which adds further challenges to the problem. But

10 Introduction

there is at least one key affordance of social platforms rendering online communication substantively different from face-to-face interaction. This is the potential for politically and culturally homogeneous communication. This is encapsulated in the metaphors of filter bubblers and echo chambers, which one may find difficult to route around, or even to escape in an information ecosystem marked by rampant social media usage. This is a considerable departure from the social web marked by open standards and centered on user governance that dominated the internet studies in the 1990s and early noughties. Where web portals would bring people of different affiliations together (Tyler et al., 2019), social platforms can fragment and balkanize lifestyles and political allegiances, which further segregates and polarizes online communities. This process of increasing segregation and polarization online may then spill over to face-to-face communication through network externalities, including spillover effects.

Affordances But the extent to which online social networks effect change to offline communities remains an empirical question. Indeed, the seminal work of Milgram and Philip Zimbardo on the psychological model of deindividuation hypothesized considerable changes in social behavior due to anonymity (Milgram & Gudehus, 1978; Zimbardo, 1969). Others have traced the behavioral effects of anonymity, including lower self-regulation, greater confidence, and freedom from the norms enforced by a social hierarchy, to communication technologies as old as the telegraph (Watt et al., 2002) and, of course, to the arrival of the internet, but commencing prior to social media (Joinson, 1999). Bisbee and Larson (2017) argued that this body of scholarship raised a concern that the way people relate to others online is fundamentally different from how they relate to others offline. This is indeed a difficult empirical question, as only scant data exists mapping online to offline social networks, and therefore only limited scholarship has assessed the extent to which social networks online are topologically comparable to those observed offline. Research has often and understandably resorted to surveys to probe whether one’s online social network is similar to one’s offline social network. Jones et al. (2013) asked Facebook users to name their closest friends in real life and found that the frequency of Facebook interactions was diagnostic of strong ties, though interestingly, private messages were not necessarily more informative than public communications (comments, wall posts, and other interactions). Similarly, Bisbee and Larson (2017) implemented a survey on Amazon Mechanical Turk and found similarities in the structure of social relationships established online and offline, but this similarity is likely restricted to the structure of the network, with individuals likely favoring online and offline social networks to develop specific relationships. Subrahmanyam et al. (2008) surveyed relationships established face-to-face and via social networking sites. The study identified some overlap, as participants would use social networking sites to connect with friends and family members. However,

Introduction 11

the overlap was imperfect, and it suggested different patterns across age groups that strengthened aspects of offline connections. The overlap between offline and online social networks was modest, as on average only half of the closest online social network contacts matched face-to-face friends. Out of 70 respondents who used instant messaging, only 12 overlapped 100% and 11 had no overlap at all. The overlap between offline and online social networks was similar; out of 73 respondents, 8 had zero overlap and 16 overlapped entirely with their face-to-face friends. On average, half of users’ best face-to-face friends were also their best friends on social networking sites. For those who used both instant messengers and social networking sites, under half reported no overlap between the online social networks and their face-to-face friends. Similarly, Boyd (2008) studied the online social network Friendster and noted that the occurrence of numerous common ties but no mutual connection was an indication that the two users were likely ex-partners. This is an interesting departure from the seminal definition of triadic closure in social network theory, dating back to the landmark definition introduced by Simmel (1908) and subsequently formalized as network transitivity (Scott & Carrington, 2011). In other words, if A and B were to share several friends but not be connected to one another, this was most likely due to a severed personal connection, not a social opportunity. Boyd (2008) concluded that this rather basic social fact could not be rendered, and that the Friendster network was not modeling everyday social networks, but constituting its own, with distinctive rules and patterns of interaction. Even when online interaction was not marked by anonymity, it was marked by a level of publicness that differed substantively from face-to-face interactions. As such, publicly performed social networks such as Facebook, Twitter, or Instagram differ fundamentally from those that sociologists studied traditionally because they represent more than tie strength. Social media platforms have struggled to cope with the distinctively different social norms that orient online and offline worlds. The conflict was perhaps to be expected: the affordances of online platforms pale in comparison to the overwhelming stream of visual, auditory, and kinesthetic information that underpins face-to-face interaction. The implicit norms and conventions of face-to-face communication are often absent in online interaction, particularly turn-taking and the expectation that conversation will not be recorded or filmed without one’s consent. Interaction on social platforms, on the other hand, is recorded by default, and it is not always clear who owns the data generated. Digital trace data resulting from online interaction may also be stored beyond the life of participants. This caveat of online interaction is augmented by the business model of social platforms supported by advertisement, which requires online activity to be linked to the real-world identity of users, with Facebook being notable in ensuring all users are personally identifiable as real human beings, or perhaps more tangibly as real-world consumers. Facebook is not alone in struggling to manage the collision between online and offline. Google has a track record of underestimating how entrenched relationships

12 Introduction

with kith and kin may differ in substantive ways from online transactions. The short-lived microblogging tool Google Buzz shared users’ online activity with people they were trying to avoid. Google engineers assumed email frequency was a reliable proxy for meaningful relationships, which, of course, does not take into account pranksters, stalkers, debt collection agencies, crooks, and scam artists. Similarly, Google’s Glass project failed to note that recording conversations between individuals requires one’s consent. Much like Facebook’s Real Names Policy, Google Plus, another short-lived microblogging and messaging tool owned by Google, sought to force users to link their Google activity to their real name, so that user’s activity would be irrevocably linked to their real-world identity, a condition at odds with the regular forgetfulness of face-to-face interaction. Even Sidewalk—Google’s project of a robot-maintained, data-driven city of the future in Toronto—was eventually scrapped. Sensors would track residents’ movement to optimize traffic flow and clean the streets, while also extending Google’s omnipresent surveillance from the online world to the physical one. The struggle to cope with online and offline norms is often accompanied by overwhelmingly good intentions that set these problems in motion. In the early noughties, when Facebook was setting its agenda to reshape the internet around personal relationships, and then the entire world, few would argue against the mission of making the world more open and connected. A more open and intensively connected world was a logical consequence of the technolibertarianism epitomized by the ‘Declaration of Independence for Cyberspace’ (Barlow, 1996) and the broader political aspirations of the Silicon Valley technorati (Barbrook & Cameron, 1996). This political project collapsed in the second half of 2016, when Facebook’s News Feed algorithm was exploited in various influence operations in the run-up to national elections (Bastos & Farkas, 2019), turning a platform originally designed for connecting people into a remarkable driver of political division. The crisis escalated quickly, and in 2018 it reached the point of no return after the Cambridge Analytica data scandal.

Pandemonium The epochal events that marked 2016 were perhaps thrown into relief in the ensuing years with multiple investigations of influence operations on social platforms to undermine democratic deliberation. This chain of events, and the debate on privacy that ensued, continues to linger and to inform policymakers, but it was quickly superseded by more pressing concerns. In 2020, less than two years after the Cambridge Analytica data scandal, China and then Europe and the rest of the world were dealing with the sudden outbreak of COVID-19, which rapidly escalated to a global pandemic. National governments struggled to cope, and health systems drew near collapse in many countries. The pandemic and the enforced policy of social distancing shifted much of the workflow online, with big tech companies—including the behemoth Amazon—struggling to handle a demand that was truly unprecedented.

Introduction 13

If Amazon struggled to cope with demand, available spots for home delivery by regular supermarkets all but disappeared. As movie theaters suspended operations, Netflix and other streaming platforms registered record numbers of new customers. Universities and schools implemented measures to stream lectures and upload course materials to teaching platforms like Moodle, Canvas, and Blackboard. Regular office work moved to remote work, and weekly workouts at the gyms moved to YouTube and fitness trackers. The Peloton indoor bike, aimed at a narrow and wealthier demographic group, offered live at-home classes and competitions with other Peloton riders around the world. Even physicians who long resisted telemedicine were suddenly caring for patients using videoconferencing platforms. Real life suddenly became detached from the physical world, with social relationships carried out over WhatsApp, business meetings moving to Teams, and friends and colleagues socializing on Zoom. The unprecedented scale of social isolation made ever so clear the gregarious nature of human life and the perennial need to interact with real people in physical, geographically bounded spaces. The outbreak brought to the fore the critical role played by digital and networking infrastructure to the regular functioning of businesses and institutions, but it also highlighted the ultimate seclusion between online and offline social networking, with hazards ascribed to the latter not applying to the former. Individuals became simultaneously more dependent on online networking technologies to carry out both routine and critical work tasks and more aware of how short online interaction fell compared with the multisensorial, time-dependent, and irrevocably space-constrained social network interactions that take place offline. Another remarkable development during the pandemic lockdown in the US was the series of protests and riots following the killing of George Floyd, a 46-yearold black man who died of strangulation by a white police officer in the Minneapolis metropolitan area. As protests against police brutality spread throughout the country and then worldwide, groups that existed almost exclusively within the realm of social media found in the protests an opportunity to transition from online conspiracy-theorizing to plotting violence and discord onsite. QAnon, the farright conspiracy theory community whose members believe in a secret plot by the ‘deep state’ against former US President Donald Trump and his supporters, seized the opportunity of ongoing protests and rioting to act out their ideas, previously restricted to chat-rooms and online forums, and take direct action in the form of an insurrection against the government. There remain some important lines separating online from offline life. Anyone attacked by a Twitter mob takes some measure of comfort from the assertion that the internet is not real life. However intimidating Twitterstorms targeting individuals may feel, it is comforting to know a stream of pixels cannot transform itself into a vicious mob. The line separating online and real life is nonetheless porous, disappearing altogether whenever teens are shamed for an Instagram post, or an adult is fired because of an unfortunate tweet, or elderly individuals are targeted in online scams. The implacable effects of the online world on real life are puzzling because

14 Introduction

the rules underpinning social interaction online are different from those of offline life, a gap successfully exploited by trolls, botmasters, and underground creatures versed in the cyber dark arts. As we acknowledge the tangible effects of digitalization and automation on our lives, we also come to terms with a certain loss of agency that fed the rampant conspiracy-theorizing observed in the late 2010s and early 2020s. The claim that Twitter is not real life is ultimately a false dichotomy. While the demographics of social platforms are rarely representative of the broader population (Sloan, 2017), online activity is deeply embedded in social and political forces. If Twitter is not real life, neither is broadcast or print media; however diverse the pool of journalists and entertainers working in mainstream media, this group is hardly representative of the opinions shared by rank-and-file voters in national consultations. Mainstream media and political elites have nevertheless an outsized effect on cultural and political trends. They are also particularly prone to populate Twitter, where the interaction between pundits and hyperactive users triggers a feedback loop between institutions and hyperactive users. In that sense, social media in general and Twitter in particular are central components of one’s offline reality, whether one is active in the platform or not. As social platforms pull power away from mainstream media and established institutions, a process referred to by media scholars as ‘mediatization’ (Strömbäck, 2008), it becomes clear that those platforms are not walled off from the real world.

Scalability The prevailing narrative about the dark turn of social media platforms leading up to the pandemic foregrounds a shift in the business model of startups connecting individuals and communities to an algorithmic model of governance where individuals and groups are fed engaging, which often translates to dividing, information. While the shift in the business model of social platforms from social technologies empowering communities to large-scale, micro-targeting advertising and propaganda platforms is a driving factor in the dark turn of social platforms in the late teens, there is a case to be made that more structural forces would have inevitably triggered the large-scale conflicts observed in the fallout from the Cambridge Analytica data scandal, even in the event that social platforms had chosen to stick to a more benign business model. The integration of individuals and communities at the center of social platforms could hardly have materialized without substantive cultural and economic upheaval. If 20th century history revolves around nation states struggling to manage each other’s boundaries, the 21st century brought about by social platforms was destined to upscale this challenge at both global and local levels. Fundamental differences observed across national and cultural fault lines at the center of national identities were speedily pulverized over much smaller scales. As such, the unrelenting stream of crisis upsetting social technology companies in the early 21st century fed from lasting cultural and group differences that were never resolved in the

Introduction 15

aftermath of the 20th-century cross-national wars. The integration of groups and communities is rarely a peaceful process, and there is little reason to expect technological scalability and fragmentation to correct this course. Cultural conflicts can instead be stoked over much smaller unit levels compared with nation states, which previously marked the boundaries broadly separating communities. Big tech scalability uncovers out-groups at much finer unit levels. They can be neighbors and may conceivably be among one’s family. By 2018, the reputation of technology companies irreversibly changed, much like the reputation of the finance sector changed a decade before after the housing bubble crash. The underlying argument of this book is that beyond the misguided policies of committing social platforms to a business model that underplays the role of individuals and communities in favor of algorithmic governance and advertising revenue, the very integration of large populations consistently triggers conflicts in ways that are unpredictable and at times counterintuitive. Much like the platforms providing the infrastructure for users, such conflicts can scale up in space and time. From this perspective, social platforms—including Facebook, which is arguably the worst offender—deserve sympathy for trying to juggle the conflicting priorities of privacy, transparency, and safety, while policymakers demand smooth integration of cultures and ideologies across communities that nevertheless continue to propagate negative outgroup-directed attitudes and behaviors. It is against this backdrop that there remains great potential for escalating the hazards brought about by greater integration of peoples and communities within the logic of network effects and increasing returns to scale. This book is a small contribution to untangling these social tensions and spatial relationships. It leverages the possibilities opened by the growing availability of spatially rich, digital trace data to bridge the disciplinary and methodological gap separating internet research from social network analysis. A purpose-specific R package was created and accompanies this book. The package offers a series of functions to plot igraph or network objects over geographic grids. The real-world applications discussed in this book draw from this package, and the book itself can be seen as a theoretical construct for the package, which also offers a vignette and tutorials that work through the software and datasets discussed in the book. We expect the two-fold structure of this project (book and software) to offer better support for data-driven, computational research projects. The ensuing chapters outline the theoretical and methodological challenges in analyzing social networks that overlap online and offline by offering an introduction to concepts, theories, and methods that sit at the crossroads between spatial and social network analysis. The first section of the book is dedicated to the differences between space (the geometrical arrangements that structure and enable forms of interaction) and place (the mechanisms through which social meanings are attached to physical locations), while also exploring the asymmetries observed in online and offline social networks. The second section covers the rationale of social network analysis and the ontological differences stating that relationships, more than individual and independent attributes, are critical to the understanding of social

16 Introduction

behavior. This section provides an overview of spillover effects and how seemingly unrelated social activities online and offline may reinforce each other. The third section unpacks some case studies that mapped social media activity to geographically bounded areas and considers the inflection of homophilous dependencies across online and offline social networks. The fourth and last section of the book explores a range of networks and discusses methods for and approaches to plotting a social network graph onto a map, including the purpose-built R package Spatial Social Media.

References Albert, R., & Barabási, A.-L. (2002). Statistical mechanics of complex networks. Reviews of Modern Physics, 74(1), 47. Barberá, P., & Steinert-Threlkeld, Z. C. (2019). How to use social media data for political science research. In L. Curini & R. Franzese (Eds.), The Sage Handbook of Research Methods in Political Science and International Relations. London: Sage. Barbrook, R., & Cameron, A. (1996). The Californian ideology. Science as Culture, 6(1), 44–72. Bardon, A. (2019). The Truth about Denial: Bias and Self-Deception in Science, Politics, and Religion. London: Oxford University Press. Barlow, J. P. (1996). Declaration of Independence for Cyberspace. Davos, Switzerland: Electronic Frontier Foundation. Bastos, M. T. (2016). Digital journalism and tabloid journalism. In B. Franklin & S. Eldridge (Eds.), Routledge Companion to Digital Journalism Studies (pp. 217–225). London: Routledge. Bastos, M. T., & Farkas, J. (2019). “Donald Trump is my president!”: The internet research agency propaganda machine. Social Media + Society, 5(3). doi: 10.1177/2056305119865466 Bastos, M. T., & Mercea, D. (2016). Serial activists: Political twitter beyond influentials and the twittertariat. New Media & Society, 18(10). doi: 10.1177/1461444815584764 Bastos, M. T., & Mercea, D. (2018a). Parametrizing Brexit: Mapping Twitter political space to parliamentary constituencies. Information, Communication & Society, 21(7), 921–939. doi: 10.1080/1369118X.2018.1433224 Bastos, M. T., & Mercea, D. (2018b). The public accountability of social platforms: Lessons from a study on bots and trolls in the Brexit campaign. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences. doi: 10.1098/rsta.2018.0003 Bastos, M. T., & Mercea, D. (2019). The Brexit Botnet and user-generated hyperpartisan news. Social Science Computer Review, 37(1), 38–54. doi: 10.1177/0894439317734157 Bastos, M. T., Mercea, D., & Baronchelli, A. (2018). The geographic embedding of online echo chambers: Evidence from the Brexit campaign. PLOS One, 13(11), e0206841. doi: 10.1371/journal.pone.0206841 Bastos, M. T., Mercea, D., & Charpentier, A. (2015). Tents, Tweets, and events: The interplay between ongoing protests and social media. Journal of Communication, 65(2), 320– 350. doi: 10.1111/jcom.12145 Bastos, M. T., Piccardi, C., Levy, M., McRoberts, N., & Lubell, M. (2018). Core-periphery or decentralized? Topological shifts of specialized information on Twitter. Social Networks, 52(Supplement C), 282–293. doi: 10.1016/j.socnet.2017.09.006 Baym, N. K. (1999). Tune in, Log on: Soaps, Fandom, and Online Community (Vol. 3). London: Sage. Benkler, Y. (2006). The Wealth of Networks: How Social Production Transforms Markets and Freedom. London: Yale University Press.

Introduction 17

Benkler, Y., Faris, R., & Roberts, H. (2018). Network Propaganda: Manipulation, Disinformation, and Radicalization in American Politics. London: Oxford University Press. Bisbee, J., & Larson, J. M. (2017). Testing social science network theories with online network data: An evaluation of external validity. American Political Science Review, 111(3), 502–521. doi: 10.1017/S0003055417000120 Bolsen, T., Druckman, J. N., & Cook, F. L. (2015). Citizens’, scientists’, and policy advisors’ beliefs about global warming. The ANNALS of the American Academy of Political and Social Science, 658(1), 271–295. doi: 10.1177/0002716214558393 Boy, J. D., & Uitermark, J. (2016). How to study the city on Instagram. PLOS One, 11(6), e0158161. Boyd, D. (2008). None of this is real: Identity and participation in Friendster. In J. Karaganis (Ed.), Structures of Participation in Digital Culture (pp. 132–157). New York, NY: Social Science Research Council. Castells, M. (2012). Networks of Outrage and Hope: Social Movements in the Internet Age. Cambridge: Polity Press. Chua, V., Madej, J., & Wellman, B. (2011). Personal communities: The world according to me. In J. Scott & P. J. Carrington (Eds.), The SAGE Handbook of Social Network Analysis. London: SAGE publications. Cole, J., Berens, B., Suman, M., Schramm, P., & Zhou, L. (2018). The 2018 Digital Future Report The 16th Annual Study on the Impact of Digital Technology on Americans. Los Angeles, CA: Center for the Digital Future, USC Annenberg. Cosmides, L., Tooby, J., & Kurzban, R. (2003). Perceptions of race. Trends in Cognitive Sciences, 7(4), 173–179. doi: 10.1016/S1364-6613(03)00057-3 Curry, M. R. (2002). Discursive displacement and the seminal ambiguity of space and place. In L. Lievrouw & S. Livingstone (Eds.), The Handbook of New Media (pp. 502–517). London: SAGE Publications. Dodge, M., & Kitchin, R. (2001). Atlas of Cyberspace (Vol. 158). London: Addison-Wesley. Donald, E. B. (1981). Debrett’s Etiquette and Modern Manners. London: Debrett’s Ltd. Dunbar, R. (1998). Theory of mind and the evolution of language. In J. R. Hurford, M. Studdert-Kennedy, & C. Knight (Eds.), Approaches to the Evolution of Language (pp. 92–110). Cambridge: Cambridge University Press. Expert, P., Evans, T. S., Blondel, V. D., & Lambiotte, R. (2011). Uncovering spaceindependent communities in spatial networks. Proceedings of the National Academy of Sciences, 108(19), 7663–7668. Ferrara, E. (2017). Disinformation and social bot operations in the run up to the 2017 French presidential election. First Monday, 22(8). Festinger, L., Schachter, S., & Back, K. (1950). Social Pressures in Informal Groups: A Study of Human Factors in Housing. Michigan: University of Michigan. González-Bailón, S., Borge-Holthoefer, J., Rivero, A., & Moreno, Y. (2011). The dynamics of protest recruitment through an online network. Scientific Reports, 1. doi: 10.1038/srep00197 Graham, M., Zook, M., & Boulton, A. (2013). Augmented reality in urban places: Contested content and the duplicity of code. Transactions of the Institute of British Geographers, 38(3), 464–479. doi: 10.1111/j.1475-5661.2012.00539.x Granovetter, M. S. (1973). The strength of weak ties. American Journal of Sociology, 78(6), 1360–1380. Haythornthwaite, C., & Wellman, B. (1998). Work, friendship, and media use for information exchange in a networked organization. Journal of the American Society for Information Science, 49(12), 1101–1114. Hipp, J. R., Corcoran, J., Wickes, R., & Li, T. (2014). Examining the social porosity of environmental features on neighborhood sociability and attachment. PLOS One, 9(1), e84544.

18 Introduction

Hjarvard, S. (2008). The mediatization of religion: A theory of the media as agents of religious change. Northern Lights: Yearbook of Film & Media Studies. doi: 10.1386/nl.6.1.9_1 Hunsinger, J., Klastrup, L., & Allen, M. (2010). International Handbook of Internet Research. Dordrecht: Springer. Iyengar, S., Lelkes, Y., Levendusky, M., Malhotra, N., & Westwood, S. J. (2019). The origins and consequences of affective polarization in the United States. Annual Review of Political Science, 22, 129–146. Johnston, R., & Pattie, C. (2011). Social networks, geography and neighbourhood effects. In J. Scott & P. J. Carrington (Eds.), The SAGE Handbook of Social Network Analysis. London: SAGE Publications. Joinson, A. (1999). Social desirability, anonymity, and internet-based questionnaires. Behavior Research Methods, Instruments, & Computers, 31(3), 433–438. Jones, J. J., Settle, J. E., Bond, R. M., Fariss, C. J., Marlow, C., & Fowler, J. H. (2013). Inferring tie strength from online directed behavior. PLOS One, 8(1), e52168. doi: 10.1371/ journal.pone.0052168 Kahan, D. M., Jenkins-Smith, H., & Braman, D. (2011). Cultural cognition of scientific consensus. Journal of Risk Research, 14(2), 147–174. doi: 10.1080/13669877.2010.511246 Kitchin, R., & Dodge, M. (2011). Code/Space: Software and Everyday Life. Cambridge, MA: MIT Press. Latour, B. (2004). Why has critique run out of steam? From matters of fact to matters of concern. Critical Inquiry, 30(2), 225–248. Laumann, E. O., Marsden, P. V., & Prensky, D. (1989). The boundary specification problem in network analysis. Research Methods in Social Network Analysis, 61, 87. Leitner, H., Sheppard, E., & Sziarto, K. M. (2008). The spatialities of contentious politics. Transactions of the Institute of British Geographers, 33(2), 157–172. Liben-Nowell, D., Novak, J., Kumar, R., Raghavan, P., & Tomkins, A. (2005). Geographic routing in social networks. Proceedings of the National Academy of Sciences, 102(33), 11623–11628. Milgram, S., & Gudehus, C. (1978). Obedience to Authority. New York, NY: Ziff-Davis Publishing Company. Mok, D., Wellman, B., & Carrasco, J. (2010). Does distance matter in the age of the Internet? Urban Studies, 47(13), 2747–2783. Myrick, J. G., Holton, A. E., Himelboim, I., & Love, B. (2016). #Stupidcancer: Exploring a typology of social support and the role of emotional expression in a social media community. Health Communication, 31(5), 596–605. Newman, M. E. (2003). The structure and function of complex networks. SIAM Review, 45(2), 167–256. Onnela, J.-P., Arbesman, S., González, M. C., Barabási, A.-L., & Christakis, N. A. (2011). Geographic constraints on social network groups. PLOS One, 6(4), e16939. Preciado, P., Snijders, T. A., Burk, W. J., Stattin, H., & Kerr, M. (2012). Does proximity matter? Distance dependence of adolescent friendships. Social Networks, 34(1), 18–31. Robins, G. L., & Pattison, P. (2005). Interdependencies and social processes: Generalized dependence structures. In P. J. Carrington, J. Scott, & L. Wasserman (Eds.), Models and Methods in Social Network Analysis (pp. 192–214). Cambridge: Cambridge University Press. Scellato, S., Noulas, A., Lambiotte, R., & Mascolo, C. (2011). Socio-spatial properties of online location-based social networks. ICWSM, 11, 329–336. Scott, J. (2011). Social physics and social networks. In J. Scott & P. J. Carrington (Eds.), The SAGE Handbook of Social Network Analysis. London: SAGE Publications.

Introduction 19

Scott, J., & Carrington, P. J. (2011). The SAGE Handbook of Social Network Analysis. London: SAGE publications. Shao, C., Ciampaglia, G. L., Varol, O., Flammini, A., & Menczer, F. (2017). The spread of fake news by social bots. arxiv.org, 96, 104. Simmel, G. (1908). Soziologie: Untersuchungen über die Formen der Vergesellschaftung. Berlin: Duncker & Humblot. Sloan, L. (2017). Who Tweets in the United Kingdom? Profiling the Twitter population using the British social attitudes survey 2015. Social Media + Society, 3(1), 2056305117698981. doi: 10.1177/2056305117698981 Snijders, T., Steglich, C., & Schweinberger, M. (2017). Modeling the coevolution of networks and behavior. In K. V. Montfort, J. Oud, & A. Satorra (Eds.), Longitudinal Models in the Behavioral and Related Sciences (pp. 41–71). London: Routledge. Snijders, T. A. B., van de Bunt, G. G., & Steglich, C. E. G. (2010). Introduction to stochastic actor-based models for network dynamics. Social Networks, 32(1), 44–60. doi: 10.1016/j. socnet.2009.02.004 Starbird, K. (2019). Disinformation’s spread: Bots, trolls and all of us. Nature, 571(7766), 449. Steglich, C., Snijders, T. A., & Pearson, M. (2010). Dynamic networks and behavior: Separating selection from influence. Sociological Methodology, 40(1), 329–393. Steglich, C., Snijders, T. A., & West, P. (2006). Applying siena. Methodology, 2(1), 48–56. Strömbäck, J. (2008). Four phases of mediatization: An analysis of the mediatization of politics. The International Journal of Press/Politics, 13(3), 228–246. doi: 10.1177/1940161208319097 Subrahmanyam, K., Reich, S. M., Waechter, N., & Espinoza, G. (2008). Online and offline social networks: Use of social networking sites by emerging adults. Journal of Applied Developmental Psychology, 29(6), 420–433. doi: 10.1016/j.appdev.2008.07.003 Sutcliffe, A., Dunbar, R., Binder, J., & Arrow, H. (2012). Relationships and the social brain: Integrating psychological and evolutionary perspectives. British Journal of Psychology, 103(2), 149–168. Tajfel, H. (1974). Social identity and intergroup behavior. Social Science Information, 13(2), 65–93. Tonkiss, F. (2005). Space, the City and Social Theory: Social Relations and Urban Forms. Cambridge: Polity. Tufekci, Z. (2014). Big Questions for Social Media Big Data: Representativeness, Validity and Other Methodological Pitfalls. Paper presented at the 8th International AAAI Conference on Weblogs and Social Media. Ann Arbor, MI: ICWSM14. Tyler, M., Grimmer, J., & Iyengar, S. (2019). Partisan enclaves and information bazaars. Mapping Selective Exposure to Online News. USDA. (2015). Farm Computer Usage and Ownership. Washington, DC: National Agricultural Statistics Service. Walker, S., Mercea, D., & Bastos, M. T. (2019). The disinformation landscape and the lockdown of social platforms. Information, Communication and Society, 22(11), 1531–1543. doi: 10.1080/1369118X.2019.1648536 Wang, Q., Phillips, N. E., Small, M. L., & Sampson, R. J. (2018). Urban mobility and neighborhood isolation in America’s 50 largest cities. Proceedings of the National Academy of Sciences, 115(30), 7735–7740. Watt, S. E., Lea, M., Spears, R., Rogers, P., & Woolgar, S. (2002). How social is internet communication? Anonymity effects in computer-mediated groups. In S. Woolgar (Ed.), Virtual Society? Technology, Cyberbole, Reality (pp. 61–77). Oxford: Oxford University Press.

20 Introduction

White, H. D. (2011). Scientific and scholarly networks. In J. Scott & P. J. Carrington (Eds.), The SAGE Handbook of Social Network Analysis. London: SAGE publications. Wong, L. H., Pattison, P., & Robins, G. (2006). A spatial model for social networks. Physica A: Statistical Mechanics and Its Applications, 360(1), 99–120. Wu, S., Hofman, J. M., Mason, W. A., & Watts, D. J. (2011). Who Says What to Whom on Twitter. Paper presented at the 20th International Conference on World Wide Web, New York, NY. Zhang, X. M., & Zhu, F. (2011). Group size and incentives to contribute: A natural experiment at Chinese Wikipedia. American Economic Review, 101(4), 1601–1615. Zimbardo, P. G. (1969). The Human Choice: Individuation, Reason, and Order versus Deindividuation, Impulse, and Chaos. In W. J. Arnold & D. Devine (Eds.), Nebraska Symposium on Motivation, Vol. 17 (pp. 237–307). Lincoln: University of Nebraska Press.

SECTION I

Local and digital The dyadic interaction of social and virtual

1 PLACE AND SPACE

The intrinsically related and highly dynamic concepts of place and space are central to understanding the dyadic interaction of social ↔ spatial. Place and space operate as boundary markers for interactions, a spatial limit that drives inclusion and exclusion of participants from conversation. Harrison and Dourish (1996) offered a broadly accepted separation of the concepts in terms of their geometric (space) and experiential (place) understandings. Space therefore refers to geometrical arrangements that structure and enable forms of interaction, while place refers to the mechanisms through which social meanings are attached to physical locations. Place and space are key concepts in spatial inquiries resulting from questions related to geographic boundaries, community-building, and the formation of nation states. The two concepts additionally interact with the continuities and disconnections between online and offline social dynamics. It is the sociological concept of place (as opposed to the geospatial space) that foregrounds people’s sense of the places they inhabit. As such, online and offline are independent from the constructs of space and place which can more properly be mapped to the splintering between spatial and social. Online social networks, nevertheless, generate a wealth of digital trace data that provide evidence for the formations of new places or the redrawing of spaces. The recent availability of user geolocational data has been leveraged to explore on-the-ground dynamics of cities, with results showing that on a large scale the structure of a place may differ from the official boundaries of a city (Cranshaw et al., 2012). People’s movements define the character of an area, with a dynamic view of the social flows of people coinciding or differing entirely from municipal fixed borders, a key example of how digital trace data can foreground the interplay of place and geography over local social interactions. Similarly, serendipitous encounters in a geographically distributed community can be countered or perhaps simply altered by routes suggested by transit and map

24 Local and digital

application, whereby an efficient and precise but otherwise reified path may take precedence over alternative routes. This dynamic reshaping of geography led some observers to note the affordance transition from maps as ‘conceived space’ or formal abstractions to ‘lived space’ as embodied human interaction (Chesher, 2012), from representations of spaces to spaces of representation. The examination of how places are defined and experienced through large-scale digital trace data implicitly assumes that the virtual and material are entwined and are changing the ways in which places are defined and experienced (Kitchin et al., 2017). The interplay between space and place naturally predates online social networks, as physical distances are perennially implicated in social relations as dimensions in which various social contexts and norms are crafted and tested. The path one takes daily to go to work or return home is not just a necessary chore or a practical convenience but also a tool for maintaining boundaries between connected social worlds. In other words, it comprises the dimension where various social contexts and norms are deployed (Boyd, 2008a). The availability of digital trace data, nevertheless, particularly those including spatially rich social network interactions, allowed social scientists to collect, visualize, and model spatial networks at unprecedented scale and paved the path to the emergence of what is currently referred to as internet research. This later development in internet scholarship was concurrent with advances in geographic information systems (GIS) that impacted multiple disciplines of scientific inquiry. Humanities scholarship resorting to this wealth of data developed rapidly, particularly in disciplines with a strong foothold on spatial inquiries resulting from questions related to geographic boundaries and the perception of space. Scholars referred to the geographic data advances in the humanities as a ‘spatial turn,’ as it closely linked studies in history, literature, cartography, and geography with vast repositories of data associated with specific cultures, regions, and locations. While methodologically the spatial turn represented an effort to employ new tools against established questions in the humanities, it also required scholars to revisit theoretical frameworks offering new directions of research. Similarly, to connect the imprint of such questions with the possibilities of spatial research, sociologists returned to Georg Simmel; historians, to Lewis Mumford; and literary historians, to Walter Benjamin (Bodenhamer et al., 2010; Massey, 2005; Warf & Arias, 2008). But it was internet scholarship that set out to explore the tensions between online and offline social networks against the backdrop of place and space. Moving away from dichotomous metaphors, internet scholarship addressed the onlineoffline continuum as a communication process characterized by the distribution of social identities and practices swiftly moving along computer-mediated and non-computer-mediated environments (Han, 2012), much like the interaction of media and interpersonal behavior is part of a continuum rather than a dichotomy (Meyrowitz, 1986). This is in line with the observation of Boyd (2008a) that online social networks do not necessarily mirror everyday social networks but constitute their own network with distinctive rules and patterns of interaction. Individual

Place and space 25

identities can thus move within and across online to offline social networks, so that online realities may inform the offline realm and vice versa. This reshaping of one’s experiences across online and offline social contexts is well captured in the spatial paradox termed ‘augmented reality’ (Conley, 2016). The internet scholarship of the 1990s would often contend with this tension between online and offline worlds as a dichotomous relationship that reified digital dualism rhetoric, with virtual referring to that which is lesser real or fake, but it swiftly advanced to increasingly nuanced definitions in which online social relationships would intersect with offline contexts in synchronous and asynchronous ways (Benedikt, 1991; Markham, 1998). By the early noughties, internet research had amassed a large body of scholarship addressing online and offline interactions, effectively covering the complex interactions between virtual and social (Mitra & Schwartz, 2001; Papacharissi, 2002) and largely identifying online and offline as different presentations of the same realm (Baym, 1999). The connection between virtual and social communities to geographies and spatial boundaries was, however, largely restricted to the topography of the internet infrastructure where space was only a neutral backdrop, prioritizing instead the role played by social relations and time in online settings (Kitchin & Dodge, 2011). It was only in the mid-teens of the 21st century that the notion of ‘post-digital’ began to be employed in the social sciences and humanities (Berry & Fagerjord, 2017) to identify not a reversal of the digital but a penetration of the analogical into the digital, and vice versa, thereby broadening the latitude of the onlineoffline probe to incorporate the physicality of offline realms into the digital lifeworld. One such example is the century-old notion of skeuomorphism applied to computer and mobile interfaces, whereby derivative objects retain nonfunctional visual attributes from structures inherited from the original (Bollini, 2017). Apple’s extensive use of skeuomorph during the era of Steve Jobs and Scott Forstall is exemplary, with graphical user interface that resembled the aesthetics of physical objects taking center stage. It was also during this period that the industry of sexbots started to explore teledildonics that offered remote mutual masturbation over a data link, thereby mimicking embodied forms of human communication. There is a long tradition of scholarship accounting for space and place in media and communications. Meyrowitz (1986) argued that the disconnect between place and space was a marker of modern media, an information ecosystem that predates the emergence of online social networks. Individuals become more engaged with distant others and therefore less engaged with proximate people in the places in which they live. Online communities that emerged in the nine-teens would be an extreme manifestation of what Meyrowitz (2005) referred to as ‘generalized elsewhere’: media affordances that allow us to construct an understanding of the world in which our immediate community is not a necessary lens to encode reality. As such, locality ceases to be the necessary center of one’s constructed world or the sole source of social experiences. The latitude of one’s social arena expands, but that social arena also distances itself from the immediate constraints of the local by enhancing connections to distant individuals and places. This expansion of one’s

26 Local and digital

social arena comes at the cost of weakening local relationships, which are ultimately restricted to a mere ‘backdrop’ for one’s experiences. Whereas face-to-face interactions and social ‘situations’ would traditionally guide appropriate behavior, media developments evolved to allow the overlapping of many previously distinct social spheres that ultimately altered the ‘situational geography’ of social life. Media affordances would support the dynamic change of social situations and roles of affiliation or ‘being’ (group identity), which overlap with roles of transition or ‘becoming’ (socialization), which then overlap with roles of authority (hierarchy). Meyrowitz (1986) argued that such affordances necessarily alter our relationship with a sense of place by supporting or undermining traditional relationships between physical locations and isolated information systems. Within this framework, new media is that which undermines the relations between physical isolation and social inaccessibility, potentially splintering or bonding different social groups into different or similar informational worlds. The link between space and communication was a central component in the establishment of communication studies in the Untied States, with communication as the transmission of information across space taking the lead in the establishment of the field in North America. Carey (1985) traced this epochal transformation to the invention of the telegraph, the first technological artifact to affect the relationship of distance to social interactions. Telegraph networks singlehandedly triggered the first break between information movement and physical movement. Transport and communication were no longer coupled, as messages need not be transported by hand. It suddenly became possible, albeit prohibitively expensive for daily communication, to move and deliver complex messages across space faster than any messenger (Pred, 1973). The establishment of telegraph lines paved the way for harmonic telegraphy, with multiplex telegraph messages transferred over a single wire, and finally to telephone networks, after which differences in how information changed when accessed from distant places started to erode. With the relatively low cost of telephone calls from the 1940s onwards, telephone networks realized the decoupling of communication from transport foreshadowed by the telegraph. Much like the global development of network technologies 50 years later, telephone networks allowed for instant communication around the world, so that individuals no longer needed to travel over space to remain in contact (Cacciatore, 1994). This epochal transformation led early observers to note it could lead to the death of distance (Fischer, 1982), much like early observers of the internet anticipated the obsolescence of geography in communication (Cairncross, 1997). From the telegraph to telephone, from the automobile industry to low-cost flying, networked individuals were increasingly able to form communities outside their immediate neighborhoods (Axhausen et al., 2016). Naturally, the development of transportation and communication networks has not supplanted the effect that geographic proximity exerts on social relations, but rather transformed and broadened the pool of opportunities driving tie formation (Dijst, 2009). Meaningful relationships continue to thrive within short

Place and space 27

geographic distances, which allow for tangible, inexpensive, and repeated interactions. Indeed, residential proximity was found to be one of the strongest predictors of how often friends see each other prior to the widespread use of online social networks (Verbrugge, 1983), and it continued to be a key predictor in the heyday of social media platforms (Tsai, 2006). Yet, even if the general consensus is that the likelihood of tie formation decreases with distance, little is known about the relevant features of this decline and how it varies over different spatial arrangements (Preciado et al., 2012). Carey (1985) contrasted communication as the transmission of information over space against the etymological affiliation of ‘communication’ and ‘community.’ In doing so, Carey (1985) foregrounded the sharing of common experiences that is central to the concept of place. Communication would encompass messages distributed over space as well as the maintenance of shared beliefs in time, with commonness, communion, community, and communication gravitating toward an idea of place. The first metaphor is eminently spatial, as communication is defined as transportation of information from sender to receiver that mirrors the movement of goods or people. The second metaphor is time-based and foregrounds the role of place, as communication is defined as a ceremony between participants coming together to share an experience, a situation in which nothing new is learned but in which a particular view of the world is portrayed and confirmed (Carey, 1985). Carey’s (1985) definition of communication as the transmission of information versus communication as the sharing of experiences within a community echoes the distinction between the patterns of social interaction in rural and urban areas. This distinction draws from the two basic types of social formation, Gemeinschaft (community) and Gesellschaft (association or society), described in the 19th-century sociological work of Tönnies (2012). The underlying assumption was that small settlements such as rural areas were based on intense patterns of social contact, whereas large groups such as urban areas were characterized by much more diverse and transient contact patterns, with comparatively fewer intense relationships due to the fragmentation observed in urban sprawls where home, work, and play were rarely spatially contiguous. Observed evidence of this dichotomy is, however, limited, with contrasting empirical examples of Gemeinschaft appearing within urban areas as well as in geographically dispersed online social networks (Johnston & Pattie, 2011). It was perhaps the work of Meyrowitz (1986) that more clearly charted the conspicuous role of the media in relation to physical boundaries and social places. It challenged the assumption that communication and travel are synonymous by arguing that the bond between physical and social places could be promptly altered by electronic communication binding people to information environments instead. By dissociating the physical and social place, it relegated the place to a backdrop for social events, and physical locations would no longer determine situations and behaviors that underpin the concept of place. With the merging of private and public spheres, strangers could suddenly be experienced as intimates and the relationship between physical location and information access would forever be

28 Local and digital

undermined. What would follow is the breakdown of distinctions between here/ there, live/mediated, and personal/public. The splintering of these stages of socialization and the physical location was to be followed by the weakening of social places, with electronic media bypassing the isolating characteristics of place and blurring the difference between people at different stages of socialization. Similarly, the distinctions among group and nongroup membership would also be blurred, with the possibility of bypassing stages of controlled access to develop new synthesized social roles. The world as presented by the media could then offer a matrix of perspectives, including those originating from distant places and perspectives. Place and location would cease to be determinants of group identity and interaction, an idea expressed in Meyrowitz’s axiom of ‘no sense of place.’ In sharp contrast to the thesis advanced by Carey (1985), territorial control would cease to guarantee information control, as electronic media could suddenly offer previously isolated groups a form of social access that removed individuals socially and psychologically from their physical situations. But Meyrowitz remained ultimately aligned with the McLuhanian premise that new technologies abolished the relationship between place and space. His ‘Sense of Place’ details the shrinking of differences between listening to second-hand information, or an account given by someone else present at the time, and having live communication. Media would close the gap in the significance of physical presence in the experience of people and events, as one could join the audience of a social performance without being physically present or communicate directly with others without being in the same place. For Meyrowitz, the physical structures that divided society into distinct spatial settings for interaction became increasingly less significant, with electronic media homogenizing places and experiences, a common denominator that would enthusiastically link individuals regardless of status and position. This is arguably a simplification of the relationship between mediated communication and geography, between place and space. Yet, Meyrowitz offers a seminal account of how isolated social settings could shape social interaction, and his situational analysis describes how electronic media interacts with social behavior. The central thesis of ‘Sense of Place’ is that the media reorganize the social settings by weakening the once-strong relationship between physical place and social place, a nexus that underpins not only group identifications but structural roles, including social positions and hierarchies. The assumption is that static social situations would give way to dynamic social situations, as physically defined settings would be replaced by social environments created by media and communication. Social roles would be performed and witnessed by new audiences and in new arenas, both of which would be independent from physical constraints. This thesis is never so vivid as in the examples advanced by Meyrowitz of how backstage areas emerge as a function of media. Literacy allows parents to spell words to each other to prevent their young and illiterate children from understanding what is being said. Similarly, two teenagers speak to each other on the telephone to forge a backstage that excludes those in their surroundings. Meyrowitz

Place and space 29

argues these examples foreground how media create backstage areas that override physical distances. While the interplay between backstage areas and space is vividly portrayed in these examples, the assumption that such backstage areas emerge from the elimination of space and geography, or that they forge nonspatial placeness, seems largely inaccurate. This is not to say social situations can only occur faceto-face in set times and places but that the patterns of access to information and socialization do not bypass space even in the absence of it. The example of parents spelling out words to relay a secret message is one where a backstage is formed because of geographic propinquity—that is, the children’s proximity to the conversation between parents triggers this form of cryptic communication. The second example, conversely, creates a backstage because of the geographic distance—the medium of the telephone is used to bridge geographic distances. The assumption that these backstage areas are similar in their relationship with space is imprecise, and so is the proposition that they remove the constraints of space; if anything, they negotiate the interference of space onto social relations. Space is not a hindrance that needs to be removed. It is a vector driving social interaction that is also structurally attached to social interaction, whether mediated or face-to-face. This is remarkable because Meyrowitz has otherwise displayed an acute spatial sensibility. ‘Sense of Place’ describes how the bond between physical place and social place pervaded oral and print cultures and changed the relative positioning of those in different places. The central role of places in the flow of information is conceptualized in the spatial and temporal constraints of any given place-situation: time and distance were measures of social insulation and isolation, with doorways to rooms and buildings setting the boundaries of inclusion and exclusion from situations in observable ways. This remarkably nuanced account of media and space disappears when Meyrowitz describes how electronic media “lead to a nearly total dissociation of physical place and social place” (Meyrowitz, 1986: 115). This anthropological definition of space is closely aligned with Olwig and Hastrup (1997), who argued for a new sensitivity to the ways in which place is performed and practiced. Rather than describing or approaching place as a site, it involves viewing it as field of relations where social networks can flourish. As such, place is not in opposition to space, as it is grounded on specific locations, but place foregrounds the connections between locations where the actors work, live, and play. This anthropological understanding of place allows ethnographers to start their research from a particular place and expand the boundaries of their object by following the connections that are rendered meaningful as the investigation progresses. In other words, place becomes a process of following connections as much as a cycle of inhabitance (Olwig & Hastrup, 1997).

2 FACE-TO-FACE AND ONLINE COMMUNITIES

The emergence of the internet and networking technologies altered the inexorable costs imposed by geographic distances on sharing information across communities. Individuals who previously sought to share content faced high costs of production and distribution, as with print, or contended for access to scarce resources, such as the electromagnetic spectrum required for broadcasting television and radio (Bastos et al., 2013). By challenging the monopoly enjoyed by mass media as primary channels for disseminating information under tight content control, networking technologies created a wealth of opportunities to coordinate and distribute information without editorial or curatorial oversight through decentralized and distributed networks (Castells, 2012). The work of Barry Wellman has made a fundamental contribution to understanding how technology has changed the spatial constraints in social networks. Curiously based in the same university where Marshall McLuhan developed his seminal theories, much of Wellman’s work is an empirically supported debate with McLuhan’s insightful probes, with the concept of ‘Global Village’ resonating with Wellman’s construct of ‘Community Liberated,’ and several of Wellman’s studies consisting of attempts to see what the global village looks like around the world (Wellman, 1999). Much of this work addressed the perennial tension between face-to-face and technology-mediated communication. Of particular interest is the study authored by Wellman and Potter (1999), where three types of communities are identified based on the extent to which they relied on face-to-face and phone contacts: lost, saved, and liberated. Individuals who lived near each other continued to have more frequent contact, but social technologies altered the notion of proximity in fundamental ways: most of the networks involving non-kin contacts covered large metropolitan areas rather than just the respondents’ immediate neighborhoods, which

Face-to-face and online communities 31

is a considerable departure from previous accounts of the spatial dependencies of social networks of kin (Fischer, 1982). Despite the growing scholarship probing the social dynamics of online communities established in decentralized networks, or perhaps because of this otherwise purely academic enterprise, the elapsed effect of online activity on offline events remained a point of contention throughout this period (Zuckerman, 2014). Critics coined the terms ‘clicktivism’ and ‘slacktivism’ to refer to social media activity stemming from armchair politics with no follow-through beyond the computer screen, typical of the aimless political action supported by the affordances of online social networks (Morozov, 2011, 2013). Effecting change in the world would require high-risk and thick activism, thereby enforcing top-down, hierarchical communication processes with strong bonding factors among participants such as those observed in the Civil Rights Movements (Gladwell, 2010). This line of inquiry was ultimately dismissed by a string of empirical studies that identified, modeled, and estimated the magnitude of the effects between online and offline protests on location (Bastos et al., 2015; Cadena et al., 2015; De Choudhury et al., 2016; Vasi & Suh, 2016). Yet, the assumption that online activity could be inconsequential to events transpiring on the ground stems from the uncertainty about the extent to which virtual and social places, the concepts underpinning much of internet scholarship, are defined by spatial boundaries and the physical dimension of social networks. To put it another way, internet scholarship successfully established that online communities operate with a set of social rules equally complex to those observed offline, but it remained ambiguous about the extent to which the former influenced the latter or the other way around. This vagueness resulted from challenges in identifying the directionality of this relationship, but by avoiding the directionality problem in the homophilic relationship between online and offline relationships, it largely obscured the role of geography in online social networks. Space was therefore often treated as a mere backdrop where online social networks would emerge and flourish, notwithstanding the large body of scholarship evidencing the bidirectional association between geography and network formation as a significant driver of tie-selection and retention (McPherson et al., 2001). Similarly, there is ample evidence that geographic proximity affects tie-formation mechanisms associated with both opportunity and preferences, as physical places can be conceived of as a bundle of resources and opportunities with the additional characteristic of spatial contiguity (Glückler, 2007). Encounters with kin, but also colleagues and friends, are particularly constrained by space, not least because of the temporal cost and effort involved in covering large distances to carry out faceto-face interaction (Hägerstrand, 1982). Not only the creation but also the maintenance of social relationships are constrained by space (Johnston & Pattie, 2011). Fischer (1982) charted these dependencies of personal relationships and faceto-face communication in northern California communities. Although the surveys excluded large metropolitan areas like New York or Los Angeles, and no

32 Local and digital

predominantly black community was surveyed, respondents consistently reported their innermost social network to comprise 15–19 individuals. The network was defined by kin (40%), colleagues (10%), and neighbors (10%). Non-kin and nonwork-related neighbors were peripheral in the network, but they remained important to people’s contact circles. More importantly, local contacts—whether kin or non-kin, colleague or non-colleague—were a major component of the social networks, comprising of 16 people on average living within five minutes’ drive of a respondent’s home. Regardless of the type of settlement (semirural, town, metropolitan, regional) and the number of local kin named, respondents had the same average number of contacts within a five-minute drive from their home (Johnston & Pattie, 2011). The spatial dependence in these networks was relatively uniform across communities. It was only when the type of the interpersonal contacts was considered that geography started to become less important. Near-neighbors were central to visiting and dinning together, but less so to discussing a hobby or obtaining advice on important matters. Fischer (1982) concluded that the advantage of close associates declines in exchanges where distance is a marginal cost. While social interactions continued to depend on nearby contacts, the discussion of hobbies and personal matters and even the borrowing of money were less reliant on physical presence and could be carried out at a distance. These marginal constraints are similar to those observed in communities of interest that populate online social networks where individuals come together to discuss a common interest. But they differ in fundamental ways from groups discussing politics, attitudes, and behaviors, particularly politics and voting, which are more likely to depend on neighborhood circles. It is the extent to which social groups formed online, with emerging patterns of attitudes and behavior, map to offline communities that was largely overlooked. Studies exploring this relationship suggested that online communities may incorporate remote strangers who are activated and incorporated as organic members of one’s social network (Rainie & Wellman, 2012). Associational effects have been reported in the literature, with the relationship between spatial distance and users’ interaction on social media found to be significant, and friendship ties in densely connected groups arising at shorter spatial distances compared with social ties between members of different groups (Laniado et al., 2017). More importantly, research found social ties on Twitter to be constrained by geographical distance with an over-representation of ties confined to distances shorter than 100 kilometers (Takhteyev et al., 2012). Other studies have suggested a spillover from in-person interaction patterns to online social networking sites, further problematizing the hypothesis about the direction of homophily (Bastos et al., 2018). The homophily model posits that individuals inhabiting physical communities are more likely to connect with others sharing similar social characteristics, so that cultural similarities and differences among people can be formalized as a function of geographic propinquity (McPherson & Smith-Lovin, 1987; McPherson et al., 2001). In other words, and contrary to the large body of scholarship extoling the virtues of the internet in bringing

Face-to-face and online communities 33

together individuals with common but niche interests separated by disparate geographic communities, it may be the case that the communities forged online, however dynamic and niche they may be, are nonetheless attached to and dependent on tangible spatial boundaries that not only constrain these communities, but may indeed structure them. These relationships are difficult to pinpoint and speak to the limited vocabulary available to address that which is social but formed largely online and that which is social but shaped largely offline. Similarly, while some types of networks (e.g., railway network, road and highway network, communication infrastructure) are properly referred to as physical networks, human social relations are necessarily cognitive. To say it another way, these relationships exist within people’s internal worlds, or to put it in social systems’ terms, within the internal boundaries of the psychological systems (Luhmann, 2005), which are not physically visible. While offline communities certainly antecede online communities, the mutual and elapsed effects between online and offline communities added further complications to the directionality problem in homophilous relationships. There are of course marginal cases in which it may not be possible to extricate offline and spatial social networks from online communities. These necessarily include communities whose members are evenly connected to each other both online and offline, though these networks are likely to emerge only infrequently (Subrahmanyam et al., 2008). The typical overlap between established ties online and offline is likely more nuanced, thereby putting the boundaries of offline social networks and physical social networks to the test. Phone networks, or the network derived from who talks to whom on the phone, can be defined as online networks, though the physical dimension of communication infrastructure and financial burdens imposed by long-distance calls before the availability of voice over IP (Mok et al., 2010) are important physical restrictions that impinge on such networks. Indeed, the cost associated with long-distance links in spatial networks inhibits the emergence of high-degree nodes (hubs) and prevents the rise of long-tailed degree distributions (Liben-Nowell et al., 2005). Other borderline examples include the relationship one has with parents living thousands of miles away. What presumably was a network with palpable physical and spatial boundaries may transition to a place that is largely virtual. This is a particularly interesting case because while one may communicate with one’s parents online, they invariably fall into a different category compared with one’s Twitter followership. Indeed, the perennial tension between spatial and social ties dates to the seminal studies on dining-table partner choices coded as a sociogram by Moreno (1953), a problem that can just as easily be translated to the seating choices of students in a classroom (geographic) in relation to the friendship ties (social) of these individuals. But there is a large universe of online communities that function without any corresponding offline social network, or whose online dynamics are objectively different from those observed on location. The tensions between drill groups, a bleak style of rap music whose songs are disseminated on YouTube, are one such

34 Local and digital

example. Songs directly threaten rival groups, who similarly respond on social media platforms like Snapchat, Instagram, or YouTube, in an escalating bravado contest that may spill over to the real world. Singer and Brooking (2018) described how online communication can spur violence, which starts not with conflicts on location but with insults hurled online; these have grown to include a number of specialized slang terms such as ‘Facebook drilling’ and ‘wallbanging.’ Observers have sounded a note of caution on the glamorization of violence in drill videos revolving around killing, shooting, and stabbing, which can be acted out and amplified in the real world (Fraser, 2017). This assumption led to a debate over the extent to which drill music is a reflection of the conditions of individuals rather than a creator of them. Much as in the case of videogames, the evidence linking the music and the crime is largely circumstantial, and studies have flagged other possible causes, including access to drugs and cutbacks to social services in the aftermath of countrywide austerity measures (Ilan, 2015). Moreover, studies exploring the network structure of urban gangs have found rivalries in both adjacent and non-adjacent turf, with proximity being a driving factor in institutionalized violence (Papachristos et al., 2013; Radil et al., 2010; Schaefer, 2012). Observers have also noted that physical violence triggered by online activity extends beyond metropolitan areas, with Facebook reportedly acting as an accelerant to violence and playing a determining role in fomenting attacks in Myanmar, Sri Lanka, and beyond (Miles, 2018; Zuckerberg, 2017). Another consequence of ubiquitous online communities and social platforms is the marked decline, often leading to the extinction, of local newspapers, which ironically are found to be more trusted by readers than are the national media. The falling levels of trust in the press and the declining fortunes of local newspapers paved the way to a media ecosystem without trusted sources (Mak, 2018; Vosoughi et al., 2018). This is an ecosystem where local newspapers were decimated and readers were comparatively more vulnerable to disinformation and low-quality information, and likewise more likely to engage with content that verges on conspiracy-theorizing or propaganda (Zuckerman, 2017). Conversely, news coverage by national newspapers is leveraging social network information and users’ self-identified preferences to offer location-based, personalized news coverage—a development that may further erode trust in the press. This gerrymandering of news offers geofragmented frames to geotargeted readers in ways that are distinctly different from the traditional nested models of local news offices and national news organizations. Whereas local news desks provided context and details to stories relevant to local audiences, and smaller outlets devoted additional resources to cover specifically local stories, the use of geolocation-targeted push notifications can detach news stories from the broader coverage that oriented the nested models of local and national news desks. News outlets may continue to offer frames for local and national audiences that together point to the same story, but there are no guarantees that readers will follow the story and their exposure to the news may well be restricted to the tailored frame. As such, the

Face-to-face and online communities 35

combination of decentralized news making and the geotargeting of readers offers the potential to manipulate, however unintentionally and subtly, the information environment where news is consumed, with the potential for exacerbating social and geographic fragmentation supported by different frames that nonetheless link to the same story (Sanfilippo & Lev-Aretz, 2017). The conditions for online communities to trigger dynamics that deviate from those observed in face-to-face communities are likely to grow with the algorithmization of community governance implemented by social platforms. Smartphones and internet-connected devices continuously store individual trace data in the cloud. This data is imperfectly processed by machine-learning algorithms of limited precision and recall, trained with incomplete datasets, and tailored for purposes that are indistinguishable from those of the company collecting the data. By aggregating digital trace data at individual and group levels, social platforms can offer advertisers granular targets benchmarked with incomplete data and limited precision. These imperfections cause misclassifications and malfunctions largely understood as acceptable downsides of the social media business because imprecisions measured at the individual level are offset at the group level. But these imprecisions and incompleteness are then built into the system recursively, so that social groups identified by and modeled with digital trace data can take up a life of their own on social platforms, with niche subcommunities that only partially meet their physical counterparts. This asymmetry between face-to-face and online communities is never so clear as when botnets are activated. Due to the high overhead involved in growing sockpuppet accounts, botnets are often set up so that bots retweet and comment on other bots’ messages, thereby triggering message cascades read by absolutely no actual person. Vast portions of the social media supply chain are conceivably colonized by automated posting protocols and quasi-autonomous parasites. Disinformation campaigns and bot masters alike often resort to zombie accounts, promptly revamped with stolen profile photos of implausibly beautiful young women (Bastos & Farkas, 2019), and yet bot detection remains far from being an exact science, as neither human annotators nor machine-learning algorithms can perform the task particularly well (Varol et al., 2017; Rauchfleisch & Kaiser, 2020). The universe of subcommunities and niche groups where online interaction has given birth to forms of life radically different from those observed in faceto-face interaction is not restricted to automated posting protocols. Scammers continuously search for keywords for which the available relevant data is limited, non-existent, or problematic, thereby providing an opportunity to push content to Google top ranking results (Golebiewski & Boyd, 2018). Such ‘data voids’ are similarly exploited on YouTube and Amazon. The former is populated by algorithms that generate video product reviews by collating relevant photos indexed by search engines and the audio of reviews using automated text to speech. Amazon is similarly full of bogus books created by scammers from scratch, compiling and modifying text from other books and Wikipedia to take advantage of loopholes in Amazon’s compensation structure (Farrell, 2018).

36 Local and digital

Twitterbots are an interesting, if extreme, case of complete departure from faceto-face communication. These can vary in sophistication from simple automated accounts relaying information sourced from websites or retweeting other bots to sophisticated networks combining human and automated accounts designed to invade or take by storm public conversations. Farrell (2018) argues that the world of communication and interaction at distance is increasingly filled with algorithms that appear human but are not, with the hacking of the dating website Ashley Madison offering a cautionary tale. The data leaked on the internet showed that several thousand women on the website were, in fact, ‘fembots’ programmed to send welcoming messages to deluded male customers who conceivably believed themselves to be surrounded by many potential sexual partners. The world of real users and the world of automated protocols feed from each other, and it is difficult to tell where one ends and the other begins, with rumors spread by Twitterbots merging into other rumors about the ubiquity of Twitterbots. Perhaps more importantly, the distinction between automated and supervised information warfare was often absent in public deliberations. Surpassing bots in complexity and capillarity in the communities in which they operate, supervised accounts (e.g., trolls) were pivotal in the successful disinformation campaign led by the Kremlin-linked and St Petersburg-based Internet Research Agency. This campaign relied primarily on supervised accounts operating on Facebook (US District Court, 2018), a sharp contrast to the desolate life of Twitterbots communicating with each other and having modest impact outside their bubbles. The contrast between the two covert strategies raises topical questions about human-driven, curated, and supervised high-volume posting versus automated, unsupervised, and scripted machine bots. Taken together, they encapsulate a form of political action that represents a considerable departure from face-to-face interaction without being a mere extension of such interaction on digital media (Massanari, 2015). While bots and trolls continue to be described in the press as analogous enterprises undermining and reshaping political campaigning, there are fundamental differences. These were first identified in studies about serial activists, users who exhibited extraordinary levels of posting activity combined with a savvy strategy for activating opinion leaders such as journalists while at the same time assisting activists to coordinate across national boundaries and protest sites (Bastos & Mercea, 2016; Mercea & Bastos, 2016). This pattern of activity suggested a complex modality of engagement that linked actions online and onsite at multiple protest locations, an astute modus operandi ultimately repurposed for sophisticated covert influence operations. This is in line with early reports of the effective disinformation campaign led by the Internet Research Agency, whose operatives galvanized partisan communication online and agitated for rallies across the United States, often contacting campaign staff members in various US states and appealing to individuals to take their grievances to the streets (Shane & Mazzetti, 2018).

3 FROM GLOBAL VILLAGE TO IDENTITY TRIBES

The Toronto School of Communication, particularly in the works of Eric Havelock and Harold Innis, offered a research program that explored communication networks in relation to the geographic boundaries where these networks emerged. But it was the later work of Marshall McLuhan that came to be primarily associated with the school, particularly the axiomata about the media. The first axiom stated that the mass media were extensions of man, as technology enhanced the physical and nervous systems of individuals and increased information-processing capacity. The second axiom asserted that the medium is the message, as the meaning of a message is ultimately affected by the symbiotic relationship between the communication platform and the content being communicated (McLuhan, 1962). But it was the third axiom that resonated with those envisaging global networks: that communication networks would bring forth a ‘global village,’ a term purposely coined as a contradiction to foreground the seamless integration of villages into a global community. Electronic or digital media would shrink the world and reshape it into a single village by moving information instantaneously from multiple locations at the same time. Network and telecommunication technologies would increase the density of connections within and across social clusters, thereby integrating geographic and cultural areas into a village that stretched across the globe (McLuhan, 1964). In short, global network infrastructure would change the balance between communication and spatial distance and put into effect McLuhan’s vision of a global village. While this vision exudes the formulaic optimism of the 1960s, it marked much of the discussion about the internet in the 1990s, when digital communication was thought to bring the world together, both geographically and politically. During the late 1990s, particularly in the second half of the decade that led to the dotcom bubble, technology pundits and observers forecast that the impact of distance would be progressively diminished by communications technology (Cairncross,

38 Local and digital

1997), notwithstanding studies reporting that geographic proximity remained a critical factor in building relationships and that the negative impacts of distance on cooperation were only partially mitigated by network technology (Kiesler & Cummings, 2002). In other words, the term ‘global village’ epitomized the shrinking of the world into one village through the use of digital media. Since its prescient formulation by McLuhan (1964), the metaphor was popularized to explain the internet, where physical distance is even less of a constraint on the communicative activities of users. Boyd (2008a) argued that the global village metaphor continues to describe effectively how digital communication empowers personal relationships across vast geographic and cultural differences. Instead of initiating relations with strangers, digital communication tools are reportedly used primarily to maintain relationships with people in close physical and social proximity. Friendster, a seminal social networking site predating Facebook, simply provided a tool for scaling up social networks rooted in proximate social relations and representing this dynamic to the community. But the scaling of social networks to online platforms is not a perfect mirror, if it is a mirror at all, of social relationships established in one’s immediate surroundings. Boyd (2008a) noted that the social graph of Friendster users with numerous common ties offered a good indication of severed relationships. Whenever user A and user B shared many friends in common, but were not friends themselves, there was a good indication that this was due to a severed personal connection, not a social opportunity. This dynamic contrasts with the dynamics of social relations observed offline, where exes cannot be simply deleted from one’s life while also maintaining the social network that supported their previous relationship. In other words, the Friendster network was not merely mirroring offline social networks but creating a disparate version with parallel albeit adjoining rules of engagement (Boyd, 2008a). This persistent McLuhanian account of online social networks informed much of early internet research where social platforms were framed as a window to social contexts and local communities. This vision offers a diametrically different account to the set of problems that appeared in the past decade, with a new set of metaphorical tropes equally inspired by geography but emphasizing identity and tribalism instead of integration and cooperation. Both sets of metaphors are spatially inspired but suggest opposing narratives to describe the online realm: the first foregrounds communication and collaboration, and the second highlights polarization and division. Common to both narratives is the foregrounding of real-world consequences either by supporting cooperation or by ripping apart the social fabric of society. Much of internet scholarship in the late 1990s and early noughties was dedicated to discussing and critiquing metaphors and concepts derived from these overarching narratives. But the public discourse continued to approach online realms as forces either liberating or driving social anxieties in offline quarters. This binary framework is encapsulated in common internet fallacies rooted in the assumption that the online and offline worlds are either secluded or that mirror each other, with scant consideration for asymmetric interactions. These fallacies have lingered

From global village to identity tribes 39

in the public discourse and include the notions that what happens online stays online, that the internet democratizes participation, that the internet enables a collective intelligence, that online trolls are just insecure people, that cyberbullying happens to people with no life outside the internet, and that cryptocurrencies have no intrinsic value. These widely held assumptions are driven by the belief that online and offline realms are self-contained universes or, inversely, that no significant difference exists between these two modalities of social interaction. These metaphors also refer to two milestones in the narrative of networking technology: first the technology was perceived and conceptualized as a force for integration, only to be subsequently—due to internet affordances allowing largescale disinformation campaigns—defined as a force for polarization. This epochal change was arguably triggered by the deployment of data-driven microtargeting in political campaigning epitomized by the Cambridge Analytica data scandal and the ensuing data lockdown enforced by social media platforms. Since then, digital trace data has been increasingly linked to disinformation, misinformation, and state propaganda across Western industrialized democracies and countries in the Global South, where state and non-state actors seek to strategically diffuse content that heightens partisanship and erodes the general trust in democratic institutions (Walker et al., 2019). Influence operations weaponizing social media emerged across the world, with prominent examples, including the 2016 US elections, the UK EU membership referendum, and the 2017 general elections in France (Bastos & Mercea, 2019; Ferrara, 2017; Shao et al., 2017). The evolving disinformation landscape required the adoption of specialized vocabulary associated with influence and disruptive operations to describe a set of media practices designed to exploit deep-seated tensions in liberal democracies (Bennett & Livingston, 2018). The effectiveness of dis/ misinformation campaigns depended in part on the ability to take advantage of the biases intrinsic to social media platforms (Comor, 2001; Innis, 2008), particularly the attention economy and the social media supply chain relying on viral content (Jenkins et al., 2012) for revenue generation. This backdrop of influence operations and information warfare presents a considerable departure from years of euphoric rhetoric praising the democratization of public discourse brought by networking technology and social media platforms (Howard & Hussain, 2013). Early scholarship extoling the potential of social media for democratization and deliberation inadvertently reinforced a narrative championing communication and collaboration as expected affordances of social platforms (Loader & Mercea, 2011). By the end of the decade, however, the narrative surrounding social platforms rapidly turned to metaphors foregrounding polarization and division in a landscape marked by tribalism and information warfare (Benkler et al., 2018), enabled by a business model driven above all by the commodification of digital circulation and its capitalization on financial markets (Langley & Leyshon, 2017). Scholarship on this hybrid media ecosystem (Chadwick, 2013) explored the technological affordances and ideological leanings that shape social media

40 Local and digital

interaction, with a topical interest in the potential for civic engagement and democratic revitalization (Zuckerman, 2014). Bennett and Segerberg (2013) expanded on Olson’s seminal work on the logic of collective action to explain the rise of digital networked politics where individuals would come together to address common problems. Benkler (2006) convincingly argued that network technology democratized communications and enhanced civic participation. Similarly, Castells (2009) described a global media ecology of self-publication and scalable mobilization that advanced internet use and political participation (Castells, 2012). Open platforms and unrestricted access offered the cornerstone of networked publics that reconfigured sociality and public life (Boyd, 2008b). The relatively open infrastructure of networked publics was also explored in scholarship detailing how online social networks supported gatewatching (Bruns, 2005) and practices in citizen journalism that are central to a diverse media ecosystem (Hermida, 2010), with citizens auditing the gatekeeping power of mainstream media and holding elite interests to account (Tufekci & Wilson, 2012). By most assessments, social network sites were welcoming challengers to the monopoly enjoyed by the mass media (Castells, 2012), with only limited attention devoted to the opportunities offered to propagandists, who could similarly coordinate and organize disinformation campaigns through decentralized and distributed networks (Benkler et al., 2018). These developments challenged the very notion of networked publics and Castells’ (2012) depiction of the internet as universal commons. And yet, the transition from narratives emphasizing open communication to concerns about information warfare was neither immediate nor inconsequential. With mobile platforms slowly replacing desktop-based applications, open standards gave way to cloud-based, centralized communication systems epitomized by social media platforms. Social technologies gradually pivoted from a business model centered on software and services to the leasing and trading of user data. These changes endangered the openness of networked publics, with the debate underpinning networks in the late 1990s being replaced by a focus on the affordances of mobile apps and social platforms, whose user base largely differs from living communities of users who would come together around common interests. Also noticeable in the transition from networked publics to social platforms was the increased commercialization of previously public, open, and often collaborative spaces, which were increasingly reduced to private property. This infrastructural transformation of the networked publics continues to drive anxieties about social media platforms in the aftermath of the Cambridge Analytica data scandal, including topical issues of digital privacy, data access, surveillance, microtargeting, and the growing influence of algorithms in society. Indeed, social platforms built much of their social infrastructure on the back of networked publics and the community organization that shaped internet services in the early 1990s. The drive toward community formation remained an important component of social media

From global village to identity tribes 41

platforms. On 27 February 2017, the CEO of Facebook, Mark Zuckerberg, wrote in a note on his Facebook page titled ‘Building Global Community’: History is the story of how we’ve learned to come together in ever greater numbers—from tribes to cities to nations. At each step, we built social infrastructure like communities, media, and governments to empower us to achieve things we couldn’t on our own. . . . Today we are close to taking our next step. Our greatest opportunities are now global. . . . Progress now requires humanity coming together not just as cities or nations, but also as a global community. (Zuckerberg, 2017, emphasis added) Zuckerberg’s vision simultaneously highlighted and projected the end of opennetworked publics. Social platforms cannot be separated from the user communities that populate them. Yet, the business model of social platforms precludes user governance while also extracting user data from community interactions (i.e. transactions among members) that can be monetized (Dijck, 2013; Fernback, 2007) by a lucrative advertising business. The latter governs group interaction and individual experience alike through a set of intricate learning algorithms (Bucher, 2017). These algorithms rely on users as ‘affective processors’ who interpret and help govern the communities through shares, likes, retweets, and pins (Gehl, 2011; Lomborg & Kapsch, 2019). However opaque to users, algorithms generate knowledge about users beyond their immediate interactions, thereby triggering further interactions and ‘imaginaries of interaction’ such as user theories about what the algorithm is or ought to be (Bucher, 2017). This represents a considerable departure from the landscape shaped by online communities in the mid-1990s and early noughties where members would share their experiences. The meteoric rise of social platforms, particularly the behemoth Facebook, came with the promise of a wider audience that successfully pulled members away (and apart) from online communities that evolved from forums and e-zines in Bulletin Board System. The promise of a wider audience came at the cost of a dwindling sense of alterity and community. The ensuing algorithmization of communities introduced and championed by social platforms completed the transition by instantaneously rendering networked publics into a profitable source of users’ interactions (Lingel, 2017). Perhaps unintentionally, the transference of community governance from users to algorithms removed a key basis for mutual trust and opened the way for largescale disinformation campaigns that conspicuously plagued election cycles, ethnic relations, and civic mobilization from 2016 onwards (Apuzzo & Santariano, 2019). By Facebook’s own account (Weedon et al., 2017), its advertising algorithms were harnessed to segment users into belief communities that could be microtargeted with materials that amplified their intimate political preferences. This repurposing of intimate knowledge and networked interaction for revenue-making remained

42 Local and digital

the corollary of commercial social media enterprises, including the individuals and academics involved in the infamous political consultancy firm Cambridge Analytica (Rosenberg, 2018). The tendency of social media users showing a preference for a subset of content that is at odds with the coverage of newspapers was already apparent before social media became a primary channel for news consumption (Bastos, 2015), but the social algorithmization of news certainly worsened its effects. Benkler et al. (2018) argued that it was Facebook algorithms—more than Facebook communities or specific malicious actors distributing problematic content—that rewarded clickbait websites and tabloid-like sources of information, which often include hyperpartisan content. The algorithmization of social media communities was particularly damaging because it reinforced patterns of interaction and the sharing of content in tightly clustered communities that supported and likely reinforced the relative insularity of users. For Benkler et al. (2018), concerns over the Facebook News Feed algorithm in particular, and over algorithmic shaping of reading and viewing habits in general, are not only legitimate but likely underplayed in the aftermath of rampant disinformation campaigns that leveraged social platforms’ algorithms. This account of social platforms is a considerable departure from the heyday of the internet as a force of liberation. Propaganda efforts led by the Internet Research Agency, a ‘troll factory’ reportedly linked to the Russian government (Bertrand, 2017), weaponized social platforms to meddle in national elections in Western democracies. Since then, the record of demonstrable falsehoods shared on social platforms with real-world consequences has increased steadily. Facebook grew more proactive in Myanmar after the United Nations and Western organizations accused it of having played a role in spreading the hatred and disinformation that contributed to acts of ethnic cleansing (Miles, 2018). Narratives of the internet as a community, global or otherwise, were rapidly superseded by metaphors of the internet as a ‘tribe,’ with the meanings associated with community—i.e., identification, communication, and collaboration—being likewise replaced by terms addressing the hostility between tribes: ‘polarization,’ ‘weaponization,’ and ‘nationalism.’ Mark Zuckerberg, Facebook’s CEO, insisted on the path of greater connectivity and ignored the reverse course in his call to the Facebook community, melancholically titled ‘Building Global Community,’ with a suitable reference to tribes, cities, and nations. The missive read much like a reality check for a company that assumed greater interconnection between users would necessarily bring about greater understanding among people in real-world communities. The letter exudes a Silicon Valley feel-good vibe about progress and humanity coming together, not just as cities or nations but as a global community, with scant detail about how the company’s social platform interacts with communities and traditional institutions to fragment our shared sense of reality (Zuckerberg, 2017). The Facebook CEO’s vision of a globally connected world grossly underestimates the extent to which social life is marked by contradictions, swiftly and demonstrably amplified as online and offline social networks collapse into a

From global village to identity tribes 43

common contextual ground. As one’s local experiences are intertwined with global networks, the various paradoxes piecing together social structures and the human experience become all too apparent. In Zuckerberg’s worldview, cooked in a bubble circumscribed by prep school, Harvard, and Silicon Valley, communication and human connectivity can only be conceptualized as inherently positive outcomes even when Facebook finds itself entangled in nefarious incidents, including genocide in Myanmar, religious terrorism in Sri Lanka, and vaccine hesitancy around the globe. This worldview is seemingly unaware of the dialectical tensions between community (Gemeinschaft) and society (Gesellschaft) charted by Tönnies (2012) and the inherent tension between individuals and group members; such tensions are the source of much cruelty and oppression, but also of comfort and kindness. The permanent impetus toward greater connectivity leaves no room for this fundamental and inherent contradiction in human experience. The disconnect between online activity and offline interaction is unambiguously expressed in algorithm-driven newsfeeds underpinning the business model of social platforms like Facebook. These algorithms are parametrized on metrics that maximize user engagement, thereby tapping into primal emotions, such as anger or fear, that scramble users’ perceptions of reality while being oblivious to the realworld repercussions of algorithmic filtering (Ananny & Crawford, 2018). Within this closed environment of algorithms that are agnostic to hatred and vitriol, reality-distorting misinformation can flourish on social platforms by consciously and reliably tapping into users’ darkest impulses and polarized politics. The substitution of real-world community leaders that emerged with the first wave of online communities with algorithms automating the management of social interaction online removed the underlying nexus negotiating the expression of identities online and offline. As social platforms began to scale up operations to cater for an increasingly larger user base, the flux of information within online communities was for the first time managed by algorithms dedicated to maximizing engagement, which often translated to maximizing conflict. The rapid deployment of this algorithmic network infrastructure led to remarkable disconnect between social groups and undermined the fragilely woven fabric of society. The rise of network propaganda embedded to social platform affordances, along with the 2016 election cycle that placed Trump in the White House and brought a nearimpossible Brexit to the UK, led the technorati to embrace dystopian narratives that described current events with terms such as ‘Darkest Timeline,’ a reference to the theory that there are multiple universes outside of our own and that we live in the worst possible universe of them all. The Darkest Timeline anecdote foregrounds a split in consensus reality perceived as cognitive dissonance in the cultural and political landscape. This perception is accompanied by a substantive uptake in conspiracy-theorizing (Uscinski, 2018), chief of these being the QAnon meta narrative. This knits together contemporary politics and racist tropes, positioning ‘the people’ against globalist elites it refers to as ‘The Cabal,’ a force that traffics in pedophilia, blood sacrifice, Satanism, and other attention-getting transgressions. Similarly, anti-vaccination

44 Local and digital

conspiracy-theorizing has rapidly evolved into a cult where members feel an obligation to share the truth with their neighbors and significant others. The economics of social capital underpinning real-world communities drives much of the activity in these loosely knit communitarian narratives, which embrace the participatory nature of the contemporary internet, where storytelling is built upon decentralized fan fiction spiraling within closed universes of mutually reinforcing interpretations (Zuckerman, 2019). Despite its limited overlap with consensus reality, conspiracy-theorizing such as QAnon narratives has successfully found footholds in the offline world. ‘Q’ T-shirts appeared recurrently in Trump reelection rallies during 2019 and 2020 and culminated in the violent storming of the United States Congress on January 2021, when supporters featured Q paraphernalia, carried signs, and celebrated the theory. QAnon surfaced in political campaigns, criminal cases, merchandising, and college classes. The book QAnon: An Invitation to a Great Awakening, written by QAnon followers and supporters, peaked at #2 on Amazon’s list of bestselling books in early 2019. QAnon supporters were often regular citizens who found in Q’s messages a source of partisan energy that confirmed their suspicions about powerful institutions. Many were senior or elderly users who came across the theory through partisan Facebook groups or Twitter threads (McIntire & Roose, 2020). The ease of information sharing supported by social platforms not only allowed content to become untethered from physical communities; it also allowed content untethered from reality to penetrate real communities at scale and speed. There is another way of accounting for these epochal changes from a different theoretical perspective. Journalism and information studies highlight that the traditional gatekeepers that used to supervise the flow of information were bypassed by the near-limitless reach of social platforms, whose online social networks are not necessarily dependent nor reliant upon offline social norms or communities. In short, these platforms cannot be mere extensions of the offline world if they often perform as tools that can make or break the fabric of society. While there is ample evidence that network externalities arising from one’s social network online and offline impinge on our shared sense of reality, the question about the directionality of the influence remains to be settled, even if much evidence points to the bidirectional association between geography and network formation as a significant driver of tie-selection and retention. In the end, the brief history of online social networks could be summarized in terms of radical epistemological turns, personified at first by the metaphor of a global village, only to be subsequently superseded by a horizon of tribalism and information warfare. These narratives are not value-neutral and foreground a discursive field that is consequential to the implementation of social technology (Gunkel, 2018). While both metaphors are spatially inspired, they advance fundamentally conflicting narratives: one foregrounds communication and collaboration, and the other polarization and division. As mutually exclusive discourses, they project inconsistent and disjointed physical places resulting from the weaving of network technology into the fine textures of the physical world, epitomized

From global village to identity tribes 45

by social platforms but also including mobile, Internet of Things, biometrics, and other technologies that generate relational data but whose treatment through the traditional framework of network analysis is at best imperfect.

References Ananny, M., & Crawford, K. (2018). Seeing without knowing: Limitations of the transparency ideal and its application to algorithmic accountability. New Media & Society, 20(3), 973–989. Apuzzo, M., & Santariano, A. (2019). Russia is targetting Europe’s elections. So are far-right copycats. The New York Times. www.nytimes.com/2019/05/12/world/europe/russianpropaganda-influence-campaign-european-elections-far-right.html Axhausen, K. W., Urry, J., & Larsen, J. (2016). The network society and the networked traveller. In W. Saleh & G. Sammer (Eds.), Travel Demand Management and Road User Pricing (pp. 109–128). Farnham and Burlington, VT: Ashgate Publishing. Bastos, M. T. (2015). Shares, pins, and Tweets: News readership from daily papers to social media. Journalism Studies, 16(3), 305–325. doi: 10.1080/1461670x.2014.891857 Bastos, M. T., & Farkas, J. (2019). “Donald Trump is my President!”: The Internet Research Agency propaganda machine. Social Media + Society, 5(3). doi: 10.1177/2056305119865466 Bastos, M. T., & Mercea, D. (2016). Serial activists: Political Twitter beyond influentials and the twittertariat. New Media & Society, 18(10). doi: 10.1177/1461444815584764 Bastos, M. T., & Mercea, D. (2019). The Brexit Botnet and user-generated hyperpartisan news. Social Science Computer Review, 37(1), 38–54. doi: 10.1177/0894439317734157 Bastos, M. T., Mercea, D., & Baronchelli, A. (2018). The geographic embedding of online echo chambers: Evidence from the Brexit campaign. PLOS One, 13(11), e0206841. doi: 10.1371/journal.pone.0206841 Bastos, M. T., Mercea, D., & Charpentier, A. (2015). Tents, Tweets, and events: The interplay between ongoing protests and social media. Journal of Communication, 65(2), 320– 350. doi: 10.1111/jcom.12145 Bastos, M. T., Raimundo, R. L. G., & Travitzki, R. (2013). Gatekeeping Twitter: Message diffusion in political hashtags. Media, Culture & Society, 35(2), 260–270. doi: 10.1177/0163443712467594 Baym, N. K. (1999). Tune in, Log on: Soaps, Fandom, and Online Community (Vol. 3). London: Sage. Benedikt, M. L. (1991). Cyberspace: First Steps. Cambridge: MIT Press. Benkler, Y. (2006). The Wealth of Networks: How Social Production Transforms Markets and Freedom. London: Yale University Press. Benkler, Y., Faris, R., & Roberts, H. (2018). Network Propaganda: Manipulation, Disinformation, and Radicalization in American Politics. Oxford: Oxford University Press. Bennett, W. L., & Livingston, S. (2018). The disinformation order: Disruptive communication and the decline of democratic institutions. European Journal of Communication, 33(2), 122–139. Bennett, W. L., & Segerberg, A. (2013). The Logic of Connective Action: Digital Media and the Personalization of Contentious Politics. Cambridge: Cambridge University Press. Berry, D. M., & Fagerjord, A. (2017). Digital Humanities: Knowledge and Critique in a Digital Age. London: John Wiley & Sons. Bertrand, N. (2017, October 30). Twitter will tell Congress that Russia’s election meddling was worse than we first thought. Business Insider.

46 Local and digital

Bodenhamer, D. J., Corrigan, J., & Harris, T. M. (2010). The Spatial Humanities: GIS and the Future of Humanities Scholarship. Bloomington: Indiana University Press. Bollini, L. (2017). Beautiful interfaces. From user experience to user interface design. The Design Journal, 20(sup1), S89–S101. doi: 10.1080/14606925.2017.1352649 Boyd, D. (2008a). None of this is real: Identity and participation in Friendster. In J. Karaganis (Ed.), Structures of Participation in Digital Culture (pp. 132–157). New York, NY: Social Science Research Council. Boyd, D. (2008b). Taken Out of Context: American Teen Sociality in Networked Publics. Berkeley: University of California. Bruns, A. (2005). Gatewatching: Collaborative Online News Production (Vol. 26). Bern: Peter Lang. Bucher, T. (2017). The algorithmic imaginary: Exploring the ordinary affects of Facebook algorithms. Information, Communication & Society, 20(1), 30–44. Cacciatore, M. A. (1994). America Calling: A Social History of the Telephone to 1940. Berkeley: University of California Press. Cadena, J., Korkmaz, G., Kuhlman, C. J., Marathe, A., Ramakrishnan, N., & Vullikanti, A. (2015). Forecasting social unrest using activity cascades. PLOS One, 10(6), e0128879. doi: 10.1371/journal.pone.0128879 Cairncross, F. (1997). The Death of Distance: How the Communications Revolution will Change Our Lives. Boston, MA: Harvard Business School. Carey, J. (1985). Communication as Culture: Essays on Media and Society. Boston, MA: Unwin Hyman. Castells, M. (2009). Communication Power. Oxford: Oxford University Press. Castells, M. (2012). Networks of Outrage and Hope: Social Movements in the Internet Age. Cambridge: Polity Press. Chadwick, A. (2013). The Hybrid Media System: Politics and Power. Oxford: Oxford University Press. Chesher, C. (2012). Navigating sociotechnical spaces: Comparing computer games and sat navs as digital spatial media. Convergence, 18(3), 315–330. Comor, E. (2001). Harold Innis and ‘the bias of communication’. Information, Communication & Society, 4(2), 274–294. Conley, T. L. (2016). Mapping New (er) Connections in a Premature Place: A Case Study on Youth (dis) Connection, Mobilities, and the City. New York, NY: Teachers College, Columbia University. Cranshaw, J., Schwartz, R., Hong, J., & Sadeh, N. (2012). The Livehoods Project: Utilizing Social Media to Understand the Dynamics of a City. Paper presented at the 6th International AAAI Conference on Weblogs and Social Media, Dublin. De Choudhury, M., Jhaver, S., Sugar, B., & Weber, I. (2016). Social Media Participation in an Activist Movement for Racial Equality. Paper presented at the 10th International AAAI Conference on Web and Social Media, Cologne, Germany. Dijck, V. J. A. (2013). The Culture of Connectivity: A Critical History of Social Media. Oxford: Oxford University Press. Dijst, M. (2009). ICT and social networks: Towards a situational perspective on the interaction between corporeal and connected presence. The Expanding Sphere of Travel Behaviour Research, 45–75. Farrell, H. (2018). Philip K. Dick and the fake humans. Amass, 22(3), 24–26. Fernback, J. (2007). Beyond the diluted community concept: A symbolic interactionist perspective on online social relations. New Media & Society, 9(1), 49–69. Ferrara, E. (2017). Disinformation and social bot operations in the run up to the 2017 French presidential election. First Monday, 22(8).

From global village to identity tribes 47

Fischer, C. S. (1982). To Dwell Among Friends: Personal Networks in Town and City. Chicago: University of Chicago Press. Fraser, A. (2017). Gangs & Crime: Critical Alternatives. London: Sage. Gehl, R. W. (2011). The archive and the processor: The internal logic of Web 2.0. New Media & Society, 13(8), 1228–1244. Gladwell, M. (2010, October 4). Small change: Why the revolution will not be tweeted. The New Yorker. www.newyorker.com/reporting/2010/10/04/101004fa_fact_gladwell Glückler, J. (2007). Economic geography and the evolution of networks. Journal of Economic Geography, 7(5), 619–634. doi: 10.1093/jeg/lbm023 Golebiewski, M., & Boyd, D. (2018). Data Voids: Where Missing Data Can Easily be Exploited (p. 8). New York, NY: Data & Society. Gunkel, D. J. (2018). Hacking Cyberspace. London: Routledge. Hägerstrand, T. (1982). Diorama, path and project. Tijdschrift voor economische en sociale geografie, 73(6), 323–339. Han, S. (2012). From decentered to distributed selves. In A. Elliott (Ed.), Routledge Handbook of Identity Studies (pp. 219–235). New York, NY: Taylor & Francis. Harrison, S., & Dourish, P. (1996). Re-Place-ing Space: The Roles of Place and Space in Collaborative Systems. Proceedings of the 1996 ACM Conference on Computer Supported Cooperative Work, Boston, MA. Hermida, A. (2010). Twittering the news: The emergence of ambient journalism. Journalism Practice, 4(3), 297–308. doi: 10.1080/17512781003640703 Howard, P. N., & Hussain, M. M. (2013). Democracy’s Fourth Wave?: Digital Media and the Arab Spring. Oxford: Oxford University Press. Ilan, J. (2015). Understanding Street Culture: Poverty, Crime, Youth and Cool. Basingstoke: Macmillan International Higher Education. Innis, H. A. (2008). The Bias of Communication. Toronto, Canada: University of Toronto Press. Jenkins, H., Ford, S., & Green, J. (2012). Spreadable Media: Creating Value and Meaning in a Networked Culture. New York, NY: New York University Press. Johnston, R., & Pattie, C. (2011). Social networks, geography and neighbourhood effects. In J. Scott & P. J. Carrington (Eds.), The SAGE Handbook of Social Network Analysis. London: SAGE Publications. Kiesler, S., & Cummings, J. N. (2002). What do we know about proximity and distance in work groups? A legacy of research. Distributed Work, 1, 57–80. Kitchin, R., & Dodge, M. (2011). Code/Space: Software and Everyday Life. Cambridge, MA: MIT Press. Kitchin, R., Lauriault, T. P., & Wilson, M. W. (2017). Understanding Spatial Media. London: SAGE Publications. Langley, P., & Leyshon, A. (2017). Platform capitalism: The intermediation and capitalisation of digital economic circulation. Finance and Society, 3(1), 11–31. Laniado, D., Volkovich, Y., Scellato, S., Mascolo, C., & Kaltenbrunner, A. (2017). The Impact of Geographic distance on online social interactions. Information Systems Frontiers, 1–16. Liben-Nowell, D., Novak, J., Kumar, R., Raghavan, P., & Tomkins, A. (2005). Geographic routing in social networks. Proceedings of the National Academy of Sciences, 102(33), 11623–11628. Lingel, J. (2017). Digital Countercultures and the Struggle for Community. Cambridge, MA: MIT Press. Loader, B. D., & Mercea, D. (2011). Networking democracy? Information, Communication & Society, 14(6), 757–769. doi: 10.1080/1369118x.2011.592648

48 Local and digital

Lomborg, S., & Kapsch, P. H. (2019). Decoding algorithms. Media, Culture & Society. doi: 10.1177/0163443719855301 Luhmann, N. (2005). Social Systems. Stanford: Stanford University Press. Mak, T. (2018, July 12). Russian influence campaign sought to exploit Americans’ trust in local news. NPR. Markham, A. N. (1998). Life Online: Researching Real Experience in Virtual Space (Vol. 6). Lanham: Rowman Altamira. Massanari, A. (2015). #Gamergate and the fappening: How Reddit’s algorithm, governance, and culture support toxic technocultures. New Media & Society. doi: 10.1177/1461444815608807 Massey, D. (2005). For Space. London: Sage. McIntire, M., & Roose, K. (2020). What happens when QAnon seeps from the web to the offline world. The New York Times. www.nytimes.com/2020/02/09/us/politics/qanontrump-conspiracy-theory.html McLuhan, M. (1962). The Gutenberg Galaxy. Toronto, Canada: University of Toronto Press. McLuhan, M. (1964). Understanding Media: The Extensions of Man. New York, NY: McGraw Hill. McPherson, M., & Smith-Lovin, L. (1987). Homophily in voluntary organizations: Status distance and the composition of face-to-face groups. American Sociological Review, 52(3), 370–379. doi: 10.2307/2095356 McPherson, M., Smith-Lovin, L., & Cook, J. M. (2001). Birds of a feather: Homophily in social networks. Annual Review of Sociology, 27(1), 415–444. doi:10.1146/annurev. soc.27.1.415 Mercea, D., & Bastos, M. T. (2016). Being a serial transnational activist. Journal of ComputerMediated Communication, 21(2), 140–155. doi: 10.1111/jcc4.12150 Meyrowitz, J. (1986). No Sense of Place: The Impact of Electronic Media on Social Behavior. Oxford: Oxford University Press. Meyrowitz, J. (2005). The rise of glocality. In K. Nyiri (Ed.), A Sense of Place: The Global and the Local in Mobile Communication (pp. 21–30). Vienna, Austria: Passagen Verlag. Miles, T. (2018, March 12). U.N. investigators cite Facebook role in Myanmar crisis. Reuters. Mitra, A., & Schwartz, R. L. (2001). From cyber space to cybernetic space: Rethinking the relationship between real and virtual spaces. Journal of Computer-Mediated Communication, 7(1). doi: 10.1111/j.1083-6101.2001.tb00134.x Mok, D., Wellman, B., & Carrasco, J. (2010). Does distance matter in the age of the Internet? Urban Studies, 47(13), 2747–2783. Moreno, J. L. (1953). Who Shall Survive? Foundations of Sociometry, Group Psychotherapy and Socio-Drama. Oxford: Beacon House. Morozov, E. (2011). The Net Delusion: How Not to Liberate the World. London: Allen Lane. Morozov, E. (2013). To Save Everything, Click Here: The Folly of Technological Solutionism. New York, NY: Public Affairs Books. Olwig, K. F., & Hastrup, K. (1997). Siting Culture: The Shifting Anthropological Object. London: Routledge. Papacharissi, Z. (2002). The virtual sphere: The internet as a public sphere. New Media & Society, 4(1), 9–27. doi: 10.1177/14614440222226244 Papachristos, A. V., Hureau, D. M., & Braga, A. A. (2013). The corner and the crew: The influence of geography and social networks on gang violence. American Sociological Review, 78(3), 417–447. Preciado, P., Snijders, T. A., Burk, W. J., Stattin, H., & Kerr, M. (2012). Does proximity matter? Distance dependence of adolescent friendships. Social Networks, 34(1), 18–31.

From global village to identity tribes 49

Pred, A. R. (1973). Urban Growth and the Circulation of Information. Cambridge, MA: Harvard University Press. Radil, S. M., Flint, C., & Tita, G. E. (2010). Spatializing social networks: Using social network analysis to investigate geographies of gang rivalry, Territoriality, and violence in Los Angeles. Annals of the Association of American Geographers, 100(2), 307–326. doi: 10.1080/00045600903550428 Rainie, H., & Wellman, B. (2012). Networked: The New Social Operating System. Cambridge, MA: MIT Press. Rauchfleisch, A., & Kaiser, J. (2020). The False positive problem of automatic bot detection in social science research. PLOS One, 15(10), e0241045. Rosenberg, M. (2018). Professor apologizes for helping Cambridge analytica harvest Facebook data. The New York Times. www.nytimes.com/2018/04/22/business/media/cambridge-analytica-aleksandr-kogan.html Sanfilippo, M. R., & Lev-Aretz, Y. (2017). Breaking news: How push notifications alter the fourth estate. First Monday, 22(11). Schaefer, D. R. (2012). Youth co-offending networks: An investigation of social and spatial effects. Social Networks, 34(1), 141–149. doi: 10.1016/j.socnet.2011.02.001 Shane, S., & Mazzetti, M. (2018, February 16). Inside a 3-year Russian campaign to influence U.S. voters. The New York Times. https://mobile.nytimes.com/2018/02/16/us/ politics/russia-mueller-election.html Shao, C., Ciampaglia, G. L., Varol, O., Flammini, A., & Menczer, F. (2017). The spread of fake news by social bots. arxiv.org. Singer, P. W., & Brooking, E. T. (2018). LikeWar: The Weaponization of Social Media. New York, NY: Eamon Dolan Books. Subrahmanyam, K., Reich, S. M., Waechter, N., & Espinoza, G. (2008). Online and offline social networks: Use of social networking sites by emerging adults. Journal of Applied Developmental Psychology, 29(6), 420–433. doi: 10.1016/j.appdev.2008.07.003 Takhteyev, Y., Gruzd, A., & Wellman, B. (2012). Geography of Twitter networks. Social Networks, 34(1), 73–81. doi: 10.1016/j.socnet.2011.05.006 Tönnies, F. (2012). Community and Society (C. P. Loomis, Trans.). Devon: Courier Corporation. Tsai, M.-C. (2006). Sociable resources and close relationships: Intimate relatives and friends in Taiwan. Journal of Social and Personal Relationships, 23(1), 151–169. Tufekci, Z., & Wilson, C. (2012). Social media and the decision to participate in political protest: Observations from Tahrir Square. Journal of Communication, 62(2), 363–379. US District Court. (2018). United States of America versus Internet Research Agency LLC, Case 1:18-cr-00032-DLFFiled C.F.R. (2018). Uscinski, J. E. (2018). Conspiracy Theories and the People Who Believe Them. Oxford: Oxford University Press. Varol, O., Ferrara, E., Davis, C. A., Menczer, F., & Flammini, A. (2017). Online Human-Bot Interactions: Detection, Estimation, and Characterization. Paper presented at the 11th International AAAI Conference on Weblogs and Social Media, Montréal, Canada. Vasi, I. B., & Suh, C. S. (2016). Online activities, spatial proximity, and the diffusion of the occupy Wall Street movement in the United States. Mobilization: An International Quarterly, 21(2), 139–154. Verbrugge, L. M. (1983). A research note on adult friendship contact: A dyadic perspective. Social Forces, 62, 78. Vosoughi, S., Roy, D., & Aral, S. (2018). The spread of true and false news online. Science, 359(6380), 1146–1151. doi: 10.1126/science.aap9559

50 Local and digital

Walker, S., Mercea, D., & Bastos, M. T. (2019). The disinformation landscape and the lockdown of social platforms. Information, Communication and Society, 22(11), 1531–1543. doi: 10.1080/1369118X.2019.1648536 Warf, B., & Arias, S. (2008). The Spatial Turn: Interdisciplinary Perspectives. London: Routledge. Weedon, J., Nuland, W., & Stamos, A. (2017). Information Operations and Facebook (Facebook Security.) Menlo Park: Facebook. Wellman, B. (1999). Acknowledgments to my intellectual community. In B. Wellman (Ed.), Networks in the Global Village: Life in Contemporary Communities (pp. XXIII–XXIV). Boulder, CO: Westview Press. Wellman, B., & Potter, S. (1999). The elements of personal communities. In B. Wellman (Ed.), Networks in the Global Village: Life in Contemporary Communities (pp. 49–82). Boulder, CO: Westview Press. Zuckerberg, M. (Producer). (2017, January 1). Building Global Community. www.facebook. com/notes/mark-zuckerberg/building-global-community/10154544292806634 Zuckerman, E. (2014). New media, new civics? Policy & Internet, 6(2), 151–168. doi: 10.1002/1944-2866.poi360 Zuckerman, E. (2017). Mistrust, Efficacy and The New Civics: Understanding the Deep Roots of the Crisis of Faith in Journalism. Washington, DC: Knight Commission on Trust, Media, and American Democracy, The Aspen Institute. Zuckerman, E. (2019). QAnon and the emergence of the unreal. Journal of Design and Science, (6).

SECTION II

Social and spatial networks The dyadic interaction of virtual and spatial

4 NETWORK SPILLOVER

The dyadic interaction of virtual ↔ spatial captures the underexplored connection between online and offline dimensionalities of networks. The very definition of the problem is challenging because it is perfectly possible for network actors to be embedded in space, but for their relationships to occur virtually. In other words, networks may be simultaneously virtual and spatial without comprising a planar graph embedded in a two-dimensional space. Transportation and mobility networks, but also the internet infrastructure, mobile phone networks, power grids, social and contact networks, and even neural networks, are examples where space is important and where topology alone cannot contain all the information, but the links connecting social network actors, and similarly roads connecting cities, are better described as having spatial and nonplanar elements (Barthelemy, 2014). One important consequence of the spatial constraints imposed on networks is the cost associated with the length of ties (edges), which, in turn, has substantive effects on the topological structure of networks. A result of this constraint is that for most real-world spatial networks, the probability of a tie between any two actors decreases with distance. Spatial constraints impinge not only on the structure of networks but also on endogenous processes driving tie formation, such as phase transitions, random walks, synchronization, navigation, resilience, and disease spread (Barthelemy, 2014). This definition, however, does not take into account links that are not necessarily embedded in space. The very nature of the links in social relationships encapsulated in friendships, whether established online or offline, is a virtual network by definition. While the actors are embedded in space, the links connecting individuals are not planar and therefore are fundamentally dissimilar from perfectly two-dimensional spatial networks. These problems are compounded by the difference between relationships established offline, where actors are embedded in geographic space, and online, where individuals may be entirely unaware of each other’s location. The intricate

54 Social and spatial networks

relationships between physical ties and online interactions are encapsulated in the problem of the directionality in social relationships, a sociological debate about the causal direction of homophily. The term ‘homophily’ refers to the phenomenon that like-minded people with similar social characteristics are more likely to be connected with each other, whether online or offline. Homogeneous social networks are marked by assortative social ties and result in limited social worlds that constrain their access to information, the attitudes they form, and the interactions they experience (McPherson et al., 2001). In other words, similarity breeds proximity, which then reinforces homophilous tendencies within cultural, economic, ethnic, sexual, religious, and racial groups, which become more likely to interact or form social ties (Lazarsfeld & Merton, 1954). Causality may, of course, move in either direction, but researchers have focused mostly on the hypothesis that similarity causes interaction (McPherson et al., 2001), likely a reflection of the divide separating social network analysis, applied to social media or otherwise, and spatial analysis, which focuses on spatial instead of social influence. The question about the directionality of homophilic relationships remains an open empirical question, with evidence of significant reciprocal effects between network structure and context (Entwisle et al., 2007). This is unfortunate because primary and secondary effects associated with the scalable deployment of social technologies are situated right at the junction of networks that are simultaneously social and spatial, but whose social/spatial relationship is not symmetric. Individuals can partake in online communities and be exposed to information diets with little similarity to their offline surroundings. At the limit, ideological orientation, identity formation, and feelings of belonging can be pieced together depending on the networks to which individuals find themselves embedded. Perceived social memberships based on age, race, income, and location can be reinforced, but also potentially rewired, as one connects with social networks online that may differ from their proximate communities. Network spillover effects happen not only when face-to-face networks are replaced with social media but particularly when the embedding in online networks spills over to geographic communities. While these interactions are not trivial, the network spillover effect is relatively simple: seemingly unrelated social activities online and offline reinforce each other. Network spillover is associated with neighborhood effects that were first described in the seminal studies of Butler and Stoke (1969) about the 1964 and 1966 United Kingdom General Elections. The election cycle consolidated the Labour Party’s majority through a process of partisan reinforcement—the tendency of prevailing opinions to draw additional support and become dominant in local areas. The geographic or neighborhood component of network spillover was mapped by Miller (1977), for whom contact, largely structured by family, choice of friends, social characteristics, and locality such as neighborhoods, is a condition for consensus, with patterns of contact being predictive of political consensus within high-contact groups. The neighborhood effect is perhaps best described in Miller’s

Network spillover 55

(1977) assertion that locality is a better predictor of how people vote than their social characteristics. Spillover effects are thus intrinsically linked to neighborhood effects due to homophilic interdependencies tying spatial and social cohesion. Locational decisions involve a considerable degree of social selection, whereby people choose to live in neighborhoods populated by people similar to themselves. This results in social networks that are geographically delimited and socially homogeneous, not just in their socioeconomic and demographic composition but also ideologically, behaviorally, and culturally. Although geographic constraints and ideological homogeneity mutually reinforce each other, neighbors may come across attitudes and behaviors different from those held within the group. In particular, those holding a minority view in the group are more likely to come across new information, including new ideas, behaviors, and ideologies. New information may then rapidly break into the group majority, as social interactions are optimized to negotiate, and thereby spread attitudes and behavior. Johnston and Pattie (2011) provided an account of how neighborhood effects can be further amplified by other network externalities, a process that may trigger feedback loops and spillovers. For instance, if a political debate between individuals holding opposing views about the best candidate in an election can persuade people to reconsider their own positions (conversion), then any social network with a majority of the population supporting a given candidate is more likely to switch to the minority candidate than the other way around. This contradicts the assumption of research on opinion evolution, but it is conceptually accurate because voting preferences are socially structured not only by the characteristics of the voter but also by those with whom the voter discusses politics. Contact with voters in a given area influences not only individuals directly connected but also others in the neighborhood (Huckfeldt & Sprague, 1995). On the other hand, and perhaps paradoxically, this process would render the majority view within the network even more prevalent, as dominance would be greater than expected from knowledge of the individuals’ personal characteristics alone. If neighborhood effects can be observed whenever conversation networks are spatially constrained, then the political complexion of areas should be more polarized than their social composition implies. Spillover effects are traditionally associated with exogenous policy interventions on subjects who were not originally targeted by the intervention but who happen to be connected to those in the target population. Social media microtargeting can thus spill over to actors in local social environments who were not targeted by the message themselves. In this scenario, the spillover effect is the contagion effect on actors due to interventions targeting their friends and acquaintances. In contrast, endogenous peer effects stem directly from peers. In other words, actors in a network influence each other without being subjected to any microtargeting or intervention. While the source of exogenous effects is situated outside of the network, in reality it is difficult to separate endogenous from exogenous peer effects, as many of the endogenous peer effects may have been originally generated by external

56 Social and spatial networks

forces unobservable to researchers (An, 2011). Likewise, an exogenous peer effect such as microtargeting can be further compounded by endogenous effects such as the neighborhood effect, so that it is not always possible to untangle nested peer effects (Duflo et al., 2011). The process of converting individuals to different opinions is, of course, the aim of propaganda. In his seminal work, Ellul (1965) reviewed the survey instruments employed to measure propaganda effect—defined as a change of opinion after research subjects are exposed to psychological manipulations via pamphlets, films, and, increasingly, social media (Bail et al., 2019). This body of literature based on social surveys has consistently concluded that propaganda has little effect, that patterns and stereotypes remain unaltered as a result of the exposure, and that group opinion remains largely unchanged. Ellul (1965) argued that survey approaches were wholly inadequate because propaganda effects cannot be measured at the individual level and secluded from the larger universe of information one consumes over time. Media effects research addressed the problem with a more successful approach. Instead of employing methods of social psychology to study swings of opinion in groups, this body of literature explored propaganda and advertisement as attempts to get traction in the public discourse (agenda-setting) or change the narrative about an event (framing and priming). More importantly, propaganda effects recreated in laboratory conditions lack not only the interconnections between various media that are central to orchestrated influence operations but the organic amalgamation of ideas and ideologies that come together from reading and sharing information within one’s community. In other words, it cannot account for network externalities that influence operations seek to achieve. The abundance of digital trace data allows campaigners to explore a virtually infinite number of entry points to engineer consensus: user data collected by social platforms offer insights into the swings of opinions that may not yet have crystalized; networks of bots and trolls can confuse audiences about the prevailing opinion regarding a contentious topic. Social realities can thus be architected without a clear way to peg the campaign message back to offline communities. Ultimately, one’s online relationships can conceivably outpace the relationships rooted in offline settings, including the immediate networks comprising friends and family. Similarly, social media studies tracking disinformation and propaganda online are often focused on attempts to identify the extent to which users are convinced to vote for a candidate or change their behavior as a result of targeted advertising. This concern stems from the belief that targeted advertisement affects users in a uniform, atomized fashion. This understanding of media effects dates from early communication research asserting that the media exerts a powerful and persuasive influence on audiences who were believed to be volatile, alienated, and inherently susceptible to manipulation. This framework is known as the hypodermic needle model (Lasswell, 1948) on account of portraying all-powerful messages being ‘injected’ into easy prey and suggestible individuals (Bineham, 1988). In the postwar period, this framework was rebuked and eventually replaced by the two-step

Network spillover 57

flow of communication, a model emphasizing the importance of opinion leaders and interpersonal communication in the flow of personal influence leading to the promotion of ideas and products (Katz, 1957). It is unsurprising that the framework used in propaganda studies still relies on the two-step flow of communication model (Katz, 1957), only updated to incorporate multiple entry points of influence, thereby foregrounding the role of opinion leaders whose influence in their community is a vector of social persuasion. This model continues to provide a dependable framework for studying Twitter (Wu et al., 2011), a social network largely populated by opinion leaders and in particular by the digiterati, and Instagram, a mobile photo-sharing app particularly suited to influencer marketing (De Veirman et al., 2017). But the topology of social network platforms, which can accommodate various network formations, has modified the relatively simple equation in which persuasion is a function of activation, reinforcement, and conversion, as secondary network effects are not sufficiently developed in classic models of interpersonal communication (Lazarsfeld et al., 1948) and information diffusion theory (Rogers, 1983). In other words, the assessment of campaign effects remains dependent on the assumption of independence to estimate the impact of exposure to partisan content on voting preferences. Political campaigns are expected to result in activation, when unmotivated actors confirm their support to the campaign; conversion, when motivated actors shift their vote to the opposing party; and reinforcement, when the initial vote preference is strengthened due to the campaign (Lazarsfeld et al., 1948). The assumption of independence underpinning campaign effects is correspondingly explored with fixed effects models (Dilliplane, 2014), with exposure to campaign materials leading individuals to activate other actors within their reach. Lazarsfeld resorted to the metaphors of photographic development and the rubbing of a coin to describe activation as the emergence of an ideological alignment that existed in latent form but only crystallized because of campaign propaganda, thereby tracing a linear path from voter’s latent tendencies to activation or conversion. The two-step flow model is remarkably nuanced and downplays the power of propaganda epitomized by the hypodermic needle model (Lasswell, 1948). It explores the subtle relationship between political communications broadcast by mass media and the direct personal influences exerted by activated individuals. It also describes how successful activation leads to increasing exposure in a continuous process, with propaganda leading to increased interest, which, in turn, makes individuals more willing to expose themselves to further propaganda. The model nonetheless assumes exposure to be linear: either it flows from the media or it is acquired through personal contacts. Relationships and interactions are thereby defined within the constraints of one’s personal ego network, with no way of accounting for hundreds of millions of dynamic new ties forged and reinforced online through social platforms. The limitations of this framework are compounded by social networking sites whose internal topology is constantly shifting due to a growing user base and

58 Social and spatial networks

successive modifications to the technologies underpinning them. The small-world properties of physical social networks are one of many topologies found on social platforms which allow for multiple secondary exposures and network effects drawing from single exposure points. This is particularly important because there is evidence that small-world properties observed in the real world are linked to the geographic dependencies of the network, namely the spatial distance between actors (Wong et al., 2006). As such, the potential impact of messages circulating in social networks cannot be benchmarked against the number of users exposed to the content, as activation might be achieved through subsequent steps through which information cascades extrapolate the assumptions of mass communication and propaganda models (Katz, 1957). While Facebook is largely structured as a social network with reciprocal ties and overlapping clusters similar to physical social networks (Backstrom et al., 2012), Twitter is a mixed system that can rapidly shift from decentralized, horizontal networks to highly centralized network formations, with few accounts sourcing information to communities of users (Bastos, Piccardi, et al., 2018). The constantly changing topology of social platforms imposes considerable challenges to modeling distributed spillover of information, but these constraints could be offset by incorporating network science methods to the task of identifying activation thresholds (Hilbert et al., 2016). Network science can trace the processes through which disinformation navigates centralized and small-world networks to maximize the effects of information dissemination (Myers et al., 2014). But even metrics of persuasion such as activation and conversion have limited heuristic power for understanding information warfare. Not only can activation occur due to secondary network effects but propaganda campaigns can successfully employ psychological warfare techniques that bypass activation altogether. Researchers can track activation times of individuals recruited to political causes up to the moment when critical mass is attained (González-Bailón et al., 2011), but psychological warfare techniques do not require a given threshold of actors to be activated because the objective is to shape perceptions and manipulate cognitions, an objective which can be achieved without change being registered in the public discourse (Jowett & O’Donnell, 2014; Linebarger, 1948). The objective of psychological warfare is not to move public opinion but to create confusion, disorder, and distrust (Jowett & O’Donnell, 2014; Taylor, 2003). The potential reach of propaganda resorting to broadcast channels, such as radio and television, is restricted to the population exposed to it and the interactions between activated individuals and their social networks. Social platforms, however, incorporate network externalities (Katz & Shapiro, 1985), so that users subjected to microtargeted propaganda are also likely to be embedded in cliques or communities equally exposed to the campaign, thereby snowballing the cumulative impression garnered by the piece. In addition to bandwagon effects, network externalities also impinge on an individual’s ability to evaluate the extent to which an opinion is prevailing or dissenting relative to the broader population. These externalities can

Network spillover 59

play a pivotal role in breaking the critical mass threshold after which social diffusion of new styles of behavior grows rapidly (Bandura, 2001). These problems have long been studied in research on opinion evolution that highlights the non-linear patterns through which opinions and social change emerge from system interactions. While opinion dynamics progress fairly linearly over time, they often lead to non-linearities and complex dynamic behavior, of which clustering (i.e., ‘bubbles’) and the polarization of opinions are common outcomes (Latané, 1996; Nowak et al., 1990). Disinformation campaigns thrive on polarized discourses by mobilizing supporters in opposing clusters, but clusters do not have to convince each other of a prevailing or a minority opinion. In other words, disinformation campaigns are not intended to change the prevailing opinion, and therefore metrics of persuasion, including activation and conversion, provide limited heuristic guidance regarding such operations. For social issues requiring engagement, such as voting, and public deliberation, disengagement is as important an objective as engagement and can be achieved with limited activation. In other words, while changing public opinion is a process governed by intrinsic dynamics, the transition to a new prevailing opinion is likely linked to changes in extrinsic control factors that affect intrinsic dynamics (Nowak et al., 1996). If an organization seeks to optimize the reach of their campaign, they might resort to a social platform as an information diffusion system and target ‘influentials’—that is, users who are central to the network and who perform the role of hubs relaying information to the periphery of the network. Trolls and botmasters do not necessarily have to convince ‘influentials’ of their political agenda. For disinformation purposes, it might suffice that opinion leaders inhabiting the network perceive one side of the public debate as contentious and potentially damaging. The result is not a change in the prevailing opinion but a change in public support for a cause that can well translate to disenchantment, apathy, and lower voter turnout by an ill-informed electorate. In this distributed information ecosystem, the negotiation of social values offline can be reinforced online, and vice versa, thereby triggering network spillover effects and feedback loops not prescribed in previous communication models. In our study of echo chambers during the Brexit debate (Bastos, Mercea, et al., 2018), we found that social media echo chambers actually reflected real-life conversations that are linked to the geographic locations of users. The findings contradicted the assumption that echo chambers—online discussions restricted to users sharing politically homogeneous content—were the result of online interactions alone. Indeed, the results were in sharp contrast to the prevailing narrative that social media recommendation algorithms were the central factor driving insularity and radicalization, social mores encapsulated in the terms ‘filter bubbles’ and ‘echo chamber communication.’ There were, in fact, fundamental differences in the geographic embedding of online echo chambers for Leave and Remain campaign subgroups. The average distance between users sharing pro-Leave messages was 168 kilometers (105

60 Social and spatial networks

miles), while the average for pro-Remain supporters was 208 kilometers (130 miles). Remain-supporting users were more likely to speak to other Remain supporters outside of their own geographic areas, whereas Leave supporters were largely circumscribed to interaction with users from nearby areas. The differences between echo chambers involving Leave and Remain supporters may conceivably be explained by the distinct geographical clustering of their social networks, with communication online echoing the composition of their extant social relations. The study indicated that online echo chambers were the result of conversations that spilled over from in-person interactions and called into question the assumption that echo chambers were a hazard specific to social media, suggesting instead that people were bringing their pub conversations to online debate. The findings of this study also confirmed the hypothesis that similarity triggers interaction, as politically homogeneous network interactions would be triggered by geographic propinquity. A related study found that radicalization on YouTube stems from the same factors that persuade individuals to change their minds in real life—absorbing and interiorizing new information, only at scale (Munger & Phillips, 2020). Instead of depending on algorithms that increasingly radicalized audiences, political content on YouTube would be driven by supply and demand, with the increased supply of right-wing videos online tapping a latent demand previously constrained by limited offer offline. Counterintuitive as it may sound, this predicament is consistent with research cautioning against the perils of politically homogeneous communication such as echo chambers and filter bubbles. The information stream of social media users is heavily weighted on what one’s connections have already engaged with, hence not only streamlining locally relevant yet restricted information but also further contributing to widening the supply of content to which there was untapped demand (Zhang et al., 2018). This process continuously corrects the supply-demand imbalance of novel information, with users receiving feedback and responses to their posts that build up a sense of influence and agency, which then increases the probability of users seeding the same content to their communities in a positive feedback loop (Oeldorf-Hirsch & Sundar, 2015) that has the potential to systematically distribute ideologically polarizing information. Further, the effects of network technology on the frequency and intensity of relationships are not linear. Mok et al. (2010) leveraged three waves of social network surveys encompassing the years 1968, 1978, and 2005 in East York, Toronto, with email being a prevalent medium of communication only in the third wave. The most salient differences found between the 1978 and 2005 samples were the characteristics of the relationships. Relationships in the period where email was widely used covered both longer and shorter distances compared with the survey of pre-internet social networks. The mean distance in 2005 was nearly twice that observed in the 1978 study, but the median distance in 2005 was 10 kilometers compared with 14 in the 1978 study. The 2005 survey also found a much higher ratio of intimate relationships, at 53% compared with only 39% in 1978, and higher frequency of phone contact compared with the social networks surveyed in 1978.

Network spillover 61

In this period, email rapidly became the most used medium of communication, followed by face-to-face conversation, and lastly phone calls. This leads to another key difference reported by Mok et al. (2010): the composition of one’s social network in 2006 included more friends, but fewer neighbors and family members. Perhaps surprisingly, but certainly in contrast to reports that network technologies diminish face-to-face contacts, the study also found that the frequency of face-to-face contact had, in fact, increased in 2005, with the average frequency of in-person interaction surging from 45 times a year in 1978 to 56 in 2005. The exception that confirms not only the rule but also structural demographic changes was face-to-face interactions with intimate neighbors, which declined dramatically from 163 contacts per year in 1978 to only 52 yearly interactions in 2005. Network technologies have accordingly non-linear effects on the frequency and intensity of one’s relationships, with marginal effects on immediate neighbors who may change the composition of one’s social network. The non-linearity of their effects on social relationships allows network technologies to support relationships with extremely distant contacts, but it may not necessarily expand the spatial range of active relationships. In other words, the combined effects of network technology and geographic propinquity may not follow a predictable pattern. This pattern is also strikingly dissimilar to what Preciado et al. (2012) reported in their study on geographic proximity as a decisive factor in friendship ties, findings which did not account for the network technology or friendships forged and/or sustained online. Interestingly, the results of that study reported a linear dependence of geography and friendship in groups of teenagers living in small, geographically isolated towns. In addition to being non-linear, the effects of coexisting online and offline interactions may work in the reverse direction, may be bidirectional, and additionally may trigger feedback loops. The asymmetric effects of online interactions on offline communities can only be gauged with the acknowledgment that these companies are platforms offering basic social infrastructure, much like electricity and water. Facebook is not simply an online social network that users may choose to join or ignore. With much of the decision-making processes of real-world communities depending on the infrastructure provided by Google, Apple, Airbnb, Amazon, Uber, and Facebook, the choice of which services to use is an illusory one. While the affordances of social platforms have become deeply integrated into the structures of communities, users and citizens continue to recognize their online activity as that of a user and their offline life as that of a citizen, thereby compartmentalizing two sources of social life assumed to be either natural extensions or spaces secluded from each other. These are false equivalences that overlook the affordances of social technologies, including the conflicting potentials for coordination and isolation beyond the constraints of geographically delimited communities. It similarly supports the diverging possibilities of greater intergroup cohesion and fragmentation processes compounding themselves upon polarization. Ultimately, the effects of online activity on real-world communities are not linear nor entirely observable, at times being

62 Social and spatial networks

bewildering and ambiguous. Even individuals who choose to remove themselves from social platforms remain de facto users, as social platforms have enough data from any nonusers to generate a version of their profile even if they are not registered to the website (Garcia, 2017; Verma et al., 2013). In the following two chapters, we provide an overview of key concepts in social network and spatial analysis supporting the framework to advance this exceptionally cross-disciplinary field of inquiry.

5 SOCIAL NETWORKS

This chapter introduces readers who may not be familiar with concepts and measures supporting the study of information spillover across online and offline social networks. These methods are drawn from the body of scholarship in network science and the theoretical framework of social network analysis. We proceed by providing an overview of the fundamentals of graph theory and the premise of social network analysis, which states that relationships, more than individual and independent attributes, are critical to understanding social behavior. We review key measures in social network analysis and how different metrics of centrality render different network visualizations that may deviate considerably from the geographic distribution of nodes. We subsequently revisit central concepts of social physics and relational sociology, which are examined in more detail in Chapter 6 when discussing spatial and neighborhood effects in social networks. A social network is typically defined as a set of actors connected by links that represent formal ties or evidence of interaction between individuals or organizations (Wasserman & Faust, 1994). The analysis of social networks has focused traditionally on the topology that emerges from the interaction between actors and links within the graph or the network space, with a number of methodological approaches to modeling network data describing relationships between actors and entities in personal, professional, kinship, political, or romantic relationships (Boccaletti et al., 2006). The focus on the topological space of the network means that there is no requirement or incentive to consider environmental or geographic context when performing traditional social network analysis (Sarkar et al., 2019). The central assumption of network analysis is that social data is necessarily relational, so that independent observations have only limited explanatory power in accounting for social phenomena. This framework was originally and explicitly formulated in the work of Simmel (1908), who articulated the primacy of social ties and intersections in social relationships. The structural analysis of networks

64 Social and spatial networks

runs deep in the sociological tradition, but network theory stems largely from the seminal work of Granovetter (1973) on the strength of ties and bridges, along with Burt’s (1978) work on structural holes and social capital. The terminology employed by the authors is substantively different, but their work converges to foreground the key role played by nonredundant ties or actors with more bridges in facilitating the diffusion of novel information (Borgatti & Lopez-Kidwell, 2011). Network data is therefore necessarily relational and structured quite differently from the rectangular data arrays of measurements common to survey and census data, where rows refer to cases, subjects, or observations, and columns are attributes, variables, or measures conveying quantitative or qualitative scores. Network data, on the other hand, consist traditionally of a square array of measurements that allows us to compare how actors are similar or dissimilar in relation to each other across a range of attributes. In contrast to survey data, both the rows and columns in network data refer to cases, subjects, or observations, with the cells indicating the occurrence of intensity of a relationship between actors. This fundamental difference forces the researcher to approach the data in different ways and to pose questions that could not be broached with a rectangular array of data measurements. It is a common misunderstanding that social media data is necessarily network data. Much of the data collected by social platforms is independent observations about users (timestamp of their login activity, list of purchased items, etc.). While it is possible to draw a network based on patterns of activity (e.g., streaming services’ recommendation systems and e-commerce ‘customers who bought this item also bought that’), user trace data is not inherently relational. Indeed, many of the services associated with the so-called Web 2.0 are not social nor particularly structured around relationships. Social media sites are not necessarily social networking sites, even if user-generated content is prevalent in both. One important example is Wikipedia, which is entirely user-generated but where interaction between users is largely restricted to dispute resolution. Network data is usually represented as an adjacency matrix where edges indicate who is adjacent to whom, so that Aij = 1 if node i has an edge to node j and Aij = 0 if node i does not have an edge to j. For social media data, we generally assume Aii = 0 unless the network allows for self-loops—or the retweeting of one’s own tweet in Twitter parlance. We also assume that Aij = Aji if the network is undirected, which is the case for Facebook friendship graphs where i can be friends with j only if j agrees with the friendship request from i. Other common ways to represent network data include edge lists, where unique identifiers are assigned to actors and a list of the links is stored (e.g., 2→3, 2→4, 3→1, 3→2, 4→5, 5→3, 5→4), and adjacency lists, which performs well if the network is large, sparse, or expansive. For social media data, one should also pay attention to the directionality of the information flow when graphing a network of retweets and/or @-mentions. For example, A→B when B retweets A, but A←B when B mentions A. The directionality of the information does not map out to the source of the action. In this example, B both retweeted and @-mentioned A, but the direction of the edges is inversed whether we are exploring @-mention or we are exploring retweet networks.

Social networks 65

Despite these caveats, it is generally possible to collect and analyze digital trace data as a network of transactions or relationships. In its most fundamental definition, networks are formed by points—also referred to as vertices, nodes, sites, users, or actors, and lines—also referred to as edges, arcs, links, bonds, ties, or relations. The occurrence of ties among actors determines the structure of the network, which can then be processed to calculate the key metrics of social network analysis, including density, cohesion, clustering, but particularly and certainly more commonly, centrality, which is a numeric value indicating the strength, regularity, or intensity of connections between any two actors. Measures of centrality are based on actor or node degree for undirected networks (graphs where the relationships are mutual), or the in- and out-degree for graphs where relationships have directionality. Degree thus expresses the number of connections between any two nodes: a numeric variable driving all measures of centrality. Social network analysis is typically segmented across three levels of analysis depending on the metrics employed to describe and model the network. These metrics tend to characterize the relationships within the network at different topological scales, namely: entity-level, which focuses on single actors or links; mesolevel, where the analysis addresses a collection of actors; and network level, where complete network metrics are employed to analyze the population dynamics of the social network in its entirety. Meso-scale network structures are typically preferred due to their similarity or close correspondence with the concept of communities in sociology and geography. As such, much attention has been given to algorithms that calculate the modules or groups structuring a given network, largely defined as subnetworks that are locally dense even though the network as a whole may be sparse. Modularity thus estimates the fraction of the links that fall within a given groups minus the expected fraction if links were distributed at random (Blondel et al., 2008; Newman, 2006). These levels of analysis are defined in the network space. As such, they are nonspatial and almost entirely nongeographic. While spatial constraints are not an integral component of social network metrics, some measures such as Average Path Length and Network Diameter may provide intuitions about node-hop distance in a network, thereby indicating how quickly one can get from one part of the network to another. Average Path Length is a measure that indicates the average distance between actors, and network diameter is the maximum distance between two of the farthest social connections. Network diameter is thus a measure of the size of the network and how expensive it is to cross it. While diameter is calculated at the network level, it is possible to calculate the largest geodesic distance for each of the actors. This measure is referred to as eccentricity, or how far an actor is from the furthest actor in the network. It follows that the largest eccentricity is the diameter of the network. These are perhaps the most salient measures for networks embedded in geographic space. With social connections tending to be local, a specification of the network diameter may correlate with its spatial coverage. Centrality is likely the most intuitively appealing notion in social network analysis. It is the metric commonly employed to rank actors or nodes in importance

66 Social and spatial networks

according to various centrality measures, whether degree centrality, betweenness centrality, closeness centrality, or eigenvector centrality. Despite being a widely employed metric in network analysis irrespective of the type of the graph, centrality is actually a construct of the network flow model, so centrality has no application to networks where no information or goods are transported between nodes. This is particularly relevant in our field of inquiry, as spillover and other secondary network effects are largely independent from centrality but may be triggered or evolve along spatial lines. In cases where what is being measured is primarily propagation rather than direct flow, being adjacent to weak nodes may render an otherwise weak node strong, which, in turn, weakens others that the node is connected to (Bonacich, 1987). Measures of centrality are often translated to influence, particularly in social media analytics and ‘influencer’ marketing strategy, where centrality is defined either in terms of degree—that is, the number of connections to or from an actor—or closeness—that is, the length of shortest path to all other actors. Degree centrality is therefore the node’s (in- or out-)degree, which is, of course, identical in undirected graphs, used as measure of a node’s degree of connectedness and a proxy of influence and/or popularity. It is a key metric for assessing which nodes are more central with respect to spreading information and influencing neighboring nodes. Closeness centrality, on the other hand, is the mean length of all shortest paths from a given node to all other nodes in the network (i.e., how many hops it takes on average to reach every other node). Closeness centrality is a measure of reach and indicates the speed at which information can reach other nodes from the starting node. Other key metrics include betweenness centrality, calculated based on the path between two nodes in any sequence of non-repeating nodes connecting them. In other words, the shortest path is the one that connects the two nodes with the quickest number of connections. Shorter paths are critical when fast communication is desired and undesirable when the exchange is not desired (e.g., the spread of diseases). Betweenness centrality employs this calculation to identify which nodes are more likely to be in the communication path between other nodes. It is particularly useful in determining points where the network may break apart, and it is an important metric for measuring network vulnerability to random or targeted attacks. Eigenvector centrality, finally, posits that the centrality of any given node depends on how central its neighbors are. The eigenvector centrality of an actor is proportional to the sum of the eigenvector centralities of the nodes connected to it, so that an actor with high eigenvector centrality is necessarily connected to other nodes with high eigenvector centrality as well. Eigenvector centrality is a measure similar to Google’s PageRank algorithm, developed to rank web pages and bring order to the web: links from highly linkedto pages are more important, and therefore count more, than links for obscure websites. Eigenvector is particularly useful in determining which node is connected to the most connected nodes, or which publication is more prestigious, by calculating the incidence of incoming links from other similarly prestigious

Social networks 67

publications. The PageRank measure has built on this metric to index the web hierarchically: it takes into account not only the secondary webpages that link to a given website but how many other websites link to those secondary webpages. This recursive definition of centrality makes PageRank especially robust to manipulation and artificial inflation of incoming links. Measures of centrality have rather practical implications in the analysis of social networks. Degree centrality is useful in answering a simple question: how many nodes can a given node reach directly? In a network of British actors, which could be scraped from IMDb.com, it would be possible to calculate how many actors a given individual has collaborated with. Betweenness, on the other hand, would be helpful in identifying the most direct and efficient path between any two actors, so that aspiring actors could trace the fastest path to working with people they admire. Betweenness is, of course, important in the diffusion of information, particularly sensitive information, as it can pinpoint the actor through whom sensitive or confidential information is more likely to flow. Closeness could be used to calculate how fast a given actor reaches all other actors in the British film industry. Conversely, it is also helpful in identifying how fast a given disease can spread from that specific actor to the rest of the network. Eigenvector, finally, is an important measure of how well a given actor is connected to other well-connected actors. Considering the reality of the entertainment business, it is perhaps the most useful metric and the one most likely to be instinctively interiorized by aspiring actors. A more structural reading of a network consists of identifying strongly connected components, which are subnetworks where each node can be reached from any other node by following directed links, and weakly connected components, where every node can be reached from any other node, ignoring the directionality of the links. This distinction is, of course, only pertinent to directed graphs; in undirected networks one refers simply to connected components. Giant components may emerge from direct and undirected graphs whenever the largest component includes a significant fraction of the graph. Giant components are likely to emerge in networks drawn from social media conversation around a topic, as highly replicated messages and influential users are likely to bridge much of the network. Conversely, networks graphed using survey or snowball methods, along with ego networks, are likely to ignore isolates and the entire network emerges, albeit artificially and as a result of the sampling process, as one giant component. Clustering and bridging are also fundamental concepts one should pay attention to when exploring networks online and offline. Bridges are basically nodes with edges connecting different groups that facilitate intergroup communication, increase social cohesion, and help spur innovation. Bridges and brokers allow diffusion of information between otherwise distant or disconnected communities and perform the exact opposite role to actors deeply embedded in the network. In Granovetter’s terminology, bridges are connected via weak ties, but not every weak tie is necessarily a bridge. Strong ties, such as family relationships, are more likely to be transitive and play an important role in social cohesion, but the regular communication typical of strong ties also means information is highly redundant due to

68 Social and spatial networks

the high clustering of actors. Weak ties, on the other hand, can bridge and reach far across the network, with the caveat that communication is likely to be infrequent. Recent scholarship has advanced the notion of intermediate ties (Onnela et al., 2007), where novel information is expected to flow through ties of intermediate strength. One concept that brings together micro and structural analysis of networks is homophily, or the tendency of individuals to associate and bond with similar others. Homophily posits that the contact between similar people occurs at a higher rate than among dissimilar people, a thesis encapsulated in the proverb ‘birds of a feather flock together’ and the first law of geography: “everything is related to everything else, but near things are more related than distant things” (Tobler, 1970). It is perhaps unsurprising that this first law is the foundation of spatial dependence and spatial autocorrelation, which are similarly known forces of homophily. Indeed, the pervasiveness of homophily means that cultural, genetic, behavioral, or material information flowing through social networks is likely to be geographically localized, which may run counter to the heterophilous communities where locals and outsiders may intermingle and assemble in diverse groups. In a broad sense, network homophily refers to relational resemblance much like spatial autocorrelation, which then refers to geographic dependence (Adams et al., 2012). The homophily model posits that individuals inhabiting physical communities are more likely to connect with others sharing similar social characteristics (McPherson & Smith-Lovin, 1987; McPherson et al., 2001). As a result, cultural similarities and differences among people can be formalized as a function of geographic propinquity. The question whether online communities are equally driven by homophilous relationships is an empirical question yet to be answered, with scholarship presenting somewhat contrasting evidence. Some studies have found that online social networks are more prone to homophily compared with offline networks, the latter being tied to physical locations where serendipitous exposure to social diversity is likelier to happen, while other studies have found that information and influence in online settings are driven by an optimal level of heterophilous connections (Yamamoto & Matsumura, 2009). Homophily can be broadly divided into two distinct types: baseline homophily and inbreeding homophily. Baseline homophily results from the limited potential tie pool due to factors like demography and foci of social activities (Feld, 1981). Inbreeding homophily, on the other hand, is conceptualized as homophily measured over that potential tie pool, including gender, age, ethnicity, religion, social class, and education. Other key characteristics underpinning the most elementary notions of homophily are household location, school, and membership affiliation, which comprise basic measurements of proximity and similarity influencing meeting and interacting opportunities (McPherson et al., 2001). Space thus is an important driver in homophilic relationships and provides opportunities to form and maintain social ties with those who are geographically proximate (Zipf, 2016). In other words, geography is a key constraint to the potential tie pool typical of baseline homophily. Network models often take inbreeding

Social networks 69

homophily into account, but they largely assume no baseline homophily effects— that is, the potential tie pool for all actors equals the entire population (Wong et al., 2006). This, of course, is not the case of social networks that overlap across online and offline dimensions, where it may not be possible to establish reliable boundaries for the potential tie pool. Moreover, while geography is typically classified as baseline homophily, reinforcement between geography and online activity may lead to inbreeding homophily (Bastos, Mercea, et al., 2018). There is evidence that homophily emerges in professional and friendship networks as individuals enforce competitive preferences and seek to form ties with the wealthiest, healthiest, and most physically attractive or successful individuals, thereby creating homophilic patterns via the endogenous exclusion of low-health or least attractive members, who are then pushed to form ties with one another (Ali et al., 2012). In online social networks driven by micro-celebrity-logic and influencer notions, ties are invariably and unsurprisingly selected preferentially. This preferential attachment generates a skewed distribution in the number of ties per user, with few users accumulating a large number of followers and most users having only a few ties. In this process of tie formation, which underpins much of the social media supply chain, users seek to befriend and follow socially desirable influencers as opposed to one another, which is an important departure from offline social networks. This results in a network with low levels of clustering and high levels of heterophily, not unlike skewed networks that occur in environments where tie formation is relatively unconstrained such as human sexual contact networks (Centola & van de Rijt, 2015). While homophily leads to homogeneous groups that facilitate establishing and cultivating relations, extreme homogenization can act counter to innovation and the generation of new ideas, and when associated with transitivity, it can lead to the formation of cliques or fully connected clusters that can exert influence in neighboring individuals of the network. It is nonetheless unclear the extent to which homophilous dependencies in physical social networks are duplicated, reinforced, or counterbalanced in online social networks (e.g., social media platforms). There are considerable challenges in identifying the directionality of this relationship, which is a necessary component in assessing the role of geography within one’s social network. Cliques are indeed a particularly uninteresting feature of networks from the perspective of a sociologist. No new information can penetrate a clique because every actor is connected to each other. No centrality measures apply to cliques, and there is no core-periphery or small-world structure. Cliques are also fragile: one missing link can disqualify a clique, but the extent to which and how cliques overlap each other may offer interesting analytical insights. Similarly, dense or tight-knit and interconnected networks are not necessarily more efficient in distributing information, as they lack nodes with more structural holes and the information that circulates is more likely to be redundant at any given time compared to a network with more structural holes (Borgatti & Lopez-Kidwell, 2011). Dense networks are also unlikely to be found in real-world situations, whether online or offline, and

70 Social and spatial networks

therefore network models usually focus on frequently observed network formations such as core-periphery, small-world, and scale-free. These network structures are used to derive the mathematical properties of complex networks driving simple contagion, when actors are infected with some probability at each iteration, or complex contagion, when actors are only infected if a certain number of neighbors are also infected at each iteration. Networks models are often based on a clustering coefficient, as real-world networks exhibit this specific property in which two actors are likelier to have an edge if they have a common neighbor. Most real-world networks also present a degree distribution that follows a power-law function—that is, a small number of nodes have very large degree. This property, also referred to as scale-free, preferential attachment, cumulative advantage, or simply as the-rich-get-richer, stipulates that new edges are to be added to actors that already have a higher-than-average degree, thereby further increasing their degree disproportionately, often exponentially, compared with most other nodes in the network. The result is a network with a long-tailed degree distribution, as it includes only a few highly connected actors directly connected to several other actors with low degree. These networks may however have a small-world structure. Defined as the combination of high clustering and short average path length, small-world network formations are also predisposed to preferential attachment. Small-world networks are particularly interesting because they are typical of most cities and are commonly known as ‘real world networks’ (Kaiser & Hilgetag, 2004). These networks are marked by the high incidence of cliques and nearcliques linking strangers by a short chain of acquaintances. This network formation is not as resilient to targeted attacks as random networks are due to the formation of hubs, but they are often more effective in relaying information by minimizing the number of hops required to reach a given actor in the network. Small-world networking is also particularly important for social movements and the organization of collective action on the ground because of their resistance to change and the filtering performed by highly connected nodes. The original formulation dates to Stanley Milgram’s small-world experiment that measured the average path length for social networks in the United States by asking random individuals to pass a message to any person they knew that was likely to be closer to a predefined target individual. Milgram (1967) found that human society was a ‘small-world-type’ of network characterized by short path-lengths, indeed 20% of the initiated chains reached the target with a chain length of 6.5, hence the phrase ‘6 degrees of separation,’ conveying that people are only six social connections away from each other. Milgram’s original experiment was restricted to offline social networks, but Dodds et al. (2003) repeated the study in the early noughties with email messages, only to find an even shorter average path length of four. It is unsurprising that the average path length was shortened as communication networks shifted from posting letters to sending email and again to posting on social platforms. There is a relatively stable number of acquaintances individuals can cultivate at any given time, estimated by Pool and Kochen (1978) to be between

Social networks 71

500 and 1500, and an even more constrained number of people individuals can hold a stable relationship with at any given time, estimated by Dunbar (1992) to be 150. Other studies calculated the average number of people US individuals know to be 290 (Killworth et al., 1990; McCarty et al., 2001); different techniques have estimated yet larger networks, with a mean of 610 and a median of 550 people, and projected the adult population to know anywhere between 250 and 1,710 (Wellman, 2012). The variance in the estimates reflects the hierarchically inclusive layers that resulted from the limited amount of time available for social interaction (Miritello et al., 2013). The estimates organize social group size in a scaling ratio of three (Zhou et al., 2005). Each layer accounts for a greater level of interaction frequency and emotional closeness in human relationships (Roberts et al., 2009; Sutcliffe et al., 2012), with the first layers presenting values around 5, 15, 50, and 150, and larger values representing relationships that are less frequent and/or intimate (Hill et al., 2008). The outer layers may include between 500 and 1500 relationships (Dunbar et al., 2015) and correspond to acquaintances or individuals who are not considered personal friends or family. The substantive variation in the estimates for size and number of layers is unsurprising, as the process through which social groups become wider, from relatives to friends—from kin to kith—then to acquaintances and eventually non-related others, is difficult to benchmark and may vary across geographically disparate social groups. Modeling dyadic cooperation is a perennial challenge in social and evolutionary studies, but it is generally accepted that cooperation and persuasion evolved within small groups (Fu et al., 2012), likely developed from parochial altruism—the preference for favoring the members of one’s ethnic, racial, or language group (Bernhard et al., 2006). Assimilation into a given group or tribe thus required assimilation into the group’s ideological belief system, a psychological construct that ties one’s sense of self to the intrinsic bias of one’s in-group and its worldview (Le & Boyd, 2007). There are real-world constraints applied to social groups formed across geographic boundaries, constraints that may not be enforced by the extensive interconnectivity allowed by social platforms operating on a wider geographical scale. With Facebook alone extending to one-third of the entire global population, the estimates for second-degree neighbors (i.e., friend of a friend) can reach half a million individuals, so that the entire network can be covered via the third- or fourthdegree neighbors (hence the average path length of four). There are, of course, fundamental differences constraining tie formation in social networks online compared to their spatialized, offline counterpart. The spatial dependencies constraining meaningful relationships (i.e., strong ties) are further compounded by the fact that on average 42% of frequent contact ties live within a mere 1 mile radius of a typical person, while the rest of their ties could be directed to individuals far outside their surroundings (Wellman et al., 1988). Dunbar (2016) himself sought to answer this question in a study straightforwardly titled ‘Do Online Social Media Cut through the Constraints That Limit

72 Social and spatial networks

the Size of Offline Social Networks?’ The study revisited the social brain hypothesis, which states that social network size is constrained by cognition and the time involved in servicing relationships, with survey data stratified by age, gender, and regional population size. Dunbar (2016) found that relationships online did not break through the glass ceiling formulated in the social brain hypothesis, as the size and range of online personal social networks (egocentric networks), indexed as the number of Facebook friends, were similar to those of offline face-to-face networks. The number of individuals in the inner layers of their network (formally identified as support clique and sympathy groups) was also found to be similar in size to those observed in offline networks. The first caveat with Dunbar’s estimate is that survey instruments perform particularly poorly at estimating network size, as individuals find it difficult to recall names (Goel & Salganik, 2010). This is compounded by the decision to focus on egocentric networks, which are limited to describing local social environments by measuring the relationships in the vicinity of the actors, whereas online social relationships may conceivably fall outside this parameter (Marsden, 2011). While the results presented by Dunbar (2016) offer evidence that online social networks could not overcome the cognitive constraints on the size of social networks, there was no indication as to whether the online egocentric social networks mapped using the survey instrument overlapped—and if they did, the extent to which they overlapped—with the offline social network. One hypothetical individual could live entirely different lives online and offline, thus rendering two egocentric social networks with no overlap. In that scenario, while the egocentric social networks may appear consistent with the threshold size postulated in the social brain hypothesis, they may have objectively doubled that number by simply adding a secondary (online) social network to the preexisting offline social network. Indeed, there is a large body or scholarship probing whether the limit on personal network size observed in the face-to-face world has been altered by the rampant use of social networking platforms, thereby relaxing the otherwise fixed number of individuals one can hold as friends. Wellman (2012) argued that Dunbar’s ‘social brain,’ limited to 150 meaningful relationships, underestimated the breadth of relationships in contemporary Western societies, especially for the two outer layers of acquaintances. Facebook would have increased the carrying capacity of relationships, with heavy internet users having more close ties (Boase, 2008; Wang & Wellman, 2010), and a much larger number of hardly known ties listed as online friends. More critically, Wellman (2012) makes the point that while the social brain of 150 contacts may apply to primates, human societies are fundamentally different in size, diversity, clustering, fragmentation, and, of course, spatial range. Whether or not online social networks effect real changes to one’s circle of friends, there is consensus in the literature that a significant share of one’s social relationships is relatively independent of geography; as such, this segment of one’s social network would likely benefit and potentially thrive on social media platforms, where geography is a lesser barrier to relationships. For large social networks,

Social networks 73

Liben-Nowell et al. (2005) have found that one-third of the online friendships are independent of geography and likely derived from other social factors such as occupation or personal interests. Similarly, Wellman et al. (1988) reported that less than half of one’s frequent contact ties live in close proximity, with the rest of the ties linking to actors located in more remote areas. Yet, and unfortunate as it may be, existing models that predict the probability of online friendship solely on the basis of geographic distance are notoriously weak at accounting for these variations.

6 SPATIAL ANALYSIS

Spatial analysis refers to a set of techniques applied to the spatial expression of human behavior to accurately describe geographical patterns. This body of methods requires information about the location of the objects and is particularly useful for investigations using point processes with a probability distribution in a finite set of spatial locations (Barthelmé et al., 2012). Much like social network analysis, spatial analysis is marked by the multidimensionality of the object of study, the interdisciplinary nature of the methodological approaches, and strong dependence on computers for data collection and analysis. Together with traditional, often governmental, sources of geographic information, internet-powered devices are producing a wealth of location-rich information ordinarily referred to as the ‘geoweb.’ The availability of such user-generated sources of spatial data, coupled with the rapid development of spatial analysis computing platforms, provide opportunities for quantitative research at unprecedented scales and contribute to bringing these two areas of scholarship into closer contact with one another. Metrics of centrality explored in social network analysis have facilitated the identification and modeling of interpersonal and institutional relationships over time, but the topology of social networks typically comprises nonspatial representations that are limiting because societies and human relationships cannot exist in isolation or detached from spatial embedding. As such, the relational attributes explored in social network analysis coexist among geographic features that affect these forces driving tie-selection and retention. Sarkar et al. (2019) sought to develop metrics for characterizing network structure and node importance in spatial social networks, but spatial context is frequently ignored from the list of factors driving social network formations, even if geography is a key vector of influence driving network dynamics (Adams et al., 2012). In other words, while the toolkit of social network analysis includes powerful methods for describing and modeling the structure of relationships among actors,

Spatial analysis 75

it struggles to incorporate spatial information, which would require projecting the relationship between actors not only over the network space, but also over the Euclidean space. In that sense, the analysis of spatial social networks is methodologically similar to the study of multiplex networks, which include multiple overlapping links across heterogeneous networks (Ettlinger, 2003). As such, in spatial networks, actors are simultaneously embedded in network space and Euclidean space, with spatial constraints likely to have a strong effect on their connectivity patterns. Similarly, a strictly geographical approach to relational data does not account for the nuanced and complex interaction in the network space, as actors that are geographically proximate may choose to connect with distant others, thereby forging social ties that exist at and in spite of distance. This formation is patently ill-suited for traditional spatial metrics rooted in the notion of proximal effects as predicted by Tobler’s First Law, which fundamentally stipulates that adjacent objects ought to be related. The development of measures, metrics, software, and frameworks for the study of spatial social networks (or spatially embedded networks) is still ongoing, and much remains to be established in the analysis of relational data embedded in space, particularly the crossover of interpersonal connectivity and environmental attachment (Adams et al., 2012; Radil et al., 2010; Sarkar et al., 2019). Spatial social network analysis seeks to fill this gap by advancing a set of metrics and a spatial social network schema that combines the relative distribution of actors in space with the understanding obtained from network topology. These seminal studies add to a large body of literature pertaining to propinquity and addressing network structure as a function of distance (Fischer, 1982), but scholarship embedding their systems in the context of multivariate geography to include contextual information about geographic space is still limited (Expert et al., 2011; Sarkar et al., 2019). The emergence of social media platforms, along with a range of geospatial user-generated services, produced a swelling body of content with specific spatial references. This adds to an established, curated, and supervised body of geographic information, including conventional and mobile weather stations, satellite and aerial imagery, weather radar, stream observations and gauge stations, citizen observations, ground and aerial LIDAR, water-quality sampling, gas measures, soil cores, and distributed sensors that measure selected domains such as air temperature and moisture (Kitchin, 2014). These corpora of geographic-tagged content cleared the primary obstacle for researchers exploring the longitudinal and cross-sectional relationship between online and offline contexts with the help of spatial analysis (Graham et al., 2013). The secondary obstacle was cleared by the wide availability and increasing affordability of powerful software applications for spatial analysis, particularly the computational implementation of spatial statistics. Indeed, the computational resources available at the turn of the century required weeks of data preparation before analysis and mapping of distributions could begin. Since then, the increasing computational power of very inexpensive computers has provided formal tools to examine much finer spatial scales, particularly those dealing with local movement such as pedestrian modeling. It is fortunate that

76 Social and spatial networks

computational resources have also become more widely available and universally cheaper, as the volume of data currently released each day exceeds anything that could be collected in the typical academic lifetime of a generation ago (Cheshire & Batty, 2012). Perhaps unsurprisingly, private companies currently collect more personal information than central governments, with 15 out of 17 business sectors in the US averaging a higher volume of data per company than the Library of Congress (Manyika et al., 2011). Spatial analysis is based on the assumption that the separation of features or events on the planet’s surface provides information about the events under consideration, and that point patterns can be analyzed to identify clustering or dispersion of interacting spatial points. This type of analysis begins with the definition of a window of observation (e.g., a country or a region) where spatial clusters can be detected and inferred (Kulldorff et al., 1998). The transformation of raw spatial data into geographically ordered information can reveal patterns and anomalies that are not immediately obvious, including density estimation and the characterization of neighborhoods. It can also be leveraged to determine relationships between properties of places and used to examine distance effects on the creation of clusters, hotspots, and irregularities. The near complete absence of spatial methods in the toolkit of social media research is an important piece of the puzzle that explains the long-held notion that online social networks mirror everyday social networks. Early internet scholarship welcomed computer-mediated communication as a means to transcend the constraints of physical environments. While geography, politics, and language define terrestrial and tangible societal points of human contact, social media affordances would provide a forum for community-building and human contact that could reach far beyond the limits of geography. But the existence of geocoded information in user-generated data, along with the multiplicative effect of network size in social platforms, allowed social media platforms to reconnect online communities to the physical geographies of cities, and to embed location-aware, geographically referenced data in social media content. These possibilities were accelerated as data from Global Positioning System (GPS) were further complemented by remote and mobile sensors and radiofrequency identification technologies like RFID, NFC, and Bluetooth. This resulted in highly granular georeferenced user data linked to the information streams generated by and targeted to users. Miller and Goodchild (2014) argued that the context for geographic research has shifted from a data-scarce to a data-rich environment. As a result, data-driven approaches to geography have emerged to respond to the wealth of georeferenced data flowing from sensors and software that digitize and store a broad spectrum of social, economic, political, and environmental patterns and processes (Crampton et al., 2013; Kitchin, 2013). The easiness of collecting, storing, and processing digital data led scholars to inquire about a fourth paradigm of science in which researchers interrogate the world through large-scale, complex instruments and systems that relay observations to large databases subsequently processed and stored as binary data points (Hey, 2012).

Spatial analysis 77

The rapid development of such databases allowed the ever-expanding online communities and social media groups to link to ambient populations where users inhabit, work, or visit (Andresen, 2011; Malleson & Andresen, 2015). The collapsing of online and offline contexts can be plainly expressed by conjuring up the concepts of space and place. While space refers to geometrical arrangements that structure, constrain, or enable certain forms of movement and interaction, place refers to the mechanisms through which social meanings are attached to locations and settings. Therefore, while space refers to a dimension of spatiality that is mostly geometric, mathematical, or physical, place refers to an aspect of the spatial context that is necessarily social and cultural (Harrison & Dourish, 2006). One’s social experience and cognitive mapping of one’s place in the world is invariably defined by continuous negotiation between place and space, that is, the relationship between the materiality of the physical space and the social dynamics embedded in it, a negotiation that could be suddenly remapped toward a new sense of place fashioned in day-to-day communication within online social platforms. Radil et al. (2010) posited the need for a concept of social position and the related techniques of structural equivalence as a means to integrate two different kinds of embeddedness: relative location in the geographic space and structural position in the network space. In opposition to the one-dimensionality of social network analysis, where actors are positioned in a social network, this approach would allow for the spatialization of social networks to include actors’ simultaneous positions in networks of relations and places. The technique advanced by Radil et al. (2010) supports the analysis of simultaneous embeddedness of actors in both network space and geographic space, thereby bridging different conceptual approaches to centrality and embeddedness, and facilitating the analysis of social behavior within network position and geographic space. This set of problems exceeds the traditional boundaries of geography as a discipline identified with topographic, chorographic, and geographic inquiries (Curry, 2002), as geographers tend to think of embeddedness in purely ‘territorial’ ways (Radil et al., 2010). Spatial analysis can nonetheless reach beyond these constraints, as well as those of ‘urban computing,’ mostly dedicated to analyzing the role of technology in urban experiences (Paulos & Jenkins, 2005). Yet, the multiple dimensions of social media affordances and the dislocated spatiality of social platforms far exceed the relatively narrow scope of analysis focused on the spatial dispersion of the data. The overlapping of online and offline social networks requires one to look at the intersections, superpositions, and overlapping of different spatial systems and various information streams associated with socially constructed phenomena and epistemological assumptions derived from ‘virtuality.’ The complex feedback loop between online and offline practices is typically salient in the context of social media, which urges users to re-encounter everyday space as they circulate in and out of online communities. This feedback loop considerably extends the community-building potential associated with digital media as users swiftly move across many geographic communities. Beyond the new channels of sociality enabled by the emergence of social media technologies, and the

78 Social and spatial networks

extension of socially meaningful physical places to internet-enabled forums, a host of theoretical questions have emerged that speak to core problems resulting from the complex interaction of cultures, politics, policies, and governments that require contextual explanation of patterns and correlations (Kitchin, 2014). Empirical research, on the other hand, has struggled to incorporate the spatial dimension in modeling social media usage and activity on location (Bastos et al., 2015; Bennett et al., 2014). Existing metrics of centrality in social network analysis rely exclusively on topology and disregard the spatial embedding of the network. As such, they cannot take into account the role individuals or users play at different spatial scales. The challenge in integrating spatial and social network analysis stems from their different definitions of distance. In social network analysis, distance is measured as the number of hops from one actor to the next within the topological space of the network. In geography and spatial analysis, distance is measured by (x, y) coordinates in Euclidean space. As such, traditional metrics of centrality in social network analysis can hardly provide relevant information about spatial dependencies in any given network. Indeed, research has found no relationship between the topological centrality of actors in a network and their geographic centrality within communities (Onnela et al., 2011). Another key contribution to the problem stems from the theory of social impact formulated by Latané (1981), which states that the time spent with alters and their influence on ego is a function of the inverse square of distance. Testing the theory decades later, Latané et al. (1995) concluded that the average number of interactions individuals classify as noteworthy or memorable was proportional to the inverse of the distance at which individuals live. In other quarters, Sarkar et al. (2019) addressed the problem by developing the SS tuning parameter (α) to describe the extent to which nodes favor contacts that are near or distant. This tuning parameter could be combined with topological centrality measures to ascertain whether an actor is deemed topologically important merely by its ability to connect its neighbors, but it may not be efficient in connecting clusters of actors over distance. This relationship was formalized with a spatial social network schema comprising Near Strangers (spatially close actors who are topologically far), Far Strangers (spatially distant actors who are topologically far), Near Friends (spatially close actors who are topologically connected), and Far Friends (spatially distant actors who are topologically connected). These axes represent a measure of how large the network is both socially and spatially. In Figure 1a, the actors Cesar and Charlie are geographically proximate, with Cesar in the London area and Charlie in the East of England region, but Cesar and Charlie are topologically far from one another, as it takes a minimum of five hops to reach one actor from the other. Conversely, Barbara and Charlie are spatially distant but topologically close (Far Friends), with Barbara placed in the North of England and Charlie in the East of England, but both are topologically connected to each other. In the network shown in Figure 1a, actors can be spatially close but topologically far, a sharp contrast to the perfectly clustered network shown in Figure 1b, where spatial proximity is associated with topological connectivity. The spatial

Spatial analysis 79

FIGURE 1A Network

formation with actors who are spatially close, but topologically

far.

dependencies of the network shown in Figure 1b result from actors clustering at different Euclidean distances inside the bounded space of the United Kingdom, so that the probability of forming an edge is reduced as a function of the distance between the actors. As such, actors in the network shown in Figure 1b are clustered both spatially and topologically, hence assembled in perfect harmony with the first

80 Social and spatial networks

FIGURE 1B Network formation with actors who are clustered spatially and topologically.

law of geography, but also consistent with the proposition in graph theory that clustered nodes tend to be well connected with relatively few connections to other distant clusters (Entwisle et al., 2007). In the formalization presented by Sarkar et al. (2019), the network in Figure 1b includes only Near Friends, while Figure 1a includes many Near Friends and Far Strangers, but also a few Far Friends (e.g., Barbara and Charlie) and Near Strangers (Charlie and Cesar).

Spatial analysis 81

This approach requires both the topological information about the graph, specified by the adjacency matrix, and the spatial information about the nodes, given by the position of the nodes (Barthelemy, 2014). These distinctions are important because methods for uncovering useful information from social networks embedded in space are poorly developed, with social network analysis often relying on network metrics and ignoring the spatial arrangement of nodes. The topology of the network is believed to offer enough information to analyze dyadic relationships, thereby disregarding the fact that useful measures for nonspatial networks might yield irrelevant or trivial results for spatial ones. The examples mentioned above centered on clustering coefficients illustrate this perennial problem: because spatial networks are frequently clustered along spatial coordinates, degree distribution is likely to correlate with spatial variables, with high-degree nodes being suppressed by long-distance costs (Expert et al., 2011). The approach of Sarkar et al. (2019) to spatial social networks (SSN) provides a measure of socio-spatial network tightness called the flattening ratio. These two network-level methods characterize the spatial structuring of the social network in terms of distance distribution of connections. In general terms, these methods provide an intuition of where friends are located, how easy it is to meet nth degree mutual friends, and how fast information will percolate to distant places. The scaling parameter α, the metric that modifies the interpretation of distance between nodes to interpret it as either beneficial or detrimental, quantifies the importance of topology at different spatial scales. The modified centrality metrics help distinguish between individuals who have many local friends and those with few local friends but many long-distance friends. Another critical challenge in mapping social relationships across space is that while links in spatial networks may be spatially embedded, social relationships are virtual in nature. In other words, networks where actors or organizations occupy clear positions in a Euclidean space allow for a straightforward rendering of their links and connections. But the spatially embedding of friendships across online and offline social networks is fundamentally different from the geographically bordered links established by roads or railway lines in transportation networks or the layout of cables connecting network and power grid systems. Despite the spatial dependencies of social networks, whether online or offline, the relationships tying individuals together are by definition cognitive and virtual, much like the functional connectivity in brain networks. In both cases, space plays a central role by affecting, directly or indirectly, network connectivity and rendering its architecture radically different from that of random networks (Gastner & Newman, 2006). Beyond the methodological limitations described in this chapter, there are also considerable limitations associated with the availability of data following the Cambridge Analytica data scandal and the ensuing data lockdown enforced by social media platforms (Walker et al., 2019). While developments in computer infrastructure have improved capacity for accessing and analyzing large-scale

82 Social and spatial networks

social network and spatial data, social platforms have become notoriously hostile toward data sharing. Another key challenge in collating social networks embedded in space is that geographic information collected from users may vary considerably in terms of reliability and precision. As such, the availability of data derived from social media needs to be considered within the framework of social sciences research, and particularly in the context of the challenges facing datadriven research that deals with populations instead of samples, messy rather than clean data, and frequent correlations that only rarely allow for causal inference (Miller & Goodchild, 2014). This source of spatial data can sit side by side with survey data based on samples and randomization, but there are important caveats in aggregating social media with survey data. If representative, random sampling tends to work well in social science research. But sampled data has a lack of extensibility for secondary uses. Even when the sampling is appropriate, it is often of no use beyond the rationale used for sampling the data—that is, a sampling rate of one in six is adequate for a given set of purposes, but problematic for the analysis of comparatively uncommon subcategories, which is often the case in studying social media activity. Random sampling also requires a process for enumerating and selecting from the population (a sampling frame) that is potentially problematic if enumeration is incomplete. Because randomness is so critical, one must carefully plan for sampling, and it may be difficult to repurpose the data beyond the scope for which it was collected (Mayer-Schönberger & Cukier, 2013). In contrast, many of the social media and ‘geoweb’ data sources offer complete populations, not just samples. Yet, working with populations instead of samples is also not without its problems. For one, populations are often self-selected rather than sampled—that is, Twitter or Facebook users comprise individuals who willingly signed up for the service; mobile data is restricted to individuals who carry smartphones; and geocoded tweets are restricted not only to users who willingly signed up for Twitter but to the subset of Twitter users who deliberately agreed to have their messages automatically geotagged and made available to the public. Considering it is currently not possible to know the demographic characteristics of any of these groups, it is also untenable to generalize from the results derived from this data to any larger populations from which they might have been drawn (Miller & Goodchild, 2014: 3–4). Despite these considerable caveats, the combination of spatial analysis and social network analysis can help to unveil the complex interactions between the offline space and online activity that have arguably reconfigured many of the social coordinates in Western industrialized societies. This, of course, requires researchers to gain access to geographically rich social media data, so that user, community, or post information can be organized as point processes within a window of observation to detect and/or infer spatial clusters. Spatialized social media data, hard as it may be to come by, can offer the necessary granularity to test a probability distribution in a finite set of spatial locations.

Spatial analysis 83

References Adams, J., Faust, K., & Lovasi, G. S. (2012). Capturing context: Integrating spatial and social network analyses. Social Networks, 34(1), 1–5. doi: 10.1016/j.socnet.2011.10.007 Ali, M. M., Amialchuk, A., & Rizzo, J. A. (2012). The influence of body weight on social network ties among adolescents. Economics & Human Biology, 10(1), 20–34. An, W. E. (2011). Models and methods to identify peer effects. In J. Scott & P. J. Carrington (Eds.), The SAGE Handbook of Social Network Analysis. London: SAGE publications. Andresen, M. A. (2011). The ambient population and crime analysis. The Professional Geographer, 63(2), 193–212. Backstrom, L., Boldi, P., Rosa, M., Ugander, J., & Vigna, S. (2012). Four Degrees of Separation. Proceedings of the 4th Annual ACM Web Science Conference, Evanston, IL. Bail, C. A., Guay, B., Maloney, E., Combs, A., Hillygus, D. S., Merhout, F., et al. (2019). Assessing the Russian Internet Research Agency’s impact on the political attitudes and behaviors of American Twitter users in late 2017. Proceedings of the National Academy of Sciences, 201906420. doi: 10.1073/pnas.1906420116 Bandura, A. (2001). Social cognitive theory of mass communication. Media Psychology, 3(3), 265–299. doi: 10.1207/S1532785XMEP0303_03 Barthelemy, M. (2014). Spatial networks. In R. Alhajj & J. Rokne (Eds.), Encyclopedia of Social Network Analysis and Mining (pp. 1967–1976). New York, NY: Springer. Barthelmé, S., Trukenbrod, H., Engbert, R., & Wichmann, F. (2012). Modelling fixation locations using spatial point processes. arXiv preprint arXiv:1207.2370. Bastos, M. T., Mercea, D., & Baronchelli, A. (2018). The geographic embedding of online echo chambers: Evidence from the Brexit campaign. PLOS One, 13(11), e0206841. doi: 10.1371/journal.pone.0206841 Bastos, M. T., Mercea, D., & Charpentier, A. (2015). Tents, Tweets, and events: The interplay between ongoing protests and social media. Journal of Communication, 65(2), 320– 350. doi: 10.1111/jcom.12145 Bastos, M. T., Piccardi, C., Levy, M., McRoberts, N., & Lubell, M. (2018). Core-periphery or decentralized? Topological shifts of specialized information on Twitter. Social Networks, 52(Supplement C), 282–293. doi: 10.1016/j.socnet.2017.09.006 Bennett, W. L., Segerberg, A., & Walker, S. (2014). Organization in the crowd: Peer production in large-scale networked protests. Information, Communication & Society, 17(2), 232–260. doi: 10.1080/1369118x.2013.870379 Bernhard, H., Fischbacher, U., & Fehr, E. (2006). Parochial altruism in humans. Nature, 442(7105), 912–915. Bineham, J. L. (1988). A historical account of the hypodermic model in mass communication. Communication Monographs, 55(3), 230–246. doi: 10.1080/03637758809376169 Blondel, V. D., Guillaume, J.-L., Lambiotte, R., & Lefebvre, E. (2008). Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment, 2008(10), P10008. Boase, J. (2008). Personal networks and the personal communication system: Using multiple media to connect. Information, Communication & Society, 11(4), 490–508. Boccaletti, S., Latora, V., Moreno, Y., Chavez, M., & Hwang, D.-U. (2006). Complex networks: Structure and dynamics. Physics Reports, 424(4–5), 175–308. Bonacich, P. (1987). Power and centrality: A family of measures. American Journal of Sociology, 92(5), 1170–1182. Borgatti, S. P., & Lopez-Kidwell, V. (2011). Network theory. In J. Scott & P. J. Carrington (Eds.), The SAGE Handbook of Social Network Analysis. London: SAGE publications.

84 Social and spatial networks

Burt, R. S. (1978). Cohesion versus structural equivalence as a basis for network subgroups. Sociological Methods & Research, 7(2), 189–212. Butler, D., & Stoke, D. (1969). Political Change in Britain: Forces Shaping Electoral Choice. London: Macmillan. Centola, D., & van de Rijt, A. (2015). Choosing your network: Social preferences in an online health community. Social Science & Medicine, 125, 19–31. doi: 10.1016/j. socscimed.2014.05.019 Cheshire, J., & Batty, M. (2012). Visualisation tools for understanding big data. Environment and Planning B: Planning and Design, 39(3), 413–415. Crampton, J. W., Graham, M., Poorthuis, A., Shelton, T., Stephens, M., Wilson, M. W., et al. (2013). Beyond the geotag: Situating ‘big data’ and leveraging the potential of the geoweb. Cartography and Geographic Information Science, 40, 130–139. Curry, M. R. (2002). Discursive displacement and the seminal ambiguity of space and place. In L. Lievrouw & S. Livingstone (Eds.), The Handbook of New Media (pp. 502–517). London: SAGE Publications. De Veirman, M., Cauberghe, V., & Hudders, L. (2017). Marketing through Instagram influencers: The impact of number of followers and product divergence on brand attitude. International Journal of Advertising, 36(5), 798–828. doi: 10.1080/02650487.2017.1348035 Dilliplane, S. (2014). Activation, conversion, or reinforcement? The impact of partisan news exposure on vote choice. American Journal of Political Science, 58(1), 79–94. doi: 10.1111/ ajps.12046 Dodds, P. S., Muhamad, R., & Watts, D. J. (2003). An experimental study of search in global social networks. Science, 301(5634), 827–829. Duflo, E., Dupas, P., & Kremer, M. (2011). Peer effects, teacher incentives, and the impact of tracking: Evidence from a randomized evaluation in Kenya. American Economic Review, 101(5), 1739–1774. Dunbar, R. I. M. (1992). Neocortex size as a constraint on group size in primates. Journal of Human Evolution, 22(6), 469–493. doi: 10.1016/0047-2484(92)90081-J Dunbar, R. I. M. (2016). Do online social media cut through the constraints that limit the size of offline social networks? Royal Society Open Science, 3(1), 150292. doi:10.1098/ rsos.150292 Dunbar, R. I. M., Arnaboldi, V., Conti, M., & Passarella, A. (2015). The structure of online social networks mirrors those in the offline world. Social Networks, 43, 39–47. doi: 10.1016/j.socnet.2015.04.005 Ellul, J. (1965). Propaganda: The Formation of Men’s Attitudes. New York, NY: Alfred A. Knopf. Entwisle, B., Faust, K., Rindfuss, R. R., & Kaneda, T. (2007). Networks and contexts: Variation in the structure of social ties. American Journal of Sociology, 112(5), 1495–1533. Ettlinger, N. (2003). Cultural economic geography and a relational and microspace approach to trusts, rationalities, networks, and change in collaborative workplaces. Journal of Economic Geography, 3(2), 145–171. Expert, P., Evans, T. S., Blondel, V. D., & Lambiotte, R. (2011). Uncovering space-independent communities in spatial networks. Proceedings of the National Academy of Sciences, 108(19), 7663–7668. Feld, S. L. (1981). The focused organization of social ties. American Journal of Sociology, 86(5), 1015–1035. Fischer, C. S. (1982). To Dwell Among Friends: Personal Networks in Town and City. Chicago: University of Chicago Press. Fu, F., Tarnita, C. E., Christakis, N. A., Wang, L., Rand, D. G., & Nowak, M. A. (2012). Evolution of in-group favoritism. Scientific Reports, 2(1), 460. doi: 10.1038/srep00460

Spatial analysis 85

Garcia, D. (2017). Leaking privacy and shadow profiles in online social networks. Science Advances, 3(8), e1701172. doi: 10.1126/sciadv.1701172 Gastner, M. T., & Newman, M. E. J. (2006). The spatial structure of networks. The European Physical Journal B—Condensed Matter and Complex Systems, 49(2), 247–252. doi: 10.1140/ epjb/e2006-00046-8 Goel, S., & Salganik, M. J. (2010). Assessing respondent-driven sampling. Proceedings of the National Academy of Sciences, 107(15), 6743–6747. González-Bailón, S., Borge-Holthoefer, J., Rivero, A., & Moreno, Y. (2011). The dynamics of protest recruitment through an online network. Scientific Reports, 1. doi: 10.1038/ srep00197 Graham, M., Zook, M., & Boulton, A. (2013). Augmented reality in urban places: Contested content and the duplicity of code. Transactions of the Institute of British Geographers, 38(3), 464–479. doi: 10.1111/j.1475-5661.2012.00539.x Granovetter, M. S. (1973). The strength of weak ties. American Journal of Sociology, 78(6), 1360–1380. Harrison, S., & Dourish, P. (2006). Re-Space-ing Place: Place and Space Ten Years on. Proceedings of the 2006 20th Anniversary Conference on Computer Supported Cooperative Work, Banff Alberta, Canada. Hey, T. (2012). The fourth paradigm—data-intensive scientific discovery. In S. Kurbanoğlu, U. Al, P. Erdoğan, Y. Tonta, & N. Uçak (Eds.), E-Science and Information Management (Vol. 317, p. 1). Berlin, Heidelberg: Springer. Hilbert, M., Vásquez, J., Halpern, D., Valenzuela, S., & Arriagada, E. (2016). One step, two step, network step? Complementary perspectives on communication flows in Twittered citizen protests. Social Science Computer Review. doi: 10.1177/0894439316639561 Hill, R. A., Bentley, R. A., & Dunbar, R. I. (2008). Network scaling reveals consistent fractal pattern in hierarchical mammalian societies. Biology Letters, 4(6), 748–751. Huckfeldt, R. R., & Sprague, J. (1995). Citizens, Politics and Social Communication: Information and Influence in an Election Campaign. New York, NY: Cambridge University Press. Johnston, R., & Pattie, C. (2011). Social networks, geography and neighbourhood effects. In J. Scott & P. J. Carrington (Eds.), The SAGE Handbook of Social Network Analysis. London: SAGE publications. Jowett, G. S., & O’Donnell, V. (2014). Propaganda & Persuasion. London: SAGE Publications. Kaiser, M., & Hilgetag, C. C. (2004). Spatial growth of real-world networks. Physical Review E, 69(3), 036103. Katz, E. (1957). The two-step flow of communication: An up-to-date report on an hypothesis. Public Opinion Quarterly, 21(1), 61–78. doi: 10.1086/266687 Katz, M. L., & Shapiro, C. (1985). Network externalities, competition, and compatibility. The American Economic Review, 75(3), 424–440. Killworth, P. D., Johnsen, E. C., Bernard, H. R., Shelley, G. A., & McCarty, C. (1990). Estimating the size of personal networks. Social Networks, 12(4), 289–312. doi: 10.1016/0378-8733(90)90012-X Kitchin, R. (2013). Big data and human geography: Opportunities, challenges and risks. Dialogues in Human Geography, 3(3), 262–267. doi: 10.1177/2043820613513388 Kitchin, R. (2014). Big data, new epistemologies and paradigm shifts. Big Data & Society, 1(1). doi: 10.1177/2053951714528481 Kulldorff, M., Athas, W. F., Feurer, E. J., Miller, B. A., & Key, C. R. (1998). Evaluating cluster alarms: A space-time scan statistic and brain cancer in Los Alamos, New Mexico. American Journal of Public Health, 88(9), 1377–1380. Lasswell, H. D. (1948). The structure and function of communication in society. The Communication of Ideas, 37, 215–228.

86 Social and spatial networks

Latané, B. (1981). The psychology of social impact. American Psychologist, 36(4), 343. Latané, B. (1996). Dynamic social impact: The creation of culture by communication. Journal of Communication, 46(4), 13–25. Latané, B., Liu, J. H., Nowak, A., Bonevento, M., & Zheng, L. (1995). Distance matters: Physical space and social impact. Personality and Social Psychology Bulletin, 21(8), 795–805. Lazarsfeld, P., Berelson, B., & Gaudet, H. (1948). The People’s Choice: How the Voter Makes Up His Mind in a Presidential Campaign (2nd ed.). New York, NY: Columbia University Press. Lazarsfeld, P., & Merton, R. (1954). Friendship as a social process: A substantive and methodological analysis. Freedom and Control in Modern Society, 18(1), 18–66. Le, S., & Boyd, R. (2007). Evolutionary dynamics of the continuous iterated Prisoner’s dilemma. Journal of Theoretical Biology, 245(2), 258–267. doi: 10.1016/j.jtbi.2006.09.016 Liben-Nowell, D., Novak, J., Kumar, R., Raghavan, P., & Tomkins, A. (2005). Geographic routing in social networks. Proceedings of the National Academy of Sciences, 102(33), 11623–11628. Linebarger, P. (1948). Psychological Warfare. Washington, DC: Infantry Journal Press. Malleson, N., & Andresen, M. A. (2015). Spatio-temporal crime hotspots and the ambient population. Crime Science, 4(1), 1–8. Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C., et al. (2011). Big Data: The Next Frontier for Innovation, Competition, and Productivity. McKinsey Global Institute. Marsden, P. V. (2011). Survey methods for network data. In J. Scott & P. J. Carrington (Eds.), The Sage Handbook of Social Network Analysis. London: Sage Publications. Mayer-Schönberger, V., & Cukier, K. (2013). Big Data: A Revolution that will Transform How We Live, Work, and Think. New York, NY: Houghton Mifflin Harcourt. McCarty, C., Killworth, P. D., Bernard, H. R., Johnsen, E. C., & Shelley, G. A. (2001). Comparing two methods for estimating network size. Human Organization, 60(1), 28–39. McPherson, M., & Smith-Lovin, L. (1987). Homophily in voluntary organizations: Status distance and the composition of face-to-face groups. American Sociological Review, 52(3), 370–379. doi: 10.2307/2095356 McPherson, M., Smith-Lovin, L., & Cook, J. M. (2001). Birds of a feather: Homophily in social networks. Annual Review of Sociology, 27(1), 415–444. doi: 10.1146/annurev. soc.27.1.415 Milgram, S. (1967). The small world problem. Psychology Today, 2(1), 60–67. Miller, H., & Goodchild, M. (2014). Data-driven geography. GeoJournal, 80(4), 449–461. Miller, W. L. (1977). Electoral Dynamics in Britain Since 1918. London: Macmillan. Miritello, G., Moro, E., Lara, R., Martínez-López, R., Belchamber, J., Roberts, S. G., et al. (2013). Time as a limited resource: Communication strategy in mobile phone networks. Social Networks, 35(1), 89–95. Mok, D., Wellman, B., & Carrasco, J. (2010). Does distance matter in the age of the Internet? Urban Studies, 47(13), 2747–2783. Munger, K., & Phillips, J. (2020). Right-wing YouTube: A supply and demand perspective. The International Journal of Press/Politics. doi: 10.1177/1940161220964767 Myers, S. A., Sharma, A., Gupta, P., & Lin, J. (2014). Information Network or Social Network? The Structure of the Twitter Follow Graph. Paper presented at the 23rd International Conference on World Wide Web, Seoul, Korea. Newman, M. (2006). Modularity and community structure in networks. Proceedings of the National Academy of Sciences, 103(23), 8577–8582. Nowak, A., Lewenstein, M., & Frejlak, P. (1996). Dynamics of public opinion and social change. In R. Hegselmann & H.-O. Peitgen (Eds.), Modelle sozialer Dynamiken: Ordnung, Chaos und Komplexität. Wien: Hölder-Pichler-Tempsky.

Spatial analysis 87

Nowak, A., Szamrej, J., & Latané, B. (1990). From private attitude to public opinion: A dynamic theory of social impact. Psychological Review, 97(3), 362–376. Oeldorf-Hirsch, A., & Sundar, S. S. (2015). Posting, commenting, and tagging: Effects of sharing news stories on Facebook. Computers in Human Behavior, 44, 240–249. Onnela, J.-P., Arbesman, S., González, M. C., Barabási, A.-L., & Christakis, N. A. (2011). Geographic constraints on social network groups. PLOS One, 6(4), e16939. Onnela, J.-P., Saramäki, J., Hyvönen, J., Szabó, G., Lazer, D., Kaski, K., et al. (2007). Structure and tie strengths in mobile communication networks. Proceedings of the National Academy of Sciences, 104(18), 7332–7336. Paulos, E., & Jenkins, T. (2005). Urban Probes: Encountering our Emerging Urban Atmospheres. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Portland, OR. Pool, I. d. S., & Kochen, M. (1978). Contacts and influence. Social Networks, 1(1), 5–51. Preciado, P., Snijders, T. A., Burk, W. J., Stattin, H., & Kerr, M. (2012). Does proximity matter? Distance dependence of adolescent friendships. Social Networks, 34(1), 18–31. Radil, S. M., Flint, C., & Tita, G. E. (2010). Spatializing social networks: Using social network analysis to investigate geographies of gang rivalry, territoriality, and violence in Los Angeles. Annals of the Association of American Geographers, 100(2), 307–326. doi: 10.1080/00045600903550428 Roberts, S. G., Dunbar, R. I., Pollet, T. V., & Kuppens, T. (2009). Exploring variation in active network size: Constraints and ego characteristics. Social Networks, 31(2), 138–146. Rogers, E. M. (1983). Diffusion of Innovations. New York, NY: The Free Press. Sarkar, D., Andris, C., Chapman, C. A., & Sengupta, R. (2019). Metrics for characterizing network structure and node importance in Spatial Social Networks. International Journal of Geographical Information Science, 33(5), 1017–1039. doi: 10.1080/13658816.2019.1567736 Simmel, G. (1908). Soziologie: Untersuchungen über die Formen der Vergesellschaftung. Berlin: Duncker & Humblot. Sutcliffe, A., Dunbar, R., Binder, J., & Arrow, H. (2012). Relationships and the social brain: Integrating psychological and evolutionary perspectives. British Journal of Psychology, 103(2), 149–168. Taylor, P. M. (2003). Munitions of the Mind: A History of Propaganda from the Ancient World to the Present Era. Manchester: Manchester University Press. Tobler, W. R. (1970). A computer movie simulating urban growth in the Detroit region. Economic Geography, 46(sup1), 234–240. Verma, A. K., Ramaiyer, V., & Prakash, H. (2013). USA Patent No.: I. Yahoo. Walker, S., Mercea, D., & Bastos, M. T. (2019). The disinformation landscape and the lockdown of social platforms. Information, Communication and Society, 22(11), 1531–1543. doi: 10.1080/1369118X.2019.1648536 Wang, H., & Wellman, B. (2010). Social connectivity in America: Changes in adult friendship network size from 2002 to 2007. American Behavioral Scientist, 53(8), 1148–1169. Wasserman, S., & Faust, K. (1994). Social Network Analysis. Cambridge: Cambridge University Press. Wellman, B. (2012). Is Dunbar’s number up? British Journal of Psychology, 103(2), 174–176. Wellman, B., Carrington, P. J., & Hall, A. (1988). Network as personal communities. In S. D. Berkowitz & B. Wellman (Eds.), Social Structures: A Network Approach. Cambridge: Cambridge University Press. Wong, L. H., Pattison, P., & Robins, G. (2006). A spatial model for social networks. Physica A: Statistical Mechanics and its Applications, 360(1), 99–120. Wu, S., Hofman, J. M., Mason, W. A., & Watts, D. J. (2011). Who Says What to Whom on Twitter. Paper presented at the 20th International Conference on World Wide Web, New York, NY.

88 Social and spatial networks

Yamamoto, H., & Matsumura, N. (2009, August 29–31). The Power of Grassroots Influentials: The Optimal Heterophily between Sender and Receiver. Paper presented at the 2009 International Conference on Computational Science and Engineering, Vancouver, Canada. Zhang, Y., Wells, C., Wang, S., & Rohe, K. (2018). Attention and amplification in the hybrid media system: The composition and activity of Donald Trump’s Twitter following during the 2016 presidential election. New Media & Society, 20(9), 3161–3182. doi: 10.1177/1461444817744390 Zhou, W.-X., Sornette, D., Hill, R. A., & Dunbar, R. I. (2005). Discrete hierarchical organization of social group sizes. Proceedings of the Royal Society B: Biological Sciences, 272(1561), 439–444. Zipf, G. K. (2016). Human Behavior and the Principle of Least Effort: An Introduction to Human Ecology. Ravenio Books.

SECTION III

Social networks online and offline The dyadic interaction of social and spatial

7 SPATIAL AND SOCIAL MEDIA DATA

The dyadic interaction social ↔ spatial is perhaps the most challenging to unpack because social ties can be formed both online and offline. This dyad is central to the confounding problem of determining social network effects across spatial boundaries. These effects can be driven by online social ties, offline social ties, or the complex and multimodal interaction drawn from ties spanning online and offline realms. We approach this problem by exploring the geography of Twitter and the extent to which conversation about ongoing events streaming on social media maps to the location where the events were unfolding on the ground. We review research designs that have grappled with this question against the backdrop of methodological challenges in data collection and sampling, as online and offline approaches to data collection may vary widely and result in different network topologies. Much effort has been devoted to addressing issues emerging from the crossroads of spatial and digital affordances. Backstrom et al. (2008) investigated the association of web search keywords with the geographic location of IP addresses and modeled the spatial variation manifested in search queries. The availability of geographic information on web search data was also explored by Gan et al. (2008) and Ginsberg et al. (2008), who traced the spread of flu in the United States by correlating the number of visits to a local doctor with flu-based search terms on Google. While earlier literature focused on modeling information about web search data, recent research has focused on user-generated geographic information by mining social media information streams. This latter line of research became possible following the growing availability of geographic information generated by social media users and GPS-enabled devices. Cheng et al. (2010) used a probabilistic model based on hundreds of tweets to estimate the likelihood of users living in a particular city within a 100-mile radius; Sakaki et al. (2010) investigated the real-time interaction between events on the ground and Twitter activity to monitor messages and detect events in geographic

92 Social networks online and offline

locations. Noulas et al. (2011) studied urban mobility patterns in several metropolitan areas by analyzing a large set of Foursquare users; Gao et al. (2012) offered a sociohistorical model to explore users’ behavior on location-based social networks. The contrast between the geography and topology of social networks was explored by Volkovich et al. (2012) in a study of the interaction between users and spatial distance which found that ties in highly connected social groups tend to span shorter distances than connections bridging separated portions of the network. Cranshaw et al. (2012) measured the social dynamics of a city based on Foursquare check-ins and compared it with boundaries of traditional municipal organizational units, such as neighborhoods. The clustered check-in areas generated by social media users differed considerably from traditionally defined neighborhoods. Twitter hashtags as a proxy for ad hoc publics (Bruns & Burgess, 2011) account for a considerable portion of studies on the intersection of social media and the public sphere. Boyd et al. (2010) described the topic function of hashtags, Huang et al. (2010) explored the conversational nature of Twitter tags, and Bruns and Liang (2012) reviewed how Twitter was used by the public to exchange information about natural disasters in Australia. Although most literature on Twitter hashtags explores the context rather than the geographic information provided, previous studies have also explored the overlap between hashtag and geographic location. Sloan et al. (2013) analyzed a sample of 1 million tweets with geographic information retrieved from user profile, geotagged tweets, and the content of the messages to estimate demographic information from the messages. There is also a growing body of research focused on social and geographic information retrieved from user profiles. Hecht et al. (2011) found that 34% of users did not provide real location information, and that when the information was available it was hardly ever specified at a scale more detailed than the city. Quercia et al. (2012) explored a large sample of Twitter profiles to test whether real-life geography and topic associations held true on Twitter. Leetaru et al. (2013) retrieved geographic data from tweets based on geocode, profile, and text message, and reported that geographic proximity played a minimal role both in who users communicate with and what they communicate about, providing preliminary evidence that geographic location was not paramount to the exchange of information on social media. Conversely, there is a large body of work stressing the importance of geography to Twitter followership and highlighting that face-to-face and online interaction have remained closely intertwined. Kulshrestha et al. (2012) investigated the participation and the social graph of Twitter users and reported a propinquity effect on the probability of interaction between Twitter users. Similarly, Takhteyev et al. (2012) examined the influence of geographic distance and national boundaries in the formation of social ties on Twitter and reported that a substantial share of ties remained within the same geographic region. Yardi and Boyd (2010) investigated tweets related to two local events and found that the geographic location of tweets and users was important in creating context, providing real-time information, and offering eyewitness accounts to the events.

Spatial and social media data 93

At any rate, the expansion of mobile communications and global satellite together with extensive internet access have further complicated the dyadic interaction between social ↔ spatial. These also intensified the preoccupation with the geographic centers of political action, or the lack thereof, particularly against the backdrop of influence operations and social media manipulation through which changes to the mechanisms of political deliberation can be achieved from geographically remote areas, including, of course, hostile state and non-state actors. Our own attempt to understand this relationship was initially focused on investigating the relationship between the geographic location of protestors attending demonstrations in the 2013 protests in Brazil and the geographic location of users who tweeted the protests (Bastos et al., 2014). The research design of the study required us to compile granular data about the location of protests, along with the number of individuals attending the demonstration, setting up tents, getting arrested, or suffering injuries in the course of the protests. This information was collected from press reports and aggregated by date and location, thereby generating a dataset of offline protest events. Twitter, on the other hand, offers granular information about protest events online. The combination of the two datasets allowed us to analyze the overlap between different online and offline protest activity across a range of geographically defined boundaries, with Twitter offering at the time of the study at least three sources of geographic data: geocode (location identified by GPS coordinates), user profile (location identified by information on user profile), and hashtag (location identified by information in the text message). Most hashtags make no reference to geographic locations, but as the protests in Brazil started to spread across metropolitan areas in the country, Twitter users started to tag messages indicating the location where protests were taking place. This unconventional development provided social media messages with a description of the location where protest activity was taking place around the country. Users relied on location-based hashtags to join the protest activity irrespective of whether they were attending the demonstrations on the ground. As protests gained momentum, the volume of hashtagged messages in our database became one order of magnitude higher than non-hashtagged messages and remained higher until the end of the period. This affordance of Twitter and the specific use of hashtags in Brazil offered an opportunity to aggregate different sources of geographic data that captured the relationship between online and onsite activity in the context of political protests. The combination of these two databases allowed us to identify whether users tweeting the protests and those protesting onsite were the same population. Given the uneven concentration of the population in urban areas (Figure 2), we adjusted the data for the population distribution and performed geospatial and spatial clustering analysis over sets of spatial locations. The results revealed that users tweeting the protests were in fact geographically distant from the street protests and that users from geographically remote areas would rely on Twitter hashtags to remotely engage with the demonstrations. These results appeared at odds with

94 Social networks online and offline

FIGURE 2A Distribution

of messages across the country.

the assumption that Facebook and Twitter played a key role in the organization of protests, either by facilitating communication between protestors or by livestreaming the demonstrations. There are important constraints on aggregating individual records across geographic boundaries, whether the source is online social networks or offline activity. In our study, the aggregate data represented the geographic coordinates of individuals participating in or tweeting the protests in Brazil. Population density in Brazil varies considerably, ranging from 3 persons per square kilometer in the Amazon region to 30 persons in the Northeast and 150 in the state of São Paulo. As such, the population-dependent data needed to be normalized using the Brazilian census (Censo, 2010) by calculating the rate of individuals engaged in political protests per Brazilian federative unit. Similarly, we computed the ratio of individuals tweeting protest messages relative to the population of each state (in thousands). With the user base of social platforms being particularly young, urban, well-educated, and wealthier (Pew Research Center, 2013, 2016), it is unsurprising that the normalized Twitter data is skewed and follows the GDP distribution in the country. This approach allowed us to sample the data and perform spatial analyses over a normalized dataset adjusted to the variations in population density and volume of

Spatial and social media data 95

FIGURE 2B Tweets

across Brazilian states in the period.

tweets across the country. The larger dataset of 1.4 million messages was sampled in three randomized samples (without replacement) of 49,611 unique messages with protest, geocode, hashtag, and profile information, thus totaling 148,833 tweets retrieved from a stratified random sample with unequal sampling rates of geocode, hashtag, and profile information streams. A number of functions in the spatstat library (Baddeley & Turner, 2005) for space-time point pattern analysis require unique coordinates, so prior to sampling it was necessary to add a random value ranging from 0.0001 to 0.001 to each geographic observation in order to avoid problems with duplicates. The randomized samples were created by subsequently resampling the data to groups of 10 thousand events to match the nationwide spatial distribution of protests in Brazilian cities and then resampled to groups of 1 thousand events of onsite and online protest activity. The resulting dataset of marked planar point patterns included 40,000 and 4,000 points respectively, with average intensities of 52.7 and 5.27 points per square unit. This allowed us to merge the four streams of political unrest into one marked planar point pattern contained by a window area of 758,974 square units (i.e., the territory of Brazil). After transforming social media

96 Social networks online and offline

data into point patterns (Illian et al., 2008), it was possible to analyze multitype point patterns, which in this case translated to protest activity online and offline. The resulting, geographically rich dataset indicates the number of protests in each region and can be used to project the spread of political upheaval for neighboring space points. After removing shared borders in the polygons (Brazilian map) to avoid problems with self-intersection and geometrical artifacts in the map, we computed the distance between point patterns and calculated the optimal point matching between multiple sources of protest activity (e.g., the distance from protestors attending a demonstration to those tweeting the same demonstration). The analysis of spatial locations was performed using point processes to provide a probability distribution of objects in a finite set of spatial locations. This is a necessary step in the manipulation of spatialized social media data if one seeks to examine online and offline activity as points in space. In other words, this approach allows for treating the data as a spatial statistics problem (Barthelmé et al., 2012) within a defined window of observation, including methods for the detection and inference of spatial clusters (Kulldorff, 1997, 2001). The differences between the geographic locations from where users tweeted (geocode) and the geographic locations where they directed their communications to (hashtag) seem to mirror the socioeconomic gradient of Brazilian society: tweets from peripheral areas of the country are directed to wealthier states in the southeast, where large portions of metropolitan public opinion are located. Nearly 80% of the hashtagged tweets were directed to these areas, and yet the local population was not more likely to join demonstrations than those in other areas of the country. There are differences across locations where users reported they live (profile), where they directed their messages to (hashtag), and from where the tweet was posted (geocode). More critically, these differences are largely at odds with the actual location of the protests, which is more closely reflected by the distribution of hashtagged tweets essentially performing the role of ad hoc publics (Bruns & Burgess, 2011). These differences are easily grasped by visualizing the central point of diffusion where demonstrations took place compared with the central point of diffusion from where messages were posted. Figure 3 shows the hotspot locations of messages associated with protest activity. Hotspots are defined by an intensity function λ(s) in which s is the spatial location and the intensity function defines where events are likely to happen and the expected number of events to happen within the window of observation. The three sources of location retrieved from Twitter present equal centroids around the São Paulo-Rio de Janeiro axis, the economic and arguably political centers of the country. However, the intensity function varies greatly across location sources based on geocode, hashtag, and profile information. The north region of Brazil presents points of diffusion only in the geocode projection plot, and the hashtag projection and the actual location of protests are particularly intense in the southeast region of Brazil.

Spatial and social media data 97

FIGURE 3 Central

point of diffusion of street protests and related Twitter activity (geocode, hashtag, and profiles).

The intensity function does not take into account the population density across Brazil, so messages with location based on geocode information include hotspot location in the north part of Brazil, particularly around the cities of Belém and Rio Branco, but also in the northeast and the central-west regions of the country. Consistent with the information described in Figure 4, messages with location based on data retrieved from user profiles present a more balanced distribution across the country, with a higher-than-average occurrence in the northeast part of the country. Finally, messages in which location was coded based on hashtag information are more broadly consistent with the actual location of protests, with density curves concentrated in the southeast of Brazil and smaller hotspots in the southern and northeast part of the country. Another interesting pattern shown in Figure 3 is that the perimeter of the points of diffusion grows larger as we move away from protests on location toward protest activity online. The distribution of hashtag activity is geographically similar to the restricted perimeter of the actual protests, geocode projection covers a much greater area, and location listed on user profiles reaches most of the Brazilian territory. In other words, the geographic distribution of users who tweeted the protests does not necessarily overlap with the geographic distribution of street protests.

98 Social networks online and offline

In our view, hashtagged messages may have performed the role of conduits that brought together users who were sympathetic to demonstrations but were not physically present at the demonstrations. One possible way of reading this plot is that there is a two-way exchange from protests online to protests on location, but that would not account for the shorter geographic perimeter presented in politically influential locations. Another approach to investigating social network activity online and offline consists of counting the number of marks of online activity that are within a 0.1 radius of the geographic location where onsite protests took place. This shows that hashtagged messages included nearly twice as many neighboring marks to street protests compared with geocoded and profile messages. Figure 4a presents the average distance between point patterns in onsite and online streams of protest activity, with the distance between protest on location to hashtag activity being the shortest at 7.31 units. Figure 4b shows the optimal point matching between point patterns with the larger cardinality n that is closest to the point pattern with the smaller cardinality m, so that observations cluster together due to similar neighbors (Schabenberger & Gotway, 2004). Similarly, Figure 4c was drawn by converting the matrices with two-dimensional coordinates into a neighbors list to triangulate the grid points (Zuyev & White,

FIGURE 4A Average

FIGURE 4B

distance from point patterns of street protests to tweets.

Point matching between street protests and tweets.

Spatial and social media data 99

FIGURE 4C

Gabriel graphs connecting street protests to the location of tweets.

FIGURE 4D Delaunay

triangulation in a tangent sphere that maximizes the angles of edges to avoid skinny formations.

2013). Gabriel graphs draw a neighborhood only if there are no other points in their line set and the resulting graph is comparable to what is shown in Figures 4a and 4b, with street protest and hashtagged tweets being similar, with 1.804 and 1.826 average number of links respectively, and 243 and 233 regions with no links. Finally, Figure 4d confirms the above by plotting the symmetry between maps, with the street and hashtag graphs presenting a single clique as the most connected region with seven links, while the geocode and profile graphs presented various connected regions with a range of links. These results surprised us by showing that Twitter users were often far away from the location on which they were commenting. First, it appeared as if a large portion of online activity was driven by individuals who were sympathetic to the protests but were not attending the demonstrations; the geography of street protests was indeed significantly remote from the geography of users tweeting the protests (distance of 7.31, 8.68, and 8.85 units for hashtag, profile, and geocode activity streams respectively). Second, hashtagged messages appeared to be particularly poor at predicting the actual location of users, and the geography of the Twitter user base seems to differ considerably from the geography of political communication. While the events analyzed in this study took place in 2013, they foreshadowed the disconnect between online and offline protest activity that would drive many

100 Social networks online and offline

influence operations and the information warfare in the following years (Bastos & Farkas, 2019; Walker et al., 2019). This study also forced us to challenge the predominant thinking in internet studies about digital communities, which remained arguably in the shadow of the concept of ‘global village,’ the powerful McLuhanian metaphor describing how new communication technologies empower and bring together geographically disparate individuals across vast territories and cultural differences (McLuhan, 1962). If anything, our results suggested a different emphasis: instead of bridging disparate geographies, social media consolidated extant socioeconomic and political divisions in the country, with geographically distant locations directing their attention toward the metropolitan centers of the public opinion.

8 ONLINE-OFFLINE COORDINATION

The extent to which online social networks overlap with the boundaries of their offline communities is, of course, an empirical question, and there is a growing (if limited) body of scholarship that seeks to address the question. The theoretical and methodological limitations in modeling social activity across computer-mediated and non-computer-mediated environments are compounded by the challenges in collecting relational data that can be mapped onto geographic boundaries while also preserving metrics of online interaction. In other words, studying the overlap of online and offline networks requires researchers to collect, at a minimum, spatial and nonspatial data aggregated at individual or group level. A substantial portion of these efforts emerged from studies on social movements. These studies devoted considerable attention to the relationship between social media usage and physical protests (Bennett et al., 2014; Castells, 2012), the articulation between platforms of self-publication and contentious communication (Castells, 1997, 2009; Diani, 2000; Tarrow, 2005), and the increase in speed and scale of political networks (Bennett et al., 2008; Bennett & Segerberg, 2013). The literature also explored how social media facilitated organization and horizontal logistical coordination (Theocharis, 2013) as well as providing a positive setting for the construction of elective social affinities (Papacharissi & Oliveira, 2012). This body of work also explored the awareness and political participation of social media users (Bekafigo & McBride, 2013; Dimitrova & Bystrom, 2013; Gustafsson, 2012), commented on the role played by Twitter on protest communication (Earl et al., 2013), and assessed the effect of livestreaming events on public conversations (Hawthorne et al., 2013; Shamma et al., 2009). Coordination across online and offline realms was found to be instrumental to the rapid formation of geographically interconnected, networked counter publics, particularly in movements such as the Indignados in Spain (Vallina-Rodriguez et al., 2012), Occupy in the United States (Penney & Dadas, 2014), the Kony 2012 campaign worldwide

102 Social networks online and offline

(Harsin, 2013), and in countries of the MENA region during what was referred to as the Arab Spring (Lim, 2013). There is also a large set of case studies charting how activist communication follows the surge of interest and participation in contentious politics (Earl & Kimport, 2011; Lim, 2013; Valenzuela, 2013). These case studies, however, only rarely provided robust and significant tests on the direction and effect between online communication and onsite protest, with the interaction between social media usage and onsite activities largely restricted to sample-based quantitative content analyses (Earl et al., 2013) or ethnographic observations (Gerbaudo, 2012). Studies that successfully tested these relationships seldom included spatial or geographic data accounting for the spatial dispersion of the activity online and offline (De Choudhury et al., 2016; Freelon et al., 2018; Jungherr & Jürgens, 2013; Theocharis et al., 2017; Vasi & Suh, 2013). Our contribution to this literature relied on modeling the geospatial metadata attached to digital trace data, often collected independently from participants. One question we sought to qualify and quantify was the extent to which social media ignites popular uprisings. This required the modeling of time series data from Twitter and Facebook, along with the actual protests on location, to model whether contentious communication on Twitter and Facebook could be relied upon to forecast onsite protest during the Indignados protests in Spain, the Occupy demonstrations in the United States, and the Vinegar rallies in Brazil. In other words, we sought to measure whether the protests of the Indignados, Occupy, and Vinegar movements were followed by corresponding Facebook and Twitter activity; whether they evolved together by exhibiting bidirectional determination (feedback) between onsite and online protest activity; or whether communication on Twitter and Facebook had any bearing on developments in the street protests. The Spanish demonstrations started in Madrid on 15 May 2011 as a protest against welfare cuts; the demonstrations rapidly escalated after a group of protestors set camp at Puerta del Sol, the city’s main square. In a matter of a few days, the protests and night-time campouts spread to more than 30 cities across Spain. The Occupy protests were arguably influenced by the events in Madrid and started on 17 September 2011 when Adbusters launched the proposal for a peaceful demonstration to ‘occupy’ the global financial center at Wall Street (Moynihan, 2011). Protestors also began camping in Zuccotti Park, and demonstrations spread to cities across the United States. On 15 October, one month before New York protestors were forced out of Zuccotti Park, similar demonstrations had happened in 951 cities across 82 countries. The Vinegar protests in Brazil followed a similar pattern, except the number of protestors was significantly higher, with millions of individuals demonstrating in large metropolitan areas across the country. This was no trivial question. In 2014, it was unclear whether activity on social platforms had any bearing on developments on location. While research had identified a positive correlation between internet use and different forms of political participation (Bakker & de Vreese, 2011; Borge-Holthoefer et al., 2011; Valenzuela et al., 2012), and similarly, a linear relationship from digital communication

Online-offline coordination 103

to onsite activity had been reported by scholarship discussing the diffusion (Tremayne, 2014; Vasi & Suh, 2013) and ranks (Anduiza et al., 2013) of protest, there remained a considerable body of literature challenging this assumption. Such literature claimed that protest activity online was ineffective and had no follow-through, a thesis encapsulated in terms such as ‘clicktivism’ and ‘slacktivism’ passionately advocated by Morozov (2012, 2013) and convincingly argued by Gladwell (2010). This claim was based on the notion that online activism has no impact, as aggrieved individuals would confine their outrage to online communities formed by the weak ties of acquaintances, through which effecting change to the real world was not possible. These claims were relatively well accepted in academia and beyond, as until the mid-twenty-teens little evidence was available of substantial elapsing connection between activist communication on social platforms and protest development on the ground. It is perhaps because these assumptions were so widely accepted that the results were particularly surprising: whereas significant Granger-causality relationships were found between Facebook and Twitter, and from the activity in these platforms to instances of protests, injuries, and arrests on location, there was considerable variation across national contexts. This variation was an indication that the effect of online social networks on events developing on the ground was not symmetric; if anything, the results showed that the elapsed effect of online to offline protests varied across different local and national contexts (Bastos et al., 2015). In a series of studies, we tracked around 100 Twitter hashtags associated with the Indignados, Occupy, and Vinegar protests (roughly 35 hashtags per event) and another 100 Facebook pages and groups dedicated to the events to test the hypotheses that the outbreak of online protest activity at one point in time could be used to predict future outbreaks of onsite protest activity. We also hypothesized that protest communication on social platforms could bear on onsite activity only indirectly, as the directionality of this relationship had not been established in previous studies. Instead of relying on individual reports to generalize collective behavior, we inspected press reports of demonstration size and duration across cities, and social media activity related to these events (Koopmans & Rucht, 2002). This approach allowed us to test narratives reporting that Twitter was instrumental to the establishment of encampments in key protest locations during the Indignados protests in Spain and the Occupy protests in the United States (Castells, 2012). The datasets were modeled as stochastic time series to perform Granger causality tests on the two-way, paired relationship between two numeric variables of online activity—tweets and Facebook posts—and four numeric variables of offline activity—protestors attending demonstrations, setting out tents, getting arrested, or being injured in the course of the demonstrations. The assumption underlying the Granger causality test is that the explanatory variable Granger-causes the outcome variable whenever there is a non-expected output that leads to an increase in the outcome variable. This framework states that a process X is considered a cause of another process Y if knowledge about the past of X significantly improves the

104 Social networks online and offline

prediction of the future of Y, in contrast to the prediction based only on knowledge about the past of Y. Unfortunately, Twitter and Facebook data are highly skewed with heavy tails, considerably affecting the estimation of correlation (i.e. autocorrelation), so we employed a semi-parametric transformation to correct from the non-normality of the individual time series (Sanggyun & Brown, 2010). This method is based on the following procedure: for time series i , find an empirical marginal distri1 T bution function based on the ranks Fˆi and Fˆj , F˘ˆii (x ) = ∑1(X i,t ≤ x ), and T + 1 t =1 similarly for j , and subsequently map the observation into the [0,1] copula space = Fˆ˘ii (X j ,t ) . Finally, we define (Taamouti et al., 2014),U˘Uiˆ,ti,t= = FFˆ˘i (X i ,t ) andU˘Ujˆ,tj,t = F − 1 X i ,t = Φ (Uˆ ||j, t ) and performed standard Granger causality tests on permutations of pairs of online and onsite protest activity, X j ,t . The results indicated that online communication on Twitter and Facebook predicted onsite protest activity in the Indignados (p < .01) and the Occupy datasets (p < .01), with bidirectional Granger-causality between online and onsite protest activity in the Occupy series (p < .01 for all pairwise variables). In the case of the Vinegar protests, the direction of the prediction was only from online to online and onsite to onsite variables—that is, from Facebook to Twitter (p < .04) and from protestors to injuries and arrests (p < .04 and p < .01 respectively). After modeling the time series of arrests and campouts, we found significant relationships between online and onsite protest activity in the Indignados and the Occupy series, but again not in the Brazilian case. The results indicated that the relationship between online and onsite protest activity varied considerably across the national contexts and instances of political unrest. In other words, while the online and onsite series in the Vinegar dataset evolved mostly self-referentially, the online and onsite series in the Occupy dataset presented a significant level of feedback, with Twitter and Facebook both predicting and being predicted by protestors attending demonstrations onsite. Moreover, while the protests in Spain presented a one-way relationship from online to onsite protest activity, with both Twitter and Facebook predicting protest activity on the ground, and Twitter also predicting campouts, in the Occupy series we found a bidirectional Granger-causality between online and onsite protest activity. The Vinegar protests in Brazil, on the other hand, presented no significant results for the relationship between online and onsite activity, with statistically significant predictions only within the online and onsite realms, but no crossover between the two sources of protest activity. Although the results showed that the relationship between online and offline protest activity was not symmetrical, it also confirmed the hypothesis that online activity could be used to predict onsite activity (with the caveats detailed earlier) and that these two instances of social activity were connected. Except for the Vinegar series, protest activity was found to be predictive of multiple instances of social media activity in the Indignados and Occupy series. The intricacy of these relationships is unpacked in Figure 5, which shows the direction of Granger-causalities across the three instances of political protest, with

Online-offline coordination 105

FIGURE 5 Directionality

of Granger-causality between tweets, Facebook posts, and camped-out, injured, and arrested protestors who participated in demonstrations (all three instances of political unrest considered).

online protest activity on social media depicted in dark shades (online) and offline protest activism shown in light shades. The Indignados, Occupy, and Vinegar political protests were largely organized by grassroots activists working in central city locations over weeks or months. These political movements operated in a horizontal, consensus-based decision-making mode in which face-to-face interaction was an important channel of communication. On the other hand, these movements also relied on social media to recruit participants and enhance mobilization (González-Bailón et al., 2011), resulting in a great deal of discussion about the extent to which social media was a central component in igniting popular protests (Gerbaudo, 2012). Upon testing this hypothesis, we found compelling evidence that online protest activity was informative of and forecasted onsite protest activity across multiple instances of political unrest. However, the significant relationships identified in the studies were not symmetric: there was a feedback in the prediction of online and onsite protest activity in the Occupy series, online communication was predictive of onsite protest activity in the Indignados series, and no significant relationship was found from online to onsite protest activity in the Vinegar series. Yet, in two of the three cases, Twitter and Facebook communication would impact protests on the ground, thus offering evidence of a feedback loop from social media activity to onsite protests and back to social media. In other words, Twitter and Facebook were likely to have amplified demonstrations through networked communication that fed into the process of participant recruitment. The results were therefore at odds with the claims of ineffectiveness of social media communication in promoting onsite protest participation, a presumed state of affairs derided as ‘slacktivism’ (Morozov, 2011). The results also contradicted the hypothesis that activity on social media platforms merely mirrored developments on the ground. In fact, the Indignados series showed that one could expect an increase in the number of protestors attending demonstrations whenever there was a rise in the number of messages related to the demonstrations on Twitter and Facebook. This directional relationship from social media communication to demonstrations may be difficult to grasp if one expects offline demonstrations to trigger communication on Twitter and Facebook, or, in

106 Social networks online and offline

the context of the rising penetration of mobile internet technologies, if one anticipates social media communication to be predominately driven by the reporting of onsite activity (Earl et al., 2013). The results, in short, supported the idea that social media were entwined in a dense, multi-layered matrix of stitching mechanisms (Bennett et al., 2014: 234) entangling online and onsite activities. This symbiosis was, however, incomplete and mutable, with many onsite activities not being reported online and vice versa. A more granular investigation of this relationship would require modeling the data at more disaggregated levels—such as neighborhood level—after which more significant relationships could perhaps be unveiled. Likewise, while the existence of directional relationships was confirmed, the directionality of these relationships remained ambiguous, as questions remained pertaining to the magnitude of mutual elapsed effects of protest activity online and offline. We continued to investigate how protest activity online and offline would come together. Despite our previous studies, the narrative focused on the clicktivism/ slacktivism hypothesis proved resilient, as did the belief that social media activity would stem from armchair politics with no follow-through beyond the computer screen. Conversely, the connective action narrative posited that the decentralized architecture of the internet would work along autonomous action to enable political dissent. These two opposing narratives were not too dissimilar from previous metaphors accounting for the impact of the internet. One foregrounded communication and collaboration; the other highlighted the potential of network communication for tribalism and factionalism detached from real-world grievances. In a follow-up study we encountered a group of distinctively politically active users who were neither elite nor ordinary Twitter users (Bastos & Mercea, 2016). After mining 20 million tweets related to nearly 200 instances of political protest from 2009 to 2013, we identified a network of individuals tweeting profusely across multiple instances of geographically disparate political hashtags. The activity of these users was so high that many would face regular suspensions on Twitter, with some reporting having posted thousands of tweets in a busy day. The prevalence of cross-hashtagging was extraordinary: 17% of them had tweeted messages with the hashtag #freeiran, 15% with #jan25; 6% of those who tweeted the hashtag #spanishrevolution also tweeted the hashtag #occupywallstreet. A considerable portion of users who tweeted #occupywallstreet would also have tweeted the protest hashtags in Iran, Egypt, and Spain. We combined the analysis of large-scale social media data with in-depth interviews in an approach we described as ‘scaling-down of big data.’ The aggregate data led us to designate these users as ‘serial activists,’ a term dating from the late 1990s and early noughties that referred to users engaging in various political demonstrations online who might not be dedicated activists themselves (Zuckerman, 2008). While early accounts of serial activism were broadly similar to the clicktivist or slacktivist terminologies, we found that serial activism was not the product of uncommitted click-activists; instead, it encompassed a complex modality of engagement that would often bridge actions online and onsite at multiple protest

Online-offline coordination 107

locations. If anything, the pattern of activity displayed by these users was remarkable for its magnitude (volume of tweeted messages), time (protest hashtagging over extended periods of time), and space (transnational nature of protest hashtags). This profile was particularly salient in the top tier of the group, which included users who tweeted on a minimum of 40 protest hashtags across five diverse geographic bands, as shown in Figure 6. The target population presented clear patterns in the dimensions of magnitude, space, and time. The magnitude of their activity was indicated by an average of 100,000 and a maximum of 1 million tweets per account. The spatial dimension was expressed by an average of 53 (x‒=56) hashtags per user—with a minimum of 43 and a maximum of 101. The average number of area bands tweeted by users was 8 from a total of 17, with a minimum of 5 and a maximum of 13, which again testified to the broad spatial area covered in their tweeting activity. The temporal dimension, finally, was highlighted by their commitment to covering a vast array of protests over extended periods of time. Postees in the target population were also long-time Twitter users, as the majority of accounts (57%) were created between 2007 and 2010; 42% were set up in 2011 (the year the Indignados and Occupy protests erupted), and only 2% of the accounts were established after 2011.

FIGURE 6A Users

overlapping across national protest hashtags.

108 Social networks online and offline

FIGURE 6B Users

overlapping across area bands.

The activity level for the target population was remarkable not only due to the number of demonstrations to which it was linked but more significantly due to the different locations in which users became remotely immersed. Illustratively, and to emphasize the spatial dimension of this population, one user from Greece tweeted across protest hashtags in locations as varied as their native Greece, Australia, Brazil, Bulgaria, Canada, Egypt, France, Iran, Spain, Turkey, UK, the United States, and countries in Africa. Users were reportedly based in the United State (35), Spain (32), Germany (11), UK (9), Brazil (8), Canada (6), Belgium and Italy (4), Australia, Austria, China, France, Greece, Mexico, and Portugal (2), and 1 user from Argentina, Cuba, Egypt, Ireland, the Netherlands, and New Zealand. There were also considerable linguistic differences, but Francophone, Hispanophone, and Anglophone cliques were connected both internally and to each other. Notwithstanding the different geographic areas where the users operated, they were often connected to at least one other member in the group. One user was connected to more than half of the target population and 15 users presented more

Online-offline coordination 109

than 50 connections to other serial activists. With an average number of interconnections of 14 (x‒ =14, x =8), the group was interconnected right at the first level of the social network as a tight-knit community, often following each other and monitoring each other’s Twitter stream. The average age in the group was also significantly higher than the broader Twitter user base, at 45 (x‒ =47.29). While gender and education were equally distributed, profession and income highlighted dissimilar groups. Most interviewees reported a low income by choice or circumstance and a professional background in the IT industries. But it was the interviews that revealed unforeseen connections between their online and offline activity. One activist explained he lived on Twitter: It was basically another appendage and from the time I was awake to the time I went to sleep I was constantly checking it and every notification that went off I had to respond immediately. Around the New Year I took one day off of Twitter—not even 12 hours. People tweeted me and because they didn’t get a reply they thought I’d been kidnapped by the government and sent people to my house to make sure I was okay. (Peter, 2014) Many reported having met other activists in person or “hav[ing] become personal friends over the years” (Roger, 2014). Common to all interviewees was an emphasis on community-building, shared values, and dedication to the cause. Another interviewee met with a colleague for the first time during an interview, and despite considerable different political stances and personal backgrounds, they found themselves completing each other’s responses. According to Thomas (2014), “not once during the interview we disagreed about a thing. It was interesting to see how I could connect so well with someone I didn’t know at all.” Thomas (2014) presented a rather personal take on the mechanics of onlineoffline coordination: I never thought I could identify with what’s happening on the ground by watching live streams, but if you cannot impersonate those people it’s very difficult to keep up the work. It’s an immersive experience and suddenly it’s impossible for your brain to separate yourself from what is happening on the ground. It’s an emergent collective identity that binds us all together. When someone from London or Brussels or Madrid feels interconnected, they’ll give support and organize protest and do solidarity acts. It’s more than just retweeting and going to bed. You’re doing that because it affects you whether you’re there or not. The context collapsing of online and onsite protest actions described by Thomas foregrounded their personal role in providing extensive and often live coverage of physical protests. George (2014), one of the interviewees, stated that his primary

110 Social networks online and offline

job was “to move information and make sure it was getting out so people could make decisions.” Peter (2014) pointed out that live stream seems to be what builds on a lot of these protests. It allows people who can’t be there to be part of that too. . . . If somebody wanted to follow a couple of people and didn’t want to make Twitter their entire existence, they could follow my feed. I was pulling from enough sources that they could just follow me and get the gist, the flavor of what was going on. I could be the central source of information for them if they followed me. The livestreaming pioneered by serial activists offers another point of connection to the online-offline coordination. While the audience for their videos streamed online was rather small for the standards of livestreaming only five years later, it projected the widespread use of the technology in the late teens. In one high-profile case that went viral on 6 July 2016, Diamond Reynolds used Facebook to livestream the aftermath of the shooting of her boyfriend Philando Castile. The footage provided another wave of protests associated with the Black Lives Matter movement: a wave of demonstrations driven in part by witnessing police brutality against African Americans online. Four years later, again as a result of events captured by bystanders, protests against police brutality spread throughout the country and then worldwide after the killing of George Floyd, a 46-year-old black man choked by a white police officer. Livestream viewers, much like serial activists’ audiences in the early teens, could connect to and disconnect from realities near and far that reshaped their experiences across multiple online and offline contexts. Livestreams underscore the tensions and the continuities between virtual and physical presence by placing the audience in the moment, while also disembodying them. In the end, only four interviews had no stories bridging online and onsite protests, while five provided detailed accounts of how online activity helped coordinate actions onsite. Describing his place on the online-onsite continuum, Sam (2014) asserted that “there are people on the ground, which is the Occupy or Gezi Park or whatever, and then there are the anonymous people who are like air support. You’ve got your foot soldiers and then you’ve got air support.” Similarly, Peter (2014) explained how he helped steer onsite actions via Twitter: “social media is great for communication and intelligence during protests and marches. People at home would listen to the feed from the police scanners and feed that to me during the livestreams.” Kate (2014) spoke of the profound investment in the protests she tweeted and the concern for the welfare of onsite contacts. In her words: There was a youth when the shooting broke out in Tahir square and it turned into a terrifying pandemonium. He had been born and brought up in an English-speaking country, but he was back in the Middle East with his girlfriend and they got split up. I’d been following him, and it was obvious he was terrified, so I kind of stepped in and said it’s alright, it’s okay, I’m here, what do you need? I helped to calm him down. He found a toddler, and

Online-offline coordination 111

everyone was running backward, and forwards and he didn’t know what to do. We managed to get him to this house, and they were treating him at the barracks. I managed to get in touch with the toddler’s relatives while we’re trying to find a place to reunite the toddler with his parents and get him out of there. We did that and the wee boy got taken to a mosque where there was a children’s charity that kept him there until his parents came. We were sitting there watching Al Jazeera and looking at Twitter and telling them what road was blocked, which streets had gunfire in them, which streets to stay away from, and what streets the police had people in handcuffs, go down that street, or go down another one. These results offered detailed and in-depth reports on users who have been mistakenly depicted as uncommitted, short-burst activists. Contrary to earlier accounts pertaining to this group (Zuckerman, 2008), the scope and duration of their immersion in collective action spoke to a high and sustained level of activism. By undertaking vital activist tasks (Bennett & Segerberg, 2013), serial activists managed to build a community of users who were often geographically apart but ideologically proximate. Perhaps ironically, serial activists showed that the technologies that have threatened traditional solidarities by entrenching atomized lifestyles also supported the production of renewed forms of collective resistance across online and offline communities.

9 THE DIRECTIONALITY OF HOMOPHILY

The studies reviewed in this chapter explored the relationship between online and offline mobilization in mass protests and revealed a geographical splintering between online supporters and those taking to the streets. Rotman and Shalev (2020) reviewed these findings and argued they imply a complementary division of labor between the two spheres of participation in large-scale protest campaigns, a proposition that is counter to narratives describing a sequential relationship between local online and offline protest activity. This related body of literature argued that social media activity is followed by street protest (Steinert-Threlkeld, 2017), that the temporal relationship is actually inversed (Porto & Brant, 2015), or that the two trends occur in parallel (Jungherr & Jürgens, 2014). In our own research, we found significant bidirectional Granger-causality between online and offline protest activity, and Rotman and Shalev (2020) argued that identifying the directionality of this relationship would be greatly facilitated by utilizing mobile location data with more precise spatial and temporal coordinates of participation compared with digital trace data rendered from social platform activity. Another point raised by Rotman and Shalev (2020) is that large-scale protest events that overlap across online and offline spheres have consistently presented common denominators, including the exceptional scale of prominent demonstrations, with participants numbering in the hundreds of thousands and oftentimes including over 1 million participants; the persistence of protest activities, which may extend over a period of weeks or even months (Mercea et al., 2018); and in particular the geographical dispersion of such demonstrations, which extend beyond the site where events kicked off. Studies have also considered whether such mass protests could mobilize previously disengaged citizens and manage to assemble a diverse coalition of participants. But a newly researchable question emerged from these developments, a question that could not be answered before due to the absence of granular and representative data about the events. For Rotman and

The directionality of homophily 113

Shalev (2020), the main concern raised against the backdrop of granular digital trace data is not what groups were mobilized, but who mobilizes whom, when, and where? In other words, research exploring the extent to which online social networks replicate, overlap, or deviate from offline networks seldom addressed how influence flows from one network to the other, and in which direction it moves: from online to offline or from offline to online. It is difficult to establish the directionality of homophily because it involves complex cultural and technical interactions supporting the flow of influence from one domain to another. The question is nonetheless essential to understanding how the spatiality of social networks evolves in relation to the networks’ virtual counterparts. Scholarship addressing this problem has sought to estimate these effects on individual identity by exploring personal information, server logs, and digital trace data as an implicit extension of real-world social networks. Another framework exploring this problem is found in literature dedicated to issues associated with anonymity and context collapse. This body of work has probed the merging of online and real-world identities, which is rapidly becoming the norm in online social networks (Marwick & Boyd, 2011; Wesch, 2009). This line of inquiry is organically linked to the seminal internet research of the 1990s because it situates the polarity between online and offline human communication in a theoretical perspectives that problematizes, while also providing context, to the irreversibly blurred line between virtual (online) and real (offline) worlds, a line which circumscribed the core of computer-mediated communication theory. The inclusion of spatial homophily complicates the problem because effects of triadic closure and homophily may stem from spatial dependencies in addition to social similarity or online in-group membership. This perspective not only adds a third potential incentive toward the creation or strengthening of assortative social ties but also foreshadows the crosspollination of various homophilic tendencies. Online personas are not only exposed to independent pressures toward social, spatial, and cultural homophily; they are further subjected to additional projections rendered from their patterns of interaction online. Thus, content offered by Google or Facebook will vary from person to person and from place to place depending on download history, web search history, cookies, and cache—factors which dynamically contextualize users’ activity with respect to location, language, and consumption patterns. This contextualization creates another drive toward homophily at the junction between spatial, virtual, and social by ensuring users are spatially and socially organized in places, groups, and spaces more aligned with them (Wilson & Graham, 2013). Addressing the directionality of homophily between online and offline social networks remains an empirical question that necessarily starts with conceptual and operational definitions. While there is a large body of scholarship on the association between geography and network formation as a significant driver of tie-selection and retention (McPherson et al., 2001), few studies have explored the extent to which online social networks map to offline communities. We approached the

114 Social networks online and offline

problem with a proof-of-concept study to validate the use of social media signal to model the ideological coordinates underpinning the Brexit referendum debate. Geographically enriched Twitter data was coupled with a machine-learning algorithm that identified tweets along the ideological space of populism, economism, globalism, and nationalism. The granular spatial data amassed for this study allowed for mapping the political value space of users tweeting the Brexit debate onto parliamentary constituencies (Bastos & Mercea, 2018). The study required the extensive collection of Twitter data supplemented by multiple queries to the Twitter API to identify the location of users tweeting the UK EU membership referendum. After adding geographic markers to the database, we calculated the ideological inclination of users and mapped them onto voting constituencies in England, Wales, Scotland, and Northern Ireland. As such, unlike most social media scholarship focused on users or posts, the unity of analysis of this study was parliamentary constituencies, from which we could model the prevailing ideological landscape as articulated on Twitter in the run-up to the referendum. The research was informed by suggestions of a geographical and socio-demographic patterning of voting preferences in the referendum explored in the literature (Hanretty, 2017; Rennie Short, 2016). The geography of the vote, it was proposed, reflected two major developments. The first was a socioeconomic imbalance between an affluent metropolitan elite clustered in and around London that voted to remain and parts of England and Wales that were economically worse off and that voted to leave the EU. The second was a political cleavage between the seat of the UK government at Westminster, an increasingly independent-minded Scotland, and Northern Ireland, whose economic prosperity and political stability turned on the existence of an open border with fellow EU member, the Republic of Ireland (Rennie Short, 2016). Following this line of inquiry, the political geography of the plebiscite was unpicked at the level of local authority areas (Becker et al., 2016) to examine the relationship between social media communication and the electoral geography of the Brexit referendum. This approach allowed us to assess the extent to which users tweeting nationalist and populist content would overlap across geographic enclaves, and conversely, whether such patterns could be observed in relation to users tweeting globalist or economist content. In other words, we probed whether the Twitter public stream could be used to identify, measure, and model the political consequences of an alignment between the vote and broader ideological orientations expressed by British public opinion. An important step in this research design was to maximize the fraction of users for which geographic information was available. To this end, we queried the Twitter REST API to retrieve the profile of users who tweeted the referendum, along with those who were @-mentioned or retweeted. Profile information, along with information tweeted by users, was pivotal to identifying the location of the user base. We triangulated information from geocoded tweets (subsequently reverse-geocoded), locations identified in the user profile (then geocoded), and

The directionality of homophily 115

information that appeared in the tweets. The triangulation prioritized the signal with higher precision, hence geocoded information was preferred if present. When not available, we looked at the location field in users’ profiles and geocoded that location. If neither source of information was available, we checked for information in the tweets, but only in cases where the place_id field of the API response returned relevant information. While we succeeded in identifying the geographic location of 60% of users who participated in the referendum debate on Twitter, a considerable portion of their locations could be identified only up to city level, but not postcode level. From this cohort of 482,193 users tweeting the referendum, only 30,122 were based in the UK. Upon identifying the location of users, we removed user accounts located outside the United Kingdom or whose location we could not identify up to postcode level. This reduced our dataset to 565,028 messages, or 11% of all collected messages. Despite multiple efforts to maximize the collection of geographic information about users, research seeking to explore the overlap between online and offline networks must contend with smaller data samples of the collected data. In this study, the various sampling techniques applied to the data, particularly the geographic rendering of user locations up to postcode level, reduced the universe of collected tweets to only 11% of the data. Another challenge in this type of research consists of defining the unit of analysis. In this case, we selected council wards and parliamentary constituencies because they provided the optimal granularity relative to online and offline data sources. Twitter data was aggregated first at user level, and then at constituency level, the latter being the unit of analysis employed in the study. The resulting dataset included multiple streams of Twitter data consolidated into a single database of online and offline activity at constituency level. A scaled Poisson regression model was applied to incorporate demographic information from lower-level geographies, thereby aggregating the results at ward or constituency level, along with voting estimates at the level of council wards for authorities that did not disclose the results at such granular levels (Hanretty, 2017; Huyen Do et al., 2015). The resulting referendum database was thus relatively granular with data down to ward level in England, Scotland, and Wales. As the ward system does not exist in Northern Ireland, the data was aggregated at the local authority district, thus overcoming inconsistencies between local authorities and successfully mapping postcodes to parliamentary constituencies. Mapping geographically rich social media data onto census area or electoral districts is another challenge due to the hierarchical subdivision of UK local government areas into various sub-authority areas and lower levels such as enumeration districts. The challenge is particularly salient when studying the United Kingdom, where there is a paucity of voting data at scales other than the level of the parliamentary constituency, which includes on average 70,000 electors and is, of course, a much coarser level of aggregation compared with local neighborhoods or postcodes, where much social interaction is expected to occur. It is therefore difficult to combine the wealth of socioeconomic and demographic data made available

116 Social networks online and offline

from censuses with much more granular digital trace data gathered from social media activity. As such, it is difficult to study neighborhood effects and the process through which information flows through social networks and may resonate with spatially polarized voting patterns (Johnston & Pattie, 2011). We addressed this challenge by relying on voting estimates at the level of council wards, which is the most granular level for which we could retrieve results or estimates for the referendum vote. Twitter-related activity was thus mapped to this unit of geographic analysis, thereby geocoding and reverse-geocoding the location of users who tweeted the referendum and subsequently matching postcodes to wards and Parliamentary Constituencies using the database provided by National Statistics Postcode Lookup (ONS Geography, 2011). Twitter users were thus matched to the fields OSLAUA, OSWARD, and the PCON11CD (local authority, ward, and constituency codes, respectively). The first field includes local authority district (LAD), unitary authority (UA), metropolitan district (MD), London borough (LB), council area (CA), and district council area (DCA). Where the council ward system did not exist (i.e., Northern Ireland), data was aggregated using these authorities to cover the entirety of the United Kingdom. Upon geocoding the self-reported location of users, we found that only 30% of them were based in the UK, with 19% of users who participated in the Brexit debate based in the United States and nearly 30% in other EU countries. This is, of course, another marker of the differences in political discourse online and offline, as the former allows individuals from different locations to participate in the public discourse on an issue circumscribed to the UK. Also surprising was the large geographic spread of the British Twitter user base, with London accounting for 14%; Lancashire 7%; Kent, Essex, West Yorkshire, and West Midlands ranging 3–4%; and South Yorkshire, Hertfordshire, Cheshire, Merseyside, Surrey, and Hampshire at 2% each. Taken together, each of these geographic groups is of comparable size to London in the share of users who tweeted the referendum. We ultimately consolidated referendum and Twitter data based on OSLAUA (Local Authorities) and PCON11CD, which is the standardized ID code for each parliamentary constituency, the only Government Statistical Service beyond European electoral region that is available for Northern Ireland and is consistent across the four countries included in the United Kingdom (ONS Geography, 2017). Using postcode as the common geographic marker across databases, this last step of data aggregation allows for pairing Twitter and referendum data based on local authority district, each comprising a range of postcodes. We assigned pseudo codes when no postcodes or grid references were made available by the authorities, particularly in the Channel Islands and Isle of Man. Data provided by the Office of National Statistics assigns the range E06 (UA), E07 (LAD), E08 (MD), and E09 (LB) to England; W06 (UA) to Wales; S12 (CA) to Scotland, and N09 (DCA) to Northern Ireland, with the pseudocode L99 being assigned to the Channel Islands and M99 to the Isle of Man. The results somewhat upset our expectations. Figure 7 shows that the geographic coverage of ideologically laden tweets only partially matched the results

The directionality of homophily 117

FIGURE 7 Mean

score of Globalism-Nationalism (a) and Economism-Populism (b) for each parliamentary constituency compared with the results of the referendum vote (c).

of the referendum. Apart from London and north-west Wales (Gwynedd), globalist messages are absent in Figure 7a. Populist messages are also relatively underwhelming, covering only portions of the Midlands and North. Nationalism was indeed quintessential to the referendum debate during most of the campaign, with three-quarters of messages (74%) displaying nationalist sentiments, as opposed to 26% expressing globalist values, such as international cooperation. We did not find that economically fragile northern England, an area generally supportive of Brexit, was any more likely to embrace nationalist content. In fact, it was remain-backing Scotland that appeared a fertile ground for nationalism. Perhaps more surprising, the distribution of globalist, nationalist, populist, and economist content was to some extent at odds with the geographic distribution of the Leave-Remain vote. Though nearly 40% of tweets contained populist sentiments, these messages were concentrated in a small number of constituencies. In only 10% of the parliamentary constituencies did populist sentiments prevail, compared with economic issues, and in less than 5% did globalist sentiments dominate, compared with nationalist sentiments. All 72 constituencies with overwhelming support for Leave (65% or higher voting to leave) presented predominantly nationalist sentiments. Conversely, only 17 of these constituencies had a Twitter debate predominantly defined by populist sentiments, with 55 being classified as concerned with the economic outlook. These were regions in which Brexit was wholeheartedly embraced—and yet populist sentiments were not predominant in these regions, at least not on Twitter.

118 Social networks online and offline

In a follow-up study we sought to explore the geographic dependencies of echo-chamber communication on Twitter within the Leave and Remain referendum campaigns (Bastos et al., 2018). In other words, we sought to explore whether the ideological clustering of online communications was a phenomenon specific to online networks or whether it was associated with physical, in-person interaction occurring in offline networks. Similar to the earlier study, we identified the location of users, estimated their partisan affiliation, and finally calculated the distance between sender and receiver of @-mentions and retweets to test whether polarized online echo chambers mapped onto geographically situated social networks. The assumption that online echo chambers would be associated with geographic propinquity was at odds with much of the literature on echo chambers, where echo chambers are defined as a process of self-selection that confines online communication to ideologically aligned cliques (Del Vicario et al., 2016; Zollo et al., 2017). The noticeable absence of spatial data in studies of the echo chamber hypothesis was at times out of pace with a wealth of observations made about the spatial embedding of politically homogeneous communication. Kitchin et al. (2017) cautioned that overt content curation ensuring information homogeneity often relies on spatial information for algorithmic filtering, foregrounding information about some places while concealing other content. With algorithms rarely augmenting places with random information, users are given contextually relevant information to make commercially aware and sensitive choices, a set of parameters that can reinforce spatial bubbles (Graham & Zook, 2013). These geographic filter bubbles not only effect change in the political heterogeneity of a given community but can also strip away serendipity from spatial encounters and reinforce digital ghettoization by directing different subgroups to different parts of their urban environments (Kitchin et al., 2017). And yet, space was a noticeable absence in the prevailing narrative about politically homogeneous echo chambers. The narrative argued that the interaction patterns in social media platforms lead users to engage with political content resonating with them (Sunstein, 2007). The ideological clustering observed in politically homogeneous echo chambers would stand in contrast to the diversity of opinions found in offline, face-to-face interactions. It would ultimately jeopardize political compromise, as ill-informed individuals populated different networks in a society increasingly segregated along polarized partisan lines (Tucker et al., 2018). Transposed to the network of tweets about the UK EU membership referendum, and following the prevailing narrative found in the literature, we would expect to find echo chambers as a communication artifact resulting from online discussion alone. Similarly, we would not expect the geographic locations of users to play a significant role in the formation of echo chambers, as echo chambers would result from social media interactions unfettered by geographic space. Indeed, social anxieties surrounding echo chambers posited that social media was another force driving political polarization (Bessi et al., 2015), with a substantive body of observational evidence showing the role of social media in stratifying users across information sources (Conover et al., 2011). While the rapid growth

The directionality of homophily 119

of online social networks fostered an expectation of higher exposure to a variety of news and politically diverse information (Messing & Westwood, 2014), it also increased the appetite for selective exposure in highly polarized social environments (Wojcieszak, 2010), with the sharing of controversial news items being particularly unlikely to take place in these contexts (Bright, 2016). The filter bubble hypothesis encapsulated these claims by positing that social platforms deploy algorithms designed to quantify and monetize social interaction, narrowly confining it to a bubble algorithmically populated with information closely matching observed and expressed user preferences (Pariser, 2012). But researchers had equally challenged the notion that social media caused selective exposure or ideological polarization, the latter being reportedly more pronounced in face-to-face interactions (Boxell et al., 2017; Gentzkow & Shapiro, 2011; Horrigan et al., 2004). Exposure to diverse and even competing opinions on polarizing topics was found to occur on social media across various national contexts (Bakshy et al., 2015; Fletcher & Nielsen, 2017; Kim, 2011). Similarly, social media was shown to be coextensive with more diverse personal networks, which are more likely to include individuals from a different political party (Hampton et al., 2011). Even with scant evidence linking filter bubbles and echo chambers to general social media communication, there was evidence of echo-chamber communication in several political contexts (Barberá et al., 2015; Krasodomski-Jones, 2016; Vaccari et al., 2016; Wojcieszak & Mutz, 2009). In these settings, political information was more likely to be retweeted if received from ideologically similar sources (Barberá et al., 2015), and cross-ideological information was unlikely to circulate in social clusters with a strong group identity (Himelboim et al., 2013). One possible explanation for the conflicting evidence on echo chambers is that politically homogeneous communication online may reflect group formations inherited from offline communities. These homophilic preferences can coexist with social media platforms that provide ideologically diverse networks (Barberá, 2014). As such, the boundaries of one’s network can be simultaneously permeated by echo chambers stemming from offline relationships while also being exposed to competing opinions on polarizing topics that circulate on social media. Similar associational effects have been reported in the literature, with the relationship between spatial distance and users’ interaction on social media found to be significant, and friendship ties in densely connected groups arising at shorter spatial distances compared with social ties between members of different groups (Laniado et al., 2017). More importantly, research found social ties on Twitter to be constrained by geographical distance with an over-representation of ties confined to distances shorter than 100 kilometers (Takhteyev et al., 2012). In our study, we expected these geographic constraints to have interacted with the patterning of the Brexit vote. This interaction would reveal spatial and associational segregation, with a spatial distribution in which people are more likely to talk to those who are categorically more similar to them. As such, the null hypothesis of our study was that the forces underpinning echo chambers were misrepresented: instead of resulting from interaction in social platforms, echo

120 Social networks online and offline

chambers would reproduce the structural political polarization found in offline social networks. The hypothesis was informed by evidence that bidirectional association between geography and network formation was a significant driver of tie-selection and retention (McPherson et al., 2001). We were also cognizant that geographic proximity affects tie-formation mechanisms associated with both opportunity and preferences, as physical places can be conceived of as a bundle of resources and opportunities with the additional characteristic of spatial contiguity (Glückler, 2007). We posited that the homophily model (Mark, 2003) was well positioned to account for echo chambers, a communication pattern largely devoted to censoring, disallowing, or underrepresenting competing views by enforcing social homogeneity. In other words, the hypothesis about the geographic embedment of online echo chambers draws from the homophily model positing that individuals inhabiting physical communities are more likely to connect with others sharing similar social characteristics, so that cultural similarities and differences among people can be formalized as a function of geographic propinquity (McPherson & Smith-Lovin, 1987; McPherson et al., 2001). Online social networks are reportedly more prone to homophily compared with offline networks, with the latter tied to physical locations where serendipitous exposure to social diversity is more likely to happen (Hampton & Gupta, 2008; Hampton et al., 2010). These factors driving homophily in online and offline networks allowed for testing whether users engaging in echo-chamber communication during the Brexit debate were clustered in geographically homogeneous subgraphs. We relied on the Twitter Streaming and REST Application Programming Interfaces (API) to amass a total of 5,099,180 tweets using a set of keywords and hashtags, including relatively neutral tags such as referendum, but more importantly, messages that used hashtags clearly aligned with the Leave and Remain campaigns. Highly charged hashtags (i.e., ‘#takebackcontrol’ or ‘#lovenotleave’) were used as a proxy for users’ ideological position. We applied again the triangulating procedure to identify the location of users, prioritizing the signal with higher precision, so that geocoded information was preferred if present. When not available, we looked at the location field in users’ profiles and geocoded that location. If neither source of information was available, we checked for information in their tweets. HERE provided the API used to geocode and reverse geocode geographic location. As the API provides attribute-level information about the match quality, we removed API responses with a MatchCity score < .9 and whose field MatchType of pointAddress failed to pinpoint the location on the map (HERE, 2013). Even with this extensive triangulation, a considerable portion of user locations could be identified only to city or postcode level. Once the location of users was identified, we relied on the longitude and latitude values to calculate the Euclidean distance (in kilometers) covered by the sender and receiver of @-mentions and retweets. We used the canonical mean equatorial radius (6378.145 km or 2.092567257E7 ft.) for earth radius, which means the calculation was not mathematically precise due to the inaccurate estimate of the earth’s radius (R). Finally, differences in distance were analyzed with a

The directionality of homophily 121

series of statistical tests, including Chi-square and Kolmogorov-Smirnov. For the Chi-squared tests, we rejected the null hypothesis of the independence assumption

( f i, j − ei, j )

2

if the p-value of x 2 = ∑

was less than the given significance level α. ei, j There are perennial limitations to this type of research that one needs to be aware. Identifying the location of social media users is a notoriously difficult task given the multitude of geographic information made available by social media platforms with various levels of accuracy, reliability, and granularity. While only 1% of tweets usually include geolocational information (Sloan, 2017; Sloan et al., 2013), it is possible to maximize this source by relying on the Twitter REST API, which allows for collecting the last 3,200 messages posted by the user, and then searching for geolocation information in their tweets. Positive matches can be used to roll the location to the remainder of the tweets posted by the same user which lack geographic information. While this approach maximizes precision in determining the location of users, there is no way of knowing whether the geolocation refers to a place where the user works, studies, lives, or was simply traversing. The spatial differences between the places one inhabits and the places one frequents constitute an established field of research in the geography of crime, contrasting residential and ambient population denominators. The distinction was first established by Boggs (1965), who found that the rates for a subset of crimes were dependent on the ambient population within an area, rather than on the residential population. Ambient population thus refers to the actual presence of individuals within a given area, a measure that is necessarily dynamic and exhibits strong spatial and temporal fluctuation. These fluctuations follow seasonal patterns resulting from associate structural differences in routine activities across hotspot locations where one travels to carry out routine activities (Brantingham & Brantingham, 1993; Malleson & Andresen, 2016). This is both a limitation and a feature of geolocated social media data, such as GPS-tagged tweets. The same ambiguity pervades information made available in user profiles, which furthermore may be entirely fabricated. The results showed that although most interactions were within a 200-kilometer radius, echo-chamber communication was largely restricted to neighboring areas within a 50-kilometer radius. The geographic trend of echo-chamber communication was, however, different between the Leave and Remain campaigns, with the former spanning much shorter distances compared with the latter. The trend was also reversed for non-echo-chamber communication, which covered shorter distances on the Remain side compared with echo-chamber communication. In other words, Leave campaign messages were chiefly exchanged within ideologically and geographically proximate echo chambers. While echo chambers also prevailed on the Remain side, the trend was, however, inverted: as distance between sender and receiver increased, echo chambers become more common and covered increasingly larger geographic areas compared with non-echo-chamber communication. This reverse trend was captured by the mean distance covered by Leave messages, at 199 km for echo chambers and 234 km for non-echo chambers i, j

122 Social networks online and offline

( x =168 and x =208 respectively). For Remain messages, inversely, the mean distance was 238 kilometers for echo chambers and 204 kilometers for non-echo chambers ( x =209 and x =184 respectively). Echo-chamber communication nonetheless prevailed on both sides of the campaign. The quantifiably higher volume of echo-chamber communication has potential implications on the homophily patterns observed in non-echo-chamber communication, as larger groups are likely to be more homophilous compared with smaller randomized subsets of the same group (Halberstam & Knight, 2016; McPherson & Smith-Lovin, 1987). Yet, while echo chambers in the Leave campaign appeared further constrained by short geographic distances, this was not the case on the Remain side. In fact, Remain-backing echo chambers were likely to span greater geographic distances while their cross-bubble communication was physically concentrated around neighboring communities, an indication that users aligned with the Remain campaign tried to cross the ideological divide within their communities. Leave echo chambers covered considerably shorter geographic distance at only 168 kilometers compared with 208 kilometers for the Remain campaign. Perhaps more puzzling, the trend was reversed for non-echo chamber communication, which covered shorter distances on the Remain side. Figure 8 unpacks these differences and shows the geographic clustering of Leave messages, particularly in-bubble interactions, centered in the Brexit heartlands of the English Midlands, the North, and the East. We subsequently randomly swapped the location of users in each subgraph and recalculated the distance traveled by @-mention and retweet messages. This allowed us to compare the observed distribution of distances against the random distribution of distances traveled by each message. In other words, this approach establishes an association between echo-chamber communication and the geography of message diffusion whenever the observed networks—ceteris paribus—differed significantly from the random network. For each iteration of the test, we retained the set of locations in each subgraph but randomly reordered the locations to test whether geographic dependencies found in echo-chamber communication were replicated in the randomized geographic network. After 100 iterations, we found that the high volume of interactions within geographically proximate echo chambers was a considerable departure from the distribution in the randomized network. The deviation was particularly prominent in echo-chamber communication, a pattern that disappeared when the location of users was randomly reshuffled. This offered important evidence that the geographic distribution of echo chambers was not determined by chance and allowed us to conclude that the geographic distribution of echo-chamber communication was unlikely, that is, much less likely to happen than in the randomized null model. If anything, the results seemed to suggest that the collapsing of distances brought by internet technologies could foreground the role of geography within one’s social network. The unlikely distribution of echo chambers was yet more salient in the subgraph of Leave echo chamber communication; curiously, it would disappear in non-echo chamber interactions for the Leave campaign and again in the entire network of Remain interactions. In other

Geographic pattern of cross-bubble, out-bubble, and in-bubble (echo chamber) with the number of vertices and edges in each subgraph and the distances averaged over the number of edges. (b) Snapshot of the central point of diffusion of Leave echo chambers in the English Midlands, the North, and the East.

FIGURE 8 (a)

The directionality of homophily 123

124 Social networks online and offline

words, the association between geographic proximity and echo-chamber communication was particularly salient in the Leave campaign. The maximum distance between the distributions observed in the two samples was significantly higher for Leave in-bubble interactions, in which the peak amplitude deviated from the pattern observed for the rest of the network and during the random reshuffling of users’ locations. The probability of seeing a test statistic as high or higher than the one observed if the two samples were drawn from the same distribution was vanishingly small. The results thus supported the hypothesis that echo-chamber communication in the Leave campaign was associated with geographic proximity. It also provided evidence that echo-chamber communication was associated with geographic proximity in the Remain campaign, but the effect was essentially reversed: echo-chamber communication in the Remain campaign was more likely to cover larger distances. The analysis of echo-chamber communication in the Leave and Remain subgraphs revealed striking interactions between online activity and geography. The results substantiated the existence of geographically bound sociopolitical enclaves materializing in polarized echo-chamber communication online. The results also identified a geographic patterning in online echo chambers, particularly in the Leave campaign, that was exogenous to online interaction. As such, the findings called into question the assumption that echo chambers were a communication effect resulting from online discussion alone. Our expectation was that the unequal geographic distribution of the British population was to be observed at each iteration of the tests, with Greater London remaining the central point of information diffusion and echo chambers reappearing with relatively unchanged geographic coverage. While the network topology remained the same at each iteration, along with the distribution of users’ location, the geographic distribution of echo chambers was, however, significantly different. The bell-shaped, near-normal distribution in the randomized networks presents a significant departure from the observed geographic coverage of echo chambers and suggests that geography is an exogenous force that impinged on actors involved in echo-chamber communication during the Brexit debate. The absence of geographic factors interacting with echo chambers in the randomized networks was puzzling because it deviated from both the geographic propinquity of echo chambers observed in the Leave campaign and the geographic remoteness observed in the Remain camp. The geographic dependence of echo chambers appeared to be driven by the physical clustering of fundamentally disparate social networks. Instead of incorporating remote strangers who are activated and incorporated as organic members of one’s social network (Rainie & Wellman, 2012), the results suggested a spillover from in-person conversation to online social media interaction. In other words, echo chambers seemed to connect homophilous dependencies in offline social networks that are not as such created by social media activity. This is also consistent with the differences in echo chambers observed in the Leave and Remain campaigns, as the demographic makeup of their social networks is considerably different. In other words, the significant geographic variation

The directionality of homophily 125

found in the data would be driven by not only the locations where the two groups were clustered but also the social positions embedded in the geographical location of Leave and Remain constituencies. The different social positions occupied by Leavers and Remainers are consistent with the geographical splintering of the country expressed in the referendum and reflect the socioeconomic imbalances separating urban loci of political and economic power, clustered around London, where Remain prevailed, and the economically fragile, worse-off parts of England and Wales, where the Leave vote prevailed. With city-dwellers spending more time shopping or exploring entertainment options outside of their neighborhoods (Groves, 2006), as well as living and working in hubs of the national and global economy (Storper, 2018), it is unsurprising that individuals living in urban areas would travel more and that their resulting social networks would be more widely connected. The distances covered by their interactions should therefore also be lengthier compared with those of people inhabiting rural or low-density areas of the country. This is, of course, only one aspect of the intricate relationship connecting existing physical ties and online interactions, with others likely at play (Laniado et al., 2017; Takhteyev et al., 2012). The studies discussed in this chapter show that our understanding of the directionality of homophily between online and offline social networks is evolving and that there is evidence for the spatial dependencies in social media interaction. This body of work challenges Facebook’s assumption that social platforms necessarily strengthen existing communities, that they invariably help individuals to come together online and offline, or that they enable groups to form completely new communities that transcend physical location (Zuckerberg, 2017). Indeed, the assumption that social platforms merely connect individuals online to reinforce their physical communities tends to ignore the complex and multidirectional association between geography and network formation that is a significant driver of tie-selection and retention. It also fails to consider that the diffusion of information on social platforms has real-world consequences that may differ from patterns observed offline and that these interactions can foster or hinder social cohesion. In other words, while interaction across social platforms can evolve in the absence of physical ties, the network externalities arising from online interactions impinge on our very sense of what is real offline.

References Anduiza, E., Cristancho, C., & Sabucedo, J. M. (2013). Mobilization through online social networks: The political protest of the indignados in Spain. Information, Communication & Society, 17(6), 750–764. doi: 10.1080/1369118x.2013.808360 Backstrom, L., Kleinberg, J., Kumar, R., & Novak, J. (2008). Spatial Variation in Search Engine Queries. Paper presented at the 17th International Conference on World Wide Web, Beijing, China. Baddeley, A., & Turner, R. (2005). Spatstat: An R Package for Analyzing Spatial Point Patterns. www.jstatsoft.org

126 Social networks online and offline

Bakker, T. P., & de Vreese, C. H. (2011). Good news for the future? Young people, internet use, and political participation. Communication Research, 38(4), 451–470. doi: 10.1177/0093650210381738 Bakshy, E., Messing, S., & Adamic, L. (2015). Exposure to ideologically diverse news and opinion on Facebook. Science, 348(6239), 1130–1132. doi: 10.1126/science.aaa1160 Barberá, P. (2014). How Social Media Reduces Mass Political Polarization. Evidence from Germany, Spain, and the US. American Political Science Association Conference Paper, San Francisco, CA. Barberá, P., Jost, J. T., Nagler, J., Tucker, J. A., & Bonneau, R. (2015). Tweeting from left to right: Is online political communication more than an echo chamber? Psychological Science, 26(10), 1531–1542. doi: 10.1177/0956797615594620 Barthelmé, S., Trukenbrod, H., Engbert, R., & Wichmann, F. (2012). Modelling fixation locations using spatial point processes. arXiv preprint arXiv:1207.2370. Bastos, M. T., & Farkas, J. (2019). “Donald Trump is my President!”: The internet research agency propaganda machine. Social Media + Society, 5(3). doi: 10.1177/2056305119865466 Bastos, M. T., & Mercea, D. (2016). Serial activists: Political Twitter beyond influentials and the twittertariat. New Media & Society, 18(10). doi: 10.1177/1461444815584764 Bastos, M. T., & Mercea, D. (2018). Parametrizing Brexit: Mapping Twitter political space to parliamentary constituencies. Information, Communication & Society, 21(7), 921–939. doi: 10.1080/1369118X.2018.1433224 Bastos, M. T., Mercea, D., & Baronchelli, A. (2018). The geographic embedding of online echo chambers: Evidence from the Brexit campaign. PLOS One, 13(11), e0206841. doi: 10.1371/journal.pone.0206841 Bastos, M. T., Mercea, D., & Charpentier, A. (2015). Tents, Tweets, and events: The interplay between ongoing protests and social media. Journal of Communication, 65(2), 320– 350. doi: 10.1111/jcom.12145 Bastos, M. T., Recuero, R. C., & Zago, G. S. (2014). Taking tweets to the streets: A spatial analysis of the Vinegar Protests in Brazil. First Monday, 19(3). doi: 10.5210/fm.v19i3.5227 Becker, S. O., Fetzer, T., & Novy, D. (2016). Who Voted for Brexit? A Comprehensive DistrictLevel Analysis (pp. 1–69). Coventry: Centre for Competitive Advantage in the Global Economy, University of Warwick. Bekafigo, M. A., & McBride, A. (2013). Who Tweets about politics?: Political participation of Twitter users during the 2011 gubernatorial elections. Social Science Computer Review, 31(5), 625–643. doi: 10.1177/0894439313490405 Bennett, W. L., Breunig, C., & Givens, T. (2008). Communication and political mobilization: Digital media and the organization of anti-Iraq war demonstrations in the U.S. Political Communication, 25(3), 269–289. doi: 10.1080/10584600802197434 Bennett, W. L., & Segerberg, A. (2013). The Logic of Connective Action: Digital Media and the Personalization of Contentious Politics. Cambridge: Cambridge University Press. Bennett, W. L., Segerberg, A., & Walker, S. (2014). Organization in the crowd: Peer production in large-scale networked protests. Information, Communication & Society, 17(2), 232–260. doi: 10.1080/1369118x.2013.870379 Bessi, A., Petroni, F., Vicario, M. D., Zollo, F., Anagnostopoulos, A., Scala, A., et al. (2015). Viral Misinformation: The Role of Homophily and Polarization. Proceedings of the 24th International Conference on World Wide Web, Florence, Italy. Boggs, S. L. (1965). Urban crime patterns. American Sociological Review, 899–908. Borge-Holthoefer, J., Rivero, A., García, I., Cauhé, E., Ferrer, A., Ferrer, D., et al. (2011). Structural and dynamical patterns on online social networks: The Spanish May 15th movement as a case study. PLOS One, 6(8), e23883. doi: 10.1371/journal.pone.0023883

The directionality of homophily 127

Boxell, L., Gentzkow, M., & Shapiro, J. M. (2017). Greater Internet use is not associated with faster growth in political polarization among US demographic groups. Proceedings of the National Academy of Sciences, 114(40), 10612–10617. doi: 10.1073/pnas.1706 588114 Boyd, D., Golder, S., & Lotan, G. (2010). Tweet, Tweet, Retweet: Conversational Aspects of Retweeting on Twitter. Proceedings of the 43rd Hawaii International Conference on System Sciences, Honolulu, HI. Brantingham, P. L., & Brantingham, P. J. (1993). Nodes, paths and edges: Considerations on the complexity of crime and the physical environment. Journal of Environmental Psychology, 13(1), 3–28. Bright, J. (2016). The social news gap: How news reading and news sharing diverge. Journal of Communication, 66(3), 343–365. Bruns, A., & Burgess, J. E. (2011). The Use of Twitter Hashtags in the Formation of ad hoc Publics. Paper presented at the 6th European Consortium for Political Research General Conference. http://eprints.qut.edu.au/46515/ Bruns, A., & Liang, Y. E. (2012). Tools and methods for capturing Twitter data during natural disasters. [Natural disasters; crisis communication; Twitter; computer science; research methodology]. First Monday, 17(4). Castells, M. (1997). The Information Age: Economy, Society and Culture Vol. II—The Power of Identity. Cambridge: Blackwell. Castells, M. (2009). Communication Power. Oxford: Oxford University Press. Castells, M. (2012). Networks of Outrage and Hope: Social Movements in the Internet Age. Cambridge: Polity Press. Censo. (2010). Instituto Brasileiro de Geografia e Estatística. Rio de Janeiro, Brazil: IBGE. Cheng, Z., Caverlee, J., & Lee, K. (2010). You are Where You Tweet: A Content-Based Approach to Geo-Locating Twitter Users. Paper presented at the 19th ACM International Conference on Information and Knowledge Management, Toronto, Canada. Conover, M. D., Ratkiewicz, J., Francisco, M., Gonçalves, B., Menczer, F., & Flammini, A. (2011). Political Polarization on Twitter. Paper presented at the 5th International AAAI Conference on Weblogs and Social Media (ICWSM11), Barcelona. Cranshaw, J., Schwartz, R., Hong, J., & Sadeh, N. (2012). The Livehoods Project: Utilizing Social Media to Understand the Dynamics of a City. Paper presented at the 6th International AAAI Conference on Weblogs and Social Media, Dublin. De Choudhury, M., Jhaver, S., Sugar, B., & Weber, I. (2016). Social Media Participation in an Activist Movement for Racial Equality. Paper presented at the 10th International AAAI Conference on Web and Social Media, Cologne, Germany. Del Vicario, M., Bessi, A., Zollo, F., Petroni, F., Scala, A., Caldarelli, G., et al. (2016). The spreading of misinformation online. Proceedings of the National Academy of Sciences, 113(3), 554–559. doi: 10.1073/pnas.1517441113 Diani, M. (2000). Social movement networks virtual and real. Information, Communication & Society, 3(3), 386–401. Dimitrova, D. V., & Bystrom, D. (2013). The effects of social media on political participation and candidate image evaluations in the 2012 Iowa caucuses. American Behavioral Scientist. doi: 10.1177/0002764213489011 Earl, J., & Kimport, K. (2011). Digitally Enabled Social Change: Activism in the Internet Age. Cambridge: MIT Press. Earl, J., McKee Hurwitz, H., Mejia Mesinas, A., Tolan, M., & Arlotti, A. (2013). This protest will be Tweeted. Information, Communication & Society, 16(4), 459–478. doi: 10.1080/1369118x.2013.777756

128 Social networks online and offline

Fletcher, R., & Nielsen, R. K. (2017). Are news audiences increasingly fragmented? A crossnational comparative analysis of cross-platform news audience fragmentation and duplication. Journal of Communication. doi: 10.1111/jcom.12315 Freelon, D., McIlwain, C., & Clark, M. (2018). Quantifying the power and consequences of social media protest. New Media & Society, 20(3), 990–1011. Gan, Q., Attenberg, J., Markowetz, A., & Suel, T. (2008). Analysis of Geographic Queries in a Search Engine Log. Paper presented at the 1st International Workshop on Location and the Web. Gao, H., Tang, J., & Liu, H. (2012, June 4–8). Exploring Social-Historical Ties on LocationBased Social Networks. Paper presented at the Sixth International Conference on Weblogs and Social Media (ICWSM’12), Dublin, Ireland. Gentzkow, M., & Shapiro, J. M. (2011). Ideological segregation online and offline. The Quarterly Journal of Economics, 126(4), 1799–1839. Gerbaudo, P. (2012). Tweets and the Streets: Social Media and Contemporary Activism. London: Pluto Press. Ginsberg, J., Mohebbi, M. H., Patel, R. S., Brammer, L., Smolinski, M. S., & Brilliant, L. (2008). Detecting influenza epidemics using search engine query data. Nature, 457(7232), 1012–1014. Gladwell, M. (2010, October 4). Small change: Why the revolution will not be tweeted. The New Yorker. www.newyorker.com/reporting/2010/10/04/101004fa_fact_gladwell Glückler, J. (2007). Economic geography and the evolution of networks. Journal of Economic Geography, 7(5), 619–634. doi: 10.1093/jeg/lbm023 González-Bailón, S., Borge-Holthoefer, J., Rivero, A., & Moreno, Y. (2011). The dynamics of protest recruitment through an online network. Scientific Reports, 1. doi: 10.1038/ srep00197 Graham, M., & Zook, M. (2013). Augmented realities and uneven geographies: Exploring the geolinguistic contours of the web. Environment and Planning A, 45(1), 77–99. Groves, R. M. (2006). Nonresponse rates and nonresponse bias in household surveys. Public Opinion Quarterly, 70(5), 646–675. Gustafsson, N. (2012). The subtle nature of Facebook politics: Swedish social network site users and political participation. New Media & Society. doi: 10.1177/1461444812439551 Halberstam, Y., & Knight, B. (2016). Homophily, group size, and the diffusion of political information in social networks: Evidence from Twitter. Journal of Public Economics, 143, 73–88. doi: 10.1016/j.jpubeco.2016.08.011 Hampton, K. N., & Gupta, N. (2008). Community and social interaction in the wireless city: Wi-fi use in public and semi-public spaces. New Media & Society, 10(6), 831–850. doi: 10.1177/1461444808096247 Hampton, K. N., Livio, O., & Goulet, L. (2010). The social life of wireless urban spaces: Internet use, social networks, and the public realm. Journal of Communication, 60(4), 701–722. Hampton, K. N., Sessions, L. F., & Her, E. J. (2011). Core networks, social isolation and new media Information, Communication & Society, 14(1), 130–155. doi: 10.1080/1369118X.2010.513417 Hanretty, C. (2017). Areal interpolation and the UK’s referendum on EU membership. Journal of Elections, Public Opinion and Parties, 27(4), 466–483. doi: 10.1080/17457289.2017.1287081 Harsin, J. (2013). WTF was Kony 2012? Considerations for Communication and Critical/ Cultural Studies (CCCS). Communication and Critical/Cultural Studies, 10(2–3), 265–272. Hawthorne, J., Houston, J. B., & McKinney, M. S. (2013). Live-Tweeting a presidential primary debate: Exploring new political conversations. Social Science Computer Review, 31(5), 552–562. doi: 10.1177/0894439313490643

The directionality of homophily 129

Hecht, B., Hong, L., Suh, B., & Chi, E. H. (2011). Tweets from Justin Bieber’s Heart: The Dynamics of the Location Field in User Profiles. Paper presented at the SIGCHI Conference on Human Factors in Computing Systems, Vancouver, BC, Canada. doi: 10.1145/1978942.1978976 HERE. (2013). Geocoder API Developer’s Guide (Version 6.2.45). http://documentation. developer.here.com/pdf/geocoding_nlp/6.2.45/Geocoder%20API%20v6.2.45%20 Developer’s%20Guide.pdf Himelboim, I., McCreery, S., & Smith, M. (2013). Birds of a feather Tweet together: Integrating network and content analyses to examine cross-ideology exposure on Twitter. Journal of Computer-Mediated Communication, 18(2), 40–60. doi: 10.1111/jcc4.12001 Horrigan, J., Garrett, K., & Resnick, P. (2004). The internet and democratic debate. Pew Internet & American Life Project. Washington, DC: Pew Research Center. Huang, J., Thornton, K. M., & Efthimiadis, E. N. (2010). Conversational Tagging in Twitter. Paper presented at the 21st ACM Conference on Hypertext and Hypermedia, Toronto, Canada. doi: 10.1145/1810617.1810647 Huyen Do, V., Thomas-Agnan, C., & Vanhems, A. (2015). Spatial reallocation of areal data—another look at basic methods. Revue d’Économie Régionale & Urbaine, mai(1), 58. doi: 10.3917/reru.151.0027 Illian, J., Penttinen, P. A., Stoyan, H., & Stoyan, D. (2008). Statistical Analysis and Modelling of Spatial Point Patterns. Chichester: Wiley. Johnston, R., & Pattie, C. (2011). Social networks, geography and neighbourhood effects. In J. Scott & P. J. Carrington (Eds.), The SAGE Handbook of Social Network Analysis. London: SAGE publications. Jungherr, A., & Jürgens, P. (2013, October 23–26). Forecasting the Pulse: How Deviations from Regular Patterns in Online Data can Identify Offline Phenomena. Paper presented at the Internet Research 14.0, Denver, USA. Jungherr, A., & Jürgens, P. (2014). Stuttgart’s Black Thursday on Twitter: Mapping political protests with social media data. In R. Gibson, M. Cantijoch, & S. Ward (Eds.), Analyzing Social Media Data and Web Networks (pp. 154–196). Houndmills: Palgrave Macmillan. Kim, Y. (2011). The contribution of social network sites to exposure to political difference: The relationships among SNSs, online political messaging, and exposure to crosscutting perspectives. Computers in Human Behavior, 27(2), 971–977. doi: 10.1016/j. chb.2010.12.001 Kitchin, R., Lauriault, T. P., & Wilson, M. W. (2017). Understanding Spatial Media. London: SAGE Publications. Koopmans, R., & Rucht, D. (2002). Protest event analysis. In B. Klandermans & S. Staggenborg (Eds.), Methods of Social Movement Research (pp. 231–259). Minneapolis: Minnesota University Press. Krasodomski-Jones, A. (2016). Talking to Ourselves? Political Debate Online and the Echo Chamber Effect. London: DEMOS. Kulldorff, M. (1997). A spatial scan statistic. Communications in Statistics-Theory and Methods, 26(6), 1481–1496. Kulldorff, M. (2001). Prospective time periodic geographical disease surveillance using a scan statistic. Journal of the Royal Statistical Society: Series A (Statistics in Society), 164(1), 61–72. doi: 10.1111/1467-985x.00186 Kulshrestha, J., Kooti, F., Nikravesh, A., & Gummadi, K. P. (2012). Geographic Dissection of the Twitter Network. Paper presented at the 6th International AAAI Conference on Weblogs and Social Media, Dublin. Laniado, D., Volkovich, Y., Scellato, S., Mascolo, C., & Kaltenbrunner, A. (2017). The impact of geographic distance on online social interactions. Information Systems Frontiers, 1–16.

130 Social networks online and offline

Leetaru, K., Wang, S., Cao, G., Padmanabhan, A., & Shook, E. (2013). Mapping the global Twitter heartbeat: The geography of Twitter. First Monday, 18(5). Lim, M. (2013). Framing Bouazizi: ‘White lies’, hybrid network, and collective/connective action in the 2010–11 Tunisian uprising. Journalism, 14(7), 921–941. doi: 10.1177/1464884913478359 Malleson, N., & Andresen, M. A. (2016). Exploring the impact of ambient population measures on London crime hotspots. Journal of Criminal Justice, 46, 52–63. doi: 10.1016/j. jcrimjus.2016.03.002 Mark, N. P. (2003). Culture and competition: Homophily and distancing explanations for cultural niches. American Sociological Review, 319–345. Marwick, A. E., & Boyd, D. (2011). I tweet honestly, I tweet passionately: Twitter users, context collapse, and the imagined audience. New Media & Society, 13(1), 114–133. doi: 10.1177/1461444810365313 McLuhan, M. (1962). The Gutenberg Galaxy. Toronto, Canada: University of Toronto Press. McPherson, M., & Smith-Lovin, L. (1987). Homophily in voluntary organizations: Status distance and the composition of face-to-face groups. American Sociological Review, 52(3), 370–379. doi: 10.2307/2095356 McPherson, M., Smith-Lovin, L., & Cook, J. M. (2001). Birds of a feather: Homophily in social networks. Annual Review of Sociology, 27(1), 415–444. doi: 10.1146/annurev. soc.27.1.415 Mercea, D., Karatas, D., & Bastos, M. (2018). Persistent activist communication in occupy Gezi. Sociology, 52(5), 915–933. doi: 10.1177/0038038517695061 Messing, S., & Westwood, S. J. (2014). Selective exposure in the age of social media. Communication Research, 41(8), 1042–1063. doi: 10.1177/0093650212466406 Morozov, E. (2011). The Net Delusion: How Not to Liberate the World. London: Allen Lane. Morozov, E. (2012). The Net Delusion: The Dark Side of Internet Freedom. New York, NY: Public Affairs Books. Morozov, E. (2013). To Save Everything, Click Here: The Folly of Technological Solutionism. New York, NY: Public Affairs Books. Moynihan, C. (2011, September 17). Wall street protest begins, with demonstrators blocked. The New York Times. http://cityroom.blogs.nytimes.com/2011/09/17/ wall-street-protest-begins-with-demonstrators-blocked/ Noulas, A., Scellato, S., Mascolo, C., & Pontil, M. (2011, July 17–21). An Empirical Study of Geographic User Activity Patterns in Foursquare. Paper presented at the Fifth International Conference on Weblogs and Social Media (ICWSM’11), Barcelona (Spain). ONS Geography. (2011). National statistics postcode lookup UK. ONS. https://data.gov. uk/dataset/national-statistics-postcode-lookup-uk ONS Geography. (2017). NHS Postcode Directory User Guide. Newport, UK: Office for National Statistics. Papacharissi, Z., & Oliveira, M. D. F. (2012). Affective news and networked publics: The rhythms of news storytelling on #Egypt. Journal of Communication, 62(2), 266–282. Pariser, E. (2012). The Filter Bubble: What the Internet is Hiding from You. London: Penguin. Penney, J., & Dadas, C. (2014). (Re)Tweeting in the service of protest: Digital composition and circulation in the Occupy Wall Street movement. New Media & Society, 16(1), 74–90. doi: 10.1177/1461444813479593 Pew Research Center. (2013). The demographics of social media users, 2012 (M. Duggan & J. Brenner, Eds.). Washington, DC: Pew Research Center’s Internet & American Life Project.

The directionality of homophily 131

Pew Research Center. (2016). Social media update 2016. Pew Research Center, 11(2). Porto, M. P., & Brant, J. (2015). Social media and the 2013 protests in Brazil: The contradictory nature of political mobilization in the digital era. In L. Dencik & O. Leistert (Eds.), Critical Perspectives on Social Media and Protest: Between Control and Emancipation (pp. 181–199). New York, NY: Rowman & Littlefield. Quercia, D., Capra, L., & Crowcroft, J. (2012). The Social World of Twitter: Topics, Geography, and Emotions. Paper presented at the Sixth International Conference on Weblogs and Social Media (ICWSM’12), Dublin, Ireland. Rainie, H., & Wellman, B. (2012). Networked: The New Social Operating System. Cambridge, MA: MIT Press. Rennie Short, J. (2016). The geography of Brexit: What the vote reveals about Disunited Kingdom. The Conversation. http://theconversation.com/the-geography-ofbrexit-what-the-vote-reveals-about-the-disunited-kingdom-61633 Rotman, A., & Shalev, M. (2020). Using location data from mobile phones to study participation in mass protests. Sociological Methods & Research. doi: 10.1177/0049124120914926 Sakaki, T., Okazaki, M., & Matsuo, Y. (2010). Earthquake Shakes Twitter Users: Real-Time Event Detection by Social Sensors. Paper presented at the 19th International Conference on World Wide Web, Raleigh, NC. Sanggyun, K., & Brown, E. N. (2010, March 14–19). A General Statistical Framework for Assessing Granger Causality. Paper presented at the 2010 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), Dallas, TX. Schabenberger, O., & Gotway, C. A. (2004). Statistical Methods for Spatial Data Analysis. London: CRC Press. Shamma, D. A., Kennedy, L., & Churchill, E. F. (2009). Tweet the Debates: Understanding Community Annotation of Uncollected Sources. Paper presented at the 1st SIGMM workshop on Social media, Beijing, China. http://delivery.acm.org/10.1145/1640000/1631148/p3-shamma. pdf?ip=169.237.62.143&id=1631148&acc=ACTIVE%20SERVICE&key=CA367851C 7E3CE77%2EBD0EBCE24FE9A3C5%2E4D4702B0C3E38B35%2E4D4702B0C3E38 B35&CFID=823708297&CFTOKEN=13640073&__acm__=1470779360_b4279060f 3cb059e7340daa2be975241 Sloan, L. (2017). Who Tweets in the United Kingdom? Profiling the Twitter population using the British social attitudes survey 2015. Social Media + Society, 3(1), 2056305117698981. doi: 10.1177/2056305117698981 Sloan, L., Morgan, J., Housley, W., Williams, M., Edwards, A., Burnap, P., et al. (2013). Knowing the Tweeters: Deriving sociologically relevant demographics from Twitter. Sociological Research Online, 18(3), 7. Steinert-Threlkeld, Z. C. (2017). Spontaneous collective action: Peripheral mobilization during the Arab Spring. American Political Science Review, 111(2), 379–403. Storper, M. (2018). Separate worlds? Explaining the current wave of regional economic polarization. Journal of Economic Geography, 18(2), 247–270. doi: 10.1093/jeg/ lby011 Sunstein, C. R. (2007). Republic.com 2.0. Princeton, NJ: Princeton University Press. Taamouti, A., Bouezmarni, T., & El Ghouch, A. (2014). Nonparametric estimation and inference for conditional density based Granger causality measures. Journal of Econometrics, 180(2), 251–264. doi: https://doi.org/10.1016/j.jeconom.2014.03.001 Takhteyev, Y., Gruzd, A., & Wellman, B. (2012). Geography of Twitter networks. Social Networks, 34(1), 73–81. doi: 10.1016/j.socnet.2011.05.006 Tarrow, S. (2005). The New Transnational Activism. Cambridge: Cambridge University Press.

132 Social networks online and offline

Theocharis, Y. (2013). The wealth of (occupation) networks? Communication patterns and information distribution in a Twitter protest network. Journal of Information Technology & Politics, 10(1), 35–56. Theocharis, Y., Vitoratou, S., & Sajuria, J. (2017). Civil society in times of crisis: Understanding collective action dynamics in digitally-enabled volunteer networks. Journal of Computer-Mediated Communication, 22(5), 248–265. doi: 10.1111/jcc4.12194 Tremayne, M. (2014). Anatomy of protest in the digital era: A network analysis of Twitter and Occupy Wall Street. Social Movement Studies, 13(1), 110–126. Tucker, J. A., Guess, A., Barberá, P., Vaccari, C., Siegel, A., Sanovich, S., et al. (2018). Social Media, Political Polarization, and Political Disinformation: A Review of the Scientific Literature. Menlo Park, CA: William and Flora Hewlett Foundation. Vaccari, C., Valeriani, A., Barberá, P., Jost, J. T., Nagler, J., & Tucker, J. A. (2016). Of echo chambers and contrarian clubs: Exposure to political disagreement among German and Italian users of Twitter. Social Media + Society, 2(3), 2056305116664221. doi: 10.1177/2056305116664221 Valenzuela, S. (2013). Unpacking the use of social media for protest behavior the roles of information, opinion expression, and activism. American Behavioral Scientist, 57(7), 920–942. Valenzuela, S., Arriagada, A., & Scherman, A. (2012). The social media basis of youth protest behavior: The case of Chile. Journal of Communication, 62(2), 299–314. doi: 10.1111/j.1460-2466.2012.01635.x Vallina-Rodriguez, N., Scellato, S., Haddadi, H., Forsell, C., Crowcroft, J., & Mascolo, C. (2012). Los Twindignados: The Rise of the Indignados Movement on Twitter. Proceedings of the 2012 Conference on Social Computing (SocialCom), Amsterdam. Vasi, I. B., & Suh, C. S. (2013, February 7). Protest in the Internet Age: Public Attention, Social Media, and the Spread of ‘Occupy’ Protests in the United States. Paper presented at the Politics and Protest workshop, CUNY Graduate Center, New York, NY, USA. Volkovich, Y., Scellato, S., Laniado, D., Mascolo, C., & Kaltenbrunner, A. (2012). The Length of Bridge Ties: Structural and Geographic Properties of Online Social Interactions. Paper presented at the 6th International AAAI Conference on Weblogs and Social Media, Dublin. Walker, S., Mercea, D., & Bastos, M. T. (2019). The disinformation landscape and the lockdown of social platforms. Information, Communication and Society, 22(11), 1531–1543. doi: 10.1080/1369118X.2019.1648536 Wesch, M. (2009). Youtube and you: Experiences of self-awareness in the context collapse of the recording webcam. Explorations in Media Ecology, 8(2), 19–34. Wilson, M. W., & Graham, M. (2013). Situating neogeography. Environment and Planning A: Economy and Space, 45(1), 3–9. doi: 10.1068/a44482 Wojcieszak, M. (2010). ‘Don’t talk to me’: Effects of ideologically homogeneous online groups and politically dissimilar offline ties on extremism. New Media & Society, 12(4), 637–655. doi: 10.1177/1461444809342775 Wojcieszak, M., & Mutz, D. C. (2009). Online groups and political discourse: Do online discussion spaces facilitate exposure to political disagreement? Journal of Communication, 59(1), 40–56. doi: 10.1111/j.1460-2466.2008.01403.x Yardi, S., & Boyd, D. (2010). Tweeting from the Town Square: Measuring Geographic Local Networks. Paper presented at the 4th International AAAI Conference on Weblogs and Social Media, Washington, DC. Zollo, F., Bessi, A., Del Vicario, M., Scala, A., Caldarelli, G., Shekhtman, L., et al. (2017). Debunking in a world of tribes. PLOS One, 12(7), e0181821. doi: 10.1371/journal. pone.0181821

The directionality of homophily 133

Zuckerberg, M. (Producer). (2017[2018, January 1]). Building Global Community. www.facebook.com/notes/mark-zuckerberg/building-global-community/10154544292806634 Zuckerman, E. (2008). Pros and Cons of Facebook Activism. http://web.archive.org/ web/20110327124755/http:/www.ethanzucker man.com/blog/2008/02/08/ pros-and-cons-of-facebook-activism/ Zuyev, S., & White, D. (2013). tripack: Triangulation of Irregularly Spaced Data (Version R Package Version 1.3–6). http://CRAN.R-project.org/package=tripack

SECTION IV

Mapping online to offline social networks Bridging geography and geodesy

10 NETWORK LAYOUTS BY GEODESY AND GEOGRAPHY

In this fourth and last section we probe the intersection of spatial and network space—geographic and geodesic coordinates—to map networks that are simultaneously social, virtual, and spatial. In other words, we explore the dyadic interactions of social & virtual ↔ social & spatial. This section also offers a brief introduction to packages sp for vector data (Bivand et al., 2008; Pebesma & Bivand, 2005); raster for raster data (Hijmans, 2015), rgdal for input, output, and projections (Bivand et al., 2015); rgeos for geometry operations (Bivand & Rundel, 2014); spdep for spatial dependence (Bivand et al., 2013); and spatstat (Baddeley & Turner, 2005) for spacetime point pattern analysis. The last chapter provides instructions for loading spatial and social network data into R and details a set of custom functions for plotting social media data onto geographic grids. Plotting social ties in network space renders a rather different picture when the same network is plotted on a geographic grid. The rationale orienting the many network layouts available to researchers is perhaps the most intuitive and straightforward way to unpack the challenges in graphing relational data across different geometries. As such, the following chapters explore techniques for the visual inspection of network graphs and informed usage of descriptive social network statistics that are essential for exploratory social network analysis. The objective is to contrast network visualization based on layout algorithms with stationary geographic grids. To this end, we detail the benefits and shortcomings of popular network visualizations, including force-directed, radial tree, and node-link compared with network lattices and matrices. The last chapter in this section covers computational methods for transforming relational and spatial data toward plotting social networks on a map. Before social media ties can be visualized on a geographic grid, it is necessary to draw a base map. Base maps are the building blocks of spatial visualization, and there are many different sources of mapping data. They are drawn using a set of

138 Mapping online to offline social networks

rules for visualizing features like roads and rivers. Tile renderers are used for creating base maps, upon which one can draw choropleths or more complex visualization techniques like heat maps and gravitational interpolation. For our purposes, we will explore simple base maps to plot actors and links encompassing the dyadic interactions of social ↔ spatial, social ↔ virtual, and virtual ↔ spatial. We begin by reviewing the growing body of scholarship exploring the intersections between social interaction on a virtual platform and social interaction in geographic locations. This body of literature has likewise contended with the challenges addressed in this book. Cranshaw et al. (2012) explored geolocated user data from Foursquare to map the on-the-ground dynamics of a city. The study found that, on a large scale, the structure of a place may differ from the official boundaries of a city. The movement of people captured by social platforms defines the character of an area, with a dynamic view of the social flows of individuals throughout different communities, communities which may differ from official neighborhood boundaries. These differences were characterized along three dispersion patterns used to describe the relationship between city neighborhoods and what the study refers to as ‘livehoods,’ or the actual geographic span of online communities: split, spilled, and corresponding. Split patterns show that different demographics or different functions are operating in a certain area, so that the geographic coverage of a community may differ entirely from the official boundaries of that community. Spilled patterns occur when online activity spills across the borders between two or more neighborhoods. The crossover is typically observed in areas that are in transition or indicates a shift in people’s behaviors and perceptions of that area, often due to developers’ efforts to blur the lines between what were once two very different neighborhoods. Corresponding patterns happen when online activity seems to either follow or reproduce the official geographic boundaries, so that social interactions online are greatly influenced by municipal borders and geography. These arbitrary borders, set by city urban planners based on census tracts and geographic landmarks such as roads and bridges, are an attempt to bring order to the chaos of urban occupation and development. They play an important role in the allocation of resources and the planning of local development, but these borders may only partially represent the observed areas of the city rendered by patterns of human interaction online (Cranshaw et al., 2012). Similarly, Facebook created the Commuting Zones database to identify geographic areas where people live and work. The methodology was originally developed by the United States Department of Agriculture Economic Research Service (USDA ERS) to map hotspots of human and economic activity that spanned municipal, state, or political boundaries. But unlike traditional commuting zones developed with national census data, Facebook Commuting Zones leveraged a database of 2.5 billion users around the world and could be updated every three months to capture local economic and commuting changes. The social platform aggregated users’ home and work locations to create a graph that measures

Network layouts by geodesy and geography 139

movements between locations over time. Much like ‘livehoods,’ the resulting zones define local labor markets and economies that extend beyond regional administrative boundaries. The graph connects population centers and clusters these locations to identify areas in which people tend to spend much of their time and to interact (Facebook Commuting Zones, 2020). There is, of course, a wealth of literature on spatial analysis, including work that combines spatial with network analysis. Okabe and Sugihara (2012) detailed statistical and computational methods in spatial analysis along networks, though it unfortunately focuses on networks that are spatial, but not social, such as planar networks or networks drawn in a plane, such as roads, where nodes are placed at the crossing connecting four branches or a crossing planar. Planar networks, such as paths, cells, trees, and circuits, include the geographically relevant transport circuits and the intricate patterns emerging from road, rail, pipeline, and telecommunications systems (Haggett & Chorley, 1969). Planar networks are, however, limited, even for representing city roads, which may include three-dimensional crossing points like an overpass or underpass that requires nonplanar representation—that is, the associated network structure is not embedded in a plane. Nonplanar structures are abundant: transportation networks of railroads and subways, road networks at ground level, expressways at a higher level, and a network of customers wandering in a multilevel shopping mall are networks that cannot be embedded in a plane without mutual crossing except at nodes (Okabe & Sugihara, 2012). Nonplanar networks, on the other hand, are difficult to split across layers. The routes connecting international airports could potentially be drawn with arcs representing airplanes, arcs that would intersect with each other at several locations, but it is difficult to divide the routes into layers. While it is possible to assign a specific trajectory with discrete heights to each airplane to avoid collision, intersections will necessarily occur, and it is not possible to apply the winged-edge data structure to general nonplanar networks. Given these constraints, it is not possible to map social relationships to planar networks. While the actors (nodes) are necessarily embedded in space, social relationships among actors exist in the cognitive space of their psychological systems, which has no geographic positioning. Geographers have regularly encountered this problem when studying the streaming of water within a drainage basin versus the interregional stream of migrants. While both examples represent functional systems where flow is the fundamental property, the problem of organizing human versus natural resources requires distinctive channel networks (Haggett & Chorley, 1969). Network analysis is not quite optimized for spatial analysis. The topology of social networks is usually characterized with an adjacency matrix A where the elements are Aij = 1 if nodes i and j are connected, but these coordinates are insufficient for spatial networks, which in addition to information about who is connected to whom also need to include node information Xij (where i would refer to latitude and j to longitude) to pinpoint the geographic location of nodes in the network. Given that A and X are independent, two topologically identical graphs may present fundamentally different spatial distributions. The visualization of the

140 Mapping online to offline social networks

same network will therefore depend on whether positional information, such as latitude and longitude coordinates, is employed to place the actors along the geographic space instead of the somewhat arbitrary position of actors rendered by network layout algorithms. As such, geographically embedded social networks basically consist of two network graphs merged into one. Perhaps due to the added analytical complexity, social media data is rarely explored against the backdrop of geography, notwithstanding online activity being embedded in time and space and regularly exhibiting geographic dependencies. While trade networks and transportation systems have been extensively mapped into geographical maps (Friendly, 2005), information resulting from online interaction is rarely mapped to geographic locations due to the assumption that the internet in general and social media in particular have overcome geographical distance. Krempel (2011) makes the case that given the opportunities opened by digital trace data to model and map human behavior embedded in space and online, it is rather surprising that the field has not been intensively studied. Part of the problem is that network visualization explores geodesics, or the shortest paths connecting any pair of nodes, rather than geographic distances separating any two nodes, with the resulting analysis restricted to the description of visually observable connections on the network image. These visualizations are drawn by algorithms designed primarily to facilitate the reading of the data. Network layouts therefore reduce complexity by offering a glimpse into a universe of interconnection that is otherwise too ample, indeed too complex, for human reckoning and judgment. Such network layouts are, however, not designed to allow for hypothesis testing, or the inference of relationships, nor do they constitute a proxy for actual network analysis. This is particularly troubling when force-based layout algorithms are relied upon to draw social relationships, which are necessarily locally and contextually bound, which is to say spatially embedded, while the algorithm is designed to draw graphs in an aesthetically pleasing and simplified fashion. Force-based layout algorithms are designed to continuously adjust the position of nodes according to a physical system of forces such as springs or molecular mechanics. These algorithms combine repulsive forces separating nodes with attractive forces between adjacent nodes, so that the resulting graph has short edge lengths and well-separated nodes. The continuous movement implemented by the algorithm may be gradient or translated into accelerations, but the movement and positioning of nodes are immaterial to their geographic location. Simpler approaches to drawing a graph include arc diagrams, where nodes are placed on a line and links are drawn as semicircles above or below the line, and circular layout, where nodes are positioned around a circle, with adjacent nodes close to each other to reduce crossings (Battista et al., 1994). These algorithms are not designed to convey geographic accuracy but to display dissimilar numerical information simultaneously instead of sequentially (Bertin, 2010). To be sure, force-directed network layouts are the cornerstone of network visualization. Such network plots are appealing and may potentially render unexpected

Network layouts by geodesy and geography 141

insights, not because of the information that has been presented but because the complex entanglement of connections can be presented in a way that is elegant and that arrays disparate information in a comprehensive and understandable manner. As a rule of thumb, network layout algorithms will position nodes and edges to maximize the reading and visualization of the network rather than providing information about node distribution or edge formation. While geographic layouts fix nodes in a Euclidean space, force-directed layouts emphasize division, complementarity, or ranking depending on the chosen algorithm (Figure 9). As such, the distance between nodes indicated by the length of any given edge is ultimately arbitrary and meaningless for analytical purposes. The exception that confirms this rule, of course, is the visualization of mental models in which the distance between nodes has actual analytical implications (Carley & Palmquist, 1992). One of the central principles orienting network layouts is the attempt to structure information hierarchically. Layered maps render hierarchies in the network as layers, with the y axis conveying information about actor status and placing nodes on the x axis (Brandes et al., 2001). Similarly, centrality maps render network information as radial orderings and tend to group nodes around the center, where the distance from the center reflects difference in metrics of centrality, influence, or authority. The center of the graph thus features highly central nodes, and nodes with lower values are pushed to more distant concentric circles. Such hierarchical graphs are more constrained, but the links between the units can be optimized so that connected nodes on different circles are rendered closer to each other (Krempel, 2011). Other graph layout algorithms include the tree layout, which renders a tree-like formation, with the parent-child relationships expanding in circles surrounding the node and diminishing at lower levels, so that circles do not overlap. Layered graph algorithms tend to arrange nodes into horizontal layers, so that most links are ordered downward from one layer to the next, and the nodes within each layer

FIGURE 9 Implementation of network layouts in Gephi that highlight different features

of the topology.

142 Mapping online to offline social networks

are ordered to minimize overlaps. Spectral layout methods, where the eigenvectors of a matrix are used as coordinates, and orthogonal layout methods, where links are drawn horizontally or vertically, rely on a multiphase approach that planarizes the network by replacing crossing points by nodes and minimizing bends. The radial layout relies on measures of centrality to place nodes at the center of the drawing, so that secondary nodes are placed on the periphery, but strongly linked nodes are not necessarily placed close to one another. These layouts are determined only by the weight of the links connecting nodes and incorporate no other constraints (Herman et al., 2000). Figure 10 shows the implementation of these network layouts in Gephi (Bastian et al., 2009). Krempel (2011) traced the development of network mapping to the invention of statistical procedures to represent similarities and distances at Bell Labs in the 1960s and 1970s, a treatment that follows ideas on how cartographers map geographical distance into two-dimensional plans. These algorithms embed observed similarities or distances in metric space, with inconsistencies in the data resolved by a type of least squares procedure. These procedures evolved into network layout algorithms employing spring embedders to arrange nodes as a function of the mechanical forces that linked and repelled them, much like electrical fields that enforced a minimal distance around each of the nodes. Forces of attraction and repulsion can be scaled to spread close neighbors and expand the canvas of the image. Due to the mechanical treatment applied to graphs, the scaling does not

FIGURE 10 Gephi

implementation of force-directed layouts VxOrd and ForceAtlas (above) compared with radial and geographic layouts (below) using the Airlines sample dataset.

Network layouts by geodesy and geography 143

affect the readability of a layout because cliques and neighborhoods are preserved, with central nodes found at the center of the graph and typically placed close to each other, while nodes with low connectivity are usually placed at the periphery. Geographical layouts are fundamentally different because nodes have preassigned coordinates in a planar space. The challenge consists of finding a way to draw non-overlapping links using polylines or spline curves to untangle the complexities of a network in the image space. While proximity in geography is timeinvariant, network layout algorithms render proximity between actors as a function of the strength of the observed relationships. In other words, proximity in network layouts reveals who is strongly connected to whom. Social network graphs objectively resort to proximity as a proxy for influence, with the closeness between nodes indicating similar characteristics, though these characteristics may be immaterial to their geographic location. Similarly, actors who are connected only by indirect paths are positioned at a distance, even if they happen to be geographically proximate. Unfortunately, conventional network layouts that translate numerical information into a network space are not tailored to display the same information over a geographic grid, not least because human visual perception struggles with patterns of dispersion and concentration that are typical of spatial distributions. One can adjust the lines representing links so that they have different width to represent weight, but these adjustments can easily lead to overlapping edges if the network is dense. It is also not possible to compensate for this problem by placing nodes north and south of the image to minimize overlap, as the positions of nodes are determined by their geographic position. Coloring, on the other hand, remains a key attribute for displaying spatial social networks efficiently, and the work of Brewer (2015), including the accompanying R package (Neuwirth & Brewer, 2014), was central in developing color schemes that allow qualitative, sequential, or diverging distributions of attributes. These procedures help in the mapping of social space into meaningful planar representations through bijective mappings (Krempel, 2011). Sarkar et al. (2019) summarized the problem by arguing that the wealth of visual approaches available to generate node-and-edge diagrams (i.e., sociograms) is of limited use when the network data contains geographically positioned actors. This is because the positions of nodes are not fixed in geodesic coordinates (measured in network hops) but in Euclidean coordinates (measured in base units of length), which is a considerable departure from graph layout algorithms designed to create aesthetically and analytically viable sociograms that maximize data readability by moving nodes around the plotting area. Another fundamental constraint in spatial social networks is the significant size of these networks, the size being likely to render hairball-like network structures that are difficult to understand even with force-based algorithms, and certainly too dense to be projected in Euclidean space (x, y) using anchored sociograms. But the successful visualization of spatial social networks can lend important insights into the social and spatial dimensions constraining the network and the

144 Mapping online to offline social networks

spatial distribution of the social connections. It can also identify the location of key actors across different spatial scales, whether their connections are nearby or dispersed, and whether actors deemed topologically important are connected to clusters of nodes both locally and at distance. Beyond the possibilities of modeling explored by Sarkar et al. (2019), visualizing social networks on a geographic grid allows the researcher to move back and forth between the numerical data and their visualizations. Ultimately, it allows one to combine the statistical exploration with exploratory visual inspection of patterns that may not be clearly understood because our knowledge and information remains limited (Krempel, 2011). Finally, a network model tailored for the geographic embedding of social media interaction needs to account for the emergence of highly connected individuals known as ‘influencers,’ who are themselves geographically entrenched, and the heavy-tailed distribution of activity that is typical of social media interaction. A tentative model that accounts for the interplay between geography and clustering in social media needs to incorporate geography as a function to the scale-free random connection model (Barabási & Réka, 1999). First, employ a Poisson process to randomly place the nodes in space with a given intensity V . Second, by attaching an independent and uniformly distributed weight U x to each node x , edges are drawn 1 d to each two nodes with the probability pxy = P (x connected to y ): = ϕ U xγU xγ x − y , β where d is the dimension of the model, so that d = 2 for network models on a bidimensional plane, parameter γ ∈ [0,1] controls node weight, and parameter β > 0 inputs the monotonously decreasing profile function ϕ (Plagge, 2020).

11 METHODS IN SPATIAL STATISTICS FOR SOCIAL NETWORKS

There is a growing number of open source and proprietary computer applications for plotting geospatial networks, ranging from point-and-click to command-line scripting and programming. Point-and-click interfaces allow for intuitive manipulation and visualization of the graph, including networks where nodes are placed on a geographic grid. Command-line tools can be fully scripted and are therefore more suitable for advanced modeling and data exploration using spatial statistics (these methods are referred to as spatial rather than geographic because they can be applied to data arrayed in any space, not only geographic space). Many applications include both point-and-click visualization tools and a command-line interface for statistical analysis. The former has the advantage of allowing for instantaneous and intuitive inspection of the geographic distribution (and potentially the geographic dependencies) of nodes. In the remainder of this chapter we detail how to plot networks on a map. The functions and snippets of code detailed below rely heavily on R plotting functions available from the R packages igraph (Csardi & Nepusz, 2006), network (Butts, 2008), and ggplot2 (Wickham, 2009). The R environment for statistical computing (R Development Core Team, 2014) offers a robust and unified platform from which social network and spatial analysis can be performed with comparatively modest systems and an enormous selection of community-contributed packages. As an ecosystem, R offers a multitude of packages for spatial analysis and statistical computing along with established packages for social network analysis. Researchers can leverage these resources to perform data analysis of streams of social media data that include geographic references and allow for mapping user activity to specific locations. R is becoming increasingly recognized as the de facto standard for data visualization and spatial analysis, with growing interest in the platform, perhaps hindered only by its relatively steep learning curve, complex help files, and idiosyncratic programming language.

146 Mapping online to offline social networks

Compared to R, most geographic information systems (GIS) software will provide a point-and-click interface that allows for easier visual interaction, data management, geometric operations, and standard workflows, with single map production and speedy execution. R, on the other hand, is more data- and modelfocused; provides a greater range of analyses; rates attributes as more important; manages many (simpler) maps; and emphasizes reproducibility (by scripting) with speedy development. Base R installation can read, visualize, and analyze spatial data, geographic or relational. Spatial observations can be identified to locations, and additional information for each observation can be added and retrieved by the user. The R ecosystem also contains various contributed packages that address two important technical issues: moving spatial data into and out of R and analyzing spatial data in R. Some interoperability is possible between Base R and commercial GIS software applications, with R packages maptools and shapefiles allowing for reading and writing ArcGIS and ArcView shapefiles (Bivand & Lewin-Koh, 2015; Stabler, 2013). The universe of R packages dedicated to spatial analysis is too large and constantly changing to be covered in this book. But major R packages are relatively stable, and the functions currently available for the analysis of spatial data are likely to remain available, perhaps with improved functionality, in further releases of the packages. R packages can apply spatial analysis on locations or spatial relationships as an explanatory or predictive variable. Broadly speaking, spatial analysis with R relies on packages sp for vector data (Bivand et al., 2008; Pebesma & Bivand, 2005); raster for raster data (Hijmans, 2015); rgdal for input, output, and projections (Bivand et al., 2015); rgeos for geometry operations (Bivand & Rundel, 2014); and spdep for spatial dependence (Bivand et al., 2013). Although there are currently no shared classes for spatial objects in R, packages often depend on classes created by other packages. The spatial.tools package contains spatial functions meant to enhance the core functionality of the raster package (Greenberg, 2014; Hijmans, 2015), including a parallel processing engine for use with rasters, while spacetime extends the shared classes defined in sp for spatio-temporal data (Pebesma, 2012; Pebesma & Bivand, 2005). Modern country boundaries are provided by the rworldmap package (South, 2011), along with functions to join and map tabular data referenced by country names or codes, with higher-resolution country borders made available by the rworldxtra package (South, 2012). Historical country boundaries (1946–2012) are provided by the cshapes package (Weidmann & Gleditsch, 2013), along with functions for calculating distance matrices. The R ecosystem includes a number of packages for chloropleth, bubble maps, and user-supplied maps. The leafletR package (Graul, 2015) provides web-mapping functionality that combines vector data files and online map tiles from different sources, while the OpenStreetMap and osmar packages provide access to open street map raster images and data from different sources (Eugster & Schlesinger, 2013; Fellows & Stotz, 2013). The spatstat and spatial packages are dedicated to space-time and point pattern analysis (Baddeley & Turner, 2005; Venables & Ripley, 2002), with spatstat making

Methods in spatial statistics 147

extensions to marked processes and spatial covariates, providing ample support for model-fitting and simulation, and allowing researchers to define areas of interest. It is currently the only package available from CRAN (R Development Core Team, 2014) that allows users to fit inhomogeneous point process models with interpoint interactions. Another package used for spatial data manipulation and analysis is sp (Pebesma & Bivand, 2005), which benefits from a large user base and supports vector data as points, lines, and polygons. Spatial analysis is performed over an ‘sp object,’ and although it is possible to create a ‘SpatialPoints object’ from scratch, users often resort to a spreadsheet-like file (data frame) or import GIS files to generate the abovementioned sp object. Finally, the package rgdal provides support for importing and exporting both vector and raster GIS data. For illustration purposes, snippet #1 generates and plots a few sp objects (Bivand et al., 2008). With snippet #2, we identify two Twitter users and a message that was retweeted from user #1 by user #2. We plot the location of users on a map and calculate the distance between the two users linked by the retweet. snippet #1 # load sp package library(sp) # generate data temp