264 54 25MB
English Pages 644 [626] Year 2021
The Urban Book Series
S. C. M. Geertman · Christopher Pettit · Robert Goodspeed · Aija Staffans Editors
Urban Informatics and Future Cities
The Urban Book Series Editorial Board Margarita Angelidou, Aristotle University of Thessaloniki, Thessaloniki, Greece Fatemeh Farnaz Arefian, The Bartlett Development Planning Unit, UCL, Silk Cities, London, UK Michael Batty, Centre for Advanced Spatial Analysis, UCL, London, UK Simin Davoudi, Planning & Landscape Department GURU, Newcastle University, Newcastle, UK Geoffrey DeVerteuil, School of Planning and Geography, Cardiff University, Cardiff, UK Paul Jones, School of Architecture, Design and Planning, University of Sydney, Sydney, NSW, Australia Andrew Kirby, New College, Arizona State University, Phoenix, AZ, USA Karl Kropf, Department of Planning, Headington Campus, Oxford Brookes University, Oxford, UK Karen Lucas, Institute for Transport Studies, University of Leeds, Leeds, UK Marco Maretto, DICATeA, Department of Civil and Environmental Engineering, University of Parma, Parma, Italy Ali Modarres, Tacoma Urban Studies, University of Washington Tacoma, Tacoma, WA, USA Fabian Neuhaus, Faculty of Environmental Design, University of Calgary, Calgary, AB, Canada Steffen Nijhuis, Architecture and the Built Environment, Delft University of Technology, Delft, The Netherlands Vitor Manuel Aráujo de Oliveira , Porto University, Porto, Portugal Christopher Silver, College of Design, University of Florida, Gainesville, FL, USA Giuseppe Strappa, Facoltà di Architettura, Sapienza University of Rome, Rome, Roma, Italy Igor Vojnovic, Department of Geography, Michigan State University, East Lansing, MI, USA Jeremy W. R. Whitehand, Earth & Environmental Sciences, University of Birmingham, Birmingham, UK Claudia Yamu, Department of Spatial Planning and Environment, University of Groningen, Groningen, Groningen, The Netherlands
The Urban Book Series is a resource for urban studies and geography research worldwide. It provides a unique and innovative resource for the latest developments in the field, nurturing a comprehensive and encompassing publication venue for urban studies, urban geography, planning and regional development. The series publishes peer-reviewed volumes related to urbanization, sustainability, urban environments, sustainable urbanism, governance, globalization, urban and sustainable development, spatial and area studies, urban management, transport systems, urban infrastructure, urban dynamics, green cities and urban landscapes. It also invites research which documents urbanization processes and urban dynamics on a national, regional and local level, welcoming case studies, as well as comparative and applied research. The series will appeal to urbanists, geographers, planners, engineers, architects, policy makers, and to all of those interested in a wide-ranging overview of contemporary urban studies and innovations in the field. It accepts monographs, edited volumes and textbooks. Indexed by Scopus.
More information about this series at http://www.springer.com/series/14773
S. C. M. Geertman · Christopher Pettit · Robert Goodspeed · Aija Staffans Editors
Urban Informatics and Future Cities
Editors S. C. M. Geertman Faculty of Geosciences Utrecht University Utrecht, The Netherlands Robert Goodspeed Taubman College of Architecture and Urban Planning University of Michigan Ann Arbor, MI, USA
Christopher Pettit City Futures Research Centre University of New South Wales Sydney, NSW, Australia Aija Staffans Department of Built Environment Aalto University Helsinki, Finland
ISSN 2365-757X ISSN 2365-7588 (electronic) The Urban Book Series ISBN 978-3-030-76058-8 ISBN 978-3-030-76059-5 (eBook) https://doi.org/10.1007/978-3-030-76059-5 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface
The international CUPUM conference (Computational Urban Planning and Urban Management) has been one of the premier international conferences for the exchange of ideas and applications of computer technologies to address a range of social and environmental problems relating to urban areas. The first conference took place in 1989 in Hong Kong. Since then this bi-annual conference has been hosted in cities across Asia, Australia, Europe, North America and South America (Table 1). Now, in 2021, 32 years after the first CUPUM conference, Aalto-Helsinki in Finland will host the 17th CUPUM conference. Table 1 Past CUPUM conferences Number
Year
Place
Country
I
1989
Hong Kong
Hong Kong
II
1991
Oxford
United Kingdom
III
1993
Atlanta
USA
IV
1995
Melbourne
Australia
V
1997
Mumbai
India
VI
1999
Venice
Italy
VII
2001
Honolulu
USA
VIII
2003
Sendai
Japan
IX
2005
London
United Kingdom
X
2007
Iguazu Falls
Brazil
XI
2009
Hong Kong
China
XII
2011
Lake Louise (Calgary/Banff)
Canada
XIII
2013
Utrecht
The Netherlands
XIV
2015
Boston
USA
XV
2017
Adelaide
Australia
XVI
2019
Wuhan
China
XVII
2021
Aalto-Helsinki
Finland v
vi
Preface
Table 2 Board of directors of CUPUM Name
Institute
Country
E-mail
Christopher Pettit (Chair of Board)
University of New South Wales
AUS
[email protected]
Andrew Allan
University of South Australia
AUS
[email protected]
Joseph Ferreira
Massachusetts Institute of USA Technology
[email protected]
Stan Geertman
Utrecht University
NED
[email protected]
Robert Goodspeed
University of Michigan
USA
[email protected]
Weifeng Li
University of Hong Kong
CHN
[email protected]
Zhan Qingming
Wuhan University
CHN
[email protected]
Antônio Nélson Rodrigues da Silva
University of Sao Paulo
BRA
[email protected]
Renee Sieber
McGill University
CAN
[email protected]
Atsushi Suzuki
Meijo University
JPN
[email protected]
The CUPUM Board (Table 2) has promoted the publication of a Springer CUPUM Book 2021 with a selection of scientific papers that were submitted to the conference. Those papers went through a competitive, double-blind review process that resulted in the selection of what the reviewers deemed to be the best CUPUM papers of 2021. All these papers fit the main overarching central theme of the Aalto-Helsinki 2021 CUPUM conference: Urban Informatics and Future Cities. Therein, we acknowledge that Future Cities are in need of innovative technologies, associated methodologies and their adoption by the key actors responsible for their planning and management. This will be assisted by ‘gathering’ (hybrid conference: online and at campus) from June 9–11, 2021 both online and at Aalto University, Helsinki metropolitan area, Finland and via the publication of this Springer CUPUM Book 2021. Through the combined efforts of the conference and book publication we hope to facilitate the exchange new ideas on this theme and bring together research and practitioners to better use data, technology and tools to address the challenges facing our cities. Organizing the programme of an international conference and editing a volume of scientific papers requires dedication, time, effort and support. First of all, we would like to thank all the people closely involved in the organization of the Aalto-Helsinki 2021 CUPUM conference. Organizing such a conference always turns out to be much more work and generating many more problems/challenges than envisaged before. Second, as book editors, we would like to thank the authors for their high-quality contributions. We started with 39 proposals for interesting book chapters and finally ended up with 29 high-quality full chapters in this book. The double-blind review process was not an easy task and it is always difficult when potential authors experience the disappointment of not being selected. By fulfilling the double-blind review
Preface
vii
Table 3 Advisors to the CUPUM board Name
Institute
Country
E-mail
Michael Batty (Chair)
University College London
GBR
[email protected]
Karl Kim
University of Hawaii
USA
[email protected]
Dick Klosterman
University of Akron
USA
[email protected]
Kazuaki Miyamoto
Tokyo City University
JPN
[email protected]
Paola Rizzi
Università degli Studi di Sassari
ITA
[email protected]
John Stillwell
University of Leeds
GBR
[email protected]
Anthony G.O. Yeh
University of Hong Kong CHN
[email protected]
Ray Wyatt
University of Melbourne
[email protected]
AUS
process and demanding at least two reviews per submission we believe that the review process has been conducted in a fair and equitable way. Third, we would like to thank our fellows from the Board of Directors of CUPUM and the Advisors to the CUPUM Board for their time and expertise in assisting with the review process. Without their help we wouldn’t be able to guarantee a doubleblind reviewing process. Furthermore, their critical judgement has improved the overall quality of the book substantially. And fourth, we would like to thank our scientific sponsors (Utrecht University, University of New South Wales, University of Michigan, Aalto University) for their contribution in time and resources to this publication. Finally, we would like to thank Springer Publishers for their willingness to publish these contributions in their academic Urban Book Series. This is already the fifth time that a selection of best papers from the CUPUM conference has been published by Springer. The first time was in 2013, when we published the book: ‘Planning Support Systems for Sustainable Urban Development’ (Stan Geertman, Fred Toppen, John Stillwell (eds.)). The second time was in 2015, when we published the book: ‘Planning Support Systems and Smart Cities’ (Stan Geertman, Joe Ferreira, Robert Goodspeed, John Stillwell (eds.)). In 2017 we published the book: ‘Planning Support Science for Smarter Urban Futures’ (Stan Geertman, Andrew Allan, Chris Pettit, John Stillwell (eds.)). And in 2019 we published the book: ‘Computational Planning and Management for Smart Cities’ (Stan Geertman, Andrew Allan, Chris Pettit, Qingming Zhan (eds.)). We hope more CUPUM books will follow. Helsinki, Finland 2021
S. C. M. Geertman Christopher Pettit Robert Goodspeed Aija Staffans
Contents
1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Robert Goodspeed, Chris Pettit, Aija Staffans, and Stan Geertman
Part I
Data Analytics and the COVID-19 Pandemic
2
Smart Governance and COVID-19 Control in Wuhan, China . . . . . . Huaxiong Jiang, Patrick Witte, and Stan Geertman
3
Using Public-Private Data to Understand Compliance with Mobility Restrictions in Sierra Leone . . . . . . . . . . . . . . . . . . . . . . . Innocent Ndubuisi-Obi Jr, Ziyu Ran, Yanchao Li, Chenab Ahuja Navalkha, Sarah Williams, and Lily Tsai
4
Development of a Spatio-Temporal Analysis Method to Support the Prevention of COVID-19 Infection: Space-Time Kernel Density Estimation Using GPS Location History Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Haruka Kato
Part II 5
6
7
1
17
33
51
Big Data and Smart Cities
A Review of Spatial Network Insights and Methods in the Context of Planning: Applications, Challenges, and Opportunities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Xiaofan Liang and Yuhao Kang Transport Infrastructure, Twitter and the Politics of Public Participation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wayne Williamson
71
93
Public Perceptions and Attitudes Towards Driverless Technologies in the United States: A Text Mining of Twitter Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 Zhiqiu Jiang and Max Zheng ix
x
Contents
8
Assessing the Value of New Big Data Sources for Transportation Planning: Benton Harbor, Michigan Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 Robert Goodspeed, Meixin Yuan, Aaron Krusniak, and Tierra Bills
9
How Various Natural Disasters Impact Urban Human Mobility Patterns: A Comparative Analysis Based on Geotagged Photos Taken in Tokyo . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 Ahmed Derdouri and Toshihiro Osaragi
10 Revealing the Spatial Preferences Embedded in Online Activities: A Case Study of Chengdu, China . . . . . . . . . . . . . . . . . . . . . 173 Enjia Zhang, Yu Ye, Jingxuan Hou, and Ying Long Part III Data-Driven Research of Activity Patterns 11 Application for Locational Intelligence and Geospatial Navigation (ALIGN): Smart Navigation Tool for Generating Routes That Meet Individual Preferences . . . . . . . . . . . . . . . . . . . . . . . . 191 Ge Zhang, Subhrajit Guhathakurta, Jon Sanford, and Bon Woo Koo 12 Pedestrian Behavior Characteristics Based on an Activity Monitoring Survey in a University Campus Square . . . . . . . . . . . . . . . 211 Toshihiro Osaragi, Yuriko Yamada, and Hiroyuki Kaneko 13 Developing a GIS-Based Tourist Walkability Index Based on the AURIN Walkability Toolkit—Case Study: Sydney CBD . . . . 233 Arsham Bassiri Abyaneh, Andrew Allan, Johannes Pieters, and Gethin Davison 14 Sequential Patterns of Daily Human Activity Extracted from Person Trip Survey Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257 Weiying Wang, Toshihiro Osaragi, and Maki Tagashira 15 Understanding the Economic Value of Walkable Cities . . . . . . . . . . . . 277 Josephine Roper, Chris Pettit, and Matthew Ng 16 (Big) Data in Urban Design Practice: Supporting High-Level Design Tasks Using a Visualization of Human Movement Data from Smartphones . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301 Angela Rout and Wesley Willett 17 Examining Passenger Vehicle Miles Traveled and Carbon Emissions in the Boston Metropolitan Area . . . . . . . . . . . . . . . . . . . . . . 319 Tigran Aslanyan and Shan Jiang
Contents
xi
Part IV Open Data and Spatial Modelling 18 Development of a Household Urban Micro-Simulation Model (HUMS) Using Available Open-Data and Urban Policy Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343 Nao Sugiki, Shogo Nagao, Batzaya Munkhbat, Atsushi Suzuki, and Kojiro Matsuo 19 An Agent-Based Bushfire Visualisation to Support Urban Planning: A Case Study of the South Coast, NSW 2019–2020 . . . . . . 371 Hitomi Nakanishi, Wendi Han, Milica Muminovic, and Tan Qu 20 Development of an Agent-Based Model on the Decision-Making of Dislocated People After Disasters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387 Yasmin Bhattacharya and Takaaki Kato 21 Evidence-Based Design Justice: Synthesizing Statistics and Stories—To Create Future ‘Just’ Cities . . . . . . . . . . . . . . . . . . . . . . 407 Prithi Yadav, Samuel Patterson, Ana Sima Bilandzic, and Sarah Johnstone Part V
Geodesign and Planning Support Systems (PSS)
22 Geodesign Between IGC and Geodesignhub: Theory and Practice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 431 Shlomit Flint Ashery and Rinat Steinlauf-Millo 23 The Role of Technology Tools to Support Geodesign in Resilience Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 447 Ripan Debnath, Christopher Pettit, Simone Zarpelon Leao, and Oliver Lock 24 Planning Support Systems for Long-Term Climate Resilience: A Critical Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 465 Supriya Krishnan, Nazli Yonca Aydin, and Tina Comes Part VI
Geospatial Data Analysis
25 Aggregation of Geospatial Data on “Street Units”: The Smallest Geographical Unit of Urban Places . . . . . . . . . . . . . . . . . . . . . 501 Takuo Inoue, Rikutaro Manabe, Akito Murayama, and Hideki Koizumi 26 Local Betweenness Centrality Analysis of 30 European Cities . . . . . 527 Kaoru Yamaoka, Yusuke Kumakoshi, and Yuji Yoshimura 27 Model for Estimation of Building Structure and Built Year Using Building Facade Images and Attributes Obtained from a Real Estate Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 549 Takuya Oki and Yoshiki Ogawa
xii
Contents
28 A Spatial Analysis of Crime Incidence and Security Perception Around a University Campus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 575 Daniela Vanessa Rodriguez Lara and Antônio Nélson Rodrigues da Silva 29 Sightseeing Support System with Augmented Reality and No Language Barriers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 591 Shinya Abe, Ryo Sasaki, and Kayoko Yamamoto 30 GeoMinasCraft: A Serious Geogame for Geographical Visualization and Exploration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 613 Ítalo Sousa de Sena, Alenka Poplin, and Bruno de Andrade
Chapter 1
Introduction Robert Goodspeed, Chris Pettit, Aija Staffans, and Stan Geertman
Abstract This book is a selection of the best papers presented at the CUPUM conference at Aalto University, Helsinki, Finland in June 2021. CUPUM stands for Computational Urban Planning and Urban Management and is a once every two years conference, held somewhere in the world. This chapter is the introductory chapter to this book. It introduces the title and central theme of the book: ‘Urban Informatics and Future Cities’. Therein, three cross cutting themes can be identified: big data, disasters and resiliance, and walkability and tourism. Besides, the chapter provides an overview of the content of the volume by presenting briefly each of its consituting chapters, their titles and authors and their main content. In total 30 chapters have been included in this volume. Keywords Urban planning · Urban management · Urban informatics · Future cities When this volume is released in June 2021, it will appear after one of the most eventful 18 months in recent memory. After the emergence of the COVID-19 virus in Wuhan, China in December 2019, the disease has spread worldwide in the worst global pandemic in over 100 years. Simultaneously, the intensifying effects of global warming are being felt worldwide, dramatized by historic wildfires in Australia and California in late 2019 and during 2020. In the U.S., simmering anger over police brutality towards Black citizens and persistent racial injustice and inequality R. Goodspeed (B) Taubman College of Architecture and Urban Planning, University of Michigan, 2000 Bonisteel Blvd, Ann Arbor, MI 48109, USA e-mail: [email protected] C. Pettit Faculty of the Built Environment, University of New South Wales, 2024 High Street, Sydney, NSW 2052, Australia A. Staffans Department of Built Environment, Aalto University, Välskärinkatu, 00260 Helsinki, Finland S. Geertman Faculty of Geosciences, Utrecht University, PO Box 80115, 3508 Utrecht, TC, The Netherlands © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 S. C. M. Geertman et al. (eds.), Urban Informatics and Future Cities, The Urban Book Series, https://doi.org/10.1007/978-3-030-76059-5_1
1
2
R. Goodspeed et al.
in American society boiled over into a series of historic protests in the summer of 2020, influencing a contested election. Worldwide, cities struggled to adapt to these events, as the pandemic caused profound changes to city life including the rapid growth in popularity of walking and cycling, transformed commuting patterns due to widespread working from home, and impacted public budgets. The 2021 CUPUM conference, which for over 30 years has convened a unique global community concerned with the application of computational tools and methods to urban management and planning problems, will also be unlike any other, as it will be conducted exclusively online. Even as life hopefully slowly returns to normal in 2021, the pandemic has underscored the centrality of digital technologies to society, with so many aspects of life being conducted via internet technologies, with videoconferencing, email, and social media attracting more users than ever before. The chapters presented here also reflect the impact of extensive diffusion of digital technologies, even as CUPUM scholars continue to lead the development of novel tools, techniques, and datasets tailored to urban applications. This volume comprises 29 chapters which will be presented at the CUPUM conference in June 2021, hosted virtually from Helsinki, Finland, and this chapter serves to introduce them in two ways. First, we comment on some of the cross cutting themes connecting these diverse chapters, and illustrate how they both reflect research on topics of longstanding interest by the CUPUM community, but also reflect the unique historical moment during which this volume appears. Second, we provide brief summaries of all of the articles, organizing them by research areas. By doing so, we hope to reflect on how the chapters illustrate the ways the CUPUM community is responding to short- and long-term societal trends, as well as provide an entree into the rich and exciting array of research contained within the volume. To conclude, we share some brief thoughts about where the field may be heading next.
1.1 Cross-Cutting Themes In order to identify themes, we analyzed all the chapters from two perspectives: their technical and topical focuses. Technically, almost all chapters fell into just three categories: papers utilizing big data, tools and planning support systems, and analysis and modeling techniques. Among these, the largest category is big data with 12 chapters, which emerges as the first major theme of the volume. Topically the chapters fall generally into five categories: disasters and resilience, transport, walkability and tourism, planning practice and geodesign, and the pandemic. Although many chapters focus on transport, this topic has been of perennial interest by urban researchers. Newly popular, disasters and resilience and walkability and tourism emerge as our second and third important themes of the volume, which reflect urban priorities that–although predating the pandemic–gained new importance during (walkability) and in anticipation of an economic recovery (tourism). We conclude with a discussion of three important minor themes which remain important threads within the CUPUM community research: planning support systems, geodesign, and research engaging
1 Introduction
3
questions of social justice. We conclude this section with some notes about why the topic of smart cities does not appear more prominently in the volume, and taking note of the geographic biases of the scholarship.
1.1.1 Big Data One of the most important consequences of the digitization of society has been the advent of diverse forms of big data. From the perspective of urban management and urban planning, big data holds the promise of providing novel insights and understanding of cities, especially as it becomes increasingly available to researchers through open data and expanded data sharing (Hawken et al. 2020). The array of chapters engaging with big data illustrate the diverse forms it can take in cities, these include cellphone data (Chaps. 3, 4, 8, and 16), administrative datasets (Chap. 17), social media (Chaps. 6, 7, 9, and 10), analysis of photos (Chaps. 12 and 27) and the use of worldwide metrics like walkscore (Chap. 15). These chapters often demonstrate the novel analysis techniques necessary to analyze such datasets, compare them with traditional data, and show how big data can be applied to diverse topics like pandemic response, transportation, disaster response, and more. With the growing availability and maturity of diverse forms of big data, it seems clear that big data research will remain a major focus of CUPUM scholars for years to come, and we hope these chapters inspire yet further advances in this area.
1.1.2 Disasters and Resilience Another noteworthy theme is reflected in a group of articles focused on the topic of natural disasters and resilience. As noted, the intensity and frequency of a variety of climate-related disasters, including wildfires, flooding, extreme heat and cold, and others, is increasing as a result of climate change. Responding to this development, CUPUM researchers tackled various aspects of this problem, including analyzing Flickr photos to understand how disasters impact human mobility (Chap. 10), developing agent-based models to simulate neighborhood rebuilding (Chap. 20), and the analysis of building photos to estimate vulnerability to earthquake and fire at the city scale (Chap. 27). In response to the growing prevalence of disasters, many cities are pursuing planning to become more resilient for the turbulent future which lies ahead. Chapters contributing to this include Chap. 24, which surveys PSS through a lens of resilience to aid practitioners seeking to adopt the best-suited PSS for this type of planning, and Chap. 23 where researchers explore tools suited to geodesign for resilience.
4
R. Goodspeed et al.
1.1.3 Walkability and Tourism Cities have long sought to cultivate walkability, as a key quality of sustainability and urban quality of life. During the pandemic, with many working at home and many key public venues closed, walking and the activity in public outdoor space became key sites for individual physical activity and safer social interactions. Complementing these shifts in many cities, contributors to this volume include a variety of work focused on walkability, and the closely related topic of tourism. Chapters 12 and 16 analyze pedestrian use of urban spaces through the lens of novel datasets, Chap. 13 constructs a walkability index tailored for tourists, and Chap. 15 conducts an analysis showing the value of walkability in real estate prices. Another illustration of the close connection between walkability and tourism, Chap. 29 discusses a new tool that utilizes augmented reality to provide information in a way that overcomes language barriers.
1.1.4 Minor Themes Finally, the chapters include work on three longstanding themes of research presented at CUPUM conferences. Although it has been over 30 years since the concept was proposed (Harris 1989), Planning Support Systems has graduated from a hypothetical idea to a rich field of technical innovation and scholarship that Geertman and Stilwell (2020) have recently argued qualifies as a planning support science. Indeed, many of the chapters previously mentioned report analyses specifically conducted with an eye towards their incorporation into PSS which allows the methods to be applied to practice. Increasingly, the idea of PSS has intertwined with the concept of geodesign, an emerging planning paradigm which calls for the coupling of the use of digital tools with stakeholder design deliberations (Foster 2016). Chapters in this volume provide the opportunity to become immersed in current debates about effective geodesign tools and practices (Chaps. 22, 23, and 24). Finally, many papers touch on the importance of social justice, a normative principle which translates many different ways into local contexts. These include a commitment for inclusive and participatory planning shown by many of the projects provided. Chapters with a clear focus on this issue include the design of tools specifically seeking to broaden access in the build environment regardless of disability (Chap. 11), and analysis of the equity implications for using big data which may not include minoritized populations such as Blacks in America (Chap. 8). Although not a prominent theme in this volume, we think it an important one which we urge the CUPUM community to engage more fully in the years ahead.
1 Introduction
5
1.1.5 A Theme that Wasn’t and a Note on Geographic Bias Finally, we wish to say a word about a topic which readers may have expected would be a theme of this book, but wasn’t: smart cities. Bursting into public consciousness in the early 2000s, propelled in part by a major push among technology companies seeking new markets for IT, the topic of smart cities is the subject of a myriad of books, articles, conferences, and pilot projects (Ruhlandt 2018; Angelidou 2015). However recently Mathis and Kanik (2021) observed a decline in smart city projects, also evident from this volume. The term smart city appears prominently in only three chapters, a discussion of how Wuhan used a variety of IT tools as part of a smart governance approach to control and respond to the COVID-19 outbreak (Chap. 2), and two papers reporting “smart campus” projects (Chaps. 16 and 28). We propose two possible interpretations to the lack of work specifically on smart cities. First, perhaps the underwhelming performance of many smart city projects and developments has cooled interest in the idea, when it has become clear that although IT may have a beneficial role, it’s far from the panacea for cities sold in marketing visions. Second, the chapters collectively illustrate a remarkable advance in the extent of use of advanced computational data, tools, and techniques as applied to urban problems compared with what existed 30 years ago. Whereas in 1989 PSS was simply a concept and most city data was rudimentary databases and GIS, this volume contains chapters describing where Sierra Leon engages in cutting-edge analysis of cellphone data with MIT researchers (Chap. 4), papers involving Australia’s sophisticated AURIN platform (Pettit et al. 2017; Sinnott et al. 2014), and many others that rely on digital infrastructures which would have seemed miraculous 32 years ago at the first CUPUM in Hong Kong. In that sense, perhaps the hype about smart has faded because almost all cities are “smart” to some degree. In its place this volume uses the term urban informatics which conveys a rich, multidisciplinary field of research contributing to improved understanding, management, and planning of cities through multiple pathways to impact. Finally we offer a note about the places represented by authors and case studies in these chapters. Although CUPUM, like research in many fields, has always reflected disproportionate participation by researchers based in wealthier parts of the world with more well developed research and technology sectors, CUPUM has always taken pride in its unique global scope, with active participants spread among all of the world regions. Indeed, this diversity is on full display in the current volume, with chapters about cities in Africa (Chap. 3), South America (Chap. 28), Europe (Chap. 26), the United States (Chap. 17), and more. However, we suspect the geographic diversity has been impacted by the pandemic. Japan and Australia are both well represented with multiple papers, perhaps a function of the significant expertise and infrastructures by scholars in these countries, but also because they both have had relatively minor COVID-19 outbreaks, lessening the need for disruptive shutdowns. Unfortunately there are only one chapter each from Africa and Latin America, and none focused on India or the Middle East, all regions hit hard by COVID-19. Finally, we think even wealthy Europe and North America may also be under-represented, although
6
R. Goodspeed et al.
both are home to many scholars working on these topics, as both experienced severe COVID-19 outbreaks accompanied by disruptions to academic research and society at large. In future years, we hope these discrepancies become reduced as all countries recover from the pandemic, and hope to take actions to ensure CUPUM encompasses work about and by residents of all types of cities worldwide.
1.2 Chapter Overviews The following section provides an overview of all 29 chapters, divided into six sections: Data analytics and the COVID-19 pandemic, Big data and smart cities, Data-driven research of activity patterns, Open data and spatial modeling, Geodesign and planning support systems, and Geospatial data analysis.
1.2.1 Data Analytics and the COVID-19 Pandemic In Chap. 2, titled ‘Smart Governance and COVID-19 Control in Wuhan, China’, the authors Huaxiong Jiang, Patrick Witte, and Stan Geertman provide a review of the many ways smart governance was deployed as part of China’s COVID-19 response in Wuhan. By discursively analyzing existing data from multiple sources, the results show that the real ‘smartness’ of the smart governance of COVID-19 in Wuhan is the innovative use of technologies to develop different types of governance approaches to control COVID-19 in a targeted way. In addition to well-known measures, such a smartphone-based exposure tracking systems, the chapter mentions grassroots governance initiatives such as the use of WeChat groups to coordinate community voluntary responses. Another way data analytics has been used to support effective pandemic response is through the use of big data to understand the impact of mobility restrictions on the possible spread of COVID. This is expressed in Chap. 3, ‘Using Public-Private Data to Understand Compliance with Mobility Restrictions in Sierra Leone’. The authors Innocent Ndubuisi-Obi, Sarah Williams, Yanchao Li, Ziyu Ran, Chenab Ahuja Navalkha, and Lily Tsai describe how their research is conducted through a partnership between the country of Sierra Leone and researchers at MIT to analyze data from the phone company Africell. This collaborative piece of research demonstrates how the pandemic is also forging new and potentially valuable research partnerships among industry, government, and academia. Chapter 4, titled ‘Development of a Spatio-Temporal Analysis Method to Support the Prevention of COVID-19 Infection: Space-Time Kernel Density Estimation Using GPS Location History Data’, is about the prevention of COVID-19 infections in places with high population density. In April 2020, the Japanese government implemented a soft lockdown. It is in this context, the author Haruka Kato developed a spatio-temporal analysis method based on the space-time kernel density estimation
1 Introduction
7
that visualizes the space-time of a place with high population density. Point type data of the floating population with GPS location, obtained at regular intervals from smartphones (15 min), was used as an input in calculating the density estimation of people. The method is an alternative to the Japanese soft lockdown in that it enables local governments to restrict people’s movements by designating specific spacetime areas. In addition, it helps citizens to change their behavior and cooperate in the prevention of COVID-19 infection.
1.2.2 Big Data and Smart Cities With the rise of geospatial big data, new narratives of cities based on spatial networks and flows have replaced the traditional focus on locations. In Chap. 5 ‘A Review of Spatial Network Insights and Methods in the Context of Planning: Applications, Challenges, and Opportunities’ the authors Xiaofan Liang and Yuhao Kang present their review of the theories, concepts, methods, and applications of spatial network analysis in cities and their insights for planners from five areas of concerns: spatial structures, urban infrastructure optimizations, indications of economic wealth, social capital, and residential mobility, and public health control (especially COVID-19). They outline four challenges that planners face when taking the planning knowledge from spatial networks to actions: data openness and privacy, linkage to direct policy implications, lack of civic engagement, and the difficulty to visualize and integrate with GIS. Finally, they envision how spatial networks can be integrated into a collaborative planning framework. In Chap. 6 ‘Transport infrastructure, Twitter and the politics of public participation’ the author Wayne Williamson describes how social media is changing the ways how many local communities seek to mobilize alternative political strategies to disrupt planning processes. The focus of the chapter is the social media and hashtag (#) use of citizens on Twitter during the planning and construction of the WestConnex motorway project in Sydney, Australia. Of particular interest is the hashtag use as a form of alternative politics. The chapter identifies the extensive use of Twitter as an additional communications channel to raise concerns at a local community level, and at a broader political level during a 2019 State election. In Chap. 7 ‘Public Perceptions and Attitudes Towards Driverless Technologies in the United States: A Text Mining of Twitter Data’, the authors Zhiqiu Jiang and Max Zheng make use of Twitter data to capture insights into the public perceptions and attitudes towards driverless technologies and the factors that influence them. To promote public adoption of driverless vehicles, governments need to better understand these insights and influencing factors. By performing text mining of tweets about driverless technology in the U.S. through topic modeling and sentiment analysis, a set of five latent themes were uncovered embedded in the tweets. The findings indicate that Ethics and Policy, Safety, and Design and Functionality are of major concern that may prohibit the acceptance of driverless vehicles.
8
R. Goodspeed et al.
In recent years we have seen the rise of mobility-related big data products which aim to support transport planners in better understanding travel behaviours of those traversing our cities. In Chap. 8 titled ‘Assessing the Value of New Big Data Sources for Transportation Planning: Benton Harbor, Michigan Case Study’ the authors Robert Goodspeed, Meixin Yuan, Aaron Krusniak, and Tierra Bills take a case study approach, examining the strengths and weaknesses of two commercial transport big data products available in the United State—SafeGraph and StreetLight. They undertake this comparative assessment against a conventional traditional Household Travel Survey. The results of this research suggest that big data can complement rather than replace conventional survey techniques in understanding regional travel behaviors. Analysts continue to demonstrate the potential for analysis of social media data to shed light on various urban phenomena. In their Chap. 9 ‘How Various Natural Disasters Impact Urban Human Mobility Patterns: A Comparative Analysis Based on Geotagged Photos Taken in Tokyo’, the authors Ahmed Derdouri and Toshihiro Osaragi demonstrate how a sophisticated analysis of the photo-sharing site Flickr can quantify the impact of several natural disasters on human mobility patterns in Tokyo. This research offers an analytics approach for revealing insights useful for urban managers tasked with disaster response. In Chap. 10 ‘Revealing the spatial preferences embedded in online activities: A case study of Chengdu, China’, the authors Enjia Zhang, Yu Ye, Jingxuan Hou, and Ying Long reveal the spatial preferences embedded in social media applications which can be used to better plan and design future cities. With two different types of social media data—online location tagging from Weibo and online reviews of points of interest on Dianping—they conducted a quantitative analysis to explore the relationship between online activities and elements of the built environment. The results suggest that online activities are still associated with physical urban phenomena, and the activity represented by Dianping reviews revealed more significant spatial preferences than that represented by Weibo check ins.
1.2.3 Data-Driven Research of Activity Patterns In Chap. 11 ‘Application for Locational Intelligence and Geospatial Navigation (ALIGN): Smart navigation tool for generating routes that meet individual preferences’ the authors Ge Zhang, Subhrajit Guhathakurta, Jon Sanford, and Bon Woo Koo develop an application to help people to navigate based on their specific preferences. The outdoor environmental barriers, such as uneven sidewalks and missing curb cuts, can significantly impair pedestrian mobility, especially for people with disabilities. The developed ALIGN app has been built for mobile devices. ALIGN intelligently identifies routes that are tailored to the individual’s specific needs and abilities, based on real-time or near real-time data. Moreover, it serves to create a repository of user behaviour that can inform policy decisions. Further on the theme of transport and mobility and in a time of smart cities and smart campuses Chap. 12 titled ‘Pedestrian Behaviour Characteristics based on an
1 Introduction
9
Activity Monitoring Survey in a University Campus Square’ by authors Toshihiro Osaragi, Yuriko Yamada, and Hiroyuki Kaneko present a novel approach in simulating pedestrian movement. The authors apply deep learning algorithms to CCTV imagery to better understand existing pedestrian movements in order to simulate pedestrian behavior. Such a study holds many potential benefits in supporting how planners understand the existing use of open spaces and also how pedestrians might interact in future proposed open spaces in a smart campus. Extensive research has been conducted on walkability, although the authors Arsham Bassiri Abyaneh, Andrew Allan, Johannes Pieters, and Gethin Davison point out this literature has neglected the unique needs and perspectives of tourists. In their Chap. 13 ‘Developing a GIS-based tourist walkability index based on the AURIN walkability toolkit—case study: Sydney CBD’ they construct such a walkability index, utilizing not only well-validated built environment metrics but input from a field survey of tourists in Sydney. As tourism rebounds in the post-pandemic world, many places may be interested in using this methodology to assess and improve the walkability of tourist districts to maximize the enjoyment and financial impact of visitors. Understanding the fine scale travel patterns of people is intrinsic for planning more accessible and functional cities. The study presented in Chap. 14 by Weiying Wang, Toshihiro Osaragi, and Maki Tagashira titled ‘Sequential Patterns of Daily Human Activity Extracted from Person Trip Survey Data’ provides an exploratory study using traditional Person Trip Survey Data for a number of prefectures across Japan. Using a combination of the Sequence Alignment Method (SAM) and Hierarchical Clustering the authors extract sequential patterns of daily activities. The results show that clusters that are at a similar distance to city centres have similar sequence patterns of activities across different urban geographies in Japan. Walkability research to-date has focused foremost on the influence of the built environment on physical activity associated with health and active transport outcomes. In their Chap. 15 ‘Understanding the economic value of walkable cities’ the authors Josephine Roper, Chris Pettit, and Matthew Ng undertook an empirical study to quantify the economic value of walkability to residential property. Specifically, they demonstrate the use of hedonic price modelling to test measures of walkability with a case study. In that, they find that the walkability index WalkScore is positively related to prices for detached houses in Sydney, Australia but has no significant relationship to apartment (unit) prices. Possible reasons and directions for future work are discussed. In Chap. 16 ‘(Big) data in urban design practice: supporting high-level design tasks using a visualization of human movement data from smartphones’ the authors Angela Rout and Wesley Willett claim that although extensive amounts of location data are produced daily by smartphones, existing geospatial tools are not customized to specifically support high-level urban design tasks. To remedy this, they present the SmartCampus visualization tool, representing spatiotemporal data of over 200 student pathways and restpoints on a university campus. The findings of their research showcase the need for location analysis tools tailored to concrete urban design
10
R. Goodspeed et al.
practices, and also highlight opportunities for Smart City researchers interested in developing domain specific, visualization tools. The authors Tigran Aslanyan, and Shan Jiang investigate in Chap. 17 titled ‘Examining Passenger Vehicle Miles Traveled and Carbon Emissions in the Boston Metropolitan Area’ the GHG emissions generated by on-road passenger vehicles in Massachusetts. For that, they take advantage of two large administrative datasets and combined spatial data analytics, econometrics, and visualization tools. Based on spatial econometric models that examine socioeconomic and built environment factors contributing to the vehicle miles traveled (VMT) at the census tract level, the study offers insights to help cities reduce VMT and the carbon footprint for passenger vehicle travel. Finally, this chapter recommends a pathway for cities and towns in the Boston metropolitan area to curb VMT and mitigate carbon emissions to achieve climate goals of carbon neutrality.
1.2.4 Open Data and Spatial Modelling Cities around the world are continually challenged by changes in population, be it rapid urbanisation or shrinking cities. The authors Nao Sugiki, Shogo Nagao, Batzaya Munkhbat, Atsushi Suzuki, and Kojiro Matsuo outline in their paper 18 titled ‘Development of a Household Urban Micro-Simulation Model (HUMS) Using Available Open-Data and Urban Policy Evaluation’ the application of an open data micro-simulation approach to assist planners to understand different population trajectories and future scenarios. The authors applied the HUMS models to Toyohashi City (Japan) to support the planning of sustainable urban development. The research concludes that micro-simulation based on open data offers much promise in supporting local governments in their planning future infrastructure. However, the chapter outlines a number of limitations that need to be overcome. In Chap. 19 titled ‘An agent-based bushfire visualisation to support urban planning: a case study of the South Coast, NSW 2019–2020’, the authors Hitomi Nakanishi, Wendi Han, Milica Muminovic, and Tan Qu present the development of a planning support tool for visualising the bushfire spread based on an agent-based simulation platform, and discuss how an agent-based simulation can be used with residents to enhance their understanding of bushfire dynamics and planning. The presented study is done in Australia. The authors also analysed the residents’ Twitter posts to understand how they prepared for evacuations in the midst of a life-threatening situation. The study concludes by outlining a number of recommendations in how simulations can support the planning and preparedness for future bushfires. As we live in a time of changing climate there are increasing numbers of extreme events including earthquakes, floods and fires. At the same time we are witnessing rapid urbanisation in many countries which is increasing the vulnerability and risk to people. It is in this context that the authors Yasmin Bhattacharya, and Takaaki Kato contribute their research titled ‘Development of an Agent-based Model on the Decision-making of Dislocated People after Disasters’. Chapter 20 outlines the
1 Introduction
11
initial Agent based modelling (ABM) framework which can be used to explore future recovery scenarios, specifically to better understand how affected neighbourhoods might respond with respect to rebuilding and repopulation. This ABM offers exciting potential in supporting decision-makers and planning in post-recovery planning efforts. In Chap. 21 ‘Evidence-based Design Justice: Synthesizing statistics and stories– to create future “Just” cities’ the authors Prithi Yadav, Samuel Patterson, Ana Sima Bilandzic and Sarah Johnstone introduce the novel notion of Evidence-based Design Justice (EDJ). This approach draws from the strengths of both domains of Urban Science and Design Research to achieve holistic insights to address—not solve— complex issues such as homelessness. Therein, perceptive design synthesizes quantitative data-analytics and qualitative stories of lived experiences of homelessness services in Brisbane, Australia, to gain holistic insights. Outcomes of this research include ways for those experiencing homelessness to influence homelessness services and policies.
1.2.5 Geodesign and Planning Support Systems (PSS) Geodesign is a relatively new area of both study and practice. In the past ten years, there have been about 20 international conferences and about 200 large studies in geodesign for areas undergoing contentious pressures for significant change. Chapter 22, titled ‘Geodesign between IGC and geodesignhub: Theory and practice’ and written by Shlomit Flint Ashery and Rinat Steinlauf-Millo compares the IGC academic approach to geodesign with the Geodesignhub approach, which is closer to the real world of interest groups practitioners. The chapter presents two case studies, one from the United Kingdom and one from Israel, and assesses the potential of geodesign as a methodology to bridge academia and practice in future-oriented policy making and spatial-temporal projects. In Chap. 23 ‘The role of technology tools to support geodesign in resilience planning’ the authors Ripan Debnath, Christopher Pettit, Simone Zarpelon Leao, and Oliver Lock looked at the role of technology tools within geodesign for achieving resilient urbanisation. In their research they reviewed the application of various design and planning support system (PSS) tools in resilience planning-related studies. Results indicate that the application of such PSS tools has reportedly been impacted by several usability issues. Using a geodesign case study, the authors examined the usability of several lightweight and open-source tools for collaboration. User evaluation revealed that stakeholders could interact with those tools easily, thus overcoming a significant barrier to the adoption of PSS into planning. Planning Support Systems (PSS) enable climate-informed planning, but there have been difficulties in the uptake of PSS due to their resource-intensive nature and lack of awareness of their usefulness. In Chap. 24 titled ‘Planning Support Systems for LongTerm Climate Resilience: A critical review’ the authors Supriya Krishnan, Nazli Yonca Aydin, and Tina Comes aim to make a headway in understanding research
12
R. Goodspeed et al.
priorities and gaps that need to be addressed for PSS to address climate resilience. They conducted a literature review and a text-mining analysis of academic and nonacademic (practice) literature on urban planning and climate resilience. Based on a range of identified shortcomings they propose a research agenda for improving the usage of PSS.
1.2.6 Geospatial Data Analysis Urban analysts have often struggled with the limitations of available small spatial units used by statistical datasets, which too often group together distinct parts of cities and thereby obscure the true underlying patterns. In addition to this long standing problem, the growing availability of address-level microdata has raised the issue of what geographic units to use to link them with other urban data. In Chap. 25 ‘Aggregation of geospatial data on “street units”: The smallest geographical unit of urban places’, the authors Takuo Inoue, Rikutaro Manabe, Akito Murayama, and Hideki Koizumi effectively demonstrate how a new geography known as street units can be used to link multiple types of data, using a Japanese neighborhood as a case study. In Chap. 26 ‘Local Betweenness Centrality Analysis of 30 European Cities’ the authors Kaoru Yamaoka, Yusuke Kumakoshi, and Yuji Yoshimura propose a novel methodology to classify the road segments in a street network. To overcome the limitations of existing research on centrality indicators of road networks and cluster analysis, they combined the local betweenness indicator and peak analysis and completed the analyses with a new visualization method. The developed method was applied to 30 cities in Europe. As a result, they extracted important regions for pedestrians solely from the road network shapes and found common tendencies among the cities. These findings will be useful for urban planners and decision-makers in shifting from car-centric transport strategies to more pedestrian-centric plans. As highlighted by the authors Takuya Oki and Yoshiki Ogawa, there is a paucity of city databases which comprise details on building structure and age. Such data underpins a number of city analytics applications such as understanding vulnerable building stock to extreme events such as earthquakes and fires. In Chap. 27 ‘Model for Estimation of Building Structure and Built Year Using Building Façade Images and Attributes Obtained from a Real Estate Database’ the authors investigate the use of AI deep learning method (CNN) for automatically classifying the building stock from available photographic images. Notably, the authors investigate the use of Grad-CAM for creating activation heatmaps to understand the structure and thus transparency of the AI models developed. The results of this study hold promise in supporting the automated classification of building structure and age. Public security is a matter of constant concern among populations, public managers, and urban planners. Although the rising crime rates may lead to a feeling of insecurity in an affected community, research in this area has not yet converged into a consensus regarding this perception. Chapter 28 titled ‘A Spatial Analysis
1 Introduction
13
of Crime Incidence and Security Perception around a University Campus’ by the authors Daniela Vanessa Rodriguez Lara, and Antonio Nelson Rodrigues da Silva presents a study comparing the official records of crime occurrences registered by the police around a university campus in Brazil with the academic community’s security perceptions. The findings point out divergences between the incidence of crimes and the security perceptions and indicate that crime occurrences are not related to the security perceptions of the community. Chapter 29, titled ‘Sightseeing Support System with Augmented Reality and No Language Barriers’ shows how tourism can benefit from augmented reality and overcome language barriers. The authors Shinya Abe, Ryo Sasaki, and Kayoko Yamamoto present a sightseeing support system, which integrates location-based AR, web-geographic information system (Web-GIS) and a recommendation system with images and other non-linguistic information. The system in its current state already covers over 1,000 major sightseeing spots from all over Japan and an increase in future utilization of the system can be anticipated. Serious games have long been an effective tool for engagement and education. The authors Italo de Sena, Alenka Poplin, and Bruno de Andrade present in their Chap. 30 the so-called ‘GeoMinasCraft: A serious geogame for geographical visualization and exploration’. This is a new game constructed within MineCraft, involving realistic landscapes, buildings, and other features. It effectively immerses players into the unique landscape of the Brazilian city of Ouro Preto, where local stakeholders are working to conserve the cityscape, designated a World Heritage Site by UNESCO in 1987 but threatened by recent development.
1.3 Conclusion The chapter summaries convey the breadth and richness of research being conducted by the CUPUM community worldwide. The research contained in this volume touches on many streams of research, each with their own preferred terminology. As previously noted, we have chosen the term urban informatics for this volume’s title, since we feel it has expanded from an emphasis on how IT has changed the urban experience (Foth 2008) to encompass a wide variety of ways computational methods and techniques are applied to urban management, policy, and planning questions (Kontokosta 2018; Goodspeed 2017). As noted, the volume demonstrates how research has moved away from the smart city debate, and contains thematic focus in several areas: disasters and resilience, transport, walkability and tourism, planning practice and geodesign, and the pandemic. Many papers involve the use of urban big data, and the volume contains exciting new tools, planning support systems, analysis techniques, and even serious games created to address these topics. With many cities looking ahead to a post-pandemic future, instead of seeking a quick return to pre-pandemic life, we have the unique opportunity to pursue cities and societies that are more resilient, sustainable, and just. This volume contains many chapters illustrating how data-driven insights and digital collaboration can be used to
14
R. Goodspeed et al.
support these changes. With cities forecast to continue growing in most parts of the world in the coming years, they will remain key sites for achieving societal goals of greater sustainability and resilience. We look forward to the innovations that emerge from the research community as we collectively strive to meet the urban challenges before us.
References Angelidou M (2015) Smart cities: a conjuncture of four forces. Cities 47:95–106. https://doi.org/ 10.1016/j.cities.2015.05.004 Foster K (2016) Geodesign parsed: placing it within the rubric of recognized design theories. Landscape Urban Planning 156:92–100. https://doi.org/10.1016/j.landurbplan.2016.06.017 Foth M (ed) (2009) Handbook of research on urban informatics: the practice and promise of the real-time city. Information Science Reference, Hershey, PA Geertman S, Stillwell J (2020) Handbook of planning support science. Edward Elgar Publishing, Cheltenham, United Kingdom Goodspeed R (2017) Urban informatics: defining an emerging field. In: Schintler LA, Chen Z (eds) Big data for regional science. Routledge, New York, NY, pp 324–335 Harris B (1989) Beyond geographic information systems. J Amer Plann Assoc 55(1):85–90 Hawken S, Han H, Pettit C (2019) Open cities | open data: collaborative cities in the information Era. Springer Nature Kontokosta CE (2018) Urban informatics in the science and practice of planning. J Plann Educ Res. https://doi.org/10.1177/0739456x18793716 Mathis S, Kanik A (2021) Why you’ll be hearing a lot less about ‘smart cities’. Citymonitor. https:// citymonitor.ai/government/why-youll-be-hearing-a-lot-less-about-smart-cities Pettit CJ, Tanton R, Hunter J (2017) An online platform for conducting spatial-statistical analyses of national census data across Australia. Comput Environ Urban Syst. https://doi.org/10.1016/j. compenvurbsys.2016.05.008 Ruhlandt RWS (2018) The governance of smart cities: a systematic literature review. Cities 81:1–23. https://doi.org/10.1016/j.cities.2018.02.014 Sinnott RO, Bayliss C, Bromage A, Galang G, Grazioli G, Greenwood P, Macaulay A, Morandini L, Nogoorani G, Nino-Ruiz M (2015) The Australia urban research gateway. Concurr Comput: Pract Exper 27(2):358–375
Part I
Data Analytics and the COVID-19 Pandemic
Chapter 2
Smart Governance and COVID-19 Control in Wuhan, China Huaxiong Jiang, Patrick Witte, and Stan Geertman
Abstract In dealing with the global COVID-19 pandemic, China has achieved reasonable success in governing COVID-19 within two months with the help of technologies. This study specifically focuses on how these massive technologies have been implemented to facilitate the smart governance of COVID-19 in Wuhan, China. By discursively analyzing existing data from multiple sources, the results obtained in this chapter show that the real ‘smartness’ of the smart governance of COVID-19 in Wuhan is the innovative use of technologies to develop different types of governance approaches to control COVID-19 in an effective and targeted way. As the pandemic continues to evolve worldwide, lessons learned from Wuhan, China can be beneficial to other countries in different institutional contexts to build their own, context-specific governance for controlling the pandemic. Keywords Pandemic · SARS-CoV-2 · ICT · Smart governance · Contextualization
2.1 Introduction In December 2019, the coronavirus (i.e., SARS-CoV-2; COVID-19) broke out in Wuhan, China. Facing with this unexpected, atypical, and damaging disease, a range of methods and means have been employed by the Chinese government and the Chinese society to contain the spread of the virus. Within two months after the initial outbreak, China has achieved reasonable success in containing the coronavirus (The State Council Information Office 2020). As COVID-19 continues to spread around the world, many discussions have been initiated on how China has succeeded in H. Jiang (B) · P. Witte · S. Geertman Department of Human Geography and Spatial Planning, Faculty of Geosciences, Utrecht University, 3584 Utrecht, CS, The Netherlands e-mail: [email protected] P. Witte e-mail: [email protected] S. Geertman e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 S. C. M. Geertman et al. (eds.), Urban Informatics and Future Cities, The Urban Book Series, https://doi.org/10.1007/978-3-030-76059-5_2
17
18
H. Jiang et al.
governing and controlling the pandemic and what insights the global community can learn from this. Two groups of discussions are identified. First, international news reports specifically looked into the measures and actions taken by the Chinese government to control the coronavirus (Normile 2020; Jie 2020). As these reports show, China adopted extreme centralized governance measures to cut all channels (e.g., tourism, public places, public transport and entertainment) for containing the transmission of the coronavirus (Lau et al. 2020; Lin 2020). For instance, commonly informed governance measures consist of new hospitals building, full lockdown and quarantine, social distancing, strict surveillance for suspected and infected COVID-19 cases, and isolating the infected from others (Cai et al. 2020; Mozur et al. 2020; Lau et al. 2020). Second, scientific literature focused on the governance mechanisms targeted at controlling the coronavirus (Mei 2020; AlTakarli 2020; Lau et al. 2020; Lin 2020). Some research summarizes China’s model to govern the COVID-19 epidemic as a public health emergency governance approach (Ning et al. 2020; Cao et al. 2020; AlTakarli 2020). Within this approach, a leadership system—the Epidemic Prevention and Control Headquarters System (EPCHS)—was established at all Chinese administrative levels to promote a whole-of-government response to this pandemic (Ning et al. 2020; Mei 2020). Different social resources, social organizations and individuals were then mobilized to provide necessary supports (e.g., medical staff, hospital equipment, medical supplies, healthcare solutions, etc.) to control COVID19 in Wuhan (Taghrir et al. 2020; Chan et al. 2020). Other studies focused more on the massive application of information and communication technology (ICT) to build containment measures and prevent the transmission of the pandemic (Pan 2020; He et al. 2020; Shaw et al. 2020). For instance, Kummitha (2020) concludes China’s technological response to control the transmission of the pandemic as a technologydriven approach. In this process, big data, urban data analytics and artificial intelligence (AI) were widely used to govern the transmission of COVID-19 in Wuhan as follows: tracking and diagnosing COVID-19 cases, identifying a potential pharmacological treatment, quick and effective pandemic alerts, public health surveillance, real-time epidemic outbreaks monitoring, etc. (Bragazzi et al. 2020). It is worth mentioning that these initial studies improve our knowledge and understanding of the governance of COVID-19. Nevertheless, there exist apparent limitations. First, the governance approaches summarized in the literature or news reports are explained either from a technological perspective or from a societal actor view. However, the interactions between actors and technology in creating ‘smart’ governance approaches to combat COVID-19 are hardly considered. Second, recent observations show that the response to COVID-19 in Wuhan has also witnessed the emergent, pop-up form of massive ICT-enabled, self-organized collaboration, characterized by large-scale, connected, and distributed interactions among citizens and digital altruism (Xinhua 2020; Wang 2020). Meijer et al. (2019) conceptualize this emerging innovative form of governance as open governance. However, few studies have recognized the existance of this newly arising governance approach and its potential and effectiveness in governing the COVID-19 pandemic.
2 Smart Governance and COVID-19 Control in Wuhan, China
19
Against this backdrop, this chapter concentrates on the socio-technical developments of smart governance for COVID-19 control in Wuhan, China. Based on the recently emerging conceptualization of smart governance (Jiang 2021), we were able to demonstrate how different types of smart governance can be established to contain COVID-19 in China. The purpose of this chapter is to present a comprehensive picture of the smart governance of COVID-19 control in China. Besides, this chapter also intends to offer useful insights obtained from China for other countries to contain COVID-19 in their respective societal contexts. To do so, Sect. 2.2 focuses on challenges of COVID-19 and explains the potential of smart governance for COVID-19 control. Section 2.3 introduces the methodology. Section 2.4 presents the results of the smart governance of COVID-19 in Wuhan. Section 2.5 ends with conclusions and lessons learned.
2.2 Smart Governance and COVID-19 Control 2.2.1 COVID-19 Challenges and Control According to medical research, COVID-19 can cause serious health problems such as fatigue, loss of smell, shortness of breath and even death (Riggioni et al. 2020). It can also have sereve socio-economic impacts such as interruptions in the global supply chain, unemployment, less consumption, gendered effects, social safety, mental health, etc. (Atkeson 2020). To respond, effective governance approaches are expected to be developed and delivered within a very short time span to address this threat (Taghrir et al. 2020; Janssen and van der Voort 2020). Here, we define “governance” broadly, as consisting of the entire system of public, private and semi-public and individual actors that jointly solve an issue (Treib 2007). In the face of COVID-19, the government for instance is required to publish infections data through public health platforms and handle relevant issues such as testing and contact tracing to control the transmission of the coronavirus (He et al. 2020). For non-state actors such as private companies and citizens, it is vital for them to comply with the rules and measures that can effectively prevent transmission (Kummitha 2020). In addition, the importance of developing and deploying digital public health technologies for pandemic control has received much attention (Shaw et al. 2020; He et al. 2020). For instance, Bragazzi et al. (2020) show that AI technologies can be applied to identify, track, and forecast infected people through big data analytics and enhance public security via improved face recognition and high temperature detection. Chan et al. (2020) highlight that smart and powerful devices can also contribute to diagnosing the virus and facilitating virtual communication to reduce human-to-human physical contact. Therefore, technology is deemed as an inseparable part of the governance strategy aiming to contain COVID-19.
20
H. Jiang et al.
Except for the discussions above, some studies highlight the importance of implementing massive ICTs to build innovative, ‘smart’ governance approaches to handle COVID-19. For instance, AlTakarli (2020) urges that we should focus on a high level of collective action with the help of smart technologies in containing the transmission of COVID-19. Bragazzi et al. (2020) praise the role of mobile and web-based applications in enhancing government rationality and informed decision-making for controlling the pandemic. In the next subsection, we will focus on how types of smart governance approaches can be built with the help of ICT to contain COVID-19.
2.2.2 Smart Governance for COVID-19 Meijer (2016:73) argues that smart governance “is about using new technologies to develop innovative governance arrangements”, aimed at obtaining better outcomes. Based on an intensive literature review, Meijer and Bolívar (2016) identify four idealtypical conceptualizations of smart governance: (1) government of a smart city, (2) smart decision-making, (3) smart administration and (4) smart urban collaboration. These conceptualizations reflect the different roles of urban actors (i.e., state, market and citizens) in governance processes and the different types of technological functions (e.g., informing, communicating, analyzing, designing, visualization) being applied to support governance processes and/or handle concerned (urban) issues (Jiang et al. 2020). Except for the four types of smart governance, Meijer et al. (2019) more recently identified a new, innovative form of smart governance, labeled as open governance. Building upon the emerging, rich interactions in cities that are facilitated by new ICTs, open governance is argued to be able to manage crises or natural disasters with ‘pop-up’, emergent collective actions. In the following subsections, we further explain these smart governance approaches.
2.2.2.1
Government of a Smart City
In this conceptualization, smart governance is considered as the governance of a smart city. For instance, Nam (2012) claims that smart governance is about implementing smart technologies (e.g., cloud technologies, internet of things (IoTs), AI and data analytics) to promote smart city initiatives. In this process, governments play a central role in policy-making and implementing technologies to make a city smarter. Then, the technological value—the acceptance, adoption, and application of technology itself is treated as value—is stressed. This type of value follows from the study of diffusion and adoption of innovations that the adoption and dissemination of a technology in a population is itself an illustration of its added value for the person who uses or operates that technology (Meijer and Thaens 2018). In this chapter, the so-called ‘government of a smart city’ approach to governing the COVID-19 period is the efforts made by government to deploy new ICTs to find potential solutions to combat this pandemic.
2 Smart Governance and COVID-19 Control in Wuhan, China
2.2.2.2
21
Smart Decision-Making
According to UNESCAP (2007), smart governance is about using technologies to improve “the process of decision-making and the process by which decisions are implemented (or not implemented)”. In this conceptualization, different sorts of data, facts and details concerning public management gathered through sensors or sensor networks from a variety of sources can help increase government’s rationality and improve their overall capabilities for smart decision-making (Ju et al. 2018). Then, the instrumental value of technology is emphasized in this process to help the participants to obtain their particular goals or purposes. For instance, by using AI tools and urban data analytics, the distribution of suspected and infected COVID-19 cases can be located and mapped for further targeted actions taken by governments (Shi et al. 2020).
2.2.2.3
Smart Administration
Alawadhi and Scholl (2016:2953) argue that smart governance is about “reshaping administrative structures and processes across multiple local government agencies and departments” by using ICT. In this approach, it is required to restructure the internal organization of government with the help of technological innovations, aimed at enhancing the efficiency, quality and sustainability of government services. Then, the collaborative value of technologies—facilitating the exchange of information and knowledge between different governmental divisions—is highlighted. For instance, the capabilities of local governments worldwide in providing public services during the COVID-19 pandemic are improved by employing integrated government information management systems (OECD 2020).
2.2.2.4
Smart Urban Collaboration
Kourtit et al. (2012) claim that to maximize the operation of cities, more ICT-enabled collaborative governance structures engaging different stakeholders should be put forward. In this process, smart governance requires a higher level of government transformation—that is, the state, market and civil society should communicate with each other and work together to achieve specific goals. Then, emphasis is not only put on the collaborative value but also on the symbolic value—“the technology provides legitimacy to the process of innovation because of the idea that technology helps us to create a better future” (Meijer and Thaens 2018:368). As for COVID-19 control, smart technologies such as public websites and social media can establish various collaborative platforms for actively engaging its residents and sharing COVID-19 information and policy measures for collective actions.
22
2.2.2.5
H. Jiang et al.
Open Governance
Open governance acknowledges the emergent—pop-up—character of new collaborations and presents an understanding of massive individualized, ICT-enabled collaboration in cities (Meijer et al. 2019). Crisis management is usually the empirical domain in which open governance results in interesting practices. Meijer et al. (2019) determine five core elements of the open governance paradigm, namely radical openness, citizen-centricity, connected intelligence, digital altruism, and crowdsourced deliberation. In this process, the relations between citizens rather than their contacts with the government are at the heart of open governance. More specifically, governments—and other platform providers—can facilitate these interactions rather than focusing on interactions between citizens and government. Technology in this process acts as collaboration infrastructure that allow different stakeholders, especially voluntary citizens, to offer public services in a co-production model. In terms of COVID19 control, the co-production model highlights the volunteering role of citizens in effectively producing knowledge and solutions to reduce the impact of COVID-19 through various connections and radically deconcentrated forms of technological intelligence. In this chapter, the five mentioned types of smart governance are applied to study the smart governance of COVID-19 in Wuhan, China. In the next section, we will introduce the methods used to gather and analyze the relevant data.
2.3 Methodology 2.3.1 Data Selection First, a literature survey was conducted to identify the approaches to smart governance of COVID-19 in Wuhan, as part of Wuhan’s response to COVID-19 has been well documented in recent academic literature. The retrieval was conducted in July 2020 and updated in September 2020. Only peer-reviewed academic journal articles were taken into account since this allows us to take in high-quality research. Then, we limited the search to determine those articles published or accepted in the year 2020. Both English and Chinese journal articles were considered as it provides us a more comprehensive view of the efforts made to govern COVID-19 in Wuhan. By using the Scopus database, a total of 28 English articles from an ICT-enabled governance perspective were identified and used in this paper. The Chinese articles were searched in the China Academic Journals Full-text Database (CJFD)—one of the most important online academic databases in China (Jiang et al. 2019). This database offers leading journals of natural and social science in China due to its strict academic standards and rigorous selection. At last, a total of 19 Chinese articles were identified and used in this paper.
2 Smart Governance and COVID-19 Control in Wuhan, China
23
Second, digital news archives, policy documents, index systems and websites were also investigated. Keywords (i.e., smart governance of COVID-19, ICT-enabled governance and COVID-19, technology and COVID-19) were searched in the Google and Baidu search engines to gather the English and Chinese documents. The search was also restricted to those online documents and reports published in 2020. By using a snowball sampling method, we were able to obtain all potentially relevant documents useful for illustrating the smart governance of COVID-19 in China. Third, nine citizens living in Wuhan were also invited to participate in semistructured interviews over the course of four months, from May to August 2020. The reason for conducting these interviews is that these participants have personally experienced the process of Wuhan’s prevention and control of COVID-19, so they are able to offer first-hand information that supplements the web-based data.
2.3.2 Analysis Discourse analysis was introduced to analyze and explain the collected data. Discourse analysis aims to understand how discourse is implicated in relations of wider power structures and social and cultural contexts (Fairclough 2001). By using coding techniques and referring to the five types of smart governance: (1) government of a smart city, (2) smart decision-making, (3) smart administration, (4) smart urban collaboration and (5) open governance, we were able to identify the units of analysis (i.e., role of actors and functions of technologies in this chapter) within their semantic contexts. According to Petrina (1998), the units of analysis act as empirical evidence of the latent meaning interpreted in discourse analysis. Thus, we further use the coded units as discourses to understand the different meaning of smart governance in containing COVID-19 in Wuhan.
2.4 Results This section presents results obtained from the analysis of the smart governance of COVID-19 in Wuhan. We specifically focus on how the five types of smart governance were applied in practice.
2.4.1 Government of a Smart City Given the supposition of feichang shiqi (i.e., extraordinary times), reports showed that the Chinese government implemented unconventional and precautionary measures regarding the combatting of COVID-19 (e.g., closure of public transport, travel restrictions, quarantine, and shutting down all non-essential companies and schools)
24
H. Jiang et al.
(The State Council Information Office 2020). In this process, the role of cuttingedge technologies in enabling the implementation of China’s containment measures has been acknowledged by the Chinese authorities (Kummitha 2020). The policy documents collected indicated that many ICTs were employed to govern COVID-19 in Wuhan (i.e., government of a smart city) (The State Council Information Office 2020). For instance, to provide reminders and information about daily confirmed new COVID-19 cases, COVID-19 protective measures, and COVID-19 control orders and rules, the Chinese authorities have upgraded networks, apps and platforms like subway and train station announcements, television broadcasts, social media, and smartphones alerts (Park 2020; Ning et al. 2020). Then, in an early stage, an explosion in the number of severely ill patients requiring treatment in the Intensive Care Unit (ICU) has put the healthcare system under unprecedented pressure. To respond, technologies have been massively applied to enhance the capability of hospitals and reduce the patient load in Wuhan. One example is the newly built Huoshenshan and Leishenshan Hospitals offering 2,600 beds in total (Cai et al. 2020). Equiped with central oxygen supply systems, negative pressure systems, ventilation systems and air purifiers, the two hospitals offer solid hardware and software basis for effective working of these hospitals. Then, the robot nurses designed and developed by private companies were also proposed as a policy instrument to combat against COVID-19 in Wuhan. For instance, collaborative partnerships were formed between CloudMinds tech company, China Mobile and Wuhan Wunchang Smart Field Hospital to build a field hospital staffed by robots (O’Meara 2020). In this hospital, some robots provided patients with food, drinks and medicine and recovery information, while others cleaned floors and sprayed disinfectant. Then, the temperature, blood oxygen, heart rate levels of patients can also be observed and checked in real-time by using smart rings and bracelets that synced with CloudMinds’ AI platform. Within this process, technologies have been crucial for supporting the understaffed medical professions and minimizes the chances of cross infection.
2.4.2 Smart Decision-Making Then, practices showed that by making use of connected applications and AI systems to collect real-time and transparent data and information, smart decisions were able to be made by the Wuhan authorities in terms of whether interventions should be applied for preventing violation of containment rules. One measure was the use of combined facial recognition platform and street infra-red camera system to assist in real-time control and surveillance for coronavirus disease (AlTakarli 2020). In this process, the authorities would offer the information to implementation teams for further involvement and actions if they identify individuals showing high temperature or COVID-19 symptoms or detect individuals walking publicly without wearing face masks. Another measure was the adoption of drones and robots equipped with AI technologies to decrease the possibility of exposing people to COVID-19 (Shi et al.
2 Smart Governance and COVID-19 Control in Wuhan, China
25
2020). Enabled by five high-resolution cameras and infra-red thermometers, these self-driving robots and remote-controlled drones were able to detect and scan people’s temperature within a radius of five metres simultaneously (Weekes 2020). Whenever a person with fever had been detected and/or without wearing a mask, the authority would be alerted or warned by the information management systems to exert regular orders or social distancing. Then, to identify those staying at home but potentially affected by COVID-19, Iflytek, a Chinese AI company that specializes in automatic speech recognition, has collaborated with the Wuhan government to develop a medical calling robot (Tang et al. 2020). With the help of AI voice assistant that can make 900 phone calls in one minute, the systems can identify potentially affected people in Wuhan and help medical personnel and nurses to deliver further health care and treatment in the most efficient manner. Furthermore, new technology-based tracking systems were also exploited by the Chinese authorities to combat COVID-19. For instance, big data and mobile technology were combined to establish a color-based health code (or QR code) system. The QR code system was developed to categorize individuals’ health status into three color groups—green, yellow or red (Mozur et al. 2020). By linking the the system to individuals’ biometric data (e.g., temperature) and contact and travel history, officials can track and monitor people’s movement, check their health status and make real-time decision about whether people need to be quarantined. The shapshot of the smart decision-making process in Wuhan showed that in the context of an usually unexpected occasion requiring immediate action, the deployment of ICTs can help government make better choices and accelerate their emergence response times greatly.
2.4.3 Smart Administration The literature gathered also indicated that smart administration platforms were adopted by the Chinese government for COVID-19 control. Within this smart approach, technologies facilitated a mode for transforming governmental internal structures and opening up their datasets to become more transparent and gain better service provision and more valid public policy. One example was the application of the interoperable health information systems to improve the working-together processes within and across different levels of governments and hospitals (Ning et al. 2020). Once a COVID-19 case is suspected and confirmed, the responsible doctor is requested to report the case electronically, where statistics will be generated for the total number in each area. Then, each province submits its overall report to the National Health Commission to generate daily reports for the newly suspected, diagnosed, and asymptomatic cases and deaths. Another example was the open government health service platforms (i.e., the Wuhan Manicipal Health Commission Website) used to release daily case report and offer other detailed information such as availability of testing stations, places that infected people visited,
26
H. Jiang et al.
and availability of face masks.1 Via these websites, local residents can have access to up-to-date information about local COVID-19 control actions. In addition, the Government Internal Information Management System in Wuhan was used to bring together and coordinate the different government divisions (Peng et al. 2020; Yu and Li 2020). By identifying and encouraging those industries that can offer basic necessities, Wuhan effectively kept the economy running during the peak crisis period. Then, Wuhan Government Service Platform was also employed to offer public services concerning financing, consultation, insurance, etc., aimed at helping those micro, small, and medium-sized enterprises (MSMEs) in crisis.2
2.4.4 Smart Urban Collaboration Practices indicated that the application of social media and web-based platforms facilitated the rise of smart collaborations between different stakeholders (i.e., state, market and civil society) to control the coronavirus. For example, with the help of WeChat public account platform, a large amount of useful information concerning the latest situation of the pandemic and personal prevention measures was published by the state-owned media to the general public (Lu and Zhang 2020). Ordinary people can use functions such as the WeChat circle of friends and groups to gather, integrate, and disseminate the information of confirmed cases and suspected contiguous people, facilitate a shared understanding of the situation and follow up-to-date control measures and rules. Current practices also revealed that smart urban collaboration networks have been widely set up to improve the hospital capacities. For instance, by using the recently well-established industrial internet in early 2020, more than 3,000 Chinese companies were able to build a well-aligned, start-to-finish and modular manufacturing chain to produce protective clothing, face masks, disinfectants and medical supplies (Lau 2020). Then, nationwide healthcare facilities also collaborated with large internet companies such as Tencent and Alibaba to create online healthcare platforms like Dingxiangyuan that can provide the public with remote medical services (Aikman and Chan 2020). By using these platforms, ordinary people were able to consult with online doctors, carry out self-examination and decide whether they should remain at home or go to a hospital for further medical checks. These platforms not only effectively alleviate the demand for hospitals by reducing caregiver workloads and non-essential hospital visits, but also prevented the likelihood of cross-infection. Furthermore, collaborations between government and private companies have enabled around 276 million full-time students to restart their studies through online platforms (Lau 2020). For instance, large internet companies such as China Mobile, Alibaba and Tencent worked with the Chinese government to create various online learning, e-learning, and distance learning environments for students, including 1 http://wjw.wuhan.gov.cn/. 2 http://home.wuhan.gov.cn/.
2 Smart Governance and COVID-19 Control in Wuhan, China
27
video message, video conference and remote consultation. Then, hundreds of industrial internet-based online educational platforms (e.g., Wangyi Open Course) also provided free-of-charge, individual live streaming services and shared their massive open online courses (Lau 2020). It should be noted that during the quarantine period in Wuhan, already-vulnerable groups including the elderly, children, people with disabilities and pregnant women were confronted with severe problems in terms of safeguarding basic necessities and being even more prone to COVID-19 (Gabster et al. 2020; Choi et al. 2020). Then, voluntary citizens and community workers used various WeChat groups to have access to those vulnerable people who need help.
2.4.5 Open Governance Interviews with Wuhan citizens also indicated that open governance has been developed to improve the hospital capacities in China. Analyses showed that after the outbreak of COVID-19 in Wuhan, many local commerce chambers, citizens and local hospitals in Wuhan worked, in a deliberate and intentional way, with overseas individuals and business owners to supply critically needed medical equipment and facilities. For instance, by creating various WeChat groups, people could exchange open government data and other open data where there is a serious need for medical supplies such as rubber gloves, goggles, thermometers, medical masks, protective suits, disinfectants, and hand lotions. Since real-time information can be updated in WeChat groups, voluntary people were able to receive important information for decision-making on primary needs and can help others by offering the latest information on local situations. As one interviewee said: I was one of the volunteers in the WeChat Group of Volunteers Supporting Wuhan. My job is to voluntarily pick up medical staff to and from work for free…one of my friends went to the community to help prevent and control fever patients…by receving message from WeChat groups, he was able to deliver medical supplies to those needed [R6].
It should also be noted that social media-based open governance identified in Wuhan is inclusive and involves opinions, ideas and suggestions on a more representative and broader scale. For instance, people in Wechat groups can invite new participants to join them at any time if needed. The new-comers are able either to generate new data and combine these with open government data in a way that new ideas and insights are produced to control COVID-19 or to spread the information concerning COVID-19 to as wide an audience as possible. Wechat offers us a very convenient way to build networks and targeted groups. It is a very costeffective, easy-to-use and intelligent tool for people to collect and share information…People can be easily invited to join the groups and contribute their ideas and knowledge…however, if you feel disturbed, you can withdrew from the groups if you want…everything is completely voluntary… [R3].
28
H. Jiang et al.
It revealed that this emergent, pop-up “ad hoc” organization includes different individual, institutional, organizational actors all using open data and co-produced knowledge to control the pandemic for all people concerned. Instead of relying on centralized leadership and top-down organizational structures, a range of individual and market interplays facilitated by deconcentrated forms of intelligence (e.g., social media networks and web-based digital platforms) contributed to the establishment of self-organization offering timely access to magnification of data input and response feedback loops in terms of COVID-19 control. Our Wechat group includes different kinds of people…It was led by self-selected participants…Although we just know a few members in the group, we are voluntarily working hard to help Wuhan…we have a warm feel [R3].
The analyses showed that the deconcentrated forms of intelligence brings the required connections. By applying these large-scale, connected, and distributed data and knowledge interactions, more multiple, deliberative and impartial participation and collaboration have been enabled for all sectors of the society to handle the COVID-19 crisis in Wuhan. Even though present literature and news reports seldom recognized its existence and potential, it was indeed a convincing supplementary type to currently existing smart governance approaches.
2.5 Conclusions and Recommendations Since China has achieved a major strategic success in controlling the current COVID19 pandemic, this study specifically focused on the smart governance of COVID-19 in Wuhan. The analysis of this chapter revealed that different types of smart governance have been developed and used in Wuhan to contain COVID-19. In this process, various stakeholders (e.g., the government, the private sector and civil society) either worked together or voluntarily participated in the smart governance of COVID-19. Then, ICT showed its transformative role in supporting the COVID-19 governance process and handling the problems faced. In brief, the analysis indicated that the smartness of smart governance for COVID-19 in Wuhan is the innovative use of ICT to develop effective governance institutions to handle the COVID-19 pandemic. Based on our analysis, we provide some lessons that China offers for other countries to contain COVID-19. First, we should acknowledge the potential of newly developed technologies (e.g., computational techniques, social media, AI and big data) in developing innovative smart governance approaches to govern the COVID-19 pandemic. As Janssen and van der Voort (2020) assert, smart technologies enhance the COVID-19 governance processes by gathering, exchanging and analyzing data to solve problems without relying much on human intervention and by improving coordination and interactions between stakeholders for collective actions. The analysis of this chapter has revealed that smart ICTs have proved their potential for governing COVID-19 in Wuhan. More specifically, massive technologies have been used to improve social
2 Smart Governance and COVID-19 Control in Wuhan, China
29
organizations and governmental administration intended for a changed human relation with improved capacities to handle the emergency. Based on this, it is highly recommended to acknowledge the importance of technology in creating innovative governance approaches to control the COVID-19 pandemic. Second, there is a strong need for different actors (i.e., state, market and civil society) to work together and improve the social and human capital that would assist in creating and implementing smart governance approaches to get COVID-19 under control. The study indicated that cooperations between government and high-tech companies were crucial for developing and implementing technology in China. Then, with the help of various digital community communication and management systems, volunteer teams of residents within communities actively delivered services to those needed. What is worth mentioning is the appearance of open governance as a dedicated type of smart governance in governing COVID-19. Enabled by various social media, networks and platforms, new emerging, pop-up forms of technology-enabled data-sharing, mass digital altruism, and self-organization practices beyond government were able to occur across a wide range of actors. From this perspective, we highlight that more collaborations and partnerships with meaningful commitments to human health security are needed to handle current fragmentation in COVID-19 governance. Third, smart governance of COVID-19 should be more pragmatic to contextualize itself in embedded situations and produces adaptative governance solutions. Although at the national scale a whole-of-government and a technology-driven approach were employed to govern COVID-19 in China, the meaning of smart governance, at the local scale, differed considerably. For instance, the detailed enquiry into the governance of COVID-19 in Wuhan indicated various smart models for governance collaborations. This implies that the reason should be understood that a smart governance approach is treated as the best solution to that situation. More specially, handling the pandemic requires the proposed smart governance approaches to be tailored to local specificities and local environments and to be addressed in a targeted way. Finally, the translation of China’s lessons in smart governance of COVID-19 into other countries should consider contextual differences (e.g., economy, politics, culture, level of technological development, etc.) between regions and/or countries. For instance, although digital contact tracing applications implied a powerful strategy to control COVID-19 in China, massive collection of private data and lax attitude towards privacy protection in the private sector could lead to an erosion of privacy rights and hurt public trust in digital technologies (Bengio et al. 2020). Then, in Western democracies, because of the influence of “the individual freedom and rights conferred to individuals, the privacy protection laws enacted and the human-driven approaches adopted in smart cities” (Kummitha 2020:5), technologies to control COVID-19 often need to be deployed in different ways than in China. Therefore, we emphasize that only when the importance of contextual differences between regions and/or countries in influencing the meaning of governance were well recognized and considered can real smart governance for controlling the worldwide COVID-19 pandemic be built.
30
H. Jiang et al.
References Aikman D, Chan A (2020) Five ways Chinese companies are responding to coronavirus. Accessed 7 July, 2020. Available at: https://www.weforum.org/agenda/2020/02/coronavirus-chinese-com panies-response/ Alawadhi S, Scholl HJ (2016, January) Smart governance: a cross-case analysis of smart city initiatives. In: 2016 49th Hawaii international conference on system sciences (HICSS) (pp 2953– 2963). Koloa, HI, USA AlTakarli NS (2020) China’s response to the COVID-19 outbreak: a model for epidemic preparedness and management. Dubai Med J 3(2):44–49 Atkeson, AG (2020) What will be the economic impact of COVID-19 in the US? Rough estimates of disease scenarios. Staff Report 595, Federal Reserve bank of Minneapolis Bragazzi NL, Dai H, Damiani G, Behzadifar M, Martini M, Wu J (2020) How big data and artificial intelligence can help better manage the COVID-19 Pandemic. Int J Environ Res Public Health 17(9):3176 Bengio Y, Janda R, Yu YW, Ippolito D, Sharma A (2020) The need for privacy with public digital contact tracing during the covid-19 pandemic. Lancet Digital Health 2(7):342–344 Cai Y, Huang T, Liu X, Xu G (2020) The Effects of “Fangcang, Huoshenshan, and Leishenshan” Makeshift Hospitals and Temperature on the Mortality of COVID-19. medRxiv. https://doi.org/ 10.1101/2020.02.26.20028472 Cao Y, Shan J, Gong Z, Kuang J, Gao Y (2020) Status and challenges of public health emergency management in china related to COVID-19. Fron Public Health 8:250. https://doi.org/10.3389/ fpubh.2020.00250 Chan AK, Nickson CP, Rudolph JW, Lee A, Joynt GM (2020) Social media for rapid knowledge dissemination: early experience from the COVID-19 pandemic. Anaesthesia. 75(12). https://doi. org/10.1111/anae.15057 Choi J, Lee S, Jamal T (2020) Smart Korea: governance for smart justice during a global pandemic. J Sustain Tourism 29(6):1–10 Fairclough N (2001) Critical discourse analysis. How to analyse talk in institutional settings: A casebook of methods (pp 25–38). London, UK: Continuum Gabster BP, van Daalen K, Dhatt R, Barry M (2020) Challenges for the female academic during the COVID-19 pandemic. Lancet 395(10242):1968–1970 He AJ, Shi Y, Liu H (2020) Crisis governance, Chinese style: distinctive features of china’s response to the Covid-19 pandemic. Policy Design Practice 3(3):1–17 Janssen M, van der Voort H (2020) Agile and adaptive governance in crisis response: Lessons from the COVID-19 pandemic. Int J Inform Manag, 102180. https://doi.org/10.1016/j.ijinfomgt.2020. 102180 Jiang H (2021) Smart urban governance: Governing cities in the “smart” era (doctoral dissertation). Utrecht University, Utrecht Jiang H, Geertman S, Witte P (2019) Comparing smart governance projects in china: a contextual approach. In: Geertman S, Zhan Q, Allan A, Pettit C (eds) Computational urban planning and management for smart cities. Springer, Cham Jiang H, Geertman S, Witte P (2020) Ignorance is bliss? An empirical analysis of the determinants of PSS usefulness in practice. Comput Environ Urban Syst 83:101505. https://doi.org/10.1016/j. compenvurbsys.2020.101505 Jie K (2020) Fighting on the frontline: COVID-19 pandemic promote robotics industry in China. Accessed 7 July, 2020. Available at: http://en.people.cn/n3/2020/0501/c90000-9686307.html Ju J, Liu L, Feng Y (2018) Citizen-centered big data analysis-driven governance intelligence framework for smart cities. Telecommunications Policy 42(10):881–896 Kourtit K, Nijkamp P and Arribas D (2012) Smart cities in perspective—a comparative European study by means of self-organizing maps. Innov: Eur J Soc Sci Res 25(2):229–246
2 Smart Governance and COVID-19 Control in Wuhan, China
31
Kummitha RKR (2020) Smart technologies for fighting pandemics: The techno-and human-driven approaches in controlling the virus transmission. Government Inform Quart 37(3): https://doi. org/10.1016/j.giq.2020.101481 Lau H, Khosrawipour V, Kocbach P, Mikolajczyk A, Schubert J, Bania J, Khosrawipour T (2020) The positive impact of lockdown in Wuhan on containing the COVID-19 outbreak in China. J Travel Med 27(3):taaa037 Lau S (2020) How China’s industrial internet is fighting COVID-19. Accessed 7 July, 2020. Available at: https://www.weforum.org/agenda/2020/04/china-covid-19-digital-response/ Lin C (2020). Delivery technology is keeping Chinese cities afloat through coronavirus. Harvard Business Review. Accessed 7 July, 2020. Available at: https://hbr.org/2020/03/delivery-techno logy-is-keepingchinese-cities-afloat-through-coronavirus Lu Y, Zhang L (2020) Social media WeChat infers the development trend of COVID-19. J Infect 81(1):e82–e83 Mei C (2020) Policy style, consistency and the effectiveness of the policy mix in China’s fight against COVID-19. Policy Soc 39(3):309–325 Meijer A (2016). Smart city governance: a local emergent perspective. In Smarter as the new urban agenda (pp 73–85). Springer, Cham Meijer AJ, Lips M, Chen K (2019) Open governance: A new paradigm for understanding urban governance in an information age. Fronti Sustain Cities 1:3. https://doi.org/10.3389/frsc.2019. 00003 Meijer A, Bolívar MPR (2016) Governing the smart city: a review of the literature on smart urban governance. Int Rev Admin Sci 82(2):392–408 Meijer A, Thaens M (2018) Urban technological innovation: Developing and testing a sociotechnical framework for studying smart city projects. Urban Affairs Rev 54(2):363–387 Mozur P, Zhong R, Krolik A (2020) In Coronavirus Fight, China Gives Citizens a Color Code, With Red Flags. Accessed 7 July, 2020. Available at: https://www.nytimes.com/2020/03/01/business/ china-coronavirus-surveillance.html Nam T (2012) Suggesting frameworks of citizen-sourcing via Government 2.0. Govern Inform Quart 29(1):12–20 Ning Y, Ren R, Nkengurutse G (2020) China’s model to combat the COVID-19 epidemic: a public health emergency governance approach. Global Health Res Policy 5(1):1–4 Normile D (2020) Can China return to normalcy: while keeping the coronavirus in check?. Science. https://doi.org/10.1126/science.abb9384 O’Meara S (2020) Meet the engineer behind China’s first robot-run coronavirus ward. Accessed 7 July, 2020. Available at: https://www.nature.com/articles/d41586-020-01794-8 OECD (2020) The Territorial Impact of COVID-19: Managing the Crisis across Levels of Government. Accessed 10 September, 2020. Available at: https://www.oecd.org/coronavirus/policy-res ponses/the-territorial-impact-of-covid-19-managing-the-crisis-across-levels-of-government-d3e 314e1/ Pan XB (2020) Application of personal-oriented digital technology in preventing transmission of COVID-19 China. Irish J Med Sci 189(4):1–2 Park J (2020) Changes in subway ridership in response to COVID-19 in Seoul, South Korea: Implications for social distancing. Cureus. 12(4). https://doi.org/10.7759/cureus.7668 Peng F, Tu L, Yang Y, et al (2020) Management and treatment of COVID-19: the Chinese experience. Canadian J Cardiol 36(6). https://doi.org/10.1016/j.cjca.2020.04.010 Petrina S (1998) The politics of research in technology education: a critical content and discourse analysis of the Journal of Technology Education, Volumes 1–8. J Technol Educ 10(1):27–57 Riggioni C, Comberiati P, Giovannini M, et al (2020) A compendium answering 150 questions on COVID-19 and SARS-CoV-2. Allergy, 10–1111. https://doi.org/10.22541/au.159076950.078 19469 Shaw R, Kim YK, & Hua J (2020). Governance, technology and citizen behavior in pandemic: lessons from COVID-19 in East Asia. Progress Disaster Sci, 100090. https://doi.org/10.1016/j. pdisas.2020.100090
32
H. Jiang et al.
Shi F, Wang J, Shi J, et al (2020) Review of artificial intelligence techniques in imaging data acquisition, segmentation and diagnosis for covid-19. IEEE Rev Biomed Eng. https://doi.org/10. 1109/RBME.2020.2987975 Taghrir MH, Akbarialiabad H, Marzaleh MA (2020) Efficacy of mass quarantine as leverage of health system governance during COVID-19 outbreak: a mini policy review. Arch Iranian Med 23(4):265–267 Tang N, Huang G, Li M, Xu F (2020) Artificial intelligence plays an important role in containing public health emergencies. Infect Control Hosp Epidemiol 41(7):1–6 The State Council Information Office. (2020). White paper—Fighting Covid-19: China in Action. Accessed 7 July, 2020. Available at: https://covid-19.chinadaily.com.cn/a/202006/08/WS5edd 8bd6a3108348172515ec.html Treib O, Bähr H, Falkner G (2007) Modes of governance: towards a conceptual clarification. J Eur Public Policy 14(1):1–20 UNESCAP (2007, January) What is good governance? Accessed 7 July, 2020. Available at: http:// www.unescap.org/pdd/prs/ProjectActivities/Ongoing/gg/governance.asp Wang M (2020) How do overseas Chinese students stand out in virus battle. Accessed 7 July, 2020. Available at: https://news.cgtn.com/news/2020-03-21/How-do-overseas-Chinese-studentsstand-out-in-virus-battle-P28DDu4uty/index.html Xinhua (2020) China Focus: Overseas Chinese join anti-coronavirus campaign. Accessed 7 July, 2020. Available at: http://www.xinhuanet.com/english/2020-02/01/c_138748542.htm Yu X, Li N (2020) How did chinese government implement unconventional measures against COVID-19 Pneumonia. Risk Manag Healthcare Policy 2020(13):491–499. https://doi.org/10. 2147/RMHP.S251351
Chapter 3
Using Public-Private Data to Understand Compliance with Mobility Restrictions in Sierra Leone Innocent Ndubuisi-Obi Jr, Ziyu Ran, Yanchao Li, Chenab Ahuja Navalkha, Sarah Williams, and Lily Tsai Abstract This research investigates the potential for using call detail records (CDRs) data to determine public compliance to two government mandated confinement measures in Sierra Leone: a three day lockdown and fourteen day inter district travel restriction during the first wave of the COVID19 pandemic in April 2020. We use a distance-based mobility indicator, the average distance travelled per district per day to determine compliance to government mandates. The measure is used to proxy the change in mobility compared to a baseline period for both inter- and intra-district trips in Sierra Leone. Our results show significant compliance across all districts in Sierra Leone. We also show that the intensity of compliance is influenced by poverty and population. Our work demonstrates how using CDR-based mobility analysis was carried out in Sierra Leone during the COVID19 crisis to aid policy makers in understanding the effectiveness of their COVID19 mitigation measures. Keywords Public-private data · Call detail records · Covid19 · Sierra leone
I. Ndubuisi-Obi Jr (B) · Z. Ran · Y. Li · C. A. Navalkha · S. Williams · L. Tsai Department of Urban Studies and Planning, Department of Political Science, Massachusetts Institute of Technology, Cambridge, MA, USA e-mail: [email protected] Z. Ran e-mail: [email protected] Y. Li e-mail: [email protected] C. A. Navalkha e-mail: [email protected] S. Williams e-mail: [email protected] L. Tsai e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 S. C. M. Geertman et al. (eds.), Urban Informatics and Future Cities, The Urban Book Series, https://doi.org/10.1007/978-3-030-76059-5_3
33
34
I. Ndubuisi-Obi et al.
3.1 Introduction Sierra Leone is located on the southwest coast of West Africa. It has a total area of 71,740 km2 and a population of 7.1 million based on the 2017 census. The country has five administrative levels: five administrative regions, 16 districts, 206 chiefdoms, 1,273 sections and 9,861 enumeration areas. On March 30th, Sierra Leone reported its first case of the novel SARSCoV19 virus. On April 5th, the Government of Sierra Leone implemented its first of a series of confinement measures—a three day lockdown. On April 11, after the first lockdown ended the Government implemented a second measure—a 14 day inter-district travel ban. Under the tenure of President Julius Maada Bio, Sierra Leone has prioritized the use of data analytics and technology to improve and bolster the work of government. As the COVID19 situation developed, a research panel was set up to support the government with actionable insights that can be used to respond effectively to COVID19. The government expressed a need in using call detail records to understand the effectiveness of their mobility restrictions and other subsequent policy actions. As part a result, a research collaboration was formed between the Government of Sierra Leone (GoSL), Africell (the mobile network operator with largest market share) and researchers at MIT to use CDR data to evaluate the effectiveness of government’s COVID19 mitigation strategies. Our research team worked closely with GoSL’s Directorate of the Science, Technology, and Innovation (DSTI) to understand the extent to which citizens responded to mobility restrictions.
3.2 Literature Review Mobility data for policy impact There have been a number of studies focused on using mobility data to support public policy and decision-making. Researchers have used call detail records (Phithakkitnukoon et al. 2012), travel surveys (Cottrill et al. 2013), smart cards (Liu et al. 2009), and GPS data (Zheng and Xie 2011) to understand human mobility patterns. These studies have aided in the understanding of topics like traffic patterns (Jiang et al. 2017), disease spread, and transit mode (Wang et al. 2018). During the COVID19 pandemic, there has been a renewed focus in the use of mobility data to support public health responses. Iacus et al. (2020) introduce the concept of mobility functional areas as geographic zones with a high degree of intra-mobility exchanges. They find that mobility-restricting measures implemented in different countries not only reduce the volume of mobility but also shrink the effect on the mobility functional areas. Similarly, Santamaria et al. (2020) generate directional daily movement volume derived from mobile phone positioning data that can be used for epidemiological and economic modelling and in early warning applications. These studies highlight the relevance of mobility data in providing insights that can be used to aid public health and resource allocation problems.
3 Using Public-Private Data to Understand Compliance …
35
Call detail records and human mobility Among the various types of mobility data, our paper focuses on the use of call detail records (CDR). A call detail record is automatically generated by the telecommunications equipment (i.e. cell tower) whenever a mobile phone call or transaction (i.e. text message) is made. In 2013, Orange Telecom released a CDR dataset in the Ivory Coast and held the “Data for Development” (D4D) challenge. Researchers around the world used this dataset to help address society development questions by contributing to the socio-economic development and well-being of the Ivory Coast population. These studies range from mapping population movement to exploring community structure to understand disease spread (Zhou et al. 2010; Wesolowski et al. 2012; Doyle et al. 2019). They demonstrate the use of CDR data to understand individual or collective mobility patterns in the geographical region, the temporal dynamics of mobility, and the interactions between social-economic status with human movement. Mobility and socioeconomic status Many studies have found that socioeconomic factors, particularly such as poverty, income, population size, influence the structure of mobility networks (Leo et al. 2016; Silm et al. 2018; Östh et al. 2018; Xu et al. 2018; Yechezkel et al. 2020). When assessing the trade-off between poverty and risk of catching COVID19, some researchers have found that in Latin America and Africa the degree of reduction in mobility is significantly driven by the intensity of poverty (Bargain and Aminjonov 2020). Social distancing measures place a high cost on the poor who have little saving and rely on casual labor (Ravallion 2020). In this study, we confirm these findings and explore the relationship between the intensity of mobility reductions and the population and poverty levels in Sierra Leone. Mobility measures for CDR data Williams et al. (2015) characterize four classes of typical mobility measures derived from CDR data: number of towers used (NTU), distance travelled-straight line (DT-SL), maximum distance traveled (MDT), and radius of gyration (RoG). The NTU measures the number of cell towers called by a subscriber in a given time period. The DT-SL measures the sum of the straight line distances between the towers where consecutive calls are placed. The MDT measures the maximum straight line distance between two towers used by a subscriber. The RoG is calculated by finding the mean of the squares of the pairwise distances between the center of mass of all cell towers visited by a subscriber and each of the towers used. Research on the use of mobility measures generated from large scale CDR data has stressed their inconsistency, inaccuracy, and their tendency to be confounded by socio-economic characteristics of the local context. These measures are plagued by problems due to call frequency, tower density and distribution, temporal and spatial sparsity, and assumptions about nature of human mobility (Williams et al. 2015; Caceres et al. 2020). Williams et al. (2015) provide recommendations for choosing or designing valid mobility measures: • should be standardized and independent of tower density • should be less dependent on a subscriber’s call frequency and dynamics of cellular network of towers
36
I. Ndubuisi-Obi et al.
• should measure clearly defined aspects of mobility such as frequency or spatial range of movement. In this study, we focus on the second recommendation and design an average distance metric similar to the DT-SL except in its use of the haversine formula to calculate distance between towers. Our paper contributes to the existing literature on CDRs by applying methods and processes documented in literature to the context of Sierra Leone during the COVID19 crisis. Additionally we show that by including policy makers in our analytics work we added in decision making and transferred knowledge to the GoSL so they can perform this work after our collaboration is over. We detail the use of such analyses and methods for policy action, identify limitations of CDR data, and recommend effective methods for public and private data analysis. In the following section, we introduce our distance-based metric for compliance and discuss its suitability as an appropriate proxy for policy compliance during the study period.
3.3 Methodology This study aims to understand if citizens of Sierra Leone complied to the COVID19 mobility restrictions enacted by the GoSL during the study period—between February 1st and April 30th. In order to investigate this question, we calculate a distance-based metric, Average Distance Travelled (ADTDT ), at the district per day level. The ADTDT metric is similar to those utilized in the literature on CDRs (See Literature Review). We compare the percent change in this metric during the study period to demonstrate the impact of Sierra Leone’s COVID19 policies on mobility within and across districts. We also include descriptive analysis to demonstrate the correlations between these changes and other socioeconomic covariates, population and poverty. Our methods aim to support the government with situational awareness that can inform their day to day COVID19 policy decisions. Below we describe the data and methods used to generate our analysis.
3.3.1 Socioeconomic Data For socioeconomic indicators, we rely on the Sierra Leone 2015 Population and Housing Census. This dataset was acquired from Statistics Sierra Leone, the country’s central data authority. The Sierra Leone 2015 Population and Housing Census includes results of seven modules reporting on household identification, population characteristics (e.g. age, sex, religion), household facilities (e.g. tenure status, rubbish disposal practices), agricultural activity, household deaths within past 12 months, ebola socioeconomic impacts, and ownership of durables. We also include poverty
3 Using Public-Private Data to Understand Compliance …
37
indicators from the 2018 Sierra Leone Integrated Household Survey (SLIHS) in our analysis. The 2018 SLIHS which covers the period of January-December 2018 contain information on a range of social and economic indicators.
3.3.2 Geospatial Datasets Shapefiles were obtained at each of the five administrative levels from Statistics Sierra Leone and from our in-country collaborators. In 2017, administrative regions in Sierra Leone were revised, resulting in the creation of one new province (Northwestern), two new districts (Falaba and Karene), and 41 new chiefdoms. With this change, the country went from having 14 districts to 16 districts. Shapefiles representing post2017 administrative boundaries at each of the five administrative levels were acquired from Stats SL. In addition to geolocated polygons representing administrative regions of Sierra Leone, these shapefiles included region names and codes, the latter of which
Fig. 3.1 Map of Sierra Leone with administrative boundary and cell towers
38
I. Ndubuisi-Obi et al.
Table 3.1 Sample CDR Record Cell-Id
Source
Call Time
Duration
Direction
Event
Target
12103
21D7B5
2020-03-04 20:00:10
00:29:57
Incoming
call
65H8I9
Cell id is a unique identifier representing the antenna or cell where a call was made or received. Source and Target are unique identifiers for subscribers generated by some anonymization or hashing procedure. Duration is the length of a CDR event. Direction logs if the call or sms was initiated by the source. Event encodes whether the event is a voice call or sms
were utilized to integrate census data. For the purposes of this research, we focus on district-level analysis (Fig. 3.1).
3.3.3 CDR Data Our analysis relies on CDR data provided by one of the leading mobile phone service providers in Sierra Leone, Africell during the study period, February 1st to April 30th 2020. Table 3.1 provides an example of a CDR record. The Africell CDR data covers 2.2 million unique active subscribers making an average of 7 million connections (voice calls/sms) per day in February and 4.5 million per day in April. Some of the decrease can be attributed to endogenous factors related to the COVID19 pandemic. However, when reviewing daily trends in call patterns, our team recognized a downward trend in connections that began far before any policy restrictions were imposed. We soon discovered that on March 7th the national telecommunications regulatory body, NATCOM, implemented a price floor of Le590. Before the price floor, between bundles and promotions, the average price per minute was significantly below Le300. From conversations with mobile network operators in Sierra Leone, they believed that the price floor reduced voice and sms significantly −50% in the case of one mobile network operator. This exogenous event complicated the task of building a relevant baseline for our analysis. As a result, our team decided that we needed to use a measure that was not sensitive to this price change. As discussed below, we believe the average distance travelled measure(ADTDT ) is a reliable proxy for mobility during the study period that is insensitive to the changes in the price floor. Overview of CDR Processing Before any data was shared, all stakeholders in the project signed a contract covering standards around data use and disclosure. This contract required the mobile network operator, Africell, provide the analytics partner, MIT, with anonymized call detail records data. Our process is described below (Fig. 3.2). • The Mobile Network Operator (Africell) provided anonymized CDR data to the Data Analytics Partner (MIT). Our research team worked closely with Africell to safely and securely transfer anonymized CDR records to our analytics system. This access was provided via access to an FTP server behind a virtual private network (VPN). The credentials for secure access were provided to our team.
3 Using Public-Private Data to Understand Compliance …
39
As CDR data was updated daily requiring our research team to ingest data from Africell’s FTP server on a daily basis. • The data was stored in a secure Amazon S3 Bucket with access provided only to members of the MIT research team. Our research team relied on Apache Spark as our big data processing engine. Secure access to a shared notebook environment was given to members of the research team. The environment provided access to an Apache Zeppelin notebook where all data analytics and processing tasks were written and tested. Before any jobs were submitted to Spark, basic validation was carried out on the data. • With the CDR records, Spark jobs were written to generate stops and journeys. These jobs were submitted via a Zeppelin notebook to run on an Amazon Elastic Map Reduce (EMR) cluster provisioned to run our large data processing tasks. Outputs were written to S3. • After processing stops and journeys, another Spark job was submitted to calculate the daily average distance travelled for each district. The resulting measures were then analyzed and visualized. Reports were subsequently prepared and presented to our stakeholders as the Directorate of Science, Technology, and Innovation.
Fig. 3.2 Sierra Leone CDR analysis process and stakeholders
40
I. Ndubuisi-Obi et al.
3.3.4 Extracting Stops and Journeys from Anonymized Call Detail Records CDRs face two significant types of data sparsity issues to require preprocessing before that can be used for mobility analysis: spatial and temporal. Unlike fine-grained GPS data, CDR data is spatially sparse because it maps a subscriber’s movement to the coordinates of the closest cell tower. As a result, there are significant limits to what we can infer about a subscriber’s displacement pattern. The temporal sparsity of CDR records is due to the fact that CDR records are only created when a subscriber makes or receives a call or text message. The fewer calls or sms messages a subscriber receives, the less accurate their CDRs are in tracing their mobility patterns. These data sparsity issues produce noisy signals that introduce errors when using CDRs as mobility traces. To correct for this concern, we use similar methods in the literature to extract ‘stops’ from the call detail records. For each user a stop is a location where that user is considered to have spent a significant amount of time. We define a valid stop as consisting of at least two consecutive events (voice or sms) where the gap between those events is not less than ten minutes and more than four hours. This operation is performed on daily call detail records and generates a list of valid stops for each subscriber. The list of daily stops is then used to generate user journeys. A journey is defined as two contiguous stops where the first element is encoded as the source and the second element is encoded as the destination. A sequence of journeys provide a proxy of a user’s displacement in a given area on a given day.
3.3.5 Generating the Average Distance Travelled Metric ADTDT =
TD TD × O Dhav O Dcount O DT D
O D T D = odiTj |i = j, i ∩ D ⊕ j ∩ D odiTj =
ai , a j , count (i. j, t , hav(i, j) , where a ∈ jour neys
(3.1) (3.2) (3.3)
As we mentioned in the literature review, there are a number of measures used to proxy mobility when working with call detail records. Our team needed a metric that could also service as a reliable proxy for social distancing or ‘compliance’ to mobility restrictions. Our team was concerned with measuring the impact of a number of policies instituted by the GoSL regarding mobility—the most significant being an inter-district travel ban. With stops and journeys generated, we chose to calculate the average distance travelled in a given district per day. The average distance travelled is defined in Eq. 3.1. To calculate the average distance travelled measure at the district level for both inter- and intra-district trip we perform a number of different aggregations on the
3 Using Public-Private Data to Understand Compliance …
41
journey dataset. For each district, to compute the inter-district average, Eq. 3.1 is calculated on the sequence of journeys that have that district as either the source or destination of a trip. For intra-district trips, it is computed on only those journeys where the source and destination are in the same district. For each unique origin and destination pair in the journey dataset, we count the number of times that trip was made on a given day (See Eq. 3.3). The haversine distance between an origindestination pair is also calculated and multiplied by the number of trips made of that origin-destination pair. The result is then summed across all valid origin-destination pairs and divided by the total number of unique origin-destination trips. The result is the average distance travelled for a given district on a given day. In order to compare changes before and after the mobility measure are enacted, we develop a baseline using the two weeks of February. We use Eq. 3.1 to calculate the ADTDT on a subset of records where T is within the first two weeks of February. The result is then used as our baseline ADTDT measure. To allow comparability, we index the daily ADTDT of each district to their February baseline. Figure 3.3 shows the normalized 7 day moving average of the daily percent change in the ADTDT for inter-district trips when compared against the baseline period across all districts in Sierra Leone. We observe a large drop in the ADTDT during the three-days of the first lockdown followed by a three day rise back to pre-lockdown rates and a fall to levels significantly below pre-lockdown levels. The measure is unaffected by the reduction in call volumes due to the imposition of national price floor on voice and sms data bundles and promotions.
Fig. 3.3 Normalized ADTDT (% change in avg distance per trip per district per day relative to February baseline, inter-district)
42
I. Ndubuisi-Obi et al.
3.4 Results and Analysis Significant reductions in inter-district mobility during the three-day period starting on April 5th lockdown and during the 14-day travel ban that started April 11th Figure 3.3 appears to show that the April 5th lockdown and the April 11th interdistrict travel ban on mobility across Sierra Leone both created decreases in travel. In all districts the 7 day moving average of the ADTDT measure was relatively flat across districts with the exception of some whose values are bound with 10% of the February baseline. As the first lockdown begins, there is a significant drop in the normalized ADTDT measure across all districts during the three days of the lockdown. After the lockdown ends, the trend begins to marginally increase and then sharply decreases after the beginning of the 14-day inter-district travel ban on the 11th of April. Western Area Urban, Western Area Rural, and Bo experienced the largest changes in inter-district mobility The map in Fig. 3.4 shows the change between February and April. The baseline used for calculating the change uses only the Sunday, Monday, and Tuesday periods in February. The maps compare the ADTDT during the three days of the lockdown to the baseline period in February. According to this measure, Western Area Urban (WAU), Western Area Rural (WAR), and Bo are the districts with the largest changes in inter-district mobility. Although all districts experienced reductions in inter-district mobility (as measured by the ADTDT ), WAU, WAR, and Bo experienced declines in mobility greater than or equal to 65%. The Western Area districts are located near Freetown, the nation’s capital, and therefore experience higher rates of business travel, so it makes sense that travel would be reduced in these areas as less businesses were open during the lockdown making travel inter-district travel between the districts unnecessary. Inter-district and inter-district compliance differs across districts In order to compare inter-district to intra-district mobility, we repeat the same process detailed in the methodology section except in this case filtering for trips that share the same origin and destination point. It is clear from Fig. 3.5, that the magnitude of the decrease within districts is relatively muted when compared to the interdistrict changes, Fig. 3.4. In this case, Western Area Urban, Port Loko and Moyama experience decreases in intra district travel that equal or exceed 40%. While other districts experienced decreases in intra-district mobility, Koinadugu is the only one to show an increase. This could be due to a number of factors such as poverty, population size and density, and employment opportunities. We explore the relationship between poverty and population size below. When looking at inter-district mobility, poverty is negatively correlated with compliance, while population size is positively correlated to compliance To understand the impact between intensity of compliance and socioeconomic factors, we explore the correlation between the change in the ADTDT measure for
3 Using Public-Private Data to Understand Compliance …
43
Fig. 3.4 Change in average distance travelled (% change, intra-district)
districts between February and April and two socioeconomic measures: poverty and population size. According to Figs. 3.6 and 3.7, districts with the largest decreases in mobility between February and April are those that have relatively lower poverty rates and higher population sizes. This result is in line with research on the impact of poverty on compliance. Ravallion (2020) writes that if COVID19 confinement measures don’t include proper public support, they can cause large migration flows among the poor and vulnerable, potentially spreading the virus at an even faster rate. Yechezkel et al. (2020) also finds that poor regions exhibit lower and slower compliance with COVID19 restrictions and recommend that more resources should be devoted to support impoverished communities. Our analysis is focused on the districtlevel therefore limiting our ability to understand the unique patterns that influence the relationship between socioeconomic indicators and compliance to COVID19
44
I. Ndubuisi-Obi et al.
Fig. 3.5 Change in average distance travelled (% change, inter-district)
social distancing measures. Future work will explore group and individuals-level characteristics at finer-grained geographic resolutions.
3.5 Discussion Government reception and use The method and results discussed above were presented by members of our research team to a high-level panel set up in the Government of Sierra Leone during the COVID19 pandemic. Policy-makers on this panel were interested in the differential impact of their travel restrictions on mobility patterns between and within districts. After sharing this work with the panel, it
3 Using Public-Private Data to Understand Compliance …
45
Fig. 3.6 Relationship between change in ADTDT between February and April and poverty
provided evidence to them of the impact of their lockdown measures. This realization aided future decisions on the impact of lockdown, travel bans, and curfews. The panel became interested in understanding the socio-economic factors that might influence the intensity of compliance. Our team used relevant census data to demonstrate the correlations we discussed in Figs. 3.6 and 3.7. The panel recommended the addition of other public datasets such as place of interest, school locations, hospital location as a way of understanding how these resources and markers influence the effect size. Another main concern of our government partners was the need to build capacity within the government to facilitate the CDR analysis. We addressed this concern in two ways: working closely with collaborators in the government and offering a training on CDR data analysis. From the onset, our team worked very closely with members of the DSTI team in both the data processing and data analytics tasks. This close collaboration added value to our research by providing us with rich domain knowledge and expertise that guided our research methods and data
46
I. Ndubuisi-Obi et al.
Fig. 3.7 Relationship between change in ADTDT between February and April and population
processing techniques. It was through this close collaboration that our team identified the anomalies in the data that resulted from the imposition of a price floor by the national telecommunications regulator. We also developed an interactive training for all interested project partners and stakeholders that introduced the tools and techniques our team used to process and analyze CDR data. Even before the first case of COVID19 was detected in Sierra Leone in March, members of our research team were participating on daily calls with the ICT pillar of the GoSL’s Covid response team. Consistent participation on these daily calls, and our willingness to provide feedback on challenges confronting the COVID19 response team even when not directly related to our CDR research, helped to build trust and led to an initial collaboration on a citizen needs assessment conducted by our research team, civil society partners, in collaboration with the GoSL’s Directorate of Science, Technology, and Innovation (DSTI), the Ministry of Finance, and Statistics Sierra Leone. Analyses of these data from a nationally representative survey of citizens
3 Using Public-Private Data to Understand Compliance …
47
were then used by the Presidential Taskforce on COVID19 in combination with other evidence to design Sierra Leone’s lockdown framework, which further helped to build trust and create a framework for collaborative learning and evidence-based policy making. CDR data and ethics It should be noted that there are a number of ethical and privacy concerns that should be taken into consideration when working with CDR data particularly when using them to address disaster. Each record in the CDR databases represents an individual and this information is extremely sensitive and should be protected. Sean McDonald, in his evaluation of the use of CDRs during Ebola crisis in Liberia, mentions how the privacy of citizens is often breached during disaster events due lack of care of actors who are often more concerned about responding in a timely manner than protecting the privacy of those in the data (McDonald 2016). McDonald notes privacy is breached in several ways: • by not ensuring the data is on a protected server and only provided to those who understand the ethical concerns, • by performing analysis that could expose individuals, potentially causing them harm • by using the data for purposes other than the disaster context. Addressing privacy concerns with Africell and GoSL was an important part of the research because we wanted to ensure the GoSL understood potential privacy issues but also reassure Africell and GoSL that the privacy of Africell users would be maintained. Our research team took several actions to address the ethical and privacy issues of the data: • Storing the data on an encrypted server, which was only accessible to those who have performed a human subject review. • Developing a license agreement which clearly defines the analysis we will perform with the data. • Aggregating individual CDR records so that individuals cannot be identified through our analysis. It should be noted that developing these standards of ethical practice took some time and it would be useful for the GoSL and Africell to create standard agreements so that future use of this data during disasters is acquired in a more timely manner.
3.6 Conclusion This study sets out to analyze whether CDR data could be used to determine the intensity of compliance to mobility restrictions imposed by the Government of Sierra Leone soon after the first case of Covid-19 was identified in the country at the end of March. We processed the CDR data into a dataset of stops and trips and this data was transformed to create a mobility indicator using an Average Distance Traveled Per District Per Day ( ADTDT ). We find that ADTDT decreased in close
48
I. Ndubuisi-Obi et al.
relationship to dates where mobility restrictions were imposed. We also demonstrate that the differences in the intensity or magnitude of compliance across districts could be explained by socioeconomic differences. When studying the change in ADTDT between February and April, we find that mobility decreased less in low-income and highly populated areas. These results appear to show that there is a relationship between rural poverty and the need to maintain travel even under travel restrictions. Our research illustrated by working closely with the GoSL and Africell we increased the utility of our analysis but also built capacity in the government to use CDR data for policy. Further research should look at the potential reasons that rural low income communities in Sierra Leone saw less reductions in mobility. Do these populations need to travel for essential resources including food? Is there potential for the Sierra Leone government to help provide those resources to help reduce travel for vulnerable populations. The work helped to highlight the need of the government to address these important issues. Acknowledgements We are immensely grateful to the team at DSTI for their support in framing and championing our research agenda. Finally, we are indebted to the team at Africell Sierra Leone for their contribution of CDR data to this research effort.
References Bargain O, Aminjonov U (2020) Poverty and COVID-19 in Developing Countries. Groupe de Recherche en Economie Théorique et Appliquée (GREThA). Available at: https://ideas.repec. org/p/grt/bdxewp/2020-08.html Caceres N, Romero LM, Benitez FG (2020) Exploring strengths and weaknesses of mobility inference from mobile phone data vs. travel surveys. Transportmetrica A: Transp Sci 16(3):574–601. https://doi.org/10.1080/23249935.2020.1720857 Cottrill CD et al (2013) Future mobility survey: experience in developing a smartphone-based travel survey in Singapore. Transp Res Record: J Transp Res Board 2354(1):59–67. https://doi.org/10. 3141/2354-07 Doyle C et al (2019) Predicting complex user behavior from CDR based social networks. Inf Sci 500:217–228. https://doi.org/10.1016/j.ins.2019.05.082 Iacus S et al (2020) Mapping Mobility Functional Areas (MFA) using mobile positioning data to inform COVID-19 policies: a European regional analysis. Available at: https://op.europa.eu/pub lication/manifestation_identifier/PUB_KJNA30291ENN. Accessed 26 October 2020 Jiang S, Ferreira J, Gonzalez MC (2017) Activity-based human mobility patterns inferred from mobile phone data: a case study of Singapore. IEEE Trans Big Data 3(2):208–219. https://doi. org/10.1109/TBDATA.2016.2631141 Leo Y et al (2016) Socioeconomic correlations and stratification in social-communication networks. J R Soc Interface 13(125):20160598. https://doi.org/10.1098/rsif.2016.0598 Liu L et al (2009) Understanding individual and collective mobility patterns from smart card records: a case study in Shenzhen. In: 2009 12th International IEEE conference on intelligent transportation systems. 2009 12th International IEEE Conference on Intelligent Transportation Systems (ITSC), St. Louis: IEEE, pp. 1–6. https://doi.org/10.1109/itsc.2009.5309662 McDonald SM (2016) Ebola: a big data disaster privacy, property, and the law of disaster experimentation. The Centre for Internet and Society, Bengaluru and Delhi, India
3 Using Public-Private Data to Understand Compliance …
49
Östh J, Shuttleworth I, Niedomysl T (2018) Spatial and temporal patterns of economic segregation in Sweden’s metropolitan areas: a mobility approach. Environ Plann A: Econ Space 50(4):809–825. https://doi.org/10.1177/0308518X18763167 Phithakkitnukoon S, Smoreda Z, Olivier P (2012) Socio-geography of human mobility: a study using longitudinal mobile phone data. PLoS ONE. Edited by Y. Moreno, 7(6):e39253. https:// doi.org/10.1371/journal.pone.0039253 Ravallion M (2020) Could pandemic lead to famine?. Project Syndicate, 15 April. Available at: https://www.project-syndicate.org/commentary/covid19-lockdowns-threaten-famine-inpoor-countries-by-martin-ravallion-2020-04 Santamaria C et al (2020) Measuring the impact of COVID-19 confinement measures on human mobility using mobile positioning data: a European regional analysis. Available at: https://op.eur opa.eu/publication/manifestation_identifier/PUB_KJNA30290ENN. Accessed 26 October 2020 Silm S, Ahas R, Mooses V (2018) Are younger age groups less segregated? Measuring ethnic segregation in activity spaces using mobile phone data. J Ethnic Migr Stud 44(11):1797–1817. https://doi.org/10.1080/1369183X.2017.1400425 Wang Z, He SY, Leung Y (2018) Applying mobile phone data to travel behaviour research: a literature review. Travel Behav Soc 11:141–155. https://doi.org/10.1016/j.tbs.2017.02.005 Wesolowski A et al (2012) Quantifying the impact of human mobility on malaria. Science (New York, N.Y.), 338(6104), pp. 267–270. https://doi.org/10.1126/science.1223467 Williams NE et al (2015) Measures of human mobility using mobile phone records enhanced with GIS data. PLOS ONE. Edited by S. Gómez, 10(7):e0133630. https://doi.org/10.1371/journal. pone.0133630 Xu Y et al (2018) Human mobility and socioeconomic status: analysis of Singapore and Boston. Comput Environ Urban Syst 72:51–67. https://doi.org/10.1016/j.compenvurbsys.2018.04.001 Yechezkel M et al (2020) Human mobility and poverty as key drivers of COVID-19 transmission and control. preprint. Epidemiology. https://doi.org/10.1101/2020.06.04.20112417 Zheng Y, Xie X (2011) Learning travel recommendations from user-generated GPS traces. ACM Trans Intell Syst Technol 2(1):1–29. https://doi.org/10.1145/1889681.1889683 Zhou C, Xu Z, Huang B (2010) Activity recognition from call detail record: relation between mobile behavior pattern and social attribute using hierarchical conditional random fields. In: 2010 IEEE/ACM Int’l conference on green computing and communications & Int’l conference on cyber, physical and social computing. Int’l conference on cyber, physical and social computing (CPSCom), Hangzhou, China: IEEE, pp 605–611. https://doi.org/10.1109/greencom-cpscom.201 0.141
Chapter 4
Development of a Spatio-Temporal Analysis Method to Support the Prevention of COVID-19 Infection: Space-Time Kernel Density Estimation Using GPS Location History Data Haruka Kato Abstract This study aims to develop a spatio-temporal analysis method to support planning for the prevention of a COVID-19 infection. The method focused on the space-time kernel density estimation using the GPS location history data. The data is GPS location data obtained at regular intervals from smartphones with the consent of the users. The research method was a panel data analysis for April 2019 and April 2020 with Ibaraki City. In April 2020, the Japanese government implemented a soft lockdown. As a result, this study developed a spatio-temporal analysis method that visualizes the space-time with high population density. Using these methods, local governments can restrict people’s lives by designating specific space-time areas. In addition, the method helps citizens to change their lifestyle behaviors and cooperate in the prevention of COVID-19 infection. The method is an alternative to the Japanese soft lockdown, which was based on an emergency declaration. In the future, this method will be utilized for data analysis in future smart cities. Keywords Spatio-temporal analysis method · COVID-19 · Space-time kernel density estimation · GPS location history data · Smartphone GPS data
4.1 Introduction 4.1.1 Background: Prevision COVID-19 The background of this study is the prevention of COVID-19 infection in places with high population density. A critical feature of COVID-19 is its ability to infect humans through droplets (Leung et al. 2020). Besides, there is a risk of unintentional transmission of the virus by the infected person because the average number of days between H. Kato (B) Graduate School of Human Life Science, Department of Housing and Environmental Design, Osaka City University, Osaka, JP 5588585, Japan e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 S. C. M. Geertman et al. (eds.), Urban Informatics and Future Cities, The Urban Book Series, https://doi.org/10.1007/978-3-030-76059-5_4
51
52
H. Kato
infection and onset of illness is 5.2 days (He et al. 2020). Therefore, it is effective to wear face masks because masks reduce the amount of virus inhaled by 60–80% (Ueki et al. 2020). However, since face masks cannot wholly prevent viral infections, governments worldwide have taken various measures based on social distancing to prevent a rapid increase in the number of infected people. Although contact matters more than density in the spread of the COVID-19 pandemic at the metropolitan scale (Hamidi et al. 2020), at the district scale, there is a medium association between population density and death rate due to COVID-19 infection (Bhadra et al. 2020). The research suggests that governments need to prevent, in advance, the formation of densely populated districts, as the number of infected people might spike in these areas. In Japan, the number of infected people has been increasing since February 2020. Therefore, the Japanese Central Government declared a state of emergency on April 7, 2020 (Prime Minister of Japan and His Cabinet 2020). However, Japanese emergency declarations are referred to as a soft lockdown, compared to the lockdowns in the USA and Europe (The Japan Times 2000). That is because the Japanese emergency declaration was based on requests without legal restrictions. Many Japanese people initially condemned the soft lockdown by the government. However, it could serve as a reference for other countries as a planning support method to balance the prevention of COVID-19 infections with the maintenance of residents’ economic and physical activity. The impact of the Japanese soft lockdown on residents’ daily lives is still being studied. For example, the soft lockdown has led to widespread adoption of telecommuting for work by adults and remote learning for children. Surveys using GPS data from smartphones showed that population density had decreased in central urban areas such as Tokyo Station and Osaka Station (Google 2020). On the other hand, unexpectedly high population densities have been reported in suburban cities (Mizuno Laboratory 2020). Based on this background, researchers need to investigate the high-density spacetime caused by the Japanese soft lockdown in suburban cities as well as central urban areas. Local governments need to refrain from restrictions on citywide lifestyle behaviors through emergency declarations if high-density space-time cannot be clarified. These would be excessive for residents who have already actively changed their lifestyle behaviors. On the other hand, there may be under-requests for residents who have not made enough changes in their lifestyle behaviors. Besides, residents may not feel incentivized to change their lifestyle behaviors because they do not know if these changes are sufficient or not. Therefore, if this study establishes a method to visualize densely populated space-time, it may be possible for local governments to implement more effective measures for specific space-time areas. Besides, it may be possible for citizens to change their lifestyle behaviors and cooperate to prevent COVID-19 infection. This method could be the basis for planning support to balance infection prevention and livelihood activities.
4 Development of a Spatio-Temporal Analysis Method …
53
4.1.2 Purpose This study aimed to develop a spatio-temporal analysis method to support a method of planning for the prevention of COVID-19 infections. The method focused on space-time kernel density estimation (STKDE) using the GPS location history data (LH data). The research method is a panel data analysis for April 2019 and April 2020 using the case study of Ibaraki City in the Osaka Metropolitan suburban area. In April 2020, the Japanese government implemented the Japanese soft lockdown. The LH data is GPS location history data obtained at regular intervals of approximately 15 min from smartphones with the users’ consent. As such, the LH data is big data that enables us to understand people’s flow at each hour. By analyzing LH data, it is possible to study individual life behaviors as big data science. Using LH data, this study evaluated the effectiveness of STKDE, which is a method for estimating the kernel density distributed in spatio-temporal coordinates based on temporal geography. The STKDE enables us to visualize the space-time in which LH data are accumulated. In other words, the STKDE provides an intuitive understanding of space-time in areas with a high population density as a measure to prevent COVID-19 infection. This study analyzed panel data from April 2020, when the Japanese soft lockdown was implemented, and April 2019, one year earlier. The analysis allows us to evaluate the effectiveness of the spatio-temporal analysis method. As a case study, this study analyzed Ibaraki City in the Osaka Metropolitan suburban area, which is a suburban city located midway between Osaka City and Kyoto City in the Osaka Metropolitan area, as shown in Fig. 4.1 (Kato and Kanki 2019). In April 2020, Ibaraki City experienced an increase in daytime population density because many citizens commuted to city centers before the Japanese soft lockdown. In fact, in April 2020, the information distributed by the Ibaraki City government frequently warned of increased population density in its parks (Ibaraki City Government 2020). By investigating the case of Ibaraki City, this study analyzed high-density space-time in a city.
Fig. 4.1 Location of Ibaraki City in the Osaka Metropolitan area
54
H. Kato
4.1.3 Literature Review Measures to prevent COVID-19 infection are still in the process of research and development. Some types of smartphone apps are used as COVID-19 countermeasures, such as Bluetooth-based contact tracing, GPS-based tracking, symptom checks and self-diagnosis, and trustworthy information for the public (European Parliament 2020). One noteworthy method is smartphone data, which has become popular in developed and developing countries. For example, Menni et al. (2020) evaluated the effectiveness of real-time tracking of self-reported symptoms to predict a potential COVID-19 infection. In addition, the effectiveness of the contact tracing application for smartphones is being evaluated (E Hernández-Orallo et al. 2020; Yasaka et al. 2020). Based on these studies, the Japanese Central Government developed COCOA, an acronym for “COVID-19 Contact-Confirming Application,” using Bluetooth (Japanese MHLW 2000). COCOA has made it easier to identify concentrated contacts when infected persons are known. The next step in the process is to develop planning support methods to prevent the spread of the disease before the number of infected people further spikes. Therefore, the use of GPS big data from smartphones has attracted attention. Studies on smartphone GPS data were popularized by González et al. (2008), who pointed out that the GPS location data can be used to understand people’s movement patterns. In Japan, Seike et al. (2011) pointed out the possibility of GPS big data. For example, Yamaguchi et al. (2017) analyzed lifestyle behavior changes during normal, disaster, and recovery periods using the Kumamoto earthquake disaster. For the prevention of COVID-19, GPS data from smartphones are also used. For example, Google has disclosed changes in the flow of people in each city (Google 2020). Mizuno Laboratory (2020) has reported regional differences in self-restraint rates based on the difference between daytime and nighttime populations. However, these studies use a “Mesh type of Floating Population Data,” which collects GPS data separated into meshes of each area. The novelty of this study is its focus on the lifestyle behaviors of residents using LH data, which collects the GPS data of each person. Using LH data, this study performs STKDE. Previously, Gao (2015) used LH data to elucidate human mobility patterns and urban dynamics. In Japan, Sato and Maruyama (2016) clarified visitors’ travel behavior through STKDE, but used massive GPS tracking rather than LH data. Based on the results, the novelty of the present study is that it develops a spatio-temporal analysis method, STKDE, using LH data. The spatio-temporal analysis is a method of analyzing people’s life behavior on a space-time map, which was developed by Hägerstrand (1970). This method allows us to determine the specific high-density space-time and to restrict citizens’ lives by designating spaces and time. Therefore, it is expected to be a new preventive planning method against COVID-19 infection as an alternative to the Japanese soft lockdown, which is based on the declaration of a state of emergency.
4 Development of a Spatio-Temporal Analysis Method …
55
4.2 Materials and Method 4.2.1 GPS Location History Data The data of this study are LH data, which are GPS location data obtained at regular intervals of approximately 15 min from smartphones with users’ consent (Fig. 4.2). In Ibaraki city, the number of LH data is approximately 1,600,000 logs per day for approximately 12,000 individuals (Fig. 4.3). Since Ibaraki City’s population is approximately 280,000, the LH data can be considered to be approximately 4.2% of its citizens. In Japan, the Agoop Corporation provides anonymized data for research purposes (Agoop Corporation 2020). The LH data were obtained through contracts between
Fig. 4.2 GPS location history data
Fig. 4.3 Log data per day in Ibaraki City
56
H. Kato
Table 4.1 Data of GPS location history data Log No
User ID
Year/Month/Day
Hour/Minutes
Latitude
Longitude
Others
1
AA
2019/4/1
0:00
YY
ZZ
…
2
BB
2019/4/1
0:00
YY’
ZZ’
…
3
CC
2019/4/1
0:00
YY”
ZZ”
…
4
AA
2019/4/1
0:01
YY
ZZ
…
:
:
:
:
:
:
…
In Ibaraki City, Log: about 1,600,000, and User IDs: about 12,000 per day
Agoop Corporation and the Graduate School of Life Sciences at Osaka City University, which belongs to the author. The LH data provided by Agoop Corporation complies with the “Guidelines for the Use of Device Location Data” (LBMA Japan 2020), a common regulation for location data in Japan. For example, the guideline prohibits the use of GPS data for any purpose that identifies an individual user. The data are collected by obtaining consent from the smartphone users who installed particular applications regarding the type of data to be collected, purpose of use, provision to third parties, and privacy policy. In addition, smartphone users can stop sending the GPS location data at any time by changing their smartphones’ basic settings. The LH data are mainly composed of the following attributes: daily ID, user ID, year, month, day, day of week, hour, minutes, latitude, longitude, operating system (OS, Android/iOS), country, GPS accuracy, speed, mesh ID, estimated transportation means, and gender. The user IDs are 96-digit alphanumeric codes that have been anonymized. The user ID is also a permanent ID assigned to each device and enables panel data analysis beyond the date. However, it does not include information that could violate personal privacy, such as names, ages, and addresses. Of the LH data, this study used user ID, year/month/day, hour/minutes, latitude, and longitude, as shown in Table 4.1. This study demonstrated causal relationships with COVID-19 using the four data sets. The LH data includes data on user IDs belonging to those who passed through Ibaraki City only, for example, those who passed through Ibaraki City to go from Osaka to Tokyo. Therefore, this study extracted data on people living in Ibaraki City. As a method, this study extracted user IDs with LH data that started after 3:00 AM and were located in Ibaraki City. 3:00 AM was selected because this is the time with the highest amount of LH data at which the movement speed is zero. Note that the LH data provided by Agoop are not obtained from all smartphones in Japan. Therefore, Ichinose et al. (2018) proposed an equation to estimate the actual floating population by analyzing the relationship between actual traffic surveys and LH data. However, this study does not aim to measure actual population density, but to clarify the high-density space time in the city. Furthermore, the penetration rate of smartphones in Japan is high for each generation. Therefore, LH data were used in this study.
4 Development of a Spatio-Temporal Analysis Method …
57
4.2.2 Space-Time Kernel Density Estimation Using LH data, we developed a spatio-temporal analysis method to support the prevention of COVID-19 infection. The method was STKDE, as shown in Fig. 4.4. The STKDE visualizes densely populated space-time. In other words, by analyzing STKDE, this study could confirm the space-time of human accumulation and change in high-density time (Nakaya and Yano 2010). The kernel density estimation, which is based on STKDE, which is a type of home range analysis for animals by radio-telemetry analysis. The home range is defined as an area traversed by the individual in the course of carrying out normal activities of food gathering, mating, and caring (Burt 1943). The radio-telemetry analysis is a method of tracking individual animals using GPS. Besides, the radio-telemetry analysis has contributed significantly to animal ecological research development since the 1960s. Home range estimation by radio-telemetry is famous for kernel density estimation, which describes the utilization probability of a point by a probability density function. This study’s method was developed into STKDE to analyze the human home range. STKDE was calculated using Eq. (4.1): In Eq. (4.1), the kernel functions K s and K t are defined using the Epanecknikov kernel (Epanechnikov 1969) in Eqs. (4.2) and (4.3). For the STKDE analysis, Nakaya (2020) developed the Space-Time Density Tool for ArcGIS Pro and made it available to the public. This program was used in the analysis of this study to estimate the kernel density from the LH data and create an iso-surface. In the analysis, the density values of the iso-surface were set to 95%, 99%, and 99.9%, meaning that in this study, iso-surfaces were created for the top 5%, 1%, and 0.1% of the density distribution. fˆ(x, y, t) =
n x − xi y − yi t − ti 1 Kt Ks , nbs2 bt i=0 bs bs bt
2 1 − u 2 + v2 , u 2 + v2 < 1 π 0, other wise 3 1 − w2 , w2 < 1 4 K t (w) = 0, other wise
K s (u, v) =
(4.1)
(4.2)
(4.3)
where fˆ(x, y, t) = the density estimation at the location (x, y, t), n is the number of points, bs is the spatial bandwidths, bt is the temporal bandwidths, u is the difference rate between longitude xi and xi+1 +1, v is the difference rate between latitude yi and yi+1 +1,w is the difference rate between the time ti and ti+1 +1.
58
Fig. 4.4 The Space-time kernel density estimation
H. Kato
4 Development of a Spatio-Temporal Analysis Method …
59
4.2.3 Panel Data Analysis for COVID-19 This study performed panel data analysis for the pre- and post-COVID-19 pandemic periods. The results indicate the effectiveness of STKDE using LH data as a preventive measure against the spread of infection. As the period, this study compared April 2020, when Japan implemented its soft lockdown, with April 2019, one year before that date. Simultaneously, cities worldwide, including Paris and London, were also in lockdown. Therefore, it is valid to analyze the periods April 2019 and April 2020. During April 2020, 0 to 3 Ibaraki citizens were infected with COVID-19 (Osaka Prefecture 2020). This number of cases was minimal compared to that in other cities in Osaka Prefecture. Therefore, it is valid for this study to analyze Ibaraki city as a case study.
4.3 Results 4.3.1 Visualizing Space-Time Kernel Density Space-time kernel density in April 2019 and April 2020 was analyzed using STKDE with LH data. Figure 4.5 shows the space-time kernel density on April 1/11/21/30, 2019. Figure 4.6 shows the space-time kernel density on April 1/11/21/30, 2020. According to the iso-surfaces for the top 5%, 1%, and 0.1% of the density distribution, the results are discussed. First, focusing on the iso-surfaces for the top 5% of the density distribution, Figs. 4.5 and 4.6 show no significant change between April 2019 and April 2020. Similarly, focusing on the iso-surfaces for the top 1% of the density distribution, Figs. 4.5 and 4.6 show no significant change between April 2019 and April 2020. The results suggest that the total number of residents in Ibaraki City was not affected by the COVID-19 pandemic. However, Figs. 4.5 and 4.6 also show that the high-density space-time changed in April 2019 and April 2020 when focusing on the iso-surfaces for the top 0.1% of the density distribution. For example, in April 2019, the iso-surfaces for the top 0.1% of the density distribution were formed during the morning and evening hours at the JR Ibaraki Station and Hankyu Ibaraki-city Station, which is the central stations in Ibaraki City. However, in April 2020, the iso-surfaces for the top 0.1% of the density distribution were formed throughout the city during the daytime hours, and at the JR Ibaraki Station and Hankyu Ibaraki-city Station in the morning and evening hours. For example, Fig. 4.6 shows that the iso-surfaces for the top 0.1% of the density distribution were formed around the South Ibaraki Station on April 1, 2020, Nishigawara Park on April 11, 2020, Ibaraki Central Park on April 21, 2020, and South Ibaraki Station on April 30, 2020. The iso-surfaces for the top 0.1% of the density distribution formed at the JR Ibaraki station disappeared on April 30, 2020. The results suggest that residents may not have used the stations to commute to the central cities because of the prevalence of remote work and other activities.
60
Fig. 4.5 STKDE with LH data in 2019
H. Kato
4 Development of a Spatio-Temporal Analysis Method …
Fig. 4.6 STKDE with LH data in 2020
61
62
H. Kato
Furthermore, the results suggest that they did not go to the central cities, but may have gone to city parks in Ibaraki city during the daytime. The work validated the alerts issued by the Ibaraki city government (2020) for the parks. As a result, this study visualized the space-time kernel density. In particular, the research clarified the space-time of high population density focusing on iso-surfaces for the top 0.1% of the density distribution.
4.3.2 Effectiveness as a Planning Support Method The impacts of the Japanese soft lockdown on people’s daily life behavior were analyzed using the spatio-temporal analysis method. For the analysis, this study plotted the space-time kernel density on April 13, 2020, when the most specific space-time high-density was formed in April 2020 (Fig. 4.7). One week after the state of emergency was declared in Japan on April 7, 2020, that is on April 13, schools and many public facilities were closed. Besides, many companies and some governments were encouraged to telecommute for work. However, Japanese emergency declarations were based on requests without legal restrictions. Therefore, citizens lived self-restrained lives while taking measures to prevent COVID-19 infection. For this reason, the declaration was called the Japanese soft lockdown. As a result, Fig. 4.7 shows that iso-surfaces for the top 0.1% were formed not only in stations but also in parks, as shown in Fig. 4.6, specifically in Ibaraki Central Park, Nishigawara Park, and Iwakura Park. The parks are so large that they have been designated as disaster evacuation areas. It is thought that citizens came together in the park to exercise while avoiding the high density in large parks. One social problem was the rapid decrease in older adults’ physical activity who went out less because of the state of emergency (Yamada et al. 2020). This method allowed us to determine the specific high-density space-time. By the methods, local governments will restrict citizens’ lives by designating specific spacetime areas. It is required for managers of public spaces (Project for Public Spaces 2020). The information distributed by the Ibaraki City government frequently warned of increased population density in its parks (Ibaraki City Government 2020). Based on the publicity, the number of Ibaraki citizens infected with COVID-19 ranged from 0 to 3 during April 2020, keeping the spike in cases under control (Osaka Prefecture 2020). This number of cases was minimal compared to the number of infected people in Osaka Prefecture. The causes are still being studied, including the possibility of insufficient PCR testing in Japan. However, the results suggest that the methods have contributed to preventing the spread of COVID-19 infection and health maintenance. Therefore, this method is expected to be a new preventive planning method to balance the prevention of COVID-19 infection with the maintenance of residents’ economic and physical activity. The method is an alternative to the Japanese soft lockdown based on the declaration of a state of emergency.
4 Development of a Spatio-Temporal Analysis Method …
Fig. 4.7 STKDE of April 13, 2020 (Monday)
63
64
H. Kato
4.4 Conclusion This study aims to develop a spatio-temporal analysis method to support planning related to the prevention of COVID-19 infection. The method employed was a STKDE using LH data, which allowed us to determine the specific high-density space-time. By these methods, local governments will restrict people’s lives by designating specific space-time areas. From August 6 to 20, 2020, Osaka Prefecture restricted the nighttime operation of restaurants and bars in a specific area, the Minami district of Osaka city. In In the Minami district, the number of infected young people increased rapidly. These types of measures have been implemented in many countries worldwide. This measure is essential before the number of infected people increases. The method is an alternative to the Japanese soft lockdown based on a declaration of emergency. Therefore, this method could be the basis for ways to balance infection prevention measures and livelihood activities. However, the LH data cannot be acquired in real-time owing to its limitations for research purposes. In the future, it is hoped that spatio-temporal analyses can be provided mainly by organizations that provide LH data. In particular, the theory can be further developed by re-examining the STKDE proposed in this study from the LH data provider side. For example, STKDE needs an easy-to-use user interface like Google Earth because it is shown in 3D space-time coordinates. Bluetooth-based contact tracing applications are helpful to identify full contacts when infected persons are identified. Compared to the contact tracing applications, the STKDE using LH data cannot trace people infected with COVID-19. However, STKDE is a helpful way to prevent COVID-19 infection. That is because this method allows policymakers to apply a planning support method for COVID-19 infection prevention measures. In addition, this method helps citizens to change their lifestyle and share its effects with them. The results suggest that the method helps citizens, the young and the elderly, to cooperate in the prevention of COVID-19 infection. Since the COVID-19 pandemic is more than a year over, many policymakers have struggled to engage the public’s cooperation based on the right data. This method is expected to be one of the tools to engage the cooperation. A comparison of the results is shown in Table 4.2. Bluetooth-based contact tracing has been the primary countermeasure for COVID-19. However, by using GPS-based tracking as well, more effective countermeasures can be achieved. However, note that the developed spatiotemporal analysis method does not display the actual population density. It is necessary to elucidate the expansion coefficient for estimating the actual population density from LH data. Besides, this study developed a method using only residents’ data, excluding visitors to the city, because the case study research was conducted in Ibaraki City. However, the data of visitors are essential for analyzing the spatio-temporal analysis method in central urban areas. Therefore, a future challenge is to improve the method to include not only residents but also visitors to the city. This method can be utilized for data analysis in future smart cities because residents feel incentivized to change their lifestyles by visualizing life behaviors. After
4 Development of a Spatio-Temporal Analysis Method …
65
Table 4.2 Comparison of tool features for COVID-19 countermeasure Bluetooth-based contact tracing
STKDE using LH data (GPS-based tracking)
Tool features (1) Experts in epidemiological research (1) Policymakers can apply a planning can identify concentrated contacts support method for COVID-19 when infected persons are know infection prevention measures (2) Citizens can change their lifestyle and shares its effects with them Data
Smartphone Bluetooth data
Effective use After the infected persons has been identified
Smartphone GPS data (GPS location history data) Before the infected persons are identified
all, they knew whether the changes in their living behaviors were sufficient. For example, this method can be used to analyze changes in residents’ lifestyle behaviors in districts that implement automated driving technology or online medical care support technology. In particular, visualizing the effects of social experiments on future technologies is essential for local government leaders to make budget decisions. I hope that many researchers will further develop a spatio-temporal analysis method. Acknowledgements The author was deeply grateful for the support of the Urban Development Department of the Ibaraki City Government (Local Government in the Osaka Metropolitan Area). This research was funded by JSPS KAKENHI (Grant number 19K23558). The author would like to thank Editage (www.editage.com) for English language editing.
References Agoop Corporation. Homepage. Available at: https://www.agoop.co.jp/en/. Accessed 21 Sept 2020 Bhadra A, Mukherjee A, Sarkar K (2020) Impact of population density on Covid-19 infected and mortality rate in India. Model Earth Syst Environ. https://doi.org/10.1007/s40808-020-00984-7 Burt WH (1943) Territoriality and home range concepts as applied to mammals. J Mammal 24(3.17):346–352 Epanechnikov VA (1969) Non-parametric estimation of a multivariate probability density. Theory Probab Appl 14(1):153–158 European Parliament (2020) National COVID-19 contact tracing apps. Available at: https:// www.europarl.europa.eu/RegData/etudes/BRIE/2020/652711/IPOL_BRI(2020)652711_EN. pdf. Accessed Dec 19 2020 Gao S (2015) Spatio-temporal analytics for exploring human mobility patterns and urban dynamics in the mobile age. Spatial Cognition Comput 15:86–114 González MC, Hidalgo CA, Barabási AL (2008) Understanding individual human mobility patterns. Nature 453:779–782 Google. COVID-19 Community Mobility Reports. Available at: https://www.google.com/covid19/ mobility/. Accessed Sept 7 2020 Hägerstrand T (1970) What about people in regional science? Papers Region Sci Assoc 24:6–21
66
H. Kato
Hamidi S, Sabouri S, Ewing R (2020) Does density aggravate the COVID-19 pandemic?. J Amer Plann Assoc. https://doi.org/10.1080/01944363.2020.1777891 Hernández-Orallo E, Manzoni P, Calafate CT, Cano JC (2020) Evaluating how smartphone contact tracing technology can reduce the spread of infectious diseases: The case of COVID-19. IEEE Access 8:99083–99097 He X, Lau EHY, Wu P et al (2020) Temporal dynamics in viral shedding and transmissibility of COVID-19. Nat Med 26:672–675 Ibaraki City Government (2020) Requests for use of the park in conjunction with measures to prevent the spread of COVID-19 (June 1, 2020) [in Japanese]. Available at: https://www.city.iba raki.osaka.jp/saigai/shingatacoronavirusinformation/oshirase/kouen/47738.html. Accessed Sept 25 2020 Ichinose R, Maruyama Y, Nagata S (2018) Estimation of the number of floating population based on location data collected from smartphones [in Japanese]. J Jpn Soc Civil Eng, Ser. A1 (Struct Eng Earthq Eng (SE/EE)) 74(3): I.210–I.219 Japanese MHLW (Ministry of Health Labor and Welfare). Request to install the COVID-19 ContactConfirming Application COCOA. Available at: https://www.mhlw.go.jp/content/10900000/000 647649.pdf. Accessed Sept 7 2020 Kato H, Kanki K (2019) Evaluating the Walkability of urban sprawl areas in the future using scenario planning: smart decline for ibaraki city in the northern osaka metropolitan area. Proceedings of 16th international conference on computers in urban planning and urban management, China, 211–230 LBMA Japan. Guidelines for the use of device location data [in Japanese]. Available at: https:// www.lbmajapan.com/guideline. Accessed Sept 21 2020 Leung NHL, Chu DKW, Shiu EYC, Chan KH, McDevitt JJ et al (2020) Respiratory virus shedding in exhaled breath and efficacy of face masks. Nat Med 26(5):676–680 Menni C, Valdes AM, Freidin MB, Sudre CH, Nguyen LH, Drew DA et al (2020) Real-time tracking of self-reported symptoms to predict potential COVID-19. Nat Med 26(7):1037–1040 Mizuno Laboratory. COVID-19 Special Site: Visualization of the percentage of people who refrain from going out. Available at: http://research.nii.ac.jp/~mizuno/. Accessed Sept 7 2020 Nakaya T, Yano K (2010) Visualizing crime clusters in a space-time cube: An exploratory dataanalysis approach using space-time kernel density estimation and scan statistics. Trans GIS 14(3):223–239 Nakaya T (2020) Space-time density tool for ArcGIS Pro Manual Ver. 1.0 [in Japanese]. Available at: https://nakaya-geolab.com/tools/. Accessed Sept 21 2020 Osaka Prefecture. Latest updates on COVID-19 in Osaka Available at: https://covid19-osaka.inf o/en. Accessed Sept 25 2020 Prime Minister of Japan and His Cabinet (2020) [COVID-19] Declaration of a state of emergency in response to the novel coronavirus disease (April 7, 2020). Available at: https://japan.kantei.go. jp/ongoingtopics/_00018.html. Accessed Sept 25 2020 Project for Public Spaces (2020) You Asked: How Can Public Space Managers Help Fight COVID19? (December 19, 2020). Available at: https://www.pps.org/article/you-asked-we-answeredhow-can-public-space-managers-help-fight-covid-19. Accessed Dec 19 2020 Sato T, Maruyama T (2016) A time-space analysis of smartphone-based travel survey data applying kernel density estimation [in Japanese]. J City Plann Inst Jpn 51(2):192–199 Seike T, Mimaki H, Hara Y, Odawara R, Nagata T, Terada M (2011) Research on the applicability of “mobile spatial statistics” for enhanced urban planning [in Japanese]. J City Plann Inst Jpn 46(3):451–456 The Japan Times (2020) The coronavirus and Japan’s Constitution, The Japan Times (April 14, 2020). Available at: https://www.japantimes.co.jp/opinion/2020/04/14/commentary/japan-com mentary/coronavirus-japans-constitution/. Accessed Sept 25 2020 Ueki H, Furusawa Y, Iwatsuki-Horimoto K, Imai M, Kabata H, Nishimura H, Kawaoka Y (2020) Effectiveness of face masks in preventing airborne transmission of SARS-CoV-2. mSphere 5:e00637–20
4 Development of a Spatio-Temporal Analysis Method …
67
Yamada M, Kimura Y, Ishiyama D, Otobe Y, Suzuki M, Koyama S, Kikuchi T, Kusumi H, Arai H (2020) Effect of the COVID-19 epidemic on physical activity in community-dwelling older adults in Japan: A cross-sectional online survey. J Nutr Health Aging 3:1–3 Yamaguchi H, Okumura M, Kaneda H, Habu K (2017) Damage and recovery process of Kumamoto earthquake in daily staying patterns: observation by mobile phone GPS data [in Japanese]. J Jpn Soc Civil Eng Ser D3 (Infrastr Plann Manag) 73(3): I_105–I_117 Yasaka TM, Lehrich BM, Sahyouni R (2020) Peer-to-peer contact tracing: Development of a privacy-preserving smartphone app. JMIR MHealth Health 8(4):
Part II
Big Data and Smart Cities
Chapter 5
A Review of Spatial Network Insights and Methods in the Context of Planning: Applications, Challenges, and Opportunities Xiaofan Liang and Yuhao Kang Abstract With the rise of geospatial big data, new narratives of cities based on spatial networks and flows have replaced the traditional focus on locations. While plenty of research that have empirically analyzed network structures, there lacks a state-of-the-art synthesis of applicable insights and methods of spatial networks in the planning context. In this chapter, we reviewed the theories, concepts, methods, and applications of spatial network analysis in cities and their insights for planners from four areas of concerns: spatial structures, urban infrastructure optimizations, indications of economic wealth, social capital, and residential mobility, and public health control (especially COVID-19). We also outlined four challenges that planners face when taking the planning knowledge from spatial networks to actions: data openness and privacy, linkage to direct policy implications, lack of civic engagement, and the difficulty to visualize and integrate with GIS. Finally, we envisioned how spatial networks can be integrated into a collaborative planning framework. Keywords Spatial networks · Social network analysis · Social capital · Mobility · Collaborative planning · Public health
5.1 Introduction The world we live in today is increasingly connected in physical, social, and economic ties. Our society has evolved into a “Network Society”, a characterization from Castell (1996), as being more decentralized, open, and organized as “a space of flows”. These connections, which often congregate in large cities, are the keys to unlock economic agglomeration, innovations, and productivity beyond population X. Liang (B) Friendly Cities Lab, School of City and Regional Planning, Georgia Institute of Technology, Atlanta, GA, USA e-mail: [email protected] Y. Kang GeoDS Lab, Department of Geography, University of Wisconsin-Madison, Madison, WI, USA e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 S. C. M. Geertman et al. (eds.), Urban Informatics and Future Cities, The Urban Book Series, https://doi.org/10.1007/978-3-030-76059-5_5
71
72
X. Liang and Y. Kang
growth and limited resources (Bettencourt 2013). Therefore, urban planning as a discipline that makes future-oriented decisions for cities should grapple with this new social structure and its implications on human activities and urban forms. Yet, planning theories and tools to address cities as systems of network and flows have just started to be integrated at the cross of the century (Albrechts and Mandelbaum 2007; Batty 2013). This perspective of cities come from complexity science theory, which has been embraced in both collaborative (and communicative) planning theories (Innes and Booher 2018; Graham and Healey 1999; Hajer and Zonneveld 2000) and a positivist approach to “the science of cities” (Bettencourt 2013; Batty 2013). Collaborative planning theorists treat networks as the mechanism to empower the stakeholders to build consensus (Innes and Booher 1999; Booher and Innes 2002), while rationalists (Batty 2013) see networks as the mechanism to model and explain urban growth and morphology. Tools that support this vision of cities come from a wide range of disciplines (e.g., agent-based models, social network analysis), most of which rely on individuals to interact and effect changes on collective phenomenon. Though critics argue that the technoscientific urbanism thinking embedded in the analytical and modeling approaches over-emphasizes technology’s role at solving urban problems (Brenner and Schmid 2015; Kitchin 2016), attentions to networks and flows as the background knowledge for planning keep rising, especially for planning smart cities (Yang and Yamagata 2020). Metrics derived from network measures such as vulnerability, reliability, and accessibility have also become the new normative goals for planning (Kim 2020). Under this context, spatial network analysis can be both interpreted as a tool to materialize a vision of city based on connectivity and a planning process that respects organic organizations and emergence from interconnected individuals and infrastructure. Spatial networks (and network analysis) is a “fuzzy concept” (Fainstein 2005, p. 223) that lacks clear disciplinary boundaries. Its development has diverging roots in the field of network science (i.e., graph theory), social science (i.e., social network analysis), and geography (i.e., relational geography). Studies from network science perspectives tend to elaborate on the mathematical and formal elements of networks (Barthélemy 2011). Social network analysis (SNA), originated from Sociology, has a long tradition focusing on how network structures inform power dynamics, group identity, and social relations (Knoke and Yang 2019). While network science and SNA may not explicitly engage with agents’ spatial locations, relational geography theories affirm that human relationships are spatially embedded in a global system of economic activities and institutional practices (Bathelt and Glückler 2003; Radil and Walther 2018). Agents can reap the benefits of innovation, knowledge diffusion, and collaborations across geography or exert influence at a distance through their connections (Bathelt and Glückler 2005). However, these virtual interactions are still not free from the constraints of geography (Laniado et al. 2018). Here, we define a spatial network as a graph structure in which nodes can be geolocated (Andris and O’Sullivan 2019). Spatial networks have two common types—planar and non-planar networks—and they are differentiated by edges’ spatial attributes. In a planar network, both vertices and edges are geographically embedded
5 A Review of Spatial Network Insights and Methods …
73
(Andris and O’Sullivan 2019). No edges will cross each other, and each intersection is a node, such as those in the road networks and electrical grid networks. A non-planar network has edges that overlap with each other without creating new nodes at the crossing (Andris and O’Sullivan 2019). These edges can represent non-spatial connections, such as social media friendship and telecommunications. Some networks can stand in between the two typologies, as their edges are somewhat constrained by geography but not completely embedded, such as flight routes and mobility patterns (Andris and O’Sullivan 2019). The separation of planar and nonplanar networks has implications on network statistics (e.g., degree distribution), as outlined in Haggett and Chorley’s early classic Network Analysis in Geography (Haggett and Chorley 1969) and later summarized in Barthélemy (2011). In this chapter, we contributed to the existing literature by associating and contextualizing the insights from spatial networks literature to planning concerns and described the challenges and opportunities to integrate spatial network analysis into planning practices. The following sections are divided into three parts. First, we grouped the literature based on themes and reviewed the methods and insights from empirical studies (with a focus on the 2010–2020 period) that applied spatial network analysis to study urban dynamics. Second, we outlined four challenges of spatial network analysis revealed by the current work along with potential solutions. We concluded the paper with a vision framework for spatial networks to empower collaborative planning and examples to integrate spatial networks for different urban stakeholders. We focused our reviews on spatial networks that engage human activities.
5.2 A Review of Spatial Networks Literature for Planning Knowledge The literature selected in the review represent a major line of research theme in spatial networks and was applied to increase planning knowledge. We consolidated the contents into four subsections based on their research goals.
5.2.1 Revealing Spatial Structures Spatial network analysis has been used to reveal inter-city and intra-city spatial structures. The word “structure” is often referenced vaguely in the literature. Therefore, we divided the interpretations of structure into three kinds—relationship, hierarchy, and cluster—based on the research questions and the methods in the literature we reviewed. Many touched on more than one interpretation in the analysis. Literature focusing on revealing the relational structure tend to ask how social and economic relationships and processes intersect with space. Thus, methods applied in
74
X. Liang and Y. Kang
this line of work rely on GIS and network visualizations heavily to contrast the spatial and social patterns or construct statistical indices to measure the effect of interests. For example, Graif et al. (2017) explained crime through neighborhood effect, a network of neighborhoods connected through gang members’ daily commutes and social ties, rather than isolated incidents. People also form their own “gang turf” by consistently visiting certain places in cities for consumption or activities. Geographers and sociologists also refer to this spatial pattern prescribed by people’s mobility traces and social media check-in data as the activity space. People or neighborhoods with various demographic labels can have distinct activity space patterns that reinforce social (Wang et al. 2015) and spatial segregation (Netto et al. 2015; Prestby et al. 2020). The extent of segregation can be measured through the reciprocity and directions of mobility flows (Phillips et al. 2019) or the overlap of demographic attributes (Blumenstock and Fratamico 2013). When connections in spatial networks indicate more than just relationships, the network nodes start to embody influence and power. As a result, hierarchies arise as some nodes become more “central” or “influential”. These characteristics can be measured through the distribution of network metrics (e.g., centrality) and the roles of nodes (e.g., hubs and spoke) at different levels. The three most widely applied centrality measures are degree centrality, closeness centrality, and betweenness centrality (Borgatti 2005). High degree centrality implies influence and vibrancy, as the node attracts a large amount of connections. High closeness centrality indicates accessibility, as the node can reach all other nodes in a few hops. High betweenness centrality represents low resilience, as the traversal of shortest paths from other nodes has to pass through a few nodes. These indices have been used as benchmarks to compare and group cities with similar characteristics in the urban context. For example, Taylor and Derudder (2004)’s world city network literature ranked cities by the number of offices from global firms (weighted by offices’ importance) and found New York and London as the alpha cities in the world. Crucitti et al. (2006), on the other hand, used the centrality measures of urban street networks to distinguish self-organized and planned cities. Transportation literature also leveraged these indicators to assess urban infrastructure and predict traffic flows. For instance, Guimera et al. (2005) observed scale-free small-world characteristics in the flight network through degree distribution of airports, which means that people can reach any airport in the world within a few layovers. Derrible (2012) measured betweenness centrality of twenty-eight metro systems and found that the system becomes more resilient (i.e., more distributed betweenness centrality) when the number of stations increases. Though betweenness centrality is a prominent measure to predict traffic flows, its efficacy is debatable (Gao et al. 2013a). Also, a node with high centrality in one measure may not be high in another, which gives different roles to the nodes (Neal 2011; Wang et al. 2011). Other than centrality measures, various network typologies and statistics are devised to capture the core-periphery structure in spatial networks, such as hubs and spokes (O’Kelly 1998), single-allocation networks (Liu et al. 2018) (or single-linkage analysis (Neal 2012)), and rich-hub coefficients (Ansell et al. 2016; Wei et al. 2018).
5 A Review of Spatial Network Insights and Methods …
75
When hierarchy is established, a natural next step is to cluster cities (or other nodes) to find communities in the networks. Questions asked in this type of work concern classifying various spatial units and visualizing enclaves that may not be spatially contiguous but deeply connected. Fortunato (2010) summarized a series of community detection algorithms that can be applied to delineate areas in a network into fine-grained, non-overlapping regions. The key idea behind community detection is to separate nodes into distinct groups to minimize the within-group difference. One class of algorithms relies on traditional graph partitioning methods to either divisively remove edges that bridge between groups (e.g., Louvain method) or additively combine groups with similar characteristics (e.g., hierarchical clustering) (Fortunato 2010). The other class is based on optimizing modularity (i.e., quality of network partition), such as Fastgreedy (Clauset et al. 2004), Spinglass (Reichardt and Bornholdt 2006), Walktrap (Pons and Latapy 2005), and Infomap (Rosvall and Bergstrom 2008). Each method has its own strength given network size, edge directions, running speed, and efficacy (Yang et al. 2016). These methods can be implemented in R and Python packages or network software like Gephi. In terms of applications, Ratti et al. (2010) adopted a spectral clustering algorithm (a way to optimize modularity) to “redraw” the boundaries of Great Britain based on cellphone interaction network data. Follow-up studies expanded the application to France, UK, Italy, Belgium, Portugal, Saudi Arabia, Ivory Coast, China, and Singapore (Sobolevsky et al. 2013; Liu et al. 2014; Zhong et al. 2014). The same approach was also applied on the neighborhood level to reveal intra-city dynamics (Gao et al. 2013b; Liu et al. 2015). The generated borders tend to conform with administrative boundaries on the regional and national level, while less on the neighborhood level, with emerging urban centers featured by people’s telecommunications and daily commutes (Sobolevsky et al. 2013; Liu et al. 2014; Zhong et al. 2014). One limitation of the modularity-based community detection method is that it only clusters based on the edges’ origins and destinations and thus can be subjective to the modifiable areal unit problem (MAUP) and edge effects. More recent work uses linear units, such as GPS trajectories and streets, as the new focus for clustering (Kempinska et al. 2018; Zhu et al. 2017). Boeing’s OSMnx package (2017) significantly lowered the difficulty to analyze street networks by automating the data download, processing, visualization, and analysis with OpenStreetMaps. An example application used this package to examine street networks’ orientation, configuration, and entropy to group and compare cities with different urban forms (Boeing 2019).
5.2.2 Optimization of Urban Infrastructure We wanted to feature a particular line of application that focused on planar networks and optimization methods. The research questions relevant to planning revolve around network traversals (e.g., most efficient network path to cover the problem area) and topology change (e.g., improve traffic and human flows with minimum road network changes).
76
X. Liang and Y. Kang
In transit planning, the network design of public transit is often framed as a bilevel optimization problem. Planners who wish to build the most efficient transit networks (first level optimization) must first resolve the users’ demand to travel most efficiently (second level optimization). Spatial networks of human mobility can inform planners of user demands for transit and the current traffic bottlenecks. The transit routes can also be framed as spatial networks to be optimized for structural efficiency. For example, with GPS embedded into the bicycles and biking docks, the O-D flows, and user trajectories were collected to indicate demands and connectivity at various stops and thus optimize the locations of biking docks and construction of bike lanes (Caggiani et al. 2019; Gu et al. 2019; Bao et al. 2017; Mesbah et al. 2012). Other applications include reducing bus stop redundancy (Shimamoto et al. 2010), evaluating operation efficiency in bus network (Delmelle et al. 2012), and designing spatial allocations of logistic centers (Yang et al. 2020). The application of spatial networks on topology change is still developing. Brelsford et al. (2018) demonstrated how to use network topology methods to suggest the most efficient routes for new roads, which may help reblock urban slums and improve residents’ accessibility to critical infrastructure.
5.2.3 Indicators of Economic Wealth, Social Capital, and Residential Mobility Studies about the spatial structure or urban infrastructure tend to treat spatial network edges as homogeneous and assume they come from a single source. In fact, density, the types of edges, and the attributes of those they connected to all impact individuals or neighborhoods’ social and economic health. The foundation piece by Eagle et al. (2010) was the first of its kind to study the economic impacts of the network structure empirically. His finding indicated that neighborhoods’ diversity of communication connections is strongly associated with their economic development. Evidence from highway transportation networks (Li et al. 2020) also confirmed this positive association. Spatial networks can also inform an individual’s (or a neighborhood’s) levels of social capital based on the possession of far-reaching or local ties. For instance, Facebook friendship data told us that counties with fewer percentage of friends within 100 miles (e.g., San Francisco) are more likely to have higher social capital, social mobility, average income, and education levels (Bailey et al. 2018). On the individual level, these social supports can be maintained in sparse and transitive networks and is unaffected by residential moves (Viry 2012), though the effect may be mediated by income or race. On the neighborhood level, social capital can be measured by the diversity and serendipity of visits, which formed the social diversity index formula in Hristova et al. (2016)’s paper. The entanglement between race, income, and interpersonal network can also influence people’s residential mobility and neighborhood mobilizations (Prestby et al.
5 A Review of Spatial Network Insights and Methods …
77
2020). Conversely, the opportunity to grow and maintain a social network is partially contingent on the locations of the individual and ties. As Van Eijk (2010) found out, poor people were more likely to have local ties. Living in a mixed (income or race) neighborhood did not necessarily bring rich connections due to the lack of interactions with resourceful neighbors. They were also more likely to form close-knit, kin-based social networks with geographic proximity, inhibiting information transmission and mobilizations for changes. Therefore, the destruction of low-income living communities is both “convenient” due to the lack of organized resistance, as evident in stories of 1960s urban renewals (Gans 1962), and detrimental as it tears down locally-maintained social capital. The lack of ties outside of the poor’s living communities may also explain why they were the last to evacuate (or not evacuate) their homes from natural disasters because of the difficulty to relocate (Metaxa-Kakavouli et al. 2018).
5.2.4 Public Health Control The Healthy Cities Movement started 40 years ago has grown planners’ attention to managing public health in urban space. The outbreak of the COVID-19 global pandemic has posited a unique context for applying spatial network analysis for public health control and measuring the effect of policies that constrains human mobility. The spatial structures and urban hierarchies embedded in spatial networks generate natural pathways for the contact-based disease to spread. Performing spatial network analysis helps governments and urban planners evaluate populations at risks, capture epidemiology-relevant behaviors, measure the effects of policies, and assist decision-making such as emergency response and medical resources evaluation and allocation (Bajardi et al. 2011; Grantz et al. 2020). Research has postulated that highly connected cities may be the first to be infected. Thus, spatial networks can be useful at tracking disease transmission and expecting populations at risk, as found in road (Strano et al. 2018), migration (Fan et al. 2020), and travel networks (Lai et al. 2020). Several human mobility portals and datasets were created to support the spatial network visualization of the transmission pathways and human response to policies (Warren and Skillman 2020; Gao et al. 2020; Pepe et al. 2020; Kang et al. 2020). Researchers have investigated how different lockdown strategies and intervention scenarios (e.g., social distancing) affect the spread of the disease and economic conditions by constructing human mobility flow networks in various countries including China (Lai et al. 2020), Italy (Bonaccorsi et al. 2020; Gatto et al. 2020), France (Pullano et al. 2020), UK (Galeazzi et al. 2020), and U.S. (Holtz et al. 2020). The availability of spatial networks (e.g., mobility) data on the local level also enabled place-based epidemic modeling. Models that integrated spatial network structure and data (e.g., business foot traffic) were more successful at explaining the spatial heterogeneity in epidemic transmission across different regions and neighborhoods (Thomas et al. 2020; Hou et al. 2020; Peng et al. 2020). All these studies
78
X. Liang and Y. Kang
demonstrate the potential for spatial network structure to understand human behaviors in response to the disease and suggest the importance of incorporating spatial networks to inform health care planning. In addition to policy evaluation, the COVID-19 public health crisis also recontextualized the flow characteristics associated with POIs (Point of Interests). As suggested by existing studies, mining POI characteristics and urban functions, such as the density of visits, the socioeconomic diversity of customers, and the types of interactions in place, are essential for urban planning (Yuan et al. 2014; Pei et al. 2014; McKenzie et al. 2015; Gao et al. 2017). In the context of COVID-19, these characteristics are re-interpreted as transmission risks. For example, Benzell et al. (2020) incorporated the number of visits (and unique visits), time spent, and median distance traveled to different POIs to calculate transmission risk and ration what kind of places should be closed or reopen first. Knowing how POI characteristics may impact public health control, urban designers in the future may look into design solutions that address this challenge in advance.
5.3 Challenges and Opportunities The application of spatial network analysis in the planning context did not come without challenges. Here, we discussed four constraints of spatial network analysis that need to be addressed for its further adoption in planning routines.
5.3.1 Data Openness and Privacy Spatial network data are not widely accessible and consistently documented. Generally speaking, there are three commonly used data sources: private sources, crowdsourcing data, and the government released official records. Out of the fifty-seven empirical studies we reviewed in Sect. 5.2, thirty-three (57%) used data from private sources, such as cell phone calls, location-based social network service, micromobility traces, GPS trajectories, and social media check-in. These data have the advantages to be fine-grained and have relatively high data quality but are often bought one-off for research purposes. They were disproportionally applied to study mobility patterns (88%), especially in COVID-19 topics. Six papers (11%) collected data through crowdsourcing, including volunteered GPS trajectories, open-sourced street networks, surveys, or collaboration with public institutions. These data tend to be small and costly to collect but more informative for specific research questions. OpenStreetMap (OSM) project as a well-known crowdsourced geographic data platform provides detailed road networks across the world, though the data quality may vary by region. Regarding government released official data sources, eighteen (32%) used publicly available data such as tax records, LODES commutes data from U.S. census, court records, street networks, bus routes, and smart cards. The attainment
5 A Review of Spatial Network Insights and Methods …
79
of some public data (e.g., air and train schedule) can involve time-consuming data scraping and cleaning. These public data also skewed heavily toward physical and mobility networks and have very little documentation for social networks. In addition, these data are often offered in aggregated formats, which is not fine-grained enough for planners’ place-based work. The stark contrast of numbers above reveals the shortage of crowdsourced and publicly available spatial network data, especially for mobility and relationships. Considered the wide range of applications on urban affairs, spatial network data should be considered a public good despite being collected through private channels. So, how can we encourage data openness to increase the accessibility of spatial network data? The City of Chicago provides an example of data openness through policymaking. Starting in 2018, all ride-share companies are required by an ordinance to send routine reports to the City of Chicago, including the origins and destinations of trips (City of Chicago 2020). Products like Uber Movements also help transportation planners to monitor traffic flow and increase road safety. During the COVID-19 outbreaks, multiple geospatial data companies (e.g., SafeGraph) also contributed free and open POI or tract-based foot traffic data for governments, non-profits, and researchers to download (Kang et al. 2020). Still, more public–private partnership models or policies need to be developed to keep spatial network data open. Privacy, in particular, geoprivacy, is another barrier to collect large-scale and consistent spatial network data. This challenge is often offloaded to service vendors to resolve. Geoprivacy refers to an individual’s rights to prevent the disclosure of sensitive personal locations such as their home, workplace, and travel trips (Kwan et al. 2004). Due to the rapid development of location-based services, information, such as users’ location records and attributes, is automatically collected or inferred from the spatial, temporal, and thematic characteristics of the geographic information. Hence, it is necessary to “encrypt” individual location information to protect users from being identified. Existing studies have proposed several potential solutions for protecting geoprivacy. The simplest one is to aggregate fine-resolution data to upper-level scales. It indeed preserves user privacy but also reduces the spatial resolution of data (Montjoye et al. 2013). The other commonly used method is grouping and mixing the geographic data (e.g., trajectory points) from k different users into k different regions and then generate k-anonymized location information (Gruteser and Grunwald 2003). Such a method may hide the spatial information of the input data and neglect temporal and semantic attributes. Another one is geomasking, which blurs users’ locations by perturbation and adding noises so that the location information can be protected with spatial patterns preserved (Gao et al. 2019). In addition, Rao et al. (2020) proposed a deep learning method using long short-term memory (LSTM) to generate a privacypreserving synthetic trajectory that preserves the essential spatio-temporal attributes of the original trajectory (Liu et al. 2018). All these studies may enhance the privacy protection of location-related information.
80
X. Liang and Y. Kang
5.3.2 Lack of Direct Policy Implications A major critique toward the spatial network literature is that the zeal to reveal spatial structures often does not lead to actual policy changes. Not many cities have changed their administrative boundaries according to emerged borders from spatial networks. One reason why these insights are only on the paper is due to the dynamic and uncertain nature of spatial structures derived from multiple sources (Steiger et al. 2015; Huang and Wong 2016). These boundaries are sensitive to the types of data collected and cannot represent the whole population. Spatial networks are also not the only way to delineate neighborhoods. It may have competing narratives with projects like Bostonography, which crowdsourced mental maps from users to represent conceptual neighborhoods (Woodruff 2013). Even if we take one of the derived spatial structure as the ground truth, very few papers went a step further to suggest a clear pathway for planners to act according to the local conditions or provide normative discussions on the results (Andris 2020). If we have spatial segregation in people’s activity space, should we respect such structure or optimize it to a healthier balance? If we know a city is at the margin of the urban hierarchy, should we intervene at all, and if so, how can we help the city move up the ladder? We outlined three potentials to answer the questions above. One of the missing pieces in the existing literature is the evolution of urban network structures and the associated socioeconomic phenomenon. Hidalgo and Hausmann (2009) and Hausmann et al. (2014)’s economic complexity research showed an excellent example of linking international trade network insights into concrete economic development suggestions for developing countries. Planning literature lacks the equivalent depth of network knowledge on the city level to direct local economic policies, though Park et al. (2019)’s work on labor flows and geo-industrial clusters has started to tackle this challenge. Another way to connect network insights to policies is to evaluate the impact of policies that change network links causally, or vice versa, the effect of networks at implementing policies. Cao et al. (2013) and Liu et al. (2019) assessed the effect of the high-speed rail network on city-to-city travel time and air traffic distribution. Such research can help transportation planners to quantify the cost and benefits of actions that increase connectivity. Andris (2020) and Goetz (2020) also pointed out how policies are often applied to the regional level, while followed at the social networks level, such as the successes and failures in COVID-19 interventions. Tracing policy implementation through social networks may inform barriers or opportunities for planning actions to take place. Lastly, a more recent piece from Shelton and Poorthuis (2019) exemplified the power of contextualizing the spatial network structures of the inner-city to challenge the administrative boundaries. Shelton and Poorthuis (2019) made a compelling case for the City of Atlanta to reconsider the arrangements of Neighborhood Planning Units (a political legacy for the neighborhoods to rally and organize for their interests in urban planning) by tracing the historical evolution of neighborhoods and comparing it to the borders derived from big data. Validating the spatial networks insights with multiple sources and grounding them in contexts can further move policies forward.
5 A Review of Spatial Network Insights and Methods …
81
5.3.3 Lack of Civic, Communicative, and Collaborative Engagement When spatial network analysis first became popular, there were many excitements in the planning field to see its applications in supporting communicative and collaborative planning theories (Albrechts and Mandelbaum 2007; Dempwolf and Lyles 2012). However, as we observed in the recent empirical studies, very few consult or engage the information providers on interpreting and explaining the implications. The big geospatial data in spatial network research may produce an illusion that we have a representative picture of the whole population. What if there is a gap between how people conceive their activity spaces and what they show on their mobility trajectories? Furthermore, most of the network insights also do not directly serve the interests of local communities or non-profit organizations. While we often applaud the positive impacts of connectivity on social and economic welfare, we should not forget the duality of network edges: when an access point is not available, an edge, such as a highway, can also negatively affect the surrounding neighborhoods. The preferential attachment mechanism (i.e., rich gets richer) in scale-free networks also deepens inequality in connectivity distribution. How can communities use spatial network analysis to communicate for their rights and counter the privileged network discourse? How can we have more “human-in-the-loop” interpretations of spatial network insights? The volunteered geographic information (VGI) literature in geography may provide some interactive models for planners to engage citizens in spatial network analysis. Participatory mapping and crowdsourcing can potentially be extended to spatial network information, though their validity can be difficult to confirm. A combination of recruited volunteers with GPS trackers and follow-up interviews can also provide valuable context to explain the motivations of emerging patterns (Sila-Nowicka et al. 2016). Also, to generate counter-narratives, we need to think creatively about what constitutes alternative spatial network data. For instance, Andris et al. (2019) collaborated with the NGO Big Brothers and Big Sisters of America to conceptualize mentorship pairs in city as spatial networks and evaluated the impact of the mentoring program at bridging spatial gaps between places which will not be connected otherwise through commutes or demographic groups.
5.3.4 Difficulty to Automate Visualizations and Integrate with GIS Geovisualization and maps are essential in geographic information representation and GISystems. A precise visualization helps illustrate data information intuitively
82
X. Liang and Y. Kang
and vividly to the audience to better understand the story and turn it into knowledge. Though researchers have integrated various spatial network data into GISystems (Andris 2016), visualizing spatial networks and interactions on maps still face conceptual and methodological challenges (see Andris et al. 2018) for a comprehensive review). For example, traditional flow maps do not display well with large-scale non-planar networks as they can be too dense to display on a 2D space and thus result in edge cluttering. In addition, nodes and edges in spatial networks take on multiple attributes (e.g., density, direction, divisions, and hierarchies of flows and the attributes of the destination nodes), which requires more aesthetic support than just size and color in GIS. Researchers have proposed several methods to address these challenges. A set of algorithms have been proposed for visual simplification to reduce cluttering based on aggregation, such as automating thresholds to filter data (Rae 2009); grouping points that are close by or have similar connectivity into one node (i.e., graph partitioning (Guo 2009) and spatial clustering (Zhu and Guo 2014; Von Landesberger et al. 2015)); bundling and summarizing edges that are going to the same directions (i.e., edge bundling (Ersoy et al. 2011)); and algebraic multigrid (Wang et al. 2018). Studies have also explored how to represent more attributes of networks in visualization, such as direction (Yao et al. 2019) and temporal or step-wise changes of flows (i.e., alluvial diagram (Rosvall and Bergstrom 2010)). Several powerful web-based visualization tools and packages were developed for spatial network visualization, such as deck.gl (Wang 2019) and flowmap.blue.1 Compared with traditional static GIS maps, these web-based interfaces can dynamically visualize more network attributes, such as the flow directions, the contrast of in and out-degree for nodes, and all connections to one node. All these studies may benefit spatial network-related geovisualization.
5.4 Conclusion: Envisioning a Collaborative Planning Model with Spatial Networks Planning is a process for setting goals, identifying and assessing options, and developing strategies for achieving desired goals (Friedmann 1987). Collaborative planning theory further elaborates on the planning process to be an “interactive, communicative activity,” (Innes 1995, p.183) that engages diverse urban stakeholders (Booher and Innes 2002). Given the wide range of spatial networks applications, we believe that it can inform collaborations across agents and regions that enrich both the rationality and humanity of the outcomes. We proposed a framework in which spatial networks contextualize collaborative planning theory in practices and resolve some of the challenges we mentioned above (see Fig. 5.1). In this idealized framework, private companies, governments, citizens, and researchers can all contribute spatial network data as a public good and participate in the generation of planning knowledge. The process will be facilitated by collaborations between various urban 1 https://flowmap.blue/.
5 A Review of Spatial Network Insights and Methods …
83
Public-Private Partnership
Private Company
Government or Public Service Provider
Crowdsourced Data Provide crowdsourced SN data based
Provide SN data based on government Open
Provide SN data as public good
Data
methods
Public-Academia Partnership
Planning Knowledge
methods development
Researchers &
Provide arguments & evidence
Provide costs and benefits, strategies, &
Fig. 5.1 A collaborative planning framework with spatial networks
Decisions
84
X. Liang and Y. Kang
stakeholders that augments the interpretations of spatial network insights. In the end, planners can use network data and methods to convert planning knowledge into actions by evaluating planning alternatives, establishing expectations of planning impacts, or providing normative evidence to support activism. To further illustrate how different planning stakeholders can use spatial network analysis for decisionmaking or activism under collaborative planning framework, we generated a table of examples for each stakeholder (see Table 5.1). In conclusion, we reported research that applied spatial networks data and methods in the planning context. We found four common themes, including revealing spatial structures, optimizing urban infrastructure, correlating network structure with economic development, social capitals, and residential mobility, and monitoring public health. We also discussed challenges of data openness and privacy, unclear policy implications, lack of civic engagement, and difficulty to integrate with GIS that impeded planners’ further adoption of spatial network analysis in daily routines. Table 5.1 Example applications of spatial networks for planning stakeholders Spatial network data
Methods
Example actions
Apply community detection to delineate boundaries for megaregions and validate the boundaries with multiple types of networks
Propose urban clusters that can be developed into megaregions and suggest regional planning of infrastructure that can support such urban system
Use community detections to capture urban clusters that have high risks of inter-city disease transmission based on connectivity
Revisit and update a dynamic administrative boundary for urban management and public health control by establishing inter-regions councils
Regional commission Population density of cities (Nodes) and multiple urban networks through mobility, employment history, capital investments, and social media (Edges) Public administration Real-time mobility tracking data with LBSN services during COVID-19
Tourism and cultural department Crowdsourced tourism stop points (Nodes) and routes (Edges) from travel agencies and locals
Convert each tourism route Develop an automated system into a chained trip, identify to recommend customized trips places that are most popular based on users’ preferences and most likely to be a transfer stop, and cluster the trips based on different themes
Economic planners Economic network constructed through company branches (Nodes) and firm collaboration records (Edges)
Calculate the probability of growing a industry based on connectivity to the industrial profiles from other cities
Strategize what industries to invest and grow not only based on a city’s endogenous resources, but also based on the relative advantages of a city in urban networks (continued)
5 A Review of Spatial Network Insights and Methods …
85
Table 5.1 (continued) Spatial network data
Methods
Example actions
Transportation planners Fragmented bike lane network Develop optimization shapefile (Nodes and Edges) algorithms to connect the fragmented bike lanes and evaluate the costs and benefits
Strategize where to build the next bike lane
Urban designers and modelers Work-home commutes data (Nodes and Edges) Social Media Check-in (Nodes)
Approximate the density of Calibrate the model of visits and volumes of traffic passenger flow at a design site flows in and out of a target area to simulate the effect of the proposed plan
Housing and community planners Survey of people’s active connections (Edges) within and outside of the current living communities (Nodes)
Construct a residential mobility index based on the neighborhood characteristics and people’s social networks
Evaluate the effect of mixed-income public housing project on people’s social capitals and job access
Visualize the information as a spatial network to show how far and scattered teachers currently live
Convince the local community to approve affordable housing for teachers so that they can live nearby the schools
Interview to reveal collaborative and competitive relationships between local coffee farms and identify critical ties
Form local coffee producers co-op networks that improve economic and environmental resilience for individuals
Activists Collect survey of teachers’ work-home locations (Nodes) and commute routes (Edges) NGO and social enterprise Locations of coffee farms (Nodes) and their production relationships (Edges)
Business owners and investors Place-based or POI-based visit data (from social media) or LBSN services
Evaluate the sociodemographic Select optimal locations to profiles of people that frequent open new business the locations and their matches to the business
However, we are optimistic that spatial networks can be the backbone of a collaborative planning framework. The hypothetical examples we provided from various urban stakeholders’ perspectives show spatial networks’ flexibility to facilitate efficiency, responsiveness, and inclusion in planning practices. Future research should address the applications in environmental, ecological, and energy networks that are not covered in this study.
86
X. Liang and Y. Kang
References Albrechts L, Mandelbaum S (2007) The network society: a new context for planning. Routledge Andris C (2016) Integrating social network data into GISystems. Int J Geogr Inf Sci 30(10):2009– 2031 Andris C (2020) Regions from social networks: what’s next? NARSC Newsl 8(1):7–10 Andris C, O’Sullivan D (2019) Spatial network analysis. Handb Reg Sci 1–24 Andris C, Liu X, Ferreira J Jr (2018) Challenges for social flows. Comput Environ Urban Syst 70:197–207 Andris C, Liu X, Mitchell J, O’Dwyer J, Van Cleve J (2019) Threads across the urban fabric: youth mentorship relationships as neighborhood bridges. J Urban Aff 1–16 Ansell C, Bichir R, Zhou S (2016) Who says networks, says oligarchy? Oligarchies as “Rich Club” networks. Connect-Off J Int Netw Soc Netw Anal 35(2):20–32 Bailey M, Cao R, Kuchler T, Stroebel J, Wong A (2018) Social connectedness: measurement, determinants, and effects. J Econ Perspect 32(3):259–280 Bajardi P, Poletto C, Ramasco JJ, Tizzoni M, Colizza V, Vespignani A (2011) Human mobility networks, travel restrictions, and the global spread of 2009 H1N1 pandemic. PLoS One 6(1): Bao J, He T, Ruan S, Li Y, Zheng Y (2017) Planning bike lanes based on sharing-bikes’ trajectories. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, pp 1377–1386 Barthélemy M (2011) Spatial networks. Phys Rep 499(1–3):1–101 Bathelt H, Glückler J (2003) Toward a relational economic geography. J Econ Geogr 3(2):117–144 Bathelt H, Glückler J (2005) Resources in economic geography: from substantive concepts towards a relational perspective. Environ Plan A 37(9):1545–1563 Batty M (2013) The new science of cities. MIT Press Benzell SG, Collis A, Nicolaides C (2020) Rationing social contact during the COVID-19 pandemic: transmission risk and social benefits of US locations. Proc Natl Acad Sci 117(26):14642–14644 Bettencourt LMA (2013) The origins of scaling in cities. Science 340(6139):1438–1441 Blumenstock J, Fratamico L (2013) Social and spatial ethnic segregation: a framework for analyzing segregation with large-scale spatial network data. In: Proceedings of the 4th annual symposium on computing for development, pp 1–10 Boeing G (2017) OSMnx: new methods for acquiring, constructing, analyzing, and visualizing complex street networks. Comput Environ Urban Syst 65:126–139 Boeing G (2019) Urban spatial order: street network orientation, configuration, and entropy. Appl Netw Sci 4(1):67 Bonaccorsi G, Pierri F, Cinelli M, Flori A, Galeazzi A, Porcelli F et al (2020) Economic and social consequences of human mobility restrictions under COVID-19. Proc Natl Acad Sci 117(27):15530–15535 Booher DE, Innes JE (2002) Network power in collaborative planning. J Plan Educ Res 21(3):221– 236 Borgatti SP (2005) Centrality and network flow. Soc Netw 27(1):55–71 Brelsford C, Martin T, Hand J, Bettencourt LMA (2018) Toward cities without slums: topology and the spatial evolution of neighborhoods. Sci Adv 4(8):eaar4644 Brenner N, Schmid C (2015) Towards a new epistemology of the urban? City 19(2–3):151–182 Caggiani L, Camporeale R, Marinelli M, Ottomanelli M (2019) User satisfaction based model for resource allocation in bike-sharing systems. Transp Policy 80:117–126 Cao J, Liu XC, Wang Y, Li Q (2013) Accessibility impacts of China’s high-speed rail network. J Transp Geogr 28:12–21 Castells M (1996) The information age, vol 98. Oxford Blackwell Publishers City of Chicago (2020) Transportation network providers—vehicles. https://data.cityofchicago.org/ Transportation/Transportation-Network-ProvidersVehicles/bc6b-sq4u Clauset A, Newman MEJ, Moore C (2004) Finding community structure in very large networks. Phys Rev E 70(6):66111
5 A Review of Spatial Network Insights and Methods …
87
Crucitti P, Latora V, Porta S (2006) Centrality measures in spatial networks of urban streets. Phys Rev E 73(3):36125 Delmelle EM, Li S, Murray AT (2012) Identifying bus stop redundancy: a GIS-based spatial optimization approach. Comput Environ Urban Syst 36(5):445–455 Dempwolf CS, Lyles LW (2012) The uses of social network analysis in planning: a review of the literature. J Plan Literat 27(1):3–21 Derrible S (2012) Network centrality of metro systems. PLoS One 7(7): Eagle N, Macy M, Claxton R (2010) Network diversity and economic development. Science 328(5981):1029–1031 Ersoy O, Hurter C, Paulovich F, Cantareiro G, Telea A (2011) Skeleton-based edge bundling for graph visualization. IEEE Trans Visual Comput Graphics 17(12):2364–2373 Fainstein SS (2005) Local networks and capital building. The network society: a new context for planning, pp 222–228 Fan C, Cai T, Gai Z, Wu Y (2020) The relationship between the migrant population’s migration network and the risk of COVID-19 transmission in China—Empirical analysis and prediction in prefecture-level cities. Int J Environ Res Pub Health 17(8):2630 Fortunato S (2010) Community detection in graphs. Phys Rep 486(3–5):75–174 Friedmann J (1987) Planning in the public domain. From knowledge to action. Princeton University Press, Princeton, New Jersey Galeazzi A, Cinelli M, Bonaccorsi G, Pierri F, Schmidt AL, Scala A, Pammolli F, Quattrociocchi W (2020) Human mobility in response to COVID-19 in France, Italy and UK. ArXiv Preprint http://arxiv.org/abs/2005.06341 Gans HJ (1962) The urban villagers. Group and Class in the life of Italian–Americans. Free Press of Glencoe, New York Gao S, Wang Y, Gao Y, Liu Y (2013a) Understanding urban traffic-flow characteristics: a rethinking of betweenness centrality. Environ Plan 40(1):135–153 Gao S, Liu Y, Wang Y, Ma X (2013b) Discovering spatial interaction communities from mobile phone data. Trans GIS 17(3):463–481 Gao S, Janowicz K, Couclelis H (2017) Extracting urban functional regions from points of interest and human activities on location-based social networks. Trans GIS 21(3):446–467 Gao S, Rao J, Liu X, Kang Y, Huang Q, App J (2019) Exploring the effectiveness of geomasking techniques for protecting the geoprivacy of Twitter users. J Spat Inform Sci 19:105–129. https:// doi.org/10.5311/JOSIS.2019.19.510 Gao S, Rao J, Kang Y, Liang Y, Kruse J (2020) Mapping county-level mobility pattern changes in the United States in response to COVID-19. SIGSPATIAL Spec 12(1):16–26 Gatto M, Bertuzzo E, Mari L, Miccoli S, Carraro L, Casagrandi R, Rinaldo A (2020) Spread and dynamics of the COVID-19 epidemic in Italy: Effects of emergency containment measures. Proc Natl Acad Sci 117(19):10484–10491 Goetz SJ (2020) COVID-19, networks and regional science. NARSC Newsl 8(1):5–7 Graham S, Healey P (1999) Relational concepts of space and place: Issues for planning theory and practice. Eur Plan Stud 7(5):623–646 Graif C, Lungeanu A, Yetter AM (2017) Neighborhood isolation in Chicago: violent crime effects on structural isolation and homophily in inter-neighborhood commuting networks. Soc Netw 51:40–59 Grantz KH, Meredith HR, Cummings DA, Metcalf CJE, Grenfell BT, Giles JR, Mehta S, Solomon S, Labrique A, Kishore N, Buckee CO (2020) The use of mobile phone data to inform analysis of COVID-19 pandemic epidemiology. Nat Commun 11(1):1–8 Gruteser M, Grunwald D (2003) Anonymous usage of location-based services through spatial and temporal cloaking. In: Proceedings of the 1st international conference on mobile systems, applications and services, pp 31–42 Gu Z, Zhu Y, Zhang Y, Zhou W, Chen Y (2019) Heuristic bike optimization algorithm to improve usage efficiency of the station-free bike sharing system in Shenzhen, China. ISPRS Int J GeoInform 8(5):239
88
X. Liang and Y. Kang
Guimera R, Mossa S, Turtschi A, Amaral LAN (2005) The world-wide air transportation network: anomalous centrality, community structure, and cities’ global roles. Proc Natl Acad Sci 102(22):7794–7799 Guo D (2009) Flow mapping and multivariate visualization of large spatial interaction data. IEEE Trans Visual Comput Graphics 15(6):1041–1048 Haggett P, Chorley RJ (1969) Network analysis in geography, vol 1. Hodder Education Hajer M, Zonneveld W (2000) Spatial planning in the network society-rethinking the principles of planning in the Netherlands. Eur Plan Stud 8(3):337–355 Hausmann R, Hidalgo CA, Bustos S, Coscia M, Simoes A (2014) The atlas of economic complexity: mapping paths to prosperity. MIT Press Hidalgo CA, Hausmann R (2009) The building blocks of economic complexity. Proc Natl Acad Sci 106(26):10570–10575 Holtz D, Zhao M, Benzell SG, Cao CY, Rahimian MA, Yang J, Allen JNL, Collis A, Moehring AV, Sowrirajan T, Ghosh D (2020) Interdependence and the cost of uncoordinated responses to COVID-19. Proc Natl Acad Sci 117(33):19837–19843 Hou X, Gao S, Li Q, Kang Y, Chen N, Chen K, Rao J, Ellenberg JS, Patz JA (2020) Intra-county modeling of COVID-19 infection with human mobility: assessing spatial heterogeneity with business traffic, age and race. Proc Natl Acad Sci 118(24) Hristova D, Williams MJ, Musolesi M, Panzarasa P, Mascolo C (2016) Measuring urban social diversity using interconnected geo-social networks. In: Proceedings of the 25th international conference on World Wide Web, pp 21–30 Huang Q, Wong DWS (2016) Activity patterns, socioeconomic status and urban spatial structure: what can social media data tell us? Int J Geogr Inf Sci 30(9):1873–1898 Innes JE (1995) Planning theory’s emerging paradigm: communicative action and interactive practice. J Plan Educ Res 14(3):183–189 Innes JE, Booher DE (1999) Consensus building and complex adaptive systems: a framework for evaluating collaborative planning. J Am Plan Assoc 65(4):412–423 Innes JE, Booher DE (2018) Planning with complexity: an introduction to collaborative rationality for public policy. Routledge Kang Y, Gao S, Liang Y, Li M, Rao J, Kruse J (2020) Multiscale dynamic human mobility flow dataset in the US during the COVID-19 epidemic. Scientific Data 7(1):1–13 Kempinska K, Longley P, Shawe-Taylor J (2018) Interactional regions in cities: making sense of flows across networked systems. Int J Geogr Inf Sci 32(7):1348–1367 Kim H (2020) Some thoughts concerning network analysis approach in regional science. NARSC Newsl 8(1):11–12 Kitchin R (2016) The ethics of smart cities and urban science. Philos Trans Royal Soc A: Math Phys Eng Sci 374(2083):20160115 Knoke D, Yang S (2019) Social network analysis, vol 154. Sage Kwan M, Casas I, Schmitz B (2004) Protection of geoprivacy and accuracy of spatial information: How effective are geographical masks? Cartographica: Int J Geogr Inform Geovisualization 39(2):15–28 Lai S, Bogoch II, Ruktanonchai NW, Watts A, Lu X, Yang W, Yu H, Khan K, Tatem AJ (2020) Assessing spread risk of Wuhan novel coronavirus within and beyond China. Janurary–April 2020: a travel network-based modeling study. MedRxiv Lai S, Ruktanonchai NW, Zhou L, Prosper O, Luo W, Floyd JR, Wesolowski A, Santillana M, Zhang C, Du X, Yu H (2020) Effect of non-pharmaceutical interventions to contain COVID-19 in China. Nat 685(7825):410-413 Laniado D, Volkovich Y, Scellato S, Mascolo C, Kaltenbrunner A (2018) The impact of geographic distance on online social interactions. Inform Syst Front 20(6):1203–1218 Li B, Gao S, Liang Y, Kang Y, Prestby T, Gao Y, Xiao R (2020) Estimation of regional economic development indicator from transportation network analytics. Sci Rep 10(1). https://doi.org/10. 1038/s41598-020-59505-2
5 A Review of Spatial Network Insights and Methods …
89
Liu Y, Sui Z, Kang C, Gao Y (2014a) Uncovering patterns of inter-urban trip and spatial interaction from social media check-in data. PLoS One 9(1):e86026 Liu X, Gong L, Gong Y, Liu Y (2015) Revealing travel patterns and city structure with taxi trip data. J Transp Geogr 43:78–90 Liu X, Hollister R, Andris C (2018) Wealthy hubs and poor chains: constellations in the US urban migration system. In: Agent-based models and complexity science in the age of geospatial big data. Springer, pp 73–86 Liu X, Chen H, Andris C (2018) trajGANs: using generative adversarial networks for geo-privacy protection of trajectory data (Vision paper). Location Privacy and Security Workshop, pp 1–7 Liu S, Wan Y, Ha H-K, Yoshida Y, Zhang A (2019) Impact of high-speed rail network development on airport traffic and traffic distribution: evidence from China and Japan. Transp Res Part A: Policy Pract 127:115–135 McKenzie G, Janowicz K, Gao S, Yang J-A, Hu Y (2015) POI pulse: a multigranular, semantic signature–based information observatory for the interactive visualization of big geosocial data. Cartographica: Int J Geogr Inform Geovisualization 50(2):71–85 Mesbah M, Thompson R, Moridpour S (2012) Bilevel optimization approach to design of network of bike lanes. Transp Res Rec 2284(1):21–28 Metaxa-Kakavouli D, Maas P, Aldrich DP (2018) How social ties influence hurricane evacuation behavior. Proceedings of the ACM on Human-Computer Interaction, 2(CSCW), pp 1–16 Montjoye D, Alexandre Y, Hidalgo C, Verleysen M, Blondel V (2013) Unique in the crowd: the privacy bounds of human mobility. Sci Rep 3:1376 Neal Z (2011) Differentiating centrality and power in the world city network. Urban Stud 48(13):2733–2748 Neal Z (2012) The connected city: How networks are shaping the modern metropolis. Routledge Netto VM, Soares MP, Paschoalino R (2015) Segregated networks in the city. Int J Urban Reg Res 39(6):1084–1102 O’Kelly ME (1998) A geographer’s analysis of hub-and-spoke networks. J Transp Geogr 6(3):171– 186 Park J, Wood IB, Jing E, Nematzadeh A, Ghosh S, Conover MD, Ahn YY (2019) Global labor flow network reveals the hierarchical organization and dynamics of geo-industrial clusters. Nat Commun 10(1):1–10 Pei T, Sobolevsky S, Ratti C, Shaw S-LL, Li T, Zhou C (2014) A new insight into land use classification based on aggregated mobile phone data. Int J Geogr Inf Sci 28(9):1988–2007. https://doi. org/10.1080/13658816.2014.913794 Peng Z, Wang R, Liu L, Wu H (2020) Exploring urban spatial features of COVID-19 transmission in Wuhan based on social media data. ISPRS Int J GeoInform 9(6):402 Pepe E, Bajardi P, Gauvin L, Privitera F, Lake B, Cattuto C, Tizzoni M (2020) COVID-19 outbreak response, a dataset to assess mobility changes in Italy following national lockdown. Sci Data 7(1):1–7 Phillips NE, Levy BL, Sampson RJ, Small ML, Wang RQ (2019) The social integration of American cities: network measures of connectedness based on everyday mobility across neighborhoods. Sociol Methods Res. https://doi.org/10.1177/0049124119852386 Pons P, Latapy M (2005) Computing communities in large networks using random walks. In: International symposium on computer and information sciences, pp 284–293 Prestby T, App J, Kang Y, Gao S (2020) Understanding neighborhood isolation through spatial interaction network analysis using location big data. Environ Plan A: Econ Space. https://doi.org/ 10.1177/0308518X19891911 Pullano G, Valdano E, Scarpa N, Rubrichi S, Colizza V (2020) Population mobility reductions during COVID-19 epidemic in France under lockdown. MedRxiv Radil SM, Walther OJ (2018) Social networks and geography: a review of the literature and its implications. ArXiv Preprint https://arxiv.org/abs/1805.04510
90
X. Liang and Y. Kang
Rae A (2009) From spatial interaction data to spatial interaction information? Geovisualisation and spatial structures of migration from the 2001 UK census. Comput Environ Urban Syst 33(3):161– 178 Rao J, Gao S, Kang Y, Huang Q (2020) LSTM-TrajGAN: a deep learning approach to trajectory privacy protection. ArXiv Preprint https://arxiv.org/pdf/2006.10521 Ratti C, Sobolevsky S, Calabrese F, Andris C, Reades J, Martino M, Claxton R, Strogatz SH (2010) Redrawing the map of Great Britain from a network of human interactions. PLoS One 5(12):e14248 Reichardt J, Bornholdt S (2006) Statistical mechanics of community detection. Phys Rev E 74(1):16110 Rosvall M, Bergstrom CT (2008) Maps of random walks on complex networks reveal community structure. Proc Natl Acad Sci 105(4):1118–1123 Rosvall M, Bergstrom CT (2010) Mapping change in large networks. PLoS One 5(1):e8694 Shelton T, Poorthuis A (2019) The nature of neighborhoods: using big data to rethink the geographies of Atlanta’s neighborhood planning unit system. Ann Am Assoc Geogr 109(5):1341–1361 Shimamoto H, Murayama N, Fujiwara A, Zhang J (2010) Evaluation of an existing bus network using a transit network optimisation model: a case study of the Hiroshima City Bus network. Transportation 37(5):801–823 Siła-Nowicka K, Vandrol J, Oshan T, Long JA, Demšar U, Fotheringham AS (2016) Analysis of human mobility patterns from GPS trajectories and contextual information. Int J Geogr Inf Sci 30(5):881–906 Sobolevsky S, Szell M, Campari R, Couronné T, Smoreda Z, Ratti C (2013) Delineating geographical regions with networks of human interactions in an extensive set of countries. PLoS One 8(12):e81707 Steiger E, De Albuquerque JP, Zipf A (2015) An advanced systematic literature review on spatiotemporal analyses of t witter data. Trans GIS 19(6):809–834 Strano E, Viana MP, Sorichetta A, Tatem AJ (2018) Mapping road network communities for guiding disease surveillance and control strategies. Sci Rep 8(1):1–9 Taylor PJ, Derudder B (2004) World city network: a global urban analysis. Routledge Thomas LJ, Huang P, Yin F, Luo XI, Almquist ZW, Hipp JR, Butts CT (2020) Spatial heterogeneity can lead to substantial local variations in COVID-19 timing and severity. Proc Natl Acad Sci. 117(39)24180–24187 Van Eijk G (2010) Unequal networks: spatial segregation, relationships and inequality in the city, vol 32. Gwen van Eijk Viry G (2012) Residential mobility and the spatial dispersion of personal networks: effects on social support. Soc Netw 34(1):59–72 Von Landesberger T, Brodkorb F, Roskosch P, Andrienko N, Andrienko G, Kerren A (2015) Mobility graphs: visual analysis of mass mobility dynamics via spatio-temporal graphs and clustering. IEEE Trans Visual Comput Graphics 22(1):11–20 Wang Y (2019) Deck. gl: Large-scale web-based visual analytics made easy. ArXiv Preprint http:// arxiv.org/abs/1910.08865 Wang J, Mo H, Wang F, Jin F (2011) Exploring the network structure and nodal centrality of China’s air transport network: a complex network approach. J Transp Geogr 19(4):712–721 Wang Y, Kang C, Bettencourt LMA, Liu Y, Andris C (2015) Linked activity spaces: embedding social networks in urban space. In: Computational approaches for urban environments. Springer, pp 313–336 Wang S, Du Y, Jia C, Bian M, Fei T (2018) Integrating algebraic multigrid method in spatial aggregation of massive trajectory data. Int J Geogr Inf Sci 32(12):2477–2496 Warren MS, Skillman SW (2020) Mobility changes in response to COVID-19. ArXiv Preprint https://arxiv.org/pdf/2003.14228 Wei Y, Song W, Xiu C, Zhao Z (2018) The rich-club phenomenon of China’s population flow network during the country’s spring festival. Appl Geogr 96:77–85
5 A Review of Spatial Network Insights and Methods …
91
Woodruff A (2013). Neighborhoods as seen by the people. https://bostonography.com/2013/neighb orhoods-as-seen-by-the-people/ Yang P, Yamagata Y (2020) Urban systems design: shaping smart cities by integrating urban design and systems science. In: Urban systems design. Elsevier, pp 1–22 Yang Z, Algesheimer R, Tessone CJ (2016) A comparative analysis of community detection algorithms on artificial networks. Sci Rep 6:30750 Yang J, Han Y, Wang Y, Jiang B, Lv Z, Song H (2020) Optimization of real-time traffic network assignment based on IoT data using DBN and clustering model in smart city. Fut Gen Comput Syst 108:976–986 Yao X, Wu L, Zhu D, Gao Y, Liu Y (2019) Visualizing spatial interaction characteristics with direction-based pattern maps. J Visual 22(3):555–569 Yuan NJ, Zheng Y, Xie X, Wang Y, Zheng K, Xiong H (2014) Discovering urban functional zones using latent activity trajectories. IEEE Trans Knowl Data Eng 27(3):712–725 Zhong C, Arisona SM, Huang X, Batty M, Schmitt G (2014) Detecting the dynamics of urban structure through spatial network analysis. Int J Geogr Inf Sci 28(11):2178–2199 Zhu X, Guo D (2014) Mapping large spatial flow data with hierarchical clustering. Trans GIS 18(3):421–435 Zhu D, Wang N, Wu L, Liu Y (2017) Street as a big geo-data assembly and analysis unit in urban studies: a case study using Beijing taxi data. Appl Geogr 86:152–164
Chapter 6
Transport Infrastructure, Twitter and the Politics of Public Participation Wayne Williamson
Abstract Social media is changing how many local communities seek to mobilize alternative political strategies to disrupt planning processes. The focus of this chapter is the social media and hashtag (#) use of citizens on Twitter during the planning and construction of the WestConnex motorway project in Sydney, Australia. To this end, this chapter applies a post-political lens to investigate alternative political strategies mobilized through Twitter to highlight equity issues relating to public participation, transparency and health impacts on the community. Of particular interest is hashtag use as a form of alternative politics. This chapter identifies the extensive use of Twitter as an additional communications channel to raise concerns at a local community level, and at a broader political level during a 2019 political election. Keywords Public participation · Twitter · Hashtags · Post-political · Australia
6.1 Introduction In Australia’s fastest growth cities, governments are allocating unprecedented levels of investment to deliver transport, education, health, arts and cultural infrastructure (GSC 2018). Meanwhile, the governance of infrastructure—the planning, financing, contracting, and construction of infrastructure is facing numerous issues, including tension between temporal and spatial decision-making, the use of expert panels and development authorities and genuine public engagement (Wegrich 2017). An audit report released by Infrastructure Australia (2019) indicates that infrastructure often does not meet Australia’s needs and is characterized by congestion, overcrowding, rising bills, outages and declining service standards. A strong strategic focus on infrastructure planning and delivery brings both opportunities and challenges. Since the early 2000s strategic planning in Australian cities has witnessed a shift towards large-scale urban infrastructure as a solution to urban problems (Dodson 2009). For Dodson (2009) this ‘infrastructure turn’ has weakened metropolitan plans W. Williamson (B) Department of Geography and Planning, Macquarie University, Balaclava Road, North Ryde, NSW 2109, Australia e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 S. C. M. Geertman et al. (eds.), Urban Informatics and Future Cities, The Urban Book Series, https://doi.org/10.1007/978-3-030-76059-5_6
93
94
W. Williamson
in favor of large, complex and fiscally demanding infrastructure projects managed by infrastructure departments. This shift is reflected in the current metropolitan plan for Sydney, which now has a strong focus on infrastructure. The Greater Sydney Commission (GSC 2018) explains strategic plans have been prepared concurrently to align land use planning, transport and infrastructure outcomes for Sydney for the first time in a generation. The topic of this chapter—the WestConnex motorway project is central to Sydney’s strategic plans and is often referred to as a “gamechanging project” or a “centerpiece infrastructure project” (Public Accountability Committee 2018). Understanding the use of social media as a tool to facilitate participation in planning is now a topic of considerable interest (Lin and Geertman 2019). The adoption of social media responds to concerns about the use of traditional methods of faceto-face consulting via public meetings and mainstream media that are effective for reaching older, politically engaged residents, but have proved to be less effective for younger cohorts (Ertiö et al. 2016; Lin and Geertman 2019). In this era where social media is increasingly important in framing planning by informing citizens of the planning processes or as a means of communication between key stakeholders (Williamson and Ruming 2020), it is vital to examine social media’s role in large scale transport infrastructure projects that impact wide geographic areas of cities. This chapter uses a post-political lens to investigate alternative political strategies mobilized by citizens during the planning and construction of WestConnex. In particular, this chapter focuses on social media as a form of alternative politics. Most social media platforms use hashtags to assign a keyword(s) as a form of metadata referencing the topic of a message. This topic tag assumes that other users will also adopt the tag and use it as a keyword for other messages on the same topic. The use of hashtags presupposes a virtual community of interest. To gain an understanding of hashtag use in urban planning, this study examines the types of hashtags used to mobilize ethical and equity concerns during the WestConnex project.
6.2 The WestConnex Project The WestConnex project will provide 30 km of continuous motorway, including 22 km of road tunnels, which will link Sydney’s west and south–west to the Central Business District (CBD) and Sydney Airport (Fig. 6.1). WestConnex now comprises three stages, delivered in six projects over a 10 year period (Public Accountability Committee 2018). The second stage of WestConnex opened in July 2020, and the entire project is due to be completed by late 2023. The WestConnex project has generated significant community and local government opposition during both the planning and construction phases of the project. Protesting against WestConnex is largely about the manipulative tactics used by the government to push ahead with the project by denying public debate and creating mechanisms to restrict the public’s ability to question the project (Haughton and
6 Transport Infrastructure, Twitter and the Politics …
95
Fig. 6.1 WestConnex project (Source Public Accountability Inquiry 2018)
McManus 2019). These tactics include, but are not limited to, delegation of responsibility to a standalone expert body—Sydney Motorway Corporation, and declaring WestConnex as a “critical infrastructure” project to prevent legal challenges and public access to decision-making documents (Haughton and McManus 2019). The use of state significant planning legislation is an attempt to diffuse community opposition by rescaling consultation and decision-making to the metropolitan level, which is perceived to be less vulnerable to community opposition (MacDonald 2018). As the construction phase of WestConnex continues, many members of the public feel they have been inadequately consulted and their opinions have not been heard, which has led to significant protests, rallies, and arrests (Haughton and McManus 2019). Moreover, planning of WestConnex has been organized to allow affected residents to express concerns about the projects implementation, but not the efficacy of the project (Searle and Legacy 2020). Public accountability and participation constitute significant mechanisms for policy-makers to deal with conflicts when they arise. However, the ability for these mechanisms to influencing decision-making is only possible when decision-makers are sensitive to the alternatives, options, and views expressed by citizens regarding infrastructure decisions (Wegrich 2017). There has never been any opportunity to discuss or debate options during the planning of WestConnex. The number of contentious issues resulting from the WestConnex project has led some to suggest WestConnex is exemplary of post-political urban governance when one considers the significant number of economic, environmental and social injustices (Haughton and McManus 2019). The next section provides an overview of the analytical framework and data collection methods mobilized by the study. A discussion of the results then follows.
96
W. Williamson
6.3 Public Participation and Alternative Politics Public participation in planning is central to the communicative planning theories that emerged in the late 1980 and 1990s (Healey 1993; Forester 1999). The shift to the communicative turn in planning theory was particularly focused on community engagement as a re-orientation from technocratic planning models towards a more interactive understanding of planning activity (Harris 2002). Communicative planning theory draws on Habermas’ (1984) theory of communicative action. Habermas distinguished communicative rationality as the separation of ‘the sphere of everyday life’ and ‘the system’ which reflects economic or administrative systems (TewdwrJones and Allmendinger 1998). Communicative rationality centralizes consensus through deliberation by harmonizing plans of action on the basis of common definitions (Habermas 1984). Communicative or collaborative planning approaches strive to increase opportunity and access to planning processes and decision-making (Healey 2006; Innes and Booher 2004). While public participation is not questioned as a democratic practice (Fung 2006), different accounts of public participation have raised questions about the rhetoric and efficacy of participatory approaches as the act of providing opportunities to take part in planning and decision-making may not necessarily lead to real participation (Maier 2001). Although it is considered important to offer the opportunity for the public to participate, by the time information is distributed about a planning matter, it is often too late in the process to make any significant changes (Hillier 2003). Collaborative planning sees conflict as creative tensions between different spheres of a pluralist society. It assumes people confront each other from different relational positions (Healey 1999) and clearly defines the planner in the role of facilitator (Healey 2006). Innes (1996) argues that collaboration functions best when all participants are equally empowered and fully informed, hence, planning reflects the ideals of democracy. However, the collaborative planning approach makes the somewhat utopian assumption that individuals and key stakeholders are interested in finding middle ground, when numerous examples demonstrate that this is often not the case (Huxley 2000). For Brand and Gaffikin (2007) the paradox of collaborative practice is that it seeks values of cohesion and inclusivity in an increasingly uncollaborative society. Although communicative rationality has been adopted in numerous planning systems, some researchers question the limits imposed by political and economic contexts on the communicative planning approach. Despite a focus on the collaborative planning ideal in planning theory and practice, the communicative planning approach has been challenged as researchers question the capacity of processes which seem to work to maintain the powerful position of certain actors via “consensus building” processes that seek to delegitimize opposition and define the wider good by working towards general agreement through engagement (Rancière 1998). To this end, the post-political analytical framework has emerged in planning literature as a lens to examine underlying urban politics and the way they are manifest and maintained via planning processes (Rancière 1998; Allmendinger and Haughton 2015; Inch 2015; Legacy 2016, 2017).
6 Transport Infrastructure, Twitter and the Politics …
97
Wilson and Swyngedouw (2014) define post-political as the political space of public engagement that is being increasingly colonized by technocratic mechanisms and consensual procedures that operate within a framework of representative democracy. Post-political efforts can reduce democratic contest by managing planning processes with expert reports and legitimizing decisions through participatory processes where the scope of possible outcomes is narrowly defined in advance (Wilson and Swyngedouw 2014). Attempts to depoliticized decision-making by shifting to consensus building are often mobilized through narratives about reducing regulatory burden, responding to key stakeholders, ensuring certainty in the planning process (Inch 2012), keeping business cases confidential, fast-tracking contracts and avoiding public discussion (Legacy 2016). In response, alternative political strategies appear in many forms. A growing body of literature provides examples of resistance to post-political strategies (Legacy 2016). These examples highlight that the post-political turn has not removed politics from planning. On the contrary, politics often emerges through alternative spheres of protest led by citizens (Enright 2019). The narrow application of participatory planning is a political act to manage contestation by citizens who seek to unsettle the planning process (Legacy 2017). Social media can be mobilized to highlight an alternative reality being experienced by protesters.
6.4 Social Media, Hashtags and Planning Through the widespread use of social media, citizens now easily establish largescale online social networks to initiate and maintain collective actions (Lin and Geertman 2019). For Trapenberg Frick (2016) the citizen networks that utilize both digital media and mainstream media such as newspapers heighten other citizens’ concerns about planning processes. Furthermore, digital media allows participants to produce their own material through several communication channels including YouTube videos, websites and posts on social media to create a perpetual digital footprint (Trapenberg Frick 2016). Twitter’s open style allows planners to read, see or listen to what the community is saying. However, direct engagement with citizens on Twitter may prove more beneficial than broadly distributing information. These benefits can improve a government agency’s reputation and their planning dialogue more generally (Schweitzer 2014). Alternative political strategies appear in many forms and networks of citizens have seized upon the networking capabilities inherent in social media. We no longer live in a mainstream media world where newspaper editors are in charge but a digitally connected world where the public can spread news of events and instantly respond to newsfeeds (Tufekci 2017). Being political often operates under serve constraints. Rancière (1998) notes that real politics is rare due to the vastly unequal access to resources required to do politics at the intensity that will effect change. However, Fenton (2016) argues that being political has stopped being about voting once every few years or signing a petition—it has become about doing and being and social
98
W. Williamson
media is increasingly playing a role in how people are being political. Attention is oxygen for being political but there are significant challenges associated with controlling the volume of information and the consistency of the messaging through social media (Tufekci 2017). The use of hashtags is a significant component of message distribution through social media. Hashtags assign a keyword(s) to a tweet as a form of metadata referencing the topic of a message as specified by the user (Zappavigna 2012). This topic tag assumes that other users will also adopt the tag and use it as a keyword for other tweets on the same topic. Thus, the use of hashtags presupposes a virtual community of interested listeners. Unlike other forms of metadata, hashtags are visible in the text and can hold functional roles in the linguistic structure of the tweet (Zappavigna 2012). The social function of a hashtag is to provide an easy means of grouping together tweets, which in turn can create an ad hoc social group. Zappavigna (2012) terms the searchable aspects of hashtags as ambient affiliation in the sense that while tweets can be grouped by a hashtag, Twitter users may not interact directly, are unlikely to know each other and may not interact again. Hashtags can lead to the formation of ad hoc publics (Burgess et al. 2015) of networks that develop around the hashtag. These networked communities can be ephemeral and arise in response to emergencies and crises or they can be more stable, long-term communities of practice or knowledge that develop to spread ideas, news or opinions on a given topic.
6.5 Data Collection and Analysis This section outlines the data sources, data collection techniques and methodology undertaken for the case study. Twitter is a social media platform that allows people to publish short messages on the Internet and is commonly referred to as microblogging. Twitter allows users to “follow” other user accounts they are interested in. Twitter enables users to broadcast messages using hash tags (#) and send direct messages using the “@” symbol. Twitter data has been collected monthly from October 2016 through to December 2019 using the keyword “WestConnex” to collect any tweets containing that term. The NodeXL (smrfoundation.org/nodexl/) application retrieves an extensive list of details for each tweet including the user name, tweet content, date of publication and the number of retweets in a spreadsheet format. The data analysis in this chapter is based on 31,000 tweets. Approximately 14,000 tweets did not have a hashtag. The following word clouds present the top 50 hashtags of the 17,000 tweets with hashtags. The Westconnex hashtag was the most used hashtag appearing in 13,958 tweets (Fig. 6.2). This is expected as the name of the project is so well known that social media uses sought to directly align their tweet with the project. To highlight the secondary hashtags used by Twitter users, #Westconnex was removed from the data and the word cloud was refreshed (Fig. 6.3). This reveals the #nswpol, #Sydney and #auspol as commonly used hashtags throughout the campaign. While #Sydney
6 Transport Infrastructure, Twitter and the Politics …
99
Fig. 6.2 Word cloud with #Westconnex (Source Author)
Fig. 6.3 Word cloud without #Westconnex (Source Author)
is a location marker, the umbrella #nswpol and #auspol hashtags are commonly used for domestic political issues. The #auspol hashtag is used for political commentary and debate and averages approximately 5000 tweets per day (Bruns and Highfield 2013).
100
W. Williamson
It is acknowledged that the social media dataset used for this chapter has inherent biases including the exclusion of certain age groups and those without Internet access, which limits social media’s ability to be representative of a broader population (Lin and Geertman 2019). Moreover, 0.64, we found that when k = 23, the results were more interpretable. Since the large number of topics (k = 23) and the presence of some topic overlaps in the pyLDAVis grid, we summarized our 23 topics into 5 broader and overarching themes. These themes allowed us to effectively conceptualize the dataset.
7.3.3 Results We uncovered 5 hidden themes from the topic modeling results. They were “Safety Perception”, “Technology Development”, “Industrial/System Integration”, “Design
7 Public Perceptions and Attitudes Towards Driverless …
117
and Functionality”, and “Ethics and Policy”. Table 7.2 shows the representative terms of each theme. The results uncovered that these 5 themes of the driverless technology were most relevant to the public. It is no wonder that the issue of safety became a hot topic on social media. For over the past ten years, safety has been one of the strongest motivations among the driving industry and road safety organizations for the implementation of driverless vehicles, and most debates were about the driverless vehicles’ safety promises (Corwin et al. 2016). The representative terms within the Technology Development theme suit the current issues around robotic progress, including artificial intelligence, software stability and liability, and data security and protection. Current analyses of accident data from on-road driverless vehicles test programs have shown a strong need for the future reliability of evolving software systems (Favarò et al. 2017). Another topic the public talked about on social media was Industrial and System Integration. This suggests that the public discussed the potential of driverless vehicles to change the future of mobility with many implications, including regulation, infrastructure, and road integration. In addition, the results showed that the public shared their opinions toward Design and Functionality of self-driving cars on social media, which could be valuable to improve the system over time since they will be the end users and their needs should be taken into account in marketing. It was noticeable that Ethics and Policy was a popular topic on social media when cars became “robotic”. With the technology’s development, the public became more interested in discussing moral choices for programming AVs. Table 7.2 Topic modeling results Generated topics
Representative terms for each generated topica
Safety perception
Safe, safety, accident, crash, emergency, human, driver, kill, stop, speed, limit, hit, live, problem, people, fatal, death, police, fault, detect, pedestrian
Technology development
Data, share, algorithm, hack, cybersecurity, software, technology, ai, tech, robot, autonomous, self-driving car, autonomous vehicles, internet of things, innovative, machine learning
Industrial/System integration
System, city, transport, infrastructure, fleet, pilot, ride, car, drive, street, road, bus, transit, shuttle, bike, metro, motor, mobility, company, market, industrial, world, business
Design and functionality
Design, model, year, engine, kid, support, smart, semi, control, transform, function, remote, invention, concept, prototype, power, solar
Ethics and policy
Law, regulation, regulate, rule, legal, lawsuit, insurance, cover, judge, bill, file, right, government, permit, legislation, progress, policy, moral, ethics, permit, patent, question, case, decision
a Note
The percentage of tokens in each theme was around 20%
118
Z. Jiang and M. Zheng
7.4 Sentiment Analysis of Each Topic Generated from Topic Modeling Sentiment analysis (SA) is a form of natural language processing (NLP) which attempts to use machine learning and statistical models to capture sentiments or feelings contained within a body of text (Liu 2012). In recent years, SA has been widely applied to a plethora of areas, including advertising, stock market trading, and election outcomes prediction. In this study, we built a SA model to identify whether each of the themes generated from LDA topic modeling contributed to adoption or disruption of driverless technology over time.
7.4.1 Model Building Naive Bayes, Support Vector Machines (SVM) and Maximum Entropy (MaxEnt) are three of the most popular machine learning algorithms used to conduct SA on Twitter data (Gautam and Yadav 2014; Giachanou and Crestani 2016). Naïve Bayes classifies a given document’s sentiment by calculating the probability of that document belonging to each sentiment class based on Bayes rule (McCallum and Nigam 1998). SVM attempts classify documents by finding an optimal hyperplane in the N-dimensional space of the text corpus. An optimal hyperplane is defined as one that maximizes distance between data points of each sentiment class. Support vectors are data points in close proximity to the hyperplane, and thus have high influence over the orientation of the classifier (Platt 1998). The MaxEnt algorithm is based on the Principle of Maximum Entropy and between all the models that fit the training data, selects the one which has the largest entropy (Phillips et al. 2004). All of them are supervised learning algorithms, requiring the data to be labeled before training the model. However, manually labeling a large corpus of tweets is labor-intensive, error prone (due to human bias), and time-consuming (Culotta and McCallum 2005). One alternative to manually labeling tweets is using emoticons contained within tweets to label them. This is effective as emoticons generally reflect the tone of the tweet (Kouloumpis et al. 2011). Using the emoticons allows sentiment analysis of tweets without having to manually label tweets beforehand, and eliminates potential bias caused by human labelers. We first extracted tweets that contained emoticons, and labeled tweets containing positive emoticons as positive, and tweets containing negative emoticons as negative. Tweets that contained both positive and negative emoticons were not considered for model training. We chose a set of positive and negative emoticons from the Emoji sentiment ranking (Kralj Novak et al. 2015), which provided sentiment scores for all emoticons. Then, we implemented the three SA algorithms discussed earlier, and determined the best algorithm for our data based on their model performance. Specifically, we
7 Public Perceptions and Attitudes Towards Driverless …
119
Table 7.3 SA accuracy of Naïve Bayes, SVM, and MaxEnt classifiers Classifier
Sentiment classification accuracy for each theme
Overall accuracy (%)
Safety perception (%)
Technology development (%)
Industrial/System integration (%)
Design and functionality (%)
Ethics and policy (%)
Naïve Bayes
83.4
81.6
84.1
83.9
83.6
SVM
82.7
86.1
83.5
83.2
84.4
83.98
MaxEnt
84.0
81.6
87.2
84.0
82.1
83.78
83.32
used NLTK (Bird et al. 2009) to build a Multinomial Naïve Bayes model, implemented Linear SVM (Fung and Mangasarian 2005) with Scikit Learn (Pedregosa et al. 2011), and built the MaxEnt model based on Scikit Learn python library (Pedregosa et al. 2011).
7.4.2 Model Evaluation We used an 80/20 training testing split for model accuracy validation. 80% of labeled data was used to train the model and the left 20% was used to test for accuracy. Python libraries NLTK, and Scikit Learn reported their classification accuracy as probabilities. Because classifier accuracy and class probability distributions can vary with each run, we classified each theme five times and calculated the average accuracy for each topic. Table 7.3 shows a comparison for the SA accuracy of each algorithm. In general, all of these classifiers’ accuracy are close to each other, between 83% and 84%. Since SVM classifier reached the highest overall accuracy 83.98% and all of its split accuracy was above 82%, we selected SVM as our SA method to report the sentiment results of each theme.
7.4.3 Results The SA results are shown in Figs. 7.4 and 7.5, presenting the overall sentiment in 2012–2019 and sentiment in each year, respectively. In Fig. 7.4a, the green and red bars show the percentages of positive and negative tweets within each theme, respectively. In general, a majority of tweets was identified as either “positive” or “neutral” sentiment (grey bar) across all the themes. Figure 7.4b shows the ratio of positive and negative sentiments excluding the neutral tweets within each theme. Positive sentiment was observed as the dominant sentiment across all the themes. The comparison results also indicated that Design and Functionality, and Ethics and Policy might be of major concern of the public since their perceived risks—negative
120
Z. Jiang and M. Zheng
Safety PercepƟon
45%
Technology Progress
21%
39%
Industrial / System IntegraƟon
20%
41%
Design and FuncƟonality
35%
Ethics and Policy
32% 0%
PosiƟve %
20%
34% 41%
22%
38%
26%
38%
19%
49%
40%
NegaƟve %
60%
80%
100%
Neutral %
(a) Share of PosiƟve, NegaƟve, and Neutral tweets Safety PercepƟon
68%
32%
Technology Progress
67%
33%
Industrial / System IntegraƟon
65%
35%
Design and FuncƟonality
57%
Ethics and Policy
43%
62%
0%
20%
PosiƟve %
40%
60%
38%
80%
100%
NegaƟve %
(b) Share of PosiƟve VS. NegaƟve tweets, excluding neutral tweets Fig. 7.4 SVM sentiment classification results for each theme from 2012 to 2019
sentiment—took around 40% share which was larger than the share of negative sentiments in other themes. Figure 7.5a shows the sentiment results within each theme over time. The neutral sentiment, which was identified as no information about either benefit or risk perceptions found in the content, stayed relatively constant across the past few years. More intriguingly, the share of positive and negative tweets exhibits opposite trends. The observed proportion of positive tweets began to decline in 2014–2015, with the sharpest decrease occurring around 2017–2018. The negative sentiment share began to rise around 2015–2016, showing a significant increase in 2017–2018. Figure 7.5b shows the ratio of the number of positive and negative tweets excluding the neutral
7 Public Perceptions and Attitudes Towards Driverless …
121
100%
100%
100%
80%
80%
80%
60%
60%
60%
40%
40%
40%
20%
20%
20%
0% 2012 2013 2014 2015 2016 2017 2018 2019
0% 2012 2013 2014 2015 2016 2017 2018 2019
PosiƟve
0% 2012 2013 2014 2015 2016 2017 2018 2019
NegaƟve
Neutral
(a) Share of PosiƟve, NegaƟve, and Neutral tweets over Ɵme 100%
100%
80%
80%
60%
60%
40%
40%
20%
20%
0% 2012 2013 2014 2015 2016 2017 2018 2019 PosiƟve
0% 2012 2013 2014 2015 2016 2017 2018 2019 NegaƟve
(b) Share of PosiƟve VS. NegaƟve tweets, excluding neutral tweets over Ɵme Fig. 7.5 SVM sentiment classification results for each theme over time (Note This figure is better visualized in color)
tweets, following a similar pattern as Fig. 7.5a. A sharp increase in the negative sentiment was observed in 2017–2018 for all the themes, and the two themes—Ethics and Policy, and Safety—became dominant concerns of the public. While the slopes of the other themes began to flatten around 2018–2019, those two themes continued their trend. Serious discussions involving the ethics of self-driving vehicle accidents began to take place in 2014 (Schroll 2014; Goodall 2014). The sharp increase of negative sentiment coincides with the first driverless vehicle fatality involving a pedestrian: in March 2018, a driverless Uber struck and killed a pedestrian in Tempe, Arizona (Wakabayashi 2018). After that, ethics, liability, and safety were the most popularly discussed issues associated with driverless technology (Chee 2018; Endsley 2019). Overall, the sentiment trend provides a detailed picture of how the positive and negative sentiment changes over time, suggesting that the public’s opinions and attitudes can be captured through tweets.
122
Z. Jiang and M. Zheng
7.5 Discussion and Conclusion In this study, we found that Twitter offers an effective means of gaining rapid information about the public perceptions and attitudes towards driverless vehicles and technology. Using Topic Modeling, we identified the main themes discussed in tweets related to driverless technology. We uncovered a set of five popular topics embedded in the tweets, consisting of Safety Perception, Technology Development, Industrial/System Integration, Design and Functionality, and Ethics and Policy. The representative terms associated within each theme from topic models highly represented the corresponding theme. For example, the topic of Safety Perception included terms such as “safe”, “accident”, “crash”, “detect”, “human”, and “driver”. Terms such as “city”, “system”, “transport”, “infrastructure”, and “road” were associated with the Industrial and System Integration theme. In the Ethics and Policy category, terms were more related to the impacts of driverless technology on issues of ethics, human rights, and legislature, including “law”, “regulation”, “legal”, “insurance”, “ethics”, “policy”, and “moral”. Overall, these results showed that a variety of topics about driverless technology were broadly discussed on social media. Some uncovered topics might not be that surprising, such as technology safety, detection, and relevant liability issues, which were always highlighted in industrial advertisements, news media, and public hearing meetings. However, topic modeling also allowed us to uncover some hidden topics which might otherwise have been overlooked. For example, we found that Design and Functionality was a hot theme in tweets. The terms in this category included “design”, “model”, “smart”, “semi”, “prototype”, “remote”, “control”, “kid”, “support”, and “function”. For a long time, Design and Functionality was regarded as an engineering challenge for restoring the self-driving system and minimizing the risk conditions in complex traffic environments, but the experience from the end users—the public— could be valuable to improve the systems over time. Automotive marketers can use the information mined from social media to target some specific functionalists, such as “control system design” and “access design for kids”, in the driverless vehicle marketing strategies. The sentiment analysis findings based on an SVM supervised learning approach successfully identified the polarity of attitudes towards each theme of driverless technology. We found that although more positive perceptions were found in the polarity tweets, i.e., tweets with positive or negative sentiment, people displayed a sharp increase of concerns for some themes during 2017–2019, and especially Ethics and Policy and Safety. Even though some people may consider driverless vehicles to be a significant part of the future of transportation, they still seem doubtful regarding regulations of the robotic mobility, especially on ethical, regulatory, and liability settings. These findings were consistent with findings in previous research. Zmud and Sener (2017) conducted an online survey of 556 respondents and found that the moral and ethics issues of self-driving cars was a controversial topic, especially about who is driving and who assumes responsibility for accidents. In addition, around 67% of perceptions towards Design and Functionality were concerns and critiques
7 Public Perceptions and Attitudes Towards Driverless …
123
until 2019. With the current stage of intelligent vehicle system design, there are lots of aspects can be improved to be multi-functional and support a wide range of population, such as providing convenience to the kids and seniors. By analyzing 505,058 tweets, this study shows that using text mining to determine topics and classify social media sentiment automatically is a promising approach to analyze acceptance of new transportation technologies such as driverless vehicles. We identified several areas that were of major concern that probably affects the public’s adoption of driverless technology and vehicles. With many ongoing improvements and innovations of driverless technology, the insights gained from this analysis will also support the governments and developers to frame prompt communication and marketing strategies to the public. This research has some limitations. First, the demographics of Twitter users may not necessarily represent the entire population. Compared to the average U.S. adult population, Twitter users are younger, more highly educated, wealthier, and techsavvy (Sloan et al. 2015; Pew Research Center 2019). Thus, some research questions may continue to require survey efforts that reach populations that do not participate as readily in Twitter, or where Twitter does not supply critical information to answer those questions. For example, senior people’s opinions may not be fully represented on Twitter, future research should try to have a better grasp of the senior population’s view and needs of driverless technology. Second, we expand the LDA algorithm in topic modeling by using a Hashtag pooling method beforehand, which can be further contextualized through a review of potential inner relationships, such as a hybrid approach that can provide the depth of insight in these Hashtags. Furthermore, while sentiment analysis can uncover the overall sentiment and ratios, it cannot explain what factors, such as demographic, socio-economic, or travel behavioral characteristics, affect the public’s perceptions and attitudes. Further survey-based research can determine the influential factors on these sentiments as well as whether the sentiments carried by the survey results agree with that of social media.
References AAA (2017) Americans Feel Unsafe Sharing the Road with Fully Self-Driving Cars. American Automobile Association. https://newsroom.aaa.com/asset/americans-feel-unsafe-sharingthe-road-with-fully-self-driving-cars-fact-sheet/ Anania EC, Rice S, Walters NW et al (2018) The effects of positive and negative information on consumers’ willingness to ride in a driverless vehicle. Transp Policy 72:218–224. https://doi.org/ 10.1016/j.tranpol.2018.04.002 Anderson JM, Kalra N, Stanley KD et al (2014) Autonomous vehicle technology: a guide for policymakers. Rand Corporation, Santa Monica, CA Aslam S (2020) Twitter by the Numbers: Stats, Demographics & Fun Facts. http://www.omnicorea gency.com/twitter-statistics/ Bansal P, Kockelman KM, Singh A (2016) Assessing public opinions of and interest in new vehicle technologies: an Austin perspective. Transp Res Part C Emerg Technol 67:1–14. https://doi.org/ 10.1016/j.trc.2016.01.019
124
Z. Jiang and M. Zheng
Bird S, Klein E, Loper E (2009) Natural language processing with Python, 1st edn. O’Reilly, Beijing and Cambridge, MA Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022 Boyd-Graber J, Hu Y, Mimno D (2017) Applications of topic models. Found Trends® Inf Retr 11:143–296. https://doi.org/10.1561/1500000030 Burns LD (2013) A vision of our transport future. Nature 497:181–182. https://doi.org/10.1038/ 497181a Chee FM (2018) An Uber ethical dilemma: examining the social issues at stake. J Inf Commun Ethics Soc 16:261–274. https://doi.org/10.1108/JICES-03-2018-0024 Corwin S, Jameson N, Pankratz D, Willigmann P (2016) The future of mobility: What’s next? Tomorrow’s mobility ecosystem—and how to succeed in it. Deloitte Culotta A, McCallum A (2005) Reducing labeling effort for structured prediction tasks. In: Proceedings of the 20th national conference on artificial intelligence, vol 2. AAAI Press, Pittsburgh, Pennsylvania, pp 746–751 Efthymiou D, Antoniou C (2012) Use of social media for transport data collection. Procedia—Soc Behav Sci 48:775–785. https://doi.org/10.1016/j.sbspro.2012.06.1055 Endsley MR (2019) Situation awareness in future autonomous vehicles: beware of the unexpected. In: Bagnara S, Tartaglia R, Albolino S et al (eds) Proceedings of the 20th congress of the international ergonomics association (IEA 2018). Springer International Publishing, Cham, pp 303–309 Favarò FM, Nader N, Eurich SO et al (2017) Examining accident reports involving autonomous vehicles in California. PLoS One 12:e0184952. https://doi.org/10.1371/journal.pone.0184952 Fung G, Mangasarian O (2005) Multicategory proximal support vector machine classifiers. Mach Learn. https://doi.org/10.1007/s10994-005-0463-6 Gautam G, Yadav D (2014) Sentiment analysis of twitter data using machine learning approaches and semantic analysis. 2014 seventh international conference on contemporary computing (IC3). IEEE, Noida, India, pp 437–442 Giachanou A, Crestani F (2016) Like it or not: a survey of twitter sentiment analysis methods. ACM Comput Surv 49:1–41. https://doi.org/10.1145/2938640 Goodall NJ (2014) Ethical decision making during automated vehicle crashes. Transp Res Rec J Transp Res Board 2424:58–65. https://doi.org/10.3141/2424-07 Gopal GN, Kovoor BC, Mini U (2021) Keyword template based semi-supervised topic modelling in tweets. In: Gupta D, Khanna A, Bhattacharyya S et al (eds) International conference on innovative computing and communications. Springer Singapore, Singapore, pp 659–666 Hajjem M, Latiri C (2017) Combining IR and LDA topic modeling for filtering microblogs. Procedia Comput Sci 112:761–770. https://doi.org/10.1016/j.procs.2017.08.166 Henrique J (2020) GetOldTweets Python. https://github.com/Jefferson-Henrique/GetOldTweetspython Hensher DA (2018) Tackling road congestion—What might it look like in the future under a collaborative and connected mobility model? Transp Policy 66:A1–A8. https://doi.org/10.1016/ j.tranpol.2018.02.007 Hong L, Davison BD (2010) Empirical study of topic modeling in Twitter. In: Proceedings of the first workshop on social media analytics—SOMA ’10. ACM Press, Washington DC, District of Columbia, pp 80–88 Jones T, Baxter M, Khanduja V (2013) A quick guide to survey research. Ann R Coll Surg Engl 95:5–7. https://doi.org/10.1308/003588413X13511609956372 Keerthi Kumar HM, Harish BS (2018) Classification of short text using various preprocessing techniques: an empirical evaluation. In: Sa PK, Bakshi S, Hatzilygeroudis IK, Sahoo MN (eds) Recent findings in intelligent computing techniques. Springer Singapore, Singapore, pp 19–30 Kohl C, Mostafa D, Böhm M, Krcmar H (2017) Disruption of Individual Mobility Ahead? A Longitudinal Study of Risk and Benefit Perceptions of Self-Driving Cars on Twitter, in Leimeister, J.M.; Brenner, W. (Hrsg.): Proceedings der 13. Internationalen Tagung Wirtschaftsinformatik (WI 2017), St. Gallen, S. pp 1220–1234
7 Public Perceptions and Attitudes Towards Driverless …
125
Kohl C, Knigge M, Baader G et al (2018) Anticipating acceptance of emerging technologies using twitter: the case of self-driving cars. J Bus Econ 88:617–642. https://doi.org/10.1007/s11573018-0897-5 Kouloumpis E, Wilson T, Moore J (2011) Twitter Sentiment Analysis: The Good the Bad and the OMG! In Proceedings of the International AAAI Conference on Web and Social Media (Vol. 5, No. 1) Kralj Novak P, Smailovi´c J, Sluban B, Mozetiˇc I (2015) Sentiment of Emojis. PLOS One 10:e0144296. https://doi.org/10.1371/journal.pone.0144296 Lim KW, Buntine W (2014) Twitter opinion topic model: extracting product opinions from tweets by leveraging hashtags and sentiment lexicon. In: Proceedings of the 23rd ACM international conference on conference on information and knowledge management. Association for Computing Machinery, New York, NY, USA, pp 1319–1328 Liu B (2012) Sentiment analysis and opinion mining. Synth Lect Hum Lang Technol 5:1–167. https://doi.org/10.2200/S00416ED1V01Y201204HLT016 Lui M, Baldwin T (2012) langid.py: An off-the-shelf language identification tool. In: Proceedings of the ACL 2012 system demonstrations. Association for Computational Linguistics, Jeju Island, Korea, pp 25–30 McCallum AK (2002) MALLET: A Machine Learning for Language Toolkit. http://mallet.cs.umass. edu/ McCallum A, Nigam K (1998) A comparison of event models for naive bayes text classification. In: undefined. /paper/A-comparison-of-event-models-for-naive-bayes-text-McCallumNigam/04ce064505b1635583fa0d9cc07cac7e9ea993cc. Accessed 28 Oct 2020 Mehrotra R, Sanner S, Buntine W, Xie L (2013) Improving LDA topic models for microblogs via tweet pooling and automatic labeling. In: Proceedings of the 36th international ACM SIGIR conference on research and development in information retrieval. Association for Computing Machinery, New York, NY, USA, pp 889–892 Otsuka E, Wallace SA, Chiu D (2014) Design and evaluation of a Twitter hashtag recommendation system. In: Proceedings of the 18th international database engineering & applications symposium. Association for Computing Machinery, New York, NY, USA, pp 330–333 Pedregosa F, Varoquaux G, Gramfort A et al (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830 Pew Research Center (2019) Sizing Up Twitter Users. https://www.pewresearch.org/internet/2019/ 04/24/sizing-up-twitter-users/ Phillips SJ, Dudík M, Schapire RE (2004) A maximum entropy approach to species distribution modeling. In: Proceedings of the twenty-first international conference on Machine learning. Association for Computing Machinery, New York, NY, USA, p 83 Platt J (1998) Sequential Minimal Optimization: A Fast Algorithm for Training Support Vector Machines. Microsoft Research. https://www.microsoft.com/en-us/research/publication/sequen tial-minimal-optimization-a-fast-algorithm-for-training-support-vector-machines/ Rahim Taleqani A, Hough J, Nygard KE (2019) Public opinion on dockless bike sharing: a machine learning approach. Transp Res Rec J Transp Res Board 2673:195–204. https://doi.org/10.1177/ 0361198119838982 Ramage D, Dumais S, Liebling DJ (2010) Characterizing Microblogs with Topic Models. In Proceedings of the International AAAI Conference on Web and Social Media (Vol. 4, No. 1) Rodrigue J-P, Comtois C, Slack B (2017) The geography of transport systems, 4th edn. Routledge, Taylor & Francis Group, London and New York Roesslein J (2020) Tweepy: Twitter for Python! Sabab Zulfiker M, Kabir N, Ali HM et al (2020) Sentiment analysis based on users’ emotional reactions about ride-sharing services on Facebook and Twitter. In: Uddin MS, Bansal JC (eds) Proceedings of international joint conference on computational intelligence. Springer, Singapore, pp 397–408 Schoettle B, Sivak M (2014) A survey of public opinion about autonomous and self-driving vehicles in the U.S., the U.K., and Australia
126
Z. Jiang and M. Zheng
Schroll C (2014) Splitting the bill: creating a national car insurance fund to pay for accidents in autonomous vehicles. Nw UL Rev 109:803 Sievert C, Shirley K (2015) pyLDAvis: Python library for interactive topic model visualization. https://CRAN.R-project.org/package=LDAvis Sloan L, Morgan J, Burnap P, Williams M (2015) Who Tweets? Deriving the demographic characteristics of age, occupation and social class from twitter user meta-data. PLoS One 10:e0115545. https://doi.org/10.1371/journal.pone.0115545 Steinskog A, Therkelsen J, Gambäck B (2017) Twitter topic modeling by Tweet aggregation. In: Proceedings of the 21st Nordic conference on computational linguistics. Association for Computational Linguistics, Gothenburg, Sweden, pp 77–86 Stevens K, Kegelmeyer P, Andrzejewski D, Buttler D (2012) Exploring topic coherence over many models and many topics. In: Proceedings of the 2012 joint conference on empirical methods in natural language processing and computational natural language learning. Association for Computational Linguistics, Jeju Island, Korea, pp 952–961 Thierer A, Hagemann R (2014) Removing Roadblocks to Intelligent Vehicles and Driverless Cars. Wake Forest Journal of Law & Policy (2015), Vol. 5, Mercatus Research Paper, Mercatus Center at George Mason University, Arlington, VA, Available at SSRN: https://ssrn.com/abstract=249 6929 or http://dx.doi.org/10.2139/ssrn.2496929 Wakabayashi D (2018) Self-driving uber car kills pedestrian in Arizona, Where robots roam. New York Times Wang X, McCallum A (2006) Topics over time: a non-Markov continuous-time model of topical trends. In: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining—KDD ’06. ACM Press, Philadelphia, PA, USA, p 424 Wang X, Gerber MS, Brown DE (2012) Automatic crime prediction using events extracted from Twitter posts. In: Yang SJ, Greenberg AM, Endsley M (eds) Social computing, behavioral— cultural modeling and prediction. Springer, Berlin, Heidelberg, pp 231–238 Wang S, Jiang Z, Noland RB, Mondschein AS (2020) Attitudes towards privately-owned and shared autonomous vehicles. Transp Res Part F Traffic Psychol Behav 72:297–306. https://doi.org/10. 1016/j.trf.2020.05.014 Zhang T, Tan H, Li S et al (2019) Public’s acceptance of automated vehicles: the role of initial trust and subjective norm. Proc Hum Factors Ergon Soc Annu Meet 63:919–923. https://doi.org/10. 1177/1071181319631183 Zhao WX, Jiang J, Weng J et al (2011) Comparing Twitter and traditional media using topic models. Advances in information retrieval. Springer, Berlin, Heidelberg, pp 338–349 Zmud JP, Sener IN (2017) Towards an understanding of the travel behavior impact of autonomous vehicles. Transp Res Procedia 25:2500–2519. https://doi.org/10.1016/j.trpro.2017.05.281
Chapter 8
Assessing the Value of New Big Data Sources for Transportation Planning: Benton Harbor, Michigan Case Study Robert Goodspeed, Meixin Yuan, Aaron Krusniak, and Tierra Bills
Abstract Transportation professionals and researchers have traditionally relied on household travel surveys to understand transportation needs, especially for transitdependent and other environmental justice communities. Recently, new big data sources have become available for transportation planning applications. The purpose of this article is to compare a traditional household travel survey with two such datasets used by practitioners: SafeGraph and StreetLight. The analysis compares the coverage and travel patterns they provide across a socioeconomically diverse small region centered on the twin cities of Benton Harbor and St. Joseph, Michigan, USA. Although lacking demographic data, the big data sources provide greater coverage and detail for parts of the region home to the African American population important for environmental justice analysis. In addition, SafeGraph data derived from cell phones provides potentially useful point-of-interest and time-of-day travel information lacking from the conventional survey. The article describes the potential for data fusion for enhanced understanding of community needs. Keywords Transportation · Big data · Environmental justice · Transportation planning · Smartphone data
8.1 Introduction A fundamental input for transportation planning is data describing people’s travel behaviors in a region. This need has traditionally been met by household travel surveys, which asks a representative sample of residents to record their travel for a selected survey period (Stopher and Greaves 2007). Household travel surveys are used to estimate transportation models, which are in turn used to evaluate proposals R. Goodspeed (B) · M. Yuan · A. Krusniak Taubman College of Architecture and Urban Planning, University of Michigan, Ann Arbor, MI, USA e-mail: [email protected] T. Bills College of Engineering, Wayne State University, Detroit, MI, USA © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 S. C. M. Geertman et al. (eds.), Urban Informatics and Future Cities, The Urban Book Series, https://doi.org/10.1007/978-3-030-76059-5_8
127
128
R. Goodspeed et al.
for new infrastructure, transit services, or policy changes (Boyce and Williams 2015). Furthermore, household travel surveys are widely employed by researchers to develop and test theories describing various travel behaviors. However, concerns have been raised about the validity of household travel surveys. Surveys often struggle to reach certain vulnerable populations, such as immigrants, the elderly, low income individuals, youth, and households who speak a language other than English (Stopher and Greaves 2007). In the United States, African American communities have lower auto ownership rates and are therefore more reliant on public transit (Raphael et al. 2006). The assumptions of travel demand models may poorly account for the daily travel needs of marginalized populations (Nostikasari 2015). Since many of these groups also experience the greatest transportation needs, their underrepresentation in surveys may mean their needs may be neglected in plans and infrastructure decisions. In recent years, several research projects have demonstrated the potential for new big data sources to provide insight about transportation behaviors that may go beyond the capabilities of traditional household travel surveys. These include cellular network data (Jiang et al. 2017), smart card data (Zannat and Choudhury 2019), vehicle location data (StreetLight Data 2018a), and others. Although these studies demonstrate the potential of these datasets, they typically rely on proprietary data access and analysis skills only available to sophisticated researchers, and therefore big data has not been widely adopted by practitioners. This has started to change in recent years with the advent of private data brokers who sell big data obtained from smartphones or vehicles directly to transportation professionals in aggregated and processed formats, therefore making data access and analysis easier for professionals. These generally fall into three categories: companies providing data derived from cell phone networks (AirSage, Cellint, and Citilabs Streetlytics), companies providing data derived from location data obtained from mobile apps (SafeGraph, Veraset), and companies providing data derived from both apps as well as on-board vehicle infotainment and navigation systems (StreetLight Data).1 These data have the potential to overcome some of the shortcomings of conventional travel surveys and expand the capacity to understand travel dynamics. For example, Albright (2018) describes how practitioners using AirSage data concluded there were many more visitors to the Lake Tahoe region than traffic count and hotel stay data suggested, confirming stakeholders’ anecdotal observations of traffic problems not apparent from conventional data. However, these data sources may introduce new biases. Big data primarily from vehicles may undercount communities with low vehicle ownership, or who own older vehicles lacking tracking technology. Tracking data from mobile apps may not count communities with fewer smartphones or who use phones more intermittently due to unaffordable data costs. For this study we focus on two big data sources which are now readily available for researchers and practitioners, and as a result, are entering into wider use. These are 1 AirSage
(www.airsage.com), Cellint (www.cellint.com), Citilabs Streetlytics (www.citilabs.com/ software/streetlytics/), SafeGraph (www.safegraph.com), Veraset (www.veraset.com), Streetlight Data (www.streetlightdata.com).
8 Assessing the Value of New Big Data Sources …
129
SafeGraph, derived from smartphone apps, and StreetLight, derived from a combination of vehicles and smartphone apps. We seek to understand the practical value of new big data sources for transportation planning, including critically assessing their representativeness and the new insights they can provide. To this end, this study aims to answer the following questions: (1) (2) (3)
How do data from SafeGraph and StreetLight compare with conventional travel surveys in terms of their spatial coverage and the travel patterns they describe? How well do each of these data sources represent the travel behaviors of lowincome African American communities? What additional information is available from SafeGraph and StreetLight that may complement traditional travel information sources?
8.2 Background and Context This section provides the background and context for our study: a description of travel needs assessments for environmental justice conducted by U.S. transportation professionals, an overview of literature on big data for mobility analysis, and a description of our case study region.
8.2.1 Travel Needs Assessments for Environmental Justice In addition to our descriptive analysis, this paper adopts the specific lens of environmental justice travel needs assessments as a common professional analysis task that may benefit from big data. Environmental justice travel needs assessments are regular activities undertaken by transportation planning organizations in the United States, according to requirements in Federal laws and regulations. The primary objectives of these analyses include identifying the needs for specific transportation improvements based on gaps in service provision for environmental justice communities, for both current infrastructure as well as future transportation changes (Forkenbrock and Sheeley 2004). There are three pieces of U.S. federal legislation that require or prompt metropolitan planning organizations (MPOs), local governments or organizations designated as responsible for transportation planning, to perform these needs assessments for environmental justice communities. First, Title VI of the Civil Rights Act of 1964 requires that no traveler “be excluded from participation in, be denied the benefits of, or be subjected to discrimination” on the basis of “race, color, or national origin” under any programs that receive federal funding (Gareis-Smith 1994). This includes the Transportation Improvement Programs and other Regional Transportation Program activities undertaken by MPOs. Second, Executive Order 12898 requires that federal agencies (and by extension, agencies receiving federal funding) identify and work to address “disproportionately high and adverse human
130
R. Goodspeed et al.
health or environmental effects” of their actions on “minority and low income populations”. Third, the National Environmental Policy Act (NEPA) requires federal agencies to assess the potential environmental, economic, and public health impacts on low-income and minority communities (National Environmental Policy Act 2020). While the methods and data sources for identifying environmental justice (EJ) communities vary, transportation needs assessments for these communities include the following basic steps (Santa Barbara County Association of Governments 2019; Purvis 2001; Mills and Neuhauser 2000; Forkenbrock and Sheeley 2004): 1. 2. 3.
4.
Estimate the size and residential locations of EJ communities, typically defined as low-income communities and/or minority ethic groups; Calculate basic travel characteristics for EJ communities versus non-EJ communities, such as mode shares; Assess the coverage and level-of-service of the available transportation system alternatives, with special emphasis placed on transit and other non-auto alternatives; Identify problems that need to be addressed for EJ and non-EJ communities, such as gaps in coverage, excessive travel times, high costs, etc.
Although this paper does not contain a full EJ travel needs assessment, we use this methodology to provide a lens for evaluating these big data sources, based on common planning objectives. In particular, we examine the value of these datasets to describe the travel behaviors of residents of EJ communities in the study region, however we note how big data do not allow for comparing mode share between groups. We also discuss how a full EJ travel needs assessment would require the fusion of traditional and big data sources due to the limitations of the two big data sources analyzed.
8.2.2 Big Data for Mobility Analysis Commercial big data sources have been increasingly applied in the transportation industry. StreetLight Data is one such source, which gathers billions of location records from smartphones and other GPS-enabled devices (including cars and trucks themselves) and uses these records to produce datasets that describe the flow of traffic between polygons representing neighborhoods across both time and modes of transportation (StreetLight Data 2018a). StreetLight data is primarily marketed toward transportation departments and transportation-oriented private companies and has become one of the most widely applied commercial big data sources in planning practice, having been used by several leading agencies. For example, the California Department of Transportation (Caltrans) partnered with StreetLight to understand statewide bike and pedestrian activity so as to inform infrastructure needs (StreetLight Data 2018b). The Virginia Department of Transportation (VDOT) also purchased a subscription to StreetLight products that involves origin-destination (OD) matrices
8 Assessing the Value of New Big Data Sources …
131
(Yang and Cetin 2020). The Ohio Department of Transportation purchased StreetLight Insight, a traffic data analysis and visualization platform, to support statewide planning and traffic operations (INRIX 2017a). Other applications have included calculating dwell time at national parks, finding the optimum placement of charging stations for electric vehicles, and estimating vehicle carbon emissions in various communities (StreetLight Data 2020). Another commercial provider of mobility patterns data known as SafeGraph has also been increasingly used to assess transportation and urban management issues in several cities. Like StreetLight, SafeGraph captures raw data from GPS-enabled smartphones. However, rather than use this data to describe routes, mode choice, or traffic flows, SafeGraph offers details about activity at points of interest, including businesses, government buildings, places of worship. This information includes, for example, how many visitors stop at a given point on a given day. SafeGraph markets this information primarily to private businesses looking for insights about their customers (or trends among competitors), such as when customers arrive, where they come from, and how long they linger at the place of business (SafeGraph 2021). Urban data scientists have been able to use SafeGraph’s point of interest data to infer origin-destination pairs useful for mobility analysis, leading to a proliferation of SafeGraph-based transportation studies in recent years. Prestby et al. (2019) constructed OD matrices with SafeGraph data for Milwalkee, Wisconsin to understand neighborhood isolation. Gao et al. (2019) integrated SafeGraph and other open data sources to predict the spatiotemporal parking legality in New York City with machine learning methods. Andersen (2020) also sought SafeGraph visitation data to analyze the social distancing trend during the COVID-19 pandemic in the U.S. On this topic, the company itself has recently created data products specifically intended for COVID-19 pandemic-related analysis. These products use SafeGraph’s existing infrastructure to provide insight on the movement of people between census block groups rather than points of interest, making using SafeGraph data for mobility analysis and transportation planning simpler than before. SafeGraph and StreetLight operate only in the United States and in North America, respectively. However, it is worth noting that commercial mobility data is a growing sector, with a range of companies now playing the same or similar roles as to SafeGraph and StreetLight both in the U.S. and around the globe. For example, INRIX, a U.S.-based company that operates worldwide, provides a suite of transportation “IQ” products that analyze traffic flows, drive time, and other metrics and spans over 30 countries across mostly Europe, North and South America, and the Middle East (INRIX 2017b). A Belgium-based company called Sentiance is one other notable example, providing similar global datasets that describe transit mode choice and driver behavior (Sentiance 2018). These global companies not only provide data to transportation agencies and businesses, but also cater to infrastructure and logistics optimization firms, insurance companies, and more.
132
R. Goodspeed et al.
As commercial data companies continue to proliferate across the globe, the volume, variety and velocity of big data increasingly allows its versatile applications in transportation, planning, and policy analysis. In transportation planning, big data can be used to reveal spatiotemporal transportation patterns, data traditionally collected through surveys. However, can big data achieve the same purposes and accuracy of conventional surveys? Our literature review has identified several studies conducting such comparisons. Alexander et al. (2015) analyzed origin-destination (OD) data generated from mobile phone call detail records against results of Boston Household Travel Survey (1991), Massachusetts Travel Survey (2010/2011), and National Household Travel Survey. They found that the phone call detail data can provide comparable results as the selected survey results regarding origin-destination patterns by hour of the day and purposes across 164 cities and towns in the Boston metropolitan area. Calabrese et al. (2011) constructed OD matrices in 8 Massachusetts counties using mobile location data from a company called AirSage. They compared the key trip characteristics against NHTS and US Census 2000 at the census tract level and concluded that AirSage data could be used to describe place of residence and found a good relationship between the OD matrices created from mobile data and Census data at both the tract and county levels. However this study may be less relevant for practitioners, since it involved analyzing 829 million mobile phone location data points collected from the cellphone network. These studies have demonstrated big data’s promising ability of describing aggregated transportation patterns, which makes using readily available big data to understand transportation behaviours become an attractive vision. However, existing literature pays little attention to the demographics of users who generated these big datasets, or the potential for unequal representation across diverse communities. Since big data collection methods involve barriers to low-income communities, there is a concern of whether the representativeness of big data can potentially inhibit more equitable transportation planning when applied at smaller spatial scales than these studies. We adopt their method of examining big data quality through comparing OD matrices resulting from big data with those from conventional surveys, but also introduce an explicitly EJ dimension to the analysis. In addition, our study contributes to the literature since it is focused on a small region. Most of the research focuses on large metropolitan areas; however, in practice, most Federally-mandated long range transportation plans are prepared for small regions with less than 200,000 people, and 75% of metropolitan planning organizations have regions of fewer than 500,000 people. Hence, there is a lack of knowledge on whether big data is effective for transportation analysis at this scale (Goodspeed and DeBoskey 2020).
8 Assessing the Value of New Big Data Sources …
133
8.2.3 Case Study Overview Our case study focuses on a portion of the Niles-Benton Harbor Metropolitan Statistical Area in Michigan, with emphasis on the twin cities: Benton Harbor and St. Joseph, Michigan. The region is home of the appliance manufacturer Whirlpool and other manufacturing firms with ties to the automotive industry. The area is also an attractive resort location with Lake Michigan beaches within easy access of both Chicago and Detroit by highway and train. Like many midwestern U.S. cities, Benton Harbor developed a black population through the Great Migration as African Americans moved north seeking economic and social opportunity in the early 20th Century. However, racial discrimination has remained a persistant problem in the real estate and labor market, and changes in the manufacturing sector have meant many jobs have moved to exurban and rural locations, or out of the U.S. entirely. Like many U.S. regions, these patterns have resulted in a region with a high degree of racial and economic segregation, with Benton Harbor being mostly African American, older, and with a lower household income, and St. Joseph being mostly white with higher household incomes. The study region is defined as the cities of Benton Harbor and St. Joseph, and five surrounding jurisdictions (Fig. 8.1). This means that some smaller rural areas are excluded from this study. The study area is served by a network of roads and the I-94 highway, as well as the public transit system, the Twin Cities Area Transportation Authority (TCATA). TCATA provides both fixed-route and on-demand services to Benton Harbor and portions of several surrounding jurisdictions, as shown in Fig. 8.3. TCATA receives funding from the U.S. Federal government and from the City of Benton Harbor. This analysis was conducted as part of a broader research collaboration on smart mobility between the University of Michigan and three regional organizations: TCATA, the Southwest Michigan Planning Commission, and the workforce development agency Kinexus. Table 8.1 demonstrates the key demographics of the selected jurisdictions. There were 63,656 people living in the area, with Benton Township and Lincoln Township being the most populated local jurisdictions. Notice that the City of Benton Harbor and Benton Harbor Township have the highest concentrations of households living in poverty, as well as the lowest average median incomes, and the highest percentages of African-American population among the seven jurisdictions. Royalton Township had the highest average median income and the highest proportion of white households. Although EJ analyses use different quantitative criteria to account for regional differences, in general they define EJ areas as those with low income and home to minority populations (e.g., EPA 2017). Based on the context of our study area, we defined the EJ community as communities with: (1) over 20% of households in poverty; and 2) over 20% of the population were African American. Figure 8.2 shows the spatial distribution of census block groups that meet this definition. Census block groups which matched each criterion were mostly contiguous.
134
R. Goodspeed et al.
Fig. 8.1 Study region with local jurisdiction boundaries, TCATA service area, fixed-route services, and environmental justice service areas. Data Sources: ACS 2014-2018 5-year estimates. Race and poverty status in the past 12 months of families data was retrieved from census block groups level. Fixed route service maps (2020) were available from the TCATA website (https://www.mywaythere. org/fixedroute.asp). Jurisdiction boundaries were according to Michigan Minor Civil Divisions (Cities & Townships) (v17a). Census block group boundaries were according to 2018 Michigan census block groups TIGER/Line Shapefile
8 Assessing the Value of New Big Data Sources …
135
Fig. 8.2 Dataset spatial units: jurisdiction boundaries, census block groups, StreetLight Zones
8.3 Methodology This section provides an overview of our data and methodology, our data sources, the methods for creating common spatial units, and the methods for conducting the coverage and travel flow analyses.
136
R. Goodspeed et al.
Table 8.1 Summary demographics of seven jurisdictions of the study area, American Community Survey, 2014–2018 estimates Study area Population Median Median Households jurisdictions household age in poverty income (%) ($)
Percent White African Other age (%) American races over 65 (%) (%) (%)
Benton Township
14,813
19,788
34.5
30.8
15.1
45.0
46.6
8.4
City of Benton Harbor
9,597
15,044
32.9
39.2
10.4
10.3
86.9
2.8
City of St Joseph
8,356
37,107
41.9
5.2
19.0
87.7
3.7
8.6
Lincoln Township
14,605
36,172
43.5
4.2
20.6
91.7
2.2
6.1
Royalton Township
4,766
40,490
42.8
4.0
17.8
92.1
0.9
7.0
Sodus Township
1,743
27,846
52.2
5.1
29.3
88.8
3.7
7.5
St Joseph Township
9,776
34,360
48.9
3.1
22.9
81.8
10.7
7.5
N/A
N/A
14.6
18.0
66.5
26.7
6.8
Study Area Total
63,656
Note Median household income and median age data was retrieved from county subdivision level data and the rest was aggregated based on census block group level data in order to be consistent with the analysis
8.3.1 Data Sources This study utilizes three data sources each using geographies provided by each source: StreetLight data describing travel between traffic zones in the study area, SafeGraph data summarized by Census block groups, and the Michigan Travel Counts III survey also summarized by block group (Table 8.2). As commercial datasets, typically practitioners purchase StreetLight and SafeGraph data. Both were provided to us for free: SafeGraph shares data freely for academic research and the StreetLight data had already been purchased by a local partner. As StreetLight and SafeGraph data are being increasingly applied in the planning practice and are intrinsically different, we chose these two as the big data sources to compare with a traditional travel survey, Michigan Travel Counts III (MTCIII) (Wilaby and Casas 2016). The MTCIII survey provides person- and household-level weights in order to account for differences in response rates between communities, which we incorporate into our analysis. The weights are benchmarked to Census 2010 data by modeling areas, household income, and household size. The survey documentation instructs analysts to apply weights when calculating aggregate statistics by multiplying the responses by the
8 Assessing the Value of New Big Data Sources …
137
Table 8.2 Overview of data sources SafeGraph (Social Distancing Metrics)
Streetlight
Michigan travel counts III
Data collection source
Smartphone apps
Vehicle navigation systems and smartphone apps
Conventional log- and GPS-based household travel sample survey
Data structure summary
Travel statistics for census block groups (e.g. dwell time at home, median distance traveled, etc.)a
Relative levels of travel between broad geographic zones, as measured by a proprietary metric called the “StreetLight Index”
Demographic and travel data (self-log and GPS traced) for sample MI households
Sample size (study region)
51 census block groups
223 StreetLight Index records (weekday trip pairs)
5,406 (trips) 1,030 (persons)
Jan. 2017–Dec. 2017
April 20–June 24, 2015; Sept. 8–Nov. 24, 2015
Temporal resolution Daily metrics for each census block group, which also include a breakdown by hour
Data is aggregated over whole period, with distinguishing peak/non-peak hours of a day and weekday/weekend of a week
Exact trip day, time, mode, origin and destination for weekdays only
Spatial resolution (Fig. 8.2)
Census block groups
Proprietary traffic zones Census block groups
Demographic information
No
No
Yes
Day of week breakdown
Yes
Yes
Yes
Explicit OD pairs
Yes (destination census block groups provided for each origin census block group)
Yes (represented by analysis zone)
Yes for GPS-log households (coordinates)
Number of trips
Yes
Indirectly, via Streetlight Index
Yes (sample)
Temporal and spatial resolution Date data available
Jan. 2019–Dec. 2019
Variable comparison
a Other
SafeGraph datasets track alternate metrics, like visitor statistics for discrete points of interest—see discussion of strengths and weaknesses below
138
R. Goodspeed et al.
corresponding person- or household-weights, which results in an estimate for the entire population (Wilaby and Casas 2016, p. 7-5). One important limitation of this analysis is temporal differences between the datasets. MTCIII is the oldest, based on data collected in 2015. Streetlight’s data is from the full calendar year of 2017 and SafeGraph’s data is from 2019. Although there of course has been some change in the region during this time, it is a very slowly growing region and we are not aware of any significant changes to land use or transportation infrastructure during that time.
8.3.2 Establishing Shared Spatial Units The three data sources provide different spatial resolutions (Fig. 8.2). SafeGraph provides U.S. census block group data, inferring a device’s home census block group from its overnight location. MTCIII also provides home and travel location data at the census block group level. The StreetLight data available for this study is aggregated at a company-defined “traffic zone,” with several zones overlapping jurisdiction boundaries, although different aggregations are possible from the company. To facilitate comparing these datasets, we decided to use jurisdictions as the shared spatial unit for analysis. As most census block group boundaries are compatible with jurisdiction boundaries and the areas with error do not have high populations (e.g., an airport or park), we assigned census block group data to jurisdictions by geographic centroids. For StreetLight data, we retain the zones but label them by the jurisdictions they contain. In addition, population and other demographics at census block group level were available from American Community Survey (ACS) 2014–2018 5-year estimates. Aggregating data by spatial units inevitably introduces error, however we follow advice to use the smallest unit possible to minimize this problem.
8.3.3 Coverage Analysis In order to compare the representativeness of the big data and traditional data sources, we conducted a coverage analysis to compare the proportion of the population contained in each dataset. We were only able to include SafeGraph and MTCIII in this analysis as the home locations of data providers for StreetLight were not available. We compared the percentage of devices (SafeGraph) and actual and weighted samples from MTCIII to the total estimated population from ACS at the census block group and jurisdiction level.
8 Assessing the Value of New Big Data Sources …
139
8.3.4 Travel Flow Analysis The travel flow analysis compares the representations of regional travel described by each dataset for travel occurring within the region only. For MTCIII, we count trips between census block groups and adjust them using the person weights, before summarizing travel at the jurisdiction level. StreetLight data is represented using the StreetLight Index, a normalized figure based on the number of local devices, trips, population, etc. that are found at the traffic zone level. Therefore, the StreetLight Index values for zones which contain multiple jurisdictions could not be divided. To overcome the limitation of the Streetlight Index, we utilized percent index (StreetLight Index of designated aggregated zones by sum of index of study area) as an indicator to represent origin-destination patterns between aggregated StreetLight zones as shown in Fig. 8.2. Trips recorded by MTCIII are represented by the ratio of weighted percent of trips (within and between jurisdictions) to the total weighted trips in the study area. To directly compare the data, we then aggregate the jurisdictions which are also combined in StreetLight data. One challenge facing such an analysis is that StreetLight uses an index, which provides a relative and not absolute measure. This index is put in abstract terms that are left undefined by StreetLight itself— in other words, it is unclear how many actual trips are indicated by a given level on the index. However, one study which used Streetlight data provides some clarification, stating that this proprietary figure is “meant to be consistent across geographies and over time but does not represent actual trip numbers.” (SSTI 2017). StreetLight Index is normalized monthly to adjust population sampling biases among census block groups (StreetLight Data 2018a). SafeGraph’s Social Distancing Metrics dataset provides daily data for each census block group, including other census block groups which were visited by devices that reside in the origin census block group. While SafeGraph began publishing this data in 2020 for the purpose of research into mobility and social distancing amid the ongoing COVID-19 pandemic, the dataset has been computed from historical data back to January 1st, 2019. This made it possible to construct an origin-destination matrix that maps census block groups onto each other for the pre-pandemic travel environment, which we then aggregated across the entire year of 2019.
8.4 Results 8.4.1 Coverage Analysis Our population coverage analysis between MTCIII and the SafeGraph dataset is shown in Table 8.3 and Fig. 8.3. StreetLight data has been omitted from this table, as it includes no information about the underlying sample size. As the table illustrates, SafeGraph’s dataset captured just over six times as many people (assuming
100
1,743
4,766
14,605
9,776
63,656
Sodus Township
Royalton Township
Lincoln Township
St. Joseph Township
a Population
Totals
15.4
14,813
Benton Township
6,462
893
1258
343
150
1766
770
1282
# Devices
SafeGraph
data: ACS 2014-2018 5-year estimate
22.9
7.5
2.7
23.3
13.1
8,356
City of St. Joseph
15.1
9,597
Proportion of Regional Population
City of Benton Harbor
Population Estimatea
100
13.8
19.5
5.3
2.3
27.3
11.9
19.8
% of Regional Devices
10.2
9.1
8.6
7.2
8.6
11.9
9.2
13.4
Devices per capita (%)
1030
176
304
109
39
157
168
77
# Survey Sample Population
MTCIII
Table 8.3 Population coverage analysis of SafeGraph and Michigan Household Travel survey
72,734
11,462
20,198
9,213
2,513
12,292
10,923
6,130
Sample adjusted by person weights
100
15.8
27.8
12.7
3.5
16.9
15.0
8.4
Adjusted sample size as % of Region
100
117.3
138.3
193.3
144.2
83.0
130.7
63.9
Adjusted sample size per capita (%)
140 R. Goodspeed et al.
8 Assessing the Value of New Big Data Sources …
141
30% 27.8%
27.3%
25% 23.3%
20%
22.9%
19.8%
19.5%
16.9% 15.1%
15.4%
15.0%
15%
15.8%
13.8% 13.1%
12.7%
11.9%
10% 8.4% 7.5% 5.3% 5% 3.5% 2.7% 2.3%
0% City of Benton Harbor
City of St. Joseph
Benton Township
Proportion of Regional Population (2014-2018 ACS)
Sodus Township
Royalton Township Lincoln Township
SafeGraph devices as % of region
St. Joseph Township
Weighted sample size as % of region
Fig. 8.3 Proportion of population, SafeGraph devices, and MTCIII responses by local jurisdiction
one device per person) as the number of respondents to the traditional survey. Calculating the number of SafeGraph devices per capita shows the highest coverage for Benton Township and the City of Benton Harbor, and lower values for surrounding municipalities. After applying provided weights, the traditional survey still resulted in an effective sample size for the City of Benton Harbor and Benton Township with a smaller proportion of regional population than estimated from ACS data. The effective sample size for all other communities is a higher proportion of their regional population than described by the ACS data. The total population providing data to these datasets is 10.2% for SafeGraph, and 1.75% from the unweighted sample MTCIII. In addition to the municipal-level data (Table 8.3), we can analyze each dataset at a finer spatial resolution. Figure 8.4 shows the number of devices per capita for SafeGraph and respondents per capita for the MTCIII at the census block group level. In general, although SafeGraph covers higher percentages of population, it demonstrates more variance among jurisdictions and across census block groups. In comparison, the sample population percentage of MTCIII distributed more evenly among jurisdictions and census block groups since MTCIII stratified random samples by household size and income. In the Twin-Cities Area, 6.4% of the recruited households participated in the survey with a retrieval rate of 65.4% (Wilaby and Casas 2016). The different sampling rates among different types of households results in
142
R. Goodspeed et al.
Fig. 8.4 SafeGraph Devices and MTCII samples per capita, by census block group
the need for weighting in the analysis, but since the weights are not calibrated to local jurisdiction populations, they reduce but not eliminate geographic bias.
8.4.2 Travel Flow Comparison Next we sought to compare the representation of regional travel flows portrayed by each dataset. This analysis also uses the local jurisdiction as the shared spatial unit. To compare the representation of travel flow, we constructed OD matrices from each of the three datasets and compared them. As noted, due to differences in the spatial units used by the datasets the comparison with StreetLight data aggregates some communities. The results are shown in Fig. 8.5. To facilitate comparison, Fig. 8.6 shows the percentage of regional travel occurring within, or originating or ending in, each jurisdiction. The percentage of travel represented by each flow before aggregation is provided in Appendix A.
8.5 Discussion This section provides discussions of what the results reveal as potential benefits and risks of utilizing big data for EJ travel needs assessments, and as the unique strengths and weaknesses of these big data sources.
8 Assessing the Value of New Big Data Sources …
143
Fig. 8.5 Representations of regional travel flows from SafeGraph, MTCIII„ and StreetLight 45%
40.1%
41.0%
40%
35% 32.7%
30%
25.5% 25%
23.8% 23.8% 23.5%
19.5%
20% 16.7% 15.2% 15% 11.2% 11.3% 10%
8.3% 6.1% 6.5%
5% 2.4% 2.6% 1.2% 0% City of Benton Harbor
Benton Township
City of St. Joseph
MTC as Share of Regional Trips
Lincoln Township and St. Joseph Township
SafeGraph Travel Share
Royalton Township
Sodus Township
Streetlight Travel Share
Fig. 8.6 Comparison of share of travel within, or to or from, study area jurisdictions contained in MTC, SafeGraph, and StreetLight data
8.5.1 Potential Benefits and Risks of Big Data for EJ Travel Needs Assessment Our data illustrate the weakness of conventional household travel surveys. Despite utilizing a design seeking a random sample with equal representation of all communities, due to differential response rates by residents of different locations the effective response rate in the affluent City of St. Joseph and St. Joseph and Royalton Townships are the highest in the region, with lower response rates in Benton Harbor and Benton Township, home to the region’s African American community, and many people who
144
R. Goodspeed et al.
are dependent on transit due to a lack of an automobile or driver’s license. Statistical weights only partly mitigate this problem, since they are calibrated to regional totals for households of different characteristics and not for jurisdiction populations. These weaknesses may explain the patterns seen in the travel flow analysis. Broadly speaking, the three datasets result in roughly similar depictions of regional travel (Fig. 8.6). MTCII and Safegraph show a similar amount of travel in the City of Benton Harbor and Benton Township, however the two diverge for the City of St. Joseph, and to a lesser extent Royalton Township, where MTCIII shows a much higher percentage, perhaps due to over-representation of households in that community not addressed by weighting. It may also be possible that both MTCIII and SafeGraph share biases which resulted in under-estimation of travel for the City of Benton Harbor and Benton Township, and over-estimation of travel for the more affluent Lincoln Township and St. Joseph Township. Turning to the StreetLight data, it describes a higher proportion of regional travel for the City of Benton Harbor and the City of St. Joseph, both of which are home to many economic, recreational, and other destinations which draw travel from outside of the region. It is possible that this dataset shows this because it may include traffic from people who do not live in the region, whereas both MTCIII and SafeGraph are limited to residents only. The analysis reported here reveals that the two big data sources may have strengths for professionals interested in studying EJ communities. Our coverage analysis showed that SafeGraph data contain more devices for Benton Harbor and Benton Township as a percentage of the regional total than those communities contain in population. However, examining the spatial detail raises cause for concern about how this company assigns devices to census block groups. Conversely, the MTCIII data has a much more evenly distributed sample; however, there is evidence that the survey under represents households in these EJ communities, even after applying weights. Regarding the travel flow analysis, the consequences of the different representation results in different estimates for the volume of travel to and from particular communities. As shown in Fig. 8.6, both SafeGraph and StreetLight result in OD matrices with similar share of regional travel either within, or originating or ending in the City of Benton Harbor or Benton Township. Conversely, the MTCIII survey results in greater total travel to or from the affluent city of St. Joseph than either dataset, potentially over representing the travel of these residents. To conclude, our analysis shows that big data do not seem to have dramatic underrepresentation of EJ communities. To the contrary, they may suggest the traditional source overrepresented travel from affluent jurisdictions. However, it may be possible that all of these datasets contain biases affecting their value for detailed transportation planning for their communities.
8 Assessing the Value of New Big Data Sources …
145
8.5.2 Addressing Strengths and Weaknesses Through Data Fusion Each of the datasets have unique strengths for transportation planning professionals. The major benefits of MTCIII survey are consistent with all surveys of this type: the use of random sampling to pursue statistical generalizability, and a high level of detail about respondent’s backgrounds, trip time and purposes, and mode choice. As a result, this analysis does not suggest big data sources can replace the many important questions which can only be answered by conventional household travel surveys. However, it suggests the potential for data fusion to combine the strengths of different data sources to develop a more complete understanding of regional travel. SafeGraph’s Social Distancing Metrics dataset has the benefit of offering origindestination pairs between census block groups at a big data scale, much larger than could ever reasonably be conducted via survey. It’s also worth noting that this is but one dataset provided by SafeGraph. Before the advent of the Social Distancing Metrics dataset, our own study initially considered inferring an OD matrix from another dataset the company offers, simply called Places. Though Social Distancing Metrics proved more useful for our analysis, this alternate dataset is still worth discussion as it yields even higher spatial resolution at the level of points of interest, including local businesses, schools, churches, and public administration buildings (see Fig. 8.7). This dataset contains detailed data about the number of visitors to these points of interest by time of day and across time, which can be valuable for detailed transportation modeling and analysis. Although neither dataset contains direct demographic data from users, because the dataset provides home locations in terms of
Fig. 8.7 SafeGraph Points of Interest within study region and POI in detail at Benton Harbor
146
R. Goodspeed et al.
census block group, it may be possible to infer the demographic representativeness of the data. The analysis here shows that SafeGraph’s data has some important weaknesses, however. Based on our comparison with U.S. Census estimates, their data may represent an inconsistent sample between geographic locations; however, intriguingly it may be one that over-represents both lower-income African American communities as well as high-income white communities. In addition, the census block group with the highest residential population also contains a large supermarket and other stores, suggesting there may be a flaw in their algorithm for assigning residency, for example erroneously assign certain shoppers this census block group as their home address. Finally, this study adds to the body of evidence suggesting that StreetLight data is broadly consistent with real-world travel behaviors. The main strength of the dataset is the detailed temporal breakdown, as it contains travel index values for different times of day for weekdays and weekends. However, the major weakness is the lack of detail about how their data are created, or the demographics of the underlying individuals whose vehicles and smartphones are being tracked. As much of the big data application research focuses on large metropolitan areas (Prestby et al. 2019; Gao et al. 2019; Alexander et al. 2015; Calabrese et al. 2011), our research explicitly addressed big data’s value in small region transportation analysis. With high spatial resolution, SafeGraph can be easily applied to small regions and can complement conventional surveys like the MTCIII by providing analysts the ability to pinpoint activity centers. Another way big data such as SafeGraph can complement conventional surveys is by providing data over much broader time periods than surveys, allowing for analysis of seasonal and longitudinal change. While it is harder for planning agencies to acquire such data in small regions due to limited staff or budgets. However, the application of StreetLight data is more limited in small regions. As StreetLight represents data in a more closed system, that is, with StreetLight traffic zones and StreetLight Index, it is harder for planners to use its data as a complement to traditional data sources that are often collected based on municipal boundaries. For larger regions, StreetLight data can sufficiently demonstrate traffic patterns and its index also provides a fast and easy approach to understand general trends. Nonetheless, in small regions, planners may need more detailed and nuanced information such as the data available from SafeGraph.
8.6 Conclusion Big data has been heralded as a potentially revolutionary development for transportation planning. Looking to conduct an empirical study beyond the hype, this study conducted a detailed analysis of two widely-available big data products being used by transportation professionals, SafeGraph and Streetlight, comparing them with a conventional household travel survey. Contrary to our expectations, both datasets seemed to represent the residents and travel behaviors of two EJ communities within our study region. However SafeGraph data also showed an uneven distribution within
8 Assessing the Value of New Big Data Sources …
147
these jurisdictions, raising concerns about data quality. The lack of detailed and demographic data, and other information only collected by surveys, suggests that big data should be thought of as complements, not supplements, to traditional travel surveying techniques.
Appendix A Summary of travel flow information Community 1
Community 2
SafeGraph (%)
MTC % of regional trips (%)
Streetlight % of regional index values (%)
Benton Harbor Benton Township
4.0
5.8
8.3
Benton Harbor St. Joseph
0.9
2.7
5.8
Benton Harbor Lincoln Township
0.7
1.3
5.9
Benton Harbor St. Joseph Twp
0.8
0.7
Benton Harbor Sodus Township
0.1
0.1
0.1
Benton Harbor Royalton Township
0.3
0.7
1.5
St. Joseph
Benton Township
2.2
4.2
5.2
St. Joseph
Lincoln Township
3.3
5.8
11.2
St. Joseph
St. Joseph Twp
3.6
7.2
St. Joseph
Sodus Township
0.3
0.2
0.1
St. Joseph
Royalton Township
1.3
4.1
3.2
Benton Township
Lincoln Township
2.4
3.8
7.8
Benton Township
St. Joseph Twp
2.9
3.6
Benton Township
Sodus Township
0.6
1.4
0.7
Benton Township
Royalton Township
1.1
2.6
3.5
Sodus Township
St. Joseph Twp
0.2
0.1
0.4
Sodus Township
Lincoln Township
0.3
0.4 (continued)
148
R. Goodspeed et al.
(continued) Community 1
Community 2
SafeGraph (%)
MTC % of regional trips (%)
Streetlight % of regional index values (%)
Sodus Township
Royalton Township
0.1
0.2
0.1
Royalton Township
St. Joseph Twp
1.3
3.5
7.8
Royalton Township
Lincoln Township
1.9
4.2
St. Joseph Twp Lincoln Township
3.4
4.4
N/A
Benton Harbor Benton Harbor
7.4
3.3
4.0
St. Joseph
St. Joseph
7.5
8.9
4.4
Benton Township
Benton Township
16.6
9.6
10.5
Lincoln Township
Lincoln Township
18.7
13.6
17.0
St. Joseph Twp St. Joseph Twp 11.9
3.6
Sodus Township
Sodus Township
1.7
0.7
0.4
Royalton Township
Royalton Township
4.4
3.1
2.1
1 Streetlight data includes Bainbridge Township 2 Streetlight zone includes both jurisdictions, as well as a portion of Lake Township 3 Streetlight data includes Pipestone Township
References Albright C (2018) Planning on the cellular level. Planning (Chicago, Ill. 1969) 84(5):24–28 Alexander L, Jiang S, Murga M, González MC (2015) Origin-destination trips by purpose and time of day inferred from mobile phone data. Transp Res Part C Emerg Technol 58:240–250. https:// doi.org/10.1016/j.trc.2015.02.018 Andersen M (2020) Early evidence on social distancing in response to COVID-19 in the United States. SSRN Electron J. https://doi.org/10.2139/ssrn.3569368 Boyce DE, Williams HCWL (2015) Forecasting urban travel: past, present and future. Edward Elgar Publishing Limited, Northampton, Massachusetts Calabrese F, Di Lorenzo G, Liu L, Ratti C (2011) Estimating origin-destination flows using mobile phone location data. IEEE Pervasive Comput 10(4):36–44. https://doi.org/10.1109/mprv.2011.41 Forkenbrock DJ, Sheeley J (2004) Effective methods for environmental justice assessment. Transportation Research Board, Washington, D.C. Gao S, Li M, Liang Y, Marks J, Kang Y, Li M (2019) Predicting the spatiotemporal legality of on-street parking using open data and machine learning. Ann GIS 25(4):299–312. https://doi. org/10.1080/19475683.2019.1679882
8 Assessing the Value of New Big Data Sources …
149
Gareis-Smith D (1994) Environmental racism: the failure of equal protection to provide a judicial remedy and the potential of Title VI of the 1964 Civil Rights Act. Temple Environ Law Technol J 13(1):57 Goodspeed R, DeBoskey D (2020) Scenario planning for smaller places: aligning methods and context. Lincoln Institute of Land Policy. https://www.lincolninst.edu/publications/working-pap ers/scenario-planning-smaller-places. Accessed 18 Jan 2021 INRIX (2017a) Ohio DOT selects INRIX and StreetLight data for on-demand mobility intelligence. INRIX. https://inrix.com/press-releases/ohiodot/. Accessed 25 Oct 2020 INRIX (2017b) Global traffic scorecard. INRIX. https://inrix.com/wp-content/uploads/2018/01/ inrix_trafficscorecard_global_en_lr-2017.pdf. Accessed 18 Jan 2021 Jiang S, Ferreira J, Gonzalez MC (2017) Activity-based human mobility patterns inferred from mobile phone data: a case study of Singapore. IEEE Trans Big Data 3(2):208–219. https://doi. org/10.1109/tbdata.2016.2631141 Mills GS, Neuhauser KS (2000) Quantitative methods for environmental justice assessment of transportation. Risk Anal 20(3):377–384. https://doi.org/10.1111/0272-4332.203036 National Environmental Policy Act (NEPA) (2020) Environmental justice. https://ceq.doe.gov/nepapractice/justice.html. Accessed 6 Nov 2020 Nostikasari D (2015) Representations of everyday travel experience: Case study of the Dallas-Fort Worth Metropolitan Area. Transp Policy 44:96–107. http://dx.doi.org/10.1016/j.tranpol.2015. 06.008 Prestby T, App J, Kang Y, Gao S (2019) Understanding neighborhood isolation through spatial interaction network analysis using location big data. Environ Plann A, 52(6):0308518X19891911031. https://doi.org/10.1177/0308518x19891911 Purvis CL (2001) Data and analysis methods for metropolitan-level environmental justice assessment. Transp Res Rec 1756(1):15–21. https://doi.org/10.3141/1756-02 Raphael S, Berube A, Deakin E (2006) Socioeconomic differences in household automobile ownership rates: implications for evacuation policy. The University of California Transportation Center SafeGraph (2021) Places schema. https://docs.safegraph.com/docs. SafeGraph. Accessed 18 Jan 2021 Santa Barbara County Association of Governments (SBCAG) (2019) Transit needs assessment 2019. Santa Barbara County Association of Governments. http://www.sbcag.org/uploads/2/4/5/ 4/24540302/transit_needs_assessment_2019_report.pdf. Accessed 25 Oct 2020 Sentiance (2018) A new world-wide dataset of location and region types. Sentiance. https://www. sentiance.com/2018/09/26/location/. Accessed 18 Jan 2021 State Smart Transportation Initiative (SSTI) (2017) Understanding trip-making with big data: a connecting Sacramento summary brief. State Smart Transportation Initiative. https://ssti.us/wpcontent/uploads/sites/1303/2020/05/SSTI_Connecting_Sacramento_Tripmaking.pdf. Accessed 25 Oct 2020 Stopher PR, Greaves SP (2007) Household travel surveys: Where are we going? Transp Res Part A Policy Pract 41(5):367–381. https://doi.org/10.1016/j.tra.2006.09.005 StreetLight Data (2018a) StreetLight insight: our methodology and data sources. StreetLight data. https://3yemud1nnmpw4b6m8m3py1iy-wpengine.netdna-ssl.com/wp-content/uploads/Str eetLight-Data_Methodology-and-Data-Sources_181008.pdf. Accessed 25 Oct 2020 StreetLight Data (2018b) Real-world big data for active transportation planning. StreetLight data https://www.streetlightdata.com/wp-content/uploads/SLD-Flyer-CalTrans-v4.pdf. Accessed 25 Oct 2020 StreetLight Data (2020) Resources: Success stories across North America. StreetLight data https:// www.streetlightdata.com/transportation-planning-success-stories/. Accessed 18 Jan 2021 Wilaby M, Casas J (2016) MI travel counts III: household travel survey final methodology report. Michigan Department of Transportation. https://www.michigan.gov/documents/mdot/MTC_III_ Methodology_Report_554340_7.pdf. Accessed 25 Oct 2020
150
R. Goodspeed et al.
Yang H, Cetin M (2020) Guidelines for using StreetLight Data for planning tasks. Virginia Transportation Research Council (VTRC). http://www.virginiadot.org/vtrc/main/online_reports/pdf/ 20-R23.pdf. Accessed 25 Oct 2020 Zannat KE, Choudhury CF (2019) Emerging big data sources for public transport planning: a systematic review on current state of art and future research directions. J Indian Inst Sci 99(4):601– 619. https://doi.org/10.1007/s41745-019-00125-9
Chapter 9
How Various Natural Disasters Impact Urban Human Mobility Patterns: A Comparative Analysis Based on Geotagged Photos Taken in Tokyo Ahmed Derdouri and Toshihiro Osaragi Abstract This study analyzed the impacts of various extreme natural events that affected Tokyo’s human movement patterns between 2008 and 2019 based on geotagged photos shared online. First, we selected six disasters of different types according to severity, damage, and photo count. Next, we delineated three phases representing steady and perturbed conditions (before, during, and after) for each extreme event based on relevant weather measurements. We then analyzed human mobility patterns via two indicators: displacement and mean squared displacement (MSD). A transfer-learning-based convolutional neural network (CNN) model was also developed to classify the photos according to whether they were taken indoors or outdoors. Thus, we analyzed the characteristics of people’s trips within and between the two environments. The results show that, while all extreme events perturbed mobility patterns to different degrees, these patterns mostly followed a truncated power-law distribution during steady and unsteady states. Keywords Human mobility · Natural disasters · Flickr · Geotagged photos · Convolutional neural network · Tokyo · Japan
9.1 Introduction Earthquakes, tropical cyclones, and floods pose constant threats to urban communities. Given their growing population densities and interdependent infrastructures, metropolitan areas suffer most from such events in terms of human casualties, infrastructural damage, and economic losses (Gu 2019). For instance, the New York metropolitan area, which accounts for 8% of the US’s economic output, was hit by Hurricane Sandy on Monday, October 29, 2012. 48 lives were lost in New York A. Derdouri (B) · T. Osaragi School of Environment and Society, Tokyo Institute of Technology, 2-12-1-M1-25 Ookayama, Meguro-Ku, Tokyo 152-8550, Japan e-mail: [email protected] T. Osaragi e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 S. C. M. Geertman et al. (eds.), Urban Informatics and Future Cities, The Urban Book Series, https://doi.org/10.1007/978-3-030-76059-5_9
151
152
A. Derdouri and T. Osaragi
City; there were 72 deaths across eight states; and economic losses climbed to USD 68 billion (Schelske et al. 2013). Additionally, these disasters have both short- and long-term negative impacts on the population’s daily routines as they affect essential mobility patterns. Our current understanding of daily human mobility trajectories under steady conditions has improved due to extensive studies on general human movements. During normal conditions, human mobility most likely fits a truncated power-law distribution, with values of the scaling parameter β ranging between 1 and 1.82 (Alessandretti et al. 2017). Studies have also focused on analyzing urban human movements under perturbed states, caused mainly by extreme natural events. Most of these studies have relied on text-based, geotagged records from social networks, generally Twitter because of its freely accessible Application Programming Interface (API) and large amount of data. Fewer studies have used photo-based records collected through photo-sharing platforms, such as Flickr, which provides fewer geotagged records. However, with the advancement of deep-learning techniques, geotagged photos could offer more insight into human mobility patterns during both steady and unsteady conditions. This study, therefore, investigated and compared urban human mobility patterns before, during, and after six disasters of different types, selected from among the extreme events that impacted Tokyo between July 2008 and December 2019. Urban mobility was evaluated based on geotagged, timestamped Flickr photos by calculating two mobility measurements—displacement and mean squared displacement (MSD)—in addition to determining people’s period of stay for both indoor and outdoor environments and the time spent traveling between them. The research questions were as follows. 1. 2. 3.
How can human mobility be described during different natural disasters? To what extent is human mobility disturbed by different natural disasters? Is photo-based, social, geotagged data capable of modeling such mobility?
The originality of this methodology is its use of geotagged photos instead of textbased geodata to explore patterns of urban human movements affected by natural disasters. The development of a convolutional neural network (CNN)-based model to categorize photos according to whether they were taken in indoor or outdoor environments led to investigating variations in stay periods and travel times between the two environments during different periods of each extreme event. The remainder of this article is organized as follows. Section 9.2 provides a comprehensive literature review. Section 9.3 overviews the study area’s geographical settings, as well as the chronology of the natural disasters that impacted Tokyo between 2008 and 2019. Section 9.4 presents the study’s methodology and describes its key steps. Section 9.5 outlines the obtained results. Section 9.6 discusses the study’s findings, implications, limitations, and suggests directions for future research; and Sect. 9.7 draws conclusions.
9 How Various Natural Disasters Impact Urban …
153
9.2 Literature Review Social networks present an enormous opportunity as a source of spatial data with rich metadata. Recently, multiple studies analyzing human mobility patterns during unsteady conditions have been carried out with geotagged social data. These studies differ considerably regarding their data sources, approaches, study periods, and spatial wideness of the target area. Text-based geotagged data sources are most popular among researchers, given the large number of posts, but photo-based geodata studies analyzing the impacts of extreme events are relatively rare. This section presents the most relevant previous studies that have employed social data (largely text- and photo-based geotagged data) to analyze human mobility during disastrous events. Regarding text-based geotagged data, researchers have principally used Twitter to analyze how human patterns are affected by disasters. New Yorkers’ tweets were used to compare their movement trajectories during steady conditions and perturbation states caused by Hurricane Sandy (Wang and Taylor 2014). Though human patterns were significantly perturbed during the hurricane, they also showed high resilience, based on the correlated values for the center of mass and the radius of gyration of people’s movements during both steady and perturbed states. Human mobility also better fit truncated power-law distributions during both states. In the same vein, Ahmouda et al. (2019) focused on how hurricanes impact human mobility patterns in the urban and rural areas of different US coastal regions affected by Hurricanes Matthew (2016) and Harvey (2017). Using tweets from the study regions, the authors compared data before, during, and after each hurricane. While displacements and activities were reduced, displacement patterns followed the truncated power-law at all phases of the events. Wang and Taylor (2016) further expanded and confirmed these results for five other types of natural disasters (typhoons, earthquakes, winter storms, thunderstorms, and wildfires), using geotagged tweets to analyze 15 cases across the globe. To the extent of our knowledge, no study has analyzed human patterns using geotagged photos during disasters; however, some studies have included geotagged photos for other purposes. Most of these used Flickr as their main data source. Preis et al. (2013) analyzed the number of photos taken during Hurricane Sandy by Flickr users living in New Jersey. A strong correlation was found between atmospheric pressure and the count of uploaded photos related to the hurricane between October 20 and November 20, 2012. They suggested that Flickr might be useful for monitoring collective human attention during large-scale disasters in real-time. Yan et al. (2017) employed geotagged Flickr photos to evaluate tourism recovery in the central Philippines after two disasters in 2013: the Bohol Earthquake (October) and Super Typhoon Haiyan (November). The authors gathered insights regarding the spatiotemporal patterns of recovery conditions and trends. Similarly, Li et al. (2018) used geotagged images to quantify the degree of damage from four natural disasters: Typhoon Ruby (2014), the Nepal Earthquake (2015), the Ecuador Earthquake (2016), and Hurricane Matthew (2016). They developed a CNN-based model
154
A. Derdouri and T. Osaragi
to classify image content into three damage classes illustrated as heatmaps: severe, mild, and no damage. Geotagged images were also successfully used as ground truth data to improve the accuracy of the May 2011 Mississippi River flood hazard map, extracted using remotely sensed data and a digital elevation model (Schnebele and Cervone 2013).
9.3 Study Area: Geographic Settings and Past Disasters Japan is among the top countries with a high vulnerability to natural disasters (Garschagen et al. 2016; Welle and Birkmann 2015). Its location in the CircumPacific Mobile Belt, known for its seismic and volcanic activities, causes frequent earthquakes and tsunamis, which can lead to catastrophes, such as the 2011 Tohoku Earthquake. The Japanese archipelago is also hit by tens of typhoons each year, which form over the Northwest Pacific Ocean, causing rainstorms, heavy rain, and landslides. While typhoons are generally predictable (with Japan’s typhoon season usually lasting from July to October) earthquakes are more elusive. Tokyo’s 23 special wards (hereinafter called Tokyo) were selected as the target area for this study for several reasons. First, according to the 2015 census, the area is home to over nine million citizens and has a population density of over 1,000 persons/km2 (Statistics Bureau 2020). Second, Tokyo is the most popular destination for international tourists. According to the Japan National Tourism Organization’s yearly statistics, approximately half of the international visitors to Japan pass through Tokyo. In 2018, for instance, more than 14 million tourists visited the capital. Third, the study area is part of the Tokyo Metropolitan Area (TMA), which is the heart of Japan’s economy (the third largest in the world). From 2000–2014, the TMA generated 37% of Japan’s GDP growth (OECD 2018). Lastly, the area is among the world’s riskiest cities in which to live. Based on a recent report from Lloyd’s City Risk Index, which estimates the economic output threatened by 18 manmade and natural hazards, Tokyo has the highest risk index across the 279 cities surveyed (Lloyd 2018). Figure 9.1 shows the chronology of the significant natural disasters that impacted the TMA from 2008–2019, causing deaths, injuries, and significant economic losses. Based on the official Tokyo Metropolitan Government disaster prevention guidebook (Tokyo Metropolitan Government 2019), we identified 22 extreme events in six categories: earthquake, heavy rain, snowstorm, typhoon, rainstorm, and snow/rainstorm. From these 22, we selected six case studies, each representing one of the six categories. Table 9.1 lists the date of occurrence and the damage caused by the selected disasters.
9 How Various Natural Disasters Impact Urban …
155
Fig. 9.1 Chronology of the natural disasters that impacted Tokyo between July 2008 and December 2019. Disasters circled in orange are the case studies for this analysis
Table 9.1 The selected extreme natural events with the approximate date of occurrence and the damage (adapted from the Tokyo Metropolitan Government’s disaster prevention guidebook [Tokyo Metropolitan Government 2019]) Disaster
Date
Damage
Rainstorm
December 3, 2010
One dead, one injured, 14 homes flooded above ground level, and 14 homes flooded below ground level
Tohoku Earthquake March 11, 2011
Seven dead, 117 injured, 17 buildings destroyed, and another 195 buildings partially destroyed
Typhoon Wipha
October 15, 2013
37 persons killed, three missing
Snow/rainstorm
February 8, 2014
Five persons seriously injured and 61 with minor injuries
Heavy rain
September 8–11, 2015 One person with minor injuries, eight buildings flooded above ground level, and 14 flooded below ground level
Snowstorm
January 22–23, 2018
592 individuals with minor injuries
9.4 Data and Methods Figure 9.2 illustrates the study’s methodology. First, we collected Flickr records associated with photos taken 15 days before and after the six selected natural disasters. These records were cleaned and merged with hourly weather data. Based on this weather data, we defined the study periods “before,” “during,” and “after” each disaster. Second, we developed a transfer-learning-based CNN model to classify photos into two categories (indoor and outdoor). We then compared human mobility based on relevant metrics (displacement, MSD, number of trips, etc.) during the disasters. The following sections detail each key step.
156
A. Derdouri and T. Osaragi
Fig. 9.2 The study’s methodological framework
9.4.1 Data Collection and Preprocessing The first step of the analysis involved collecting and preprocessing geotagged photos (Fig. 9.2a). First, we developed a python script to collect records from Flickr through its publicly available API. We set a query to obtain only records from Tokyo taken between July 1, 2008, and December 31, 2019. To minimize the effects of “active users,” defined as those who upload a large number of photos (Hollenstein and Purves 2010), which could bias the results in favor of their behaviors (Hu et al. 2015), we kept only one record if multiple photos were taken within the same minute. We then downloaded Tokyo’s historical hourly weather data from worldweatheronline.com via its accessible API. Weather parameters included, among other factors, temperature, air pressure, humidity, and cloud cover. These data were used to delineate the start and end dates for each of the three phases in each disaster. Of the 22 selected disasters, we chose six case studies according to the availability of Flickr samples for each of the disasters’ three phases. Table 9.2 lists the case studies, the start/end dates of each phase, and the corresponding Flickr sample count. It should be noted that, for the Tohoku Earthquake, we considered the major 7.2Mw foreshock from March 9, 2011, as the start of the “during” period.
9.4.2 Indoor/Outdoor Photo Data Classification The second key step was identifying the environment in which the photos were taken (indoor or outdoor) (Fig. 9.2b) and thereby determine in what environment users
9 How Various Natural Disasters Impact Urban …
157
Table 9.2 The considered disasters with the estimated start/end dates for their three phases and the Flickr records count for each phase Disaster
Approximate occurrence date and time
Phase “before” (− days)
Phase “during” (− days)
(+ days)
Phase “after” (+ days)
Flickr records count Unique users count (Average photos per user)
Rainstorm
Tohoku Earthquake
Typhoon Wipha
Snow/rainstorm
Heavy rain
Snowstorm
2/8/2010 12:00:00 AM
3/9/2011 11:45:00 AM
10/16/2013 12:00:00 AM
2/8/2014 12:00:00 AM
9/8/2015 12:00:00 AM
1/22/2018 12:00:00 AM
15
3
1,346
146
3
639
147 (9.16)
34 (4.29)
115 (5.56)
15
0
533
129
523
94 (5.67)
11 (11.73)
118 (4.43)
5
3
15
15
15
3
1,571
415
700
75 (20.95)
20 (20.75)
34 (20.59) 5
15
7
5
911
278
515
69 (13.2)
54 (5.15)
80 (6.44) 3
19
15
3
1,350
430
631
67 (20.15)
21 (20.47)
21 (30.05)
15
3
268
281
106
37 (7.24)
23 (12.22)
20 (5.3)
3
15
15
were located during the three phases of each case study. The users’ periods of stay before moving to different environments were also estimated. To accomplish this goal, we developed a CNN-based model to classify photos as indoor or outdoor. For the training and testing process, we used two public datasets: Dense Indoor and Outdoor DEpth (DIODE; Vasiljevic et al. 2019) and Indoor Scene Recognition (Quattoni and Torralba 2009). The former contains thousands of RGB photos of indoor and outdoor scenes, while the latter contains 15,620 indoor photos grouped into 67 categories (e.g., store, home, public spaces). Each dataset was split into training and testing/validation sets. Initially, we developed two models, based on the transfer-learning approach via MobileNetV2 (Sandler et al. 2018) and InceptionV3 (Szegedy et al. 2015). Using the training dataset (80%) and the evaluation dataset (20%), we trained and validated the two models considering a binary cross-entropy loss function. We obtained an accuracy of 94% and 84% and a loss of 0.17 and 0.64 for the MobileNetV2 and InceptionV3 models, respectively. Consequently, we selected the MobileNetV2 model to classify the Flickr images as indoor or outdoor. In addition to its high accuracy, the MobileNetV2 was selected because it optimizes
158
A. Derdouri and T. Osaragi
memory consumption and execution time while minimizing prediction errors. The total training time was about 14 h (13 h:59 m:2 s) compared to an approximate 29 h (28 h:58 m:46 s) for the InceptionV3 model. The model training was conducted on a 10-Intel-core i9-10900X with 120 GB RAM and a 3.70 GHz CPU.
9.4.3 Calculating Human Mobility Parameters and Trip Characteristics In the third step, we computed parameters describing human mobility for all users before, during, and after each disaster (Fig. 9.2c). Two parameters were considered: displacements and MSD. Displacement denoted the distance between two consecutive photos taken by an individual. The Haversine formula (Robusto 1957) was used to calculate displacements based on the geographic coordinates of the origin and destination points as follows: φ ϕ − φ − ϕ 2 1 2 1 + cos φ1 cos φ2 sin2 sin2 d = 2r × arcsin (9.1) 2 2 where r referred to the earth’s radius, which equals approximately 6,371 km; φ was the latitude; and ϕ was the longitude. MSD is a concept adapted from statistical physics and Brownian motion (Barbosa et al. 2018), which has been employed across disciplines, such as for hurricane track forecasting (Meuel et al. 2012). MSD was used to calculate the spatial extent of an individual’s displacements in an area, and it was defined by the following formula: M S D(t) = (r (t) − r0 )2
(9.2)
where r0 was the location of the first photo taken at the beginning of the observations; and r (t) was the location of a photo taken at time t. The two mobility parameters were calculated for a 24-h period beginning at 3:00 AM and ending at 2:59 AM the next day. For the displacements, we fitted the data for the three event phases into four distributions: exponential (EXP), lognormal (LGN), power-law (PL), and truncated power-law (TPL). The goodness of fit was tested and compared for the four distributions. The Maximum Likelihood Estimation (MLE) method was employed to identify the model with the highest goodness of fit using the python package powerlaw (Alstott et al. 2014). We also investigated the characteristics of trips between or within indoor and outdoor environments. Using the MobileNetV2 model, we predicted the nature of the environment where all sampled images were taken. Consequently, using the timestamp information from the photos, we measured the users’ length of stay in
9 How Various Natural Disasters Impact Urban …
159
each environment by calculating the period between the last and first photos among consecutive photos taken in a similar environment. We categorized these stay periods into three classes: short (≤10 min), medium (>10 min and 80%) of the photos taken during these events. Photos taken during both events were related to food or taken inside a room or restaurant. This could suggest that the population was well prepared for the events. After Typhoon Wipha, human mobility slowly returned to normal, mainly around the Tokyo and Ikebukuro stations, which can be concluded from the slightly wider photo density in those areas and the increase in the number of photos taken outdoors (+20%). The impacts of the four-day 2015’s heavy rain lasted longer, supported by the limited number of photos taken after the event and the slight increase in outdoor photos (+2%). The photos taken after these incidents showed indoor environments (e.g., restaurants, rooms, etc.), natural scenes, and cultural events. In contrast to Typhoon Wipha and 2015’s heavy rain, the 2010 rainstorm (Fig. 9.3A1) did not heavily impact human movement, which can be concluded from the large densities of photos taken at the time. Only 32% of those photos were taken indoors. This may be explained by the fact that the 2010 rainstorm was neither as severe nor as long as Typhoon Wipha and 2015’s heavy rain. During snow-related events, a wider spatial extent of photos was observed across the study area, especially during the snowstorm of January 22–23, 2018 (Fig. 9.3A6), while the indoor/outdoor photo percentages show that the event in February 2014 (Fig. 9.3A4) slightly pushed people to stay indoors. However, despite snowy weather, people tended to visit attractions, such as zoos and parks. The snowstorm of 2018 encouraged people to go outside and see Tokyo in such rare, snowy conditions, evidenced by the increased photo density and number of outdoor photos at the time.
160
A. Derdouri and T. Osaragi
Fig. 9.3 Spatial distribution of photos and their density before/during/after the six studied natural disasters (green = before; red = during; blue = after). The numbers and colored, camera-like icons correspond to an example of a path followed by a random user during the three phases. Black dots in the study-area maps represent the main stations located along the Yamanote line. Percentages at the bottom of the maps correspond to the percentages of indoor (I) and outdoor (O) photos taken during each phase
9 How Various Natural Disasters Impact Urban …
161
The Tohoku Earthquake (Fig. 9.3A2) impacted people’s movement differently than the rain- and snow-related events. Post-earthquake photo clusters were observed across the study area beyond the usual hotspots near the Yamanote line stations. The percentages of indoor and outdoor photos were relatively equal (52% and 48% before; 48% and 52% after, respectively). However, the number of indoor photos increased (63%) during the event. People took photos to document unusual circumstances, such as empty store shelves and water stored in bathtubs. The insights gathered from this temporal analysis of photos’ spatial distribution and density, coupled with the nature of the environment, show how people reacted to different types of natural disasters. While rainy events generally led to fewer activities, their impacts on human mobility depended on their duration and severity. Snow-related events led to more outdoor activities, which might be attributed to the events’ rareness and lack of severity. During most of the sampled events, the human activities were mainly located near the Yamanote line. However, the Tohoku Earthquake caused a wide number of people to disperse beyond this area.
9.5.2 Displacements Figure 9.4 shows the heatmaps for comparing the displacement fit of the four distributions—EXP, LGN, PL, and TPL—for each phase of the analyzed disasters. The loglikelihood ratio R values suggest that, although not statistically significant in most cases at levels below 10%, TPL fit the data more appropriately than PL, LGN, and EXP. This partially coincides with results reported by Wang and Taylor (2014). LGN showed a better fit than other distributions during and after the heavy rain of September 8–11, 2015, and after the snowstorm of January 22–23, 2018. EXP showed the worst fit for almost all case studies. The plots in Fig. 9.5 present the complementary cumulative distribution function for the TPL distribution fit of the displacements calculated for each phase of the analyzed events. The equation for the TPL distribution was: P(d) ∝ d −β e−λd
(9.3)
where d was the displacement; β was the scaling parameter; and λ was the exponential cutoff value. The λ values increased during wet-weather-related disasters, which suggests limited trips, while values recorded during and after the Tohoku Earthquake increased. The values of β ranged from 1 to 1.92, with a few exceptions returning values above 2. The differences between these values and those reported in the literature could be due to the small sample sizes used in this comparative analysis, as other research has reported that sample size may affect the goodness of fit results (e.g., Wang and Taylor (2016)). Additionally, concerning Twitter, Devkota et al. (2019) pointed out that all activities are not necessarily recorded. This is even more true for Flickr, which is less popular than Twitter.
162
A. Derdouri and T. Osaragi
Fig. 9.4 Heatmaps comparing the displacement fit of the four considered distributions (EXP, LGN, PL, and TPL) for the three phases (during, before, and after) of each analyzed natural disaster. The first value, R, in each cell, corresponds to the loglikelihood ratio between two distributions. A positive number indicates that the displacement data are more likely in the y-axis distribution, while negative numbers indicate the contrary. The second value, labeled p, denotes the significance value. The last value depicts the statistical significance level of the comparison, in which five levels are considered: (*****) = 0.1%; (****) = 0.5%; (***) = 1%; (**) = 5%; (*) = 10%. () is used when a p-value is not significant at all considered levels
9 How Various Natural Disasters Impact Urban …
163
Fig. 9.5 Complementary cumulative distribution functions for the displacements and fitted TPL distribution for the three phases of each analyzed natural disaster: a rainstorm, b the Tohoku Earthquake, c Typhoon Wipha, d snow/rainstorm, e heavy rain, and f snowstorm
Overall, the results suggest that all sampled extreme events affected human mobility patterns. These patterns, during different disaster phases, most often followed a TPL distribution, though this finding was not always statistically significant.
164
A. Derdouri and T. Osaragi
9.5.3 Spatial Extent of Human Activity For all disasters, the spatial extents of Flickr users’ displacements in the study area are presented through the sum of MSD values in 80–140 min periods (Fig. 9.6). The spatial extent of human activity during the analyzed events is limited, as MSD values remained small.
Fig. 9.6 The MSD values computed for the three phases of each analyzed natural disaster: a rainstorm, b the Tohoku Earthquake, c Typhoon Wipha, d snow/rainstorm, e heavy rain, and f snowstorm
9 How Various Natural Disasters Impact Urban …
165
Apart from the 2011 Tohoku Earthquake, the MSD values before, during, and after each disaster followed similar patterns. Before the occurrence, people tended to move across the study area freely; however, once a disaster occurred, their spatial activities tended to diminish. The MSD values generally increased slowly over time— a pattern previously reported in several studies (Ahmouda et al. 2019; Barbosa et al. 2018). For the analyzed events, the pace of returning to normal conditions, and the time required to do so, most likely depended on the severity of each disaster. For instance, the MSD values recorded after the 2010 rainstorm showed a quick return to approximately normal values. Conversely, for the 2014 snow/rainstorm and the 2015 heavy rain, the MSD values showed a slight growth after the two events but did not change thereafter. Concerning Typhoon Wipha and the 2018 snowstorm, the computed MSD values remained nearly zero throughout the covered 120-minute period. Other studies (e.g., Ahmouda et al. 2019) reported a period of 240 min before human activities resumed in Houston, Texas, and in North and South Carolina after Hurricanes Harvey and Matthew, respectively. However, the observed MSD values before, during, and after the Tohoku Earthquake showed different patterns. Only for this event was the spatial extent recorded after the disaster (40 km) wider than that recorded before the disaster (20 km). This might be attributed to the fact that people tend to flee to safer outdoor spaces after earthquakes—a postulation confirmed by analyzing the trips between indoor and outdoor environments, as shown in Sect. 9.5.4.
9.5.4 Trip Characteristics Between Indoor and Outdoor Environments Figure 9.7 illustrates the number of trips between indoor and outdoor environments recorded during the different phases of each sampled event. We adopted the following notation (X, Y, T), where X and Y referred to the nature of the environment (I = indoor or O = outdoor) of the origin and the destination, respectively, and T referred to the category of time spent on each trip (S = Short [≤10 min]; M = Medium [>10 min and 10 min and